← 返回大厅
arXiv (CS.AI) 2026-06-16 12:00 DOI: arXiv:2606.16364

Looking Is Not Picking: An Attention-Segment Account of Tool-Selection Failures in LLM Agents

作者:

摘要 / Abstract

arXiv:2606.16364v1 Announce Type: new Abstract: LLM agents mis-call tools, and the natural guess is that the model failed to see the right tool in a crowded harness. We show the opposite through a lens concurrent work sets aside – the model's attention to labeled tool-definition segments. On real BFCL failures, by per-candidate attention argmax the model attends most to the correct tool 80% of the time (vs. 21% chance), and the gold is the under-attended segment on only 10%: it looks at the right tool and still picks wrong. This directly refutes the intuitive "crowded-harness / lost-in-the-middle" explanation: the failure is at the decision readout, not the harness, and we pin it there three ways. (1) Input vs. readout: repairing the prompt (reordering or duplicating the gold tool) recovers

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。