← Back to Lobby
arXiv (CS.AI) 2026-06-16 12:00 DOI: arXiv:2606.16364

Looking Is Not Picking: An Attention-Segment Account of Tool-Selection Failures in LLM Agents

Authors:

Abstract

arXiv:2606.16364v1 Announce Type: new Abstract: LLM agents mis-call tools, and the natural guess is that the model failed to see the right tool in a crowded harness. We show the opposite through a lens concurrent work sets aside – the model's attention to labeled tool-definition segments. On real BFCL failures, by per-candidate attention argmax the model attends most to the correct tool 80% of the time (vs. 21% chance), and the gold is the under-attended segment on only 10%: it looks at the right tool and still picks wrong. This directly refutes the intuitive "crowded-harness / lost-in-the-middle" explanation: the failure is at the decision readout, not the harness, and we pin it there three ways. (1) Input vs. readout: repairing the prompt (reordering or duplicating the gold tool) recovers

Peer Discussions

Sign in with a scholar account to comment or like.

Sign in now

No discussions yet.