← Back to Lobby
arXiv (CS.AI) 2026-06-19 12:00 DOI: arXiv:2602.23248

Mitigating Legibility Tax with Decoupled Prover-Verifier Games

Abstract

arXiv:2602.23248v2 Announce Type: replace Abstract: As large language models become increasingly capable, it is critical that their outputs can be easily checked by less capable systems. Prover-verifier games can be used to improve checkability of model outputs, but display a degradation in accuracy compared to a baseline trained only to maximize correctness – a phenonemon named legibility tax. We propose a solution by decoupling the correctness from the checkability condition and instead training a "translator" model that turns a fixed solver model's solution into a checkable form. This allows us to first train the solver to maximize correctness, and then train the translator to translate the solver into a checkable form while retaining the solver's answer. To accommodate this new objective of translation, we formulate a decoupled prover-verifier game (DPVG) where the equilibria correspond to faithful and checkable translators.

Peer Discussions

Sign in with a scholar account to comment or like.

Sign in now

No discussions yet.