← 返回大厅
arXiv (CS.LG) 2026-06-24 12:00 DOI: arXiv:2602.22810

Multi-agent imitation learning with function approximation: Linear Markov games and beyond

摘要 / Abstract

arXiv:2602.22810v2 Announce Type: replace Abstract: In this work, we present the first theoretical analysis of multi-agent imitation learning (MAIL) in linear Markov games where both the transition dynamics and each agent's reward function are linear in some given features. We demonstrate that by leveraging this structure, it is possible to replace the state-action level "all policy deviation concentrability coefficient" (Freihaut et al., arXiv:2510.09325) with a concentrability coefficient defined at the feature level which can be much smaller than the state-action analog when the features are informative about states' similarity. Furthermore, to circumvent the need for any concentrability coefficient, we turn to the interactive setting. We provide the first, computationally efficient, interactive MAIL algorithm for linear Markov games and show that its sample complexity depends only on the dimension of the feature map $d$. Building on these theoretical findings, we propose a deep MAIL interactive algorithm which clearly outperforms BC on games such as Tic-Tac-Toe and Connect4.

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。