← 返回大厅
arXiv (CS.LG) 2026-06-19 12:00 DOI: arXiv:2606.19656

DF-ExpEnse: Diffusion Filtered Exploration for Sample Efficient Finetuning

摘要 / Abstract

arXiv:2606.19656v1 Announce Type: cross Abstract: A natural recipe for intelligent robotic decision-making is initializing from pretrained generative control policies, which have summarized offline experience, and adapting them to self-collected online experience. We present DF-ExpEnse, an exploration technique that improves the quality of online experience collection, thus increasing finetuning sample-efficiency. DF-ExpEnse leverages the multimodal modeling capabilities of the generative control policy to create an expressive and tractably evaluatable candidate set. It then utilizes an ensemble of critics to identify the action that best balances quality with high exploration interest. In fleet settings, DF-ExpEnse further enables cross-agent communication to facilitate collaborative exploration as a group. DF-ExpEnse can be seamlessly integrated with existing strategies that finetune pretrained generative control policies via reinforcement learning. We experimentally validate consistent sample-efficiency benefits through DF-ExpEnse across a variety of manipulation and locomotion tasks, compared to default finetuning and alternative action selection schemes. Project can be found at https://df-expense.github.io.

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。