← 返回大厅
arXiv (CS.CV) 2026-06-12 12:00 DOI: arXiv:2512.12571

Measurement Plasticity: Sensor-Level Adaptation for Vision-Language Models

摘要 / Abstract

We propose Multi-View Physical-prompt (MVP) for Test-Time Adaptation (TTA), a forward-only framework that moves TTA from tokens to photons by treating the camera exposure triangle (i.e., ISO, shutter speed, and aperture) as physical prompts. At inference, MVP acquires selected multiple physical views using a source-affinity score, evaluates digitally augmented variants of each retained view and filters the lowest-entropy predictions, and aggregates predictions with hard voting. This selection-then-vote design is simple, calibration-friendly, and requires no gradients or model modifications. On ImageNet-ES and ImageNet-ES-Diverse, MVP outperforms digital-only TTA on both Auto-Exposure and a combination with conventional sensor control. MVP remains effective under reduced parameter candidates that lower capture latency, demonstrating its practicality.

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。