← 返回大厅
arXiv (CS.AI) 2026-06-15 12:00 DOI: arXiv:2606.14466

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

摘要 / Abstract

arXiv:2606.14466v1 Announce Type: cross Abstract: This paper investigates the fragility of post-hoc explanation methods in audio deepfake detection. While previous work on explanation manipulation focused on images using standard $L_p$ metrics, we introduce a psychoacoustic framework that optimizes inaudible perturbations to decouple model attributions from final classifications. We evaluate this vulnerability across state-of-the-art architectures under strict prediction-preserving constraints. By evaluating the manipulation cost through domain-specific perceptual audio quality metrics alongside explanation alignment criteria, our framework demonstrates that an adversary can systematically distort automated explanation heatmaps while preserving the predicted deepfake label. Full code available at: https://github.com/cncPomper/Audio-XAI

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。