← 返回大厅
arXiv (CS.CL) 2026-06-11 12:00 DOI: arXiv:2606.11279

Massive Open-Vocabulary Keyword Spotting

摘要 / Abstract

Automatic speech recognition systems have been shown to under-perform when it comes to transcribing words rarely seen in the training data, namely specialized terminology. Open-vocabulary keyword spotting, combined with contextual biasing, has been shown to mitigate this issue. However, existing systems can only handle glossaries of a few hundred terms without becoming an infeasible bottleneck. We propose a system that stores features with a memory footprint up to 128 times smaller than a comparable baseline and allows users to process massive databases while remaining open-vocabulary. Without fine-tuning the speech recognition model, our system achieves a comparable entity recall as uncompressed solutions, even in languages not seen during training.

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。