← 返回大厅
arXiv (CS.LG) 2026-06-17 12:00 DOI: arXiv:2606.17805

QueryMarket: Cost-Aware Online Active Learning in Data Markets

摘要 / Abstract

arXiv:2606.17805v1 Announce Type: new Abstract: Data acquisition is a major bottleneck for learning in real-time streams: analysts must decide on the fly which labels to purchase while respecting a rolling budget. However, existing online active learning rarely unifies pricing, information gain, and rolling budget constraints under concept drift. We introduce QueryMarket, a market-inspired framework that queries each incoming data point based on its estimated utility to the model and its price. Within this framework, we propose OVBAL (online variance-based active learning), which integrates data pricing with information-driven selection by estimating each sample's marginal utility via a D-optimality criterion with exponential forgetting and executing cost-aware purchases under rolling budget constraints. OVBAL yields a simple, fully online decision rule that adapts to nonstationary streams and heterogeneous label costs. Experiments on synthetic data and a real-world solar power generation forecasting task show that OVBAL is particularly effective under seller-centric pricing and yields a more favorable long-run error-cost trade-off in the real-world task under both pricing schemes.

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。