Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.LG) 2026-06-24

Reconstructing GRACE Terrestrial Water Storage with Spatio-Temporal Graph Neural Networks: An Application to South America

arXiv:2606.23833v1 Announce Type: new Abstract: Terrestrial water storage (TWS) integrates snow, soil moisture, surface water, and groundwater and is a key indicator of how climate variability and human activity reshape the global water cycle. The GRACE and GRACE-FO satellite missions provide the only direct, globally consistent observations of TWS change, but their record only begins in 2002 which is too short for many climate-scale analyses. We present a deep learning application that reconstructs monthly GRACE-like TWS anomalies (TWSA) back to 1940 by learning the relationship between daily ERA5 meteorological forcing (precipitation, evapotranspiration, runoff) and monthly GRACE observations. In contrast to prior reconstruction approaches based on grid-cell-wise regression, CNNs, or LSTMs, we adapt a multi-variate time series graph neural network (MTGNN) architecture, which was originally developed for mobility and traffic forecasting on urban sensor networks to this satellite-geodesy task. Spatial dependencies are encoded in a static, interpretable hybrid adjacency matrix that combines geodesic proximity with lagged correlations of climatic time series, capturing both local hydrological coupling and large-scale teleconnections. The reconstruction achieves a grid-cell Pearson correlation of 0.69, a basin-mean correlation of 0.94, and a near-zero bias, and it reproduces the spatial fingerprints of the 2015/16 El Niño and 2020/21 La Niña events. A systematic comparison with established reconstruction approaches (GTWS-MLrec, RM-REC, GRAiCE) shows that the graph-based model is statistically competitive at basin scale, reaching a correlation within 0.025 of the best baseline while using only roughly half to a tenth of the predictors the other models require and revealing characteristic weaknesses in arid regions in all models. The complete implementation is publicly available at github.com/hcu-cml/MTGNN-TWS-Reconstruction-GRACE

02.
arXiv (CS.CV) 2026-06-16

RaLMPH: Reliability-aware Learning for Multi-Pathologist Harmonization in Whole-Slide Image Classification

Multiple Instance Learning (MIL) is a standard paradigm for Whole-Slide Image (WSI) analysis and has achieved strong results in computational pathology. However, most MIL pipelines assume a single "gold" label per slide, which conflicts with clinical practice where substantial inter-pathologist variability is common. Existing multi-annotator learning and label-refinement methods typically estimate global annotator reliability or rely on single-instance assumptions, making them poorly suited to MIL and to localized diagnostic contexts where experts disagree. We propose RaLMPH (Reliability-aware Learning for Multi-Pathologist Harmonization), a MIL-based label reconciliation framework for WSIs annotated by multiple pathologists. RaLMPH introduces a reliability field that jointly models (i) local neighborhood structure in WSI feature space and (ii) expert uncertainty (entropy), enabling per-sample identification of trustworthy reference neighborhoods. Leveraging this field, RaLMPH performs sample-wise local annotator ranking to select reliable opinions per slide and applies an adaptive gating mechanism to fuse labels conditioned on local reliability. Experiments on a clinical WSI dataset with labels from six pathologists, as well as controlled simulated benchmarks, show that RaLMPH consistently outperforms existing approaches. Further analyses clarify how our reliability-aware mechanism improves label reconciliation and downstream MIL performance.

03.
arXiv (CS.AI) 2026-06-11

HiGR: Industrial-Scale Hierarchical Generative Slate Recommendation Framework in Tencent

arXiv:2512.24787v4 Announce Type: replace-cross Abstract: Slate recommendation, which presents users with a ranked item list in a single display, is ubiquitous across mainstream online platforms. While recent generative recommendation methods have shown strong potential in modeling item sequences with semantic IDs, directly applying them to industrial-scale slate recommendation faces a fundamental disconnect: entangled SID spaces confound high-level list planning, fine-grained autoregressive decoding over long sequences limits semantic planning efficiency, and token-level objectives misalign with holistic slate quality. In this paper, we propose HiGR, an industrial-scale hierarchical generative framework for slate recommendation that bridges this disconnect through a co-designed pipeline. First, HiGR learns structured SIDs via a Prefix-Contrastive Residual Quantized VAE (PCRQ-VAE). By enforcing high-level prefixes to capture shared semantics, PCRQ-VAE creates a controllable discrete space that acts as a prerequisite for efficient planning. Leveraging this structured space, our Hierarchical Slate Decoder (HSD) shifts autoregressive modeling from entangled token-level decoding to coarse-grained preference embeddings. This design significantly reduces inference latency while allowing explicit global slate structure planning. Finally, this stable planning space enables an ORPO-based listwise alignment mechanism to optimize triple-objective implicit feedback-ranking fidelity, genuine user interest, and diversity. Extensive offline experiments show that HiGR outperforms state-of-the-art baselines by over 10% in offline recommendation quality while achieving a $5\times$ inference speedup. Online A/B tests on Tencent platforms further improve watch time by 1.22% and video plays by 1.73%. HiGR has been deployed on multiple Tencent platform surfaces, serving hundreds of millions of users and proving its industrial-scale applicability.

04.
arXiv (CS.AI) 2026-06-18

Large-Scale OD Matrix Estimation with A Deep Learning Method

arXiv:2310.05753v2 Announce Type: replace Abstract: The estimation of origin-destination (OD) matrices is a crucial aspect of Intelligent Transport Systems (ITS). It involves adjusting an initial OD matrix by regressing the current observations like traffic counts of road sections (e.g., using least squares). However, the OD estimation problem lacks sufficient constraints and is mathematically underdetermined. To alleviate this problem, some researchers incorporate a prior OD matrix as a target in the regression to provide more structural constraints. However, this approach is highly dependent on the existing prior matrix, which may be outdated. Others add structural constraints through sensor data, such as vehicle trajectory and speed, which can reflect more current structural constraints in real-time. Our proposed method integrates deep learning and numerical optimization algorithms to infer matrix structure and guide numerical optimization. This approach combines the advantages of both deep learning and numerical optimization algorithms. The neural network(NN) learns to infer structural constraints from probe traffic flows, eliminating dependence on prior information and providing real-time performance. Additionally, due to the generalization capability of NN, this method is economical in engineering. We conducted tests to demonstrate the good generalization performance of our method on a large-scale synthetic dataset. Subsequently, we verified the stability of our method on real traffic data. Our experiments provided confirmation of the benefits of combining NN and numerical optimization.

05.
arXiv (math.PR) 2026-06-25

On the L{é}vy concentration function of Gaussian quadratic forms with applications to second order U-statistics

arXiv:2606.25441v1 Announce Type: new Abstract: We provide an upper-bound for the L{é}vy concentration function: $$ Q_{S}(\varepsilon):= \sup_{x \in\mathbb{R}}\mathbb{P} (x < S \leq x+\varepsilon) $$ where $S$ is a weighted sum of noncentral chi-square random variables: $$ S:= \sum_{k=1}^\infty \lambda_k (Z_k^2 - 1) + \mu_kZ_k $$ Here, $\{Z_k\}_{k=1}^\infty$ is a sequence of independent standard Gaussian random variables and $\{\lambda_k\}_{k=1}^\infty, \{\mu_k\}_{k=1}^\infty$ are real valued, square summable sequences. Random variables of this type often appear as limiting distributions of second order U-statistics. Our bound is adaptive, in that it recovers (up to constant factors) Gaussian type concentration function estimates if $\|\lambda\|_2$ is negligible compared to $\|\mu\|_2$ and chi-square estimates if $\|\mu\|_{2}$ is negligible compared to $\|\lambda\|_2$. Our bound generalizes existing bounds in various ways. In particular, we make no assumptions regarding the number of nonzero $|\lambda_k|$ or the size of the minimal $|\lambda_k|$, nor do we make any assumptions on the signs of $\lambda_k$. Finally, we apply our bound to some examples of interest, specifically quadratic forms that arise in limit theorems for second-order U-statistics.

06.
arXiv (CS.LG) 2026-06-16

The Complexity of Min-Max Optimization for Quadratic Polynomials

arXiv:2606.17000v1 Announce Type: cross Abstract: We prove that computing approximate stationary points of min-max optimization over the hypercube is PPAD-hard for quadratic polynomials. This holds even when the polynomials are multilinear, each variable appears in at most three monomials, and the approximation factor is inverse polynomial. As a direct consequence, we obtain the first PPAD-hardness results for two-team zero-sum polymatrix games.

07.
arXiv (CS.LG) 2026-06-19

Reversible Residual Normalization Alleviates Spatio-Temporal Distribution Shift

arXiv:2604.15838v2 Announce Type: replace Abstract: Distribution shift severely degrades the performance of deep forecasting models. While this issue is well-studied for individual time series, it remains a significant challenge in the spatio-temporal domain. Effective solutions like instance normalization and its variants can mitigate temporal shifts by standardizing statistics. However, distribution shift on a graph is far more complex, involving not only the drift of individual node series but also heterogeneity across the spatial network where different nodes exhibit distinct statistical properties. To tackle this problem, we propose Reversible Residual Normalization (RRN), a novel framework that performs spatially-aware invertible transformations to address distribution shift in both spatial and temporal dimensions. Our approach integrates graph convolutional operations within invertible residual blocks, enabling adaptive normalization that respects the underlying graph structure while maintaining reversibility. By combining Center Normalization with spectral-constrained graph neural networks, our method captures and normalizes complex Spatio-Temporal relationships in a data-driven manner. The bidirectional nature of our framework allows models to learn in a normalized latent space and recover original distributional properties through inverse transformation, offering a robust and model-agnostic solution for forecasting on dynamic spatio-temporal systems.

08.
arXiv (CS.CV) 2026-06-25

Wan-Streamer v0.1: End-to-end Real-time Interactive Foundation Models

We present Wan-Streamer, a native-streaming, end-to-end interactive foundation model designed from the ground up for real-time, low-latency, full-duplex audio-visual interaction. Wan-Streamer seamlessly models language, audio, and video as both input and output within a single Transformer, where the sequence is represented as interleaved visual, audio, and text input tokens together with visual, audio, and text output tokens, coordinated by block-causal attention for incremental streaming. Unlike cascaded interactive systems that rely on separate VAD, ASR, language, TTS, audio-driven animation, or video-generation modules, Wan-Streamer does not rely on external language, speech, avatar, or video-generation modules: perception, reasoning, generation, response timing, turn management, and cross-modal synchronization are learned jointly within one unified model, reducing pipeline latency and error accumulation. To support natural audio-visual responsiveness, we redesign the entire stack around streamability, including causal encoders, causal decoders, block-causal attention, and low-latency multimodal token scheduling, enabling streaming units as short as 160 ms at 25 fps. Wan-Streamer achieves approximately 200 ms model-side response latency and approximately 550 ms total interaction latency when combined with 350 ms bidirectional network latency, supporting sub-second duplex audio-visual communication. These results position Wan-Streamer as a unified, end-to-end, multimodal interactive foundation model for low-latency streaming interaction.

09.
arXiv (math.PR) 2026-06-15

Boltzmann-Like Occupation of Nonequilibrium Steady States on Dense Networks

arXiv:2606.14542v1 Announce Type: cross Abstract: A central problem in statistical physics is to extend the Boltzmann distribution to nonequilibrium steady states (NESS). We prove that NESS on large dense networks have Boltzmann-like occupation despite extensive entropy production. We further show that the active-matter heuristic of "low rattling" is asymptotically exact. Intuitively, these NESS spend a greater fraction of their time in states they leave more slowly. This explanation extends to the broader class of "equiaccessible" steady states, which play a role in our analysis akin to that of equilibrium in linear response.

10.
arXiv (CS.LG) 2026-06-12

Kareus: Joint Reduction of Dynamic and Static Energy in Large Model Training

arXiv:2601.17654v2 Announce Type: replace Abstract: The computing demand of AI is growing at an unprecedented rate, but energy supply is not keeping pace. As a result, energy has become an expensive and contended resource that requires explicit management and optimization. Although recent works have made significant progress in large model training optimization, they focus on optimizing either dynamic or static energy consumption. We find that fine-grained kernel scheduling and frequency scaling jointly and interdependently impact both dynamic and static energy consumption. Based on this finding, we design Kareus, a training system that pushes the time-energy tradeoff frontier by optimizing both aspects. Kareus decomposes the intractable joint optimization problem into local, partition-based subproblems. It then uses a multi-pass multi-objective optimization algorithm to find execution schedules that push the time-energy tradeoff frontier. Compared to the state of the art, Kareus reduces training energy by up to 28.3% at the same training time, or reduces training time by up to 27.5% at the same energy consumption.

11.
medRxiv (Medicine) 2026-06-17

The Unreliable Judges: Assessing Reproducibility and Self-Preference Bias of LLMs as Free-Text Evaluators

Large Language Models (LLMs) are transforming clinical practice and research, but their adoption requires rigorous evaluation. While human assessment is ideal, its cost has driven the widespread use of LLMs as evaluators. We introduce an open-source reciprocal framework comparing 71 human experts against six LLMs. AI evaluators show a strong self-preference bias, yet neither group reliably identified whether a response was human- or AI-generated. AI scores correlated with surface features such as length and lexical diversity, whereas human scores did not. By probing the evaluator's hidden states and applying targeted steering, we show that verbosity is a major causal driver of the bias. Moreover, shuffling question-response pairings shows that long responses keep high scores even when they no longer answer the question, whereas short ones do not, demonstrating that AI judges reward verbosity largely independently of content alignment. Finally, API-based and batch inference inflate stochasticity, underscoring the need for controlled deployment.

12.
arXiv (CS.AI) 2026-06-11

TreeSeeker: Tree-Structured Trial, Error, and Return in Deep Search

arXiv:2606.11662v1 Announce Type: new Abstract: Deep search requires agents to answer complex questions through multi-step web search, browsing, evidence comparison, and synthesis. A central challenge is deciding how to search when several directions look plausible but only some will later lead to reliable evidence. If an agent greedily follows the current best-looking direction, it may keep extending a weak continuation. If it explores without discipline, it may waste budget on disconnected trials. We propose TreeSeeker, an inference-time framework for controlled trial-and-error in deep search. TreeSeeker organizes search as branch-and-return search over tree-structured states, where each branch is a tentative direction for a sub-goal. At each round, TreeSearch reads all sub-goal trees, identifies active goals, and uses textual UCB signals of value, uncertainty, and risk to select among exploiting a promising branch, exploring an uncertain alternative, or pruning an unproductive continuation and returning to an earlier branch point. TreeMem supports this control loop by keeping evidence, uncertainty, conflicts, progress, and failure cues attached to the branches that produced them, so trial outcomes can guide later decisions. Experiments on XBench-DeepSearch, BrowseComp, and BrowseComp-ZH show that TreeSeeker consistently outperforms strong open-source baselines, suggesting that explicit branch-and-return control complements stronger reasoning and tool execution.

13.
arXiv (CS.CL) 2026-06-11

Measuring language complexity from hierarchical reuse of recurring patterns

We introduce the ladderpath index as a measure of language complexity grounded in algorithmic information theory. It counts the minimum steps needed to reconstruct a sequence through hierarchical reuse of repeated substructures, capturing an exactly computable but constrained form of algorithmic compressibility related to, but distinct from, Kolmogorov complexity. We apply the ladderpath approach to 21 parallel corpora from the Parallel Universal Dependencies dataset. The ladderpath index is approximately invariant across the languages, and varies much less than the corpus length. This is more pronounced when all corpora are mapped to a unified binary representation, providing evidence for the equi-complexity hypothesis from a representation-independent perspective. We also observe trade-offs between character inventory size and corpus length, and between vocabulary-level and corpus-level reconstruction complexity, supporting the trade-off hypothesis that total complexity is conserved and redistributed across linguistic levels. The reusable substructures identified by the ladderpath approach, without any linguistic input, overlap with words and morphological components attested in the natural vocabulary. The hierarchical reuse captured by the ladderpath approach parallels the chunking mechanisms proposed in cognitive science, where the human cognitive system compresses linguistic input into nested, reusable units under shared memory and processing constraints. This connection between cognitive chunking and the ladderpath approach provides a new interpretation for the equi-complexity and trade-off hypotheses, grounding both in the shared cognitive architecture that underlies language processing across human languages.

14.
medRxiv (Medicine) 2026-06-15

Automated AI-Based Ventricular Subcompartment Segmentation and Volumetry in Idiopathic Normal Pressure Hydrocephalus

Purpose In idiopathic normal pressure hydrocephalus (iNPH), longitudinal monitoring of ventricular size is important for diagnosis and treatment follow-up. This study aimed to validate a fully automated AI model for CT ventricular volumetry with subcompartments and to compare AI-derived volume changes with routine radiology assessments. Methods This retrospective, single-center study included 88 patients with iNPH and 456 non-contrast-enhanced head CT examinations. The model was trained on 38 manually labeled CT scans with 12 ventricular subcompartments. Outcomes included segmentation accuracy, correspondence between AI-derived longitudinal ventricular volume changes and radiology report categories (decreased, unchanged, increased), radiologist detection thresholds for ventricular change, and paired pre- and postoperative volume changes in 22 patients with ventriculoperitoneal shunt. Results Mean segmentation accuracy was high (Dice, 0.83). 91% of 100 segmentations were rated as excellent by an expert neuroradiologist. AI-derived ventricular volume changes corresponded well to radiology report categories (median total ventricular volume changes of -17% in cases reported as decreased, 0% in unchanged cases, and +22% in increased cases; all p < 0.001). Radiologists reported ventricular volume change in 50% of cases at an AI-measured relative volume change of +/-6%, and in 90% of cases at +21% for enlargement and -18% for decrease. After shunt placement, ventricular volume decreased by -8% (median), with the largest relative reductions observed in the right temporal and occipital horns. Conclusions Automated AI-based ventricular segmentation on CT enables accurate and reproducible assessment of ventricular volume changes in iNPH and complements routine radiological evaluation for longitudinal and postoperative monitoring.

15.
arXiv (CS.CL) 2026-06-24

Blockwise Policy-Drift Gating for On-Policy Distillation

On-policy distillation (OPD) trains a student policy using teacher signals computed on trajectories sampled by the student itself. Recent work shows that sampled-token OPD can be fragile on long-horizon reasoning tasks and that local teacher-support matching is a simple and effective repair. This paper introduces blockwise policy-drift gating, a lightweight student-only old-current drift controller for OPD under rollout reuse. The method computes log-probability shifts between the behavior student and the current student on the sampled token path, aggregates these shifts over fixed blocks or spans, and uses the resulting detached, mean-normalized gates to reweight OPD position losses. It does not change teacher targets, teacher top-K supports, or the rollout policy. In a six-variant Qwen3 math reasoning benchmark with a uniform 200-step training budget for all trained variants, we use pass@8 as the primary problem-level solve-rate metric. Fixed 64-token block gating improves sampled-token OPD mean pass@8 from 0.4978 to 0.5160 across AIME24, AIME25, MATH500, and AMC23. On Teacher-TopK/LSM, Block64 gives the best four-benchmark mean pass@8 among trained students. The results identify local old-current policy drift as a practical control signal for reused OPD rollouts and motivate block-level gating as a simple default for improving solve-rate robustness.

16.
arXiv (CS.CL) 2026-06-17

Scaling Enterprise Agent Routing: Degradation, Diagnosis, and Recovery

Production LLM assistants route user requests to growing libraries of specialized tools, but how does routing accuracy degrade as the catalog scales? We study single-step routing on a 110-agent, 584-tool catalog from a deployed enterprise productivity assistant, evaluating three frontier models from 10 to 110 agents. Routing F1 on under-specified requests drops 16–23 percentage points across models. An oracle analysis decomposes the degradation into a retrieval gap (the model cannot surface the right tool) and a confusion gap (even with perfect retrieval, the oracle ceiling drops 10pp). Embedding-based shortlisting recovers +10–11pp F1 at full scale across all three models and two providers. A production annotation study (1,435 human-labeled utterances, three annotators) confirms the recovery on real traffic at +10–17pp despite 10–15pp lower absolute performance.

17.
arXiv (CS.AI) 2026-06-19

CRAX: Fast Safe Reinforcement Learning Benchmarking

arXiv:2606.20376v1 Announce Type: cross Abstract: Safety is a core concern for deploying reinforcement learning (RL) agents in real-world domains such as robotics and autonomous driving. While benchmarks have been central to progress in RL, existing safety benchmarks with high-fidelity 3D physics remain computationally slow, limiting large-scale experimentation and rapid prototyping. To address this gap, we propose CRAX (Constrained RL Accelerated with JAX). Built on top of the MuJoCo XLA (MJX) physics engine with realistic 3D dynamics, CRAX leverages vectorized operations and hardware acceleration, yielding up to ~100x speedups over comparable CPU-based safety benchmarks. The benchmark features six environment suites and three agent-specific tasks, each spanning three difficulty levels. Evaluating six popular safe RL methods shows that no single approach dominates across all tasks, and reveals the trade-offs between performance and safety. We find that curriculum learning across difficulty levels and safety transfer can improve performance over direct training in harder settings.

18.
arXiv (CS.AI) 2026-06-24

Decentralized Coordination of Autonomous Traffic Through Advanced Air Mobility Corridors

arXiv:2606.23832v1 Announce Type: cross Abstract: The use of dedicated corridors for Advanced Air Mobility (AAM) traffic is one of the most commonly proposed pathways to integrating them into existing airspace operations. Most prior research has focused on the design of networks of AAM corridors and conflict resolution for aircraft within corridors. It is also generally believed that while attractive from an implementation perspective, corridor-based operations may be inefficient, especially in the absence of centralized traffic management. In this paper, we show that contrary to this belief, it is possible for autonomous aircraft to learn to self-organize into corridor flows in decentralized settings. We illustrate our approach using scenarios in which fixed-wing aircraft need to safely and efficiently traverse (1) a single corridor with metering after the exit, (2) a sequence of two consecutive corridors, and (3) a corridor that splits into two. We find that in decentralized settings with only local information, the aircraft are able to conform to the corridor boundaries more than 94% of the time and reach their goal in a relatively efficient manner. Furthermore, tactical interventions to handle violations of the separation minimum are needed only infrequently in low- and medium-density settings. However, such tactical interventions become more frequently necessary only when traffic density is high.

19.
arXiv (CS.LG) 2026-06-24

Data Augmentation: A Fourier Analysis Perspective

arXiv:2606.24418v1 Announce Type: new Abstract: Data augmentation is a simple and model-agnostic approach for exploiting known invariances in learning problems. Given a group acting on the input space, one augments the training set with transformed copies of each sample. Because it exploits symmetries without modifying the underlying learning algorithm, data augmentation can be applied broadly across learning methods. However, this universality comes at a computational cost: when the group is large, full group-sized augmentation quickly becomes computationally infeasible. This raises a fundamental question: Can partial data augmentation achieve the same statistical benefits as full augmentation in terms of generalization and sample complexity? We develop a general framework for investigating this question using Fourier analysis and the representation theory of finite groups. We show that, for a broad class of classical learning problems, partial data augmentation based on a randomly sampled subset of group elements achieves the same minimax rates as full augmentation, up to an approximation error that vanishes as the subset size increases. Our results provide a theoretical explanation for why partial augmentation can retain the statistical benefits of full augmentation despite enforcing symmetry only approximately, and shed light on a recently raised question in learning with symmetries: whether statistically optimal learning under general group invariances can be achieved using computationally scalable methods. Moreover, we prove a complementary impossibility result: enforcing exact invariance via data augmentation requires averaging over the entire group, and cannot be achieved by any strict subset when the hypothesis space is sufficiently expressive. Together, these results provide a unified perspective on full and partial data augmentation, as well as exact and approximate symmetry enforcement.

20.
arXiv (CS.CV) 2026-06-16

Physics-Driven Zero-Shot MRI Reconstruction with Non-local Image Priors

Zero-Shot Self-Supervised Learning (ZS-SSL) has emerged as a promising paradigm for accelerated Magnetic Resonance Imaging (MRI) reconstruction, eliminating the reliance on fully-sampled external datasets. However, learning solely from a single under-sampled scan suffers from supervision scarcity and optimization instability, often leading to overfitting or artifacts. To address these challenges, we propose a robust physics-driven ZS-SSL framework that synergizes physical consistency with image-domain non-local priors. Our method introduces three core innovations: (1) a Coil Sensitivity Map (CSM)-Guided Dynamic Repository, which stabilizes the training trajectory by filtering physically inconsistent artifacts based on coil sensitivity constraints; (2) a SPIRiT-based regularization, which enforces k-space self-consistency via a learned correlation kernel and stochastic masking; (3) a Non-Local Self-Similarity (NSS) Pixel Bank, which leverages the high-fidelity reference established by the former modules to explicitly mine non-local anatomical similarities, thereby augmenting supervision in the image domain. Extensive experiments on the FastMRI dataset demonstrate that our approach achieves state-of-the-art performance, particularly under high acceleration factors, effectively bridging the gap between zero-shot learning and supervised methods. The code is available at https://github.com/Zolento/NS-SSL.

21.
arXiv (CS.LG) 2026-06-17

Resource-Efficient Variational Quantum Classifier

arXiv:2511.09204v3 Announce Type: replace-cross Abstract: We introduce the unambiguous quantum classifier based on Hamming distance measurements combined with classical post-processing. The proposed approach improves classification performance through a more effective use of ansatz expressivity, while requiring significantly fewer circuit evaluations. Moreover, the method demonstrates enhanced robustness to noise, which is crucial for near-term quantum devices. We evaluate the proposed method on a breast cancer classification dataset. The unambiguous classifier achieves an average accuracy of 90%, corresponding to an improvement of 6.9 percentage points over the baseline, while requiring eight times fewer circuit executions per prediction. In the presence of noise, the improvement is reduced to approximately 3.1 percentage points, with the same reduction in execution cost. We substantiate our experimental results with theoretical evidence supporting the practical performance of the approach.

22.
arXiv (CS.CL) 2026-06-12

Multi-Bitwidth Quantization for LLMs Using Additive Codebooks

As large language models (LLMs) are increasingly deployed across heterogeneous hardware with varying resource constraints, the ability to adaptively manage the trade-off between performance and efficiency without retraining is critical. We propose Drop-by-Drop, a novel multi-bitwidth post-training quantization framework that enables inference-time precision control over LLM weights from a single trained model. Our method is theoretically grounded in information theory and successive refinement. We establish that LLM weights, which commonly follow a Gaussian distribution, can be optimally reconstructed with increasing fidelity as additional bits are incorporated, under a weighted mean squared error distortion motivated by LLM loss functions. To realize this in practice, Drop-by-Drop incorporates Matryoshka-style supervision into the loss function, exploiting the structure of additive codebooks. Drop-by-Drop produces a single model where ordered subsets of codebooks yield accurate partial reconstructions at each precision level. This approach significantly reduces storage and memory overhead by allowing a single checkpoint to serve multiple bitwidths, while maintaining competitive perplexity and accuracy across major architectures, such as Qwen, LLaMA, Gemma, and Mistral.

23.
arXiv (CS.LG) 2026-06-16

Using Reinforcement Learning to Optimize the Global and Local Crossing Number

arXiv:2509.06108v2 Announce Type: replace-cross Abstract: Graph drawing concerns the algorithmic visualization of graphs. A good drawing of a graph is easy to read and facilitates solving tasks on the graph. Several properties have been identified to occur in good drawings of graphs. Such properties include a low number of crossings, large angles between edges, short edges, and depicting symmetries. Many of these properties are explicitly measurable metrics. This brings us to the insight that graph drawing can be seen as a game. In this paper, we study a single-player optimization game in which the player iteratively moves vertices of a straight-line graph drawing to reduce edge crossings. This game arose naturally from the automatic track of the Graph Drawing Challenge, where solutions are obtained by repeatedly performing local vertex movements. We formalize this process as a game with full information and investigate whether reinforcement learning can discover effective strategies for playing it. Our reinforcement-learning agent observes the local geometric and structural context of a vertex and selects a movement direction with the goal of reducing either the global or the local crossing number, that is, the total number of crossings or the maximum number of crossings per edge. We compare the resulting strategies to existing methods and established crossing-minimization heuristics on standard benchmark graphs. While our approach does not out-compete state-of-the-art methods for minimizing the global crossing number, it is competitive and often superior for minimizing the local crossing number.

24.
arXiv (CS.LG) 2026-06-12

TEDD: Robust Detection of Unstable Temporal Features

arXiv:2606.12643v1 Announce Type: new Abstract: When working with real-world temporal data, it is common to encounter features whose distribution is changing over time. The naive employment of Machine Learning models on this unstable data might lead to rapidly degrading performance, especially if the new distribution is much different from what was previously seen during training. In order to cope with this problem, it is critical to automatically identify features that are changing over time. With these features detected, data scientists and other practitioners will be able to mitigate the issue (for instance, by applying data transformations), deploying more robust models that retain high performance for longer periods of time. In this paper, we describe which temporal changes a feature should not suffer from, and propose TEDD, a technique to a) identify when a dataset might lead to an unstable Machine Learning model and b) automatically detect which features cause such lack of robustness. In order to achieve it, we leverage a regression model to highlight which features contribute to a good prediction of an instance's timestamp. We compare our approach to other methods in real and synthetic data, testing their detection capability on all simple change patterns. We show that our method: detects all types of basic changes, both for numerical and categorical features; can detect multivariate drifts; returns a comparable value measuring the amount of change of each feature; requires no parameter tuning; and is scalable both on number of features and instances of the dataset.

25.
arXiv (CS.CV) 2026-06-16

Efficient Reinforcement for Visual-Textual Thinking with Discrete Diffusion Model

RL-based post-training has been widely adopted to enable interleaved visual and textual reasoning in unified multimodal models capable of both text and image generation. However, most existing approaches are built upon autoregressive (AR) unified models, which require full image regeneration during visual reasoning. In this work, we demonstrate that multimodal discrete diffusion models are effective alternatives to AR models for reinforcement learning in interleaved reasoning, owing to their ability to perform efficient visual rollouts via localized visual editing rather than full image-token regeneration. This reduces rollout computation during GRPO by 26.9\% compared to AR baselines, with minimal performance drop. Despite the improved efficiency, we find that joint reward assignment, which employs a shared reward signal across modalities, introduces cross-modal interference between unrelated image and text token sequences during RL updates. To address this issue, we propose factorized reward assignment, a strategy that assigns rewards independently to text and vision segments. With factorized reward assignment, our RL approach achieves an 11.2% improvement over joint reward assignment and a 38.04% improvement over the base model.