Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (quant-ph) 2026-06-16

Intrinsic preservation of plasticity in continual quantum learning

arXiv:2511.17228v2 Announce Type: replace Abstract: Artificial intelligence in dynamic, real-world environments requires the capacity for continual learning. However, standard deep learning suffers from a fundamental issue: loss of plasticity, in which networks gradually lose their ability to learn from new data. Here we show that quantum learning models naturally overcome this limitation, preserving plasticity over long timescales. We demonstrate this advantage systematically across a broad spectrum of tasks from multiple learning paradigms, including supervised learning and reinforcement learning, and diverse data modalities, from classical high-dimensional images to quantum-native datasets. Although classical models exhibit performance degradation correlated with unbounded weight and gradient growth, quantum neural networks maintain consistent learning capabilities regardless of the data or task. We identify the origin of the advantage as the intrinsic physical constraints of quantum models. Unlike classical networks where unbounded weight growth leads to landscape ruggedness or saturation, the unitary constraints confine the optimization to a compact manifold. Our results suggest that the utility of quantum computing in machine learning extends beyond potential speedups, offering a robust pathway for building adaptive artificial intelligence and lifelong learners.

02.
arXiv (CS.LG) 2026-06-16

Not all Jensen-Shannon Divergence Estimators are Equal

arXiv:2606.16411v1 Announce Type: new Abstract: The Jensen-Shannon divergence is widely reported as a scalar measure of fidelity for synthetic tabular data. Yet, in practice, it is estimated from finite samples using protocols that are often underspecified. This creates a measurement problem. Although the population divergence is well defined, the empirical value depends on the estimator family, sampling protocol, calibration, dimensionality, and class balance. We show that different protocols can yield non-comparable values: marginal-based estimators ignore dependencies in the joint distribution and can severely underestimate divergence, while classifier-based estimators capture joint structure but exhibit strong estimator dependence. We systematically study this behavior across controlled settings with reference divergences and real-world synthetic tabular benchmarks. Our analysis reveals dependence blindness in marginal estimators, prior-shift bias under class imbalance, and estimator sensitivity in high dimensions. To address prior shift, we derive a closed-form posterior correction for classifier-based Jensen-Shannon estimation. Our results show that empirical Jensen-Shannon divergence values are inherently protocol-dependent, making explicit specification of the estimation procedure necessary for meaningful comparison. We provide practical guidelines and an open-source tool for estimator-aware Jensen-Shannon evaluation.

03.
medRxiv (Medicine) 2026-06-12

Mathematical analysis of the overall survival after chemoradiotherapy of limited-stage small cell lung cancer and the effect of dose/fractionation

The purpose of this work is to analyze the 2-year overall survival (OS2y) of limited-stage small cell lung cancer (LS-SCLC) treated with chemoradiotherapy (CRT), aiming at characterizing the response of LS-SCLC, and in particular the /{beta} value and proliferation parameters. Through a systematic analysis of the literature, we collated a dataset containing 57 entries (3363 patients) of response of LS-SCLC treated with CRT. Radiotherapy schedules ranged from hyper- to hypofractionation. Four radiobiological models to describe the OS2y were investigated, with progressive levels of complexity including the effect of radiotherapy, chemotherapy, treatment year and toxicity. The Akaike Information Criterion (AIC) was used to compare models, and the profile likelihood methodology to compute confidence intervals. Model 4, which includes the effect of radiotherapy, chemotherapy, treatment year and dose-dependent toxicity, provided the best fits of the experimental data (lowest AIC value). While being the best model, model 4 still fails to provide a good prediction of the OS2y, in particular failing to predict the survival of the schedules achieving the lower/higher survivals. The radiobiological analysis of the dose-response of LS-SCLC to CRT does not allow to narrowly constrain the value of response parameters. We attribute this limitation to the large heterogeneity of this disease. Nonetheless, our analysis shows a large /{beta} value (>9 Gy, 95% CI), which implies a low fractionation effect in the radiotherapy of LS-SCLC. and an accelerated proliferation of tumor cells, {lambda}' > 1.6 Gy/day (95% CI), after a kick-off time of ~4-5 weeks, which supports the use of accelerated protocols to avoid the effect of tumor proliferation on the clinical outcome.

04.
arXiv (CS.LG) 2026-06-16

Filtered Conformal Ellipsoids for Graph-Native Time Series

arXiv:2606.17014v1 Announce Type: new Abstract: Joint prediction sets for multivariate time series should control a single event while adapting to cross-coordinate dependence. We study filtered conformal ellipsoids: a frozen state-space filter emits a one-step predictive mean and covariance, and split-conformal calibration is applied to the resulting Mahalanobis scores. The filter is used to choose the ellipsoid shape; conformal calibration chooses the scalar radius, so the construction benefits from a learned predictive covariance without relying on Gaussian tail probabilities for coverage. The main difficulty is that filtered scores are dependent and learned recurrent filters need not contract in their raw hidden state; we therefore analyse contraction in an observable predictive-law quotient that identifies hidden states producing the same future sequence of emitted Gaussian laws. Under a stable Bayes Gaussian-projection filter, covariance bounds, and a finite-horizon observability Fisher condition, small excess Gaussian negative log-likelihood implies contraction of the learned emitted laws. Combined with a threshold-autocovariance envelope this yields a Chebyshev-type approximate coverage bound for filtered split-conformal prediction under dependence; a sharper Bernstein-type bound requires an additional geometric-mixing concentration assumption. Under Gaussian oracle realisability we also obtain a near-oracle log-volume comparison within the class of conditionally valid Gaussian ellipsoid rules. We instantiate the framework with a GCN-GRU filter with diagonal-plus-low-rank covariance. On moderate-size graph-native traffic benchmarks (METRLA-$20$ and PEMSBAY-$50$), the learned filter gives sharper at-target ellipsoids than static-covariance and non-filter baselines; at full-graph scale and on non-graph-native datasets, factor and copula baselines can be stronger.

05.
arXiv (CS.AI) 2026-06-16

APEX: Adaptive Principle EXtraction A Three-Layer Self-Evolution Framework for Production AI Agents

arXiv:2606.15363v1 Announce Type: new Abstract: Self-improvement in AI agents has emerged as a key research frontier: systems that modify their own prompts, workflows, and decision rules based on accumulated operational experience. The state-of-the-art Self-Harness framework [1] achieves 14–21% improvement on Terminal-Bench-2.0 by mining failure clusters and patching the agent harness. However, Self-Harness optimises only one dimension – the prompt harness – leaving behavioural principles and workflow topology unchanged. We propose APEX (Adaptive Principle EXtraction), a three-layer co-evolution framework that simultaneously evolves: (L1) the harness via failure-mode patching, (L2) behavioural principles via success-trace distillation [2], and (L3) the agent workflow topology via structural fitness-based selection [6]. We implement APEX on Joe [13], a production-grade super AI Agent built on NVIDIA Nemotron and designed as an Edge AI Agent Factory for the NVIDIA Agent Challenge 2026, managing a 15-node compute fleet using 114 real task traces collected over 18 days. APEX achieves an APEX Health Score of 0.570 (+90% vs. baseline 0.300) in a single evolutionary run, distilling 6 novel reusable principles and selecting a research-first workflow topology scoring 0.900 (+20%). Our results demonstrate that multi-dimensional co-evolution substantially outperforms single-axis harness optimisation, at a cost of only 4 LLM calls (~270 s) on a local qwen2.5-coder:32b instance.

06.
arXiv (CS.LG) 2026-06-19

HEPTv2: End-to-End Efficient Point Transformer for Charged Particle Reconstruction

arXiv:2606.20437v1 Announce Type: cross Abstract: Charged-particle tracking – reconstructing trajectories from sparse detector measurements – is a fundamental high-energy-physics inference problem and a canonical example of learning under extreme combinatorial ambiguity. At the High-Luminosity Large Hadron Collider (HL-LHC), tracking must remain accurate and efficient despite unprecedented collision densities. Graph neural networks perform strongly, but incur substantial costs from graph construction and processing, while transformer-based approaches rely on auxiliary stages that prevent end-to-end optimization. To address this, we present HEPTv2, an end-to-end point-transformer architecture that reconstructs tracks from detector hits in one trainable pipeline. HEPTv2 combines a locality-aware point encoder with a track decoder that predicts complete trajectories without graph-building, clustering, or filtering. The encoder uses locality-sensitive hashing in detector coordinate space to preserve tracking-relevant geometry while enabling efficient local attention. The decoder resolves ambiguities through sectorized decoding and direct hit-to-track prediction under joint encoder-decoder supervision, allowing the full pipeline to be optimized end-to-end. On TrackML, HEPTv2 achieves 98.6% double-majority tracking efficiency at a 0.8% fake rate, while requiring only $\sim$15~ms inference time and 0.4~GB peak memory per event on a NVIDIA A100 GPU. Latency and memory scale approximately linearly for events with up to $5\times10^5$ hits. HEPTv2 establishes a new state of the art in the accuracy-latency trade-off, improving efficiency by 4.5% over the strongest prior transformer and by 1.1–2.2% over optimized graph-based pipelines, while reducing latency by factors of 7 and 38–52, respectively. These results show end-to-end transformers can deliver the accuracy and efficiency required for real-time particle reconstruction at the HL-LHC.

07.
arXiv (CS.CV) 2026-06-11

Towards Fully Automated Exam Grading: Fairness-Aware Recognition of Handwritten Answers with Foundation Models

Correcting handwritten exams by hand is time-consuming and error-prone, particularly for large cohorts, while fully digital exams tend to force a didactic narrowing towards closed question formats. A practical middle ground keeps paper-based, problem-oriented tasks but records the assessment-relevant answers as single capital letters in a table that a machine can read. The open question is whether this reading can be made accurate and, above all, fair enough for unsupervised grading. Earlier automated approaches reached only about 88%–91% recognition – too low – and failed on the cases that matter most: answers placed outside the cell, crossed out, or written in cursive. We show that general-purpose vision-language foundation models (VLMs), which interpret the page rather than match pixel templates, close this gap. On a benchmark of 61 anonymised exams (3141 answer positions) the best model reaches 98.4% accuracy, well above the previous baseline. Crucially, we centre the evaluation on fairness: we distinguish false negatives (a correct answer marked wrong, which disadvantages the student) from false positives, and a lightweight prompt that supplies the reference solution as context lowers the false-negative rate to 0.58%. Under an exemplary grading scheme only three of the 61 exams would be graded worse, all caught by a student self-review step. Fully automated, fairness-aware exam grading at scale is therefore defensible; we release the anonymised benchmark to support reproducibility.

08.
arXiv (CS.CL) 2026-06-19

FineREX: Fine-Tuned NER-RE for Human Smuggling Knowledge Graphs

Court proceedings contain valuable evidence about human smuggling networks, but this information is often buried within unstructured, jargon-heavy legal documents. While large language models (LLMs) can support knowledge graph construction through automated information extraction, existing approaches rely on general-purpose models that are not tailored to the entity and relationship definitions required in this domain. We introduce FineREX, a streamlined knowledge graph construction pipeline built around a fine-tuned LLM for named entity recognition and relationship extraction (NER-RE). Using a manually annotated dataset of $512$ text chunks, FineREX achieves absolute improvements of 15.50% and 31.46% in entity and relationship F1-score, respectively, compared to a larger general-purpose baseline. These gains translate into higher-quality knowledge graphs, reducing legal noise by nearly half and lowering node duplication on long documents from 17.78% to 11.17%. By eliminating document rewriting and redundant extraction stages, FineREX also reduces end-to-end processing time by 50.0%. Our results demonstrate that domain-specific fine-tuning can substantially outperform larger general-purpose models while improving both the quality and efficiency of knowledge graph construction for illicit network analysis.

09.
arXiv (CS.CV) 2026-06-16

DDTNet: Degradation Disentanglement and Transfer Network for Test-Time All-in-One De-weathering Adaptation

All-in-one adverse weather image restoration aims to remove multiple degradations, such as rain, haze, and snow, using a single unified model. Despite their broad applicability, existing methods typically compromise performance, delivering balanced but suboptimal results for individual degradation types. This issue becomes more pronounced when a domain gap exists between training and testing data. Motivated by the observation that modeling degradation patterns is more feasible than recovering clean content, we propose the Degradation Disentanglement and Transfer Network (DDTNet), which focuses specifically on degradation transfer. By disentangling degradation patterns from target-domain degraded images and transferring them to source domain clean images, DDTNet generates domain-adaptive paired training data. These pairs are then used to fine-tune restoration models, significantly enhancing their adaptability across diverse weather conditions and domains. The core of DDTNet is the Degradation Disentanglement Module (DDM), which comprises Degradation Coupled Attention (DCA) to capture both general and weather-specific features, thereby enabling effective disentanglement and transfer of degradation patterns. Experimental results demonstrate that DDTNet significantly and consistently improves existing all-in-one models across real-world deraining, desnowing, and dehazing datasets.

10.
arXiv (CS.LG) 2026-06-19

Optimal Ansatz-free Hamiltonian Learning In Situ

arXiv:2606.19486v1 Announce Type: cross Abstract: Characterizing the features of a Hamiltonian that governs a quantum system serves as a fundamental subroutine of quantum device calibration, signal sensing, and error correction. Recent works proposed protocols have achieved the optimal Heisenberg-limited scaling learning ansatz-free Hamiltonians from their real-time evolutions without fully specifying interaction structures. However, these protocols rely on both deep circuits with interleaving probes and control, and extremely short time resolution, making them difficult to implement on near- and intermediate-term in situ quantum experiments. In this work, we propose a computationally efficient, control-free, and ancilla-free algorithm that uses only Pauli product state preparation and measurement, and learns an ansatz-free Hamiltonian $H$ with $||H||\leq\Lambda$ in total evolution time of $\Theta(\frac{\Lambda}{\epsilon^2}\log(\frac{\Lambda}{\epsilon}))$. The evolution time cost of our algorithm is optimal for any control-free protocols as we further prove a lower bound of $\Omega(\frac{\Lambda}{\epsilon^2}\log(\frac{\Lambda}{\epsilon}))$. Technically, our method introduces a randomized-sampling framework that combines band-limited kernel-based time sampling with a displacement sieve for Hamiltonian structure learning. The characteristic probe time resolution depends only on $\Lambda$ instead of $\varepsilon$, which makes our protocol especially appealing in the high-precision regime for sensing and calibration applications. We also show that the algorithm maintains the same asymptotic total evolution time in the presence of state-preparation-and-measurement (SPAM) noise when the Hamiltonian is local after calibration. Our results demonstrate the fundamental cost of experimentally friendly Hamiltonian learning and provide a practical route to rigorous in situ characterization of near-term quantum platforms.

11.
arXiv (CS.AI) 2026-06-16

Deep Q-Learning on Hölder Spaces

作者:

arXiv:2606.16846v1 Announce Type: cross Abstract: We study the operator-theoretic core of Q-learning in continuous-time stochastic control with continuous states and actions. In value-based reinforcement learning, each Q-learning or DQN update is built from a Bellman optimality target; our analysis isolates this target in a diffusion setting and studies its regularity and approximation complexity. Under uniform ellipticity and Hölder-regular coefficients, we show that a Bellman update maps bounded inputs into an anisotropic regularity class, smoothing the state variable while leaving only Lipschitz dependence on the action variable. This yields a compact family of Bellman iterates and motivates a tensor-product DeepONet architecture adapted to the mixed regularity of the problem. We then derive explicit approximation and resource bounds, together with a stiffness–complexity trade-off as the time step $\delta \to 0$. The resulting theory makes a direct contribution to Q-learning theory at the level of Bellman target regularity and approximation in continuous stochastic control. At the same time, we do not claim a full convergence theorem for practical sampled Q-learning with exploration, replay, and stochastic gradient updates.

12.
arXiv (CS.CV) 2026-06-11

Benchmarking Cross-Domain Audio-Visual Deception Detection

Automated deception detection is crucial for assisting humans in accurately assessing truthfulness and identifying deceptive behavior. Conventional contact-based techniques, like polygraph devices, rely on physiological signals to determine the authenticity of an individual's statements. Nevertheless, recent developments in automated deception detection have demonstrated that multimodal features derived from both audio and video modalities may outperform human observers on publicly available datasets. Despite these positive findings, the generalizability of existing audio-visual deception detection approaches across different scenarios remains largely unexplored. To close this gap, we present the first cross-domain audio-visual deception detection benchmark, that enables us to assess how well these methods generalize for use in real-world scenarios. We used widely adopted audio and visual features and different architectures for benchmarking, comparing single-to-single and multi-to-single domain generalization performance. To further exploit the impacts using data from multiple source domains for training, we investigate three types of domain sampling strategies, including domain-simultaneous, domain-alternating, and domain-by-domain for multi-to-single domain generalization evaluation. We also propose an algorithm to enhance the generalization performance by maximizing the gradient inner products between modality encoders, named ``MM-IDGM". Furthermore, we proposed the Attention-Mixer fusion method to improve performance, and we believe that this new cross-domain benchmark will facilitate future research in audio-visual deception detection.

13.
arXiv (CS.LG) 2026-06-16

Learning the generating functional for variance reduction in lattice QCD

arXiv:2606.15986v1 Announce Type: cross Abstract: The generating functional in quantum field theory provides the natural framework for constructing correlation functions as derivatives with respect to source operators. We present a methodology that leverages machine-learned normalizing flows to reduce the variance of arbitrary $N$-point correlation functions of bosonic operators in lattice gauge field theory calculations by encoding a representation of the generating functional. We show that it is possible to systematically approach noiseless estimators of correlation functions in this framework. We demonstrate this methodology with applications to calculations of glueball correlation functions and Wilson loops in Quantum Chromodynamics and Yang-Mills theory. The results show up to three orders of magnitude variance reduction.

14.
arXiv (CS.CL) 2026-06-12

SkillCAT: Contrastive Assessment and Topology-Aware Skill Self-Evolution for LLM Agents

Skill self-evolution methods for LLM agents aim to turn execution trajectories into reusable skill documents, but current pipelines typically learn from one trajectory per task, merge candidate skill patches before checking them, and load the full skill corpus before inference. We propose SkillCAT, a training-free framework that separates this process into three stages. Contrastive Causal Extraction (CCE) samples multiple trajectories for each task and compares same-task success/failure pairs to identify evidence that explains outcome differences. Assessment-Augmented Evolution (AAE) replays each candidate patch on source-task clones and keeps only patches that improve or preserve task outcomes before hierarchical skill patch merging. Topology-Aware Task Execution (TTE) compiles the evolved skills into a routable sub-skill topology, so inference loads only the capability nodes relevant to the task. We evaluate SkillCAT on common agent benchmarks, including SpreadsheetBench, WikiTableQuestions, and DocVQA, and further test cross-model and out-of-distribution generalization. Across these settings, SkillCAT raises the average score over baselines by up to 40.40%, demonstrating reliable skill evolution without model training.

15.
arXiv (CS.CV) 2026-06-16

FairGen: Preference-Aligned Diffusion for Demographically Equitable Medical Image Synthesis

Medical imaging is central to modern diagnostics, and artificial intelligence (AI) systems are increasingly used to support image-based analysis by improving efficiency, accuracy, and access to care. However, inequities in healthcare access and differential disease prevalence create severe demographic imbalances in clinical image data. Such imbalances are compounded by the fact that diseases can manifest with distinct features across demographic groups, rendering certain phenotypic presentations naturally rare. AI models trained on such imbalanced data risk perpetuating diagnostic bias and widening healthcare disparities. Here we introduce FairGen, a fairness-aware diffusion framework that synthesizes demographically balanced medical images while preserving pathology-relevant visual features. By embedding physician-aligned preferences into the generation process, FairGen improves subgroup coverage during synthesis and downstream classification. Applied to dermatology, radiology, and neuroimaging benchmark tasks, FairGen achieves fairness improvements of 95.9% for skin images, 80.0% for chest radiography, and 35.2% for brain MRI, while maintaining competitive diagnostic accuracy relative to models trained on original clinical data. Clinician-facing expert review and external validation on independent cohorts further support that these gains extend beyond standard fidelity metrics and are not confined to the original in-distribution datasets.

16.
bioRxiv (Bioinfo) 2026-06-14

Generative design of antigen-specific T-cell receptor sequences with a conditional diffusion model

T cell receptor (TCR)-based immunotherapy holds immense potential for treating cancers and infectious diseases, where highly antigen-specific TCR recognition is crucial for adaptive immunity against tumors and pathogens. Engineering or de novo generation of the complementarity-determining region 3 (CDR3) loops of TCRs using artificial intelligence offers a powerful alternative to designing reactive TCRs rather than laborious experimental screening. However, current in silico approaches are constrained by weak conditional guidance, limited flexibility, and a lack of rigorous functional validation. To address these limitations, we introduce TCRDiff, a generative diffusion framework for designing antigen-specific TCRs conditioned on peptide-MHC (pMHC) targets and germline-encoded variable genes. By leveraging pre-trained knowledge from massive T-cell repertoires and TCR-pMHC recognition data, TCRDiff generates CDR3{beta} sequences with state-of-the-art fidelity to native binding TCRs through a denoising diffusion process. Furthermore, incorporating the interface geometry features generated TCR-pMHC complexes with superior structural plausibility. As a proof of concept, we deployed TCRDiff in a systematic pipeline to design candidate TCRs for immunotherapy. In vitro activation assays validated that TCRDiff-generated TCRs specifically recognize the MAGE-A3 epitope with minimized off-target cross-reactivity. Together, TCRDiff establishes a powerful, validated computational paradigm to accelerate the development of TCR-based immunotherapies.

17.
arXiv (CS.LG) 2026-06-16

Biarchetype analysis for univariate functional data. An application to macroeconomic financial time series

arXiv:2606.15881v1 Announce Type: cross Abstract: We introduce biarchetype analysis for the first time in the context of univariate functional data. This unsupervised methodology extends archetype analysis by simultaneously identifying archetypal structures across both the cases (countries, in our application) and the temporal argument. Both cases and time points are expressed as mixtures of biarchetypes, yielding a concise and highly interpretable representation of complex functional observations. Although biarchetype analysis is not intended as a clustering technique, it offers superior interpretability compared with biclustering approaches, as it is based on extreme, representative patterns rather than average centroids, thereby enhancing human comprehension. We apply the proposed method to 10-year government bond yields of European countries over the period 2001-2025. The results identify three distinct time regimes (the pre-crisis period, the euro-area sovereign debt crisis, and the post-crisis period), and reveal Germany, Greece, and Hungary as country archetypes.

18.
arXiv (quant-ph) 2026-06-19

Optimized Quantum States for Sensing in the Presence of Loss and Phase Noise

arXiv:2606.19649v1 Announce Type: new Abstract: Squeezed vacuum lets gravitational-wave detectors and other quantum sensors surpass the standard quantum limit, and is optimal in the loss-limited regime; phase noise breaks this optimality. Numerically optimizing the quantum Fisher information across the loss and phase-noise landscape, we identify non-Gaussian states that outperform any Gaussian state. These fall into three classes: Fock-like, cubic-phase-like, and states with discrete rotational symmetry. Limiting the average number of photons in the input state to $\bar{n}=5$, with $1-\eta = 5\%$ photon loss and 200 mrad phase noise, the non-Gaussian advantage reaches up to 2.2 dB. Furthermore, we observe that the non-Gaussian advantage can persist even when the measurement strategy is homodyne detection.

19.
arXiv (CS.AI) 2026-06-15

Aligning Quantum Operators with Large Language Models

arXiv:2606.13811v1 Announce Type: cross Abstract: Can Large Language Models (LLMs) understand and reason about quantum operators? Despite their remarkable capabilities in mathematics and symbolic reasoning, LLMs remain inherently blind to quantum representations such as unitary matrices. In this work, we take a step toward bridging this gap by introducing an approach that maps unitary operators into the latent space of an LLM, enabling unified modeling over quantum and linguistic inputs. We instantiate this idea on Clifford+T circuit synthesis over a Pauli rotation gate set, where our model achieves results competitive with state-of-the-art methods and scales consistently with training data, with no signs of saturation. Our approach further enables language-conditioned synthesis, allowing gate constraints unseen during training to be specified directly in natural language. This work suggests a path toward quantum–aware foundation models that can natively interpret and reason about quantum operations, which could have broader implications reaching across quantum compilation and algorithm discovery.

20.
arXiv (quant-ph) 2026-06-12

Asymmetric quantum steering harvested near a Lorentz-violating BTZ black hole

arXiv:2606.12766v1 Announce Type: cross Abstract: We investigate the harvesting of quantum steering and its directional asymmetry between two Unruh-DeWitt detectors in a Lorentz-violating BTZ black hole spacetime. Since the detectors are located at different radial positions outside the black hole, they experience inequivalent local environments induced by gravitational redshift, causing Alice to undergo stronger effective thermal noise than Bob. Remarkably, we uncover a counterintuitive phenomenon in which the detector subjected to a higher effective temperature exhibits stronger steerability than the other one, revealing a nontrivial inversion of thermal intuition in curved spacetime. Furthermore, quantum steering survives only within a finite window of detector energy gaps and reaches its maximum within an optimal regime. We find that Lorentz violation suppresses steering most strongly near this optimal energy gap, indicating an enhanced sensitivity of maximal correlation extraction to symmetry breaking effects. Our results demonstrate that Lorentz violation acts as a geometric constraint on the quantum information capacity of spacetime, simultaneously restricting both the strength and the directionality of quantum correlations.

21.
arXiv (CS.CL) 2026-06-16

Agentic Retrieval and Reinforcement Learned Equation Chains: A Controlled Generation Framework for Complex and Novel Physics Word Problems

Generating high-quality Physics Word Problems (PWPs) that are novel, complex, and solvable remains a challenging and underexplored problem in educational content generation. Existing approaches, many adapted from Math Word Problem (MWP) generation, often produce ambiguous, unsolvable, or structurally simple questions with limited linguistic diversity. We introduce ARVRE (Agentic Retrieval Value Reinforced Equation-chain), a two-stage framework for generating diverse and mathematically valid PWPs. In the first stage, a form of offline temporal-difference learning is used to construct valid chains of physics equations, while an agentic retrieval-augmented generation (RAG) framework dynamically selects topic-specific concepts and vocabulary. This design enables explicit control over problem structure and difficulty. In the second stage, a Large Language Model (LLM) converts the equation chain and retrieved concepts into a natural-language physics question. By grounding generation in valid equation chains, our method preserves mathematical correctness while promoting linguistic diversity and contextual richness. Human and automated evaluations demonstrate that ARVRE generates PWPs that are more complex, novel, and solvable than those produced by existing approaches. These results highlight the potential of combining reinforcement learning, retrieval, and LLMs for reliable generation of educational physics content.

22.
arXiv (CS.CL) 2026-06-18

Improving Medical Communication using Rubric-Guided Counterfactual Recommendations

Text-based telemedicine increasingly relies on lightweight patient feedback, however, such feedback primarily reflects perceived communication quality rather than medical accuracy. We introduce an LM-guided counterfactual recommendation pipeline that discovers and refines interpretable communication features such as tone, personalization, actionability and completeness in addressing patient concerns, without interfering with the medical content. These features are used together with patient-doctor interaction metadata to estimate positive feedback. At inference time, the system searches over low-cost ordinal feature changes and recommends minimal communication changes predicted to increase the probability of positive feedback, while independent auditor models test whether these gains generalize beyond the selection model. Across interactions, recommendations yield a mean +6.41% gain in predicted positive feedback probability under independent auditors, and are non-negative for 93.31% of recommendations. These results suggest that small, interpretable communication changes can capture most predicted gains while preserving the doctor's control over medical reasoning and final wording.

23.
arXiv (CS.LG) 2026-06-18

How fast can you find a good hypothesis?

arXiv:2509.03734v3 Announce Type: replace-cross Abstract: In the hypothesis selection problem, we are given sample and query access to finite set of candidate distributions (hypotheses), $\mathcal{H} = \{H_1, \ldots, H_n\}$, and samples from an unknown distribution $P$, both over a domain $\mathcal{X}$. The goal is to output a distribution $Q$ whose distance to $P$ is comparable to that of the nearest hypothesis in $\mathcal{H}$. Specifically, if the minimum distance is $\mathsf{OPT}$, we aim to output $Q$ such that, with probability at least $1-\delta$, its total variation distance to $P$ is at most $C \cdot \mathsf{OPT} + \varepsilon$. The optimal approximation for proper algorithms (where $Q \in \mathcal{H}$) is $C=3$ using $\Theta(\log(n/\delta)/\varepsilon^2)$ samples from $P$ and for improper algorithms (where $Q$ is not necessarily in $\mathcal{H}$) is $C=2$ using $\tilde{\Theta}(\log(n/\delta)/\varepsilon^2)$ samples from $P$. In the improper setting, the algorithm achieving $C=2$ [Bousquet, Braverman, Kol, Efremenko, Moran, FOCS 2021] runs in time which grows polynomially with $|\mathcal{X}|$ – it does not run in finite time for real-valued distributions. A promising path towards improved runtime is to consider improper algorithms which output a mixture $Q$ of the hypotheses as such a distribution can be represented in $n$ words of memory. We show (1) a lower bound that no algorithm which outputs a mixture can achieve approximation better than $C = 3-2/n$ unless the number of samples is polynomial in $|\mathcal{X}|$, as well as (2) an algorithm which runs in time $poly(n)$ and achieves the same approximation guarantee. In the proper setting, [Aliakbarpour, Bun, Smith, NeurIPS 2024] provided an algorithm with $C=3$ running in $\tilde{O}(n/(\delta^3\varepsilon^3))$ time. We improve this time complexity to $\tilde{O}(n/(\delta \varepsilon^2))$, significantly reducing the dependence on the confidence and error parameters.

24.
arXiv (quant-ph) 2026-06-16

Exact Many-body Quantum Dynamics in One-Dimensional Baths via Collective Spins

arXiv:2505.00588v2 Announce Type: replace Abstract: Computing the exact dynamics of many-body quantum systems becomes intractable as system size grows. Here, we present a symmetry-based method that provides an exponential reduction in the complexity of a broad class of such problems $\unicode{x2014}$ qubits coupled to one-dimensional electromagnetic baths. We identify conditions under which partial permutational symmetry emerges and exploit it to group qubits into collective multi-level degrees of freedom, which we term ''superspins.'' These superspins obey a generalized angular momentum algebra, reducing the relevant Hilbert space dimension from exponential to polynomial. Using this framework, we efficiently compute many-body superradiant dynamics in large arrays of qubits coupled to waveguides and ring resonators, showing that $\unicode{x2014}$ unlike in conventional Dicke superradiance $\unicode{x2014}$ the total spin length is not conserved. At long times, dark states become populated. We identify configurations where these states exhibit metrologically useful entanglement. Our approach enables exact treatment of complex dissipative dynamics beyond the fully symmetric limit and provides a rigorous benchmark for approximate numerical methods.

25.
arXiv (CS.CL) 2026-06-19

What Makes Effective Supervision in Latent Chain-of-Thought: An Information-Theoretic Analysis

Latent Chain-of-Thought (CoT) internalizes reasoning within continuous hidden states, offering a promising alternative to verbose discrete reasoning traces. However, robust latent reasoning remains difficult because outcome supervision provides weak learning signals and leaves latent trajectories prone to semantic drift. In this work, we analyze Latent CoT from an information-theoretic perspective and identify this failure as a dual collapse: gradient attenuation along the optimization path and representational drift in the latent space. We further decompose process supervision into two complementary dimensions: Trajectory Supervision, which injects dense stepwise reasoning signals, and Space Supervision, which preserves the semantic structure of the latent manifold. Our analysis shows that rigid geometric compression can collapse the reasoning space, whereas generative reconstruction provides a more flexible semantic anchor that better preserves information capacity. To measure these effects, we introduce the Unified Latent Probe (ULP), which quantifies the mutual information between latent trajectories and explicit reasoning steps. Experiments reveal a clear Information-Performance Binding: reasoning accuracy depends on the information fidelity preserved in the latent chain. These findings provide a principled framework for latent reasoning supervision and suggest shifting from geometric imitation toward mutual information maximization. Our code is available at \href{https://github.com/EIT-NLP/Supervision-in-Latent-CoT}{this repository}.