Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (math.PR) 2026-06-16

Super-Arrhenius relaxation of the triangular plaquette model in any dimension

arXiv:2606.16259v1 Announce Type: new Abstract: Consider the following plaquette model from statistical physics: a lamp lies at every vertex of the triangular lattice and a switch lies at every even vertex of the (bipartite) dual hexagonal lattice. Each switch toggles the three lamps on its face. The energy of a configuration is the number of ON lamps. For the Glauber dynamics associated with the Gibbs measure defined by this Hamiltonian at any inverse temperature $\beta>0$, we show that, in any dimension $d\ge 2$, the infinite volume relaxation time satisfies \[e^{\beta^2/C}/C \le T_{\mathrm{rel}}\le Ce^{e^{C\beta}}\] for some $C>0$. Our result entails that the Gibbs measure is unique. The $e^{\beta^2}$ scaling was conjectured by Newman and Moore in 1999 and matches the behaviour of supercritical rooted kinetically constrained models such as the East model, thus recovering fragile glass phenomenology in the absence of kinetic constraints. More precisely, we show that, on a torus of side length $2^k$, when $\beta\to\infty$ and $k/\beta\to0$, we have $T_{\mathrm{rel}}=e^{2\beta k(1+o(1))}$. Quite surprisingly, however, we also prove that, on non-periodic finite domains of size $n\le e^{\beta/C}$ for large $C>0$, we have the much larger asymptotics $\ln T_{\mathrm{rel}}=\beta n^{\Theta(1)}$. The main ingredients of the proofs are new results in extremal and enumerative combinatorics and rely on renormalisation ideas for the dynamics and its groundstates also known as the Ledrappier subshift. We note consequences of our results to geometric group theory (more precisely to the complexity of the word problem for the Baumslag finitely presented group) and to ergodic theory.

02.
arXiv (CS.CV) 2026-06-11

VLGA: Vision-Language-Geometry-Action Models for Autonomous Driving

Vision-language-action (VLA) models can describe scenes and reason about them in language, yet still struggle to ground their actions in the dense 3D world around them. Existing approaches either inject features from a frozen 3D foundation model without an objective that ensures the policy uses them, or constrain geometry with sparse box and map losses that provide no dense spatial signal. We introduce VLGA, the first vision-language-action model supervised to reconstruct the dense 3D world it drives through. VLGA introduces geometry as a fourth modality alongside vision, language, and action through a dedicated expert supervised by a per-pixel pointmap regression loss against LiDAR. Extensive experiments conducted on challenging nuScenes and Bench2Drive datasets for open-loop and closed-loop evaluations, respectively, show the superiority of VLGA over counterpart VLA methods. In particular, on open-loop nuScenes, VLGA sets a new state of the art among VLA methods without ego status, with the lowest L2 (0.50\,m average) and 3-second collision rate (0.18\%). On closed-loop Bench2Drive, VLGA attains the state-of-the-art driving score of 79.08, +0.71 over the strongest prior VLA, at comparable efficiency and comfort.

03.
arXiv (CS.LG) 2026-06-19

Quantile of Means: A Bonus-Free Ensemble Method for Minimax Optimal Reinforcement Learning

arXiv:2606.20107v1 Announce Type: new Abstract: Optimal Reinforcement Learning (RL) algorithms typically rely on carefully constructed count-based uncertainty estimates to drive exploration. Although theoretically sound, such estimates are hard to compute in practical settings and therefore offer limited insight for designing exploration heuristics. Meanwhile, ensembling has emerged as a practical approach, but remains without theoretical justification. Building on a recent ensemble-based method for Multi-Armed Bandits, we propose a quantile-based ensemble method for finite-horizon Markov Decision Processes (MDPs). Our simple count-free approach achieves optimal variance-dependent regret bounds, providing theoretical grounding for ensemble-based exploration in RL.

04.
arXiv (CS.LG) 2026-06-16

Scalar-pathway fidelity improves physical accuracy in short-range equivariant interatomic potentials

arXiv:2606.15892v1 Announce Type: new Abstract: Accurate interatomic potentials enable molecular dynamics of materials, molecules, and interfaces beyond density-functional-theory length and time scales. Equivariant neural network potentials have improved the representation of local geometry. However, their deployable energy surfaces ultimately manifest through invariant scalar channels, whose aggregation and spectral resolution remain comparatively underexamined. Here we use Physics-Aware Neighborhood (PAN) pooling and Physics-Guided Spectral (PGS) mixers as controlled scalar-pathway probes: lightweight, symmetry-preserving modifications that act only on \(\ell=0\) channels while leaving the equivariant tensor backbone unchanged. Using MACE as a high-body-order mechanistic scaffold, PAN adds coordination-sensitive amplitude modulation, whereas PGS augments edge and readout scalar features with radial and tapered spectral bases. Across metallic Ag, covalent Si, a short-range ionic LiF/Li–F subset, and MD17/rMD17 molecules, this scalar-pathway correction reduces MACE force errors by 22–27\% and energy errors by 19–22\%; on systems with stress labels, stress errors decrease by 27–28\%, at approximately 5\% additional inference-FLOPs cost. Directionally consistent gains in Allegro and NequIP further indicate that the correction is portable across distinct short-range equivariant backbones, although effect sizes remain architecture-dependent. These results identify scalar-pathway fidelity as a practical design dimension for short-range equivariant interatomic potentials.

05.
arXiv (CS.CL) 2026-06-11

When More Documents Hurt RAG: Mitigating Vector Search Dilution with Domain-Scoped, Model-Agnostic Retrieval

Retrieval-augmented generation degrades when scaled to large, heterogeneous document collections, where dense similarity loses discriminative power, and top-k retrieval increasingly returns semantically similar but contextually incorrect chunks. We refer to this failure mode as vector search dilution. Even when using hybrid dense+sparse retrieval, we observed this firsthand in a deployed Wyoming Department of Transportation corpus, where scaling from 54 to 1,128 documents (88,907 chunks) reduced accuracy from 75% to below 40%. To address this dilution, we propose MASDR-RAG ( Multi-Agent Scoped Domain Retrieval for RAG) and evaluate it on 200 expert-validated queries across five LLM backbones, six corpora, and two index stacks. Our results indicate that domain scoping using organizational metadata is the key fix, significantly improving P@10 from 0.77 to 0.86 ($p < 0.05$). Furthermore, our investigation of multi-agent orchestration revealed that a high degree of configuration dependence results –creating what we call the precision-faithfulness paradox. Based on these varied outcomes, our practical recommendation is simple: scope first, then perform a single synthesis call, reserving full multi-agent orchestration for genuinely multi-domain corpora paired with native-tool-call backbones. Code and Data will be made public upon acceptance.

06.
PLOS Computational Biology 2026-06-01

A statistical framework for comparing epidemic forests

Authors:

by Cyril Geismar, Peter J. White, Anne Cori, Thibaut Jombart Inferring who infected whom in an outbreak is essential for characterising transmission dynamics and guiding public health interventions. However, this task is challenging due to limited surveillance data and the complexity of immunological and social interactions. Instead of a single definitive transmission tree, epidemiologists often consider multiple plausible trees forming epidemic forests. Various inference methods and assumptions can yield different epidemic forests, yet no formal test exists to assess whether these differences are statistically significant. We propose such a framework using a chi-square test and permutational multivariate analysis of variance (PERMANOVA). We assessed each method’s ability to distinguish simulated epidemic forests generated under different offspring distributions. While both methods achieved perfect specificity for forests with 100+ trees, PERMANOVA consistently outperformed the chi-square test in sensitivity across all epidemic and forest sizes. Implemented in the R package mixtree, we provide the first statistical framework to robustly compare epidemic forests.

07.
bioRxiv (Bioinfo) 2026-06-18

Benchmarking attention-based methods for vision transformers' interpretability in retinal fundus imaging

Deep learning models based on Vision Transformers (ViTs) have shown strong performance in retinal fundus imaging, but their interpretability remains poorly understood. In particular, attention-based attribution methods are widely used to explain ViT predictions, despite limited evaluation of their faithfulness and biological relevance in medical imaging. Here, we systematically benchmark four attention-based interpretability methods for RETFound, a retinal ViT-based foundation model, that we previously fine-tuned to predict 17 retinal vascular phenotypes from UK Biobank fundus images1. We compare raw attention, attention rollout, gradient-weighted attention rollout, and Chefer's hybrid relevance-based method using both qualitative visualisation and quantitative evaluation frameworks. To assess attribution faithfulness, we perform perturbation-based deletion and insertion experiments, quantifying changes in model predictions as highly attended image regions are progressively removed or restored. To evaluate biological specificity, we run structure-aware analyses combining attribution maps with vessel segmentation and artery-vein labels through the Relative ratio of Attention Intensity (RAI) metric. Across models, attribution maps differed substantially depending on the selected interpretability method, highlighting the need for rigorous quantitative evaluation. Among the evaluated approaches, gradient-weighted attention rollout consistently achieved the strongest perturbation performance and produced attribution maps most closely aligned with the anatomical definition of the predicted retinal traits. Furthermore, vessel-type specific models systematically concentrate attention on the corresponding vascular structures despite being trained using only a single scalar value per image as supervision. These findings demonstrate that attention-based attribution methods capture biologically meaningful vascular representations, while also revealing method-dependent variability in attribution behaviour. This work provides a quantitative framework for evaluating interpretability methods in medical imaging with annotated segmentation and contributes toward more transparent and biologically grounded medical AI systems.

08.
arXiv (CS.LG) 2026-06-19

A Model-Driven Approach for Developing Families of Reinforcement Learning Environments

arXiv:2606.20324v1 Announce Type: cross Abstract: Virtual training environments are software-intensive systems in which reinforcement learning (RL) agents learn, adapt, and demonstrate meaningful behavior. Virtual training environments offer a safe and cost-efficient alternative to training agents in real-world settings. However, to converge, most realistic RL problems require training in multiple, mostly similar but slightly different environments - i.e., families of environment variants. The typical development process of environment families is a labor-intensive and error-prone manual endeavor that does not scale well. To alleviate these issues, in this paper, we propose a model-driven approach for developing families of RL training environments. To obtain the family of environments, we develop an approach and prototype tool. In our approach, a hybrid genetic algorithm - a combination of population-based global search and heuristic local search - generates environment families. Mutations and constraints are expressed as model transformations and are operationalized into a search process by a state-of-the-art model transformation engine. We demonstrate the soundness of our approach in a wildfire mitigation scenario and curriculum learning - a particular learning paradigm that relies on environment families.

09.
arXiv (CS.LG) 2026-06-18

Mixed-Precision Communication-Avoiding SGD for Generalized Linear Models on GPUs

arXiv:2606.18463v1 Announce Type: cross Abstract: Distributed stochastic gradient descent (SGD) is limited by communication rather than computation, since each iteration requires an AllReduce across processes. Communication-avoiding SGD (CA-SGD) amortizes communication over $s$ iterations by replacing $s$ consecutive AllReduces with a single AllReduce of an $sb\times sb$ Gram matrix, trading more computation and bandwidth for fewer synchronization points. Modern GPUs with matrix hardware and reduced-precision formats offset this by accelerating the Gram GEMM and shrinking BF16 traffic. We study mixed-precision CA-SGD for generalized linear models on NVIDIA GPUs. Our finite-precision analysis decomposes the local rounding error of one CA-SGD outer iteration into nine independent precision choices, depending on the hardware only through its low-precision unit roundoffs, so the resulting recipes transfer in principle across GPU generations. The recipe stores the input matrix and margin vector in low precision, computes the Gram matrix from low-precision inputs with high-precision accumulation, communicates it in high precision, and performs the inner recurrence and weight updates in high precision. On NERSC Perlmutter A100 GPUs, mixed-precision CA-SGD matches FP32 SGD loss within $0.5\%$ on logistic, linear, and Poisson problems and reaches $5.1$–$6.8\times$ speedup over FP32 SGD on epsilon, SUSY, HIGGS, synth, and Poisson-synth. Our software is available at https://doi.org/10.5281/zenodo.20448273

10.
arXiv (CS.CV) 2026-06-16

MAF: Multimodal Adaptive Few-shot Prompting for Sentiment Analysis with MLLMs

Authors:

Multimodal large language models (MLLMs) have demonstrated remarkable capabilities in understanding complex multimodal content. However, their performance in sentiment analysis exhibits acute sensitivity to prompt design, rendering static, uniformly applied prompts inherently suboptimal for capturing the nuanced multimodal cues that vary across inputs. To address this limitation, we propose a Multimodal Adaptive Few-Shot Prompting (MAF) framework, which dynamically retrieves and integrates query-relevant demonstrations to elicit the sentiment reasoning capabilities of MLLMs in a context-sensitive manner. MAF constructs a demonstration retrieval module that holistically encodes facial expressions, scene context, and textual semantics, with a lip movement amplitude detection mechanism introduced for accurate speaker identification in multi-person scenarios. Departing from conventional fixed-weight fusion, a lightweight coefficient generation network is trained to output query-conditioned fusion weights in real time, enabling weighted aggregation of multimodal similarity scores to retrieve the top-K most informative demonstrations. Prediction stability is further enhanced through majority voting over multiple candidate outputs generated by the MLLM. Extensive experiments on public benchmark datasets demonstrate that MAF achieves substantial and consistent performance improvements over the corresponding backbone variants and remains competitive with strong multimodal sentiment-analysis baselines.

11.
arXiv (quant-ph) 2026-06-15

All about quantum error correction: distillation, mitigation, self-correction and beyond

Authors:

arXiv:2606.14034v1 Announce Type: new Abstract: In this work, it is shown that many quantum error-manipulating techniques, such as distillation, error mitigation, and dynamical decoupling, are special cases of the most general framework for quantum error correction. This unifying perspective is achieved by extending quantum error correction to include state-adaptive and channel-adaptive settings, as well as multi-stage coding scenarios. Based on this insight, a model of self-correcting quantum memory is also proposed. This work clarifies the relationship among these techniques and illustrates, through explicit constructions, how the unified perspective can guide the design of reliable quantum information systems.

12.
arXiv (CS.AI) 2026-06-19

Systematic Study of Dysarthric Speech Recognition: Spectral Features and Acoustic Models

arXiv:2606.19793v1 Announce Type: cross Abstract: The challenge associated with recognizing dysarthric speech primarily arises from pronounced acoustic variability attributed to impaired articulatory precision. Past research has demonstrated improved recognition through the use of hybrid DNN/HMM sequence discriminative training. This paper presents a comprehensive investigation of various combinations of acoustic features tailored to different Acoustic Models, offering suitable feature selections for each. The incorporation of Pitch features notably improved recognition performance, especially for sentence recognition tasks involving dysarthric speech. Through a systematic examination of the TORGO database, we have demonstrated the potential to enhance the performance of the state-of-the-art Factorized Time Delay Neural Network (F-TDNN) model for recognizing dysarthric speech. Our methods, implemented with the F-TDNN model, resulted in a 4.65\% relative improvement in isolated word recognition and a 4.63\% relative improvement in sentence recognition for dysarthric speech, compared to previous research. This improvement effectively compensates for speech variability, attributable to our deliberate selection of the number of overlapping frames between consecutive training example chunks.

13.
arXiv (CS.LG) 2026-06-16

Photon: Federated LLM Pre-Training

arXiv:2411.02908v2 Announce Type: replace Abstract: Scaling large language models (LLMs) demands extensive data and computing resources, which are traditionally constrained to data centers by the high-bandwidth requirements of distributed training. Low-bandwidth methods like federated learning (FL) could enable collaborative training of larger models across weakly-connected GPUs if they can effectively be used for pre-training. To achieve this, we introduce Photon, the first complete system for federated end-to-end LLM training, leveraging cross-silo FL for global-scale training with minimal communication overheads. Using Photon, we train the first federated family of decoder-only LLMs from scratch. We show that: (1) Photon can train model sizes up to 7B in a federated fashion while reaching an even better perplexity than centralized pre-training; (2) Photon model training time decreases with available compute, achieving a similar compute-time trade-off to centralized; and (3) Photon outperforms the wall-time of baseline distributed training methods by 35% via communicating 64x-512xless. Our proposal is robust to data heterogeneity and converges twice as fast as previous methods like DiLoCo. This surprising data efficiency stems from a unique approach combining small client batch sizes with extremely high learning rates, enabled by federated averaging's robustness to hyperparameters. Photon thus represents the first economical system for global internet-wide LLM pre-training.

14.
arXiv (CS.CL) 2026-06-18

As Easy as Rocket Science: Assessing the Ability of Large Language Models to Interpret Negation in Figurative Language

Figurative language and negation are two areas that challenge current language models, however, both are widely used throughout written and spoken language. Large language models (LLMs) are also widely used in everyday contexts where they cannot necessarily be tuned for a specific dataset. It is therefore essential to understand the ability of LLMs to correctly interpret text that includes both negation and figurative language. To investigate this, we develop a set of new annotations to an existing dataset of figurative language, and test a range of language models on the dataset. We find that the combination of negation and figurativeness can present a particular challenge, and that performance overall and across different negation types is particularly dependent on the prompt style used.

15.
arXiv (CS.CV) 2026-06-16

Mitigating Object Hallucinations in LVLMs via Attention Imbalance Rectification

Object hallucination in Large Vision-Language Models (LVLMs) severely compromises their reliability in real-world applications, posing a critical barrier to their deployment in high-stakes scenarios such as autonomous driving and medical image analysis. Through systematic empirical investigation, we identify that the imbalanced attention allocation, both across modalities (i.e., vision and language) and within modalities (among individual tokens), exhibits a strong causal correlation with the occurrence of object hallucination. Leveraging this insight, we introduce a novel concept termed attention imbalance, which not only quantifies the degree of attention disparity but also visually delineates the underlying patterns (e.g., over-attentiveness to irrelevant language tokens or under-attentiveness to discriminative visual features) that drive object hallucination. To mitigate object hallucination, we further propose Attention Imbalance Rectification (AIR), a lightweight decoding-time intervention method that reallocates attention weights and adjusts attention distributions to rectify modality-wise and token-wise imbalances. Extensive evaluations on four mainstream LVLMs and three benchmarks (CHAIR, POPE, and MM-Vet) with seven baselines demonstrate that AIR consistently reduces object hallucination rates, achieving up to a 35.1% reduction compared to the baselines, while improving up to 15.9% of LVLMs' general capability across diverse vision-language tasks.

16.
arXiv (CS.LG) 2026-06-12

Robust State-Conditional Feature-Weighted Jump Models for Temporal Clustering

arXiv:2606.13146v1 Announce Type: cross Abstract: We propose a robust feature-weighted jump model for time-dependent clustering. A penalty is used to encourage smoothness of transitions over time, while robustness is achieved through the use of a Tukey's biweight loss function. An additional parameter controls the variability of feature weights across states, allowing the model to assign state-specific relevance to each feature. We illustrate in simulation how the method accurately recovers the true cluster sequence and reliably identifies relevant features, outperforming competing approaches, particularly in the presence of outliers. We conclude with two empirical applications, one on the number of conflict-related homicides in Kosovo in the period 1998-2000, and another on macroeconomic performance of twelve European countries in the period 1949-2024.

17.
arXiv (math.PR) 2026-06-11

Feynman–Kac formula for the heat equation with a one-center point interaction in $d=3$

arXiv:2606.11677v1 Announce Type: new Abstract: We study Schrödinger operators with a one-center point interaction, formally defined by \begin{align*} -\Delta_\alpha=-\Delta+\alpha\,\delta_0(\cdot), \end{align*} for $\alpha\in\mathbb{R}$, and the associated heat equation \begin{align} \partial_t u=\tfrac{1}{2}\Delta_{\alpha} u,\quad u(0,x)=u_0(x)\in C_c^{\infty}(\mathbb{R}^3\setminus\{0\}).\label{eq:HEapp} \end{align} Here $\Delta$ denotes the Laplacian (self-adjoint on $L^2(\mathbb{R}^3)$) and $\delta_x$ the Dirac measure at $x$. The operator $-\Delta_\alpha$ can be realized either as a self-adjoint extension of $-\Delta|_{C_0^{\infty}(\mathbb{R}^3\setminus\{0\})}$ in $L^2(\mathbb{R}^3)$, or as the norm-resolvent limit of $-\Delta+\lambda_\varepsilon V(\cdot/\varepsilon)$ for suitable $\lambda_\varepsilon$ and $V:\mathbb{R}^3\to\mathbb{R}$. In this paper we construct, for each $t>0$ and $x\in\mathbb{R}^3\setminus\{0\}$, a probability law on path space and a normalizing function $G_t^\alpha(x)$ giving the following probabilistic representation of the solution to the associated equation: \begin{align*} u(t,x)=G_t^\alpha(x)\,\mathbb{E}\bigl[u_0\bigl(W^{t,x}(t)\bigr)\bigr], \end{align*} where $\{W^{t,x}(s):0\le s\le t\}$ is a continuous process depending on $(t,x,\alpha)$. The result provides a Feynman–Kac type formula for the heat equation with a one-point interaction in three dimensions.

18.
medRxiv (Medicine) 2026-06-17

Clinical Study Protocol of the 'Biomarkers of Severity of COVID-19 Patients' (BIOMARCOVID) Project

Introduction The coronavirus disease 2019 (COVID-19) pandemic has challenged health care systems worldwide, in certain areas exceeding hospital capacities and human resources. This has underscored the importance of having better tools to predict the outcome of potentially severe respiratory infections such as SARS-CoV-2. Predicting COVID-19 severity may allow physicians to better manage ICU beds and increase the chances of patient survival through appropriate management. During the toughest months of the pandemic, most physicians tried to identify patients that might develop severe forms based primarily on clinical features on admission (e.g., BMI, age). In this context, significant research has focused on identifying comorbidities, clinical manifestations, and routine blood biomarkers to predict disease severity. However, despite the demonstrated value of untargeted metabolomics in assessing severity, limited data exist on its use for identifying novel metabolite biomarkers that could improve both the sensitivity and specificity of outcome prediction. Our goal is to identify metabolite biomarkers that could enhance the predictive accuracy of standard medical biology data and clinical parameters. Methods and analysis This is a retrospective, observational, monocentric cohort study conducted at the Centre Hospitalier Universitaire Grenoble Alpes (CHUGA). The maximum number of eligible patients admitted for PCR-confirmed COVID-19 between March and December 2020 will be included. Severity outcome is defined using the WHO 10-category ordinal scale (mild: categories 4-5; severe: >5). Blood samples were collected within 48 hours of admission and analyzed for 62 routine blood tests and untargeted multiplatform LC-MS/MS metabolomics across four national platforms. Statistical analysis will include logistic regression with variable selection for the primary aim, and multi-block chemometric integration of clinical, biological, and metabolomics data as a secondary aim. Ethics and dissemination A study steering committee has been formed to ensure the accuracy of the collected data by thoroughly reviewing it prior to the data lock. All aspects of the study comply with ethical standards, including approval by the CHUGA institutional review board and adherence to CNIL Reference Methodology MR004 for the protection of participants' rights, privacy, and confidentiality. This study is registered on the French Health Data Hub (number F20210218154851). Results will be disseminated through peer-reviewed publications, presentations at national and international scientific and clinical conferences, and reports shared with key healthcare system stakeholders.

19.
arXiv (math.PR) 2026-06-16

Asymptotic behavior of some strongly critical decomposable 3-type Galton–Watson processes with immigration

arXiv:2406.09852v2 Announce Type: replace Abstract: We study the asymptotic behavior of a critical decomposable 3-type Galton-Watson process with immigration when its offspring mean matrix is triangular with diagonal entries 1. It is proved that, under second or fourth order moment assumptions on the offspring and immigration distributions, a sequence of appropriately scaled random step processes formed from such a Galton-Watson process converges weakly. The limit process can be described using independent squared Bessel processes $({\mathcal X}_{t,1})_{t\geq0}$, $({\mathcal X}_{t,2})_{t\geq0}$, and $({\mathcal X}_{t,3})_{t\geq0}$, the linear combinations of the integral processes of $({\mathcal X}_{t,1})_{t\geq0}$ and $({\mathcal X}_{t,2})_{t\geq0}$, and possibly the 2-fold iterated integral process of $({\mathcal X}_{t,1})_{t\geq0}$. The presence of the 2-fold iterated integral process in the limit distribution is a new phenomenon in the description of asymptotic behavior of critical multi-type Galton-Watson processes with immigration. Our results complete and extend some results of Foster and Ney (1978) for some strongly critical decomposable 3-type Galton-Watson processes with immigration.

20.
arXiv (CS.CL) 2026-06-12

LEDGER: A Long-Context Benchmark of Corporate Annual Reports for Grounded Financial Retrieval and Extraction

Finance reporting is a natural proving ground for large language models, and the very-long-context capabilities of recent models across all sizes make rigorous evaluation in this domain an increasingly pressing need. Yet most public financial resources reduce the task to plain-text SEC 10-K filings paired with a handful of question-answer items. We release LEDGER (Long-context Evaluation of Documents for Grounded Extraction and Retrieval), a corpus of 4,999 digitized corporate annual reports - full documents with figures, tables, and narrative, not just regulatory filings. Each report is labeled with 31 consolidated financial KPIs to be extracted and linked to the market's reaction at the earnings date. From this data we derive three evaluation benchmarks spanning the difficulty spectrum: a pure page-level KPI retrieval task with TREC-style relevance judgments over 118,048 questions in natural language, a conversational "needle-in-a-haystack" single-value lookup, and a full KPI extraction task, both from long, numerically dense reports. We additionally provide human OCR-quality annotations with inter-annotator agreement and the complete extraction, validation, and scoring toolchain. We further demonstrate the dataset's research utility with a case study linking CEO-letter rhetoric to post-publication market impact.

21.
arXiv (CS.AI) 2026-06-17

A homotopy-type-theoretic generalization of neurosymbolic inference

arXiv:2606.17851v1 Announce Type: new Abstract: A wide range of neurosymbolic (NeSy) systems compute one functional: a belief-weighted sum of a logical quantity over a space of $\sigma$-structures, of which weighted model counting, fuzzy logic, and probabilistic logic are special cases. This account is built on sets, and a set deliberately forgets two things that are important for NeSy: when two $\sigma$-structures are the same up to a symmetry of the theory, and how many distinct proofs witness a query. Replacing the underlying sets by types, in the sense of homotopy type theory, preserves this information, and turns this functional into a belief-weighted homotopy cardinality, a notion of size that counts each object in inverse proportion to its symmetries. We develop the framework from scratch for NeSy systems, prove a conservativity theorem that recovers the classical functional when symmetries are trivial, and show that the symmetry our framework exposes is exactly the one behind reasoning shortcuts. The payoff is concrete: the shortcut-aware concept posterior that recent methods reach by ensembling or expressive density estimation is the only symmetry-invariant point of the confusion-set simplex, computable in closed form by averaging a single model over the symmetry group. On MNIST reasoning-shortcut benchmarks this single-model wrapper is better calibrated than a diversity-trained ensemble, while leaving label accuracy and identifiable concepts untouched. Code is freely available at https://github.com/bio-ontology-research-group/hott-nesy.

22.
PLOS Computational Biology 2026-06-22

Heterogeneous suppressive effect of <i>Wolbachia</i> incompatible insect technique coupled with sterile insect technique across time and historical <i>Ae. aegypti</i> abundance - using distributional synthetic controls

Authors:

by Yichen Zhai, Chia-Chen Chang, Zhiyong Xi, Cheong Huat Tan, Lee Ching Ng, Jue Tao Lim Background Biological control tools such as Wolbachia incompatible-insect technique, are a promising class of interventions to modify and suppress Aedes aegypti mosquitoes to reduce risk of Aedes-borne diseases. Due to the spatial nature of the intervention, intervention effects can be spatio-temporally heterogeneous. Yet, most evaluations of field-based technologies rely on average treatment effects, which preclude characterization and understanding of treatment effect heterogeneities and the factors influencing it. Methods Here, we developed a causal inference framework using distributional synthetic controls to explicitly account for spatio-temporal trap-level mosquito abundance data to ascertain the entomological efficacy of Wolbachia in suppressing Ae. aegypti abundance. This method is able to construct counterfactual distributions of intervened areas, provide detailed comparisons to actual distributions and quantify treatment effects of the intervention on mosquito abundance over different quantiles. By employing our framework to trap-level mosquito abundance data from 57,990 unique mosquito traps routinely maintained and measured twice a week, and a large-scale field trial of Wolbachia incompatible-insect technique coupled with sterile insect technique (IIT-SIT) in Singapore, we (1) quantified heterogeneous treatment effects for IIT-SIT across the time-since-intervention, over the traps’ historical mosquito abundance, over calendar time, (2) quantified whether elimination of wild-type Aedes aegypti was possible in intervention locations and (3) addressed if suppressive effects in spillover locations adjacent to directly intervened locations were heterogeneous. Results IIT-SIT interventions led to a strong suppressive effect on adult Aedes aegypti abundance. From the onset of intervention in directly treated locations, sector-specific intervention effectiveness (IE) ranged from 24.04% in the earliest treatment period, and reached 86.08% in the latest treatment period. Raw reductions in aegypti abundance were also found to increase over time as sectors were intervened over longer time periods. In spillover sectors, IE was lower in magnitude and more variable, but average IE reached a maximum of 78.08% in 2-years post-treatment. Wolbachia interventions also led to an increase in the percentage of traps recording no mosquitoes from 6.8% at the start of intervention to 33.01% 124-weeks post-intervention. We found that IE was higher in sectors with lower historical mosquito abundance. However, IE converged across sectors with different historical mosquito abundance as intervention time increased. Conclusion This study revealed spatial heterogeneities in suppressing wild-type female Ae. aegypti by IIT-SIT and provided strong evidence that IIT-SIT can drastically suppress wild-type Ae. aegypti populations despite heterogeneous treatment effects over time.

23.
medRxiv (Medicine) 2026-06-18

The Effectiveness of aromatherapy and its supportive Interventions on anxiety and pain among breast cancer patients: A systematic review and meta-analysis

Introduction: Breast cancer treatments are often associated with pain and anxiety, which can hinder physical functioning and overall quality of life, even after treatment. Complementary therapies, such as aromatherapy, can be used to alleviate pain and reduce anxiety in breast cancer patients. This project aimed to synthesize current global evidence on the effectiveness of aromatherapy. Method: This systematic review followed the PRISMA 2020 guidelines, with a comprehensive, systematic search conducted in PubMed, CINAHL, Cochrane Library, and SCOPUS for randomized controlled trials (RCTS) published from 2015 to 2025. Eligible studies included adult women breast cancer surgery patients who received aromatherapy during various periods of breast cancer. Where possible, data from the included studies were pooled using meta-analysis. GRADE approach was used to assess certainty of findings. Results: The search yielded 84 studies. Out of these, six were included in this review. On average, aromatherapy reduces pain and anxiety scores by 0.79 (standard mean difference (SMD)=-0.79, 95% CI -1.42, -0.16) and 0.53 (SMD=-0.53, 95 CI=-0.90, -0.16) units, respectively, compared to control condition [Low-quality of evidence]. The combination of aromatherapy with music reduces pain and anxiety by 1.26 (SMD= -1.26, 95 CI=-1.65, -0.87) and 1.08 (SMD = -1.08, 95 % CI: -1.45, -0.70) units respectively compared to standard care [Low-quality of evidence]. Conclusion: There is a potential role for the use of aromatherapy and music therapy, to alleviate anxiety and pain, especially for non-preoperative anxiety and pain. Further research is needed to inform the integration of aromatherapy into the management of anxiety and pain.

24.
arXiv (CS.CL) 2026-06-11

RLCSD: Reinforcement Learning with Contrastive On-Policy Self-Distillation

On-policy self-distillation (OPSD) provides dense, token-level supervision for reasoning models by aligning a model's own distribution with the distribution it produces under privileged context, typically a verified solution. However, we show that the learning signal drawn from this distributional gap concentrates on style tokens rather than task-bearing ones, as the hinted model tends to produce more direct, shorter outputs. We term this pathology privilege-induced style drift, which destabilizes training or causes response length to shrink. To address this, we propose RLCSD (Reinforcement Learning with Contrastive on-policy Self-Distillation), which mitigates this drift by contrasting the teacher-student gap under a correct hint against that under a wrong hint, suppressing the style shift that conditioning on a hint tends to induce regardless of correctness, and yielding a signal that is more concentrated on task-bearing tokens. Experiments on Qwen3 (1.7B/4B/8B) and Olmo-3-7B-Think across mathematical and logical reasoning show that RLCSD consistently outperforms GRPO and prior OPSD methods. We further show that the contrastive principle is general: it plugs into existing OPSD methods to improve them, and its underlying insight extends to the broader cross-model on-policy distillation setting.

25.
arXiv (CS.AI) 2026-06-12

SymQNet: Amortized Acquisition for Low-Latency Adaptive Hamiltonian Learning

arXiv:2606.12808v1 Announce Type: cross Abstract: Adaptive Hamiltonian learning is central to calibrating and characterizing quantum devices. In an adaptive controller, choosing the next experiment is itself a computation. Bayesian design rules are recomputed after every posterior update, and that step can take seconds. Across hundreds of shots, those seconds become a significant wall-clock cost for adaptivity. We introduce SymQNet, an amortized reinforcement-learning approach for low-latency adaptive Hamiltonian learning. SymQNet learns a posterior-conditioned acquisition policy offline, then uses a fast policy forward pass online while retaining Bayesian posterior feedback. On transverse-field Ising benchmarks, SymQNet substantially reduces acquisition latency relative to bounded Fisher-information search and bounded two-step Bayesian active learning by disagreement (BALD). At five qubits, it reduces acquisition-only decision latency by $47.1\times$ and $72.6\times$ relative to these online baselines; at twelve qubits, full simulated steps take $1.02$ s for SymQNet versus $13.27$ s for bounded two-step BALD. Overall, we show that learned acquisition can make adaptive Hamiltonian learning practical for repeated low-latency workloads.