Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (quant-ph) 2026-06-16

Interaction-enabled topological pumping of Rydberg electrons

arXiv:2606.15126v1 Announce Type: cross Abstract: Topological pumping is a paradigmatic realization of quantized transport in band systems, yet its fate in strongly correlated regimes, especially with long-range interactions, remains largely unexplored. Here we report the experimental observation of interaction-enabled topological pumping of correlated Rydberg electrons in a synthetic lattice. We show that dipolar exchange interactions induce a controllable shift of the underlying topological singularity in parameter space, such that a fixed pumping trajectory can be driven through successive topological transitions by tuning the interaction strength alone. This leads to the emergence and breakdown of quantized transport. The observations are consistent with an effective Rice-Mele description with interaction-renormalized onsite potentials and are supported by characterizing the adiabaticity and robustness to control trajectory imperfections. Our results establish a platform for exploring interaction-controlled topological transport beyond perturbative regimes and open a route toward engineering correlated topological matter in synthetic quantum systems.

02.
arXiv (CS.AI) 2026-06-19

Wisdom of Committee: Diverse Distillation from Large Foundation Models and Domain Experts

arXiv:2402.14035v4 Announce Type: replace-cross Abstract: Knowledge distillation from foundation models to compact domain models is challenging due to substantial gaps in capacity, architecture, and modality. For example, in our experiments, distilling from a 76M-parameter language model to a 2M-parameter recommender closes less than 40% of the performance gap between the undistilled student and the teacher. We show that introducing domain-specific experts – which share the student's architectural characteristics – alongside the foundation model as a diverse teacher committee significantly improves transfer. However, standard multi-teacher methods fail to exploit this diversity: naively combining heterogeneous teachers can degrade performance below single-teacher distillation. To address this, we propose DiverseDistill, an interactive distillation framework that employs a learnable Question-Answer mechanism to generate teacher-conditioned queries and align heterogeneous teacher outputs into the student's representation space. Unlike methods requiring gradient-based co-optimization or architectural modification of teachers, DiverseDistill operates with frozen teachers using only forward-pass inference through their intermediate layers: no parameter updates, no co-training, and no architectural surgery. A dynamic teacher importance mechanism further reduces training cost by filtering low-relevance teachers per sample (e.g., ~30% fewer forward passes with no quality loss for recommendation tasks), while the entire Distillation Module is discarded after training, adding zero inference overhead. Evaluations on recommendation (38x compression) and vision (3.6x compression) tasks demonstrate that DiverseDistill recovers 73-114% of the teacher-student performance gap, consistently outperforming all single- and multi-teacher baselines.

03.
arXiv (CS.CL) 2026-06-17

Do We Still Need Humans in the Loop? Comparing Human and LLM Annotation in Active Learning for Hostility Detection

Instruction-tuned LLMs can annotate thousands of instances at low cost. This raises two questions for active learning (AL): can LLM labels replace human labels within the AL loop, and does AL remain necessary when entire corpora can be cheaply labeled? We investigate both on a new dataset of 277,902 German political TikTok comments (25,974 LLM-labeled, 5,000 human-annotated), comparing LLM and human annotation across seven conditions, four encoders, and 10 random seeds. Under a two-question interface that mirrors the human annotation task, LLM annotation at scale outperforms human-supervised classifiers at roughly one-tenth the cost (\$28 for GPT-5.2 Batch API vs. \$316 for Prolific). The advantage holds for both a closed-source (GPT-5.2) and an open-weight (Qwen3.5-122B-10B) LLM, is robust under soft-label evaluation, and is unlocked specifically by the two-question decomposition; a holistic single-prompt baseline only ties with human supervision. AL provides no reliable advantage over random sampling under either LLM annotator. However, error structure varies sharply: only GPT-5.2 under the two-question interface produces classifiers with near-human FP/FN balance, while other LLM variants over-flag border-control and economic competition discourse. We release the dataset and code.

04.
arXiv (CS.LG) 2026-06-16

Data-driven Control with Real-time Uncertainty Compensation for Multi-Fuel Engines

arXiv:2606.16171v1 Announce Type: cross Abstract: Multi-fuel compression ignition (CI) engines offer superior power density and fuel flexibility. However, achieving consistent and optimal combustion phasing across a wide range of operating conditions remains a major challenge, particularly in the presence of modeling uncertainties. This paper presents a novel, data-driven real-time uncertainty compensation framework for combustion control in multi-fuel CI engines. The proposed approach introduces a pseudo-engine speed that enables dynamic adaptation of control inputs in response to uncertainty affecting the engine. To model the underlying combustion process, a Gaussian Process Regression (GPR) model is first trained on available input-output data, capturing the nonlinear and fuel-dependent behavior across varying operating conditions. Control inputs are then synthesized through model inversion of the learned GPR surrogate and augmented with an uncertainty compensator designed to mitigate deviations caused by dynamic variations in operating conditions and model inaccuracies. This integrated control strategy allows for real-time input corrections within a finite number of combustion cycles. Theoretical analysis establishes finite-time convergence guarantees for the proposed controller. Simulation results demonstrate that the proposed method steers the combustion phasing to the desired value in real-time, providing a scalable and adaptive control solution for multi-fuel CI engine operation.

05.
arXiv (math.PR) 2026-06-17

Cutoff for asymmetric shelf shuffle

arXiv:2606.18039v1 Announce Type: new Abstract: A mechanical shuffler consists of $m$ shelves. A deck of $n$ cards, arranged in increasing order, is dealt from the bottom sequentially. Each card is assigned a shelf uniformly at random and placed on the top (bottom) of the existing pile with probability $p$ ($1-p$) independently. We refer to this as asymmetric shelf-shuffle. We find the law $\nu_{n, m}^{(p)}$ of the permutation induced by the asymmetric shelf-shuffle and show that the pair consisting of the number of descents and the number of valleys is a sufficient statistic. This generalizes a result of Diaconis, Fulman, and Holmes (Ann. Appl. Prob., 2013) corresponding to the case $p=1/2$. For $p=1/2$, Chen and Ottolini (ECP, 2025) established the cutoff in the total variation distance near $\lfloor n^{5/4}\rfloor$. We establish the cutoff for the asymmetric shelf shuffle. Let $\nu_n$ be the uniform measure on the set of all permutations $S_n$ of $\{1, \ldots, n\}$. For a fixed $p\neq 1/2$ and $c>0$, we show that \[\operatorname{TV}\left(\nu_{n, \lfloor cn^{3/2}\rfloor }^{(p)}, \nu_n\right)=1-2\Phi\left(-\frac{|2p-1|}{4\sqrt{3}c}\right)+O_{c, p}(n^{-1/2})\;.\] We also establish the cutoff in the separation distance near $m\approx n^{2}$ and in the relative entropy near $m=n^{3/2}$. In both cases, we also obtain the cutoff profile explicitly.

06.
arXiv (CS.AI) 2026-06-16

Learning aligned EEG representations with subject-specific encoders

arXiv:2606.16462v1 Announce Type: cross Abstract: Cross-subject EEG decoding promises more training data, but it also exposes neural networks to strong inter-subject distribution shifts. We study whether task supervision and architecture alone can learn subject-aligned representations. We replace a shared EEG encoder with subject-specific encoders followed by a common classifier, and compare this hybrid model with standard EEGNet, AttentionBaseNet, and CTNet baselines with Euclidean Alignment (EA) on four motor-imagery datasets. EA improves shared encoders by recentering subject covariances, but the hybrid encoder largely internalises this role: validation-loss curves and latent-distance analyses change little when EA is removed. Subject-specific heads increase class distinctiveness and place each subject close to its own latent manifold, improving most subjects while leaving a method-sensitive subset. These results support subject-specific encoders as a learned alignment mechanism for EEG decoding and identify head selection for unseen subjects as the remaining bottleneck.

07.
arXiv (quant-ph) 2026-06-16

Quantum simulation of the Liouville equation in classical mechanics with discontinuous potential via Schrödingerization

arXiv:2606.15066v1 Announce Type: new Abstract: We develop quantum simulation algorithms for the Liouville equation of classical mechanics with discontinuous potential. Such discontinuities represent potential barriers at which classical particles undergo energy preserving transmission or reflection, and the resulting interface conditions must be incorporated into the numerical flux. We combine Hamiltonian-preserving schemes by Jin and Wen in Commun. Math. Sci. 3(3), 285-315 (2005) with the Schrödingerization method, which embeds the resulting nonunitary semi-discrete dynamics into a unitary Schrödinger type system in one additional auxiliary variable [arXiv:2212.14703, arXiv:2212.13969]. For one-, two-, and $n$-dimensional problems with grid aligned interfaces, we construct sparse matrix representations of the transmission and reflection fluxes using step and hat functions, derive the corresponding Hamiltonians of the Schrödingerized systems, and analyze their sparse-access query complexity. In the sparse-access oracle model, the resulting algorithms have a polynomial dependence on the inverse accuracy and avoid the exponential dependence on the phase-space dimension suffered by classical grid based Hamiltonian-preserving schemes, up to the cost of implementing the oracles and the postselection overhead. We also describe the postselected recovery of the physical solution state and the quantum readout of macroscopic observables such as density and averaged velocity through overlap estimation. Numerical experiments based on classical simulation of the Schrödingerized dynamics validate the proposed formulation and illustrate the correct transmission/reflection behavior at potential barriers.

08.
medRxiv (Medicine) 2026-06-18

Biomedical Capacity, Governance, and Health Security: A Dominican Republic Research Analysis of Stakeholder Perspectives

The COVID-19 pandemic exposed critical vulnerabilities in globally concentrated biomedical supply chains and accelerated interest in nearshoring and hemispheric health-security strategies. The Dominican Republic, already the third-largest medical device exporter in Latin America, occupies a strategically significant but institutionally constrained position within this realignment. This study evaluates stakeholder perceptions of the principal opportunities and barriers affecting biomedical ecosystem development in the Dominican Republic, with particular attention to governance, workforce capacity, and value-chain upgrading pathways. Methods. A concurrent mixed-methods design was employed, integrating a cross-sectional electronic survey of 142 purposively sampled domain experts (administered September-December 2025) with a qualitative executive consultation with senior government and industry leaders. Survey analyses combined descriptive statistics, one-sample t-tests against the scale neutral midpoint, chi-square goodness-of-fit tests, Friedman non-parametric ranking, Spearman rank correlations, and exploratory linear and logistic multivariable regression. Qualitative responses were analyzed using a framework approach grounded in the Triple Helix model of innovation systems. Results. Perceived government support was significantly below neutral (mean = 2.67, SD = 1.12; p = 0.034). Workforce shortages (83.3%) and weak academia-industry collaboration (71.4%) were the most frequently endorsed barriers ({chi}2(5) = 18.7, p = 0.002). Regulatory modernization (88.1%) and workforce development (85.7%) ranked as the highest-priority policy levers (Friedman p = 0.005). Clinical trials and contract research organization services were the dominant sub-sector priority (76.2%, binomial p < 0.001). In multivariable analysis, perceived government support, talent availability, and confidence in IP protection jointly explained 46% of the variance in sector competitiveness (R2 = 0.46, p < 0.001). Strong majority support existed for a formal public-private biomedical coordination authority (73.8%, p < 0.001).Conclusion. Institutional credibility and advanced human capital–rather than geography or market access–are the perceived binding constraints on the Dominican Republics biomedical trajectory. Regulatory modernization, targeted workforce investment, and the establishment of a national biomedical coordination authority represent the highest-leverage interventions for positioning the country as a hemispheric hub for biomedical manufacturing, clinical research, and health security.

09.
arXiv (CS.AI) 2026-06-16

TS-Memory: Plug-and-Play Memory for Time Series Foundation Models

arXiv:2602.11550v2 Announce Type: replace-cross Abstract: Time Series Foundation Models (TSFMs) achieve strong zero-shot forecasting through large-scale pre-training, but adapting them to downstream domains under distribution shift remains challenging. Existing solutions face a trade-off: Parametric Adaptation can cause catastrophic forgetting and requires costly multi-domain maintenance, while Non-Parametric Retrieval improves forecasts but incurs high inference latency due to datastore search. We propose Parametric Memory Distillation and implement it as TS-Memory, a lightweight memory adapter that augments frozen TSFMs. TS-Memory is trained in two stages. First, we construct an offline, retrieval-leakage-safe kNN teacher that synthesizes confidence-aware quantile targets from retrieved futures. Second, we distill this retrieval-induced distributional correction into a lightweight memory adapter via confidence-gated supervision. During inference, TS-Memory fuses memory and backbone predictions with constant-time overhead, enabling retrieval-free deployment. Experiments across diverse TSFMs and benchmarks demonstrate consistent improvements in both point and probabilistic forecasting over representative adaptation methods, with efficiency comparable to the frozen backbone. Code: https://github.com/sisuolv/TS-Memory.

10.
arXiv (math.PR) 2026-06-18

Milstein-type Schemes for Hyperbolic SPDEs

arXiv:2512.19647v4 Announce Type: replace-cross Abstract: This article studies the temporal approximation of hyperbolic semilinear stochastic evolution equations with multiplicative Gaussian noise by Milstein-type schemes. We take the term hyperbolic to mean that the leading operator generates a contractive, not necessarily analytic $C_0$-semigroup. Optimal convergence rates are derived for the pathwise uniform strong error \[ E_h^\infty := \Big(\mathbb{E}\Big[\max_{1\le j \le M}\|U_{t_j}-u_j\|_X^p\Big]\Big)^{1/p} \] on a Hilbert space $X$ for $p\in [2,\infty)$. Here, $U$ is the mild solution and $u_j$ its Milstein approximation at time $t_j=jh$ with step size $h>0$ and final time $T=Mh>0$. For sufficiently regular nonlinearity and noise, we establish strong convergence of order one, with the error satisfying $E_h^\infty\lesssim h\sqrt{\log(T/h)}$ for rational Milstein schemes and $E_h^\infty \lesssim h$ for exponential Milstein schemes. This extends previous results from parabolic to hyperbolic SPDEs and from exponential to rational Milstein schemes. Moreover, root-mean-square error estimates are strengthened to pathwise uniform estimates. Numerical experiments validate the convergence rates for the stochastic Schrödinger equation. Further applications to Maxwell's and transport equations are included.

11.
medRxiv (Medicine) 2026-06-11

Ferritin across long-term conditions in England: cross-sectional primary care study

Background Iron deficiency (ID) is a readily treatable condition once identified. Ferritin is the primary diagnostic marker, but cut-offs vary and inflammation complicates interpretation in patients with long-term conditions (LTCs). Aim To describe ferritin distribution and the prevalence of threshold-defined low ferritin in adults with and without LTCs in primary care. Design and setting Cross-sectional observational study using routinely collected electronic health records from a national primary care database in England (1st January 2015 to 31st December 2021). Method Adults with >1 ferritin test in Clinical Practice Research Datalink (CPRD) Aurum were included. LTCs were identified using validated primary-care code lists. Outcomes included ferritin distribution and threshold-defined ID prevalence using World Health Organization (WHO) (

12.
Nature Medicine 2026-06-10

Dual-target gene therapy in Parkinson’s disease: a multicenter phase 1 trial

作者:

Restoring striatal dopamine synthesis is a promising gene therapy strategy for Parkinson’s disease. Previous adeno-associated virus-mediated aromatic L-amino acid decarboxylase (AADC) monotherapies remain dependent on exogenous levodopa, whereas multigene delivery is constrained by strict adeno-associated virus packaging limits. A ‘dual approach’ targeting the two rate-limiting enzymes, tyrosine hydroxylase (TH) and AADC, offers the potential for autonomous dopamine synthesis. We report the 12-month primary safety and tolerability outcomes of a multicenter, open-label, dose-escalation, phase 1 trial evaluating BBM-P002, a new adeno-associated virus vector—AAVT42—codelivering constitutively active TH and AADC. Ten participants with moderate-to-advanced Parkinson’s disease were enrolled and received bilateral intraputaminal infusions across doses of 4.0 × 1011 vg (Cohort 1; n = 1), 6.0 × 1011 vg (Cohort 2; n = 2), 1.0 × 1012 vg (Cohort 3; n = 2) and 1.2 × 1012 vg (Cohort 4; n = 5). The trial achieved its primary outcome, as BBM-P002 demonstrated a favorable safety and tolerability profile within 12 months post-treatment. No dose-limiting toxicities or drug-related serious adverse events occurred. A total of 23 adverse events were reported, all judged unrelated to BBM-P002 and primarily mild and transient. Systemic toxicity and clinically meaningful immunogenicity were absent. In conclusion, intraputaminal delivery of BBM-P002 was safe and well tolerated in this phase 1 trial, supporting continued clinical development. ClinicalTrials.gov registration: NCT05822739 . Phase 1 results reveal that BBM-P002, a dual-target gene therapy co-delivering TH and DDC, is safe and well tolerated in Parkinson’s disease, with 12-month motor improvements signaling therapeutic potential.

13.
bioRxiv (Bioinfo) 2026-06-18

Predicting optimal growth temperatures of bacteria using learned structural information from a single protein

Temperature is a fundamental determinant of bacterial physiology and ecology. Optimal growth temperature (OGT) is highly variable across species, contributing to differences in where and when species are most likely to thrive. Although the OGTs for most bacteria remain unknown, the increasing availability of genomes from uncultivated and cultivated taxa has made it advantageous to build genomic, cultivation-independent models to infer OGT. However, pre-existing genomic models often lack the generalizability and mechanistic grounding required for robust inferences of OGT. We propose a novel framework for predicting bacterial OGT which uses learned protein structural signatures of thermal adaptation. We hypothesize that biophysical tradeoffs which dictate enzymatic functions across variable temperatures provide a more robust empirical basis for OGT prediction than broad genomic features. Our OGT-predicting model, ROSEATE, is based on a single gene, adenylate kinase (ADK), that encodes for a ubiquitous enzyme essential for energy homeostasis. ROSEATE uses high-dimensional latent space encoding via MSA Transformer, a protein language model which embeds ADKs in a manner which preserves biophysical information about embedded proteins. We show that the accuracy of the ROSEATE model is on par with other genome-based models, has a high degree of phylogenetic generalizability, and the ESM embeddings effectively capture key temperature-adaptive enzyme characteristics derived from AlphaFold structures. Because ROSEATE is based on analyses of a single ubiquitous protein, it can be used with metagenomic data to infer the community-level variation in bacterial OGTs. We demonstrate this feature of ROSEATE by reconstructing ADK sequences from over 500 environmental and host-associated metagenomes, successfully distinguishing community-wide thermal preferences across diverse habitats, from polar oceans to mammalian guts. By transitioning from genomic proxies to informationally dense protein structural features, this work provides an efficient, interpretable tool for predicting bacterial OGTs across taxa and whole communities.

14.
arXiv (CS.LG) 2026-06-15

On the Generalization Bounds of Symbolic Regression with Genetic Programming

arXiv:2604.17402v2 Announce Type: replace Abstract: Symbolic regression (SR) with genetic programming (GP) aims to discover interpretable mathematical expressions directly from data. Despite its strong empirical success, the theoretical understanding of why GP-based SR generalizes beyond the training data remains limited. In this work, we provide a learning-theoretic analysis of SR models represented as expression trees. We derive a generalization bound for GP-style SR under constraints on tree size, depth, and learnable constants. Our result decomposes the generalization gap into two interpretable components: a structure-selection term, reflecting the combinatorial complexity of choosing an expression-tree structure, and a constant-fitting term, capturing the complexity of optimizing numerical constants within a fixed structure. This decomposition provides a theoretical perspective on several widely used practices in GP, including parsimony pressure, depth limits, numerically stable operators, and interval arithmetic. In particular, our analysis shows how structural restrictions reduce hypothesis-class growth while stability mechanisms control the sensitivity of predictions to parameter perturbations. By linking these practical design choices to explicit complexity terms in the generalization bound, our work offers a principled explanation for commonly observed empirical behaviors in GP-based SR and contributes towards a more rigorous understanding of its generalization properties.

15.
medRxiv (Medicine) 2026-06-16

Prevalence and Correlates of Ideal Cardiovascular Health among Ugandan Adolescents: A Cross-Sectional Study

Introduction: Cardiovascular disease (CVD) risk factors often emerge during adolescence and track into adulthood, yet data on cardiovascular health (CVH) in sub-Saharan Africa remain limited. We assessed the prevalence and correlates of ideal CVH among Ugandan adolescents. Methods: We analysed baseline data of adolescents enrolled in a cluster-randomised controlled trial being conducted in urban (Kampala) and rural (Jinja) districts of Uganda. In this study, Ideal CVH was defined as meeting "ideal" status of 5-7 of the American Heart Association's Life's Simple 7 metrics. Random-effects logistic regression was used to identify factors associated with ideal CVH, accounting for village-level clustering. Results: We recruited 1316 participants with a mean age of 13.2 years, of whom 58.1% were female. Overall, the prevalence of ideal CVH was 66.8% (95% CI: 64.2% - 69.3%). The prevalence was higher in Jinja (74.4%, 95%CI: 70.9% - 77.7%) than Kampala (59.6%, 95%CI: 55.8%-63.2%) and the difference was evident (p

16.
arXiv (CS.AI) 2026-06-16

Bayesian 3D Steerable CNNs: Enabling Equivariance and Uncertainty Quantification Simultaneously

arXiv:2606.15479v1 Announce Type: cross Abstract: Steerable convolutional neural networks (Steerable-CNNs) guarantee SE(3)-equivariance by parameterizing kernels as linear combinations of steerable basis functions, but their deterministic nature precludes uncertainty quantification - limiting their use in settings where confidence estimates are essential. We propose a Bayesian Steerable-CNN that places posterior distributions over the basis coefficients, yielding stochastic kernels while preserving equivariance exactly. The loss function of the model is obtained via variational inference and minimized by Bayes-by-Backpropagation. The framework admits a decomposition of predictive uncertainty into epistemic and aleatoric components. Empirically, the model attains competitive classification accuracy alongside an expected calibration error of 0.0263 and outperforms its deterministic counterpart by up to 6.17% under distributional shift induced by additive Gaussian noise. Furthermore, we leverage the model's uncertainty estimates to enhance its performance significantly, achieving a notable gain - approximately 4% higher accuracy across 84% of the test dataset. A statistically significant negative correlation between epistemic uncertainty and prediction error confirms that the learned posterior variance is semantically meaningful. The framework unifies Bayesian uncertainty quantification with the inductive bias of equivariant CNNs.

17.
arXiv (CS.LG) 2026-06-18

Fisher Width: A Geometric Measure of Complexity on Statistical Manifolds

作者:

arXiv:2606.18306v1 Announce Type: new Abstract: Gaussian width is a central geometric complexity measure in high-dimensional probability, compressed sensing, convex optimization, and learning theory. It quantifies the average extent of a set along random directions, thereby capturing the effective dimension of constraint sets, hypothesis classes, and descent cones. However, this notion is intrinsically Euclidean. Statistical models instead carry a natural Riemannian geometry induced by the Fisher information metric, where directions are scaled according to statistical distinguishability rather than ambient Euclidean length. We introduce Fisher width, a Fisher-geometric analogue of Gaussian width for statistical manifolds. At a parameter point $\theta$, Fisher width replaces the Euclidean identity by the local metric tensor $G(\theta)^{1/2}$, measuring the Gaussian width of the Fisher-rescaled set. This makes the resulting quantity sensitive to local statistical curvature and invariant under smooth reparameterizations. We develop the basic theory of Fisher width, showing that it retains key structural features of Gaussian width, including concentration, metric perturbation stability, and spectral comparison bounds with the Euclidean baseline, while also capturing anisotropic geometric effects invisible to Euclidean measures. As an application, we prove a generalization bound for Fisher-Lipschitz hypothesis classes and propose computable estimators, which we evaluate empirically on MNIST across three model classes. Fisher width is to statistical manifolds what Gaussian width is to Euclidean convex bodies. This work lays the foundation for studying complexity and learning on curved statistical manifolds.

18.
arXiv (CS.AI) 2026-06-18

Optimizing Lithium Production Decisions under Geological, Demand, and Pricing Uncertainties: A POMDP Framework for Multi-Objective Decision Making

arXiv:2606.18598v1 Announce Type: new Abstract: Decision making in lithium production is challenging, whether from an investor's perspective or a strategic production standpoint. Determining which mines to open and when to open them involves not only geological and price uncertainties, but also complexities around the choice of extraction method, from direct lithium extraction to hard rock mining. Prior work explored models of this problem and different methods to optimize mining decisions; these models did not account for uncertainty in pricing, uncertainty in demand, or different mining technologies to extract lithium. Incorporating different pricing models and extraction technology into these models enables more robust strategies for determining not only when and where to open a mine, but also which method of production to pursue. We frame the problem as a partially observable Markov decision process (POMDP) and solve using belief state planning methods to get optimal decision making. In our study, we show that POMDP solvers outperform human inspired heuristics by dynamically adapting to shifting lithium price regimes (static, linear, exponential, and stochastic) through belief state planning and explicit uncertainty management. By optimally sequencing exploration, production, and technology choice, the framework achieves higher demand fulfillment and more balanced economic environmental outcomes over the projects lifetime in all different pricing and deposit scenarios.

19.
arXiv (CS.CL) 2026-06-12

Helping Figures Tell their Story! Paper-Grounded Video Generation Explaining Complex Scientific Figures

Scientific figures compress complex pipelines into a single canvas, yet understanding them requires paper-grounded, step-by-step narration aligned with visual highlights a capability missing from current video generation systems and benchmarks. To address this, we introduce paper-grounded figure-to-video generation: generating narrated, region-grounded walkthrough videos from a figure and its paper. We propose MINARD (Multimodal Interpretation of Narrated Architecture via Region Decomposition), a pipeline that generates paper-grounded narrations and sequentially grounds them to figure regions. We also release FigTalk, a benchmark with new sequential and component-level grounding metrics derived. On FigTalk, MINARD generates humanlike, paper-faithful narrations and outperforms narration-conditioned figure spatial grounding compared to existing approaches in both automatic and human evaluation

20.
arXiv (CS.CL) 2026-06-12

Language Model Circuits Are Sparse in the Neuron Basis

The high-level concepts that a neural network uses to perform computation need not be aligned to individual neurons (Smolensky, 1986). Language model interpretability research has thus turned to techniques which decompose the neuron basis into more interpretable units of model computation, such as sparse autoencoders (SAEs). However, not all neuron-based representations are uninterpretable. For the first time, we empirically show that MLP neurons are as sparse a feature basis as SAEs. We use this finding to develop an end-to-end gradient-based attribution pipeline for circuit tracing on the MLP neuron basis, which surfaces causally effective neurons on a variety of tasks. On a standard subject-verb agreement benchmark (Marks et al., 2025), a circuit of $\approx 10^2$ MLP neurons is enough to control model behaviour. On the multi-hop city-state-capital task from (Lindsey et al., 2025), we find a circuit in which small sets of neurons encode specific latent reasoning steps (e.g. mapping a city to its state), and can be steered to change the model's output. This work thus advances automated interpretability of language models without imposing additional training costs.

21.
arXiv (CS.LG) 2026-06-16

Self-Supervised Learning of Iterative Solvers for Constrained Optimization

arXiv:2409.08066v3 Announce Type: replace Abstract: The real-time solution of parametric optimization problems is critical for applications that demand high accuracy under tight real-time constraints, such as model predictive control. To this end, this work presents a learning-based iterative solver for constrained optimization, comprising a neural network predictor that generates initial primal-dual solution estimates, followed by a learned iterative solver that refines these estimates to reach high accuracy. We introduce a novel loss function based on Karush-Kuhn-Tucker (KKT) optimality conditions, enabling fully self-supervised training without pre-solved optimizer solutions. Theoretical guarantees ensure that the training loss function attains minima exclusively at KKT points. A convexification procedure enables application to nonconvex problems while preserving these guarantees. Experiments on two nonconvex case studies demonstrate speedups of up to one order of magnitude compared to state-of-the-art solvers such as IPOPT, while achieving orders of magnitude higher accuracy than competing learning-based approaches.

22.
arXiv (CS.CL) 2026-06-11

Reassessing High-Performing LLMs on Polish Medical Exams: True Competence or Bias-Driven Performance?

Large language models (LLMs) in medicine are mainly evaluated using multiple-choice question answering (MCQA), which can overestimate real clinical ability due to guessing strategies and answer biases. To address these limitations, we introduce an expanded and more challenging benchmark based on Polish medical exams, adding over 15,000 questions, two new domains, and four structural modifications that reduce MCQA-specific artifacts and better test reasoning. We evaluate 21 LLMs and show that evaluation design strongly affects results. Under our harder setup, the best model (Qwen3.5-122B) drops by 28.4 and 31 pp on English and Polish exams, respectively. Despite low evidence of data contamination, standard MCQA scores do not reliably reflect true medical competence. To facilitate further research, we make our benchmark publicly available.

23.
arXiv (CS.CL) 2026-06-15

A Computational Audit of Demographic Association Encoding in ClinicalBERT Language Predictions

Transformer-based clinical language models are increasingly integrated into high-stakes clinical decision support pipelines, yet the computational mechanisms through which demographic associations encoded in medical documentation propagate into model probability distributions remain empirically underspecified. We present a systematic computational audit of representational bias in ClinicalBERT (Alsentzer et al., 2019), a BERT-based model pretrained on MIMIC-III discharge summaries, employing two complementary probing methodologies: Log Probability Bias Analysis (LPBA), which quantifies demographic descriptor-induced shifts in masked token probability distributions across behavioral and evaluative semantic categories, and Masked Language Model-based analysis (MLM), which probes internal representational structure for demographic agency attribution encoding across 98 real clinical sentence templates and eight intersectional race-gender combinations. Corpus frequency analysis operationalizes the distinction between statistical disparity and bias amplification by benchmarking model outputs against empirical term frequencies in the MIMIC-III training corpus. Of 32 statistically significant findings, 65.6% contradict observed corpus distributions, rising to 80% for Black patients and 87.5% for agency attribution under MLM probing, providing direct empirical evidence that representational bias in ClinicalBERT operates predominantly through model-internal amplification rather than training data inheritance. Keywords: natural language processing, clinical documentation, algorithmic auditing, representational bias, health equity 1

24.
arXiv (CS.LG) 2026-06-17

Geometry-Preserving Encoder/Decoder in Latent Generative Models

arXiv:2501.09876v4 Announce Type: replace-cross Abstract: Generative modeling aims to generate new data samples that resemble a given dataset. When using diffusion models for this task, one of the main challenges is solving the problem in the input space, which tends to be very high-dimensional. To address this, recent approaches solve diffusion models in the latent space through an encoder that maps from the data space to a lower-dimensional latent space, improving training efficiency and achieving state-of-the-art results. The variational autoencoder (VAE) is the most commonly used encoder/decoder framework in this domain, known for its ability to learn latent representations and generate data samples. In this paper, we introduce a novel encoder/decoder framework with theoretical properties distinct from those of the VAE, specifically designed to preserve the geometric structure of the data distribution. We demonstrate the significant advantages of this geometry-preserving encoder in the training process of both the encoder and decoder. Additionally, we provide theoretical results proving convergence of the training process, including convergence guarantees for encoder training, and results showing faster convergence of decoder training when using the geometry-preserving encoder.

25.
arXiv (CS.LG) 2026-06-12

Robust State-Conditional Feature-Weighted Jump Models for Temporal Clustering

arXiv:2606.13146v1 Announce Type: cross Abstract: We propose a robust feature-weighted jump model for time-dependent clustering. A penalty is used to encourage smoothness of transitions over time, while robustness is achieved through the use of a Tukey's biweight loss function. An additional parameter controls the variability of feature weights across states, allowing the model to assign state-specific relevance to each feature. We illustrate in simulation how the method accurately recovers the true cluster sequence and reliably identifies relevant features, outperforming competing approaches, particularly in the presence of outliers. We conclude with two empirical applications, one on the number of conflict-related homicides in Kosovo in the period 1998-2000, and another on macroeconomic performance of twelve European countries in the period 1949-2024.