Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CL) 2026-06-15

Simulating Students' Java Programming Errors with Large Language Models

Understanding student errors in the programming is a cornerstone of programming education, yet obtaining a representative set of student errors for any newly designed task remains slow and costly, since authentic submissions only accumulate after extensive classroom deployment. This paper explores whether large language models (LLMs) can serve as scalable proxies for students by simulating realistic logical errors in code submissions. Using the CodeWorkout dataset of 74,000+ unique student Java submissions across 37 problems, we evaluate five LLMs under three mainstream prompting strategies: Input-Output (IO), Chain-of-Thought (CoT), and iterative Self-Refine. We assess performance along two key dimensions: diversity (the range of distinct error patterns) and alignment (alignment with authentic student mistakes), and examine how these vary by struggling level of programming tasks. Our quantitative findings reveal that while all models generate diverse errors, their alignment to human submissions diverges: Claude Sonnet 4 achieves the most balanced performance. In addition, we conducted a blinded expert annotation study (N = 401) comparing synthetic and authentic errors. This qualitative analysis confirms that the generated errors are functionally indistinguishable from authentic student errors. Moreover, higher-struggling-level problems elicit more diverse but less student-like errors. These results highlight trade-offs in using LLMs to simulate human learners and suggest design considerations for integrating synthetic errors into teachable agents, intelligent tutoring systems, and large-scale learning analytics.

02.
arXiv (CS.CV) 2026-06-18

Automatic ply-specific analyses of CFRP micrographs using shortest-path-based ply distinction

We present an automated approach to distinguish between ply instances in semantic segmentation masks of high-resolution carbon-fiber reinforced polymer micrographs. Interpreting the segmentation mask as a graph with pixels as vertices, enables us to use a shortest-path algorithm yielding the ply-separating paths. Thereby, we bridge the gap between semantic segmentation and ply instance segmentation using global information. We successfully apply our approach on high-resolution micrographs featuring a broad range of characteristics like artificially added gaps in single or multiple plies, different stacking sequences and ply traversing cracks. Assigning each fiber pixel to a ply based on the calculated paths, allows for a comprehensive, quantitative ply analysis with respect to its microstructural properties like the local fiber volume fraction as well as locally resolved ply and interleaf layer thickness. These insights help to reveal manufacturing-induced inhomogeneities, draw conclusions on manufacturing parameters and link mechanical properties to underlying microstructural imperfections.

03.
arXiv (CS.CL) 2026-06-16

Vernier: Probing Representational Misalignment Behind Lexical Gaps in Causal Reasoning

作者:

Instruction-tuned language models can answer the same causal-reasoning question differently after its English variable names are replaced by type-preserving placeholders, although the structural causal model and the gold answer are unchanged. We ask whether this lexical gap reflects information loss in the placeholder view or a misaligned read-out from a representation that still carries answer-relevant content. Vernier uses a paired-view weight update as an instrument and then inspects the mechanism left after the gap closes. In the working regimes, the evidence favours representational misalignment. A variable-name probe becomes more accurate on the placeholder view, and activation patching on Qwen-7B, Qwen-14B, and Llama-3.1-8B shows that the decision-token representation can transfer answer identity between views. The update that realigns the views is counterfactual augmentation over original and placeholder prompts, while the answer-subspace KL mainly sharpens intermediate answer-belief agreement. Success is bounded by model family, scale, and task. CRASS transfer is reliable across Qwen scales and Llama, e-CARE remains weak, and preliminary non-causal rename tasks show a similar qualitative pattern.

04.
medRxiv (Medicine) 2026-06-10

Optimisation of steatotic liver disease screening algorithm for resource-poor settings using machine learning

Background The European Association for the Study of the Liver (ESAL) - Steatotic Liver Disease (SLD) screening algorithm involves two steps; initial screening with FIB-4 followed by referral for vibration-controlled transient elastography (VCTE) in patients likely to have significant fibrosis (SF). However, VCTE is not widely available in resource-limited settings. Aim To optimise the EASL SLD screening algorithm for resource-poor settings using machine learning (ML). Methods We analysed data from 964 adults aged [≥]35 years who underwent VCTE at a tertiary referral centre in Sri Lanka between November 2024 and 2025. Multiple ML models using different methods and variable combinations were trained on 80% of the dataset and tested on the remaining 20%. Best models were selected based on performance and externally validated using data from 430 patients who underwent VCTE before November 2024. Model performance was compared with the FIB-4 using confusion matrices. Results A Random Forest model incorporating age, AST, ALT, and platelet count separately, rather than using FIB-4, outperformed. The all-variable ML model showed the best predictive performance for SF, with accuracy of 77.2%, recall of 0.762, precision of 0.778, and AUC-ROC of 0.818. The variables used in the model, in descending order of feature importance, were AST, platelet count, BMI, ALT, age, diabetes mellitus, hypertension, dyslipidaemia, sex, family history, hypothyroidism, diabetes complication and smoking. External validation demonstrated 75.1% accuracy and an AUC of 0.779. When used as the first step of the SLD screening algorithm, the all-variable ML model identified 37 (17.1%) additional true positives and reduced false-negative diagnoses by 50% compared with FIB-4. Conclusions ML-based models were more effective than the FIB-4 score as the first-line screening tool for VCTE referral, substantially improving the identification of patients with significant fibrosis in this South Asian cohort.

05.
arXiv (quant-ph) 2026-06-19

GPU-accelerated semidefinite programming for causal games

arXiv:2606.20519v1 Announce Type: new Abstract: The process matrix formalism describes quantum correlations in scenarios without a fixed causal order between local laboratories. Operational signatures of such correlations can be investigated through causal games. A paradigmatic example is the Guess-Your-Neighbour's-Input game, in which two parties attempt to guess each other's inputs. Correlations compatible with any definite, or probabilistically mixed, causal order cannot achieve a winning probability exceeding $1/2$. The best process-matrix strategy currently known attains a value of approximately $0.6218$ using local dimension $d=5$, while the strongest known dimension-independent upper bound is $0.7592$. In this work, we investigate whether increasing the local dimension beyond $d = 5$ can narrow this gap. To this end, we employ a see-saw optimization scheme in which each step is formulated as a semidefinite program. For scalability, we develop a custom implementation of the SCS solver in which the dominant computational cost, the projection onto the positive-semidefinite cone, is offloaded to a GPU, yielding a six-fold speedup. Using this implementation, we explore local dimensions up to $d = 8$, and we do not find significant improvements over the value at $d=5$. Our results suggest that either qualitatively different strategies are required to approach the known upper bound, or that the bound itself is not tight.

06.
arXiv (CS.AI) 2026-06-11

GPO: Learning from Critical Steps to Improve LLM Reasoning

arXiv:2509.16456v3 Announce Type: replace Abstract: Large language models (LLMs) are increasingly used in various domains, showing impressive potential on different tasks. Recently, reasoning LLMs have been proposed to improve the reasoning or thinking capabilities of LLMs to solve complex problems. Despite the promising results of reasoning LLMs, enhancing the multi-step reasoning capabilities of LLMs still remains a significant challenge. While existing optimization methods have advanced the LLM reasoning capabilities, they often treat reasoning trajectories as a whole, without considering the underlying critical steps within the trajectory. In this paper, we introduce Guided Pivotal Optimization (GPO), a novel fine-tuning strategy that dives into the reasoning process to enable more effective improvements. GPO first identifies the `critical step' within a reasoning trajectory - a point that the model must carefully proceed to succeed at the problem. We locate the critical step by estimating the advantage function. GPO then resets the policy to the critical step, samples the new rollout and prioritizes the learning process on those rollouts. This focus allows the model to learn more effectively from pivotal moments within the reasoning process to improve the reasoning performance. We demonstrate that GPO is a general strategy that can be integrated with various optimization methods to improve reasoning performance. Besides theoretical analysis, our experiments across challenging reasoning benchmarks show that GPO can consistently and significantly enhance the performance of existing optimization methods, showcasing its effectiveness and generalizability in improving LLM reasoning by concentrating on pivotal moments within the generation process.

07.
Nature (Science) 2026-06-17

<i>CHPO</i> coordinates chilling recovery and nitrogen use in rice

作者:

Global rice production faces mounting challenges from abnormal temperature fluctuations and nitrogen-fertilizer-driven environmental pollution1–7. Developing varieties that balance chilling resilience and nitrogen-use efficiency (NUE) offers a promising solution, but the molecular networks coordinating these traits remain poorly understood. Here we identify CHILLING PHOENIX (CHPO), a major gene underlying the quantitative trait locus shared by both chilling tolerance and resilience. It encodes a MYB transcription factor that acts as a key regulator coordinating post-chilling recovery with nitrogen use in rice. Natural variation in a GCG-repeat-encoded polyalanine tract alters CHPO DNA-binding preference and redirects regulatory outputs between the japonica-type (CHPOjap) and indica-type (CHPOind), causing opposing effects on chilling tolerance and resilience. This allelic variation is shaped by domestication selection, with the CHPOjap allele probably derived from Chinese wild rice. CHPOjap directly targets OsTCP19 and OsNRT2.4 to fine-tune NUE, thereby enhancing chilling tolerance and resilience. These findings provide a mechanistic framework for a chilling-induced high-nitrogen-utilization module that alleviates the damage caused by chilling stress, and a potential molecular design&nbsp;strategy for breeding rice varieties with both chilling resilience and high NUE at the&nbsp;recovery stage. A rice gene, CHPO, links chilling resilience with nitrogen-use efficiency, revealing a domestication-shaped regulatory mechanism that could guide breeding of climate-resilient, sustainable rice varieties.

08.
arXiv (CS.CV) 2026-06-16

Planning with Unified Multimodal Models

With the powerful reasoning capabilities of large language models (LLMs) and vision-language models (VLMs), many recent works have explored using them for decision-making. However, most of these approaches rely solely on language-based reasoning, which limits their ability to reason and make informed decisions. Recently, a promising new direction has emerged with unified multimodal models (UMMs), which support both multimodal inputs and outputs. We believe such models have greater potential for decision-making by enabling reasoning through generated visual content. To this end, we propose Uni-Plan, a planning framework built on UMMs. Within this framework, a single model simultaneously serves as the policy, dynamics model, and value function. In addition, to avoid hallucinations in dynamics predictions, we present a novel approach self-discriminated filtering, where the generative model serves as a self-discriminator to filter out invalid dynamics predictions. Experiments on embodied decision-making tasks show that Uni-Plan substantially improves success rates compared to VLM-based methods, while also showing strong data scalability, requiring no expert demonstrations and achieving better performance under the same training-data size. This work lays a foundation for future research in reasoning and decision-making with UMMs.

09.
Nature (Science) 2026-06-10

Mitochondria tethered to the nucleus secure its energy supply

Direct interactions between the cell’s powerhouses and nuclear pores might channel energy straight into the nucleus, fuelling cell division and differentiation. Direct interactions between the cell’s powerhouses and nuclear pores might channel energy straight into the nucleus, fuelling cell division and differentiation.

10.
Nature Medicine 2026-06-15

Adaptive deep brain stimulation for dynamic gait control in Parkinson’s disease: a randomized feasibility trial

A randomized crossover study of five patients with Parkinson’s disease (PD) demonstrates that gait-synchronized adaptive deep brain stimulation is feasible and safe, and reduces falls compared with continuous stimulation. Gait dysfunction in PD is a major source of disability and is often insufficiently treated by continuous deep brain stimulation (cDBS). Although adaptive DBS (aDBS) has shown efficacy for other motor symptoms using β-based, state-driven neural signals, gait is a dynamic, cyclical behavior that may require temporally precise modulation. Here we evaluated a behavior-contingent aDBS approach that synchronizes stimulation to gait phase. We reported a single-center, blinded, randomized, crossover study evaluating the feasibility of identifying patient-specific biomarkers to drive aDBS. The primary outcome was feasibility of successful identification of gait-phase biomarkers to implement aDBS. Five participants with PD undergoing pallidal DBS and subdural electrode paddle implantation were enrolled. We successfully identified personalized gait-phase biomarkers from cortical or pallidal field potentials in all five patients and embedded them into a bidirectional neurostimulator. During acute in-clinic testing, aDBS improved step variability and step symmetry versus cDBS. Three participants subsequently completed a double-blinded, multi-day crossover phase. In this setting, aDBS maintained general motor symptom control, reduced falls and yielded patient-specific gait improvements. No adverse events occurred and aDBS was well tolerated. These findings establish the feasibility of biomarker-driven, movement-synchronized neuromodulation and support the development of a larger randomized trial to determine clinical efficacy. ClinicalTrial.gov registration: NCT04675398 . A randomized crossover study shows that gait-phase-synchronized adaptive deep brain stimulation is feasible and safe, and reduces falls compared to continuous stimulation in Parkinson’s disease.

11.
arXiv (CS.CL) 2026-06-19

Segment-Level Mandarin Chinese Speech-Based Cognitive Impairment Detection via an Autoencoder with Contrastive Learning

\noindentBackground and Objective: Speech has emerged as a low-cost and non-invasive digital biomarker with considerable potential for cognitive impairment detection. However, limited labeled data and cross-dataset variability remain major challenges for robust speech-based screening systems. \par\noindentMethods: We developed a segment-level representation learning framework for speech-based cognitive impairment detection. Speech recordings were divided into short segments and converted into spectrogram representations. To improve robustness under limited-data conditions, offline and online augmentation strategies were combined with autoencoder-based representation learning and contrastive objectives to enhance discriminative latent representations. \par\noindentResults: Experiments conducted on four independent Mandarin Chinese speech datasets demonstrated stable and competitive performance in both binary and three-class classification tasks, with particularly notable improvements in the clinically challenging three-class setting. Ablation studies further supported the effectiveness of the proposed framework. \par\noindentConclusions: The findings suggest that segment-level speech representation learning may provide a scalable and practical approach for cognitive impairment screening in resource-constrained clinical settings.

12.
arXiv (quant-ph) 2026-06-16

Entangled states are typically incomparable

arXiv:2406.03335v2 Announce Type: replace Abstract: Consider a bipartite quantum system, where Alice and Bob jointly possess a pure state $|\psi\rangle$. Using local quantum operations on their respective subsystems, and unlimited classical communication, Alice and Bob may be able to transform $|\psi\rangle$ into another state $|\phi\rangle$. Famously, Nielsen's theorem [Phys. Rev. Lett., 1999] provides a necessary and sufficient algebraic criterion for such a transformation to be possible (namely, the local spectrum of $|\phi\rangle$ should majorise the local spectrum of $|\psi\rangle$). In the paper where Nielsen proved this theorem, he conjectured that in the limit of large dimensionality, for almost all pairs of states $|\psi\rangle, |\phi\rangle$ (according to the natural unitary invariant measure) such a transformation is not possible. That is to say, typical pairs of quantum states $|\psi\rangle, |\phi\rangle$ are entangled in fundamentally different ways, that cannot be converted to each other via local operations and classical communication. Via Nielsen's theorem, this conjecture can be equivalently stated as a conjecture about majorisation of spectra of random matrices from the so-called trace-normalised complex Wishart-Laguerre ensemble. Concretely, let $X$ and $Y$ be independent $n \times m$ random matrices whose entries are i.i.d. standard complex Gaussians; then Nielsen's conjecture says that the probability that the spectrum of $X X^\dagger / \operatorname{tr}(X X^\dagger)$ majorises the spectrum of $Y Y^\dagger / \operatorname{tr}(Y Y^\dagger)$ tends to zero as both $n$ and $m$ grow large. We prove this conjecture, and we also confirm some related predictions of Cunden, Facchi, Florio and Gramegna [J. Phys. A., 2020; Phys. Rev. A., 2021].

13.
arXiv (CS.LG) 2026-06-24

Evaluation Metrics as Averaged Outcomes of Fair Gambles

arXiv:2401.14483v4 Announce Type: replace Abstract: In the current practices of machine learning, the evaluation of forecasts has become a cornerstone of scientific progress. A multitude of evaluation metrics have been suggested and used to qualify "good" forecasts. What do those metrics share? How are they related? In this work, we use a protocol borrowed from game-theoretic probability to show that a large part of evaluation metrics can be viewed as averaged outcomes of fair gambles. Intuitively, a fair gambler is one which a forecaster would expect to fail. Hence, the gambler's ability to gain disproves the quality of the forecast. Standard evaluation metrics are then variants of choices of such fair gambles. In particular, this choice is structured along two dimensions, one of which separates calibration-type and regret-type metrics. In particular, this framework sheds light on the relationship of calibration and regret showing a theoretical equivalence in their ability to evaluate when being scaled appropriately, but the incomparability of obtained scores.

14.
arXiv (CS.LG) 2026-06-16

Dynamic Link Prediction with Temporally Enhanced Signed Graph Neural Networks

arXiv:2605.26290v2 Announce Type: replace Abstract: Temporal signed networks (TSNs) model the time evolution of cooperative and adversarial relationships that arise in applications such as social media analysis, trust and reputation systems, and financial transaction networks. While graph neural networks (GNNs) perform well for static or unsigned link prediction, effective learning in temporal signed graphs remains challenging due to the interaction of signed relations, evolving structure, and balance-theoretic constraints. To address this gap, we propose a modular temporal enhancement framework for signed GNNs that integrates historical context into otherwise static architectures. The framework introduces a Historical Context Integration Module (HCIM) that combines learnable recency-aware temporal weighting, LSTM-based embedding trajectory modeling, and multi-head temporal attention to capture both short- and long-term signed interaction dynamics. Historical information is fused with current node representations using either global or node-adaptive weighting, allowing the architecture-agnostic framework to accommodate heterogeneous temporal behaviors. We instantiate the approach on the Self-Explainable Signed Graph Transformer (SE-SGformer), preserving interpretability while extending it with temporal awareness. Experiments on real-world and synthetic TSNs, including Bitcoin OTC, Bitcoin Alpha, Reddit, and small-world network models, demonstrate consistent and statistically significant improvements over the static baseline.

15.
arXiv (quant-ph) 2026-06-16

Reconstruction of detector error model for quantum error correction

arXiv:2606.16288v1 Announce Type: new Abstract: Fault-tolerant quantum computing fundamentally relies on the accurate characterization of circuit-level noise to optimize decoding algorithms. However, extracting complex multi-body error correlations remains challenging. Contemporary greedy inference algorithms can suffer from statistical distortion, discarding true physical mechanisms while introducing many unphysical false positives. Here, we introduce the Correlation-Analysis-based Hypergraph Reconstruction (CAHR) algorithm, a globally consistent framework to invert experimental syndrome statistics directly into discrete physical hypergraphs. By coupling exact algebraic correlation equations with a top-down concurrent-pruning strategy, CAHR recovers the fault topology without false positives for both $d=5$ rotated surface codes and dense 8-body 2D color codes in our benchmark settings. Furthermore, we show that exact continuous parameter extraction in dense codes is limited by a variance cascade, where absolute statistical variance accumulates linearly from high- to low-degree mechanisms. This motivates a two-stage inference paradigm: utilizing CAHR to extract the fault topology, followed by continuous probability optimization. This provides a practical approach for characterizing and decoding highly correlated noise in realistic quantum hardware.

16.
medRxiv (Medicine) 2026-06-22

Multi-omics data fusion reveals divergent molecular signatures of intra-articular micro-fragmented adipose tissue and hyaluronic acid treatment in inflammatory-phenotype knee osteoarthritis

Knee osteoarthritis (KOA) affects an estimated 374 million people worldwide and has no approved disease-modifying treatment. Intra-articular micro-fragmented adipose tissue (MFAT) outperformed hyaluronic acid (HA) on patient-reported outcomes in our recent double-blind randomized trial (ISRCTN88966184), yet the molecular basis of this differential efficacy is unknown, and the two interventions have not previously been compared at the level of their in vivo molecular response in human KOA. Here we apply an interpretable artificial-intelligence data-fusion framework, based on non-negative matrix tri-factorization, to longitudinally collected plasma from this cohort, integrating proteomics, N-glycomics, miRNA transcriptomics and patient genetics with prior protein-protein and miRNA-gene regulatory networks at baseline, one and six months. The framework jointly decomposes all data modalities at each timepoint into shared, interpretable factors, from which we derive data-driven pathways of genes and of miRNAs and recover new patient-gene and patient-miRNA associations. These pathways were biologically coherent, showing significant enrichment in Gene Ontology Biological Process and Reactome Pathway annotations. By six months, the two treatments left clearly distinct molecular signatures: HA remained dominated by canonical OA pathogenic processes, including cartilage-degrading effectors such as MMP13 and LIMK2 and markers of synovial inflammation, whereas MFAT shifted the systemic landscape toward chondroprotection, anti-inflammatory signalling and bone-cartilage homeostasis, with prioritized effectors including SIRT7 and NDUFC1. To our knowledge, these are the first systems-level molecular data directly comparing the in vivo response to the two treatments in human KOA, providing initial evidence that MFAT acts as a disease-modifying intervention and demonstrating the value of interpretable data fusion for uncovering treatment mechanisms in small translational cohorts.

17.
arXiv (CS.CV) 2026-06-24

CanadaFireSat: Toward high-resolution wildfire forecasting with multiple modalities

Canada experienced in 2023 one of the most severe wildfire seasons in recent history, causing damage across ecosystems, destroying communities, and emitting large quantities of CO2. This extreme wildfire season is symptomatic of a climate-change-induced increase in the length and severity of the fire season that affects the boreal ecosystem. Therefore, it is critical to empower wildfire management in boreal communities with better mitigation solutions. Wildfire probability maps represent an important tool for understanding the likelihood of wildfire occurrence and the potential severity of future wildfires. The massive increase in the availability of Earth observation data has enabled the development of deep learning-based wildfire forecasting models, aiming at providing precise wildfire probability maps at different spatial and temporal scales. A main limitation of such methods is their reliance on coarse-resolution environmental drivers and satellite products, leading to wildfire occurrence prediction of reduced resolution, typically around $\sim 0.1${\deg}. This paper presents a benchmark dataset: CanadaFireSat, and baseline methods for high-resolution: 100 m wildfire forecasting across Canada, leveraging multi-modal data from high-resolution multi-spectral satellite images (Sentinel-2 L1C), mid-resolution satellite products (MODIS), and environmental factors (ERA5 reanalysis data). Our experiments consider two major deep learning architectures. We observe that using multi-modal temporal inputs outperforms single-modal temporal inputs across all metrics, achieving a peak performance of 60.3% in F1 score for the 2023 wildfire season, a season never seen during model training. This demonstrates the potential of multi-modal deep learning models for wildfire forecasting at high-resolution and continental scale.

18.
medRxiv (Medicine) 2026-06-15

Active commuting, anxiety symptoms and mental wellbeing: a dose-response study

Climate change draws attention to the planetary health perspective in sport and exercise sciences, that is, to physical activity that supports both human wellbeing and environmental sustainability. Active commuting is a sustainable form of physical activity with well-established somatic health benefits. However, more knowledge is needed on its relationship with mental health. We examined dose-response associations between active commuting, anxiety symptoms, and mental wellbeing among Finnish adults, and whether green commuting environment moderates these relationships. We used data from the cross-sectional Environment and Health Survey collected in June-September 2023 in the ten largest cities in Finland. Employed participants with data on anxiety symptoms (Generalized Anxiety Disorder-7, GAD-7), mental wellbeing (World Health Organization-Five Well-Being Index, WHO-5), commuting profile over a year (mode, frequency, distance, and perceived greenness along the commute route), and sociodemographic and lifestyle factors were included (n=1,672; mean age 45.3 years; 53.8% women). Active commuting was defined as travelling the entire commute by walking or cycling (including e-biking) that was converted into approximated annual km/week and MET-h/week. We used linear and logistic regression with restricted cubic splines to evaluate dose-response associations, adjusted for key covariates. The role of perceived greenness was tested using an active commuting x commute greenness interaction term. We found no dose-response relationships between active commuting and anxiety symptoms or mental wellbeing in any of the models. No effect modification by commute greenness was observed. More research on how active commuting may support planetary health from a mental health perspective is needed.

19.
arXiv (CS.AI) 2026-06-17

Reversal Q-Learning

arXiv:2606.17551v1 Announce Type: cross Abstract: Iterative generative modeling techniques, such as flow matching, provide powerful tools to model complex behaviors for effective offline reinforcement learning (RL). In this work, we propose a new off-policy RL algorithm that trains a flow policy based on prior data. Our idea starts from the "expanded" Markov decision process (MDP) framework, which treats individual flow refinement steps as separate actions in an MDP. To enable off-policy RL within this framework, we apply two techniques: we generate virtual on-policy trajectories (by "reversing" flows) to make this framework compatible with prior data, and we apply a bias-and-variance reduction technique to mitigate the curse of horizon in off-policy RL. We call the resulting algorithm Reversal Q-learning (RQL). RQL has several advantages over previous flow-based RL methods: it does not suffer from backpropagation through time, makes better use of the learned value function, and directly trains the full, expressive flow policy. Through our experiments on 50 challenging simulated robotic tasks, we show that RQL leads to the best average offline RL performance compared to state-of-the-art flow-based offline RL algorithms.

20.
arXiv (CS.AI) 2026-06-24

RetiSEM: Generalising Causal Models for Fragmented Biomedical Data

arXiv:2606.24488v1 Announce Type: cross Abstract: Learning causal models from fragmented biomedical data is challenging because clinical, molecular, and imaging variables are often incomplete or not jointly observed. We propose RetiSEM, a domain-constrained structural equation modelling (SEM) framework for causal graph recovery and mediation analysis under limited multimodal resources. This proposed work organises variables into biologically informed blocks, applies forbidden-edge constraints, and decomposes pathway-level effects into TE, NDE, and NIE components. We evaluate RetiSEM across ten synthetic benchmark scenarios that vary in dimensionality, nonlinearity, causal depth, and pathway structure, together with a fragmented real-world setting that combines NHANES clinical variables with externally derived retinal representations. This approach achieves lower structural error and higher causal accuracy than unconstrained baselines across the synthetic benchmarks. In the real-data analysis, retinal variables behave mainly as downstream biomarker-like indicators, with smaller but detectable indirect effects. These findings support our strategy as an interpretable framework for testing structured causal hypotheses in limited-resource biomedical AI. The code and resources for this work are publicly available at: https://github.com/Inamullah-Colab/ReitSEM.

21.
arXiv (CS.AI) 2026-06-24

VoltanaLLM: Energy-Efficient and SLO-Aware Disaggregated LLM Serving via Adaptive Frequency Control and State-Space Routing

arXiv:2509.04827v3 Announce Type: replace-cross Abstract: The energy cost of Large Language Model (LLM) inference is rapidly becoming a barrier to sustainable and scalable deployment. Although modern serving architectures expose distinct prefill and decode behaviors, existing systems fail to exploit these phase differences for energy-efficient serving under strict latency SLOs. This paper introduces VoltanaLLM, the first system that explicitly targets and reduces the energy bloat in modern prefill-decode (P/D) disaggregated LLM serving. Guided by a control-theory perspective, VoltanaLLM separates two levers: per-instance operating-point selection (GPU frequency per iteration) and system-level state-space routing of requests. We empirically observe that LLM inference exhibits a U-shaped energy-frequency curve creating "sweet spots" that depend on phase behavior and load. VoltanaLLM exploits this by combining phase-specific, iteration-level frequency selection driven by a lightweight, online-adaptive latency predictor, with a decode state-space guided router that avoids architectural granularity-induced inefficiencies, all while meeting desired SLOs. We implement VoltanaLLM using SGLang and evaluate it across multiple models and real-world workloads. Our results show VoltanaLLM reduces end-to-end energy by up to 36.3% versus a static max-frequency baseline while maintaining high SLO attainment, and generalizes to newer GPUs. These results point to sustainable LLM serving via phase-aware, iteration-level frequency selection coupled with architecture-aware routing. Source code is available in https://github.com/Supercomputing-System-AI-Lab/VoltanaLLM.

22.
arXiv (CS.LG) 2026-06-11

Coverage Guarantees for Pseudo-Calibrated Conformal Prediction under Distribution Shift

arXiv:2602.14913v2 Announce Type: replace Abstract: Conformal prediction (CP) offers distribution-free marginal coverage guarantees under an exchangeability assumption, but these guarantees can fail if the data distribution shifts. We analyze the use of pseudo-calibration as a tool to counter this performance loss under a bounded label-conditional covariate shift model. Using tools from domain adaptation, we derive a lower bound on target coverage in terms of the source-domain loss of the classifier and a Wasserstein measure of the shift. Using this result, we provide a method to design pseudo-calibrated sets that inflate the conformal threshold by a slack parameter to keep target coverage above a prescribed level. Finally, we propose a source-tuned pseudo-calibration algorithm that interpolates between hard pseudo-labels and randomized labels as a function of classifier uncertainty. Numerical experiments show that our bounds qualitatively track pseudo-calibration behavior and that the source-tuned scheme mitigates coverage degradation under distribution shift while maintaining nontrivial prediction set sizes.

23.
arXiv (quant-ph) 2026-06-16

Optical Creation of Synthetic Microgravity for Quantum Degenerate Gases

arXiv:2606.14985v1 Announce Type: cross Abstract: Microgravity environments provide unique opportunities for ultracold-atom experiments by enabling long interrogation times and reduced acceleration-induced dynamics. However, their realization has largely been restricted to specialized facilities such as drop towers, sounding rockets, and space-based laboratories. Here we realize synthetic microgravity for quantum degenerate gases using optically engineered force landscapes that compensate Earth's gravity to the milli-g level while maintaining continuous confinement of the atomic ensemble. These force landscapes are generated by dynamically painted optical dipole potentials and calibrated in situ through Bloch oscillations in a vertical optical lattice, enabling precise control of the residual acceleration. We use this capability to demonstrate matter-wave beam splitting with arm separations of several hundred microns. We further implement a Bloch-band atom interferometer in which interaction-induced dephasing is strongly suppressed through controlled three-dimensional expansion in the synthetic microgravity potential. This reduction of mean-field effects restores near-$\sqrt{N}$ scaling of interferometric sensitivity for large quantum degenerate ensembles. Our results establish a versatile platform for realizing synthetic microgravity with trapped quantum gases in terrestrial laboratories, bringing the advantages of microgravity experiments to continuously operating systems and opening new opportunities for quantum sensing, matter-wave interferometry, and precision measurements.

24.
medRxiv (Medicine) 2026-06-17

Clinician knowledge and self-efficacy in snakebite management: A cross-sectional assessment in Northern Uganda

Background: Snakebite envenomation (SBE) is a major public health crisis in rural Uganda, yet it remains a neglected tropical disease. Effective management is often compromised by systemic barriers and a lack of clinician training. This study assessed clinician self-efficacy and objective knowledge regarding SBE management in Northern Uganda. Methods: A descriptive, cross-sectional study was conducted between February and July 2025 among 379 healthcare workers in Gulu, Omoro, and Pader districts. A validated questionnaire was used to collect data on socio-demographics, self-reported efficacy (scale 1-10), and objective knowledge. Knowledge scores [&ge;]70% were categorized as adequate. Multivariable logistic regression identified independent predictors of adequate knowledge, and Spearmans correlation ({rho}) assessed the relationship between knowledge and self-efficacy. Results: The participants had a mean age of 35.6 years (SD {+/-}7.3), were predominantly female (56.5%, 214/379), and most (83.6%, 317/379) practiced at Health Centre III level facilities. While 53.8% (204/379) reported prior training, 48.3% (183/379) of these had not received an update in over 10 years. Adequate knowledge was demonstrated by 51.5% (195/379) of participants. In the multivariable analysis, practicing in Omoro (adjusted odds ratio [aOR]: 0.3, 95% CI: 0.1-0.6, p < 0.001) or Pader (aOR: 0.2, 95% CI: 0.1-0.4, p < 0.001) was associated with lower odds of adequate knowledge compared to Gulu district. Prior training significantly increased the odds of adequate knowledge (aOR: 2.3, 95% CI: 1.3-4.2, p = 0.006). A moderate positive correlation was observed between self-efficacy and objective knowledge (Spearmans {rho} = 0.33, p < 0.0001). Conclusion: Approximately half of the frontline healthcare workers in Northern Uganda lack adequate knowledge on SBE management, with significant geographic differences and outdated training. The gap between clinician self-efficacy and objective knowledge poses a risk to patient safety. Regular, mandatory refresher training and targeted educational outreach to remote districts are required to reduce SBE-related morbidity and mortality.

25.
arXiv (quant-ph) 2026-06-24

Cornell Interaction in the Two-body Pauli-Schrödinger-type Equation Framework: The Symplectic Quantum Mechanics Formalism

arXiv:2507.20045v3 Announce Type: replace Abstract: We investigate the quantum behavior of a quark-antiquark bound system under the influence of a magnetic field within the symplectic formulation of quantum mechanics. Employing a perturbative approach, we obtain the ground and first excited states of the system described by the Cornell potential, which incorporates both confining and non-confining interactions. After performing a Levi-Civita mapping in phase space, we solve the time-independent symplectic Pauli-Schrödinger-type equation and determine the corresponding Wigner function. Special attention is given to the observation of the confinement of the quark-antiquark, that is revealed in the phase space structure. Due to the presence of spin in the Hamiltonian, the results reveal that the magnetic field enhances the non-classicality of the Wigner function, signaling stronger quantum interference and a departure from classical behavior. The experimental mass spectra is used to estimate the intensity of the external field, leading to a value that is in order of the transiet magnetic field measured in non-central heavy-ion collisions at RHIC and LHC.