Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.AI) 2026-06-18

CaVe-VLM-CoT: An Interpretable Vision-Language Model Framework

arXiv:2606.18385v1 Announce Type: new Abstract: Vision-Language Models (VLMs) remain prone to hallucinations, producing fluent but visually unfaithful outputs. Existing chain-of-thought and retrieval-augmented methods only partially address this, as they neither enforce step-level citation grounding nor route verification failures back to retrieval for correction. We present CaVe-VLM-CoT, a modular reflection-based agentic-RAG framework that enforces evidence-grounded reasoning through a five-stage closed-loop pipeline: Extractor, Retriever, Solver, Citation Injector, and Verifier, in which detected ungrounded claims trigger structured feedback to the Extractor for targeted re-retrieval. Since no existing framework jointly measures retrieval quality, step-wise citation faithfulness, and cross-modal grounding, we propose a suite of 23 component-wise metrics across all stages, anchored by CaVeScore, a composite metric weighting accuracy, citation precision and recall, attribution, and evidence grounding. Without any architectural or prompt modifications, CaVe-VLM-CoT achieves 87.1\% accuracy and 56.6\% CaVeScore on ScienceQA , and 55.2\% accuracy and 35.7\% CaVeScore on MMMU (30 subjects).

02.
arXiv (math.PR) 2026-06-16

Well-posedness of stochastic parabolic equations with gradient nonlinearities and applications to phase-field models

Authors:

arXiv:2606.15425v1 Announce Type: new Abstract: We study well-posedness of stochastic parabolic equations with gradient nonlinearities. Our analysis is based on recent maximal-regularity frameworks for nonlinear stochastic parabolic equations in critical spaces. We extend the existing results by controlling drift and noise coefficient separately. This way we can allow for less regular driving noise in case of subcritical dispersion coefficients. Our approach, based on gluings of local solutions, moreover implies new continuation criteria. We then apply our existence result and the continuation criteria to show global well-posedness of phase-field models of moving boundary problems.

03.
arXiv (CS.LG) 2026-06-15

FreshRetailNet-LT: A Stockout-Annotated Censored Demand Dataset for Latent Demand Recovery and Forecasting in Fresh Retail

arXiv:2505.16319v4 Announce Type: replace Abstract: Accurate demand estimation is critical for the retail business in guiding the inventory and pricing policies of perishable products. However, it faces fundamental challenges from censored sales data during stockouts, where unobserved demand creates systemic policy biases. Existing datasets lack the temporal resolution and annotations needed to address this censoring effect. To fill this gap, we present FreshRetailNet-50K, the first large-scale benchmark for censored demand estimation. It comprises 50,000 store-product time series of detailed hourly sales data from 898 stores in 18 major cities, encompassing 863 perishable SKUs meticulously annotated for stockout events. The hourly stock status records unique to this dataset, combined with rich contextual covariates, including promotional discounts, precipitation, and temporal features, enable innovative research beyond existing solutions. We demonstrate one such use case of two-stage demand modeling: first, we reconstruct the latent demand during stockouts using precise hourly annotations. We then leverage the recovered demand to train robust demand forecasting models in the second stage. Experimental results show that this approach achieves a 2.73% improvement in prediction accuracy while reducing the systematic demand underestimation from 7.37% to near-zero bias. With unprecedented temporal granularity and comprehensive real-world information, FreshRetailNet-50K opens new research directions in demand imputation, perishable inventory optimization, and causal retail analytics. The unique annotation quality and scale of the dataset address long-standing limitations in retail AI, providing immediate solutions and a platform for future methodological innovation. The data (https://huggingface.co/datasets/Dingdong-Inc/FreshRetailNet-50K) and code (https://github.com/Dingdong-Inc/frn-50k-baseline}) are openly released.

04.
arXiv (quant-ph) 2026-06-19

Frequency-Multiplexed Millimeter-Wave Fault-Tolerant Superconducting Qubits Enabled by an On-Chip Nonreciprocal Control Bus

arXiv:2512.17588v2 Announce Type: replace Abstract: Scaling superconducting quantum processors is fundamentally limited by the escalating complexity of cryogenic wiring and the detrimental effects of microwave crosstalk and Purcell decay. This paper proposes a novel architecture based on frequency-multiplexed millimeter-wave superconducting qubits, integrating an on-chip cryogenic nonreciprocal space-time-periodic Josephson frequency multiplier as a universal control bus. The bus replaces multiple high-frequency XY drive lines with a single low-frequency input tone, which is parametrically converted into a comb of high-order harmonics, each resonantly addressing a distinct qubit. The nonreciprocal nature of the bus provides intrinsic isolation that suppresses Purcell decay and reduces coherent crosstalk by more than $98\%$ compared to a conventional reciprocal shared drive line. Full error-budget analysis demonstrates that the architecture can maintain gate errors below the fault-tolerance threshold for arrays exceeding 25 qubits, converting a crosstalk-dominated error budget into one primarily limited by intrinsic material coherence. Theoretical modeling based on a non-Markovian master equation further indicates that the engineered environment enables information backflow, offering a pathway to enhanced coherence. This integrated, frequency-multiplexed, and nonreciprocal control bus offers a compelling route toward dramatic I/O simplification, improved noise resilience, and scalable high-coherence superconducting quantum processors.

05.
arXiv (CS.AI) 2026-06-16

Policy Regret for Embedding Model Routing: Contextual Bandits with Low-Rank Experts

arXiv:2606.14929v1 Announce Type: cross Abstract: Modern recommendation systems increasingly rely on dynamically routing diverse queries to multiple embedding models. Despite its practical significance, this problem remains poorly understood under realistic conditions like adversarial queries, bandit feedback, and limited observability of models. We formalize embedding model routing as an adversarial contextual linear bandit with low-rank experts, where contexts are queries, actions are items, and experts are the embedding models working on low-rank latent representation spaces. We first establish that standard regret notions suffer from structural misspecification or statistical intractability, and we identify a log-quadratic policy class that is expressive enough to capture query-dependent model routing, yet structured enough to allow efficient online learning. Second, we propose a policy gradient algorithm called Hypentropy Policy Gradient (HPG). It provably adapts to the unknown low-rank structure under incomplete information and attains $\tilde{\mathcal O}(s\sqrt{M T})$ linearized policy regret – where $s, M$, and $T$ are the intrinsic rank of the experts, the number of models, and the number of rounds – thus avoiding a curse of dimensionality. Finally, we also provide an computationally efficient and parameter-free implementation of HPG.

06.
arXiv (math.PR) 2026-06-19

An alternative approach to well-posedness of McKean-Vlasov equations arising in Consensus-Based Optimization

arXiv:2512.19446v4 Announce Type: replace-cross Abstract: In this work we study the mean-field description of Consensus-Based Optimization (CBO), a derivative-free particle optimization method. Such a description is provided by a non-local SDE of McKean-Vlasov type, whose fields lack of global Lipschitz continuity. We propose a novel approach to prove the well-posedness of the mean-field CBO equation based on a truncation argument. The latter is performed through the introduction of a cut-off function, defined on the space of probability measures, acting on the fields. This procedure allows us to study the well-posedness problem in the classical framework of Sznitman. Through this argument, we recover the established result on the existence of strong solutions, and we extend the class of solutions for which pathwise uniqueness holds.

07.
arXiv (CS.LG) 2026-06-18

Quantum Annealing Enhanced Reinforcement Learning for Accurate Remaining Useful Lifetime Prediction

arXiv:2606.18503v1 Announce Type: new Abstract: Remaining useful life (RUL) estimation is central to predictive maintenance, where an unplanned failure can cost far more than the asset itself. Statistical degradation models miss the strong nonlinearity of real systems, and data-driven models often converge to suboptimal solutions in high-dimensional, non-convex search spaces. We propose a Quantum Annealing enhanced Q-Learning (QAQL) framework that couples the sampling behaviour of quantum annealing with the sequential decision making of Q-learning. Each Q-value update is encoded as a small quadratic unconstrained binary optimization (QUBO) whose ground state is the greedy action; rather than acting as a deterministic optimizer, the annealer returns a distribution over near-optimal actions across many reads, and this stochastic action selection supplies the exploration that curbs premature convergence on nonlinear degradation trajectories. The QUBO is solved on the D-Wave Advantage system using minor embedding, with the annealer woven into the reinforcement-learning loop rather than bolted on after training. We validate QAQL on two public benchmarks: the NASA C-MAPSS turbofan engine datasets and a device-fleet predictive maintenance dataset. Averaged over many independent runs and across six error metrics, QAQL outperforms the classical and quantum baselines considered in this study, with statistically significant improvements. The results indicate that quantum annealing is a usable, not merely theoretical, optimizer inside a reinforcement-learning loop for industrial predictive-maintenance applications.

08.
arXiv (quant-ph) 2026-06-16

Quantum Information Geometry of Multicomponent Superconducting Fluctuation Transport

arXiv:2606.15928v1 Announce Type: cross Abstract: Quantum geometry underlies many electronic responses, but its transport signatures have so far been established mainly for pure single-particle Bloch states. Whether collective many-body fluctuations possess a measurable quantum geometry remains largely unexplored. Here we show that superconducting fluctuation transport provides a direct probe of quantum information geometry in collective many-body matter. Starting from a multicomponent time-dependent Ginzburg-Landau theory in the Gaussian fluctuation regime, we identify the equilibrium density matrix of fluctuating Cooper pairs as the static pair propagator, which defines a positive mixed-state manifold in momentum space. The geometry of this manifold is directly measurable through paraconductivity: the longitudinal paraconductivity is governed by the quantum Fisher information of superconducting fluctuation modes, while the fluctuational anomalous Hall effect is governed by the mean Uhlmann curvature, the mixed-state counterpart of Berry curvature. This correspondence further yields geometric bounds between these two transport components, with no direct analogue in normal electronic transport. Applied to chiral superconducting fluctuations in quarter-metal systems motivated by rhombohedral multilayer graphene, a symmetry-allowed Lifshitz invariant generates finite mean Uhlmann curvature and logarithmically enhances the anomalous Hall conductivity above the critical temperature. Our results establish collective superconducting fluctuations as an experimentally accessible transport probe of mixed-state quantum information geometry.

09.
arXiv (quant-ph) 2026-06-15

Spin-orbit coupling by design in quantum state engineering of atomically defined quantum dots

arXiv:2606.14487v1 Announce Type: cross Abstract: Tuning spin-orbit coupling is essential in controlling both spin and charge in confined semiconductor nanostructures, yet it is rarely a truly controllable parameter. Here, we show control over the spin-orbit Hamiltonian in quantum dots and the resulting quantum states by tailoring the confinement potential with atomic-scale precision. Using scanning tunnelling microscopy and spectroscopy, we pattern individual Cs ions into designer quantum dot structures on the surface of indium antimonide, in which electrons from a two-dimensional electron gas are confined with chosen in-plane electric-field gradients. We then quantify the atomic level structure, both spatially resolving the orbital character of the electronic states and their magnetic-field evolution. We demonstrate that the level structure, including the induced zero-field splitting, can be tailored by the designed geometry of the local electric fields. These effects can be described using a Hamiltonian that allows consistent treatment of the confinement-induced spin-orbit coupling beyond the conventional Bychkov-Rashba description. This Hamiltonian is derived from a multiband k.p model and takes the energy dependence of the relevant physical parameters into account. Such precise control of spin-orbit coupling in semiconductor quantum dots is relevant to quantum and spintronic technologies.

10.
arXiv (CS.AI) 2026-06-11

Implicit Neural Representations of Individual Behavior

arXiv:2606.12200v1 Announce Type: cross Abstract: We study policy representation learning from unlabeled multi-policy behavioral data. Each episode is generated by a fixed policy, but policy labels are unavailable. This setting appears in robotics play, demonstrations, games, racing, and other datasets where heterogeneous behaviors are mixed without annotations. We introduce Behavioral INR, a self-supervised generative model that adapts implicit neural representations (INRs) from vision to behavior. Instead of mapping coordinates to RGB values, Behavioral INR represents a policy as a state-action function mapping states to subsequent actions. An episode-level latent modulates this function through FiLM layers, yielding a generative prior over policies and allowing policy identity to be inferred without supervision. Because INRs treat each datapoint as samples from an underlying function, the same model naturally accommodates variable episode lengths and different sampling granularities, as in vision INRs with different image resolutions. We also define policy-level out-of-distribution (OOD) shifts along state-distribution and action-distribution axes, which arise when policies overlap in states or actions but are not captured by standard behavioral OOD settings based only on new agents or environments. We evaluate on synthetic Gaussian random field data, MuJoCo demonstrations with controlled OOD splits, and real-world chess, Formula 1 racing, robotics, and Seek-Avoid datasets. Behavioral INR most consistently improves policy identifiability in the hardest continuous state-action settings, especially when longer episodes, more policies, and OOD splits reduce the usefulness of marginal shortcuts; amortized history encoders remain competitive when policy identity can be recovered from symbolic repetition or low-dimensional action statistics. We release code and checkpoints.

11.
PLOS Computational Biology 2026-06-05

A multiscale, Bayesian inference approach to augment mechanistic models of cell signaling with machine-learning predictions of binding affinity

by Holly A. Huber, Stacey D. Finley Computational models in systems biology are often underdetermined—that is, there is little data relative to the complexity and size of the model. This lack of data is primarily due to limits in our ability to observe specific biological systems and restricts the utility of computational models. To reduce this uncertainty, recent methods have explored augmenting parameter inference of systems biology models with predictions from machine learning models. Such approaches expand the pool of data that is applicable for the inference problem. Here, we explore augmenting the parameter inference of intracellular signaling models. We choose to investigate signaling because experimental measurements of the variables of interest, protein dynamics, are still quite limited. To investigate, we propose a novel, multiscale, Bayesian inference approach that augments traditional signaling data with predictions of binding affinity. These predictions are generated using a machine learning pipeline with measurements of amino acid sequence, from the Universal Protein Resource, or protein structure, from the Protein Data Bank, as inputs. We find that we can successfully integrate these measurements into the inference problem using our novel framework. Excitingly, this integration significantly improves the parameter estimates of signaling models. We demonstrate that how much this improvement impacts predictions of signaling depends on the sensitivity of the prediction to perturbations in the parameter values. Overall, the framework we establish here improves the parameter inference of intracellular signaling models by successfully bridging data on protein sequence and structure with systems-level signaling.

12.
arXiv (CS.AI) 2026-06-12

Multi-Field Hybrid Retrieval-Augmented Generation for Maritime Accident Root Cause Analysis

arXiv:2606.13249v1 Announce Type: new Abstract: Maritime accident adjudication reports contain critical tribunal findings for root cause analysis (RCA), yet retrieving relevant precedents and drafting consistent reports from decades of records remains labor-intensive. This paper proposes a multi-field hybrid retrieval-augmented generation (RAG) framework for automated maritime RCA, utilizing a comprehensive dataset of 13,329 Korea Maritime Safety Tribunal (KMST) reports (1971-2025). We transform raw adjudications into a structured knowledge base of "incident cards", indexing three distinct fields-Summary, Causes, and Disposition-alongside a hierarchical L1/L2 cause taxonomy. Our retrieval strategy employs a field-aware hybrid approach, fusing sparse and dense rankings via Reciprocal Rank Fusion (RRF). Given the lack of large-scale expert relevance labels, we evaluate retrieval performance using ceiling-normalized recall and nDCG based on a metadata-derived proxy relevance score. Experimental results demonstrate that our proposed retrieval significantly outperforms baseline methods, improving NormRecall@100 from 0.18 to 0.55. Furthermore, grounding the generator on the retrieved precedents enhances RCA generation quality over an LLM-only baseline, increasing the LLM-as-a-judge score from 3.34 to 3.72. These findings suggest that field-aware RAG can substantially streamline maritime safety investigation workflows by enabling faster precedent search and more consistent, evidence-based RCA drafting.

13.
arXiv (CS.LG) 2026-06-12

Earth Science Foundation Models: From Perception to Reasoning and Discovery

arXiv:2605.12542v2 Announce Type: replace-cross Abstract: Large foundation models (FMs) are transforming Earth science by integrating heterogeneous multimodal data, such as multi-platform imagery, gridded reanalysis data, diverse geophysical and geochemical observations, and domain-specific text, to support tasks ranging from basic perception to advanced scientific discovery. This paper provides a unified review of Earth science foundation models (Earth FMs) through two complementary dimensions: depth, which traces the evolution of model capabilities from perception to multimodal reasoning and agentic scientific workflows, and breadth, which summarizes their expanding applications across the atmosphere, hydrosphere, lithosphere, biosphere, anthroposphere, and cryosphere, as well as coupled Earth system processes. Using this framework, we review representative multimodal Earth foundation models and compile more than 200 datasets and benchmarks spanning diverse Earth science tasks and modalities. We further discuss key challenges in multimodal data heterogeneity, scientific reliability and continual updating, scalability and sustainability, and the transition from foundation models to agentic and embodied Earth intelligence, and outline future directions toward more integrated, trustworthy, and actionable AI Earth scientists. Overall, this paper offers a structured roadmap for understanding the development of Earth foundation models from both capability depth and application breadth.

14.
arXiv (CS.AI) 2026-06-16

BRIDGE: Biological Evidence Refinement and Heterogeneous Dynamic Gating for Gene Regulatory Networks

arXiv:2606.14734v1 Announce Type: cross Abstract: Motivation: Gene regulatory network inference from single-cell RNA sequencing (scRNA-seq) data is important for uncovering cell-state-specific transcriptional programs. However, scRNA-seq measurements are sparse and noisy, and experimentally validated TF-target interactions remain limited, making reliable inference challenging. Although graph neural networks have advanced GRN prediction, existing methods often rely on biologically unconstrained graph augmentation, such as random edge perturbation, and insufficiently control information transfer between genes and cells. These limitations may distort regulatory structures and weaken robustness under noisy and weakly supervised settings. Results: To address these issues, we propose an innovative framework named Biological Evidence Refinement and Heterogeneous Dynamic Gating for Gene Regulatory Networks (BRIDGE). BRIDGE extracts gene and cell representations from the expression matrix and its matrix dual, and performs contrastive learning in the gene space and cell space between self and neighbors across the co-expression-refined regulatory view and the original graph. It then applies heterogeneous gated encoding to adaptively regulate information transfer between genes and cells, enabling robust transcription factor-to-target gene prediction. Experiments on benchmark datasets spanning three network types and seven cell types show that BRIDGE achieves state-of-the-art AUROC and AUPRC in most settings. In particular, on Specific networks, BRIDGE improves average AUPRC by 5% over the second-best baseline, GCLink. In cross-cell-type few-shot transfer, BRIDGE consistently outperforms GCLink and GENELink across all six target cell types. A case study on hESC further supports the biological relevance of the predictions, with 9 of the top 10 and 46 of the top 100 novel TF-target interactions validated by ChIPBase.

15.
arXiv (CS.LG) 2026-06-18

Automated Byzantine-Resilient Clustered Decentralized Federated Learning for Battery Intelligence in Connected EVs

arXiv:2605.21115v2 Announce Type: replace-cross Abstract: Federated learning (FL) has emerged as a promising paradigm for managing electric vehicle (EV) battery data in intelligent transportation systems (ITS), enabling privacy-preserving tasks such as anomaly detection and capacity estimation. However, most existing frameworks rely on centralized aggregation schemes, which pose critical limitations in terms of security and trust. To address these challenges, we propose ABC-DFL, an automated Byzantine-resilient clustered decentralized federated learning (C-DFL) framework for connected EVs. The proposed incentive-driven C-DFL system replaces the central server with an open-permissioned blockchain, featuring a new dynamic Quorum Byzantine Fault Tolerance (QBFT) protocol and an oracle-based aggregation layer, to enhance trust, security, and automation. At the core of ABC-DFL lies FLECA (Filtered Layered Enhanced Clustering Aggregation), a robust hierarchical aggregation protocol that mitigates Byzantine attacks by having each EV filter malicious updates using an adaptive threshold based on deviations from its reference model update. Oracle nodes, responsible for inter-group aggregation, employ robust clustering to isolate and aggregate model updates from trustworthy EV groups. Comprehensive experimental evaluations demonstrate that FLECA matches FedProx convergence under benign conditions and significantly outperforms existing defenses with attack impact scores below 0.10 in adaptive adversarial scenarios. Furthermore, several learning experiments with multitask models confirm the effectiveness and fairness of the incentive mechanism. Finally, on-chain and off-chain benchmarks validate the practicality of ABC-DFL.

16.
arXiv (CS.LG) 2026-06-18

Measurement noise limits the advantage of nonlinear models over linear models in biomedical prediction

arXiv:2606.18420v1 Announce Type: new Abstract: On biomedical tabular data, flexible models such as deep networks, gradient-boosted trees, and kernel methods are repeatedly matched or beaten by linear and logistic regression given the same features. The usual reaction is to treat this as a model-side shortfall, to be fixed with more data, a better architecture, or tuning, on the assumption that the nonlinear structure is there and the model has failed to capture it. We argue that these fixes cannot help when the binding limit is the measurement rather than the model, as it frequently is in biomedicine. Additive noise blurs the population-optimal predictor, and because blurring removes a function's fine, rapidly varying detail before its broad shape, it erases nonlinear structure faster than linear structure. A degree-$k$ interaction is attenuated by the $k$-th power of feature reliability, while the linear part is attenuated only once. At the reliabilities typical of biomedical measurement, the nonlinear advantage can vanish even when the underlying biology is strongly nonlinear, and what the noise removes cannot be recovered by a larger cohort or a more flexible model, only by better measurement. The nonlinearity is hidden, not absent, and a tie between linear and flexible models is not by itself a verdict on the biology. These pieces are classical, drawn from measurement-error statistics, psychometrics, and Gaussian analysis, and we assemble them into an exact excess-risk identity. Measurement reliability is one of three conditions, alongside sample size and feature representation, that must align for a flexible model to help, and together they leave only a narrow window that most biomedical tasks fall outside. Across 140 UK Biobank tasks, the gap between flexible and linear models, where it exists, carries the predicted noise signature, and the three conditions can be separated by intervention but not by a benchmark alone.

17.
arXiv (CS.CL) 2026-06-16

Islamic Large Language Models: From Knowledge Acquisition to Trustworthy and Hallucination-Resistant AI

Large language models (LLMs) are increasingly used for knowledge-intensive question answering, including religious and legal questions. Islamic knowledge is a particularly demanding setting: answers are expected to be grounded in authoritative sources, citations must be exact, Arabic varieties differ substantially from the language of classical sources, and legitimate jurisprudential disagreement must be represented rather than collapsed into a single answer. This survey reviews the emerging field of Islamic LLMs and trustworthy Islamic AI. We organize the literature around Arabic NLP and Arabic-centric LLMs, Islamic NLP resources, Qur'anic question answering, Islamic knowledge benchmarks, retrieval-augmented generation, Islamic legal reasoning, inheritance reasoning, hallucination evaluation, and trustworthiness. We argue that fluency in Arabic is not sufficient for Islamic AI. Reliable systems require curated sources, retrieval and verification modules, citation-aware generation, madhhab-aware reasoning, human expert evaluation, and benchmarks that measure not only answer accuracy but also faithfulness, source validity, and reasoning quality. The survey concludes with a research agenda for hallucination-resistant Islamic AI systems.

18.
Nature Biotechnology 2026-06-19

Optimized R2 retroelement complexes for DNA insertion into plant genomes

Traditional approaches for DNA insertion into plant genomes using Agrobacterium tumefaciens result in random integration. Newer genetic engineering methods based on nucleases, prime editors, transposases and recombinases extend capabilities but remain constrained with low efficiencies, off-target integration or limited payload size. Here we adapt the avian Taeniopygia guttata R2 protein (R2Tg) for targeted DNA insertion into plant genomes by engineering R2Tg expression cassettes and RNA payloads carrying intron-disrupted reporters, with optimized ribosomal DNA homology arms and untranslated regions. In Arabidopsis thaliana protoplasts, Nicotiana benthamiana leaves and Solanum lycopersicum seedlings, our R2Tg editor system achieves targeted insertion of full-length payloads ranging from 2.2 kb to 5 kb. In Nicotiana benthamiana leaves, integration occurs, on average, at 1 copy per genome, which is 30 times more efficient than that achieved by Cas9 homology-directed repair. This work establishes an R2Tg ribonucleoprotein platform for targeted DNA insertion into plant genomes, using a multicopy genomic safe-harbor site to enable efficient addition of multikilobase genes. R2 retrotransposons are used to integrate DNA into plant and crop 25S ribosomal DNA sites.

19.
medRxiv (Medicine) 2026-06-15

Beyond the Apnea-Hypopnea Index: Physiological and Demographic Predictors of Excessive Daytime Sleepiness in Obstructive Sleep Apnea

Excessive daytime sleepiness (EDS) is a common but inconsistently predicted symptom of obstructive sleep apnea (OSA). OSA is typically diagnosed with polysomnography (PSG), and the current standard for severity assessment is the apnea-hypopnea index (AHI). AHI has many limitations, including its inability to explain physiological mechanisms or reflect variability in patient symptoms, such as EDS. This retrospective study aims to find physiological and demographic parameters that better predict EDS in patients with OSA and to evaluate whether these parameters outperform AHI using PSG data from the Mount Sinai Integrative Sleep Center. Clinical variables used to predict EDS included arousal index (AI), average oxygen desaturation during sleep, average heart rate during sleep, and AHI, along with demographic variables including age, sex, and BMI. Hypothesis tests, logistic regression models, and decision tree classifier models were performed on the data to discriminate sleepy from nonsleepy patients as determined by an Epworth Sleepiness Scale (ESS) score [≥] 10. AI and oxygen desaturation were found to be the most predictive physiological variables, and sex and BMI were found to be the most predictive demographic variables. The final decision tree model with these four variables outperformed the AHI in predicting EDS. These findings suggest that daytime sleepiness in OSA can be better explained by measures of apnea burden, oxygenation impairment, and patient demographics than by AHI alone, although these remain only modestly predictive. Future studies should focus on investigating more comprehensive physiological markers, multi-night sleep data, and more objective assessments of sleepiness.

20.
arXiv (CS.CL) 2026-06-18

EARS: Explanatory Abstention for Reliable Sub-Agent Modeling in Large-scale Multi-Agent Systems

In large-scale enterprise settings, centralized multi-agent systems (MAS) are increasingly adopted, in which a coordinator delegates user requests to lightweight, domain-specialized sub-agents. While this architecture improves modularity, scalability, and cost efficiency, its reliability depends not only on accurate routing but also on sub-agents' ability to calibrate their responses to capability constraints. In particular, sub-agents built on smaller fine-tuned models often struggle with such calibration, leading them to over-answer ambiguous, underspecified, misrouted, or unsupported requests and produce hallucinated outputs instead of actionable feedback. To address this challenge, we present EARS (Explanatory Abstention for Reliable Sub-Agent Modeling), a production-oriented framework that reframes sub-agent abstention as an inter-agent communication protocol: a sub-agent does not merely abstain, but exposes an actionable failure state to the coordinator. EARS curates human-agent interaction data using an ensemble of calibrated LLM-as-a-Judge models, producing structured abstention labels and rationales under a taxonomy of sub-agent failure modes. These data are used to fine-tune sub-agents to detect failure conditions and return rationales for coordinator-level clarification, rerouting, or fallback. We evaluate EARS in a large-scale production e-commerce assistant supporting enterprise business intelligence workflows. EARS improves the overall response pass rate from 68.5% to 78.9%, demonstrating that sub-agent-side explanatory abstention improves MAS reliability.

21.
arXiv (CS.LG) 2026-06-16

Towards CONUS-Wide ML-Augmented Conceptually-Interpretable Modeling of Catchment-Scale Precipitation-Storage-Runoff Dynamics

arXiv:2510.02605v2 Announce Type: replace Abstract: While many modern studies are dedicated to ML-based large-sample hydrologic modeling, these efforts have not necessarily translated into predictive improvements that are grounded in enhanced physical-conceptual understanding. Here, we report on a CONUS-wide large-sample study (spanning diverse hydro-geo-climatic conditions) using ML-augmented physically-interpretable catchment-scale models of varying complexity based in the Mass-Conserving Perceptron (MCP). Results were evaluated using attribute masks such as snow regime, forest cover, and climate zone. Our results indicate the importance of selecting model architectures of appropriate model complexity based on how process dominance varies with hydrological regime. Benchmark comparisons show that physically-interpretable mass-conserving MCP-based models can achieve performance comparable to data-based models based in the Long Short-Term Memory network (LSTM) architecture. Overall, this study highlights the potential of a theory-informed, physically grounded approach to large-sample hydrology, with emphasis on mechanistic understanding and the development of parsimonious and interpretable model architectures, thereby laying the foundation for future models of everywhere that architecturally encode information about spatially- and temporally-varying process dominance.

22.
arXiv (CS.CV) 2026-06-18

SMART: A Flexible, Interpretable, and Scalable Spatio-temporal Brain Atlas from High-Resolution Imaging Data

We introduce SMART, a framework for learning a flexible, interpretable, and scalable spatio-temporal brain atlas from longitudinal high-resolution 3D medical images. Existing approaches to spatio-temporal atlas construction rely on black-box generative models that lack flexibility, limit interpretability, and struggle to scale to high-dimensional data. SMART addresses these challenges by learning a continuous disease-time atlas that decouples global group-wise disease dynamics from their patient-specific anatomical manifestation. Guided by anatomically inspired priors, SMART models interpretable global trajectories of regional progression along a shared disease timeline through region-specific differential equations. Global trajectories are further personalized to individual anatomies via dense diffeomorphic displacements parameterized by a flexible and scalable multi-scale Neural Cellular Automata. Evaluated on five longitudinal MRI datasets in Alzheimer's disease (ADNI-1/GO/2, OASIS-3, AIBL; > 1,300 subjects), SMART produces anatomically meaningful predictions of disease progression and achieves state-of-the-art forecasting accuracy and improved temporal consistency over adversarial and diffusion baselines. Our approach establishes a new paradigm for flexible, interpretable, and scalable modeling of spatio-temporal change in high-dimensional medical image time-series.

23.
arXiv (CS.LG) 2026-06-11

A Multi-Modal Sensor Fusion Instrument for Measuring Regional Human Mobility: The Distributed Human Data Engine (DHDE)

arXiv:2603.21639v2 Announce Type: replace-cross Abstract: Accurately estimating human mobility in peripheral regional economies presents a fundamental measurement challenge: physical ground-truth sensors are sparse, behavioral intent signals are heterogeneous, and environmental friction introduces systematic bias into demand inference. We introduce the Distributed Human Data Engine (DHDE), a multi-modal sensor fusion architecture that addresses this challenge by integrating physical instrumentation (Edge-AI cameras), digital intent signals (route search impression metrics), behavioral records (90,350 spending records, 97,719 standardized survey responses), and meteorological data across four geographically distributed nodes in Fukui, Japan. The primary measurement-science contribution is the design, deployment, and cross-node validation of the DHDE as a sparse-sensor compensation instrument: a heterogeneous sensor fusion architecture that anchors non-stationary digital intent signals to concurrent physical ground-truth counts, correcting for systematic bias introduced by meteorological planning friction. The instrument is implemented as an ensemble inference pipeline (Random Forest and Ordinary Least Squares with Newey-West robust inference), calibrated across 397 daily observations and validated by chronological holdout replication across four geographically distinct node types. The primary OLS specification achieved an in-sample explanatory power of R2 = 0.810 and a chronological out-of-sample predictive performance of R2 = 0.683. Results identify an Under-Vibrancy Paradox where macro-regional visitor satisfaction correlates positively with crowd density (Spearman rank correlation rs = +0.150, p = 0.002). We estimate an annual proxy gap of 865,917 intent-implied visits, corresponding to JPY 11.96 billion (USD 72.6 million) in foregone revenue.

24.
arXiv (CS.AI) 2026-06-15

HarnessX: A Composable, Adaptive, and Evolvable Agent Harness Foundry

arXiv:2606.14249v1 Announce Type: new Abstract: AI agent performance depends critically on the runtime harness, comprising the prompts, tools, memory, and control flow that mediate how a model observes, reasons, and acts. Yet today's harnesses remain largely hand-crafted and static: each new model or task still demands bespoke scaffolding, and the rich traces produced during execution are rarely distilled back into systematic improvement. We introduce HarnessX, a foundry for composable, adaptive, and evolvable agent harnesses. HarnessX assembles typed harness primitives via a substitution algebra, adapts them through AEGIS, a trace-driven multi-agent evolution engine grounded in an operational mirror between symbolic adaptation and reinforcement learning, and closes the harness-model loop by turning trajectories into both harness updates and model training signal. Across five benchmarks (ALFWorld, GAIA, WebShop, tau^3-Bench, and SWE-bench Verified), HarnessX yields an average gain of +14.5% (up to +44.0%), with gains largest where baselines are lowest. These results suggest that agent progress need not come from model scaling alone: composing and evolving runtime interfaces from execution feedback is an actionable and complementary lever. The complete codebase will be open-sourced in a future release.

25.
arXiv (CS.AI) 2026-06-17

LLM-Aided Joint Secrecy Precoding and Trajectory for RSMA-Based Heterogeneous UAV Networks

arXiv:2507.17188v3 Announce Type: replace-cross Abstract: This paper investigates secure communications in rate-splitting multiple access (RSMA) enabled heterogeneous UAV networks, where multiple UAVs collaboratively serve ground terminals in the presence of eavesdroppers. By jointly considering secrecy rate maximization and propulsion energy consumption minimization, we formulate a multi-objective optimization problem involving UAV trajectory design, service association, power allocation, and secrecy precoding under mobility, collision-avoidance, service-capacity, and communication constraints. The formulated problem is highly non-convex due to the coupling among UAV trajectories, RSMA transmission variables, and secrecy constraints.To address the resulting non-convex and highly coupled optimization problem, we propose a hierarchical optimization framework. The inner layer uses a semidefinite relaxation (SDR)-based S2DC algorithm combining penalty functions and difference-of-convex (D.C.) programming to solve the secrecy precoding problem with fixed UAV positions. The outer layer introduces a Large Language Model (LLM)-guided heuristic multi-agent reinforcement learning approach (LLM-HeMARL) for trajectory optimization. LLM-HeMARL efficiently incorporates LLM-generated expert heuristic policy, enabling UAVs to learn energy-aware, security-driven trajectories without the inference overhead of real-time LLM calls. The simulation results show that our method outperforms existing baselines in secrecy rate and energy efficiency, with consistent robustness across varying UAV swarm sizes and random seeds.