Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
bioRxiv (Bioinfo) 2026-06-23

FateLimit quantifies the prediction horizon of cell fate

Single-cell technologies have enabled increasingly detailed reconstruction of developmental trajectories, yet a fundamental question remains unresolved: when does future cellular identity become predictable from cells current molecular state? Existing approaches infer lineage relationships, transition probabilities or future transcriptional dynamics, but do not directly quantify the emergence of fate predictability during cellular state transitions. Here we present FateLimit, an information-theoretic framework for measuring the temporal dynamics of cell-fate predictability from single-cell omics data. FateLimit combines probabilistic fate assignment, fate entropy and mutual information to quantify how information about future cellular outcomes is encoded in present molecular states. We introduce two quantitative descriptors: the Fate Information Half-Life (FIHL), which measures the characteristic timescale of fate-information dynamics, and the Prediction Horizon (PH), defined as the earliest developmental stage at which observed fate predictability exceeds the 95th percentile of a permutation-derived null distribution. We applied FateLimit across developmental, lineage-tracing and reprogramming systems, including pancreatic endocrinogenesis, CellTag reprogramming, human hematopoiesis and zebrafish embryogenesis. Across all datasets, FateLimit identified significant fate information and reproducible prediction horizons that were robust to cell-state representation, lineage structure and biological context. Comparative analysis revealed that prediction horizons differ substantially among cellular lineages, indicating that distinct developmental programs acquire predictive information at different rates. FateLimit establishes a general framework for quantifying the predictability of future cellular identity from present molecular states. By transforming developmental trajectories into predictability landscapes, FateLimit enables systematic comparison of commitment dynamics across biological systems and establishes prediction horizons as a quantitative measure of cell-fate determination.

02.
arXiv (CS.CL) 2026-06-25

MedLayBench-V: A Large-Scale Benchmark for Expert-Lay Semantic Alignment in Medical Vision Language Models

Medical Vision-Language Models (Med-VLMs) have achieved expert-level proficiency in interpreting diagnostic imaging. However, current models are predominantly trained on professional literature, limiting their ability to communicate findings in the lay register required for patient-centered care. While text-centric research has actively developed resources for simplifying medical jargon, there is a critical absence of large-scale multimodal benchmarks designed to facilitate lay-accessible medical image understanding. To bridge this resource gap, we introduce MedLayBench-V, the first large-scale multimodal benchmark dedicated to expert-lay semantic alignment. Unlike naive simplification approaches that risk hallucination, our dataset is constructed via a Structured Concept-Grounded Refinement (SCGR) pipeline. This method enforces strict semantic equivalence by integrating Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs) with micro-level entity constraints. MedLayBench-V provides a verified foundation for training and evaluating next-generation Med-VLMs capable of bridging the communication divide between clinical experts and patients.

03.
arXiv (CS.AI) 2026-06-16

Calibrated Sampling-Free Uncertainty Estimation in Bayesian Deep Learning

arXiv:2606.16214v1 Announce Type: cross Abstract: Modern deep learning models remain notoriously prone to overconfidence, limiting their reliability in high-stakes applications. Bayesian methods aim to counter this by learning a distribution over model parameters, and recent advances now make this feasible for large-scale architectures at costs comparable to AdamW. However, a challenge remains at test time: predictions must be averaged across many forward passes with weights sampled from the posterior, which is prohibitively expensive. Variance propagation offers an efficient alternative, computing layer-wise analytical approximations of uncertainty in a single forward pass. While such techniques are effective for MLPs, their extension to modern architectures remains challenging, due to increased depth and diversity of layer types. To fill this gap, we propose Calibrated Variance Propagation (CVP), which introduces a new propagation method for normalization layers, combines it with recent techniques for handling activation functions, and absorbs residual error through a light calibration step. CVP yields comparably accurate uncertainty estimates to MC sampling across transformers and CNNs, at a fraction of the cost. Against prior variance propagation work, CVP improves coverage at $0.5\%$ risk from $8.2\%$ to $14.6\%$ with BEiT-3 on Visual Reasoning (NLVR2) and from $2.6\%$ to $10.8\%$ with ViLT on VQAv2, with gains extending to convolutional architectures.

04.
arXiv (CS.CL) 2026-06-16

Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving

Large language models (LLMs) have shown promising first-order logic (FOL) reasoning capabilities with applications in various areas. However, their effectiveness in complex mathematical reasoning involving multi-step FOL deductions is still under-researched. While LLMs perform competitively on established mathematical reasoning benchmarks, they struggle with multi-step FOL tasks, as demonstrated by Deepseek-Prover-V2-7B's low accuracy (4.2%) on our proposed theorem proving dataset. This issue arises from the limited exploration of diverse proof strategies and the potential for early reasoning mistakes to undermine entire proofs. To address these issues, we propose DREAM, a self-adaptive solution that enhances the Diversity and REAsonability of LLMs' generation strategies. DREAM incorporates an Axiom-Driven Strategy Diversification mechanism to promote varied strategic outcomes and a Sub-Proposition Error Feedback to help LLMs reflect on and correct their proofs. Our contributions include pioneering advancements in LLMs' mathematical reasoning through FOL theorem proving, introducing a novel inference stage solution that improves performance by 0.6% to 6.4%, and providing a curated dataset of 447 mathematical theorems in Lean 4 format for evaluation.

05.
arXiv (CS.CL) 2026-06-11

Measuring Semantic Progress in Multi-turn Dialogue via Information Gain

Evaluating multi-turn dialogue is challenging because quality emerges across turns rather than within individual responses. We focus on a key dimension of information-seeking dialogue: semantic progress, defined as the accumulation of new, question-relevant, and non-redundant information over the course of a conversation. We formalize semantic progress as question-conditioned uncertainty reduction and introduce an information-theoretic metric that approximates it in embedding space. Our main estimator uses a tractable Gaussian formulation with closed-form updates, while a complementary maximum-entropy argument shows why log-determinant structure arises more broadly when only second-order embedding information is retained. This formulation yields desirable theoretical properties, including monotonicity, additive decomposition of total information gain across turns, and diminishing returns for redundant evidence. Unlike LLM-as-a-judge approaches, our metric requires no autoregressive inference at evaluation time and is fully reproducible for a fixed embedding model. Experiments on MT-Bench, Chatbot Arena, and UltraFeedback show that the proposed metric achieves competitive agreement with human judgments despite targeting only semantic progress, with improved alignment on MT-Bench and UltraFeedback compared to several LLM-based judges. Notably, the method remains effective with lightweight embedding models under CPU-only execution, indicating that semantic progress can be captured without reliance on large model capacity.

06.
medRxiv (Medicine) 2026-06-15

Instrumental Activities of Daily Living in Older Adults with Epilepsy: A Cross-Sectional and Longitudinal Multicenter Study

Objective: Instrumental activities of daily living (IADLs) represent a critical but understudied measure of day-to-day function in persons with epilepsy(PWE). In the multicenter Brain Aging and Cognition in Epilepsy (BrACE) study of PWE aged greater than or equal to 55 years, we examined the proportion, clinical correlates, epilepsy-related predictors, and longitudinal trajectory of IADL impairment. Methods: IADLs were assessed using the Functional Activities Questionnaire (FAQ; range=0 to 30; higher=more impaired); a FAQ greater than or equal to 2 defines MCI-level impairment, and a FAQ greater than or equal to 5 defines dementia-level functional impairment. Multivariable logistic regression identified predictors of baseline function. Global cognition (Montreal Cognitive Assessment [MoCA]), individual cognitive measures, and quality of life (QOL) were compared between the impaired and unimpaired groups. Linear regression evaluated predictors of longitudinal functional decline. Results: Of 57 participants (mean age=66.6 years; female=52.6%), 38.6% (n=22) had MCI-level functional impairment and 17.5% (n=10) had dementia-level functional impairment. In univariate analyses, worse FAQ scores were associated with lower education, higher area deprivation index, early-onset epilepsy (EOE less than 60 years), antiseizure medication polytherapy, and epilepsy localization. In multivariable analysis, temporal lobe epilepsy (OR=4.46, 95% CI=1.09, 21.83,p=0.047), EOE(OR=7.14, 95% CI=1.16, 59.97, p=0.046), and lower education(OR=0.70,95% CI=0.49, 0.93, p=0.025) remained independently associated with baseline MCI-level functional-impairment. Lower education (OR=0.55,95% CI=0.29, 0.84, p=0.021) was the only factor associated with dementia-level IADL-impairment. IADL-impaired participants demonstrated lower verbal memory scores (adjusted p=0.041) and MoCA scores (adjusted p

07.
arXiv (CS.AI) 2026-06-24

Data Scale, Not Latency, Shapes Cross-Lingual Encoder Transfer in Streaming ASR

Authors:

arXiv:2606.24169v1 Announce Type: new Abstract: Adapting a streaming speech recognition model to a new language requires choosing between two plausible warm starts: a multilingual (ML) encoder or an English-only (EN) encoder. The common intuition is that the multilingual encoder should help most at low data, but it is unclear how long that advantage persists, whether tight streaming latency amplifies it, and whether it survives deployment quantization. We answer these questions with a controlled sweep of a 0.6 B-parameter cache-aware FastConformer transducer across eight European languages, up to five target-language data scales (100 h to 2500 h), three streaming tiers plus offline decoding, and up to four public test sets. The main result is that multilingual initialization is a data-limited advantage, not a latency-limited one. On FLEURS at 160 ms, the mean EN-ML word error rate (WER) gap falls from +4.21 percentage points (pp) at 100 h to +0.20 pp at 2500 h; a power-law fit summarizes this decay, with each doubling of target-language data roughly halving the remaining advantage. Across the three streaming tiers, the across-language mean EN-ML gap is approximately stable at each scale from 100 to 1000 h, and is near zero by 2500 h. Finally, 4-bit weight-only encoder quantization at the matched 560 ms streaming tier reduces the encoder footprint by about 3x, with an average FLEURS WER increase of about 0.5 pp. The resulting guideline is simple: use multilingual initialization in low-data regimes, treat the choice as effectively irrelevant at large data, and make latency and quantization decisions independently.

08.
arXiv (CS.AI) 2026-06-11

AutoMine Solution for AV2 2026 Scenario Mining Challenge

arXiv:2606.11874v1 Announce Type: new Abstract: With the development of autonomous driving systems, mining high-value, safety-critical, and planning-relevant scenarios from large-scale driving logs has become essential for data-driven evaluation. In this paper, we propose AutoMine, a robust self-refining scenario mining method based on LLMs and VLMs. AutoMine uses semantics-preserving prompt augmentation to reduce LLM prompt sensitivity, combines robust trajectory atomic functions with VLM-based functions to handle perception noise and open-world visual cues, and refines generated code through execution feedback from real logs. In the Argoverse 2 Scenario Mining Competition at CVPR 2026, AutoMine achieves a HOTA-Temporal score of 36.38 and a Timestamp BA score of 77.21.

09.
arXiv (math.PR) 2026-06-25

Three non-Hermitian random matrix universality classes of complex edge statistics: Spacing ratios and distributions

arXiv:2603.28457v2 Announce Type: replace-cross Abstract: The conjectured three generic local bulk statistics amongst all non-Hermitian random matrix symmetry classes have recently been extended to three generic local edge statistics. We study analytically and numerically complex spacing ratios and nearest-neighbour (NN) spacing distributions that characterise such local statistics. We choose the three simplest representatives of these universality classes, given by the Gaussian ensembles of complex Ginibre, complex symmetric and complex self-dual matrices, denoted by class A, AI$^\dag$ and AII$^\dag$. In the first part, we analytically study the complex spacing ratio in class A, at finite matrix size $N$. Introducing a conditional point process, we simplify existing expressions and show why an uncontrolled approximation introduced earlier converges well in the large-$N$ limit in the bulk. When specifying to the elliptic Ginibre ensemble, we present a parameter-dependent $N=3$ surmise for the complex spacing ratio, interpolating to that of the Gaussian unitary ensemble (GUE), where such a surmise is very accurate. In the second numerical part, we compare complex spacing ratios, its moments, and NN spacing distributions for all three ensembles with that of uncorrelated points, the two-dimensional (2D) Poisson process, both in the bulk and at the edge. The varying degree of repulsion within these different edge universality classes can be well understood in terms of an effective 2D Coulomb gas description, at different values of inverse temperature $\beta$. We find indications that the complex spacing ratio does not fully unfold the local statistics at the edge. Finally we verify that for small argument, in all three symmetry classes the NN spacing distributions in the bulk and at the edge are consistent with the universal cubic repulsion.

10.
Nature (Science) 2026-06-10

Whole-genome duplication shaped cell-type evolution in the vertebrate brain

Authors:

The complex brains of vertebrates have more cell types than those of their closest relatives. Whole-genome duplications (WGDs) occurred during early vertebrate evolution1, but it is unclear whether the duplicated genes (ohnologues) facilitated cell-type evolution. Here using brain single-cell transcriptomes from five chordates—human2, mouse3, lizard4, lamprey5 and amphioxus—we report that many cell-type families with conserved core transcription factors in vertebrates do not show one-to-one homology with amphioxus. Moreover, ohnologues, particularly those from the first WGD, were more important than small-scale duplication paralogues for vertebrate cell-type evolution. To explore whether ohnologues are mechanistically important for this process, we predicted ancestral cell-type states and compared them to amphioxus and experimentally investigated macroglia. The findings indicate that ohnologues had a role in early vertebrate cell-type diversification. Moreover, by examining paralogue expression across cell types and species, we show that expression changes were mainly driven by dosage selection and subfunctionalization. We also link ohnologues to cellular diversity at different anatomical and cell-type scales. Our findings demonstrate the importance of WGDs for the evolution of early vertebrate brain complexity and highlight that the resultant ohnologues continued to capacitate cell-type evolution long after they were formed. Analyses of brain single-cell transcriptomes from human, mouse, lizard, lamprey and amphioxus reveal that duplicated genes (ohnologues) played a pivotal part in early vertebrate cell-type diversification.

11.
arXiv (CS.LG) 2026-06-16

Q-Learning with Fine-Grained Gap-Dependent Regret

arXiv:2510.06647v2 Announce Type: replace-cross Abstract: We study fine-grained gap-dependent regret bounds for model-free reinforcement learning in episodic tabular Markov Decision Processes. Existing model-free algorithms achieve minimax worst-case regret, but their gap-dependent bounds remain coarse and fail to fully capture the structure of suboptimality gaps. We address this limitation by establishing fine-grained gap-dependent regret bounds for both UCB-based and non-UCB-based algorithms. In the UCB-based setting, we develop a novel analytical framework that explicitly separates the analysis of optimal and suboptimal state-action pairs, yielding the first fine-grained regret upper bound for UCB-Hoeffding (Jin et al., 2018). To highlight the generality of this framework, we introduce ULCB-Hoeffding, a new UCB-based algorithm inspired by AMB (Xu et al.,2021) but with a simplified structure, which enjoys fine-grained regret guarantees and empirically outperforms AMB. In the non-UCB-based setting, we revisit the only known algorithm AMB, and identify two key issues in its algorithm design and analysis: improper truncation in the $Q$-updates and violation of the martingale difference condition in its concentration argument. We propose a refined version of AMB that addresses these issues, establishing the first rigorous fine-grained gap-dependent regret for a non-UCB-based method, with experiments demonstrating improved performance over AMB.

12.
arXiv (CS.CV) 2026-06-11

RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization

Visual manipulation localization (VML) aims to identify tampered regions in images and videos, a task that has become increasingly challenging with the rise of advanced editing tools. Existing methods face two central issues. The first is resolution diversity. Resizing or padding can distort subtle forensic cues and introduce unnecessary computational cost. The second is the difficulty of extending spatial models for images to spatio-temporal inputs in videos, which often results in maintaining separate architectures for the two data types. To address these challenges, we propose RelayFormer, a unified framework that adapts to varying resolutions and naturally handles both static and temporal visual data. RelayFormer partitions inputs into fixed-size sub-images and introduces Global Local Relay (GLR) tokens that propagate structured context through a relay-based attention mechanism. This design enables efficient exchange of global cues, such as semantic or temporal consistency, while preserving fine-grained manipulation artifacts. Unlike prior approaches that depend on uniform resizing or sparse attention, RelayFormer scales to variable resolutions and video sequences with minimal overhead. Experiments across diverse benchmarks demonstrate superior performance and strong efficiency, combining resolution adaptivity without interpolation or excessive padding, unified processing for images and videos, and a favorable balance between accuracy and computational cost. Code is available at~\href{https://github.com/WenOOI/RelayFormer}{https://github.com/WenOOI/RelayFormer}.

13.
arXiv (math.PR) 2026-06-16

A uniform-in-time weakly convergent explicit numerical method for the underdamped Langevin equation with polynomial potentials

Authors:

arXiv:2606.15175v1 Announce Type: cross Abstract: The underdamped Langevin equation is a fundamental model in statistical mechanics for sampling Gibbs measures and simulating molecular dynamics, for which numerical methods with uniform-in-time weak convergence are essential for accurately reproducing long-time statistical observables and invariant measures of the underlying dynamics. Currently, such uniform-in-time weak convergence is established for implicit schemes, but remains unknown for explicit ones under polynomially growing potentials. To improve efficiency in long-time simulations, we propose the first explicit numerical method for the underdamped Langevin equation with polynomially growing potentials that is proven to achieve uniform-in-time weak convergence. The explicit numerical method is constructed by introducing a dissipativity on the scalar auxiliary variable (SAV), which we call the DSAV method. The proposed DSAV method enables the approximation of the invariant measure for the underdamped Langevin equation with a precision of $\varepsilon$ at a significantly reduced computational cost of $\mathcal{O}(\varepsilon^{-1} \log(\varepsilon^{-1}))$. In addition, we establish the existence and positivity of the density function of the numerical solution without using the Malliavin calculus. Numerical experiments are performed to verify the theoretical findings and demonstrate the long-time stability of the proposed numerical method.

14.
arXiv (CS.LG) 2026-06-16

Tight Bounds for Logistic Regression with Large Stepsize Gradient Descent in Low Dimension

arXiv:2602.12471v2 Announce Type: replace Abstract: We consider the optimization problem of minimizing the logistic loss with gradient descent to train a linear model for binary classification with separable data. With a budget of $T$ iterations, it was recently shown that an accelerated $1/T^2$ rate is possible by choosing a large stepsize $\eta = \Theta(\gamma^2 T)$ (where $\gamma$ is the dataset's margin) despite the resulting non-monotonicity of the loss. In this paper, we provide a tighter analysis of gradient descent for this problem when the data is two-dimensional: we show that GD with a sufficiently large learning rate $\eta$ finds a point with loss smaller than $\mathcal{O}(1/(\eta \gamma^2 T))$, as long as $T \geq \Omega(n/\gamma + 1/\gamma^2)$, where $n$ is the dataset size. Our improved rate comes from a tighter bound on the time $\tau$ that it takes for GD to transition from unstable (non-monotonic loss) to stable (monotonic loss), via a fine-grained analysis of the oscillatory dynamics of GD in the subspace orthogonal to the max-margin classifier. We also provide a lower bound of $\tau$ matching our upper bound up to logarithmic factors, showing that our analysis is tight.

15.
arXiv (quant-ph) 2026-06-11

Global vs. Local Discrimination of Locally Implementable Multipartite Unitaries

arXiv:2509.10430v2 Announce Type: replace Abstract: We study single-shot distinguishability of locally implementable multipartite unitaries under Local Operations and Classical Communication (LOCC) and global operations. As unitary discrimination depends on both the choice of probing states and the measurements on the evolved states, we classify LOCC and global distinguishability into two categories: adaptive strategies, where probing states are chosen based on measurement outcomes from other subsystems, and restricted strategies, where probing states remain fixed. Our findings uncover three surprising features in the bipartite setting and establish new structural limits for unitary discrimination: (i) Certain pairs of unitaries are globally distinguishable with restricted strategies but indistinguishable under LOCC, even with adaptive strategies. (ii) There exist sets of four unitaries that are distinguishable via LOCC, yet remain globally indistinguishable with restricted strategies. (iii) Some sets of unitaries are globally indistinguishable under adaptive strategies, when probed with separable states, but become distinguishable via LOCC.

16.
Nature Biotechnology 2026-06-22

Affordable centimeter-scale 3D microscopy with submicrometer resolution

Authors: Unknown Author

Submicrometer-resolution three-dimensional (3D) imaging of large samples has been constrained by the short working distance, high cost and inflexible design of immersion objectives. We developed hybrid solid–liquid optics (HySIL) — a refractive framework with index-matched components — for submicrometer-resolution 3D imaging of centimeter-scale samples in various immersion media using inexpensive air objectives.

17.
arXiv (CS.AI) 2026-06-19

RACL: Reasoning-Agent Control Layers for Continuous Metaheuristic Learning

arXiv:2606.20142v1 Announce Type: new Abstract: This paper introduces RACL, a Reasoning-Agent Control Layer for metaheuristics. RACL places a reasoning agent above an existing optimizer. The agent does not replace the optimizer and does not modify business constraints. Instead, it controls the optimizer's internal search behavior by observing operational memory, reasoning over past behavior, formulating bounded hypotheses, testing interventions, evaluating outcomes, applying guardrails, consolidating useful policies and explaining its decisions. The experiment uses vehicle routing as a testbed, but the contribution is not a new routing solver, a particular ALNS configuration or a specific set of routing rules. The contribution is the RACL method: a way for a reasoning agent to discover, validate, consolidate and explain algorithmic control rules for a metaheuristic. In the current experimental setting, RACL improves or ties the Operational Memory Policy in 21 of 21 feasible cases and improves or ties a non-reasoning Stagnation-Triggered Policy in 18 of 21 feasible cases, with an average RACL vs STP cost delta of -0.641%. In the Sevilla-9/10 runtime sample, RACL improves average cost by -8.337% versus Fixed and -1.605% versus STP without showing material computational overhead. During the proof-of-concept, Codex was used as an in-the-loop reasoning agent observing executions, interpreting logs and proposing live bounded interventions. The policy proxy was later used only to make quantitative evaluation reproducible.

18.
arXiv (CS.LG) 2026-06-17

Fast Nonparametric Conditional Independence Testing via Two-Stage Regression

arXiv:2606.18011v1 Announce Type: cross Abstract: Constraint-based causal discovery relies on repeated conditional independence tests, but fast nonparametric tests often sacrifice calibration, especially when variables depend on the conditioning set through nonlinear relationships. We introduce BLITZ (Broad-to-Local Independence Testing via residualiZation), a nonparametric conditional independence test designed to run well under a second while maintaining the accuracy needed for the thousands of queries performed by constraint-based causal discovery algorithms. BLITZ first removes broad smooth dependence on the conditioning set using low-order polynomial regression, then applies a small nonlinear feature map and residualizes those features with shallow tree regressions. The resulting statistic tests residual cross-covariance, with a moment-matched chi-square approximation to the null distribution. We show theoretically that the two-stage design reduces the effective complexity faced by the tree residualizers, allowing shallow trees to control residual conditional-mean bias while avoiding excessive overfitting. In simulations, BLITZ provides better null calibration than fast kernel, random-feature, and regression-based competitors while remaining among the fastest methods tested. In causal discovery experiments on synthetic graphs and flow-cytometry data, BLITZ yields more reliable endpoint orientations among retained adjacencies and competitive structural recovery. These results suggest that broad-to-local residualization is a practical route to calibrated, scalable nonparametric conditional independence testing for causal discovery.

20.
arXiv (CS.AI) 2026-06-17

PreAct: Computer-Using Agents that Get Faster on Repeated Tasks

Authors:

arXiv:2606.17929v1 Announce Type: new Abstract: Computer-using agents drive real software through the screen – clicking and typing – but they solve every task from scratch: asked to repeat a task, an agent re-reads the screen, re-reasons every tap, and pays the full cost again. We present PreAct, which lets such an agent get faster on tasks it has done before. The first time it succeeds, PreAct compiles the run into a small state-machine program-states that check the screen, transitions that act-and on later runs replays it directly instead of invoking the agent 8.5-13x faster, with no per-step language-model calls. Replay is not blind: at each step PreAct checks that the screen matches what the program expects before acting, and hands control back to the agent the moment something is off. PreAct applies the same discipline when deciding what to keep: a freshly compiled program enters the store only if, re-run from a clean state, an independent evaluator confirms it solved the task-catching programs that replay to their last step yet leave the task undone. Across a mobile, a desktop, and a web benchmark, this store-time check separates repeated runs that improve from ones that degrade as faulty programs accumulate, worth 1.75-2.6 tasks per benchmark, the same direction on all three; a fallback that explores afresh when no program fits brings PreAct level with a strong record-and-replay baseline. We also report what did not matter: prompt wording, runtime guardrails, and whether a language model or a plain embedding retriever selects which program to reuse.

21.
arXiv (CS.AI) 2026-06-16

Infant Spontaneous Movement Noise Improves Exploration in Deep RL

arXiv:2606.16590v1 Announce Type: cross Abstract: Exploration in deep reinforcement learning (RL) is commonly implemented as temporally uncorrelated white noise. However, recent works show that temporally correlated colored noise can improve exploration efficiency by producing smooth trajectories with better coverage of the state space. We inquire whether action noise inspired by infant spontaneous movements can also improve exploration in deep RL. We find that the power spectral densities of babies' end-effector velocities follow a colored noise process where the spectral exponent increases with age. Inspired by this developmental pattern, we introduce a mechanism that progressively increases the temporal auto-correlation of exploration noise during RL training, matching the infant statistics. Experiments across several RL environments show that infant-inspired noise produces structured exploratory behavior and can improve learning efficiency compared to conventional exploration strategies. These findings suggest that human motor and cognitive development can provide useful guidance for designing learning mechanisms in artificial agents. Our code is available at https://github.com/trieschlab/baby-noise-rl.

22.
arXiv (CS.CV) 2026-06-25

What Does the Brain See? Multiview Neural Representations to Demystify the Brain-Visual Alignment

Zero-shot visual decoding from electroencephalography (EEG) aims to infer visual semantics from non-invasive neural recordings, but remains challenging due to the low signal-to-noise ratio, non-stationarity, and limited spatial resolution of EEG. Existing EEG-vision alignment methods often rely on holistic EEG embeddings, which can obscure the complementary temporal, spectral, and spatial structure underlying visual perception. We introduce a unified multiview EEG representation learning framework for aligning brain responses with visual semantic embeddings. Our method builds an EEG encoder that jointly models three complementary views: input-conditioned state-space temporal dynamics, learnable wavelet-based spectral decomposition for sample-adaptive frequency modeling, and attention-modulated graph learning for structured electrode interactions. The resulting multiview EEG embeddings are fused and aligned with pretrained visual representations in a shared semantic space using contrastive learning with EEG-specific regularization, enabling 200-way zero-shot visual classification. Experiments on THINGS-EEG benchmark show that our method achieves state-of-the-art performance, with 54.8% Top-1 and 85.6% Top-5 accuracy in the within-subject setting and 15.3% Top-1 and 45.4% Top-5 accuracy in the cross-subject setting. We further present the first systematic cross-session EEG-image decoding evaluation, achieving 40.8% Top-1 and 78.0% Top-5 accuracy. These results suggest that explicitly modeling multiview neural structure improves both semantic alignment and generalization in EEG-based visual decoding.

23.
medRxiv (Medicine) 2026-06-17

Low-Density Lipoprotein Cholesterol and Dementia Risk: Integrating Mendelian Randomization and Target Trial Emulation Within the Heart-Brain Axis

Background: The heart-brain axis links cardiovascular and neurodegenerative disease through shared vascular and inflammatory mechanisms. Although low-density lipoprotein cholesterol (LDL-C) is an established causal factor in atherosclerotic cardiovascular disease (ASCVD), its relationship with dementia remains uncertain, with midlife elevations associated with increased risk but late-life associations often appearing null or inverse. To address this cholesterol paradox, we integrated mendelian randomization (MR) with an active-comparator new-user target trial emulation. Methods: We applied a triangulated causal inference framework integrating two-sample MR with observational target trial emulation. Genetic variants associated with LDL-C were used as instrumental variables to evaluate Alzheimer disease (AD), dementia with Lewy bodies (DLB), frontotemporal dementia (FTD), and any dementia (AnyDem), with causal estimates derived using inverse-variance weighted models and sensitivity analyses for heterogeneity and pleiotropy. In parallel, an active-comparator new-user design compared statin versus ezetimibe initiation among adults aged 60 years or older using propensity score (PS) overlap weighting and Cox proportional hazards models to evaluate cardiovascular and dementia outcomes. Results: Genetically predicted LDL-C was associated with increased risk of DLB (OR 1.65, 95% CI 1.30-2.10; p

24.
arXiv (CS.LG) 2026-06-16

LoComposition: Terrain-Adaptive Energy-Efficient Quadruped Locomotion without Gait Priors

arXiv:2606.15896v1 Announce Type: cross Abstract: Learning-based quadrupedal locomotion typically relies on complex reward formulations that entangle task specification, operational limits, gait preference, and terrain adaptation within a single optimization objective. We instead treat these functions through distinct mechanisms: rewards for task specification, constraints for operational limits, energy minimization for gait preference, and exteroceptive perception for adapting energy use to terrain difficulty. We show that these components jointly enable efficient, terrain-adaptive locomotion, and that removing each component exposes a distinct failure mode. Our formulation removes explicit gait priors (including air-time, contact-count, and foot-clearance targets) in favor of emergent behavior. Compared to a conventional complex-reward baseline, our formulation achieves comparable terrain traversal while reducing cost of transport by 56% and operational-limit violations by 96%. The resulting policies transfer zero-shot to a physical Unitree Go2 using LiDAR-based elevation mapping. Project website with videos: https://tinyurl.com/locomposition.

25.
arXiv (CS.CL) 2026-06-16

Misinformation Propagation in Benign Multi-Agent Systems

Multi-agent systems, in which multiple large language model agents solve problems through turn-based interaction, are increasingly deployed in high-stakes settings such as medical diagnosis, legal analysis, and forensic decision-making. Their reliability can be at risk when single agents reason from incorrect or misleading context, e.g., from tool calls, since errors may propagate through agent interactions. This work studies this risk by injecting intent-based misinformation into benign single-agent and multi-agent systems across reasoning, knowledge, and alignment tasks. We find that misinformation can degrade single-agent performance and persists across multi-agent debate, with agents often retaining answers introduced by misinformed peers. Nevertheless, multi-agent debate reduces the resulting performance degradation compared to single-agent prompting, especially when most agents are not exposed to misinformation. Robustness depends on group composition and decision protocol. Consensus can be more stable than voting under peer pressure, while majorities can often steer misinformed agents back toward correct answers. Our results show that misinformation robustness in multi-agent systems depends on the underlying model and also on how agents exchange information and aggregate decisions.