Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-19

Interpreting Neural Combinatorial Optimization via Evolving Programmatic Bottlenecks

arXiv:2606.19741v1 Announce Type: new Abstract: Neural Combinatorial Optimization (NCO) achieves strong performance, yet its black-box nature remains a key roadblock to deployment and scientific diagnosis. Standard interpretability tools, such as Concept Bottleneck Models (CBMs), are ill-equipped for NCO, whose decisions are dynamic, state-dependent, and lack proper concept vocabulary definition. To close this gap, we introduce Evolving Programmatic Bottlenecks (EPB), to our knowledge, the first framework for interpreting NCO policies by distilling black-box NCO models into human-readable program portfolios. EPB employs an LLM to autonomously evolve a bank of programs, where each program's per-step action distribution serves as the bottleneck. EPB works through an iterative framework: Block I fixes program bank capacity and introduces a hybrid textual-numerical gradient descent scheme that couples numerical gradients for student router updates and textual gradients for LLM-based program revision; Block II dynamically adapts bank capacity via fault-targeted expansion and redundancy pruning. Extensive experiments demonstrate EPB's effectiveness and broad applicability, where the distilled program portfolios largely match original performance. EPB also reveals that NCO behavior shifts across optimization stages and can be approximated as a composition of classic heuristic variants. Our work advances interpretable NCO and establishes EPB as a promising tool for interpreting sequential decision-making models.

03.
arXiv (CS.AI) 2026-06-16

Optimizing Health Coverage in Ethiopia: A Learning-augmented Approach and Persistent Proportionality Under an Online Budget

arXiv:2509.00135v2 Announce Type: replace Abstract: As part of nationwide efforts aligned with the United Nations' Sustainable Development Goal 3 on Universal Health Coverage, Ethiopia's Ministry of Health is strengthening health posts to expand access to essential healthcare services. However, only a fraction of this health system strengthening effort can be implemented each year due to limited budgets and other competing priorities, thus the need for an optimization framework to guide prioritization across the regions of Ethiopia. In this paper, we develop a tool, Health Access Resource Planner (HARP), based on a principled decision-support optimization framework for sequential facility planning that aims to maximize population coverage under budget uncertainty while satisfying region-specific proportionality targets at every time step. We then propose two algorithms: (i) a learning-augmented approach that improves upon expert recommendations at any single-step; and (ii) a greedy algorithm for multi-step planning, both with strong worst-case approximation estimation. In collaboration with the Ethiopian Public Health Institute and Ministry of Health, we demonstrated the empirical efficacy of our method on three regions across various planning scenarios.

05.
arXiv (quant-ph) 2026-06-15

QCI Connect: A Modular Full-Stack Quantum Computing Platform

arXiv:2606.14456v1 Announce Type: new Abstract: In a world of various competing quantum computing architectures, hardware-agnostic, full-stack platforms are necessary to bring the full power of quantum computing hardware to domain experts via the cloud. QCI Connect and its Software Development Kit provide a reference architecture for a full-stack platform with a modular design and open-source interface definitions, built to facilitate a community-driven application ecosystem. Here, we present its overall design and features, central interfaces, and lessons learned, both for users of the platform and as a reference guide for future developments.

06.
arXiv (quant-ph) 2026-06-19

Thermodynamic Value of XOR-Game-Induced Side Information in a Szilard Engine

arXiv:2605.12044v3 Announce Type: replace Abstract: We introduce a Szilard-type thermodynamic valuation of side-information channels induced by Bell-type correlations. In each round, a two-level working system is thermalized with a degenerate Hamiltonian, so that its physical microstate is a uniform classical bit. A trusted referee embeds this bit into a finite two-player XOR game, and a correlation resource produces a compressed controller bit. The controller uses only this compressed bit as side information for feedback. The construction is formulated first for arbitrary finite XOR games. The referee encoding makes the game-winning event equivalent to correct prediction of the physical microstate. Consequently, the induced side-information channel is binary symmetric, with success probability equal to the XOR-game winning probability of the supplied behaviour. The reversible Szilard feedback value is therefore fixed by the mutual information between the microstate and the controller record. Optimizing over local, quantum, and nonsignalling behaviour sets turns the corresponding game values into local, quantum, and nonsignalling thermodynamic ceilings. The construction is an effective-channel valuation, not a claim that Bell nonlocality is thermodynamic fuel. The controller receives only the compressed prediction bit, not the auxiliary variables that define the game. The thermodynamic costs of the referee, the correlation resource, and the preprocessing are not included. When controller-memory reset is included in a full cycle, the net work is non-positive, consistently with the second law.

07.
arXiv (CS.LG) 2026-06-17

A tensor network approach for chaotic time series prediction

arXiv:2505.17740v2 Announce Type: replace Abstract: Making accurate predictions of chaotic time series is a complex challenge. Reservoir computing, a neuromorphic-inspired approach, has emerged as a powerful tool for this task. It exploits the memory and nonlinearity of dynamical systems without requiring extensive parameter tuning. However, selecting and optimizing reservoir architectures remains an open problem. Next-generation reservoir computing simplifies this problem by employing nonlinear vector autoregression based on truncated Volterra series, thereby reducing hyperparameter complexity. Nevertheless, the latter suffers from exponential parameter growth in terms of the maximum monomial degree. Tensor networks offer a promising solution to this issue by decomposing multidimensional arrays into low-dimensional structures, thus mitigating the curse of dimensionality. This paper explores the application of a previously proposed tensor network model for predicting chaotic time series, demonstrating its advantages in terms of accuracy and computational efficiency compared to conventional echo state networks. Using a state-of-the-art tensor network approach enables us to bridge the gap between the tensor network and reservoir computing communities, fostering advances in both fields.

08.
arXiv (CS.LG) 2026-06-16

CacheMuon: Using Temporal Preconditioning To Approximate Polar Factor

arXiv:2606.16371v1 Announce Type: new Abstract: Muon is an optimizer that computes updates using the polar factor of the momentum matrix and has shown strong empirical performance across a range of training settings. A key component of Muon is the Newton-Schulz iteration used to compute this polar factor. Although this avoids the cost of an exact singular value decomposition, it remains expensive in practice because it is applied at every optimization step. At the same time, the momentum matrix changes smoothly over training, suggesting strong temporal correlation in the corresponding polar factors. In this paper, we exploit this structure and propose CacheMuon, a temporal preconditioning method that reuses information from previous optimization steps to approximate the polar factor at the current step. This reduces redundant orthogonalization computation across iterations. We analyze CacheMuon as an inexact Muon update, with error controlled by fresh-solver error and cache staleness. Empirically, CacheMuon provides a controllable quality-efficiency frontier: conservative thresholds closely match fresh Muon on language-model and vision training while reducing orthogonalization FLOPs, whereas more aggressive thresholds yield larger arithmetic savings at the cost of modest validation-quality degradation.

09.
bioRxiv (Bioinfo) 2026-06-19

Morpho-FM: spatial molecular reconstruction from routine H&E histology using transcriptomic foundation-model priors

Routine haematoxylin and eosin (H&E) histology captures tissue architecture at clinical scale, but lacks a direct molecular readout of the transcriptional programmes that organise tumour epithelium, stroma, vasculature and immune compartments. Spatial transcriptomics provides this context, yet cost, workflow complexity and sparse sampling limit routine use. Most existing histology-to-expression models are trained de novo on small paired cohorts and therefore remain weakly constrained when extrapolating from sparse measurements to dense, tissue-wide molecular maps. Here we introduce Morpho-FM, a weakly supervised framework that predicts spatial gene expression from routine H&E whole-slide images by conditioning a pretrained single-cell transcriptomic foundation-model prior on local histological neighbourhoods. A lightweight morphology-to-transcriptome adapter maps cached whole-slide histology features into a transcriptomic decoder, enabling prediction at measured locations, dense full-section reconstruction, and re-aggregation to the original measurement support. Across harmonized prostate cancer benchmarks, Morpho-FM achieved the strongest overall performance among five representative methods, reaching mean per-gene Pearson correlations of 0.286 in rotating single-slide evaluation and 0.298 in multi-slide held-out validation. The framework reproduced this advantage across kidney cancer sections, achieved a mean correlation of 0.210 across 56 directed single-slide evaluations and retained measurable predictive signal after external transfer to clear-cell renal cell carcinoma sections. Controlled ablation analyses identified pretrained transcriptomic initialization as a reproducible source of performance gain exceeding that attributable to changes in the histology feature backbone. Beyond predictive accuracy benchmarks, Morpho-FM recovered ERBB2-enriched tumour compartments, boundary-associated molecular gradients, and annotation-aligned tissue domains across Xenium and HER2ST breast cancer datasets. Together, these results support transcriptomic foundation-model priors as an effective constraint for morphology-conditioned molecular decoding and demonstrate the potential of Morpho-FM to extend spatial transcriptomic insight across routine pathology sections.

10.
arXiv (math.PR) 2026-06-16

Quantitative Oppenheim Conjecture for Random Quadratic Forms and Optimal Variance Bounds in Function Fields

arXiv:2606.16699v1 Announce Type: cross Abstract: We prove a quantitative version of Oppenheim's conjecture in the function field setting. In order to do so, we compute the higher moments of the Siegel transform. In particular, we find an optimal bound on the variance of the number of lattice points in a set. Moreover, we compute the exact variance of the number of lattice points in a ball, which is of independent interest.

11.
Nature (Science) 2026-06-17

Molecular basis of polyadenylated RNA fate determination in the nucleus

作者:

Eukaryotic genomes generate a plethora of polyadenylated (pA+) RNAs1,2, which are packaged into ribonucleoprotein particles (RNPs). To ensure faithful gene expression, functional pA+ RNPs, including protein-coding RNPs, are exported to the cytoplasm, whereas transcripts within non-functional pA+ RNPs are degraded in the nucleus1–4. How cells distinguish these opposing fates remains unknown. The DExD-box ATPase UAP56 (also known as DDX39B) is a central component of functional pA+ RNPs, and promotes their docking to the nuclear pore complex-anchored TREX-25,6, which triggers transcript release from UAP56 to facilitate export7. Here we reveal that the poly(A) tail exosome targeting (PAXT) connection8 binds a TREX-2-like module, which releases pA+ RNAs from UAP56 for decay by the nuclear exosome. The core of this module consists of a LENG8–PCID2–SEM1 trimer, which we show is structurally and biochemically equivalent to the central GANP–PCID2–SEM1 trimer of TREX-2. Mutagenesis and transcriptomic data demonstrate that the nuclear fate of pA+ RNPs is governed by the contending actions of nucleoplasmic PAXT and nuclear pore complex-associated TREX-2, which interpret RNA-bound UAP56 as a signal for RNA decay or export, respectively. As RNA targets of PAXT are generally short and intron-poor, we propose an overall model for pA+ RNP fate determination whereby the distinct sub-nuclear localizations of PAXT and TREX-2 govern the degradation of short non-functional pA+ RNAs while allowing export of their longer and functional counterparts. Biochemical, structural and cell biological analyses reveal that UAP56 (DDX39B) assembles with a TREX-2–like module that redirects non-functional polyadenylated RNAs from export to degradation.

12.
arXiv (CS.AI) 2026-06-16

Attention is Just Another Name for Coupling?: A Fast-Slow ODE Perspective on Hierarchical Pretraining

arXiv:2606.16730v1 Announce Type: cross Abstract: Causal self-attention is a coupling mechanism: each token's hidden state is updated by a learned mixture of preceding tokens at the same timescale. This paper asks whether a second, temporally slower coupling-a slow sub-system operating on a temporally-downsampled view of the sequence and fed back into the fast path through a zero-initialised gate-complements it. The question is framed in the language of singularly perturbed ordinary differential equations (ODEs), where the fast variable $x$ evolves at the token rate, the slow variable $y$ evolves at one update per $P$ tokens, and the timescale ratio $\varepsilon = 1/P$ is enforced structurally by causal block-mean pooling. The paper instantiates the fast-slow ODE formalism as a concrete neural network: a fast path of standard causal attention over $T$ tokens, a slow path of full attention over $T/P$ pooled tokens ($P^2 \times$ cheaper per layer), and a zero-initialised additive gate. In addition, under a linear-generator assumption on the fast dynamics, we prove that the equilibrium manifold $x = \phi(y)$ is exactly the master-equation (ME) stationary distribution $p_{\mathrm{st}}(y)$; in that regime a learned MLP $\phi_\theta(y)$ is a variational approximation of it (the trained block is not a generator, so this identity is the structured limit, not a claim about the network as trained). Empirically, at $500$k tokens the coupling is neutral – the gate stays closed and the coupled and frozen ablations are within run-to-run noise – at a wall-clock cost comparable to a dense baseline. The contribution is the precise, gap-marked mapping itself, not a performance gain.

13.
arXiv (CS.LG) 2026-06-19

EQPO: Equitable Group Relative Policy Optimization for Clinical Reasoning

arXiv:2510.19893v2 Announce Type: replace Abstract: Medical AI systems demonstrated impressive diagnostic performance, yet they routinely show uneven accuracy across demographic groups, disadvantaging underrepresented populations. Although multimodal reasoning foundation models have pushed clinical diagnosis forward, reinforcement learning-based post-training tends to absorb and magnify the biases present in majority-dominated training corpora. We propose Equitable Group Relative Policy Optimization (EQPO), a hierarchical reinforcement learning method that encourages balanced learning across heterogeneous clinical populations by adaptively reweighting samples according to subgroup representation, task difficulty, and data source. As demographic annotations are frequently missing in real-world clinical data, EQPO additionally applies unsupervised clustering to recover latent subpopulations when they are unavailable. On 7 diagnostic benchmarks covering 5 modalities (X-ray, CT, dermoscopy, mammography, ultrasound), EQPO reduces F1 standard deviation by 43.9% and the maximum cross-group F1 gap by 42.7% on QoQ-Med3-8B over vanilla GRPO, and narrows predictive parity gaps by 27.2% on MedGemma-4B over bias-mitigated RL baselines while raising F1 by 12.5% even without any demographic labels. Examining the training trajectory shows that EQPO steadily improves fairness over the course of optimization, in contrast to baseline methods whose fairness degrades as training proceeds, and the discovered implicit groups remain stable and align with masked demographic attributes. We further release EquiMedGemma-4B and EquiQoQ-Med3-8B, equitability-aware clinical VLLMs that attain state-of-the-art accuracy with markedly smaller demographic gaps.

14.
arXiv (CS.CV) 2026-06-18

VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation

Controllable image-to-video (I2V) generation transforms a reference image into a coherent video guided by user-specified control signals. While precise control over camera motion, object motion, and lighting is essential for high-fidelity creation, existing methods often treat these factors independently. This overlooks the physical coupling among viewpoint, geometry, and illumination in dynamic scenes, leading to visual inconsistencies such as mismatched shadows and perspective drift under simultaneous changes. We present VidCRAFT3, a unified and flexible I2V framework that explicitly models cross-factor interactions among geometry, motion, and illumination, enabling both independent and joint control over camera motion, object motion, and lighting direction. Image2Cloud provides explicit 3D geometric priors for accurate camera motion control. ObjMotionNet encodes sparse object trajectories into multi-scale motion features to guide realistic object motion. A Spatial Triple-Attention Transformer integrates lighting direction through lighting cross-attention for consistent relighting. To address the scarcity of jointly annotated data, we construct the VideoLightingDirection (VLD) dataset with accurate per-frame lighting direction annotations, and introduce a three-stage progressive training strategy that enables robust learning without fully joint annotations. Extensive experiments demonstrate that VidCRAFT3 achieves state-of-the-art performance in control precision and visual coherence across diverse scenarios.

15.
arXiv (math.PR) 2026-06-17

Persistence diagrams of random triangular matrices over finite fields

arXiv:2606.17895v1 Announce Type: cross Abstract: Let us consider a random infinite lower triangular matrix, where the entries on and below the diagonal are i.i.d. uniform random elements of a fixed finite field. We investigate the evolution of the span of the first $n$ rows of this matrix as $n$ grows. Many properties of this evolving subspace can be captured with the help of the verbose persistence diagram, which is a standard tool in stochastic topology and topological data analysis. We give an explicit formula for the distribution of the persistence diagram. We prove a law of large numbers for the distribution of lifetimes. We also describe the fluctuations of the persistent Betti numbers.

16.
arXiv (CS.CV) 2026-06-11

DepthMaster: Unified Monocular Depth Estimation for Perspective and Panoramic Images

While monocular depth estimation has achieved significant progress, achieving generalized metric depth estimation for both narrow field-of-view (FoV) perspectives and $360^\circ$ panoramas remains an unsolved challenge. Existing methods are often tailored to specific camera types and struggle to produce accurate metric depth that generalizes across diverse settings. This limitation stems from two key challenges: the inherent geometric discrepancy between perspective and panoramic cameras, and the scarcity of panoramic training data with metric annotations. In this work, we introduce DepthMaster, a unified metric depth estimation framework. Rather than employing specialized networks to learn spherical distortions, we reformulate the problem by decomposing panoramic images into overlapping perspective patches. Crucially, distinct from prior projection-based methods that rely on ad-hoc architectural modifications to handle boundaries, we introduce a novel Correspondence Consistency Loss (CCL) and inject virtual projection cameras as geometric priors, allowing us to seamlessly stitch the patches while avoiding specialized operators and keeping the backbone largely compatible with standard Transformer designs. This strategy also resolves the geometric differences by unifying all inputs into a canonical perspective representation, and effectively circumvents data scarcity by directly unlocking powerful metric priors from vast perspective datasets. Trained on a mixed dataset that contains only one panorama dataset, DepthMaster achieves state-of-the-art zero-shot performance on 13 diverse datasets, outperforming not only universal methods but also leading specialist models in both perspective and panoramic domains.

17.
arXiv (CS.AI) 2026-06-17

SkillJect: Effectively Automating Skill-Based Prompt Injection for Skill-Enabled Agents

arXiv:2602.14211v3 Announce Type: replace-cross Abstract: Agent skills extend LLM agents with task-specific instructions, executable scripts, and auxiliary resources, improving reusability but creating a new supply-chain attack surface. A malicious or compromised skill can be repeatedly loaded as trusted guidance and steer downstream tool use. Existing skill-based prompt-injection attacks are often manual and brittle, because explicit malicious instructions are rejected or ignored when they are not aligned with the original workflow. We propose SkillJect, the first automated framework for generating poisoned skills against skill-enabled agent systems. SkillJect uses two coordinated channels. In the artifact channel, it hides the payload inside an auxiliary helper script. In the instruction channel, it rewrites SKILL.md with a front-loaded inducement strategy, placing injected content at the beginning and framing the helper script as a mandatory prerequisite or initialization step. The rewritten instruction explicitly references the helper-script path and provides an executable example command, making the helper appear to be a legitimate setup step before normal skill operations. SkillJect further adopts a closed-loop multi-agent process to improve attack effectiveness. An Attack Agent generates poisoned skills, a Victim Agent executes downstream tasks with the poisoned skill, and an Evaluate Agent inspects execution traces to determine whether the hidden payload was executed. The Attack Agent then uses this feedback to diagnose failure causes and rewrite SKILL.md, while keeping the payload fixed. Experiments across skill-enabled platforms, backend LLMs, and attack categories show that SkillJect substantially outperforms naive direct injection and prior manual skill-injection attacks, highlighting poisoned skills as a persistent threat in reusable skill ecosystems.

18.
arXiv (CS.AI) 2026-06-12

GeoDial: A Multimodal Conversational Tutoring Dataset for Geometry Problem-Solving with Visual Tutor Turns

arXiv:2606.12419v1 Announce Type: cross Abstract: Several educational domains rely heavily on diagrams and visual cues, yet most existing tutoring datasets are limited to text-only interactions. This limits the development of AI tutors that can teach in visually grounded ways used by human instructors. Thus, we introduce GeoDial, a multimodal tutoring dataset of over 1.3K teacher-student dialogs in the domain of geometry collected from experienced math teachers, where instructional turns are explicitly grounded in diagram highlights. We propose a scalable annotation protocol that integrates dialog acts, visual highlighting, and feedback, enabling fine-grained supervision of both language and visual tutoring behavior. To illustrate the challenges posed by this setting, we fine-tune several vision-language models on GeoDial and evaluate their ability to generate tutoring utterances and diagram highlights. While supervised fine-tuning substantially improves the quality of generated dialog, it struggles to produce accurate diagram highlights, revealing a key limitation of current methods and highlighting the need for approaches that more effectively integrate visual reasoning with pedagogical interaction.

19.
medRxiv (Medicine) 2026-06-17

Cardio Heart Connect: Protocol for a Randomized Trial of a Commercially Available mHealth Fitness Intervention for Cardiac Rehabilitation After Transcatheter Aortic Valve Replacement

Background: Despite ample evidence of the benefits of cardiac rehabilitation (CR), few transcatheter aortic valve replacement (TAVR) patients participate. Commercially available mobile health offers an opportunity to deliver activity-promotion content to populations that are challenged to participate in CR. This study aims to test the efficacy of clinically controlled, commercially available fitness programming for improving physical activity and cardiovascular health outcomes designed to be initiated while patients are on waitlists for traditional CR. Methods: The Cardio Heart Connect study is a hybrid type I effectiveness-implementation trial aiming to enroll N=200 patients who have been placed on a cardiac rehab waitlist following a TAVR procedure from the University of Colorado Hospital Heart and Vascular Center. Participants will be randomized 1:1 to the Cardio Heart Connect intervention with commercially available fitness or attention control, designed to control for technology access. At baseline, post-intervention (8 weeks), and follow-up (12 months), we will assess the primary outcome of participants? daily steps as measured by smartwatch accelerometer and secondary outcomes of interest including functional capacity (Duke Activity Status Index; VO2max), quality of life (Kansas City Cardiomyopathy Questionnaire), and cardiovascular health status (Life Essential 8). In addition, we will use mixed methodologies to evaluate the implementation of intervention using the Reach, Effectiveness, Adoption, Implementation, and Maintenance (RE-AIM) Framework. Conclusions: Commercially available fitness programs have the potential to provide more accessible opportunities for patients recovering from TAVR to engage in physical activity and may be preferred due to their customizability, convenience, and ease of scheduling. Overall, this study will provide insight into the use of commercial mHealth to promote activity following TAVR.

20.
arXiv (CS.CV) 2026-06-17

Principled RL for Flow Matching Emerges from the Chunk-level Policy Optimization

Recent Progress in post-training flow matching for text-to-image (T2I) generation with Group Relative Policy Optimization (GRPO) has demonstrated strong potential. However, it is hindered by a critical limitation: inaccurate advantage attribution. In this work, we argue that aggregating consecutive steps into a coherent 'chunk' and shifting the policy optimization paradigm from GRPO's step level to the chunk level can effectively mitigate the negative impact of this issue. Building on this insight, we propose Group Chunking Policy Optimization (GCPO), the first chunk-level reinforcement learning approach for post-training flow matching. Extensive experiments demonstrate that GCPO achieves superior performance on both standard T2I benchmarks and preference alignment, with up to 43% relative gains over GRPO, highlighting the promise of chunk-level policy optimization. The code is available on https://github.com/xingzhejun/GCPO.

21.
arXiv (CS.LG) 2026-06-16

An RRAM-based Hardware Implementation of a Radial Basis Function Neuron for Edge Classifiers

arXiv:2606.14739v1 Announce Type: cross Abstract: The deployment of modern machine learning (ML) solutions on resource-constrained edge devices highlights implementation challenges. This is especially true for extreme edge applications that include safety-critical components, such as autonomous navigation tasks. This paper demonstrates an artificial neural network (ANN) design leveraging Metal-Oxide Resistive RAM (RRAM) -based Analogue Content Addressable Memory (ACAM) as an efficient hardware substrate for performing metric-based classification and online adaptation on the edge. The proposed design is based on a custom Template piXeL (TXL) cell used for building the ACAM module, where each TXL cell acts as a configurable receptive field neuron. These cells employ a Radial Basis activation function to calculate the distance of an input from the programmed receptive field. The TXL can be organised into dense arrays for calculating the distance of a high-dimensional input against all stored prototypes, effectively performing fast and energy efficient similarity search. This hardware engine enables on-the-fly learning, where the receptive field parameters can be tuned to track domain shift. Through simulation of the proposed TXL-RBF classifier we can achieve 89.1\% accuracy on the MNIST dataset while consuming 185fJ per cell per operation when operating at 100MHz.

22.
arXiv (CS.LG) 2026-06-19

Characterization of Gaussian Universality Breakdown in High-Dimensional Empirical Risk Minimization

arXiv:2604.03146v3 Announce Type: replace-cross Abstract: We study high-dimensional convex empirical risk minimization (ERM) under general non-Gaussian data designs. By heuristically extending the Convex Gaussian Min-Max Theorem (CGMT) to non-Gaussian settings, we derive an asymptotic min-max characterization of key statistics, enabling approximation of the mean $\mu_{\hat{\theta}}$ and covariance $C_{\hat{\theta}}$ of the ERM estimator $\hat{\theta}$. Specifically, under a concentration assumption on the data matrix and standard regularity conditions on the loss and regularizer, we show that for a test covariate $x$ independent of the training data, the projection $\hat{\theta}^\top x$ approximately follows the convolution of the generally non-Gaussian distribution of $\mu_{\hat{\theta}}^\top x$ with an independent centered Gaussian variable of variance $\mathrm{tr}(C_{\hat{\theta}} \mathbb{E}[xx^\top])$. This result clarifies the scope and limits of Gaussian universality for ERMs. Additionally, we prove that any $\mathcal{C}^2$ regularizer is asymptotically equivalent to a quadratic form determined solely by its Hessian at zero and gradient at $\mu_{\hat{\theta}}$. Numerical simulations across diverse losses and models are provided to validate our theoretical predictions and qualitative insights.

23.
arXiv (CS.AI) 2026-06-19

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

arXiv:2606.19980v1 Announce Type: new Abstract: Achieving dexterous robotic manipulation in the real world heavily relies on human supervision and algorithm engineering, which becomes a central bottleneck in the pursuit of general physical intelligence. Although emerging coding agents can generate code to automate algorithm search, their successes remain largely confined in digital environments. We conjecture that the missing abstraction to automate robotics research is a repeatable feedback loop for real-world policy improvement: reset the scene, execute a policy, verify the outcome, and refine the next iteration. To bridge this gap, we introduce ENPIRE, a harness framework for coding agents that instantiates this physical feedback routine with four core modules: an Environment module (EN) for automatic reset and verification, a Policy Improvement module (PI) that launches policy refinement, a Rollout module (R) to evaluate policies with one or multiple physical robots operating in parallel, and an Evolution module (E) in which coding agents analyze logs, consult literature, improve training infrastructure and algorithm code to address failure modes. This closed-loop system transforms real-world manipulation learning into a controllable optimization procedure, minimizing human effort while allowing fair ablations across training recipe and agent variants. Powered by ENPIRE, frontier coding agents can autonomously train a policy to achieve a 99% success rate on challenging, dexterous manipulation tasks, such as organizing a pin box, fastening a zip tie, and tool use, a process that further accelerates when we dispatch an agent team on a robot fleet. Our results suggest a practical and scalable path toward deploying coding agents to autonomously advancing robotics in the physical world.

24.
medRxiv (Medicine) 2026-06-12

Heterogeneity of Treatment Effect of Aspirin and Clinically Significant Bleeding in Older Adults

Aim: The global population of older adults is growing, and older age is linked to higher bleeding risk. Although guidelines discourage aspirin for primary prevention in healthy older adults due to bleeding harms outweighing benefits, many continue taking it without a clear indication. It remains unclear whether all older adults face uniform aspirin-related bleeding risk or if certain subgroups are more vulnerable. Methods: We analyzed data from 19,114 ASPREE trial participants to develop machine learning models using 116 baseline variables. Random forest (RF) and random survival forest (RSF) models predicted 5-year bleeding risk, and participants were stratified into low, intermediate, and high-risk groups based on the 20th and 80th percentiles of predicted risk. We assessed heterogeneity of treatment effect (HTE) by testing treatment-by-risk group interactions on the relative scale using Fine-Gray models, and on the absolute scale using observed 5-year cumulative incidence rates. Results: Over a median follow-up of 4.7 years, 626 major bleeding events occurred. The RF model had moderate discrimination (AUC = 0.65, 95% CI: 0.63-0.67) and good calibration (Brier = 0.032, 95% CI: 0.029-0.034). Statistically significant HTE was observed on the relative scale, with the greatest relative increase in bleeding risk seen in the low-risk group (subdistribution hazard ratio = 2.26, 95% CI: 1.27-4.01). On the absolute scale, low-risk participants experienced higher bleeding with aspirin (absolute risk difference (ARD) = 1.17%, 95% CI: 0.37-1.95), but heterogeneity in ARDs was not statistically significant (Cochran's Q p > 0.45). Similar findings were observed when using the RSF model. Conclusion: Participants at lowest baseline bleeding risk experienced the greatest relative increase in bleeding risk with aspirin therapy. We found statistically significant heterogeneity in treatment effects on the relative but not absolute scale. These findings support an individualized, risk-based approach to aspirin therapy decision-making in older adults.

25.
arXiv (CS.LG) 2026-06-12

A Stabilized Path-Space Approach to Diffusion-Based Posterior Sampling

arXiv:2606.12710v1 Announce Type: new Abstract: Diffusion models provide expressive data-driven priors for Bayesian inverse problems, but many diffusion posterior samplers rely on heuristic guidance approximations that can fail for nonlinear operators and multimodal posteriors. In this work, we develop a stabilized path-space framework for diffusion-based posterior sampling. Starting from a base diffusion process whose terminal marginal represents the prior, we define a likelihood-weighted target measure on trajectories and cast posterior sampling as learning a controlled stochastic process whose path measure matches this target. This formulation connects diffusion posterior sampling to stochastic optimal control while preserving the Bayesian structure needed for uncertainty quantification. We introduce a time reparameterization that makes the path-space control problem well posed by removing the bias induced by the unknown initial value function, without auxiliary training. We then learn the control via a trust-region path-space optimization method with log-variance objectives. The path-space perspective also unifies our learned control approach with existing guidance-based samplers, quantifies the sampling error induced by approximate controls, and yields importance sampling corrections for asymptotically exact posterior expectations. We evaluate the proposed framework on a suite of benchmark inverse problems with analytically characterized or high-quality reference posteriors, enabling principled assessment of sampling accuracy and uncertainty quantification. These experiments provide insight into the behavior of diffusion-based posterior samplers and demonstrate improved accuracy and robustness over leading approaches.