Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
bioRxiv (Bioinfo) 2026-06-15

Multiple Fault Analysis and Drug Therapy on Signaling Pathways Using Dynamic Bayesian Network-based Model

Cell growth is an intricate biological phenomenon that is closely regulated by the interplay between various growth factors and transcription factors. Signaling pathways are the main mediators in this event, which provide the driving force for mitosis or sometimes meiosis. However, when malfunctions occur within the biological network, they can cause uncontrolled cell division, regardless of external stimuli. By employing Dynamic Bayesian Networks (DBNs), these malfunctions can be explicitly simulated, offering insights into their effects on cellular behavior and growth regulation. To a significant extent, the resultant outcomes can be mitigated through the use of reduced drug combinations. This study delves into the intricacies of signaling pathway behavior under the influence of concurrent malfunctions. Initially, we replicate the effects of these dysfunctions within DBNs. Subsequently, drug therapy is applied to alleviate their impact. Our methodology introduces a parameter known as efficiency_score, enabling the identification of optimized drug combinations without prior knowledge of specific dysfunctions. Particularly relevant in the context of realistic cancer conditions, these tailored drug inhibition points demonstrate enhanced efficacy compared to conventional treatments. Leveraging GPU acceleration throughout the modeling process accelerates the analysis of multiple faults within the biological networks, rendering our approach notably faster and more efficient.

02.
arXiv (quant-ph) 2026-06-19

Unleashing Emergent Fermions with Rydberg Atom Simulators

arXiv:2606.19444v1 Announce Type: cross Abstract: Rydberg atom simulators, in both analog and digital modes, have attracted significant recent interest due to their versatile geometric reconfigurability. In this work, leveraging this feature, we propose two complementary approaches, one for each mode, to characterize emergent fermions in critical quantum many-body systems. In the analog mode, we assemble the Rydberg atoms in a "developable" (namely, preserving local couplings) Möbius band geometry to realize antiperiodic boundary conditions, where fermionic states reside. Spectroscopic measurement in this sector then reveals universal energy ratios of the bosonic and fermionic states. In the digital mode, we carry out a fermionic version of Kibble-Zurek ramping with a quantum circuit, directly addressing the fermionic scaling form. Reconfigurability allows an exponential speed-up of this task, with an $O(\log L\log\log L)$ circuit-depth overhead. Our work establishes the Rydberg atom simulator as a uniquely powerful platform to attack the notoriously difficult issue of experimentally probing emergent fermions that are nonlocally defined in a bosonic system.

03.
arXiv (CS.LG) 2026-06-18

A physical adaptive material motor unit neural network: a hygromorph composite material machine

arXiv:2606.18275v1 Announce Type: cross Abstract: Advances in novel materials science enable structures to function as intelligent machines by embedding memory and learning capabilities directly into materials. Our work introduces a physical adaptive material motor unit neural network,leveraging a new generation of controllable actuators composed of wood- and carbon black-based composites, sensitive to temperature and relative humidity. These material actuators are assembled into a motor unit-like structure inspired by muscle contraction trigger, forming an intelligent machine capable of dynamic shading control that can be used, for example, in buildings. The machine is governed by a neural network trained on over 350 experimental data points collected under diverse environmental conditions. By establishing a new data-aware backpropagation training, we show that the machine predicts shading responses and learns to predict appropriate behaviour incrementally as the database expands. We also demonstrate the ability of the machine to optimise configurations to achieve similar shading outputs under two distinct conditions.

04.
arXiv (CS.CL) 2026-06-15

Can Post-Training Turn LLMs into Good Medical Coders? An Empirical Study of Generative ICD Coding

Automated International Classification of Diseases (ICD) coding is a core medical-coding task for billing, epidemiology, and clinical decision support. Generative large language models (LLMs) are often reported as weak medical coders, but this finding mainly comes from inference-time settings such as prompting, retrieval, reranking, or tool use, leaving the role of task-specific post-training underexplored. We present a controlled empirical study of post-training for generative ICD coding, comparing discriminative baselines with LLM coders across prompting, supervised fine-tuning, and reinforcement learning under a common protocol and metric set. To our knowledge, this is the first study to evaluate RL-based post-training for generative LLM coders in ICD coding. We further introduce PHI, a diagnostic curriculum that extends GRPO to refine missed-code cases. Our results show that prompting-only evaluation substantially underestimates the potential of LLMs for ICD coding. SFT provides the main capability jump, GRPO further improves code-set prediction beyond SFT, and PHI provides targeted gains on macro-level performance. These findings suggest that the main bottleneck is not the generative formulation alone, but how the model is adapted and optimized for full-taxonomy recall. We release our code, data splits, and checkpoints at https://github.com/AlexandreWANG915/LLM4ICD.

05.
arXiv (CS.CL) 2026-06-24

On the Stability of Prompt Ranking in Large Language Model Evaluation

Prompt-based interaction has become a dominant paradigm for using large language models (LLMs), where multiple candidate prompts are evaluated and the top-ranked one is selected for downstream use. This workflow implicitly assumes that prompt rankings are stable under minor variations in evaluation conditions. In this paper, we systematically study prompt ranking stability under common sources of variability, including random seeds and limited evaluation subsets. Across three open-weight LLMs and two benchmark tasks, we find that while overall rank correlations are often moderate to high, the identity of the top-performing prompt frequently changes, leading to unreliable selection decisions. To address this issue, we propose a simple stability-aware selection strategy based on a lower confidence bound, which accounts for both performance and variance. Our results show that this approach improves robustness in unstable settings while remaining competitive in more stable regimes. These findings highlight the importance of accounting for evaluation uncertainty in prompt selection and LLM benchmarking.

06.
arXiv (CS.LG) 2026-06-17

Physics-Constrained Neural Networks for Improved Short-Term Weather Forecasting: A Case Study over the South Pacific

arXiv:2606.17659v1 Announce Type: new Abstract: This study introduces enhancements to physics-constrained neural networks (PCNNs) that improve the accuracy and stability of hybrid short-term weather forecasting models. Building on the WeatherGFT architecture, three innovations are proposed. First, an upgraded numerical solver, combining a fifth-order weighted essentially non-oscillatory scheme (WENO-5), a beta-plane approximation, and subgrid-scale viscosity, permits a fourfold increase in the integration time step to 1200 s while reducing the daily mean squared error by up to 26%. Second, a unified autoregressive hybrid block replaces the original chain of 24 specialised modules, eliminating overfitting to specific lead times. Third, the physical core is integrated with two state-of-the-art neural backbones, resulting in PI-PredFormer and PI-IAM4VP. Evaluation on the WeatherBench South Pacific subset from 2000 to 2004 shows that these hybrids reduce root mean squared error at 1-12 h lead times by 8-22% compared to purely neural counterparts, while better preserving physical consistency. These results demonstrate that incremental refinement of hybrid components offers a practical route toward more accurate and efficient short-range weather forecasting.

07.
arXiv (CS.LG) 2026-06-16

IBAD: Interpretable Behavioral Anomaly Detection on Human Mobility Data

arXiv:2606.16023v1 Announce Type: new Abstract: Human mobility appears highly diverse, yet much of a person's daily mobility can be explained by a small set of recurring behavioral templates, such as commuting, school-centered activities, caregiving, nightlife, or errand patterns. We present \texttt{IBAD} (\underline{I}nterpretable \underline{B}ehavioral \underline{A}nomaly \underline{D}etection), a framework that learns interpretable daily mobility templates and represents each individual as a distribution over mixtures of these templates. Rather than focusing on specific locations, IBAD characterizes activities that individuals perform across locations. This approach first discovers global behavioral templates using Latent Dirichlet Allocation (LDA), then employs a hierarchical self-supervised model to learn normal behavior of individuals from their soft behavioral templates. We also introduce a splicing benchmark that creates controlled behavioral mismatches between an individual's historical profile and injected mobility patterns. Experiments on real-world and synthetic datasets show that daily behavior can be effectively decomposed into a small number of interpretable templates. Crucially, we show that the learned behavioral archetypes transfer across distinct geographic and demographic contexts. Furthermore, IBAD maintains a robust competitive performance across all settings. For reproducibility purposes, the code is accessible at ~\href{https://github.com/USC-InfoLab/IBAD}{https://github.com/USC-InfoLab/IBAD}.

08.
arXiv (CS.CV) 2026-06-11

ActionMap: Robot Policy Learning via Voxel Action Heatmap

Vision-language-action (VLA) models have advanced rapidly across backbones, training recipes, and data scale, yet the action decoder, which converts the backbone's hidden state into a continuous control signal, has barely changed and remains a single-point predictor across the majority of current VLAs. Whether implemented via autoregressive token bins, L1 regression, or flow-matching denoising, the resulting decoder treats the action space as unstructured, leaving the geometric proximity of neighboring actions unexploited during training. To advance this, we introduce ActionMap, a voxel heatmap action head that drops into an existing VLA in place of its native action decoder. For each new action, the head predicts a voxel heatmap over the action space, where each voxel directly stores the probability of the corresponding action. Across LIBERO simulation and real-world Franka manipulation, our heatmap head surpasses two architecturally distinct backbones at matched training steps (e.g., +8.2% over OpenVLA-OFT's L1 regression head on the LIBERO four-suite average), converges at comparable or faster rates on both backbones, and remains markedly more data-efficient at low training data. The cross-backbone consistency indicates that action representation is a real lever for VLA performance, distinct from further backbone or recipe scaling. Project Page: https://showlab.github.io/ActionMap/.

09.
arXiv (CS.CL) 2026-06-18

Towards Scalable Customization and Deployment of Multi-Agent Systems for Enterprise Applications

Large language model (LLM)-based multi-agent systems demonstrate strong performance on complex reasoning and task execution, enabling broad enterprise applications. However, production deployment remains challenging due to domain-specific customization requirements and high latency and inference costs in agentic workflows. We propose a unified framework for customization and efficient deployment of multi-agent systems in real-world settings. The first stage, Agentic Model Customization, combines continual pretraining, supervised fine-tuning, and preference optimization to adapt a compact model to specialized domains while retaining strong agentic capabilities. The second stage, Inference Optimization, integrates speculative decoding and FP8 quantization with targeted calibration to enable cost-efficient serving with minimal quality loss. Across enterprise workloads, our framework enables rapid domain adaptation and achieves a 4.48x speedup in throughput while maintaining performance and improving robustness on long-tail scenarios.

10.
arXiv (quant-ph) 2026-06-24

Spectator-transition crosstalk in a spin-3/2 silicon vacancy qudit in silicon carbide revealed by broadband Ramsey interferometry

arXiv:2601.15559v3 Announce Type: replace Abstract: Color center spins in 4H-SiC offer a rare combination of wafer-scale materials maturity with long spin coherence and chip-level photonics, making them promising building blocks for scalable quantum technologies. In particular, the silicon vacancy hosts an S=3/2 ground state, a native qudit that enables compact encodings and subspace-selective control, but also introduces spectator transitions: short, detuned pulses can coherently drive non-addressed level pairs and create crosstalk. Here we use broadband Ramsey interferometry to reveal and quantify such spectator-transition crosstalk. Experimentally, the Ramsey Fourier spectra display multiple lines beyond the addressed single-quantum transition. Analytically, we map each line to a pairwise energy difference between qudit levels of the rotating-frame Hamiltonian and assign its weight via compact amplitudes set by the prepared state and the microwave pulse parameters, predicting a deterministic six-branch structure. Numerical time-domain propagation with the experimental sampling reproduces the detuning map, and the measured peak positions coincide with the analytic branch lines without frequency fitting. Together these results provide a practical, spectator-aware framework for multilevel control in the silicon vacancy qudit. The approach offers clear guidance to suppress crosstalk or, conversely, to exploit spectator lines, for example as additional constraints for in situ pulse calibration and for phase-sensitive quantum state and process estimation.

11.
arXiv (CS.LG) 2026-06-18

Self-attention-based non-linear basis transformations for compact latent space modelling of dynamic optical fibre transmission matrices

arXiv:2406.07775v2 Announce Type: replace Abstract: Multimode optical fibres are hair-thin strands of glass that efficiently transport light. They promise next-generation medical endoscopes that provide unprecedented sub-cellular image resolution deep inside the body. However, confining light to such fibres means that images are inherently scrambled in transit. Conventionally, this scrambling has been compensated by pre-calibrating how a specific fibre scrambles light and solving a stationary linear matrix equation that represents a physical model of the fibre. However, as the technology develops towards real-world deployment, the unscrambling process must account for dynamic changes in the matrix representing the fibre's effect on light, due to factors such as movement and temperature shifts, and non-linearities resulting from the inaccessibility of the fibre tip when inside the body. Such complex, dynamic and nonlinear behaviour is well-suited to approximation by neural networks, but most leading image reconstruction networks rely on convolutional layers, which assume strong correlations between adjacent pixels, a strong inductive bias that is inappropriate for fibre matrices which may be expressed in a range of arbitrary coordinate representations with long-range correlations. We introduce a new concept that uses self-attention layers to dynamically transform the coordinate representations of varying fibre matrices to a basis that admits compact, low-dimensional representations suitable for further processing. We demonstrate the effectiveness of this approach on diverse fibre matrix datasets. We show our models significantly improve the sparsity of fibre bases in their transformed bases with a participation ratio, p, as a measure of sparsity, of between 0.01 and 0.11. Further, we show that these transformed representations admit reconstruction of the original matrices with < 10% reconstruction error, demonstrating the invertibility.

12.
arXiv (CS.CV) 2026-06-19

Benchmarking Vision Foundation Models for Domain-Generalizable Face Anti-Spoofing

Face Anti-Spoofing (FAS) remains challenging due to the requirement for robust domain generalization across unseen environments. While recent trends leverage Vision-Language Models (VLMs) for semantic supervision, these multimodal approaches often demand prohibitive computational resources and exhibit high inference latency. Furthermore, their efficacy is inherently limited by the quality of the underlying visual features. This paper revisits the potential of vision-only foundation models to establish a highly efficient and robust baseline for FAS. We conduct a systematic benchmarking of 15 pre-trained models, such as supervised CNNs, supervised ViTs, and self-supervised ViTs, under severe cross-domain scenarios including the MICO and Limited Source Domains (LSD) protocols. Our comprehensive analysis reveals that self-supervised vision models, particularly DINOv2 with Registers, significantly suppress attention artifacts and capture critical, fine-grained spoofing cues. Combined with Face Anti-Spoofing Data Augmentation (FAS-Aug), Patch-wise Data Augmentation (PDA) and Attention-weighted Patch Loss (APL), our proposed vision-only baseline achieves state-of-the-art performance in the MICO protocol. This baseline outperforms existing methods under the data-constrained LSD protocol while maintaining superior computational efficiency. This work provides a definitive vision-only baseline for FAS, demonstrating that optimized self-supervised vision transformers can serve as a backbone for both vision-only and future multimodal FAS systems. The project page is available at: https://gsisaoki.github.io/FAS-VFMbenchmark-CVPRW2026/ .

13.
arXiv (CS.AI) 2026-06-16

How Much Do Reviews Really Contribute? A Study on Text-Enriched Matrix Factorization for Recommendations

arXiv:2606.16973v1 Announce Type: cross Abstract: Incorporating textual reviews into a Recommender System has become a prominent strategy for enriching collaborative signals with semantic information. However, the actual contribution of review-derived representations remains an open question, particularly when strong collaborative baselines are employed. In this work, we systematically investigate the impact of textual information on Matrix Factorization by introducing and comparing three enrichment strategies over a common collaborative backbone. First, we propose a learnable gating mechanism that adaptively balances collaborative and textual signals during training. This mechanism is applied to two distinct review representations: (i) aggregated topic profiles extracted from user and item histories, and (ii) full text embedding representations derived from reviews. Additionally, we explore a cross-attention mechanism that identifies and emphasizes the most informative dimensions of the textual representation before fusion with collaborative factors. We evaluate six variants: pure, enriched with topic profiles and text via gating; enriched with topics and text via gating; and enhanced with cross-attention over textual features. Experiments across multiple review-based datasets reveal that although adaptive fusion mechanisms improve representation flexibility, the marginal contribution of textual signals remains limited compared to the collaborative backbone. These findings suggest that, under typical rating-prediction settings, collaborative information continues to dominate performance, raising important considerations for the effective integration of semantic review signals into recommendation models.

14.
arXiv (CS.CL) 2026-06-12

From Tokens to Faces: Investigating Discrete Speech Representations for 3D Facial Animation

The choice of speech representation is critical in speech-driven 3D facial animation. Representations differ in what they encode: SSL features emphasize segmental and semantic cues, neural codecs yield latents optimized for acoustic reconstruction, and ASR-style objectives produce label-based spaces. We evaluate four speech representation families for 3D facial synthesis, comparing their facial reconstruction quality across two facial decoders using objective metrics and a perceptual evaluation. We additionally conduct probing analyses that relate tokenized representations to phonetic units and to articulatory deformations. We found that encoding phonetic classes is beneficial for accurate facial animation prediction on both semantic and label-based representations with comparable facial animation quality. From the latter, we introduce an Audio Visual Text-to-Speech (AVTTS) pipeline that leverages, as a shared space, discrete representations to decode speech and 3D facial motion.

15.
arXiv (CS.CL) 2026-06-11

Dual-Stance Evaluation of Sycophancy: The Structure of Agreement and the Limits of Intervention

Activation steering can shift LLM behaviour, but standard evaluations do not typically test whether a sycophancy-reduction direction also suppresses agreement with factually correct statements. We introduce dual-stance evaluation, which tests both stances of each topic, and apply it to centroid-difference steering on Llama-3-8B-Instruct. We find a dissociation: the model represents sycophantic and factual agreement in geometrically distinct subspaces, yet the steering direction projects equally onto both and cannot differentially target either. The direction accordingly reduces agreement with factually correct statements (e.g. that the Earth is round) as well as sycophantic ones. All other static properties of the two activation groups are matched, suggesting the behavioural dissociation arises from generation dynamics or from finer-grained structure that residual-stream analysis cannot resolve. The pattern illustrates a general gap: representations that are readable from activations may not be writable through them.

16.
arXiv (CS.CV) 2026-06-16

Cross-Modal Registration Between 3D and 2D Fingerprints via Pose-Aware Unwrapping and Point-Cloud Fusion

Three-dimensional (3D) fingerprints preserve global finger geometry and local ridge structure while avoiding contact-induced deformation, but they remain difficult to integrate with legacy two-dimensional (2D) fingerprint systems. This paper addresses the intermediate stage between 3D acquisition and cross-modal matching, and presents a unified framework for 3D fingerprint preprocessing and registration across contactless and contact-based 2D modalities. The framework combines four components: 1) a nonparametric visualization and unwrapping method that converts a 3D fingerprint point cloud into a rolled-equivalent 2D representation without relying on a global finger-shape model; 2) a point-cloud fusion pipeline that registers and mosaics multiple partial 3D captures into a more complete fingerprint model; 3) an ellipse-based pose normalization method for canonical finger alignment; and 4) a pose-aware cross-modal registration strategy that improves compatibility between 3D fingerprints and both contactless and contact-based 2D fingerprints. Experiments on a self-collected multimodal fingerprint database containing 150 fingers show that the proposed framework achieves ridge-level 3D registration accuracy, robust pose estimation, and consistent gains in 2D compatibility. In particular, the 3D fusion error is concentrated around 0.09 mm, contactless 2D–3D registration reaches ridge-scale projection accuracy, and pose-aware unwrapping improves genuine matching scores relative to generic 3D unwrapping. These results support the use of 3D fingerprints as an effective geometric bridge across heterogeneous fingerprint modalities. The baseline implementation has been publicly released at https://github.com/XiongjunGuan/3DFpVisual.

17.
arXiv (quant-ph) 2026-06-16

Controlled Quantum Metrology with Anisotropic Heisenberg Spin Interactions under Intrinsic Decoherence

arXiv:2606.16918v1 Announce Type: new Abstract: We theoretically investigate quantum parameter estimation in a two-qubit anisotropic Heisenberg spin system with Dzyaloshinskii-Moriya (DM) interaction in the presence of intrinsic decoherence described by the Milburn model. Using the Quantum Fisher Information (QFI), we study the estimation of both the uniform magnetic field and the DM interaction strength. Analytical expressions for the time-evolved density matrix are obtained and used to explore the effects of exchange anisotropy, intrinsic decoherence, and probe-state preparation on the achievable estimation precision. Our results show that suitable tuning of the anisotropic exchange coupling and the initial entangled state can considerably enhance the estimation performance, with different optimal parameter regimes emerging for magnetic-field and DM-interaction sensing. To better understand the role of quantum resources in metrology, we also examine the behaviour of concurrence, quantum coherence, and von Neumann entropy. Overall, our findings demonstrate that anisotropic Heisenberg spin systems with DM interaction provide a promising and flexible platform for high-precision quantum metrology even in the presence of intrinsic decoherence.

18.
medRxiv (Medicine) 2026-06-15

ECHOCARDIOGRAPHY ABNORMALITIES IN PREECLAMPSIA WITH SEVERE FEATURES.

Purpose To determine the frequency of echocardiographic abnormalities in women with preeclampsia with severe features. To describe the spectrum and types of echocardiographic abnormalities associated with preeclampsia with severe features. Method This is a Prospective observational study conducted in Vani Vilas hospital attached to Bangalore Medical College and Research Institute, Bangalore from January 2023 to December 2025. 560 pregnant women diagnosed with severe preeclampsia(SPE) were included in the study. Chronic hypertension without superimposed preeclampsia, underlying cardiac diseases and previous history of peripartum cardiomyopathy were excluded from the study. Transthoracic echocardiography-TTE (2D ECHO) was done to evaluate cardiac structure and function. Echocardiographic abnormalities identified during the study were documented and analysed using descriptive statistical methods. Results Abnormalities in ECHO was noted in 23.03%. A unique finding was the documentation of elevated pulmonary artery systolic pressures (PASP) suggestive of Pulmonary Hypertension (PH) (PASP >35 mm HG) among 20.25% of the participants. It was also the commonest abnormality on ECHO. Mild PH was the commonest (15.71%), moderate PH was seen in 3.92% and severe PH in 0.71% of cases. Next most frequent abnormality was moderate to severe valvular regurgitation (10%), followed by left ventricular hypertrophy (5.53%). Diastolic dysfunction (DD) was seen in 3.92%, systolic dysfunction(SD) in 3.57%, chamber dilatation in 3.57% and LV global hypokinesia in 3.03% cases of SPE Conclusion Preeclampsia with severe features (SPE) is associated with 23.03% abnormalities on echocardiography. SPE is associated with systolic dysfunction, diastolic dysfunction, chamber dilatation, valvular regurgitation, left ventricular hypertrophy and pulmonary hypertension.

19.
arXiv (CS.LG) 2026-06-11

SpaTeoGL: Spatiotemporal Graph Learning for Interpretable Seizure Onset Zone Analysis from Intracranial EEG

arXiv:2602.11801v2 Announce Type: replace Abstract: Accurate localization of the seizure onset zone (SOZ) from intracranial EEG (iEEG) is essential for epilepsy surgery but is challenged by complex spatiotemporal seizure dynamics. We propose SpaTeoGL, a spatiotemporal graph learning framework for interpretable seizure network analysis. SpaTeoGL jointly learns window-level spatial graphs capturing interactions among iEEG electrodes and a temporal graph linking time windows based on similarity of their spatial structure. The method is formulated within a smooth graph signal processing framework and solved via an alternating block coordinate descent algorithm with convergence guarantees. Experiments on a multicenter iEEG dataset with successful surgical outcomes show that SpaTeoGL is competitive with a baseline based on horizontal visibility graphs and logistic regression, while improving non-SOZ identification and providing interpretable insights into seizure onset and propagation dynamics.

20.
medRxiv (Medicine) 2026-06-22

Rare loss-of-function variants in POLD1, PMS1 and FAN1 modify age at onset of motor symptoms in Huntington's disease

Huntington's disease is a rare neurodegenerative disease whose primary risk factors are inherited expansions of a CAG repeat tract in the HTT gene. Somatic expansion of these tracts leads to neuronal toxicity, neuronal death and clinical disease progression. To identify genetic factors with a major impact on disease onset and progression, we genome sequenced 18,825 individuals for the ENROLL-HD study. Our results show rare inactivating mutations in three genes, all involved in DNA damage repair, are major determinants of age of onset for motor symptoms (n=10,610) and other clinical manifestations. Heterozygote carriers of predicted loss-of-function (pLoF) variants in POLD1 and PMS1 developed motor symptoms an average 20 years (n=3; P=1x10-5) and 7 years (n=6; P=2x10-3) later than non-carriers, respectively. Conversely, heterozygote carriers of pLoF variants in FAN1 (n=30) developed symptoms 10 years earlier (P=2x10-10). Our findings highlight therapeutic strategies and help predict age of onset for at-risk individuals.

21.
arXiv (CS.LG) 2026-06-24

KLip-PPO: A per-sample KL perspective on PPO-Clip

arXiv:2606.23932v1 Announce Type: new Abstract: Proximal Policy Optimization (PPO) is the standard policy-gradient algorithm for on-policy reinforcement learning. The literature presents it in two forms, a clipped surrogate that bounds the importance ratio between successive policies and a Kullback-Leibler penalty between them. These forms are treated as separate algorithms with their own gradients, their own hyperparameters, and their own reference implementations, and a sizeable body of empirical work compares them. We show that the gradient of the clipped surrogate is reproduced exactly by a Kullback-Leibler surrogate whose coefficient varies per sample, with closed-form dependence on the importance ratio and the advantage. The identity holds at every minibatch step and across the entire inner loop, and on five MuJoCo continuous-control benchmarks the two losses produce indistinguishable training curves. The reformulation exposes a structural feature of the clipped surrogate that the min notation hides. PPO-Clip's implicit per-sample penalty is a step function at the boundary of the trust region, and the shape of this coefficient is the natural design axis for generalising the algorithm. We sketch the resulting follow-up directions in the discussion.

22.
arXiv (CS.AI) 2026-06-24

Beyond U-Net: A Latent-Representation-Aligned Skip-Free Backbone for Flow-Matching Speech Enhancement

arXiv:2606.24745v1 Announce Type: cross Abstract: Generative models, particularly diffusion and score-based approaches, have recently achieved strong performance in speech enhancement, but their iterative sampling process limits real-time deployment. Flow Matching offers an efficient alternative by transporting noisy speech toward clean speech through an ordinary differential equation with few function evaluations. In this work, we propose a skip-free encoder-decoder backbone for flow-matching speech enhancement, guided by Latent Representation Alignment (LRA). Instead of relying on U-Net skip connections, which may transfer noise-correlated low-level features to the decoder, the proposed model aligns its bottleneck and decoder representations with clean latent features extracted from a frozen Descript Audio Codec encoder-decoder without quantization. This codec-aligned supervision promotes compact clean-speech representations while preserving efficient few-step inference. Experiments on WSJ0-CHiME3 and VoiceBank-DEMAND show improved PESQ and perceptual quality, especially on VoiceBank-DEMAND, using only five function evaluations.

23.
Nature Medicine 2026-06-16

<b>Engineered heart muscle passes early clinical milestone</b>

Engineered heart muscle allografts derived from induced pluripotent stem cells show promising early outcomes in patients with treatment-refractory advanced heart failure with reduced left ventricular ejection fraction, in support of further clinical investigation. Engineered heart muscle allografts derived from induced pluripotent stem cells show promising early outcomes in patients with treatment-refractory advanced heart failure with reduced left ventricular ejection fraction, in support of further clinical investigation.

24.
arXiv (CS.AI) 2026-06-16

APEX: Adaptive Principle EXtraction A Three-Layer Self-Evolution Framework for Production AI Agents

arXiv:2606.15363v1 Announce Type: new Abstract: Self-improvement in AI agents has emerged as a key research frontier: systems that modify their own prompts, workflows, and decision rules based on accumulated operational experience. The state-of-the-art Self-Harness framework [1] achieves 14–21% improvement on Terminal-Bench-2.0 by mining failure clusters and patching the agent harness. However, Self-Harness optimises only one dimension – the prompt harness – leaving behavioural principles and workflow topology unchanged. We propose APEX (Adaptive Principle EXtraction), a three-layer co-evolution framework that simultaneously evolves: (L1) the harness via failure-mode patching, (L2) behavioural principles via success-trace distillation [2], and (L3) the agent workflow topology via structural fitness-based selection [6]. We implement APEX on Joe [13], a production-grade super AI Agent built on NVIDIA Nemotron and designed as an Edge AI Agent Factory for the NVIDIA Agent Challenge 2026, managing a 15-node compute fleet using 114 real task traces collected over 18 days. APEX achieves an APEX Health Score of 0.570 (+90% vs. baseline 0.300) in a single evolutionary run, distilling 6 novel reusable principles and selecting a research-first workflow topology scoring 0.900 (+20%). Our results demonstrate that multi-dimensional co-evolution substantially outperforms single-axis harness optimisation, at a cost of only 4 LLM calls (~270 s) on a local qwen2.5-coder:32b instance.

25.
arXiv (CS.AI) 2026-06-11

Feature-Aligned Speech Watermarking for Robustness to Reconstruction Distortions

arXiv:2606.11828v1 Announce Type: cross Abstract: Audio watermarking aims to embed identifiable information into audio while remaining imperceptible. Existing methods adopt high-fidelity, low-energy designs to preserve perceptual quality, but the resulting watermarks lack robustness under suppression by speech reconstruction models. Improving robustness is challenging due to the inherent robustness-fidelity trade-off in existing designs, where increasing watermark energy improves robustness but reduces fidelity. To address this problem, we propose a feature-aligned watermarking method that aligns the watermark with the original speech feature distribution, allowing higher watermark energy to improve robustness while preserving imperceptibility. We use a pretrained speech codec to generate a pseudo-speech watermark and fuse it into the spectrogram of the input audio, with VAD loss and perceptual losses guiding embedding within voiced regions. Experiments show that our method maintains imperceptibility comparable to existing approaches while substantially improving robustness under both seen and unseen speech reconstruction models.