Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.LG) 2026-06-12

An Empirical Study on Predictive Maintenance for Component X in Heavy-Duty Scania Trucks

arXiv:2606.12486v1 Announce Type: new Abstract: Condition-based Predictive Maintenance (PdM) for truck fleets has gained momentum in recent years. This maintenance strategy aims to minimize unplanned downtimes and reduce costs by monitoring the health status of vehicles and taking proactive action based on their condition. However, the implementation of condition-based PdM systems is challenging due to the large volume of data generated by the trucks, the inherent complexity of detecting failures through sensor data and the difficulties in finding cost-effective trade-offs in the solution's implementation. In this paper, we define and validate a condition-based PdM methodology built on the assumption that the wear-and-tear state of the monitored component can be represented as a monotonically non-decreasing time series. It involves selecting only the most recent observations from the time series and transforming them into a tabular format for classification using machine learning (ML) models designed for tabular data. Our results indicate that the proposed methodology reduces costs on the Scania Component X dataset compared to current state-of-the-art (SOTA) approaches, while also simplifying the modeling process through AutoML.

02.
medRxiv (Medicine) 2026-06-22

A Drug-Specific, Half-Life-Adjusted Framework for Classifying CNS-Active Systemic Therapy Exposure During and After Radiotherapy

Clinical oncology datasets often store systemic therapy as a regimen label with a start date and an end date. Those records are clinically recognizable but can be analytically incomplete when the research question concerns whether a patient was exposed to a concurrent CNS-active drug (cCNS-aD) or an adjuvant CNS-active drug (aCNS-aD) around radiotherapy. Contemporary CNS-oncology studies usually define CNS activity by empiric drug lists and define concurrency by fixed calendar windows, although the literature shows substantial heterogeneity across both concepts. This paper proposes a generalizable framework for converting raw systemic therapy records into reproducible cCNS-aD and aCNS-aD variables, useful in subgrouping for clinical studies. The framework uses a transparent CNS scoring model based on three clinical evidence components: intracranial objective response rate, consensus CNS endorsement, and intrathecal route of administration. It then defines a pharmacokinetic exposure proxy as the recorded end date plus five half-lives. Concurrent exposure is classified by overlap with the radiotherapy interval, while post-radiotherapy exposure is classified by overlap with a prespecified post-RT attribution window. The framework separately identifies post-RT pharmacokinetic persistence and post-RT treatment initiation, allowing investigators to distinguish continued exposure from true adjuvant initiation. This is a methodological framework and reference implementation. Implementation audits and endpoint-specific sensitivity analyses remain necessary before use as a definitive exposure classifier

03.
arXiv (CS.CV) 2026-06-24

MambaRaw: Selective State Space Modeling for Efficient 4K Raw Image Reconstruction

In-camera JPEG previews are ubiquitous in raw image formats and provide an sRGB reference at negligible storage cost. Although existing metadata-based reconstruction frameworks can exploit this side information when recovering raw images, their context models often become computationally expensive especially at high resolution, eg, 4K raw image, given that attention mechanisms scale quadratically with feature maps, hindering its practical application. To address these limitations, we propose MambaRaw, a JPEG-conditioned metadata-based raw image reconstruction framework that uses State Space Models (SSMs) to estimate entropy parameters efficiently. Our key contribution comprises a Spatial-Energy Coupled Context Modeling mechanism with two lightweight modules: (1) TileMambaBlock, which performs Mamba-style selective scanning only on information-dense tiles to improve the efficiency; and (2) Energy-Aware Refinement (EAR), an identity-initialized residual module that enhance feature representation to match the long-tail energy distribution of raw signals. Extensive experiments on three camera datasets (Sony, Olympus, Samsung) show consistent improvements over strong metadata-based baselines and set a new state of the art for JPEG-guided raw reconstruction with great efficiency. Notably, at low metadata bitrates, MambaRaw increases PSNR by 1.2–1.4 dB and reduces end-to-end coding latency by about 9%. Code is released at https://github.com/Peizeli1/MambaRaw.

04.
arXiv (CS.LG) 2026-06-19

A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models

arXiv:2604.13240v2 Announce Type: replace-cross Abstract: Mapping the spatial distribution of species is essential for conservation policy and invasive species management. Species distribution models (SDMs) are the primary tools for this task, serving two purposes: achieving robust predictive performance while providing ecological insights into the driving factors of distribution. However, the increasing complexity of deep learning SDMs has made extracting these insights more challenging. To reconcile these objectives, we propose the first implementation of concept-based Explainable AI (XAI) for SDMs. We leverage the Robust TCAV (Testing with Concept Activation Vectors) methodology to quantify the influence of landscape concepts on model predictions. To enable this, we provide a new open-access landscape concept dataset derived from high-resolution multispectral and LiDAR drone imagery. It includes 653 patches across 15 distinct landscape concepts and 1,450 random reference patches, designed to suit a wide range of species. We demonstrate this approach through a case study of two aquatic insects, Plecoptera and Trichoptera, using two Convolutional Neural Networks and one Vision Transformer. Results show that concept-based XAI helps validate SDMs against expert knowledge while uncovering novel associations that generate new ecological hypotheses. Robust TCAV also provides landscape-level information, useful for policy-making and land management. Code and datasets are publicly available.

05.
arXiv (CS.LG) 2026-06-18

Towards a future space-based, highly scalable AI infrastructure system design

arXiv:2511.19468v2 Announce Type: replace-cross Abstract: If AI is a foundational general-purpose technology, we should anticipate that demand for AI compute – and energy – will continue to grow. The Sun is by far the largest energy source in our solar system, and thus it warrants consideration how future AI infrastructure could most efficiently tap into that power. This work explores a scalable compute system for machine learning in space, using fleets of satellites equipped with solar arrays, inter-satellite links using free-space optics, and Google tensor processing unit (TPU) accelerator chips. To facilitate high-bandwidth, low-latency inter-satellite communication, the satellites would be flown in close proximity. We illustrate the basic approach to formation flight via an 81-satellite cluster of 1 km radius, and describe an approach for using high-precision ML-based models to control large-scale constellations. Trillium TPUs are radiation tested. They survive a total ionizing dose equivalent to a 5 year mission life without permanent failures, and are characterized for bit-flip errors. Launch costs are a critical part of overall system cost; a learning curve analysis suggests launch to low-Earth orbit (LEO) may reach $\lesssim$\$200/kg by the mid-2030s.

06.
arXiv (quant-ph) 2026-06-11

Large Fluctuations in Open Quantum Systems

arXiv:2606.11822v1 Announce Type: new Abstract: We study statistics of atypical measurement outcomes in the steady states of driven open quantum systems. In equilibrium, the probability distribution over the phase space, as encoded in, e.g., the Wigner function, is analytic in the phase-space coordinates. We show that this property is generically lost in driven dissipative systems: their {\it large-deviation function} develops lines and surfaces across which its derivatives are discontinuous. As an illustrative example, we consider a parametrically driven Kerr oscillator coupled linearly and/or nonlinearly to a dissipative bath. Rare fluctuations in the amplitude and phase of the induced oscillations are governed by semiclassical instanton trajectories of the corresponding Keldysh-Lindblad action. We demonstrate that a given fluctuation can be realized through multiple distinct instanton trajectories. The competition between these trajectories leads to abrupt switching of the dominant instanton and, consequently, to non-analytic features in the large-deviation function.

07.
arXiv (CS.CV) 2026-06-25

Cross-Modality Structural Guidance in 3D Latent Diffusion for Robust FLAIR Super-Resolution

High-resolution (HR) MRI acquisition is often hampered by scan time constraints, resulting in anisotropic or low-resolution scans (e.g., thick-slice FLAIR) that limit diagnostic accuracy. While deep learning-based super-resolution (SR) methods show promise, they often hallucinate anatomical details, which can compromise brain structural integrity. To mitigate this limitation, we introduce MR-DiffuSR, a Multi-Resolution Diffusion-based Super-Resolution framework that incorporates HR T1w structural image priors to guide the restoration of thick-slice FLAIR scans and operates in the 3D latent space. Our architecture introduces cross-modality structural swin-attention, which derives structural attention maps from the HR T1w and applies them to the low-resolution FLAIR latent features. This design disentangles anatomical structure from modality-specific contrast, effectively preventing hallucinations. Furthermore, we employ a mixed-scale degradation strategy, training the model on a continuum of downsampling factors to ensure robustness to varying slice thicknesses, while optimizing with a DINOv3-based perceptual loss to preserve high-frequency semantic details. Evaluated on the ADNI-4 dataset, MR-DiffuSR surpasses both CNN and 2D diffusion approaches, achieving an average PSNR of 32.46dB, SSIM of 0.97, and LPIPS of 0.07 across all downsampling factors. In downstream white matter hyperintensity segmentation, our model demonstrates exceptional robustness. While baseline performance collapses at 10x down-sampling (Dice: 0.51), MR-DiffuSR maintains a Dice score of 0.63, preserving utility even at 7mm equivalent slice thickness.

08.
bioRxiv (Bioinfo) 2026-06-18

fuzzyfold: a high-performance framework for stochastic RNA folding kinetics

Authors:

The analysis of nucleic acid secondary structures is overwhelmingly dominated by methods that analyze the thermodynamic equilibrium distribution and which ignore all dynamic aspects of nucleic acid folding. Yet, there are numerous popular examples of nucleic acid folding that rely on kinetic models, such as RNA riboswitches or DNA strand displacement systems. Here, I am presenting fuzzyfold, a Rust-based software package for nucleic acid secondary structure analysis with an explicit focus on stochastic modeling. The framework introduces three-way and four-way shift moves with a biophysically motivated rate-model parameterization, and it is developed with an emphasis on both model flexibility and performance, e.g. allowing for the generation of single co-transcriptional trajectories for thousand-nucleotide long RNA molecules in just a few minutes. The main strength of the fuzzyfold package, however, is its focus on user and developer interfaces for long-term development. It provides easily installable command-line interfaces, e.g. for aggregating data from multiple parallel trajectories efficiently into an ensemble-level dynamic analysis. For developers, the code-base supports straight-forward substitution of thermodynamic and kinetic free-energy models, and a flexible library interface with Python bindings, enabling integration of individual components into custom computational workflows.

09.
bioRxiv (Bioinfo) 2026-06-15

VrySure: A Multi-Task AI Scientific Fraud Detection Platform for Identifying Manipulated and AI-Generated Biomedical Research Images

Integrity of scientific data is critical in biomedical research, where images often serve as primary evidence for experimental observations and conclusions. Advances in image-editing technologies and generative artificial intelligence (AI) have increased the accessibility and realism of visual manipulation, making detection through manual review increasingly challenging. To empower our laboratory researchers to continuously monitor and uphold scientific rigor and data integrity, and serve the global scientific community, we developed VrySure, an easy-to-deploy, AI-driven multi-task platform for automated image-integrity screening in biomedical research. VrySure integrates four detection modules: cross-image transformation detection, within-image copy-move detection, splicing detection in blot and gel images, and AI-generated image detection. The system identifies potentially manipulated images and, when possible, localizes suspicious regions using bounding-box outputs to support downstream verification. To support development and evaluation, we constructed task-specific datasets by combining public biomedical image resources, curated manipulated examples, and synthetic images generated by multiple generative AI systems. We evaluated VrySure using region-level F1 score, recall, precision, false negative rate (FNR), and false discovery rate (FDR) across multiple manipulation categories and compared its performance with two commonly used commercial image-integrity screening platforms under a predefined benchmark protocol. Under the tested conditions, VrySure achieved a higher F1 score and recall, lower FNR, and maintained a low FDR for within-image copy-move detection, splicing detection, and AI-generated image detection, while showing comparable performance in transformation detection. Beyond automated screening, VrySure is designed to support source-data comparison and evidence-based assessment in scientific integrity investigations. By integrating multiple detection capabilities into a unified and scalable workflow, VrySure provides a practical framework to improve the efficiency and consistency of image-integrity screening in biomedical research.

10.
medRxiv (Medicine) 2026-06-15

ICD-10 Code Ambiguity Obscures Treatment-Eligible Adults with Spinal Muscular Atrophy: A Single-Center Chart Review and Patient Outreach Study

Background. Three disease-modifying therapies (DMTs) for spinal muscular atrophy (SMA) have been approved since 2016, yet many adults remain untreated. Identifying them depends on ICD-10 codes that capture SMA but do not reliably distinguish it from other related conditions. We examined, in one U.S. health system, both patients' engagement with therapy and the accuracy of the codes used to find them. Methods. We conducted a retrospective chart review of adults in an academic health system identified by SMA-associated ICD-10 codes, with manual adjudication of diagnosis and DMT status. Confirmed SMA-positive, DMT-naive patients were invited to a structured telephone interview on treatment awareness and barriers. Results. Of 60 charts, 22 (36.7%; 95% CI 25.6-49.3%) were appropriately coded for SMA or a related disorder; only 16 (26.7%) had molecularly confirmed SMA. The other 38 (63.3%) were miscoded, spanning spinal and bulbar muscular atrophy, asymptomatic carriers, prenatal screening, and conditions unrelated to SMA. Ten of the 16 confirmed patients (62.5%) were DMT-naive; one was interviewed, one declined, and eight could not be reached. The non-response is itself a finding: the patients least visible to administrative data are the hardest to reach. Conclusions. ICD-10 ambiguity is a barrier to treatment access in adult SMA, as is loss to follow-up. We make two recommendations: continuous documentation-coding alignment that uses natural language processing to verify the genetic precondition, and type-specific SMA codes (subcodes for Types 0-4) anchored on molecular SMN1 confirmation. Together these would support cohort identification, outreach, and evidence generation without adding to clinician burden.

11.
arXiv (CS.AI) 2026-06-16

HOLO-MPPI: Multi-Scenario Motion Planning via Hierarchical Policy Optimization

arXiv:2606.16480v1 Announce Type: cross Abstract: Robots deployed in the real world must plan motions across diverse scenarios without per-scenario retuning. End-to-end reinforcement learning (RL) can generalize across scenarios but often becomes brittle under distribution shift, reward misspecification, and stochastic interactions. Model predictive path integral (MPPI) control enables strong real-time refinement without gradients, but its performance depends on a well-shaped sampling prior, while manually designing the priors does not scale to multi-scenario deployment. We present HOLO-MPPI (High-level Offline, Low-level Online MPPI), a multi-scenario motion planning framework that combines high-level policy learning with low-level stochastic optimal control. Offline, we learn a high-level policy that proposes scenario-robust plans in an abstract action space, with a learned world model for online rollout. Online, the policy serves as a data-driven prior generator that parameterizes MPPI's sampling distribution conditioned on the current observation and goal. MPPI then optimizes low-level control sequences around this prior in real time to adapt to local disturbances. We instantiate HOLO-MPPI in autonomous driving by designing an effective high-level action space and tailored model architectures. Our evaluation across diverse driving scenarios shows that HOLO-MPPI improves upon MPPI and end-to-end RL baselines while maintaining real-time control.

12.
arXiv (CS.AI) 2026-06-17

First, do NOHARM: towards clinically safe large language models

arXiv:2512.01241v3 Announce Type: replace-cross Abstract: Large language models (LLMs) are routinely used by physicians and patients for medical advice, yet their clinical safety profiles remain poorly characterized. We present NOHARM (Numerous Options Harm Assessment for Risk in Medicine), a 1,100-task benchmark of primary care-to-specialist consultation cases to measure the frequency and severity of harm from LLM-generated medical recommendations. NOHARM covers 10 specialties, with 12,747 expert annotations for 4,249 clinical management options. Across 28 LLMs, recommendations carried the potential for severe harm in up to 22.6% of cases, with errors of omission accounting for more than 80% of severe errors. In a randomized trial of 101 generalist physicians, human benchmark performance significantly improved with AI assistance, yet physicians remained far from realizing the potential of AI tools, frequently ignoring essential advice surfaced by AI. Safety performance tracked general-intelligence and medical-knowledge benchmarks across the full range of models but decoupled at the frontier. Despite strong performance on existing evaluations, widely used AI models can produce medical advice with the potential for severe harm at non-trivial rates, highlighting the importance of explicit measurement of clinical safety.

13.
arXiv (CS.AI) 2026-06-12

Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation

arXiv:2606.12594v1 Announce Type: new Abstract: Modern Lean theorem provers achieve strong performance only with substantial training and inference compute, driven in part by scarce verified proof data and the long reasoning traces of formal proof search, making both supervised fine-tuning (SFT) and sampling expensive. We introduce Pythagoras-Prover, a compute-efficient open-source family of Lean theorem provers built for practical compute budgets. The family spans two generation paradigms: autoregressive models at 4B and 32B parameters, and a first proof-of-concept diffusion-based prover (4B) that iteratively refines Lean proofs at inference time. For training efficiency, we build a Lean-verified corpus stratified into easy, medium, and hard problems for curriculum SFT, so models acquire proof skills progressively from shorter, simpler proofs to longer, harder ones. During SFT, a dynamic proof-reasoning filtering scheme preserves informative proof traces while keeping each instance within an 8k-token context budget. We also introduce Augmented Lean Formalisation (ALF), which expands scarce verified corpora into variants of formal statements, populated via self-distillation for extra training signal without formally verifying every mutated instance. By perturbing known problems while preserving their formal character, ALF reduces reliance on any statement's surface form. Empirically, Pythagoras-Prover-4B surpasses DeepSeek-Prover-V2-671B at pass@32 on MiniF2F-Test (86.1% vs 82.4%) with ~167x fewer parameters, while Pythagoras-Prover-32B sets the open-source state of the art at 93.0% on MiniF2F-Test and solves 93 of 672 PutnamBench problems. We release MiniF2F-ALF, an ALF-mutated contamination-sensitive benchmark on which every evaluated model loses accuracy; here our 32B remains strongest and our 4B matches the prior state of the art, Goedel-Prover-V2-32B.

14.
medRxiv (Medicine) 2026-06-17

Frequency-dependent cognitive effects of Deep Brain Stimulation in Parkinson's Disease: A Systematic Review and Meta-Analysis

Background: Subthalamic nucleus deep brain stimulation (STN-DBS) improves levodopa-induced motor complications and cardinal motor symptoms of Parkinson's disease (PD), but stimulation frequency may differentially shape outcomes. This is evident for axial and gait symptoms, which may respond differently to lower-frequency stimulation. Whether frequency-dependent effects extend to cognition remains unclear. Objective: To investigate the cognitive effects of DBS at distinct frequencies in PD. Methods: We conducted a systematic review and meta-analysis (PROSPERO - CRD42024618253). PubMed, Web of Science, and EMBASE were searched for studies assessing cognitive outcomes under different stimulation frequencies. Eight cognitive domains were defined: verbal fluency, cognitive flexibility, executive control, working memory, attention, processing speed, episodic memory, and time processing. Multilevel random-effects meta-analyses were performed, with effect sizes expressed as Hedges' g. Results: Forty-three studies met the inclusion criteria, the majority (n = 31) involving STN-DBS. Twenty-one STN-DBS studies, including 355 patients, were included in the meta-analysis. Compared with HFS ([≥] 130 Hz), lower frequencies (4-80 Hz) were associated with better verbal fluency (g = 0.27) and cognitive flexibility (g = 0.38), with consistent effects across sensitivity and leave-one-out analyses. Accuracy-based executive control measures also favored lower-frequency stimulation. OFF-stimulation comparisons showed a concordant pattern. Evidence for other targets (PPN and NBM) was limited. Conclusions: Lower-frequency STN-DBS was associated with modest benefits in specific cognitive domains compared with HFS. These findings highlight the need for future research to determine how frequency interacts with stimulation location and symptom-specific networks to shape cognitive and cognitive-motor outcomes in PD.

15.
arXiv (math.PR) 2026-06-11

Integrated expectile-based measures of inequality

arXiv:2606.12333v1 Announce Type: cross Abstract: Expectiles provide a class of asymmetric location functionals that incorporate the magnitude of deviations and admit a natural geometric interpretation. Building on their structural consistency with the convex stochastic order, this paper introduces a family of integrated expectile functionals for measuring risk, dispersion, and inequality. The proposed functionals admit analytical representations as integrals of expectiles across asymmetry levels. For a distinguished subclass of these constructions, a geometric representation is available: the resulting quantities can be expressed as weighted areas of star-shaped sets encoding the distributional asymmetry of a random variable. This approach yields a new class of expectile-based inequality indices, constituting a natural counterpart to classical Gini-type measures while preserving desirable monotonicity and consistency properties. Empirical counterparts are derived in closed form and admit explicit decompositions over finite samples. The framework extends naturally to multivariate settings through directional expectile constructions, leading to measures capable of capturing genuinely joint forms of multivariate dispersion and inequality.

16.
arXiv (quant-ph) 2026-06-24

A high-fidelity two-qubit gate for multimode superconducting P-mon qubits

arXiv:2606.24772v1 Announce Type: new Abstract: To scale superconducting quantum processors, it is essential to achieve long coherence times while engineering interactions that do not introduce additional decoherence channels. In superconducting qubit systems, this can be realized using multimode circuits that feature a protected qubit mode alongside a distinct mediator mode. Building on this concept, our recently developed P-mon qubit provides intrinsic protection against decoherence from the readout environment. We extend this approach to controlled two-qubit interactions, by exploiting the mediator modes of P-mons for on-demand coupling. Because direct interactions between the qubit modes are strongly suppressed, unwanted $ZZ$-type interactions are significantly reduced to below $3.6(5)~kHz$ in the idle state. When tuning the coupled mediator modes on resonance, the cross-Kerr interaction between the qubit and the hybridized mediator modes leads to a qubit-state dependent frequency shift. By selectively addressing these transitions, we implement a $180~ns$ long CZ gate and determine a fidelity of $99.62(4)~%$. These results represent a significant step toward a scalable superconducting architecture that maintains high performance at scale.

17.
arXiv (CS.AI) 2026-06-11

SirenFNO: Efficient and Full Frequency Learning of Fourier Neural Operators

arXiv:2606.11518v1 Announce Type: cross Abstract: Fourier neural operators (FNOs) are effective and efficient surrogates for approximating solutions of PDEs and generalize across discretizations. However, owing to the reliance on frequency truncation to maintain learning efficiency of FNOs, empirical studies suggest that FNOs exhibit spectral bias toward low-frequency information, which may hinder the learning capability especially for certain PDEs with strong high-frequency oscillations. To address this limitation, we propose SirenFNO, a novel framework that leverages sinusoidal representation networks (SIRENs) to learn implicit neural representations and performs mode-wise kernel parameterization. Our SIREN parameterization learns a full-grid spectrum with a constant and discretization-independent parameter count, thereby eliminating the need for frequency truncation. We further extend SirenFNO with functional tensor decompositions to enhance parameter and learning efficiency. Empirical results show that our SirenFNO consistently outperforms FNO with approximately $4$ to $15$ times parameter reductions with preserved discretization invariance, and our functional decomposition variants obtain performance improvements with a maximum of $73$ times fewer parameters across multiple PDE benchmarks.

18.
arXiv (CS.CV) 2026-06-18

Neural Phase Correlation

Authors:

Correspondence is fundamentally relational: it seeks the unknown transformation between two observations of a common scene, not the content of either. Yet the dominant learning-based methods do not represent the transformation as a first-class object in the architecture. They encode each image independently and let a learned similarity function or a deep decoder discover the mapping implicitly. Phase correlation is the canonical exception, measuring the inter-image relationship directly in the Fourier domain, but the rigidity of its fixed basis confines it to global translation. We introduce a learned generalization of phase correlation that lifts this restriction by learning the basis on which the transformation decomposes. The same algebraic primitive extends to dense non-rigid deformations and to unitary dynamics. On the ACDC cardiac-MRI benchmark the framework matches or exceeds prior published baselines on both registration directions. On CAMUS echocardiography it matches state-of-the-art without auxiliary scoring or adaptive-smoothness mechanisms. Applied to time-evolved wavefunction pairs of the 1-D quantum harmonic oscillator, the same framework recovers the Hermite-function eigenstates and the quantized energy levels of the unknown Hamiltonian from observation pairs alone.

19.
arXiv (CS.LG) 2026-06-16

Robust Neural Tucker Factorization with Bias Correction and Adaptive Initialization

arXiv:2606.16388v1 Announce Type: new Abstract: High-dimensional incomplete (HDI) tensors are widely used in traffic and climate applications, but sparse observations make accurate completion difficult. The intrinsic non-linear dynamics and non-stationary variations across distinct multi-modal fields severely hinder the efficacy of conventional linear reconstruction frameworks. Neural Tucker factorization provides an effective framework for modeling high-order interactions among tensor modes. By parameterizing underlying structural characteristics into continuous latent spaces, neural representations circumvent the rigid low-rank constraints of classical algebra. However, its performance can still be affected by implementation-level choices, especially parameter initialization and the bias configuration of the final output mapping. Suboptimal initializations frequently lead to variance explosion across the cubically expanded interaction spaces, driving the subsequent non-linear activation boundaries into severe gradient saturation zones, while the omission of a dedicated translation parameter forces interaction weights to implicitly absorb global statistical deviations. This paper proposes a simple yet effective neural Tucker factorization model with Kaiming initialization and bias correction (KaBiN) for HDI tensor completion. The proposed model utilizes Kaiming uniform initialization for the embedding and Tucker linear parameters, and adopts a simple bias correction in output mapping. By elegantly decoupling global mean shifts from local structural representations, the framework provides a highly stable and well-conditioned optimization landscape. Experiments on three real-world HDI tensor datasets show that KaBiN achieves better performance than the original NeuTucF, while introducing minimal computational overhead.

20.
arXiv (CS.CL) 2026-06-17

LLM Features Can Hurt GNNs: Concatenation Interference on Homophilous Graph Benchmarks

Adding LLM-generated node features to graph neural networks (GNNs) is widely reported to improve accuracy on standard benchmarks. We document a contrasting observation: when LLM features are introduced through pure input concatenation (rather than joint training, distillation, or prompt-conditioning), they can systematically degrade accuracy on the same homophilous benchmarks where end-to-end LLM pipelines succeed. With an MLP backbone on the Planetoid public split and bag-of-words original features, concatenating SBERT-encoded GPT-4o-mini TAPE features reduces PubMed test accuracy by -17.0 +/- 0.3 pp and Cora by -4.3 +/- 0.6 pp (CiteSeer -0.6 +/- 0.8 pp, within seed noise). The drop attenuates as we relax each condition (GCN / GCNII / GAT backbones, random splits, smaller encoders) and reverses on medium-homophily WikiCS (+4.4 pp) and ogbn-arxiv (+11.7 pp). To predict when concatenation helps versus hurts, we report a simple measure of LLM-alone discriminability, Delta_sig. Across 9 datasets Delta_sig correlates with the concatenation cost more strongly than homophily at point estimate (r^2 = 0.38 vs. 0.06; N=9, bootstrap CIs overlap). The bootstrap-best change-point is tau = 13.8 pp, and the rule "Delta_sig

21.
arXiv (CS.CL) 2026-06-12

Getting Better at Working With You: Compiling User Corrections into Runtime Enforcement for Coding Agents

Interactive LLM agents are becoming part of daily work, but they do not reliably become easier to work with over time: a correction remembered in one session may still be violated in the next. We study this gap between preference access and preference compliance. In tasks derived from anonymized real-user friction cases, Mem0 memory still leaves 57.5% of applicable preference checks violated. We introduce Test-time Rule Acquisition and Compiled Enforcement (TRACE), a drop-in skill-layer pipeline for coding-agent runtimes that mines user corrections, rewrites them as atomic rules, and compiles them into runtime checks that must pass before an agent completes future tasks. Unlike runtime checks written ahead of time by developers, TRACE skills come from the user's own chat corrections. We evaluate TRACE with simulated user-in-the-loop experiments on ClawArena coding-agent tasks and MemoryArena-derived memory-intensive tasks. On ClawArena, TRACE reduces held-out preference violation from 100.0% to 37.6% on in-distribution tasks and from 100.0% to 2.0% on out-of-distribution tasks. On MemoryArena-derived tasks, TRACE reduces in-distribution violation from 100.0% to 60.5% while matching or exceeding the strongest memory baseline on task pass. These results suggest that compiling corrections into runtime enforcement can address a repeated-friction failure mode that memory alone does not reliably solve, reducing the need for users to restate the same correction across future sessions. Experiment code is available at https://github.com/YujunZhou/TRACE_exp, and the deployable skill is available at https://github.com/YujunZhou/tellonce.

22.
arXiv (CS.CV) 2026-06-25

Contrastive Conditional-Unconditional Alignment for Long-tailed Diffusion Model

Training data for class-conditional image synthesis often exhibit a long-tailed distribution with limited amount of images for tail classes. Such an imbalance causes mode collapse and reduces the diversity of synthesized images for tail classes. For class-conditional diffusion models trained with imbalanced data, we aim to improve the diversity and fidelity of tail class images without compromising the quality of head class images. We propose contrastive conditional-unconditional alignment (CCUA), which comprises two synergistic loss functions. Our first loss is an Alignment Loss (AL) that aligns class-conditional generation with unconditional generation at large timesteps. Alignment loss makes the denoising process insensitive to class conditions for the initial steps, which enriches tail classes through knowledge sharing from head classes. Secondly, we diversify unconditional generation via an Unsupervised Contrastive Loss (UCL) to increase the distance/dissimilarity among synthetic images. We combine the two losses to implicitly diversify conditional generation. Our framework is easy to implement as demonstrated on both U-Net based architecture and Diffusion Transformer. Our method outperforms vanilla denoising diffusion probabilistic models, score-based diffusion model, and alternative contrastive methods for class-imbalanced image generation across various datasets, in particular ImageNet-LT with 256$\times$256 resolution.

23.
arXiv (math.PR) 2026-06-18

Denoising Distances in Metric Measure Spaces

arXiv:2606.18301v1 Announce Type: cross Abstract: Recent work studied the problem of finding clusters and denoising pairwise distances from noisy distances of points sampled on a manifold. We study the same problems in more general metric measure spaces under \lowerphiregularity{}. We give an algorithm that extracts large localized clusters around every sampled point and uses them to denoise distances to any fixed accuracy, with near-linear running time in the dense fixed-accuracy regime. We also show how to achieve much higher accuracy with a non-efficient algorithm. This suggests that unlike the Riemannian case, denoising to higher accuracy in more general metric spaces has a statistical-computational gap.

24.
arXiv (CS.AI) 2026-06-11

Forecasting Future Behavior as a Learning Task

arXiv:2606.11445v1 Announce Type: new Abstract: Trust in an AI system is often anchored by explanations of how it works, which one then uses to forecast its behavior on new inputs. For large reasoning models (LRMs), this conventional route is particularly difficult to follow: explanation methods for single token generations do not naturally generalize to long trajectories, and the trajectories themselves are often not faithful when read as natural language. We propose an alternative that bypasses the explanation step: treat behavior forecasting as a learnable task and train Behavior Forecasters that operates on a single reasoning trajectory to make the same forecasts one would typically seek from an explanation. The forecaster's training data is obtained by querying the LRM with no human annotation, and its inference is done in a single forward pass. We instantiate this approach on two tasks: how likely the LRM is to repeat its answer on re-runs, and how removing parts of the input changes its answer. We evaluate this approach on both tasks across three diverse reasoning datasets and find that trained Behavior Forecasters are more accurate than GPT-5.4 and Claude Opus-4.6 reading the same trajectories as naive readers, at a small fraction of their inference cost. We find that fine-tuning the backbone end-to-end and initializing it from the target LRM are each necessary for strong performance. These results show that the reasoning trajectory carries information about the LRM's future behavior that goes beyond what naive reading conveys.

25.
arXiv (CS.AI) 2026-06-16

MA-SBI: Misspecification-Aware Simulation-Based Inference via Side-Channel Guidance

arXiv:2606.16923v1 Announce Type: new Abstract: Simulation-based inference (SBI) of latent parameters is often hindered by simulator misspecification, the mismatch between simulated and real-world observations caused by inherent modeling simplifications. RoPE, the recent state-of-the-art for robust SBI, addresses this through optimal transport between learned representations of real and simulated observations, but requires ground-truth parameter calibration pairs that are typically unavailable in the very settings where SBI is needed. What practitioners do have is unstructured side-information such as regime labels, instruction text, and policy bulletins. We propose Misspecification-Aware Simulation-Based Inference (MA-SBI), a calibration-free framework that turns this side-channel into a posterior correction. A learned corrector maps side-channel text to an observation-space shift applied before any pre-trained amortized posterior, requiring no retraining and no parameter ground-truth. Our main theorem bounds achievable bias reduction by the mutual information between misspecification and side-channel, with a non-vacuous constant that extends to all sub-Gaussian noise via Donsker-Varadhan. On hide-the-calibration benchmarks, MA-SBI with text alone matches the oracle posterior across 10 seeds and two backbones (TOST equivalence), while RoPE given more data does not. The two approaches are complementary: where misspecification is structural and recoverable from parameter pairs, RoPE dominates, as the theory predicts. A stochastic variant improves posterior-predictive log-likelihood on real COVID and OxCGRT epidemiological data, and correctly leaves the posterior unchanged on a well-specified cognitive-science corpus.