Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-18

Veriphi: Attack-Guided Neural Network Verification with Dataset-Dependent Training Methods

arXiv:2606.18454v1 Announce Type: cross Abstract: We present Veriphi, a GPU-accelerated neural network verification system that combines fast adversarial attacks with formal bound certification using alpha,beta-CROWN methods. Through systematic experiments on MNIST and CIFAR-10 using three training methodologies (standard, adversarial, certified), we demonstrate that training method effectiveness is fundamentally dataset-dependent. Interval Bound Propagation (IBP) achieves 78% certified accuracy on simple MNIST (784 dimensions) but provides negligible certification performance on the more complex CIFAR-10 dataset, where PGD adversarial training dominates with 94% certification at small perturbations. We achieve 5x verification speedup through attack-guided falsification and scale our approach to production-size models (105.8M parameters) for real-world aerospace logistics optimization. Our results challenge the assumption that certified training universally outperforms adversarial training, showing context matters critically for verification strategy selection.

02.
arXiv (CS.AI) 2026-06-24

G$^3$VLA: Geometric inductive bias for Vision-Language-Action Models

arXiv:2606.24472v1 Announce Type: cross Abstract: Vision-language-action (VLA) models have made rapid progress in generalist robot manipulation by harnessing semantic knowledge from pretrained vision-language backbones, but their visual tokens remain grounded in 2D image coordinates rather than the calibrated geometry of the robot's cameras – a mismatch especially pronounced in multi-camera setups, where views are coupled by known intrinsics and extrinsics yet processed as independent images. We propose G$^3$VLA, a camera-aware geometric module that injects calibrated structure into the visual-token stream of a pretrained VLA without altering its action space or imitation objective, combining intrinsic-conditioned ray embeddings, projective positional encoding (PRoPE), and bidirectional cross-view fusion. Geometric supervision is provided either from ground-truth point maps when available, or from confidence-gated $\pi^3$X teacher predictions, requiring no depth sensors or manual annotations. Instantiated on $\pi_0$, G$^3$VLA yields consistent gains across the LIBERO suites, RoboCasa24, RoboTwin2.0, and real-robot settings, with the largest improvements on spatially and object-sensitive tasks. We further validate on $\pi_{0.5}$ and GR00T 1.5, with results suggesting that geometric transfer is most effective when geometry-aware tokens have direct access to the action generation pathway. Our project page is at https://sites.google.com/view/g3vla

03.
arXiv (math.PR) 2026-06-15

Semiclassical limit of Polyakov-Liouville measure and Q-Curvature Uniformization on evev-dimensional manifolds

arXiv:2606.14443v1 Announce Type: new Abstract: We study the semiclassical limit of the Polyakov-Liouville measure $\boldsymbol{\nu}_\gamma$, which is a non-Gaussian measure on $H^{-\eps}(M)$ that has recently been extended from Riemann surfaces to general Riemannian manifolds $(M,g)$ of even dimension. We show that under an appropriate rescaling in the semiclassical limit as $\gamma\to0$, the normalized Polyakov-Liouville measure $\Q_\gamma$ concentrates on the unique smooth weight $u$ for which the conformal metric $e^{2u}g$ on $M$ has constant $Q$-curvature.

04.
arXiv (CS.LG) 2026-06-15

Muon$^p$: Muon with Fractional Spectral Powers

arXiv:2606.13867v1 Announce Type: new Abstract: Muon is an increasingly widely used optimizer that replaces a gradient $G=USV^\top$ with its polar factor $UV^\top$, thereby flattening the singular spectrum. However, full flattening discards singular-value information that may matter for adaptation. We introduce Muon$^p$, a Muon-style optimizer that instead uses fractional spectral-power updates $US^pV^\top$ for rational $p\in(0,1)$, interpolating between Muon and gradient descent. To make it practical, we prove that fractional spectral powers cannot be computed by any fixed univariate polynomial iteration, and furthermore derive low-degree odd bivariate recurrences that approximate $US^pV^\top$ using only matrix multiplications, preserving Muon's matrix-multiplication-only structure and compute complexity. We show that Muon$^p$ maximizes the linear improvement in loss under the Schatten $q$-norm for $q=1+\frac{1}{p}$. Empirically, Muon$^p$ is especially effective for finetuning: on billion-scale models, Muon$^p$ improves validation perplexity and downstream task performance. We further analyze when Muon$^p$ is less suitable, through the lens of spectral geometry. Our results reveal important insights on when preserving the singular spectrum can bring significant gains, and introduce a principled way to achieve them.

05.
medRxiv (Medicine) 2026-06-23

Socioeconomic Determinants of Guideline-Concordant Therapy for Early-Stage Non-Small Cell Lung Cancer: A Population-Based Analysis from Appalachian and Non-Appalachian Ohio, 2004-2015

Purpose: To examine the relative contributions of insurance, county-level poverty, and other socioeconomic factors, as compared with Appalachian geography, to receipt of guideline-concordant therapy for early-stage non-small cell lung cancer (NSCLC) in Appalachian and non-Appalachian Ohio. Methods: Retrospective population-based cohort study using the Ohio Cancer Incidence Surveillance System. We identified adults diagnosed with early-stage NSCLC between 2004 and 2015 (N=26,756). The primary outcome was receipt of guideline-concordant local therapy (surgery or definitive radiation). Rural-urban classification used USDA Rural-Urban Continuum Codes. Multivariable logistic regression and Cox proportional hazards models assessed predictors of treatment and survival, with E-values, race-stratified models, and propensity score weighting as sensitivity analyses. Findings: Median age was 71 years; 50.3% were male, 83.8% non-Hispanic White, and 20.4% Appalachian. Overall, 83.6% received guideline-concordant local therapy (59.6% surgery, 24.0% radiation). In adjusted analysis, Medicaid (adjusted odds ratio [OR] 0.53, 95% confidence interval [CI] 0.44-0.63; adjusted risk ratio [RR] 0.94, 0.91-0.96), county-level poverty >20% (OR 0.77, 95% CI 0.68-0.87; RR 0.96, 0.95-0.98), and unmarried status were independently associated with lower therapy receipt, whereas Appalachian residence was associated with modestly higher receipt (OR 1.17, 95% CI 1.06-1.29; RR 1.02, 1.01-1.04). Therapy rates converged across regions over the study period (year x Appalachian interaction p20% (HR 1.13, 95% CI 1.07-1.20). Conclusions: Socioeconomic factors, particularly Medicaid insurance and county-level poverty, were the patient characteristics most strongly associated with lower receipt of guideline-concordant therapy, whereas Appalachian residence was not a barrier. Findings support targeted interventions addressing insurance-related and poverty-related barriers to lung cancer care in high-poverty communities regardless of geographic designation.

06.
arXiv (CS.AI) 2026-06-24

Abstractions of Queries in Ontology-Based Data Access

arXiv:2606.24618v1 Announce Type: new Abstract: In ontology-based data access (OBDA), multiple data sources are integrated via mappings to an ontology. We consider an OBDA setting based on existential rules and the certain answer semantics. We address the recent issue of query abstraction, which consists of abstracting data queries by translating them to the ontology layer. Since a perfect abstraction may not exist, the notions of minimally complete and maximally sound abstractions have been introduced. We study abstractions within an extension of UCQs with a limited form of inequality and a special predicate marking database constants. While this extension does not lead to an increased complexity of the problems of interest, it is able to express minimally complete abstractions, hence perfect abstractions when they exist. We also characterize maximally sound abstractions by making a new connection with the notion of maximum recovery stemming from data exchange.

07.
arXiv (CS.LG) 2026-06-18

UST-GNN: A Unified Spatial–Topological Graph Neural Network Framework for Urban Analytics–Demonstrated through a Case Study on Urban Health Prediction

arXiv:2504.04739v3 Announce Type: replace Abstract: Understanding how social, demographic, environmental, and spatial factors jointly shape urban outcomes is essential for sustainable urban development and evidence-based policy. Traditional statistical approaches often struggle to capture complex non-linear relationships, while many machine learning methods overlook the joint roles of spatial autocorrelation and network topology in urban systems. Recent advances in GeoAI have addressed these challenges only partially, often treating spatial effects, graph structure, evaluation, and interpretability separately. We present UST-GNN, a unified spatial–topological graph neural network framework that integrates neighbourhood connectivity, heterogeneous urban features, and positional/locational embeddings into a single representation. Using the MedSAT dataset, which contains over 150 environmental and socio-demographic variables and six prescription outcomes across 4,835 neighbourhoods in Greater London, UST-GNN outperforms strong statistical, geographically enhanced, and graph Machine Learning baselines, improving out-of-sample $R^2$ by 8.4–13.2\% under strict spatial cross-validation. We further introduce a lightweight principal-component module to interpret learned node embeddings geographically and relate them to policy-relevant covariates. The resulting analyses recover established patterns, offer new perspectives on debated associations, and reveal novel predictors warranting further causal investigation. Together, these findings demonstrate the value of graph-based spatial machine learning for urban health analytics, environmental inequality assessment, and evidence-based urban policy. Beyond predictive gains, UST-GNN provides a unified GeoAI analytical pipeline that can be embedded into urban digital twin workflows for scenario testing, monitoring, and data-informed decision-making for healthier, more sustainable cities.

09.
arXiv (quant-ph) 2026-06-17

Efficient time-series prediction on NISQ devices via time-delayed quantum extreme learning machine

arXiv:2602.21544v2 Announce Type: replace Abstract: We proposed a time-delayed quantum extreme learning machine (TD-QELM) for efficient time-series prediction on noisy intermediate-scale quantum (NISQ) devices. By encoding multiple past inputs simultaneously, TD-QELM achieves shallow circuit depth independent of sequence length, thereby, mitigating noise accumulation and reducing computational complexity. Experiments using the NARMA benchmark on both noiseless simulations and IBM's 127-qubit processor demonstrate that TD-QELM consistently outperforms conventional quantum reservoir computing in prediction accuracy and noise robustness. These results highlight TD-QELM as a practical and scalable framework for time-series learning on current NISQ hardware.

10.
arXiv (CS.CV) 2026-06-16

Vision-Encoder Behavioral Fingerprints of Image-to-Image Generative Models: A Training-Paradigm-Driven Taxonomy of Six Commercial APIs

作者:

We study six production image-to-image AI systems (gpt-image-1, Gemini 2.5 Flash Image, Flux Kontext, SDXL img2img, SD3 img2img, and Qwen Image Edit) under a content-adaptive sub-JND adversarial perturbation pipeline, scoring all outputs by frozen DINOv2 ViT-B/14 token distances against clean references. Across a 3,588-call corpus spanning COCO photographs, CelebA-HQ portraits, and AI-generated inputs, the six systems partition into two image-invariant behavioral bands on a 2D (patch_mean, ssim_clean) plane: edit-trained models (Flux Kontext, Qwen Edit, Gemini) cluster in a tight band, while T2I-base models adapted at sampling time (SDXL, SD3, gpt-image-1) cluster in a drift band.

11.
arXiv (CS.LG) 2026-06-16

Beyond the Smile: A Hybrid Convolutional VAE for Crypto Volatility Surfaces

arXiv:2606.16961v1 Announce Type: new Abstract: We present a convolutional variational autoencoder for cryptocurrency implied-volatility surfaces, together with a deployable predictor that combines it with a quadratic smile re-fit through a deterministic per-tenor routing rule. Trained on 6,034 fully-filled hourly Binance Options surfaces of BTC and ETH spanning May-October 2023 and parameterised on a common $6 \times 7$ tenor-delta grid, the model attains a hidden-cell surface-completion RMSE in the 0.94-1.56 vol-point range across both markets and mask rates 10-50%. The hybrid predictor attains 0.83 vol points at 50% masking against 7.00 for the smile re-fit alone, an eightfold reduction obtained at no additional inference cost. Under structurally-correlated hole patterns that emulate the withdrawal of an entire tenor of strikes, the smile re-fit incurs 9.6-13.1 vol points of error while the learned model remains at 1.5-1.9, isolating a regime in which the generative model is the only viable predictor. Joint training on BTC and ETH improves the in-distribution model on both markets by 9-27% relative to the better-performing single-symbol counterpart, indicating a substantially shared vol-surface manifold across the two largest cryptocurrencies over the observation window. The hybrid is calendar- and butterfly-arbitrage-free at the listed strikes, a property that the parametric smile re-fit alone fails at high mask rates. The per-snapshot reconstruction error of the trained model flags the late-October ETF-anticipation rally and the August $17$, $2023$ flash crash as elevated-error periods without supervision. All training and evaluation infrastructure is released to support reproducible follow-on work.

12.
arXiv (CS.AI) 2026-06-16

Sample from What You See: Visuomotor Policy Learning via Diffusion Bridge with Observation-Embedded Stochastic Differential Equation

arXiv:2512.07212v3 Announce Type: replace Abstract: Imitation learning with diffusion models has advanced robotic control by capturing the multi-modal action distributions. However, existing methods typically treat observations only as high-level conditions to the denoising network, rather than integrating them into the stochastic dynamics of the diffusion process itself. As a result, the sampling is forced to begin from random noise, weakening the coupling between perception and control and often yielding suboptimal performance. We propose BridgePolicy, a generative visuomotor policy that directly integrates observations into the stochastic dynamics via a diffusion-bridge formulation. By constructing an observation-informed trajectory, BridgePolicy enables sampling to start from a rich and informative prior rather than random noise, substantially improving precision and reliability in control. A key difficulty is that diffusion bridge normally connects distributions of matched dimensionality, while robotic observations are heterogeneous and not naturally aligned with actions. To overcome this, we introduce a semantic aligner to unify the visual and state inputs and align the observations with action representations, making diffusion bridge applicable to heterogeneous robot data. Extensive experiments across 52 simulation tasks on three benchmarks and 5 real-world tasks demonstrate that BridgePolicy consistently outperforms state-of-the-art generative policies. Our code is available at https://jianghcsr.github.io/BridgePolicy_page/.

13.
arXiv (CS.CV) 2026-06-17

WeaveLA: Event Driven Cross-Subtask Latent Memory Weaving for Repetitive Robot Manipulation

Vision-Language-Action (VLA) policies have achieved remarkable single-step manipulation, yet they remain brittle precisely where each stage depends on what was just completed. The core issue is structural: short-window VLAs lack an explicit channel for rouxting information across sub-task boundaries, and existing memory-augmented variants either write at every frame, retrieve from demonstration-time stages, or fire at sub-goal events without performing an explicit sub-task-to-sub-task hand-off into the action expert. We identify the sub-goal completion event as the natural temporal unit for cross-subtask memory hand-off, and present WeaveLA (Weave Latent memory for Vision-Language-Action policies), a cross-subtask memory interface that, on top of a frozen VLA backbone, compresses each completed segment into latent tokens via query-driven attention pooling and routes them directly into the action-generation path of the next sub-task. This event-triggered, action-side design preserves the base policy's short-window interface while adding a lightweight cross-subtask channel. Through stratified evaluation on RoboMME with a $\pi_{0.5}$ backbone, WeaveLA's gains land exactly where the channel is needed: on the hardest repetition slice (SwingXtimes, $N{=}3$), success rises from $0\%$ to $47.8\%$, while single-execution episodes remain unchanged. Per-episode paired analysis confirms the gains are confined to tasks whose causal structure requires cross-subtask information.

14.
arXiv (CS.LG) 2026-06-16

Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation

作者:

arXiv:2605.04998v2 Announce Type: replace-cross Abstract: This revision updates a pop-to-jazz chord-generation rehearsal study. Best-epoch metrics still show that modest pop rehearsal preserves pop accuracy while improving jazz prediction, but v2 corrects released-checkpoint selection: the released F1 equals Phase 0, F2 had a transcription error, and ft-pop80-v2 restores a hash-distinct jazz-adapted F1 across 3 seeds.

15.
bioRxiv (Bioinfo) 2026-06-16

OmicOS: A Comprehensive Omics Ecosystem Infrastructure and Agent System for the AI Era

Biology has accumulated a vast ecosystem of omics methods, but much of this ecosystem remains built for expert humans rather than scientific agents. Methods are scattered across Python packages, R/Bioconductor and CRAN workflows, command-line tools, incompatible data containers and implicit object states, making even routine analyses difficult for an AI system to choose, execute and verify reliably. Here we introduce OmicOS, a comprehensive omics ecosystem infrastructure and agent system that turns OmicVerse V2, an open-source omics community, into an executable foundation for agentic biology. OmicVerse V2 provides the community substrate: scalable AnnDataOOM-compatible rust backends, agent-friendly Python algorithms for single-cell, spatial, bulk and multi-omics analysis, interfaces to single-cell foundation models, and Python-native reconstructions of historically R-centred Bioconductor/CRAN-style workflows. OmicOS makes this substrate actionable by registering analytical functions as state-aware capability contracts, allowing agents to inspect live data objects, select valid methods, execute controlled workflows and record provenance. The result is not a fixed pipeline, but a programmable omics environment in which agents compose real analyses from verified community methods rather than inventing tools. Across external and purpose-built benchmarks, OmicOS ranked first among the evaluated systems, reaching 81.2% on BiomniBench. Adding OmicVerse to a minimal agent improved task completion by up to 34.2 percentage points with qwen-3.6-35b, and controlled ablations showed that the gains came from registry-grounded execution rather than from larger models, documentation retrieval or unrestricted tool exposure. The same infrastructure scaled to atlas-sized data, reproduced R-centred workflows in Python and converted external pathology software into agent-usable skills. In a discovery task starting from a whole-body spatial map and the term Alzheimer disease, OmicOS composed a non-canonical workflow that integrated spatial expression, genetic association, eQTL and colocalization evidence to nominate a colon epithelial risk axis centred on PICALM, CD2AP and CR1. Together, OmicVerse and OmicOS define an open foundation for AI-era omics, showing how a community of biological methods can be transformed into a reliable, extensible and agent-operable system for discovery.

16.
arXiv (quant-ph) 2026-06-24

An Analysis of Speculative Window Decoders for Quantum Error Correction

arXiv:2606.24048v1 Announce Type: new Abstract: Fault-tolerant quantum computing is essential for realizing the substantial computational speedups that quantum computing can bring, but it requires real-time error decoding with high performance. Speculative window decoding improves performance by reducing the time spent waiting for dependencies from prior decoding windows. However, speculative decoders have only been evaluated under the regime of superconducting qubits with fast gate speeds, surface codes, and matching decoders. Since different quantum technologies can have slower gate speeds, we evaluate the performance of speculative decoding under slow gate speeds. We also examine its sensitivity to speculation accuracy, decoder latency, processor count, and workload parallelism, which can vary across different quantum error correction codes, decoders, and hardware platforms. This work presents design principles for identifying when speculative decoding yields the greatest performance improvements. It also reveals the conditions under which non-speculative decoders outperform speculative decoders.

17.
arXiv (CS.LG) 2026-06-18

Stochastic Thermodynamics and SDE-based Generative Models

作者:

arXiv:2606.18290v1 Announce Type: cross Abstract: SDE-based generative models, including diffusion models and the Schrödinger bridge, have found broad applications in signal processing tasks such as speech enhancement, image restoration, and time-series generation. This note presents a modeling framework for such models within the context of stochastic thermodynamics. The main results of this note are trajectory-level definitions of work, heat, and entropy production, along with a generalized Jarzynski identity and a second-law-like inequality. The proposed framework extends the original Jarzynski setup to accommodate time-dependent bath temperature and nonconservative driving forces. This thermodynamic perspective may deepen our understanding of diffusion models and the Schrödinger bridge from a nonequilibrium statistical mechanics viewpoint.

18.
arXiv (CS.AI) 2026-06-24

Promise and challenges of heart chamber segmentation from non-contrast CT scans using contrastive unpaired image translation: a feasibility study

arXiv:2606.23879v1 Announce Type: cross Abstract: Purpose: To evaluate the feasibility and challenges of heart chamber segmentation from non-contrast CT scans using contrastive unpaired image translation and deep learning-based segmentation. Approach: We developed ChameleonNet, a framework utilizing the Contrastive Unpaired Translation (CUT) network with decoupled contrastive learning (DCL) loss to synthesize non-contrast CT from contrast CT scans. Using annotations of four heart chambers (left atrium (LA), left ventricle (LV), right atrium (RA), and right ventricle (RV)) from contrast scans, we trained a Hausdorff distance loss-enhanced nnU-Net on synthesized non-contrast images. The translation model was trained with 35,538 contrast-enhanced and 37,197 non-contrast CT slices. The segmentation model was trained with 292 synthesized non-contrast scans. Performance was evaluated using Dice similarity coefficient (DSC) and 95th Hausdorff distance (HD95) on 36 synthesized non-contrast scans, and volume agreement on 36 real non-contrast CT scans was assessed using Pearson correlation, mean absolute percentage error (MAPE), and mean percentage error (MPE). Results: The segmentation model achieved DSC of 0.94 (0.01), 0.91 (0.04), 0.92 (0.03), 0.93 (0.02), and HD95 of 3.63 (1.49), 5.74 (4.08), 5.18 (1.77), 5.51 (3.21) mm on synthesized non-contrast images for LA, LV, RA, and RV, respectively. On real non-contrast CT scans, Pearson correlations were 0.93, 0.82, 0.87, and 0.89 (all p

19.
arXiv (CS.LG) 2026-06-12

Plan, Don't Pose: Long Composite Motion Generation with Text-Aligned BFM

arXiv:2605.29906v2 Announce Type: replace Abstract: Text-to-motion (T2M) generation has broad applications in character animation, virtual avatars, and human-robot interaction. Existing methods typically generate pose trajectories or motion tokens directly from language, forcing a single model to handle semantic interpretation, long-horizon structure, and low-level physical realization. This coupling makes them costly and often unreliable for long, compositional, or semantically dense prompts. We propose Text2BFM, the first framework that aligns natural language with pretrained Behavioral Foundation Models (BFMs) for T2M generation without relying on heavy end-to-end motion generators. Text2BFM operates in the latent policy space of a frozen BFM, using it as an executable motion prior. A text-aligned variational behavioral bottleneck compresses BFM policy-latent sequences into compact motion representations that are compatible with language and preserve long-horizon behavioral structure. Generation is performed in this compact behavioral manifold with a lightweight conditional generator, and the resulting latent encoded behaviors are decoded into policy latents that drive the pretrained frozen BFM. By decoupling semantic planning from motion execution, Text2BFM achieves efficient, robust T2M generation and strong performance on long, compositional textual descriptions.

20.
arXiv (CS.CL) 2026-06-24

Decoherence as Defence and the Magnitude of Noise Regularisation: A Rigorous N -Qubit Theory of Stochastic Quantum Neural Networks for Adversarially Robust Network Intrusion Detection

Stochastic quantum neural networks (SQNNs) encode neuronal activations as qubits, synaptic topology as entanglement, and neural noise through a Lindblad master equation. A recent conference study applied a ring-entangled SQNN to collaborative intrusion detection and reached three conclusions: ring entanglement is essential for non-local anomaly detection; an adversarial-resilience bound holds but is conservative; and the depolarising channel fails to act as a dropout-style regulariser, behaving instead as output noise. It left open whether a per-gate stochastic deactivation (``true quantum dropout'') could regularise where the depolarising channel could not, and whether the loose robustness bound could be replaced by a predictive theory. This paper resolves both and extends the framework to real data and to neutral-atom hardware. We give an $N$-qubit formulation through the stochastic master equation and its vectorised Liouvillian, and prove a decoherence-contraction theorem: a depolarising channel of strength $\gamma$ over $L$ entangling layers contracts every weight-$w$ Pauli read-out by a factor $(1-4\gamma/3)^{wL}$ (for the weight-$1$ read-out used here, $(1-4\gamma/3)^{L}$); building on the general noise-as-defence result of Du et al., we make this quantitative and operational for intrusion detection. On the real NSL-KDD dataset under white-box FGSM and PGD attacks, a depolarising SQNN trained with the channel is, over seven seeds under strong $\ell_\infty$/$\ell_2$ attacks, significantly more robust than the noiseless circuit ($\ell_\infty$ PGD-$20$, $p=0.04$, large effect) and, critically, never suffers the catastrophic robustness collapse that the noiseless model and gradient-trained classical detectors (which fall from $95\%$ to $47\%$) do, cutting robustness variance roughly twofold; we show this robustness arises from a noise-reshaped training boundary rather than from attack-time gradient contraction. For generalisation, we derive an adaptive-penalty formula showing that per-gate dropout implements a curvature-weighted $L_2$ penalty $\tfrac{p(1-p)}{2}\sum\theta^2\partial^2_\theta L$ in weight space, maximised at $p=1/2$, whereas depolarising noise implements an output-space penalty. A $30$-seed study confirms the formula's quantitative prediction: both mechanisms reduce the train-test gap by a small but statistically significant margin ($\approx\!0.01$; $p

21.
arXiv (CS.AI) 2026-06-17

ARVO: Atlas of Reproducible Vulnerabilities for Open-Source Software

arXiv:2606.17283v1 Announce Type: cross Abstract: Achieving reproducibility, quantity, and diversity in vulnerability datasets has long been viewed as an inherent three-way trade-off, where improving one dimension often comes at the cost of the others. In practice, reproducibility has been the dimension most often neglected. This has limited what can be automatically extracted from historical bug datasets, and has reduced their utility for downstream security research. In this work, we propose a method to produce a new security dataset which ensures reproducibility for diverse vulnerabilities at scale by identifying the key obstacles to large-scale bug reproduction and addressing them with general solutions. Using this method, we introduce full reproducibility to the largest open source software vulnerability dataset (OSS-Fuzz) and construct the ARVO dataset (an Atlas of Reproducible Vulnerabilities in Open-source software). ARVO is a large-scale dataset consisting of over 6,100 real-world vulnerabilities across 311 projects. Focusing on reproducibility, ARVO differs from existing datasets by providing each vulnerability in a form that can be consistently rebuilt, triggered, and analyzed across versions. Reproducibility also enables automatic identification of the corresponding patch for each vulnerability and supports direct interaction with vulnerabilities after code changes, capabilities that existing large-scale datasets do not provide. In our evaluation, ARVO successfully reproduces 81% of vulnerabilities and achieves 89.4% accuracy on the located patches. We also discuss ARVO's influence on both upstream practices and downstream security research.

22.
arXiv (CS.LG) 2026-06-17

Generalization Guarantees for Multi-Input Neural Operator Learning in Sobolev Spaces

arXiv:2606.17419v1 Announce Type: new Abstract: We develop approximation and generalization error estimates for multi-input neural operators, with the output error measured in Sobolev norms. In contrast to standard operator-learning settings with a single input function, our framework allows multiple input functions defined on possibly different domains, with different dimensions and Sobolev regularities. The derived rates explicitly quantify the contribution of each input space to the final error bound. In particular, in the balanced regime, the approximation and generalization rates are governed by the interaction between the input dimensions, regularities, and Sobolev orders, while the dependence on the model complexity retains a \(\log\log/\log\)-type structure. Our analysis provides a general theoretical framework for multi-input operator learning, including Sobolev training, and is applicable to operator learning problems arising from partial differential equations and scientific computing.

23.
arXiv (CS.AI) 2026-06-11

Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation

arXiv:2606.11990v1 Announce Type: cross Abstract: Remaining Useful Life (RUL) prediction is essential for industrial predictive maintenance, yet many learning-based approaches rely on extensive feature engineering or large labeled datasets to train task-specific sequence models. In this work, we introduce a lightweight learning approach, in which we leverage a frozen pretrained time-series foundation model (TSFM) and combine it with a small regression head for RUL estimation from multivariate sensor streams. More specifically, we use Chronos-2 as a frozen backbone to extract context window features and train a lightweight regression neural network for RUL prediction. Experiments on real-world industrial sensor data from two device types show that Chronos-2 features consistently improve over recurrent, convolutional, Transformer-based, and gradient-boosting baselines under the same preprocessing and evaluation protocol. We further analyze the impact of context length and find that performance improves significantly with longer histories, indicating that TSFM representation offer a practical and data-efficient alternative for RUL estimation in industrial settings.

24.
arXiv (CS.CV) 2026-06-24

Learning Ego-Centric BEV Representations from a Perspective-Privileged View: Cross-View Supervision for Online HD Map Construction

Bird's-eye-view (BEV) representations derived from multi-camera input have become a central interface for online high-definition (HD) map construction. However, most approaches rely solely on ego-centric supervision, requiring large-scale scene structure to be inferred from incomplete observations, occlusions, and diminishing information density at long range, where perspective effects and spatial sparsity hinder consistent structural reasoning. We introduce Cross-View Supervision (CVS), a representation learning paradigm that transfers geometric and topological priors from an ego-aligned overhead perspective into camera-based BEV encoders. Rather than adding auxiliary semantic losses, CVS aligns representations in a shared BEV feature space and distills globally consistent structural knowledge from a perspective-privileged teacher into the ego-centric backbone. This supervision enhances structural coherence without modifying the inference architecture or requiring overhead input at test time. Experiments on nuScenes using ego-aligned aerial imagery from the AID4AD cross-view extension demonstrate consistent improvements over StreamMapNet while maintaining identical camera-only inference. CVS yields +3.9mAP in the standard $60\times30\,\mathrm{m}$ region and +9.9mAP in the extended $100\times50\,\mathrm{m}$ setting, corresponding to a 44% relative gain at long range. These results highlight perspective-privileged structural supervision as a promising training principle for improving BEV representation learning in HD map construction.

25.
arXiv (CS.LG) 2026-06-12

Normative Robustness as a Frontier for Non-Verifiable Reasoning in LLMs

arXiv:2606.12731v1 Announce Type: new Abstract: As LLMs increasingly serve in advisory and deliberative roles, users rely on them for non-verifiable reasoning in domains lacking objective ground truths. However, traditional evaluations of LLM reasoning focus almost exclusively on fact-based domains, such as mathematics and science, leaving uncertainty over whether and to what degree models can handle ambiguous, subjective, or value-laden problems over time. To address this concern, we propose moral reasoning as a paradigmatic subdomain of non-verifiable reasoning. We define moral robustness as a model's capacity to exhibit sound moral reasoning across time and contexts, and we introduce a scalable, adversarial, multi-turn evaluation framework to empirically measure this capability. We simulate 48,000 user-agent moral deliberations across four frontier LLMs, varying premise relevance, premise order, conversation duration, and the user's stated moral view. We find that models successfully ignore morally-irrelevant distractors, but shift their reasoning by up to 6.5%, on average, towards the user's stated preferred moral view, and varying their reasoning depending on factors such as order (altering moral judgments by order in 13-22% of the cases) and duration (altering moral judgments between single-turn and multi-turn in 10-24% of the cases). Our analysis indicates that models tailor not just their final verdicts but their underlying justifications to align with a user's moral viewpoint - a failure mode we characterize as moral deliberative sycophancy.