Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CV) 2026-06-25

Efficient Cross-Scale Invertible Hiding Network with Spatial-Frequency Collaboration and Non-Invertible Mechanism

Image hiding aims to conceal image-level messages within cover images at the same resolution. Invertible neural networks (INN)-based image hiding has emerged as an important branch. It treats concealing and revealing as a pair of inverse problems on image domain transformation and uses INN's forward and backward processes to address them. Due to architectural constraints, existing INN-based methods suffer from single-scale and single-domain feature extraction and limited nonlinear representation capability, resulting in inferior image quality. To mitigate these limitations, we propose an efficient cross-scale invertible hiding network with the spatial-frequency collaboration and the non-invertible mechanism, termed CrosInv. CrosInv exploits cross-scale and spatial-frequency collaborative features while enhancing nonlinear representation. Specifically, we introduce a cross-scale invertible module that bijectively maps inputs to cross-scale representations. To effectively integrate spatial and frequency information, the cross-scale invertible module employs pixel shuffle, Haar wavelet transformation, and their inverse operations for scale transformation. Furthermore, a non-invertible cross dense module is integrated to enhance the nonlinearity. Comprehensive experiments verify the effectiveness and superiority of the proposed CrosInv.

03.
arXiv (math.PR) 2026-06-24

Conditionally Poissonian random digraphs

arXiv:1705.03801v2 Announce Type: replace Abstract: We define a Poissonian model of directed random graphs which generalises the undirected Poissonian random graph process introduced by Norros and Reittu in Adv. Appl. Probab. 38 (2006), 59–75. Its loopless simple projection is a rank-one independent-arc inhomogeneous digraph of the type studied by Cao and Olvera-Cravioto, Random Struct. Alg. 56 (2020), 722–774. For the Poissonian multigraph itself, we discuss the relation to Norros-Reittu graphs, characterise limiting degree distributions, and record explicit exploration estimates. In particular, we give fixed-depth directed local weak limits, stopped branching-process couplings with weight-mass collision budgets, a comparison with the simple projection, and a rare-event concentration criterion. These estimates are intended as graph-side structural inputs for later dynamics on the graph.

04.
arXiv (CS.CL) 2026-06-18

Trust Region On-Policy Distillation

On-Policy Distillation (OPD) is a fundamental technique for efficient post-training of large language models (LLMs), with broad applications in agent learning, multi-task enhancement, and model compression. However, OPD training becomes unstable when the teacher and student distributions differ substantially, as teacher supervision on student-generated tokens may yield unreliable policy gradients and even cause optimization failure. This work addresses reliable on-policy token-level supervision through credit assignment strategies, and proposes Trust Region On-Policy Distillation, TrOPD. It features the following characteristics: 1) Trust-Region On-Policy Learning: TrOPD performs OPD only in regions where the teacher provides reliable supervision, mitigating the optimization difficulty of the K1 reverse-KL estimator under distribution mismatch. 2) Outlier Estimation: For outlier regions, we explore gradient clipping, masking, and forward-KL estimation to reduce the adverse effects of unreliable supervision. 3) Off-Policy Guidance: The student continues generation from teacher prefixes and uses forward KL to imitate off-policy guidance, encouraging on-policy exploration toward reliable regions. Experiments show that TrOPD consistently outperforms SoTA OPD baselines, including OPD, EOPD, and REOPOLD, across mathematical reasoning, code generation, and general-domain benchmarks.

05.
arXiv (CS.LG) 2026-06-17

VISTA: Scale-Aware Visual Navigation via Action History Conditioning

arXiv:2606.17294v1 Announce Type: cross Abstract: Vision Navigation Foundation Models (VNMs) promise end-to-end learned navigation policies capable of zero-shot deployment across diverse embodiments and environments. To maintain generality, many vision-based navigation models predict normalized actions. However, this normalization introduces a critical deployment vulnerability: applying different scaling factors to the same normalized trajectory alters its physical geometry, which degrades navigation performance and increases collision risks. We address this vulnerability by conditioning the model on normalized action histories alongside image observations, providing explicit context on the relationship between the model's predictions and the robot's actual physical displacement. Furthermore, current VNMs often struggle in visually repetitive environments that lack distinct features. To resolve this issue, we integrate a DINOv3 encoder, whose richer representations enable our model to capture both spatial and geometric dimensions between observations. VISTA generalizes robustly to out-of-distribution environments, achieving 100% goal prediction accuracy in zero-shot, real-world deployment in Outdoor, Forest and Office settings, and an average of 95% checkpoints crossed, demonstrating consistent path following in unseen environments.

06.
arXiv (CS.AI) 2026-06-24

CRAFT: A Tendon-Driven Hand with Hybrid Hard-Soft Compliance

arXiv:2603.12120v2 Announce Type: replace-cross Abstract: We introduce CRAFT hand, a tendon-driven anthropomorphic hand with hybrid hard-soft compliance for contact-rich manipulation. The design is based on a simple idea: contact is not uniform across the hand. Impacts concentrate at joints, while links carry most of the load. CRAFT places soft material at joints and keeps links rigid, and uses rollingcontact joint surfaces to keep flexion on repeatable motion paths. Fifteen motors mounted on the fingers drive the hand through tendons, keeping the form factor compact and the fingers light. In structural tests, CRAFT improves strength and endurance while maintaining comparable repeatability. In teleoperation, CRAFT improves handling of fragile and low-friction items, and the hand covers 33/33 grasps in the Feix taxonomy. The full design costs under $600 and will be released open-source with visionbased teleoperation and simulation integration. Project page: http://craft-hand.github.io/

07.
arXiv (CS.CV) 2026-06-24

DLTPose: 6DoF Pose Estimation From Accurate Dense Surface Point Estimates

We propose DLTPose, a novel method for 6DoF object pose estimation from RGBD images that combines the accuracy of sparse keypoint methods with the robustness of dense pixel-wise predictions. DLTPose predicts per-pixel radial distances to a set of minimally four keypoints, which are then fed into our novel Direct Linear Transform (DLT) formulation to produce accurate 3D object frame surface estimates, leading to better 6DoF pose estimation. Additionally, we introduce a novel symmetry-aware keypoint ordering approach, designed to handle object symmetries that otherwise cause inconsistencies in keypoint assignments. Previous keypoint-based methods relied on fixed keypoint orderings, which failed to account for the multiple valid configurations exhibited by symmetric objects, which our ordering approach exploits to enhance the model's ability to learn stable keypoint representations. Extensive experiments on the benchmark LINEMOD, Occlusion LINEMOD and YCB-Video datasets show that DLTPose outperforms existing methods, especially for symmetric and occluded objects. The code is available at https://anonymous.4open.science/r/DLTPose_/ .

08.
arXiv (CS.CV) 2026-06-25

Geometry-Anchored Transport Framework for Exemplar-Free Class-Incremental Learning

Exemplar-free class-incremental learning (EFCIL) requires stable decision boundaries within a shifting feature space. While maintaining class-conditional Gaussian statistics provides a principled classification strategy, these parametric summaries remain sensitive to anisotropic representation drift. Existing methods often transport these statistics across tasks using a decoupled, post-hoc paradigm: optimizing a backbone without explicit geometric constraints can distort the legacy manifold, limiting the precision of retroactive alignment. In this paper, we formulate feature transport as an endogenous training constraint rather than a separate post-task step, presenting the Geometry-Anchored Transport Framework. First, we derive an Analytic Geometric Anchor via Mahalanobis-aligned regression to mitigate macroscopic anisotropic drift. Second, we introduce a Topology-Aware Evolution objective that regularizes localized manifold degradation while calibrating a residual network against the analytic prior. By coupling manifold evolution with transport constraints during the primary training phase, our framework mitigates evaluation errors without requiring decoupled fine-tuning. Experiments across CIFAR-100, TinyImageNet, and ImageNet-100 demonstrate that the proposed framework consistently improves upon existing post-hoc alternatives under strict exemplar-free constraints.

09.
arXiv (CS.LG) 2026-06-19

Comparative Study of Neural Surrogate Architectures for Autoregressive Prediction of Internal Battery States

arXiv:2606.20053v1 Announce Type: new Abstract: The Doyle-Fuller-Newman (DFN) model resolves internal electrochemical states in lithium-ion batteries with high fidelity. However, the numerical solution of its governing equations is computationally prohibitive for real-time deployment, limiting scalability from individual cells to pack and fleet-scale applications. While machine learning surrogates can substantially reduce inference latency through GPU acceleration, most existing approaches learn solution approximations tied to specific operating conditions rather than learning generalizable state-evolution dynamics. This work presents a systematic comparison of four neural network architectures (MLP, ResNet, U-Net, FNO) formulated as autoregressive state-transition operators that predict full DFN internal states across a wide range of operating conditions. To ensure a controlled architectural comparison, all models are trained under a unified framework using multi-step unrolling and current-conditioning, isolating the impact of spatial inductive bias. Results demonstrate that the U-Net's multi-scale feature hierarchy achieves a mean final-step nRMSE of 3% averaged across all internal state variables after 300-step autoregressive rollouts, while providing a 5.38x speed-up over the numerical solver. These findings highlight spatial inductive bias as a critical determinant of surrogate performance, advancing the development of surrogates for internal state observability for next-generation battery management systems and digital twins.

10.
arXiv (CS.CV) 2026-06-16

ATV-Net: Adaptive Triple-View Network with Dynamic Feature Fusion

Recent advances in semantic segmentation rely heavily on attention-based and transformer-style architectures that, while accurate, introduce considerable architectural complexity and computational cost. This paper asks whether a compact CNN-based segmentation head can remain competitive by adaptively selecting useful receptive-field evidence. We propose ATV-Net, an Adaptive Triple-View Network that attaches a lightweight head to a conventional backbone. The head organizes three complementary views – point-wise, neighborhood-level, and enlarged context – and fuses them through an Adaptive Decision Gate that generates image-dependent weights from global feature statistics. This allows the model to emphasize different receptive-field responses according to scene content, without dense attention or multi-scale aggregation. Experiments on Cityscapes and Pascal VOC 2012 show that ATV-Net achieves 80.31% mIoU on Cityscapes with ResNet-101 and 80.90% with ConvNeXt-Tiny, and 86.7% and 88.5% mIoU on Pascal VOC 2012, respectively, while requiring fewer GFLOPs than representative context-aggregation and attention-based heads. The results indicate that adaptive receptive-field selection remains a practical and effective design choice for CNN-based semantic segmentation.

11.
arXiv (CS.AI) 2026-06-16

On-Policy Distillation with Curriculum Turn-level Guidance for Multi-turn Agents

arXiv:2606.15912v1 Announce Type: cross Abstract: Multi-turn agents that plan, invoke tools, and interact with environments offer a promising paradigm for solving complex tasks, yet their capabilities typically rely on very large models whose inference cost is prohibitive in practice.On-Policy Distillation (OPD) is a natural recipe for transferring such capabilities to smaller students, but we find that it suffers a characteristic failure mode in this setting: small student errors compound across turns and push the trajectory out of the teacher's familiar state distribution, so the teacher's supervision becomes least reliable precisely where the student needs it most.We propose Guided On-Policy Distillation (Guided-OPD), a simple yet effective algorithm that mixes teacher- and student-generated turns within each rollout and schedules the teacher's intervention probability along a curriculum that decays to zero.Strong guidance keeps early trajectories close to the teacher distribution and is then gradually withdrawn to recover the purely on-policy regime used at inference.On ALFWorld, ScienceWorld, and WebShop, distilling Qwen3 students from a Qwen3-30B-A3B teacher, Guided-OPD improves Score by 21.1\% and Success Rate by 25.5\% over vanilla OPD on average, with larger gains on smaller students.

12.
arXiv (CS.LG) 2026-06-12

Single vs. Multiple Branches in DeepONet and S-DeepONet: Network Architecture Follows Coupling in Multiphysics Systems

arXiv:2507.03660v2 Announce Type: replace Abstract: `Real-time prediction of complex physical systems requires surrogate models that learn from data while representing strong multiphysics coupling. Deep Operator Networks have shown success in single-physics problems, yet their effectiveness in capturing nonlinear interactions in coupled systems (such as thermo-mechanical or electro-thermal coupling) remains underexplored. Here we pose a practical question: should the architecture of a neural operator reflect the strength of physical coupling it aims to model? We compare single-branch and multi-branch designs, in both feedforward and sequential recurrent forms, across three representative systems: a reaction–diffusion problem with heterogeneous sources, a nonlinear thermo-electrical problem with temperature-dependent conductivity and Joule heating, and a viscoplastic thermo-mechanical model of steel solidification. Single-branch networks consistently outperform multi-branch variants in tightly coupled regimes by encouraging shared latent representations, whereas multi-branch designs remain favorable for decoupled or single-physics tasks. Once trained, these surrogates deliver full-field predictions up to $1.8 \times 10^4$ times faster than physics-based solvers.

13.
arXiv (CS.CV) 2026-06-15

SED:Lightweight Saliency prediction for Event-based data via Distillation

Event-based saliency prediction has gained attention recently, as combining event cameras with saliency estimation can act as an upstream stage that naturally improves the efficiency of downstream eventbased perception at the edge. However, current approaches are either neuromorphic, underperforming on event-based saliency benchmarks, or too heavy for resource-constrained edge applications due to their reliance on transformers or 3D convolutions. Drawing inspiration from efficient convolutional modules, SED and aiming to exploit the temporal information in event data, we propose a lightweight network, trained through knowledge distillation, built on a Depthwise Spatio-Temporal Block (DSTconv) – a factorization of the 3D depthwise separable convolution. Relative to its teacher, our model reduces the model size from 180 MB to 0.32 MB (562x) and the parameter count from 45M to 81k (554x), while matching or outperforming it on the N-DHF1K and N-UCF Sports datasets. Moreover, it generalizes strongly beyond its training distribution, transferring from synthetic to real event data where a model trained from scratch fails.

14.
arXiv (CS.CL) 2026-06-15

The Linguistics Olympiads: Towards a New Corpus for Linguistics Research?

Linguistics olympiad problems (LOPs) are a category of self-sufficient puzzles consisting of a scaled-down corpus representative of certain linguistic phenomena, from which the solver must deduce a primitive set of rules of the language and then translate a new set of elements. The linguistics olympiads (LOs) have become a worldwide phenomenon with 43 different territories taking part in the International Linguistics Olympiad (IOL) 2025. While the typology and solving strategies of LOPs have been analysed, their scientific facet and connections to academic linguistics have yet to be explored. LOPs are directly connected to many linguistic fields, e.g., linguistic typology, linguistic relativity, and linguistics fieldwork. Recently, LOPs have become a research focus as benchmarks for large language models, thus highlighting their usefulness in computational linguistics. Nevertheless, they have not yet been integrated into mainstream linguistics research. This paper attempts to open new directions of including this particular type of puzzle in academic research by offering a structured evaluation of LOPs as linguistic data sources and proposes criteria for their responsible use in academic research. Starting from a set of over 1800 LOPs, this study critically examines the potential of LOPs as a novel corpus for linguistics research by discussing their strengths and limitations as tools, as well as the areas of linguistics into which these problems could fit. This work forms the foundation for a broader initiative aimed at bridging the gap between LOs and academic linguistics, by establishing a robust theoretical framework for LOPs.

15.
arXiv (CS.AI) 2026-06-19

Class-Incremental Motion Forecasting

arXiv:2603.09420v3 Announce Type: replace-cross Abstract: Motion forecasting enables autonomous vehicles to anticipate scene evolution by predicting the future trajectories of dynamic agents. However, existing approaches typically assume a closed-world setting with a fixed object taxonomy and access to high-quality perception, limiting their applicability in the real world where perception is imperfect, and new object classes may emerge over time. In this work, we introduce class-incremental motion forecasting, a novel setting in which new object classes are sequentially introduced over time and future object trajectories are predicted directly from camera images. We propose the first end-to-end framework for this setting, which adapts to newly introduced classes while mitigating catastrophic forgetting of previously learned ones. Our method generates motion forecasting pseudo-labels for known classes and matches them with 2D instance masks from an open-vocabulary segmentation model. This 3D-to-2D keypoint voting mechanism filters inconsistent and overconfident predictions, while a query feature variance-based replay strategy samples informative past sequences to preserve prior knowledge. Extensive evaluations on nuScenes and Argoverse 2 show that our approach successfully preserves performance on known classes while effectively adapting to novel ones. We further demonstrate zero-shot transfer to real-world driving and show that the framework extends naturally to open- and closed-loop end-to-end class-incremental planning on nuScenes and NeuroNCAP. Code and models will be made publicly available at https://omen.cs.uni-freiburg.de.

16.
arXiv (CS.CV) 2026-06-16

Wasserstein Equilibrium Decoding for Reliable Medical Visual Question Answering

Small vision-language models (2-8B) are well-suited for clinical deployment due to privacy constraints, limited connectivity, and low-latency requirements favouring on-device or on-premise inference. However, their limited capacity exacerbates the generation of plausible but incorrect outputs. We extend game-theoretic decoding, previously restricted to text-only, closed-ended NLP tasks, to vision-language models for open-ended Medical VQA. We introduce a semantically aware Wasserstein stopping criterion that replaces lexical order matching, enabling convergence based on semantic consensus among near-synonymous candidate answers and avoiding unnecessary iterations caused by clinically equivalent ranking swaps. On VQA-RAD and PathVQA, we obtain consistent, statistically significant improvements over greedy and discriminative baselines. On VQA-RAD, we improve Qwen3-VL-2B by +3.5 percentage points (p < 0.01), surpassing the greedy 4B model, with similar trends at larger scales. On PathVQA, Gemma-3-4B with BDG matches MedGemma-4B under greedy decoding despite no domain-specific fine-tuning. At accuracy parity with classic BDG, the Wasserstein criterion reduces average convergence iterations by approximately 20%, improving inference efficiency while preserving the game-theoretic equilibrium behaviour. Code is available at https://github.com/luca-hagen/ Wasserstein-BDG-medical-VQA.

17.
medRxiv (Medicine) 2026-06-24

Reducing stillbirth in high burden settings using biomarkers and ultrasound technologies: protocol for the multi-centre prospective iTECH cohort study

Introduction Stillbirth prevention requires reliable detection of potential causes for timely interventions. Currently, there is no effective screening strategy to identify fetuses at risk of stillbirth. Prognostic models have been proposed as a potential solution, but there is a shortage of robust, clinically applicable models in low- and middle-income countries. Early birth is frequently initiated without proper risk stratification, leading to increased neonatal and infant morbidity and mortality. This study aims to develop and validate multi modal multivariable prediction models for stillbirth and pathologies that lead to stillbirth (preeclampsia & fetal growth restriction) using widely accessible and cost-effective markers. Stakeholder perspectives will also be assessed. Methods and analysis This multi-center prospective cohort study is running in four high volume regional referral hospitals in Uganda: Kawempe, Hoima, Lira, and Mbale. We will enroll at least 6,075 pregnant women attending routine antenatal care (ANC), above 13 years of age, and greater than or equal to11 weeks of gestation. Data and biological samples will be collected at 11-23 weeks, 35-37 weeks and at birth in all women. In a subset of women, additional measurements will be obtained between 24-34 weeks and 38-42 weeks to allow for spread of the data across the full spectrum of pregnancy. This data will enable us to investigate the physiological changes with gestational development in healthy or unhealthy pregnancies, to guide future monitoring and management of women and establishment of reference values for novel markers. The placenta will be collected for histopathological analysis in women diagnosed with intrauterine fetal demise at greater than or equal to 20 weeks of gestation, stillbirth nearmiss and their corresponding controls. Data on socio-demographics, obstetric history, current pregnancy conditions, and tests such as maternal hemodynamics, ultrasound, and biochemical markers will be collected from each participant, and used to develop regression and machine learning prediction models. Models will be validated and evaluated by comparing their calibration plots, precision and recall, F1 scores and accuracy, aiming for less complexity and reliable predictions. Emerging models will be translated into software as a medical device (SAMD), while taking into account user experiences, regulatory requirements, data pipelines in clinical workflows and user-friendly interfaces that facilitate access and the interpretation of outputs, to allow for seamless integration into existing electronic health information systems and decision support tools. To assess stakeholder perceptions, we will employ an exploratory qualitative component using focus group discussions, semi-structured and key informant interviews. The sample will include 81 purposively selected women and their partners who use maternity care services, local leaders and healthcare providers in and out of the four hospitals implementing iTECH in Uganda. Qualitative data will be audio recorded, transcribed verbatim and thematic analysis performed using Nvivo 12.

18.
arXiv (CS.AI) 2026-06-24

What Does ODRL Mean? A Cross-Level Ontological Grounding of Permissions, Prohibitions, and Duties in UFO-L

arXiv:2606.24344v1 Announce Type: cross Abstract: ODRL policy evaluators produce verdicts, but say nothing about the normative positions a policy brings into existence, the authority structures those positions presuppose, or who holds the power to declare a norm violated. We formulate the Cross-Level Design Principle: any normative language with violable, consequential norms requires both conduct-level positions (Permission, Duty, Right, No right) and competence-level positions (Power, Subjection, Immunity, Disability). Applying this to ODRL, we establish that prohibition is sanctioned (violation possible and consequential), that permission is underspecified across its behaviour parameter (open vs. closed world), and that the formal semantics covers achievement obligations only. We ground ODRL in UFO-L, mapping each activated rule to a simple legal relator and extending coverage from two to eight legal positions; violation-declaration authority, implicit in every existing evaluator, becomes an explicit Power-Subjection pair. All axioms are mechanically verified in Isabelle/HOL and across a 39-problem benchmark under Vampire, E, and Z3.

19.
arXiv (CS.AI) 2026-06-24

ATHENA: Agentic Team for Hierarchical Evolutionary Numerical Algorithms

arXiv:2512.03476v3 Announce Type: replace-cross Abstract: Progress in computational science depends on complex numerical workflows that must faithfully encode physical laws, yet translating conceptual insight into reliable code remains a major bottleneck. Although large language models can generate isolated code fragments, they lack the structured reasoning required to design, verify, and iteratively refine complete scientific pipelines. Here we introduce ATHENA, an agentic framework explicitly designed to emulate scientific research modeled as a knowledge-driven contextual bandit process. Its core loop separates conceptual policy from numerical realization through expert-derived conceptual scaffolding, enabling principled diagnosis, reformulation, and repair of computational strategies. Across scientific computing and scientific machine learning tasks, ATHENA autonomously derives and correctly applies exact analytical solutions, constructs stable numerical solvers, diagnoses ill-posed formulations, and orchestrates hybrid symbolic-numeric workflows. Quantitatively, ATHENA matches and frequently surpasses the accuracy of expert-authored reference solutions reported in the literature on canonical benchmarks. By reframing computation as an object of agentic reasoning, our framework enables autonomous orchestration of heterogeneous algorithms across scientific domains.

20.
arXiv (CS.AI) 2026-06-17

The Price of Anarchy in Disaggregated Inference

arXiv:2606.17081v1 Announce Type: cross Abstract: Disaggregated inference architectures physically separate prefill and decode phases onto distinct GPU pools, creating competing "agents" that share a fixed hardware budget. We provide, to our knowledge, the first formal game-theoretic analysis of this architecture, using NVIDIA Dynamo as a concrete case study. We model disaggregated serving as three coupled games: a two-player resource game between prefill and decode pools, a selfish caching game over the hierarchical KV cache, and a congestion game with positive externalities for request routing. We empirically validate the latter two; the P/D resource game is treated analytically (Section 9.2). We characterize how GPU saturation induces regime transitions that shift the game's payoff structure: below saturation, selfish behavior has bounded Price of Anarchy (PoA); at saturation, superlinear latency and cache externalities drive our empirical estimator PoA-hat (defined in Section 6.4) upward. Based on this analysis, we design an adaptive controller that detects saturation transitions in real time and adjusts routing parameters accordingly, shifting from cache-affinity exploitation to load-balanced congestion avoidance. We instantiate our framework on a 3-node NVIDIA B200 cluster running Dynamo with two models, Nemotron-4-340B (TP=8, full-node workers with cross-InfiniBand KV transfers) and Llama-3.1-70B (TP=4), and find the same three-regime PoA-hat structure with the same first post-knee grid point (C=128) on both models. Adaptive routing shifts each model to a better operating point. Our strongest result is on the 70B 1P/5D topology, where PoA-hat drops 3.1x (66.4 to 21.5) in the saturated phase at a 13% throughput cost. On the 70B 1P/2D, PoA-hat drops 2.2x and TTFT P99 drops 7.6x (see Section 8.5).

21.
arXiv (CS.AI) 2026-06-17

ARVO: Atlas of Reproducible Vulnerabilities for Open-Source Software

arXiv:2606.17283v1 Announce Type: cross Abstract: Achieving reproducibility, quantity, and diversity in vulnerability datasets has long been viewed as an inherent three-way trade-off, where improving one dimension often comes at the cost of the others. In practice, reproducibility has been the dimension most often neglected. This has limited what can be automatically extracted from historical bug datasets, and has reduced their utility for downstream security research. In this work, we propose a method to produce a new security dataset which ensures reproducibility for diverse vulnerabilities at scale by identifying the key obstacles to large-scale bug reproduction and addressing them with general solutions. Using this method, we introduce full reproducibility to the largest open source software vulnerability dataset (OSS-Fuzz) and construct the ARVO dataset (an Atlas of Reproducible Vulnerabilities in Open-source software). ARVO is a large-scale dataset consisting of over 6,100 real-world vulnerabilities across 311 projects. Focusing on reproducibility, ARVO differs from existing datasets by providing each vulnerability in a form that can be consistently rebuilt, triggered, and analyzed across versions. Reproducibility also enables automatic identification of the corresponding patch for each vulnerability and supports direct interaction with vulnerabilities after code changes, capabilities that existing large-scale datasets do not provide. In our evaluation, ARVO successfully reproduces 81% of vulnerabilities and achieves 89.4% accuracy on the located patches. We also discuss ARVO's influence on both upstream practices and downstream security research.

22.
arXiv (CS.CV) 2026-06-16

DeepMine-Mamba: Mitigating Information Dilution in Mamba-Based State Space Models for Document Image Binarization

Document image binarization aims to separate foreground text from degraded backgrounds while preserving thin, broken, and low-contrast strokes. Although deep learning methods have improved binarization performance, most existing approaches rely on convolutional, transformer-based, or generative architectures, while Mamba-based state space models remain largely unexplored for this task. In this work, we investigate Mamba-based feature propagation and observe that direct state-space propagation may dilute weak foreground cues during long-range modeling, especially faint ink traces, fragmented characters, and boundary-sensitive stroke details. To address this problem, we propose DeepMine-Mamba, a Mamba-based binarization framework equipped with a novel Anti-Dilution Gate that estimates propagation-induced feature changes and selectively restores stroke-sensitive local responses while suppressing unnecessary background enhancement. Experiments on DIBCO/H-DIBCO benchmarks under a strict leave-one-year-out protocol show that DeepMine-Mamba achieves competitive overall performance, with strong average FM and Fps across benchmark years. Ablation results further show that the Anti-Dilution Gate is the key component for mitigating propagation-induced foreground dilution and improving stroke preservation.

23.
arXiv (CS.LG) 2026-06-24

AsyncOPD: How Stale Can On-Policy Distillation Be?

arXiv:2606.24143v1 Announce Type: new Abstract: On-policy distillation (OPD) trains a student on its own rollouts guided by teacher feedback and is becoming increasingly important for large language model (LLM) post-training. Like reinforcement learning (RL), however, OPD faces an on-policy systems bottleneck, as rollouts can dominate training time for reasoning workloads. Asynchronous training pipelines can alleviate this bottleneck by decoupling rollout generation from learner updates, but doing so introduces stale-policy data. While prior work has studied stale data in asynchronous RL, its effects in OPD remain underexplored. We present the first systematic study of staleness in asynchronous OPD, focusing on a practical setting where teacher feedback is implemented through local KL losses and full-vocabulary teacher logits are too expensive to store or transfer, necessitating finite teacher-score caches. We first show that KL direction changes the stale-data problem: teacher-weighted forward KL is more robust to stale rollouts, whereas student-weighted reverse KL is vulnerable. Second, for this vulnerable reverse-KL case, we study whether methods designed to stabilize asynchronous RL can mitigate OPD staleness. In our experiments, they do not improve over a simpler OPD-specific surrogate: recomputing the reverse-KL signal under the current student at learner time. Third, we analyze how finite teacher-score caches create a bias-variance tradeoff for sparse and sampled reverse-KL OPD estimators. This motivates multi-sample Monte Carlo (MC), which preserves MC correctability while reducing one-sample variance. Finally, we present and open-source AsyncOPD, a fully asynchronous OPD training pipeline built from these estimator choices. Experiments show that AsyncOPD improves training throughput by $1.6\times$ to $3.8\times$ over strict synchronous training while reaching comparable accuracy.

24.
bioRxiv (Bioinfo) 2026-06-18

ScriptManager: a platform for scalable and reproducible high-resolution analysis of genomics datasets

Background: The growing diversity of genomic and epigenomic assays has driven a parallel expansion in data formats, analysis workflows, and figure-generation tools. However, tools for analyzing data and assembling publication-quality figures are often specialized to a specific assay, dramatically limiting their interoperability and reproducibility. Results: We present the v1.0 release of ScriptManager, a Java-based framework for modular and reproducible analysis and visualization workflows of genomics and epigenomics data. Unlike existing tools specialized for individual assay types, ScriptManager provides a unified and extensible framework for cross-assay visualization and workflow reproducibility. The v1.0 release adds novel analytical modules, GUI session logging, automated unit and integration testing, tutorials, and expanded documentation. It also integrates with the broader reproducibility ecosystem through Singularity containers, Anaconda packaging, and Galaxy XML wrappers. We demonstrate ScriptManager's TagPileup scaling from local single-core execution to a 10,305-job analysis distributed across the Open Science Grid (OSG), with the full workload completing in

25.
arXiv (CS.LG) 2026-06-11

Knowledge Manifold: A Riemannian Geometric Framework for Semantic Mapping and Geodesic Analysis of Scientific Literature

arXiv:2606.05907v2 Announce Type: replace-cross Abstract: We present the knowledge manifold: a Riemannian geometric space in which a corpus of documents is arranged according to semantic positional relationships derived from character n-gram TF-IDF representations. The framework proceeds in five tightly coupled stages. First, each document is converted to a character-level n-gram TF-IDF vector (4-7 grams, up to 250,000 features, L2-normalized) and embedded in a two-dimensional knowledge map via constrained stress minimization with repulsion, variance, and centering regularizers. Second, knowledge at an arbitrary query point is estimated through Smoothed Particle Hydrodynamics (SPH) interpolation using a cubic-spline kernel, yielding an interpolated TF-IDF feature vector that can be linguistically characterized. Third, directional knowledge gradients at 0, 45, and 90 degrees are computed from the SPH interpolation map, and pairwise directional similarity is quantified via inner product and cosine similarity. Fourth, a Gaussian Process Regression (GPR) model, with a Constant x RBF + White kernel fitted on a 10-dimensional SVD projection, provides a Bayesian posterior mean, uncertainty estimate, and per-document contribution rate at the query point. Fifth, geodesics in the knowledge space are obtained by minimizing a discrete Riemannian path energy derived from the SPH-induced metric tensor, using L-BFGS-B with seven deterministic initial-path candidates. We apply the formulation to a corpus of 20 papers in fiber-reinforced composite materials and aerospace structural mechanics, showing that the semantic map recovers meaningful research clusters, geodesic paths reveal natural conceptual bridges between distant topics, and SPH/GPR interpolation enables the generation of virtual knowledge: hypothetical paper abstracts describing unstudied but geometrically predicted research directions.