论文广场 - AcademicHub

01.

arXiv (CS.CL) 2026-06-16 DOI: arXiv:2310.06555

It's About Time: Temporal References in Emergent Communication

作者:

Olaf Lipinski ↗Adam J. Sobey ↗Federico Cerutti ↗Timothy J. Norman ↗

Emergent communication enables agents to develop bespoke languages that improve communication efficiency. Despite the known importance of temporal structure in natural language, there is no existing evidence of temporal references in emergent communication. This paper addresses this gap, by exploring how agents communicate about temporal relationships. We analyse three potential factors for the emergence of temporal references: environmental, external, and architectural. Our experiments demonstrate that altering the loss function is insufficient for temporal references to emerge; rather, architectural changes are necessary. A minimal change in agent architecture, using a different batching method, allows the emergence of temporal references. This modified design is compared with the standard architecture in a temporal referential games environment, which emphasises temporal relationships. The analysis shows that over 95% of the agents with the modified batching method develop temporal references, without changes to their loss function. We consider temporal referencing necessary for future improvements to the agents' communication efficiency, enabling future agents to use a closer to optimal coding as compared to purely compositional languages. These insights provide the basis for incorporation of temporal references into other emergent communication settings, and investigation of other aspects of language.

阅读与讨论 → 访问原文 →

02.

arXiv (CS.CV) 2026-06-12 DOI: arXiv:2606.12633

ECA: Efficient Continual Alignment for Open-Ended Image-to-Text Generation

作者:

Jiangtao Kong ↗Peijun Zhao ↗Chun-Fu Chen ↗Youngwook Do ↗Shaohan Hu ↗Tianyi Zhou ↗Huajie Shao ↗

Incremental Learning (IL) for Open-ended Image-to-Text Generation (OpenITG) enables models to continuously generate accurate, contextually relevant text for new images while preserving previously acquired knowledge. Unlike prior studies, this paper addresses a more practical scenario in which the predominant category of visual data shifts over time as environments evolve. In this context, we introduce a new notion of continual alignment, which incrementally adapts the alignment module within pre-trained VLMs to preserve high-quality cross-modal representations. Based on this idea, we propose Efficient Continual Alignment (ECA), a novel exemplar-free IL approach for OpenITG. The key challenge is enabling the model to acquire new, task-specific features while minimizing interference with the established alignment without accessing raw data from previous tasks. To address this, ECA employs three core mechanisms: a Mixture of Query (MoQ) module that adapts task-specific query tokens, a Fisher Dynamic Expansion (FeDEx) that dynamically expands model structure based on a Fisher Information Matrix (FIM)-based metric, and an embedding dictionary with Dictionary Replay (DR) to retain past knowledge. To evaluate ECA's performance, we construct four new IL OpenITG benchmarks that better reflect real-world scenarios. Experimental results demonstrate that ECA significantly mitigates catastrophic forgetting and improves IL performance compared to baseline methods. Code and benchmarks are available at https://github.com/Snowball0823/ECA.

阅读与讨论 → 访问原文 →

03.

arXiv (CS.CV) 2026-06-16 DOI: arXiv:2606.16474

MVOFormer: Flow-Semantic Transformer for Robust Monocular Visual Odometry

作者:

Jituo Li ↗Shunwang Sun ↗Jialu Zhang ↗Xinqi Liu ↗Jinyao Hu ↗Zhicheng Lu ↗Sajad Saeedi ↗Guodong Lu ↗

Monocular visual odometry (MVO) is foundational to autonomous navigation and robotic localization. However, existing learning-based MVO approaches often struggle with either a lack of interpretable, complementary features or overly complex multi-stage architectures. These limitations inherently restrict their robustness and cross-domain generalization. In this work, we propose MVOFormer, a novel transformer framework for robust monocular visual odometry. Our architecture features a Flow-Semantic Dual Branch Encoder that synergizes dense geometric motion cues with object-centric semantic priors, explicitly distinguishing static structures from dynamic distractors. These representations are then fused by an Iterative Multimodal Decoder, enabling coarse-to-fine pose refinement while dynamically suppressing attention on unreliable regions. Extensive evaluations demonstrate that, without any target-domain fine-tuning, MVOFormer achieves superior zero-shot generalization and robustness, significantly outperforming prior learning-based frame-to-frame methods across diverse benchmarks including TartanAir, KITTI, TUM-RGBD, and ETH3D-SLAM.

阅读与讨论 → 访问原文 →

04.

arXiv (quant-ph) 2026-06-12 DOI: arXiv:2606.13499

Observation of Non-Gaussian Magnon Dynamics in a Two-Dimensional Long-Range XY Model

作者:

S. -A. Guo ↗J. -Y. Tan ↗J. Ye ↗Y. Jiang ↗L. Zhang ↗Y. -X. Chen ↗H. -J. Chen ↗H. -Y. Hu ↗W. -X. Guo ↗B. -X. Qi ↗L. He ↗Z. -C. Zhou ↗…

arXiv:2606.13499v1 Announce Type: new Abstract: Non-Gaussian evolution of high-order spin correlations characterizes important properties of quantum many-body systems. In practice, decoherence, statistical fluctuation and miscalibration of experimental parameters all hinder the witness of non-Gaussian dynamics. Here we demonstrate the crossover between Gaussian and non-Gaussian dynamics on a two-dimensional XY model with long-range and spatially structured interaction using a trapped ion quantum simulator. We prepare different initial densities of magnon excitations and verify the dynamics of single-spin observables for the engineered Hamiltonian. Then we compare the high-order spin correlations with the mean-field solution and the Holstein-Primakoff approximation, and demonstrate the non-Gaussian behavior in a way independent of the calibration errors. Our work provides a verifiable path from classically simulatable dynamics to regimes where quantum advantage may emerge.

阅读与讨论 → 访问原文 →

05.

arXiv (CS.AI) 2026-06-16 DOI: arXiv:2606.16326

Gaming-Resistant Insurance Contracts for Autonomous AI Agents: Strategy-Proof Toll Mechanism Design

作者:

Hao-Hsuan Chen ↗

arXiv:2606.16326v1 Announce Type: cross Abstract: Paper A defines a time-consistent actuarial runtime that prices each side-effect-bearing action against a contractually fixed safe default and gates execution against a reserve budget. It treats the operator as passive. This paper makes the operator strategic. We characterise a five-attack space for autonomous AI-agent insurance contracts and prove when the actuarial runtime is gaming-resistant. Two attack surfaces – post-toll safe-default selection and within-boundary action splitting – are closed by Paper A's minimal-authority and no-splitting clauses. The remaining three require new contract clauses. First, common-control aggregation prevents cross-boundary re-routing from reducing toll below the boundary potential applied to total exposure. Second, interface failures such as invalid JSON are contract-relevant events, not safety wins: treating them as zero-toll safe defaults can reward unreliable models, while escalation fees reverse the incentive. We validate this interface-compliance theorem on committed cross-model traces from the companion empirical paper. Third, a model-identity menu with a componentwise-minimum penalty schedule makes truthful reporting of the deployed model weakly dominant. We then compose these clauses with Paper A's runtime guarantees to obtain joint incentive compatibility over the five-attack space. Finally, a two-parameter premium family discharges operator individual rationality and weak budget balance at the truthful equilibrium. The result is an incentive-compatibility layer for actuarial control of autonomous-agent side effects.

阅读与讨论 → 访问原文 →

06.

arXiv (CS.CL) 2026-06-16 DOI: arXiv:2606.15893

BALTO: Balanced Token-Level Policy Optimization for Hallucination Mitigation

作者:

Ning Li ↗Zixuan Guo ↗Yan Xu ↗Wenbo Fei ↗Yifan Niu ↗Chang Luo ↗Yasheng Wang ↗Weiwen Liu ↗Yong Yu ↗Weinan Zhang ↗

Hallucinations remain a major obstacle to deploying large language models (LLMs) in knowledge-intensive settings, where generated responses must be faithfully grounded in provided evidence. Reinforcement learning (RL) is a promising direction for hallucination mitigation, but response-level faithfulness rewards suffer from a granularity mismatch: localized hallucinations can cause supported content to receive spurious penalties. Although recent work introduces fine-grained feedback such as claim-level verification and token-level rewards, unbalanced credit assignment can still induce length, verbosity, or optimization-noise biases. We propose BALTO, a Balanced Token-level Policy Optimization framework for hallucination mitigation. BALTO extracts checkable factual claims, verifies them against the reference context, and projects claim-level judgments to token-level labels. A balanced token-level credit assignment mechanism is introduced into the framework. This design redistributes probability mass from unsupported content toward faithful content, rather than suppressing the entire response. We systematically analyze the limitations of response-level rewards from a theoretical standpoint, and prove BALTO's advantages in training stability and optimization efficiency for hallucination mitigation. Experiments on ConFiQA, RAGTruth, and FinLLM-Eval show that BALTO achieves the highest faithfulness across all six model–benchmark settings and consistently outperforms existing post-training baselines in Q-Score, demonstrating a stronger faithfulness–informativeness trade-off.

阅读与讨论 → 访问原文 →

07.

arXiv (CS.CL) 2026-06-11 DOI: arXiv:2606.11279

Massive Open-Vocabulary Keyword Spotting

作者:

Leonor Barreiros ↗Raul Monteiro ↗Afonso Mendes ↗Gon\c{c}alo M. Correia ↗

Automatic speech recognition systems have been shown to under-perform when it comes to transcribing words rarely seen in the training data, namely specialized terminology. Open-vocabulary keyword spotting, combined with contextual biasing, has been shown to mitigate this issue. However, existing systems can only handle glossaries of a few hundred terms without becoming an infeasible bottleneck. We propose a system that stores features with a memory footprint up to 128 times smaller than a comparable baseline and allows users to process massive databases while remaining open-vocabulary. Without fine-tuning the speech recognition model, our system achieves a comparable entity recall as uncompressed solutions, even in languages not seen during training.

阅读与讨论 → 访问原文 →

08.

arXiv (CS.CV) 2026-06-18 DOI: arXiv:2606.08206

SegmentAnyTreeV2: Scaling Transformer-Based Tree Instance Segmentation Across Sensors, Platforms, and Forests

作者:

Maciej Wielgosz ↗Stefano Puliti ↗Rasmus Astrup ↗

We present SegmentAnyTreeV2, a sensor- and platform-agnostic framework for semantic and instance segmentation of forest point clouds. The model combines a serialization-based Point Transformer v3 backbone with a lightweight semantic head and a tree-focused cross-attention mask decoder. Semantic predictions restrict instance decoding to tree-class voxels, while instance-aware query initialization, one-to-many seed supervision, and asymmetric mask scoring improve separation in dense and structurally complex stands. We further introduce FOR-instance v3, an expanded benchmark comprising 427 scenes and 26,496 annotated trees across diverse biomes, forest structures, and LiDAR platforms. On the FOR-instanceV2 test split, SegmentAnyTreeV2 achieves 90.5% precision, 80.2% recall, 85.0% F1, 90.7% coverage, and 87.6% semantic mIoU, outperforming previous learning-based methods in both instance detection and mask completeness. Zero-shot evaluation on independent sites further demonstrates strong cross-domain generalization.

阅读与讨论 → 访问原文 →

09.

arXiv (CS.LG) 2026-06-17 DOI: arXiv:2508.19445

On Surjectivity of Neural Networks: Can you elicit any behavior from your model?

作者:

Haozhe Jiang ↗Nika Haghtalab ↗

arXiv:2508.19445v3 Announce Type: replace Abstract: Given a trained neural network, can any specified output be generated by some input? Equivalently, does the network correspond to a function that is surjective? In generative models, surjectivity implies that any output, including harmful or undesirable content, can in principle be generated by the networks, raising concerns about model safety and jailbreak vulnerabilities. In this paper, we prove that many fundamental building blocks of modern neural architectures, such as networks with pre-layer normalization and linear-attention modules, are almost always surjective. As corollaries, widely used generative frameworks, including GPT-style transformers and diffusion models with deterministic ODE solvers, admit inverse mappings for arbitrary outputs. By studying surjectivity of these modern and commonly used neural architectures, we contribute a formalism that sheds light on their unavoidable vulnerability to a broad class of adversarial attacks.

阅读与讨论 → 访问原文 →

10.

arXiv (CS.CV) 2026-06-12 DOI: arXiv:2606.12858

JSCGC: Joint Source-Channel-Generation Coding for Wireless Generative Communications

作者:

Tong Wu ↗Zhiyong Chen ↗Guo Lu ↗Li Song ↗Feng Yang ↗Meixia Tao ↗Wenjun Zhang ↗

Conventional communication systems, including both separation-based coding and learning-based joint source-channel coding (JSCC), are typically designed under Shannon's rate-distortion theory. However, relying on generic distortion metrics fails to capture complex human visual perception, often resulting in blurred or unrealistic reconstructions. In this paper, we propose Joint Source-Channel-Generation Coding (JSCGC), a generative communication paradigm that replaces the conventional decoder with a generative model at the receiver. The received signal is treated as a condition that controls the sampling process into the learned conditional distribution, reformulating communication from deterministic reconstruction for distortion minimization to controlled generation for mutual information maximization under perceptual constraints. Based on this formulation, we develop a unified joint training and efficient stochastic sampling framework, and provide theoretical analysis of its effectiveness in both learning and inference stages. Extensive experiments on latent-space image transmission demonstrate that the JSCGC consistently improves feature-based, semantic-level, and distributional quality across diverse channel conditions, while exhibiting a distinct error behavior characterized by semantic inconsistency rather than distortion.

阅读与讨论 → 访问原文 →

11.

arXiv (CS.LG) 2026-06-19 DOI: arXiv:2603.10184

Stabilizing Bandits using Regularization: Precise Regret and A Quantitative Central Limit Theorem

作者:

Budhaditya Halder ↗Ishan Sengupta ↗Koustav Chowdhury ↗Samya Praharaj ↗Koulik Khamaru ↗

arXiv:2603.10184v2 Announce Type: replace-cross Abstract: Statistical inference with bandit data presents fundamental challenges owing to adaptive sampling, which violates the independence assumptions underlying classical asymptotic theory. Recent work has identified stability~\citep{laiwei82} as a sufficient condition for valid inference under adaptivity. This paper first provides a refined stability condition, stated in terms of the iterates of an online algorithm, and shows that a large class of regularized stochastic-mirror-descent-style algorithms satisfy it. This refined condition allows us to strengthen the asymptotic results of~\citet{laiwei82} in several ways. First, we derive a non-asymptotic Berry–Esseen bound for the empirical reward estimates under adaptive sampling. Second, we derive matching non-asymptotic upper and lower bounds on the regret of the proposed algorithm, yielding a precise characterization of its regret. Third, we show that these regularized algorithms preserve asymptotic normality and valid inference under a prescribed level of adversarial corruption. Finally, we show that regularization is necessary rather than incidental: Lai–Wei stability is incompatible with the optimal $O(\sqrt{T})$ regret rate – the rate attained by unregularized algorithms such as EXP3 – so that a controlled, polylogarithmic inflation in regret is the price of valid inference.

阅读与讨论 → 访问原文 →

12.

arXiv (CS.AI) 2026-06-16 DOI: arXiv:2602.02028

Edit Knowledge, Not Just Facts via Multi-Step Reasoning over Background Stories

作者:

Ya Gao ↗Kalle Kujanp\"a\"a ↗Pekka Marttinen ↗Harri Valpola ↗Alexander Ilin ↗

arXiv:2602.02028v2 Announce Type: replace Abstract: Enabling artificial intelligence systems, particularly large language models, to update knowledge and flexibly apply it during reasoning remains a central challenge. Existing knowledge editing approaches emphasize atomic facts, improving factual recall but often failing to integrate updated information into a coherent framework usable across contexts. In this work, we argue that knowledge update is fundamentally a reasoning problem rather than a memorization problem. Consequently, a model should be trained in situations where the new information is instrumental to solving a task, combined with pre-existing knowledge, and exercised through multi-step reasoning. Based on this insight, we propose a training strategy based on three principles. First, new knowledge is introduced as a coherent background story that contextualizes novel facts and explains their relation to existing knowledge. Second, models are trained using self-generated multi-hop questions that require multi-step reasoning involving the new information. Third, training is done using knowledge distillation, forcing a student model to internalize the teacher's reasoning behavior without access to the novel information. Experiments show that models trained with this strategy effectively leverage newly acquired knowledge during reasoning and achieve remarkable performance on challenging questions that require combining multiple new facts.

阅读与讨论 → 访问原文 →

13.

arXiv (CS.AI) 2026-06-19 DOI: arXiv:2606.19533

A Tool for the Synthesis of Adaptive Probabilistic Processors Based on the Ising Model

作者:

Jonathan Juracy Carneiro da Silva ↗Leonardo R. Gobatto ↗Jose Rodrigo Azambuja ↗

arXiv:2606.19533v1 Announce Type: cross Abstract: This work presents a tool for the synthesis and simulation of probabilistic architectures for solving combinatorial optimization problems by mapping them to the Ising model. The proposed approach automatically constructs the Ising Hamiltonian and determines the number of probabilistic elements (p-bits) based on problem characteristics such as size and topology. Furthermore, the tool introduces an adaptive strategy for selecting the most suitable update algorithm among Gibbs Sampling, Simulated Annealing (SA), Simulated Quantum Annealing (SQA), and cluster-based methods. Experimental results using benchmark problems demonstrate improved convergence behavior and flexibility compared to fixed approaches. The proposed framework enables systematic evaluation of probabilistic computing strategies and supports the development of future hardware implementations based on MTJs and p-bits.

阅读与讨论 → 访问原文 →

14.

arXiv (CS.CV) 2026-06-11 DOI: arXiv:2606.11661

Learning Instance-Adaptive Low-Rank Orthogonal Subspaces for Clothes-Changing Person Re-Identification

作者:

Dong-Woo Kim ↗Tae-Kyun Kim ↗

Clothes-changing person re-identification (CC-ReID) aims to recognize individuals despite drastic appearance changes caused by clothing variation. While existing methods rely on adversarial learning to disentangle clothing features, we propose Ortho-ReID, which explicitly models a low-rank clothing subspace from VLM text descriptions and extracts clothing-invariant representations via direct geometric constraints. A critical component is our transformer-based Basis Maker, which refines a shared, low-dimensional clothing prior into an instance-adaptive low-rank subspace through cross-attention with image patches, enabling robust clothing feature extraction even under varying visibility conditions. This instance-adaptive subspace is supervised via alignment with clothing text embeddings, while identity features are extracted via a learnable projection head and geometrically constrained to be strictly orthogonal to it. Extensive experiments demonstrate state-of-the-art performance on PRCC (+5.9% top-1), Celeb-reID-light (+3.5%), and LaST (+5.3%), with competitive results on LTCC.

阅读与讨论 → 访问原文 →

15.

arXiv (quant-ph) 2026-06-16 DOI: arXiv:2606.16823

Physically Motivated Ansatz for Open Fermionic Systems on Quantum Computer

作者:

Yi Liu ↗Xiaopeng Li ↗Zhen Liu ↗Zhenyu Li ↗

arXiv:2606.16823v1 Announce Type: new Abstract: Determining non-equilibrium steady states (NESS) of open fermionic systems is a fundamental problem akin to finding ground states of closed systems. To address this, variational quantum algorithms can be used to solve the Lindblad master equation, much like the Schrödinger equation, yet ansatz design for NESS remains challenging. Existing approaches rely mostly on hardware-efficient ansätze (HEA), which suffer from the barren plateau problem. Here, we introduce a physically motivated ansatz named NE-UCC. Numerical simulations demonstrate that NE-UCC reliably converges to the steady state even in strongly correlated regimes far from equilibrium, reducing the infidelity by up to ten orders of magnitude compared to HEA. Furthermore, NE-UCC facilitates the exploration of excited eigenmodes with specific symmetries.

阅读与讨论 → 访问原文 →

16.

arXiv (CS.AI) 2026-06-17 DOI: arXiv:2602.08939

CausalT5k: Diagnosing Refusal and Failure Modes in Trustworthy Causal Reasoning Across Causal Rungs

作者:

Longling Geng ↗Andy Ouyang ↗Theodore Wu ↗Daphne Barretto ↗Matthew John Hayes ↗Rachael Cooper ↗Yuqiao Zeng ↗Sameer Vijay ↗Gia Ancone ↗Ankit Rai ↗Matthew Wolfman ↗Patrick Flanagan ↗…

arXiv:2602.08939v2 Announce Type: replace Abstract: Large language models increasingly produce fluent causal explanations, yet they often fail in ways aggregate accuracy cannot diagnose: confusing association with intervention, abandoning correct judgments under pressure, over-refusing valid claims, or answering when evidence is underdetermined. We introduce CTK, a diagnostic benchmark of 5,147 cases and growing, across 10 domains and all three levels of Pearl's Ladder of Causation. Unlike benchmarks that only score correctness, CTK reveals why a model failed by annotating causal rung, trap type, pressure sensitivity, refusal quality, and Utility-Safety tradeoffs. Its Sheep/Wolf taxonomy separates valid causal designs from inferential traps; paired neutral/pressure variants measure sycophantic drift through Bad Flip Rate; and Wise Refusal fields test whether a model identifies the missing information needed before endorsing a claim. CTK exposes failure modes hidden by aggregate accuracy: the Skepticism Trap, Rung Collapse under scaling, pressure-induced drift, Detection-Correction gaps, and counterfactual error modes. Rather than prescribing a correction method, it provides the diagnostic substrate for studying causal-reasoning failure profiles.

阅读与讨论 → 访问原文 →

17.

arXiv (CS.CV) 2026-06-11 DOI: arXiv:2602.03282

Global Geometry Is Not Enough for Vision Representations

作者:

Jiwan Chung ↗Seon Joo Kim ↗

A common assumption in representation learning is that globally well-distributed embeddings support robust and generalizable representations. This focus has shaped both training objectives and evaluation protocols, implicitly treating global geometry as a proxy for representational competence. While global geometry effectively encodes which elements are present, it is often insensitive to how they are composed. We investigate this limitation by testing the ability of geometric metrics to predict compositional binding across a diverse suite of vision encoders. We find that standard geometry-based statistics exhibit near-zero correlation with compositional binding. In contrast, functional sensitivity, as measured by the input–output Jacobian, reliably tracks this capability. We further provide an analytic account showing that this disparity arises from objective design, as existing losses explicitly constrain embedding geometry but leave the local input–output mapping unconstrained. These results suggest that global embedding geometry captures only a partial view of representational competence and establish functional sensitivity as a critical complementary axis for modeling composite structure.

阅读与讨论 → 访问原文 →

18.

arXiv (CS.AI) 2026-06-18 DOI: arXiv:2606.18310

Conflict-Aware Retriever Editing for Knowledge Injection Attacks on LLM-Based RAG Systems

作者:

Xinru Liu ↗Xianglong Zhang ↗Di Cai ↗Zhumin Chen ↗Pengfei Hu ↗Xin Xin ↗

arXiv:2606.18310v1 Announce Type: cross Abstract: Injecting malicious knowledge into retrieval-augmented generation (RAG) systems can manipulate retrieved evidence and mislead downstream generation, posing a serious security threat for AI applications. Existing RAG injection attacks mainly rely on manipulating external knowledge bases, such as crafting malicious corpus. However, the synthetic text crafted by such data-centric methods could be detectable, leading to the failure of attacks. Beyond corpus manipulation, open-source retrievers are increasingly exposing RAG systems to model-centric attacks. In this paper, we propose conflict-aware retriever editing, i.e., CAREATTACK, a model-centric retriever attack framework for malicious knowledge injection in RAG. Specifically, CAREATTACK consists two stages of conflict-aware retriever editing and attack-preserving anchor repair. Conflict-aware retriever editing adapts efficient closed-form parameter editing to the dense retrieval model, promoting malicious knowledge above benign competing passages and resolving potential parameter conflicts through graph-based conflict detection and parameter editing projection. Then, attack-preserving anchor repair performs lightweight calibration on the edited retriever to further eliminate the impact on non-target prompts while preserving the attack effectiveness for target prompts. We instantiate CAREATTACK on Qwen3-Embedding-0.6B and BGE-M3, and conduct evaluation on three benchmark datasets. Experimental results demonstrate our method substantially promote malicious passages into the retrieved knowledge of RAG systems and can perform attacks for batches of target prompts and passages, given the access of retrieval model parameters. Since most RAG systems are built upon open-source retrieval models, this work reveals a practical attack surface in RAG systems. Codes are public accessible at https://anonymous.4open.science/r/CareAttack-3F1C.

阅读与讨论 → 访问原文 →

19.

medRxiv (Medicine) 2026-06-17 DOI: HASH:b57b4288dd9142d14f75f4a7ba4b834e

Cross-Device Adaptation of Mirai for Mammography-Based Breast Cancer Risk Prediction

作者:

Sistig ↗Rothstein ↗J. H ↗Gadgil ↗Achacoso ↗Alexeeff ↗S. E ↗Gerstley ↗L. D ↗Klein ↗R. J ↗Margolies ↗…

Fine-tuning can adapt pretrained medical imaging models to new clinical datasets, but device-specific domain shifts may limit generalizability. We evaluated Mirai, a mammography-based deep learning model for breast cancer risk prediction, in a large screening cohort containing Hologic and General Electric (GE) full-field digital mammography systems, including GE Premium View (GE PV) and Tissue Equalization (GE TE) post-processing software. Native Mirai showed lower performance on TE images than on Hologic or PV images. Fine-tuning on TE images improved TE performance, particularly for short-term risk prediction, but substantially reduced performance on Hologic images, consistent with catastrophic forgetting. To mitigate this effect, we developed a device-invariant model using interleaved multi-device sampling and conditional adversarial training. This approach largely restored Hologic performance while maintaining improved TE performance, providing better robustness across heterogeneous imaging platforms. Comparison of cumulative and annual risk AUCs over a five-year time horizon further showed that performance gains were driven mainly by short- and intermediate-term predictions. These findings highlight both the value and dangers of device-specific fine-tuning and support balanced domain-adaptation strategies for deploying mammography-based risk models across diverse clinical imaging environments.

阅读与讨论 → 访问原文 →

20.

bioRxiv (Bioinfo) 2026-06-19 DOI: HASH:4e27796b09c01d0ba5d3c28788b6796b

Nickel-Driven Dynamics of Urease in Sporosarcina pasteurii: Integrated Computational and Experimental Insights

作者:

Al-Thawadi ↗S. M ↗

Urease is a nickel-dependent enzyme that plays an important role in urea hydrolysis and in a process named as microbial-induced calcium carbonate precipitation (MICP), which is widely used in sustainable environmental biotechnology. Despite its ecological importance, urease powers Biogrout (biocementation), a promising green technology for soil stabilization and infrastructure repair. Yet, the relationship between nickel availability, enzyme activation, and bacterial fitness remains poorly understood. In this study, we reveal a striking dual effect of nickel on Sporosarcina pasteurii: while high Ni2+ concentrations strongly inhibit growth (IC50 {approx} 637.7 {micro}M), they simultaneously boost specific urease activity up to six-fold. This uncoupling between biomass and enzymatic efficiency highlights a previously overlooked adaptive strategy under metal stress. Using structural bioinformatics and molecular docking, we show that Ure1–the catalytic subunit–exhibits the strongest nickel affinity (-4.3 kcal{middle dot}mol-1), supported by highly conserved active-site residues, whereas accessory proteins UreE and UreG display moderate and weak binding, consistent with their roles in metal delivery and GTP-dependent maturation. In addition, microscopic observations confirmed that calcium carbonate precipitation was most pronounced at intermediate nickel concentrations (approximately 400-1000 {micro}M), whereas higher concentrations ([≥]1000-1300 {micro}M) led to reduced mineral formation due to loss viable cells. Taken together, these results indicates that nickel availability controls both urease activation and bacterial fitness, and that an optimal balance is required to maximize biomenerilization efficiency in environmental applications, particularly in biocementation technology.

阅读与讨论 → 访问原文 →

21.

arXiv (CS.CL) 2026-06-16 DOI: arXiv:2502.05163

Enhancing LLM Safety Through a Theoretical Minimax Game Lens

作者:

Yihe Deng ↗Yu Yang ↗Junkai Zhang ↗Wei Wang ↗Bo Li ↗

The rapid advancement of large language models (LLMs) necessitates effective mechanisms to ensure their responsible deployment by accurately distinguishing unsafe content from benign content. While substantial safety datasets are available in English, multilingual safety modeling remains underexplored due to limited open-source safety datasets in other languages. Even within English datasets, safe yet sensitive corner-case content is scarce, leading to shortcut learning by models and non-trivial false-positive rates. To mitigate these issues, we introduce a novel minimax reinforcement learning (RL) framework wherein a data generator and a classifier model co-evolve, facilitating the production of high-quality synthetic multilingual safety data. We theoretically formalize this interaction as a minimax game and rigorously demonstrate convergence to a Nash equilibrium. Empirical evaluations confirm that our synthetic data generation method significantly enhances the classifier model performance, enabling a substantially smaller model to surpass the state-of-the-art by nearly 10% on English benchmarks while achieving 4.5x faster inference speed. These results establish a scalable and efficient methodology for synthetic data generation, advancing the development of safer and more robust multilingual LLM deployments.

阅读与讨论 → 访问原文 →

22.

arXiv (CS.LG) 2026-06-11 DOI: arXiv:2606.11625

TimeRouter: Efficient and Adaptive Routing of Time-Series Foundation Models

作者:

Kanghui Ning ↗Yushan Jiang ↗Kashif Rasul ↗Anderson Schneider ↗Yuriy Nevmyvaka ↗Dongjin Song ↗

arXiv:2606.11625v1 Announce Type: new Abstract: Time-series foundation models (TSFMs) are increasingly explored as predictive experts within emerging agentic time-series systems. However, TSFMs exhibit heterogeneous inductive biases, and no single model consistently dominates across forecasting regimes, making expert selection a critical challenge. Existing systems often delegate this decision to LLM-based controllers, incurring substantial inference overhead. We present TimeRouter, an efficient routing framework that leverages empirical complementarity across a pool of pretrained TSFMs through lightweight discriminative routing, selective gating, and ensemble fallback. Concretely, TimeRouter combines a learned routing head, a selective gate, and an ensemble fallback, enabling adaptive expert selection without invoking an LLM at inference time. TimeRouter achieves state-of-the-art performance on the GIFT-EVAL leaderboard, with an LB MASE of 0.6765. Beyond benchmark performance, our ablation studies provide empirical insights into TSFM routing design, highlighting the importance of pool composition and selective gating. Taken together, these results position TimeRouter as a modular and lightweight routing layer for future agentic time-series systems built upon foundation-model pools. Our code is available at https://github.com/UConn-DSIS/TimeRouter.

阅读与讨论 → 访问原文 →

23.

Nature Biotechnology 2026-06-05 DOI: HASH:3e73bcac3de442b75c61148ec9479e32

Multiplexed, precise genome engineering in monocots with twin prime editing systems

作者:

Hongchao Li ↗

Simultaneously introducing diverse genomic edits remains a challenge in crop genome engineering. Here we describe a twin prime editing-based knockout (TKO) system that installs stop codon clusters (SCCs) for precise translational termination with minimal in-frame mutations. TKO achieves knockout efficiencies of up to 70.5%, 58.6% and 75.1% in rice, maize and wheat protoplasts, respectively, and produces heritable knockout alleles in 96.8% of regenerated rice plants. In hexaploid wheat, TKO outperforms Cas9 4.2-fold in generating triple-homolog knockouts, largely by reducing in-frame mutations. Orthogonal TKO editors with sequence-divergent SCCs enable simultaneous knockout of up to ten genes without cross-interference. Integration of TKO with conventional prime editing establishes TRIM1 (TKO editor-enabled gene rupture and development of integrated multitype genome modification system) for simultaneous knockout and precise editing, achieving a 22.8% coediting of four genes in rice. TRIM2 extends this capacity to kilobase-scale modifications through a prime editor–recombinase system, enabling a 4.9-kb insertion (1.2% efficiency) and gene knockout (up to 79.8%) in protoplasts. Plant genome editing is multiplexed with twin prime editing.

阅读与讨论 → 访问原文 →

24.

arXiv (CS.CV) 2026-06-15 DOI: arXiv:2606.14010

RT-VLA: Real-Time Vision-Language-Action Models via Knowledge Distillation

作者:

Xiangyu Huang ↗Zhenlin Hua ↗Han Zhou ↗Shounak Sural ↗Ragunathan Rajkumar ↗

Vision-Language-Action (VLA) models have shown strong potential for end-to-end autonomous driving by jointly modeling visual perception, language reasoning, explainability and action prediction. However, their large vision-language backbones and reasoning modules introduce substantial inference latency and thereby prevent their deployment in the unforgiving reality of the road networks. We propose RT-VLA, a lightweight, distilled VLA model that transfers the driving and reasoning capabilities of the state-of-the-art SimLingo model into a compact student through multi-level supervised distillation. RT-VLA preserves language-based reasoning and supports post-hoc explanation through offline language analysis of safety-critical driving moments without adding latency to real-time control. Compared to the SimLingo teacher, RT-VLA maintains competitive closed-loop driving and language reasoning performance while reducing inference time by 44.8X in vision-only mode and 7.9X in vision+language mode. These results suggest that supervised distillation is a practical approach for building real-time, explainable VLA-style autonomous driving models.

阅读与讨论 → 访问原文 →

25.

arXiv (quant-ph) 2026-06-19 DOI: arXiv:2606.20184

Operator Learning for efficient Quantum Computation

作者:

Paul Over ↗Sergio Bengoechea ↗Leonardo Borello Busilacchi ↗Martin Kiffner ↗Thomas Rung ↗Alexios A. Michailidis ↗

arXiv:2606.20184v1 Announce Type: new Abstract: An efficient implementation of quantum algorithms is often hindered by the lack of efficient primitives for operators and state preparation. This limits both the ability of near-term quantum hardware to simulate complex problems and the potential of fault-tolerant algorithms to achieve practical quantum advantage. To address this, we propose a full-stack variational framework that transforms arbitrary operators to compact quantum circuits. The resulting variational circuits can be tailored to the connectivity and long-range interaction of the target hardware. The learning process employs backpropagation together with a cost function that efficiently optimizes unitary operators and non-unitary – dense or sparse – operators using only a single ancilla qubit for block encoding. Additionally, we introduce a regularization term that reduces the approximation error. The approach is validated for both quantum mechanical and engineering applications. In the former case, we learn propagators that arise in native quantum problems – such as quantum simulation and quantum chemistry – and achieve improved resource scaling in comparison to standard Suzuki-Trotter expansions. In the latter case, we demonstrate the approach's ability to implement the second-order central finite difference approximation of the Laplace operator – relevant for solving partial differential equations – while improving upon current error metrics. The final example deals with learning a dense, non-unitary operator that arises in the analysis of inviscid potential flow around an airfoil. This universality of the framework opens the door for solving general problems beyond prototypical engineering and quantum applications.

阅读与讨论 → 访问原文 →

探索全球前沿学术脉络