Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-17

Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies

作者:

arXiv:2502.17518v3 Announce Type: replace-cross Abstract: This paper presents a comprehensive study on the use of ensemble Reinforcement Learning (RL) models in financial trading strategies, leveraging classifier models to enhance performance. By combining RL algorithms such as A2C, PPO, and SAC with traditional classifiers like Support Vector Machines (SVM), Decision Trees, and Logistic Regression, we investigate how different classifier groups can be integrated to improve risk-return trade-offs. The study evaluates the effectiveness of various ensemble methods, comparing them with individual RL models across key financial metrics, including Cumulative Returns, Sharpe Ratios (SR), Calmar Ratios, and Maximum Drawdown (MDD). Our original experimental results demonstrate that ensemble methods often outperform base models in terms of risk-adjusted returns, providing better management of drawdowns and overall stability. However, both the original analysis and the additional reproduction reported in this version show that ensemble performance is sensitive to the choice of variance threshold \(\tau\), classifier group, RL-agent pair, and market universe. The reproduction evidence strengthens the conclusion that classifier-assisted ensemble selection can improve robustness, while also clarifying that the advantage is conditional rather than automatic across all datasets. This study emphasizes the value of combining RL with classifiers for adaptive decision-making, with implications for financial trading, robotics, and other dynamic environments.

02.
arXiv (quant-ph) 2026-06-16

A New Definition of Quantum Superposition

arXiv:2606.15607v1 Announce Type: new Abstract: The usual description of the superposition of two (pure quantum) states is ambiguous, since the binary operation of summation in a Hilbert space does not pass down to the quotient projective space. Even though Dirac noted this as early as 1930, it is often asserted that the superposition is a binary operation acting on two states with a value that is a unique state. The goal for this note is to motivate a rigorous, geometrical definition of the superposition of states in the setting of complex projective space, which has been argued elsewhere to be the natural geometric phase space for quantum theory. The upshot is that the new definition of the superposition of two pure states, viewed as two distinct points in the projective space, is the unique (complex) line on which those two points lie. Finally, a comparison is given between superposition and expansion in an orthonormal basis.

03.
bioRxiv (Bioinfo) 2026-06-11

Tumour evolution as ground truth for cancer whole-genome sequencing

Cancer genomes are shaped by evolutionary processes that couple mutagenesis, clonal selection, chromosomal instability, spatial growth and treatment response into structured genomic patterns, yet current benchmarking strategies largely ignore this evolutionary dependency. Here, we present SCOUT, a large-scale synthetic whole-genome sequencing resource of over 200 samples, designed for systematic benchmarking of tumour genomic analysis and evolutionary inference under controlled evolutionary ground truth. Unlike conventional task-specific simulations, SCOUT models tumour evolution as a latent generative process that simultaneously shapes mutations, copy-number alterations, variant allele frequencies, mutational signatures and clonal architectures. SCOUT recapitulates key features of solid and haematological malignancies, including driver mutations, chromosomal instability, intratumour heterogeneity, spatial sampling and treatment-associated evolutionary dynamics in tumour and matched-normal longitudinal and multi-region sequencing designs. Using SCOUT, we benchmarked widely used methods for somatic variant detection, copy-number analysis, mutational signature inference and tumour evolutionary reconstruction. Across analytical tasks, performance deteriorated in low-purity, highly subclonal and structurally complex tumours, while spatial sampling bias and hypermutation generated spurious evolutionary signals that confounded tumour interpretation across multiple inference layers. Evolutionary simulations further distinguished lineage-restricted genetic bottlenecks from multi-lineage resistance dynamics associated with tumour plasticity. Tumour purity consistently exerted a stronger effect on inference accuracy than sequencing depth. Together, our results establish evolutionary ground truth as a prerequisite for reproducible benchmarking and biologically interpretable analysis of cancer whole-genome sequencing data.

04.
arXiv (CS.AI) 2026-06-17

OmniSapiens: A Foundation Model for Social Behavior Processing via Heterogeneity-Aware Relative Policy Optimization

arXiv:2602.10635v3 Announce Type: replace Abstract: Socially intelligent AI systems must reason across diverse human behavioral tasks and generalize to new social contexts. However, behavioral data is inherently heterogeneous, comprising diverse modalities and prediction targets that produce uneven training signals across samples, creating imbalanced learning dynamics that challenge existing AI models. To address this, we develop Omnisapiens-7B 2.0, a foundation model for social behavior processing that explicitly addresses learning from heterogeneous behavioral data. This is enabled through Heterogeneity-Aware Relative Policy Optimization, a new RL method that rebalances learning signals across samples by approximating each sample's contribution to the policy update and using these estimates to drive geometrically centered, inertially smoothed advantage modulation for stable training. Omnisapiens-7B 2.0 achieves the best and most consistent performance across 10 behavioral tasks, while also attaining the best performance on all five held-out benchmarks, with gains of up to +12.02% and +9.37% respectively. Furthermore, it demonstrates more consistent and interpretable reasoning traces, supporting reliable real-world behavioral applications. Our model is available at https://github.com/MIT-MI/human_behavior_atlas.

05.
bioRxiv (Bioinfo) 2026-06-18

Calculation of sequence space coverage in a mutagenesis library

Directed evolution requires screening of large mutagenesis libraries, but accurate calculation of library sizes needed to discover functional variants remains challenging. Existing models provide baseline estimates, yet current computational approaches for finding the best variants scale poorly with library complexity. Here, we introduce a scalable algorithmic framework to compute exact discovery probabilities in saturation mutagenesis libraries with no requirement for explicit sequence enumeration. By aggregating variants into a composition log–sum distribution and applying log-space convolution across randomisation blocks, it is possible to extend this to massive sequence spaces and mixed codon schemes. By inverting these calculations, absolute mathematical ceilings for experimental design are established. Ultimately, this framework provides a rapid, quantitative tool to balance the statistical coverage-diversity trade-off within the limitations of laboratory screening. Finally, this is implemented as an open-source web application (SSCC) that allows researchers to construct heterogeneous library designs and compute required sampling depths, coverage probabilities, and absolute randomisation limits.

06.
arXiv (CS.CL) 2026-06-16

Bridging Passive and Active: Enhancing Conversation Starter Recommendation via Active Expression Modeling

Large Language Model (LLM)-driven conversational search is shifting information retrieval from reactive keyword matching to proactive, open-ended dialogues. In this context, Conversation Starters are widely deployed to provide personalized query recommendations that help users initiate dialogues. Conventionally, recommending these starters relies on a closed "exposure-click" loop. Yet, this feedback loop mechanism traps the system in an echo chamber where, compounded by data sparsity, it fails to capture the dynamic nature of conversational search intents shaped by the open world. As a result, the system skews towards popular but generic suggestions. In this work, we uncover an untapped paradigm shift to shatter this harmful feedback loop: harnessing user "free will" through active user expressions. Unlike traditional recommendations, conversational search empowers users to bypass menus entirely through manually typed queries. The open-world intents in active queries hold the key to breaking this loop. However, incorporating them is non-trivial: (1) there exists an inherent distribution shift between active queries and formulated starters. (2) Furthermore, the "non-ID-able" nature of open text renders traditional item-based popularity statistics ineffective for large-scale industrial streaming training. To this end, we propose Passive-Active Bridge (PA-Bridge), a novel framework that employs an adversarial distribution aligner to bridge the distributional gap between passively recommended starters and active expressions. Moreover, we introduce a semantic discretizer to enable the deployment of popularity debiasing algorithms. Online A/B tests on our platform, demonstrate that PA-Bridge significantly boosts the Feature Penetration Rate by 0.54% and User Active Days by 0.04%.

07.
arXiv (math.PR) 2026-06-16

Higher-order spectral perturbation expansions II: Kernel matrices and manifold learning

arXiv:2606.16373v1 Announce Type: cross Abstract: We study spectral concentration bounds for kernel matrices as approximation of the corresponding kernel integral operator. Results are established under weak assumptions on the data setting and the reproducing kernel relying only on a Mercer condition and a local Weyl law. This allows us to deal with key features of kernel matrices, such as large multiplicities, large effective dimension, and heavy-tailed distributions. Our results apply to infinite dimensional principal component analysis, manifold learning, and Bayesian nonparametric statistics. We illustrate this via two prototypical examples: The heat kernel on the sphere and a wavelet prior from Bayesian nonparametrics.

08.
arXiv (CS.AI) 2026-06-16

Estimating Mutual Information between Time Series and Temporal Event Sequences Across Diverse Analysis Tasks

arXiv:2606.01602v2 Announce Type: replace-cross Abstract: Pairwise dependence measures such as correlation and causality are fundamental to temporal data mining, yet there is still no principled and robust way to quantify dependence between heterogeneous data types, especially between continuous time series and discrete temporal event sequences. Existing approaches rely on ad hoc transformations or mutual-information estimators that are highly sensitive to quantization, repeated values, and event redundancy, leading to biased or unstable results in practice. We propose a nonparametric mutual information estimator that directly measures the dependence between time series and event sequences without data transformation, learning, or ad hoc discretization. Our method models the continuous-discrete duality of real-world time series to handle quantization and repeated-value artifacts and introduces a latent event clustering strategy to mitigate bias from event co-occurrence and redundancy. Together, these yield a robust and unified framework that bridges discrete and continuous mutual information. We evaluate the proposed estimator on four representative tasks: discrete-continuous time-delayed mutual information for causality analysis, global and local temporal repetition discovery, discrete covariate selection for time series forecasting, and continuous feature selection for classification. Experiments on synthetic and real-world datasets show consistent improvements over existing methods in accuracy, robustness, and interpretability, positioning our approach as a general-purpose dependence operator for heterogeneous temporal data, similar to Pearson correlation for homogeneous time series. Code available at: https://github.com/HaojiHu/Multimodal-Temporal-Data-Quantification

09.
arXiv (CS.AI) 2026-06-19

Mitigating Simplicity Bias in OOD Detection through Object Co-occurrence Analysis

arXiv:2605.07821v2 Announce Type: replace-cross Abstract: Out-of-distribution (OOD) detection is crucial for ensuring the reliability of deep learning models. Existing methods mostly focus on regular entangled representations to discriminate in-distribution (ID) and OOD data, neglecting the rich contextual information within images. This issue is particularly challenging for detecting near-OOD, as models with simplicity bias struggle to learn discriminative features in disentangled representations. The human visual system can use the co-occurrence of objects in the natural environment to facilitate scene understanding. Inspired by this, we propose an Object-Centric OOD detection framework that learns to capture Object CO-occurrence (OCO) patterns within images. The proposed method introduces a new OOD detection paradigm that understands object co-occurrence within an image by predicting disentangled representations for the test sample, then adaptively divides patterns into three scenarios based on object co-occurrence patterns observed in ID training data, and finally performs OOD detection in a divide-and-conquer manner. By doing so, OCO can distinguish near-OOD by considering the semantic contextual relationships present in their images, avoiding the tendency to focus solely on simple, easily learnable regions. We evaluate OCO through experiments across challenging and full-spectrum OOD settings, demonstrating competitive results and confirming its ability to address both semantic and covariate shifts. Code is released at https://github.com/Michael-McQueen/OCO.

10.
arXiv (CS.CV) 2026-06-16

CycliST: A Video Language Model Benchmark for Reasoning on Cyclical State Transitions

We present CycliST, a novel benchmark dataset designed to evaluate Video Language Models (VLM) on their ability for textual reasoning over cyclical state transitions. CycliST captures fundamental aspects of real-world processes by generating synthetic, richly structured video sequences featuring periodic patterns in object motion and visual attributes. CycliST employs a tiered evaluation system that progressively increases difficulty through variations in the number of cyclic objects, scene clutter, and lighting conditions, challenging state-of-the-art models on their spatio-temporal cognition. We conduct extensive experiments with current state-of-the-art VLMs, both open-source and proprietary, and reveal their limitations in generalizing to cyclical dynamics such as linear and orbital motion, as well as time-dependent changes in visual attributes like color and scale. Our results demonstrate that present-day VLMs struggle to reliably detect and exploit cyclic patterns, lack a notion of temporal understanding, and are unable to extract quantitative insights from scenes, such as the number of objects in motion, highlighting a significant technical gap that needs to be addressed. More specifically, we find no single model consistently leads in performance: neither size nor architecture correlates strongly with outcomes, and no model succeeds equally well across all tasks. By providing a targeted challenge and a comprehensive evaluation framework, CycliST paves the way for visual reasoning models that surpass the state-of-the-art in understanding periodic patterns.

11.
arXiv (CS.AI) 2026-06-16

CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies

arXiv:2606.16613v1 Announce Type: new Abstract: As LLM agents become capable of increasingly long-horizon tasks, evaluating their performance in economic systems is becoming increasingly important. Unlike existing benchmarks that primarily evaluate a single agent interacting with a passive environment, economic systems are inherently multi-agent, requiring autonomous agents to communicate, negotiate, and transact while pursuing their own objectives over extended periods. We introduce CoffeeBench, a benchmark for evaluating LLM agents in a long-horizon multi-agent economy composed of heterogeneous firms. In CoffeeBench, two farmers, two roasters, and two retailers autonomously operate their businesses over a 90-day simulation, each seeking to maximize cumulative net income through communication and transactions while managing cash, inventory, and pricing. The evaluated model controls one coffee roaster, while the remaining firms are controlled by fixed reference agents. Across several recent open-weight and proprietary LLMs, all models outperform a passive baseline that takes no actions, with most achieving positive net income. Analysis of agent behavior reveals substantial differences in long-horizon economic interaction: higher-performing models communicate more actively with other firms, whereas Claude~Haiku~4.5 exhibits an idle-drift failure mode, repeatedly choosing inaction despite producing coherent assessments and plans. We release our code and agent trajectories to support future research.

12.
arXiv (math.PR) 2026-06-11

Second-order PACF asymptotics and discrimination between fractional Gaussian noise and $\operatorname{FARIMA}(0,d,0)$

作者:

arXiv:2605.31416v2 Announce Type: replace-cross Abstract: Fractional Gaussian noise and $\operatorname{FARIMA}(0,d,0)$ have the same long-memory pole $|\theta|^{-2d}$ and hence the same leading PACF law $\alpha(n)\sim d/n$. We show that this agreement breaks at the first non-universal order. For $0

13.
arXiv (CS.CV) 2026-06-19

An Angular-Temporal Interaction Network for Light Field Object Tracking in Low-Light Scenes

High-quality 4D light field representation with efficient angular feature modeling is crucial for scene perception, as it can provide discriminative spatial-angular cues to identify moving targets. However, recent developments still struggle to deliver reliable angular modeling in the temporal domain, particularly in complex low-light scenes. In this paper, we propose a novel light field epipolar-plane structure image (ESI) representation that explicitly defines the geometric structure within the light field. By capitalizing on the abrupt changes in the angles of light rays within the epipolar plane, this representation can enhance visual expression in low-light scenes and reduce redundancy in high-dimensional light fields. We further propose an angular-temporal interaction network (ATINet) for light field object tracking that learns angular-aware representations from the geometric structural cues and angular-temporal interaction cues of light fields. Furthermore, ATINet can also be optimized in a self-supervised manner to enhance the geometric feature interaction across the temporal domain. Finally, we introduce a large-scale light field low-light dataset for object tracking. Extensive experimentation demonstrates that ATINet achieves state-of-the-art performance in single object tracking. Furthermore, we extend the proposed method to multiple object tracking, which also shows the effectiveness of high-quality light field angular-temporal modeling.

14.
arXiv (quant-ph) 2026-06-15

Nonadiabatic Self-Healing of Trotter Errors in Digitized Counterdiabatic Dynamics

arXiv:2512.22636v2 Announce Type: replace Abstract: Trotter errors in digitized quantum dynamics arise from approximating time-ordered evolution under noncommuting Hamiltonian terms with a product formula. In the adiabatic regime, such errors are known to exhibit long-time self-healing [Phys. Rev. Lett. 131, 060602 (2023)], where discretization effects are effectively suppressed. Here we show that self-healing persists at finite evolution times once nonadiabatic errors induced by finite-speed ramps are compensated. Using counterdiabatic driving to cancel diabatic transitions and isolate discretization effects, we study both noninteracting and interacting spin models and characterize the finite-time scaling with the Trotter steps and the total evolution time. In the instantaneous eigenbasis of the driven Hamiltonian, the leading digital error maps to an effective harmonic perturbation whose dominant Fourier component yields an analytic upper bound on the finite-time Trotter error and reveals the phase-cancellation mechanism underlying self-healing. Our results establish finite-time self-healing as a generic feature of digitized counterdiabatic protocols, clarify its mechanism beyond the long-time adiabatic limit, and provide practical guidance for high-fidelity state preparation on gate-based quantum processors.

15.
arXiv (CS.CV) 2026-06-16

Training-Free Open-Vocabulary Visual Grounding for Remote Sensing Images and Videos

Remote sensing visual grounding (RSVG) aims to localize a referred target in a remote sensing image or video according to a natural language expression. Existing RSVG methods usually rely on task-specific manual annotations, which are costly to collect and inevitably limited in covering the diversity of real-world geospatial scenarios. As a result, they often struggle to generalize to open-vocabulary queries involving novel objects, fine-grained attributes, complex spatial relationships, and functional semantics. In this paper, we propose RSVG-ZeroOV, a training-free framework that leverages frozen generic foundation models for zero-shot open-vocabulary RSVG. RSVG-ZeroOV follows an Overview-Focus-Evolve paradigm, which exploits the distinct yet complementary attention patterns of vision-language models (VLMs) and diffusion models (DMs) to progressively generate precise grounding results. Specifically, (i) Overview utilizes a VLM to extract cross-attention maps that capture semantic correlations between the referring expression and visual regions; (ii) Focus leverages the fine-grained modeling priors of a DM to compensate for object structure and shape information often overlooked by VLM attention; and (iii) Evolve introduces a simple yet effective attention evolution module to suppress irrelevant activations, yielding purified object masks. To handle video inputs, we further present Video RSVG-ZeroOV, which extends image-level grounding to spatio-temporal grounding through a query-relevant key-frame selector and a temporal propagator, enabling efficient and temporally coherent video grounding without video annotations or fine-tuning. Extensive experiments on six image and video grounding benchmarks show that RSVG-ZeroOV consistently outperforms existing zero-shot baselines and achieves competitive or superior performance compared with weakly- and fully-supervised methods.

16.
arXiv (CS.CV) 2026-06-16

Deep Learning in Seismic Interpretation: Federated Advances in Salt Dome Segmentation

Salt-dome delineation is a critical, high-impact task in subsurface geological interpretation, driving decisions in hydrocarbon exploration, reservoir modeling, and drilling safety. While convolutional encoder-decoder architectures have delivered significant improvements in automated salt segmentation, their widespread application is severely limited by data sovereignty concerns, dataset bias, and the scarcity of labeled seismic volumes. This paper introduces FedSaltNet, a Federated Learning (FL) framework explicitly engineered for robust, generalizable, and privacy preserving salt-dome segmentation. We couple a lightweight Small U-Net backbone, chosen for its efficiency and regularization properties with a novel Foreground-Weighted (FG-WEIGHTED) aggregation strategy designed to tackle domain-specific class imbalance. Through an extensive comparative study emulating non-IID conditions across four diverse seismic datasets (TGS, SEAM, F3, GBS), we demonstrate two critical findings: The FG-WEIGHTED algorithm effectively mitigates data heterogeneity, yielding a 4.0% relative improvement in Intersection over Union (IoU) over the best conventional FL method. The simple U-Net architecture proved essential, outperforming the higher capacity ResNet-18 U-Net variant by 166% in average IoU, underscoring the necessity of architectural simplicity in data-constrained federated environments. FedSaltNet provides a validated, high-performance solution that establishes the viability of federated deep learning for collaborative, next-generation subsurface interpretation.

17.
arXiv (CS.AI) 2026-06-12

EpiBench: Verifiable Evaluation of AI Agents on Epigenomics Analysis

arXiv:2606.13602v1 Announce Type: new Abstract: We introduce EpiBench, a verifiable benchmark for short-horizon epigenomics analysis. EpiBench evaluates whether agents can make well-defined analysis decisions from realistic workflow states and return deterministically gradable answers. The benchmark includes 106 evaluations across CUT\&Tag/CUT\&RUN, ATAC-seq, ChIP-seq, and DNA methylation workflows. Across 5,088 valid trajectories from 16 model-harness pairs, no system passed a majority of attempts: GPT-5.5 / Pi led at 45.0\% (143/318 attempts; 95\% confidence interval (CI), 36.3–53.7), followed by GPT-5.5 / OpenAI Codex at 39.9\% (127/318 attempts; 95\% CI, 31.6–48.3). Claude Opus 4.8 Max / Pi and GPT-5.4 / Pi each passed 39.0\% (124/318 attempts; 95\% CI, 30.2–47.8 and 31.0–47.0, respectively). Performance varies across assay types, and many failed runs still contain parts of the correct answer. Agents often found the right files and computed useful intermediate results, but failed when the task required deeper, assay-specific scientific judgment.

18.
arXiv (CS.LG) 2026-06-15

Learning the Context of Errors: Black-Box Online Adaptation of Time Series Foundation Models

arXiv:2606.14222v1 Announce Type: new Abstract: The rapid evolution of Time Series Foundation Models (TSFMs) has advanced zero-shot forecasting across diverse domains. Inspired by the current form of Large Language Models, future TSFMs may be offered as commercialized, closed-source API services. However, many existing online adaptation methods still rely on white-box access for parameter fine-tuning or gradient backpropagation. This paradigm mismatch raises a question: In black-box online adaptation for TSFMs, what should we learn? We answer this with an insight: the predictive errors of the base model are conditioned on both the input and output of the base model (i.e., the context of errors). To validate this insight, we propose ORCA (Online Residual Contextual Adaptation). We conduct extensive experiments across 5 state-of-the-art TSFMs and 8 datasets to demonstrate the effectiveness of our approach. Furthermore, through ablation studies, we quantitatively analyze the impact of different adapter learning hypotheses on the final adaptation performance in black-box online adaptation. Code available at https://github.com/Fifthky/ORCA.

19.
Nature (Science) 2026-06-09

Don’t compete, collaborate: why collective funding applications are the future

Scientists with disparate expertise writing grants together can identify knowledge gaps and drive progress — but systems must change to incentivize them. Scientists with disparate expertise writing grants together can identify knowledge gaps and drive progress — but systems must change to incentivize them.

20.
arXiv (quant-ph) 2026-06-16

Quantum learning with a single-atom sensor

arXiv:2606.15071v1 Announce Type: new Abstract: The ability to gather information and to act upon it is at the core of every learning agent. But what is the impact of quantum mechanics on an agent's ability to sense external inputs and to translate them into actions? Here we address the question for a prototype task of learning agency at the quantum scale: rotating a single spin based on information gathered by a single atom. We determine the ultimate performance limit for this task, revealing a fundamental tradeoff between entanglement at the sensing stage and coherence at the action stage: if the single-atom sensor is not entangled with the quantum system serving as the agent's internal memory, then the best learning strategy requires a coherent transfer of quantum information from the sensor to the system that controls the agent's actions. In contrast, if the sensor is initially entangled with the agent's memory, then the transfer of quantum information is no longer necessary. Our results indicate that the quantum properties of the sensor radically affect the optimal way to convert external stimuli into actions, revealing a link between quantum sensing and the behavior of quantum agents.

21.
arXiv (CS.LG) 2026-06-12

Scalable anomaly detection via a univariate Christoffel function

arXiv:2606.12483v1 Announce Type: new Abstract: Anomaly detection plays a critical role in identifying unusual patterns across domains such as fraud detection, network intrusion, and system fault diagnosis. Recently, Christoffel function-based methods, rooted in polynomial optimization, have emerged as promising alternatives to deep learning due to their strong mathematical foundations and computational frugality. However, their practical applicability is hindered by the need to invert a matrix whose size grows exponentially with the data dimension, rendering the method intractable even for moderate-dimensional datasets. This paper addresses the dimensionality limitations of Christoffel function-based anomaly detection while preserving its key theoretical properties, i.e., the on-off support dichotomy behavior and the accurate support shape capture. We introduce UCF, a univariate Christoffel function which is based on the squared distance between the query point and the support points. Extensive experiments on the ADBench benchmark demonstrate that UCF consistently outperforms 14 state-of-the-art baselines in terms of Average Precision. By resolving the scalability bottleneck of the Christoffel Function, this work expands the toolkit of anomaly detection methods with a robust, theoretically grounded, and universally applicable approach.

22.
arXiv (CS.CL) 2026-06-16

From ASR to ASP: Evaluating Prompt Attack Vulnerabilities Against Open-Source LLMs

Recent studies demonstrate that Large Language Models (LLMs) are vulnerable to attacks that generate harmful or sensitive outputs. As open-source LLMs are increasingly adopted in high-impact applications such as finance, law, and healthcare, systematically investigating their security risks is becoming increasingly important towards trustworthy LLM era. This paper comprehensively studies effective prompt injection attacks against 14 widely used open-source and three closed-source LLMs on five attack benchmarks. Moreover, existing evaluation metrics mostly only consider the attack success rate, overlooking uncertainty in model responses. Our proposed Attack Success Probability (ASP) additionally captures uncertain behaviors for evaluation, where the model may initially refuse a harmful request but subsequently provide harmful guidance or vice versa, reflecting inconsistency and ambiguity in attack feasibility. By systematically analyzing the effectiveness of prompt injection attacks, we propose a straightforward and effective hypnotism attack; results show that this attack causes aligned language models, including Stablelm2, Mistral, Openchat, and Vicuna, to generate objectionable behaviors, achieving around 90% ASP. They also indicate that ignore prefix attacks can break all 14 open-source LLMs, achieving over 60% ASP on a multi-categorical dataset. We find that moderately well-known LLMs exhibit higher vulnerability to prompt injection attacks, highlighting the need to raise public awareness and prioritize efficient mitigation strategies.

23.
arXiv (math.PR) 2026-06-16

Interplay of insurance and financial risks in a non Levy-Renewal environment

arXiv:2606.15596v1 Announce Type: new Abstract: In this paper we consider a multivariate risk model, with common counting process and common process of logarithmic returns for the investment portfolio. We assume that the claim-vectors, the counting process and the logarithmic returns of the investment portfolio satisfy a weak dependence structure. Further, we consider that the counting process represents an inhomogeneous renewal process, and the logarithmic returns represent a cadlag process with independent but not necessarily stationary increments. Under these conditions we provide an asymptotic expression for the infinite-time entrance probability of the discounted aggregate claims into some rare set xA, where A denotes a set from a general set family, crucial for the actuarial practice, when the common distribution of the claim vectors belong to a multivariate heavy-tailed distribution class. This result, is derived under a moment condition for the financial risks, and underlines the multivariate linear single big jump principle. When we restrict the distribution class of the claim-vectors to multivariate regular variation, we find more explicit asymptotic expressions, weakening the moment conditions on the financial risks. The asymptotic formulas, derived through double dependence solution, become more direct and practical in applications. With respect to the technical part, due to non Levy-Renewal framework, the classical Kesten-Goldie theorems are not applicable, nor their extensions. The way we make the discretization of the process of the discounted aggregate claims permits to derive uniform asymptotics with respect to the number of summands, that facilitate the approximation of the infinite sums of the main results.

24.
arXiv (CS.LG) 2026-06-15

Shuttling Compiler for Trapped-Ion Quantum Computers Based on Large Language Models

arXiv:2512.18021v3 Announce Type: replace-cross Abstract: We present the first shuttling compiler based on large language models (LLMs) for trapped-ion quantum computers, where qubits are shuttled between segments for gate execution and qubit storage. We fine-tune pre-trained LLMs on examples from linear and branched one-dimensional shuttling architectures. Thus, we obtain a layout-independent compilation strategy that learns the required shuttling operations directly from data. Using benchmark circuits with up to 16 qubits, such fine-tuned LLMs can now generate valid schedules for shuttling architectures. Notably, we also obtain a valid schedule for a previously unseen four-way junction layout. This demonstrates that trained LLMs can generalize to layouts not encountered during training. For various architectures, LLM-based schedules improve upon state-of-the-art baseline compiler results, reducing the shuttling effort by up to 15%.

25.
arXiv (CS.AI) 2026-06-17

MODE: Modality-Decomposed Expert-Level Mixed-Precision Quantization for MoE Multimodal LLMs

arXiv:2606.17118v1 Announce Type: cross Abstract: Mixture-of-Experts Multimodal Large Language Models (MoE-MLLMs) offer remarkable performance but incur prohibitive GPU memory costs, making compression essential. Among PTQ methods, expert-level mixed-precision quantization has proven effective for MoE-LLMs, yet suffers notable degradation on MoE-MLLMs due to two overlooked biases in expert importance estimation. (1) At the cross-modal level, the numerical dominance of vision tokens causes expert selection frequency to be dominated by vision tokens, masking experts that are critical to the text modality; (2) at the intra-vision level, the large proportion of redundant vision tokens further skew frequency statistics, obscuring experts critical for informative visual content. To bridge gaps, we propose MODE, a modality-decomposed expert-level mixed-precision quantization framework for MoE-MLLMs that decomposes expert selection frequency by modality, filters redundant vision tokens to obtain denoised visual frequency, and further evaluates quantization sensitivity per modality as a complementary signal to frequency-based estimation. These signals are integrated into an Integer Linear Programming formulation to assign per-expert bit-widths under a given budget. Extensive experiments show that MODE is particularly well-suited for MoE-MLLMs, limiting average performance loss to within 2.9% at W3A16, with larger gains at the extreme 2-bit setting.