Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.LG) 2026-06-18

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI

arXiv:2606.19270v1 Announce Type: cross Abstract: Artificial intelligence has driven rapid progress in medical imaging research, producing increasingly sophisticated algorithms and steady improvements on benchmark tasks. However, this algorithm-centric trajectory has also revealed a growing imbalance: while computational methods advance rapidly, the conceptual foundations that define imaging tasks, evaluation metrics, and clinical meaning sometimes remain underexamined. In this Perspective, we distinguish algorithmic innovation, which focuses on improving computational implementations and performance within a fixed problem definition, from conceptual innovation, which reframes what problems are posed, how success is measured, and why an approach is clinically relevant. We argue that prevailing incentive structures, training pathways, and publication norms disproportionately reward algorithmic novelty, particularly for early-career researchers, while at times undervaluing conceptual contributions that are essential for scientific maturation and clinical translation. Through representative examples from medical imaging AI, we show how insufficient conceptual grounding can lead to misaligned objectives, fragile generalization, and limited real-world impact. We conclude with actionable recommendations for researchers, mentors, reviewers, and journals to better recognize, support, and integrate conceptual innovation alongside algorithmic advances.

02.
arXiv (CS.CV) 2026-06-11

From Simulation to Real-World: An In-Field 6D Pose Dataset and Baseline for Robotic Strawberry Harvesting

Robotic strawberry harvesting requires precise 6D pose estimation; however, collecting 6D pose ground truth in real agricultural fields is inherently challenging. Existing 6D pose estimation methods have therefore relied solely on synthetic data that lacks scene-level realism, leaving their performance under real agricultural field conditions unquantified. In this work, we present, to the best of our knowledge, the first real-world 6D pose ground truth dataset of strawberries collected in actual agricultural fields (12,040 images). We also introduce a synthetic dataset rendered in NVIDIA Isaac Sim, featuring scene-level realism and domain randomization. Nevertheless, our experiments reveal that a significant sim-to-real gap persists, underscoring the necessity of real agricultural field data for reliable evaluation. We further quantify the sim-to-real gap through baseline 6D pose estimation results across backbone encoders, serving as a reference for future work. The real-world dataset will be made available upon acceptance.

03.
arXiv (quant-ph) 2026-06-19

Local controllability of heralded quantum linear optics

arXiv:2606.19470v1 Announce Type: new Abstract: Photonic linear optical networks provide a versatile platform for quantum information processing and quantum state engineering. However, the set of states that can be generated using passive linear optics alone is fundamentally constrained by bosonic symmetries. Heralding, based on conditional measurements on auxiliary modes, is a widely used technique to overcome these limitations and effectively enlarge the set of accessible states. Despite the widespread use of heralding, it is often unclear how specific ancillary resources impact the overall reachability of the target space. In this work, we investigate the local controllability of photonic states in linear optical networks by analyzing the rank of the Jacobian of the output state with respect to the underlying unitary circuit, which provides a quantitative measure of the dimension of the accessible tangent space at a given configuration. Our analysis ranges from passive linear optics to heralded linear optics, where auxiliary resources and conditional measurements are included. Within this framework, we quantify how different resources enlarge the locally accessible state space beyond that of passive linear optics and determine the resources required for the Jacobian rank to reach its maximal value, thereby achieving full local controllability. As maximal local rank is a necessary condition for global reachability, our framework offers a systematic tool to assess and compare the accessible state space of measurement-based photonic architectures, and to establish practical criteria for the resources needed in high-dimensional quantum state engineering.

04.
arXiv (CS.CL) 2026-06-15

Detecting Historical Turning Points in Italian Media: A Complex Systems Approach to a Diachronic News Corpus

The increasing availability of large-scale textual corpora has opened new possibilities for data-driven, quantitative approaches to historical analysis using Natural Language Processing (NLP). However, diachronic corpora with historical relevance from the pre-digital era remain scarce and often incomplete. We present a quantitative approach to historical analysis based on the reconstruction and exploration of a diachronic corpus of around 600,000 articles from the Italian newspaper "La Repubblica", covering all the articles published from the 1st of January 1985 to the 31st of December 2000 - a period of major political, social, and geopolitical change in Italy and globally. Using NLP techniques, we analyze the text at both lexical and semantic levels; we then apply tools from complex systems and statistical physics to trace shifts in media discourse over time. This allows us to detect key transition periods, such as the transition from the First Republic to the Second Republic in Italy, or major international conflicts like the Gulf War or the Kosovo War, without relying on prior labeling. The results show how combining computational linguistics with ideas from complex systems can offer new quantitative insight into historical changes, opening up new paths for studying the dynamics of media and society through large-scale textual data.

05.
arXiv (CS.AI) 2026-06-16

Learn from Your Mistakes: Tree-like Self-Play for Secure Code LLMs

arXiv:2606.03489v2 Announce Type: replace-cross Abstract: While Large Language Models (LLMs) excel in code generation, they remain prone to replicating subtle yet critical vulnerabilities endemic to their training data. Current alignment techniques, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL), typically apply coarse-grained optimization at the sequence level. This approach often fails to address the localized nature of security flaws, where a single incorrect token choice can compromise an entire program. To bridge this gap, we introduce Tree-like Self-Play (TSP), a framework that reframes secure code generation as a fine-grained sequential decision process. Unlike standard methods that blindly maximize likelihood, TSP constructs a decision tree where the model explores branching trajectories–generating both secure "golden paths" and vulnerable variants. By treating code generation as a self-play game, the model learns to strictly discriminate against its own localized errors. This provides a dense, on-policy learning signal that forces self-correction precisely at the critical decision nodes where vulnerabilities typically emerge. Our experiments demonstrate that TSP fundamentally enhances model reliability. In Python security benchmarks, TSP boosts CodeLlama-7B's pass rate (SPR@1) to 75.8%, significantly outperforming SFT (57.0%) and unstructured self-play baselines. Crucially, TSP induces robust out-of-distribution generalization: the model not only reduces vulnerabilities in unseen categories (CWEs) by 24.5% but also successfully transfers security principles learned from C/C++ to diverse languages, including Python, Go, and JavaScript. This suggests that TSP does not merely memorize patches, but internalizes abstract, language-agnostic security logic.

07.
arXiv (CS.AI) 2026-06-18

InstructTime++: Time Series Classification with Multimodal Language Modeling via Implicit Feature Enhancement

arXiv:2601.14968v2 Announce Type: replace-cross Abstract: Most existing time series classification methods adopt a discriminative paradigm that maps input sequences directly to one-hot encoded class labels. While effective, this paradigm struggles to incorporate contextual features and fails to capture semantic relationships among classes. To address these limitations, we propose InstructTime, a novel framework that reformulates time series classification as a multimodal generative task. Specifically, continuous numerical sequences, contextual textual features, and task instructions are treated as multimodal inputs, while class labels are generated as textual outputs by tuned language models. To bridge the modality gap, InstructTime introduces a time series discretization module that converts continuous sequences into discrete temporal tokens, together with an alignment projection layer and a generative self-supervised pre-training strategy to enhance cross-modal representation alignment. Building upon this framework, we further propose InstructTime++, which extends InstructTime by incorporating implicit feature modeling to compensate for the limited inductive bias of language models. InstructTime++ leverages specialized toolkits to mine informative implicit patterns from raw time series and contextual inputs, including statistical feature extraction and vision-language-based image captioning, and translates them into textual descriptions for seamless integration. Extensive experiments on multiple benchmark datasets demonstrate the superior performance of InstructTime++.

08.
arXiv (quant-ph) 2026-06-17

Quantum Resources and Wigner Symmetry in Nucleon-Nucleon Scattering from Effective Field Theory

arXiv:2606.17148v1 Announce Type: cross Abstract: We study quantum resources in the spin degrees of freedom, such as entanglement, stabilizer magic, and non-local magic, in low-energy nucleon-nucleon scattering through next-to-leading order in pionless effective field theory. Treating each nucleon spin as a qubit, we calculate the corresponding resource-generating powers of the scattering operator at generic center-of-mass momentum and scattering angle $\Theta$. The analysis retains $S$- and $P$-wave channels generated by two-derivative contact interactions. When the microscopic physics exhibits Wigner's $SU(4)$ spin-flavor symmetry, the neutron-proton amplitude becomes proportional to the spin-space identity operator and therefore generates no new resources after scattering, extending an observation previously made for leading-order $S$-wave scattering. The same-nucleon channel remains resource-generating because constraints from identical particles project out part of the Hilbert space. These results show how enhanced symmetries, partial-wave structure, and resource generation are intertwined in low-energy two-body scattering.

09.
arXiv (CS.AI) 2026-06-16

EMS: Multi-Agent Voting via Efficient Majority-then-Stopping

arXiv:2604.02863v2 Announce Type: replace Abstract: Majority voting is the standard for aggregating multi-agent responses into a final decision. However, traditional methods typically require all agents to complete their reasoning before aggregation begins, leading to significant computational overhead, as many responses become redundant once a majority consensus is achieved. In this work, we formulate efficient multi-agent voting as a reliability-aware agent scheduling problem and propose Efficient Majority-then-Stopping (EMS) to improve reasoning efficiency. EMS first estimates a Task-Conditioned Reliability Ordering (TCRO) for each agent by retrieving its historical consensus evidence on semantically similar queries, and then invoking agents in descending reliability order. Next, Adaptive Incremental Voting (AIV) terminates the process once the current leading answer cannot be overturned by any possible votes from the remaining agents, and returns this answer. Finally, Reliability History Updating (RHU) updates only the invoked agents according to their consensus with the final decision. Extensive evaluations across five benchmarks show that EMS preserves the accuracy of Majority Voting while reducing the average number of invoked agents by 35% and token consumption by 44%, respectively. The code is available at https://github.com/fuyu66/EMS.

10.
Nature (Science) 2026-06-19

Daily briefing: Human detritus remakes geology

作者:

What, exactly, is a rock? Plus, a stem-cell success for a severe autoimmune disease and evidence that ‘AI deskilling’ is real. Researchers have tracked the electrical activity of individual brain cells during conversation in real time. Plus, the history of GPS and a cross-species transplant that could reveal clues about the origin of animals.

11.
arXiv (CS.AI) 2026-06-18

TMR-GGNN: Credit Card Fraud Detection based on Time-Aware Multi-Relational Guided Graph Neural Network

arXiv:2606.18444v1 Announce Type: cross Abstract: In recent years, credit card fraud detection has faced significant challenges due to highly imbalanced data, evolving fraud patterns, and complex relational structures among transaction entities. To address these issues, this research proposes a novel framework called Timeaware Multi Relational Guided Graph Neural Network (TMR GGNN). Particularly, the proposed TMR GGNN extends the encoder decoder Graph Neural Network GNN architecture by modeling heterogeneous interactions across customers, merchants, devices, and IPs over temporal windows. Subsequently, the proposed TMR GGNN approach constructs a dynamic, multi relational graph and incorporates a time aware relational attention mechanism within the encoder to adaptively weigh the transaction relevance based on temporal proximity and semantic context. Consequently, the decoder employs a contrastive learning module to distinguish between real and synthesized transaction patterns, while improving the models generalization of rare fraud cases. Additionally, to effectively manage severe class imbalances and emphasize discriminative learning, a composite loss function combining Information Noise Contrastive Estimation (InfoNCE) based contrastive loss with Focal Loss is introduced. This integration assists in improving fraud identification while mitigating false negatives.

13.
arXiv (CS.AI) 2026-06-17

Cluster-Aware Dual-Level Test Specification Generation for Large-Scale Automotive Software Requirements

arXiv:2606.17197v1 Announce Type: cross Abstract: Generating test specifications that satisfy Automotive SPICE SWE.6 requirements becomes increasingly challenging and time-consuming as projects scale to thousands of requirements. Because this manual process often consumes weeks of engineering effort, automation becomes a critical necessity. However, standard Large Language Model (LLM) approaches struggle at scale: processing requirements individually discards vital inter-requirement dependencies, while feeding entire corpora at once exceeds context-window limits, leading to incomplete integration coverage and redundant test cases. This paper presents a novel "Cluster-then-Summarize" pipeline that addresses these limitations through three-stages. Requirements are embedded using sentence transformers and grouped using UMAP dimensionality reduction followed by HDBSCAN density-based clustering. This grouping utilizes an automatic minimum cluster size selection driven by a quality criterion combining normalized Silhouette and Calinski-Harabasz scores. A multi-level map-reduce summarization algorithm then distills each cluster into concise, domain-conformant descriptions while preserving quantitative thresholds and safety integrity levels. The pipeline exploits the derived cluster topology to generate test specifications at two levels: individual requirement verification and cluster-level integration tests that verify cross-requirement feature behavior. A nearby-cluster context mechanism provides bounded cross-feature awareness during each LLM call, and Retrieval-Augmented Generation grounds all outputs in ISO 26262 and ASPICE standards. Evaluation on automotive requirement datasets of varying scale demonstrates that the cluster-aware approach improves integration test coverage and maintains summarization fidelity compared to baseline methods while scaling efficiently to thousands of requirements.

14.
medRxiv (Medicine) 2026-06-16

Cross-sectional study of the association between depressive symptoms and attentional bias to emotional stimuli in patients with acute stroke: Study protocol

Post-stroke depression affects approximately 30% of patients after stroke and is associated with delayed recovery in activities of daily living, reduced rehabilitation effectiveness, and poorer quality of life. Attentional bias modification may provide a low-burden, nonpharmacological approach for patients in the acute phase of stroke. However, before such an intervention can be implemented in clinical practice, it is necessary to clarify whether attentional bias is present in patients with acute stroke and depressive symptoms, whether cognitive function influences the manifestation of this bias, and which task and stimulus formats are most appropriate for assessment. This multicenter, cross-sectional observational study will enroll patients with acute stroke between 7-30 days after stroke onset. Depressive symptoms will be assessed using the depression subscale of the Hospital Anxiety and Depression Scale. Attentional bias will be measured under four task conditions based on the dot-probe task and the cue-target task, using face and word stimuli. Secondary assessments will include cognitive function, anxiety symptoms, activities of daily living, health-related quality of life, and clinical background variables. The aims of this study are to investigate the association between depressive symptoms and attentional bias in patients with acute stroke, compare attentional bias characteristics across task and stimulus types, and examine the potential influence of cognitive function on this association. The findings are expected to provide an empirical basis for designing future attentional bias modification protocols targeting post-stroke depression in the acute phase. This study has been registered with the UMIN Clinical Trials Registry (UMIN000059166).

15.
arXiv (CS.LG) 2026-06-18

SCAN: Enhance Time Series Anomaly Detection via Multi-Scale Neighborhood-Centered Clustering

arXiv:2606.19255v1 Announce Type: new Abstract: Time series anomaly detection plays a crucial role in a wide range of real-world applications. Reconstruction-based methods have become the mainstream paradigm, but they suffer from over-generalization and under-generalization problems, which are challenging to balance. To address this, we introduce multi-scale clustering to enhance reconstruction-based methods. At the representation level, we integrate the cluster center representations of normal patterns to constrain the model to target representative normal patterns for reconstruction, preventing dominance of powerful capacity and representation capability. At the anomaly criterion level, we derive anomaly confidence score based on cluster membership probability and combine it with reconstruction error, providing dual criteria for detection. Furthermore, the effectiveness of the cluster center representations and anomaly confidence score depends on the clustering performance. Accordingly, we extract neighborhood-centered representations for multi-view clustering to improve clustering performance. Extensive experiments on multiple real-world datasets from diverse application domains demonstrate the state-of-the-art performance of SCAN.

16.
arXiv (math.PR) 2026-06-12

Conditional means, vector pricings, amenability and fixed points in cones

arXiv:2512.13829v4 Announce Type: replace Abstract: We develop a generalization of conditional probability for arbitrary ordered vector spaces. A related problem is that of assigning a numerical value to one vector relative to another. We characterize the groups for which these generalized probabilities can be stationary, respectively invariant. Our results deviate from the setting of classical probability and lead to a new criterion for amenability and for fixed points in cones.

17.
arXiv (quant-ph) 2026-06-11

Entanglement preservation and Clauser-Horne nonlocality in electromagnetically induced transparency quantum memories

arXiv:2507.15453v4 Announce Type: replace Abstract: Entanglement preservation in noisy quantum memories represents a central challenge in quantum information science. While experiments have shown that electromagnetically induced transparency (EIT) memories can store entangled photons, a quantitative theoretical analysis of whether nonlocal quantum correlations can survive storage loss induced by ground-state decoherence remains limited. Here we combine the dark-state polariton formalism with a reduced density-operator treatment to derive an EIT-specific effective pure-loss description for the retrieved photonic state in the ground-state-decoherence-limited regime. The analysis reveals that decoherence transforms an initially pure Bell state into a mixed state with a vacuum component and predicts a protocol-dependent storage-efficiency benchmark of 89.7% for violating the chosen unconditional Clauser-Horne (CH) inequality. Above this benchmark, the retrieved photonic state violates the CH inequality without post-selection, whereas below it, this unconditional CH violation is no longer obtained. This framework provides a quantitative theoretical description of entanglement retention, retrieved photonic density operators, and protocol-dependent Bell-test benchmarks in EIT quantum memories.

18.
medRxiv (Medicine) 2026-06-22

Brain-gut axis imaging, motion correction with 11C-carfentanil total-body PET

Background: Mu-opioid receptors (MORs) are expressed throughout the body including in the brain and gastrointestinal (GI) tract. Total-body PET imaging of the brain and GI tract offers a promising approach for cross-sectional in vivo evaluation of the MOR brain-GI axis. However, intestinal motility and bladder filling introduce motion throughout the GI tract over the scan window. Here we establish analysis methodology to account for motion for dynamic imaging of the brain-GI axis, to further characterize peripheral MORs throughout the body and provide a framework for semi-automatic total-body PET modeling. Methods: 4 subjects underwent 90-min dynamic [11C]-carfentanil (cfn) total-body PET acquisitions at baseline, after intravenous naloxone (central antagonist) administration, and after orally administered loperamide (peripheral agonist and P-glycoprotein substrate). Thalamic MOR availability was measured using the Logan reference tissue model. Using CT-based segmentation, the GI tract was subdivided into anatomical segments, in addition to other peripheral organs (e.g., liver, psoas muscle). Frame-by-frame semi-automatic motion correction was performed with three distinct reference frames (11-14 min post-injection, p.i., 35-40 min p.i., and 85-90 min p.i.). The performance of these three were compared to manual correction. Compartment modeling and Logan graphical analysis were performed to estimate relevant kinetic parameters (K1, VT, VTLogan). Results: Across the 4 subjects and regions, kinetic parameter estimates were highly correlated (r>0.7) for K1, VT and VT Logan when comparing semi-automatic (reference frame at 35-40 min p.i.) and manual correction. With semi-automatic motion correction, graphical-based estimation of VTLogan in the gastrointestinal tract was significantly decreased with loperamide relative to baseline (p

19.
arXiv (math.PR) 2026-06-11

Marked random graphs with given degree sequence: large deviations on the local topology

arXiv:2401.00351v2 Announce Type: replace Abstract: We investigate the behavior of the empirical neighborhood distribution of marked graphs in the framework of local weak convergence. Here we extend known results by considering uniform random graphs with given degree sequences and i.i.d. marks on half-edges and vertices. We establish a large deviation principle for such families of empirical measures. The proof builds on Bordenave and Caputo's seminal 2015 paper, and Delgosha and Anantharam's 2019 introduction of BC entropy, relying on combinatorial lemmas that allow one to construct suitable approximations of measures supported on marked trees. Possible applications of these results are in the study of interacting diffusions on top of random graphs.

20.
arXiv (quant-ph) 2026-06-19

Universality in Ionic Three-body Systems Near an Ion-atom Feshbach Resonance

arXiv:2511.00325v3 Announce Type: replace-cross Abstract: We calculate bound and scattering properties of a system of two neutral atoms and an ion near an atom-ion Feshbach resonance. Our results indicate that long-range atom-ion interactions lead to significant deviations from universal behavior derived from contact or van der Waals potentials. We find that ionic systems display an overall suppression of inelastic transitions leading to recombination rates and lifetimes of Efimov state orders of magnitude smaller with respect to those for neutral atoms. We further characterize the dense spectra of triatomic molecular ions with extended lifetimes. Our results provide a deeper insight on the universality and structure of three-body ionic systems and establishing them as a promising platform for exploring novel few- and many-body phenomena with long-range interactions.

21.
arXiv (CS.CL) 2026-06-11

GraspLLM: Towards Zero-Shot Generalization on Text-Attributed Graphs with LLMs

Research on Text-Attributed Graphs (TAGs) has gained significant attention recently due to its broad applications across various real-world data scenarios, such as citation networks, e-commerce platforms, social media, and web pages. Inspired by the remarkable semantic understanding ability of Large Language Models (LLMs), there have been numerous attempts to integrate LLMs into TAGs. However, existing methods still struggle to generalize across diverse graphs and tasks, and their ability to capture transferable graph structural patterns remains limited. To address this, we introduce the GraspLLM, a framework that combines Graph structural comprehension with semantic understanding prowess of LLMs to enhance the cross-dataset and cross-task generalizability. Specifically, we represent node texts from different graphs in a unified semantic space with a frozen general embedding model, on top of which we perform motif-aware contrastive learning across multiple motif-induced adjacency matrices to extract dataset-agnostic structural information. Then, with our proposed optimal contextual subgraph, we extract the most contextually relevant subgraph for each target node and align these subgraphs to the token space of LLM via an alignment projector. Extensive experiments on TAG benchmark datasets spanning diverse domains reveal that GraspLLM consistently outperforms previous LLM-based methods for TAGs, especially in zero-shot scenarios, highlighting its strong generalizability across different datasets and tasks. Our code is available at https://github.com/Heinz217/GraspLLM.

22.
arXiv (CS.LG) 2026-06-16

NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models

arXiv:2602.06694v3 Announce Type: replace Abstract: Weight-only quantization has become a standard approach for efficiently serving large language models (LLMs). However, existing methods fail to efficiently compress models to binary (1-bit) levels, as they either require large amounts of data and compute or incur additional storage. In this work, we propose NanoQuant, the first post-training quantization (PTQ) method to compress LLMs to both binary and sub-1-bit levels. NanoQuant formulates quantization as a low-rank binary factorization problem, and compresses full-precision weights to low-rank binary matrices and scales. Specifically, it utilizes an efficient alternating direction method of multipliers (ADMM) solver to precisely initialize latent binary matrices and scales, and then tunes the initialized parameters through a block and model reconstruction process. Consequently, NanoQuant establishes a new Pareto frontier in low-memory post-training quantization, and enables sub-1-bit compression. NanoQuant makes large-scale deployment feasible on consumer hardware. For example, it compresses Llama2-70B by 25.8$\times$ in just 13 hours on a single H100, enabling a 70B model to operate on a consumer 8 GB GPU. Code is available at https://github.com/SamsungLabs/NanoQuant.

23.
arXiv (math.PR) 2026-06-16

Experimentation for Different Scheduling Policies on Queues: Mixed Differences-in-Q Estimators Based on Little's Law

arXiv:2605.29641v2 Announce Type: replace-cross Abstract: In data centers, tasks are dispatched to various servers to evenly distribute the workload. When a data center considers implementing a new scheduling algorithm, it typically conducts an A/B test prior to deployment to assess the real-world impact of this new method. However, a straightforward A/B test might be interfered with so-called ``Markovian'' interference. We utilized the Differences-in-Q estimator, as developed by Farias et al. (2022), and introduced mixed Differences-in-Q estimators grounded in Little's Law. We show that our A/B testing methods significantly reduce bias and variance when testing various scheduling policies. Extensive simulations were conducted under scenarios like non-stationary arrival rates, heterogeneous service rates, and communication delays. These simulations highlight the robustness and efficacy of our A/B testing approach.

24.
arXiv (CS.LG) 2026-06-12

From geometry to dynamics: Learning overdamped Langevin dynamics from sparse observations with geometric constraints

arXiv:2512.23566v2 Announce Type: replace-cross Abstract: How can we learn the laws underlying the dynamics of stochastic systems when their trajectories are sampled sparsely in time? Existing methods either require temporally resolved high-frequency observations, or rely on geometric arguments that apply only to conservative systems, limiting the range of dynamics they can recover. Here, we present a new framework that reconciles these two perspectives by reformulating inference as a stochastic control problem. Our method uses geometry-driven path augmentation, guided by the geometry in the system's invariant density to reconstruct likely trajectories and infer the underlying dynamics without assuming specific parametric models. Applied to overdamped Langevin systems, our approach accurately recovers stochastic dynamics even from extremely undersampled data, outperforming existing methods in synthetic benchmarks. This work demonstrates the effectiveness of incorporating geometric inductive biases into stochastic system identification methods.

25.
arXiv (CS.CV) 2026-06-17

Query-Efficient Video Adversarial Attack with Stylized Logo on Service Computing

In service computing, video classification has become fundamental to many intelligent applications. While Deep Neural Networks (DNNs) have demonstrated excellent performance in recognizing video content, recent studies have shown that DNNs are highly vulnerable to adversarial examples. Thus, understanding adversarial attacks can better respond to emergency situations. In order to improve attack performance, many style-transfer-based attacks and patch-based attacks have been proposed. However, the global perturbation of the former will bring unnatural global colors, while the latter is difficult to achieve success in targeted attacks due to the limited perturbation space. Moreover, compared to a plethora of methods targeting image classifiers, video adversarial attacks remain relatively underexplored. Therefore, to generate adversarial examples with a low budget and to provide them with a higher verisimilitude, we propose a novel black-box video attack framework, called Stylized Logo Attack (SLA). SLA is conducted through three stages. The first stage involves building a style reference set for logos, which can not only make the generated examples more natural, but also carry more target class features in targeted attacks. Then, Reinforcement Learning is employed to determine the style reference and position parameters of the logo within the video, which ensures that the stylized logo is placed in the video with optimal attributes. Finally, perturbations are optimized in a step-by-step manner so as to improve the fooling rate. Experimental results indicate that SLA can achieve better performance than state-of-the-art methods and still maintain good deception effects when facing various defense methods. We believe SLA can raise awareness among the security community about the reliability and security of video classification systems and serve as a memorandum of possible attack methods.