Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.LG) 2026-06-24

New Bounds for the Last Iterate of the Stochastic subGradient Method

arXiv:2606.24879v1 Announce Type: cross Abstract: We study the last iterate of the stochastic subgradient method for one-dimensional convex Lipschitz objectives. For a fixed horizon $n$, we consider the standard fixed stepsizes $\eta =\Theta(1/\sqrt n)$. We prove that, for such stepsize policies, under additive i.i.d. subgradient noise with uniformly bounded variance, the last iterate features an optimization error of order $1/\sqrt n$, thereby removing the extra $(\log n)$ factor present in existing generic bounds. On the other hand, we show that without the i.i.d. assumption, the optimization error can be of order $(\log n)/\sqrt n$. Thus, under the uniformly bounded variance assumption alone, the last iterate of SsGM is suboptimal even in dimension one, resolving negatively an open problem posed in Koren and Segal, COLT, 2020.

02.
arXiv (CS.LG) 2026-06-15

A Unified Framework for Structured Flow Modeling: From Representation to Verification and Model Discovery

arXiv:2605.18250v3 Announce Type: replace-cross Abstract: Many dynamical systems can be described in terms of structured flows combining source/sink behavior, cyclic dynamics, and topology-constrained transport. These features arise across a wide range of physical, engineered, and data-driven systems. The objective of this work is to establish a unified perspective on such systems, to identify modeling approaches that balance expressivity, interpretability, computational complexity, and data requirements, and to investigate how highly expressive models can be used to uncover the dominant mechanisms underlying observed dynamics. Starting from the Helmholtz-Hodge decomposition of continuous vector fields, we review the recently proposed Graph Vector Field (GVF) framework and its discrete representation on simplicial complexes. We then introduce a hierarchy of alternative approaches, including parametric conditional models, linear graph dynamical systems, and reduced Hodge representations. Finally, we propose a verification and validation methodology based on benchmark datasets from well-understood physical systems and on systematic model-reduction and ablation studies. The resulting family of structured-flow models within a common framework, ranging from low-dimensional parametric representations to full GVF formulations, supports a diagnostic methodology in which gradient, curl, harmonic, and topological contributions are systematically assessed through ablation studies. This process enables the identification of dominant mechanisms underlying the observed dynamics and guides the construction of simplified models tailored to the available data and operational constraints. By separating structural verification, behavioral verification, and domain-specific validation, the proposed approach provides a foundation for scalable and interpretable analysis of complex dynamical systems across multiple application domains.

03.
arXiv (CS.LG) 2026-06-24

Real vs. Complex Spectral Bases for Neural Operators: The Role of Green's Function Alignment

arXiv:2606.24851v1 Announce Type: new Abstract: Fourier Neural Operators (FNO) learn solution operators of partial differential equations by parameterizing global convolutions in the complex Fourier domain. For real-valued PDE solutions, the complex FFT carries representational redundancy through conjugate symmetry. We introduce the Hartley Neural Operator (HNO), the exact real-valued mirror of FNO: it replaces the FFT with the purely real Discrete Hartley Transform and learns a single real multiplier per retained spectral mode, with no complex arithmetic. Because the real Hartley spectrum is not halved by conjugate symmetry, HNO retains twice as many frequency corners as FNO but one real weight where FNO carries a complex pair, so the two operators are iso-parametric at equal width and differ only in spectral basis. Our central thesis is that the best basis is a property of the operator. Self-adjoint elliptic operators (Poisson, biharmonic) have real, symmetric Green's functions that the real Hartley multiplier diagonalizes exactly, and HNO is favored there. Time-dependent operators carry phase, from oscillation in the wave equation to transport in advection, Burgers, and Navier-Stokes, which a real diagonal multiplier cannot represent, so FNO is favored there, and increasingly so with the operator's phase content, leaving the phaseless heat equation as the borderline case. Training both operators identically and benchmarking across PDE classes, initial-condition families, and boundary conditions, we find an elliptic-versus-time-dependent split that is monotone in operator phase content and matches the Green's-function theory we develop. Rather than a universal winner, our findings give a predictive rule: match the spectral basis to the symmetry of the solution operator.

04.
arXiv (CS.CL) 2026-06-11

Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

Large Language Models (LLMs) are increasingly used for code generation, raising concerns that they may be misused to produce malicious code. Meanwhile, Grammar-Constrained Decoding (GCD) has been widely adopted to improve the reliability of LLM-generated code by enforcing syntactic validity. In this paper, we reveal a counterintuitive risk: this reliability-oriented technique can itself become an attack surface. We uncover a new jailbreak attack, termed CodeSpear, that exploits GCD to induce LLMs into generating malicious code. Our experiments show that simply applying a benign code grammar constraint can effectively jailbreak LLMs. To address this vulnerability, we propose CodeShield, a safety alignment approach that robustly preserves safe behavior even under attacker-controlled grammar constraints. CodeShield aligns the model in the code modality by teaching it to generate honeypot code under GCD. Such code is semantically harmless, so it does not implement the malicious request, and structurally diverse, so it is difficult to suppress through grammar tightening. At the same time, CodeShield still preserves natural-language refusals when natural language is available. Experiments on 10 popular LLMs across 4 benchmarks show that CodeSpear outperforms representative jailbreak baselines and increases the attack success rate by more than 30 percentage points on average. CodeShield also restores safety under CodeSpear while preserving benign utility. Our findings reveal a fundamental risk of GCD and call for greater attention to its potential security implications.

05.
medRxiv (Medicine) 2026-06-24

Differential COVID-19 Outcomes Across Lysosomal Disorders

Background Lysosomal disorders (LDs) are a heterogeneous group of rare inherited disorders characterized by multi-system involvement and high comorbidity burden, which raises concerns about severe COVID-19 outcomes. Conversely, because SARS-CoV-2 relies on endolysosomal pathways for cellular entry and replication, certain LDs may exert a protective effect against viral pathogenesis. Prior clinical evidence investigating LDs and severe SARS-CoV-2 infection has been limited by small sample sizes and inconsistent findings. Therefore, to resolve these conflicting biological hypotheses and estimate population-level outcomes, we conducted a large-scale retrospective cohort study using nationwide U.S. harmonized electronic health record data from the National Clinical Cohort Collaborative (N3C). This design utilized longitudinal records starting January 1, 2018, to evaluate COVID-19 infections captured between January 1, 2020, and July 11, 2024. Results The study included 16,380 individuals, comprising 5,460 patients with lysosomal disorders and 10,920 matched controls. Patients with LDs had significantly higher odds of COVID-19 hospitalization compared with controls (OR = 1.86, 95% CI: 1.70-2.04). Elevated odds were observed across the evaluated categories, but varied substantially. Notably, neurodegenerative LDs such as neuronal ceroid lipofuscinosis (OR = 9.32) and metachromatic leukodystrophy (OR = 2.33) remained associated with hospitalization after adjustment for comorbidities. Contrarily, the elevated odds for Fabry disease and Gaucher disease were no longer significant after adjustment. Mortality among hospitalized patients with LDs was comparable to that of matched controls (one-year survival: 82.1% vs 82.0%), suggesting that LD status does not independently worsen survival once hospitalization occurs. Conclusions Patients with LDs were at an increased odds of COVID-19 hospitalization, driven by a combination of elevated comorbidity burden and disorder-specific effects, which vary significantly across LD categories. This study clarifies that excess risk is concentrated in the transition to hospitalization. These patients may thus require personalized clinical care to mitigate the negative consequences of COVID-19.

06.
arXiv (CS.LG) 2026-06-18

Provable quantum speedups for computing persistence in topological data analysis

arXiv:2410.21258v2 Announce Type: replace-cross Abstract: Topological data analysis (TDA) aims to extract noise-robust features from a data set by examining the number and persistence of holes in its topology. We provide an efficient quantum algorithm for a computational problem closely related to a core task in TDA – determining whether a given hole persists across different length scales. Further, we prove the problem itself is $\mathsf{BQP}_1$-hard, implying that a classical solution is extremely unlikely; this stands in contrast to all previous quantum approaches to TDA, where the problems were also intractable for quantum computers, or where a rigorous proof of classical hardness still remains open. This result implies an {exponential} quantum speedup for this problem under standard complexity-theoretic assumptions. Our approach relies on encoding the persistence of a hole in a variant of the guided sparse Hamiltonian problem, where the guiding state is constructed from a harmonic representative of the hole.

07.
arXiv (quant-ph) 2026-06-16

Sharp Transitions for Subsystem Complexity

arXiv:2510.18832v2 Announce Type: replace-cross Abstract: The circuit complexity of time-evolved pure quantum states grows linearly in time for an exponentially long time. This behavior has been proven in certain models, is conjectured to hold for generic quantum many-body systems, and is believed to be dual to the long-time growth of black hole interiors in AdS/CFT. Achieving a similar understanding for mixed states remains an important problem. In this work, we study the circuit complexity of time-evolved subsystems of pure quantum states. We find that for greater-than-half subsystem sizes, the complexity grows linearly in time for an exponentially long time, similarly to that of the full state. However, for less-than-half subsystem sizes, the complexity rises and then falls, returning to low complexity as the subsystem equilibrates. Notably, the transition between these two regimes occurs sharply at half system size. We use holographic duality to map out this picture of subsystem complexity dynamics and rigorously prove the existence of the sharp transition in random quantum circuits. Furthermore, we use holography to predict features of complexity growth at finite temperature that lie beyond the reach of techniques based on random quantum circuits. In particular, at finite temperature, we argue for an additional sharp transition at a critical less-than-half subsystem size. Below this critical value, the subsystem complexity saturates nearly instantaneously rather than exhibiting a rise and fall. This novel phenomenon, as well as an analogous transition above half system size, provides a target for future studies based on rigorous methods.

08.
arXiv (CS.AI) 2026-06-15

HierSVA: A Data Synthesis Pipeline, Dataset, and Benchmark for LLM-Driven Hierarchical Hardware Formal Verification

arXiv:2606.13706v1 Announce Type: cross Abstract: We present HierSVA, an integrated suite that combines a pipeline, dataset, and benchmark for LLM-driven hierarchical hardware formal verification. HierSVA-SP pairs an RTL preprocessing toolchain with an LLM-in-the-loop formal verification flow to produce reference SystemVerilog Assertions (SVA) on hierarchical RTL. Applying it to BaseJump STL yields HierSVA-DS, a dataset of 342 modules, with hierarchy metadata and depths 0–9, accompanied by a deep subset of 28 module-bug pairs with natural-language specifications and bug variants. HierSVA-B decomposes assertion quality into six metric axes: syntax correctness, assertion proof success rate, vacuity, specification faithfulness, mutation coverage, and formal core coverage. Applying HierSVA-B to twelve recent LLMs reveals three findings. First, the module-level compile rate is 67.1\%; among generated assertions in evaluable runs, 82.1\% prove non-vacuously, but the corresponding assertion sets detect only 70.2\% of eligible injected faults and cover 36.2\% of the formal core. Second, on 211 evaluable model–module entries in the deep subset, assertion sets flag buggy RTL with 0.87 recall, but 40\% of predicted-buggy outcomes are false positives on correct RTL, limiting precision to 0.60. Third, agentic mode improves S1-style provability and strength metrics, but gains plateau and oscillate. Codes and artifacts are available at \href{https://github.com/HierSVAAnon/HierSVACodeAndArtifacts}{https://github.com/HierSVAAnon/HierSVACodeAndArtifacts}. Dataset is available at \href{https://huggingface.co/datasets/AnonymousHierSVA/HierSVA}{https://huggingface.co/datasets/AnonymousHierSVA/HierSVA}.

09.
arXiv (CS.CL) 2026-06-25

Space-Efficient Language Generation in the Limit

We initiate a resource-aware theory of language generation in the limit under the minimal constraint of space efficiency. In our framework, a learner observes an adversarial positive stream from a target language $K$ and must eventually output a hallucination-free hypothesis language $L \subseteq K$ while omitting at most $\Delta$ strings of $K$. We focus on $\mathcal{C}_{s,k}$, the collection of languages recognized by DFAs with at most $s$ states over an alphabet of size $k$, as the natural hypothesis class for memory-bounded learners. In the exponential-space regime, we prove that a learner can exactly identify the target $K$. Under a stricter memory budget, we characterize the strongest possible generation guarantees. In particular, we present a streaming algorithm using $\mathrm{poly}(s,k)$ space that converges to a hypothesis with generation gap $\Delta = O(k^{2s-2})$. Moreover, the learned hypothesis captures every string in $K$ of length at least $2s-1$. We complement this result with a near-matching lower bound through a reduction from a standard communication complexity problem. Specifically, achieving generation gap $\Delta \le k^{(1-\varepsilon)s}$ requires $k^{\Omega(\varepsilon s)}$ memory. Together, these results reveal a sharp transition between polynomial-space generation and exponential-space exact identification.

10.
arXiv (CS.CV) 2026-06-24

FlowDec: Temporal Conditional Flow Decorruptor for Robust Continuous Vision-Language Navigation

Vision-and-Language Navigation in Continuous Environments (VLN-CE) requires agents to follow natural-language instructions in unseen scenes. While Large Models (LMs) have advanced VLN-CE, their performance remains severely degraded by real-world visual corruptions, a critical yet underexplored domain constraint. We introduce Temporal Conditional Flow Decorruptor (FlowDec), a novel image restoration framework tailored for LM-based VLN-CE. FlowDec integrates a hybrid temporal conditioning strategy to align the generative flow path with historical context and employs action-centroid guided filtering to dynamically assess and integrate outputs. Extensive experiments demonstrate that FlowDec outperforms state-of-the-art decorruption methods in both navigation accuracy and generation latency. Our approach establishes a robust, efficient paradigm for resilient embodied navigation in unpredictable real-world conditions.

11.
arXiv (CS.CL) 2026-06-25

Probing in the Wild: A Case Study of Self-Supervised Speech Representations on Mandarin Sub-dialects with Unsupervised Articulatory Analysis

While self-supervised speech models have achieved strong performance across speech tasks, relatively little is known about how their internal phonetic representations behave under fine-grained dialect variation. Existing probing studies typically rely on curated corpora with manual phonetic annotations, limiting their applicability to naturally occurring dialect speech. We present a case study of articulatory feature representations in a Mandarin self-supervised speech model using an entirely unlabeled probing pipeline. Phone sequences are generated using a language-agnostic universal phone recognizer and mapped to articulatory feature vectors, enabling frame-level probing without manual annotation. Our results reveal a structured pattern in articulatory feature decodability across Mandarin sub-dialects. Acoustically salient features such as labiality and stridency remain comparatively stable, whereas features associated with finer spectral distinctions exhibit larger dialect-dependent variation. This variation is driven primarily by elevated decodability for Beijing speech relative to other Mandarin sub-dialects. Layer-wise analyses further show distinct representational dynamics for these feature groups. These findings suggest that language-agnostic articulatory probing can be applied to real-world dialect corpora and that dialect sensitivity in self-supervised speech representations is unevenly distributed across articulatory dimensions.

12.
arXiv (CS.AI) 2026-06-16

Z-Plane Neural Networks: Bounded Geometric Activation Replaces ReLU and LayerNorm

arXiv:2606.15669v1 Announce Type: cross Abstract: Modern deep neural networks rely on Euclidean scalar activations (e.g., ReLU) and global normalization techniques (e.g., LayerNorm) to prevent gradient instability in deep architectures. However, these mechanisms inherently cause dead neurons, discard critical directional information, and destroy the orthogonality of feature representations. Inspired by the frequency-modulation transmission of biological axons, we propose the Z-Plane Neural Network, which maps hidden states into 2D phasor bundles on a hypersphere. We introduce a novel geometric activation function, Radial Bounding($\mathbf{x} / \max(1, \|\mathbf{x}\|_2)$), which limits the energy magnitude while preserving the phase (direction). We demonstrate mathematically that this isotropic activation maintains 1-Lipschitz continuity and prevents gradient vanishing by preserving tangential gradients. Empirically, a 100-layer Z-Plane Multi-Layer Perceptron (MLP)-entirely devoid of ReLU and LayerNorm-successfully converges on the MNIST dataset with 98.34% accuracy and absolute numerical stability, proving that bounded geometric activation alone is sufficient for stable deep learning.

13.
arXiv (CS.CV) 2026-06-19

Holo-World: Unified Camera, Object and Weather Control for Video World Model

Video world models are moving toward preserving an observed world under controllable camera and object motion while allowing its environmental state to change. Yet these controls remain isolated, and weather generation typically relies on a source video or reconstructed scene that already specifies future structure. We study a first-frame-anchored source-to-state setting, where the model starts from a single image and follows explicit camera and object controls and an optional weather instruction, then generates a video that either preserves the source world or transfers it to a target weather state. To address these challenges, we first build HoloStateData, a state video dataset that turns diverse videos into unified control samples for camera, object, and weather supervision. Second, we introduce Holo-World, a unified controllable video world model that jointly controls scene from a single image. Its Unified Scene Adapter factorizes world preservation and weather transfer into distinct parameter subspaces, using rendered background, geometry buffers, and object controls to maintain controlled scene structure while modeling weather-dependent appearance and particle effects. Additionally, Scene-Weather Decomposed CFG guides scene and weather residuals separately, strengthening target weather effects without over-amplifying the full condition. Quantitative and qualitative experiments demonstrate that Holo-World maintains precise camera and object control with consistent scene structure while transferring scenes into diverse target weather state, outperforming video-to-video weather editing baselines on weather-state generation. Our project page is available at \url{https://xiangchenyin.github.io/Holo-World/}.

14.
arXiv (CS.CV) 2026-06-17

ActWorld: From Explorable to Interactive World Model via Action-Aware Memory

Interactive world models aim to simulate environment dynamics under real-time user actions. However, their action vocabulary is largely confined to navigation: most actions correspond to motion (e.g., walk, turn, look around), while interaction with objects in the scene (e.g., pick up plates, open doors, or trigger physical responses) is either absent, restricted to game domains, or relegated to prompt-to-full-video scenarios. The resulting worlds are visually explorable but not truly actionable. In this work, we present ActWorld, an interactive world model that extends prior navigation-centric generators to support mid-rollout object interaction within a chunk-autoregressive framework. We argue that the navigation-interaction gap stems from two bottlenecks. First, a data bottleneck: the lack of human-object interaction data with accurate, dense labels. Second, a memory bottleneck: recency-biased history compression in existing world models discards the event-transition frames that causally determine subsequent object states, leading to an action-forgetting pathology. On the data side, we construct a 100K interaction video dataset, each annotated with per-chunk captions via chain-of-thought reasoning. On the model side, we introduce a hierarchical action-aware memory design that routes history compression by interaction importance, complemented by a persistent memory bank that maintains event-update and object-identity tokens across long rollouts. Experiments show that ActWorld supports both flexible navigation and rich object interaction within a single model, substantially improving interaction fidelity over navigation-only baselines without sacrificing viewpoint control. Project page is available at https://interactwm.github.io/ActWorld.

15.
arXiv (CS.AI) 2026-06-24

Beyond the Autoregressive Horizon: A Comprehensive Survey of Diffusion Models, World Modelling, and State Space Models for Code

arXiv:2606.23690v1 Announce Type: cross Abstract: Autoregressive (AR) language models have driven significant progress in automated software engineering, enabling powerful code generation and assistance systems. However, the next-token prediction paradigm introduces structural limitations for code reasoning, including restricted global planning, challenges in maintaining long-range dependencies, and limited grounding in program execution semantics. Noting the heavy skewness of existing literature towards AR models, we discuss emerging paradigms that could potentially overcome the logic and scaling bottlenecks of next-token prediction by unlocking next-generation architectural capabilities for code intelligence. Specifically, we discuss the potential of Diffusion Models, which generate code via holistic denoising that captures long-range syntactic constraints often missed by AR models. We also discuss Code World Models (CWMs), which simulate execution states to support reasoning, and State Space Models (SSMs), which provide linear-time efficiency for massive contexts. By connecting these developments with findings from cognitive neuroscience, we outline directions for developing "System 2" code generation agents.

16.
arXiv (CS.LG) 2026-06-11

Bergson: An Open Source Library for Data Attribution

arXiv:2606.11660v1 Announce Type: new Abstract: Data attribution is a promising field in interpretability that aims to explain model behavior through the influence of its training data, with applications including debugging undesirable model behavior and training dataset curation. However, significant engineering effort is required to perform it at scale, and many cutting edge techniques lack open-source tooling and support. Bergson is an open source library that aims to enable faster progress in the field by providing a host of techniques that scale to very large language models and pre-training datasets. The library natively supports on-disk gradient stores and multi-node distributed training, and provides quality of life tools for researchers. Finally, we introduce the first open-source implementations of three leading data attribution methods: MAGIC, SOURCE, and TrackStar. The library is available at https://github.com/EleutherAI/bergson .

17.
medRxiv (Medicine) 2026-06-18

Early-life Urban Environment, Nutrition, and Pubertal Timing in Southern Europe: An Exposome Analysis

Background: Urban environmental and lifestyle factors during early life may influence pubertal timing, but the combined effects of multiple environmental exposures within an exposome analytical framework remain poorly understood. Objective: To examine the association between early-life urban environmental exposures and pubertal timing, and to explore whether these exposures interact with early-life nutritional factors, namely breastfeeding duration and childhood diet quality. Methods: Data from two European population-based birth cohorts were analysed: Generation XXI (G21, Portugal; n=5263; 51.5% girls) and INfancia y Medio Ambiente (INMA, Spain; n=1019; 50.1% girls). Urban environmental exposures including indicators of air pollution, traffic, built environment, and natural spaces were estimated at 4 early-life stages at both cohorts: pregnancy (INMA only), birth, 1 year, and 4-5 years of age. Pubertal development timing was assessed using Tanner staging and/or the Pubertal Development Scale (PDS), and age at menarche was self-reported. Exposome-Wide Association Study (ExWAS) models and unsupervised clustering followed by ordinal logistic regression models were used to examine single- and multi-exposure associations, respectively. Regression models were fitted adjusting for relevant child characteristics, maternal factors, and household socioeconomic conditions, and corrected for multiple testing. Results: Individuals living in more unfavourable urban environments characterised by higher building density, air pollution, and lower access to natural spaces showed earlier pubertal timing according to multiple outcomes, across multiple early-life exposure periods, and in both cohorts. In the G21 cohort, these environmental profiles were associated with earlier age at menarche, particularly for exposures at 1-1.5 and 4-5 years (e.g., 1-1.5y: {beta}=-0.172, FDR-adjusted p-value=0.041), while in the INMA cohort, boys exposed to more unfavourable environmental profiles showed more advanced pubertal development, also particularly for exposures at 1-1.5 and 4-5 years of age (e.g., 1-1.5y; {beta}=0.572, FDR-adjusted p-value=0.008). Among environmental domains, air pollution and traffic were the factors most consistently associated with pubertal timing. Regarding early-life nutritional factors, longer duration of exclusive breastfeeding was associated with a lower Tanner stage among girls in G21. No significant interactions between breastfeeding duration and environmental exposure clusters were observed. Conclusion: Early-life urban environmental exposures, particularly air pollution and traffic, may influence pubertal timing. Exclusive breastfeeding may have a protective role against earlier pubertal development. These findings highlight the importance of improving urban environmental conditions and promoting breastfeeding to support healthy developmental trajectories.

18.
arXiv (CS.CV) 2026-06-17

MOCHI: Motion Enhancement of Collaborative Human-object Interactions

Collaborative human-object interaction shows dynamic and complex movements that require mutual anticipation and continuous adjustment between participants and the shared object. Modeling such collaborative multi-human object interaction (MHOI) scenarios requires high-quality data acquisition as a foundational step; however, this is challenging due to the inherent complexity of MHOI where human-human and human-object interactions occur simultaneously. Such complexity leads to noisy MHOI captures characterized by several artifacts: contact misalignment between hands and objects, motion jitter and temporal inconsistencies in the captured sequences, and missing or incomplete finger-level articulation details. To address these challenges, we present MOCHI (MOtion Enhancement of Collaborative Human-object Interactions), a two-stage framework for enhancing noisy MHOI data. Our approach first generates physically plausible hand grasps through optimization from noisy body input, producing grasps that are both physically plausible and semantically consistent with the body pose, where these optimized grasps are extended into complete hand-object interaction sequences. Consequently, the full-body motion for all participants are refined through a diffusion-based noise optimization framework that uses single-person motion priors. During the optimization process, we introduce optimization objectives to encode human-object and human-human interaction information within these single-person priors. Experimental results demonstrate the effectiveness of our pipeline across diverse MHOI data, either acquired by existing capture methods or synthesized by generative models. We further show robustness of our system across varying numbers of participants and types of interactions, and demonstrate various applications including keyframe-based MHOI creation and data augmentation through varying object geometries.

19.
arXiv (CS.CL) 2026-06-17

Structural Role Injection in Handlebars-Templated LLM Prompts: Triple-Brace Interpolation, Delimiter Family, and the Limits of HTML Auto-Escaping

Large language model applications build prompts from templates, and Handlebars is a widely used templating engine and the default prompt-template format in Microsoft Semantic Kernel. Its double-brace {{x}} expression HTML-escapes the interpolated value and is documented as the safe default; its triple-brace {{{x}}} expression inserts the value raw. We show that this choice silently governs an application's exposure to structural role injection, where attacker-controlled data carries chat role delimiters that forge a higher-privilege turn. A model-free analysis establishes the mechanism: Handlebars escaping rewrites angle brackets but not square brackets, colons, or Markdown hashes, so it neutralises ChatML, Llama-3, and XML role delimiters (survival rate 0.00) while leaving Llama-2 [INST], legacy Human:/Assistant:, and Markdown ### delimiters intact (survival rate 1.00 for the last two). We then run 5760 trials across seven delimiter families, two attack objectives, and four models (GPT-3.5 Turbo, GPT-4o mini, GPT-4.1 mini, Claude Haiku 4.5) at a combined API cost of 1.63 USD. GPT-3.5 Turbo follows the task-hijack instruction in 97% of raw and 91% of escaped trials, with the escaping protection concentrated in the angle-bracket families and absent for the colon- and Markdown-based families; the harder secret-exfiltration objective, which does not saturate, exposes the same family interaction more cleanly. Claude Haiku 4.5 resists both objectives almost entirely. The escaped default protects only the delimiter schemes whose characters HTML escaping happens to cover, gives no protection for the rest, and cannot substitute for a structural separation of instruction and data.

20.
arXiv (CS.AI) 2026-06-16

Snyk VulnBench JS 1.0: Can LLMs Find the Same Bugs Twice?

arXiv:2606.15762v1 Announce Type: cross Abstract: We ran 300 repeated vulnerability-finding scans to measure how repeatable agentic large language model (LLM) security review is on the same JavaScript code, prompt, and benchmark harness. The headline result is that LLM security findings were unevenly repeatable: reference-matched findings were stable, but extra model reports varied heavily from run to run. Across 250 model runs, 80 of 161 unique unmatched findings appeared in only one of five identical repetitions, while only 22 appeared in all five. By contrast, when Claude matched a Snyk Code reference finding, the behavior was much more stable: 134 of 158 unique reference-matched findings appeared in all five repetitions. The benchmark also shows complementarity. Models consistently found familiar, high-signal exploit shapes, and in one case surfaced a likely Snyk Code product gap. Snyk Code static application security testing (SAST) was deterministic and better at systematically enumerating repeated data-flow sinks. The results support combining agentic LLM review with deterministic SAST rather than treating either technique as a replacement for the other.

21.
arXiv (CS.CL) 2026-06-25

SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

Sparse Mixture-of-Experts (MoE) architectures have emerged as an increasingly influential paradigm as they offer a strategic balance between parameter scalability and computational efficiency. However, low-resource languages, which suffer from a scarcity of high-quality training data, often have their tokens routed to different experts than those predominantly activated by high-resource inputs, which limits cross-lingual expert sharing. This cross-lingual routing divergence consequently hinders their efficacy in multilingual contexts. To address this issue, we propose SARA (Semantically Anchored Routing Alignment), a framework designed to transfer specialized capabilities from high-resource languages as anchors to low-resource languages. SARA explicitly aligns the routing distribution of multilingual inputs with high-resource semantic anchors using a symmetric Jensen-Shannon (JS) divergence constraint. Unlike traditional distillation methods that operate on output logits, SARA directly aligns the internal routing distributions of MoE layers, encouraging mechanistic consistency in expert selection across languages. We conduct experiments on 2 LLMs across 5 low-resource languages and 3 benchmarks. Experiment results demonstrate that SARA outperforms standard instruction tuning, e.g., +0.8% on Qwen3-30B-A3B and +1.2% on Phi-3.5-MoE-instruct on Global-MMLU. Further analyses show that SARA effectively addresses performance bottlenecks in low-resource languages, providing a scalable pathway to enhance multilingual capabilities in sparse architectures.

22.
medRxiv (Medicine) 2026-06-15

High Demand, Low Possession: Dilemmas and Strategies for Research Capability Cultivation in Clinical Medicine Postgraduates

Most previous studies have examined medical postgraduate research training from a single dimension, lacking a full-chain analysis that integrates capability demand, actual possession, obstacles, and output. Consequently, the measurement of capability gaps and the analysis of underlying training model deficiencies remain insufficient. To address this gap, we administered a self-designed multidimensional questionnaire to 86 clinical medicine postgraduates at a medical school, covering research cognition, interest, capability demand and possession, participation pathways, difficulties, and outputs. The aim was to systematically characterize the current situation, identify problems, and propose optimization strategies. Over 90% of participants expressed interest in research, yet only 1.16% self-rated as very knowledgeable. The largest demand-possess gap was for writing and publication (86.05% vs. 16.28%), followed by independent research capability (75.58% vs. 11.63%). A total of 59.30% cited lack of foundational knowledge, making experiments very difficult, as the greatest challenge, and 66.28% had no research achievements. The primary source of research topics was supervisor assignment (54.65%), with only 4.65% choosing topics independently. No statistically significant differences were found across grades or training types (P > 0.05). These findings reveal a structural high demand, low possession gap in medical postgraduate research training, with early research experience deficit and a passive research model as key constraining factors. Accordingly, an integrated bachelor-postgraduate progressive research competency training system is proposed.

23.
arXiv (CS.AI) 2026-06-25

SE-AGCNet: An End-to-End Framework for Joint Speech Enhancement and Loudness Control in Meeting Scenarios

arXiv:2606.25959v1 Announce Type: cross Abstract: Conventional audio pipelines typically treat speech enhancement (SE) and automatic gain control (AGC) as discrete modules, which often limits overall performance. For instance, applying AGC before SE may inadvertently amplify background noise, while prioritizing SE tends to over-suppress low-volume speech. To address these limitations, we propose SE-AGCNet, an end-to-end framework that jointly optimizes SE and AGC. Tailored for meeting scenarios with significant volume variations, SE-AGCNet leverages the synergy between the two tasks: SE preserves quiet speech, thereby facilitating effective volume adjustment by the AGC component. Furthermore, we propose a specialized data simulation pipeline, SE-AGC-DataGen, and incorporate standardized loudness evaluation metrics: integrated loudness (LUFS), short-term loudness (St LUFS), and LRA. Experiments show that SE-AGCNet consistently achieves target loudness while improving speech quality and ASR accuracy over competitive baselines.

24.
arXiv (CS.AI) 2026-06-16

An Integrated System for Real-Time Student Assessment and Career Guidance Using Neural Networks in Computing Disciplines

arXiv:2606.15831v1 Announce Type: new Abstract: Many undergraduate students in Computer Science (CS) and Software Engineering (SWE) struggle to identify suitable career paths, particularly when their academic performance, abilities, and interests do not fully align. To address this issue, this study proposes an AI-driven Student Assessment and Career Prediction System that integrates a Career Guidance Expert (CGE) system with a Web-Based Student Assessment (WBSA) platform. Within the integrated framework, CGE enhances personalized career recommendations using AI while also assisting students after graduation in identifying suitable jobs, research domains, and higher study opportunities aligned with their skills and interests. The WBSA platform further strengthens interaction between students and faculty through assessments, personalized tasks, mentorship activities, and a secure real-time chat application. The CGE system employs a Multilayer Perceptron (MLP) model trained on real-world academic and extracurricular data collected using the snowball sampling method from the students of universities, achieving a validation accuracy of 94.71% in predicting personalized career paths. A pre-survey was conducted across universities to evaluate the proposed model before deployment. The WBSA system was developed as a modern web application using technologies such as Node.js, Next.js, and PostgreSQL to ensure scalability, responsiveness, and secure data management. The overall system is supported by a secure cloud-based infrastructure, the platform provides reliable performance while assisting graduates to select suitable career path in IT sector. In addition, a post-survey involving both students and faculty was conducted to gather feedback and further improve the overall effectiveness and usability of the system.

25.
arXiv (CS.AI) 2026-06-15

Recovering Stranded Discrimination in Knowledge Tracing: Per-Item Bias Correction via Empirical-Bayes Shrinkage

arXiv:2606.14123v1 Announce Type: cross Abstract: Deployed knowledge-tracing models are typically frozen after training, yet systematic per-item logit bias arises, from limited per-item expressivity in backbone architectures and from post-deployment shifts in item properties, degrading prediction quality. Global post-hoc calibrators such as Platt scaling, temperature scaling, and isotonic regression improve probability estimates but leave discriminative ability, as measured by AUC, unchanged. This AUC invariance is a structural consequence of monotone score-only transforms; recovering the stranded discrimination requires conditioning on item identity. We propose SLC (State-space Logit Correction), which converts binary observations to Gaussian pseudo-observations via Laplace/IRLS, applies empirical-Bayes shrinkage through a Kalman smoother, and fits an offset-Platt link. The state-space formulation also yields a detectability bound that characterizes the Bernoulli information floor, explaining why temporal tracking provides no benefit at current data densities. Across four datasets, five backbones, and three seeds, SLC improves AUC on all four datasets and NLL on three, with the advantage concentrating on sparse items. Cross-domain controls suggest that the same phenomenon can arise beyond education when the deployed backbone leaves entity-level bias.