Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CL) 2026-06-15

Right or Wrong, Models Comply: Directional Blindness in LLM Moral Judgment

As language models take integrated roles across many domains, the response of LLMs to user pushback becomes a critical alignment property. Yet many existing evaluations treat compliance as unidirectional, measuring whether models resist pressure but not whether they resist it selectively. We introduce Compliance Asymmetry (A = BCR/HCR), a bidirectional diagnostic that compares beneficial output change under helpful nudges with harmful change under misleading nudges. Across 9 models and 972,000 nudge-condition responses, we find that this selectivity differs in factual and moral judgments: models follow helpful nudges more than harmful ones on factual questions (A = 1.58), but follow both directions at nearly identical rates on moral questions (A = 1.04). This phenomenon persists across model families, capability levels, and nudging types. Interestingly, we also find that chain-of-thought prompting amplifies helpful and harmful compliance together, while identity-based prompting suppresses both by nearly identical margins. These results identify direction-blind moral compliance as a distinct failure mode in current LLMs and suggest that alignment should target directionally calibrated updating rather than lower compliance alone.

02.
arXiv (CS.LG) 2026-06-16

Contrastive Regularization for Accent-Robust ASR

arXiv:2605.03297v2 Announce Type: replace-cross Abstract: ASR systems based on self-supervised acoustic pretraining and CTC fine-tuning achieve strong performance on native speech but remain sensitive to accent variability. We investigate supervised contrastive learning (SupCon) as a lightweight, accent-invariant auxiliary objective for CTC fine-tuning. An utterance-level contrastive loss regularizes encoder representations without architectural modification or explicit accent supervision. Experiments on the L2-ARCTIC benchmark show consistent WER reductions across multiple pretrained encoders, with up to 25 – 29\% relative reduction under unseen-accent evaluation. Analysis using within-transcript cosine dispersion indicates that SupCon promotes more compact and stable representation geometry under accent variability. Overall, SupCon provides an effective and model-agnostic regularization strategy for improving accent robustness.

03.
arXiv (CS.AI) 2026-06-25

Model Forensics: Investigating Whether Concerning Behavior Reflects Misalignment

arXiv:2606.26071v1 Announce Type: cross Abstract: A central goal of safety research is determining whether a model is misaligned. Prior work has largely focused on detecting concerning behavior. But behavior alone does not establish misalignment: a concerning action can arise from benign causes such as confusion. This motivates model forensics: investigating whether the action was driven by malign intent. In this paper, we propose a baseline protocol for model forensics consisting of two steps, iterated as needed. First, we read the chain of thought (CoT) to generate hypotheses about what drives model behavior. Second, we make edits to the prompt or environment to test these hypotheses. While the CoT is not always faithful, it is a rich source of unsupervised insight that can guide the collection of more rigorous evidence. To evaluate our protocol, we create a suite of six agentic environments where models exhibit concerning behavior, and apply it to each. We establish that Kimi K2 Thinking takes shortcuts due to a genuine disposition towards low-effort actions, by showing this hypothesis successfully predicts its behavior. Through counterfactual experiments, we show DeepSeek R1 deceives out of a desire to be consistent with a previous instance of itself. Our methods nonetheless leave significant room for refinement. For example, when we test whether Kimi K2 Thinking believes it is violating user intent, we find no evidence of such a belief, but without positive controls we cannot confirm our tests would detect it. Overall, we find our simple protocol provides a strong baseline that we hope future work will improve upon. More broadly, our work is a concrete step in developing the growing field of model forensics.

04.
medRxiv (Medicine) 2026-06-16

Fidelity-Derived Quantum Dissimilarity-Enhanced k-Nearest Neighbor Algorithm for Arterial Hypertension Prediction

We present a quantum-enhanced version of the classic k-Nearest Neighbors (kNN) classification algorithm, applied to the prediction of arterial hypertension. The traditional Euclidean distance metric of the kNN algorithm is replaced with a Fidelity-derived quantum dissimilarity measure to evaluate the similarity between data samples. We map classical real-world clinical and ECG-derived data features into quantum states via the Dense-Angle Encoding, which efficiently utilizes parameterized rotation gates to pack multiple features into minimal qubits while maintaining pure states. We evaluate the performance of the dissimilarity measure using both the noiseless state vector Simulator and the IBM Qiskit Estimator primitives. The quantum circuit demonstrates robust predictive capabilities comparable to the classical model. While it does not claim computational supremacy over the classical baseline, the framework proves that fidelity-based similarity is a physically meaningful and efficient approach for hybrid quantum classical classification.

05.
arXiv (CS.AI) 2026-06-19

Bid Farewell to Seesaw: Towards Accurate Long-tail Session-based Recommendation via Dual Constraints of Hybrid Intents

arXiv:2511.08378v4 Announce Type: replace-cross Abstract: Session-based recommendation (SBR) aims to predict anonymous users' next interaction based on their interaction sessions. In the practical recommendation scenario, low-exposure items constitute the majority of interactions, creating a long-tail distribution that severely compromises recommendation diversity. Existing approaches attempt to address this issue by promoting tail items but incur accuracy degradation, exhibiting a "see-saw" effect between long-tail and accuracy performance. We attribute such conflict to session-irrelevant noise within the tail items, which existing long-tail approaches fail to identify and constrain effectively. To resolve this fundamental conflict, we propose HID (Hybrid Intent-based Dual Constraint Framework), a plug-and-play framework that transforms the conventional "see-saw" into "win-win" through introducing the hybrid intent-based dual constraints for both long-tail and accuracy. Two key innovations are incorporated in this framework: (i) Hybrid Intent Learning, where we reformulate the intent extraction strategies by employing attribute-aware spectral clustering to reconstruct the item-to-intent mapping. Furthermore, discrimination of session-irrelevant noise is achieved through the assignment of the target and noise intents to each session. (ii) Intent Constraint Loss, which incorporates two novel constraint paradigms regarding the diversity and accuracy to regulate the representation learning process of both items and sessions. These two objectives are unified into a single training loss through rigorous theoretical derivation. Extensive experiments across multiple SBR models and datasets demonstrate that HID can enhance both long-tail performance and recommendation accuracy, establishing new state-of-the-art performance in long-tail recommender systems.

06.
arXiv (CS.AI) 2026-06-11

"That's AI Slop, You Bot!" Studying Accusations, Evidence, and Credibility in Online Discourse Towards LLM-Generated Comments

arXiv:2606.12073v1 Announce Type: cross Abstract: Generative AI has made fluent prose cheap to produce, breaking the old promise to readers that good writing meant real thinking. How have readers responded, and what can this tell us about changing anti-AI attitudes? We analyzed 25 million comments from Hacker News and Reddit (2023-2026), combining LLM judgment on 7,500 sampled accusations of AI use, sentiment trajectories, speech-act coding of 300 confirmed accusations of AI use, and a matched-control test of accused versus non-accused parent comments. We found that the pejorative-label share of accusations rose more than tenfold on both platforms while a placebo vocabulary of pre-2022 inauthenticity terms (shill, astroturf) did not. This shift reflected a fast-growing trend of branding any suspicious or seemingly inauthentic prose as "AI slop". The slop frame now constitutes 94 percent of pejorative mentions, with the dominant comments shifting in tone from mockery toward gatekeeping and structural protest. The key surprise comes from a matched-control test which found that prose features that statistically distinguish AI from human text do not predict which human text gets accused as AI. The new accusations work as social gatekeeping of perceived authenticity without actually screening for AI. This research extends signaling theory by showing that substitute signals used socially can grow even when inaccurate if the underlying detection problem cannot be solved at the non-expert level. It shows that AI's effects on writing from the reader side are distinct from those on the production (writer) side. Detection technology cannot resolve this dynamic because the social function of accusations is increasingly to perform social gatekeeping and in-group signaling as opposed to identifying AI-generated writing.

07.
arXiv (CS.CL) 2026-06-24

The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs

Sparse attention offers a promising strategy to extend long-context capabilities in Transformer LLMs, yet its efficiency-accuracy trade-offs remain unclear due to the lack of comprehensive evaluation. We address this gap with the largest-scale empirical analysis to date of training-free sparse attention, evaluating six methods across multiple model families and sizes, sequences up to 128K tokens, and sparsity levels up to 0.95 (i.e., $1/20$ attention budget) on nine diverse tasks. We first organise the rapidly evolving landscape of sparse attention methods into a taxonomy along four design axes. Our analysis then yields actionable insights: 1) sparse attention is effective: larger sparse models outperform smaller dense ones at equivalent cost, improving the Pareto frontier; 2) for the training-free methods we study, fine-grained per-query importance estimation during prefilling remains impractical-due to both the cost of estimation and the lack of sparse kernels that translate fine-grained sparsity into wall-clock gains-forcing a task-dependent choice between global-to-token and block-to-block selection. Instead, during decoding, token-to-page selection becomes feasible, enabling better generalisation and higher sparsity tolerance; 3) longer sequences tolerate higher sparsity, suggesting that fixed-budget methods in production are suboptimal. Together, these findings provide practical guidance for deploying sparse attention and methodological recommendations for future evaluations. Our code is available at https://github.com/PiotrNawrot/sparse-frontier.

08.
arXiv (CS.CL) 2026-06-17

ConSA: Controllable Sparsity in Hybrid Attention via Learnable Allocation

Hybrid architectures combining full attention (FA) and sliding-window attention (SWA) are a promising paradigm for efficient LLM inference. However, existing methods typically rely on hand-crafted rules or simple post-hoc heuristics for FA/SWA allocation and offer limited analysis of the attention behaviors underlying these designs. We propose Controllable Sparsity in Hybrid Attention (ConSA), a framework that learns optimal FA/SWA assignment under a user-specified sparsity target. ConSA employs L0 regularization to learn binary masks selecting between FA and SWA for each attention unit, while an augmented Lagrangian constraint enforces the target sparsity at either layer or KV-head granularity. We evaluate ConSA on two LLMs at the 0.6B and 1.7B scales. Learned allocations consistently outperform rule-based baselines, with KV-head-wise allocation yielding clear gains over layer-wise allocation. The learned patterns place SWA in the bottom layers and concentrate FA into contiguous middle-layer blocks, diverging from evenly interleaved patterns in rule-based methods. This structure persists across model scales, sparsity levels, and allocation granularities, revealing a fine-grained spectrum of intrinsic attention behaviors that underlies the learned allocation.

09.
arXiv (CS.AI) 2026-06-11

Lung-SRAD: Spectral-Aware Regularized Audio DASS with Dual-Axis Patch-Mix Contrastive Learning for Respiratory Sound Classification

arXiv:2606.11922v1 Announce Type: cross Abstract: Recent respiratory sound classification (RSC) studies largely rely on CLS-token driven self-attention architectures such as the Audio Spectrogram Transformer (AST). While effective at modeling global context, recent analyses suggest a low-pass filtering behavior that may reduce sensitivity to localized abnormal patterns. In this work, we investigate State Space Models (SSMs) as an alternative backbone for RSC. Using the Distilled Audio State Space model, we analyze intermediate representations through spectral response curves and observe stronger preservation of mid-to-high spatial-frequency components. Based on these observations, we introduce spectral-aware layer regularization using Gaussian convolution applied to selected layers. We further propose Dual-Axis Patch-Mix contrastive learning tailored to SSM-based audio models for robust representation learning. Experiments on the ICBHI benchmark show that our approach achieves 64.48% score, outperforming the AST baseline by 5%. Code is available at https://github.com/RSC-Toolkit/Lung-SRAD.

10.
medRxiv (Medicine) 2026-06-24

Risk factors for suicide and repeat self-harm: a cohort study of adults with hospital-presenting self-harm

Background:Previous self-harm elevates the risk of repeat self-harm and suicide, but the prognostic value of events and clinician observations around the index event is unclear. We evaluated established and exploratory risk factors for suicide and repeat self-harm among patients presenting to emergency psychiatric units after a suicide attempt or nonsuicidal self-injury (NSSI). Methods: Multicentre cohort study in Sweden (n = 804). Outcomes were suicide and repeat self-harm at 1-year and 5-year follow-up, ascertained through linked national registers. Established risk factors included psychiatric diagnoses, prior suicidal behaviour, and sociodemographic characteristics; exploratory factors comprised past-week self-reported symptom changes and clinician observations. LASSO-regularised Cox regression models were fitted for established (n=21) and exploratory (n=11) risk factors. Results: During five-year follow-up, 285 (35%) individuals had a new episode of self-harm and 41 (5%) died by suicide. No risk factors reached statistical significance for suicide, although male sex was retained after regularisation (1-year hazard ratio [HR] = 3.57 [95% CI 0-8.33]; 5-year HR = 2.5 [0.03-4.55]). Three established risk factors were significantly associated with repeat self-harm: psychiatric inpatient care in the three months before the index event (1-year HR = 1.85 [1.3-2.6]; 5-year HR = 1.72 [1.23-2.65]), previous suicide attempt (1-year HR = 2.01 [0.79-2.4]; 5-year HR = 2.19 [1.27-2.6]), and borderline personality disorder (1-year HR = 1.82 [1.13-3]; 5-year HR = 1.67 [0.14-2.75]). Among exploratory risk factors, clinician-observed hopelessness (1-year HR = 1.72 [1.1-2.3]; 5-year HR = 1.51 [1.03-1.91]) and personality disorder features (1-year HR = 1.48 [0.96-2.05]; 5-year HR = 1.47 [1.04-1.95]) were associated with repeat self-harm. Conclusions: Risk factor profiles for repeat self-harm were consistent at 1 and 5 years. Beyond established risk factors, clinician-observed hopelessness and personality disorder features emerged as markers of risk, suggesting that qualitative clinician assessments may yield prognostic information not available from medical records alone.

11.
arXiv (quant-ph) 2026-06-16

Morphology-resolved scrambling in a chaotic quantum billiard

arXiv:2606.16865v1 Announce Type: new Abstract: Chaotic quantum systems can retain spatial memory through scarred eigenstates, but whether these static structures control scrambling remains unclear. This work establishes a morphology-resolved connection between scarred eigenstates and eigenstate-resolved OTOCs in a peanut-shaped quantum billiard. Scalar localisation diagnostics, including differential entropy and continuum participation ratios, detect anomalous concentration but discard spatial architecture. A scale-normalised density overlap, in contrast, directly compares probability density profiles, revealing families of orthogonal eigenstates with nearly identical spatial morphology. Comparing the complete OTOC time traces of these orthogonal eigenstates reveals that morphological recurrence has dynamical content: moderate density overlap yields no universal prediction, whereas strongly recurring morphologies exhibit nearly identical OTOC growth and saturation. Thus, scarred structures act as spatial templates for operator growth, not merely static violations of ergodicity. This morphology-resolved framework turns eigenstate shape into a quantitative predictor of scrambling and provides a scale-controlled diagnostic of weak ergodicity breaking in quantum chaos.

12.
arXiv (CS.CV) 2026-06-25

HG-Bench: A Benchmark for Multi-Page Handwritten Answer-Region Grounding in Automated Homework Assessment

Automated homework assessment depends not only on recognizing student answers, but also on accurately locating where each answer and each intermediate reasoning step appears in noisy, multi-page handwritten work. This paper addresses the missing evaluation setting of page-aware, two-level answer-region grounding: given a sequence of homework page images, a model must localize complete answer regions and their ordered step-level subregions. We introduce HG-Bench, a benchmark of 500 human-annotated K-12 homework samples curated from a 1,489,278-image source pool, with question-level and step-level boxes linked by a hierarchical containment constraint. HG-Bench is paired with a page-aware evaluation protocol that separately measures complete-answer localization (FA) and step-level decomposition (FSm), revealing whether models truly ground the spatial structure of student reasoning rather than merely parse visible text. Across frontier closed-source APIs and competitive open-weight VLMs, no zero-shot system exceeds 55.22% on FA or 48.22% on FSm, while a GLM-4.6V 9B reference model fine-tuned on ~10k in-domain examples reaches 74.97/72.26. These results identify step-level handwritten grounding as a concrete capability gap and provide a reproducible benchmark, evaluation protocol, and trained reference point for future work on automated homework assessment.

13.
arXiv (CS.LG) 2026-06-19

When Calibration Fails the Vulnerable Hospital: Federated Conformal Risk Control via Risk-Curve Shrinkage

arXiv:2606.20115v1 Announce Type: new Abstract: Conformal risk control (CRC) provides distribution-free guarantees on segmentation quality by calibrating a prediction-set threshold on held-out data. In federated deployments, the standard approach pools calibration scores across sites into a single threshold. We provide the first quantification, on real multi-institutional brain tumor data (FeTS-2022, 1,251 subjects, 20 institutions), showing that this naive pooled CRC protects the average hospital but violates coverage at 40% of individual institutions, with the worst site exceeding the target false-negative rate by 7.8 percentage points. The naive alternative, per-site local CRC, largely restores coverage but inflates prediction sets by 83x, rendering them clinically useless. We propose a shrinkage-based federated CRC protocol: each site transmits only its empirical risk curve (G scalars) to a server, which computes a shrinkage-regularized threshold per site. A single hyperparameter n0 smoothly trades worst-case coverage for prediction-set efficiency; leave-one-site-out sensitivity analysis identifies n0=19, achieving 2.7/20 violations at 2.0x stretch. We further show that direct Lagrangian optimization of coverage budgets fails, concentrating risk on vulnerable hospitals, and that the finite-sample correction term is essential: removing it triples violations. The marginal CRC guarantee is preserved by construction under the stated site-mixture assumption; per-site coverage is validated across four targets with three seeds. No patient-level images, masks, or per-volume scores leave any site.

14.
arXiv (CS.AI) 2026-06-19

Sensorimotor World Models: Perception for Action via Inverse Dynamics

arXiv:2606.20104v1 Announce Type: cross Abstract: Perception for action suggests that representations of the world should be shaped not by visual fidelity alone, but by their relevance for actions. At the same time, latent JEPA-style world models advocate learning compact predictive states from high-dimensional observations to facilitate the prediction of future states, but end-to-end training of these models is nontrivial because representations may collapse if our only goal is to construct a latent state that is easy to predict. We introduce a sensorimotor world model (SMWM): a latent world model trained end-to-end with inverse dynamics regularization. This single regularizer addresses both issues: it prevents representation collapse and induces action-aligned representations. By forcing latent states to preserve information about the action underlying a transition, it biases the model toward the controllable degrees of freedom of the environment while discarding uncontrollable distractors. This yields stable latent world models trained from offline, reward-free trajectories, without frozen encoders, exponential moving averages, or complex latent regularizers. Empirically, SMWM learns compact, interpretable latent spaces and enables competitive planning performance across simple 2D and 3D control tasks.

15.
arXiv (quant-ph) 2026-06-17

Intrinsic Pointer Basis and Irreversible Classicality from Coherence Contraction

arXiv:2604.23304v4 Announce Type: replace Abstract: This work analyzes an operational route to classical behavior for reduced quantum states using the intrinsic reference basis (IRB). Relative to a fixed physical conjugation, the IRB separates intrinsic populations from a real antisymmetric cohesion sector. A globally bounded cohesion index is defined and its exponential contraction is proved for phase-free dephasing dynamics aligned with the IRB; for general aligned dephasing, the corresponding modulus-based coherence functional contracts at the same computable rates. The results provide distance bounds to the IRB-diagonal description and a logarithmic upper bound on the time required to reach a prescribed experimental tolerance. The IRB projectors constitute state-derived candidate pointer sectors, and they become dynamically stable pointer sectors when the effective dephasing generator is aligned with them and damps the relevant inter-sector coherences. Degenerate population sectors lead naturally to block-classicality and protected intra-block coherence. In a two-level active sector, the cohesion index equals fringe visibility, giving a direct interferometric test of the contraction law. The construction is independent of any spacetime- or unification-emergence hypothesis and is intended as a channel-level complement to environment-induced einselection.

16.
arXiv (CS.AI) 2026-06-12

The KG-ER Conceptual Schema Language

arXiv:2508.02548v3 Announce Type: replace-cross Abstract: We propose KG-ER, a conceptual schema language for knowledge graphs that describes the structure of knowledge graphs independently of their representation (relational databases, property graphs, RDF) while helping to capture the semantics of the information stored in a knowledge graph.

17.
arXiv (quant-ph) 2026-06-16

Exactly Solvable Quantum Model with Spin-Dependent Coulomb Interaction

arXiv:2501.05103v5 Announce Type: replace Abstract: In this work, we report an exactly solvable quantum model featuring a spin-dependent Coulomb interaction, described by the spin vector potential \(\vec{\mathcal{A}} = k (\vec{r} \times \vec{S}) / r^2\) together with a Coulomb-type scalar potential \(\varphi = \kappa / r\) . The model is governed by the Schrödinger-type Hamiltonian \(\mathcal{H}_S = \vec{\Pi}^2 / (2M) + q \varphi\) in nonrelativistic quantum mechanics and by the Dirac-type Hamiltonian \(\mathcal{H}_D = c \vec{\alpha} \cdot \vec{\Pi} + \beta M c^2 + q \varphi\) in relativistic quantum mechanics, where \(\vec{\Pi} = \vec{p} - (q/c)\vec{\mathcal{A}}\) is the canonical momentum. We demonstrate two main results: (i) Just as the Coulomb-type scalar potential \(\mathcal{S}_Maxwell = \{\vec{\mathcal{A}} = 0,\ \varphi = \kappa / r\}\) is a local exact solution of Maxwell's equations on $r\neq0$, the gauge potential \(\mathcal{S}_YM = \{\vec{\mathcal{A}} = k (\vec{r} \times \vec{S}) / r^2,\ \varphi = \kappa / r\}\) constitutes a local exact solution of the Yang–Mills equations on the punctured region $r\neq0$. (ii) Both Hamiltonians \(\mathcal{H}_S\) and \(\mathcal{H}_D\) can be solved exactly in the presence of this spin-dependent Coulomb interaction. The resulting energy spectra are derived, and they naturally reduce to those of the ordinary hydrogen atom when the spin-dependent terms are neglected. Finally, we clarify the quantization conditions and the fixed-background interpretation of the model.

18.
arXiv (math.PR) 2026-06-11

A Hybrid LSMC-PDE Method for Bermudan Option Pricing under the Gatheral Double Mean-Reverting Model

arXiv:2606.11237v1 Announce Type: cross Abstract: We study Bermudan option pricing under the Gatheral Double Mean-Reverting (GDMR) stochastic volatility model. The model features a variance process together with a stochastic long-run mean variance process and allows Constant Elasticity of Variance (CEV)-type exponents in the diffusion coefficients. This model is attractive since it provides a flexible specification for volatility dynamics. However, the pricing of early-exercise derivatives under the GDMR model remains largely unexplored in the literature. To address this challenge, we adapt a Hybrid Least-Squares Monte Carlo-Partial Differential Equation (LSMC-PDE) framework to the GDMR model and provide a detailed model-specific implementation. Conditioning on simulated variance paths, the pricing problem reduces to a one-dimensional problem in the asset price, which is solved by a Fourier-based approach, while the remaining dependence on the variance variables is approximated by least-squares regression. Our numerical experiments demonstrate that the Hybrid LSMC-PDE approach yields accurate pricing estimates and often lower pricing errors than plain LSMC, particularly for low and moderate numbers of simulation paths, showing the benefit of using the model structure in early-exercise option pricing.

19.
arXiv (CS.LG) 2026-06-11

MemNovo: Look Back at the Spectrum for Balanced De Novo Peptide Sequencing from Mass Spectrometry

arXiv:2606.11868v1 Announce Type: new Abstract: De novo peptide sequencing from tandem mass spectrometry is pivotal in proteomics, enabling identification of novel peptides without reference databases. While recent Transformer-based encoder-decoder models have achieved remarkable performance, we uncover a critical pathology in their inference dynamics. Through comprehensive feature scaling experiments, we demonstrate that existing auto-regressive peptide decoders tend to over-rely on generated-sequence priors while progressively under-utilizing fine-grained physical evidence from the input mass spectrum. This phenomenon leads to suboptimal results, where generated peptide sequences are biologically plausible yet not faithful to the input spectrum. To rectify this, we propose MemNovo, a training-free and plug-and-play mechanism that re-balances peptide and spectral contributions at inference time. MemNovo alleviates the information bottleneck by establishing a persistent spectral memory bank and injecting retrieved features directly into the final decoding stage via an ultra-conservative residual connection. Theoretical analysis confirms that this mechanism restores the mutual information between the decoder state and the raw spectrum. Extensive experiments on the Nine Species benchmark with two representative baselines, Casanovo and InstaNovo, demonstrate that MemNovo consistently improves both amino acid precision and peptide precision, achieving up to 39.1% relative improvement in peptide precision for Casanovo and up to 3.9% for InstaNovo, with negligible computational overhead.

20.
arXiv (CS.AI) 2026-06-19

Reinforcement-aware Knowledge Distillation for LLM Reasoning

arXiv:2602.22495v3 Announce Type: replace-cross Abstract: Reinforcement learning (RL) post-training has recently driven major gains in long chain-of-thought reasoning large language models (LLMs), but the high inference cost of such models motivates distillation into smaller students. Most existing knowledge distillation (KD) methods are designed for supervised fine-tuning (SFT), relying on fixed teacher traces or teacher-student Kullback-Leibler (KL) divergence-based regularization. When combined with RL, these approaches often suffer from distribution mismatch and objective interference: teacher supervision may not align with the student's evolving rollout distribution, and the KL regularizer can compete with reward maximization and require careful loss balancing. To address these issues, we propose RL-aware distillation (RLAD), which performs selective imitation during RL – guiding the student toward the teacher only when it improves the current policy update. Our core component, Trust Region Ratio Distillation (TRRD), replaces the teacher-student KL regularizer with a PPO/GRPO-style likelihood-ratio objective anchored to a teacher–old-policy mixture, yielding advantage-aware, trust-region-bounded distillation on student rollouts and naturally balancing exploration, exploitation, and imitation. Across diverse logic reasoning and math benchmarks, RLAD consistently outperforms offline distillation, standard GRPO, and KL-based on-policy teacher-student knowledge distillation.

21.
arXiv (quant-ph) 2026-06-12

Non-invertible symmetries out of equilibrium: Eigenstate order and Floquet physics

arXiv:2508.14213v2 Announce Type: replace-cross Abstract: Through the study of the Rep($D_8$) non-invertible symmetry, we show how non-invertible symmetries manifest in dynamics. Results are presented for dynamics generated by Hamiltonians as well as Floquet unitaries. For both examples, the role of the non-invertible symmetry is studied through the appearance of non-invertible symmetry protected edge modes. In addition, the role of the non-invertible symmetry for the Hamiltonian is studied through eigenstate order. In particular, by considering the effect of symmetry preserving disorder, the non-invertible symmetry is shown to give rise to degeneracies in the spectra of the Hamiltonian that can only be completely lifted at orders of perturbation that scale with system size. The eigenstates of disordered Hamiltonians, whose ground state correspond to non-trivial symmetry protected topological (SPT) states, are shown to have either trivial or non-trivial SPT order that are detected as non-zero expectation value of string order-parameters. In contrast, non-trivial SPT order is absent in the eigenstates of trivial SPT Hamiltonians with disorder. The interface between two different SPT phases host edge modes whose dynamics is studied numerically and analytically. The edge mode is shown to oscillate at frequencies related to different effective chain lengths that are weighted by the temperature, becoming an exact zero mode in the limit of zero temperature. A Floquet model with the non-invertible symmetry is constructed whose edge mode is shown to exhibit period-doubled dynamics at low effective-temperatures. The zero and period-doubled edge modes differ from those in conventional SPTs by being symmetric under the invertible symmetry, while being charged under the non-invertible symmetry.

22.
arXiv (quant-ph) 2026-06-25

Fundamental limit on the heralded single photons' spectral brightness

arXiv:2510.24439v3 Announce Type: replace Abstract: Heralded single photons (HSPs) are the versatile flying qubits in quantum communication and networks due to their ability to remove the randomness of arrival time and enhance the transmission reliability. As the generation rate of HSPs increases or their linewidth narrows, both of which are desirable for quantum information processing, the fundamental limit of spectral brightness (SB), defined as the generation rate per unit linewidth, remains unclear. To examine the existence and value of such a limit, we systematically studied the SB together with the cross-correlation function, or equivalently, the signal-to-background ratio (SBR). We ultimately derive an upper bound on SB that applies universally to all types of HSP sources. A newly defined quantity governs this limit, the quality factor, which is the product of SBR and effective SB. The quality factor indicates how closely an HSP source approaches an ideal noise-free source. Furthermore, by employing an HSP source based on hot atomic vapor, we achieved an SB of $(8.5\pm0.3)$$\times$$10^5$ pairs/s/MHz and a quality factor of $0.73\pm0.02$ under the single-photon criterion. Both values represent the highest reported performance to date among all HSP platforms. These results provide a unified benchmark for evaluating and optimizing HSP sources.

23.
arXiv (math.PR) 2026-06-17

The Erdős-Hajnal High-Girth Subgraph Conjecture Holds in the Polynomial Chromatic-Sparsity Regime

作者:

arXiv:2606.17901v1 Announce Type: cross Abstract: For a graph $G$ put $h_r(G)=\max{\chi(H):H\subseteq G,\operatorname{girth}(H)\ge r}.$ Erdős and Hajnal asked whether $h_r(G)\to\infty$ as $\chi(G)\to\infty$, for every fixed $r\ge4$. We prove this in every fixed polynomial edge-density regime: for all $r\ge4$, $k\ge2$, $P,C>0$, there is $M=M_{r,k}(P,C)$ such that $\chi(G)\ge M,\ e(G)\le C\chi(G)^P\Longrightarrow h_r(G)\ge k.$ Quantitatively, after replacing $P$ by $P\vee2$ and $C$ by $C\vee2$, $M_{r,k}(P,C)\le \exp!\left(O_{r,k}\bigl((P+2+\log(C\vee2))^2\bigr)\right),$ and consequently the same conclusion holds throughout the quasi-polynomial range $e(G)\le \exp\bigl(C_0(\log\chi(G))^a\bigr),\ 1 < a < 3/2,$ for all sufficiently large $\chi(G)$. In each fixed polynomial-density regime we also obtain $f_{P,C}(k,r)\le k^{O_{r,P,C}(1)}.$ The proof combines a chromatic-defect random extraction lemma, compact and near-quadratic sparse-core bases, and a peeling/thinning bootstrap increasing the admissible edge exponent by $1/(r-1)$. We also prove structural saturation results for possible counterexamples, including Moore-strength exact-cycle packings and quadratic saturation in projected colour-pair space. Finally, writing $h_r^{\mathrm f}(G)=\max{\chi_{\mathrm f}(H):H\subseteq G,\operatorname{girth}(H)\ge r},$ we develop a fractional random-extraction framework based on Mohar-Wu preservation. We prove sufficient cheap-cycle-killing criteria and verify them for several structured families, including clique-organised families, line graphs of incidence graphs of equal-order generalized quadrangles and generalized hexagons, and the Bohman-Keevash tracking-time triangle-free-process graph. We also isolate a density-free obstruction that any proof using this fractional surgery route must overcome.

24.
arXiv (CS.CV) 2026-06-15

Instruct-Particulate: Scaling Feed-Forward 3D Object Articulation with Kinematic Control

Reconstructing articulated 3D objects is important for animation, gaming, and robotic simulations. Recent neural networks can estimate the articulated structure of 3D objects, but their generalization remains limited by the scarcity of annotated data for this task. To address this gap, we introduce Instruct-Particulate, a model that takes a 3D mesh together with a target kinematic specification, including part descriptions, connectivity, joint types, and optional point prompts, and predicts the corresponding kinematic part segmentation and joint motion parameters. The kinematic specification disambiguates the task and allows the model to target annotations of different granularity, thereby making it possible to use more abundant heterogeneous training data. At test time, the kinematic specification can be obtained automatically from large-scale vision-language models, so the model can be applied to any input mesh. To train our model at scale, we construct a heterogeneous dataset of more than 150,000 articulated 3D objects, extending existing publicly available collections with data obtained by partially labelling other 3D models (monolithic or already decomposed into parts) with kinematic labels by means of vision-language models. Experiments show that our model generalizes better across categories and to AI-generated meshes, enabling articulated asset reconstruction from real-world images via image-to-3D models.

25.
bioRxiv (Bioinfo) 2026-06-10

A Unified Spatial AI Framework for Cross-Domain Tissue-State Analysis in Trauma, Oral, and Cardiovascular Pathology

作者:

Objective: To develop a cross-domain spatial AI framework for identifying conserved tissue-state organisation across trauma, oral disease, and cardiovascular tissue using spatial transcriptomic data. Methods: Four public spatial transcriptomic datasets spanning wound healing, periodontitis, oral squamous cell carcinoma, and cardiac tissue were integrated using recurrence modelling, graph-based spatial learning, fuzzy tissue-state analysis, and tensor decomposition. Cross-domain coupling, spatial fragmentation, recurrence structure, and permutation-based topological validation were evaluated. Results: Six conserved fuzzy tissue states were identified, dominated by extracellular matrix remodelling, fibroblast/stromal activation, endothelial signalling, and inflammatory pathways. Latent embedding analysis demonstrated strong overlap between trauma and oral domains, while cardiovascular tissue exhibited more compact spatial organisation. Oral inflammatory tissue showed the highest fragmentation, whereas cardiovascular tissue demonstrated greater recurrence coherence. Tensor decomposition identified conserved stromal-remodelling programmes across domains. Permutation testing confirmed significantly elevated graph modularity and reduced spatial entropy relative to null distributions. Conclusion: The proposed framework identified conserved spatial tissue-state architecture linking wound healing, oral pathology, and cardiovascular tissue despite differences in tissue origin, pathology, and acquisition technology. Significance: These findings demonstrate the potential of spatial AI for investigating conserved stromal and inflammatory microenvironmental organisation across clinically related disease systems and may support spatial biology research in trauma–oral–systemic health.