Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.LG) 2026-06-11

Provable Recovery of Locally Important Signed Features and Interactions from Random Forest

arXiv:2512.11081v2 Announce Type: replace-cross Abstract: Feature and Interaction Importance (FII) methods are essential in supervised learning for assessing the relevance of input variables and their interactions in complex prediction models. In many domains, such as personalized medicine, local interpretations for individual predictions are often required, rather than global scores summarizing overall feature importance. Random Forests (RFs) are widely used in these settings, and existing interpretability methods typically exploit tree structures and split statistics to provide model-specific insights. However, theoretical understanding of local FII methods for RF remains limited, making it unclear how to interpret high importance scores for individual predictions. We propose a novel, local, model-specific FII method that identifies frequent co-occurrences of features along decision paths, combining global patterns with those observed on paths specific to a given test point. We prove that our method consistently recovers the true local signal features and their interactions under a Locally Spike Sparse (LSS) model and also identifies whether large or small feature values drive a prediction. We illustrate the usefulness of our method and theoretical results through simulation studies and a real-world data example.

03.
arXiv (CS.LG) 2026-06-16

Communication-Efficient Neural Tangent Kernels for Heterogeneous Decentralized Federated Learning

作者:

arXiv:2512.12737v2 Announce Type: replace Abstract: Decentralized federated learning (DFL) enables collaborative model training without a central server, but converges slowly under statistical heterogeneity. Recent work has shown that neural tangent kernel (NTK) methods achieve faster convergence than gradient-based updates in DFL, while momentum has proven effective for accelerating gradient-based FL. However, applying momentum to NTK updates can destabilize training under heterogeneous data. We propose SPARK, which addresses this instability with a stage-wise annealed soft-label regularizer evaluated on neighborhood-aggregated data, so that momentum can accelerate NTK updates stably. Under high heterogeneity, SPARK converges about 3$\times$ faster than baselines and lowers the total communication to a target accuracy by up to about 70\%, and it attains higher accuracy across heterogeneity levels. We further study random projection as an optional Jacobian-compression strategy for bandwidth-constrained settings. We validate the approach across multiple datasets, network topologies, and heterogeneity levels.

04.
arXiv (CS.AI) 2026-06-18

QSignAI: Quantum-Randomness-Seeded Identity Signatures at the Intersection of AI for Science and Science for AI

arXiv:2605.27729v2 Announce Type: cross Abstract: The 2024-2025 Nobel and Turing awards recognised AI and quantum science simultaneously. Yet no deployed system has brought these streams together for the public. This paper presents QSignAI, a production-deployed platform demonstrating a bidirectional AI-quantum relationship in a real-time event participation system. We address three questions: can quantum-randomness generation via a two-source extractor be embedded in an AI-driven social platform with acceptable latency; can an AI bot make quantum phenomena perceptually legible to general audiences; and does the combined system work in practice? A conversational bot routes each participant's first message through a quantum pipeline comprising a Toeplitz two-source extractor over independent single-qubit Hadamard measurements on SV1 and DM1 simulators, plus a 2-qubit Bell state, producing a unique quantum-randomness-seeded identity signature per participant. The first two questions are answered through system architecture and qualitative deployment evidence from live events; the third through successful production deployment. The current deployment uses cloud quantum simulators; physical QPU randomness is the near-term extension. Measurable benchmarks are identified as priority future work.

05.
arXiv (CS.LG) 2026-06-17

A Dynamical Systems Perspective on the Analysis of Neural Networks

arXiv:2507.05164v2 Announce Type: replace-cross Abstract: In this chapter, we utilize dynamical systems to analyze several aspects of machine learning algorithms. As an expository contribution we demonstrate how to re-formulate a wide variety of challenges from deep neural networks, (stochastic) gradient descent, and related topics into dynamical statements. We also tackle three concrete challenges. First, we consider the process of information propagation through a neural network, i.e., we study the input-output map for different architectures. We explain the universal embedding property for augmented neural ODEs representing arbitrary functions of given regularity, the classification of multilayer perceptrons and neural ODEs in terms of suitable function classes, and the memory-dependence in neural delay equations. Second, we consider the training aspect of neural networks dynamically. We describe a dynamical systems perspective on gradient descent and study stability for overdetermined problems. We then extend this analysis to the overparameterized setting and describe the edge of stability phenomenon, also in the context of possible explanations for implicit bias. For stochastic gradient descent, we present stability results for the overparameterized setting via Lyapunov exponents of interpolation solutions. Third, we explain several results regarding mean-field limits of neural networks. We describe a result that extends existing techniques to heterogeneous neural networks involving graph limits via digraph measures. This shows how large classes of neural networks naturally fall within the framework of Kuramoto-type models on graphs and their large-graph limits. Finally, we point out that similar strategies to use dynamics to study explainable and reliable AI can also be applied to settings such as generative models or fundamental issues in gradient training methods, such as backpropagation or vanishing/exploding gradients.

06.
arXiv (CS.CV) 2026-06-17

Revisiting Structural Dependency in Autoregressive Multi-Task Table Recognition via Order-Independent Cell-Level Representations

Multi-task table recognition jointly addresses table structure prediction, cell localization, and cell content recognition within a unified framework. Existing approaches often rely on autoregressive decoders to generate table structures and reuse their hidden states for cell localization and content recognition. This autoregressive generation process can make cell representations order-dependent, degrading global consistency across cells. This paper proposes a structural refinement module that produces order-independent cell features through non-causal attention. This design enables parallel inference of cell contents while conditioning each cell on global context encoded in the refined features. Experiments on two large datasets demonstrate consistent gains in cell localization and end-to-end recognition, while reducing overall inference time by around threefold.

07.
arXiv (CS.CL) 2026-06-11

Evolving Agents in the Dark: Retrospective Harness Optimization via Self-Preference

AI agents rely on a harness of skills, tools, and workflows to solve complex problems. Continually improving this harness is essential for adapting to new tasks. However, existing optimization methods typically require ground-truth validation sets, yet such labeled data is difficult to acquire in practical deployment settings. To address this problem, we introduce Retrospective Harness Optimization (RHO), a self-supervised method that optimizes the agent harness using only past trajectories. Specifically, RHO selects a diverse coreset of challenging tasks from past trajectories and re-solves them in parallel. The agent analyzes these rollouts using self-validation and self-consistency, then generates candidate harness updates and selects the most effective one by its own pairwise self-preference. We evaluate RHO across three diverse domains, spanning software engineering, technical work, and knowledge work. Notably, a single optimization round improves the pass rate on SWE-Bench Pro from 59% to 78% without any external grading. Furthermore, our analysis demonstrates that RHO effectively targets prior failure modes. As a result, the optimized harness alters the agent's behavior patterns and sustains higher accuracy during long-horizon sessions.

08.
arXiv (quant-ph) 2026-06-16

Semiclassical Gravity Efficiently Solves $\mathsf{NP}$-Complete Problems

arXiv:2606.14806v1 Announce Type: cross Abstract: Assuming the gravitational field is classical and that it couples to quantum fields via the semiclassical Einstein field equations, we show that the weak-field dynamics of a massive and non-relativistic qubit can in principle be used to solve an $\mathsf{NP}$-complete problem in polynomial time. We attribute this vast computational power to the non-linear dynamics afforded by the semiclassical Einstein field equations. Consequently, the above two assumptions entail a violation of the Physical Extended Church–Turing Thesis, which we regard as evidence for the quantization of gravity.

09.
arXiv (CS.LG) 2026-06-17

A fairness-aware extension of Stochastic Multicriteria Acceptability Analysis for ranking

arXiv:2606.17756v1 Announce Type: new Abstract: Fairness has become a central concern in ranking problems involving individuals or social groups, particularly under the Responsible Artificial Intelligence agenda. In Multi-Criteria Decision Analysis, Stochastic Multicriteria Acceptability Analysis (SMAA) provides a robust framework for handling uncertainty and incomplete preference information, but it does not explicitly address fairness in the resulting rankings. This paper proposes SMAA-Fair, a fairness-aware extension of SMAA for ranking problems. The approach reweights the simulated rankings generated by SMAA according to their level of group fairness, so that fairer rankings contribute more strongly to the acceptability indices and central weights vector. The framework is independent of the aggregation model and can incorporate different fairness metrics. In this study, Statistical Parity, normalized discounted Kullback–Leibler divergence (rKL) and normalized discounted cumulative Kullback–Leibler divergence (nDKL) are adopted. Rankings are derived from the fairness-adjusted acceptability matrix using expected ranking and maximum acceptability ranking. We also derive the central weight according to the degree of fairness in the obtained rankings. Numerical experiments with synthetic and real data show that SMAA-Fair improves the representation of protected groups among favourable ranking positions, while preserving robustness to preference uncertainty.

10.
arXiv (math.PR) 2026-06-18

A scaling limit theorem for controlled branching processes with a size-divisible term

arXiv:2508.17116v2 Announce Type: replace Abstract: This paper establishes general sufficient conditions for a sequence of controlled branching processes to converge weakly on the Skorokhod space. We focus on a class of control mechanisms that extend previous results by decomposing those random variables into the sum of two independent components: an immigration term, which depends on the current population size, and a size-divisible term, which can be expressed as the sum of random contributions from each individual. This extension allows us to capture a broad range of control functions including Poisson, binomial, and negative binomial distributions, commonly used in the literature. The assumptions are formulated in terms of probability generating functions of the offspring and control laws, distinguishing in this latter between the immigration and the size-divisible parts. The limit process is shown to be a continuous-state branching process with dependent immigration. The proof essentially relies on tightness arguments and the identification of a martingale problem. We also identify the special case in which the limit reduces to a classical Feller branching diffusion with immigration.

11.
arXiv (CS.AI) 2026-06-19

Temporal Self-Imitation Learning

arXiv:2606.19752v1 Announce Type: cross Abstract: Long-horizon robot manipulation policies trained with reward shaping can still exploit dense rewards through inefficient interaction, while rare efficient behaviors may be forgotten during training. We argue that temporal efficiency itself provides a powerful and underutilized source of self-supervision for reinforcement learning. We introduce Temporal Self-Imitation Learning (TSIL), a reinforcement learning framework that mines temporally efficient successful trajectories generated during learning and converts them into reusable supervision for future policy improvement. TSIL progressively refines learning using configuration-conditioned adaptive temporal targets derived from fast successful trajectories, while preserving and replaying efficient behaviors through efficiency-weighted self-imitation learning. Across 15 distinct long-horizon manipulation tasks, TSIL consistently improves learning efficiency, task-completion efficiency, revisitation of fast successful behaviors, and robustness to unstable training conditions. More broadly, our results suggest that the temporal structure of successful behavior itself provides a scalable self-supervisory signal for reinforcement learning beyond manually engineered reward shaping alone.

12.
arXiv (CS.AI) 2026-06-12

LLM-as-an-Investigator: Evidence-First Reasoning for Robust Interactive Problem Diagnosis

arXiv:2606.13220v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as interactive assistants for technical problem solving. However, when users provide incomplete descriptions or plausible but unverified explanations, LLMs may prematurely align with these assumptions and propose solutions before collecting sufficient evidence. We refer to this behavior as user-driven sycophancy: the tendency of an LLM to reinforce a user-provided hypothesis instead of testing alternative explanations. This paper introduces LLM-as-an-Investigator, an evidence-first agentic AI methodology for robust problem diagnosis. The approach is implemented through a Solution Investigator Agent, which estimates the ambiguity of an initial problem description, generates candidate hypotheses, asks targeted clarification questions, and updates hypothesis probabilities after each answer. Rather than producing an immediate response, the agent continues the investigation until the evidence makes one candidate explanation stronger than the alternatives. To evaluate the approach, we build a benchmark from solved technical forum threads in mechanical, electrical, and hydraulic domains. We use a three-agent evaluation pipeline in which a Problem-Solution Extractor Agent converts solved threads into structured cases, a Ground-Truth Evaluator Agent simulates the user while hiding the known solution, and the tested assistant attempts to recover the solution through dialogue. The experiments compare standard assistants, reasoning-oriented LLMs, and the proposed investigator-based model across LLM backbones. In addition to diagnostic accuracy, we analyze how standard assistants follow misleading user hypotheses in diagnostic cases. The results show that the proposed approach identifies the problem more accurately than direct prompting and reasoning-only baselines, while its evidence-first protocol helps reduce user-induced conversational bias.

13.
arXiv (CS.AI) 2026-06-11

End-to-End Machine Learning for Depressive State Classification via EEG and fNIRS

arXiv:2606.11555v1 Announce Type: cross Abstract: The escalating demand for mental healthcare, driven by rising societal stress, highlights the limitations of traditional psychiatric diagnostics. Conventional methods - relying primarily on clinical interviews and patient self-reports - are inherently vulnerable to subjective bias and the varying empirical judgment of practitioners. To address the need for quantitative evaluation, biological signal-based detection, including electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS), has emerged as a promising objective alternative. Such technology is particularly vital for identifying latent depressive states that may be unrecognized by the subjects themselves. Furthermore, in aging populations, the high comorbidity between depression and dementia necessitates early differentiation to prevent mutual symptom exacerbation and maintain Quality of Life (QoL). This pilot study of eleven healthy students establishes a framework for biological signal-based depression detection, serving as a foundational step toward automated, objective diagnostic tools for clinical use.

14.
Nature (Science) 2026-06-10

Two-component exciton condensates in an electron–hole bilayer

作者:

Macroscopic quantum coherence emerges when bosons condense into a Bose–Einstein condensate (BEC)1–5. Excitons are a long-sought solid-state route to high-temperature BECs with strong interactions, electrical tunability and potentially multicomponent spinor order, but conclusive evidence for equilibrium condensation has remained elusive. Here we report evidence for two-component exciton BECs in MoSe2/hBN/WSe2 electron–hole bilayers6–9 by probing the spin–valley susceptibility of constituent electrons and holes. This heterostructure hosts equilibrium exciton fluids with four spin–valley flavours. Magneto-optical spectroscopy in a dilution refrigerator reveals three exciton condensate phases with distinct flavour polarizations. At zero magnetic field, the many-body ground state is a coherent superposition of two condensed intravalley exciton flavours. Under a magnetic field, the intravalley exciton condensate first switches to a two-component intervalley condensate through a first-order quantum phase transition at a weak critical field and then turns into a fully polarized single-component condensate at high fields. The condensate signatures form a dome in density–temperature space, persisting up to approximately 1.8 K. Our results establish van der Waals electron–hole bilayers as a versatile platform for strongly interacting, multicomponent exciton BECs. Macroscopic quantum coherence arises in two-component exciton Bose–Einstein condensates within MoSe2/hBN/WSe2 electron–hole bilayers, exhibiting distinct spin–valley polarized phases, quantum phase transitions under magnetic fields and stable condensate behaviour up to approximately 1.8 K.

15.
arXiv (CS.CL) 2026-06-19

PsyScore: A Psychometrically-Aware Framework for Trait-Adaptive Essay Scoring and ZPD-Scaffolded Feedback

Effective Automated Essay Scoring (AES) are expected to support both reliable assessment and actionable instructional feedback. However, existing approaches often treat scoring and feedback as separate components: neural scoring models provide limited interpretability, while Large Language Model (LLM)-based feedback is typically insensitive to learners proficiency levels. To address this fragmentation, this work proposes PsyScore, a psychometrically-aware framework that integrates diagnostic assessment with instructional scaffolding through a shared latent ability representation. PsyScore comprises three key modules: a Trait-Adaptive Neural IRT Scorer that incorporates the Graded Partial Credit Model (GPCM) into a neural architecture, enabling the precise estimation of student ability while maintaining psychometric interpretability, a ZPD-Scaffolded Feedback Generator, which conditions multi-agent feedback strategies on the diagnosed ability parameter to adapt instructional focus across different proficiency levels, and a Multi-Perspective Feedback Evaluation Strategy that assesses feedback quality via pairwise preference judgements and student revision simulations. Experiments on the ASAP++ dataset demonstrate that PsyScore achieves competitive scoring performance while providing more pedagogically aligned feedback.

16.
arXiv (CS.CL) 2026-06-11

Soft-Prompt Tuning for Fair and Efficient LLM Benchmark Evaluation

Benchmark scores often misrepresent a large language model's (LLM's) knowledge, because they rely, e.g., on the model's ability to follow specific formatting requirements. This especially penalizes base models that may know the correct answers but lack the ability – typically introduced in post-training – to structure them as instructed. To overcome this, we propose soft-prompt tuning, an efficient, fair, and architecture-agnostic model evaluation. By optimizing only 10 soft-prompt vectors (roughly 0.0006% parameters for a 7B model) over a short tuning period, we adapt models to specific benchmark formats, closing gaps in format-following and ensuring that underlying knowledge is accurately reflected in benchmark scores. This allows one to fairly compare different base models – trained with various pre-training recipes – on benchmarks without the need for full post-training. We evaluated soft-prompt tuning across 7 models and 7 datasets. The results show that (a) soft-prompt tuning saturates format-following within 80 steps (~640 samples) making it highly efficient, (b) soft-prompt tuning significantly outperforms zero- and few-shot prompting, surfacing base model knowledge that standard prompting misses, that (c) even post-trained models can benefit from soft-prompts to maximize format compliance, and that (d) soft-prompted base model performance predicts post-trained model rankings more reliably than zero- and few-shot baselines, offering a low-cost proxy for downstream model quality. Our contributions include (1) metrics which disentangle format-following and knowledge accuracy, (2) a fairer benchmarking protocol of LLM knowledge, and (3) a cost- and memory-effective recipe to identify optimal pre-training strategies early in LLM development.

17.
medRxiv (Medicine) 2026-06-11

Polygenic risk scores associate with asthma phenotypes and proteomic analyses implicate IL1R1 in two family-based studies

Despite its high prevalence and the discovery of hundreds of genetic associations, the genetic determinants and heterogeneous manifestations of asthma remain incompletely understood. Incorporating polygenic risk scores (PRS) into asthma research offers a powerful approach to quantify inherited susceptibility, refine risk profiles, and advance mechanistic understanding of disease development. For this study, we leveraged whole-genome sequencing (WGS) data from two family-based cohorts of childhood asthma - the Genetics of Asthma in Costa Rica Study (GACRS) and the Childhood Asthma Management Program (CAMP) - to examine the transmission profiles of externally derived asthma PRS and their associations with clinical phenotypes in children with asthma. To further elucidate molecular mechanisms, we integrated large-scale external genome-wide association study (GWAS) summary statistics and genetic prediction models of protein abundance in a two-step proteome-wide association study (PWAS) of asthma. Our findings provide robust evidence supporting the validity of externally derived asthma PRS (asthma PRS association p-value p={10}^{-24} [GACRS and CAMP trios combined] for the Global Biobank Meta-analysis Initiative [GBMI]) and reveal consistent associations with spirometry measures and atopy markers across both studies, as 13 of 21 traits (62%) were significantly associated with the GBMI-PRS in the meta-analysis after multiple-testing correction. Moreover, the results of the integrative proteomic analysis implicate IL-1 signaling in the etiology of asthma, reinforcing the candidacy of IL1R1 antagonists for drug repurposing.

18.
Nature (Science) 2026-06-17

Molecular basis of polyadenylated RNA fate determination in the nucleus

作者:

Eukaryotic genomes generate a plethora of polyadenylated (pA+) RNAs1,2, which are packaged into ribonucleoprotein particles (RNPs). To ensure faithful gene expression, functional pA+ RNPs, including protein-coding RNPs, are exported to the cytoplasm, whereas transcripts within non-functional pA+ RNPs are degraded in the nucleus1–4. How cells distinguish these opposing fates remains unknown. The DExD-box ATPase UAP56 (also known as DDX39B) is a central component of functional pA+ RNPs, and promotes their docking to the nuclear pore complex-anchored TREX-25,6, which triggers transcript release from UAP56 to facilitate export7. Here we reveal that the poly(A) tail exosome targeting (PAXT) connection8 binds a TREX-2-like module, which releases pA+ RNAs from UAP56 for decay by the nuclear exosome. The core of this module consists of a LENG8–PCID2–SEM1 trimer, which we show is structurally and biochemically equivalent to the central GANP–PCID2–SEM1 trimer of TREX-2. Mutagenesis and transcriptomic data demonstrate that the nuclear fate of pA+ RNPs is governed by the contending actions of nucleoplasmic PAXT and nuclear pore complex-associated TREX-2, which interpret RNA-bound UAP56 as a signal for RNA decay or export, respectively. As RNA targets of PAXT are generally short and intron-poor, we propose an overall model for pA+ RNP fate determination whereby the distinct sub-nuclear localizations of PAXT and TREX-2 govern the degradation of short non-functional pA+ RNAs while allowing export of their longer and functional counterparts. Biochemical, structural and cell biological analyses reveal that UAP56 (DDX39B) assembles with a TREX-2–like module that redirects non-functional polyadenylated RNAs from export to degradation.

19.
arXiv (quant-ph) 2026-06-15

Efimov Effect in Ultracold Microwave-Shielded Polar Molecules

arXiv:2602.21433v2 Announce Type: replace-cross Abstract: A quantum-mechanical description is presented for the three-body physics of shielded dipolar molecules, including a prediction of observable Efimov physics. Despite the anisotropic and long-range nature of the interaction, shielding enables a regime in which universality emerges already at the two-body level and extends to the three-body sector, where Efimov physics emerges. On the negative side of the scattering-length resonance, computed trimer binding energies display the characteristic scaling expected for Efimov resonances. Finally, the sudden approximation can be used to create trimer bound states, starting from positive energy trap states as a way to create or detect these molecular trimers. Moreover, the three-body parameter expressed in dipolar units is found to be universal.

20.
arXiv (CS.AI) 2026-06-11

ConsistencyPlanner: Real-time Planning with Fast-Sampling Consistency Models

arXiv:2606.11569v1 Announce Type: cross Abstract: Closed-loop planning in complex, real-world driving scenarios presents a critical challenge for autonomous driving systems. While traditional rule-based methods are interpretable, their predefined heuristics lack the adaptability for dynamic traffic environments. Learning-based approaches have shown considerable promise. Conversely, learning-based approaches, despite their promise, struggle to balance the modeling diverse and multimodal driving behaviors and real-time planning, often leading to indecisive or unsafe actions. To address this limitation, we propose Consistency Planner, a real-time planning framework with fast-sampling consistency models. Our approach is built upon two key technical contributions. Efficient Multimodal Sampling: We employ fast-sampling consistency models to generate a diverse set of plausible future trajectories. This enables efficient, real-time exploration of multimodal actions, overcoming the computational bottlenecks of previous iterative generative methods. Heterogeneous Feature Fusion: We introduce an attention-enhanced decoder that dynamically integrates heterogeneous input features (including scene feature and action token) into a cohesive representation for robust planning. Extensive evaluation in the Waymax simulator demonstrates superior performance in safety metrics compared to existing methods, with particularly strong results in challenging dynamic scenarios.

21.
arXiv (CS.LG) 2026-06-11

Tensor Methods: A Unified and Interpretable Approach for Material Design

arXiv:2602.10392v2 Announce Type: replace Abstract: When designing new materials, it is often necessary to tailor the material design to have some desired properties. As the set of design parameters grow, the search space grows exponentially, making the actual synthesis and evaluation of all material combinations virtually impossible. Even using traditional computational methods such as Finite Element Analysis becomes too computationally heavy to search the design space. Recent methods use machine learning (ML) surrogate models to more efficiently determine optimal material designs; unfortunately, these methods often (i) are notoriously difficult to interpret and (ii) under perform when the training data comes from a non-uniform sampling of the design space. We suggest the use of tensor completion methods as an all-in-one approach for interpretability and predictions. We observe classical tensor methods are able to compete with traditional ML in predictions, with the added benefit of their interpretable tensor factors (which are given completely for free, as a result of the prediction). In our experiments, we are able to rediscover physical phenomena via the tensor factors, indicating that our predictions are aligned with the true underlying physics of the problem. This also means these tensor factors could be used by experimentalists to identify potentially novel patterns, given we are able to rediscover existing ones. We also study the effects of both types of surrogate models when we encounter training data from a non-uniform sampling of the design space. We observe more specialized tensor methods that can give better generalization in these non-uniforms sampling scenarios. We find the best generalization comes from a tensor model, which is able to improve upon the baseline ML methods by up to 5% on aggregate $R^2$, and halve the error in some out of distribution regions.

22.
arXiv (CS.AI) 2026-06-18

Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation

arXiv:2603.10827v2 Announce Type: replace-cross Abstract: Speech-aware large language models (LLMs) can accept speech inputs, yet their training objectives largely emphasize linguistic content or specific fields such as emotions or the speaker's gender, leaving it unclear whether they encode speaker identity. First, we propose a model-agnostic scoring protocol that produces continuous verification scores for both API-only and open-weight models, using confidence scores or log-likelihood ratios from the Yes/No token probabilities. Using this protocol, we benchmark recent speech-aware LLMs and observe weak speaker discrimination (EERs above 20% on VoxCeleb1). Second, we introduce a lightweight augmentation that equips an LLM with ASV capability by injecting frozen ECAPA-TDNN speaker embeddings through a learned projection and training only LoRA adapters. On TinyLLaMA-1.1B, the resulting ECAPA-LLM achieves 1.03% EER on VoxCeleb1-E, approaching a dedicated speaker verification system while preserving a natural-language interface.

23.
arXiv (math.PR) 2026-06-11

Heat kernel estimates for Markov processes with blowing-up jump kernels

arXiv:2512.24807v2 Announce Type: replace Abstract: In this paper, we establish sharp two-sided heat kernel estimates for a large class of purely discontinuous symmetric Markov processes on closed subsets $F$ of $\mathbb{R}^d$, whose jump kernels blow up on a Borel subset $\Sigma$ of $F$. We assume that $F\setminus \Sigma$ is a $\kappa$-fat set and is dense in $F$. To the best of our knowledge, this is the first work establishing sharp heat kernel estimates for jump processes whose jump kernels blow up on part of the state space. The jump kernels under consideration take the form $J(x,y)=|x-y|^{-d-\alpha}{\mathcal B}(x,y)$, where $\alpha\in (0,2)$ and the function ${\mathcal B}(x,y)$ blows up at a subset $\Sigma$ of $F$. A fundamental obstacle is that the tails of the jump measures are not uniformly bounded, and hence standard techniques in heat kernel analysis do not provide a priori off-diagonal estimates. To overcome this difficulty, we develop a new approach based on weighted integral estimates for the heat kernel that are sensitive to both the blow-up behavior of the jump kernel and the geometry of $F\setminus \Sigma$. Examples of processes falling within our general framework include traces of isotropic $\alpha$-stable processes in $C^{1,\rm Dini}$ sets, processes in Lipschitz sets arising in connection with the nonlocal Neumann problem, and a large class of resurrected self-similar processes in the closed upper half-space.

24.
arXiv (CS.AI) 2026-06-19

Stabilizing the Q-Gradient Field for Policy Smoothness in Actor-Critic Methods

arXiv:2601.22970v2 Announce Type: replace-cross Abstract: Policies learned via continuous actor-critic methods often exhibit erratic, high-frequency oscillations, making them unsuitable for physical deployment. Current approaches attempt to enforce smoothness by directly regularizing the policy's output. We argue that this approach treats the symptom rather than the cause. In this work, we theoretically establish that policy non-smoothness is fundamentally governed by the differential geometry of the critic. By applying implicit differentiation to the actor-critic objective, we prove that the sensitivity of the optimal policy is bounded by the ratio of the Q-function's mixed-partial derivative (noise sensitivity) to its action-space curvature (signal distinctness). To empirically validate this theoretical insight, we introduce PAVE (Policy-Aware Value-field Equalization), a critic-centric regularization framework that treats the critic as a scalar field and stabilizes its induced action-gradient field. PAVE rectifies the learning signal by minimizing the Q-gradient volatility while preserving local curvature. Experimental results demonstrate that PAVE achieves smoothness comparable to policy-side smoothness regularization methods, while maintaining competitive task performance, without modifying the actor.

25.
arXiv (CS.CV) 2026-06-17

SPATIA: Multimodal Generation and Prediction of Spatial Cell Phenotypes

Understanding how cellular morphology, gene expression, and spatial context jointly shape tissue function is a central challenge in biology. Image-based spatial transcriptomics technologies now provide high-resolution measurements of cell images and gene expression profiles, but existing methods typically analyze these modalities in isolation or at limited resolution. We address the problem by introducing SPATIA, a multi-level generative and predictive model that learns unified, spatially aware representations by fusing morphology, gene expression, and spatial context from the cell to the tissue level. SPATIA also incorporates a spatially conditioned generative framework with confidence-aware OT reweighting and morphology-profile alignment for modeling target-state morphology distributions. Specifically, we propose a confidence-aware flow matching objective that reweights weak optimal-transport pairs based on uncertainty. We further apply morphology-profile alignment to encourage biologically meaningful image generation, enabling the modeling of microenvironment-dependent phenotypic transitions. We assembled a multi-scale dataset consisting of 25.9 million cell-gene pairs across 17 tissues. We benchmark SPATIA against 18 models across 12 tasks, spanning categories such as phenotype generation, annotation, clustering, gene imputation, and cross-modal prediction. SPATIA achieves improved performance over state-of-the-art models, improving generative fidelity by 8% and predictive accuracy by up to 3%.