Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.LG) 2026-06-18

Smoothness-Based Derandomization of PAC-Bayes Bounds

arXiv:2606.19105v1 Announce Type: new Abstract: We study PAC-Bayes derandomization for smooth loss functions. Our goal is to obtain generalization bounds that hold with high probability for deterministic predictors by exploiting smoothness properties of both the loss and the predictor class. We show that passing from the Gibbs predictor to the deterministic predictor at the posterior mean has a precise cost, given by the generalization gap of the Jensen gap class. We control this class through its Rademacher complexity, leading to bounds for deterministic predictors that involve flatness quantities expressed in terms of parameter Jacobians and Hessians of the score map. The framework applies to both bounded and unbounded smooth loss functions, and we specialize the results to linear predictors and smooth neural networks. Finally, the Jacobian and Hessian quantities appearing in the theory motivate a practical regularizer. For BatchNorm networks, we compute this regularizer with respect to effective BatchNorm weights obtained by folding the BatchNorm transformation into the adjacent affine weights. Experiments on CIFAR-10 illustrate the behavior of this regularizer under different batch sizes.

02.
arXiv (CS.AI) 2026-06-18

ProfiLLM: Utility-Aligned Agentic User Profiling for Industrial Ride-Hailing Dispatch

arXiv:2606.18803v1 Announce Type: new Abstract: Bringing Large Language Models (LLMs) into industrial ride-hailing dispatch as semantic feature extractors over platform-scale behavioral logs is a compelling but under-explored data systems problem. Production matching pipelines remain dominated by structured numerical features, yet decisive behavioral signals (e.g., a driver's habitual aversion to certain regions) are inherently contextual and naturally expressible as LLM-generated user profiles. However, scaling such profiling to a live, millisecond-latency dispatcher faces three intertwined constraints rarely addressed together: on a platform with millions of daily orders, logs exceed any LLM's context window by orders of magnitude; most users are long-tail, with too few interactions for per-user profiling; and surface-fluent profiles do not necessarily improve downstream prediction utility. We present ProfiLLM, an agentic LLM data pipeline that operationalizes utility-aligned user profiling for production matching systems through two modules. (1) Tool-Augmented Global Knowledge Mining equips an LLM agent with 27 analytical tools to mine platform-scale data, producing reusable global knowledge, adaptive user clustering rules, and region-level supply-demand priors. (2) Utility-Aligned Profile Exploration generates multiple candidate profiles per cluster, evaluates them via a lightweight downstream utility proxy, iteratively refines the best candidates and constructs preference pairs for DPO fine-tuning. Deployed on DiDi's production dispatcher, ProfiLLM achieves up to +6.14% relative AUC improvement in outcome prediction, up to +4.35% GMV gain in dispatching simulation, and consistent improvements in a 14-day online A/B test including +0.47% GMV, +0.33% Completion Rate, and -0.82% Cancel-Before-Accept rate.

03.
arXiv (CS.CV) 2026-06-25

Benchmarking Vision-Language Models for Microscopic Plant Image Understanding

Microscopic imaging provides essential visual evidence for studying plant biology and pathology at the cellular and subcellular levels. However, existing benchmarks on vision-language models primarily focus on macroscopic plant imagery, while the microscopic domain remains underexplored. To address this gap, we present PlantMicro, a comprehensive benchmark for evaluating vision-language models (VLMs) in microscopic plant imagery. PlantMicro integrates more than 5,000 images collected across diverse hosts, biological domains, and imaging modalities. Building on this diversity, we design a set of complementary tasks that capture different facets of microscopic image understanding. To support these tasks, we construct over 9,000 VQA pairs that systematically evaluate the capabilities of VLMs. Experiments on PlantMicro show that current VLMs struggle with fine-grained recognition and biologically grounded reasoning. For example, GPT-5 achieves 34.93% accuracy on the pathogen classification task, which is only modestly above the random-guessing baseline. The results highlight a significant gap in current VLMs' ability to comprehend plant microscopic images. PlantMicro provides a standardized foundation for advancing VLMs toward reliable and comprehensive microscopy-level plant understanding.

04.
bioRxiv (Bioinfo) 2026-06-11

Combinatorial docking and molecular generation to navigate over 100-billion molecules for prospective ligand discovery

Commercially available make-on-demand libraries now exceed 100 billion compounds, requiring over 50 years to screen on 2,000 CPU cores using conventional docking. We present two complementary approaches to address this challenge. CombiDOCK, a combinatorial docking framework, enables exhaustive screening at the 100-billion scale within 40 days. MINT-Dock, a generative framework, accelerates navigation of this space by integrating CombiDOCK with Monte Carlo Tree Search. Benchmarked on 46 diverse targets, CombiDOCK matched full-molecule docking accuracy, and MINT-Dock achieved a 4,800-fold enrichment over random selection. Compared with prior billion-scale brute-force campaigns against {sigma}2, VMAT2, and VAChT, prospective CombiDOCK screens of the 100-billion-molecule library yielded higher hit rates and more potent ligands, while MINT-Dock achieved comparable outcomes across single- and multi-target objectives with >20-fold computational cost reductions. Docking-predicted poses of the best VAChT-binding compounds were confirmed by cryo-EM structures. These methods provide exhaustive and generative paths for navigating the trillion-molecule frontier of drug discovery.

05.
arXiv (CS.CL) 2026-06-16

Know Your Limits : On the Faithfulness of LLMs as Solvers and Autoformalizers in Legal Reasoning

Large Language Models (LLMs) achieve strong performance on reasoning tasks, but whether this reflects faithful logical inference or heuristic approximation remains unclear. We study this question in legal entailment by comparing three paradigms, including pure LLM classification, LLM-based Formal Reasoning, and solver-based Formal Reasoning using the Z3 SMT solver, on a re-annotated subset of ContractNLI across five LLMs. Our re-annotation reveals a systematic and measurable gap between pragmatic legal interpretation and strict formal entailment, where a substantial proportion of legally sound inferences are not formally grounded without additional unstated assumptions. While introducing formal structure improves accuracy, with LLM-based Formal Reasoning achieving the highest benchmark performance, we show that this gain does not imply faithful reasoning. We identify three recurring failure modes: scope laundering, where LLMs report solver-inconsistent classifications without executing the underlying formal reasoning, producing conclusions that appear logically grounded but are not; implicit constraint blindness, where LLMs overlook logical constraints present in formal representations; and program synthesis failures, where LLMs generate incorrect Z3 code despite structured prompting. Critically, scope laundering persists across all models, raising serious concerns about the faithfulness of LLM-based formal reasoning as a proxy for symbolic execution. These results reveal a fundamental gap between benchmark accuracy and logical faithfulness.

06.
arXiv (CS.CV) 2026-06-19

One-Shot Novel View and Pose Human Image Synthesis via 3D Prior Guided Diffusion Model

This paper addresses the challenge of one-shot novel view and pose human image synthesis. The existing methods transfer the reference human image to a target pose using a set of 2D pose keypoints or synthesize human images based on generalizable human NeRF which uses human model priors to extract point-wise features. However, pose transfer based methods can not handle complex human pose using ambiguous 2D pose as the condition, while generalizable human NeRFs may be inaccurate to recover occluded/invisiable human parts without extracted reliable features. To solve these problems, we propose a novel approach for novel view and pose synthesis from a singe human image via conditional denoising diffusion model. Our diffusion model divides the novel view and pose synthesis problem into a sequence of conditional denoising steps. Specifically, to generate humans with complex and arbitrary poses, we introduce 3D human priors, i.e., 3D normal map and color prompt, as geometry and color conditions into the generation process. By transferring the reference human into the target human with a series of diffusion steps, our diffusion model enables high-quality synthesis including the occluded/invisible parts. Further, we propose a self-reconstruction based customized refinement to enhance fine details when tested on novel persons.Experimental results on different public datasets demonstrate that our approach significantly outperforms previous methods and also shows better generalization ability across datasets. The code will be made publicly available at https://github.com/Yankeegsj/3DPGDM.

07.
arXiv (CS.AI) 2026-06-16

Epileptic Seizure Detection in Separate Frequency Bands Using Feature Analysis and Graph Convolutional Neural Network (GCN) from Electroencephalogram (EEG) Signals

arXiv:2604.00163v2 Announce Type: replace-cross Abstract: Epileptic seizures are neurological disorders characterized by abnormal and excessive electrical activity in the brain, resulting in recurrent seizure events. Electroencephalogram (EEG) signals are widely used for seizure diagnosis due to their ability to capture temporal and spatial neural dynamics. While recent deep learning methods have achieved high detection accuracy, they often lack interpretability and neurophysiological relevance. This study presents a frequency-aware framework for epileptic seizure detection based on ictal-phase EEG analysis. The raw EEG signals are decomposed into five frequency bands (delta, theta, alpha, lower beta, and higher beta), and eleven discriminative features are extracted from each band. A graph convolutional neural network (GCN) is then employed to model spatial dependencies among EEG electrodes, represented as graph nodes. Experiments on the CHB-MIT scalp EEG dataset demonstrate high detection performance, achieving accuracies of 97.1%, 97.13%, 99.5%, 99.7%, and 51.4% across the respective frequency bands, with an overall broadband accuracy of 99.01%. The results highlight the strong discriminative capability of mid-frequency bands and reveal frequency-specific seizure patterns. The proposed approach improves interpretability and diagnostic precision compared to conventional broadband EEG-based methods.

09.
medRxiv (Medicine) 2026-06-10

Developmental Associations Linking Childhood Trauma and Early Cannabis Use to Adolescent DNA Methylation and Psychotic-Like Experiences

Background. Psychotic-like experiences (PLEs) index early risk for psychotic disorders and are consistently associated with childhood trauma, yet underlying biological mechanisms remain poorly understood. DNA methylation (DNAm) may capture the biological embedding of early adversity, while adolescent exposures such as cannabis use may modify these processes. We examined epigenome-wide associations of childhood trauma and PLEs, tested the moderating role of early cannabis use, and evaluated DNAm as a potential mediator. Methods. We analysed data from the Avon Longitudinal Study of Parents and Children (ALSPAC), a UK population-based birth cohort. Childhood trauma was assessed prospectively and retrospectively. Epigenome-wide DNAm was measured in peripheral blood at ~17 years using the Illumina 450K array, and PLEs were assessed at 18 using a structured interview. Epigenome-wide association studies were conducted for trauma-DNAm and DNAm-PLEs associations in the final sample (n = 1,457), adjusting for demographic, biological, and technical covariates. Differentially methylated regions (DMRs) were identified using DMRff, followed by functional enrichment analyses. Cannabis use at 15.5 was modelled as a moderator with multiple imputation for missing data. Mediation was tested using the Divide-Aggregate Composite-null Test (DACT). Results. Childhood trauma was associated with widespread DNAm differences, primarily at the regional level, with enrichment in pathways related to cellular stress responses. In contrast, DNAm associated with PLEs was more limited and implicated loci involved in epigenetic regulatory processes. These signatures were largely distinct, and there was no evidence supporting mediation after multiple testing correction. Incorporating cannabis use altered the pattern and extent of DNAm associations, with stronger and more significant signals observed at both CpG and regional levels, although these did not translate into evidence of mediation. Conclusion. Childhood trauma and PLEs show distinct DNAm signatures in adolescence, with trauma-related DNAm reflecting broad stress-related processes and PLE-associated DNAm implicating regulatory mechanisms. We found little evidence that DNAm mediates the trauma-PLE association. Instead, adolescent exposures, particularly cannabis use, may distinctly influence trauma-related epigenetic variation with limited detectable downstream effects on PLEs. These findings support a context-dependent model of epigenetic risk and highlight the need for larger longitudinal studies to clarify causal pathways linking early adversity to psychosis.

10.
arXiv (CS.LG) 2026-06-16

Circuit Tracing in Autoregressive Protein Language Models

arXiv:2606.16044v1 Announce Type: new Abstract: Protein language models (pLMs) can generate novel protein sequences with properties beyond those observed in nature, yet the mechanisms underlying protein generation remain poorly understood. Existing mechanistic interpretability methods based on sparse autoencoders and transcoders primarily focus on protein representation learning models and do not capture the computation required for autoregressive generation. Here, we introduce ProGenMech, a mechanistic interpretability framework for generative protein language models that extends cross-layer transcoders (CLTs) to ProGen3, a sparse Mixture-of-Experts model trained for both causal generation and span infilling. Unlike per-layer approaches, CLTs reconstruct each layer using sparse latent variables from all preceding layers, enabling faithful recovery of inter-layer generative computation. We further develop a zero-shot circuit discovery framework to identify sparse latent circuits responsible for protein generation and fitness prediction. In causal generation and zero-shot fitness estimation tasks, ProGenMech outperforms local transcoder baselines in recovering ProGen3's probability distribution and functional scoring behavior, while matching the original model's generative distribution in span infilling tasks. Moreover, the recovered circuits reveal biologically meaningful motifs and functional regions associated with conserved sequence patterns and protein fitness landscapes, establishing a foundation for interpretable and steerable protein generation.

11.
arXiv (CS.CV) 2026-06-16

All Eyes on the Workflow: Automated and Efficient Event Discovery from Video Streams

Disciplines such as business process management and process mining aid organizations by discovering insights about processes on the basis of recorded event data. However, an obstacle to process analysis is data multi-modality: for instance, data in video form are not directly interpretable as events. Existing approaches rely on a dictionary of activity label as input, cannot provide frame-by-frame labeling explanations, or rely on superseded computer vision techniques. In this work, we present SnapLog, an approach to extract event data from videos by converting frames to feature vectors using image embeddings and performing temporal segmentation through frame-wise similarity matrices. A generalized few-shot classification is then used to assign labels to the video segments, yielding labeled, timestamped sub-sequences of frames that are interpretable as events. Conventional process mining techniques can be used to analyze the resulting data. We show that our approach produces logs that accurately reflect the process in the videos.

12.
arXiv (CS.CL) 2026-06-16

A Large-Scale Multi-Dimensional Empirical Study of LLMs for Conversation Summarization

Despite the significant advancement of LLMs in conversation summarization, their evaluation remains limited by insufficient scenarios, input lengths, and sample sizes. Furthermore, existing benchmarks often omit frontier reasoning systems and efficient small models, or lack fine-grained, multi-dimensional assessments. To bridge these gaps, we propose OmniCSEval, a unified benchmark comprising 1,800 diverse conversations across six real-world scenarios, featuring context lengths ranging from 128 to 32k tokens. For fine-grained evaluation, we employ a bidirectional fact-checking framework that integrates key fact matching to assess completeness and conciseness, alongside summary fact verification to evaluate faithfulness. To ensure reliable assessment, we establish a human-LLM collaborative pipeline for key fact extraction and a multi-LLM consensus verifier for summary fact decomposition. Leveraging this framework, we evaluate 28 LLMs across four distinct categories grouped by reasoning capability and model scale. Our extensive empirical study reveals critical insights regarding the cross-scenario challenges current LLMs continue to face, the impacts of reasoning and scale, and the efficiency and adaptability of reasoning models. We also provide guidance for system selection in real-world deployments.

13.
arXiv (quant-ph) 2026-06-24

Optimizing LOCC Protocols on Product Stiefel Manifold

arXiv:2510.06909v2 Announce Type: replace Abstract: Characterizing the operational limits of Local Operations and Classical Communication (LOCC) is a central problem in distributed quantum information, yet remains computationally intractable due to the non-convex geometry of the LOCC set. We introduce a geometric framework that embeds the physical constraints of fixed-round LOCC protocols onto the product Stiefel manifold, converting a constrained protocol-design problem into unconstrained Riemannian optimization. We demonstrate this framework through entanglement distillation: by directly optimizing finite-copy LOCC protocols, we discover achievable protocols whose fidelities match positive partial transpose (PPT) upper bounds to within numerical precision, and we provide numerical evidence for both the operational advantage of adaptive communication rounds and the super-additivity of coherent information under two-way processing. These results establish Riemannian manifold optimization as a practical tool for probing the physical limits of future quantum networks.

14.
arXiv (CS.CL) 2026-06-25

Invisible to humans, visible to machines: a preregistered audit of Unicode fidelity across four biomedical bibliographic APIs

Biomedical text mining, scientometrics, and the construction of training corpora for biomedical large language models (LLMs) all assume that the abstract text returned by a bibliographic API faithfully reproduces the published abstract. This pre-registered audit (OSF osf.io/269b5) tests that assumption for four widely used public APIs (PubMed E-utilities, Crossref, OpenAlex, Semantic Scholar) against PubMed Central (PMC) JATS XML as a common ground truth. From a complete enumeration of the PMC Open Access subset for 2024 (about 700,000 records), a simple random sample of 4,000 English-language research articles was drawn; for each, we recorded whether Unicode characters from four pre-specified classes present in the JATS abstract (typographic punctuation, mathematical/scientific symbols, Greek letters, special whitespace) were preserved by each API. Two systematic, deterministic losses met the pre-registered criterion (upper 95% CI bound below 5%): the PubMed AbstractText field preserved typographic punctuation in only 0.6% of eligible abstracts (95% CI 0.3-1.0%), and OpenAlex preserved special whitespace in 0% (0.0-0.4%). A blinded mechanism audit attributed the first loss to character substitution and the second to inverted-index serialization. Mathematical symbols and Greek letters were preserved faithfully (over 95%) by all four APIs. Separately, Crossref returned no abstract for 24.6% of papers (coverage 75.4%, 95% CI 74.1-76.7%), concentrated in specific publishers (Elsevier and ACS: 0%). Character-level fidelity is therefore API-dependent and undocumented: the same publisher-deposited JATS text carries different surface signatures depending on the serving API, with direct consequences for tokenization-sensitive bibliometrics, corpus construction, and character-level indicators of LLM-assisted writing.

15.
arXiv (CS.LG) 2026-06-17

Sum-of-Squares Degree Barriers for the Reweighted-Hinge Method in Robust Halfspace Learning: A Christoffel-Function Characterization

作者:

arXiv:2606.17215v1 Announce Type: new Abstract: A certificate that removes outliers sees the data only through its low-degree moments, and an adversary exploits exactly this, hiding corruption where the clean data already looks typical, in the blind spot no bounded-degree test resolves. That blind spot turns out to have an exact size: the Christoffel function of the clean marginal, the very quantity modern data analysis thresholds to detect outliers, here read from the adversary's side as the corruption a bounded-degree certificate cannot remove. We turn this inversion into the organizing principle of the reweighted-hinge approach to robustly learning $\gamma$-margin halfspaces under malicious noise (Shen, 2025; Zeng and Shen, 2025): the governing resource is the Sum-of-Squares degree of the outlier-removal certificate, and the resolution principle states that the maximal corruption mass which can hide at a center $c$ from a degree-$2t$ certificate is exactly the Christoffel function $\lambda_{t+1}(c)$ of the clean marginal. Three consequences follow, all against the certificate method (not information-theoretic). A margin-degree tradeoff: certifying the dense pancake to error $\epsilon$ costs SoS degree $\Omega(\log(1/\epsilon))$ or margin $\Omega(\sqrt{\log(1/\epsilon)}/\sqrt{d})$, explaining why the $\log(1/\epsilon)$ margin Shen (2025) records is forced, with a weighted-Chebyshev reduction making the threshold $2t=\Theta((|c|/s)^2)$ tight modulo one classical weighted-extremal estimate. A degree-$2$ outlier barrier: the resolution principle realized as an explicit instance on which degree $2$ is stuck at $\eta^{1/2}$ while degree $4$ escapes, locating the method's small breakdown rate in the degree, not the analysis. And a degree-$2t$ algorithm tracing the frontier $\eta^{1-1/2t}$ (recovering Shen (2025) at $t=1$), whose gain is an explicit constant, capped by the pancake density and shown unimprovable by the degree-$2$ barrier.

16.
PLOS Medicine 2026-05-21

U = U for all: Advancing equity in HIV prevention

by Thiago S. Torres, Paula M. Luz Suppression of HIV with antiretrovirals eliminates HIV transmission risk, summarized as Undetectable = Untransmittable (U = U). However, U = U literacy remains unevenly understood and shared, and stigmas persist. Equitable and accurate awareness of U = U requires culturally tailored interventions, improved provider education, and supportive policy environments beyond biomedical evidence alone. Suppression of HIV with antiretrovirals eliminates HIV transmission risk, summarized as Undetectable = Untransmittable (U=U). However, U=U literacy remains unevenly understood and shared, and stigmas persist. In this Perspective, Thiago Torres and Paula Luz outline what is needed to improve equity and accuracy in global awareness and education of U=U.

17.
arXiv (CS.AI) 2026-06-24

Uncertainty-Aware Longitudinal Forecasting of Alzheimer's Disease Progression Using Deep Learning

arXiv:2606.24604v1 Announce Type: new Abstract: Longitudinal modelling of Alzheimer's disease progression is clinically useful only if it can describe not just the most likely next diagnosis, but how a patient may evolve over time and how reliable that forecast is. Most deep learning approaches reduce this problem to single-step classification, treating cognitively normal, mild cognitive impairment, and dementia as flat categories while providing limited insight into how uncertainty accumulates across future visits. We propose a probabilistic framework that combines ordinal diagnosis prediction, multi-horizon trajectory generation, and decomposed uncertainty estimation. A Temporal Fusion Transformer encoder is adapted with a CORAL ordinal output layer, asymmetric loss weighting, and converter oversampling to respect disease-stage ordering and improve sensitivity to MCI-to-dementia transitions. Conditioned on the learned patient-context representation, an autoregressive Mixture Density Network generates five-year probabilistic trajectories for diagnosis state, CDR Sum of Boxes, MMSE orientation, and hippocampal volume. On ADNI, the model outperforms linear, recurrent, and transformer baselines for next-visit diagnosis prediction, with the strongest gains on MCI-versus-dementia discrimination. Generated trajectories achieve near-nominal 90% credible interval coverage, widening uncertainty across the forecast horizon, and biomarker dynamics consistent with expected Alzheimer's disease progression. We further separate aleatoric from epistemic uncertainty using analytic mixture variance and a five-member bootstrap ensemble, which provides the strongest encoder diversity and output-level epistemic signal. Epistemic uncertainty is higher for rare progression archetypes, MCI and dementia patients, and under external evaluation on OASIS-3, where it increases alongside prediction error.

18.
arXiv (CS.AI) 2026-06-16

Multi-agent Framework for Time-Sensitive Complementary Collaboration in Minecraft

arXiv:2606.15684v1 Announce Type: new Abstract: We present TickingCollabBench, a Minecraft-based multi-agent benchmark for a novel class of time-sensitive complementary collaboration tasks. Our benchmark reflects four core characteristics of real-world collaboration: agent heterogeneity, mandatory collaboration, dynamic environments, and strict real-time constraints with failure risks. To enable this, we develop the TickingCollab framework, which supports the generation of diverse dynamic environments and abstracts Minecraft's primitive APIs to enable declarative YAML task specifications for composing these events. Building on this, we design a feasibility-aware automated benchmark generation pipeline, where an LLM drafts structurally diverse task configurations and feasibility verifier filters out invalid ones using approximate constraints. Evaluations demonstrate that lang latency and inherent difficulty of coordinating under partial observability and agent heterogeneity cause LLMs to frequently fail under dynamic environments and fall significantly short of a global-knowledge oracle.

19.
arXiv (CS.LG) 2026-06-12

LongSpike: Fractional Order Spiking State Space Models for Efficient Long Sequence Learning

arXiv:2606.12895v1 Announce Type: new Abstract: Spiking Neural Networks (SNNs) are well-regarded for their biological plausibility and energy efficiency in processing sequential data. However, dominant SNN architectures typically rely on first-order Ordinary Differential Equations (ODEs) to govern neuronal state transitions. This first-order assumption imposes a "memoryless" bottleneck, limiting the model's capacity to capture the complex, long-range dependencies inherent in long-sequence tasks. In this work, we propose LongSpike, a novel SNN framework that integrates fractional-order State-Space Modeling, or f-SSM, from control theory into the spiking domain. By extending traditional integer-order SSMs to the fractional-calculus regime, LongSpike enables the hierarchical integration of neuronal dynamics with long-memory kernels. To mitigate the computational overhead and parallelization challenges typically associated with fractional operators, we leverage a state-space formulation that supports efficient, parallel training. Empirical evaluations on challenging benchmarks, including Long Range Arena (LRA), large-scale WikiText-103, and Speech Commands, demonstrate that LongSpike outperforms state-of-the-art SNNs in accuracy while preserving sparse synaptic computation. The code is available at https://github.com/xinruihe389-commits/LongSpike.

20.
arXiv (CS.CL) 2026-06-15

Knowing When to Quit: A Principled Framework for Dynamic Abstention in LLM Reasoning

LLMs utilizing chain-of-thought reasoning often waste substantial compute by producing long, incorrect responses. Abstention can mitigate this by withholding outputs unlikely to be correct. While most abstention methods decide to withhold outputs before or after generation, dynamic mid-generation abstention considers early termination of unpromising reasoning traces at each token position. Prior work has explored empirical variants of this idea, but principled guidance for the abstention rule remains lacking. We present a formal analysis of dynamic abstention for LLMs, modeling abstention as an explicit action within a regularized reinforcement learning framework. An abstention reward parameter controls the trade-off between compute and information. We show that abstaining when the value function falls below this reward strictly outperforms natural baselines under general conditions. We further derive a principled and efficient method to approximate the value function. Empirical results on mathematical reasoning and toxicity avoidance tasks support our theory and demonstrate improved selective accuracy over existing methods.

21.
arXiv (quant-ph) 2026-06-12

Relativistic Locality from Electromagnetism to Quantum Field Theory

arXiv:2412.11532v2 Announce Type: replace Abstract: Electromagnetism is the paradigm case of a theory that satisfies relativistic locality. This can be proven by demonstrating that, once the theory's laws are imposed, what is happening within a region fixes what will happen in the contracting light-cone with that region as its base. The Klein-Gordon and Dirac equations meet the same standard. We show that this standard can also be applied to quantum field theory (without collapse), examining two different ways of assigning reduced density matrix states to regions of space. Our preferred method begins from field wave functionals and judges quantum field theory to be local. Another method begins from particle wave functions (states in Fock space) and leads to either non-locality or an inability to assign states to regions, depending on the choice of creation operators. We take this analysis of quantum field theory (without collapse) to show that the many-worlds interpretation of quantum physics is local at the fundamental level. We argue that this fundamental locality is compatible with either local or global accounts of the non-fundamental branching of worlds, countering an objection that has been raised to the Sebens-Carroll derivation of the Born Rule from self-locating uncertainty.

22.
medRxiv (Medicine) 2026-06-24

Who funds stroke trials in Europe? A survey of funding sources for randomised controlled stroke trials by the European Stroke Organisation Trials Alliance (ESOTA) network

Abstract Aims and scope Evidence from randomised controlled trials (RCTs) has transformed stroke care. There are no systematically collected data on the amount of public funding, critical to delivering trials, going into stroke RCTs. To understand the extent of stroke RCT funding by national and EU funding bodies across Europe, the European Stroke Organisation Trials Alliance (ESOTA) conducted a survey of its member nations. Methods This is an observational study of research funding in Europe. The ESOTA steering group sent an electronic survey to the leads of the 16 participating national networks from 14 countries. Structured survey questions included who the funding bodies were in each country, the number of RCT applications put forward for public national or EU funding, the number of successful and failed applications, and the amount of funding granted between 01/01/2022 and 31/12/2023. Results Responses were received from 13 of 14 participating countries. There was significant variation in the number of grant applications submitted by individual countries, ranging from 0-17 during the 24-month survey period. The median number of funded studies per country was 1 (IQR 3, range 0-9) representing a median success rate of 47.1 % (IQR 21.1-59.4%), with no RCTs granted joint European funding. Conclusions Our survey highlights significant inequities in stroke trial funding across Europe. Given the encouraging rate of successful applications overall, it is important for all member networks to submit proposals. This is particularly pertinent for multicentre trials, given the evolution of evidence base in stroke towards large trials, across diverse populations.

23.
arXiv (quant-ph) 2026-06-25

A Mean-Field Lindblad Master Equation Framework for Interaction-Driven Decoherence in Solid-State Qubit Ensembles

arXiv:2606.25261v1 Announce Type: new Abstract: Multi-qubit systems are essential for scalable quantum technologies, but their performance is often limited by decoherence from qubit–qubit interactions and environmental noise. Although environmental decoherence in single-qubit systems and gate fidelity in multi-qubit systems have been widely studied, a predictive framework connecting qubit interactions, concentration, spatial distribution, and bath occupation to relaxation and decoherence times remains lacking. Here, we develop a multi-qubit mean-field Lindblad master equation (MQMF-LME) framework for the population and coherence dynamics of a solid-state qubit in an interacting multi-qubit environment. The framework treats one qubit as the system of interest and the surrounding qubits as an effective bath, incorporating intrinsic relaxation and bidirectional excitation transfer between the system and the bath. Analytical solutions provide closed-form expressions for density-matrix dynamics, steady-state populations, relaxation time $T_1$, and decoherence time $T_2$, while numerical simulations extend the framework to concentration-dependent dynamics, $1/f$-noise-induced dephasing, and material-specific excitation-transfer mechanisms. For a model system with Förster resonance energy transfer (FRET)-mediated excitation exchange, higher qubit concentrations reduce both $T_1$ and $T_2$, whereas $1/f$ noise reduces $T_2$ without changing $T_1$. Applied to Er$^{3+}$-doped CeO$_2$, the framework shows that long-range FRET-mediated excitation transfer reproduces the experimental decrease in relaxation time with dopant concentration, whereas short-range Dexter-type exchange does not, identifying FRET-mediated excitation transfer as the dominant mechanism. The MQMF-LME framework provides a modular route for linking microscopic interactions and environmental noise sources to measurable decoherence times in solid-state multi-qubit systems.

24.
arXiv (math.PR) 2026-06-18

Probabilistic representation and classical solutions of wave equations with complex polynomial nonlinearities

arXiv:2606.18919v1 Announce Type: cross Abstract: We review the probabilistic representation of solutions of wave equations with polynomial nonlinearities in spatial dimensions d=1,2,3 using stochastic branching processes. Under regularity assumptions on the initial data, we derive conditions ensuring the integrability of the corresponding Monte Carlo estimator, and the existence and smoothness of mild and classical solutions. We also present numerical results and comparisons with grid-based algorithms for the solution of nonlinear wave equations.

25.
arXiv (CS.CV) 2026-06-18

Prior-guided Fusion of Multimodal Features for Change Detection from Optical-SAR Images

Multimodal change detection (MMCD) identifies changed areas in multimodal remote sensing data, demonstrating significant application value in land use monitoring and urban sustainable development. However, literature MMCD approaches exhibit limitations in both cross-modal interaction and exploiting modality-specific characteristics. This leads to insufficient modeling of fine-grained change information, thus hindering the precise detection of semantic changes. To address these problems, we propose STSF-Net, a framework designed for MMCD between optical and SAR images. STSF-Net jointly models modality-specific and spatio-temporal common features to enhance change representations. Specifically, modality-specific features are exploited to capture genuine semantic change signals, while spatio-temporal common features are embedded to suppress pseudo-changes caused by differences in imaging mechanisms. Furthermore, we introduce an optical and SAR feature fusion strategy that adaptively adjusts multimodal feature importance based on semantic priors obtained from visual foundation models. Finally, we introduce the novel Delta-SN6 dataset, the first openly-accessible multiclass MMCD benchmark consisting of very-high-resolution fully polarimetric SAR and optical images. Experimental results on Delta-SN6, BRIGHT, and Wuhan datasets demonstrate that our method outperforms the state-of-the-art by 3.21%, 0.87%, and 1.32% in mIoU, respectively.