Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CV) 2026-06-15

Gefen: Optimized Stochastic Optimizer

AdamW is a default optimizer for modern deep learning, but its first and second moment states add roughly two parameter-sized buffers to training memory. We propose Gefen, a memory-efficient optimizer that automatically shares second-moment estimates across parameter blocks and quantizes the first moment using a learned codebook, thereby reducing AdamW's memory footprint by ~8x while maintaining the same performance, corresponding to a reduction of 6.5 GiB per billion parameters. The method is motivated by a theoretical result showing that large mixed Hessian entries constrain the ratio of squared gradients toward one, suggesting that Hessian-aligned parameters are natural candidates for sharing second-moment statistics. Since computing Hessians is impractical at scale, Gefen infers block structure from the initial squared gradients, requiring no architecture-specific metadata or hyperparameters beyond AdamW defaults. Gefen learns an exact histogram-based dynamic-programming quantization codebook and reuses the same blocks for first-moment scaling. Across diverse experiments, Gefen achieves the lowest peak optimizer memory among the compared AdamW-like methods while maintaining AdamW-level performance. In FSDP and DDP training, the reduced memory footprint enables larger microbatches and improves throughput significantly over AdamW, providing a practical drop-in replacement with lower memory usage that can increase throughput and enable training larger models or using larger batch sizes. We provide the complete Python implementation, including fused CUDA kernels at https://github.com/ndvbd/Gefen

02.
arXiv (CS.CV) 2026-06-15

Giving AI a Headache: Acoustic Adversarial Attacks to Computer Vision Applications

Artificial Intelligence (AI) is increasingly used to automate a variety of real-world computer vision (CV) applications, such as autonomous vehicle control, facial recognition, and security cameras. Recent research has shown that acoustic vibration can induce real physical motion in cameras, interfering with their internal stabilization mechanisms. Because the motion falls outside the conditions the stabilization system was designed to handle, the system introduces artifacts into the frame, causing AI-based CV models to misclassify, miss targets, or hallucinate objects. Previous work used ultrasonic frequencies (>20 kHz) to perform short-range attacks, which limits them to short distances due to the attenuation exhibited by high frequencies. In this work, we investigate acoustic attacks using lower frequencies in the audible range (

03.
arXiv (CS.CV) 2026-06-11

How Seemingly Inconsequential Design Choices Dictate Performance of LLMs in Pathology

General-purpose large language models (LLMs) are routinely used as baselines when evaluating specialized pathology models on whole-slide images (WSIs). Because WSIs exceed contemporary model context limits, LLM baselines routinely use small, high-magnification patches processed independently via majority voting, without systematic evaluation of seemingly inconsequential design choices such as patch size, patch count, and magnification. Generalist LLMs have consistently underperformed specialized systems, reinforcing the perception that domain-specific training or architectural adaptation is necessary for pathology tasks involving WSIs. Here, we conduct a systematic factorial analysis of four input design factors: inference mode, patch size, magnification, and patch count. We demonstrate that prior studies have overstated the gap between specialized models and general-purpose LLMs by choosing non-optimized input configurations. On the MultiPathQA benchmark, switching to a single balanced configuration (large patches at lower magnification, processed jointly) raises GPT-5 from 15.1% to 39.5% on cancer-type classification (TCGA) and from 38.1% to 62.9% on organ classification (GTEx). Per-task optimization yields further gains up to 43.9% (TCGA) and 71.6% (GTEx). The same configuration generalizes to two other models and to a fully held-out CPTAC cohort, where it improves Gemini 3 Flash by 23.4 percentage points without any task-specific tuning.

04.
arXiv (CS.CL) 2026-06-18

Approximate Structured Diffusion for Sequence Labelling

Sequence labelling, a core task of Natural Language Processing (NLP), consists in assigning each token of an input sentence a label. From a Machine Learning point of view, sequence labelling is often cast as a Linear-Chain Conditional Random Field (CRF) parametrised by a neural network. While this approach gives good empirical results, CRFs assume a finite decision span (eg label bigrams) which can limit their expressivity and hurt performance when long-range dependencies are required. We show we can leverage diffusion to train a CRF conditioned on an entire label sequence, with the caveat that the condition is on a noisy version of labels. We show experimentally that this method, in conjunction with approximate CRF inference, improves label accuracy with a 16.5% error reduction for POS-tagging.

05.
arXiv (quant-ph) 2026-06-11

Tight Bounds for Quantum Phase Estimation and Related Problems

arXiv:2305.04908v3 Announce Type: replace Abstract: Phase estimation, due to Kitaev [arXiv'95], is one of the most fundamental subroutines in quantum computing. In the basic scenario, one is given black-box access to a unitary $U$, and an eigenstate $\lvert \psi \rangle$ of $U$ with unknown eigenvalue $e^{i\theta}$, and the task is to estimate the eigenphase $\theta$ within $\pm\delta$, with high probability. The cost of an algorithm for us is the number of applications of $U$ and $U^{-1}$. We tightly characterize the cost of several variants of phase estimation where we are no longer given an eigenstate, but are required to estimate the maximum eigenphase of $U$, aided by advice in the form of states (or a unitary preparing those states) which are promised to have at least a certain overlap $\gamma$ with the top eigenspace. We give algorithms and nearly matching lower bounds for all ranges of parameters. We show that a small number of copies of the advice state (or of an advice-preparing unitary) are not significantly better than having no advice at all. We also show that having lots of advice (applications of the advice-preparing unitary) does not significantly reduce cost, and neither does knowledge of the eigenbasis of $U$. We immediately obtain a lower bound on the complexity of the Unitary recurrence time problem, resolving an open question of She and Yuen~[ITCS'23]. Lastly, we study how efficiently one can reduce the error probability in the basic phase-estimation scenario. We show that a phase-estimation algorithm with precision $\delta$ and error probability $\epsilon$ has cost $\Omega\left(\frac{1}{\delta}\log\frac{1}{\epsilon}\right)$, matching an easy upper bound. This contrasts with some other scenarios in quantum computing (e.g., search) where error-probability reduction costs only a factor $O(\sqrt{\log(1/\epsilon)})$. Our lower bound uses a variant of the polynomial method with trigonometric polynomials.

06.
arXiv (CS.LG) 2026-06-15

Hybrid Uncertainty Sensitivity Analysis Based on the HSIC for High-Dimensional Responses with Aleatory–Epistemic Separation

arXiv:2606.14053v1 Announce Type: cross Abstract: Quantifying the influence of hybrid aleatory and epistemic uncertainties on high-dimensional system responses remains a major challenge in global sensitivity analysis (GSA). Existing Hilbert–Schmidt Independence Criterion (HSIC)-based approaches are primarily restricted to single-output settings and lack a rigorous decomposition of heterogeneous uncertainty sources and their interactions. To address this limitation, a novel double-space tensor-product RKHS framework is proposed for sensitivity analysis under hybrid uncertainty. By constructing factorized kernels over both the latent input space and the multidimensional output space, a concurrent double Möbius inversion is derived to orthogonally decompose the global dependence measure into pure aleatory effects, pure epistemic effects, and their interaction contributions. The resulting dimension-wise sensitivity indices preserve the uncertainty attribution structure across all output dimensions. To satisfy the independence assumptions required by the decomposition, an auxiliary-variable representation based on the inverse probability integral transform is introduced, enabling the treatment of hierarchical uncertainties and Copula-induced correlations within a unified latent space. A fully vectorized single-loop implementation is further developed to avoid the computational burden of nested Monte Carlo simulation. Statistical significance and estimation uncertainty are quantified through permutation testing and Bootstrap confidence intervals. Numerical studies on a modified multi-output Ishigami function and an aerodynamic pressure-field problem demonstrate the accuracy, scalability, and practical applicability of the proposed framework.

07.
arXiv (CS.CV) 2026-06-16

The Third Challenge on Image Denoising at NTIRE 2026: Methods and Results

This paper reports on the NTIRE 2026 Challenge on Image Denoising, specifically focusing on the high-noise regime ($\sigma = 50$). The competition investigates advanced neural architectures designed to restore high-fidelity details from images corrupted by additive white Gaussian noise (AWGN). Unlike constrained benchmarks, this track emphasizes peak quantitative performance, measured by Peak Signal-to-Noise Ratio (PSNR), without limitations on parameter count or computational overhead. By synthesizing contributions from 20 finalist teams out of 116 registrants, this report benchmarks the latest technical innovations and provides a comprehensive snapshot of the current state-of-the-art in unconstrained image restoration.

08.
arXiv (CS.CV) 2026-06-16

Focus When Necessary: Adaptive Routing and Collaborative Grounding for Training-Free Visual Grounding

While Multimodal Large Language Models (MLLMs) excel in cross-modal reasoning, they often struggle to perceive fine-grained details in complex high-resolution images. Recent training-free methods address this through image scaling and localized cropping. However, applying these manipulations indiscriminately introduces computational redundancy for simple queries and can degrade accuracy by truncating essential global context or introducing irrelevant background noise. To this end, we propose LazyMCoT, a dynamic and training-free framework that adaptively allocates visual grounding efforts based on sample difficulty. The framework features an Adaptive Routing mechanism that evaluates predictive uncertainty using first-token statistics from a single forward pass. This efficiently bypasses confident cases while ensuring the recall of difficult samples via conformal calibration. For these challenging cases, a Collaborative Grounding module integrates the inherent cross-modal attention of the model with an external visual expert through a two-stage refinement process. This refinement process generates a precise localized display to recover small or occluded targets. Extensive experiments across diverse benchmarks demonstrate that LazyMCoT rivals training-based approaches by simultaneously improving reasoning accuracy and reducing average inference latency. Our code is availble at https://github.com/TencentBAC/LazyMCoT.

09.
Nature (Science) 2026-06-17

Molecular basis of polyadenylated RNA fate determination in the nucleus

作者:

Eukaryotic genomes generate a plethora of polyadenylated (pA+) RNAs1,2, which are packaged into ribonucleoprotein particles (RNPs). To ensure faithful gene expression, functional pA+ RNPs, including protein-coding RNPs, are exported to the cytoplasm, whereas transcripts within non-functional pA+ RNPs are degraded in the nucleus1–4. How cells distinguish these opposing fates remains unknown. The DExD-box ATPase UAP56 (also known as DDX39B) is a central component of functional pA+ RNPs, and promotes their docking to the nuclear pore complex-anchored TREX-25,6, which triggers transcript release from UAP56 to facilitate export7. Here we reveal that the poly(A) tail exosome targeting (PAXT) connection8 binds a TREX-2-like module, which releases pA+ RNAs from UAP56 for decay by the nuclear exosome. The core of this module consists of a LENG8–PCID2–SEM1 trimer, which we show is structurally and biochemically equivalent to the central GANP–PCID2–SEM1 trimer of TREX-2. Mutagenesis and transcriptomic data demonstrate that the nuclear fate of pA+ RNPs is governed by the contending actions of nucleoplasmic PAXT and nuclear pore complex-associated TREX-2, which interpret RNA-bound UAP56 as a signal for RNA decay or export, respectively. As RNA targets of PAXT are generally short and intron-poor, we propose an overall model for pA+ RNP fate determination whereby the distinct sub-nuclear localizations of PAXT and TREX-2 govern the degradation of short non-functional pA+ RNAs while allowing export of their longer and functional counterparts. Biochemical, structural and cell biological analyses reveal that UAP56 (DDX39B) assembles with a TREX-2–like module that redirects non-functional polyadenylated RNAs from export to degradation.

10.
arXiv (CS.LG) 2026-06-17

Multi-Source Cybersecurity Logs: An ATT&CK-Labeled Dataset and SLM Evaluation

arXiv:2606.18190v1 Announce Type: cross Abstract: Multi-stage cyberattacks span system, network, and browser logs. Detecting them requires correlating events across all three sources. Machine learning methods can learn these cross-source patterns, but they need labeled multi-source data. Existing public datasets fall short. Network-only datasets such as CICIDS and UNSW-NB15 miss host and browser activity. Host-focused datasets such as LMDG and CICAPT-IIoT lack browser telemetry. ATLAS includes all three sources but labels events only as malicious or benign, without MITRE Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) technique granularity. No public dataset combines all three sources with per-entry ATT&CK technique labels. We close the gap by building a multi-source log dataset of 870 sessions (70 attack, 800 benign) and approximately 2.3 million events. We captured system, network, and browser activity simultaneously on Windows endpoints. We labeled malicious events with ATT&CK technique IDs, covering 12 tactics and 53 techniques. We generated all attack data using real tools, including Remote Access Trojan (RAT), Command and Control (C2) tunnels, and cloud exfiltration. To demonstrate learnability, we fine-tuned three Small Language Models (SLMs) (Qwen2.5-1.5B, Llama-3.2-3B, Phi-4-Mini) using Low-Rank Adaptation (LoRA). We compared each against its base variant across ten metrics on two tasks: chunk classification and ATT&CK technique identification. Fine-tuning improved every model on every metric. Chunk classification accuracy rose from approximately 8% in the base variants to between 90% and 97% after fine-tuning. Technique identification remained challenging, with the best exact-match accuracy at 42%, although high partial-match scores show the models captured most of the underlying reasoning.

11.
arXiv (CS.CL) 2026-06-12

LLM-based Embeddings: Attention Values Encode Sentence Semantics Better Than Hidden States

Sentence representations are foundational to many Natural Language Processing (NLP) applications. While recent methods leverage Large Language Models (LLMs) to derive sentence representations, most rely on final-layer hidden states, which are optimized for next-token prediction and thus often fail to capture global, sentence-level semantics. This paper introduces a novel perspective, demonstrating that attention value vectors capture sentence semantics more effectively than hidden states. We propose Value Aggregation (VA), a simple method that pools token values across multiple layers and token indices. In a training-free setting, VA outperforms other LLM-based embeddings, even matches or surpasses the ensemble-based MetaEOL. Furthermore, we demonstrate that when paired with suitable prompts, the layer attention outputs can be interpreted as aligned weighted value vectors. Specifically, the attention scores of the last token function as the weights, while the output projection matrix ($W_O$) aligns these weighted value vectors with the common space of the LLM residual stream. This refined method, termed Aligned Weighted VA (AlignedWVA), achieves state-of-the-art performance among training-free LLM-based embeddings, outperforming the high-cost MetaEOL by a substantial margin. Finally, we highlight the potential of obtaining strong LLM embedding models through fine-tuning Value Aggregation.

12.
arXiv (quant-ph) 2026-06-17

Quantum conditional entropies from convex trace functionals

arXiv:2410.21976v4 Announce Type: replace Abstract: We study geometric properties of trace functionals that generalize those in [Zhang, Adv. Math. 365:107053 (2020)], arising from a novel family of conditional entropies with applications in quantum information. Building on new convexity results for these functionals, we establish data-processing inequalities and additivity properties for our entropies, demonstrating their operational significance. We further prove completeness under duality, chain rules, and various monotonicity properties for this family. Our proofs draw on tools from complex interpolation theory, multivariate Araki–Lieb and Lieb–Thirring inequalities, variational characterizations of trace functionals, and spectral pinching techniques.

13.
arXiv (CS.AI) 2026-06-11

DuoBench: A Reproducible Benchmark for Bimanual Manipulation in Simulation and the Real World

arXiv:2606.11901v1 Announce Type: cross Abstract: Bimanual robot systems substantially expand manipulation capabilities, but coordinating two arms introduces additional control complexity and failure modes that are not well captured by existing benchmarks. We introduce DuoBench, an extensible benchmarking framework for bimanual manipulation policies on the FR3 Duo platform. DuoBench comprises eleven tasks spanning four coordination categories, implemented in simulation and partially reproduced in the real world through reproducible task recipes with 3D-printable assets. In addition, we propose a stage-based evaluation scheme that supports fine-grained semantic failure analysis beyond binary success and provide human-teleoperated datasets for all benchmark tasks. We benchmark several dual-arm imitation-learning and vision-language-action policies in simulation and on real hardware. Our results show that current policies remain challenged by bimanual manipulation, particularly in early interaction stages, parallel arm execution, and transfer between simulation and real-world settings. DuoBench provides a reproducible testbed for diagnosing these failure modes and studying future methods for dual-arm policy learning. Code, datasets, and videos are available at https://duobench.github.io/

14.
PLOS Medicine 2026-06-09

Molecular Tumor Boards clinical impact on patient care and structural features: A systematic review and meta-analysis

作者:

by Luigi Russo, Erika Giacobini, Nicolò Lentini, Tommaso Osti, Maud Kamal, Stefania Boccia, Roberta Pastorino Background Molecular Tumor Boards (MTBs) bring together multidisciplinary experts to translate genomic data into clinical decisions in oncology, however, their overall clinical impact remains unclear. The aim of this systematic review is to assess the clinical impact of MTB-recommended therapies on patients with cancer outcomes. Methods and findings In this systematic review and meta-analysis, we searched PubMed, Embase, Scopus, and CENTRAL up to July 2025. We included studies of any design, both single-arm studies and studies with a comparator group, that reported the clinical impact of MTBs in patients who received MTB-guided therapy. Meta-analyses were performed separately by study design, using hazard ratios (HRs) for overall survival (OS) and progression-free survival (PFS), relative risks (RRs) for objective response rate (ORR) and disease control rate (DCR), and pooled proportions for PFS ratio ≥1.3. All meta-analyses were conducted using random-effects models based on the inverse variance method. We evaluated the risk of bias using the RoB 2.0 for RCTs and ROBINS-I for non-randomized studies.From 6,846 records, 78 studies (9,195 patients; 4,569 treated per MTB recommendations) were included. MTB-guided therapies were associated with reduced risk of death (HR 0.87; 95% CI [0.76, 1.01]; p = 0.069; I2 = 0.0% in RCTs; 0.62 in retrospective studies) and disease progression (HR 0.73; 95% CI [0.64, 0.84]; p 

16.
arXiv (CS.LG) 2026-06-17

Meta-classification of one-class classification models using ranking correlation and nearest neighbor

arXiv:2606.17858v1 Announce Type: new Abstract: Machine Learning (ML) techniques have been applied to various problems. However, applying ML to ML models is an unexplored direction. For this purpose, this paper considers a meta-classification of one-class classification (OCC) models, because all ML models could be approximated as OCC models. The proposal represents OCC models as normality rankings and classifies them using nearest-neighbor and ranking-correlation metrics. The experiment classifies OCC models, where classes correspond to training datasets, algorithms, and hyperparameters. The proposal achieves high accuracy when class labels are datasets. Moreover, it can classify algorithms when the training datasets contain the same class. In addition, the discussion highlights that the classification of OCC models is essentially the classification of datasets that treats multiple samples as a single input. The experiment demonstrates the classification of datasets using sleeping records. The proposed method can provide a unified solution for classifying OCC models, datasets, and rankings. Source code is uploaded to the public repository https://github.com/ToshiHayashi/ClassOCC.

17.
arXiv (CS.LG) 2026-06-16

Finite Resources False Discovery Rate Control in Structured Hypothesis Spaces

arXiv:2606.15393v1 Announce Type: cross Abstract: Scientific discovery relies on large-scale hypothesis testing. However, the capacity to identify true discoveries while controlling false discovery faces major challenges: obtaining relevant reference data (the null distribution) is resource-intensive, leaving finite-data uncertainty, and the procedure should account for the inherent structure in the hypothesis space, when such structure exists. Here, we present a framework for controlling the false discovery rate both when each hypothesis is evidenced only by a finite count of null draws, leaving its p-value uncertain, and when the hypothesis space carries arbitrary structure, requiring only that the structure be represented through a suitable reproducing kernel. We present two decision rules that are both robust to structural mis-specification, yet offer a distinct trade-off between exact FDR control and statistical power. The first rule guarantees exact FDR control; the second maximizes power by adapting mirror-statistic control into count space, utilizing an analytical framework to assess FDR control when exact mirror symmetry is relaxed. Furthermore, the tractability gained by the RKHS framework allows us to directly investigate finite-data uncertainties, which we leverage to suggest a policy for the efficient allocation of null distribution samples.

18.
arXiv (quant-ph) 2026-06-12

Optimal classical shadow estimation of unitary channels at Heisenberg limit

arXiv:2606.13638v1 Announce Type: new Abstract: Full tomography of an unknown quantum evolution is resource-intensive and often unnecessary when the goal is only to predict selected properties. This motivates the study of classical shadow estimation of unitary channels (CSEU), a task in which one queries an unknown $d$-dimensional unitary $U$ and stores classical data that can later be used to predict expectation values $\mathrm{tr}[O \cdot U\rho U^\dagger]$ up to additive error $\varepsilon$ for arbitrary input states $\rho$ and observables $O$. We propose a parallel, non-adaptive CSEU protocol using $\mathcal{O}(d\varepsilon^{-1})$ queries when the input states or observables have constant rank. This achieves Heisenberg scaling with respect to $\varepsilon$ and is query-optimal, as we prove a matching $\Omega(d\varepsilon^{-1})$ lower bound that remains valid even with stronger access to the unknown unitary. Our query-optimal CSEU protocol provides a versatile and powerful tool for quantum learning theory, pushing the performance limits of several fundamental learning tasks, including unitary channel tomography, Hamiltonian learning, boundary-regime quantum channel tomography, Pauli transfer matrix learning, inverse-free amplitude estimation, pure-state property estimation, and shallow-circuit learning. Remarkably, we show that optimal unitary channel tomography can be achieved using only parallel queries, closing the gap between the best achievable efficiency of parallel and sequential tomography protocols. Together, these applications establish our framework as a fundamental tool for learning properties of quantum processes, particularly for certain key tasks that require high precision.

19.
arXiv (CS.CL) 2026-06-11

BaltiVoice: A Speech Corpus and Fine-tuned Whisper ASR System for the Balti Language

作者:

We present BaltiVoice, a 16.8-hour read-speech corpus for Balti (ISO 639-3: bft), a Tibetic language spoken in Gilgit-Baltistan, Pakistan, with no prior publicly available ASR resources. The corpus contains 10,060 validated utterances in native Nastaliq script, derived from Mozilla Common Voice recordings. Fine-tuning OpenAI Whisper-small yields a Word Error Rate (WER) of 26.74% and a Character Error Rate (CER) of 8.67% on a 538-utterance speaker-disjoint validation set, down from a zero-shot baseline of 159.19% WER and 152.52% CER. A Whisper-base fine-tuned on the same data achieves 44.54% WER and 15.61% CER, confirming that model capacity matters for this low-resource setting. The dataset, fine-tuned model, and a live transcription demo are publicly available on HuggingFace.

20.
medRxiv (Medicine) 2026-06-18

Consistency of sleep timing and duration are associated with more physical activity and favorable heart rate metrics in a naturalistic cohort

Background: Regularity of sleep patterns over time has increasingly gained traction as an important axis of sleep health. Since sleep habits are under some degree of behavioral control, understanding such patterns in naturalistic settings is particularly important. We quantified sleep variability and tested the hypothesis that regularity correlates with physical activity, resting heart rate (rHR), and heart rate variability (HRV). Methods: We analyzed real-world digital health data from over 81,000 participants (over 18 million nights) who provided informed consent to participate in the Apple Heart and Movement Study and elected to contribute sleep, activity, and heart rate data to the study. Variability was quantified using the standard deviation (SD) computed from total sleep time (TST), sleep start time (S-start), end time (S-end), and midpoint time (MP), as well as the Sleep Regularity Index (SRI). Results: The SD-based variability metrics correlated with one another (R values 0.74-0.92), and with the SRI metric (R values 0.62-0.64). More consistent sleep, by any metric, was associated with more activity and better rHR and HRV. The most consistent tertile for TST variability had higher median TST (6.9 vs 5.9 hours), more daily exercise (32.8 vs 20.4 minutes), lower rHR (62.4 vs 65.6 beats per minute), and higher HRV (40.6 vs 37.3), all p

21.
arXiv (math.PR) 2026-06-16

Well-posedness of stochastic parabolic equations with gradient nonlinearities and applications to phase-field models

作者:

arXiv:2606.15425v1 Announce Type: new Abstract: We study well-posedness of stochastic parabolic equations with gradient nonlinearities. Our analysis is based on recent maximal-regularity frameworks for nonlinear stochastic parabolic equations in critical spaces. We extend the existing results by controlling drift and noise coefficient separately. This way we can allow for less regular driving noise in case of subcritical dispersion coefficients. Our approach, based on gluings of local solutions, moreover implies new continuation criteria. We then apply our existence result and the continuation criteria to show global well-posedness of phase-field models of moving boundary problems.

22.
arXiv (CS.CL) 2026-06-16

Mapping Geopolitical Bias in 11 Large Language Models: A Bilingual, Dual-Framing Analysis of U.S.-China Tensions

Large language models are how hundreds of millions of people now encounter contested political questions, raising a subtle measurement problem: a model that simply agrees with whatever it is told can masquerade as biased, contaminating any claim that models hold political opinions. We address this by importing balanced keying from survey psychometrics, posing each proposition and its swapped reverse and signing the response so acquiescence cancels and genuine conviction accumulates. The result is a reproducible, quantitative instrument that maps geopolitical stance across 11 models and 2 languages (19,712 responses). Developer origin, query language and issue domain emerge as three near-equal, additive factors; every model, including those built in the United States, leans more Pro-China in Mandarin; and two models with identical agreement bias are told apart, one neutral, one biased. We release it as an open, interactive tool that extends to any contested-opinion domain.

23.
arXiv (math.PR) 2026-06-17

Limit theorems for random Dirichlet series with summation over primes, with an application to Rademacher random multiplicative functions

arXiv:2508.15032v2 Announce Type: replace Abstract: It is shown that two conjectures put forward in the recent article Iksanov and Kostohryz (2025) are true. Namely, we prove a functional central limit theorem (FCLT) and a law of the iterated logarithm (LIL) for a random Dirichlet series $\sum_p \frac{\eta_p}{p^{1/2+s}}$ as $s\to 0+$, where $\eta_1$, $\eta_2,\ldots$ are independent identically distributed random variables with zero mean and finite variance, and $\sum_p$ denotes the summation over the prime numbers. As a consequence, an FCLT and an LIL are obtained for $\log \sum_{n\geq 1} \frac{f(n)}{n^{1/2+s}}$ as $s\to 0+$, where $f$ is a Rademacher random multiplicative function.

24.
arXiv (quant-ph) 2026-06-16

Quantum Algorithm for Open-System Battery Cathodes by Modeling Multiple Strongly Coupled Holstein Polarons with Chain-Mapped Caldeira-Leggett Dynamics

arXiv:2606.16017v1 Announce Type: new Abstract: Cathode lithiation occupies a chemical regime of tightly localized orbitals, narrow bandwidths, and strong electron-lattice coupling. The defining electrochemical observables (open-circuit voltage and differential capacity) are open-system, reservoir-equilibration quantities that closed-Hamiltonian quantum simulation cannot produce, set by exchange with electron, Li$^+$, and phonon baths. We present a fault-tolerant quantum algorithm that recovers them through a unitary chain-mapped Caldeira-Leggett embedding, rendering the baths Trotterizable. The resulting fourth-order Trotter step has a T-gate count polynomial in system size, validating its open-system dynamics against hierarchical equations of motion (HEOM) at strong coupling and the Lindblad limit at weak coupling. For single-carrier olivine LiFePO$_4$, a single voltage anchor on an otherwise DFT-fixed Hamiltonian places the differential-capacity peak within the $\pm5$ mV reproducibility of the experimental plateau. For multi-carrier spinel LiMn$_2$O$_4$, whose $1{:}1$ Mn$^{3+}$/Mn$^{4+}$ filling makes the inter-site Coulomb repulsion dynamically active, the same kernel yields a two-plateau voltage curve with a $125$ mV split, within $17\%$ of the observed $150$ mV. We deliver an end-to-end fault-tolerant resource estimate for such a multi-carrier, three-reservoir observable: $368$ logical qubits and $\sim3\times10^5$ T-gates per step, or $\sim1.7\times10^{12}$ T-gates for a full voltage curve (parallelizable over $\sim10^3$ trajectories), leaving the production-scale dynamical run as a milestone for future hardware. The same kernel reproduces macroscopic quantum coherence, two-band superconductivity, and the Mikheyev-Smirnov-Wolfenstein resonance without modification, placing dynamical battery chemistry and similar Hamiltonians within scope for fault-tolerant quantum simulation.

25.
arXiv (CS.AI) 2026-06-12

Fault Lines: Navigating Ethics and Responsible AI Where National Policy Meets Local Practice in Public Sector Transformation

arXiv:2606.13039v1 Announce Type: cross Abstract: The UK government has adopted a pro-AI stance to help transform public service delivery in the face of severe financial pressures, but the path to translate this vision into responsible AI practice remains ill-defined. While UK policy is often set at the national level, local authorities are responsible for most public service delivery, and the rapid advance of AI-first narratives in the public sector is exposing fault lines in knowledge and practice at this national-local interface. This paper examines how responsible AI is interpreted and implemented at the interface between the UK's central government and local authorities, taking the high-stakes area of Special Educational Needs and Disabilities (SEND) as a case study. We present a thematic analysis of 17 semi-structured interviews with policymakers, practitioners, and third-sector professionals to identify barriers and enabling conditions for responsible AI where national policy meets local practice. We identify five interconnected challenges facing local authorities: shadow usage of AI and data privacy risks, market-government asymmetry in AI provision, insufficient workforce readiness, a lack of standardised definitions and measurements, and gaps in human accountability. For each, participants proposed actionable steps, from strengthening data protection frameworks and rebalancing the market-government relationship to enhancing workforce capacity. Our examination of SEND brings these challenges into sharper focus, showing how high-stakes decisions affecting vulnerable children and families intensify tensions around accountability, fairness, and human oversight, exposing the limits of a principle-based regulatory approach. We argue that responsible public sector AI requires both national policy adjustments and structural reforms to institutional capacity, values, and governance mechanisms at the local level.