Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.AI) 2026-06-18

How Well Do Large Language Models Capture Human Personality?

arXiv:2606.18263v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used to simulate human populations via persona prompting, often under the assumptions that richer persona descriptions improve behavioral fidelity, similarly sized attribute combinations are equally simulatable, and persona definitions generalize across tasks. In this work, we formalize these assumptions and systematically evaluate them across multiple architectures, scales, and simulation settings. We identify a fundamental limitation we term persona manifold collapse, where increasingly expressive persona specifications lead to systematic contraction of representational and behavioral diversity. Across models, increasing persona complexity consistently reduces inter-persona separation in latent space and weakens behavioral differentiation in downstream simulation tasks. These effects persist across multiple analyses as richer personas fail to preserve human subgroup disagreement, performance varies across attribute combinations of similar size, and adding descriptive detail often degrades rather than improves simulation fidelity. Surprisingly, simple Age-Gender personas consistently outperform richly specified Ideal Customer Profiles (ICPs) across industries, achieving substantially higher downstream prediction accuracy. We find that collapse is not uniform across attributes. Certain combinations remain behaviorally stable and preserve stronger alignment with human responses, forming localized regions we term alignment bridges. Together, our results provide empirical and conceptual foundations for understanding the limits of persona-conditioned simulation, highlighting the need for representation-aware persona construction rather than increasing persona expressivity alone.

02.
arXiv (CS.LG) 2026-06-16

Remember, Don't Re-read: Stateful ReAct Agents for Token-Efficient Autonomous Experimentation

arXiv:2606.14945v1 Announce Type: new Abstract: The autoresearch pattern enables autonomous experimentation by having a large language model (LLM) iteratively modify code to optimize a target metric. Its stateless design, however, reconstructs experimental context from scratch at every iteration, incurring $O(n)$ token cost per iteration and $O(n^{2})$ total. This work reformulates the pattern as a stateful ReAct agent using LangGraph, where typed persistent state carries experimental history across iterations via a tool-calling interface. Two benchmarks are evaluated: hyperparameter tuning (15 iterations, small per-iteration observations) and code performance optimization (40 iterations, large per-iteration observations containing full source code and benchmark results). On hyperparameter tuning, the stateful agent consumes 90\% fewer tokens (2{,}492 vs.\ 24{,}465). On code optimization, the stateful agent consumes 52\% fewer tokens (627K vs.\ 1{,}275K) while achieving comparable optimization quality on both tasks. The token reduction is structural: the stateless agent re-reads the full history at $O(n)$ cost per iteration, while the stateful agent operates within a fixed-size conversation window at $O(1)$ cost. This paper describes the architecture in sufficient detail for practitioners to implement a stateful autoresearch agent for their own workflows.

03.
medRxiv (Medicine) 2026-06-18

Cardiac rhythm development: A wearable device index of risk for physical and mental illness in adolescence

Objective. The autonomic nervous system, which regulates cardiac rhythm, undergoes pronounced maturation across adolescence. How cardiac rhythm develops over this period, however, and whether individual differences in its development forecast mental and physical illness, remain open questions. We used three waves of Fitbit data from the Adolescent Brain Cognitive Development (ABCD) Study to characterize the developmental trajectory of the cardiac rhythm and to test whether variation in that trajectory predicts onset of psychopathology and cardiometabolic disease. Methods. 8,301 adolescents contributed 242,811 valid Fitbit wear days across Waves 2 (Mage=12), 4 (Mage=14), and 6 (Mage=16). Cosinor mixed-effects models yielded three rhythm parameters per session: mesor (24-hour mean), amplitude (diurnal swing), and acrophase (peak timing). We first characterized age- and sex-specific trajectories, cross-wave stability, and factors shaping the rhythm. We then used parallel-process latent growth models to test whether within-person changes in rhythm tracked symptom trajectories, and hierarchical logistic models to test whether rhythm parameters predicted the first clinical onset of psychopathology and of obesity and hypertension. Results. The cardiac rhythm changed substantially across adolescence: mesor decreased, amplitude flattened, and acrophase shifted later. Within-person change in the rhythm tracked change in blood pressure, BMI, and trajectories of depression and ADHD symptoms. Higher mesor predicted incident onset of all five outcomes controlling for demographics, baseline symptoms, and behavior (ORs 1.36-1.54); amplitude, acrophase, and rhythm instability conferred additional risk. Conclusions. The 24-hour cardiac rhythm is a passively measurable substrate of adolescent autonomic development that indexes transdiagnostic risk for psychiatric and cardiometabolic illness.

04.
arXiv (quant-ph) 2026-06-16

Quantum Nonlocal Games on Graph Ensembles

arXiv:2606.16784v1 Announce Type: new Abstract: Quantum entanglement is one of the most striking discoveries in all of science. This effect allows, for instance, two spatially separated agents to coordinate their actions, without communication, to an extent that is both counter-intuitive, and provably impossible by any other physical means. A recently discovered example is that of mobile agents (players) performing spatial coordination tasks such as rendezvous, where the agents aim to meet on a network without communication. Until now, demonstrations of this advantage have relied on highly idealized conditions: agents are assumed to have complete knowledge of the topography, and experiments have been restricted to simulations using data generated by qubits within a single quantum processor. Here we address both limitations by developing a theory for graph ensembles that capture topographical uncertainty and by experimentally demonstrating the advantage in rendezvous scenarios between physically separated ion-trap systems with access to remote entanglement. Moreover, we simulate a broader set of problems on superconducting hardware. Surprisingly, when players are given the ability to gather more local information the quantum advantage increases – a feat impossible by classical means. Our findings establish a concrete route toward practical quantum advantages in motion coordination problems. More broadly, they point to a new way of using portable quantum devices to enhance collective decision-making in uncertain environments.

05.
arXiv (CS.LG) 2026-06-11

SwiftCTS: Fast Cross-Design Prediction and Pareto Optimization of Clock Tree Metrics via Few-Shot Calibration

arXiv:2606.11348v1 Announce Type: new Abstract: Clock Tree Synthesis (CTS) is a computationally expensive stage in the physical design flow, requiring iterative EDA tool invocations to navigate a vast configuration space for optimal power, wirelength, and timing skew. Existing machine learning approaches require computationally expensive retraining or fine-tuning cycles to adapt to unseen macro architectures and are architecturally mismatched to the millions of evaluations demanded by exhaustive combinatorial search. We present SwiftCTS, a physics-informed surrogate framework that addresses both limitations simultaneously. By coupling lightweight, physics-grounded statistical features with gradient-boosted ensembles, SwiftCTS trains in under five seconds on a CPU and delivers sub-millisecond inference without GPU support. To handle out-of-distribution (OOD) designs without retraining or fine-tuning, we introduce a K-shot multiplicative calibration mechanism that anchors predictions to just one or two physical reference runs, reducing power prediction error from 24.5\% to 3.3\% and wirelength error from 56.6\% to under 1\% on unseen macros. Integrating this engine with an evolutionary optimizer, SwiftCTS evaluates 100,000 CTS configurations in under ten seconds, yielding Pareto-optimal frontiers that are physically validated within the OpenROAD flow. Closed-loop validation confirms prediction errors below 0.5\% for power and wirelength, and timing skew predictions within five picoseconds on an OOD benchmark, consistently outperforming default tool heuristics across all target metrics. Code publicly available at: \href{https://anonymous.4open.science/r/SwiftCTS-7E6E}{https://github.com/BarsatKhadka/SwiftCTS}

07.
bioRxiv (Bioinfo) 2026-06-15

Biological meaning in protein embedding space is resolution-dependent

Protein language model embeddings are increasingly used to organise biological sequences, yet how biological meaning is encoded within embedding neighbourhoods remains poorly understood. Using two independent hierarchical enzyme systems, carbohydrate-active enzymes and peptidases, we investigated how biological interpretation changes across embedding organisations aligned to different levels of biological hierarchy. Different embedding organisations give rise to distinct neighbourhood semantics. When aligned to membership-boundary resolution, embeddings robustly separated artefacts and unrelated proteins from members of the target category. However, embeddings aligned to functional-grouping resolution maintained compositional neighbourhood structure for multi-domain proteins spanning more than one functional or catalytic group. Finally, embeddings aligned to local-family resolution recovered compact family-like neighbourhoods, including families withheld from training, while weakening broader membership-boundary and functional-grouping relationships. Moreover, embeddings optimised toward the same level of biological organisation retain different biological relationships depending on optimisation trajectory employed. Together, our results show that proximity in protein embedding space has no fixed biological interpretation. Instead, biological meaning emerges across embedding resolutions through selective preservation of different forms of biological organisation.

08.
arXiv (CS.CL) 2026-06-16

Context Compression Is Not One Thing: Readable Symbolic Re-expression vs. Coherent Summary at Matched Budget

We study context compression for multi-hop question answering with small language models. We propose Telegraph English, a readable symbolic format that rewrites retrieved passages into structured entity-relation statements, preserving reasoning evidence at lower token cost. In controlled experiments on MuSiQue, TwoWiki, and HotpotQA, Telegraph English outperforms three matched-budget compression baselines (character-level deletion, truncation, and random sub-sampling) on every dataset, with gains of 13 to 20 F1 percentage point. It also outperforms a coherent prose summary produced by the same encoder on the hardest dataset. A pre-registered depth-interaction hypothesis is null: the advantage does not grow with reasoning depth within datasets. We interpret these results as evidence that readable symbolic re-expression preserves entity content more densely than either natural language or coherent summarization at matched token budget.

10.
arXiv (CS.AI) 2026-06-18

Detecting High-Potential SMEs with Heterogeneous Graph Neural Networks

arXiv:2602.19591v3 Announce Type: replace-cross Abstract: Small and Medium Enterprises (SMEs) constitute 99.9% of U.S. businesses and generate 44% of economic activity, yet systematically identifying high-potential SMEs remains an open challenge. We introduce SME-HGT, a Heterogeneous Graph Transformer framework that predicts which SBIR Phase I awardees will advance to Phase II funding using exclusively public data. We construct a heterogeneous graph with 32,268 company nodes, 124 research topic nodes, and 13 government agency nodes connected by approximately 99,000 edges across three semantic relation types. SME-HGT achieves an AUPRC of 0.621 0.003 on a temporally-split test set, outperforming an MLP baseline (0.590 0.002) and R-GCN (0.608 0.013) across five random seeds. At a screening depth of 100 companies, SME-HGT attains 89.6% precision with a 2.14 lift over random selection. Our temporal evaluation protocol prevents information leakage, and our reliance on public data ensures reproducibility. These results demonstrate that relational structure among firms, research topics, and funding agencies provides meaningful signal for SME potential assessment, with implications for policymakers and early-stage investors.

11.
arXiv (CS.LG) 2026-06-15

Direct Fisher Score Estimation for Likelihood Maximization

arXiv:2506.06542v2 Announce Type: replace-cross Abstract: We study the problem of likelihood maximization when the likelihood function is intractable but model simulations are readily available. We propose a sequential, gradient-based optimization method that directly models the Fisher score based on a local score matching technique which uses simulations from a localized region around each parameter iterate. By employing a linear parameterization to the surrogate score model, our technique admits a closed-form, least-squares solution. This approach yields a fast, flexible, and efficient approximation to the Fisher score, effectively smoothing the likelihood objective and mitigating the challenges posed by complex likelihood landscapes. We provide theoretical guarantees for our score estimator, including bounds on the bias introduced by the smoothing. Empirical results on a range of synthetic and real-world problems demonstrate the superior performance of our method compared to existing benchmarks.

12.
arXiv (math.PR) 2026-06-17

Convergence rate of Euler–Maruyama scheme to the invariant probability measure under total variation distance for the SDEs

arXiv:2505.04218v3 Announce Type: replace Abstract: This article shows the geometric decay rate of Euler-Maruyama scheme for one-dimensional stochastic differential equation towards its invariant probability measure under total variation distance. Firstly, the existence and uniqueness of invariant probability measure and the uniform geometric ergodicity of the chain are studied through introduction of non-atomic Markov chains. Secondly, the equivalent conditions for uniform geometric ergodicity of the chain are discovered, by constructing a split Markov chain based on the original Euler-Maruyama scheme.

13.
arXiv (CS.CV) 2026-06-19

Vortex: Multi-Modal Fusion System for Intelligent Video Retrieval

This paper presents Vortex, the multimodal video retrieval system developed by our team, FocusOnFun, for the Ho Chi Minh City AI Challenge 2025, designed to advance intelligent multimedia search and temporal reasoning. The system integrates adaptive keyframe extraction, multimodal metadata generation from vision-language and speech models, and a hybrid retrieval strategy that fuses CLIP and SigLIP2 embeddings through Reciprocal Rank Fusion to balance global and fine-grained semantics. To enhance interactivity, Vortex incorporates Rocchio-based relevance feedback and a multi-stage temporal search mechanism for sequential event alignment. Built on Milvus and Elasticsearch, the architecture enables scalable indexing and efficient retrieval. Evaluated in the official competition, our FocusOnFun team's system achieved a score of 79.6/88 (90.5\%) in the Preliminary Round and was further evaluated in the Final Round, achieving an `Excellent' overall performance with `Outstanding' results in the question-answering (QA) task. This demonstrating the complementary strengths of CLIP and SigLIP2 and confirming the effectiveness of the hybrid retrieval approach. The system establishes a robust foundation for future research in intelligent, context-aware, and interactive video retrieval.

14.
arXiv (CS.CL) 2026-06-11

UR-BERT: Scaling Text Encoders for Massively Multilingual TTS Through Universal Romanization and Speech Token Prediction

We propose UR-BERT, a Romanized transcription-based text-to-speech (TTS) encoder for massively multilingual TTS systems. Conventional grapheme-to-phoneme (G2P)-based approaches are limited to around 100 languages due to the availability of reliable G2P resources. In contrast, UR-BERT scales to 495 languages by unifying diverse writing systems into a shared Romanization representation. To further enhance phonetic fidelity and text-speech alignment, we introduce a speech token prediction objective during training, which encourages the encoder to learn speech-aware phonetic representations in a data-efficient manner. Experiments show that TTS systems built on UR-BERT consistently outperform recent text encoder baselines across a wide range of languages and resource conditions, and demonstrate strong generalization to unseen languages.

15.
arXiv (CS.CL) 2026-06-19

ShoppingBench: A Real-World Intent-Grounded Shopping Benchmark for LLM-based Agents

Existing benchmarks in e-commerce primarily focus on basic user intents, such as finding or purchasing products. However, real-world users often pursue more complex goals, such as applying vouchers, managing budgets, and finding multi-products seller. To bridge this gap, we propose ShoppingBench, a novel end-to-end shopping benchmark designed to encompass increasingly challenging levels of grounded intent. Specifically, we propose a scalable framework to simulate user instructions based on various intents derived from sampled real-world products. To facilitate consistent and reliable evaluations, we provide a large-scale shopping sandbox that serves as an interactive simulated environment, incorporating over 2.5 million real-world products. Experimental results demonstrate that even state-of-the-art language agents (such as GPT-4.1) achieve absolute success rates under 50% on our benchmark tasks, highlighting the significant challenges posed by our ShoppingBench. In addition, we propose a trajectory distillation strategy and leverage supervised fine-tuning, along with reinforcement learning on synthetic trajectories, to distill the capabilities of a large language agent into a smaller one. As a result, our trained agent achieves competitive performance compared to GPT-4.1.

16.
arXiv (CS.AI) 2026-06-19

Variable-Length Tokenization via Learnable Global Merging for Diffusion Transformers

arXiv:2606.20076v1 Announce Type: cross Abstract: Latent Diffusion Models (LDMs) have become dominant in visual synthesis, but their quality-compute trade-off is largely constrained by the tokenizer's fixed compression ratio. Variable-length tokenizers (VLTs) promise adaptive compression by varying token counts, allowing diffusion models to flexibly balance quality and compute. However, conventional VLTs modulate length by truncating ordered token sequences, which makes token semantics depend on token position and breaks representational alignment across lengths. This leads to a cross-length shift in the latent distribution that hinders a single variable-length diffusion model from operating effectively. To address this, we propose a novel variable-length tokenizer that modulates length by merging tokens. We show that encouraging similar tokens to merge enables direct cross-length representation alignment when the diffusion transformer operates according to the merging pattern. Since conventional merging methods are data-dependent, making the merging pattern inaccessible during generation, we introduce learnable global merging, which is data-independent, to ensure compatibility with diffusion transformers. On ImageNet 256$\times$256 generation, our merging-based variable-length tokenizer integrated with a diffusion transformer achieves a superior gFID-compute trade-off compared to prior VLT methods. Code is available at [this https URL](https://github.com/movinghoon/lgm)

17.
arXiv (math.PR) 2026-06-11

Hierarchical Random Measures without Tables

arXiv:2505.02653v2 Announce Type: replace-cross Abstract: The hierarchical Dirichlet process is the cornerstone of Bayesian nonparametric multilevel models. Its generative model can be described through a set of latent variables, commonly referred to as tables within the popular restaurant franchise metaphor. The latent tables simplify the expression of the posterior and allow for the implementation of Gibbs sampling algorithms to approximately draw posterior samples. However, managing their assignments can become computationally expensive, especially as the size of the dataset and the number of levels increase. In this work, we identify a prior for the concentration parameter of the hierarchical Dirichlet process that (i) induces a quasi-conjugate posterior distribution, and (ii) removes the need for tables, leading to more interpretable expressions for the posterior, with both a scalable and an exact algorithm to sample from it. Remarkably, this construction extends beyond the Dirichlet process, leading to a new framework for defining normalized hierarchical random measures and a new class of algorithms to sample from their posteriors. The key analytical tool is the independence of multivariate increments, that is, their representation as completely random vectors.

18.
arXiv (CS.CV) 2026-06-12

Acquisition state behaves as a structured, measurable variable governing lung-nodule AI: kernel-driven measurement instability and noise-driven detection fragility, invisible to DICOM metadata

AI governance for medical imaging is formalizing: the 2026 ACR-SIIM Practice Parameter recommends local acceptance testing and ongoing drift monitoring, and the ACR Assess-AI registry monitors AI outputs using DICOM metadata for context. We argue that a necessary, currently unmonitored layer sits beneath output metrics: whether incoming studies remain within the acquisition envelope a model was validated on. Using a LUNA16-trained MONAI RetinaNet lung-nodule detector, we test whether acquisition state behaves as a structured, measurable variable. On real paired CT differing only in reconstruction kernel (NLST B30f vs B80f), kernel alone shifted AI-measured diameter and flipped a Fleischner size category in 5.2% (8 of 155) of nodules at fixed patient and acquisition, while detection confidence was unchanged (Wilcoxon p=0.22). Under controlled LIDC-IDRI perturbations the effects dissociated by axis: the noise axis degraded detection confidence (p=5.9e-32, concentrated in nodules under 6 mm) but not measurement, while the frequency/kernel axis corrupted measurement (p=8.6e-13) but not detection. A 4-feature pixel fingerprint recovered reconstruction identity (patient-level AUC about 0.95 on real CT, 0.995 on a QIBA phantom) where the ConvolutionKernel DICOM tag was uninformative (identical labels across reconstructions). The kernel axis transported across four manufacturers (leave-one-vendor-out AUC 0.94-0.98, matching the within-vendor ceiling). Acquisition state thus maps to distinct AI failure modes, frequency content to measurement reliability and noise to detection sensitivity, and is not recoverable from metadata. Acquisition-aware, input-side validation is the missing layer for the acceptance-testing and drift-monitoring requirements now entering imaging-AI accreditation.

19.
arXiv (CS.LG) 2026-06-11

Bypassing Prompt Guards in Production with Controlled-Release Prompting

arXiv:2510.01529v4 Announce Type: replace Abstract: Ball et al. recently established that prompt filtering for AI alignment faces a fundamental barrier: under standard cryptographic assumptions, no filter running significantly faster than the protected model can universally distinguish adversarial prompts from benign ones. We investigate whether this impossibility result translates to real-world vulnerabilities in deployed large language model (LLM) systems. We answer affirmatively by introducing controlled-release prompting, a practical instantiation of the theoretical framework that exploits the resource asymmetry between lightweight input filters and the main models they protect. Unlike the theoretical construction, our attack does not require model modification: it generates malicious prompts that are indecipherable by any bounded filter yet remain tractable to the target LLM. We find our attack to be successful on four major chat platforms (Google Gemini, DeepSeek Chat, xAI Grok, and Mistral Le Chat) where baseline methods fail. Additionally, we apply our attack to extract copyrighted data from Gemini. Finally, we provide a systematic evaluation of 14 open-weight prompt guard models, revealing that even reasoning-capable filters cannot reliably detect our attack without incurring prohibitive resource overhead.

20.
arXiv (CS.CL) 2026-06-11

A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries

How much meaning does a text carry? Shannon's theory measures uncertainty over symbols and is intentionally indifferent to meaning, while pairwise metrics such as BERTScore compare two texts rather than characterizing one. We develop a geometric framework that measures semantic content from the structure of a text's sentence embeddings. The framework has three parts. First, within a fixed embedding and baseline, six natural axioms uniquely determine a scalar measure up to scale, a frame-conditional uniqueness theorem. The resulting scalar is empirically too coarse, motivating a richer representation. Second, we propose a three-coordinate semantic profile capturing novelty (displacement from generic discourse), breadth (diversity of distinct ideas), and integration (connectedness among them), together with a discrete minimal unit (the semantic quantum) whose resolution is fixed by a clustering threshold $\tau$. Third, we prove a no-go theorem: no scalar summary of the profile can simultaneously satisfy analytic stability under paraphrase and concatenation, ordinal robustness across text scales, and cross-representation comparability. We exhibit two practical scalars, $S_{\mathrm{minmax}}$ and $S_{\mathrm{rank}}$, each occupying a distinct corner of this trade-off triangle. Validation across 23 synthetic categories, 5 Project Gutenberg novels, and 3 embedding models confirms the trade-off. The recommended rank-normalized configuration passes 25 of 28 ordinal checks as point estimates (21 of 28 after Benjamini-Hochberg correction), outperforming seven baselines including unigram entropy and a BERTScore-based novelty signal. A separate variational result connects the breadth coordinate to the log-determinant of a determinantal point process (Spearman $\rho = 0.985$ over 507 Gutenberg chapters), giving an optimization-theoretic foundation for breadth.

21.
arXiv (CS.LG) 2026-06-16

Spectral Adaptive Conformal Prediction for Structured Non-Exchangeable Data

arXiv:2606.15950v1 Announce Type: cross Abstract: Conformal prediction gives prediction intervals with finite-sample coverage when the data are exchangeable. Many time-indexed datasets are not exchangeable. They have seasons, recurring regimes, changing frequencies, or other forms of structured dependence. This paper studies a simple way to use that structure. We propose spectral adaptive conformal prediction, a method that forms weighted conformal quantiles using local spectral similarity and then updates the target miscoverage level online. The spectral weights choose calibration residuals that look relevant to the current test point. The adaptive update corrects the long-run miss rate when uncertainty changes over time. We give an approximate coverage result for the fixed spectral weighted quantile and a deterministic long-run calibration result for the adaptive update. Simulations with recurring regimes and slowly changing frequencies, together with three U.S. real-data examples, show that the hybrid method can improve on fixed spectral weighting, while also showing that spectral weighting must be monitored through effective sample size diagnostics.

22.
arXiv (CS.CL) 2026-06-12

PiDA: Phonetically-Informed Data Augmentation for Robust Vietnamese Speech Translation

Cascaded speech translation (ST) systems suffer from error propagation when Automatic Speech Recognition (ASR) outputs incorrect transcripts. We present the first systematic categorization of ASR errors for Vietnamese ST, classifying substitution errors by phonetic cause and quantifying their impact on downstream Neural Machine Translation (NMT) performance using Linear Mixed-Effects Modelling. We confirm that most ASR substitution errors arise from phonetic confusions rather than random noise, and that these phonetic errors significantly degrade ST quality. Motivated by this finding, we propose Phonetically-Informed Data Augmentation (PiDA), which generates ASR-like corruptions by substituting words with phonetically similar alternatives using phonetic word embeddings. Fine-tuning on a PiDA-augmented version of FLEURS Vietnamese-English improves translation of erroneous ASR outputs (up to +2.04 BLEU over standard fine-tuning) while also slightly improving clean-text performance.

23.
arXiv (quant-ph) 2026-06-15

Landscape-Similarity-Guided Optimization in Divide-and-Conquer QAOA

arXiv:2602.21689v3 Announce Type: replace Abstract: Divide-and-conquer strategies mitigate hardware constraints for the Quantum Approximate Optimization Algorithm (QAOA) on Noisy Intermediate-Scale Quantum (NISQ) devices by partitioning large interaction graphs into smaller, hardware-compatible sub-problems. However, this approach introduces a severe classical training bottleneck: a decomposition across $m$ boundary nodes generates $2^m$ distinct sub-problems that typically require independent optimization. In this work, we demonstrate that across diverse synthetic and real-world interaction graphs, the variational landscapes of these reduced QAOA instances actually exhibit a robust universality. Adapting the replica-overlap framework of spin-glass physics, we define a landscape-overlap order parameter $q$ to quantify geometric correlations between energy landscapes, revealing a sharp landscape-similarity transition as graph connectivity is tuned. Exploiting this, we introduce Doubly Optimized QAOA (DO-QAOA), an adaptive pipeline that collapses the sub-problems from $2^m$ distinct sub-problems into $K=\mathcal{O}(1)$ effective landscape classes. By performing optimization on a single representative sub-problem and dynamically transferring parameters to remaining sub-problems, DO-QAOA lowers runtime and quantum measurement overhead by orders of magnitude while maintaining a competitive Approximation Ratio Gap (ARG).

24.
medRxiv (Medicine) 2026-06-16

A Poisson Process Life Expectancy framework for optimising patient lifetime during chemotherapy

Cancer therapy balances between two competing objectives - treatment efficacy against the tumour and the risk of treatment related severe adverse events, including patient death. Most existing optimal control theory (OCT) formulations rely on optimising heuristic cost functionals that lack direct clinical interpretability. In clinical practice treatment efficacy and patient tolerability are primarily assessed through survival metrics and adverse event rates. Here we introduce the Continuous Lifetime Payoff (CLP), a novel OCT objective functional that directly links treatment decisions to patient survival. It explicitly incorporates tumour dynamics, tumour eradication, and patient mortality from tumour progression, drug-related toxicity and age. We fit age-related mortality from life tables and infer parameters from simulated survival data. The CLP provides a clinically grounded framework for optimising chemotherapy regimens.

25.
arXiv (CS.AI) 2026-06-12

Select and Improve: Understanding the Mechanics of Post-Training for Reasoning

arXiv:2606.13125v1 Announce Type: cross Abstract: Reinforcement learning has rapidly emerged as a key component in the training of reasoning and coding models, yet it remains poorly understood from a mechanistic perspective. We study how and through what underlying processes capabilities are acquired or enhanced via reinforcement learning post-training. Our analysis, based on controlled math reasoning experiments with Qwen-2.5-1.5B, reveals two core mechanisms: strategy selection and strategy improvement. Our results highlight the role of SFT data and reinforcement learning data in activating these mechanisms, in particular showing how supervising the model on diverse reasoning strategies can enable strategy selection and how increasing difficulty in reinforcement learning data can enable strategy improvement. Taken together, our results provide mechanistic insight into RL training and suggest practical interventions to continue scaling reasoning capabilities.