Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CL) 2026-06-16

Replay What Matters: Off-Policy Replay for Efficient LLM Reinforcement Unlearning

LLM unlearning has emerged as a cost-effective alternative to full retraining for removing hazardous knowledge from pretrained models while preserving general utility. Recent RL-based methods such as RULE reformulate unlearning as learning a refusal behavior, but their on-policy optimization repeatedly samples from the same forget and retain/boundary prompts throughout training. We identify a critical inefficiency in this process: easy cases quickly converge and provide little useful gradient signal, while hard cases near the forget/retain boundary continue to produce low-reward rollouts that are discarded after a single use. To address this issue, we propose ReRULE, an off-policy replay enhancement for reinforcement unlearning. ReRULE stores low-reward hard-case rollout groups in a replay buffer during early GRPO training and reuses them in later stages through importance-sampled off-policy updates, redirecting computation toward boundary cases that still require learning. Theoretically, we show that ReRULE yields a tighter hard-case convergence bound than pure on-policy RULE. Empirically, ReRULE improves MUSE-Books Retain Quality from 46.3 to 56.2 while adding only 5–11% training time across benchmarks. Its limited improvement on the simpler TOFU setting further supports the intended conditional behavior: replay is most beneficial when the hard/easy disparity is pronounced.

02.
arXiv (CS.AI) 2026-06-12

A Study of Belief Revision Postulates in Multi-Agent Systems (Extended Version)

arXiv:2605.02249v2 Announce Type: replace Abstract: We investigate the belief revision problem in epistemic planning, i.e., what will be the beliefs of all agents in a multi-agent system after an agent gains the belief in some state property. Based on the standard representation in epistemic planning of agents' beliefs via a single multi-agent Kripke model, we generalize the classical AGM belief revision postulates to the multi-agent setting, with the aim to provide a formal framework for evaluating dynamic epistemic reasoning frameworks in which the beliefs of all agents as the result of actions are computed. As an example of a simple operator that satisfies all of the generalized AGM postulates, we present generalized full-meet multi-agent belief revision. We moreover define a generalization of the standard postulates for iterated revision, present a more sophisticated, event model based revision operator, and discuss the potential issues in defining an epistemic operator on Kripke models that can satisfy all of the generalized postulates for iterated multi-agent belief revision.

03.
arXiv (CS.AI) 2026-06-16

Proximal Policy Optimization for Amortized Discrete Sampling

arXiv:2606.15793v1 Announce Type: cross Abstract: This paper explores policy gradient algorithms for training stochastic policies to sample from structured discrete probability distributions under the Generative Flow Network (GFlowNet) framework. Building on extensive theoretical connections between GFlowNets and entropy-regularized reinforcement learning, we derive equivalents of standard policy gradient algorithms for training GFlowNets, as well as experimentally explore their various methodological aspects, including baseline training and advantage estimation. Most importantly, our work is the first to derive and successfully apply proximal policy optimization to GFlowNets, showing its improved convergence speed and data efficiency compared to standard GFlowNet training objectives on benchmarks ranging from synthetic energies to molecular graph generation.

04.
arXiv (CS.CL) 2026-06-16

Equity with Efficiency: An Empirical Study of Tokenizers for Multilingual Large Language Models

Multilingual large language models (LLMs) depend on subword tokenization to bridge discrete text and continuous neural representation. State-of-the-art multilingual LLMs often use Byte-level Byte-Pair Encoding (BPE) tokenizers that structurally favor high-resource languages and Latin scripts. For speakers of underrepresented languages, particularly those across Southeast Asia, this bias inflates inference costs and widens cross-lingual capability gaps. We present the first systematic comparison of equitable tokenizers on a unified benchmark spanning 11 Southeast Asian languages. Beyond tokenizer-level analysis of compression efficiency and cross-lingual equity, we assess downstream task performance through controlled 1.5B-parameter language model training using the same training data. Our results show that Parity-aware BPE lies on the Pareto frontier of the efficiency-equity trade-off, achieving strong compression parity at competitive cost. Morphology-Driven Byte Encoding delivers the best semantic reasoning performance through morphologically richer representations, albeit at a higher computational expense. Byte Latent Transformer underperforms on downstream tasks, possibly because its architectural assumptions misalign with the constraints of limited low-resource training data. Together, our findings demonstrate that cross-lingual fairness and tokenization efficiency are not fundamentally at odds, and offer practical guidance for designing equitable multilingual models.

05.
arXiv (CS.AI) 2026-06-11

SirenFNO: Efficient and Full Frequency Learning of Fourier Neural Operators

arXiv:2606.11518v1 Announce Type: cross Abstract: Fourier neural operators (FNOs) are effective and efficient surrogates for approximating solutions of PDEs and generalize across discretizations. However, owing to the reliance on frequency truncation to maintain learning efficiency of FNOs, empirical studies suggest that FNOs exhibit spectral bias toward low-frequency information, which may hinder the learning capability especially for certain PDEs with strong high-frequency oscillations. To address this limitation, we propose SirenFNO, a novel framework that leverages sinusoidal representation networks (SIRENs) to learn implicit neural representations and performs mode-wise kernel parameterization. Our SIREN parameterization learns a full-grid spectrum with a constant and discretization-independent parameter count, thereby eliminating the need for frequency truncation. We further extend SirenFNO with functional tensor decompositions to enhance parameter and learning efficiency. Empirical results show that our SirenFNO consistently outperforms FNO with approximately $4$ to $15$ times parameter reductions with preserved discretization invariance, and our functional decomposition variants obtain performance improvements with a maximum of $73$ times fewer parameters across multiple PDE benchmarks.

06.
arXiv (CS.LG) 2026-06-11

Triangular-Reference Schrödinger Bridges for Time Series Generation

arXiv:2605.27478v3 Announce Type: replace-cross Abstract: Schrödinger bridges for time series (SBTS) generate synthetic paths by projecting, in relative entropy, a Brownian reference onto the path laws that match the joint distribution of the data on the observation grid. The Brownian reference, however, fixes the quadratic variation of the generated paths, which is restrictive when stochastic volatility, correlated noise, or rank-deficient covariance structures must be reproduced. We introduce "Triangular-Reference Schrödinger Bridges for Time Series" (TR-SBTS), which keeps the entropy-projection backbone of SBTS but replaces the Brownian reference by a triangular, volatility-informed, intervalwise frozen reference on a state augmented with latent covariance descriptors. The construction remains a single entropy projection on the augmented state: the minimiser is the \(h\)-transform of the reference, and on each frozen interval the optimal drift has the logarithmic-gradient form \(b^\star(t,x)=A\,\nabla\log H(t,x)\), intrinsic to the active covariance directions when the frozen covariance \(A\) is degenerate. We prove stability of the frozen approximation and consistency of the associated regularised kernel estimators, describe a reference-aware Nadaraya–Watson implementation of the conditional next-increment law, and evaluate the construction on numerical experiments.

07.
arXiv (CS.CV) 2026-06-17

Theoretical Grounding of Out-Of-Distribution Detection With Reinforcement Learning Optimizer

Out-of-distribution (OOD) detection in dynamic open-world environments requires a model to continually adapt to evolving data distributions while generalizing to covariate-shifted inputs and rejecting semantic-shifted OOD examples. Most existing OOD detection methods optimize only the current-step objective and do not explicitly account for how post-deployment environment changes affect future OOD behavior. In this paper, we establish a theoretical grounding for dynamic OOD detection using a reinforcement learning (RL)-guided optimizer that explicitly favors updates that reduce the semantic OOD false positive rate over time. We develop a novel augmented optimizer that uses an RL-guided correction term on top of standard gradient descent (GD) and show its improvement over both future-domain generalization and semantic-OOD rejection. We analyze temporal error decomposition in terms of model-change and environment-change generalization errors and develop a new theoretical framework for comparing the generalization errors under both GD and RL-guided optimizers.

08.
arXiv (CS.LG) 2026-06-17

Sign-Rank, Index, and List Replicability: Connections and Separations

arXiv:2606.18236v1 Announce Type: new Abstract: In learning theory, the sign rank of a binary concept class captures the smallest dimension in which it can be represented by points and halfspaces. Despite tremendous interest, lower bounds on sign rank are notoriously difficult to come by. Two recent approaches to the problem establish lower bounds on sign rank by measures that are easier to analyze: the $\mathbb{Z}_2$-index and the list replicability number. We order these measures, showing that the $\mathbb{Z}_2$-index is upper-bounded by a linear function of the list replicability number. As a main consequence, we obtain a strong separation between sign rank and $\mathbb{Z}_2$-index, thereby resolving a question of Frick, Hosseini, and Vasileuski. This motivates a thorough study of list replicability, the stronger of the two lower-bounding measures. We establish upper bounds on the list replicability number by two combinatorial measures: height and minimum star number. We also prove a fundamental composition result, showing that the product of two concept classes has list replicability number bounded by the sum of the list replicability numbers of the two classes.

09.
arXiv (CS.LG) 2026-06-19

Closing the Social-Semantic Gap: SPSD for Edge-Based Prompt Compression in Cloud LLM Inference

arXiv:2606.19364v1 Announce Type: new Abstract: The prefill stage of Large Language Model (LLM) inference is a growing contributor to cloud-scale energy cost. Many consumer-support and conversational prompts contain social scaffolding: politeness markers, apologetic preamble, repetition, and rapport-building language that is important for human communication but carries low marginal information for machine reasoning. We call this discrepancy the Social-Semantic Gap. We present SPSD (Sentiment Preserving Semantic Distillation), an edge-based pipeline that compresses user prompts using a 4-bit quantised Small Language Model before transmission to a cloud-deployed LLM. Evaluation on a 248-prompt corpus using Gemma-2-2B-Instruct (Q4_K_M) as the SLM and Llama-3.1-8B-Instruct as the cloud evaluation model yields a mean input token saving of 99.9 tokens per distilled call, with all 146 distilled calls yielding positive savings. Response quality, assessed by blind LLM-as-judge scoring across 121 pairs, is non-inferior to the raw path within a pre-specified 1-point margin on a 15-point rubric; the judge awarded 43 percent ties, 28 percent distilled wins, and 29 percent raw wins. Cosine similarity is mixed: mean 0.682, median 0.712, with 54.1 percent of pairs above the 0.70 reference threshold. Safety-critical domains are conservatively routed to passthrough via rule-based gates. Per-call net energy saving is estimated at 70-270 uWh under stated assumptions. SPSD shows that on-device prompt distillation can reduce cloud LLM input-token cost while preserving response quality within a practical non-inferiority margin.

10.
arXiv (CS.AI) 2026-06-17

Moving Out: Physically-grounded Human-AI Collaboration

arXiv:2507.18623v4 Announce Type: replace-cross Abstract: The ability to adapt to physical actions and constraints in an environment is crucial for embodied agents (e.g., robots) to effectively collaborate with humans. Such physically grounded human-AI collaboration must account for the increased complexity of the continuous state-action space and constrained dynamics caused by physical constraints. However, most existing collaboration benchmarks are discrete or do not consider physical attributes and constraints. To address this, we introduce Moving Out, a human-AI collaboration benchmark that resembles a wide range of collaboration modes affected by physical attributes and constraints, such as moving heavy items together and coordinating actions to move an item around a corner. Moving Out consists of two challenges and human-human interaction data to comprehensively evaluate models' abilities to adapt to diverse human behaviors and unseen physical attributes. To give embodied agents the capability to collaborate with humans under physical attributes and constraints, we propose a novel method, BASS (Behavior Augmentation, Simulation, and Selection), to enhance the diversity of agents and their understanding of the outcome of actions. We systematically compare BASS and state-of-the-art models in AI-AI and human-AI experiments, showing that BASS can effectively collaborate with both unseen AI and humans. The project page is available at https://live-robotics-uva.github.io/movingout_ai/.

11.
arXiv (CS.CL) 2026-06-19

StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs

Multimodal large language models (MLLMs) are increasingly deployed in personally and societally consequential settings, yet the visual cues that shape how these models judge people remain poorly understood. Prior work often compares different (groups of) individuals, making it difficult to separate appearance effects from identity differences. We introduce StylisticBias, a controlled benchmark for evaluating attribute-level social bias in MLLMs. We generate 500 photorealistic base faces and create about 50 single-attribute variations per face, producing about 25K images. This design keeps identity fixed and changes one visual attribute at a time. It lets us measure how specific cues shift model judgments. We evaluate six MLLMs across 25 binary social judgment scenarios. We find that age and body type dominate identity-level effects, while fashion style and other visual cues drive the largest attribute-level shifts. We further find that about 15 attributes account for nearly 80\% of the total variation, showing that bias is concentrated in a small set of visual cues. Sensitivity is strongest in judgments that are semantically aligned with appearance, especially socioeconomic and style-related judgments. We release StylisticBias as a benchmark for fine-grained bias evaluation in multimodal models. Code and dataset: https://github.com/timo-cavelius/StylisticBias and https://hf.co/datasets/shaghayegh/stylistic-bias-dataset.

12.
medRxiv (Medicine) 2026-06-18

Hospital-Level Variation in Antenatal Corticosteroids for Late Preterm Births

Objective: To determine whether and to what extent hospitals across the United States vary in their use of late-preterm steroids using a novel data set in which the timing of steroid administration relative to delivery can be observed. Methods: This was a retrospective cohort study of singleton births with known gestational ages identified in the Premier Healthcare Database from 2015 to 2022. The primary variable of interest was hospital-level adoption of antenatal corticosteroids for late-preterm singleton deliveries, calculated as the proportion of late-preterm singleton births (34-36 completed weeks of gestation) with any betamethasone exposure during the same late-preterm period. Hospital adoption was defined as the weighted average rate of ALPS administration among late-preterm infants across the entire post-period. Hospitals were ranked by their late-preterm steroid adoption rates and categorized by quartile based on the empirical distribution. Temporal trends were assessed using annual hospital-level adoption rates and visualized using time-series plots and distributional plots. A logistic regression model was constructed to determine hospital characteristics associated with being a highest-quartile adopting hospital. Results: The analysis cohort included 728 hospitals and 5,452,791 births, of which 361,006 (6.6%) were singleton late preterm births. Hospital steroid exposure rates ranged from 0 to 82% and were categorized into quartiles based on overall exposure rate, with cutoffs at 20.6%, 29.8%, and 40.1%. Median exposure rates increased progressively across quartiles from 14.1% (IQR 9.3-17.4%) in the lowest adopting hospitals (Q1) to 47.6% (IQR 43.7-53.2%) in the highest adopting hospitals (Q4), with substantial within-quartile variation. In the multivariable model, urban location was a strong predictor of high adoption after adjustment (aOR 2.05; 95% CI 1.11-3.83, p=0.02). Compared to Midwest hospitals, Southern hospitals had significantly lower odds of being high adopters (aOR 0.37; 95% CI 0.20-0.69, p

13.
arXiv (quant-ph) 2026-06-19

Arrival times of an atomic Bose-Einstein condensate

arXiv:2606.20281v1 Announce Type: cross Abstract: The times of flight of an atomic Bose-Einstein condensate are theoretically investigated in the experimentally unexplored regime corresponding to detection close to the trap of the condensate. In this regime, there is no consensus on how to calculate the distribution of times of arrival onto the detector. For non-interacting particles, distinct theoretical predictions have been made in the past. This work analyses how these predictions are modified for an interacting Bose-Einstein condensate. For this purpose, a time-dependent Gross-Pitaevskii equation is solved analytically and numerically.

14.
bioRxiv (Bioinfo) 2026-06-19

Perturbation Curve models continuous transcriptional response trajectories and improves prediction of genetic modulations

Single-cell CRISPR screens, Perturb-seq, have revolutionized functional genomics by revealing biological causality. However, although perturbation assignments are typically represented as discrete labels, the cell-level effective strength of perturbations is often continuous and diverse. Current analytical frameworks struggle to decouple the variability in perturbation strength from the diversity of downstream responses. Here, we present Perturbation Curve (PertCurve), a nonlinear, curve-based computational framework that models the trajectories of transcriptomic responses by explicitly incorporating diverse perturbation magnitudes and strengths. By ordering cells by perturbation strength, we demonstrate that PertCurve accurately recapitulates the response magnitudes and reveals the distinct modularity and asynchrony patterns of downstream gene behaviors. These patterns are categorized into archetypes, including proportional, sensitive, and threshold responses. By applying this framework across CRISPRi/a modalities, we identify universal response patterns in viral infection, apoptosis, and proliferation genes, and reveal previously overlooked context-specific regulatory features in cell differentiation. Finally, incorporating PertCurve into perturbation prediction models and evaluation metrics enhances predictive performance, delivering actionable insights for refining established models.

16.
arXiv (CS.CV) 2026-06-24

ActiveScope: Actively Seeking and Correcting Perception for MLLMs

Multimodal Large Language Models (MLLMs) have demonstrated impressive vision-language understanding, yet still struggle with fine-grained perception in high-resolution images. While existing training-free methods typically rely on attention-based localization or coarse-to-fine search, they are often misled by distractors and fail to locate multiple targets. Our investigation attributes these failures to Contextual Dominance, where salient distractors overwhelm target attention and cause inaccurate localization, and Semantic Bias, where global semantics cause the model to fixate on the most salient concept, resulting in incomplete localization in multi-object scenarios. Built on these insights, we propose ActiveScope, a training-free framework that enhances MLLMs by actively seeking and correcting perception. ActiveScope features two modules. The Semantic Anchor Localization (SAL) utilizes fine-grained semantic anchors to independently localize key targets, thereby mitigating semantic bias. The Interference-Suppressed Refinement (ISR) refines localization by suppressing attention on salient distractions to overcome contextual dominance. Extensive experiments on high-resolution image understanding benchmarks demonstrate that ActiveScope outperforms existing training-free methods (e.g., 96.34 percent accuracy on $V^{*}$ Bench), validating the superiority of the active search and self-correction paradigm. Our code is available at https://github.com/jasmine-ww/ActiveScope.

17.
arXiv (CS.AI) 2026-06-19

Global Ease of Living Index: a machine learning framework for longitudinal analysis of major economies

arXiv:2502.06866v3 Announce Type: replace-cross Abstract: The drastic changes in the global economy, geopolitical conditions, and disruptions such as the COVID-19 pandemic have impacted the cost of living and quality of life. It is essential to comprehend the long-term implications of the cost of living and quality of life in major economies. A transparent and comprehensive living index must include multiple dimensions of living conditions. In this study, we present an approach to quantifying the quality of life through the Global Ease of Living Index that combines various socio-economic and infrastructural factors into a single composite score. Our index utilises economic indicators that define living standards, which could help in targeted interventions to improve specific areas. We present a machine learning framework to address missing data for certain economic indicators in specific countries. We then curate and update the data and use a dimensionality reduction approach (Principal Component Analysis and Factor Analysis) to create the Ease of Living Index for major economies since 1970. Our work significantly adds to the literature by offering a practical tool for policymakers to identify areas needing improvement, such as healthcare systems, employment opportunities, and public safety. Our approach with open data and code can be easily reproduced and applied to various contexts, providing transparency and accessibility for ongoing research and policy development in quality-of-life assessment.

18.
medRxiv (Medicine) 2026-06-16

Development and reliability and validity test of the Questionnaire on Knowledge, Attitude and Practice of ICU Nurses on Blood Oxygen Saturation Management in Mechanically Ventilated Patients

Objective: A questionnaire on the knowledge, attitude and practice of ICU nurses regarding the management of blood oxygen saturation in patients with mechanical ventilation was compiled, and its reliability and validity were tested. Method: Drawing upon the knowledge-attitude-practice theory, the initial questionnaire draft was developed through literature review and consultation with Delphi experts. Employing convenience sampling, 32 nurses from the General ICU of Wuxi Second People's Hospital were surveyed between 1 August 2025 and 27 September 2025, enabling item screening and assessment of reliability and validity.The full version of the developed questionnaire is provided as Supporting Information (S1 File). All items are published under a CC BY 4.0 license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Result: A questionnaire on the knowledge, attitude and practice of ICU nurses regarding the management of blood oxygen saturation in mechanically ventilated patients was finalised, comprising 26 items: 11 in the knowledge dimension, 6 in the attitude dimension and 9 in the behaviour dimension. The overall Cronbach's coefficient for the questionnaire was 0.88, with dimension-specific coefficients of 0.787, 0.722, and 0.781 respectively. The Spearman-Brown coefficient for the entire questionnaire was 0.967, while dimension-specific coefficients were 0.796, 0.666, and 0.728 respectively. The content validity index at the questionnaire level (S-CVI) was 0.886, and the item-level content validity index (I-CVI) ranged from 0.913 to 0.967. 0.728. The questionnaire's level content validity index (S-CVI) was 0.886, and the item level content validity index (I-CVI) ranged from 0.913 to 1.00. Conclusion: The questionnaire on knowledge, attitude and practice of blood oxygen saturation management in mechanically ventilated patients demonstrates good reliability and validity. It may serve as an assessment tool for intensive care unit nurses regarding their knowledge, attitude, and practices concerning blood oxygen saturation management in mechanically ventilated patients, thereby establishing a foundation for developing targeted intervention strategies in future practice.

19.
medRxiv (Medicine) 2026-06-15

Epileptogenicity alters intrahippocampal ripple propagation

Objective: Tracing the propagation of high-frequency oscillations (HFOs) aids in localizing epileptogenic regions and improving surgical outcomes. We examined how hippocampal epileptogenicity influences the propagation properties of the HFOs it generates. Methods: We analyzed non-REM sleep stereo-EEG from 49 patients (68 hemispheres) with verified hippocampal contacts. Hippocampi were stratified by excitability: 28 seizure onset zone (SOZ), 22 more-irritative non-SOZ (>6 interictal epileptiform discharges [IED]/min), and 18 less-irritative non-SOZ (

20.
medRxiv (Medicine) 2026-06-15

Non-Parametric Ancestry Adjustment for Polygenic Scores

Modern polygenic risk scores (PRS) exhibit shifts correlated with ancestry, leading to erroneous predictions for non-European individuals when models are trained on predominantly European cohorts. Such shifts arise from, among other factors, (1) algorithmic limitations in the ability of PRS model training to detect causal variants, rather than nearby variants with ancestry-dependent correlations to the causal one, (2) under-representation of alleles with higher prevalence in non-European populations in the association study training, and (3) gene-by-environment interactions where the environment is correlated with genetic ancestry. Current ancestry-adjustment methodologies often discretize individuals into population categories and apply a simple affine mapping to reduce these genetic ancestry biases. However, such approaches provide suboptimal adjustments, particularly for admixed individuals. In this work, we introduce a detailed theoretical characterization of ancestry-dependent biases and propose novel methods based on non-parametric neighborhood techniques that provide more accurate empirical results and admit statistical consistency guarantees. Extensive experiments using the UK Biobank demonstrate the effectiveness of the proposed methods.

21.
arXiv (CS.LG) 2026-06-15

Testing For Distribution Shifts with Conditional Conformal Test Martingales

arXiv:2602.13848v2 Announce Type: replace Abstract: We propose a sequential test for detecting arbitrary distribution shifts that allows conformal test martingales (CTMs) to work under a fixed, reference-conditional setting. Existing CTM detectors construct test martingales by continually growing a reference set with each incoming sample, using it to assess how atypical the new sample is relative to past observations. While this design yields anytime-valid type-I error control, it suffers from test-time contamination: after a change, post-shift observations enter the reference set and dilute the evidence for distribution shift, increasing detection delay and reducing power. In contrast, our method avoids contamination by design by comparing each new sample to a fixed null reference dataset. Our main technical contribution is a robust martingale construction that remains valid conditional on the null reference data, achieved by explicitly accounting for the estimation error in the reference distribution induced by the finite reference set. This yields anytime-valid type-I error control together with guarantees of asymptotic power one and bounded expected detection delay. Empirically, our method detects shifts faster than standard CTMs, providing a powerful and reliable distribution-shift detector.

22.
arXiv (math.PR) 2026-06-24

Sub-Poisson distributions: Concentration inequalities, optimal variance proxies, and closure properties

arXiv:2508.12103v2 Announce Type: replace Abstract: We introduce a nonasymptotic framework for sub-Poisson distributions with moment generating function dominated by that of a Poisson distribution. At its core is a new notion of optimal sub-Poisson variance proxy, analogous to the variance parameter in the sub-Gaussian setting. This framework allows us to derive a Bennett-type concentration inequality without boundedness assumptions and to show that the sub-Poisson property is closed under key operations including independent sums and convex combinations, but not under all linear operations such as scalar multiplication. We derive bounds relating the sub-Poisson variance proxy to sub-Gaussian and sub-exponential Orlicz norms. Taken together, these results unify the treatment of Bernoulli and Poisson random variables and their signed versions in their natural tail regime.

23.
arXiv (CS.AI) 2026-06-24

ZONOS2 Technical Report

arXiv:2606.24320v1 Announce Type: cross Abstract: We present ZONOS2 8B, our latest TTS model, which achieves state-of-the-art naturalness, prosody, and voice cloning fidelity. We improve upon Zonos-v0.1 across scale, data, and training recipe. We scale the model from 1.6B to 8B total parameters (900M active) with a novel mixture-of-experts (MoE) backbone, improving inference latency and throughput. We expand our training corpus from 200K to over 6M hours using a new data processing pipeline, and we simplify our post-training and conditioning recipes to improve naturalness and voice cloning fidelity. We evaluate ZONOS2 8B on quality, speaker similarity, WER, and ZTTS1-Eval, our novel TTS benchmark, where it performs competitively with state-of-the-art systems while maintaining good streaming latency. We release our model weights and example inference code under an Apache 2.0 license on GitHub and Hugging Face.

24.
medRxiv (Medicine) 2026-06-12

Cancer care disruption during the COVID-19 pandemic in Ontario, Canada: A sequential mixed-methods study

Introduction The COVID-19 pandemic profoundly disrupted healthcare delivery worldwide, with cancer care among the most affected services. Prior studies documented delays in referrals, reduced specialist access, and increased provider burden. However, the extent to which these experiences were reflected at the system level remains unclear. Objective To document cancer care experiences and examine whether these experiences were reflected in population-level health system indicators across Ontario, Canada. Methods We used an exploratory sequential mixed-methods design. Qualitative data were collected through focus groups and semi-structured interviews with 32 participants, including patients with cancer (n=8), caregivers (n=5), healthcare providers (n=14), and decision-makers (n=5) across two hospital settings in Ontario, Canada. Emergent themes informed the development of quantitative indicators. We then conducted a retrospective population-based analysis of linked administrative health databases for cancer patients in Ontario (n=87,786) to assess the prevalence of identified themes. Results Four themes emerged: (I) delays in diagnosis and screening; (II) disrupted access to primary care; (III) barriers to specialist and mental health services; and (IV) fragmented care for patients with multimorbidity. Quantitative findings corroborated major themes. Screening rates declined for cervical (64.8% to 57.5%) and breast cancer (64.5% to 57.2%). While in-person primary care shifted almost entirely to virtual modalities (8.5% to 95.4%), overall visit volumes remained stable. Specialist care showed uneven patterns, with increased oncology visits but declines in cardiology and mental health services. Patients with multiple comorbidities experienced the largest reductions in non-oncology specialist care. Conclusion The pandemic disrupted key components of cancer care, particularly screening, access to certain specialist services, and care for patients with complex needs. Integrating qualitative and quantitative evidence highlights areas of system vulnerability and underscores the need for coordinated, resilient cancer care capable of maintaining essential services during future crises.

25.
arXiv (CS.AI) 2026-06-16

When in Doubt, Plan It Out: Committed Small Language Model Deliberation for Reactive Reinforcement Learning

arXiv:2606.16995v1 Announce Type: new Abstract: Reinforcement Learning (RL) policies often degrade in unfamiliar environments because they lack explicit deliberation. We propose Plan, Align, Commit, Think (PACT), a hybrid architecture that combines a fast, reactive RL policy with a slow, deliberative Small Language Model (SLM) planner. PACT invokes the SLM asynchronously to generate and validate candidate action plans. Once a plan is verified through simulation as safe, feasible, and complete, it is executed directly, bypassing the RL policy without retraining or modifying it. Evaluated on three FrozenLake configurations of increasing difficulty, PACT outperforms all baselines while relying on a 2B-parameter SLM backbone, suggesting that deliberative planning and reactive execution are more powerful in concert than either is alone in these settings.