Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.LG) 2026-06-19

Topological Data Analysis for High-Dimensional Dynamic Process Monitoring

arXiv:2606.20443v1 Announce Type: cross Abstract: Real-time process monitoring requires methods that extract actionable information from high-dimensional time-series data. In this work, we present a new approach for process monitoring that combines tools of topological data analysis (TDA) and machine learning. In the proposed approach, we represent multivariate time-series data as manifolds and use topological descriptors to summarize the structure of such data; we then use a neural ordinary differential equation to learn the dynamic evolution of the topological structure of the system. Using real data from an industrial process, we show that this trajectory-based event detection approach is effective at detecting diverse types of events. We contrast this approach against reconstruction-based approaches such as principal component analysis and autoencoders and against a trajectory-based approach that uses Koopman autoencoders.

02.
arXiv (quant-ph) 2026-06-11

Consistent Evaluation of Operators Involving the Position Operator in the Bloch Representation: Application to the Orbital Moment

arXiv:2606.11679v1 Announce Type: cross Abstract: The position operator plays a central role in condensed-matter observables such as velocity, orbital moment, and electric polarization. In solid-state physics, the evaluation of operators incorporating the position operator has not reached a consensus, as observed in the operator-level discrepancy between the local circulation of Wannier functions and the self-rotation of wave packets. Here, to achieve a consistent evaluation of such operators, we propose three rules for evaluating operators involving the position operator in the Bloch representation. The rules are devised to satisfy physical conditions: independence from the choice of unit cell, preservation of Hermitian conjugacy for the product of operators, and recovery of the correct intraband velocity. We further address the gauge dependence of the position operator and introduce a scheme termed gauge filtration, which systematically removes gauge-dependent contributions from the operators containing the position operator. This methodology ensures that the quantities obtained from the operator evaluation correspond to observable physical phenomena. By applying our framework, we reconcile the results concerning the self-rotation of the wave packet and the local circulation of the Wannier function. We expect our proposal to establish a consistent framework for evaluating operators involving the position operator.

03.
arXiv (CS.CV) 2026-06-15

Pix2Pix-Hybrid: Structure-Guided Conditional Synthesis of Hajj Crowd Images with Multi-Channel Conditioning and Weak Attribute Supervision

Developing accurate crowd-counting models for Hajj pilgrimage scenes remains challenging because domain-specific annotated images are scarce and data collection during large gatherings raises privacy concerns. To address these limitations, this paper proposes Pix2Pix-Hybrid (P2P-H), a hybrid conditional GAN for structure-guided Hajj crowd-image synthesis and data augmentation. P2P-H builds on Pix2Pix and employs a U-Net generator conditioned on eight input channels that jointly encode structural cues (edges and grayscale) and contextual attributes (crowd density and time of day). To capture detailed textures in dense scenes, the framework integrates two multi-scale PatchGAN discriminators operating at different resolutions. The training procedure combines adversarial, perceptual, and feature-matching objectives with adaptive data augmentation and stabilization strategies. The model was trained on 993 real Hajj frames collected from 60 publicly available video sources, with conditioning attributes derived automatically to reduce manual labeling effort. Using this framework, we constructed CrowdH, a synthetic dataset of 10,000 high-resolution Hajj crowd images. Experimental results show that P2P-H improves structure-preserving conditional synthesis quality compared with Pix2Pix and StyleGAN2-ADA baselines and shows favorable transfer to other crowd datasets. To assess downstream utility, we further constructed CrowdH-Mix-469, an annotated mixed real-synthetic dataset comprising 384 real Hajj images and 85 selected synthetic images,and evaluated five crowd-counting models under real-only and real-plus-synthetic training. The selected synthetic data reduced MAE across all five models, with the strongest gain observed for CSRNet.

04.
arXiv (CS.CV) 2026-06-16

RLPR: Radar-to-LiDAR Place Recognition via Two-Stage Asymmetric Cross-Modal Alignment for Autonomous Driving

All-weather autonomy is critical for autonomous driving, which necessitates reliable localization across diverse scenarios. While LiDAR place recognition is widely deployed for this task, its performance degrades in adverse weather. Conversely, radar-based methods, though weather-resilient, are hindered by the general unavailability of radar maps. To bridge this gap, radar-to-LiDAR place recognition, which localizes radar scans within existing LiDAR maps, has garnered increasing interest. However, extracting discriminative and generalizable features shared between modalities remains challenging, compounded by the scarcity of large-scale paired training data and the signal heterogeneity across radar types. In this work, we propose RLPR, a robust radar-to-LiDAR place recognition framework compatible with single-chip, scanning, and 4D radars. We first design a dual-stream network to extract structural features that abstract away from sensor-specific signal properties (e.g., Doppler or RCS). Subsequently, motivated by our task-specific asymmetry observation between radar and LiDAR, we introduce a two-stage asymmetric cross-modal alignment (TACMA) strategy, which leverages the pre-trained radar branch as a discriminative anchor to guide the alignment process. Experiments on four datasets demonstrate that RLPR achieves state-of-the-art recognition accuracy with strong zero-shot generalization capabilities.

05.
arXiv (quant-ph) 2026-06-17

Induced Resource Theories and Harvesting via Quantum Probes

arXiv:2606.17287v1 Announce Type: new Abstract: We consider scenarios in which a quantum system with a well-defined resource theory is used as a probe to interact with an environment, such as a quantum field, for which a resource-theoretic description is absent or incomplete. We clarify if and how the harvesting of a resource in the probe can tell us about the state of the environment. This is particularly ambiguous when the probe-environment interaction is not a free operation, or the concept of such free operations cannot be defined altogether. We propose a framework and precise conditions under which it becomes possible to interpret resource generation on the probe as evidence of resources in the environment, thereby introducing an effective notion of resources for the latter. Our results clarify in which sense resources can be said to be harvested from the environment and provide a systematic way to analyse such processes beyond fully controlled resource-theoretic settings. More generally, this work may provide a step towards a more general understanding of the interplay of different quantum resources.

06.
arXiv (CS.AI) 2026-06-15

TRACE: Trajectory-Routed Causal Memory for Delayed-Evidence Visuomotor Imitation

arXiv:2606.14551v1 Announce Type: cross Abstract: Robots under autonomous operation may require decisions based on evidence that is no longer visible. We study delayed-evidence tasks, where an early cue disappears before a later decision point, so visually similar observations can require different actions. In these settings, the current observation is not a sufficient state for control. We introduce TRAjectory-routed Causal Evidence (TRACE), a memory framework for visuomotor imitation policies. TRACE stores task-relevant visual and robot-state evidence, such as object identity, target choice, or route-dependent state, in a fixed-size latent memory that remains bounded over long episodes. Instead of indexing memory by raw time or manually provided task labels, TRACE uses path signatures: compact, order-sensitive features of the executed robot-state trajectory. These signatures do not store the visual cue itself; rather, they provide trajectory-conditioned keys for writing and retrieving the evidence stored when the cue was visible. When the robot later reaches an ambiguous observation, the policy conditions on TRACE memory to recover the missing context and choose the correct branch. TRACE attaches through lightweight adapters to policies, without changing the policy backbone, action head, or imitation objective. Across real-world long-horizon manipulation tasks with visually ambiguous branch points, TRACE improves branch selection and task success over alternative baselines, including short-history and recurrent memory. Project page: https://jeong-zju.github.io/trace

07.
arXiv (quant-ph) 2026-06-12

Multi-entropy in heavy local quenches

arXiv:2606.12526v1 Announce Type: cross Abstract: We study the time evolution of tripartite entanglement in heavy local quenches in two-dimensional holographic conformal field theories. Our diagnostic is the genuine multi-entropy of adjacent intervals, computed from both bulk and boundary perspectives. A perturbative bulk analysis shows that the first-order small-mass perturbation around the vacuum geodesic network cancels identically at any time after the quench. In the fully back-reacted geometry, a vacuum-subtracted genuine multi-entropy arises from a mismatch between the winding selected by the trivalent geodesic network and the windings selected independently by the pairwise geodesics. In the sharp quench limit, the time dependence of genuine multi-entropy is kinematically fixed to logarithms of rational functions of time and is independent of the heavy operator dimension. The CFT calculation reproduces the same formula within the heavy-light vacuum block approximation, where the branch choice in the heavy-background uniformization map corresponds to the winding selection in the bulk. These results indicate that, in this setup, the genuine multi-entropy is controlled by global saddle selection, rather than by a local energy response or quasiparticle propagation.

08.
arXiv (quant-ph) 2026-06-17

Cavity-enhanced superconducting response in an underdoped cuprate

arXiv:2606.18084v1 Announce Type: cross Abstract: Superconductors carry electrical current without resistance when paired electrons condense into a coherent macroscopic quantum state. In underdoped cuprates, evidence suggests that pairing-related correlations and superconducting fluctuations can survive above the temperature at which global coherence is lost, pointing to phase fluctuations as a key limitation on superconductivity in this regime. Motivated by recent demonstrations of cavity-modified collective states in quantum materials, we investigate whether superconducting coherence can be stabilized by engineering the electromagnetic environment of the superconductor. We study an underdoped YBa$_2$Cu$_3$O$_{7-\delta}$ thin film in a tunable terahertz cavity formed with a semi-transparent gold mirror. From temperature-dependent terahertz transmission measurements, we find that the cavity enhances the superconducting response below the critical temperature, with an increase of the inferred superfluid weight. The effect becomes more pronounced at smaller cavity lengths and is accompanied by an upward shift of the superconducting onset temperature. Calculations based on a cavity-coupled model for phase-fluctuating superconductors capture these trends and support an interpretation in terms of cavity-enhanced phase stiffness. These results showcase the potential of cavity engineering for designing emergent functionalities in correlated systems.

09.
arXiv (CS.CL) 2026-06-16

Why Tree-Style Branching Matters for Thought Advantage Estimation in GRPO

Group Relative Policy Optimization (GRPO) trains Chain-of-Thought reasoning with verifiable rewards, but estimating thought-level advantages without value functions often suffers from high variance. Although tree-style branching is used in practice to reduce variance, it lacks a theoretical explanation of why it works and whether it is important or potentially necessary. We study thought-level advantage estimation in GRPO from a variance perspective under a minimal tree-style setting where multiple continuations are sampled for each thought. Using the multivariate delta method, we reveal a sampling-dimension asymmetry. Increasing sampled thoughts ($K$) leaves a strictly positive estimation-variance floor, whereas increasing continuations per thought ($M$) drives the leading-order estimation variance to zero at rate $1/M$. This implies that, within the fixed-temperature GRPO-style estimator without value models studied here, accurate thought-level advantage estimation cannot be achieved by scaling thought sampling alone, making continuation-level branching a principled and potentially necessary mechanism rather than a heuristic. Experiments further provide empirical evidence for its effectiveness and potential necessity, demonstrating improved optimization stability, training efficiency, and final performance not only in math but also across vision domains and under different model architectures and sizes.

10.
arXiv (CS.LG) 2026-06-12

Retrieval-Augmented Foundation Models for Water Level Prediction in the Everglades

arXiv:2508.04888v2 Announce Type: replace Abstract: Accurate water level forecasting in the Everglades is essential for flood mitigation, drought management, water resource planning, and biodiversity conservation. While recent time-series foundation models have shown strong performance on generic tasks (represented in their pre-training), their effectiveness in domain-specific applications remains insufficiently understood. In this work, we curate a domain-specific dataset for water-level forecasting in the Everglades and observe that the performance of current state-of-the-art models remains limited. To address this gap, we leverage a retrieval-augmented mechanism that retrieves analogous multivariate hydrological episodes from an external archive of historical observations to enrich the input context of those pre-trained models. We study two retrieval strategies, statistical similarity-based retrieval and mutual information-based retrieval, and analyze how incorporating retrieved historical contexts affects predictive performance. Extensive experiments show that retrieval augmentation consistently improves long-horizon water level forecasts and yields disproportionately larger gains during extreme events, which is particularly critical for environmental decision-making. Our study provides empirical evidence that analog-based retrieval can benefit pretrained time-series foundation models in environmental science, offering practical insights into their strengths, limitations, and failure modes when applied to hydrological forecasting in the Everglades. Although evaluated in the Everglades, the proposed framework is general and can be applied to other hydrological systems given time series data. The code and data have been made publicly available at https://github.com/rahuul2992000/WaterRAF.

11.
arXiv (CS.AI) 2026-06-16

GIST-CMTF: Goal-State Inference for Causal Minimal Tool Filtering in LLM Agents

arXiv:2606.16813v1 Announce Type: new Abstract: Tool-augmented LLM agents rely on runtime filtering to decide which tools should be visible at each step. Causal Minimal Tool Filtering (CMTF) reduces tool-choice confusion by exposing only the next causally necessary tool frontier, but it assumes that the user request has already been mapped to a symbolic goal state. In practice, requests such as "handle my appointment" or "take care of this email" may correspond to multiple possible goals. This creates wrong-goal execution, where an agent follows a valid causal tool path for an unintended objective. We introduce GIST-CMTF, a goal-state inference layer that predicts candidate symbolic goals over the same state-transition vocabulary used by CMTF, estimates ambiguity, and either applies CMTF or exposes clarification as a causal action that produces missing goal or state variables. We evaluate GIST-CMTF across seven model backends, six filtering methods, and 120 controlled tool-use tasks. GIST-CMTF achieves 97.0% task success, compared with 80.1% for top-goal CMTF and 82.9% for semantic-goal CMTF. It reduces wrong-goal execution from 19.4% under top-goal CMTF to 2.5%, while preserving the one-tool exposure of causal filtering and using substantially fewer tokens than all-tools exposure. These results suggest that reliable tool-augmented agents should validate goal state, not only tool relevance, before exposing external actions.

12.
arXiv (CS.CV) 2026-06-17

SegTME-UNI2: A Foundation Model-Based Framework for Generalisable Multiclass Cell Segmentation and LLM-Driven Tumour Microenvironment Characterisation in Histopathology

Characterising the tumour microenvironment (TME) from routine H&E-stained histology images requires simultaneous cell segmentation, feature extraction, and interpretable clinical reporting. We present SEGTME-UNI2, a unified framework addressing these requirements. Its core is UNI2-UPERHOVER, a dual-head segmentation model pairing the UNI2-H pathology foundation model (ViT-Giant, pretrained on >100M tiles from 100K slides) with two parallel UperNet decoders: one for six-class semantic segmentation and one for horizontal-vertical gradient regression enabling watershed-based nuclear instance separation. To address the lack of pixel-level annotations in large real-world repositories, UNI2-UPERHOVER undergoes a three-stage progressive pseudo-label curriculum. Each stage trains a fresh model without weight transfer, driving improvement entirely via increased pseudo-label quality: Stage 1: Uses human-annotated PanNuke (7,901 images, 189,744 nuclei, 0.25 um/pixel). Stage 2: Uses entropy-filtered pseudo-labels from the Stage 1 model on 271,711 TCGA-UT scale-0 patches (0.5 um/pixel). Stage 3: Uses pseudo-labels from the Stage 2 model on all 1,608,060 TCGA-UT patches across six resolution scales (0.5-1.0 um/pixel). Segmentation outputs feed a structured TME feature extraction pipeline computing 20+ per-patch compositional, morphological, spatial entropy, and intercellular distance metrics. These are encoded as JSON and passed to a fine-tuned NVIDIA BioNeMo GPT model to generate clinically interpretable TME narratives. Preliminary validation on held-out PanNuke and TCGA-UT partitions demonstrates framework feasibility and internal consistency. The pseudo-labelled TCGA-UT dataset and UNI2-UPERHOVER checkpoint are publicly released to support large-scale TME profiling and spatial biology research.

13.
arXiv (CS.CL) 2026-06-18

Structured Inference with Large Language Gibbs

The knowledge encoded in large language models (LLMs) can serve as a substrate for structured reasoning over variables describing a complex world, but accessing this knowledge in a probabilistically coherent manner poses a difficult inference problem. We propose Large Language Gibbs, a scheme for structured probabilistic inference that uses conditional distributions of an LLM as transition operators. Rather than sampling structured objects through single-pass autoregressive generation, we iteratively resample individual variables conditioned on others using an LLM's next-token conditionals. This approach avoids order-dependent biases and produces a stationary distribution that reflects a compromise between all local conditionals. We apply this approach to sampling from synthetic distributions, consistent reasoning tasks, and Bayesian structure learning. The results suggest that the use of LLM conditionals in MCMC is a practical alternative to one-pass generation for structured probabilistic inference under a world prior accessible through noisy LLM conditionals.

14.
arXiv (CS.CV) 2026-06-15

A Lightweight Fiducial-Based Pipeline for 3D Hyperspectral Mapping of ex-vivo Lumpectomy Specimens

Hyperspectral Imaging (HSI) is a promising modality for intraoperative assessment of resection margins in Breast-Conserving Surgery (BCS), but its clinical translation requires aligning the inherently 2D spectral information onto the 3D shape of the excised tissue so that suspicious regions can be precisely localized for targeted follow-up. We present a fully automated, calibration-free pipeline that produces a 3D hyperspectral point cloud of an ex-vivo lumpectomy specimen from a set of consumer-camera RGB images and a single top-down HSI acquisition. The 3D geometry is reconstructed with a deep-learning Structure-from-Motion backbone, stabilized in a metric reference frame by a custom bundle adjustment that enforces consistency on the corners of four ArUco markers placed around the specimen. The HSI cube is then registered to the reconstruction without recovering the HSI camera pose: the markers, visible in both modalities, define 16 corner correspondences that drive a planar homography, and 3D coordinates are recovered by lookup on an orthographically rendered depth map. Evaluated on two ex-vivo lumpectomy specimens, the pipeline achieves a median 3D registration error below 1~mm and a 2D reprojection error below 0.02 mm, with a total per-specimen processing time under 4 minutes on accelerated hardware. These results support the feasibility of integrating HSI-guided spatial localization into intraoperative margin assessment workflows for breast-conserving surgery.

15.
arXiv (CS.LG) 2026-06-17

Evaluating Uplift Modeling under Structural Biases: Insights into Metric Stability and Model Robustness

arXiv:2603.20775v2 Announce Type: replace Abstract: In personalized marketing, uplift models estimate the incremental effect of an intervention by modeling how customer behavior would change under alternative treatments using counterfactual analysis. However, real-world marketing data often exhibit various biases, such as selection bias, spillover effects, measurement error, and unobserved confounding. These biases can adversely affect both the accuracy of uplift estimation and the validity of evaluation metrics. Despite the importance of bias-aware assessment, there remains a lack of systematic studies evaluating how different models and metrics perform under such biased conditions. To bridge this gap, we design a systematic benchmarking framework. Unlike standard predictive tasks, real-world uplift datasets inherently lack counterfactual ground truth. This limitation renders the direct validation of evaluation metrics infeasible and prevents the precise quantification of biases. Therefore, a semi-synthetic approach serves as a critical enabler for systematic benchmarking. This approach effectively bridges the gap by retaining real-world feature dependencies while providing the ground truth needed to isolate structural biases. Our investigations reveal that (i) uplift targeting and prediction can manifest as distinct objectives, where proficiency in one does not ensure efficacy in the other; (ii) while many models exhibit inconsistent performance under diverse biases, TARNet shows notable robustness, providing insights for subsequent model design; (iii) the stability of evaluation metrics is linked to their mathematical alignment with the ATE, suggesting that ATE-approximating metrics yield more consistent model rankings under structural data imperfections. These findings suggest the need for more robust uplift models and evaluation metrics under real-world data imperfections.

16.
arXiv (CS.CV) 2026-06-16

DynFS-MoE: Dynamic Functional-Structural Mixture-of-Experts for Post-Traumatic Epilepsy Diagnosis

Post-traumatic epilepsy (PTE) is a severe complication of traumatic brain injury (TBI), yet early identification remains challenging due to the complex structural and functional alterations it induces in the brain. To address this, we propose a dynamic multimodal Mixture-of-Experts (MoE) framework that integrates functional and structural MRI through time-aware functional-structural encoding and class-conditioned expert routing. Within this framework, modality-specific and cross-modal experts learn complementary representations, while a Modality-Class MoE (MCoE) module dynamically dispatches expert weights according to each classification objective. Experimental results across three binary classification tasks demonstrate that the framework consistently outperforms static fusion baselines, and high-interpretability analyses further reveal meaningful region-of-interest (ROI) interactions. This dynamic multimodal expert framework effectively captures class-dependent brain interaction patterns and provides an interpretable approach for PTE diagnosis and risk stratification.

17.
bioRxiv (Bioinfo) 2026-06-08

DDI_single: Single-Sequence-Based Protein Domain Assembly

Authors:

Domains are the basic units of protein structure and function. Appropriate inter-domain organization is critical to enable cooperative execution of multiple related functions. It is thus a crucial step to determine the full-length structure of multi-domain proteins for the purpose of elucidating their functions and designing new drugs to regulate these functions. Existing structure prediction algorithms are generally better at solving the internal conformation of domains, rather than modeling the relative positions between domains. To address the challenge of accurately determining multi-domain protein conformations, we develop a single-sequence-based domain assembly algorithm called DDI_single. DDI_single directly extracts features from the amino acid sequence using the protein language model ESM-1b, and accurately predicts the interactions between residue pairs of structural domains through a novel gated cross-attention module, thus achieving the correct assembly of structural domains. With the knowledge of domain definition, DDI_single achieves more than 20% higher accuracy in the task of predicting the relative distances of residue pairs between domains than that of the single-sequence-based structure prediction algorithm trRosettaX_single. When assembling domains with known spatial conformations, DDI_single correctly assembles 74.4% of the samples in the test set (TM-score>0.5). When assembling domains with unknown spatial conformations, in cases where the internal spatial conformations of domains are correctly modeled, DDI_single correctly assembles 73.9% of the samples.

18.
arXiv (CS.LG) 2026-06-16

Brownian Kernel Ladders

arXiv:2606.15812v1 Announce Type: new Abstract: Constructing mathematically tractable function spaces that capture hierarchical compositional representations remains a central challenge in statistical learning theory. We introduce Brownian kernel ladders (BKLs), a recursively defined hierarchy of integral reproducing kernel Hilbert spaces generated through Brownian-kernel integral constructions. Starting from linear functionals, each layer is obtained by integrating Brownian kernels over probability measures supported on subsets of the previous layer, yielding a recursive function-space model in which depth is encoded directly through the hierarchy. Based on this framework, we define canonical BKL spaces together with an associated complexity functional. We establish several analytical and statistical properties of these spaces. In particular, we show that BKL spaces form quasi-Banach spaces, satisfy depth-dependent Hölder regularity estimates, and exhibit strict monotonicity with respect to depth. We further prove existence results for regularized empirical risk minimization and derive Gaussian complexity bounds that remain uniformly controlled with respect to both the ambient dimension and the hierarchy depth. A key ingredient of the analysis is a combinatorial proof technique based on recursive subset decompositions and Brownian-kernel threshold representations. These estimates yield excess-risk guarantees of near-parametric order for regularized empirical risk minimization over BKL spaces. Our results provide a mathematically tractable hierarchical function-space framework for studying compositional representations in deep learning.

19.
arXiv (CS.AI) 2026-06-16

Calibrated Sampling-Free Uncertainty Estimation in Bayesian Deep Learning

arXiv:2606.16214v1 Announce Type: cross Abstract: Modern deep learning models remain notoriously prone to overconfidence, limiting their reliability in high-stakes applications. Bayesian methods aim to counter this by learning a distribution over model parameters, and recent advances now make this feasible for large-scale architectures at costs comparable to AdamW. However, a challenge remains at test time: predictions must be averaged across many forward passes with weights sampled from the posterior, which is prohibitively expensive. Variance propagation offers an efficient alternative, computing layer-wise analytical approximations of uncertainty in a single forward pass. While such techniques are effective for MLPs, their extension to modern architectures remains challenging, due to increased depth and diversity of layer types. To fill this gap, we propose Calibrated Variance Propagation (CVP), which introduces a new propagation method for normalization layers, combines it with recent techniques for handling activation functions, and absorbs residual error through a light calibration step. CVP yields comparably accurate uncertainty estimates to MC sampling across transformers and CNNs, at a fraction of the cost. Against prior variance propagation work, CVP improves coverage at $0.5\%$ risk from $8.2\%$ to $14.6\%$ with BEiT-3 on Visual Reasoning (NLVR2) and from $2.6\%$ to $10.8\%$ with ViLT on VQAv2, with gains extending to convolutional architectures.

20.
arXiv (CS.CL) 2026-06-12

Structuring The Future: Diffusion LLM Speculative Decoding via Calibrated Draft Graphs

Diffusion LLMs (dLLMs) have recently emerged as a powerful alternative to autoregressive LLMs (AR-LLMs) with the potential to operate at significantly higher token-generation rates. To unlock this potential, we present Spiffy, a speculative decoding algorithm to accelerate dLLM inference while provably preserving the model's output distribution. This work addresses the unique challenges involved in applying ideas from speculative decoding of AR-LLMs to dLLMs. Spiffy performs auto-speculation to eliminate the overheads of an independent draft model, structuring draft states in the form of a novel directed draft graph to take advantage of the bidirectional, blockwise nature of dLLM generation. These draft graphs are calibrated offline to maximize acceptance rates and are dynamically pruned during inference for improved computational efficiency. We present a detailed formulation of Spiffy and demonstrate its ability to accelerate LLaDA, Dream, and SDAR models in combination with KV caching and threshold-based dynamic unmasking leading to up to $8.6\times$ reduction in model inferences and $6.3\times$ acceleration in token rate.

21.
arXiv (CS.LG) 2026-06-17

Another Look at Log-PCA for Probability Measures: A Dynamical Formulation and Statistical Convergence

arXiv:2606.17196v1 Announce Type: cross Abstract: This paper is concerned with learning principal variations of random probability measures on $\mathbb{R}^m$ under the Wasserstein geometry. We introduce a new dynamical formulation to interpret the log-PCA, a linearized principal geodesic analysis, as a variational approach. Our differentiable version, termed as the Wasserstein Tangential PCA (WT-PCA), captures the local principal modes of geodesic variations of a (weighted) probability measure on the Wasserstein space via its covariance operator at barycenter. Based on the dynamical perspective and leveraging parallel transport structure of the optimal transport problems, we derive a general statistical convergence rate of the empirical WT-PCA when estimated from data in terms of the 2-Wasserstein distance between the population and empirical barycenter reference measures.

22.
PLOS Medicine 2026-05-13

Contribution of nosocomial transmission to <i>Klebsiella pneumoniae</i> neonatal sepsis in Africa and South Asia: An observational study of infection clusters inferred from pathogen genomics and temporal data

by Erkison Ewomazino Odih, Jabir A. Abdulahi, Anne V. Amulele, Matthew Bates, Eva Heinz, Weiming Hu, Kajal Jain, Rindidzani Magobo, Courtney P. Olwagen, John M. Tembo, Tolbert Sonda, Jonathan Strysko, Caroline C. Tigoi, Kyle Bittinger, Jennifer Cornick, Ebenezer Foster-Nyarko, Wilson Gumbi, Steven M. Jones, Chileshe L. Musyani, Carolyn M. McGann, Ahmed M. Moustafa, Patrick Musicha, James C. L. Mwansa, Moreka L. Ndumba, Thomas D. Stanton, Donwilliams O. Omuoyo, Oliver Pearse, Laura T. Phillips, Paul J. Planet, Charlene M. C. Rodrigues, Fatou Secka, Kirsty Sands, Erin Theiller, Allan M. Zuza, Sulagna Basu, Grace J. Chan, Kenneth C. Iregbu, Jean-Baptiste Mazarati, Semaria Solomon Alemayehu, Timothy R. Walsh, Rabaab Zahra, Angela Dramowski, Sombo Fwoloshi, Appiah-Korang Labi, Lola Madrid, Noah Obeng-Nkrumah, David Ojok, Boaz D. Wadugu, Andrew C. Whitelaw, Anudita Bhargava, Atul Jindal, Ramesh K. Agarwal, Alexander M. Aiken, James A. Berkley, Susan E. Coffin, Nicholas A. Feasey, Nelesh P. Govender, Davidson H. Hamer, Shabir A. Madhi, Mari Jeeva Sankar, Kelly L. Wyres, Kathryn E. Holt Background Klebsiella pneumoniae is the leading cause of sepsis among neonates in low- and middle-income countries (LMICs) in Africa and Asia, contributing substantially to the overall burden of antimicrobial-resistant infections and mortality among neonates globally. Pathogen sequencing has been used to investigate case clusters and confirm nosocomial transmission in a small number of neonatal units. Here we utilise pathogen sequence data to estimate the fraction of K. pneumoniae neonatal sepsis attributable to nosocomial transmission in African and South Asian countries. Methods and findings We estimated the proportion of invasive K. pneumoniae disease involved in nosocomial transmission clusters in a given neonatal unit, using single-linkage clustering based on pairwise temporal and genetic distances estimated from bacterial whole-genome sequences aggregated from 10 contributing studies. Analysing 1,523 K. pneumoniae isolates from 27 units in 13 countries in Africa and South Asia between 2013 and 2023, we inferred 156 nosocomial transmission clusters, ranging from 2 to 188 neonates each (83 of the clusters comprised ≥3 cases). Overall, we estimated that 1,035 neonatal infections (68.0%) were part of nosocomial transmission clusters. Excluding the first infection in each cluster as a potential index case, we estimate at least 879 (57.7%) infections were acquired via nosocomial transmission. Sensitivity analyses showed that results were robust to the choice of genetic distance estimation methods and thresholds used to define clusters, and cluster estimates were stable over temporal distance thresholds ranging from 2 to 8 weeks. Isolates were mostly extended-spectrum beta-lactamase (ESBL) producers (90.9%) and included 172 multi-locus sequence types (STs). Fourteen STs, including several globally recognised multidrug-resistant lineages, were associated with transmission clusters at multiple units, and these were collectively responsible for two-thirds of all infections. Carriage of carbapenemase genes (adjusted odds ratio, aOR = 2.08 [95% confidence interval, CI: 1.04, 4.14]; p = 0.04) and ESBL genes (aOR = 2.48 [95% CI: 1.26, 4.90]; p = 0.006) were significantly positively associated with transmission in a logistic regression model with site as a covariate. Limitations of this study include the lack of sufficient clinical data to allow high-resolution investigation of transmission dynamics and lack of facility-level data to investigate contributors to the observed differences in transmission burden across sites. Conclusions Nosocomial transmission contributes to a substantial proportion of K. pneumoniae sepsis in neonatal care units in Africa and South Asia. Reducing transmission within these settings through improved infection prevention and control and other measures could substantially reduce the neonatal sepsis burden. A high burden of transmission clusters is associated with the same drug-resistant lineages that are recognised as high-risk clones associated with hospital outbreaks in high-income countries, indicating global connectivity of the antimicrobial-resistant pathogen population.

23.
bioRxiv (Bioinfo) 2026-06-16

Better data, better trees: GenBank-GISAID deduplication and source-specific artifact masking in viral genomics

GenBank and GISAID are the primary repositories for viral genomic data, but integrating records across them remains a challenge. The same sequence could be made available in both databases without any cross-reference linking the two entries. Consequently, there is no systematic way to identify this redundancy, which compromises the compilation of representative, non-redundant large-scale datasets. In parallel, the growth of viral genomic data has increased the risk of systematic technical artifacts introduced during sequencing or assembly. These artifacts can inflate substitution rate estimates and degrade temporal signal, biasing evolutionary rate estimates. To address both challenges, here we present a formal, reproducible workflow integrating two newly developed complementary tools: G2G matcher for cross-repository harmonization and Lab-Specific Bias FILTer (LSBFILT) for masking of laboratory-specific artifacts. Using the Eastern/Central/South African (ECSA) chikungunya virus lineage as a proof-of-concept, we demonstrate that our integrated workflow restores temporal signal and provides a robust, curated dataset for downstream phylodynamic analyses. Critically, restricting masking of homoplastic sites to specific sequences reduces the substitution rate estimate from an inflated 8.517 x 10e-4; to 5.078 x 10e-4; substitutions/site/year and increases the coefficient of determination (R2) of the root-to-tip regression analysis from 0.353 to 0.677. By enabling systematic cross-repository harmonization and source-specific artifact masking, we provide the molecular epidemiological community with scalable tools to reconcile fragmented genomic data and reduce technical biases, fostering more accurate and reproducible phylogenetic analysis. G2G matcher is available at https://github.com/andrezaleite/G2G-Matcher, and LSBFILT at https://github.com/khourious/LSBFILT.

24.
arXiv (CS.CV) 2026-06-19

Distill Once, Adapt Life-Long: Exploring Dataset Distillation for Continual Test-Time Adaptation

Continual Test-Time Adaptation (CTTA) aims to maintain model performance under evolving target domains by adapting online without labeled data. However, practical deployments often cannot retain the source dataset due to privacy or licensing constraints, and purely source-free CTTA methods tend to become unstable under long-term distribution shift, suffering from compounding self-training errors and catastrophic forgetting. We introduce DO-ALL (Distill Once, Adapt Life-Long), a plug-and-play framework that revisits source information in a compact and privacy-conscious form via Dataset Distillation (DD). Before deployment, DO-ALL performs DD to produce a small set of synthetic distilled anchors that summarize the source distribution. During adaptation, each target sample is matched with its most semantically aligned anchor, which provides a stable reference for various CTTA via source replay, representation alignment, and manifold-smoothing regularization. DO-ALL can be seamlessly integrated into existing CTTA algorithms, consistently improving long-term robustness across CIFAR100-C, ImageNet-C, and the CCC benchmark. This demonstrates the potential of leveraging DD to enable stable and continuous adaptation without retaining raw source data. The code is available at https://github.com/blue-531/DOALL.

25.
arXiv (CS.CL) 2026-06-12

IVIE: A Neuro-symbolic Approach to Incremental and Validated Generation of Interactive Fiction Worlds

Computational creativity in Interactive Fiction faces a fundamental tension: Large Language Models (LLM) may produce creative narratives but struggle with world coherence, while symbolic systems ensure consistency but lack creative flexibility. We present IVIE (Incremental & Validated Interactive Experiences), a neuro-symbolic approach to generating complete and playable interactive fiction worlds from scratch. Building upon PAYADOR's neuro-symbolic framework, IVIE implements a four-stage incremental generation pipeline that delegates creative decisions–setting and character creation, puzzle design–to LLMs while grounding the world state through symbolic validation. The system generates worlds with interconnected locations, functional items, non-player characters, and coherent puzzles, all structured around a central goal-oriented architecture. Human evaluation shows the approach generates immersive, thematically coherent worlds with high player engagement. Results seem to indicate that the neuro-symbolic approach successfully balances flexibility with narrative coherence: symbolic validation grounds LLM generation without eliminating generative freedom. However, challenges remain: LLM inconsistencies occasionally bypass puzzle constraints, and objective validation gaps allow some structurally impossible goals. We identify key design considerations for future neurosymbolic interactive storytelling systems, particularly regarding LLM capabilities and their limitations.