Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
PLOS Computational Biology 2026-06-04

Cell differentiation can underpin the reproducibility of morphogenesis

by Dominic K. Devlin, Austen R. D. Ganley, Nobuto Takeuchi Morphogenesis of complex body shapes is reproducible despite the noise inherent in the underlying morphogenetic processes. However, how these morphogenetic processes work together to achieve this reproducibility remains unclear. Here, we ask how this reproducibility is achieved by evolving complex morphologies in a multi-scale, computational model. Each morphology consists of a population of cells on a two-dimensional grid using the Cellular Potts Model framework. Each cell contains a genome that encodes a gene regulatory network, morphogens for cell-cell signalling, and proteins that determine cell behaviours. By repeatedly simulating our model with different initial conditions under selection for shape complexity, we obtained a “zoo” of evolved morphologies. We find that these evolved, complex morphologies are reproducible in a sizeable fraction of simulations, despite no direct selection for reproducibility. We show that high reproducibility is caused by spatially segregating moving cells that “shape” morphologies from stationary cells that “maintain” morphologies during morphogenesis. Strikingly, most highly reproducible morphologies also evolved cell differentiation, where proliferative, moving progenitor cells irreversibly differentiate into non-dividing, stationary differentiated cells at tissue boundaries. These results suggest that cell differentiation observed in natural development plays a fundamental role in morphogenesis in addition to the production of specialised cell types. This previously unrecognised role of cell differentiation has major implications for our understanding of how morphologies are generated and regenerated.

02.
arXiv (CS.CV) 2026-06-17

Learning QoE from Packet-Level Measurements in Encrypted Video Conferencing Traffic

The quality of the user experience has become one of the most important aspects in todays world, as it directly influences individuals willingness to continue using or abandon a product or service. In this context, video conferencing applications (VCAs), which experienced widespread adoption following the COVID-19 pandemic, must deliver excellent performance to remain competitive in an increasingly crowded market. Although content providers (CPs) such as Zoom, WhatsApp, Telegram, and Google Meet can assess conversation quality by comparing transmitted and received data. The widespread use of end-to-end encryption in VCAs makes quality-of-experience (QoE) evaluation by internet service providers (ISPs) far more challenging. Since ISPs do not have access to the encrypted content, they must rely on passive measurements of unencrypted traffic characteristics on the data path. In this work, we present a simple yet effective QoE prediction framework based on an almost stock convolutional neural network (CNN) architecture that uses only the packet sizes extracted from the communication between two participants in a video conferencing (VC) call to predict two QoE metrics: BRISQUE and MOS. The proposed framework is simple, easy to implement, and does not require high-end computational resources, yet it provides superior prediction performance, as shown in our experiments on two custom datasets collected from WhatsApp and Zoom, which achieve substantial improvements over previous models for the QoE prediction task.

03.
arXiv (CS.CV) 2026-06-18

When AUC Misleads: Polarization-Aware Evaluation of Deepfake Detectors under Domain Shift

Recent advances in generative AI, such as diffusion models and face-swapping tools, have enabled the creation of highly realistic deepfakes, leading to real-world harms including financial fraud and non-consensual explicit content. In response, deepfake detection has become an active research area, with recent methods increasingly focusing on improving generalization to unseen manipulations. This is typically evaluated using the Area Under the ROC Curve (AUC) measured separately across multiple datasets. However, such an evaluation fails to reflect real-world scenarios where detectors face a mixture of data sources and varying artifact types. To address this limitation, we introduce a novel metric, Cross-dataset AUC (Cross-AUC) that averages per-domain AUCs with a measure of prediction polarization for taking into account the robustness to domain shift. The polarization extent is quantified by the Wasserstein Distance between class score distributions. Cross-AUC not only assesses the generalization capabilities of deepfake detectors under domain shifts more realistically, but it is also interpretable as it better explains the reason behind a drop in performance. Experiments performed on seven benchmark datasets demonstrate its practical relevance.

04.
arXiv (quant-ph) 2026-06-16

Readout-Induced Leakage in Superconducting Circuits with Nonlinear Couplings

arXiv:2606.16055v1 Announce Type: new Abstract: In superconducting circuits, drive-induced unwanted transitions limit the readout power, thereby constraining readout speed and fidelity. When such transitions excite the qubit into leakage states, they produce correlated errors that are particularly harmful for quantum error correction. Native nonlinear qubit-readout resonator coupling is a promising alternative to conventional linear hybridization because it provides intrinsic Purcell protection and stricter selection rules for multiphoton processes. In realistic devices, however, we show that such a coupling alone neither eliminates nor necessarily suppresses drive-induced transitions. Instead, if not appropriately engineered, these couplings often worsen the situation by introducing additional parasitic processes. Moreover, the rates of these unwanted transitions remain sensitive to the choice of readout frequency, regardless of the coupling mechanism. We demonstrate that readout-induced leakage can thus vary by orders of magnitude even when readout frequencies differ by less than ~7%. Our results establish that the benefits of native nonlinear couplings are realized only through informed device design, including the spectral placement of relevant auxiliary modes and elimination of parasitic ones.

05.
Science (Express) 2026-05-07

TranscriptFormer: A generative cell atlas across 1.5 billion years of evolution | Science

作者: 未知作者

Single-cell transcriptomics is revolutionizing our understanding of cellular diversity, yet comparing transcriptional programs across the tree of life remains challenging. We developed TranscriptFormer, a family of generative foundation models trained on up to 112 million cells spanning 1.53 billion years of evolution across 12 species. We demonstrate state-of-the-art performance on cell type classification, even for species separated over 685 million years of evolution, and zero-shot disease state identification in human cells. Developmental trajectories, phylogenetic relationships and cellular hierarchies emerge naturally in TranscriptFormer’s representations without any explicit training on these annotations. This work establishes a powerful framework for quantitative single-cell analysis and comparative cellular biology, thus demonstrating that universal principles of cellular organization can be learned and predicted across the tree of life.

06.
arXiv (CS.AI) 2026-06-19

ProMUSE: Progressive Multi-modal Uncertainty-guided Staged Evidential Alzheimer Disease Classification

arXiv:2606.19371v1 Announce Type: cross Abstract: Alzheimer's disease (AD) is a fatal disorder that destroys memory and cognitive skills in the elderly population. Most treatments for AD are effective in the early stage, leading to an increasing demand for early AD diagnosis. AD diagnosis increasingly relies on multimodal data such as clinical assessments, structural Magnetic Resonance Imaging (MRI), and Positron Emission Tomography (PET) imaging. However, MRI and PET acquisition remain costly and not universally accessible, making full-modality inference impractical in real-world clinical workflows. We propose ProMUSE, a Progressive Multi-modal Uncertainty Guided Staged Evidential Network that adaptively determines when additional modalities are necessary, helping reduce the overall cost of data acquisition while maintaining accuracy. ProMUSE first performs evidential classification using low-cost clinical data and quantifies uncertainty via a Dirichlet-based subjective logic model. When uncertainty exceeds a learned threshold, ProMUSE progressively incorporates MRI or PET features, fusing modality-wise belief and uncertainty through Dempster-Shafer theory to obtain a calibrated multimodal prediction. This staged acquisition strategy enables accurate diagnosis while minimizing reliance on expensive imaging. Experiments on ADNI, AIBL, and OASIS across CN-AD, CN-MCI, and MCI-AD tasks demonstrate that ProMUSE achieves competitive or superior accuracy compared to full-modality baselines while reducing MRI/PET usage by 50-90%, yielding substantial cost savings. These results highlight ProMUSE as a practical, uncertainty-aware, and resource-efficient solution for real-world AD screening.

07.
arXiv (CS.CV) 2026-06-15

HULFSynth : An INR based Super-Resolution and Ultra Low-Field MRI Synthesis via Contrast factor estimation

We present an unsupervised single image bidirectional Magnetic Resonance Image (MRI) synthesizer that synthesizes an Ultra-Low Field (ULF) like image from a High-Field (HF) magnitude image and vice-versa. Unlike existing MRI synthesis models, our approach is inspired by the physics that drives contrast changes between HF and ULF MRIs. Our forward model simulates a HF to ULF transformation by estimating the tissue-type Signal-to-Noise ratio (SNR) values based on target contrast values. For the Super-Resolution task, we used an Implicit Neural Representation (INR) network to synthesize HF image by simultaneously predicting tissue-type segmentations and image intensity without observed HF data. The proposed method is evaluated using synthetic ULF-like data from generated from standard 3T T$_1$-weighted images for qualitative assessments and paired 3T-64mT T$_1$-weighted images for validation experiments. WM-GM contrast improved by 52% in synthetic ULF-like images and 37% in 64mT images. Sensitivity experiments demonstrated the robustness of our forward model to variations in target contrast, noise and initial seeding.

08.
arXiv (CS.AI) 2026-06-12

Mechanical Conscience: A Mathematical Framework for Dependability of Machine Intelligenc

arXiv:2605.03847v2 Announce Type: replace Abstract: Distributed collaborative intelligence (DCI), encompassing edge-to-edge architectures, federated learning, transfer learning, and swarm systems, creates environments in which emergent risk is structurally unavoidable: locally correct decisions by individual agents compose into globally unacceptable behavioral trajectories under uncertainty. Existing approaches such as constrained optimization, safe reinforcement learning, and runtime assurance evaluate acceptability at the level of individual actions rather than across behavioral trajectories, and none addresses the multi-participant, uncertainty-laden nature of DCI deployments. This paper introduces mechanical conscience (MC), a novel concept and simplified mathematical framework that operationalizes trajectory-level normative regulation for both single-agent and distributed intelligent systems. Mechanical conscience is defined as a supervisory filter that minimally corrects a baseline policy's actions to reduce cumulative deviation from a normatively admissible region, while accounting for epistemic uncertainty. We introduce associated constructs, conscience score, mechanical guilt, and resonant dependability, that provide an interpretable vocabulary and computable governance signals for this emerging field. Core theoretical properties are established: admissibility equivalence, existence of optimal regulation, and monotonic deviation reduction. Illustrative results demonstrate that MC-regulated agents maintain trajectory-level normative acceptability where conventional controllers drift outside admissible bounds, and that the framework naturally extends to suppress interaction-induced emergent risk in multi-agent DCI settings.

09.
arXiv (CS.CV) 2026-06-16

CycliST: A Video Language Model Benchmark for Reasoning on Cyclical State Transitions

We present CycliST, a novel benchmark dataset designed to evaluate Video Language Models (VLM) on their ability for textual reasoning over cyclical state transitions. CycliST captures fundamental aspects of real-world processes by generating synthetic, richly structured video sequences featuring periodic patterns in object motion and visual attributes. CycliST employs a tiered evaluation system that progressively increases difficulty through variations in the number of cyclic objects, scene clutter, and lighting conditions, challenging state-of-the-art models on their spatio-temporal cognition. We conduct extensive experiments with current state-of-the-art VLMs, both open-source and proprietary, and reveal their limitations in generalizing to cyclical dynamics such as linear and orbital motion, as well as time-dependent changes in visual attributes like color and scale. Our results demonstrate that present-day VLMs struggle to reliably detect and exploit cyclic patterns, lack a notion of temporal understanding, and are unable to extract quantitative insights from scenes, such as the number of objects in motion, highlighting a significant technical gap that needs to be addressed. More specifically, we find no single model consistently leads in performance: neither size nor architecture correlates strongly with outcomes, and no model succeeds equally well across all tasks. By providing a targeted challenge and a comprehensive evaluation framework, CycliST paves the way for visual reasoning models that surpass the state-of-the-art in understanding periodic patterns.

10.
arXiv (CS.CV) 2026-06-18

Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation

Understanding disease progression is a central clinical challenge with direct implications for early diagnosis and personalized treatment. While recent generative approaches have attempted to model progression, key mismatches remain: disease dynamics are inherently continuous and monotonic, yet latent representations are often scattered, lacking semantic structure, and diffusion-based models disrupt continuity with random denoising process. In this work, we propose to treat the disease dynamic as a velocity field and leverage Flow Matching (FM) to align the temporal evolution of patient data. Unlike prior methods, it captures the intrinsic dynamic of disease, making the progression more interpretable. However, a key challenge remains: in latent space, Auto-Encoders (AEs) do not guarantee alignment across patients or correlation with clinical-severity indicators (e.g., age and disease conditions). To address this, we propose to learn patient-specific latent alignment, which enforces patient trajectories to lie along a specific axis, with magnitude increasing monotonically with disease severity. This leads to a consistent and semantically meaningful latent space. Together, we present $\Delta$-LFM, a framework for modeling patient-specific latent progression with flow matching. Across three longitudinal MRI benchmarks, $\Delta$-LFM demonstrates strong empirical performance and, more importantly, offers a new framework for interpreting and visualizing disease dynamics.

11.
arXiv (CS.LG) 2026-06-18

Enhanced Graph Neural Networks using K-Hop Gaussian Diffusion

arXiv:2606.18317v1 Announce Type: new Abstract: Most graph neural network (GNN) cores rely on graph convolutions, typically implemented as message passing between direct (single-hop) neighbors. In many real-world graphs, edges can be noisy or poorly defined, limiting information propagation to local neighborhoods. Existing diffusion kernels, such as Personalized PageRank (PPR) and Heat Kernel, alleviate this issue through global propagation, but still struggle with complex local structures and distant node noise. To address these limitations, we propose a K-Hop Gaussian (KHG) diffusion kernel as a preprocessing module for graph data. KHG introduces multi-hop diffusion with Gaussian weighting for remote nodes, balancing local and global information propagation before applying standard GNNs. Experiments on multiple benchmark datasets demonstrate that KHG significantly outperforms traditional message-passing GNNs, as well as PPR and Heat Kernel diffusion, particularly in noisy or structurally complex graphs.

12.
arXiv (CS.CL) 2026-06-17

When Tables Go Crazy: Evaluating Multimodal Models on French Financial Documents

Vision-language models (VLMs) perform well on many document understanding tasks, yet their reliability in specialized, non-English domains remains underexplored. This gap is especially critical in finance, where documents mix dense regulatory text, numerical tables, and visual charts, and where extraction errors can have real-world consequences. We introduce Scribe Finance, the first multimodal benchmark for evaluating French financial document understanding. The dataset contains 1,204 expert-validated questions spanning text extraction, table comprehension, chart interpretation, and multi-turn conversational reasoning, drawn from real investment prospectuses, KIDs, and PRIIPs. We evaluate six open-weight VLMs (8B-124B parameters) using an LLM-as-judge protocol. While models achieve strong performance on text and table tasks (85-90% accuracy), they struggle with chart interpretation (34-62%). Most notably, multi-turn dialogue reveals a sharp failure mode: early mistakes propagate across turns, driving accuracy down to roughly 50% regardless of model size. These results show that current VLMs are effective for well-defined extraction tasks but remain brittle in interactive, multi-step financial analysis. Scribe Finance offers a challenging benchmark to measure and drive progress in this high-stakes setting.

13.
arXiv (CS.LG) 2026-06-16

Information Leakage Detection through Approximate Bayes-optimal Prediction

arXiv:2401.14283v4 Announce Type: replace-cross Abstract: In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to binary system sensitive information and lack a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor's log-loss and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.

14.
arXiv (CS.AI) 2026-06-19

Toward Calibrated Mixture-of-Experts Under Distribution Shift

arXiv:2606.20544v1 Announce Type: new Abstract: Calibration aligns a model's predictive uncertainty with the frequencies of its empirical outcomes and is important for understanding and trusting reported probabilities. Recent work shows that enforcing calibration at the level of individual predictors can improve ensemble accuracy and calibration, with mixture-of-experts (MoE) models showing strong empirical improvements in particular; however, the conditions under which calibration helps MoE are not well understood. In this work, we study how MoE models behave under distribution shift, focusing on how routing mechanisms interact with expert-level calibration. We show that expert calibration is sufficient to ensure calibration of the overall model under a broad class of distribution shifts in hard-routed models, but is insufficient for calibrating soft-routed models. To address this, we propose an adversarial reweighting that penalizes calibration errors of the routed aggregate under distribution shift, and we demonstrate that it improves the accuracy-calibration tradeoff both on average and on difficult subsets of the data, across model classes, prediction tasks, and distribution shifts.

15.
Nature (Science) 2026-06-17

A blastoporal organizer in a ctenophore

In an iconic experiment in 1924, Hilde Mangold and Hans Spemann established that the dorsal blastopore lip of amphibian embryos functions as an organizer and induces a secondary body axis when transplanted into a host embryo1. This discovery demonstrated that specific embryonic regions can regulate embryonic patterning and lead to the establishment of an entire body axis. Subsequent studies have revealed that cnidarians, the sister group to Bilateria, also possess a blastoporal embryonic organizer2,3. However, the evolutionary origin of the organizer remains unclear. Here we report that the blastopore lip of the ctenophore Mnemiopsis leidyi, a member of the evolutionary sister group to all other metazoans4,5, exhibits organizer activity. We show that transplanted fragments of blastopore lip tissue from M. leidyi gastrula induce secondary pharynx and mouth formation. Moreover, transphyletic transplantation experiments show that the blastopore lip of M. leidyi leads to the generation of a secondary body axis in embryos of the cnidarian Nematostella vectensis. Organizer function in M. leidyi requires both β-catenin and TGFβ signalling, and the TGFβ-family ligands probably provide this inductive capacity. These findings reveal the deep homology of the blastoporal organizer in ctenophores, cnidarians and vertebrates, implying the ancestral organizer role of the blastopore lip. We propose that the emergence of the organizer was an essential innovation that facilitated the change from the temporal cell differentiation of unicellular relatives to the spatial cell differentiation of the first multicellular embryo. Experiments using the comb jelly Mnemiopsis leidyi and the sea anemone Nematostella vectensis reveal that the emergence of a core signalling pathway may have been a key innovation enabling the transition to multicellularity in animals.

16.
arXiv (CS.AI) 2026-06-12

Representing Time Series as Structured Programs for LLM Reasoning

arXiv:2606.12481v1 Announce Type: cross Abstract: Large language models (LLMs) have demonstrated strong reasoning and instruction-following capabilities, making them potentially powerful tools for time-series analysis. However, time series lie outside their native textual modality, raising a fundamental question: how should time series be represented so that LLMs can reason about them effectively? Existing work typically serializes raw numerical sequences or fine-tunes pre-trained LLMs on time-series data. These approaches place the burden of extracting temporal structure directly on the LLM, creating a modality mismatch that often degrades performance on long sequences and introduces substantial computational overhead. In this work, we introduce Time-Series-to-Structured-Program representation (T2SP), a deterministic, training-free method that represents a time series as a structured symbolic program. T2SP decomposes time series into trends, periods, and salient events, expressing them in a program-friendly format aligned with the textual and code-like modalities on which LLMs are natively trained. By shifting temporal-structure extraction from the model to the representation itself, T2SP enables off-the-shelf LLMs to leverage their existing reasoning capabilities for time-series understanding. We evaluate T2SP on three reasoning tasks – editing, captioning, and question answering – where it consistently improves performance, reduces reasoning time, and lowers failure rates compared with raw-string representations. Our results demonstrate that T2SP provides an effective interface between time series and LLMs.

17.
arXiv (CS.CV) 2026-06-16

Structure-Semantic Co-optimized Latent Diffusion Model for Fast Visual Anagram Synthesis

Visual anagram is an intriguing form of art creation wherein a single image presents different conceptual interpretations under transformations such as flipping or rotation. Recent work has achieved visual anagram synthesis by leveraging pretrained text-to-image (T2I) diffusion models, yet still suffers from several key limitations including computational inefficiency, suboptimal aesthetic quality, and weak semantic fidelity and expressiveness. This work focuses on generating visual anagrams with substantially improved visual quality at minimal computational cost, thereby advancing intelligent creation of illusionary digital art. To increase image resolution while reducing time overhead, we adapt the cutting-edge parallel denoising algorithm from pixel-based T2I model to the adversarially distilled latent-based one, and accordingly propose a structure-semantic co-optimization (S2CO) framework to counteract the consequent visual degradation. As the core of our approach, S2CO framework comprises three key innovations: (\romannumeral1) null-text structure alignment optimization; (\romannumeral2) semantic enhancement optimization; (\romannumeral3) attention-guided noise fusion. Building upon these components, our method dubbed S2CO-Anagram is able to generate higher-resolution anagram images with noticeably superior visual harmony and semantic faithfulness than related SOTA approaches, all while achieving substantially faster inference speed. Code will be publicly available.

18.
arXiv (CS.CV) 2026-06-11

MLT-Dedup: Efficient Large-Scale Online Video Deduplication via Multi-Level Representations and Spatial-Temporal Matching

The explosive growth of user-generated video content on online platforms is accompanied by the emergence of numerous near-duplicate videos–videos that are identical or highly similar but differ by partial edits. These duplicates degrade user experience and increase storage and bandwidth costs, making large-scale video deduplication a critical task. Existing video deduplication frameworks face a fundamental challenge in retrieving sufficient high-quality candidates under a limited index budget, as well as trade-offs between efficiency and precision. To address these issues, we propose MLT-Dedup, an efficient large-scale online video deduplication framework with Multi-Level representations and spatial-Temporal matching. Our approach employs a Multi-Level Video Encoder (ML-VE) to extract both fine-grained frame-level and sparse clip-level embeddings: sparse embeddings support efficient candidate retrieval, while fine-grained embeddings are loaded for precise pairwise matching. During matching, we introduce DiF-SiM, a Differential Feature-enhanced Similarity Module capable of locating duplicated temporal segments and providing reliable similarity evidence to support policy-driven deduplication decisions. Extensive experiments on a real-world large-scale platform demonstrate that MLT-Dedup reduces online repetition rates by 91% at 90% precision. Furthermore, our sparse retrieval design achieves a 5x increase in indexing capacity, enabling broader candidate coverage in real-world deployment.

19.
arXiv (quant-ph) 2026-06-12

Quantum-Driven Neuromorphic Computing for Million-Qubit-Scale Workloads

arXiv:2606.12968v1 Announce Type: new Abstract: We introduce Apollo, a 10000 node p-qubit neuromorphic processor fabricated in 16 nm mixed signal CMOS and operating fully at room temperature with a typical analog core power envelope of about 0.5 W. Its fundamental element, the p-qubit, is a bistable stochastic unit whose continuous time state fluctuations are driven by integrated quantum entropy units that inject true quantum derived randomness. This enables ultrafast stochastic transitions at low energy while preserving a classical state representation. Apollo combines these p-qubits with a high degree Hyperion 256 interconnect topology, allowing efficient embedding of dense Ising and QUBO problems with substantially reduced minor embedding overhead compared with sparse annealing platforms. We show that, through the Suzuki Trotter correspondence, the equilibrium statistics and annealing dynamics of the p-qubit network reproduce key properties of transverse field quantum annealing without cryogenic cooling, long lived coherence, or microwave control. Beyond device level validation, Apollo is evaluated on a three dimensional spin glass benchmark previously used to study quantum advantage in superconducting annealers. Across 300 disorder realizations, Apollo reaches substantially lower ground state energies than reported cryogenic quantum annealing hardware, while remaining distinct from classical simulated annealing and simulated quantum annealing. A 350 nm release candidate device experimentally validates the core p-qubit dynamics, thermodynamic sampling correctness, and continuous time annealing behavior. These results establish Apollo as a room temperature, industrially scalable platform for quantum driven energy based optimization, probabilistic inference, generative modeling, and hybrid classical quantum workflows.

20.
arXiv (CS.AI) 2026-06-12

TrajGenAgent: A Hierarchical LLM Agent for Human Mobility Trajectory Generation

arXiv:2606.12657v1 Announce Type: new Abstract: Human mobility data is important for transportation, urban planning, and epidemic control, but large-scale trajectory collection is often costly and privacy-constrained, motivating realistic synthetic trajectory generation. Existing LLM-based generators typically rely on either prompt engineering, which preserves zero-shot reasoning but lacks fine-grained spatiotemporal grounding, or trajectory-level fine-tuning, which improves statistical precision but incurs substantial computational cost and may weaken general reasoning. We propose TrajGenAgent, a semantic-aware hierarchical LLM-agent framework for human mobility trajectory generation without model fine-tuning. TrajGenAgent uses a two-stage orchestrator-worker design: an LLM first synthesizes an individual- and weekday-conditioned activity chain from historical evidence via in-context learning, and a deterministic workflow then grounds each activity into a complete visit using personalized POI retrieval, distance-aware location selection, kinematics-aware travel-time propagation, and LLM-based duration estimation. To evaluate realism beyond aggregate spatiotemporal statistics, we introduce an anomaly-detection-based evaluation framework using two complementary detectors to assess behavioral and semantic plausibility. Experiments on benchmark and large-scale simulation datasets show that TrajGenAgent improves spatiotemporal fidelity, semantic coherence, and individual-specific behavioral realism over representative neural and LLM-based baselines, while avoiding parameter updates.

21.
arXiv (quant-ph) 2026-06-19

Effective discrete-modulated continuous variable QKD under general attacks

arXiv:2606.20346v1 Announce Type: new Abstract: Continuous variable quantum key distribution via discrete modulations ensures information-theoretic security using standard telecom technologies, providing affordable and scalable quantum communications with simplified classical postprocessing. However, existing security proofs against general attacks often rely on restrictive assumptions, such as a bounded dimension for coherent states, or require impractically large block sizes. In this work, we develop a finite-size security analysis that removes these limitations while incorporating realistic experimental features. Our approach combines the dimension reduction technique, a security proof based on the marginal-constrained entropy accumulation, and a trusted detector model accounting for the receiver imperfections. We report positive key rates in the finite-size regime for relevant block sizes of the order of $10^8$. These results contribute to narrowing the gap between theoretical security proofs and practical implementations of discrete-modulated continuous variable quantum key distribution protocols.

22.
arXiv (math.PR) 2026-06-19

Model-independent upper bounds for the prices of Bermudan options with convex payoffs

arXiv:2503.13328v3 Announce Type: replace-cross Abstract: Suppose $\mu$ and $\nu$ are probability measures on $\mathbb{R}$ satisfying $\mu \leq_{cx} \nu$. Let $a$ and $b$ be convex functions on $\mathbb{R}$ with $a \geq b \geq 0$. We are interested in finding $$\sup_{\mathbf{M}} \sup_{\tau} \mathbb{E}^{\mathbf{M}} \left[ a(X) I_{ \{ \tau = 1 \} } + b(Y) I_{ \{ \tau = 2 \} } \right] $$ where the first supremum is taken over consistent models $\mathbf{M}$ (i.e., filtered probability spaces $(\Omega, \mathbf{F}, \mathbb{F}, \mathbb{P})$ such that $Z=(z,Z_1,Z_2)=(\int_{\mathbb{R}} x \mu(dx) = \int_{\mathbb{R}} y \nu(dy), X, Y)$ is a $(\mathbb{F},\mathbb{P})$ martingale, where $X$ has law $\mu$ and $Y$ has law $\nu$ under $\mathbb{P}$) and $\tau$ in the second supremum is a $(\mathbb{F},\mathbb{P})$-stopping time taking values in $\{1,2\}$. Our contributions are first to characterise and simplify the dual problem, and second to completely solve the problem under some structural assumptions on the measures $\mu$ and $\nu$ (namely that $\mu$ and $\nu$ are absolutely continuous probability measures that satisfy the Dispersion Assumption). A key finding is that the canonical set-up in which the filtration is that generated by $Z$ is not rich enough to define an optimal model and additional randomisation is required. This holds even though the marginal laws $\mu$ and $\nu$ are atom-free. The problem has an interpretation of finding the robust, or model-free, no-arbitrage bound on the price of a Bermudan option with two possible exercise dates, given the prices of co-maturing European options.

23.
arXiv (CS.AI) 2026-06-19

ORAgentBench: Can LLM Agents Solve Challenging Operations Research Tasks End to End?

arXiv:2606.19787v1 Announce Type: new Abstract: Large language models are increasingly deployed as autonomous agents for multi-step tasks in executable environments, yet their ability to perform realistic operations research (OR) work remains unclear. Existing OR evaluations often decouple modeling from solving, rely on pre-formalized or text-only instances, and rarely test the full workflow from operational artifacts to validated decisions. In this work, we introduce ORAgentBench, an execution-grounded benchmark for evaluating autonomous agents on challenging end-to-end operations research tasks. It contains 107 human-reviewed tasks across diverse operational scenarios, each packaged in an isolated environment with a natural-language brief, multi-file data, configuration artifacts, and a required submission schema. Agents must write and run solution code, and their submissions are evaluated by hidden validators for schema validity, hard-constraint feasibility, and normalized objective quality. Experiments with fourteen frontier agent-model configurations show that current agents remain far from reliable OR practice. The best agent passes only 35.51% of all tasks and 20.59% of hard tasks, and many feasible submissions still fall below the required quality threshold. Failure analysis further shows that errors are dominated by strategic weaknesses, including missed operational rules, brittle formulations, weak feasible-solution construction, and insufficient solution improvement. OR-specific procedural skills increase hard-task feasibility, but do not reliably improve solution quality or pass rate. These results suggest that progress in OR agents requires moving beyond plausible optimization code toward dependable, high-quality operational decision-making.

24.
arXiv (math.PR) 2026-06-11

Arrangements of Consecutive Numbers in Mallows Permutations

arXiv:2606.12410v1 Announce Type: cross Abstract: We study the random variable that counts the number of specific arrangements of clustered consecutive numbers in permutations under the Mallows distribution. We provide an asymptotic expression for the expected value of this random variable. This result extends and tightens the previously known result by Pinsky (2022) concerning clustered consecutive numbers in Mallows permutations. Moreover, we identify a range of parameters for which the distribution of the number of arrangements of clustered consecutive numbers in Mallows permutations is close to a Poisson distribution.

25.
medRxiv (Medicine) 2026-06-22

Starting, stopping and restarting. Patterns of Methylphenidate Use over 14 years in a large public health system

Background Persistence with stimulant medication is poor in children and adolescents with ADHD, and the evidence base is derived predominantly from high-income countries. We describe methylphenidate utilisation patterns and predictors of 12-month retention across 14 years in a large South African public health service. Methods Retrospective cohort study using routine pharmacy data from the Western Cape provincial health service (2011-2024). Children aged 5-18 at first prescription were included. Treatment episodes were defined as continuous prescription sequences with no gap exceeding 90 days and classified as initiations or restarts. Logistic regression modelled 12-month retention against early visit frequency and formulation type as pre-specified exposures. Findings 421,925 prescription events for 23,243 children across 115 facilities generated 65,885 treatment episodes. Median age at first prescription was 10 years (IQR 8-12); 77.6% were male. Kaplan-Meier 12-month survival was 28.2% for initiations and 15.4% for restarts, substantially below high-income country comparators. A quarter of all initiating prescriptions were not followed by a subsequent dispensing event; nearly 40% of patients had three or more treatment episodes. Early visit frequency was the strongest predictor of 12-month retention (high vs low: OR 2.85, 95% CI 2.65-3.06). The sustained-release formulation effect was present but attenuated on multivariable adjustment. Treatment re-initiations showed a marked seasonal pattern consistent with the South African school calendar. Interpretation Twelve-month retention was markedly lower than high-income country rates. Against a backdrop of high attrition, both early visit frequency and sustained-release formulation access predicted persistence; clinical engagement and reducing structural barriers to access are modifiable factors in this setting. Funding None.