Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CV) 2026-06-16

Mutual Distillation of Dual-Foundation Models for Semi-Supervised PET/CT Segmentation

Organ segmentation from PET/CT is critical for quantitative analysis and radiotherapy planning in oncology. To ease the high annotation cost of PET/CT segmentation, semi-supervised learning (SSL) provides a practical and effective solution for developing deep models with limited labeled data. Recent developments in visual foundation models have demonstrated remarkable adaptability with improved efficiency. In this work, we propose a mutual distillation framework that seamlessly exploits both structural and functional foundation models, which act as modality-specific generalists for distilling knowledge from structural CT and metabolic PET imaging. By bridging the gap between the task-specific precision of student models and the segmentation priors of generalist foundation models, we propose MuDuo, a mutual distillation framework that synergistically leverages SAM-Med3D for CT and SegAnyPET for PET to distill their knowledge into a lightweight student network. Our approach eliminates the need for manual prompts while maximizing the utility of unlabeled data for automatic segmentation, achieving state-of-the-art performance on the AutoPET dataset with only 5 labeled cases. Our source code is available at https://github.com/Wu-beining/MuDuo.

02.
arXiv (CS.CV) 2026-06-16

Comparing Human Gaze and Vision-Language Model Attention in Safety-Relevant Environments

Human visual attention plays an important role in how people perceive and respond to environments containing potential risks. This study investigates whether large vision-language models can identify the same regions of a scene that attract human attention in safety-relevant environments. Eye-tracking data were collected from ten participants viewing 33 scene images representing environments with varying levels of potential risk using Pupil Invisible wearable glasses. Gaze coordinates were mapped onto stimulus images to generate population-averaged human gaze heatmaps. In parallel, GPT-4o was prompted through the OpenAI Vision Application Programming Interface (API) to generate spatial predictions of visual attention, which were converted into saliency maps for comparison with human gaze patterns. Spatial alignment between human gaze heatmaps and model-generated saliency maps was evaluated using four complementary metrics: Pearson correlation (r = 0.515 +- 0.117), Normalised Scanpath Saliency (NSS = 0.988 +- 0.323), Kullback-Leibler divergence (KL = 1.766 +- 0.844), and Area Under the Receiver Operating Characteristic Curve using the Judd formulation (AUC-Judd = 0.806 +- 0.076). A cross-model comparison with Gemini Pro, Gemini Flash, and Claude showed that all models exceeded the AUC-Judd chance baseline of 0.5 and achieved positive NSS scores. Gemini Pro demonstrated the strongest spatial localisation according to three of the four metrics, whereas GPT-4o produced the closest distributional match to human attention as measured by KL divergence. These findings suggest that large vision-language models can identify regions that broadly correspond to where humans direct visual attention in safety-relevant scenes without requiring eye-tracking training data. The results highlight the potential of vision-language models as a scalable tool for approximating human attentional patterns.

03.
arXiv (CS.AI) 2026-06-11

Resource-Aware LLM Reasoning for Mobile Edge General Intelligence

arXiv:2509.23248v3 Announce Type: replace Abstract: The rapid advancement of large language models (LLMs) has enabled an emergence of agentic artificial intelligence (AI) with powerful reasoning and autonomous decision-making capabilities. This integration with edge computing has led to the development of Mobile Edge General Intelligence (MEGI), which brings real-time, privacy-preserving reasoning to the network edge. However, deploying LLM-based agentic AI reasoning in MEGI environments poses significant challenges due to the high computational demands of reasoning and the limited resources of edge devices. To address these challenges, we propose a joint optimization framework for efficient LLM reasoning deployment in MEGI. First, we systematically review enhancement methods to identify mechanisms suitable for edge adaptation. Subsequently, we present a distributed framework that synergizes reasoning enhancement via adaptive CoT prompting with scalable deployment through a distributed MoE architecture. An important innovation of this approach involves modeling reasoning depth as a dynamic network resource variable, which is optimized jointly with expert activation and transmission power. This mechanism allows the system to dynamically regulate expert networks and reasoning complexity according to task requirements and device capabilities. Experimental evaluations in mobile edge environments demonstrate that the proposed framework effectively balances reasoning quality and resource efficiency. The results show that with less than one second of additional inference time, both accuracy and latency satisfaction rate can reach 90\%, validating the practical viability of deploying sophisticated LLM reasoning in resource-constrained MEGI systems.

04.
arXiv (CS.LG) 2026-06-19

Representing Piecewise-Linear Functions by Functions with Minimal Arity

arXiv:2406.02421v2 Announce Type: replace-cross Abstract: Any continuous piecewise-linear function $F\colon \mathbb{R}^{n}\to \mathbb{R}$ can be represented as a linear combination of $\max$ functions of at most $n+1$ affine-linear functions. In our previous paper [``Representing piecewise linear functions by functions with small arity'', AAECC, 2023], we showed that this upper bound of $n+1$ arguments is tight. In the present paper, we extend this result by establishing a correspondence between the function $F$ and the minimal number of arguments that are needed in any such decomposition. We show that the tessellation of the input space $\mathbb{R}^{n}$ induced by the function $F$ has a direct connection to the number of arguments in the $\max$ functions.

05.
arXiv (CS.CV) 2026-06-16

MMLongEmbed: Benchmarking Multimodal Embedding Models in Long-Context Scenarios

Recent advancements have significantly expanded the theoretical context windows of Multimodal Embedding Models (MEMs). However, larger context windows do not necessarily translate into effective comprehension and representation of long-context multimodal inputs, which remains a critical bottleneck for real-world deployment. To address the lack of systematic evaluation in this setting, we introduce MMLongEmbed, the first comprehensive benchmark for evaluating MEMs in long-context scenarios. MMLongEmbed comprises four retrieval tasks spanning multiple context-length ranges, covering text, document, and video modalities. Through extensive evaluation of state-of-the-art models, we find that current architectures rely heavily on superficial feature matching and struggle to capture deep semantic and structural dependencies. We further observe that performance degradation varies systematically with context length and key information placement. Moreover, models exhibit substantially different robustness to redundant contextual information across modalities. For reproducibility, the benchmark and code are publicly available.

06.
medRxiv (Medicine) 2026-06-12

Genome-wide association and multi-omics functional screens reveal the genetic architecture of foveal development

Foveal hypoplasia causes visual impairment across congenital eye disorders, yet the genetic programmes governing foveal development remain poorly characterised and no tractable model exists for foveal disease. In the first genome-wide association study of foveal hypoplasia, we identified 42 sentinel variants mapping to 54 effector genes supported by >= 2 criteria from a variant-to-gene framework incorporating developmental multi-omics. Disruption of six effector genes using mutant lines and CRISPR knockouts in the zebrafish high acuity zone recapitulates structural, functional, and ultrastructural hallmarks of foveal hypoplasia, establishing the first vertebrate disease model. Integration with human foetal single-cell and spatial transcriptomics reveals two temporal waves of effector gene expression and identifies Muller glia as critical mediators of foveal patterning. Phenome-wide analyses reveal foveal variants are pleiotropic with refractive, lenticular, and metabolic traits, connecting foveal development to anterior segment and systemic disease biology. These findings should inform mechanistic studies of macular disease.

07.
arXiv (CS.LG) 2026-06-12

Robustness Verification of Recurrent Neural Networks with Abstraction Refinement

arXiv:2606.12490v1 Announce Type: new Abstract: Certified local robustness verification for recurrent neural networks (RNNs) is challenging because approximation errors introduced by nonlinear relaxations can propagate through recurrent connections and accumulate over time. As a result, scalable linear bound propagation methods often become overly conservative and fail to certify inputs that are in fact robust, especially when many pre-activation intervals cross zero. We propose an abstraction-refinement framework for RNN verification that partitions such intervals to remove the dominant relaxation error: on each refined branch, ReLU becomes exact, and smooth activations such as tanh and sigmoid admit substantially tighter linear envelopes. To control the combinatorial cost of splitting in long sequences, we introduce a SHAP-guided timestep selection strategy that ranks hidden states by their contribution to the verification objective and refines only the most critical timesteps in temporal order. Experiments on CIFAR10 and MNIST stroke benchmarks demonstrate consistent improvements in verification success and robustness-margin tightness over abstraction-only baselines, while exposing clear runtime trade-offs between ReLU and tanh models.

08.
arXiv (CS.AI) 2026-06-16

Topological Flow Matching

arXiv:2606.15897v1 Announce Type: cross Abstract: Flow matching is a powerful generative modeling framework, valued for its simplicity and strong empirical performance. However, its standard formulation treats signals on structured spaces, such as fMRI data on brain graphs, as points in Euclidean space, overlooking the rich topological features of their domains. To address this, we introduce topological flow matching, a topology-aware generalization of flow matching. We interpret flow matching as a framework for solving a degenerate Schrödinger bridge problem and inject topological information by augmenting the reference process with a Laplacian-derived drift. This principled modification captures the structure of the underlying domain while preserving the desirable properties of flow matching: a stable, simulation-free objective and deterministic sample paths. As a result, our framework serves as a drop-in replacement for standard flow matching. We demonstrate its effectiveness on diverse structured datasets, including brain fMRIs, ocean currents, seismic events, and traffic flows.

09.
arXiv (quant-ph) 2026-06-12

Explicit Quantum Circuit Simulation of Nonlinear 1-Dimensional Fluid with Carleman-linearized Boltzmann Method

arXiv:2606.12770v1 Announce Type: new Abstract: Quantum computation of fluid dynamics has attracted growing attention as a key application of fault-tolerant quantum computers anticipated in the coming decade, with lattice Boltzmann methods emerging as a particularly promising approach. Explicit and efficient elementary-gate-level circuit simulations, however, have so far been demonstrated only in the linear case. Here we include the leading nonlinearity through second-order Carleman linearization of the one-dimensional Boltzmann equation, and demonstrate, via explicit quantum-circuit simulation, the preparation of the final-time state using a Taylor-expansion-based ODE solver based on the quantum singular value transformation. With this construction, we analyze the gate and qubit complexities, which scale logarithmically with the grid size, the nonlinearity captured by the higher-order Carleman linearization, and the practical utility of higher-order expansions in the Taylor ODE solver. The construction provides a concrete baseline for computational cost reduction and further developments such as extensions to higher dimensions, complex geometries, and the extraction of physical quantities, towards industrially useful quantum CFD.

10.
arXiv (CS.AI) 2026-06-19

CareTransition-Audit: A Benchmark to Audit Discharge Summaries for Efficient Care Transitions

arXiv:2604.05435v2 Announce Type: replace Abstract: Incomplete or inconsistent discharge documentation drives care fragmentation and avoidable readmissions. Despite its critical role in patient safety, auditing discharge summaries relies on manual review and does not scale. We propose an automated framework for auditing discharge summaries using large language models (LLMs). Our approach operationalizes the DISCHARGED framework into a checklist of 46 questions. Using 50 summaries from the MIMIC-IV database, with clinician ground-truth labels, we benchmark 11 LLMs. Model-assessed mean documentation completeness ranges from 54.9% to 74.2%, and the best-performing models achieve a Cohen's kappa values around 0.5 against clinician labels, indicating moderate agreement. All models struggle to identify ambiguous documentation (Unclear), highlighting a key gap in current automated auditing. This work provides a clinician-validated benchmark and zero-shot baselines for systematic quality improvement in clinical documentation.

11.
arXiv (CS.LG) 2026-06-15

Shuttling Compiler for Trapped-Ion Quantum Computers Based on Large Language Models

arXiv:2512.18021v3 Announce Type: replace-cross Abstract: We present the first shuttling compiler based on large language models (LLMs) for trapped-ion quantum computers, where qubits are shuttled between segments for gate execution and qubit storage. We fine-tune pre-trained LLMs on examples from linear and branched one-dimensional shuttling architectures. Thus, we obtain a layout-independent compilation strategy that learns the required shuttling operations directly from data. Using benchmark circuits with up to 16 qubits, such fine-tuned LLMs can now generate valid schedules for shuttling architectures. Notably, we also obtain a valid schedule for a previously unseen four-way junction layout. This demonstrates that trained LLMs can generalize to layouts not encountered during training. For various architectures, LLM-based schedules improve upon state-of-the-art baseline compiler results, reducing the shuttling effort by up to 15%.

12.
arXiv (CS.LG) 2026-06-15

LoMC: Localized Multidirectional Correction for Refusal Suppression in Routed Foundation Models

arXiv:2606.13709v1 Announce Type: cross Abstract: We study controlled post-training refusal suppression in routed MoE and hybrid-MoE foundation models, aiming to increase non-refusal target-response behavior while preserving general capability under a compact intervention footprint. Existing broad direction-based edits can perturb general-purpose computation, whereas support-only expert edits often lack sufficient capacity to correct heterogeneous refusal representations. To address this limitation, we introduce Localized Multidirectional Correction (LoMC), a support-gated intervention framework that follows a support-then-correction execution order: it first identifies a compact edit support, then aggregates prototype correction directions into layer-wise correction directions, and finally applies rank-one layer-wise correction only within the selected support. By using the edit support as a structural gating constraint, LoMC increases correction capacity without expanding the intervention scope. Experiments on text-only and multimodal safety benchmarks across four routed backbones show that LoMC substantially improves non-refusal target-response behavior while maintaining general capability under a compact intervention footprint.

13.
medRxiv (Medicine) 2026-06-16

Daily Healthy Eating Index (HEI-2020) scoring reveals diet quality patterns masked by aggregation

The Healthy Eating Index (HEI-2020) is conventionally computed by aggregating intake across days before scoring. Digital food logging enables an alternative: scoring each day and averaging daily scores. These methods are not equivalent. The HEI's density-based structure and component caps cause aggregation to inflate adequacy scores when intake is irregular. Using Food & You data, we show daily HEI correlates more strongly with microbiome diversity, and recommend co-reporting both metrics.

14.
Nature (Science) 2026-06-22

Isotopic evidence for a cold and distant origin of 3I/ATLAS

Interstellar objects provide the only directly observable samples of icy planetesimals formed around other stars, and can therefore provide insight into the diversity of physical and chemical conditions occurring during exoplanet formation1−3. Here we report isotopic measurements of the interstellar comet 3I/ATLAS, which reveal an elemental composition unlike any Solar System body. The water in 3I/ATLAS is enriched in deuterium, at a level of D/H = (0.98 ± 0.06)%, which is more than an order of magnitude higher than in known comets, while its range of 12C/13C ratios (141–191 for CO2 and 123–172 for CO) exceeds typical values found in the Solar System, as well as nearby interstellar clouds and protoplanetary disks. Such extreme isotopic signatures indicate formation at temperatures  ≲ 30 K in a relatively metal-poor environment. When interpreted with respect to models for Galactic chemical evolution, the carbon isotopic composition implies that 3I/ATLAS may have accreted as long ago as 12 billion years, following a period of intense, early star formation. 3I/ATLAS thus represents a preserved fragment of an ancient planetary system.

15.
arXiv (quant-ph) 2026-06-19

A Finite-Volume Scheme for the Continuum Extrapolation of Lattice Step-Scaling in (2+1)D Hamiltonian U(1) Gauge Theory

arXiv:2606.20029v1 Announce Type: cross Abstract: We propose a finite-volume scheme to perform controlled continuum extrapolations of the lattice step-scaling function, a key ingredient for determining the running coupling in a Hamiltonian lattice gauge theory in small volumes. As a testbed, we employ a dual Hamiltonian formulation of pure U(1) gauge theory in (2+1) dimensions and an operator basis that remains efficient toward weak coupling. We describe the implementation of static external charges on the spatial lattice and study, using matrix product states, the resulting confining string, from which we extract the static potential and a force-based renormalized coupling. Using the proposed finite-volume scheme, we demonstrate a stable continuum limit of the step-scaling function on the lattice sizes accessible to present Hamiltonian simulations. The method is readily extendable to other gauge groups and dimensions, providing a pathway toward Hamiltonian step-scaling studies in other theories.

16.
arXiv (math.PR) 2026-06-11

Stochastic epidemic model with varying infectivity and waning immunity: the law of large numbers with unbounded infectivity

arXiv:2606.11845v1 Announce Type: new Abstract: We revisit the large population limit of our epidemic model with infection age dependent infectivity and progressive immunity waning, under the assumption that the supremum in $t$ of the random infectivity function has a finite expectation, while the previous proofs assumed that this supremum admits a deterministic upper bound.

17.
arXiv (CS.AI) 2026-06-15

Capability Minimization as a Safety Primitive: Risk-Aware Causal Gating for Least-Privilege LLM Agents

arXiv:2606.13884v1 Announce Type: new Abstract: Modern decision systems increasingly rely on learned components whose outputs may be confident yet wrong, exposing downstream actions to costly errors. We introduce Risk-Aware Causal Gating (RACG), a framework that decides whether to act on, defer, or abstain from a model's prediction by combining causal effect estimation with calibrated risk control. RACG models the causal pathway from candidate actions to outcomes and gates each decision according to an estimated counterfactual risk rather than raw predictive confidence. To make gating reliable, we derive distribution-free bounds on the probability of acting under high-risk conditions and show how these bounds translate into operating thresholds that satisfy user-specified safety constraints. We further propose an adaptive gating policy that adjusts to distribution shift by monitoring discrepancies between predicted and realized outcomes, tightening the gate when causal assumptions appear violated. Across simulated interventions and real-world decision benchmarks, RACG reduces high-cost errors substantially while preserving most of the utility of an ungated policy, and it outperforms confidence-based and selective-prediction baselines at matched abstention rates. Our results indicate that explicitly separating causal risk from predictive uncertainty yields decision systems that are both safer and more transparent, offering a principled mechanism for trustworthy automation in high-stakes settings.

18.
medRxiv (Medicine) 2026-06-18

Cardiac rhythm development: A wearable device index of risk for physical and mental illness in adolescence

Objective. The autonomic nervous system, which regulates cardiac rhythm, undergoes pronounced maturation across adolescence. How cardiac rhythm develops over this period, however, and whether individual differences in its development forecast mental and physical illness, remain open questions. We used three waves of Fitbit data from the Adolescent Brain Cognitive Development (ABCD) Study to characterize the developmental trajectory of the cardiac rhythm and to test whether variation in that trajectory predicts onset of psychopathology and cardiometabolic disease. Methods. 8,301 adolescents contributed 242,811 valid Fitbit wear days across Waves 2 (Mage=12), 4 (Mage=14), and 6 (Mage=16). Cosinor mixed-effects models yielded three rhythm parameters per session: mesor (24-hour mean), amplitude (diurnal swing), and acrophase (peak timing). We first characterized age- and sex-specific trajectories, cross-wave stability, and factors shaping the rhythm. We then used parallel-process latent growth models to test whether within-person changes in rhythm tracked symptom trajectories, and hierarchical logistic models to test whether rhythm parameters predicted the first clinical onset of psychopathology and of obesity and hypertension. Results. The cardiac rhythm changed substantially across adolescence: mesor decreased, amplitude flattened, and acrophase shifted later. Within-person change in the rhythm tracked change in blood pressure, BMI, and trajectories of depression and ADHD symptoms. Higher mesor predicted incident onset of all five outcomes controlling for demographics, baseline symptoms, and behavior (ORs 1.36-1.54); amplitude, acrophase, and rhythm instability conferred additional risk. Conclusions. The 24-hour cardiac rhythm is a passively measurable substrate of adolescent autonomic development that indexes transdiagnostic risk for psychiatric and cardiometabolic illness.

19.
arXiv (CS.AI) 2026-06-16

Faster Completion, Less Learning: Generative AI Reduced Study Time on Math Problems and the Knowledge They Build

arXiv:2605.21629v2 Announce Type: replace-cross Abstract: How much have students' ordinary learning processes shifted in response to generative AI, and how does that affect their durable learning outcomes? Self-report surveys show little change, while small-scale behavioral studies report widespread AI use without the scale or duration to measure learning consequences. We address both questions using a ten-year panel of $3.2$ million ALEKS learning interactions for investigating time-on-task, complemented by ALEKS PPL placement-assessment data for examining proctoring and learning outcomes, with a quasi-experimental design exploiting variation in tasks that are more susceptible to AI (text-based word problems) and less susceptible to AI (interactive graph-based problems). Learning time on AI-susceptible problems declines $2.8\%$ per quarter among college students after ChatGPT's release, cumulating to $26.9\%$ over eleven quarters; high-schoolers show $31.3\%$, middle-schoolers $9.0\%$, and Grade 5 students no detectable change. Among college students, the post-ChatGPT divergence vanishes entirely under proctoring, ruling out broad efficiency gains as the likely explanation. Logistic fixed-effects models on randomly assigned proctored retention items yield a $25\%$ cumulative decline in odds of correct response; the same estimator on non-proctored assessment produces a large opposite-signed increase – inconsistent with any platform, cohort, or curriculum explanation. These results are among the first large-scale behavioral and outcome evidence that generative AI has altered how students study and the knowledge they build – the population-level indicator of cognitive surrender, with direct implications for educational research, assessment governance, and AI policy.

20.
arXiv (quant-ph) 2026-06-11

Recirculating Quantum Photonic Networks for Fast Deterministic Quantum Information Processing

arXiv:2602.11033v2 Announce Type: replace Abstract: A fundamental challenge in photonics-based deterministic quantum information processing is to realize key transformations on time scales shorter than those of detrimental decoherence and loss mechanisms. This challenge has been addressed through device-focused approaches that aim to increase nonlinear interactions relative to decoherence rates. In this work, we adopt a complementary architecture-focused approach by proposing a recirculating quantum photonic network (RQPN) that minimizes the duration of quantum information processing tasks, thereby reducing the requirements on nonlinear interaction rates. The RQPN consists of a network of all-to-all connected nonlinear cavities with dynamically controlled waveguide couplings, and it processes information by capturing a photonic input state, recirculating photons between the cavities, and releasing a photonic output state. We demonstrate the RQPN's architectural advantage through two examples: first, we show that processing all qubits simultaneously yields faster operations than single- and two-qubit decompositions of the three-qubit Toffoli gate. Second, we demonstrate implementations of a measurement-free correction for single-photon loss, achieving up to seven-fold speedups and significantly improved hardware efficiency relative to state-of-the-art architecture proposals. Our work shows that a single hardware-efficient recirculating architecture substantially reduces the temporal overhead of multi-qubit gates and quantum error correction, thereby lowering the barrier to experimental realizations of deterministic photonic quantum information processing.

21.
arXiv (CS.AI) 2026-06-11

MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning

arXiv:2606.12018v1 Announce Type: new Abstract: We propose a multi-agent collaborative framework built upon a lightweight Multimodal Large Language Model (MLLM), specifically designed for social intelligence reasoning. A key feature of our approach is that both the training and inference phases are augmented via knowledge distillation. Within this architecture, multi-modal data pertinent to social intelligence is precisely localized. Furthermore, relevant long-tail events are identified, extracted, and rendered as formatted, explicit text. This formatting strategy prevents critical long-tail information from being overshadowed by head events and environmental noise during the tokenization process. Specifically, we integrate Test-Time Adaptation (TTA) across the entire reasoning pipeline, encompassing the extraction and representation of long-tail events, Chain-of-Thought (CoT) prompting, and self-reflection. This TTA mechanism is also distillation-enhanced, utilizing Low-Rank Adaptation (LoRA) to fine-tune the foundation model exclusively for instance-level reasoning. Extensive evaluations against various open-source and proprietary AI models across multiple benchmarks demonstrate the effectiveness of the proposed framework. With around 30% of training data from IntentTrain, we achieve state-of-the-art results. Codes are available at https://github.com/eeee-sys/MODF-SIR, demo is available at https://huggingface.co/spaces/Harry-1234/MODF-SIR, LoRA is available at https://huggingface.co/Harry-1234/MODF-SIR and the dataset for training router is available at https://huggingface.co/datasets/Harry-1234/IntentRouterTrain.

22.
arXiv (quant-ph) 2026-06-19

QMCtwin: Master-Equation Simulation of Syndrome Statistics Beyond Pauli Noise

arXiv:2606.19848v1 Announce Type: new Abstract: As quantum error correction moves toward large-scale experimental implementations, decoder performance increasingly depends on how faithfully hardware noise is translated into syndrome statistics. Standard stabilizer workflows achieve scalability by replacing device dynamics with stochastic Pauli or detector-error models, but this compression can discard coherent phase information, nonunital drift, continuous-time effects of always-on couplings, and correlations generated by simultaneous Hamiltonian and dissipative evolution. Here we present QMCtwin, a sign-problem-suppressed quantum Monte Carlo framework for master-equation simulation of QEC circuits, and apply it to a full syndrome-extraction round of a distance-$7$ rotated surface code with $97$ physical qubits. The open-system model includes realistic superconducting-device noise mechanisms such as relaxation, pure dephasing, coherent gate miscalibration, residual $ZZ$ crosstalk, and drive-qubit detuning. By directly estimating syndrome observables from the QMC-generated stochastic density matrix estimator, we compare the master-equation dynamics with their Pauli-twirled Clifford simulation counterparts. QMCtwin predicts syndrome-extraction biases and correlations between syndromes and proxies of logical-string-parity that are absent or strongly suppressed in the stochastic Pauli description. We introduce information-theoretic diagnostics that further quantify how information concerning syndromes versus string-parity proxies differs between the realistic master-equation simulation and the corresponding Pauli-twirled model. These results show that QMC-based master-equation digital twins can expose noise features hidden by conventional Pauli/Clifford noise models and provide a practical path toward more accurate decoder-facing syndrome models.

23.
medRxiv (Medicine) 2026-06-15

A controlled human infection model for symptomatic pertussis in North America using the pertactin-producing clinical isolate D420

Background Despite widespread vaccination, pertussis remains a poorly controlled disease globally and results in substantial annual morbidity and mortality, particularly in young children. Controlled human infection models (CHIMs) using the causative agent Bordetella pertussis are promising systems to enable the study of pertussis disease pathogenesis and immunology and to rapidly assess vaccines and therapeutics. While a pertussis CHIM that produces asymptomatic infection has been established in Europe, the development of a CHIM that leads to symptomatic illness would be advantageous for evaluating vaccine efficacy against both infection and disease. Methods Healthy participants 18-40 years of age were inoculated intranasally with one of eight doses (ranging from 104 to 108 colony forming units (CFU)) of the pertactin-producing B. pertussis isolate D420 at the challenge facility within the Canadian Center for Vaccinology (Nova Scotia, Canada). The study occurred in two stages. In stage one, the B. pertussis dose was escalated in cohort groups of five to six participants until reaching an endpoint where 70-90% of participants exhibited mild (non-severe, Grade 1 or 2) symptomatic infection, defined as the Human Infectious Dose 70-90 (HID70-90). In stage two, additional challenges were conducted for doses below, at, and above the identified HID70-90 to characterize the emerging pertussis model. For all challenge doses, participants were closely monitored during an inpatient stay of up to 24 days and post-discharge for laboratory-confirmed infection, pertussis symptoms, safety, and IgG antibody responses to four B. pertussis antigens including pertussis toxin, filamentous hemagglutinin, fimbriae, and pertactin. All participants received a five-day course of azithromycin, where timing of initiation depended on B. pertussis testing and symptoms. The study was conducted between July 4, 2022 and March 19, 2025. Findings Seventy-five participants were inoculated with one of the eight B. pertussis D420 challenge doses and completed the inpatient stay. From the stage-one dose escalation, we found that 107 CFU of B. pertussis D420 was the lowest dose that achieved the HID70-90, where 9 of 12 participants (75.0%) exhibited mild symptomatic infection. Following stage-two challenges, 16 of 22 total participants at 107 CFU (72.7%) developed mild symptomatic infection, thus verifying the HID70-90. The symptomatic infection rate below the HID70-90 at 5x106 CFU of D420 was 20.0% and above the HID70-90 at 5x107 and 108 CFU were 58.3% and 55.6%, respectively. Symptoms with elevated frequency for symptomatic infection (relative to background symptoms in non-infected) included nasal congestion, runny nose, fatigue, malaise, and cough. At the HID70-90, 50% of symptomatic infections included cough. Serological analyses of the four highest (stage-two) challenge doses (5x106, 107, 5x107, 108 CFU) revealed that antibody titres increased over time post-challenge. Seroconversion for at least one of the four studied antibodies was nearly twice as common for symptomatic (70.0%) than asymptomatic (35.7%) infection and was absent (0%) for non-infected. All infections were cleared following azithromycin treatment (100%) and there were no study-related serious adverse events. Interpretation A safe and reproducible symptomatic pertussis CHIM was achieved, providing a model for research on pertussis disease pathogenesis and immunology and for assessing vaccines and therapeutics. (Clinicaltrials.gov, NCT05136599).

24.
arXiv (CS.CV) 2026-06-17

Pareto LoRA: Mitigating Modality Imbalance in Unified Multimodal Models via Pareto-Optimal Gradient Integration

Unified multimodal models (UMMs) have recently emerged as a promising paradigm for integrating multimodal understanding and generation within a single autoregressive transformer. However, during multimodal instruction tuning, these models often exhibit pronounced modality imbalance: language gradients dominate optimization, thus leading to lower image generation quality, especially under parameter-efficient fine-tuning such as LoRA. In this work, we systematically analyze modality imbalance in LoRA-based fine-tuning of UMMs for interleaved text-image generation. We show that vision modality performance degrades substantially more than text modality performance when compared to unimodal counterparts, and that modality-specific gradients can differ by orders of magnitude across various tasks and layers. Motivated by this observation, we reformulate the multimodal instruction tuning as a bi-objective optimization problem and propose Pareto LoRA, a Pareto-optimal gradient integration strategy that balances the text and image objectives by modulating the gradient direction and strength. Experiments on the CoMM benchmark with Emu2 demonstrate that Pareto LoRA consistently improves multimodal generation balance, achieving up to 44.9% gains in perceptual image quality over vanilla LoRA while maintaining comparable text performance.

25.
arXiv (CS.CV) 2026-06-17

Enhancing Pathological VLMs with Cross-scale Reasoning

Pathological images are inherently multi-scale, requiring pathologists to integrate evidence from global tissue architecture at low magnification to cellular morphology at higher magnification for accurate diagnosis. While existing pathological datasets for vision-language model (VLM) include various scales, they often lack an explicit cross-scale reasoning objective. This limitation prevents VLMs from capturing essential cross-scale representations and learning evidence-based reasoning. To bridge this gap, we introduce the first cross-scale training and evaluation paradigm that formulates pathology interpretation as multi-magnification reasoning. However, creating such a task reveals a critical challenge: multi-image visual question answering (VQA) is prone to text-only shortcuts, which allow models to guess answers using magnification-dependent artifacts rather than visual evidence. To address this, we propose a leakage-aware curation pipeline that combines adversarial text-only screening with constraint-guided question design. Using this pipeline, we construct Scale-VQA, a high-quality benchmark with 4,685 multiple-choice questions grounded in 2,537 pathology images across multiple magnification levels. Finally, we present ScaleReasoner-R1, a model trained via reinforcement learning to optimize performance on the cross-scale VQA task. ScaleReasoner-R1 achieves state-of-the-art performance on our cross-scale reasoning benchmark and generalizes to SOTA performance on established single-scale benchmarks. Findings suggest that even the limited cross-scale supervision can significantly improve pathological understanding. The code and demos will be open-sourced.