Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
medRxiv (Medicine) 2026-06-19

Cardiometabolic multimorbidity and care experiences in primary healthcare among Brazilian adults aged 50 and over (ELSI-Brazil)

Background: Population aging and the rising burden of non-communicable diseases have increased the prevalence of cardiometabolic multimorbidity (CM-MM) among older adults. Patient-reported experience measures (PREMs) are recognized as essential components of healthcare quality assessment, yet evidence on primary care experiences among individuals with CM-MM remains scarce. Objective: To analyze primary care experiences according to the presence of cardiometabolic multimorbidity among Brazilians aged 50 years and older. Methods: Cross-sectional study using data from the second wave of the Brazilian Longitudinal Study of Aging (ELSI-Brazil, 2019-2021; n = 9,949). CM-MM was defined as the self-reported coexistence of two or more of the following conditions: hypertension, diabetes mellitus, dyslipidemia, acute myocardial infarction, and stroke. Primary care experiences were assessed using a validated 12-item instrument organized into four domains: first-contact access, longitudinality, communication, and care coordination. Associations were estimated using Poisson regression adjusted for sociodemographic, health conditions, and healthcare utilization variables, with stratified analysis by Family Health Strategy (FHS) coverage. Results: CM-MM prevalence was 25.5%, with a progressive increase by age and an inverse gradient by education. Individuals with CM-MM reported significantly more positive experiences in longitudinality (mean index 2.53 vs. 2.34; adjusted PR = 1.22; 95%CI 1.12-1.33; p < 0.001) and, to a lesser extent, in communication (mean index 2.68 vs. 2.58; adjusted PR = 1.10; 95%CI 1.00-1.20; p = 0.041). No statistically significant differences were found in first-contact access or care coordination. After stratified by FHS coverage, the observed differences in longitudinality and communication were no longer statistically significant. Conclusions: CM-MM was associated with more positive primary care experiences in longitudinality and communication. The absence of differentiated experiences in first-contact access and coordination highlights structural gaps in primary care responsiveness to individuals with greater clinical complexity. Keywords: Multimorbidity; Cardiometabolic diseases; Primary Care; Patient-reported experience measures; Older adults; ELSI-Brazil.

02.
arXiv (CS.CV) 2026-06-16

NeRD: Neuro-Symbolic Rule Distillation for Efficient Ontology-Grounded Chain-of-Thought in Medical Image Diagnosis

Interpretability is essential for trustworthy medical image diagnosis. However, existing concept-driven interpretable methods have key limitations: Concept Bottleneck Models (CBMs) require scoring all predefined concepts at inference time and for manual intervention, imposing a substantial burden on clinicians, while rationale-based generative approaches often select concepts by class discriminability, which can drift from diagnostic ontologies. To address these issues, we propose Neuro-Symbolic Rule Distillation (NeRD), a framework that produces efficient, ontology-grounded reasoning chains that are sufficient yet non-redundant, without manually crafting diagnostic rules. Experiments on two skin datasets demonstrate strong diagnostic performance and interpretability, and blinded expert evaluation confirms the clinical plausibility of NeRD rationales. Our method further enables a first expert-in-the-loop study for Multimodal Chain-of-Thought-based diagnosis, achieving efficient and effective concept-level intervention.

03.
arXiv (CS.LG) 2026-06-16

Diffusion Models for Adaptive Sequential Data Generation

arXiv:2606.06007v2 Announce Type: replace Abstract: Generating realistic synthetic sequential data is critical in real-world applications across operations research, finance, healthcare, energy systems, and scientific computing, where time-indexed observations are used for prediction, simulation, risk assessment, and data-driven decision-making. While diffusion models have achieved remarkable success in generating static data, their direct extensions to sequential settings often fail to capture temporal dependence and information structure. Designing diffusion models that can simulate sequential data in an adapted manner, and hence without anticipation of future information, therefore remains an open challenge. In this work, we propose a sequential forward-backward diffusion framework for adapted time series generation. Our approach progressively injects and removes noise along the sequence, conditioning on the previously generated history to ensure adaptiveness. A novel score-matching objective is introduced for efficient parallel training. We derive rigorous statistical guarantees under a generic framework, then establish score approximation, score estimation, and distribution estimation results with ReLU networks serving as a concrete instance. Empirically, we validate our method on synthetic data, including ARMA models and Gaussian processes, and demonstrate its effectiveness in constructing mean-variance optimal portfolios.

04.
bioRxiv (Bioinfo) 2026-06-16

Physics-Driven Zero-Shot Reconstruction of Isotropic 3D Fluorescence Microscopy under Undersampled Acquisition

Three-dimensional (3D) imaging represents the development of next generation of fluorescence microscopy. However, routine axial down-sampling makes isotropic resolution unrealistic. Here, we propose DeepUI, a physical zero-shot framework designed to achieve isotropic 3D fluorescence images from a low axial sampling rate. DeepUI fully leverages the intrinsic characteristics of 3D images through physics-guided degradation, which incorporates spatial-frequency joint learning to generate a scaled optical transfer function, combined with noise degradation and an up-sampling branch. Typically requiring just 5 minutes for training and 0.5 minutes for high-throughput and fast prediction, we demonstrate the superior performance of DeepUI to get isotropic results, and the exclusivity to axial down-sampling conditions, even in more challenging conditions, including defocused background, noise, and resolution blur.

05.
arXiv (CS.LG) 2026-06-12

Towards One-for-All Anomaly Detection for Tabular Data

arXiv:2603.14407v2 Announce Type: replace Abstract: Tabular anomaly detection (TAD) aims to identify samples that deviate from the majority in tabular data and is critical in many real-world applications. However, existing methods follow a ``one model for one dataset (OFO)'' paradigm, which relies on dataset-specific training and thus incurs high computational cost and yields limited generalization to unseen domains. To address these limitations, we propose OFA-TAD, a generalist one-for-all (OFA) TAD framework that only requires one-time training on multiple source datasets and can generalize to unseen datasets from diverse domains on-the-fly. To realize one-for-all tabular anomaly detection, OFA-TAD extracts neighbor-distance patterns as transferable cues, and introduces multi-view neighbor-distance representations from multiple transformation-induced metric spaces to mitigate the transformation sensitivity of distance profiles. To adaptively combine multi-view distance evidence, a Mixture-of-Experts (MoE) scoring network is employed for view-specific anomaly scoring and entropy-regularized gated fusion, with a multi-strategy anomaly synthesis mechanism to support training under the one-class constraint. Extensive experiments on 34 datasets from 14 domains demonstrate that OFA-TAD achieves superior anomaly detection performance and strong cross-domain generalizability under the strict OFA setting. The source code is available at https://github.com/Shiy-Li/OFA-TAD.

06.
arXiv (CS.CV) 2026-06-16

GeoRoPE: Ground-Aware Rotary Adaptation for Remote Sensing Foundation Models

Remote-sensing foundation models (RSFMs) benefit from pretraining on imagery from multiple sensors and ground sampling distances (GSDs), but such exposure alone does not resolve scale mismatch during downstream adaptation. A fixed token-grid offset can correspond to different ground distances across sensors, making grid-based positional priors physically inconsistent. Meanwhile, heterogeneous spatial granularity means that compact urban regions and homogeneous landscapes may require different positional sensitivities even under the same GSD. Therefore, we propose {GeoRoPE}, a ground-aware, RoPE-compatible, and parameter-efficient spatial adaptation method for RSFMs. GeoRoPE recalibrates token-level positional interactions from two complementary aspects. First, Geo-Coordinate Calibration (GCC) rescales raw token-grid offsets according to the ground distance represented by one token-grid step, producing geo-calibrated relative coordinates across GSDs. Second, Geo-Frequency Calibration (GFC) adjusts the native RoPE frequency with a relation-specific factor, enabling position sensitive adaptation to scene-dependent spatial granularity. GeoRoPE is injected into pretrained RSFMs through a lightweight adapter, preserving the frozen spatial prior while adding geo-aware positional corrections. Experiments across multiple RSFMs, sensors, resolutions, and downstream tasks demonstrate that GeoRoPE improves cross-resolution robustness and scale-sensitive representation learning.

07.
arXiv (CS.AI) 2026-06-16

Edit Knowledge, Not Just Facts via Multi-Step Reasoning over Background Stories

arXiv:2602.02028v2 Announce Type: replace Abstract: Enabling artificial intelligence systems, particularly large language models, to update knowledge and flexibly apply it during reasoning remains a central challenge. Existing knowledge editing approaches emphasize atomic facts, improving factual recall but often failing to integrate updated information into a coherent framework usable across contexts. In this work, we argue that knowledge update is fundamentally a reasoning problem rather than a memorization problem. Consequently, a model should be trained in situations where the new information is instrumental to solving a task, combined with pre-existing knowledge, and exercised through multi-step reasoning. Based on this insight, we propose a training strategy based on three principles. First, new knowledge is introduced as a coherent background story that contextualizes novel facts and explains their relation to existing knowledge. Second, models are trained using self-generated multi-hop questions that require multi-step reasoning involving the new information. Third, training is done using knowledge distillation, forcing a student model to internalize the teacher's reasoning behavior without access to the novel information. Experiments show that models trained with this strategy effectively leverage newly acquired knowledge during reasoning and achieve remarkable performance on challenging questions that require combining multiple new facts.

09.
arXiv (CS.CL) 2026-06-12

Direct Preference Optimization for Chatbot Fine-Tuning: An Empirical Study

We present an approach to fine-tuning large language models using Direct Preference Optimization (DPO), a reinforcement learning technique. Our experimental results demonstrate that DPO simplifies the training pipeline, improves computational efficiency, and achieves competitive performance. The evaluation using BLEU, ROUGE, and cosine similarity metrics indicates effective learning and convergence, though further investigation is needed to address observed training instability.

10.
arXiv (quant-ph) 2026-06-19

Stalls and Spequlation: Pipelined Execution for Fault Tolerant Quantum Computation

arXiv:2606.19593v1 Announce Type: new Abstract: Fault-tolerant quantum computation requires the coordinated action of three distinct systems: classical control logic, quantum hardware, and classical error decoders. Current scheduling models treat logical operations as atomic, hiding the fact that these subsystems operate sequentially and spend significant time idle. We present a pipelined execution framework that decomposes each logical operation into its component stages i.e. Control, Execute, and Decode. Building on this, we discuss some speculation strategies that allow successor operations to begin processing before their predecessors have completed decoding. We evaluate our framework on several common benchmarks and show that pipelining with speculation reduces total pipeline steps by 20-40% compared to a no-speculation baseline. The most aggressive strategy consistently outperforms conservative alternatives, even though partial rollback is needed at times, because the per-rollback penalty is small relative to the parallelism gained. We further show that speculation facilitates load balancing by distributing work more evenly across the heterogeneous subsystems of a fault-tolerant quantum computer, converting idle time into useful computation while also saving on execution time.

11.
arXiv (CS.LG) 2026-06-11

Hierarchical Probabilistic Conformal Prediction for Distributed Energy Resources Adoption

arXiv:2411.12193v4 Announce Type: replace-cross Abstract: The rapid growth of distributed energy resources (DERs) presents both opportunities and operational challenges for electric grid management. Accurately predicting DER adoption is critical for proactive infrastructure planning, but the inherent uncertainty and spatial disparity of DER growth complicate traditional forecasting approaches. Moreover, the hierarchical structure of distribution grids demands that predictions satisfy statistical guarantees at both the circuit and substation levels, a non-trivial requirement for reliable decision-making. In this paper, we propose a novel uncertainty quantification framework for DER adoption predictions that ensures validity across hierarchical grid structures. Leveraging a multivariate Hawkes process to model DER adoption dynamics and a tailored split conformal prediction algorithm, we introduce a new nonconformity score that preserves statistical guarantees under aggregation while maintaining prediction efficiency. We establish theoretical validity under mild conditions and demonstrate through empirical evaluation on customer-level solar panel installation data from Indianapolis, Indiana that our method consistently outperforms existing baselines in both predictive accuracy and uncertainty calibration.

12.
arXiv (CS.CL) 2026-06-17

Algorithmic Prompt Generation for Diverse Human-like Teaming and Communication with Large Language Models

Understanding how humans collaborate and communicate in teams is essential for improving human-agent teaming and AI-assisted decision-making. However, relying solely on data from large-scale user studies is impractical due to logistical, ethical, and practical constraints, necessitating synthetic models of multiple diverse human behaviors. Recently, agents powered by Large Language Models (LLMs) have been shown to emulate human-like behavior in social settings. But, obtaining a large set of diverse behaviors requires manual effort in the form of designing prompts. On the other hand, Quality Diversity (QD) optimization has been shown to be capable of generating diverse Reinforcement Learning (RL) agent behavior. In this work, we combine QD optimization with LLM-powered agents to iteratively search for prompts that generate diverse team behavior in a long-horizon, multi-step collaborative environment. We first show, through a human-subjects experiment, that humans exhibit diverse coordination and communication behavior in this domain. We then present a series of experiments showing that our approach captures behaviors that are difficult to observe without large-scale data collection, and a follow-up user study to show that these generated behaviors are human-like. Our findings highlight the combination of QD and LLM-powered agents as an effective tool for studying teaming and communication strategies in multi-agent collaboration.

13.
arXiv (CS.CL) 2026-06-18

FLiP: Towards understanding and interpreting multimodal multilingual sentence embeddings

This paper presents factorized linear projection (FLiP) models for understanding pretrained sentence embedding spaces. We train FLiP models to recover the lexical content from multilingual (LaBSE), multimodal (SONAR) and API-based (Gemini) sentence embedding spaces in several high- and mid-resource languages. We show that FLiP can recall more than 75% of lexical content from the embeddings, significantly outperforming existing non-factorized baselines. Using this as a diagnostic tool, we uncover the modality and language biases across the selected sentence encoders and provide practitioners with intrinsic insights about the encoders without relying on conventional downstream evaluation tasks. Our implementation is public https://github.com/BUTSpeechFIT/FLiP.

15.
arXiv (math.PR) 2026-06-16

Joint convergence in Wiener chaos via transport hierarchy and Malliavin covariances

arXiv:2606.14812v1 Announce Type: new Abstract: We study the joint convergence in distribution of a sequence $X_N = I_p(f_N)$ of multiple Wiener–Itô integrals of order $p\geq 2$ that converges to a Gaussian limit $Z\sim N(0,\sigma^2)$, together with another sequence $Y_N = I_q(g_N)$ converging in law. The central finding is that the joint convergence of $(X_N, Y_N)$ is completely governed by the asymptotic behavior of the iterated Malliavin covariances $Y_{r+1,N} = \langle DX_N, DY_{r,N}\rangle_H$, $r\geq 0$: joint convergence holds as soon as these covariances converge jointly with $Y_N$, and the structure of the limiting distribution is then explicitly determined by their limits. Moreover, the convergence of the Malliavin covariances is necessary for joint convergence, as shown by a counterexample. When $q

16.
medRxiv (Medicine) 2026-06-22

The impact of changes in age-based eligibility criteria on seasonal influenza vaccine uptake in England between 2019 and 2024: A retrospective cohort study

Objectives: To examine changes in seasonal influenza vaccine uptake among clinical risk groups over periods of differing age-based eligibility. Design: Retrospective cohort study. Setting: Individuals in England registered in the Clinical Practice Research Datalink Aurum. Participants: Between 1,239,802 (2019/20) and 1,289,330 (2023/24) individuals aged 40-69 years in clinical risk groups. Interventions: Natural experiment involving temporary expansion of age-based eligibility for influenza vaccination to include 50-64-year-olds from 2020/21 to 2022/23. Main outcome measures: Influenza vaccine uptake from 1st September to 28th February, incidence rate ratio (IRR) of vaccine uptake across consecutive seasons within age groups, and the ratio of IRRs between age groups. Results: Influenza vaccine uptake increased in all age groups in 2020/21 relative to 2019/20. The increase was larger in individuals aged 50-64 years (13.3%; IRR 1.50, 95% CI 1.50-1.51) compared with those aged 40-49 years (8.3%; IRR 1.35, 95% CI 1.34-1.35) and 65-69 years (6.8%; IRR 1.34, 95% CI 1.33-1.35). From 2020/21 to 2022/23, vaccine uptake decreased, with a more pronounced decline among those aged 40-49 years (-5.4%) compared with age-eligible groups (50-64 years: -3.0%; 65-69 years: -3.1%). The reversion of age eligibility in 2023/24 was associated with a larger decrease in uptake among those aged 50-64 years (-9.6% vs 2022/23; IRR 0.79, 95% CI: 0.79-0.79) compared with those aged 40-49 years (-4.9%; IRR 0.87, 95% CI: 0.87-0.88) and 65-69 years (-3.3%; IRR 0.97, 95% CI: 0.96-0.97). Patterns were broadly consistent across clinical risk groups. Conclusions: The COVID-19 pandemic saw a general increase in seasonal influenza vaccine uptake in clinical risk groups. This increase was larger and more sustained in 50-64 year-olds who had also become eligible based on age. Our findings highlight the potential gains in vaccine coverage among clinical risk groups based on expanded age-based eligibility.

17.
arXiv (CS.CV) 2026-06-15

ADAPT: An Autonomous Forklift for Construction Site Operation

Efficient material logistics play a critical role in controlling costs and schedules in the construction industry. However, manual material handling remains prone to inefficiencies, delays, and safety risks. Autonomous forklifts offer a promising solution to streamline on-site logistics, reducing reliance on human operators and mitigating labor shortages. This paper presents the development and evaluation of ADAPT (Autonomous Dynamic All-terrain Pallet Transporter), a fully autonomous off-road forklift designed for construction environments. Unlike structured warehouse settings, construction sites pose significant challenges, including dynamic obstacles, unstructured terrain, and varying weather conditions. To address these challenges, our system integrates AI-driven perception techniques with traditional approaches for decision making, planning, and control, enabling reliable operation in complex environments. We validate the system through extensive real-world testing, comparing its continuous performance against an experienced human operator across various weather conditions. Our findings demonstrate that autonomous outdoor forklifts can operate near human-level performance, offering a viable path toward safer and more efficient construction logistics.

18.
arXiv (CS.LG) 2026-06-16

Amortized mean-shift interacting particles

arXiv:2606.15871v1 Announce Type: cross Abstract: Bayesian inference for inverse problems is run to evaluate integrals – posterior expectations, tail probabilities, and risks – across a stream of observations. The standard estimate averages the integrand over posterior samples, a Monte-Carlo average whose error decays only as the square root of the sample size, so accuracy demands many samples – prohibitive when each one calls a partial-differential-equation forward model. Mean-shift interacting particles need far fewer: they return a small set of signed-weight nodes – a deterministic quadrature whose weighted averages estimate those integrals. Finding the nodes, however, is a per-observation optimization that, in its most accurate form, reads the posterior score at every step – returning the cost it meant to save. We introduce amortized mean-shift interacting particles, a learned map that emits the weighted nodes from an observation and a few posterior samples in a single forward pass. Training asks only for joint parameter-observation samples and a posterior to draw from – a conditional normalizing flow, an empirical conditional, or any reference the user can sample – and the map learns to integrate that posterior from samples alone, evaluating neither its density nor its score. Once trained, it generalizes to unseen observations and integrands at any node budget and improves on independent samples in two ways: by reweighting them, provably no worse than the equal weights of Monte-Carlo; and by moving them, which empirically lowers it further. Across closed-form, sampled, learned, and physics-based posteriors – up to a thousand-coefficient groundwater field – it integrates more accurately than the same number of samples at every budget, and a posterior-whitened, dimension-aware kernel removes the high-dimensional wall. The result is a Pareto improvement on Monte-Carlo integration, not a competitor to drawing more samples.

19.
medRxiv (Medicine) 2026-06-18

A Novel Correction Method for QT Interval in the Presence of Left Bundle Branch Block Morphology

Background Accurate assessment of the QT interval is challenging in the presence of QRS prolongation, such as during ventricular pacing or bundle branch block. Current correction methods are heterogeneous and lack consensus. To evaluate the relationship between QRS duration and QT interval during ventricular pacing and to develop a practical correction method for QT assessment. Methods In this prospective single-centre study, 94 patients undergoing electrophysiology study for supraventricular tachycardia were included. Standardised pacing was performed at the same cycle length from the right ventricular (RV) apex, high output and low output pacing from His catheter, and coronary sinus (reference). QRS and QT intervals were measured from 12-lead ECGs. Changes in QT (QT) and QRS duration (QRS) were analysed using linear regression and mixed-effects modelling. QT correction formulas of the form QT corrected = QT N x QRS were evaluated using Bland-Altman analysis across multiple coefficients. Results A significant positive correlation between QRS and QT was observed across all pacing sites (r = 0.52-0.74, p < 0.001). In mixed-effects modelling, QRS was a strong independent predictor of QT (0.59, p < 0.001), with no significant interaction between pacing site and QRS, supporting a consistent relationship across pacing locations. Bland-Altman analysis demonstrated that correction coefficients of 0.65-0.70 minimised systematic bias compared with lower coefficients, with similar precision across models (SD 16 ms) and no evidence of proportional bias. A coefficient of 0.65 provided the most balanced performance between bias and variability. Conclusion QT prolongation during ventricular pacing is primarily driven by QRS widening and follows a consistent linear relationship across pacing sites. A simple correction using QT corrected = QT 0.65 x (QRS 100 ms) provides a practical and accurate method for QT assessment, with potential clinical applicability in patients with conduction abnormalities or ventricular pacing.

20.
arXiv (quant-ph) 2026-06-12

A Robust Strontium Tweezer Apparatus for Quantum Computing

arXiv:2601.16564v2 Announce Type: replace-cross Abstract: Neutral atoms for quantum computing applications show promise in terms of scalability and connectivity. We demonstrate the realization of a versatile apparatus capable of stochastically loading a 5x5 array of optical tweezers with single $^{88}$Sr atoms featuring flexible magnetic field control and excellent optical access. A custom-designed oven, spin-flip Zeeman slower, and deflection stage produce a controlled flux of Sr directed to the science chamber. In the science chamber, featuring a vacuum pressure of $3 \times 10^{-11}$ mbar, the Sr is cooled using two laser cooling stages, resulting in $\sim 3 \times 10^5$ atoms at a temperature of 5(1) $\mu$K. The optical tweezers feature a $1/e^2$ waist of 0.81(2) $\mu$m, and loaded atoms can be imaged with a fidelity of $\sim 0.997$ and a survival probability of $0.99^{+0.01}_{-0.02}$. The atomic array presented here forms the core of a full-stack quantum computing processor targeted for quantum chemistry computational problems.

21.
PLOS Computational Biology 2026-06-08

Statistics of cortical representational drift can enable robust readout

by Charles Micou, Timothy O’Leary Representational drift of fixed stimuli, learned tasks and familiar environments is observed in many brain areas, leading to reconfiguration of population codes over days to weeks. This raises the question of whether downstream brain regions employ mechanisms to track changes in population activity and thus preserve the fidelity of the information they extract. We show that the statistical properties of drift have a significant impact on such mechanisms. Over an extended period, a net change in population tuning due to drift can arise from an accumulation of small changes distributed across the population, or via abrupt jumps that affect smaller subsets of cells at each time point. We demonstrate that an adaptive readout can exploit the heavy-tailed statistics of abrupt jumps to maintain a more stable readout using a simple inference mechanism. Using experimental data, we investigate the extent to which heavy-tailed drift statistics are observed during representational drift in the posterior parietal cortex and visual cortex. We find that experimentally measured drift does not conform to a Gaussian random walk. Instead, we find sudden jumps in neural tuning that would be advantageous for a downstream observer adapting to changes in representation. These observations motivate future study to determine whether adaptive decoding mechanisms exist in the brain and to determine the physiological mechanisms that shape the statistics of representational drift.

22.
medRxiv (Medicine) 2026-06-17

Low-Density Lipoprotein Cholesterol and Dementia Risk: Integrating Mendelian Randomization and Target Trial Emulation Within the Heart-Brain Axis

Background: The heart-brain axis links cardiovascular and neurodegenerative disease through shared vascular and inflammatory mechanisms. Although low-density lipoprotein cholesterol (LDL-C) is an established causal factor in atherosclerotic cardiovascular disease (ASCVD), its relationship with dementia remains uncertain, with midlife elevations associated with increased risk but late-life associations often appearing null or inverse. To address this cholesterol paradox, we integrated mendelian randomization (MR) with an active-comparator new-user target trial emulation. Methods: We applied a triangulated causal inference framework integrating two-sample MR with observational target trial emulation. Genetic variants associated with LDL-C were used as instrumental variables to evaluate Alzheimer disease (AD), dementia with Lewy bodies (DLB), frontotemporal dementia (FTD), and any dementia (AnyDem), with causal estimates derived using inverse-variance weighted models and sensitivity analyses for heterogeneity and pleiotropy. In parallel, an active-comparator new-user design compared statin versus ezetimibe initiation among adults aged 60 years or older using propensity score (PS) overlap weighting and Cox proportional hazards models to evaluate cardiovascular and dementia outcomes. Results: Genetically predicted LDL-C was associated with increased risk of DLB (OR 1.65, 95% CI 1.30-2.10; p

23.
medRxiv (Medicine) 2026-06-17

LLM-Driven Extraction of NI-RADS and Imaging Tumor Characteristics to Enhance Oropharyngeal Cancer Survivorship Surveillance

Abstract Purpose Radiologic surveillance is essential for oropharyngeal cancer (OPC) survivors, guiding recurrence detection and follow-up strategies. The Neck Imaging Reporting and Data System provides a standardized framework for post-treatment risk reporting at both the primary tumor site (pNI-RADs) and cervical lymph nodes (nNI-RADS). Comprehensive surveillance additionally requires assessment of disease status, including the primary tumor, nodal involvement, and distant metastases. These clinical results are often embedded as unstructured data within free-text radiology reports. We hypothesized that a large language model (LLM) can reliably extract NI-RADS score criteria and summarize key imaging features from unstructured radiology text, achieving high concordance with expert review. Methods Previously untreated OPC patients who received definitive cancer therapy were identified. Eligible imaging reports included post-treatment head and neck CT, MRI, or FDG PET/CT scans containing narrative and impression text. Examinations lacking narrative or impression text, containing pre-existing NI-RADS annotations, or involving non-surveillance imaging modalities were excluded. A total of 200 reports were randomly selected from 7,076 eligible examinations for manual abstraction using a three-reviewer consensus framework to establish a reference dataset. Using the Palantir Foundry Pipeline Builder, a GPT-5-based LLM was deployed to extract pNI-RADS and nNI-RADS scores, and key imaging features of disease status from these reports. Performance was evaluated using exact agreement and F1-based metrics. Results Agreement for no evidence of disease (score of 1) was 93.3% (126/135; F1 = 0.94) and 90.3% (130/144; F1 = 0.93) for pNI-RADS and nNI-RADS, respectively. For NI-RADS [&ge;]2, exact category agreement was 73.1% (38/52; macro-F1 = 0.75) for pNI-RADS and 64.3% (27/42; macro-F1 = 0.56) for nNI-RADS. Quadratic weighted {kappa} was 0.81 and 0.59, respectively. For post-treatment disease surveillance variables, agreement was 94.9% (149/157; F1 = 0.87) for primary tumor presence, 89.1% (164/184; F1 = 0.87) for nodal disease presence, and 94.7% (126/133; F1 = 0.70) for distant metastasis detection. Specificity was high across disease-status variables (0.95-0.99), with negative predictive values of 0.95 for primary tumor, 0.87 for nodal disease, and 0.99 for distant metastasis. Conclusions Our LLM-based information retrieval and classification approach for radiographic treatment response from unstructured, multidimensional imaging reports achieved high performance for disease exclusion and moderate performance for detecting suspected residual and/or new disease. This pipeline supports scalable and standardized surveillance data capture for longitudinal monitoring, clinical analytics, and survivorship research in head and neck oncology.

24.
arXiv (CS.CL) 2026-06-15

Decoupled Mixture-of-Experts for Parametric Knowledge Injection

Knowledge injection aims to equip large language models (LLMs) with external, domain-specific, or time-sensitive knowledge. Existing approaches typically face a trade-off between flexibility and integration: retrieval-augmented generation keeps knowledge outside the model but only provides prompt-level augmentation, whereas post-training based methods encode new knowledge into shared parameters but may introduce catastrophic forgetting, knowledge conflict, and costly updates. In this paper, we propose Decoupled Mixture-of-Experts (DMoE), a modular architecture for parametric knowledge injection that decouples both experts and the router from the base model. DMoE converts external knowledge corpora into independently updatable expert modules and uses a lightweight uncertainty-aware router to activate relevant experts only when the base model lacks sufficient knowledge during generation. To support efficient auto-regressive inference, DMoE attaches experts only to the final-layer feed-forward network, preserving KV-cache reuse while enabling parameter-level knowledge augmentation. Experiments on knowledge-intensive benchmarks show that DMoE consistently improves answer quality over retrieval and adapter-based baselines.

25.
arXiv (CS.AI) 2026-06-16

Evolutionary Dynamics of Cooperation in Next-Generation LLM Agent Systems: A Cross-Provider Empirical Extension

arXiv:2605.29874v2 Announce Type: replace-cross Abstract: Do next-generation LLM agents inherit the cooperative biases documented in their predecessors, or does scale and provider diversity reshape equilibrium behaviour in competitive multi-agent settings? Willis et al. established a benchmark for this question using evolutionary game theory and the Iterated Prisoner's Dilemma (IPD), finding consistent cooperative biases in ChatGPT-4o and Claude 3.5 Sonnet. We extend this benchmark to four frontier models released in 2025-2026 - Claude Sonnet 4.6, Gemini 2.5 Flash, Gemini 3.1 Pro, and GPT-5.4 Mini - applying the identical protocol across three prompting styles (Default, Prose, Self-Refine) and four population compositions (balanced and biased, with and without noise). Cooperative bias persists across providers (H1): ten of twelve model-prompt combinations favour cooperative equilibria in balanced noiseless conditions. Cross-provider divergence is substantial (H3): Gemini 2.5 Flash reaches up to 77% aggressive equilibria under biased conditions, while GPT-5.4 Mini reaches 70% cooperative equilibria under Self-Refine. Support for aggressive capability parity is partial (H2): Self-Refine raises ICD in all models and Gemini 3.1 Pro Refine achieves the highest ICD in the dataset (0.925), but Default and Prose prompts show no systematic narrowing. Evidence on noise robustness is directionally positive but not robustly confirmed (H4): with n=500 Moran iterations per condition, average noise sensitivity is about 6 percentage points for Claude Sonnet 4.6 versus 13 pp for Claude 3.5 Sonnet, but this cross-study gap is not statistically significant once the predecessor's unreported sampling error is propagated. Provider identity, rather than model generation, is the strongest correlate of equilibrium outcomes; noise remains a universal challenge regardless of model size or vintage.