Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
medRxiv (Medicine) 2026-06-16

Higher Population Coverage with Typhoid Conjugate Vaccine is Needed to Induce Herd Protection: Evidence from a Cluster-Randomized Trial in Urban Bangladesh

Introduction: A cluster randomized trial (CRT) in Bangladesh found that Vi-tetanus toxoid (Vi-TT) vaccine conferred 85% protection to vaccinees at 18 months of follow-up; however, it failed to confer significant herd protection to non-vaccinees. Methods: In the CRT, children aged 9 months to

02.
arXiv (CS.CL) 2026-06-18

Rethinking Cross-lingual Gaps from a Statistical Viewpoint

Any piece of knowledge is usually expressed in one or a handful of natural languages on the web or in any large corpus. Large Language Models (LLMs) act as a bridge by acquiring knowledge from a source language and making it accessible when queried using target languages. A cross-lingual gap is a drop in accuracy incurred when querying knowledge in a target language rather than the source language. Existing research focused on modeling or training failures leading to cross-lingual gaps. In this work, we take an alternative view to characterize the nature of cross-lingual error, and hypothesize that the variance of responses in the target language is a key cause of this gap. For the first time, we formalize the cross-lingual gap in terms of biased and unbiased errors. We empirically validate our hypothesis through multiple inference-time interventions that control variance and reduce the cross-lingual gap. We demonstrate a few test-time ensemble methods that reduce response variance, and thereby improve source-target transfer scores by up to 12 absolute points yielding relative gains of 8% to over 50% across various LLMs.

03.
arXiv (CS.CV) 2026-06-12

SPARC: Reliable Spatial Annotations from Robot Demonstrations at Scale

This work introduces Spatial Annotations from Robot Demonstrations with Reliability Calibration (SPARC), a risk-aware framework that automatically labels robot demonstrations with structured spatial annotations and assigns each annotation a reliability score. Structured spatial annotations, such as bounding boxes, object trajectories, and manipulation phase labels, benefit a broad range of robotics applications from training grounded robot policies and embodied foundation models to motion planning and hierarchical task composition. Existing automated pipelines generate such annotations at scale but provide no reliable quality signal: detector confidence is poorly calibrated for annotation correctness, forcing a choice between accepting noisy labels or discarding useful samples. In contrast to existing automated pipelines, SPARC leverages the spatio-temporal structure inherent to robot tasks to generate a reliability signal, reducing noisy labels and retaining more useful samples. We further introduce Interaction-Aware Bench (IA-Bench), a benchmark that measures model accuracy in grounding the locations of interacted objects in robot demonstrations. On 1.7k human-annotated demonstrations spanning diverse embodiments and scenarios, SPARC significantly outperforms detection-only baselines in localization accuracy while retaining three times more samples at high-precision operating points. Our experiments demonstrate that models finetuned on our annotations achieve state-of-the-art results on object-grounding and pointing benchmarks among similarly sized models, while remaining competitive on broader spatial-reasoning suites without manually verified or annotated training data. Furthermore, policies trained on SPARC-generated annotations outperform baselines in cluttered, visually ambiguous real-world scenes. Code, data, and models are available at intuitive-robots.github.io/sparc-labeling.

04.
Nature (Science) 2026-06-10

Mitochondria directly interact with the nuclear pore complex

Mitochondria regulate cellular processes through direct and indirect interactions with other organelles. A well-studied example has been contact with the endoplasmic reticulum at mitochondrial-associated endoplasmic reticulum membranes1, which control pathways including redox and calcium homeostasis2,3. Recent studies have also reported direct mitochondria–nuclear membrane contacts in cancer cells and yeast that promote pro-survival signalling4,5. Here we identify direct interactions between mitochondria and nuclear pores. Using two unbiased proteomic screens, GST pulldown and BioID, we found that VDAC1 was the top mitochondrial candidate that interacts with the filamentous nuclear pore protein RANBP2. In vitro RANBP2 CRISPR knockout, RANBP2 truncation or site-directed mutagenesis of RANBP2–VDAC1 interacting amino acids resulted in reduced mitochondria–nucleus proximity and decreased nuclear ATP and phosphocreatine levels. This was accompanied by a decline in the levels of the nuclear phosphoproteome and downregulation of pathways involved in histone modification, cellular differentiation and transcriptional regulation in vitro. Moreover, deletion of the RANBP2 C-terminal domain in vivo in mice resulted in embryonic lethality due to cardiac and neural crest differentiation defects. Collectively, these results describe a mechanism by which mitochondria directly interact with the nuclear pore complex, a phenomenon critical for regulation of nuclear energetics and cellular differentiation. Undoubtedly, additional roles of this interaction remain to be revealed. Mitochondria interact directly with the nuclear pore complex via VDAC1–RANBP2 binding to sustain nuclear ATP levels.

05.
arXiv (CS.CL) 2026-06-19

Efficiently Representing Algorithms With Chain-of-Thought Transformers

The increasing popularity of reasoning models – language models that output a series of reasoning or thought tokens before producing an answer – is justified, in part, by theoretical results showing that chain-of-thought (CoT) transformers can simulate Turing machines, and thus perform arbitrary computation. However, the Turing machine, while suitable for complexity-theoretic analysis, is not convenient, intuitive, or efficient for discussing algorithms. Algorithms are typically designed and analyzed at a higher level of abstraction, captured by the Word RAM model with random-access memory and unit-cost operations on $\bigO(\log n)$-bit words. As a result, Word RAM algorithms can be substantially more efficient than their Turing machine counterparts, raising the question: Can CoT transformers efficiently simulate Word RAM algorithms? For instance, can they sort $n$ items in $\bigO(n \log n)$ steps or run Dijkstra's algorithm in $\bigO(E + V \log V)$ steps? We answer affirmatively, up to poly-logarithmic overhead. We first establish this for finite-precision transformers with poly-logarithmic width and rightmost unique hard attention, then strengthen the result to two more practical settings with finite width and log-precision: continuous CoT, where reasoning takes the form of vectors rather than tokens, and a hybrid architecture in which transformer layers sit atop a recurrent (linear RNN) layer. In all three cases, we find that CoT can efficiently simulate any Word RAM algorithm with only a poly-logarithmic overhead in $n$. This overhead reduces to log-square when the Word RAM has a ``flat'' instruction set, and only logarithmic for multiplication-free flat instructions – in stark contrast to known CoT simulations of Turing machines, which require quadratic overhead over Word RAM.

06.
bioRxiv (Bioinfo) 2026-06-24

A comprehensive analysis of calreticulin mutants reveals distinct biophysicochemical proprieties with a potential for refined targeted therapies

Calreticulin mutations in myeloproliferative neoplasms result in the replacement of the C-terminus acidic sequence with a positively charged tail that causes pathological activation of the thrombopoietin. The two canonical variants are Type-1 and Type-2. The remaining are mainly classified as Type-1 or Type-2 like based on the wild type sequence retained. Here, we performed in silico biophysicochemical analyses of 76 CALR exon 9 frameshift variants by their sequence and predicted biophysical properties, complemented by structural modeling of the mutant homodimers. Beyond confirming the Type-1 versus Type-2 distinction, we found that the Type 1-like variants form a continuum of charge architecture along which two reproducible subgroups can be identified, rather than sharply separated classes. This work refines the conventional mechanism-based classification into a charge-resolved framework and provides testable hypotheses linking novel-tail chemistry to receptor activation in CALR-mutant neoplasms and paves the way for improved targeted therapies based on individual mutants characteristics

07.
arXiv (quant-ph) 2026-06-19

Passive-User Bell-State Loop-Back Key Establishment without Quantum Detectors at the User Nodes

arXiv:2606.19551v1 Announce Type: new Abstract: We propose and analyze a Bell-state extension of the Loop-Back quantum key distribution architecture for secret-key establishment between two passive users that do not require quantum transmitters or quantum detectors. In the proposed setting, a single active station, Alice, provides the entangled-state infrastructure, retains one qubit of an initially prepared Bell pair, and sends the traveling subsystem through two passive users, denoted by $B_1$ and $B_2$. Each passive user applies a local Pauli operation to the same traveling subsystem, so that the operation observed by Alice is only the effective composition $U_{\mathrm{eff}}=U_2U_1$. After the subsystem returns, Alice performs a Bell-state measurement and, using her private knowledge of the initial Bell state, deterministically identifies the effective Pauli operation. However, the individual factors $U_1$ and $U_2$ remain algebraically hidden from Alice whenever the local choices are uniformly and independently selected. The public effective operation acts as a parity-like constraint: each passive user can infer the operation applied by the other from its own private choice, while the active station learns only the global composition. This construction transfers the essential distributed-transformation mechanism of passive-user Loop-Back QKD to the entangled-state regime. Unlike single-qubit passive-user schemes, whose useful events are intrinsically post-selected, the Bell-state version is limited primarily by the success probability of the Bell-state measurement. We discuss the algebraic structure of the protocol, its interpretation as an infrastructure-assisted mediated key-establishment mechanism, and the physical assumptions required to protect passive Pauli modulators against active injection or Trojan-horse-type attacks.

08.
arXiv (CS.LG) 2026-06-15

LapidaryEngine: Fully Conversational Crystal Generation

arXiv:2606.14215v1 Announce Type: new Abstract: The emergence of Large Language Models (LLMs) has inspired the vision of generating bespoke crystal materials directly from natural-language instructions, enabling users to design materials through intuitive, conversational interaction. Existing text-to-crystal generative models represent important early steps toward this goal, but they suffer from two critical limitations: (i) restricted input formats that require highly structured descriptions (e.g., chemical formulas), and (ii) one-directional generation, where models can map text to crystal but cannot perform the inverse. These limitations prevent fully conversational workflows and hinder alignment with users' inherently ambiguous and evolving desiderata. We address these challenges with LapidaryEngine, the first model to support fully conversational crystal generation. LapidaryEngine accepts free-form natural-language requests and performs iterative refinement and editing in a dialogue-like manner. The key innovation is a pivot representation, a third, intermediate form that enables bidirectional translation between text and crystal structures despite the absence of direct paired datasets. Leveraging this pivot allows robust interpretation of user feedback and precise structural control. We demonstrate LapidaryEngine across diverse tasks, including insulator discovery, stability optimization, compositional modification, and structural editing, showcasing its ability to align generated materials with user intent in an interactive manner.

09.
arXiv (CS.AI) 2026-06-19

Exit-and-Join Dynamics for Decentralized Coalition Formation

作者:

arXiv:2606.19683v1 Announce Type: new Abstract: This paper studies coalition formation as a decentralized dynamical process driven by unilateral exit-and-join decisions. Agents evaluate local moves using the Aumann-Dreze value, so payoffs are computed within the agent's current coalition rather than through a globally negotiated coalition structure. The resulting model links cooperative payoff allocation with noncooperative best-response behavior: a terminal partition is precisely a coalition structure with no admissible, individually profitable exit-and-join deviation. We establish equilibrium characterizations, identify conditions under which the dynamics admit scalar Lyapunov or exact-potential representations, and analyze how switching and acceptance costs shape local stability. Numerical experiments test finite-time stabilization, cost sensitivity, and a special convex-game benchmark.

10.
arXiv (CS.CV) 2026-06-16

Towards Global AI-Driven Cervical Cancer Screening

The global elimination of cervical cancer is a key public health goal set by the World Health Organization (WHO), with screening programs reducing mortality by up to 80%. However, access to experts and biopsy services is limited in low- to middle-income countries (LMICs). Deep learning (DL)-based algorithms offer promising support for screening, but most existing approaches have been developed and validated on private datasets from single countries. We present the first DL-based approach to cervical cancer screening validated on data from multiple countries. Technically, we phrase the problem of detecting and classifying lesions in colposcopy images as a multi-task learning problem, in which we simultaneously perform image-level classification and lesion segmentation. Our model was trained on a private data set of acid stain colposcopy images with manually generated lesion segmentation masks and corresponding histopathological results, employing extensive data augmentation to address image variability. In an in-distribution validation with pathology results serving as ground truth, our algorithm outperformed medical experts (Balanced Accuracy: 0.68 vs 0.64) in CIN1- (Cervical intraepithelial neoplasia grade 1 or lower) versus CIN2+ (grade 2 or higher) classification. External validation on four colposcopy data sets from four countries featuring radical differences in prevalence and patient characteristics yielded superior performance of our method compared to baseline methods. Performance variability across countries was high with AUC values ranging from 0.54 - 0.80. Overall, algorithm performance varied with age, transformation zone (cervical area most prone to lesion development), presence of comorbidities and pathognomonic signs, with comorbidities having by far the largest negative effect. Future work should focus on improving model robustness and generalizability.

11.
arXiv (CS.CV) 2026-06-16

PermaVid: Consistent Video Generation Across Edits via Disentangled Context Memory

Consistent video generation under editing operations requires persistence: when edits modify scene appearance or layout, subsequent generations should remain coherent across time and viewpoints. However, existing memory designs struggle to maintain long-term consistency after such modifications, as stored contexts may become outdated or invalid. To address this, we propose PermaVid, a novel framework built upon a multi-modal context memory that disentangles spatial context into semantic appearance and geometric structure, together with an edit-aware memory update and retrieval strategy that keeps memory evolution aligned with subsequent observations. Specifically, we develop two complementary memory banks: an RGB context memory that captures appearance-aware observations while implicitly encoding geometry, and a depth context memory that preserves geometry-only structure disentangled from semantics. Building on this design, we introduce a memory-guided video generation model that performs multi-modal feature fusion under reference conditions drawn from mixed-modality memory contexts. Experiments demonstrate that our method maintains strong long-term semantic and structural consistency after edits, significantly outperforming state-of-the-art methods.

12.
arXiv (CS.LG) 2026-06-12

Ride, Track, and Recover: Pilot Randomized Trial of a Wearable Digital Self-Management Intervention During a Veteran Endurance-Cycling Program

arXiv:2606.13529v1 Announce Type: cross Abstract: Post-traumatic stress disorder (PTSD) in veterans is characterized by persistent hyperarousal and comorbid anxiety and depressive symptoms that are difficult to monitor and manage outside clinical settings. Thirteen veterans participating in a Project Hero cycling event in Texas were randomized by computer-generated sequence in a naturalistic setting to two arms: (1) digital intervention plus physical activity, or (2) physical activity only, plus a third at-home monitoring control cohort consisting of 7 veterans selected from the broader Project Hero veteran community. Continuous smartwatch sensing combined heart rate and accelerometer features to detect hyperarousal events, which were confirmed in real time by participants. Weekly self-report measures of anxiety, depression, and PTSD severity were collected. Generalized additive mixed models characterized nonlinear trajectories over time. Baseline-normalized hyperarousal trajectories differed significantly across conditions, with the digital intervention group (n=7) showing structured stabilization compared to late-study escalation in the physical-only group (n=3). Both cycling groups exhibited acute symptom improvements during the endurance event; however, the digital intervention group demonstrated a higher overall maintenance of gains. The at-home control group (n=4) showed gradual symptom declines. Perceived precision of ML detections varied substantially across individuals and was positively associated with symptom severity, with higher-severity participants confirming a greater proportion of detected events. These results suggest that coupling wearable detection with digital self-management tools may support stabilization of hyperarousal and symptom improvement while emphasizing the importance of personalization and human-centered design in wearable mental health systems.

13.
arXiv (CS.LG) 2026-06-11

Last-Iterate Convergence of Optimistic Multiplicative Weight Update

arXiv:2606.11773v1 Announce Type: cross Abstract: Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative-Weights Update (OMWU) are two very popular algorithms to solve convex/concave saddle-point problems, where OMWU is the non-Euclidean, entropic version of OGDA. It is known since the '80s that the last iterate of OGDA asymptotically converges to a saddle point in smooth problems. On the other hand, it is unknown if OMWU has the same property. In this paper, I show that OMWU converges asymptotically for smooth convex-concave saddle-point problems, with a small enough constant learning rate. The result does not require uniqueness, strict complementarity, an error bound, or initialization near a solution. The main new ingredient is a boundary argument showing that every cluster point satisfies the inactive-coordinate KKT inequalities. The boundary argument was discovered with assistance from ChatGPT and is documented in the appendix.

14.
bioRxiv (Bioinfo) 2026-06-23

Systematic benchmarking of zero-shot utility and robustness in single-cell transcriptomic foundation models

Single-cell foundation models (scFMs) have been proposed as reusable representations for transcriptomic analysis, yet their practical utility and robustness when applied without task-specific fine-tuning remain incompletely characterized. Here, we systematically evaluated single-cell transcriptomic representations in zero-shot settings across 20 methods, 6 downstream tasks and 1,607 datasets comprising nearly 21.8 million cells. We characterized model behavior along three complementary dimensions: baseline utility, structural robustness, and dataset-level drivers of performance variability. Our large-scale analysis reveals a decoupling between utility and robustness: methods ranking highly on standard benchmarks often show marked instability under shifts in dataset structure. Furthermore, no single model performs uniformly well across tasks. In several tasks, classical statistical representations based on highly variable genes remain competitive under zero-shot conditions. Together, these results define the practical boundaries of zero-shot use in scFMs and provide a large-scale benchmark and decision framework for representation selection in single-cell genomics.

15.
arXiv (CS.CV) 2026-06-19

HumanScale: Egocentric Human Video Can Outperform Real-Robot Data for Embodied Pretraining

Embodied foundation models are expected to benefit from data scaling like large language models, but face a much tighter data bottleneck. Teleoperated real-robot trajectories remain the dominant pretraining source due to their precise action supervision and embodiment alignment, yet their scalability is limited by high collection cost, acquisition difficulty, and low behavioral and environmental diversity. These limitations have sparked interest in egocentric human video as a scalable, substantially lower-cost, and more diverse alternative for embodied model pretraining. However, its effectiveness compared to teleoperated real-robot data remains underexplored. To address this question, we conduct a systematic study comparing egocentric human video and teleoperated real-robot trajectories as pretraining data sources for embodied foundation models, under fixed post-training and validation protocols. Surprisingly, we find that egocentric data, when processed through a carefully designed filtering and labeling pipeline, is not merely a viable substitute for model pretraining but can lead to superior performance. With the same amount of pretraining data, models pretrained on egocentric data achieve a 24% lower validation loss on real-robot action prediction, as well as 52.5% and 90% higher success rates on in-distribution and out-of-distribution real-robot task execution, respectively. This finding verifies a scalable paradigm for embodied foundation models: pretrain on egocentric human video to learn diverse world representations, then adapt with a small amount of labeled real-robot data for action-space alignment. We hope this study encourages broader exploration of egocentric data and offers guidance for data quality assessment before costly robot data collection.

16.
arXiv (quant-ph) 2026-06-11

On-Chip Quantum Randomness Amplification

arXiv:2606.12173v1 Announce Type: new Abstract: Randomness amplification, the task of extracting uniform private bits from biased seeds that may be partly known by a malicious third party, is of central importance in cryptography. The highest security in this task is provided by a class of quantum protocols known as device-independent, which however are challenging to integrate into scalable devices. Semi-device-independent (SDI) protocols are a promising alternative that guarantees security under few natural assumptions, such as bounds on the amount of energy used by the devices. Here, we provide the first demonstration of SDI randomness amplification on an integrated silicon photonic chip, achieving a throughput rate of 20 Mbps suitable for practical applications. This rate is achieved through a novel technique for SDI entropy certification, which delivers strictly tighter von Neumann entropy bounds compared to existing methods and remains valid even if the preparation and measurement devices share quantum correlations. Overall, the methods developed in this work enable the integration of SDI technology into portable telecom devices, opening up a new generation of quantum cryptographic hardware.

17.
medRxiv (Medicine) 2026-06-10

Developmental Associations Linking Childhood Trauma and Early Cannabis Use to Adolescent DNA Methylation and Psychotic-Like Experiences

Background. Psychotic-like experiences (PLEs) index early risk for psychotic disorders and are consistently associated with childhood trauma, yet underlying biological mechanisms remain poorly understood. DNA methylation (DNAm) may capture the biological embedding of early adversity, while adolescent exposures such as cannabis use may modify these processes. We examined epigenome-wide associations of childhood trauma and PLEs, tested the moderating role of early cannabis use, and evaluated DNAm as a potential mediator. Methods. We analysed data from the Avon Longitudinal Study of Parents and Children (ALSPAC), a UK population-based birth cohort. Childhood trauma was assessed prospectively and retrospectively. Epigenome-wide DNAm was measured in peripheral blood at ~17 years using the Illumina 450K array, and PLEs were assessed at 18 using a structured interview. Epigenome-wide association studies were conducted for trauma-DNAm and DNAm-PLEs associations in the final sample (n = 1,457), adjusting for demographic, biological, and technical covariates. Differentially methylated regions (DMRs) were identified using DMRff, followed by functional enrichment analyses. Cannabis use at 15.5 was modelled as a moderator with multiple imputation for missing data. Mediation was tested using the Divide-Aggregate Composite-null Test (DACT). Results. Childhood trauma was associated with widespread DNAm differences, primarily at the regional level, with enrichment in pathways related to cellular stress responses. In contrast, DNAm associated with PLEs was more limited and implicated loci involved in epigenetic regulatory processes. These signatures were largely distinct, and there was no evidence supporting mediation after multiple testing correction. Incorporating cannabis use altered the pattern and extent of DNAm associations, with stronger and more significant signals observed at both CpG and regional levels, although these did not translate into evidence of mediation. Conclusion. Childhood trauma and PLEs show distinct DNAm signatures in adolescence, with trauma-related DNAm reflecting broad stress-related processes and PLE-associated DNAm implicating regulatory mechanisms. We found little evidence that DNAm mediates the trauma-PLE association. Instead, adolescent exposures, particularly cannabis use, may distinctly influence trauma-related epigenetic variation with limited detectable downstream effects on PLEs. These findings support a context-dependent model of epigenetic risk and highlight the need for larger longitudinal studies to clarify causal pathways linking early adversity to psychosis.

18.
arXiv (math.PR) 2026-06-18

A Unified Approach to Beta Moments, Combinatorial Identities, and Random Walks

arXiv:2605.05420v2 Announce Type: replace Abstract: The study of random walks has increasingly been popular across diverse disciplines such as statistics, mathematics, quantum physics, where they are used to model paths consisting of successive random steps in a mathematical space. A fundamental quantity of interest is the probability that a simple symmetric random walk returns to the origin after 2n steps. In this paper, we develop a unified probabilistic approach that connects the return probabilities in arbitrary dimensions with moment representations. Using this framework, we provide probabilistic proofs of several combinatorial identities involving beta and gamma functions, and derive new combinatorial identities in general dimensions.

19.
arXiv (CS.LG) 2026-06-24

Stochastic Expectation Maximization for Robust State-Space Radio Interferometric Imaging

arXiv:2606.23944v1 Announce Type: cross Abstract: State–space models provide a flexible framework for analyzing dynamical systems, yet they often rely on Gaussian assumptions that fail to capture heavy-tailed or outlier-prone measurement noise. We propose a robust estimation scheme for linear state–space models subject to compound-Gaussian noise, as encountered for instance in radio interferometry affected by radio-frequency interference (RFI). The method relies on a Stochastic Approximation Expectation–Maximization (SAEM) algorithm in which the standard E-step is replaced by Monte Carlo sampling of the latent states and noise texture through closed-form Gibbs updates, enabling tractable inference despite the heavy-tailed likelihood. Numerical experiments show that the proposed method significantly improves reconstruction fidelity and robustness to RFI, outperforming a Gaussian EM algorithm and even an oracle RTS smoother. These results highlight the benefits of heavy-tailed state–space modeling and SAEM-based inference in interference-dominated imaging scenarios.

20.
bioRxiv (Bioinfo) 2026-06-14

Generative design of antigen-specific T-cell receptor sequences with a conditional diffusion model

T cell receptor (TCR)-based immunotherapy holds immense potential for treating cancers and infectious diseases, where highly antigen-specific TCR recognition is crucial for adaptive immunity against tumors and pathogens. Engineering or de novo generation of the complementarity-determining region 3 (CDR3) loops of TCRs using artificial intelligence offers a powerful alternative to designing reactive TCRs rather than laborious experimental screening. However, current in silico approaches are constrained by weak conditional guidance, limited flexibility, and a lack of rigorous functional validation. To address these limitations, we introduce TCRDiff, a generative diffusion framework for designing antigen-specific TCRs conditioned on peptide-MHC (pMHC) targets and germline-encoded variable genes. By leveraging pre-trained knowledge from massive T-cell repertoires and TCR-pMHC recognition data, TCRDiff generates CDR3{beta} sequences with state-of-the-art fidelity to native binding TCRs through a denoising diffusion process. Furthermore, incorporating the interface geometry features generated TCR-pMHC complexes with superior structural plausibility. As a proof of concept, we deployed TCRDiff in a systematic pipeline to design candidate TCRs for immunotherapy. In vitro activation assays validated that TCRDiff-generated TCRs specifically recognize the MAGE-A3 epitope with minimized off-target cross-reactivity. Together, TCRDiff establishes a powerful, validated computational paradigm to accelerate the development of TCR-based immunotherapies.

21.
medRxiv (Medicine) 2026-06-11

Hantavirus Disease in Uruguay: Trends and Mortality Before and During the COVID-19 Pandemic.

Introduction: Hantavirus disease is an emerging and potentially severe zoonosis of global distribution. In Uruguay, it is transmitted by rodents inhabiting peridomestic, suburban, and rural areas. Global incidence is estimated at 150,000 to 200,000 cases per year, with up to 300 annual cases in the Americas. Since 1997, Uruguay's Ministry of Public Health (MPH) has monitored Hantavirus cardiopulmonary syndrome (HCPS), the most common clinical presentation in the region. By 2019, a total of 271 cases had been identified in the country, with an estimated mortality rate of nearly 50%. Objectives: To describe the clinical, epidemiological, and occupational characteristics of patients with Hantavirus disease in Uruguay during the pre-pandemic (2018-2019) and pandemic (2020-2021) periods. Methods: A descriptive, cross-sectional, observational study was conducted, including all serologically confirmed cases of Hantavirus infection reported to the MPH between 2018 and 2021. Clinical and demographic data were extracted from the mandatory reporting form for zoonotic diseases. Incidence and case fatality rates were calculated, and factors associated with fatal outcomes were analyzed. Results: A total of 58 confirmed cases were identified between 2018 and 2021. Most patients were male (62%), with a mean age of 36.5 years (SD 16). A decline in incidence was observed during 2020-2021, with no significant change in case fatality. Direct rodent exposure was the most frequently associated risk factor. Montevideo and Canelones were the most affected departments. Renal and pulmonary involvement were significantly associated with mortality. Conclusion: Hantavirus remains a relevant public health concern in Uruguay. Although a decrease in incidence was observed during the COVID-19 pandemic years, case fatality rates remained high. The findings underscore the need for sustained surveillance and early recognition, particularly in urbanizing regions.

22.
arXiv (math.PR) 2026-06-19

Maximal rigidity of random measure and uniqueness pairs: stealthy processes, quasicrystals and periodicity

arXiv:2512.10686v2 Announce Type: replace Abstract: This article investigates the phenomenon of maximal rigidity in spatial processes, where perfect interpolation of the process is possible from partial information, specifically, from its restriction to a strict subdomain, often resulting in a trivial tail $\sigma$algebra. A classical example known since the 1930's is that a time series is fully determined by its values on the negative integers if its spectrum has a gap, or at least a sufficiently deep zero. We extend such results to higher dimensions and continuous settings by establishing a connection with the concept of uniqueness pairs, rooted in the uncertainty principle of harmonic analysis. We present several other manifestations of this principle, unify and strengthen seemingly unrelated results across different models: quasicrystals and stealthy processes are shown to be maximally rigid on cones, and discrete integer-valued processes are necessarily periodic when they have a simply connected spectrum. Finally, we identify a surprising class of continuous fields with seemingly standard behavior, such as linear variance and finite dependency range, that undergo a phase transition: they are perfectly interpolable on B(0, $\rho$) for $\rho$ ___ 2 $\pi$ but exhibit no rigidity for $\rho$ > 2.

23.
medRxiv (Medicine) 2026-06-11

Advancing Clinical Implementation of Cardiovascular Polygenic Risk Scores Through Patient-Level Robustness Assessment

Background and Aims: Polygenic risk scores (PRSs) for atherosclerotic cardiovascular disease (ASCVD) can perform equivalently at the population level yet disagree for individual patients. We examined whether such intra-individual variability reflects genuinely complementary risk information or mainly statistical and methodological uncertainty, and whether it affects clinical classification once PRSs are integrated into SCORE2-OP. Methods: In 4,137 ASCVD-free participants of the CoLaus|PsyCoLaus cohort (478 incident events over a median 14.4 years), we identified 16 ASCVD-PRSs with practically equivalent population-level performance using Bayesian equivalence testing. We quantified intra-individual variability (standard deviation, coefficient of variation, intraclass correlation, Cohen's kappa, extreme discordance), tested whether discordance exceeded chance, decomposed scores into shared and unique genetic components, and assessed variability after integration into SCORE2-OP, benchmarked against perturbation of systolic blood pressure. Results: For a typical individual, risk estimates varied by 18 percentile points across PRSs. Discordance matched chance expectations under a shared-signal model, with no distinct phenotypic profile among discordant individuals, and predictive power resided overwhelmingly in the shared genetic component. Variability tracked PRS size and weighting rather than distinct variants. After integration into SCORE2-OP, 75.6% of participants were placed in different categories by at least one model and 54.6% as both low and high risk; instability was concentrated near guideline thresholds and far exceeded that from blood-pressure measurement error. Conclusions: Equivalent population-level performance is not sufficient to treat PRSs as interchangeable at the individual level, and methodological standardisation and pragmatic clinical trials remain necessary to determine whether PRS integration improves long-term cardiovascular outcomes.

24.
arXiv (CS.LG) 2026-06-16

libhmm: A Modern C++20 Library for Hidden Markov Models with Correct MLE Emission M-Steps

作者:

arXiv:2605.29208v2 Announce Type: replace-cross Abstract: We describe libhmm, a C++20 library for Hidden Markov Model parameter estimation, sequence decoding, and model selection. libhmm addresses two gaps in existing software: the absence of a well-maintained, zero-dependency C++ HMM library suitable for embedding in production systems, and the widespread use of method-of-moments (MOM) approximations in the emission distribution M-step of the Baum-Welch algorithm. The library implements correct maximum likelihood estimators for sixteen scalar emission distributions, including an ECME algorithm for the location-scale Student-t distribution, Newton-Raphson maximization for Gamma, Beta, Weibull, and Negative Binomial distributions, and the von Mises distribution for circular data. All forward-backward and Viterbi calculations operate in full log-space. SIMD acceleration is provided for AVX-512, AVX2, SSE2, and ARM NEON via compile-time dispatch with scalar fallback. Version 4 adds multivariate observation support via the BasicHmm template, with three multivariate emission families (diagonal Gaussian, full-covariance Gaussian, and independent components) each with correct weighted MLE M-steps. Python bindings are available via the companion package pylibhmm. We compare libhmm against established C and C++ HMM libraries and against published R reference packages on seven real-data benchmarks, and discuss the architectural tradeoffs made in the design.

25.
arXiv (CS.CV) 2026-06-17

GASE: Gaussian Splatting-Based Automated System for Reconstructing Embodied-Simulation Environments

Training embodied agents in the real world requires skilled operators and expensive hardware. Simulation environments offer a compelling alternative by enabling large-scale, cost-effective data augmentation. Consequently, rapidly constructing high-fidelity simulation scenes with a minimal sim-to-real gap has become a critical objective in robot learning. While reconstruction-based methods provide superior visual quality, current workflows are hindered by inefficient data acquisition and subpar foreground object extraction. We thus propose GASE, a highly automated system for simulation scene construction. GASE leverages multi-view video streams from panoramic camera arrays to enable rapid environment scanning. To ensure high-quality asset generation, our pipeline introduces a camera-pose-based strategy that robustly extracts objects across frames in the 2D domain, followed by high-fidelity scene inpainting. Foreground objects and the static background are then reconstructed independently and seamlessly imported into physics simulators for policy training. Extensive experiments demonstrate that GASE outperforms existing 3D Gaussian-based methods in segmentation accuracy by over 10\% while achieving state-of-the-art inpainting quality. Furthermore, real-robot deployments across manipulation and navigation tasks maintains a performance gap of less than 10\% compared to policies trained purely on real-world data. These results confirm that GASE provides an efficient and highly effective solution for bridging the sim-to-real gap. Code will be released.