Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
medRxiv (Medicine) 2026-06-18

The relationship between serotonin transporter occupancy and extracellular serotonin concentration is hyperbolic, not linear: implications for safely tapering antidepressants

Background: Hyperbolic tapering is an increasingly recognized approach for discontinuing serotonin reuptake inhibitor (SRI) antidepressants that involves non-linear dose reductions with equal stepwise reductions in serotonin transporter (SERT) occupancy to mitigate withdrawal symptoms. Its theoretical basis is the hyperbolic relationship between SRI dose and SERT occupancy reported in radioligand imaging studies. Hyperbolic tapering implicitly assumes that changes in SERT occupancy approximate changes in biologic effect and withdrawal risk. Because SERT occupancy plateaus across the therapeutic dose range of SRIs, this framework predicts relatively small biologic effects and withdrawal risk within this range. However, SERT occupancy influences serotonergic activity only indirectly via its effects on extracellular serotonin concentrations, and the relationship between these two variables is poorly characterized. Methods: We developed a two-pathway clearance model derived from mass-action kinetics to evaluate the steady-state relationship between SERT occupancy and extracellular serotonin concentrations under chronic SRI treatment. Results: Our analysis indicates that serotonin concentrations increase hyperbolically as transporter occupancy increases, suggesting that biologically meaningful differences in serotonergic signaling persist across the therapeutic dose range of SRIs despite plateauing occupancy. Conclusions: Our model predicts a hyperbolic relationship between SERT occupancy and extracellular serotonin concentrations, suggesting that changes in occupancy may not map proportionally onto serotonergic effect. These findings provide a potential mechanistic explanation for dose-dependent clinical effects of SRIs despite plateauing transporter occupancy and generate testable hypotheses regarding antidepressant tapering strategies. Empirical validation is warranted.

02.
arXiv (CS.AI) 2026-06-16

SDS-LoRA: Overcoming Anisotropic Gradient Scaling in Low-Rank Adaptation

arXiv:2606.16454v1 Announce Type: cross Abstract: Low-Rank Adaptation (LoRA) enables efficient adaptation of large pre-trained models to downstream tasks by parameterizing weight updates with low-rank matrices. In this paper, we investigate the limitations of the LoRA parameterization from a geometric perspective. Specifically, we show that when a full fine-tuning gradient is backpropagated to the low-rank matrices, it undergoes anisotropic scaling driven by their singular values. We argue that this phenomenon is undesirable because it distorts the full fine-tuning gradient by skewing it toward dominant singular directions while suppressing others. Our analyses demonstrate that anisotropic gradient scaling reduces the effective rank of the low-rank matrices' gradients and results in suboptimal alignment between the full fine-tuning gradient and its low-rank approximation in LoRA, thereby exacerbating the gap to full fine-tuning. To address these limitations, we propose a new low-rank parameterization, SDS-LoRA, which structurally decouples singular values from the backward pass. Our method ensures that the full fine-tuning gradient backpropagates only through the orthonormal bases of the low-rank matrices' subspaces, independent of their scales. Convergence analysis demonstrates that while LoRA's convergence rate degrades with the condition number of the low-rank matrices, SDS-LoRA remains independent of it. Experimental results across natural language and vision benchmarks show that SDS-LoRA improves loss convergence and reduces the gap to full fine-tuning, significantly enhancing adaptation performance.

03.
arXiv (CS.LG) 2026-06-16

Early Anomaly-Onset Detection based on Wigner–Ville Distribution Slice Spectra: A Transmission-Grid Test Case

arXiv:2606.15856v1 Announce Type: cross Abstract: Operational disturbance monitoring in power networks requires decisions to be made from waveform windows as they arrive, rather than from completed records after the event. This study evaluates full-vector Wigner–Ville Distribution Slice (WVDS) spectra for sequential anomaly-onset detection in high-voltage grid-voltage waveforms. The approach keeps the bilinear midpoint interaction structure of the Wigner–Ville distribution and represents each 128-sample voltage window by a 128-dimensional slice spectrum, avoiding manually selected fault-frequency markers. WVDS is used with a baseline-normalized deviation (BND) score and is compared against the BND of Fast Fourier Transform (FFT-BND), raw-window autoencoders, FFT autoencoders, and WVDS autoencoders under the same thresholding and three-window persistence rule. A synthetic autoencoder–clustering teacher is used to select RTE fault records that start from an initially normal region and then transition to anomalous behavior. On the filtered test set, FFT-BND achieves the highest sensitivity, whereas WVDS-BND provides the lowest false-alarm operating point, reducing record-level pre-onset false alarms to 0.69%. The autoencoder comparison follows the same selectivity pattern: WVDS reconstruction decreases false alarms relative to FFT reconstruction but misses more examples. The results indicate that preserved WVD cross-term information can form a selective representation for online grid-waveform anomaly monitoring when false alarms are costly.

04.
arXiv (CS.LG) 2026-06-19

Enhancing Graph Neural Networks Using Proximity Graphs for Dust Source Emission Forecasting

arXiv:2606.19825v1 Announce Type: new Abstract: Accurate prediction of dust source emissions is critical for mitigating the significant environmental and health hazards posed by dust storms. Traditional forecasting methods often struggle to capture the complex spatiotemporal dynamics of these phenomena. In this paper, we demonstrate that proximity graphs enable Graph Neural Networks (GNNs) to effectively model the intricate spatial and temporal relationships between data points. Specifically, we use proximity graphs–such as Delaunay triangulation, Gabriel graph, k-Nearest Neighbor graph, and Yao graph–as the input for GNNs (including GraphSAGE, Graph Convolutional Networks, and Graph Attention Networks) to perform message passing. Our approach highlights the effectiveness of integrating proximity graphs with GNNs for robust and accurate dust source forecasting. To emphasize the importance of proximity graph representations, we compare our method against GNNs using random graphs for message passing. The results show that GNNs with proximity graphs significantly outperform those with random graphs and are also far superior to Long Short-Term Memory (LSTM) model in dust source emission forecasting.

05.
arXiv (CS.CL) 2026-06-16

When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions

Chain-of-thought (CoT) reasoning has become the default strategy for enhancing LLM capabilities, yet its application raises a fundamental question: when is explicit reasoning actually beneficial? Empirical evidence reveals a striking paradox: CoT often provides marginal or even negative gains on factual and open-ended tasks while multiplying token consumption. In this work, we show that LLM reasoning is not a static property of tasks or models, but a dynamic decoding state that emerges during generation. Through systematic analysis, we find early-stage entropy dynamics provide a reliable signal of this state: tasks benefiting from CoT exhibit consistent entropy reduction, while others display unstable or increasing patterns. This behavior can be interpreted as a phase-transition-like shift from a high-entropy exploratory regime to a low-entropy structured reasoning regime. Based on these insights, we propose EDRM (Entropy Dynamics-based Reasoning Manifold), a lightweight and training-free routing framework that leverages early decoding entropy to adaptively select inference strategies. EDRM embeds entropy trajectories into a compact and interpretable manifold representation, enabling both zero-shot deployment and fine-grained instance-level adaptation. Across 15 benchmarks and 4 LLMs of varying scales and architectures, EDRM consistently outperforms static baselines. At the dataset level, EDRM achieves 41–55\% token reduction while improving accuracy with as few as 50 calibration samples. At the instance level, it further improves accuracy by up to 4.7\% while maintaining 27–45\% token savings. These results suggest that reasoning should be invoked selectively rather than by default, and demonstrate the effectiveness of entropy-driven decoding control for efficient and adaptive LLM inference.

06.
arXiv (CS.LG) 2026-06-15

When to Write and When to Suppress: Route-Specialized Dual Adapters for Memory-Assisted Knowledge Editing

作者:

arXiv:2606.14668v1 Announce Type: new Abstract: Knowledge editing systems must update selected facts while preserving nearby but irrelevant behavior. This paper studies this problem in a memory-assisted setting where an edit memory is retrieved at inference time and a parameter-efficient adapter corrects the model's object preference. We argue that the central design question is not only how to write an edit, but also when to suppress it. We introduce \method{}, a route-specialized dual-adapter editor. A relevance router first decides whether a prompt should receive an edit memory. Routed prompts use an edit adapter trained to prefer the new object over the original object; unrouted non-direct prompts use a separate locality adapter trained to preserve or restore the original-object preference. We evaluate \method{} on three 1,000-case protocols, \cf{}, \zsre{}, and \mquake{}, under the same memory protocol and two 7B/8B base models. On Llama-3.1-8B-Instruct, \method{} obtains the best overall probability-preference accuracy on all three benchmarks: 0.8180 on \cf{}, 0.8946 on \zsre{}, and 0.9922 on \mquake{}. The same trend holds on Qwen3-8B. Router ablations show that the relevant memory boundary differs across datasets: a lexical neural router is safest on \cf{}, while BGE embedding routing is better on \zsre{} and \mquake{}. Component and module ablations show that the gain mainly comes from separating edit injection from off-route suppression rather than from simply increasing LoRA capacity.

07.
arXiv (CS.CV) 2026-06-16

Classifying by Proxy: Explainable and Reproducible Ensemble of Proxy Tasks for Child Sexual Abuse Imagery Classification

Child Sexual Abuse Imagery (CSAI) classification systems are needed solutions for lessening the psychological impacts often felt by law enforcement agents responsible for evaluating these materials and for efficient removal of these materials from the web. However, due to the nature of the task, researching and developing such systems is not a trivial endeavor. The images are highly sensitive, and the related datasets are under restrictive access regimes, which means most studies in the area are not reproducible or distributable and are therefore hard to compare and validate. More concerning still, most models for this task today lack an aspect often desired by law enforcement agents: explainability. In this paper, we apply an ensemble of Proxy Tasks – tasks that correlate to CSAI classification – yielding improvements in reproducibility, explainability, and security for distribution. This concept is applied for the first time to real CSAI, with a novel selection of relevant Proxy Tasks (selected from the CSAI literature) and training adaptations to the original framework. Our final model achieves competitive results, yielding 91.9% balanced accuracy on the RCPD dataset with the best Proxy Task combination. We furthermore contrast these results with the best-in-class representation learning model, DINO, and show that our ensemble improves accuracy and provides explanations for its classification results, a feature that a single deep learning model can seldom provide.

08.
arXiv (CS.AI) 2026-06-11

Federated continual learning: A comprehensive survey on lifelong and privacy-preserving learning over distributed and non-stationary data

arXiv:2606.11272v1 Announce Type: cross Abstract: Federated Learning (FL) enables collaborative and privacy-preserving model training across distributed clients, but most existing FL systems implicitly assume data stationarity. In real-world settings-such as healthcare, industrial IoT (IIOT), cybersecurity, and smart cities-data streams are inherently non-stationary, leading classical FL methods to suffer from performance degradation, instability, and catastrophic forgetting. Continual Learning (CL) addresses learning under evolving data distributions but has been largely studied in centralized settings, overlooking key constraints of federated systems, including privacy, limited communication, and client heterogeneity. Federated Continual Learning (FCL) emerges at the intersection of FL and CL, aiming to support lifelong, adaptive, and privacy-aware learning over distributed and non-stationary data. This survey provides a comprehensive and systematic overview of FCL. We first present a formal definition of the FCL problem and clarify its distinctive characteristics. We then analyze the limitations of classical FL under non-stationary conditions, highlighting how CL principles support long-term adaptation. To organize the rapidly growing literature, we propose a multi-dimensional taxonomy of FCL approaches. Furthermore, we review representative application domains and data modalities, summarize commonly used evaluation metrics, and discuss experimental perspectives for assessing long-term performance and forgetting. Finally, we highlight key open challenges, including handling extreme heterogeneity under temporal drift, designing scalable and privacy-preserving memory mechanisms, and establishing standardized benchmarks. This survey aims to serve as a reference and a roadmap for advancing FCL toward robust and deployable real-world systems.

09.
arXiv (CS.CV) 2026-06-19

HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing

Foundation models are transitioning from offline predictors to deployed systems expected to operate over long time horizons. In real deployments, objectives are not fixed: domains drift, user preferences evolve, and new tasks appear after the model has shipped. This elevates continual learning and instant personalization from optional features to core architectural requirements. Yet most adaptation pipelines still follow a static weight paradigm: after training (or after any adaptation step), inference executes a single parameter vector regardless of user intent, domain, or instance-specific constraints. This treats the trained or adapted model as a single point in parameter space. In heterogeneous and continually evolving regimes, distinct objectives can induce separated feasible regions over parameters, forcing any single shared update into compromise, interference, or overspecialization. As a result, continual learning and personalization are often implemented as repeated overwriting of shared weights, risking degradation of previously learned behaviors. We propose HY-WU (Weight Unleashing), a memory-first adaptation framework that shifts adaptation pressure away from overwriting a single shared parameter point. HY-WU implements functional (operator-level) memory as a neural module: a generator that synthesizes weight updates on-the-fly from the instance condition, yielding instance-specific operators without test-time optimization.

10.
bioRxiv (Bioinfo) 2026-06-21

Expanding the GUSome: Structure-guided identification and characterization of gut microbial β-glucuronidases

The gut microbiome-encoded {beta}-glucuronidase (GUS) enzymes have a significant effect on human physiology through their deglucuronidation activity on endogenous and exogenous glucuronides. GUS activity also significantly influences the pharmacokinetics, efficacy and toxicity of various drugs including chemotherapeutic drugs. Given their crucial role in drug metabolism, GUS enzymes have emerged as promising targets for therapeutic intervention. Here, we have identified and characterized 79 unique GUS enzymes through a structure-guided approach. Structural modelling of these GUS enzymes revealed a conserved core and active-site residues with significant variations in the number and nature of the C-terminal domains. A new classification system based on the number and type of additional C-terminal domains is presented for the GUS proteins. Further, GUS enzymes have been categorized into different loop categories linked to their substrate preferences. The relationship between domain architecture and loop-type is explored by sequence similarity network analysis. We could successfully express, purify and validate GUS processing capability of a panel of identified GUS proteins. The nature of oligomer organization has been deciphered by SEC and DLS studies. Further, we have identified additional GUS enzymes capable of processing SN-38G, glucuronidated form of anticancer drug, irinotecan. These newly identified GUS enzymes will offer valuable insights into gut microbial GUS diversity and their role in understanding the population-specific drug-induced adverse effects on human health.

11.
arXiv (CS.CL) 2026-06-18

Enhancing Multilingual Reasoning via Steerable Model Merging

Model merging is an effective technique for composing the capabilities of a multilingual model and a reasoning model. It has achieved promising generalization in multilingual reasoning tasks by aligning feature spaces of different models. However, the merged single model often fails to address the conflicts between source models, leading to suboptimal performance. In other words, the one-size-fits-all merging strategy may not align with the characteristics of different inputs which may require prioritizing certain models over others. To this end, we propose a Steerable Model Merging (ST-Merge) framework to modulate the contribution of each source model. To realize this idea, we introduce a gated cross-attention mechanism to weight or filter the two attended source models in an adaptive manner. Extensive experiments demonstrate that ST-Merge consistently outperforms multiple strong baselines on four multilingual reasoning benchmarks across 21 different languages.

12.
arXiv (math.PR) 2026-06-16

The existence of invariant sublinear expectations for $G$-SDEs

arXiv:2606.15203v1 Announce Type: new Abstract: In this paper, we study the existence of invariant sublinear expectations of Markovian semigroups on sublinear expectation spaces. To achieve this, we establish a complete metric space of sublinear expectations, on which we extend Harris' method to the nonlinear setting on the convergence of sublinear semigroups. We then explore two cases of $G-$diffusions by studying the Lyapunov function and the local Doeblin condition. One is the $G-$Brownian motion on the unit circle which is the case studied in Feng and Zhao [Zhaonon], but with the new method. Another is the multidimensional $G-$SDEs on the whole space $\mathbb{R}^d$. We establish, for the first time in the literature, the existence of the invariant sublinear expectation for $G-$SDEs under the non-degenerate and weakly dissipative assumption. For this, we prove that for a class of $G-$SDEs, the $G-$expectation can be represented as the supremum of the semigroup of a family of SDEs, of which the regularity is obtained by considering the Bismut-Elworthy-Li formula and the Denis-Hu-Peng representation for the distribution of $G-$Brownian motions.

13.
arXiv (CS.CV) 2026-06-16

Explainable Flood Segmentation on Sentinel-1 SAR Imagery: A Comparative Study of CNN and Transformer Architectures

Rapid and accurate flood prediction is essential for disaster response and mitigation planning. Synthetic Aperture Radar (SAR) sensors in satellites are well-suited for this purpose because they operate independently of weather and daylight conditions. Although SAR-based data enable all-weather flood monitoring, distinguishing flooded land from permanent water remains a significant challenge, particularly when flooding is defined strictly as inundated land. This study provides a comprehensive comparison of convolutional neural network (CNN) and vision transformer architectures for multi-class flood segmentation using Sentinel-1 SAR imagery, specifically trained to separate flooded land from permanent water bodies and land. Three state-of-the-art (SOTA)CNN-based models, U-Net, U-Net++, and DeepLabV3 with ResNet-34 backbone, and three SegFormer variants (b0,b1,b2) were evaluated in two benchmark datasets, the ETCI NASA dataset and SenFloods11, using scene-based data splits to ensure a realistic assessment of spatial generalization. The results demonstrate that SegFormer-b2 significantly outperforms the U-Net baseline on the ETCI dataset (higher flood IoU across all 7 test scenes in the Wilcoxon signed-rank test), while after fine-tuning on Sen1Floods11, the advantage narrows to within the range of scene variability and is concentrated in spatially fragmented flood events. The study includes both qualitative and quantitative explainability techniques to visually comprehend model decisions and systematically assess prediction reliability. Qualitative analysis reveals that SegFormer-b2 produces more spatially coherent Grad-CAM activations focused on flood-relevant features, while U-Net generates more informative uncertainty estimates along flood boundaries.

14.
arXiv (CS.CV) 2026-06-17

Evaluating Synthetic Data Generation for Domain Generalization in Fetal Brain MRI Segmentation

Fetal brain tissue segmentation from magnetic resonance imaging (MRI) is crucial for studying neurodevelopment, but remains challenging due to data heterogeneity and limited annotations. Domain randomization (DR) has recently emerged as a promising strategy for single-source domain generalization by synthesizing training images with randomized artifacts, contrast, and resolution. In this work, we investigate how to maximize the out-of-domain (OOD) generalization of DR-based methods. We evaluate several synthetic data generation strategies for DR, with a particular focus on our recently proposed framework, FetalSynthSeg. We show that simple Gaussian mixture-based intensity modeling outperforms more complex physics-based simulations, and that intensity clustering (subdividing tissue classes based on intensity) improves OOD robustness. Evaluated on 348 fetal subjects from four sites spanning 0.55-3T and both T1w and T2w contrasts, FetalSynthSeg reaches state-of-the-art performance on several FeTA 2024 testing datasets (80-85 Dice score) and, for the first time, offers robust segmentation on modalities other than T2w for fetal brain segmentation (80 Dice on dHCP-T1w dataset). Compared with state-of-the-art methods such as BOUNTI, nnU-Net ensemble, and the FeTA 2024 winner, FetalSynthSeg delivers comparable or superior accuracy while maintaining strong robustness across domain shifts. Our code, model weights, and Docker image ready for easy inference are available at https://hub.docker.com/r/vzalevskyi/fetalsynthseg.

15.
arXiv (CS.AI) 2026-06-16

Sample from What You See: Visuomotor Policy Learning via Diffusion Bridge with Observation-Embedded Stochastic Differential Equation

arXiv:2512.07212v3 Announce Type: replace Abstract: Imitation learning with diffusion models has advanced robotic control by capturing the multi-modal action distributions. However, existing methods typically treat observations only as high-level conditions to the denoising network, rather than integrating them into the stochastic dynamics of the diffusion process itself. As a result, the sampling is forced to begin from random noise, weakening the coupling between perception and control and often yielding suboptimal performance. We propose BridgePolicy, a generative visuomotor policy that directly integrates observations into the stochastic dynamics via a diffusion-bridge formulation. By constructing an observation-informed trajectory, BridgePolicy enables sampling to start from a rich and informative prior rather than random noise, substantially improving precision and reliability in control. A key difficulty is that diffusion bridge normally connects distributions of matched dimensionality, while robotic observations are heterogeneous and not naturally aligned with actions. To overcome this, we introduce a semantic aligner to unify the visual and state inputs and align the observations with action representations, making diffusion bridge applicable to heterogeneous robot data. Extensive experiments across 52 simulation tasks on three benchmarks and 5 real-world tasks demonstrate that BridgePolicy consistently outperforms state-of-the-art generative policies. Our code is available at https://jianghcsr.github.io/BridgePolicy_page/.

16.
arXiv (quant-ph) 2026-06-11

An iterative Ising decoder for quantum error correction codes

arXiv:2606.12301v1 Announce Type: new Abstract: The Ising framework maps the decoding problem in quantum error correction onto ground-state optimization of a classical Hamiltonian, in which $X$-$Z$ error correlations enter as cross terms. Under phenomenological depolarizing noise, the exact joint formulation contains up to 8-body interactions for the toric code and 10-body for the $6.6.6$ color code. These high-order terms degrade solver convergence, inflate runtime, and raise the auxiliary spin overhead when embedding into native 2-body Ising hardware. In this work, we propose the iterative low-order decoding (ILOD) algorithm, which alternates between $X$- and $Z$-type sub-Hamiltonians, approximating cross-type correlations through Bayesian priors that reweight each type's couplings using the other type's inferred error configuration. This halves the maximum body count of interaction terms in the Hamiltonian, accelerating the solver, restoring convergence at larger code distances, and reducing the total spin count for 2-body embedding by a factor of $2.5$. For the toric code, ILOD attains a threshold of $4.73%$ versus $4.83%$ for the joint formulation, with the empirical runtime ratio scaling as $(0.81)^d$. For the $6.6.6$ color code, their thresholds agree within statistical uncertainty for small code distances, and ILOD remains convergent for larger distances where the joint formulation fails to converge despite a larger annealing budget.

18.
arXiv (quant-ph) 2026-06-16

Quantum Nonlocal Games on Graph Ensembles

arXiv:2606.16784v1 Announce Type: new Abstract: Quantum entanglement is one of the most striking discoveries in all of science. This effect allows, for instance, two spatially separated agents to coordinate their actions, without communication, to an extent that is both counter-intuitive, and provably impossible by any other physical means. A recently discovered example is that of mobile agents (players) performing spatial coordination tasks such as rendezvous, where the agents aim to meet on a network without communication. Until now, demonstrations of this advantage have relied on highly idealized conditions: agents are assumed to have complete knowledge of the topography, and experiments have been restricted to simulations using data generated by qubits within a single quantum processor. Here we address both limitations by developing a theory for graph ensembles that capture topographical uncertainty and by experimentally demonstrating the advantage in rendezvous scenarios between physically separated ion-trap systems with access to remote entanglement. Moreover, we simulate a broader set of problems on superconducting hardware. Surprisingly, when players are given the ability to gather more local information the quantum advantage increases – a feat impossible by classical means. Our findings establish a concrete route toward practical quantum advantages in motion coordination problems. More broadly, they point to a new way of using portable quantum devices to enhance collective decision-making in uncertain environments.

19.
arXiv (CS.CL) 2026-06-11

Measuring Semantic Progress in Multi-turn Dialogue via Information Gain

Evaluating multi-turn dialogue is challenging because quality emerges across turns rather than within individual responses. We focus on a key dimension of information-seeking dialogue: semantic progress, defined as the accumulation of new, question-relevant, and non-redundant information over the course of a conversation. We formalize semantic progress as question-conditioned uncertainty reduction and introduce an information-theoretic metric that approximates it in embedding space. Our main estimator uses a tractable Gaussian formulation with closed-form updates, while a complementary maximum-entropy argument shows why log-determinant structure arises more broadly when only second-order embedding information is retained. This formulation yields desirable theoretical properties, including monotonicity, additive decomposition of total information gain across turns, and diminishing returns for redundant evidence. Unlike LLM-as-a-judge approaches, our metric requires no autoregressive inference at evaluation time and is fully reproducible for a fixed embedding model. Experiments on MT-Bench, Chatbot Arena, and UltraFeedback show that the proposed metric achieves competitive agreement with human judgments despite targeting only semantic progress, with improved alignment on MT-Bench and UltraFeedback compared to several LLM-based judges. Notably, the method remains effective with lightweight embedding models under CPU-only execution, indicating that semantic progress can be captured without reliance on large model capacity.

20.
arXiv (CS.LG) 2026-06-19

When Calibration Fails the Vulnerable Hospital: Federated Conformal Risk Control via Risk-Curve Shrinkage

arXiv:2606.20115v1 Announce Type: new Abstract: Conformal risk control (CRC) provides distribution-free guarantees on segmentation quality by calibrating a prediction-set threshold on held-out data. In federated deployments, the standard approach pools calibration scores across sites into a single threshold. We provide the first quantification, on real multi-institutional brain tumor data (FeTS-2022, 1,251 subjects, 20 institutions), showing that this naive pooled CRC protects the average hospital but violates coverage at 40% of individual institutions, with the worst site exceeding the target false-negative rate by 7.8 percentage points. The naive alternative, per-site local CRC, largely restores coverage but inflates prediction sets by 83x, rendering them clinically useless. We propose a shrinkage-based federated CRC protocol: each site transmits only its empirical risk curve (G scalars) to a server, which computes a shrinkage-regularized threshold per site. A single hyperparameter n0 smoothly trades worst-case coverage for prediction-set efficiency; leave-one-site-out sensitivity analysis identifies n0=19, achieving 2.7/20 violations at 2.0x stretch. We further show that direct Lagrangian optimization of coverage budgets fails, concentrating risk on vulnerable hospitals, and that the finite-sample correction term is essential: removing it triples violations. The marginal CRC guarantee is preserved by construction under the stated site-mixture assumption; per-site coverage is validated across four targets with three seeds. No patient-level images, masks, or per-volume scores leave any site.

21.
PLOS Computational Biology 2026-06-05

Heuristic multi-site optimization for protein sequence design using Masked Protein Language Models

作者:

by Lijuan Wang, Yuze Wang, Chen Qiu, Liwei Xiao, Xianliang Liu, Junjie Chen Protein sequence design for tailored functional properties is a fundamental task in protein engineering, with critical applications in drug discovery and therapeutic development. Efficient navigation of the combinatorial vastness of protein sequence space to identify functional variants remains a formidable challenge. Conventional approaches, which predominantly rely on template-based local search or single-residue mutagenesis, are constrained by their susceptibility to local optima and their potential risk of destabilizing native structural stability. In this study, we introduce ProtHMSO, a heuristic multi-site optimization framework leveraging masked protein language models (ProtLMs) for context-aware sequence exploration. ProtHMSO mimics natural evolutionary mechanisms by employing ProtLM-derived substitution probabilities to guide heuristic searches for synergistic mutations, thereby constraining combinatorial search spaces through evolutionary and biophysical priors. ProtHMSO is further applied to replace the exploration strategies in genetic algorithms (GAs) and Monte Carlo tree search (MCTS) for improving their convergence efficiency. Benchmark experiments demonstrate that protein sequences generated by ProtHMSO exhibit superior functional performance and closer alignment with natural sequence distribution, compared with state-of-the-art methods. These advancements highlight that ProtHMSO has strong potential and compatibility to accelerate functional protein discovery, offering a robust framework for efficient and context-aware exploration of protein sequence space.

22.
arXiv (quant-ph) 2026-06-19

Steady-state entanglement of spin qubits mediated by nonreciprocal and chiral magnons

arXiv:2509.13094v3 Announce Type: replace Abstract: We propose a hybrid quantum system in which a magnet supporting non-reciprocal magnons, chiral magnons, or both mediates the dissipative and unidirectional coupling of spin qubits. By driving the qubits, the steady state of this qubit-qubit coupling scheme becomes the maximally entangled Bell state. We devise a protocol where the system converges to this entangled state and benchmark it including qubit decay and dephasing. The protocol is numerically tested on a hybrid system consisting of nitrogen-vacancy (NV) centers coupled to magnon surface modes of an yttrium iron garnet (YIG) film. We show that the dephasing time of the NV centers forms the bottleneck for achieving the entanglement of NV centers separated by a distance within the magnon coherence length. Our findings identify the key technological requirements and demonstrate a viable route toward steady-state entanglement of solid-state spins over distances of several microns using magnonic quantum networks, expanding the toolbox of magnonics for quantum information purposes.

23.
Nature (Science) 2026-06-10

A 5.3-million-year-old deep-sea whale necropolis in the Diamantina Zone

Whale falls are biodiversity oases at seabeds1–6, yet their record from the oceans has remained sparse and fragmentary6,7. Here we report the discovery of a vast whale necropolis in the Diamantina Zone (4,616- to 7,001-m depth), extending about 1,200 km along the sea floor of the southeastern Indian Ocean. This area has a deep and extensive accumulation comprising five modern natural whale-fall communities and 476 fossil cetaceans recorded. We show that carcasses host specialized communities dominated by brittle stars, bone-boring worms and chemosynthesis-based bivalves and that the fossil record in this area comprises both extant and extinct deep-diving beaked whales. Isotopic dating shows that whale falls in this region have occurred since at least 5.3 million years ago. These findings reshape the understanding of the limits and biogeography of whale-fall ecosystems and establish some deep sea floors as a fossil archive for tracing cetacean evolution over geological time. Researchers uncovered an enormous deep-sea accumulation of whale remains in the southeastern Indian Ocean, showing long-term, specialized ecosystems and an extensive fossil record that offers new insight into deep-ocean biodiversity and whale evolutionary history.

24.
bioRxiv (Bioinfo) 2026-06-12

Computational Design of Optimal Sequences for Targeted Hypermutagenesis Using Recombination-Coupled Diversity-Generating Retroelements

Diversity-generating retroelements (DGRs) are natural systems that accelerate evolution via targeted hypermutation at adenines. We previously developed DGRec, a system combining DGRs and recombineering for programmable mutagenesis in Escherichia coli. We here address two important issues with DGRec: the dependence of mutagenesis efficiency on the dgrRNA secondary structure and the variability of the reverse-transcription biases with sequence context and position. First, we introduce and validate a method to recode non-functional templates, i.e. with low mutagenesis efficiency, into highly functional ones through synonymous mutations. Second, we develop a Long Short-Term Memory (LSTM) model to predict DGRec mutational profiles for any given template sequence. By integrating this LSTM model with our recoding method, we establish a comprehensive workflow for customized directed evolution, enabling researchers to precisely fine-tune DGRec in vivo mutagenesis to their engineering needs.

25.
arXiv (CS.AI) 2026-06-16

Running hardware-aware neural architecture search on embedded devices under 512MB of RAM

arXiv:2606.14824v1 Announce Type: cross Abstract: This document proposes a novel approach to hardware-aware neural architecture search (HW NAS) that considers the resources available on the computing platform running it, enabling its execution on various embedded devices. The presented HW NAS produces tiny convolutional neural networks (CNNs) targeting low-end microcontroller units (MCUs), typically involved in the Internet of Things (IoT) or wearable robotics, opening new use cases. A gateway could run it to tailor CNNs' architecture on the acquired data without using external servers, ensuring privacy. The proposed technique achieves state-of-the-art results in the human-recognition tasks on the Visual Wake Word dataset, a standard TinyML benchmark, on several embedded devices.