Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.CV) 2026-06-17

Not Truly Multilingual: Script Consistency as a Missing Dimension in VLM Evaluation

Current multilingual evaluations for Vision-Language Models (VLMs) assume a one-to-one mapping between language and orthography, overlooking billions of users of multi-script languages. We introduce PuMVR (Punjabi Multimodal Visual Reasoning), a benchmark of 1,000 strictly parallel image-text instances across Punjabi's three active scripts: Gurmukhi, Shahmukhi, and Roman. Evaluating 10 state-of-the-art VLMs, we expose a substantial and systematic Script Gap. Models frequently solve visual tasks in one script while failing identical tasks in another, with accuracy deltas reaching 16%. Crucially, visual input boosts absolute performance uniformly yet does not close the orthographic gap. Furthermore, cross-script in-context transfer is highly brittle, exposing script-locked knowledge representation. Supported by McNemar tests across all script pairs, our findings demonstrate that current "multilingual" VLMs are not truly multi-script. We propose the Script Consistency Rate (SCR), which falls as low as 24.8% on our benchmark, as a mandatory metric for script-agnostic evaluation to ensure equitable AI access. Data and code are available at: https://github.com/prabhjotschugh/Not-Truly-Multilingual-PuMVR.

02.
arXiv (quant-ph) 2026-06-24

A Quantum Non-Gaussianity Criterion Based on Photon Correlations $g^{(2)}$ and $g^{(3)}$

arXiv:2511.08488v2 Announce Type: replace Abstract: Quantum non-Gaussian states, which cannot be written as mixtures of Gaussian states, are necessary to achieve a quantum advantage in continuous variable systems. They represent an important benchmark for the realization of an advanced quantum light source, as they cannot be made by simple means such as displacement and squeezing. We introduce an attenuation-resistant sufficient criterion for quantum non-Gaussian states based on the second- and third-order correlation functions, $g^{(2)}$ and $g^{(3)}$. The general non-linear bound for classical mixtures of Gaussian states is $\sqrt{g^{(3)}} + 3 \sqrt{g^{(2)}} \geq 2$. Any mixture of Gaussian states must fulfill this inequality, thus, the violation of it represents a direct confirmation of quantum non-Gaussianity. We experimentally show the non-Gaussianity of the state produced by a quantum dot single-photon source, where we obtain $\sqrt{g^{(3)}} + 3 \sqrt{g^{(2)}} = 0.174 (13)$, which represents a statistical significance of more than $100$ standard deviations.

03.
arXiv (CS.LG) 2026-06-17

Monotonic Kolmogorov-Arnold Networks: A Theoretical and Empirical Study of Monotonicity as an Inductive Bias

arXiv:2606.17886v1 Announce Type: new Abstract: Monotonicity has been a long-running architectural inductive bias for neural networks, motivated by tabular, scientific, and economic settings where outputs are known to respond monotonically to certain inputs. Existing approaches are MLP- or flow-based and lack per-edge functional transparency; the only Kolmogorov–Arnold Network (KAN) variant with monotonicity, MonoKAN, enforces the constraint only on a restricted parameter subset and requires a projection-style training procedure. We close this gap with MKAN, a KAN with hard monotonicity guaranteed for all parameter values via exponential reparameterization of B-spline coefficients, positive edge weights, and a monotone base activation. Training reduces to standard unconstrained gradient descent. Our headline theoretical contribution is a representation-cost theorem: any $C^K, K >0$ feature extractor inducing a ball-shaped semantic-neighborhood partition admits a monotone realization of the equivalent neighborhood structure at $N' = N^* + k \le 2N^*$, where $k$ is the number of non-monotone coordinates of the original. The bound is architecture-agnostic and gives a principled sizing rule for monotone encoders. Empirically, MKAN is competitive with state-of-the-art monotone NNs on the SMM/ICML-2024 benchmark while being the only method that combines hard unconstrained monotonicity with KAN's per-edge functional transparency; the $2N^*$ prediction is validated in a self-supervised feature-size sweep on four real datasets, and on a controlled monotone-generative dataset MKAN recovers ground-truth factors with substantially higher Spearman alignment than KAN, MLP, and linear baselines.

04.
arXiv (CS.CV) 2026-06-18

HACMatch Semi-Supervised Rotation Regression with Hardness-Aware Curriculum Pseudo Labeling

Regressing 3D rotations of objects from 2D images is a crucial yet challenging task, with broad applications in autonomous driving, virtual reality, and robotic control. Existing rotation regression models often rely on large amounts of labeled data for training or require additional information beyond 2D images, such as point clouds or CAD models. Therefore, exploring semi-supervised rotation regression using only a limited number of labeled 2D images is highly valuable. While recent work FisherMatch introduces semi-supervised learning to rotation regression, it suffers from rigid entropy-based pseudo-label filtering that fails to effectively distinguish between reliable and unreliable unlabeled samples. To address this limitation, we propose a hardness-aware curriculum learning framework that dynamically selects pseudo-labeled samples based on their difficulty, progressing from easy to complex examples. We introduce both multi-stage and adaptive curriculum strategies to replace fixed-threshold filtering with more flexible, hardness-aware mechanisms. Additionally, we present a novel structured data augmentation strategy specifically tailored for rotation estimation, which assembles composite images from augmented patches to introduce feature diversity while preserving critical geometric integrity. Comprehensive experiments on PASCAL3D+ and ObjectNet3D demonstrate that our method outperforms existing supervised and semi-supervised baselines, particularly in low-data regimes, validating the effectiveness of our curriculum learning framework and structured augmentation approach.

05.
arXiv (quant-ph) 2026-06-16

Accelerating physics-informed neural networks for full waveform inversion using a hybrid quantum-classical finite-basis architecture

arXiv:2606.01110v2 Announce Type: replace-cross Abstract: Full waveform inversion (FWI) reconstructs heterogeneous material properties from receiver data but remains computationally demanding. Physics-informed neural networks (PINNs) and their domain-decomposed variants (FBPINNs) offer a mesh-free alternative but face convergence challenges when representing complex velocity fields. We present a hybrid quantum-classical FBPINN for acoustic FWI, bringing together quantum computing and classical machine learning, in which the decomposed wavefield network and the global velocity network are implemented as classical-to-quantum pipelines terminating in parameterized quantum circuits (PQCs). The PQCs are realized as differentiable JAX statevector simulators, enabling end-to-end automatic differentiation through the classical PINN, the quantum circuit, and the physics-informed loss. On a geophysical anomaly benchmark, the quantum hybrid reaches a lower L1 velocity error than the primary classical FBPINN baseline in approximately 8x fewer training iterations, despite using approximately 33% fewer trainable parameters, and it outperforms all 15 classical hyperparameter variants tested. A second benchmark (checkerboard) demonstrates the generality of the inversion pipeline, confirming that the quantum hybrid architecture can recover structured spatial variations beyond the localized anomaly benchmark. Our framework is broadly applicable to wave-based inverse problems beyond geophysics, including medical ultrasound tomography and non-destructive evaluation.

06.
arXiv (CS.AI) 2026-06-19

TeleMorpher: Toward Robust Simultaneous Motion-Location Editing

arXiv:2606.19676v1 Announce Type: cross Abstract: Diffusion models have achieved remarkable success in image and video generation and editing. While recent studies have extended these efforts toward motion editing, simultaneously transforming both motion and location-despite its practical importance-remains largely unexplored. To better understand robust motion-location editing, we first analyze the fundamental factors that degrade its quality. Based on this analysis, we propose TeleMorpher, one of the first one-shot frameworks to the best of our knowledge, for simultaneous motion-location editing. Our approach leverages motion priors, a target motion-centric video generated from an off-the-shelf model as motion-editing guidance, and the ground truth motion to enable more controllable and precise motion-location editing. Via this, our framework works as follows: (1) we first disentangle the protagonist and the background via pre-trained segmentation and inpainting models. (2) Then, we introduce a training-free pose warping that edits the protagonist's motion with the motion prior as the guidance. (3) The result of warped motion video is directly injected into a baseline motion editor during inference, mitigating the difference between source and target motions while preserving the appearance of the source video. (4) To enhance the reliability of quantitative evaluations, we propose two new LPIPS-based metrics that measure the background consistency before and after the motion editing and the fidelity of motion editing performance via measuring the difference between the extracted protagonist's skeletons from source and target videos. Experiments with in-the-wild videos and the TaiChi dataset demonstrate that TeleMorpher achieves superior performance across both quantitative and qualitative measurements (real-human evaluation), underscoring its effectiveness.

07.
arXiv (quant-ph) 2026-06-25

Arbitrarily Loss-Tolerant Quantum Position Verification in a Single Execution

arXiv:2606.25037v1 Announce Type: new Abstract: Quantum position verification (QPV) seeks to certify the spatial location of an untrusted prover, but is challenged fundamentally by entanglement-based attacks and experimentally by photon loss. Both issues were addressed separately in different works and were simultaneously resolved for sequentially repeated protocols in Phys.\ Rev.\ Lett.\ 135,~260801 via a commitment-based modification that renders security independent of transmission losses. However, single-execution protocols are preferable in practice, and the original techniques do not extend to the parallel setting due to their reliance on sequential structure. We overcome this by utilizing different techniques based on no-signalling correlations, lifting the commitment modification to the parallel regime while preserving the security guarantees of the underlying QPV protocol. Applying this to a BB84-based QPV protocol suitable for near-term implementation and secure against bounded-entanglement adversaries, we prove that fixing a threshold~$k$ on the number of successfully committed qubits yields an adversarial acceptance probability that decays exponentially in~$k$. The resulting protocol maintains robustness to noise levels of up to~$3.7\%$ and remains secure under arbitrarily slow quantum communication, as does the original protocol. This yields the first fully loss-tolerant single-shot QPV protocol secure against entangled attackers, making QPV feasible over arbitrary distances. Finally, we refine the sequential analysis and obtain improved quantitative parameters for experimental implementations.

08.
arXiv (CS.LG) 2026-06-16

p-PSO: A Penalized Particle Swarm Optimization Technique for Finding D-Optimal Designs with Mixed Factors in Generalized Linear Models

arXiv:2606.15962v1 Announce Type: cross Abstract: Finding D-optimal designs for generalized linear models (GLMs) is challenging due to the dependence of the Fisher information matrix on unknown parameters and the lack of closed-form solutions, particularly when input factors include both discrete and continuous variables. Although classical algorithms and recent metaheuristic approaches have offered partial solutions, there remains a need for robust and computationally efficient methods. In this paper, we propose a penalized Particle Swarm Optimization (PSO) approach, named $p$-PSO. Here we introduce a new, general-purpose penalty formulation for constrained optimization and demonstrate its effectiveness in optimal design problems. The formulation is algorithm-agnostic and applicable to a broad class of black-box optimization methods. Results show that the method is highly efficient, with its primary contribution being a penalty formulation that enables the direct use of an off-the-shelf PSO algorithm and extends naturally to more general constrained optimization tasks.

09.
arXiv (CS.CV) 2026-06-24

Jolia: Concept-Level Vision-Language Alignment for 3D CT Contrastive Learning

Vision-language contrastive pretraining has become the dominant recipe for 3D medical foundation models, leveraging the large volumes of paired scans and reports produced in clinical practice. However, medical images usually span dozens of organs, and radiological reports are much longer than typical natural image captions and are composed of multiple structured sections. CLIP-style pretraining compresses this structure by encoding each modality into a single global token, at the risk of losing important details. We introduce ConQuer (Concept Queries), an image-text pretraining method that augments CLIP's global alignment with a set of localized alignments, one per concept. ConQuer splits the report into concept-specific sections and learns cross-attention queries that pool the matching image features without using any segmentation mask or spatial supervision. Contrastive learning is then applied independently for each concept. Concepts can be any unit of semantic localization; here, they are anatomical regions, one query per organ or gross body region. As a byproduct, each query learns attention maps focused on its concept, providing built-in spatial interpretability. We use ConQuer to train Jolia, a 3D CT foundation model on chest and abdominal CT. Jolia consistently outperforms a CLIP baseline on findings classification, report generation, and cross-center transfer, and sets a new state of the art across multiple public benchmarks. Jolia's weights will be released upon acceptance.

10.
arXiv (quant-ph) 2026-06-12

Influence-solvability: a systematic theory of $(1+1)D$ solvability and its application to brickwork circuits

arXiv:2606.12538v1 Announce Type: cross Abstract: `Solvable' circuits, such as dual unitaries and its generalisations, have arisen as paradigmatic examples of tractable chaotic non-equilibrium dynamics, both in classical and quantum systems. However, while increasingly more complicated sufficient conditions have been proposed, a systematic theory classifying and understanding general features of solvable circuits is missing. We develop such a theory by introducing influence-solvable circuits, a class of $(1+1)D$ circuits whose influence matrix, which represents the `bath' generated by its own evolution, is given by a uniform MPS with finite bond-dimension $\chi$. This property allows for efficient computation of subsystem dynamics and essentially contains all known examples of solvable circuits. We derive a set of necessary and sufficient local conditions by using a version of the fundamental theorem of MPS for open boundary conditions. Next we apply our theory to brickwork circuits with $\chi=1$ influence-solvability and perform a systematic classification of classical brickwork circuits with local dimension up to $d=3$ and quantum brickwork circuits with $d=2$. Our search reveals new solvable circuits that are not captured by known solvability conditions.

11.
PLOS Computational Biology 2026-06-18

A comparison of contact patterns derived from the population structure in agent-based models and empirical contact survey data

Authors:

by Janik Suer, Johannes Ponge, Michael Brüggemann, Jan Pablo Burgard, Vitaly Belik, Bernd Hellingrath, Alejandra Rincón Hidalgo, Andrzej K. Jarynowski, Richard Pastor, Huynh Thi Phuong, Steven Schulz, Ashish Thampi, Chao Xu, Marlli Zambrano, Rafael Mikolajczyk, André Karch, Veronika K. Jaeger, on behalf of the OptimAgent Consortium Agent-based models (ABMs) are powerful tools for simulating disease spread, relying on individual-level interaction rules from which emergent dynamics arise. An important component in ABMs is contact behaviour. To reduce computational complexity, contact behaviour in ABMs is often assumed as random mixing within structurally defined settings (as, e.g., workplaces). with setting composition typically based on empirical data such as census information. However, the validity of this approach to represent contacts remains unclear. To address this gap, we compare the contact structure derived through this approach in a large-scale ABM with empirical contact survey data with respect to age contact matrices for households, schools, workplaces, all remaining contact settings, and all contacts combined (based on difference matrices and sum of squared errors (SSE)). Our results demonstrate that random mixing in settings with known age compositions like households (SSE:0.7(95%CI0.4–0.9)), schools (SSE:0.7(95%CI:0.3–1.1)) and workplaces (SSE:0.5(95%CI:0.2-0.7)), captures basic interaction patterns but fails to account for age-related variation in contact numbers. The largest differences arise for contacts outside these settings (SSE:3.8(95%CI:1.2–6.5)), as ABMs typically use random regional contacts that do not capture age-structured behaviour observed in contact surveys. Applying contact matrices from both approaches to an age-structured compartmental model, leads to noticeable differences in simulated epidemic outcomes regarding reproduction numbers and spreading dynamics between age groups. Our results suggest that naïve approaches to represent contact behaviour in ABMs based on population structure can be valid in settings with defined age-structures while settings with low a priori structure require more advanced methods to represent contact behaviour observed in contact surveys.

12.
arXiv (CS.CL) 2026-06-16

OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference

Large language models (LLMs) with extended context windows enable powerful applications but impose significant memory overhead, as caching all key-value (KV) states scales linearly with sequence length and batch size. Existing cache eviction methods address this by exploiting attention sparsity, yet they typically rank tokens heuristically using accumulated attention weights without considering their true impact on attention outputs. We propose Optimal Brain Cache (OBCache), a principled framework that formulates cache eviction as a layer-wise structured pruning problem. Building upon the Optimal Brain Damage (OBD) theory, OBCache quantifies token saliency by measuring the perturbation in attention outputs induced by pruning tokens, with closed-form scores derived for isolated keys, isolated values, and joint key-value pairs. Our scores account not only for attention weights but also for information from value states and attention outputs, thereby enhancing existing eviction strategies with output-aware signals. Experiments on LLaMA and Qwen models demonstrate that replacing the heuristic scores in existing works, which estimate token saliency across different query positions, with OBCache's output-aware scores consistently improves long-context accuracy. Code is available at https://github.com/DreamSoul-AI/OBCache.

13.
medRxiv (Medicine) 2026-06-15

HPV Self-Sampling in Cervical Screening: A Rapid Review

Introduction Cervical cancer is the fourth largest cause of cancer deaths in women. HPV self-sampling could increase uptake of cervical screening. This rapid review aimed to determine the accuracy, concordance, uptake and acceptability of self-sampling over clinician-collected samples in high income countries. Method We followed Cochrane Rapid Reviews Methods. Top-up of 4 systematic reviews and meta-analyses was performed. Narrative data synthesis was conducted and meta-analysis where applicable. Databases searched were MEDLINE, EMBASE, CENTRAL and clinical trial registries. Risk of bias was assessed using AMSTAR 2, QUADAS, the Cochrane Risk of Bias (RoB), or the Nudelman and Otto, 2020 tool, depending on the study type. Findings The review included 39 studies for accuracy, 38 studies for concordance, 37 uptake and 48 studies for acceptability. Self-sampling has similar accuracy as clinician-collected samples when PCR-based assays are used. The overall agreement of self-sampling and clinician-collected samples was 87.1%(95%CI;85.6-88.6) with a kappa value of 0.70(95%CI;0.67-0.73). Mail-to-all strategies had higher uptake with participation differences of 11.3%(95%CI:8.4-14.2) in the intention-to-treat analysis and 7.7%(95%CI:4.7-10.8) in the per protocol analysis. Self-sampling is acceptable to non-attendees (91%(95%CI;85.3-94.6). Conclusion and Recommendation Self-sampling shows good performance on the four clinical effectiveness indicators of accuracy, concordance, uptake and acceptability.

14.
arXiv (math.PR) 2026-06-19

A Cycle Walk for Sampling Measures on Spanning Forests for Redistricting

arXiv:2509.08629v2 Announce Type: replace-cross Abstract: We introduce the Cycle Walk, a new Markov chain Monte Carlo method for sampling distributions on balanced graph partitions, motivated by applications in political redistricting. The method operates on spanning forests and combines two types of updates: local "cycle" moves within districts and global moves that exchange population between adjacent districts while preserving balance constraints. This construction enables efficient Metropolis–Hastings correction while allowing proposals at multiple spatial scales. We show that the Cycle Walk naturally interpolates between existing approaches based on local updates and a class of global update methods derived from recombination (RECOM). Through a range of numerical experiments on synthetic graphs and real-world precinct data, we demonstrate that the Cycle Walk exhibits improved empirical convergence diagnostics for distributions that place weaker weight on spanning-tree counts, a regime that is challenging for existing methods. In particular, the algorithm remains effective when incorporating alternative compactness measures that more closely reflect policy-relevant criteria. These results suggest that the Cycle Walk provides a flexible and computationally efficient framework for sampling from a broader class of redistricting distributions than previously accessible with MCMC techniques.

15.
arXiv (math.PR) 2026-06-19

Power-law hypothesis and (un)fairness of PageRank on undirected multi-type PAMs

arXiv:2606.19583v1 Announce Type: new Abstract: The preferential attachment model (PAM) describes the sequential growth of a network based on the "rich-get-richer" principle. Several versions of it have become established for modeling, e.g., citation networks, capturing a power-law degree distribution. Directed versions of the preferential attachment model where the edges are directed from the new to the old vertices have been the subject of extensive research. They have been shown to exhibit remarkable properties such as heavier tails for the limiting graph-normalized PageRank than for the in-degrees. By contrast, for the undirected version, we recently showed that PageRank has similar tails as the degree. In the present paper, we discuss the PageRank asymptotics for a multi-type version of the undirected PAM (here vertices have different colors), complementing previous results of Antunes, Bhamidi, Banerjee and Pipiras on the asymptotics of PageRank on similar directed multi-type or colored PAMs. Our studies are motivated by the aim to go beyond the rigid rule of edge orientation in directed preferential attachment models. As the main result, for the case of a finite set of colors, we show that the power-law hypothesis for PageRank is fulfilled also for the colored undirected PAM, where, by contrast to the directed case, the power-law exponent is color-dependent for some choices of the initial color distribution and the attractiveness function. For the specific case of a two-type model, we discuss implications of our results on fairness in sampling underrepresented nodes from the network.

16.
arXiv (CS.CL) 2026-06-16

Mechanistic Analysis of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

Sequential fine-tuning of Large Language Models (LLMs) adaptation to target tasks often triggers catastrophic forgetting, where the acquisition of novel target skills degrades ancestral capabilities. This paper presents a systematic comparative study of catastrophic forgetting across twenty premier models representing the state-of-the-art in mid-2026. We categorize our investigation into two primary research lines: (i) a behavioral and semantic output drift analysis of ten leading closed-source models (including Claude Fable 5, GPT-5.5 High, and Gemini 3.5 Flash), and (ii) a deep mechanistic interpretation of ten prominent open-weight architectures (such as DeepSeek-V4-Pro, Llama 4 Maverick, and Qwen 3.6-27B). Through weight-space trajectory tracking, Centered Kernel Alignment (CKA), and routing gate drift calculations in Mixture-of-Experts (MoE) layers, we localize the neural circuits highly susceptible to parameter overwriting. Our findings indicate that early-layer attention heads exhibit systemic entropic dispersion, while mid-to-deep feed-forward networks (or sparse expert blocks) suffer localized representation collapse. Informed by these insights, we introduce Low-Rank Circuit Projection (LRCP), a subspace-regularized training intervention. Empirical evaluations show that LRCP successfully mitigates up to 94.2% of ancestral capabilities in open-weight configurations and matches the adaptation velocity of standard PEFT baselines.

17.
arXiv (CS.AI) 2026-06-16

AnonShield: Scalable On-Premise Pseudonymization for CSIRT Vulnerability Data

arXiv:2606.15650v1 Announce Type: cross Abstract: We present AnonShield, a high-throughput, on-premise pseudonymization system that combines GPU-accelerated NER, streaming processing, caching, and schema-aware configuration. Evaluated on datasets up to 550 MB (70,951 records), AnonShield reduces processing time from over 92 hours to under 10 minutes (up to 738x speedup) while achieving up to 94.2% F1-score and 96.7% recall. Our results show that scalable pseudonymization of vulnerability data is feasible without sacrificing analytical utility, enabling compliant data sharing in operational CSIRT environments.

18.
arXiv (CS.AI) 2026-06-16

SpecAlign: Efficient Specification-Grounded Alignment of Large Language Models via Synthetic Data

arXiv:2606.16276v1 Announce Type: new Abstract: As large language models (LLMs) are increasingly deployed in real-world applications, alignment is no longer governed by a single universal notion of safety or helpfulness, but instead by provider- or application-specific model specifications. These specifications are typically long, structured, and frequently updated, yet existing alignment pipelines lack a systematic mechanism to operationalize them as training signals. In this paper, we propose specification-grounded alignment, a new alignment paradigm that treats provider-authored model specifications as the primary alignment target rather than abstract principles or static benchmarks. To instantiate this paradigm, we introduce SpecAlign, a framework that synthesizes alignment data directly from specification documents. SpecAlign combines structured rule annotation, controllable specification instantiation, and multi-agent adversarial data synthesis to generate fine-grained, boundary-aware preference pairs that capture both compliant behaviors and meaningful specification violations. Experiments across multiple model specifications and backbone models demonstrate that training with SpecAlign consistently improves rule compliance while preserving general capabilities and avoiding over-conservative behavior. These results suggest that grounding alignment in explicit model specifications enables rapid, precise, and scalable adaptation of LLM behavior to evolving policy requirements.

19.
arXiv (CS.CV) 2026-06-16

Sinkhorn-CPD: Robust point cloud registration via unbalanced entropic optimal transport

Coherent Point Drift (CPD) is widely used for rigid point cloud registration because of its soft correspondences and closed-form parameter updates. However, CPD's target-side marginal constraint forces every observation, including outliers, to receive exactly unit probability mass. This assumption degrades registration accuracy under heavy outliers and partial overlap. Optimal transport (OT) methods can handle missing mass through unbalanced formulations, but require hand-tuned annealing schedules. In this paper, we propose Sinkhorn-CPD, which replaces CPD's target-side marginal constraint with dual Kullback-Leibler penalties, allowing the algorithm to discard outliers on both sides. The resulting formulation is a fully unbalanced entropic optimal transport problem, which can be efficiently solved by generalized Sinkhorn iterations. Moreover, Sinkhorn-CPD preserves the closed-form Procrustes and variance updates of CPD. In our method, the variance sigma^2 plays the role of the entropic regularization parameter, which induces an automatic annealing schedule from diffuse to sharp correspondences without manual temperature tuning. Experiments on synthetic, cross-category, and scan-to-CAD benchmarks show that Sinkhorn-CPD achieves state-of-the-art accuracy, with strong robustness to outliers and partial overlap.

20.
arXiv (CS.CV) 2026-06-17

Complex Layout Classification in the Wild: A Low-Resource Approach with Layout-Preserving Augmentations

Many digitized corpora suffer from low resources because annotations may be scarce, page scans are noisy and of poor resolution, or layouts are structurally complex in ways that negatively affect the quality of automatic transcription. Developing robust classification models for low-resource languages is inhibited by the lack of large-scale annotated data and by the frequent semantic complexity of page layouts. To this end, we have curated a complex-layout dataset, manually classified into eight distinct layout types based on their separator regions. To overcome data scarcity, we propose a novel training strategy in the form of a CNN-based classifier that employs strong, domain-aware augmentations to improve generalization. We utilize narrow anisotropic Gaussian masking to suppress incidental textual details while preserving essential separations, compelling the model to learn global geometric arrangements. Additionally, we implement reflection-induced label transformations to enrich the training distribution while maintaining label consistency across asymmetric categories. The results demonstrate that layout-specific augmentations can substantially improve page-level layout classification under severe annotation scarcity.

21.
arXiv (CS.CV) 2026-06-18

GUMP-Net: An interpretable model-data-driven intelligent algorithm for multi-class pelvic segmentation

Pelvic segmentation is one of the most important and fundamental research problems in precise and intelligent diagnosis and treatment, as well as surgical planning and navigation for pelvic fractures. By combining an improved geodesic active contour model with deep neural networks, we propose GUMP-Net, an interpretable model-data-driven intelligent algorithm for multi-class pelvic segmentation, in which three network modules are designed to constitute the overall segmentation framework together: the object detection module for automatic level set initialization, the edge detector module for learning an anatomy-aware edge detector function and the iteration module for deep level set evolution. Leveraging the advantages of level set representation and deep learning, GUMP-Net shows more accurate, robust and consistent segmentation performance, especially in small training data situation, compared to the state-of-the-art methods. Extensive experiments on pelvic datasets demonstrate the rationality and effectiveness of the proposed algorithm. Further experiments extended to ankle dataset indicate broader applications to other anatomies. The proposed algorithm not only provides an efficient segmentation method for complex fracture reduction, but also gives an interpretable geometric perspective for understanding deep learning segmentation.

22.
medRxiv (Medicine) 2026-06-15

Association of Genetic Liability to Psychiatric Disorders with Peripheral Metabolic Dysregulation

Importance: Individuals with psychiatric disorders face elevated cardiometabolic risk which is linked to increased mortality. The extent to which this reflects shared pathogenesis or the downstream effects of illness and treatment remains poorly understood. Objective: To characterize the direct pleiotropic effects of psychiatric genetic liability on circulating metabolites and aggregate cardiometabolic risk, independent of psychiatric diagnosis and psychotropic medication use. Design: Cohort study. Setting: Mass General Brigham Biobank (MGBB). Participants: MGBB participants with metabolomic profiling, genomic data, and linked electronic health records. Exposures: Genetic liability to nine psychiatric disorders quantified using polygenic risk scores (PRS): attention deficit/hyperactivity disorder (ADHD), anorexia nervosa (ANO), anxiety disorder (ANX), autism spectrum disorder (ASD), bipolar disorder (BD), major depressive disorder (MDD), PTSD, schizophrenia (SCZ), and substance use disorder (SUD). Main Outcomes and Measures: 249 circulating metabolites and four metabolomic risk scores (MRS) for type 2 diabetes, myocardial infarction, ischemic stroke, and vascular dementia. PRS-metabolite associations were estimated using nested models adjusting for lifetime psychiatric diagnosis and psychotropic medication use. Results: Across 25,290 participants, we identified 604 significant PRS-metabolite associations (Bonferroni p< 1.36 x 10-4), of which 89% persisted after adjustment for lifetime diagnosis and medication use, suggesting that the direct genetic effects on metabolism are largely independent of illness or treatment. PRS for MDD, PTSD, and ADHD showed the most extensive dysregulation, with a transdiagnostic pattern of elevated lipids and systemic inflammation, specifically triglycerides ({beta} = 0.04 to 0.05, all p< 4.4 x10-13) and glycoprotein acetyls ({beta} = 0.05, all p< 2.2 x10-16). Notably, PRS for SCZ and BD showed minimal metabolite dysregulation despite having the strongest association with their target diagnoses. PRS for MDD, PTSD, ADHD, and SUD were associated with increased MRS across cardiometabolic conditions ({beta} = 0.03 to 0.08, all p< 2.1 x10-4). Sensitivity analyses controlling for BMI or excluding participants without any psychiatric history (N: 21,305 and 11,150, respectively) showed a similar pattern. Conclusions and Relevance: Psychiatric genetic liability is associated with systemic metabolic dysregulation independent of illness onset or treatment, supporting a partially pleiotropic basis for psychiatric-cardiometabolic comorbidity.

23.
bioRxiv (Bioinfo) 2026-06-16

Evidence for recombination in dengue virus genomes

Recombination is a key driver of RNA virus evolution, yet its extent and evolutionary implications in dengue virus (DENV) remain incompletely understood. We conducted a comprehensive, genome-wide recombination screen across 6,905 complete DENV genomes representing all four serotypes, 82 countries, and eight decades of sampling (1944-2023) retrieved from the Bacterial and Viral Bioinformatics Resource Center. Using seven complementary recombination detection methods implemented in RDP5, we identified 66 recombination events across 53 unique recombinant sequences, of which 29 are newly described. Events included intra-genotypic (n = 18), inter-genotypic (n = 32), and inter-serotypic (n = 16) exchanges spanning 14 genotypes and four continents, with no meaningful serotype-level enrichment (Cramer's V = 0.054). Recombination was concentrated in non-structural genes, most frequently NS3 (19 events), NS5 (17), and NS2 (12), while the capsid gene contained no recombination events, consistent with strong functional constraint. Single-nucleotide polymorphism analyses confirmed low divergence between recombinants and their inferred parents in both recombinant and non-recombinant regions. Phylogenomic analysis of 6,642 sequences revealed that recombinants cluster significantly closer to their major parents (p = 8.9 x 10-6 ) and that their removal does not significantly alter tree topology (p = 0.898), suggesting that the short length of recombinant regions limits phylogenetic conflict. We also introduce RECOSIM, an unsupervised machine-learning tool for recombination detection that achieved higher precision than RDP5 on both simulated (93.4% vs. 80.0%) and empirical (98.1% vs. 39.3%) datasets. Collectively, these results establish recombination as a widespread, pan-serotypic phenomenon in DENV with implications for genomic surveillance, vaccine evaluation, and evolutionary inference.

24.
medRxiv (Medicine) 2026-06-22

Disentangling adiposity-related and non-adiposity-related genetic pathways for type 2 diabetes

OBJECTIVE To identify circulating proteins associated with type 2 diabetes (T2D) risk through pathways not fully explained by body mass index (BMI), and to assess therapeutic actionability. RESEARCH DESIGN AND METHODS We applied GWAS-by-subtraction within a genomic structural equation model to European ancestry summary statistics for T2D (74,124 cases, 824,006 controls) and BMI (n = 681,275), partitioning T2D liability into BMI-related and BMI-subtracted components. We then performed proteome-wide Mendelian randomization (MR) using cis-protein quantitative trait loci from four plasma proteomics cohorts: ARIC, deCODE, Fenland, and the UK Biobank Pharma Proteomics Project. Prioritized proteins passed sensitivity analyses with alternative MR methods and were supported by colocalization evidence. Tissue-resolution regulatory support was assessed using cis-eQTL colocalization across GTEx and pancreatic islet, subcutaneous adipose, and whole-blood resources. Actionability was evaluated using the druggable genome and Open Targets. RESULTS GWAS-by-subtraction attenuated the genetic correlation between BMI and BMI-subtracted T2D from 0.54 (SE 0.02) to 0.35 (SE 0.02). Proteome-wide MR prioritized 29 proteins for BMI-subtracted T2D. Thirteen showed eQTL colocalization in at least one tissue, implicating liver and intermediary metabolism (GCDH, NOTCH2), pancreatic islet biology (CTRB2, MANBA), adipose and Wnt signaling (RSPO3, GALNT3), and whole blood regulatory signals (PAM, SNUPN). Sixteen proteins were classified within druggable-genome Tiers 1-3, and five had existing Open Targets compounds. CONCLUSIONS Integrating GWAS-by-subtraction, proteome-wide MR, and colocalization nominated 29 proteins associated with T2D liability not fully explained by BMI. These findings highlight genetically supported targets for follow-up studies of T2D therapies that complement weight-centered approaches.

25.
bioRxiv (Bioinfo) 2026-06-14

Generative design of antigen-specific T-cell receptor sequences with a conditional diffusion model

T cell receptor (TCR)-based immunotherapy holds immense potential for treating cancers and infectious diseases, where highly antigen-specific TCR recognition is crucial for adaptive immunity against tumors and pathogens. Engineering or de novo generation of the complementarity-determining region 3 (CDR3) loops of TCRs using artificial intelligence offers a powerful alternative to designing reactive TCRs rather than laborious experimental screening. However, current in silico approaches are constrained by weak conditional guidance, limited flexibility, and a lack of rigorous functional validation. To address these limitations, we introduce TCRDiff, a generative diffusion framework for designing antigen-specific TCRs conditioned on peptide-MHC (pMHC) targets and germline-encoded variable genes. By leveraging pre-trained knowledge from massive T-cell repertoires and TCR-pMHC recognition data, TCRDiff generates CDR3{beta} sequences with state-of-the-art fidelity to native binding TCRs through a denoising diffusion process. Furthermore, incorporating the interface geometry features generated TCR-pMHC complexes with superior structural plausibility. As a proof of concept, we deployed TCRDiff in a systematic pipeline to design candidate TCRs for immunotherapy. In vitro activation assays validated that TCRDiff-generated TCRs specifically recognize the MAGE-A3 epitope with minimized off-target cross-reactivity. Together, TCRDiff establishes a powerful, validated computational paradigm to accelerate the development of TCR-based immunotherapies.