← 返回大厅
arXiv (CS.LG) 2026-06-11 12:00 DOI: arXiv:2606.11570

Enhancing Spectral Embedding through Robust and Flexible Knowledge Transfer in Electronic Health Records

摘要 / Abstract

arXiv:2606.11570v1 Announce Type: cross Abstract: We propose a spectral-based, unsupervised representation learning framework to derive low-dimensional embeddings for clinical concepts and patients in rare disease cohorts from electronic health records, where data are high-dimensional but sample sizes are limited. To overcome this challenge, we incorporate a knowledge matrix extracted from a broader population that shares a partially overlapping subspace with the rare-disease cohort. Our method departs from existing approaches by relaxing restrictive one-to-one signal-alignment assumptions between the latent data matrix and knowledge matrix, allowing more flexible and realistic forms of structured sharing. We introduce a novel two-step spectral embedding procedure: first, we identify and remove irrelevant components from the knowledge matrix; then, we apply a projection-based method to separately recover shared and heterogeneous components. Simulations and an analysis of a real-world multiple sclerosis cohort show that the proposed method outperforms competing approaches, particularly in challenging scenarios where shared signals are weak and only partially aligned, as is common in rare-disease data.

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。