bioRxiv (Bioinfo)
2026-06-10 00:00
DOI:
HASH:eb18cdb9a66e9aa1b885dd075d88c05f
When batch correction corrupts gene expression: uncovering distortions in correlation structures
作者:
摘要 / Abstract
Batch correction is essential for integrating datasets and enabling population-level insights into health and disease. Embedding-based approaches are among the most widely used solutions, but here we highlight a critical, overlooked limitation: these methods can distort feature-to-feature (e.g., gene gene) relationships, potentially undermining downstream analyses. We investigate this issue and introduce a novel metric to quantify it.