01.
bioRxiv (Bioinfo)
2026-06-10
When batch correction corrupts gene expression: uncovering distortions in correlation structures
作者:
Batch correction is essential for integrating datasets and enabling population-level insights into health and disease. Embedding-based approaches are among the most widely used solutions, but here we highlight a critical, overlooked limitation: these methods can distort feature-to-feature (e.g., gene gene) relationships, potentially undermining downstream analyses. We investigate this issue and introduce a novel metric to quantify it.