bioRxiv (Bioinfo)
2026-06-11 00:00
DOI:
HASH:ab0afb05678c5e206a4781779e631bd5
inquiSTR: a toolkit for accurate and efficient population-scale tandem repeat genotyping and analysis
Authors:
Abstract
Tandem repeats are highly mutable genomic elements linked to human traits and diseases. Profiling large catalogs of tandem repeats from population-scale long-read sequencing data requires accurate and efficient tools. We introduce inquiSTR, a command-line toolkit for fast genome-wide tandem repeat length genotyping. inquiSTR, with efficient parallel processing and low-memory streaming algorithms, genotypes a genome-wide repeat catalog of 1.78 million loci in less than two minutes. Benchmarking shows high accuracy and significantly faster performance compared to existing tools and truth sets. inquiSTR also provides methods for downstream analyses such as population structure inference, association testing, and outlier detection.