← 返回大厅
bioRxiv (Bioinfo) 2026-06-15 00:00 DOI: HASH:c331f6fb311a5ab72548d4ceaa8e6d66

AliceDB database and pipeline for identification of natural protein variants based on mass spectrometry measurement data

摘要 / Abstract

The natural variation that distinguishes living organisms within a single species is currently being studied intensively, primarily at the genetic level. Unfortunately, studies of natural variants at the level of protein gene products are not very common, mainly due to the lack of appropriate databases and bioinformatics tools. The main research technique used to study proteomes/peptidomes is mass spectrometry (MS). A classic method for interpreting raw mass spectrometry data in proteomic/peptidomic studies involves the use of databases containing representative (canonical) sequences that define the proteome of the organism under study. In this paper, we present the AliceDB database, which contains information on over 7 million natural variants of protein sequences described in the scientific literature for Homo sapiens. The data contained in the AliceDB database can be utilized using widely available and commonly used software for interpreting proteomic data. Test results regarding the use of the AliceDB database for the interpretation of proteomic data indicate that accounting for the presence of natural variants increases both the number and quality of identified proteins. Furthermore, it is easy to identify protein sequence variants that may, for example, be of significance in medicine.

同行评议区

登录学者账户后即可在此处发表评述或点赞。

立即登录

暂无评议记录。