首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Deriving the probabilities of water loss and ammonia loss for amino acids from tandem mass spectra
Authors:Sun Shiwei  Yu Chungong  Qiao Yantao  Lin Yu  Dong Gongjin  Liu Changning  Zhang Jingfen  Zhang Zhuo  Cai Jinjin  Zhang Hong  Bu Dongbo
Institution:Bioinformatics Group, Center for Advanced Computing Research, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China.
Abstract:In protein identification through tandem mass spectrometry, it is critical to accurately predict the theoretical spectrum for a peptide sequence. The widely used prediction models, such as SEQUEST and MASCOT, ignore the intensity of the ions with important neutral losses, including water loss and ammonia loss. However, ignoring these neutral losses results in a significant deviation between the predicted theoretical spectrum and its experimental counterpart. Here, based on the "one peak, multiple explanations" observation, we proposed an expectation-maximization (EM) method to automatically learn the probabilities of water loss and ammonia loss for each amino acid. Then we employed these probabilities to design an improved statistical model for theoretical spectrum prediction. We implemented these methods and tested them on practical data. On a training set containing 1803 spectra, the experimental results show a good agreement with some known knowledge about neutral losses, such as the tendency of water loss from Asp, Glu, Ser, and Thr. Furthermore, on a testing set containing 941 spectra, the improved similarity between the experimental and predicted spectra demonstrates that this method can generate more reasonable predictions relative to the model that ignores neutral losses. As an application of the derived probabilities, we implemented a database searching method adopting the improved theoretical spectrum model with neutral loss ions estimated. Experimental results on Keller's data set demonstrate that this method can identify peptides more accurately than SEQUEST. In another application to validate SEQUEST's results, the reported peptide-spectrum pairs are reranked with respect to the similarity between experimental and predicted spectra. Experimental results on both LTQ and QSTAR data sets suggest that this reranking strategy can effectively distinguish the false negative predictions reported by SEQUEST.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号