首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higherenergy collision induced dissociation (HCD) of charged peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.  相似文献   

2.
Scherl A  Tsai YS  Shaffer SA  Goodlett DR 《Proteomics》2008,8(14):2791-2797
Although mass spectrometers are capable of providing high mass accuracy data, assignment of true monoisotopic precursor ion mass is complicated during data-dependent ion selection for LC-MS/MS analysis of complex mixtures. The complication arises when chromatographic peak widths for a given analyte exceed the time required to acquire a precursor ion mass spectrum. The result is that many measured monoisotopic masses are misassigned due to calculation from a single mass spectrum with poor ion statistics based on only a fraction of the total available ions for a given analyte. Such data in turn produces errors in automated database searches, where precursor m/z value is one search parameter. We propose here a postacquisition approach to correct misassigned monoisotopic m/z values that involves peak detection over the entire elution profile and correction of the precursor ion monoisotopic mass. As a result of using this approach to reprocess shotgun proteomic data we increased peptide sequence assignments by 10% while reducing the estimated false positive ratio from 1 to 0.2%. We also show that 4% of the salvaged identifications may be accounted for by correction of mixed tandem mass spectra resulting from fragmentation of multiple peptides simultaneously, a situation which we refer to as accidental CID.  相似文献   

3.
Fourier transform-all reaction monitoring (FT-ARM) is a novel approach for the identification and quantification of peptides that relies upon the selectivity of high mass accuracy data and the specificity of peptide fragmentation patterns. An FT-ARM experiment involves continuous, data-independent, high mass accuracy MS/MS acquisition spanning a defined m/z range. Custom software was developed to search peptides against the multiplexed fragmentation spectra by comparing theoretical or empirical fragment ions against every fragmentation spectrum across the entire acquisition. A dot product score is calculated against each spectrum to generate a score chromatogram used for both identification and quantification. Chromatographic elution profile characteristics are not used to cluster precursor peptide signals to their respective fragment ions. FT-ARM identifications are demonstrated to be complementary to conventional data-dependent shotgun analysis, especially in cases where the data-dependent method fails because of fragmenting multiple overlapping precursors. The sensitivity, robustness, and specificity of FT-ARM quantification are shown to be analogous to selected reaction monitoring-based peptide quantification with the added benefit of minimal assay development. Thus, FT-ARM is demonstrated to be a novel and complementary data acquisition, identification, and quantification method for the large scale analysis of peptides.  相似文献   

4.
Mass spectrometric based sequencing of enzymatic generated peptides is widely used to obtain specific sequence tags allowing the unambiguous identification of proteins. In the present study, two types of desorption/ionization techniques combined with different modes of ion dissociation, namely vacuum matrix-assisted laser desorption/ionization (vMALDI) high energy collision induced dissociation (CID) and post-source decay (PSD) as well as atmospheric pressure (AP)-MALDI low energy CID, were applied for the fragmentation of singly protonated peptide ions, which were derived from two-dimensional separated, silver-stained and trypsin-digested hydrophilic as well as hydrophobic glomerular proteins. Thereby, defined properties of the individual fragmentation pattern generated by the specified modes could be observed. Furthermore, the compatibility of the varying PSD and CID (MS/MS) data with database search derived identification using two public accessible search algorithms has been evaluated. The peptide sequence tag information obtained by PSD and high energy CID enabled in the majority of cases an unambiguous identification. In contrast, part of the data obtained by low energy CID were not assignable using similar search parameters and therefore no clear results were obtainable. The knowledge of the properties of available MALDI-based fragmentation techniques presents an important factor for data interpretation using public accessible search algorithms and moreover for the identification of two-dimensional gel separated proteins.  相似文献   

5.
数据非依赖采集(DIA)是蛋白质组学领域近年来快速发展的质谱采集技术,其通过无偏碎裂隔离窗口内的所有母离子采集二级谱图,理论上可实现蛋白质样品的深度覆盖,同时具有高通量、高重现性和高灵敏度的优点。现有的DIA数据采集方法可以分为全窗口碎裂方法、隔离窗口序列碎裂方法和四维DIA数据采集方法(4D-DIA)3大类。针对DIA数据的不同特点,主要数据解析方法包括谱库搜索方法、蛋白质序列库直接搜索方法、伪二级谱图鉴定方法和从头测序方法4大类。解析得到的肽段鉴定结果需要进行可信度评估,包括使用机器学习方法的重排序和对报告结果集合的假发现率估计两个步骤,实现对数据解析结果的质控。本文对DIA数据的采集方法、数据解析方法及软件和鉴定结果可信度评估方法进行了整理和综述,并展望了未来的发展方向。  相似文献   

6.
Recent emergence of new mass spectrometry techniques (e.g. electron transfer dissociation, ETD) and improved availability of additional proteases (e.g. Lys-N) for protein digestion in high-throughput experiments raised the challenge of designing new algorithms for interpreting the resulting new types of tandem mass (MS/MS) spectra. Traditional MS/MS database search algorithms such as SEQUEST and Mascot were originally designed for collision induced dissociation (CID) of tryptic peptides and are largely based on expert knowledge about fragmentation of tryptic peptides (rather than machine learning techniques) to design CID-specific scoring functions. As a result, the performance of these algorithms is suboptimal for new mass spectrometry technologies or nontryptic peptides. We recently proposed the generating function approach (MS-GF) for CID spectra of tryptic peptides. In this study, we extend MS-GF to automatically derive scoring parameters from a set of annotated MS/MS spectra of any type (e.g. CID, ETD, etc.), and present a new database search tool MS-GFDB based on MS-GF. We show that MS-GFDB outperforms Mascot for ETD spectra or peptides digested with Lys-N. For example, in the case of ETD spectra, the number of tryptic and Lys-N peptides identified by MS-GFDB increased by a factor of 2.7 and 2.6 as compared with Mascot. Moreover, even following a decade of Mascot developments for analyzing CID spectra of tryptic peptides, MS-GFDB (that is not particularly tailored for CID spectra or tryptic peptides) resulted in 28% increase over Mascot in the number of peptide identifications. Finally, we propose a statistical framework for analyzing multiple spectra from the same precursor (e.g. CID/ETD spectral pairs) and assigning p values to peptide-spectrum-spectrum matches.Since the introduction of electron capture dissociation (ECD)1 in 1998 (1), electron-based peptide dissociation technologies have played an important role in analyzing intact proteins and post-translational modifications (2). However, until recently, this research-grade technology was available only to a small number of laboratories because it was commercially unavailable, required experience for operation, and could be implemented only with expensive FT-ICR instruments. The discovery of electron-transfer dissociation (ETD) (3) enabled an ECD-like technology to be implemented in (relatively cheap) ion-trap instruments. Nowadays, many researchers are employing the ETD technology for tandem mass spectra generation (49).Although the hardware technologies to generate ETD spectra are maturing rapidly, software technologies to analyze ETD spectra are still in infancy. There are two major approaches to analyzing tandem mass spectra: de novo sequencing and database search. Both approaches find the best-scoring peptide either among all possible peptides (de novo sequencing) or among all peptides in a protein database (database search). Although de novo sequencing is emerging as an alternative to database search, database search remains a more accurate (and thus preferred) method of spectral interpretation, so here we focus on the database search approach.Numerous database search engines are currently available, including SEQUEST (10), Mascot (11), OMSSA (12), X!Tandem (13), and InsPecT (14). However, most of them are inadequate for the analysis of ETD spectra because they are optimized for collision induced dissociation (CID) spectra that show different fragmentation propensities than those of ETD spectra. Additionally, the existing tandem mass spectrometry (MS/MS) tools are biased toward the analysis of tryptic peptides because trypsin is usually used for CID, and thus not suitable for the analysis of nontryptic peptides that are common for ETD. Therefore, even though some database search engines support the analysis of ETD spectra (e.g. SEQUEST, Mascot, and OMSSA), their performance remains suboptimal when it comes to analyzing ETD spectra. Recently, an ETD-specific database search tool (Z-Core) was developed; however it does not significantly improve over OMSSA (15).We present a new database search tool (MS-GFDB) that significantly outperforms existing database search engines in the analysis of ETD spectra, and performs equally well on nontryptic peptides. MS-GFDB employs the generating function approach (MS-GF) that computes rigorous p values of peptide-spectrum matches (PSMs) based on the spectrum-specific score histogram of all peptides (16).2 MS-GF p values are dependent only on the PSM (and not on the database), thus can be used as an alternative scoring function for the database search.Computing p values requires a scoring model evaluating qualities of PSMs. MS-GF adopts a probabilistic scoring model (MS-Dictionary scoring model) described in Kim et al., 2009 (17), considering multiple features including product ion types, peak intensities and mass errors. To define the parameters of this scoring model, MS-GF only needs a set of training PSMs.3 This set of PSMs can be obtained in a variety of ways: for example, one can generate CID/ETD pairs and use peptides identified by CID to form PSMs for ETD. Alternatively, one can generate spectra from a purified protein (when PSMs can be inferred from the accurate parent mass alone) or use a previously developed (not necessary optimal) tool to generate training PSMs. From these training PSMs, MS-GF automatically derives scoring parameters without assuming any prior knowledge about the specifics of a particular peptide fragmentation method (e.g. ETD, CID, etc.) and/or proteolytic origin of the peptides. MS-GF was originally designed for the analysis of CID spectra, but now it has been extended to other types of spectra generated by various fragmentation techniques and/or various enzymes. We show that MS-GF can be successfully applied to novel types of spectra (e.g. ETD of Lys-N peptides (18, 19)) by simply retraining scoring parameters without any modification. Note that although the same scoring model is used for different types of spectra, the parameters derived to score different types of spectra are dissimilar.We compared the performance of MS-GFDB with Mascot on a large ETD data set and found that it generated many more peptide identifications for the same false discovery rates (FDR). For example, at 1% peptide level FDR, MS-GFDB identified 9450 unique peptides from 81,864 ETD spectra of Lys-N peptides whereas Mascot only identified 3672 unique peptides, ≈160% increase in the number of peptide identifications (a similar improvement is observed for ETD spectra of tryptic peptides).4 MS-GFDB also showed a significant 28% improvement in the number of identified peptides from CID spectra of tryptic peptides (16,203 peptides as compared with 12,658 peptides identified by Mascot).The ETD technology complements rather than replaces CID because both technologies have some advantages: CID for smaller peptides with small charges, ETD for larger and multiply charged peptides (20, 21). An alternative way to utilize ETD is to use it in conjunction with CID because CID and ETD generate complementary sequence information (20, 22, 23). ETD-enabled instruments often support generating both CID and ETD spectra (CID/ETD pairs) for the same peptide. Although the CID/ETD pairs promise a great improvement in peptide identification, the full potential of such pairs has not been fully realized yet. In the case of de novo sequencing, de novo sequencing tools utilizing CID/ETD pairs indeed result in more accurate de novo peptide sequencing than traditional CID-based algorithms (23, 24, 25). However, in the case of database search, the argument that the use of CID/ETD pairs improves peptide identifications remains poorly substantiated. A few tools are developed to use CID/ETD (or CID/ECD) pairs for the database search but they are limited to preprocessing/postprocessing of the spectral data before or following running a traditional database search tool (26, 27). Nielsen et al., 2005 (22) pioneered the combined use of CID and ECD for the database search. Given a CID/ECD pair, they generated a combined spectrum comprised only of complementary pairs of peaks, and searched it with Mascot.5 However, this approach is hard to generalize to less accurate CID/ETD pairs generated by ion-trap instruments because there is a higher chance that the identified complementary pairs of peaks are spurious. More importantly, using traditional MS/MS tools (such as Mascot) for the database search of the combined spectrum is inappropriate, because they are not optimized for analyzing such combined spectra; a better approach would be to develop a new database search tool tailored for the combined spectrum. Recently, Molina et al., 2008 (26) studied database search of CID/ETD pairs using Spectrum Mill (Agilent Technologies, Santa Clara, CA) and came to a counterintuitive conclusion that using only CID spectra identifies 12% more unique peptides than using CID/ETD pairs. We believe that it is an acknowledgment of limitations of the traditional MS/MS database search tools for the analysis of multiple spectra generated from a single peptide.In this paper, we modify the generating function approach for interpreting CID/ETD pairs and further apply it to improve the database search with CID/ETD pairs. In contrast to previous approaches, our scoring is specially designed to interpret CID/ETD pairs and can be generalized to analyzing any type of multiple spectra generated from a single peptide. When CID/ETD pairs from trypsin digests are used, MS-GFDB identified 13% and 27% more peptides compared with the case when only CID spectra and only ETD spectra are used, respectively. The difference was even more prominent when CID/ETD pairs from Lys-N digests were used, with 41% and 33% improvement over CID only and ETD only, respectively.Assigning a p value to a PSM greatly helped researchers to evaluate the quality of peptide identifications. We now turn to the problem of assigning a p value to a peptide-spectrum-spectrum match (PS2M) when two spectra in PS2M are generated by different fragmentation technologies (e.g. ETD and CID). We argue that assigning statistical significance to a PS2M (or even PSnM) is a prerequisite for rigorous CID/ETD analyses. To our knowledge, MS-GFDB is the first tool to generate statistically rigorous p values of PSnMs.The MS-GFDB executable and source code is available at the website of Center for Computational Mass Spectrometry at UCSD (http://proteomics.ucsd.edu). It takes a set of spectra (CID, ETD, or CID/ETD pairs) and a protein database as an input and outputs peptide matches. If the input is a set of CID/ETD pairs, it outputs the best scoring peptide matches and their p values (1) using only CID spectra, (2) using only ETD spectra, and (3) using combined spectra of CID/ETD pairs.  相似文献   

7.
In proteomics, selected reaction monitoring (SRM) is rapidly gaining importance for targeted protein quantification. The triple quadrupole mass analyzers used in SRM assays allow for levels of specificity and sensitivity hard to accomplish by more standard shotgun proteomics experiments. Often, an SRM assay is built by in silico prediction of transitions and/or extraction of peptide precursor and fragment ions from a spectral library. Spectral libraries are typically generated from nonideal ion trap based shotgun proteomics experiments or synthetic peptide libraries, consuming considerable time and effort. Here, we investigate the usability of beam type CID (or "higher energy CID" (HCD)) peptide fragmentation spectra, as acquired using an Orbitrap Velos, to facilitate SRM assay development. Therefore, peptide fragmentation spectra, obtained by ion-trap CID, triple-quadrupole CID (QqQ-CID) and Orbitrap HCD, originating from digested cellular lysates, were compared. Spectral comparison and a dedicated correlation algorithm indicated significantly higher similarity between QqQ-CID and HCD fragmentation spectra than between QqQ-CID and ion trap-CID spectra. SRM transitions generated using a constructed HCD spectral library increased SRM assay sensitivity up to 2-fold, when compared to the use of a library created from more conventionally used ion trap-CID spectra, showing that HCD spectra can assist SRM assay development.  相似文献   

8.
Mass spectrometry (MS) analysis of peptides carrying post‐translational modifications is challenging due to the instability of some modifications during MS analysis. However, glycopeptides as well as acetylated, methylated and other modified peptides release specific fragment ions during CID (collision‐induced dissociation) and HCD (higher energy collisional dissociation) fragmentation. These fragment ions can be used to validate the presence of the PTM on the peptide. Here, we present PTM MarkerFinder, a software tool that takes advantage of such marker ions. PTM MarkerFinder screens the MS/MS spectra in the output of a database search (i.e., Mascot) for marker ions specific for selected PTMs. Moreover, it reports and annotates the HCD and the corresponding electron transfer dissociation (ETD) spectrum (when present), and summarizes information on the type, number, and ratios of marker ions found in the data set. In the present work, a sample containing enriched N‐acetylhexosamine (HexNAc) glycopeptides from yeast has been analyzed by liquid chromatography‐mass spectrometry on an LTQ Orbitrap Velos using both HCD and ETD fragmentation techniques. The identification result (Mascot .dat file) was submitted as input to PTM MarkerFinder and screened for HexNAc oxonium ions. The software output has been used for high‐throughput validation of the identification results.  相似文献   

9.
A novel database search algorithm is presented for the qualitative identification of proteins over a wide dynamic range, both in simple and complex biological samples. The algorithm has been designed for the analysis of data originating from data independent acquisitions, whereby multiple precursor ions are fragmented simultaneously. Measurements used by the algorithm include retention time, ion intensities, charge state, and accurate masses on both precursor and product ions from LC‐MS data. The search algorithm uses an iterative process whereby each iteration incrementally increases the selectivity, specificity, and sensitivity of the overall strategy. Increased specificity is obtained by utilizing a subset database search approach, whereby for each subsequent stage of the search, only those peptides from securely identified proteins are queried. Tentative peptide and protein identifications are ranked and scored by their relative correlation to a number of models of known and empirically derived physicochemical attributes of proteins and peptides. In addition, the algorithm utilizes decoy database techniques for automatically determining the false positive identification rates. The search algorithm has been tested by comparing the search results from a four‐protein mixture, the same four‐protein mixture spiked into a complex biological background, and a variety of other “system” type protein digest mixtures. The method was validated independently by data dependent methods, while concurrently relying on replication and selectivity. Comparisons were also performed with other commercially and publicly available peptide fragmentation search algorithms. The presented results demonstrate the ability to correctly identify peptides and proteins from data independent acquisition strategies with high sensitivity and specificity. They also illustrate a more comprehensive analysis of the samples studied; providing approximately 20% more protein identifications, compared to a more conventional data directed approach using the same identification criteria, with a concurrent increase in both sequence coverage and the number of modified peptides.  相似文献   

10.
Peptide identification using tandem mass spectrometry is a core technology in proteomics. Latest generations of mass spectrometry instruments enable the use of electron transfer dissociation (ETD) to complement collision induced dissociation (CID) for peptide fragmentation. However, a critical limitation to the use of ETD has been optimal database search software. Percolator is a post-search algorithm, which uses semi-supervised machine learning to improve the rate of peptide spectrum identifications (PSMs) together with providing reliable significance measures. We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data. Here, we report recent developments in the Mascot Percolator V2.0 software including an improved feature calculator and support for a wider range of ion series. The updated software is applied to the analysis of several CID and ETD fragmented peptide data sets. This version of Mascot Percolator increases the number of CID PSMs by up to 80% and ETD PSMs by up to 60% at a 0.01 q-value (1% false discovery rate) threshold over a standard Mascot search, notably recovering PSMs from high charge state precursor ions. The greatly increased number of PSMs and peptide coverage afforded by Mascot Percolator has enabled a fuller assessment of CID/ETD complementarity to be performed. Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%). We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.  相似文献   

11.
A database independent search algorithm for the detection of phosphopeptides is described. The program interrogates the tandem mass spectra of LC-MS/MS data sets regarding the presence of phosphorylation specific signatures. To achieve maximum informational content, the complementary fragmentation techniques electron capture dissociation (ECD) and collisionally activated dissociation (CAD) are used independently for peptide fragmentation. Several criteria characteristic for peptides phosphorylated on either serine or threonine residues were evaluated. The final algorithm searches for product ions generated by either the neutral loss of phosphoric acid or the combined neutral loss of phosphoric acid and water. Various peptide mixtures were used to evaluate the program. False positive results were not observed because the program utilizes the parts-per-million mass accuracy of Fourier transform ion cyclotron resonance mass spectrometry. Additionally, false negative results were not generated owing to the high sensitivity of the chosen criteria. The limitations of database dependent data interpretation tools are discussed and the potential of the novel algorithm to overcome these limitations is illustrated.  相似文献   

12.
The increasing use of multistage tandem mass spectrometry (MS/MS and MS (3)) methods for comprehensive phosphoproteome analysis studies, as well as the emerging application of in silico spectral intensity prediction algorithms in enhanced database search analysis strategies, necessitate the development of an improved understanding of the mechanisms and other factors that affect the gas-phase fragmentation reactions of phosphorylated peptide ions. To address this need, we have examined the multistage collision-induced dissociation (CID) behavior of a set of singly and doubly charged phosphoserine- and phosphothreonine-containing peptide ions, as well as their regioselectively or uniformly deuterated derivatives, in a quadrupole ion trap mass spectrometer. Consistent with previous reports, the neutral loss of phosphoric acid (H 3PO 4) was observed as a dominant reaction pathway upon MS/MS. The magnitude of this loss was found to be highly dependent on the proton mobility of the precursor ion for both phosphoserine- and phosphothreonine-containing peptides. In contrast to that currently accepted in the literature, however, the results obtained in this study unequivocally demonstrate that the loss of H 3PO 4 does not predominantly occur via a "charge-remote" beta-elimination reaction. The observation of product ions corresponding to the loss of formaldehyde (CH 2O, 30 Da, or CD 2O, 32 Da) or acetaldehyde (CH 3CHO, 44 Da) upon MS (3) dissociation of the [M+ nH-H 3PO 4] ( n+ ) product ions from phosphoserine- and phosphothreonine-containing peptide ions, respectively, provide experimental evidence for a "charge-directed" mechanism involving an S N2 neighboring group participation reaction, resulting in the formation of a cyclic product ion. Potentially, these "diagnostic" MS (3) product ions may provide additional information to facilitate the characterization of phosphopeptides containing multiple potential phosphorylation sites.  相似文献   

13.
The dominant ions in MS/MS spectra of peptides, which have been fragmented by low-energy CID, are often b-, y-ions and their derivatives resulting from the cleavage of the peptide bonds. However, MS/MS spectra typically contain many more peaks. These can result not only from isotope variants and multiply charged replicates of the peptide fragmentation products but also from unknown fragmentation pathways, sample-specific or systematic chemical contaminations or from noise generated by the electronic detection system. The presence of this background complicates spectrum interpretation. Besides dramatically prolonged computation time, it can lead to incorrect protein identification, especially in the case of de novo sequencing algorithms. Here, we present an algorithm for detection and transformation of multiply charged peaks into singly charged monoisotopic peaks, removal of heavy isotope replicates, and random noise. A quantitative criterion for the recognition of some noninterpretable spectra has been derived as a byproduct. The approach is based on numerical spectral analysis and signal detection methods. The algorithm has been implemented in a stand-alone computer program called MS Cleaner that can be obtained from the authors upon request.  相似文献   

14.
Identification of single glycoconjugate components in a complex mixture from the urine of a patient suffering from a congenital disorder of glycosylation was probed by MALDIMS analysis on a hybrid quadrupole time-of-flight instrument. In negative ion mode, complex maps containing more than 50 ionic species were obtained and a number of molecular ions directly as-signed using a previously developed computer-assisted algorithm. To confirm the data and determine the carbohydrate sequence, single molecular ions were selected and submitted to fragmentation experiments. Interpretation of fragmentation spectra was also assisted by the soft-ware using alignment with spectra generated in silico. According to fragmentation data, the majority of glycoconjugate ionic species could be assigned to free oligosaccharides along with ten species tentatively assigned to glycopeptides. Following this approach for glycan identification by a combination of MALDI-QTOFMS and MS/MS experiments, computer-assisted assignment and fragment analysis, data for a potential glycan data base are produced. Of high benefit for this approach are two main factors: low sample consumption due to the high sensitivity of ion formation, and generation of only singly charged species in MALDIMS allowing interpretation with-out any deconvolution. In this experimental set-up, sequencing of single components from the MALDI maps by low energy CID followed by computer-assisted assignment and data base search is proposed as a most efficient strategy for the rapid identification of complex carbohydrate structures in clinical glycomics.  相似文献   

15.
Protein analysis by database search engines using tandem mass spectra is limited by the presence of unexpected protein modifications, sequence isoforms which may not be in the protein databases, and poor quality tandem mass spectrometry (MS/MS) of low abundance proteins. The analysis of expected protein modifications can be efficiently addressed by precursor ion scanning. However, it is limited to modifications that show such a characteristic loss in a peptide independent manner. We observed that proline and aspartic acid induced backbone fragmentation is accompanied by a low intensity signal for loss of H3PO4 for several pSer- or pThr-phosphopeptides. We describe here the use of peptide-specific fragments that can be used after a protein was identified to allow in-depth characterization of modifications and isoforms. We consider high abundance fragments formed by cleavage at the C-terminal side of aspartic acid, at the N-terminal side of proline and low mass ions such as a2, b2, b3, y1, y2, and y3. The MS/MS dataset is filtered for each sequence tag of interest by an in silico precursor ion scan. The resulting extracted ion traces are then combined by multiplication to increase specificity. Since the strategy is based on common peptide segments which are shared by different isoforms of peptides it can be applied to the analysis of any post-translational modification or sequence variants of a protein. This is demonstrated for the cases of serine and threonine phosphorylation, histone H1 acetylation and the spotting of multiple H1 isoforms.  相似文献   

16.
Because of the intrinsic physical properties of single- or double-charged ions, MALDI-based CID on these peptide precursor ions tends to be incomplete, resulting in a large number of MS/MS spectra unassigned or ambiguously identified. Consequently, the TOF/TOF high throughput capability may not be fully explored and utilized. Here, we describe a novel method for de novo sequence assignment of those MALDI TOF/TOF MS/MS spectra with incomplete or weak fragment ion series. In this approach, the deuterium-labeled lysine and leucine precursors were used in parallel to mass-tag the proteome of a metastatic human hepatocellular carcinoma (HCC) cell line during in vivo cell culturing. These stable isotope precursor markers not only position at terminal but at internal MS/MS fragment ions with the characteristic isotope pattern induced by multiple mass tagging in parallel. This enhanced signal specificity evidently resolved ambiguities in those sparse poor-quality TOF/TOF spectra by providing critical sequential links among MS/MS fragment ions. Our data-dependent approach was able to reduce many false-positives in current genome sequence-based peptide sequencing. With developing new algorithms accordingly, our approach is amenable for automation that will lead to more comprehensive and reliable identification for proteomes.  相似文献   

17.
Although genome databases have become the key for proteomic analyses, de novo sequencing remains essential for the study of organisms whose genomes have not been completed. In addition, post-translational modifications present a challenge in database searching. Recognition of the b or y-ion series in a peptide MS/MS spectrum as well as identification of the b1 - and yn-1 -ions can facilitate de novo analyses. Therefore, it is valuable to identify either amino-acid terminus. In previous work, we have demonstrated that peptides modified at the epsilon-amino group of lysine as a t-butyl peroxycarbamate derivative undergo free radical promoted peptide backbone fragmentation under low-energy collision-induced dissociation (CID) conditions. Here we explore the chemistry of the N-terminal amino group modified as a t-butyl peroxycarbamate. The conversion of N-terminal amines to peroxycarbamates of simple amino acids and peptides was studied with aryl t-butyl peroxycarbonates. ESI-MS/MS analysis of the peroxycarbamate adducts gave evidence of a product ion corresponding to the neutral loss of the N-terminal side chain (R), thus identifying this residue. Further fragmentation (MS3) of product ions formed by N-terminal residue side-chain loss (-R) exhibited an m/z shift of the b-ions equal to the neutral loss of R, therefore labeling the b-ion series. The study was extended to the analysis of a protein tryptic digest where the SALSA algorithm was used to identify spectra containing these neutral losses. The method for N-terminus identification presented here has the potential for improvement of de novo analyses as well as in constraining peptide mass mapping database searches.  相似文献   

18.
An Z  Chen Y  Koomen JM  Merkler DJ 《Proteomics》2012,12(2):173-182
Amidation is a post-translational modification found at the C-terminus of ~50% of all neuropeptide hormones. Cleavage of the C(α)-N bond of a C-terminal glycine yields the α-amidated peptide in a reaction catalyzed by peptidylglycine α-amidating monooxygenase (PAM). The mass of an α-amidated peptide decreases by 58 Da relative to its precursor. The amino acid sequences of an α-amidated peptide and its precursor differ only by the C-terminal glycine meaning that the peptides exhibit similar RP-HPLC properties and tandem mass spectral (MS/MS) fragmentation patterns. Growth of cultured cells in the presence of a PAM inhibitor ensured the coexistence of α-amidated peptides and their precursors. A strategy was developed for precursor and α-amidated peptide pairing (PAPP): LC-MS/MS data of peptide extracts were scanned for peptide pairs that differed by 58 Da in mass, but had similar RP-HPLC retention times. The resulting peptide pairs were validated by checking for similar fragmentation patterns in their MS/MS data prior to identification by database searching or manual interpretation. This approach significantly reduced the number of spectra requiring interpretation, decreasing the computing time required for database searching and enabling manual interpretation of unidentified spectra. Reported here are the α-amidated peptides identified from AtT-20 cells using the PAPP method.  相似文献   

19.
Searching spectral libraries in MS/MS is an important new approach to improving the quality of peptide and protein identification. The idea relies on the observation that ion intensities in an MS/MS spectrum of a given peptide are generally reproducible across experiments, and thus, matching between spectra from an experiment and the spectra of previously identified peptides stored in a spectral library can lead to better peptide identification compared to the traditional database search. However, the use of libraries is greatly limited by their coverage of peptide sequences: even for well‐studied organisms a large fraction of peptides have not been previously identified. To address this issue, we propose to expand spectral libraries by predicting the MS/MS spectra of peptides based on the spectra of peptides with similar sequences. We first demonstrate that the intensity patterns of dominant fragment ions between similar peptides tend to be similar. In accordance with this observation, we develop a neighbor‐based approach that first selects peptides that are likely to have spectra similar to the target peptide and then combines their spectra using a weighted K‐nearest neighbor method to accurately predict fragment ion intensities corresponding to the target peptide. This approach has the potential to predict spectra for every peptide in the proteome. When rigorous quality criteria are applied, we estimate that the method increases the coverage of spectral libraries available from the National Institute of Standards and Technology by 20–60%, although the values vary with peptide length and charge state. We find that the overall best search performance is achieved when spectral libraries are supplemented by the high quality predicted spectra.  相似文献   

20.
Villén J  Beausoleil SA  Gygi SP 《Proteomics》2008,8(21):4444-4452
Phosphopeptide identification and site determination are major challenges in biomedical MS. Both are affected by frequent and often overwhelming losses of phosphoric acid in ion trap CID fragmentation spectra. These losses are thought to translate into reduced intensities of sequence informative ions and a general decline in the quality of MS/MS spectra. To address this issue, several methods have been proposed, which rely on extended fragmentation schemes including collecting MS3 scans from neutral loss-containing ions and multi-stage activation to further fragment these same ions. Here, we have evaluated the utility of these methods in the context of a large-scale phosphopeptide analysis strategy with current instrumentation capable of accurate precursor mass determination. Remarkably, we found that MS3-based schemes did not increase the overall number of confidently identified peptides and had only limited value in site localization. We conclude that the collection of MS3 or pseudo-MS3 scans in large-scale proteomics studies is not worthwhile when high-mass accuracy instrumentation is used.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号