首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
One of the challenges associated with large-scale proteome analysis using tandem mass spectrometry (MS/MS) and automated database searching is to reduce the number of false positive identifications without sacrificing the number of true positives found. In this work, a systematic investigation of the effect of 2MEGA labeling (N-terminal dimethylation after lysine guanidination) on the proteome analysis of a membrane fraction of an Escherichia coli cell extract by 2-dimensional liquid chromatography MS/MS is presented. By a large-scale comparison of MS/MS spectra of native peptides with those from the 2MEGA-labeled peptides, the labeled peptides were found to undergo facile fragmentation with enhanced a1 or a1-related (a(1)-17 and a(1)-45) ions derived from all N-terminal amino acids in the MS/MS spectra; these ions are usually difficult to detect in the MS/MS spectra of nonderivatized peptides. The 2MEGA labeling alleviated the biased detection of arginine-terminated peptides that is often observed in MALDI and ESI MS experiments. 2MEGA labeling was found not only to increase the number of peptides and proteins identified but also to generate enhanced a1 or a1-related ions as a constraint to reduce the number of false positive identifications. In total, 640 proteins were identified from the E. coli membrane fraction, with each protein identified based on peptide mass and sequence match of one or more peptides using MASCOT database search algorithm from the MS/MS spectra generated by a quadrupole time-of-flight mass spectrometer. Among them, the subcellular locations of 336 proteins are presently known, including 258 membrane and membrane-associated proteins (76.8%). Among the classified proteins, there was a dramatic increase in the total number of integral membrane proteins identified in the 2MEGA-labeled sample (153 proteins) versus the unlabeled sample (77 proteins).  相似文献   

2.
We describe an enabling technique for proteome analysis based on isotope-differential dimethyl labeling of N-termini of tryptic peptides followed by microbore liquid chromatography (LC) matrix-assisted laser desorption and ionization (MALDI) mass spectrometry (MS). In this method, lysine side chains are blocked by guanidination to prevent the incorporation of multiple labels, followed by N-terminal labeling via reductive amination using d(0),(12)C-formaldehyde or d(2),(13)C-formaldehyde. Relative quantification of peptide mixtures is achieved by examining the MALDI mass spectra of the peptide pairs labeled with different isotope tags. A nominal mass difference of 6 Da between the peptide pair allows negligible interference between the two isotopic clusters for quantification of peptides of up to 3000 Da. Since only the N-termini of tryptic peptides are differentially labeled and the a(1) ions are also enhanced in the MALDI MS/MS spectra, interpretation of the fragment ion spectra to obtain sequence information is greatly simplified. It is demonstrated that this technique of N-terminal dimethylation (2ME) after lysine guanidination (GA) or 2MEGA offers several desirable features, including simple experimental procedure, stable products, using inexpensive and commercially available reagents, and negligible isotope effect on reversed-phase separation. LC-MALDI MS combined with this 2MEGA labeling technique was successfully used to identify proteins that included polymorphic variants and low abundance proteins in bovine milk. In addition, by analyzing a mixture of two equal amounts of milk whey fraction as a control, it is shown that the measured average ratio for 56 peptide pairs from 14 different proteins is 1.02, which is very close to the theoretical ratio of 1.00. The calculated percentage error is 2.0% and relative standard deviation is 4.6%.  相似文献   

3.
Mass spectrometry (MS) is a technique that is used for biological studies. It consists in associating a spectrum to a biological sample. A spectrum consists of couples of values (intensity, m/z), where intensity measures the abundance of biomolecules (as proteins) with a mass-to-charge ratio (m/z) present in the originating sample. In proteomics experiments, MS spectra are used to identify pattern expressions in clinical samples that may be responsible of diseases. Recently, to improve the identification of peptides/proteins related to patterns, MS/MS process is used, consisting in performing cascade of mass spectrometric analysis on selected peaks. Latter technique has been demonstrated to improve the identification and quantification of proteins/peptide in samples. Nevertheless, MS analysis deals with a huge amount of data, often affected by noises, thus requiring automatic data management systems. Tools have been developed and most of the time furnished with the instruments allowing: (i) spectra analysis and visualization, (ii) pattern recognition, (iii) protein databases querying, (iv) peptides/proteins quantification and identification. Currently most of the tools supporting such phases need to be optimized to improve the protein (and their functionalities) identification processes. In this article we survey on applications supporting spectrometrists and biologists in obtaining information from biological samples, analyzing available software for different phases. We consider different mass spectrometry techniques, and thus different requirements. We focus on tools for (i) data preprocessing, allowing to prepare results obtained from spectrometers to be analyzed; (ii) spectra analysis, representation and mining, aimed to identify common and/or hidden patterns in spectra sets or in classifying data; (iii) databases querying to identify peptides; and (iv) improving and boosting the identification and quantification of selected peaks. We trace some open problems and report on requirements that represent new challenges for bioinformatics.  相似文献   

4.
Only a small fraction of spectra acquired in LC-MS/MS runs matches peptides from target proteins upon database searches. The remaining, operationally termed background, spectra originate from a variety of poorly controlled sources and affect the throughput and confidence of database searches. Here, we report an algorithm and its software implementation that rapidly removes background spectra, regardless of their precise origin. The method estimates the dissimilarity distance between screened MS/MS spectra and unannotated spectra from a partially redundant background library compiled from several control and blank runs. Filtering MS/MS queries enhanced the protein identification capacity when searches lacked spectrum to sequence matching specificity. In sequence-similarity searches it reduced by, on average, 30-fold the number of orphan hits, which were not explicitly related to background protein contaminants and required manual validation. Removing high quality background MS/MS spectra, while preserving in the data set the genuine spectra from target proteins, decreased the false positive rate of stringent database searches and improved the identification of low-abundance proteins.  相似文献   

5.
Protein lysine monomethylation is an important post-translational modification participated in regulating many biological processes. There is growing interest in identifying these methylation events. However, the introduction of one methyl group on lysine residues has negligible effect on changing the physical and chemical properties of proteins or peptides, making enriching and identifying monomethylated lysine (Kme1) proteins or peptides extraordinarily challenging. In this study, we proposed an antibody-free chemical proteomics approach to capture Kme1 peptides from complex protein digest. By exploiting reductive glutaraldehydation, 5-aldehyde-pentanyl modified Kme1 residues and piperidine modified primary amines were generated at the same time. The peptides with aldehyde modified Kme1 residues were then enriched by solid-phase hydrazide chemistry. This chemical proteomics approach was validated by using several synthetic peptides. It was demonstrated that it can enrich and detect Kme1 peptide from peptide mixture containing 5000-fold more bovine serum albumin tryptic digest. Besides, we extended our approach to profile Kme1 using heavy methyl stable isotope labeling by amino acids in cell culture (hmSILAC) labeled Jurkat T cells and Hela cells. Totally, 29 Kme1 sites on 25 proteins were identified with high confidence and 11 Kme1 sites were identified in both two types cells. This is the first antibody-free chemical proteomics approach to enrich Kme1 peptides from complex protein digest, and it provides a potential avenue for the analysis of methylome.  相似文献   

6.
The identification of peptides and proteins from fragmentation mass spectra is a very common approach in the field of proteomics. Contemporary high-throughput peptide identification pipelines can quickly produce large quantities of MS/MS data that contain valuable knowledge about the actual physicochemical processes involved in the peptide fragmentation process, which can be extracted through extensive data mining studies. As these studies attempt to exploit the intensity information contained in the MS/MS spectra, a critical step required for a meaningful comparison of this information between MS/MS spectra is peak intensity normalization. We here describe a procedure for quantifying the efficiency of different published normalization methods in terms of the quartile coefficient of dispersion (qcod) statistic. The quartile coefficient of dispersion is applied to measure the dispersion of the peak intensities between redundant MS/MS spectra, allowing the quantification of the differences in computed peak intensity reproducibility between the different normalization methods. We demonstrate that our results are independent of the data set used in the evaluation procedure, allowing us to provide generic guidance on the choice of normalization method to apply in a certain MS/MS pipeline application.  相似文献   

7.
De novo peptide sequencing via tandem mass spectrometry.   总被引:10,自引:0,他引:10  
Peptide sequencing via tandem mass spectrometry (MS/MS) is one of the most powerful tools in proteomics for identifying proteins. Because complete genome sequences are accumulating rapidly, the recent trend in interpretation of MS/MS spectra has been database search. However, de novo MS/MS spectral interpretation remains an open problem typically involving manual interpretation by expert mass spectrometrists. We have developed a new algorithm, SHERENGA, for de novo interpretation that automatically learns fragment ion types and intensity thresholds from a collection of test spectra generated from any type of mass spectrometer. The test data are used to construct optimal path scoring in the graph representations of MS/MS spectra. A ranked list of high scoring paths corresponds to potential peptide sequences. SHERENGA is most useful for interpreting sequences of peptides resulting from unknown proteins and for validating the results of database search algorithms in fully automated, high-throughput peptide sequencing.  相似文献   

8.
A strategy based on isotope labeling of peptides and liquid chromatography matrix-assisted laser desorption ionization mass spectrometry (LC-MALDI MS) has been employed to accurately quantify and confidently identify differentially expressed proteins between an E-cadherin-deficient human carcinoma cell line (SCC9) and its transfectants expressing E-cadherin (SCC9-E). Proteins extracted from each cell line were tryptically digested and the resultant peptides were labeled individually with either d(0)- or d(2)-formaldehyde. The labeled peptides were combined and the peptide mixture was separated and fractionated by a strong cation exchange (SCX) column. Peptides from each SCX fraction were further separated by a microbore reversed-phase (RP) LC column. The effluents were then directly spotted onto a MALDI target using a heated droplet LC-MALDI interface. After mixing with a MALDI matrix, individual sample spots were analyzed by MALDI quadrupole time-of-flight MS, using an initial MS scan to quantify the dimethyl labeled peptide pairs. MS/MS analysis was then carried out on the peptide pairs having relative peak intensity changes of greater than 2-fold. The MS/MS spectra were subjected to database searching for protein identification. The search results were further confirmed by comparing the MS/MS spectra of the peptide pairs. Using this strategy, we detected and compared relative peak intensity changes of 5480 peptide pairs. Among them, 320 peptide pairs showed changes of greater than 2-fold. MS/MS analysis of these changing pairs led to the identification of 49 differentially expressed proteins between the parental SCC9 cells and SCC9-E transfectants. These proteins were determined to be involved in different pathways regulating cytoskeletal organization, cell adhesion, epithelial polarity, and cell proliferation. The changes in protein expression were consistent with increased cell-cell and cell-matrix adhesion and decreased proliferation in SCC9-E cells, in line with E-cadherin tumor suppressor activity. Finally, the accuracy of the MS quantification and subcellular localization for 6 differentially expressed proteins were validated by immunoblotting and immunofluorescence assays.  相似文献   

9.
The proteins secreted by prostate cancer cells (PC3(AR)6) were separated by strong anion exchange chromatography, digested with trypsin and analyzed by unbiased liquid chromatography tandem mass spectrometry with an ion trap. The spectra were matched to peptides within proteins using a goodness of fit algorithm that showed a low false positive rate. The parent ions for MS/MS were randomly and independently sampled from a log-normal population and therefore could be analyzed by ANOVA. Normal distribution analysis confirmed that the parent and fragment ion intensity distributions were sampled over 99.9% of their range that was above the background noise. Arranging the ion intensity data with the identified peptide and protein sequences in structured query language (SQL) permitted the quantification of ion intensity across treatments, proteins and peptides. The intensity of 101,905 fragment ions from 1421 peptide precursors of 583 peptides from 233 proteins separated over 11 sample treatments were computed together in one ANOVA model using the statistical analysis system (SAS) prior to Tukey-Kramer honestly significant difference (HSD) testing. Thus complex mixtures of proteins were identified and quantified with a high degree of confidence using an ion trap without isotopic labels, multivariate analysis or comparing chromatographic retention times.  相似文献   

10.
We demonstrate an approach for global quantitative analysis of protein mixtures using differential stable isotopic labeling of the enzyme-digested peptides combined with microbore liquid chromatography (LC) matrix-assisted laser desorption ionization (MALDI) mass spectrometry (MS). Microbore LC provides higher sample loading, compared to capillary LC, which facilitates the quantification of low abundance proteins in protein mixtures. In this work, microbore LC is combined with MALDI MS via a heated droplet interface. The compatibilities of two global peptide labeling methods (i.e., esterification to carboxylic groups and dimethylation to amine groups of peptides) with this LC-MALDI technique are evaluated. Using a quadrupole-time-of-flight mass spectrometer, MALDI spectra of the peptides in individual sample spots are obtained to determine the abundance ratio among pairs of differential isotopically labeled peptides. MS/MS spectra are subsequently obtained from the peptide pairs showing significant abundance differences to determine the sequences of selected peptides for protein identification. The peptide sequences determined from MS/MS database search are confirmed by using the overlaid fragment ion spectra generated from a pair of differentially labeled peptides. The effectiveness of this microbore LC-MALDI approach is demonstrated in the quantification and identification of peptides from a mixture of standard proteins as well as E. coli whole cell extract of known relative concentrations. It is shown that this approach provides a facile and economical means of comparing relative protein abundances from two proteome samples.  相似文献   

11.
Proteomics uses tandem mass spectrometers and correlation algorithms to match peptides and their fragment spectra to amino acid sequences. The replication of multiple liquid chromatography experiments with electrospray ionization of peptides and tandem mass spectrometry (LC–ESI–MS/MS) produces large sets of MS/MS spectra. There is a need to assess the quality of large sets of experimental results by statistical comparison with that of random expectation. Classical frequency-based statistics such as goodness-of-fit tests for peptide-to-protein distributions could be used to calculate the probability that an entire set of experimental results has arisen by random chance. The frequency distributions of authentic MS/MS spectra from human blood were compared with those of false positive MS/MS spectra generated by a computer, or instrument noise, using the chi-square test. Here the mechanics of the chi-square test to compare the results in toto from a set of LC–ESI–MS/MS experiments with those of random expectation is detailed. The chi-square analysis of authentic spectra demonstrates unambiguously that the analysis of blood proteins separated by partition chromatography prior to tryptic digestions has a low probability that the cumulative peptide-to-protein distribution is the same as that of random or noise false positive spectra.  相似文献   

12.
In‐gel digestion followed by LC/MS/MS is widely used for the identification of trace amounts of proteins and for the site‐specific glycosylation analysis of glycoproteins in cells and tissues. A major limitation of this technique is the difficulty in acquiring reliable mass spectra for peptides present in minute quantities and glycopeptides with high heterogeneity and poor hydrophobicity. It is considered that the SDS used in electrophoresis can interact with proteins noncovalently and impede the ionization of peptides/glycopeptides. In this study, we report an improved in‐gel digestion method to acquire reliable mass spectra of a trace amount of peptides/glycopeptides. A key innovation of our improved method is the use of guanidine hydrochloride, which forms complexes with the residual SDS molecules in the sample. The precipitation and removal of SDS by addition of the guanidine hydrochloride was successful in improving the S/N of peptides/glycopeptides in mass spectra and acquiring a more comprehensive MS/MS data set for the various glycoforms of each glycopeptide.  相似文献   

13.
Despite significant advances in the identification of known proteins, the analysis of unknown proteins by MS/MS still remains a challenging open problem. Although Klaus Biemann recognized the potential of MS/MS for sequencing of unknown proteins in the 1980s, low throughput Edman degradation followed by cloning still remains the main method to sequence unknown proteins. The automated interpretation of MS/MS spectra has been limited by a focus on individual spectra and has not capitalized on the information contained in spectra of overlapping peptides. Indeed the powerful shotgun DNA sequencing strategies have not been extended to automated protein sequencing. We demonstrate, for the first time, the feasibility of automated shotgun protein sequencing of protein mixtures by utilizing MS/MS spectra of overlapping and possibly modified peptides generated via multiple proteases of different specificities. We validate this approach by generating highly accurate de novo reconstructions of multiple regions of various proteins in western diamondback rattlesnake venom. We further argue that shotgun protein sequencing has the potential to overcome the limitations of current protein sequencing approaches and thus catalyze the otherwise impractical applications of proteomics methodologies in studies of unknown proteins.  相似文献   

14.
Searching spectral libraries in MS/MS is an important new approach to improving the quality of peptide and protein identification. The idea relies on the observation that ion intensities in an MS/MS spectrum of a given peptide are generally reproducible across experiments, and thus, matching between spectra from an experiment and the spectra of previously identified peptides stored in a spectral library can lead to better peptide identification compared to the traditional database search. However, the use of libraries is greatly limited by their coverage of peptide sequences: even for well‐studied organisms a large fraction of peptides have not been previously identified. To address this issue, we propose to expand spectral libraries by predicting the MS/MS spectra of peptides based on the spectra of peptides with similar sequences. We first demonstrate that the intensity patterns of dominant fragment ions between similar peptides tend to be similar. In accordance with this observation, we develop a neighbor‐based approach that first selects peptides that are likely to have spectra similar to the target peptide and then combines their spectra using a weighted K‐nearest neighbor method to accurately predict fragment ion intensities corresponding to the target peptide. This approach has the potential to predict spectra for every peptide in the proteome. When rigorous quality criteria are applied, we estimate that the method increases the coverage of spectral libraries available from the National Institute of Standards and Technology by 20–60%, although the values vary with peptide length and charge state. We find that the overall best search performance is achieved when spectral libraries are supplemented by the high quality predicted spectra.  相似文献   

15.
Tandem mass spectrometry (MS/MS) has emerged as a cornerstone of proteomics owing in part to robust spectral interpretation algorithms. Widely used algorithms do not fully exploit the intensity patterns present in mass spectra. Here, we demonstrate that intensity pattern modeling improves peptide and protein identification from MS/MS spectra. We modeled fragment ion intensities using a machine-learning approach that estimates the likelihood of observed intensities given peptide and fragment attributes. From 1,000,000 spectra, we chose 27,000 with high-quality, nonredundant matches as training data. Using the same 27,000 spectra, intensity was similarly modeled with mismatched peptides. We used these two probabilistic models to compute the relative likelihood of an observed spectrum given that a candidate peptide is matched or mismatched. We used a 'decoy' proteome approach to estimate incorrect match frequency, and demonstrated that an intensity-based method reduces peptide identification error by 50-96% without any loss in sensitivity.  相似文献   

16.
We report an isotope labeling shotgun proteome analysis strategy to validate the spectrum-to-sequence assignments generated by using sequence-database searching for the construction of a more reliable MS/MS spectral library. This strategy is demonstrated in the analysis of the E. coli K12 proteome. In the workflow, E. coli cells were cultured in normal and (15)N-enriched media. The differentially labeled proteins from the cell extracts were subjected to trypsin digestion and two-dimensional liquid chromatography quadrupole time-of-flight tandem mass spectrometry (2D-LC QTOF MS/MS) analysis. The MS/MS spectra of the two samples were individually searched using Mascot against the E. coli proteome database to generate lists of peptide sequence matches. The two data sets were compared by overlaying the spectra of unlabeled and labeled matches of the same peptide sequence for validation. Two cutoff filters, one based on the number of common fragment ions and another one on the similarity of intensity patterns among the common ions, were developed and applied to the overlaid spectral pairs to reject the low quality or incorrectly assigned spectra. By examining 257,907 and 245,156 spectra acquired from the unlabeled and (15)N-labeled samples, respectively, an experimentally validated MS/MS spectral library of tryptic peptides was constructed for E. coli K12 that consisted of 9,302 unique spectra with unique sequence and charge state, representing 7,763 unique peptide sequences. This E. coli spectral library could be readily expanded, and the overall strategy should be applicable to other organisms. Even with this relatively small library, it was shown that more peptides could be identified with higher confidence using the spectral search method than by sequence-database searching.  相似文献   

17.
The sulfonation reagent, a succinimidyl ester of 3-sulfobenzoic acid, has been synthesized for effective peptide sequencing. It is capable of incorporating an additional mobile proton into the peptide backbone, thus, facilitating efficient collision-induced dissociation. This reagent is easily and inexpensively prepared in short time. Tandem mass spectra of the guanidinated and reagent-sulfonated peptides consist mainly of the y-ion series with higher intensities than those observed for solely guanidinated peptides. These enhanced tandem MS attributes significantly improved MASCOT total-ion scores, thus, allowing more confident peptide sequencing. This derivatization was also very effective for the analysis of tryptic digest of human blood serum proteins separated by two-dimensional gel electrophoresis. When used in LC-MALDI/MS/MS format, this type of derivatization does not adversely affect chromatographic efficiencies.  相似文献   

18.
Beer I  Barnea E  Admon A 《Proteomics》2005,5(13):3491-3496
The human Plasma Proteome Project (PPP) is a large-scale collaboration between many laboratories. One of the most demanding tasks in the PPP involved the analysis of very large amounts of raw MS/MS data produced by the participants. The main approach for managing this task was letting the participants analyze their own data and submit the results to the central PPP repository as lists of identified proteins and peptides. To complement this distributed approach, we also performed centralized analysis of the raw MS/MS data provided by the participants. Due to the data redundancy inherent in such a project, centralized analysis has the potential to reduce the computational effort by reducing redundancy before the analysis. Centralized analysis can also unify the process and take advantage of data sharing among laboratories to improve protein identification and validation. The process we employed included removing low-quality spectra, clustering spectra by mutual similarity, and applying uniform peptide and protein identification procedures. To demonstrate the process, we analyzed 5.28 million MS/MS spectra derived by eight laboratories from tryptic peptides of serum and plasma proteins.  相似文献   

19.
High‐resolution MS/MS spectra of peptides can be deisotoped to identify monoisotopic masses of peptide fragments. The use of such masses should improve protein identification rates. However, deisotoping is not universally used and its benefits have not been fully explored. Here, MS2‐Deisotoper, a tool for use prior to database search, is used to identify monoisotopic peaks in centroided MS/MS spectra. MS2‐Deisotoper works by comparing the mass and relative intensity of each peptide fragment peak to every other peak of greater mass, and by applying a set of rules concerning mass and intensity differences. After comprehensive parameter optimization, it is shown that MS2‐Deisotoper can improve the number of peptide spectrum matches (PSMs) identified by up to 8.2% and proteins by up to 2.8%. It is effective with SILAC and non‐SILAC MS/MS data. The identification of unique peptide sequences is also improved, increasing the number of human proteoforms by 3.7%. Detailed investigation of results shows that deisotoping increases Mascot ion scores, improves FDR estimation for PSMs, and leads to greater protein sequence coverage. At a peptide level, it is found that the efficacy of deisotoping is affected by peptide mass and charge. MS2‐Deisotoper can be used via a user interface or as a command‐line tool.  相似文献   

20.
Protein identification by mass spectrometry is mainly based on MS/MS spectra and the accuracy of molecular mass determination. However, the high complexity and dynamic ranges for any species of proteomic samples, surpass the separation capacity and detection power of the most advanced multidimensional liquid chromatographs and mass spectrometers. Only a tiny portion of signals is selected for MS/MS experiments and a still considerable number of them do not provide reliable peptide identification. In this article, an in silico analysis for a novel methodology of peptides and proteins identification is described. The approach is based on mass accuracy, isoelectric point (pI), retention time (t(R)) and N-terminal amino acid determination as protein identification criteria regardless of high quality MS/MS spectra. When the methodology was combined with the selective isolation methods, the number of unique peptides and identified proteins increases. Finally, to demonstrate the feasibility of the methodology, an OFFGEL-LC-MS/MS experiment was also implemented. We compared the more reliable peptide identified with MS/MS information, and peptide identified with three experimental features (pI, t(R), molecular mass). Also, two theoretical assumptions from MS/MS identification (selective isolation of peptides and N-terminal amino acid) were analyzed. Our results show that using the information provided by these features and selective isolation methods we could found the 93% of the high confidence protein identified by MS/MS with false-positive rate lower than 5%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号