首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
We demonstrate a new approach to the determination of amino acid composition from tandem mass spectrometrically fragmented peptides using both experimental and simulated data. The approach has been developed to be used as a search-space filter in a protein identification pipeline with the aim of increased performance above that which could be attained by using immonium ion information. Three automated methods have been developed and tested: one based upon a simple peak traversal, in which all intense ion peaks are treated as being either a b- or y-ion using a wide mass tolerance; a second which uses a much narrower tolerance and does not perform transformations of ion peaks to the complementary type; and the unique fragments method which allows for b- or y-ion type to be inferred and corroborated using a scan of the other ions present in each peptide spectrum. The combination of these methods is shown to provide a high-accuracy set of amino acid predictions using both experimental and simulated data sets. These high quality predictions, with an accuracy of over 85%, may be used to identify peptide fragments that are hard to identify using other methods. The data simulation algorithm is also shown post priori to be a good model of noiseless tandem mass spectrometric peptide data.  相似文献   

2.
Mass spectrometry has proved to be an important tool for protein biomarker discovery, identification and characterization. However, global proteomic profiling strategies often fail to identify known low-abundance biomarkers as a result of the limited dynamic range of mass spectrometry (two to three orders of magnitude) compared with the large dynamic range of protein concentrations in biologic fluids (11 to 12 orders of magnitude for serum). In addition, the number of peptides generated in such methods vastly overwhelms the resolution capacity of mass spectrometers, requiring extensive sample clean-up (e.g., affinity tag, retentate chromatography and/or high-performance liquid chromatography) before mass spectrometry analysis. Baiting and affinity pre-enrichment strategies, which overcome the dynamic range and sample complexity issues of global proteomic strategies, are very difficult to couple to mass spectrometry. This is due to the fact that it is nearly impossible to sort target peptides from those of the bait since there will be many cases of isobaric peptides. IDBEST? (Target Discovery, Inc.) is a new tagging strategy that enables such pre-enrichment of specific proteins or protein classes as the resulting tagged peptides are distinguishable from those of the bait by a mass defect shift of approximately 0.1 atomic mass units. The special characteristics of these tags allow: resolution of tagged peptides from untagged peptides through incorporation of a mass defect element; high-precision quantitation of up- and downregulation by using stable isotope versions of the same tag; and potential analysis of protein isoforms through more complete peptide coverage from the proteins of interest.  相似文献   

3.
Mass spectrometry has proved to be an important tool for protein biomarker discovery, identification and characterization. However, global proteomic profiling strategies often fail to identify known low-abundance biomarkers as a result of the limited dynamic range of mass spectrometry (two to three orders of magnitude) compared with the large dynamic range of protein concentrations in biologic fluids (11 to 12 orders of magnitude for serum). In addition, the number of peptides generated in such methods vastly overwhelms the resolution capacity of mass spectrometers, requiring extensive sample clean-up (e.g., affinity tag, retentate chromatography and/or high-performance liquid chromatography) before mass spectrometry analysis. Baiting and affinity pre-enrichment strategies, which overcome the dynamic range and sample complexity issues of global proteomic strategies, are very difficult to couple to mass spectrometry. This is due to the fact that it is nearly impossible to sort target peptides from those of the bait since there will be many cases of isobaric peptides. IDBEST (Target Discovery, Inc.) is a new tagging strategy that enables such pre-enrichment of specific proteins or protein classes as the resulting tagged peptides are distinguishable from those of the bait by a mass defect shift of approximately 0.1 atomic mass units. The special characteristics of these tags allow: resolution of tagged peptides from untagged peptides through incorporation of a mass defect element; high-precision quantitation of up- and downregulation by using stable isotope versions of the same tag; and potential analysis of protein isoforms through more complete peptide coverage from the proteins of interest.  相似文献   

4.
The identification of proteins in proteomics experiments is usually based on mass information derived from tandem mass spectrometry data. To improve the performance of the identification algorithms, additional information available in the fragment peak intensity patterns has been shown to be useful. In this study, we consider the effect of iTRAQ labeling on the fragment peak intensity patterns of singly charged peptides from MALDI tandem MS data. The presence of an iTRAQ-modified basic group on the N-terminus leads to a more pronounced set of b-ion peaks and distinct changes in the abundance of specific peptide types. We performed a simple intensity prediction by using a decision-tree machine learning approach and were able to show that the relative ion abundance in a spectrum can be correctly predicted and distinguished from closely related sequences. This information will be useful for the development of improved method-specific intensity-based protein identification algorithms.  相似文献   

5.
The strong need for quantitative information in proteomics has fueled the development of mass spectrometry-based analytical methods that are able to determine protein abundances. This article reviews mass spectrometry experiments aimed at providing an absolute quantification of proteins. The experiments make use of the isotope-dilution concept by spiking a known amount of synthetic, isotope-labeled reference peptide into the analyte sample. Quantification is achieved by comparing the mass spectrometry signal intensities of the reference with an endogenous peptide that is generated upon proteolytic cleavage of the target protein. In an analogous manner, the level of post-translational modification at a distinct residue within a target protein can be determined. Among the strengths of absolute quantification are low detection limits reaching subfemtomole levels, a high dynamic range spanning approximately five orders of magnitude, low requirements for sample clean-up, and a fast and straightforward method development. Recent studies have demonstrated the compatibility of absolute quantification with various mass spectrometry readout techniques and sample purification steps such as 1D gel electrophoresis, size-exclusion chromatography, isoelectric peptide focusing, strong cation exchange and reversed phase or affinity chromatography. Under ideal conditions, quantification errors and coefficients of variation below 5% have been reported. However, the fact that at the start of the experiment the analyte is a protein and the internal standard is a peptide, severe quantification errors may result due to the selection of unsuitable reference peptides and/or imperfect protein proteolysis. Within the ensemble of mass spectrometry-based quantification methods, absolute quantification is the method of choice in cases where absolute numbers, many repetitive experiments or precise levels of post-translational modifications are required for a few, preselected species of interest. Consequently, prominent application areas include biomarker quantification, the study of post-translational modifications such as phosphorylation or ubiquitination and the comparison of concentrations of interacting proteins.  相似文献   

6.
The strong need for quantitative information in proteomics has fueled the development of mass spectrometry-based analytical methods that are able to determine protein abundances. This article reviews mass spectrometry experiments aimed at providing an absolute quantification of proteins. The experiments make use of the isotope-dilution concept by spiking a known amount of synthetic, isotope-labeled reference peptide into the analyte sample. Quantification is achieved by comparing the mass spectrometry signal intensities of the reference with an endogenous peptide that is generated upon proteolytic cleavage of the target protein. In an analogous manner, the level of post-translational modification at a distinct residue within a target protein can be determined. Among the strengths of absolute quantification are low detection limits reaching subfemtomole levels, a high dynamic range spanning approximately five orders of magnitude, low requirements for sample clean-up, and a fast and straightforward method development. Recent studies have demonstrated the compatibility of absolute quantification with various mass spectrometry readout techniques and sample purification steps such as 1D gel electrophoresis, size-exclusion chromatography, isoelectric peptide focusing, strong cation exchange and reversed phase or affinity chromatography. Under ideal conditions, quantification errors and coefficients of variation below 5% have been reported. However, the fact that at the start of the experiment the analyte is a protein and the internal standard is a peptide, severe quantification errors may result due to the selection of unsuitable reference peptides and/or imperfect protein proteolysis. Within the ensemble of mass spectrometry-based quantification methods, absolute quantification is the method of choice in cases where absolute numbers, many repetitive experiments or precise levels of post-translational modifications are required for a few, preselected species of interest. Consequently, prominent application areas include biomarker quantification, the study of post-translational modifications such as phosphorylation or ubiquitination and the comparison of concentrations of interacting proteins.  相似文献   

7.
Bandeira N 《BioTechniques》2007,42(6):687, 689, 691 passim
Significant technological advances have accelerated high-throughput proteomics to the automated generation of millions of tandem mass spectra on a daily basis. In such a setup, the desire for greater sequence coverage combines with standard experimental procedures to commonly yield multiple tandem mass spectra from overlapping peptides-typical observations include peptides differing by one or two terminal amino acids and spectra from modified and unmodified variants of the same peptides. In a departure from the traditional spectrum identification algorithms that analyze each tandem mass spectrum in isolation, spectral networks define a new computational approach that instead finds and simultaneously interprets sets of spectra from overlapping peptides. In shotgun protein sequencing, spectral networks capitalize on the redundant sequence information in the aligned spectra to deliver the longest and most accurate de novo sequences ever reported for ion trap data. Also, by combining spectra from multiple modified and unmodified variants of the same peptides, spectral networks are able to bypass the dominant guess/confirm approach to the identification of posttranslational modifications and alternatively discover modifications and highly modified peptides directly from experimental data. Open-source implementations of these algorithms may be downloaded from peptide.ucsd.edu.  相似文献   

8.
We report the use of microbore reverse-phase high performance liquid chromatography connected on-line to an electrospray mass spectrometer for the separation/detection of peptides derived by proteolytic digestion of proteins separated by polyacrylamide gel electrophoresis. A small fraction (typically 10% of the total) of the peptides eluting from the column was diverted through a flow-splitting device into the ion source of the mass spectrometer, whereas the majority of the peptide samples was collected for further analyses. We demonstrate the feasibility of obtaining reproducible peptide maps from submicrogram amounts of protein applied to the gel and good correlation of the signal detected by the mass spectrometer with peptide detection by UV absorbance. Furthermore, independently verifiable peptide masses were determined from subpicomole amounts of peptides directed into the mass spectrometer. The method was used to analyze the 265-kDa and the 280-kDa isoforms of the enzyme acetyl-CoA carboxylase isolated from rat liver. The results provide compelling evidence that the two enzyme isoforms are translation products of different genes and suggest that these approaches may be of general utility in the definitive comparison of protein isoforms. We furthermore illustrate that knowledge of peptide masses as determined by this technique provides a major advantage for error-free data interpretation in chemical high-sensitivity peptide sequence analysis.  相似文献   

9.
Direct matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) analysis of human serum yielded ion signals from only a fraction of the total number of peptides and proteins expected to be in the sample. We increased the number of peptide and protein ion signals observed in the MALDI-TOF mass spectra analysis of human serum by using a prefractionation protocol based on liquid phase isoelectric focusing electrophoresis. This pre-fractionation technique facilitated the MALDI-TOF MS detection of as many as 262 different peptide and protein ion signals from human serum. The results obtained from three replicate fractionation experiments on the same serum sample indicated that 148 different peptide and protein ion signals were reproducibly detected using our isoelectric focusing and MALDI-TOF MS protocol.  相似文献   

10.
We describe a statistical measure, Mass Distance Fingerprint, for automatic de novo detection of predominant peptide mass distances, i.e., putative protein modifications. The method's focus is to globally detect mass differences, not to assign peptide sequences or modifications to individual spectra. The Mass Distance Fingerprint is calculated from high accuracy measured peptide masses. For the data sets used in this study, known mass differences are detected at electron mass accuracy or better. The proposed method is novel because it works independently of protein sequence databases and without any prior knowledge about modifications. Both modified and unmodified peptides have to be present in the sample to be detected. The method can be used for automated detection of chemical/post-translational modifications, quality control of experiments and labeling approaches, and to control the modification settings of protein identification tools. The algorithm is implemented as a web application and is distributed as open source software.  相似文献   

11.
Tandem mass spectrometry (MS/MS) is frequently used in the identification of peptides and proteins. Typical proteomic experiments rely on algorithms such as SEQUEST and MASCOT to compare thousands of tandem mass spectra against the theoretical fragment ion spectra of peptides in a database. The probabilities that these spectrum-to-sequence assignments are correct can be determined by statistical software such as PeptideProphet or through estimations based on reverse or decoy databases. However, many of the software applications that assign probabilities for MS/MS spectra to sequence matches were developed using training data sets from 3D ion-trap mass spectrometers. Given the variety of types of mass spectrometers that have become commercially available over the last 5 years, we sought to generate a data set of reference data covering multiple instrumentation platforms to facilitate both the refinement of existing computational approaches and the development of novel software tools. We analyzed the proteolytic peptides in a mixture of tryptic digests of 18 proteins, named the "ISB standard protein mix", using 8 different mass spectrometers. These include linear and 3D ion traps, two quadrupole time-of-flight platforms (qq-TOF), and two MALDI-TOF-TOF platforms. The resulting data set, which has been named the Standard Protein Mix Database, consists of over 1.1 million spectra in 150+ replicate runs on the mass spectrometers. The data were inspected for quality of separation and searched using SEQUEST. All data, including the native raw instrument and mzXML formats and the PeptideProphet validated peptide assignments, are available at http://regis-web.systemsbiology.net/PublicDatasets/.  相似文献   

12.
Does trypsin cut before proline?   总被引:1,自引:0,他引:1  
Trypsin is the most commonly used enzyme in mass spectrometry for protein digestion with high substrate specificity. Many peptide identification algorithms incorporate these specificity rules as filtering criteria. A generally accepted "Keil rule" is that trypsin cleaves next to arginine or lysine, but not before proline. Since this rule was derived two decades ago based on a small number of experimentally confirmed cleavages, we decided to re-examine it using 14.5 million tandem spectra (2 orders of magnitude increase in the number of observed tryptic cleavages). Our analysis revealed a surprisingly large number of cleavages before proline. We examine several hypotheses to explain these cleavages and argue that trypsin specificity rules used in peptide identification algorithms should be modified to "legitimatize" cleavages before proline. Our approach can be applied to analyze any protease, and we further argue that specificity rules for other enzymes should also be re-evaluated based on statistical evidence derived from large MS/MS data sets.  相似文献   

13.
Multiplexed tandem mass spectrometry (MS/MS) has recently been demonstrated as a means to increase the throughput of peptide identification in liquid chromatography (LC) MS/MS experiments. In this approach, a set of parent species is dissociated simultaneously and measured in a single spectrum (in the same manner that a single parent ion is conventionally studied), providing a gain in sensitivity and throughput proportional to the number of species that can be simultaneously addressed. In the present work, simulations performed using the Caenorhabditis elegans predicted proteins database show that multiplexed MS/MS data allow the identification of tryptic peptides from mixtures of up to ten peptides from a single dataset with only three "y" or "b" fragments per peptide and a mass accuracy of 2.5 to 5 ppm. At this level of database and data complexity, 98% of the 500 peptides considered in the simulation were correctly identified. This compares favorably with the rates obtained for classical MS/MS at more modest mass measurement accuracy. LC multiplexed Fourier transform-ion cyclotron resonance MS/MS data obtained from a 66 kDa protein (bovine serum albumin) tryptic digest sample are presented to illustrate the approach, and confirm that peptides can be effectively identified from the C. elegans database to which the protein sequence had been appended.  相似文献   

14.
Protein phosphorylation modulates a myriad of biological functions, and its regulation is vital for proper cellular activity. Mass spectrometry is the enabling tool for phosphopeptide analysis, where recent instrumentation advances in both speed and sensitivity in linear ion trap and orbitrap technologies may yield more comprehensive phosphoproteomic analyses in less time. Protein phosphorylation analysis by MS relies on structural information derived through controlled peptide fragmentation. Compared with traditional, ion-trap-based collision-induced dissociation (CID), a more recent type of fragmentation termed HCD (higher energy collisional dissociation) provides beam type CID tandem MS with detection of fragment ions at high resolution in the orbitrap mass analyzer. Here we compared HCD to traditional CID for large-scale phosphorylation analyses of murine brain under three separate experimental conditions. These included a same-precursor analysis where CID and HCD scans were performed back-to-back, separate analyses of a phosphotyrosine peptide immunoprecipitation experiment, and separate whole phosphoproteome analyses. HCD generally provided higher search engine scores with more peptides identified, thus out-performing CID for back-to-back experiments for most metrics tested. However, for phosphotyrosine IPs and in a full phosphoproteome study of mouse brain, the greater acquisition speed of CID-only analyses provided larger data sets. We reconciled our results with those in direct contradiction from Nagaraj N, D'Souza RCJ et al. (J. Proteome Res. 9:6786, 2010). We conclude, for large-scale phosphoproteomics, CID fragmentation with rapid detection in the ion trap still produced substantially richer data sets, but the back-to-back experiments demonstrated the promise of HCD and orbitrap detection for the future.  相似文献   

15.
MOTIVATION: The identification of peptides by tandem mass spectrometry (MS/MS) is a central method of proteomics research, but due to the complexity of MS/MS data and the large databases searched, the accuracy of peptide identification algorithms remains limited. To improve the accuracy of identification we applied a machine-learning approach using a hidden Markov model (HMM) to capture the complex and often subtle links between a peptide sequence and its MS/MS spectrum. Model: Our model, HMM_Score, represents ion types as HMM states and calculates the maximum joint probability for a peptide/spectrum pair using emission probabilities from three factors: the amino acids adjacent to each fragmentation site, the mass dependence of ion types and the intensity dependence of ion types. The Viterbi algorithm is used to calculate the most probable assignment between ion types in a spectrum and a peptide sequence, then a correction factor is added to account for the propensity of the model to favor longer peptides. An expectation value is calculated based on the model score to assess the significance of each peptide/spectrum match. RESULTS: We trained and tested HMM_Score on three data sets generated by two different mass spectrometer types. For a reference data set recently reported in the literature and validated using seven identification algorithms, HMM_Score produced 43% more positive identification results at a 1% false positive rate than the best of two other commonly used algorithms, Mascot and X!Tandem. HMM_Score is a highly accurate platform for peptide identification that works well for a variety of mass spectrometer and biological sample types. AVAILABILITY: The program is freely available on ProteomeCommons via an OpenSource license. See http://bioinfo.unc.edu/downloads/ for the download link.  相似文献   

16.
The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets.  相似文献   

17.
Comparative proteomic approaches using isotopic labeling and MS have become increasingly popular. Conventionally quantification is based on MS or extracted ion chromatogram (XIC) signals of differentially labeled peptides. However, in these MS-based experiments, the accuracy and dynamic range of quantification are limited by the high noise levels of MS/XIC data. Here we report a quantitative strategy based on multiplex (derived from multiple precursor ions) MS/MS data. One set of proteins was metabolically labeled with [13C6]lysine and [15N4]arginine; the other set was unlabeled. For peptide analysis after tryptic digestion of the labeled proteins, a wide precursor window was used to include both the light and heavy versions of each peptide for fragmentation. The multiplex MS/MS data were used for both protein identification and quantification. The use of the wide precursor window increased sensitivity, and the y ion pairs in the multiplex MS/MS spectra from peptides containing labeled and unlabeled lysine or arginine offered more information for, and thus the potential for improving, protein identification. Protein ratios were obtained by comparing intensities of y ions derived from the light and heavy peptides. Our results indicated that this method offers several advantages over the conventional XIC-based approach, including increased sensitivity for protein identification and more accurate quantification with more than a 10-fold increase in dynamic range. In addition, the quantification calculation process was fast, fully automated, and independent of instrument and data type. This method was further validated by quantitative analysis of signaling proteins in the EphB2 pathway in NG108 cells.  相似文献   

18.
We present a tool to improve quantitative accuracy and precision in mass spectrometry based on shotgun proteomics: protein quantification by peptide quality control, PQPQ. The method is based on the assumption that the quantitative pattern of peptides derived from one protein will correlate over several samples. Dissonant patterns arise either from outlier peptides or because of the presence of different protein species. By correlation analysis, protein quantification by peptide quality control identifies and excludes outliers and detects the existence of different protein species. Alternative protein species are then quantified separately. By validating the algorithm on seven data sets related to different cancer studies we show that data processing by protein quantification by peptide quality control improves the information output from shotgun proteomics. Data from two labeling procedures and three different instrumental platforms was included in the evaluation. With this unique method using both peptide sequence data and quantitative data we can improve the quantitative accuracy and precision on the protein level and detect different protein species.  相似文献   

19.
We report on the effectiveness of CID, HCD, and ETD for LC-FT MS/MS analysis of peptides using a tandem linear ion trap-Orbitrap mass spectrometer. A range of software tools and analysis parameters were employed to explore the use of CID, HCD, and ETD to identify peptides (isolated from human blood plasma) without the use of specific "enzyme rules". In the evaluation of an FDR-controlled SEQUEST scoring method, the use of accurate masses for fragments increased the number of identified peptides (by ~50%) compared to the use of conventional low accuracy fragment mass information, and CID provided the largest contribution to the identified peptide data sets compared to HCD and ETD. The FDR-controlled Mascot scoring method provided significantly fewer peptide identifications than SEQUEST (by 1.3-2.3 fold) and CID, HCD, and ETD provided similar contributions to identified peptides. Evaluation of de novo sequencing and the UStags method for more intense fragment ions revealed that HCD afforded more contiguous residues (e.g., ≥ 7 amino acids) than either CID or ETD. Both the FDR-controlled SEQUEST and Mascot scoring methods provided peptide data sets that were affected by the decoy database used and mass tolerances applied (e.g., identical peptides between data sets could be limited to ~70%), while the UStags method provided the most consistent peptide data sets (>90% overlap). The m/z ranges in which CID, HCD, and ETD contributed the largest number of peptide identifications were substantially overlapping. This work suggests that the three peptide ion fragmentation methods are complementary and that maximizing the number of peptide identifications benefits significantly from a careful match with the informatics tools and methods applied. These results also suggest that the decoy strategy may inaccurately estimate identification FDRs.  相似文献   

20.
In a recent study, in vivo metabolic labeling using (15)N traced the rate of label incorporation among more than 1700 proteins simultaneously and enabled the determination of individual protein turnover rate constants over a dynamic range of three orders of magnitude (Price, J. C., Guan, S., Burlingame, A., Prusiner, S. B., and Ghaemmaghami, S. (2010) Analysis of proteome dynamics in the mouse brain. Proc. Natl. Acad. Sci. U. S. A. 107, 14508-14513). These studies of protein dynamics provide a deeper understanding of healthy development and well-being of complex organisms, as well as the possible causes and progression of disease. In addition to a fully labeled food source and appropriate mass spectrometry platform, an essential and enabling component of such large scale investigations is a robust data processing and analysis pipeline, which is capable of the reduction of large sets of liquid chromatography tandem MS raw data files into the desired protein turnover rate constants. The data processing pipeline described in this contribution is comprised of a suite of software modules required for the workflow that fulfills such requirements. This software platform includes established software tools such as a mass spectrometry database search engine together with several additional, novel data processing modules specifically developed for (15)N metabolic labeling. These fulfill the following functions: (1) cross-extraction of (15)N-containing ion intensities from raw data files at varying biosynthetic incorporation times, (2) computation of peptide (15)N isotopic incorporation distributions, and (3) aggregation of relative isotope abundance curves for multiple peptides into single protein curves. In addition, processing parameter optimization and noise reduction procedures were found to be necessary in the processing modules in order to reduce propagation of errors in the long chain of the processing steps of the entire workflow.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号