首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Typically, detection of protein sequences in collision-induced dissociation (CID) tandem MS (MS2) dataset is performed by mapping identified peptide ions back to protein sequence by using the protein database search (PDS) engine. Finding a particular peptide sequence of interest in CID MS2 records very often requires manual evaluation of the spectrum, regardless of whether the peptide-associated MS2 scan is identified by PDS algorithm or not. We have developed a compact cross-platform database-free command-line utility, pepgrep, which helps to find an MS2 fingerprint for a selected peptide sequence by pattern-matching of modelled MS2 data using Peptide-to-MS2 scoring algorithm. pepgrep can incorporate dozens of mass offsets corresponding to a variety of post-translational modifications (PTMs) into the algorithm. Decoy peptide sequences are used with the tested peptide sequence to reduce false-positive results. The engine is capable of screening an MS2 data file at a high rate when using a cluster computing environment. The matched MS2 spectrum can be displayed by using built-in graphical application programming interface (API) or optionally recorded to file. Using this algorithm, we were able to find extra peptide sequences in studied CID spectra that were missed by PDS identification. Also we found pepgrep especially useful for examining a CID of small fractions of peptides resulting from, for example, affinity purification techniques. The peptide sequences in such samples are less likely to be positively identified by using routine protein-centric algorithm implemented in PDS. The software is freely available at http://bsproteomics.essex.ac.uk:8080/data/download/pepgrep-1.4.tgz.  相似文献   

2.
Although peptide mass fingerprinting is currently the method of choice to identify proteins, the number of proteins available in databases is increasing constantly, and hence, the advantage of having sequence data on a selected peptide, in order to increase the effectiveness of database searching, is more crucial. Until recently, the ability to identify proteins based on the peptide sequence was essentially limited to the use of electrospray ionization tandem mass spectrometry (MS) methods. The recent development of new instruments with matrix-assisted laser desorption/ionization (MALDI) sources and true tandem mass spectrometry (MS/MS) capabilities creates the capacity to obtain high quality tandem mass spectra of peptides. In this work, using the new high resolution tandem time of flight MALDI-(TOF/TOF) mass spectrometer from Applied Biosystems, examples of successful identification and characterization of bovine heart proteins (SWISS-PROT entries: P02192, Q9XSC6, P13620) separated by two-dimensional electrophoresis and blotted onto polyvinylidene difluoride membrane are described. Tryptic protein digests were analyzed by MALDI-TOF to identify peptide masses afterward used for MS/MS. Subsequent high energy MALDI-TOF/TOF collision-induced dissociation spectra were recorded on selected ions. All data, both MS and MS/MS, were recorded on the same instrument. Tandem mass spectra were submitted to database searching using MS-Tag or were manually de novo sequenced. An interesting modification of a tryptophan residue, a "double oxidation", came to light during these analyses.  相似文献   

3.
Identification of proteins by mass spectrometry (MS) is an essential step in pro- teomic studies and is typically accomplished by either peptide mass fingerprinting (PMF) or amino acid sequencing of the peptide. Although sequence information from MS/MS analysis can be used to validate PMF-based protein identification, it may not be practical when analyzing a large number of proteins and when high- throughput MS/MS instrumentation is not readily available. At present, a vast majority of proteomic studies employ PMF. However, there are huge disparities in criteria used to identify proteins using PMF. Therefore, to reduce incorrect protein identification using PMF, and also to increase confidence in PMF-based protein identification without accompanying MS/MS analysis, definitive guiding principles are essential. To this end, we propose a value-based scoring system that provides guidance on evaluating when PMF-based protein identification can be deemed sufficient without accompanying amino acid sequence data from MS/MS analysis.  相似文献   

4.
Identification of proteins by mass spectrometry (MS) is an essential step in pro- teomic studies and is typically accomplished by either peptide mass fingerprinting (PMF) or amino acid sequencing of the peptide. Although sequence information from MS/MS analysis can be used to validate PMF-based protein identification, it may not be practical when analyzing a large number of proteins and when high- throughput MS/MS instrumentation is not readily available. At present, a vast majority of proteomic studies employ PMF. However, there are huge disparities in criteria used to identify proteins using PMF. Therefore, to reduce incorrect protein identification using PMF, and also to increase confidence in PMF-based protein identification without accompanying MS/MS analysis, definitive guiding principles are essential. To this end, we propose a value-based scoring system that provides guidance on evaluating when PMF-based protein identification can be deemed sufficient without accompanying amino acid sequence data from MS/MS analysis.  相似文献   

5.
Identification of proteins by mass spectrometry (MS) is an essential step in proteomic studies and is typically accomplished by either peptide mass fingerprinting (PMF) or amino acid sequencing of the peptide. Although sequence information from MS/MS analysis can be used to validate PMF-based protein identification, it may not be practical when analyzing a large number of proteins and when high- throughput MS/MS instrumentation is not readily available. At present, a vast majority of proteomic studies employ PMF. However, there are huge disparities in criteria used to identify proteins using PMF. Therefore, to reduce incorrect protein identification using PMF, and also to increase confidence in PMF-based protein identification without accompanying MS/MS analysis, definitive guiding principles are essential. To this end, we propose a value-based scoring system that provides guidance on evaluating when PMF-based protein identification can be deemed sufficient without accompanying amino acid sequence data from MS/MS analysis.  相似文献   

6.
This study identified prostaglandin D2 synthase (PGDS) in murine epididymal fluid using a proteomic approach combining two-dimensional (2D) gel electrophoresis and mass spectrometry (MS). The caudal epididymal fluid was collected by retroperfusion, and proteins were separated by 2D gel electrophoresis followed by matrix-assisted laser desorption ionization MS analyses after trypsin digestion. The identification was based on the protein-specific peptide map as well as on sequence information generated by nano-electrospray ionization MS/MS. By in situ hybridization, the mRNA was detected in caput, corpus, and cauda, but it was not detected in the initial segment. The PGDS protein was mostly detected in the corpus and cauda by Western blot analysis and immunohistochemistry using a specific polyclonal antibody. In caudal fluid, PGDS was distributed among several isoforms (pI range, 6.5-8.8), suggesting that this protein undergoes posttranslational modification of its primary sequence. After N-glycanase digestion, the molecular mass decreased from 20-25 to 18.5 kDa, its theoretical mass. The PGDS was also detected in the epididymis of rat, hamster, and cynomolgus monkey from the caput to the cauda. In conclusion, MS is a powerful and accurate technique that allows unambiguous identification of the murine epididymal PGDS. The protein is 1) present throughout the epididymis, except in the initial segment, with an increasing luminal concentration from distal caput to cauda; 2) a major protein in caudal fluid; 3) an N-glycosylated, highly polymorphic protein; and 4) conserved during evolution.  相似文献   

7.
MOTIVATION: The identification of peptides by tandem mass spectrometry (MS/MS) is a central method of proteomics research, but due to the complexity of MS/MS data and the large databases searched, the accuracy of peptide identification algorithms remains limited. To improve the accuracy of identification we applied a machine-learning approach using a hidden Markov model (HMM) to capture the complex and often subtle links between a peptide sequence and its MS/MS spectrum. Model: Our model, HMM_Score, represents ion types as HMM states and calculates the maximum joint probability for a peptide/spectrum pair using emission probabilities from three factors: the amino acids adjacent to each fragmentation site, the mass dependence of ion types and the intensity dependence of ion types. The Viterbi algorithm is used to calculate the most probable assignment between ion types in a spectrum and a peptide sequence, then a correction factor is added to account for the propensity of the model to favor longer peptides. An expectation value is calculated based on the model score to assess the significance of each peptide/spectrum match. RESULTS: We trained and tested HMM_Score on three data sets generated by two different mass spectrometer types. For a reference data set recently reported in the literature and validated using seven identification algorithms, HMM_Score produced 43% more positive identification results at a 1% false positive rate than the best of two other commonly used algorithms, Mascot and X!Tandem. HMM_Score is a highly accurate platform for peptide identification that works well for a variety of mass spectrometer and biological sample types. AVAILABILITY: The program is freely available on ProteomeCommons via an OpenSource license. See http://bioinfo.unc.edu/downloads/ for the download link.  相似文献   

8.
Two-dimensional liquid chromatography (2D-LC) coupled on-line with electrospray ionization tandem mass spectrometry (2D-LC-ESI-MS/MS) is a new platform for analysis and identification of proteome. Peptides are separated by 2D-LC and then performed MS/MS analysis by tandem MS/MS. The MS/MS data are searched against database for protein identification. In one 2D-LC-ESI-MS/MS run, we obtained not only the structural information of peptides directly from MS/MS, but also the retention time of peptides eluted from LC. Information on the chromatographic behavior of peptides can assist protein identification in the new platform for proteomics. The retention time of the matching peptides of the identified protein was predicted by the hydrophobic contribute of each amino acid on reversed-phase liquid chromatography (RPLC). By using this strategy proteins were identified by four types of information: peptide mass fingerprinting (PMF), sequence query, and MS/MS ions searched and the predicted retention time. This additional information obtained from LC could assist protein identification with no extra experimental cost.  相似文献   

9.
López JL  Marina A  Alvarez G  Vázquez J 《Proteomics》2002,2(12):1658-1665
In this work, a novel approach based on proteomics is applied for the analysis of the three European marine mussel species: Mytilus edulis (ME), Mytilus galloprovincialis (MG) and Mytilus trossulus (MT), which are of interest in biotechnology and food industry. The proteomes of these species are poorly described in databases, are difficult to diagnose, and have a controversial taxonomy, To characterise species-specific peptides, we compared 51 matrix-assisted laser desorption/ioization-time of flight peptide mass maps generated from 6 random selected prominent spots derived from the two-dimensional electrophoresis analysis of foot protein extracts from several individuals. Minor species-specific differences in the peptide maps were detected in only one of the spots, corresponding to tropomyosin. Two peptides were unique to ME and MG individuals, whereas another peptide was present only in MT individuals. The sequence of these peptides was characterised by, nanoelectrospray ionization-ion trap (nanoESI-IT) tandem mass spectrometry (MS/MS) analysis followed by database searching and de novo sequence interpretation. We detected a single T to D amino acid substitution in MT tropomyosin. Unambiguous and highly-specific species identification was then demonstrated by analysing peptide extracts from tropomyosin spots by micro high-performande liquid chromatography (microHPL) ESI-IT mass spectrometry using the selected ion monitoring configuration, focused on these peptides, in continuous MS/MS operation. Our results suggest that proteomics may be successfully applied for the identification of species whose proteome is not present in databases.  相似文献   

10.
11.
Kim SI  Kim JY  Kim EA  Kwon KH  Kim KW  Cho K  Lee JH  Nam MH  Yang DC  Yoo JS  Park YM 《Proteomics》2003,3(12):2379-2392
As an initial step to the comprehensive proteomic analysis of Panax ginseng C. A. Meyer, protein mixtures extracted from the cultured hairy root of Panax ginseng were separated by two-dimensional polyacrylamide gel electrophoresis (2-DE). The protein spots were analyzed and identified by peptide finger printing and internal amino acid sequencing by matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) and electrospray ionization quadrupole-time of flight mass spectrometry (ESI Q-TOF MS), respectively. More than 300 protein spots were detected on silver stained two-dimensional (2-D) gels using pH 3-10, 4-7, and 4.5-5.5 gradients. Major protein spots (159) were analyzed by peptide fingerprinting or de novo sequencing and the functions of 91 of these proteins were identified. Protein identification was achieved using the expressed sequence tag (EST) database from Panax ginseng and the protein database of plants like Arabidopsis thaliana and Oryza sativa. However, peptide mass fingerprinting by MALDI-TOF MS alone was insufficient for protein identification because of the lack of a genome database for Panax ginseng. Only 17 of the 159 protein spots were verified by peptide mass fingerprinting using MALDI-TOF MS whereas 87 out of 102 protein spots, which included 13 of the 17 proteins identified by MALDI-TOF MS, were identified by internal amino acid sequencing using tandem mass spectrometry analysis by ESI Q-TOF MS. When the internal amino acid sequences were used as identification markers, the identification rate exceeded 85.3%, suggesting that a combination of internal sequencing and EST data analysis was an efficient identification method for proteome analysis of plants having incomplete genome data like ginseng. The 2-D patterns of the main root and leaves of Panax ginseng differed from that of the cultured hairy root, suggesting that some proteins are exclusively expressed by different tissues for specific cellular functions. Proteome analysis will undoubtedly be helpful for understanding the physiology of Panax ginseng.  相似文献   

12.
Protein disulfide isomerase (PDI) has been identified in a protein extract from the venom duct of the marine snail C. amadis. In-gel tryptic digestion of a thick protein band at approximately 55 kDa yields a mixture of peptides. Analysis of tryptic fragments by MALDI-MS/MS and LC-ESI-MS/MS methods permits sequence assignment. Three tryptic fragments yield two nine residue sequences (FVQDFLDGK and EPQLGDRVR ) and an eleven residue sequence (DQESTGALAFK ). Database analysis using peptides and were consistent with the sequence of PDI and peptide appears to be derived from a co-migrating protein. In identifying proteins based on the characterization of short peptide sequences the question arises about the reliability of identification using peptide fragments. Here we have also demonstrated the minimum length of peptide fragment necessary for unambiguous protein identification using fragments obtained from the experimentally derived sequences. Sequences of length > or =7 residues provide unambiguous identification in conjunction with protein molecular mass as a filter. The length of sequence necessary for unambiguous protein identification is also established using randomly chosen tryptic fragments from a standard dataset of proteins. The results are of significance in the identification of proteins from organisms with unsequenced genomes.  相似文献   

13.

Background

The immense diagnostic potential of human plasma has prompted great interest and effort in cataloging its contents, exemplified by the Human Proteome Organization (HUPO) Plasma Proteome Project (PPP) pilot project. Due to challenges in obtaining a reliable blood plasma protein list, HUPO later re-analysed their own original dataset with a more stringent statistical treatment that resulted in a much reduced list of high confidence (at least 95%) proteins compared with their original findings. In order to facilitate the discovery of novel biomarkers in the future and to realize the full diagnostic potential of blood plasma, we feel that there is still a need for an ultra-high confidence reference list (at least 99% confidence) of blood plasma proteins.

Methods

To address the complexity and dynamic protein concentration range of the plasma proteome, we employed a linear ion-trap-Fourier transform (LTQ-FT) and a linear ion trap-Orbitrap (LTQ-Orbitrap) for mass spectrometry (MS) analysis. Both instruments allow the measurement of peptide masses in the low ppm range. Furthermore, we employed a statistical score that allows database peptide identification searching using the products of two consecutive stages of tandem mass spectrometry (MS3). The combination of MS3 with very high mass accuracy in the parent peptide allows peptide identification with orders of magnitude more confidence than that typically achieved.

Results

Herein we established a high confidence set of 697 blood plasma proteins and achieved a high 'average sequence coverage' of more than 14 peptides per protein and a median of 6 peptides per protein. All proteins annotated as belonging to the immunoglobulin family as well as all hypothetical proteins whose peptides completely matched immunoglobulin sequences were excluded from this protein list. We also compared the results of using two high-end MS instruments as well as the use of various peptide and protein separation approaches. Furthermore, we characterized the plasma proteins using cellular localization information, as well as comparing our list of proteins to data from other sources, including the HUPO PPP dataset.

Conclusion

Superior instrumentation combined with rigorous validation criteria gave rise to a set of 697 plasma proteins in which we have very high confidence, demonstrated by an exceptionally low false peptide identification rate of 0.29%.  相似文献   

14.
Signal transduction from the insulin receptor to downstream effectors is attenuated by phosphorylation at a number of Ser/Thr residues of insulin receptor substrate-1 (IRS-1) resulting in resistance to insulin action, the hallmark of type II diabetes. Ser/Thr residues can also be reversibly glycosylated by O-linked beta-N-acetylglucosamine (O-GlcNAc) monosaccharide, a dynamic posttranslational modification that offers an alternative means of protein regulation to phosphorylation. To identify sites of O-GlcNAc modification in IRS-1, recombinant rat IRS-1 isolated from HEK293 cells was analyzed by two complementary mass spectrometric methods. Using data-dependent neutral loss MS3 mass spectrometry, MS/MS data were scanned for peptides that exhibited a neutral loss corresponding to the mass of N-acetylglucosamine upon dissociation in an ion trap. This methodology provided sequence coverage of 84% of the protein, permitted identification of a novel site of phosphorylation at Thr-1045, and facilitated the detection of an O-GlcNAc-modified peptide of IRS-1 at residues 1027-1073. The level of O-GlcNAc modification of this peptide increased when cells were grown under conditions of high glucose with or without chronic insulin stimulation or in the presence of an inhibitor of the O-GlcNAcase enzyme. To map the exact site of O-GlcNAc modification, IRS-1 peptides were chemically derivatized with dithiothreitol following beta-elimination and Michael addition prior to LC-MS/MS. This approach revealed Ser-1036 as the site of O-GlcNAc modification. Site-directed mutagenesis and Western blotting with an anti-O-GlcNAc antibody suggested that Ser-1036 is the major site of O-GlcNAc modification of IRS-1. Identification of this site will facilitate exploring the biological significance of the O-GlcNAc modification.  相似文献   

15.
The nucleotide sequence of a 2,146 bp portion of the Anacystisnidulans (Synechococcus PCC6301) genome has been determined.This region contains an open reading frame (ORF) of 392 codons,whose predicted protein sequence shows partial homology to thoseof E. coli phoM and envZ. Hence ORF392 is suggested to be asensory kinase gene in cyanobacteria.  相似文献   

16.
Homology-driven proteomics promises to reveal functional biology in insects with sparse genome sequence information. A proteomics study comparing plant virus transmission competent and refractive genotypes of the aphid Schizaphis graminum isolated numerous candidate proteins involved in virus transmission, but limited genome sequence information hampered their identification. The complete genome of the pea aphid, Acyrthosiphon pisum, released in 2008, enabled us to double the number of protein identifications beyond what was possible using available EST libraries and other insect sequences. This was concomitant with a dramatic increase of the number of MS and MS/MS peptide spectra matching the genome-derived protein sequence. LC-MS/MS proved to be the most robust method of peptide detection. Cross-matching spectral data to multiple EST sequences and error tolerant searching to identify amino acid substitutions enhanced the percent coverage of the Schizaphis graminum proteins. 2-D electrophoresis provided the protein pI and MW which enabled the refinement of the candidate protein selection and provided a measure of protein abundance when coupled to the spectral data. Thus, the homology-based proteomics pipeline for insects should include efforts to maximize the number of peptide matches to the protein to increase certainty in protein identification and relative protein abundance.  相似文献   

17.
18.
The combination of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS), in-gel enzymatic digestion of proteins separated by two-dimensional gel electrophoresis and searches of molecular weight in peptide-mass databases is a powerful and well established method for protein identification in proteomics analysis. For successful protein identification by MALDI-TOF mass spectrometry of peptide mixtures, critical parameters include highly specific enzymatic cleavage, high mass accuracy and sufficient numbers and sequence coverage of the peptides which can be analyzed. For in-gel digestion with trypsin, the method employed should be compatible both with enzymatic cleavage and subsequent MALDI-TOF MS analysis. We report here an improved method for preparation of peptides for MALDI-TOF MS mass fingerprinting by using volatile solubilizing agents during the in-gel digestion procedure. Our study clearly demonstrates that modification of the in-gel digestion protocols by addition of dimethyl formamide (DMF) or a mixture of DMF/N,N-dimethyl acetamide at various concentrations can significantly increase the recovery of peptides. These higher yields of peptides resulted in more effective protein identification.  相似文献   

19.
A novel software tool named PTM-Explorer has been applied to LC-MS/MS datasets acquired within the Human Proteome Organisation (HUPO) Brain Proteome Project (BPP). PTM-Explorer enables automatic identification of peptide MS/MS spectra that were not explained in typical sequence database searches. The main focus was detection of PTMs, but PTM-Explorer detects also unspecific peptide cleavage, mass measurement errors, experimental modifications, amino acid substitutions, transpeptidation products and unknown mass shifts. To avoid a combinatorial problem the search is restricted to a set of selected protein sequences, which stem from previous protein identifications using a common sequence database search. Prior to application to the HUPO BPP data, PTM-Explorer was evaluated on excellently manually characterized and evaluated LC-MS/MS data sets from Alpha-A-Crystallin gel spots obtained from mouse eye lens. Besides various PTMs including phosphorylation, a wealth of experimental modifications and unspecific cleavage products were successfully detected, completing the primary structure information of the measured proteins. Our results indicate that a large amount of MS/MS spectra that currently remain unidentified in standard database searches contain valuable information that can only be elucidated using suitable software tools.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号