首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Generating all plausible de novo interpretations of a peptide tandem mass (MS/MS) spectrum (Spectral Dictionary) and quickly matching them against the database represent a recently emerged alternative approach to peptide identification. However, the sizes of the Spectral Dictionaries quickly grow with the peptide length making their generation impractical for long peptides. We introduce Gapped Spectral Dictionaries (all plausible de novo interpretations with gaps) that can be easily generated for any peptide length thus addressing the limitation of the Spectral Dictionary approach. We show that Gapped Spectral Dictionaries are small thus opening a possibility of using them to speed-up MS/MS searches. Our MS-Gapped-Dictionary algorithm (based on Gapped Spectral Dictionaries) enables proteogenomics applications (such as searches in the six-frame translation of the human genome) that are prohibitively time consuming with existing approaches. MS-Gapped-Dictionary generates gapped peptides that occupy a niche between accurate but short peptide sequence tags and long but inaccurate full length peptide reconstructions. We show that, contrary to conventional wisdom, some high-quality spectra do not have good peptide sequence tags and introduce gapped tags that have advantages over the conventional peptide sequence tags in MS/MS database searches.  相似文献   

2.
Protein identification has been greatly facilitated by database searches against protein sequences derived from product ion spectra of peptides. This approach is primarily based on the use of fragment ion mass information contained in a MS/MS spectrum. Unambiguous protein identification from a spectrum with low sequence coverage or poor spectral quality can be a major challenge. We present a two-dimensional (2D) mass spectrometric method in which the numbers of nitrogen atoms in the molecular ion and the fragment ions are used to provide additional discriminating power for much improved protein identification and de novo peptide sequencing. The nitrogen number is determined by analyzing the mass difference of corresponding peak pairs in overlaid spectra of (15)N-labeled and unlabeled peptides. These peptides are produced by enzymatic or chemical cleavage of proteins from cells grown in (15)N-enriched and normal media, respectively. It is demonstrated that, using 2D information, i.e., m/z and its associated nitrogen number, this method can, not only confirm protein identification results generated by MS/MS database searching, but also identify peptides that are not possible to identify by database searching alone. Examples are presented of analyzing Escherichia coli K12 extracts that yielded relatively poor MS/MS spectra, presumably from the digests of low abundance proteins, which can still give positive protein identification using this method. Additionally, this 2D MS method can facilitate spectral interpretation for de novo peptide sequencing and identification of posttranslational or other chemical modifications. We envision that this method should be particularly useful for proteome expression profiling of organelles or cells that can be grown in (15)N-enriched media.  相似文献   

3.
De novo interpretation of tandem mass spectrometry (MS/MS) spectra provides sequences for searching protein databases when limited sequence information is present in the database. Our objective was to define a strategy for this type of homology-tolerant database search. Homology searches, using MS-Homology software, were conducted with 20, 10, or 5 of the most abundant peptides from 9 proteins, based either on precursor trigger intensity or on total ion current, and allowing for 50%, 30%, or 10% mismatch in the search. Protein scores were corrected by subtracting a threshold score that was calculated from random peptides. The highest (p < .01) corrected protein scores (i.e., above the threshold) were obtained by submitting 20 peptides and allowing 30% mismatch. Using these criteria, protein identification based on ion mass searching using MS/MS data (i.e., Mascot) was compared with that obtained using homology search. The highest-ranking protein was the same using Mascot, homology search using the 20 most intense peptides, or homology search using all peptides, for 63.4% of 112 spots from two-dimensional polyacrylamide gel electrophoresis gels. For these proteins, the percent coverage was greatest using Mascot compared with the use of all or just the 20 most intense peptides in a homology search (25.1%, 18.3%, and 10.6%, respectively). Finally, 35% of de novo sequences completely matched the corresponding known amino acid sequence of the matching peptide. This percentage increased when the search was limited to the 20 most intense peptides (44.0%). After identifying the protein using MS-Homology, a peptide mass search may increase the percent coverage of the protein identified.  相似文献   

4.
Nitric oxide is an important mediator that participates in reduction-oxidation (redox) mechanisms and in cellular signal transduction pathways. Two types of post-translational modifications are induced by nitric oxide: S-nitrosylation of cysteine residues and nitration of tyrosine residues. Two-dimensional gel electrophoresis-based Western blotting was used to detect, and liquid chromatography (LC)-tandem mass spectrometry (MS/MS) to determine the amino acid sequence of, several different nitrated proteins in the human pituitary. Proteins from several 2D gel spots, which corresponded to the strongly positive anti-nitrotyrosine Western blot spots, were subjected to in-gel trypsin-digestion and LC-MS/MS analysis. MS/MS, SEQUEST analysis, and de novo sequencing were used to determine the nitration site of each nitrated peptide. A total of four different nitrated peptides were characterized and were matched to four different proteins: synaptosomal-associated protein, actin, immunoglobulin alpha Fc receptor, and cGMP-dependent protein kinase 2. Those nitrotyrosyl-proteins participate in neurotransmission, cellular immunity, and cellular structure and mobility.  相似文献   

5.
De novo peptide sequencing by mass spectrometry (MS) can determine the amino acid sequence of an unknown peptide without reference to a protein database. MS-based de novo sequencing assumes special importance in focused studies of families of biologically active peptides and proteins, such as hormones, toxins, and antibodies, for which amino acid sequences may be difficult to obtain through genomic methods. These protein families often exhibit sequence homology or characteristic amino acid content; yet, current de novo sequencing approaches do not take advantage of this prior knowledge and, hence, search an unnecessarily large space of possible sequences. Here, we describe an algorithm for de novo sequencing that incorporates sequence constraints into the core graph algorithm and thereby reduces the search space by many orders of magnitude. We demonstrate our algorithm in a study of cysteine-rich toxins from two cone snail species (Conus textile and Conus stercusmuscarum) and report 13 de novo and about 60 total toxins.  相似文献   

6.
Although peptide mass fingerprinting is currently the method of choice to identify proteins, the number of proteins available in databases is increasing constantly, and hence, the advantage of having sequence data on a selected peptide, in order to increase the effectiveness of database searching, is more crucial. Until recently, the ability to identify proteins based on the peptide sequence was essentially limited to the use of electrospray ionization tandem mass spectrometry (MS) methods. The recent development of new instruments with matrix-assisted laser desorption/ionization (MALDI) sources and true tandem mass spectrometry (MS/MS) capabilities creates the capacity to obtain high quality tandem mass spectra of peptides. In this work, using the new high resolution tandem time of flight MALDI-(TOF/TOF) mass spectrometer from Applied Biosystems, examples of successful identification and characterization of bovine heart proteins (SWISS-PROT entries: P02192, Q9XSC6, P13620) separated by two-dimensional electrophoresis and blotted onto polyvinylidene difluoride membrane are described. Tryptic protein digests were analyzed by MALDI-TOF to identify peptide masses afterward used for MS/MS. Subsequent high energy MALDI-TOF/TOF collision-induced dissociation spectra were recorded on selected ions. All data, both MS and MS/MS, were recorded on the same instrument. Tandem mass spectra were submitted to database searching using MS-Tag or were manually de novo sequenced. An interesting modification of a tryptophan residue, a "double oxidation", came to light during these analyses.  相似文献   

7.
ABSTRACT

A novel insecticidal peptide (LaIT3) was isolated from the Liocheles australasiae venom. The primary structure of LaIT3 was determined by a combination of Edman degradation and MS/MS de novo sequencing analysis. Discrimination between Leu and Ile in MS/MS analysis was achieved based on the difference in side chain fragmentation assisted by chemical derivatization. LaIT3 was determined to be an 84-residue peptide with three intrachain disulfide bonds. The sequence similarity search revealed that LaIT3 belongs to the scorpine-like peptides consisting of two structural domains: an N-terminal α-helical domain and a C-terminal cystine-stabilized domain. As observed for most of the scorpine-like peptides, LaIT3 showed significant antibacterial activity against Escherichia coli, which is likely to be caused by its membrane-disrupting property.  相似文献   

8.
For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software.  相似文献   

9.
Despite a recent surge of interest in database-independent peptide identifications, accurate de novo peptide sequencing remains an elusive goal. While the recently introduced spectral network approach resulted in accurate peptide sequencing in low-complexity samples, its success depends on the chance of presence of spectra from overlapping peptides. On the other hand, while multistage mass spectrometry (collecting multiple MS 3 spectra from each MS 2 spectrum) can be applied to all spectra in a complex sample, there are currently no software tools for de novo peptide sequencing by multistage mass spectrometry. We describe a rigorous probabilistic framework for analyzing spectra of overlapping peptides and show how to apply it for multistage mass spectrometry. Our software results in both accurate de novo peptide sequencing from multistage mass spectra (despite the inferior quality of MS 3 spectra) and improved interpretation of spectral networks. We further study the problem of de novo peptide sequencing with accurate parent mass (but inaccurate fragment masses), the protocol that may soon become the dominant mode of spectral acquisition. Most existing peptide sequencing algorithms (based on the spectrum graph approach) do not track the accurate parent mass and are thus not equipped for solving this problem. We describe a de novo peptide sequencing algorithm aimed at this experimental protocol and show that it improves the sequencing accuracy on both tandem and multistage mass spectrometry.  相似文献   

10.
Panax ginseng is an important herb that has clear effects on the treatment of diverse diseases. Until now, the natural peptide constitution of this herb remains unclear. Here, we conduct an extensive characterization of Ginseng peptidome using MS‐based data mining and sequencing. The screen on the charge states of precursor ions indicated that Ginseng is a peptide‐rich herb in comparison of a number of commonly used herbs. The Ginseng peptides were then extracted and submitted to nano‐LC‐MS/MS analysis using different fragmentation modes, including CID, high‐energy collisional dissociation, and electron transfer dissociation. Further database search and de novo sequencing allowed the identification of total 308 peptides, some of which might have important biological activities. This study illustrates the abundance and sequences of endogenous Ginseng peptides, thus providing the information of more candidates for the screening of active compounds for future biological research and drug discovery studies.  相似文献   

11.
De novo peptide sequencing via tandem mass spectrometry.   总被引:10,自引:0,他引:10  
Peptide sequencing via tandem mass spectrometry (MS/MS) is one of the most powerful tools in proteomics for identifying proteins. Because complete genome sequences are accumulating rapidly, the recent trend in interpretation of MS/MS spectra has been database search. However, de novo MS/MS spectral interpretation remains an open problem typically involving manual interpretation by expert mass spectrometrists. We have developed a new algorithm, SHERENGA, for de novo interpretation that automatically learns fragment ion types and intensity thresholds from a collection of test spectra generated from any type of mass spectrometer. The test data are used to construct optimal path scoring in the graph representations of MS/MS spectra. A ranked list of high scoring paths corresponds to potential peptide sequences. SHERENGA is most useful for interpreting sequences of peptides resulting from unknown proteins and for validating the results of database search algorithms in fully automated, high-throughput peptide sequencing.  相似文献   

12.
Although genome databases have become the key for proteomic analyses, de novo sequencing remains essential for the study of organisms whose genomes have not been completed. In addition, post-translational modifications present a challenge in database searching. Recognition of the b or y-ion series in a peptide MS/MS spectrum as well as identification of the b1 - and yn-1 -ions can facilitate de novo analyses. Therefore, it is valuable to identify either amino-acid terminus. In previous work, we have demonstrated that peptides modified at the epsilon-amino group of lysine as a t-butyl peroxycarbamate derivative undergo free radical promoted peptide backbone fragmentation under low-energy collision-induced dissociation (CID) conditions. Here we explore the chemistry of the N-terminal amino group modified as a t-butyl peroxycarbamate. The conversion of N-terminal amines to peroxycarbamates of simple amino acids and peptides was studied with aryl t-butyl peroxycarbonates. ESI-MS/MS analysis of the peroxycarbamate adducts gave evidence of a product ion corresponding to the neutral loss of the N-terminal side chain (R), thus identifying this residue. Further fragmentation (MS3) of product ions formed by N-terminal residue side-chain loss (-R) exhibited an m/z shift of the b-ions equal to the neutral loss of R, therefore labeling the b-ion series. The study was extended to the analysis of a protein tryptic digest where the SALSA algorithm was used to identify spectra containing these neutral losses. The method for N-terminus identification presented here has the potential for improvement of de novo analyses as well as in constraining peptide mass mapping database searches.  相似文献   

13.
Hernandez P  Gras R  Frey J  Appel RD 《Proteomics》2003,3(6):870-878
In recent years, proteomics research has gained importance due to increasingly powerful techniques in protein purification, mass spectrometry and identification, and due to the development of extensive protein and DNA databases from various organisms. Nevertheless, current identification methods from spectrometric data have difficulties in handling modifications or mutations in the source peptide. Moreover, they have low performance when run on large databases (such as genomic databases), or with low quality data, for example due to bad calibration or low fragmentation of the source peptide. We present a new algorithm dedicated to automated protein identification from tandem mass spectrometry (MS/MS) data by searching a peptide sequence database. Our identification approach shows promising properties for solving the specific difficulties enumerated above. It consists of matching theoretical peptide sequences issued from a database with a structured representation of the source MS/MS spectrum. The representation is similar to the spectrum graphs commonly used by de novo sequencing software. The identification process involves the parsing of the graph in order to emphasize relevant sections for each theoretical sequence, and leads to a list of peptides ranked by a correlation score. The parsing of the graph, which can be a highly combinatorial task, is performed by a bio-inspired algorithm called Ant Colony Optimization algorithm.  相似文献   

14.
López JL  Marina A  Alvarez G  Vázquez J 《Proteomics》2002,2(12):1658-1665
In this work, a novel approach based on proteomics is applied for the analysis of the three European marine mussel species: Mytilus edulis (ME), Mytilus galloprovincialis (MG) and Mytilus trossulus (MT), which are of interest in biotechnology and food industry. The proteomes of these species are poorly described in databases, are difficult to diagnose, and have a controversial taxonomy, To characterise species-specific peptides, we compared 51 matrix-assisted laser desorption/ioization-time of flight peptide mass maps generated from 6 random selected prominent spots derived from the two-dimensional electrophoresis analysis of foot protein extracts from several individuals. Minor species-specific differences in the peptide maps were detected in only one of the spots, corresponding to tropomyosin. Two peptides were unique to ME and MG individuals, whereas another peptide was present only in MT individuals. The sequence of these peptides was characterised by, nanoelectrospray ionization-ion trap (nanoESI-IT) tandem mass spectrometry (MS/MS) analysis followed by database searching and de novo sequence interpretation. We detected a single T to D amino acid substitution in MT tropomyosin. Unambiguous and highly-specific species identification was then demonstrated by analysing peptide extracts from tropomyosin spots by micro high-performande liquid chromatography (microHPL) ESI-IT mass spectrometry using the selected ion monitoring configuration, focused on these peptides, in continuous MS/MS operation. Our results suggest that proteomics may be successfully applied for the identification of species whose proteome is not present in databases.  相似文献   

15.
A proteomics-based approach was used for characterizing wheat gliadins from an Italian common wheat (Triticum aestivum) cultivar. A two-dimensional gel electrophoresis (2-DE) map of roughly 40 spots was obtained by submitting the 70% alcohol-soluble crude protein extract to isoelectric focusing on immobilized pH gradient strips across two pH gradient ranges, i.e., 3-10 or pH 6-11, and to sodium dodecyl sulfate-polyacrylamide electrophoresis in the second dimension. The chymotryptic digest of each spot was characterized by matrix-assisted laser desorption/ionization-time of flight mass spectrometry and nano electrospray ionization-tandem mass spectrometry (MS/MS) analysis, providing a "peptide map" for each digest. The measured masses were subsequently sought in databases for sequences. For accurate identification of the parent protein, it was necessary to determine de novo sequences by MS/MS experiments on the peptides. By partial mass fingerprinting, we identified protein molecules such as alpha/beta-, gamma-, omega-gliadin, and high molecular weight-glutenin. The single spots along the 2-DE map were discriminated on the basis of their amino acid sequence traits. alpha-Gliadin, the most represented wheat protein in databases, was highly conserved as the relative N-terminal sequence of the components from the 2-DE map contained only a few silent amino acid substitutions. The other closely related gliadins were identified by sequencing internal peptide chains. The results gave insight into the complex nature of gliadin heterogeneity. This approach has provided us with sound reference data for differentiating gliadins amongst wheat varieties.  相似文献   

16.
Electrostatic repulsion hydrophilic interaction chromatography (ERLIC) coupled with mass spectrometry (MS) is a technique that is increasingly being used as a trapping/enrichment tool for glycopeptides/phosphorylated peptides or sample fractionation in proteomics research. Here, we describe a novel ERLIC-MS/MS-based peptide mapping method that was successfully used for the characterization of denosumab, in particular the analysis of sequence coverage, terminal peptides, methionine oxidation, asparagine deamidation and glycopeptides. Compared to reversed phase liquid chromatography (RPLC)-MS/MS methods, ERLIC demonstrated unique advantages in the retention of small peptides, resulting in 100% sequence coverage for both the light and heavy chains. It also demonstrated superior performance in the separation and characterization of asparagine deamidated peptides, which is known to be challenging by RPLC-MS/MS. The developed method can be used alone for peptide mapping-based characterization of monoclonal antibodies, or as an orthogonal method to complement the RPLC-MS/MS method. This study extends the applications of ERLIC from that of a trapping/fractioning column to biologic therapeutics characterization. The ERLIC-MS/MS method can enhance biologic therapeutics analysis with more reliability and confidence for bottom-up peptide mapping-based characterization.  相似文献   

17.
Neuropeptidomics is used to characterize endogenous peptides in the brain of tree shrews (Tupaia belangeri). Tree shrews are small animals similar to rodents in size but close relatives of primates, and are excellent models for brain research. Currently, tree shrews have no complete proteome information available on which direct database search can be allowed for neuropeptide identification. To increase the capability in the identification of neuropeptides in tree shrews, we developed an integrated mass spectrometry (MS)-based approach that combines methods including data-dependent, directed, and targeted liquid chromatography (LC)-Fourier transform (FT)-tandem MS (MS/MS) analysis, database construction, de novo sequencing, precursor protein search, and homology analysis. Using this integrated approach, we identified 107 endogenous peptides that have sequences identical or similar to those from other mammalian species. High accuracy MS and tandem MS information, with BLAST analysis and chromatographic characteristics were used to confirm the sequences of all the identified peptides. Interestingly, further sequence homology analysis demonstrated that tree shrew peptides have a significantly higher degree of homology to equivalent sequences in humans than those in mice or rats, consistent with the close phylogenetic relationship between tree shrews and primates. Our results provide the first extensive characterization of the peptidome in tree shrews, which now permits characterization of their function in nervous and endocrine system. As the approach developed fully used the conservative properties of neuropeptides in evolution and the advantage of high accuracy MS, it can be portable for identification of neuropeptides in other species for which the fully sequenced genomes or proteomes are not available.  相似文献   

18.
The conventional approach in modern proteomics to identify proteins from limited information provided by molecular and fragment masses of their enzymatic degradation products carries an inherent risk of both false positive and false negative identifications. For reliable identification of even known proteins, complete de novo sequencing of their peptides is desired. The main problems of conventional sequencing based on tandem mass spectrometry are incomplete backbone fragmentation and the frequent overlap of fragment masses. In this work, the first proteomics-grade de novo approach is presented, where the above problems are alleviated by the use of complementary fragmentation techniques CAD and ECD. Implementation of a high-current, large-area dispenser cathode as a source of low-energy electrons provided efficient ECD of doubly charged peptides, the most abundant species (65-80%), in a typical trypsin-based proteomics experiment. A new linear de novo algorithm is developed combining efficiency and speed, processing on a conventional 3 GHz PC, 1000 MS/MS data sets in 60 s. More than 6% of all MS/MS data for doubly charged peptides yielded complete sequences, and another 13% gave nearly complete sequences with a maximum gap of two amino acid residues. These figures are comparable with the typical success rates (5-15%) of database identification. For peptides reliably found in the database (Mowse score > or = 34), the agreement with de novo-derived full sequences was >95%. Full sequences were derived in 67% of the cases when full sequence information was present in MS/MS spectra. Thus the new de novo sequencing approach reached the same level of efficiency and reliability as conventional database-identification strategies.  相似文献   

19.
The Virtual Expert Mass Spectrometrist (VEMS) program package was developed for flexible, automated, and manual de novo tandem mass spectrometry (MS/MS) protein sequencing, and includes accessory programs for matrix-assisted laser desorption/ionization-mass spectrometry (MS) interpretation, and generation of protein and peptide databases. VEMS V2.0 has been developed into a fast tool for combining database-independent and -dependent protein assignments in an extended analysis of MS/MS-peptide data. MS or MS/MS data can be directly recalibrated after the first search by fitting the data to the best search result using polynomial equations. The score function is an improvement of known scoring algorithms and can be adapted for any MS instrument type. In addition, VEMS offers a novel statistical model for evaluating the significance of the protein assignment. The novel features are illustrated by the analysis of the fragmentation spectra obtained by liquid chromatrography-MS/MS analysis of peptides from an anionic peroxidase enriched protein fraction from potato root tissue. The extended analysis mode resulted in the additional assignment of spectra for nine modified tryptic peptides and nine miscleaved peptides, in addition to the 45 spectra from regular tryptic peptides. Of the nine modified peptides, three were glycosylated.  相似文献   

20.
MOTIVATION: Peptide identification following tandem mass spectrometry (MS/MS) is usually achieved by searching for the best match between the mass spectrum of an unidentified peptide and model spectra generated from peptides in a sequence database. This methodology will be successful only if the peptide under investigation belongs to an available database. Our objective is to develop and test the performance of a heuristic optimization algorithm capable of dealing with some features commonly found in actual MS/MS spectra that tend to stop simpler deterministic solution approaches. RESULTS: We present the implementation of a Genetic Algorithm (GA) in the reconstruction of amino acid sequences using only spectral features, discuss some of the problems associated with this approach and compare its performance to a de novo sequencing method. The GA can potentially overcome some of the most problematic aspects associated with de novo analysis of real MS/MS data such as missing or unclearly defined peaks and may prove to be a valuable tool in the proteomics field. We assess the performance of our algorithm under conditions of perfect spectral information, in situations where key spectral features are missing, and using real MS/MS spectral data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号