首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein identification has been greatly facilitated by database searches against protein sequences derived from product ion spectra of peptides. This approach is primarily based on the use of fragment ion mass information contained in a MS/MS spectrum. Unambiguous protein identification from a spectrum with low sequence coverage or poor spectral quality can be a major challenge. We present a two-dimensional (2D) mass spectrometric method in which the numbers of nitrogen atoms in the molecular ion and the fragment ions are used to provide additional discriminating power for much improved protein identification and de novo peptide sequencing. The nitrogen number is determined by analyzing the mass difference of corresponding peak pairs in overlaid spectra of (15)N-labeled and unlabeled peptides. These peptides are produced by enzymatic or chemical cleavage of proteins from cells grown in (15)N-enriched and normal media, respectively. It is demonstrated that, using 2D information, i.e., m/z and its associated nitrogen number, this method can, not only confirm protein identification results generated by MS/MS database searching, but also identify peptides that are not possible to identify by database searching alone. Examples are presented of analyzing Escherichia coli K12 extracts that yielded relatively poor MS/MS spectra, presumably from the digests of low abundance proteins, which can still give positive protein identification using this method. Additionally, this 2D MS method can facilitate spectral interpretation for de novo peptide sequencing and identification of posttranslational or other chemical modifications. We envision that this method should be particularly useful for proteome expression profiling of organelles or cells that can be grown in (15)N-enriched media.  相似文献   

2.
LC-MS/MS has emerged as the method of choice for the identification and quantification of protein sample mixtures. For very complex samples such as complete proteomes, the most commonly used LC-MS/MS method, data-dependent acquisition (DDA) precursor selection, is of limited utility. The limited scan speed of current mass spectrometers along with the highly redundant selection of the most intense precursor ions generates a bias in the pool of identified proteins toward those of higher abundance. A directed LC-MS/MS approach that alleviates the limitations of DDA precursor ion selection by decoupling peak detection and sequencing of selected precursor ions is presented. In the first stage of the strategy, all detectable peptide ion signals are extracted from high resolution LC-MS feature maps or aligned sets of feature maps. The selected features or a subset thereof are subsequently sequenced in sequential, non-redundant directed LC-MS/MS experiments, and the MS/MS data are mapped back to the original LC-MS feature map in a fully automated manner. The strategy, implemented on an LTQ-FT MS platform, allowed the specific sequencing of 2,000 features per analysis and enabled the identification of more than 1,600 phosphorylation sites using a single reversed phase separation dimension without the need for time-consuming prefractionation steps. Compared with conventional DDA LC-MS/MS experiments, a substantially higher number of peptides could be identified from a sample, and this increase was more pronounced for low intensity precursor ions.  相似文献   

3.
Recent advances in instrument control and enrichment procedures have enabled us to quantify large numbers of phosphoproteins and record site-specific phosphorylation events. An intriguing problem that has arisen with these advances is to accurately validate where phosphorylation events occur, if possible, in an automated manner. The problem is difficult because MS/MS spectra of phosphopeptides are generally more complicated than those of unmodified peptides. For large scale studies, the problem is even more evident because phosphorylation sites are based on single peptide identifications in contrast to protein identifications where at least two peptides from the same protein are required for identification. To address this problem we have developed an integrated strategy that increases the reliability and ease for phosphopeptide validation. We have developed an off-line titanium dioxide (TiO(2)) selective phosphopeptide enrichment procedure for crude cell lysates. Following enrichment, half of the phosphopeptide fractionated sample is enzymatically dephosphorylated, after which both samples are subjected to LC-MS/MS. From the resulting MS/MS analyses, the dephosphorylated peptide is used as a reference spectrum against the original phosphopeptide spectrum, in effect generating two peptide spectra for the same amino acid sequence, thereby enhancing the probability of a correct identification. The integrated procedure is summarized as follows: 1) enrichment for phosphopeptides by TiO(2) chromatography, 2) dephosphorylation of half the sample, 3) LC-MS/MS-based analysis of phosphopeptides and corresponding dephosphorylated peptides, 4) comparison of peptide elution profiles before and after dephosphorylation to confirm phosphorylation, and 5) comparison of MS/MS spectra before and after dephosphorylation to validate the phosphopeptide and its phosphorylation site. This phosphopeptide identification represents a major improvement as compared with identifications based only on single MS/MS spectra and probability-based database searches. We investigated an applicability of this method to crude cell lysates and demonstrate its application on the large scale analysis of phosphorylation sites in differentiating mouse myoblast cells.  相似文献   

4.
MOTIVATION: Peptide identification following tandem mass spectrometry (MS/MS) is usually achieved by searching for the best match between the mass spectrum of an unidentified peptide and model spectra generated from peptides in a sequence database. This methodology will be successful only if the peptide under investigation belongs to an available database. Our objective is to develop and test the performance of a heuristic optimization algorithm capable of dealing with some features commonly found in actual MS/MS spectra that tend to stop simpler deterministic solution approaches. RESULTS: We present the implementation of a Genetic Algorithm (GA) in the reconstruction of amino acid sequences using only spectral features, discuss some of the problems associated with this approach and compare its performance to a de novo sequencing method. The GA can potentially overcome some of the most problematic aspects associated with de novo analysis of real MS/MS data such as missing or unclearly defined peaks and may prove to be a valuable tool in the proteomics field. We assess the performance of our algorithm under conditions of perfect spectral information, in situations where key spectral features are missing, and using real MS/MS spectral data.  相似文献   

5.
Biniossek ML  Schilling O 《Proteomics》2012,12(9):1303-1309
Peptide sequences lacking basic residues (arginine, lysine, or histidine, referred to as "base-less") are of particular importance in proteomic experiments targeting protein C-termini or employing nontryptic proteases such as GluC or chymotrypsin. We demonstrate enhanced identification of base-less peptides by focused analysis of singly charged precursors in liquid chromatography (LC) electrospray ionization (ESI) tandem mass spectrometry (MS/MS). Singly charged precursors are often excluded from fragmentation and sequence analysis in LC-MS/MS. We generated different pools of base-less and base-containing peptides by tryptic and nontryptic digestion of bacterial proteomes. Focused LC-MS/MS analysis of singly charged precursor ions yielded predominantly base-less peptide identifications. Similar numbers of base-less peptides were identified by LC-MS/M Sanalysis targeting multiply charged precursors. There was little redundancy between the base-less sequences derived by both MS/MS schemes. In the present experimental outcome, additional LC-MS/MS analysis of singly charged precursors substantially increased the identification rate of base-less sequences derived from multiply charged precursors. In conclusion, LC-MS/MS based identification of base-less peptides is substantially enhanced by additional focused analysis of singly charged precursors.  相似文献   

6.
Tandem mass spectrometry (MS/MS) combined with database searching is currently the most widely used method for high-throughput peptide and protein identification. Many different algorithms, scoring criteria, and statistical models have been used to identify peptides and proteins in complex biological samples, and many studies, including our own, describe the accuracy of these identifications, using at best generic terms such as "high confidence." False positive identification rates for these criteria can vary substantially with changing organisms under study, growth conditions, sequence databases, experimental protocols, and instrumentation; therefore, study-specific methods are needed to estimate the accuracy (false positive rates) of these peptide and protein identifications. We present and evaluate methods for estimating false positive identification rates based on searches of randomized databases (reversed and reshuffled). We examine the use of separate searches of a forward then a randomized database and combined searches of a randomized database appended to a forward sequence database. Estimated error rates from randomized database searches are first compared against actual error rates from MS/MS runs of known protein standards. These methods are then applied to biological samples of the model microorganism Shewanella oneidensis strain MR-1. Based on the results obtained in this study, we recommend the use of use of combined searches of a reshuffled database appended to a forward sequence database as a means providing quantitative estimates of false positive identification rates of peptides and proteins. This will allow researchers to set criteria and thresholds to achieve a desired error rate and provide the scientific community with direct and quantifiable measures of peptide and protein identification accuracy as opposed to vague assessments such as "high confidence."  相似文献   

7.
Genes that encode glycosylphosphatidylinositol anchored proteins (GPI-APs) constitute an estimated 1-2% of eukaryote genomes. Current computational methods for the prediction of GPI-APs are sensitive and specific; however, the analysis of the processing site (omega- or omega-site) of GPI-APs is still challenging. Only 10% of the proteins that are annotated as GPI-APs have the omega-site experimentally verified. We describe an integrated computational and experimental proteomics approach for the identification and characterization of GPI-APs that provides the means to identify GPI-APs and the derived GPI-anchored peptides in LC-MS/MS data sets. The method takes advantage of sequence features of GPI-APs and the known core structure of the GPI-anchor. The first stage of the analysis encompasses LC-MS/MS based protein identification. The second stage involves prediction of the processing sites of the identified GPI-APs and prediction of the corresponding terminal tryptic peptides. The third stage calculates possible GPI structures on the peptides from stage two. The fourth stage calculates the scores by comparing the theoretical spectra of the predicted GPI-peptides against the observed MS/MS spectra. Automated identification of C-terminal GPI-peptides from porcine membrane dipeptidase, folate receptor and CD59 in complex LC-MS/MS data sets demonstrates the sensitivity and specificity of this integrated computational and experimental approach.  相似文献   

8.
The characterization by de novo peptide sequencing of the different protein nucleoside diphosphate kinase B (NDK B) from all the commercial hakes and grenadiers belonging to the family Merlucciidae is reported. A classical proteomics approach, consisting of two-dimmensional gel electrophoresis, tryptic in-gel digestion of the excised spots, MALDI-TOF MS, LC-MS/MS, and nanoESI-MS/MS analyses, was followed for the purification and characterization of the different isoforms of the NDK B. Fragmentation spectra were used for de novo peptide sequence. A high degree of homology was found between the sequences of all the species studied and the NDK B sequence from Gillichthys mirabilis, which is accessible in the protein databases. Particular attention was paid to the differential characterization of species-specific peptides that could be used for fish authentication purposes. These findings allowed us to propose a rapid and effective classification method, based in the detection of these biomarker peptides using the selective ion reaction monitoring (SIRM) scan mode in mass spectrometry.  相似文献   

9.
A method was developed for the liquid chromatographic-mass spectrometric (LC-MS) identification of extremely neurotoxic toxins. The method combines sample treatment in a safety containment and analysis of detoxified material in a common laboratory facility. The method was applied to the characterization of neat tetanus toxin and subsequent identification of the toxin in cell lysate supernatants and culture supernatants from different Clostridium tetani bacteria strains. Characterization of the neat toxin was accomplished by (1) accurate mass measurement of enzyme digest fragments of the toxin and (2) tandem mass spectrometric (MS/MS) amino acid sequencing of selected peptides. Accurate mass measurement proved no longer feasible for the analysis of supernatants, due to the overwhelming presence of peptides from proteins other than toxin. Even when high-molecular-weight proteins were filtered from the lysates and treated, the retained protein fraction yielded too many peptides. However, MS/MS could successfully be applied when the findings from the characterization of neat toxin were employed. Thus, LC-MS/MS of selected precursor ions from trypsin digest fragments yielded specific sequence data for identification of the toxin. This procedure provided reliable identification of the toxin at levels above 1 microg/ml and within a day. Investigations with the method developed will be extended to the botulinum neurotoxins.  相似文献   

10.
An Z  Chen Y  Koomen JM  Merkler DJ 《Proteomics》2012,12(2):173-182
Amidation is a post-translational modification found at the C-terminus of ~50% of all neuropeptide hormones. Cleavage of the C(α)-N bond of a C-terminal glycine yields the α-amidated peptide in a reaction catalyzed by peptidylglycine α-amidating monooxygenase (PAM). The mass of an α-amidated peptide decreases by 58 Da relative to its precursor. The amino acid sequences of an α-amidated peptide and its precursor differ only by the C-terminal glycine meaning that the peptides exhibit similar RP-HPLC properties and tandem mass spectral (MS/MS) fragmentation patterns. Growth of cultured cells in the presence of a PAM inhibitor ensured the coexistence of α-amidated peptides and their precursors. A strategy was developed for precursor and α-amidated peptide pairing (PAPP): LC-MS/MS data of peptide extracts were scanned for peptide pairs that differed by 58 Da in mass, but had similar RP-HPLC retention times. The resulting peptide pairs were validated by checking for similar fragmentation patterns in their MS/MS data prior to identification by database searching or manual interpretation. This approach significantly reduced the number of spectra requiring interpretation, decreasing the computing time required for database searching and enabling manual interpretation of unidentified spectra. Reported here are the α-amidated peptides identified from AtT-20 cells using the PAPP method.  相似文献   

11.
Protein identification by interrogation of databases requires a comprehensive compilation of modified amino acids forms. Here, we describe the chemical oxidation of carboxyamidomethyl cysteine to the sulfoxide and sulfone forms, species that may add more complexity to peptide analyses. They can be easily distinguished by tandem mass spectrometry (MS/MS) due to their characteristic pattern of side chain neutral eliminations either from the parent ion or ion series that generate dehydroalanine as detected by MS(3). This finding was supported by the MS(n) spectra recorded for a peptide isolated from a mixture of tryptic peptides and for a derivatized/oxidized synthetic peptide with a different sequence. These modifications and their diagnostic neutral losses should be included in the list of chemical modifications and in algorithms designed for the automatic sequencing of peptides and database searching.  相似文献   

12.
Hernandez P  Gras R  Frey J  Appel RD 《Proteomics》2003,3(6):870-878
In recent years, proteomics research has gained importance due to increasingly powerful techniques in protein purification, mass spectrometry and identification, and due to the development of extensive protein and DNA databases from various organisms. Nevertheless, current identification methods from spectrometric data have difficulties in handling modifications or mutations in the source peptide. Moreover, they have low performance when run on large databases (such as genomic databases), or with low quality data, for example due to bad calibration or low fragmentation of the source peptide. We present a new algorithm dedicated to automated protein identification from tandem mass spectrometry (MS/MS) data by searching a peptide sequence database. Our identification approach shows promising properties for solving the specific difficulties enumerated above. It consists of matching theoretical peptide sequences issued from a database with a structured representation of the source MS/MS spectrum. The representation is similar to the spectrum graphs commonly used by de novo sequencing software. The identification process involves the parsing of the graph in order to emphasize relevant sections for each theoretical sequence, and leads to a list of peptides ranked by a correlation score. The parsing of the graph, which can be a highly combinatorial task, is performed by a bio-inspired algorithm called Ant Colony Optimization algorithm.  相似文献   

13.
Zhang N  Li XJ  Ye M  Pan S  Schwikowski B  Aebersold R 《Proteomics》2005,5(16):4096-4106
In MS/MS experiments with automated precursor ion, selection only a fraction of sequencing attempts lead to the successful identification of a peptide. A number of reasons may contribute to this situation. They include poor fragmentation of the selected precursor ion, the presence of modified residues in the peptide, mismatches with sequence databases, and frequently, the concurrent fragmentation of multiple precursors in the same CID attempt. Current database search engines are incapable of correctly assigning the sequences of multiple precursors to such spectra. We have developed a search engine, ProbIDtree, which can identify multiple peptides from a CID spectrum generated by the concurrent fragmentation of multiple precursor ions. This is achieved by iterative database searching in which the submitted spectra are generated by subtracting the fragment ions assigned to a tentatively matched peptide from the acquired spectrum and in which each match is assigned a tentative probability score. Tentatively matched peptides are organized in a tree structure from which their adjusted probability scores are calculated and used to determine the correct identifications. The results using MALDI-TOF-TOF MS/MS data demonstrate that multiple peptides can be effectively identified simultaneously with high confidence using ProbIDtree.  相似文献   

14.
LC-MS/MS has demonstrated potential for detecting plant pathogens. Unlike PCR or ELISA, LC-MS/MS does not require pathogen-specific reagents for the detection of pathogen-specific proteins and peptides. However, the MS/MS approach we and others have explored does require a protein sequence reference database and database-search software to interpret tandem mass spectra. To evaluate the limitations of database composition on pathogen identification, we analyzed proteins from cultured Ustilago maydis, Phytophthora sojae, Fusarium graminearum, and Rhizoctonia solani by LC-MS/MS. When the search database did not contain sequences for a target pathogen, or contained sequences to related pathogens, target pathogen spectra were reliably matched to protein sequences from nontarget organisms, giving an illusion that proteins from nontarget organisms were identified. Our analysis demonstrates that when database-search software is used as part of the identification process, a paradox exists whereby additional sequences needed to detect a wide variety of possible organisms may lead to more cross-species protein matches and misidentification of pathogens.  相似文献   

15.
A "one-pot" alternative method for processing proteins and isolating peptide mixtures from bacterial samples is presented for liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis and data reduction. The conventional in-solution digestion of the protein contents of bacteria is compared to a small disposable filter unit placed inside a centrifuge vial for processing and digestion of bacterial proteins. Each processing stage allows filtration of excess reactants and unwanted byproduct while retaining the proteins. Upon addition of trypsin, the peptide mixture solution is passed through the filter while retaining the trypsin enzyme. The peptide mixture is then analyzed by LC-MS/MS with an in-house BACid algorithm for a comparison of the experimental unique peptides to a constructed proteome database of bacterial genus, specie, and strain entries. The concentration of bacteria was varied from 10 × 10(7) to 3.3 × 10(3) cfu/mL for analysis of the effect of concentration on the ability of the sample processing, LC-MS/MS, and data analysis methods to identify bacteria. The protein processing method and dilution procedure result in reliable identification of pure suspensions and mixtures at high and low bacterial concentrations.  相似文献   

16.
MS/MS and database searching has emerged as a valuable technology for rapidly analyzing protein expression, localization, and post-translational modifications. The probability-based search engine Mascot has found widespread use as a tool to correlate tandem mass spectra with peptides in a sequence database. Although the Mascot scoring algorithm provides a probability-based model for peptide identification, the independent peptide scores do not correlate with the significance of the proteins to which they match. Herein, we describe a heuristic method for organizing proteins identified at a specified false-discovery rate using Mascot-matched peptides. We call this method PROVALT, and it uses peptide matches from a random database to calculate false-discovery rates for protein identifications and reduces a complex list of peptide matches to a nonredundant list of homologous protein groups. This method was evaluated using Mascot-identified peptides from a Trypanosoma cruzi epimastigote whole-cell lysate, which was separated by multidimensional LC and analyzed by MS/MS. PROVALT was then compared with the two traditional methods of protein identification when using Mascot, the single peptide score and cumulative protein score methods, and was shown to be superior to both in regards to the number of proteins identified and the inclusion of lower scoring nonrandom peptide matches.  相似文献   

17.
18.
For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software.  相似文献   

19.
Two-dimensional gel electrophoresis-separated and excised haptoglobin alpha2-chain protein spots were subjected to in-gel digestion with trypsin. Previously unassigned peptide ion signals observed in mass spectrometric fingerprinting experiments were sequenced using the matrix-assisted laser desorption/ionization-quadrupole ion trap-time of flight (MALDI-QIT-TOF) mass spectrometer and showed that the haptoglobin alpha-chain derivative under study was cleaved by trypsin unspecifically. Abundant cleavages occurred C-terminal to histidine residues at H23, H28, and H87. In addition, mild acidic hydrolysis leading to cleavage after aspartic acid residues at D13 was observed. The uninterpreted tandem mass spectrometry (MS/MS) spectrum of the peptide with ion signal at 2620.19 was submitted to database search and yielded the identification of the corresponding peptide sequence comprising amino acids (aa) aa65-87 from the haptoglobin alpha-chain protein. Also, the presence of a mixture of two tryptic peptides (mass to charge ratio m/z 1708.8; aa40-54, and aa99-113, respectively), that is caused by a tiny sequence variation between the two repeats in the haptoglobin alpha2-chain protein was resolved by MS/MS fragmentation using the MALDI-QIT-TOF mass spectrometer instrument. Advantageous features such as (i) easy parent ion creation, (ii) minimal sample consumption, and (iii) real collision induced dissociation conditions, were combined successfully to determine the amino acid sequences of the previously unassigned peptides. Hence, the novel mass spectrometric sequencing method applied here has proven effective for identification of distinct molecular protein structures.  相似文献   

20.
Generating all plausible de novo interpretations of a peptide tandem mass (MS/MS) spectrum (Spectral Dictionary) and quickly matching them against the database represent a recently emerged alternative approach to peptide identification. However, the sizes of the Spectral Dictionaries quickly grow with the peptide length making their generation impractical for long peptides. We introduce Gapped Spectral Dictionaries (all plausible de novo interpretations with gaps) that can be easily generated for any peptide length thus addressing the limitation of the Spectral Dictionary approach. We show that Gapped Spectral Dictionaries are small thus opening a possibility of using them to speed-up MS/MS searches. Our MS-Gapped-Dictionary algorithm (based on Gapped Spectral Dictionaries) enables proteogenomics applications (such as searches in the six-frame translation of the human genome) that are prohibitively time consuming with existing approaches. MS-Gapped-Dictionary generates gapped peptides that occupy a niche between accurate but short peptide sequence tags and long but inaccurate full length peptide reconstructions. We show that, contrary to conventional wisdom, some high-quality spectra do not have good peptide sequence tags and introduce gapped tags that have advantages over the conventional peptide sequence tags in MS/MS database searches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号