首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this work, the commonly used algorithms for mass spectrometry based protein identification, Mascot, MS-Fit, ProFound and SEQUEST, were studied in respect to the selectivity and sensitivity of their searches. The influence of various search parameters were also investigated. Approximately 6600 searches were performed using different search engines with several search parameters to establish a statistical basis. The applied mass spectrometric data set was chosen from a current proteome study. The huge amount of data could only be handled with computational assistance. We present a software solution for fully automated triggering of several peptide mass fingerprinting (PMF) and peptide fragmentation fingerprinting (PFF) algorithms. The development of this high-throughput method made an intensive evaluation based on data acquired in a typical proteome project possible. Previous evaluations of PMF and PFF algorithms were mainly based on simulations.  相似文献   

2.
The MultiTag method (Sunyaev et al., Anal. Chem. 2003 15, 1307-1315) employs multiple error-tolerant searches with peptide sequence tags (Mann and Wilm, Anal. Chem. 1994, 66, 4390-4399) for the identification of proteins from organisms with unsequenced genomes. Here we demonstrate that the error-tolerant capabilities of MultiTag increased the number of peptide alignments and improved the confidence of identifications in an EST database. The MultiTag outperformed conventional database searching software that only utilizes stringent matching of tandem mass spectra to nucleotide sequences of ESTs.  相似文献   

3.
We demonstrate a new approach to the determination of amino acid composition from tandem mass spectrometrically fragmented peptides using both experimental and simulated data. The approach has been developed to be used as a search-space filter in a protein identification pipeline with the aim of increased performance above that which could be attained by using immonium ion information. Three automated methods have been developed and tested: one based upon a simple peak traversal, in which all intense ion peaks are treated as being either a b- or y-ion using a wide mass tolerance; a second which uses a much narrower tolerance and does not perform transformations of ion peaks to the complementary type; and the unique fragments method which allows for b- or y-ion type to be inferred and corroborated using a scan of the other ions present in each peptide spectrum. The combination of these methods is shown to provide a high-accuracy set of amino acid predictions using both experimental and simulated data sets. These high quality predictions, with an accuracy of over 85%, may be used to identify peptide fragments that are hard to identify using other methods. The data simulation algorithm is also shown post priori to be a good model of noiseless tandem mass spectrometric peptide data.  相似文献   

4.
One of the major additions in MS technology has been the irruption of the Orbitrap mass analyzer, which has boosted the proteomics analyses of biological complex samples since its introduction. Here, we took advantage of the capabilities of the new Orbitrap Fusion Lumos Tribrid mass spectrometer to assess the performance of different data‐dependent acquisition methods for the identification and quantitation of peptides and phosphopeptides in single‐shot analysis of human whole cell lysates. Our study explored the capabilities of tri‐hibrid mass spectrometers for (phospho‐) peptide identification and quantitation using different gradient lengths, sample amounts, and combinations of different peptide fragmentation types and mass analyzers. Moreover, the acquisition of the same complex sample with different acquisition methods resulted in the generation of a dataset to be used as a reference for further analyses, and a starting point for future optimizations in particular applications.  相似文献   

5.
Proteins can be identified using a set of peptide fragment weights produced by a specific digestion to search a protein database in which sequences have been replaced by fragment weights calculated for various cleavage methods. We present a method using multidimensional searches that greatly increases the confidence level for identification, allowing DNA sequence databases to be examined. This method provides a link between 2-dimensional gel electrophoresis protein databases and genome sequencing projects. Moreover, the increased confidence level allows unknown proteins to be matched to expressed sequence tags, potentially eliminating the need to obtain sequence information for cloning. Database searching from a mass profile is offered as a free service by an automatic server at the ETH, Zürich. For information, send an electronic message to the address cbrg/inf.ethz.ch with the line: help mass search, or help all.  相似文献   

6.
Alternative splicing is generally accepted as a mechanism that explains the discrepancy between the number of genes and proteins. We used peptide mass fingerprinting with a theoretical database and scoring method to discover and identify alternative splicing isoforms. Our theoretical database was built using published alternative splicing databases such as ECgene, H-DBAS, and TISA. According to our theoretical database of 190,529 isoforms, 37% of human genes have multiple isoforms. The isoforms produced from a gene partially share common peptide fragments because they have common exons, making it difficult to distinguish isoforms. Therefore, we developed a new method that effectively distinguishes a true isoform among multiple isoforms in a gene. In order to evaluate our algorithm, we made test sets for 4226 protein isoforms extracted from our theoretical database randomly. Consequently, 94% of true isoforms were identified by our scoring algorithm.  相似文献   

7.
8.
Spectral library searching is an emerging approach in peptide identifications from tandem mass spectra, a critical step in proteomic data analysis. In spectral library searching, a spectral library is first meticulously compiled from a large collection of previously observed peptide MS/MS spectra that are conclusively assigned to their corresponding amino acid sequence. An unknown spectrum is then identified by comparing it to all the candidates in the spectral library for the most similar match. This review discusses the basic principles of spectral library building and searching, describes its advantages and limitations, and provides a primer for researchers interested in adopting this new approach in their data analysis. It will also discuss the future outlook on the evolution and utility of spectral libraries in the field of proteomics.  相似文献   

9.
Peptide mass fingerprinting (PMF) is a valuable method for rapid and high-throughput protein identification using the proteomics approach. Automated search engines, such as Ms-Fit, Mascot, ProFound, and Peptldent, have facilitated protein identification through PMF. The potential to obtain a true MS protein identification result depends on the choice of algorithm as well as experimental factors that influence the information content in MS data. When mass spectral data are incomplete and/or have low mass accuracy, the “number of matches” approach may be inadequate for a useful identification. Several studies have evaluated factors influencing the quality of mass spectrometry (MS) experiments. Missed cleavages, posttranslational modifications of peptides and contaminants (e.g., keratin) are important factors that can affect the results of MS analyses by influencing the identification process as well as the quality of the MS spectra. We compared search engines frequently used to identify proteins fromHomo sapiens andHalobacterium salinarum by evaluating factors, including data-based and mass tolerance to develop an improved search engine for PMF. This study may provide information to help develop a more effective algorithm for protein identification in each species through PMF.  相似文献   

10.
A completely automated peptide mapping liquid chromatography/mass spectrometry (LC/MS) system for characterization of therapeutic proteins in which a common high-performance liquid chromatography (HPLC) autosampler is used for automated sample preparation, including protein denaturation, reduction, alkylation, and enzymatic digestion, is described. The digested protein samples are then automatically subjected to LC/MS analysis using the same HPLC system. The system was used for peptide mapping of monoclonal antibodies (mAbs), known as a challenging group of therapeutic proteins for achieving complete coverage and quantitative representation of all peptides. Detailed sample preparation protocols, using an Agilent HPLC system, are described for Lys-C digestion of mAbs with intact disulfide bonds and tryptic digestion of mAbs after reduction and alkylation. The automated procedure of Lys-C digestion of nonreduced antibody, followed by postdigestion disulfide reduction, produces both the nonreduced and reduced digests that facilitate disulfide linkage analysis. The automated peptide mapping LC/MS system has great utility in preparing and analyzing multiple samples for protein characterization, identification, and quantification of posttranslational modifications during process and formulation development as well as for protein identity and quality control.  相似文献   

11.
Current influenza vaccine manufacturing and testing timelines require that the constituent hemagglutinin (HA) and neuraminidase (NA) strains be selected each year approximately 10 months before the vaccine becomes available. The threat of a pandemic influenza outbreak requires that more rapid testing methods be found. We have developed a specialized on-filter sample preparation method that uses both trypsin and chymotrypsin to enzymatically digest peptide-N-glycosidase F (PNGase F)-deglycosylated proteins in vaccines. In tandem with replicate liquid chromatography-mass spectrometry (LC-MS) analyses, this approach yields sufficient protein sequencing data (>85% sequence coverage on average) for strain identification of HA and NA components. This has allowed the confirmation, and in some cases the correction, of the identity of the influenza strains in recent commercial vaccines as well as the correction of some ambiguous HA sequence annotations in available databases. This method also allows the identification of low-level contaminant egg proteins produced during the manufacturing process.  相似文献   

12.
Curation and interpretation of protein databank-search results by human experts are key aspects of MS-based proteomic data acquisition. These tasks are often overlooked due to the vast amount of data to inspect. We have developed myProMS, a web server designed to ease search results validation and interpretation by improving data organization, mining and sharing between MS specialists and biologists during MS-based collaborative projects. A demo is accessible at http://bioinfo.curie.fr/myproms.  相似文献   

13.
Lee K  Bae D  Lim D 《Molecules and cells》2002,13(2):175-184
Protein identification by peptide mass fingerprinting, using the matrix-assisted laser desorption ionization time of flight mass spectrometry (MALDI-TOF MS), plays a major role in large proteome projects. In order to develop a simple and reliable method for protein identification by MALDI-TOF MS, we compared and evaluated the major steps in peptide mass fingerprinting. We found that the removal of excess enzyme from the in-gel digestion usually gave a few more peptide peaks, which were important for the identification of some proteins. Internal calibration always gave better results. However, for a large number of samples, two step calibrations (i.e. database search with peptide mass from external calibration, then the use of peptide masses from the search result as internal calibrants) were useful and convenient. From the evaluation and combination of steps that were already developed by others, we established a single overall procedure for peptide identification from a polyacrylamide gel.  相似文献   

14.
Peptide mass fingerprinting (PMF) is widely used for protein identification while studying proteome via time-of-flight mass spectrometer or via 1D or 2D electrophoresis. Peptide mass tolerance indicating the fit of theoretical peptide mass to an experimental one signifcantly influences protein identification. The role of peptide mass tolerance could be estimated by counting the number of correctly identified proteins for the reference set of mass spectra. The reference set of 400 Ultraflex (Bruker Daltonics, Germany) protein mass spectra was obtained for liver microsomes slices hydrolyzed via 1D gel electrophoresis. Using a Mascot server for protein identification, the peptide mass tolerance value varied within 0.02–0.40 Da with a step of 0.01 Da. The number of identified proteins changed up to 10 times depending on the tolerance. The maximal number of identified proteins was reported for the tolerance value of 0.15 Da (120 ppm) known to be 1.5–2-fold higher than the recommended values for such a type of mass spectrometer. The software program PMFScan was developed to obtain the dependence between the number of identified proteins and the tolerance values.  相似文献   

15.
Database search post-processing by neural network was employed in peptide mapping experiments. The database search was performed using both the known algorithms and score functions, such as Bayesian, MOWSE, Z-score, correlations between calculated and actual peptide length fractional abundance, and, in addition, the probability of protein digest pattern in peptide fingerprint, all embedded in locally developed program. The new signal-processing algorithm based on neural network improves signal-noise separation and is acceptable for automatic protein identification in mixtures. Its power was tested on Helicobacter pylori protein inventory after preceding protein separation by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). Increase in protein identification success rate was observed, and about 100 proteins were identified with no need of human participation in database search estimation.  相似文献   

16.
Peptide mass-fingerprint is widely used for protein identification while studying proteome with the use of 1D or 2D electrophoresis. Peptide mass tolerance indicates the fit of theoretical peptide mass with the experimental measurements, and choice of this parameter sufficiently influences the protein identification. The role of peptide mass tolerance was estimated by counting the number of identified proteins for the reference set of mass-spectra. The reference set of 400 Ultraflex (Bruker Daltonics, Germany) mass-spectra was obtained for the slices of 1D gel of liver microsomes. Using Mascot server for protein identification, the peptide mass tolerance value was varied in the range from 0.02 to 0.40 Da with a step 0.01 Da. Depending on the tolerance the number of identified protein changes up to 10 times. Maximal number of identified proteins was reported for the tolerance value of 0.15 Da (120 ppm), which is 1.5 - 2 times higher than the recommended values for such type of mass-spectrometers. The software program PMFScan was developed to obtain the dependence of number of identified proteins of the tolerance values.  相似文献   

17.
Due to the limited applicability of conventional protein identification methods to the proteomes of organisms with unsequenced genomes, researchers have developed approaches to identify proteins using mass spectrometry and sequence similarity database searches. Both the integration of mass spectrometry with bioinformatics and genomic sequencing drive the expanding organismal scope of proteomics.  相似文献   

18.
Zhao Song  Luonan Chen  Dong Xu 《Proteomics》2009,9(11):3090-3099
Protein identification using Peptide Mass Fingerprinting (PMF) data remains an important yet only partially solved problem. Current computational methods may lead to false positive identification since the top hit from a database search may not be the target protein. In addition, the identification scores assigned singly by a scoring function (raw scores) are not normalized. Therefore, the ranking based on raw scores may be biased. To address the above issue, we have developed a statistical model to evaluate the confidence of the raw score and to improve the ranking of proteins for identification. The results show that the statistical model better ranks the correct protein than the raw scores. Our study provides a new method to enhance the accuracy of protein identification by using PMF data. We incorporated the method into our software package “Protein‐Decision” together with a user‐friendly graphical interface. A standalone version of Protein‐Decision is freely available at http://digbio.missouri.edu/ProteinDecision/ .  相似文献   

19.
Most proteomic labelling technologies intend to improve protein quantification and/or facilitate (de novo) peptide sequencing. We present here a novel stable-isotope labelling method to simultaneously identify and quantify protein components in complex mixtures by specifically derivatizing the N-terminus of proteins with 4-sulphophenyl isothiocyanate (SPITC). Our approach combines protein identification with quantification through differential isotope-coded labelling at the protein N-terminus prior to digestion. The isotope spacing of 6 Da (unlabelled vs. six-fold 13C-labelled tag) between derivatized peptide pairs enables the detection on different MS platforms (MALDI and ESI). Optimisation of the reaction conditions using SPITC was performed on three model proteins. Improved detection of the N-terminally derivatized peptide compared to the native analogue was observed in negative-ion MALDI-MS. Simpler fragmentation patterns compared to native peptides facilitated protein identification. The 13C-labelled SPITC resulted in convenient peptide pair spacing without isotopic overlap and hence facilitated relative quantification by MALDI-TOF/TOF and LC-ESI-MS/MS. The combination of facilitated identification and quantification achieved by differentially isotope-coded N-terminal protein tagging with light/heavy SPITC represents, to our knowledge, a new approach to quantitative proteomics.  相似文献   

20.

Background

Peptide-spectrum matching is a common step in most data processing workflows for mass spectrometry-based proteomics. Many algorithms and software packages, both free and commercial, have been developed to address this task. However, these algorithms typically require the user to select instrument- and sample-dependent parameters, such as mass measurement error tolerances and number of missed enzymatic cleavages. In order to select the best algorithm and parameter set for a particular dataset, in-depth knowledge about the data as well as the algorithms themselves is needed. Most researchers therefore tend to use default parameters, which are not necessarily optimal.

Results

We have applied a new optimization framework for the Taverna scientific workflow management system (http://ms-utils.org/Taverna_Optimization.pdf) to find the best combination of parameters for a given scientific workflow to perform peptide-spectrum matching. The optimizations themselves are non-trivial, as demonstrated by several phenomena that can be observed when allowing for larger mass measurement errors in sequence database searches. On-the-fly parameter optimization embedded in scientific workflow management systems enables experts and non-experts alike to extract the maximum amount of information from the data. The same workflows could be used for exploring the parameter space and compare algorithms, not only for peptide-spectrum matching, but also for other tasks, such as retention time prediction.

Conclusion

Using the optimization framework, we were able to learn about how the data was acquired as well as the explored algorithms. We observed a phenomenon identifying many ammonia-loss b-ion spectra as peptides with N-terminal pyroglutamate and a large precursor mass measurement error. These insights could only be gained with the extension of the common range for the mass measurement error tolerance parameters explored by the optimization framework.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号