期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A hierarchical MS2/MS3 database search algorithm for automated analysis of phosphopeptide tandem mass spectra

Hua Xu Liwen Wang Larry Sallans Michael A. Freitas Dr. 《Proteomics》2009,9(7):1763-1770

A novel hierarchical MS²/MS³ database search algorithm has been developed to analyze MS²/MS³ phosphopeptides proteomic data. The algorithm is incorporated in an automated database search program, MassMatrix. The algorithm matches experimental MS² spectra against a supplied protein database to determine candidate peptide matches. It then matches the corresponding experimental MS³ spectra against those candidate peptide matches. The MS² and MS³ spectra are used in concert to arrive at peptide matches with overall higher confidence rather than combining MS² and MS³ data searched separately. Receiver operating characteristic analysis showed that hierarchical MS²/MS³ database searches with MassMatrix had better sensitivity and specificity than the two‐stage MS²/MS³ database searches obtained with MassMatrix, MASCOT, and X!Tandem. A greater number of true peptide matches at a given false rate were identified by use of this new algorithm for data collected on both LCQ and LTQ‐FTICR mass spectrometers. The additional MS³ spectral data also improved the overall reliability and the number of true positives (TPs) due to the fact that the TPs of the MS²/MS³ search results had higher scores than those of the MS². 相似文献

2.

Generalized method for probability-based peptide and protein identification from tandem mass spectrometry data and sequence database searching

Ramos-Fernández A Paradela A Navajas R Albar JP 《Molecular & cellular proteomics : MCP》2008,7(9):1748-1754

Tandem mass spectrometry-based proteomics is currently in great demand of computational methods that facilitate the elimination of likely false positives in peptide and protein identification. In the last few years, a number of new peptide identification programs have been described, but scores or other significance measures reported by these programs cannot always be directly translated into an easy to interpret error rate measurement such as the false discovery rate. In this work we used generalized lambda distributions to model frequency distributions of database search scores computed by MASCOT, X!TANDEM with k-score plug-in, OMSSA, and InsPecT. From these distributions, we could successfully estimate p values and false discovery rates with high accuracy. From the set of peptide assignments reported by any of these engines, we also defined a generic protein scoring scheme that enabled accurate estimation of protein-level p values by simulation of random score distributions that was also found to yield good estimates of protein-level false discovery rate. The performance of these methods was evaluated by searching four freely available data sets ranging from 40,000 to 285,000 MS/MS spectra. 相似文献

3.

An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: sensitivity and specificity analysis 总被引：1，自引：0，他引：1

Kapp EA Schütz F Connolly LM Chakel JA Meza JE Miller CA Fenyo D Eng JK Adkins JN Omenn GS Simpson RJ 《Proteomics》2005,5(13):3475-3490

MS/MS and associated database search algorithms are essential proteomic tools for identifying peptides. Due to their widespread use, it is now time to perform a systematic analysis of the various algorithms currently in use. Using blood specimens used in the HUPO Plasma Proteome Project, we have evaluated five search algorithms with respect to their sensitivity and specificity, and have also accurately benchmarked them based on specified false-positive (FP) rates. Spectrum Mill and SEQUEST performed well in terms of sensitivity, but were inferior to MASCOT, X!Tandem, and Sonar in terms of specificity. Overall, MASCOT, a probabilistic search algorithm, correctly identified most peptides based on a specified FP rate. The rescoring algorithm, PeptideProphet, enhanced the overall performance of the SEQUEST algorithm, as well as provided predictable FP error rates. Ideally, score thresholds should be calculated for each peptide spectrum or minimally, derived from a reversed-sequence search as demonstrated in this study based on a validated data set. The availability of open-source search algorithms, such as X!Tandem, makes it feasible to further improve the validation process (manual or automatic) on the basis of "consensus scoring", i.e., the use of multiple (at least two) search algorithms to reduce the number of FPs. complement. 相似文献

4.

Monte carlo simulation-based algorithms for analysis of shotgun proteomic data

Xu H Freitas MA 《Journal of proteome research》2008,7(7):2605-2615

Two new statistical models based on Monte Carlo Simulation (MCS) have been developed to score peptide matches in shotgun proteomic data and incorporated in a database search program, MassMatrix (www.massmatrix.net). The first model evaluates peptide matches based on the total abundance of matched peaks in the experimental spectra. The second model evaluates amino acid residue tags within MS/MS spectra. The two models provide complementary scores for peptide matches that result in higher confidence in peptide identification when significant scores are returned from both models. The MCS-based models use a variance reduction technique that improves estimation precision. Due to the high computational expense of MCS-based models, peptide matches were prefiltered by other statistical models before further evaluation by the MCS-based models. Receiver operating characteristic analysis of the data sets confirmed that MCS-based models improved the overall performance of the MassMatrix search software, especially for low-mass accuracy data sets. 相似文献

5.

Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling

Choi H Ghosh D Nesvizhskii AI 《Journal of proteome research》2008,7(1):286-292

Reliable statistical validation of peptide and protein identifications is a top priority in large-scale mass spectrometry based proteomics. PeptideProphet is one of the computational tools commonly used for assessing the statistical confidence in peptide assignments to tandem mass spectra obtained using database search programs such as SEQUEST, MASCOT, or X! TANDEM. We present two flexible methods, the variable component mixture model and the semiparametric mixture model, that remove the restrictive parametric assumptions in the mixture modeling approach of PeptideProphet. Using a control protein mixture data set generated on an linear ion trap Fourier transform (LTQ-FT) mass spectrometer, we demonstrate that both methods improve parametric models in terms of the accuracy of probability estimates and the power to detect correct identifications controlling the false discovery rate to the same degree. The statistical approaches presented here require that the data set contain a sufficient number of decoy (known to be incorrect) peptide identifications, which can be obtained using the target-decoy database search strategy. 相似文献

6.

Enhancing peptide identification confidence by combining search methods

Alves G Wu WW Wang G Shen RF Yu YK 《Journal of proteome research》2008,7(8):3102-3113

Confident peptide identification is one of the most important components in mass-spectrometry-based proteomics. We propose a method to properly combine the results from different database search methods to enhance the accuracy of peptide identifications. The database search methods included in our analysis are SEQUEST (v27 rev12), ProbID (v1.0), InsPecT (v20060505), Mascot (v2.1), X! Tandem (v2007.07.01.2), OMSSA (v2.0) and RAId_DbS. Using two data sets, one collected in profile mode and one collected in centroid mode, we tested the search performance of all 21 combinations of two search methods as well as all 35 possible combinations of three search methods. The results obtained from our study suggest that properly combining search methods does improve retrieval accuracy. In addition to performance results, we also describe the theoretical framework which in principle allows one to combine many independent scoring methods including de novo sequencing and spectral library searches. The correlations among different methods are also investigated in terms of common true positives, common false positives, and a global analysis. We find that the average correlation strength, between any pairwise combination of the seven methods studied, is usually smaller than the associated standard error. This indicates only weak correlation may be present among different methods and validates our approach in combining the search results. The usefulness of our approach is further confirmed by showing that the average cumulative number of false positive peptides agrees reasonably well with the combined E-value. The data related to this study are freely available upon request. 相似文献

7.

Phosphorylation-specific MS/MS scoring for rapid and accurate phosphoproteome analysis

Payne SH Yau M Smolka MB Tanner S Zhou H Bafna V 《Journal of proteome research》2008,7(8):3373-3381

The promise of mass spectrometry as a tool for probing signal-transduction is predicated on reliable identification of post-translational modifications. Phosphorylations are key mediators of cellular signaling, yet are hard to detect, partly because of unusual fragmentation patterns of phosphopeptides. In addition to being accurate, MS/MS identification software must be robust and efficient to deal with increasingly large spectral data sets. Here, we present a new scoring function for the Inspect software for phosphorylated peptide tandem mass spectra for ion-trap instruments, without the need for manual validation. The scoring function was modeled by learning fragmentation patterns from 7677 validated phosphopeptide spectra. We compare our algorithm against SEQUEST and X!Tandem on testing and training data sets. At a 1% false positive rate, Inspect identified the greatest total number of phosphorylated spectra, 13% more than SEQUEST and 39% more than X!Tandem. Spectra identified by Inspect tended to score better in several spectral quality measures. Furthermore, Inspect runs much faster than either SEQUEST or X!Tandem, making desktop phosphoproteomics feasible. Finally, we used our new models to reanalyze a corpus of 423,000 LTQ spectra acquired for a phosphoproteome analysis of Saccharomyces cerevisiae DNA damage and repair pathways and discovered 43% more phosphopeptides than the previous study. 相似文献

8.

PRIMA: peptide robust identification from MS/MS spectra

Liu J Ma B Li M 《Journal of bioinformatics and computational biology》2006,4(1):125-138

In proteomics, tandem mass spectrometry is the key technology for peptide sequencing. However, partially due to the deficiency of peptide identification software, a large portion of the tandem mass spectra are discarded in almost all proteomics centers because they are not interpretable. The problem is more acute with the lower quality data from low end but more popular devices such as the ion trap instruments. In order to deal with the noisy and low quality data, this paper develops a systematic machine learning approach to construct a robust linear scoring function, whose coefficients are determined by a linear programming. A prototype, PRIMA, was implemented. When tested with large benchmarks of varying qualities, PRIMA consistently has higher accuracy than commonly used software MASCOT, SEQUEST and X! Tandem. 相似文献

9.

Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics 总被引：1，自引：0，他引：1

Choi H Nesvizhskii AI 《Journal of proteome research》2008,7(1):254-265

Development of robust statistical methods for validation of peptide assignments to tandem mass (MS/MS) spectra obtained using database searching remains an important problem. PeptideProphet is one of the commonly used computational tools available for that purpose. An alternative simple approach for validation of peptide assignments is based on addition of decoy (reversed, randomized, or shuffled) sequences to the searched protein sequence database. The probabilistic modeling approach of PeptideProphet and the decoy strategy can be combined within a single semisupervised framework, leading to improved robustness and higher accuracy of computed probabilities even in the case of most challenging data sets. We present a semisupervised expectation-maximization (EM) algorithm for constructing a Bayes classifier for peptide identification using the probability mixture model, extending PeptideProphet to incorporate decoy peptide matches. Using several data sets of varying complexity, from control protein mixtures to a human plasma sample, and using three commonly used database search programs, SEQUEST, MASCOT, and TANDEM/k-score, we illustrate that more accurate mixture estimation leads to an improved control of the false discovery rate in the classification of peptide assignments. 相似文献

10.

Identification and characterization of disulfide bonds in proteins and peptides from tandem MS data by use of the MassMatrix MS/MS search engine

Xu H Zhang L Freitas MA 《Journal of proteome research》2008,7(1):138-144

A new database search algorithm has been developed to identify disulfide-linked peptides in tandem MS data sets. The algorithm is included in the newly developed tandem MS database search program, MassMatrix. The algorithm exploits the probabilistic scoring model in MassMatrix to achieve identification of disulfide bonds in proteins and peptides. Proteins and peptides with disulfide bonds can be identified with high confidence without chemical reduction or other derivatization. The approach was tested on peptide and protein standards with known disulfide bonds. All disulfide bonds in the standard set were identified by MassMatrix. The algorithm was further tested on bovine pancreatic ribonuclease A (RNaseA). The 4 native disulfide bonds in RNaseA were detected by MassMatrix with multiple validated peptide matches for each disulfide bond with high statistical scores. Fifteen nonnative disulfide bonds were also observed in the protein digest under basic conditions (pH = 8.0) due to disulfide bond interchange. After minimizing the disulfide bond interchange (pH = 6.0) during digestion, only one nonnative disulfide bond was observed. The MassMatrix algorithm offers an additional approach for the discovery of disulfide bond from tandem mass spectrometry data. 相似文献

11.

The standard protein mix database: a diverse data set to assist in the production of improved Peptide and protein identification software tools

Klimek J Eddes JS Hohmann L Jackson J Peterson A Letarte S Gafken PR Katz JE Mallick P Lee H Schmidt A Ossola R Eng JK Aebersold R Martin DB 《Journal of proteome research》2008,7(1):96-103

Tandem mass spectrometry (MS/MS) is frequently used in the identification of peptides and proteins. Typical proteomic experiments rely on algorithms such as SEQUEST and MASCOT to compare thousands of tandem mass spectra against the theoretical fragment ion spectra of peptides in a database. The probabilities that these spectrum-to-sequence assignments are correct can be determined by statistical software such as PeptideProphet or through estimations based on reverse or decoy databases. However, many of the software applications that assign probabilities for MS/MS spectra to sequence matches were developed using training data sets from 3D ion-trap mass spectrometers. Given the variety of types of mass spectrometers that have become commercially available over the last 5 years, we sought to generate a data set of reference data covering multiple instrumentation platforms to facilitate both the refinement of existing computational approaches and the development of novel software tools. We analyzed the proteolytic peptides in a mixture of tryptic digests of 18 proteins, named the "ISB standard protein mix", using 8 different mass spectrometers. These include linear and 3D ion traps, two quadrupole time-of-flight platforms (qq-TOF), and two MALDI-TOF-TOF platforms. The resulting data set, which has been named the Standard Protein Mix Database, consists of over 1.1 million spectra in 150+ replicate runs on the mass spectrometers. The data were inspected for quality of separation and searched using SEQUEST. All data, including the native raw instrument and mzXML formats and the PeptideProphet validated peptide assignments, are available at http://regis-web.systemsbiology.net/PublicDatasets/. 相似文献

12.

LFQuant: A label‐free fast quantitative analysis tool for high‐resolution LC‐MS/MS proteomics data

Changming Xu Ning Li Hui Liu Jie Ma Yunping Zhu Hongwei Xie 《Proteomics》2012,12(23-24):3475-3484

Database searching based methods for label‐free quantification aim to reconstruct the peptide extracted ion chromatogram based on the identification information, which can limit the search space and thus make the data processing much faster. The random effect of the MS/MS sampling can be remedied by cross‐assignment among different runs. Here, we present a new label‐free fast quantitative analysis tool, LFQuant, for high‐resolution LC‐MS/MS proteomics data based on database searching. It is designed to accept raw data in two common formats (mzXML and Thermo RAW), and database search results from mainstream tools (MASCOT, SEQUEST, and X!Tandem), as input data. LFQuant can handle large‐scale label‐free data with fractionation such as SDS‐PAGE and 2D LC. It is easy to use and provides handy user interfaces for data loading, parameter setting, quantitative analysis, and quantitative data visualization. LFQuant was compared with two common quantification software packages, MaxQuant and IDEAL‐Q, on the replication data set and the UPS1 standard data set. The results show that LFQuant performs better than them in terms of both precision and accuracy, and consumes significantly less processing time. LFQuant is freely available under the GNU General Public License v3.0 at http://sourceforge.net/projects/lfquant/ . 相似文献

13.

Improving sensitivity by probabilistically combining results from multiple MS/MS search methodologies 总被引：1，自引：0，他引：1

Searle BC Turner M Nesvizhskii AI 《Journal of proteome research》2008,7(1):245-253

Database-searching programs generally identify only a fraction of the spectra acquired in a standard LC/MS/MS study of digested proteins. Subtle variations in database-searching algorithms for assigning peptides to MS/MS spectra have been known to provide different identification results. To leverage this variation, a probabilistic framework is developed for combining the results of multiple search engines. The scores for each search engine are first independently converted into peptide probabilities. These probabilities can then be readily combined across search engines using Bayesian rules and the expectation maximization learning algorithm. A significant gain in the number of peptides identified with high confidence with each additional search engine is demonstrated using several data sets of increasing complexity, from a control protein mixture to a human plasma sample, searched using SEQUEST, Mascot, and X! Tandem database-searching programs. The increased rate of peptide assignments also translates into a substantially larger number of protein identifications in LC/MS/MS studies compared to a typical analysis using a single database-search tool. 相似文献

14.

MassWiz: a novel scoring algorithm with target-decoy based analysis pipeline for tandem mass spectrometry

Yadav AK Kumar D Dash D 《Journal of proteome research》2011,10(5):2154-2160

Mass spectrometry has made rapid advances in the recent past and has become the preferred method for proteomics. Although many open source algorithms for peptide identification exist, such as X!Tandem and OMSSA, it has majorly been a domain of proprietary software. There is a need for better, freely available, and configurable algorithms that can help in identifying the correct peptides while keeping the false positives to a minimum. We have developed MassWiz, a novel empirical scoring function that gives appropriate weights to major ions, continuity of b-y ions, intensities, and the supporting neutral losses based on the instrument type. We tested MassWiz accuracy on 486,882 spectra from a standard mixture of 18 proteins generated on 6 different instruments downloaded from the Seattle Proteome Center public repository. We compared the MassWiz algorithm with Mascot, Sequest, OMSSA, and X!Tandem at 1% FDR. MassWiz outperformed all in the largest data set (AGILENT XCT) and was second only to Mascot in the other data sets. MassWiz showed good performance in the analysis of high confidence peptides, i.e., those identified by at least three algorithms. We also analyzed a yeast data set containing 106,133 spectra downloaded from the NCBI Peptidome repository and got similar results. The results demonstrate that MassWiz is an effective algorithm for high-confidence peptide identification without compromising on the number of assignments. MassWiz is open-source, versatile, and easily configurable. 相似文献

15.

High-throughput Database Search and Large-scale Negative Polarity Liquid Chromatography–Tandem Mass Spectrometry with Ultraviolet Photodissociation for Complex Proteomic Samples

James A. Madsen Hua Xu Michelle R. Robinson Andrew P. Horton Jared B. Shaw David K. Giles Tamer S. Kaoud Kevin N. Dalby M. Stephen Trent Jennifer S. Brodbelt 《Molecular & cellular proteomics : MCP》2013,12(9):2604-2614

The use of ultraviolet photodissociation (UVPD) for the activation and dissociation of peptide anions is evaluated for broader coverage of the proteome. To facilitate interpretation and assignment of the resulting UVPD mass spectra of peptide anions, the MassMatrix database search algorithm was modified to allow automated analysis of negative polarity MS/MS spectra. The new UVPD algorithms were developed based on the MassMatrix database search engine by adding specific fragmentation pathways for UVPD. The new UVPD fragmentation pathways in MassMatrix were rigorously and statistically optimized using two large data sets with high mass accuracy and high mass resolution for both MS¹ and MS² data acquired on an Orbitrap mass spectrometer for complex Halobacterium and HeLa proteome samples. Negative mode UVPD led to the identification of 3663 and 2350 peptides for the Halo and HeLa tryptic digests, respectively, corresponding to 655 and 645 peptides that were unique when compared with electron transfer dissociation (ETD), higher energy collision-induced dissociation, and collision-induced dissociation results for the same digests analyzed in the positive mode. In sum, 805 and 619 proteins were identified via UVPD for the Halobacterium and HeLa samples, respectively, with 49 and 50 unique proteins identified in contrast to the more conventional MS/MS methods. The algorithm also features automated charge determination for low mass accuracy data, precursor filtering (including intact charge-reduced peaks), and the ability to combine both positive and negative MS/MS spectra into a single search, and it is freely open to the public. The accuracy and specificity of the MassMatrix UVPD search algorithm was also assessed for low resolution, low mass accuracy data on a linear ion trap. Analysis of a known mixture of three mitogen-activated kinases yielded similar sequence coverage percentages for UVPD of peptide anions versus conventional collision-induced dissociation of peptide cations, and when these methods were combined into a single search, an increase of up to 13% sequence coverage was observed for the kinases. The ability to sequence peptide anions and cations in alternating scans in the same chromatographic run was also demonstrated. Because ETD has a significant bias toward identifying highly basic peptides, negative UVPD was used to improve the identification of the more acidic peptides in conjunction with positive ETD for the more basic species. In this case, tryptic peptides from the cytosolic section of HeLa cells were analyzed by polarity switching nanoLC-MS/MS utilizing ETD for cation sequencing and UVPD for anion sequencing. Relative to searching using ETD alone, positive/negative polarity switching significantly improved sequence coverages across identified proteins, resulting in a 33% increase in unique peptide identifications and more than twice the number of peptide spectral matches.The advent of new high-performance tandem mass spectrometers equipped with the most versatile collision- and electron-based activation methods and ever more powerful database search algorithms has catalyzed tremendous progress in the field of proteomics (–). Despite these advances in instrumentation and methodologies, there are few methods that fully exploit the information available from the acidic proteome or acidic regions of proteins. Typical high-throughput, bottom-up workflows consist of the chromatographic separation of complex mixtures of digested proteins followed by online mass spectrometry (MS) and MSⁿ analysis. This bottom-up approach remains the most popular strategy for protein identification, biomarker discovery, quantitative proteomics, and elucidation of post-translational modifications. To date, proteome characterization via mass spectrometry has overwhelmingly focused on the analysis of peptide cations (), resulting in an inherent bias toward basic peptides that easily ionize under acidic mobile phase conditions and positive polarity MS settings. Given that ∼50% of peptides/proteins are naturally acidic () and that many of the most important post-translational modifications (e.g. phosphorylation, acetylation, sulfonation, etc.) significantly decrease the isoelectric points of peptides (, 8), there is a compelling need for better analytical methodologies for characterization of the acidic proteome.A principal reason for the shortage of methods for peptide anion characterization is the lack of MS/MS techniques suitable for the efficient and predictable dissociation of peptide anions. Although there are a growing array of new ion activation methods for the dissociation of peptides, most have been developed for the analysis of positively charged peptides. Collision-induced dissociation (CID)¹ of peptide anions, for example, often yields unpredictable or uninformative fragmentation behavior, with spectra dominated by neutral losses from both precursor and product ions (), resulting in insufficient peptide sequence information. The two most promising new electron-based methods, electron-capture dissociation and electron-transfer dissociation (ETD), are applicable only to positively charged ions, not to anions (–). Because of the known inadequacy of CID and the lack of feasibility of electron-capture dissociation and ETD for peptide anion sequencing, several alternative MSⁿ methods have been developed recently. Electron detachment dissociation using high-energy electrons to induce backbone cleavages was developed for peptide anions (, ). Another new technique, negative ETD, entails reactions of radical cation reagents with peptide anions to promote electron transfer from the peptide to the reagent that causes radical-directed dissociation (, ). Activated-electron photodetachment dissociation, an MS³ technique, uses UV irradiation to produce intact peptide radical anions, which are then collisionally activated (, ). Although they represent inroads in the characterization of peptide anions, these methods also suffer from several significant shortcomings. Electron detachment dissociation and activated-electron photodetachment dissociation are both low-efficiency methods that require long averaging cycles and activation times that range from half a second to multiple seconds, impeding the integration of these methods with chromatographic timescales (–). In addition, the fragmentation patterns frequently yield many high-abundance neutral losses from product ions, which clutter the spectra (–), and few sequence ions (, , ). Recently, we reported the use of 193-nm photons (ultraviolet photodissociation (UVPD)) for peptide anion activation, which was shown to yield rich and predictable fragmentation patterns with high sequence coverage on a fast liquid chromatographic timeline (). This method showed promise for a range of peptide charge states (i.e. from 3- to 1-), as well as for both unmodified and phosphorylated species.Several widely used or commercial database searching techniques are available for automated “bottom-up” analysis of peptide cations; SEQUEST (), MASCOT (), OMSSA (), X! Tandem (), and MASPIC () are all popular choices and yield comparable results (). MassMatrix (), a recently introduced searching algorithm, uses a mass accuracy sensitive probability-based scoring scheme for both the total number of matched product ions and the total abundance of matched products. This searching method also utilizes LC retention times to filter false positive peptide matches () and has been shown to yield results comparable to or better than those obtained with SEQUEST, MASCOT, OMSSA, and X! Tandem (). Despite the ongoing innovation in automated peptide cation analysis, there is a lack of publically available methods for automated peptide anion analysis.In this work, we have modified the mass accuracy sensitive probabilistic MassMatrix algorithms to allow database searching of negative polarity MS/MS spectra. The algorithm is specific to the fragmentation behavior generated from 193-nm UVPD of peptide anions. The UVPD pathways in MassMatrix were rigorously and statistically optimized using two large data sets with high mass accuracy and high mass resolution for both MS¹ and MS² data acquired on an Orbitrap mass spectrometer for complex HeLa and Halo proteome samples. For low mass accuracy/low mass resolution data, we also incorporated a charge-state-filtering algorithm that identifies the charge state of each MS/MS spectrum based on the fragmentation patterns prior to searching. MassMatrix not only can analyze both positive and negative polarity LC-MS/MS files separately, but also can combine files from different polarities and different dissociation methods into a single search, thus maximizing the information content for a given proteomics experiment. The explicit incorporation of mass accuracy in the scores for the UVPD MS/MS spectra of peptide anions increases peptide assignments and identifications. Finally, we showcase the utility of integrating MassMatrix searching with positive/negative polarity MS/MS switching (i.e. data-dependent positive ETD and negative UVPD during a single proteomic LC-MS/MS run). MassMatrix is available to the public as a free search engine online. 相似文献

16.

Comparison of Mascot and X!Tandem performance for low and high accuracy mass spectrometry and the development of an adjusted Mascot threshold

Brosch M Swamy S Hubbard T Choudhary J 《Molecular & cellular proteomics : MCP》2008,7(5):962-970

It is a major challenge to develop effective sequence database search algorithms to translate molecular weight and fragment mass information obtained from tandem mass spectrometry into high quality peptide and protein assignments. We investigated the peptide identification performance of Mascot and X!Tandem for mass tolerance settings common for low and high accuracy mass spectrometry. We demonstrated that sensitivity and specificity of peptide identification can vary substantially for different mass tolerance settings, but this effect was more significant for Mascot. We present an adjusted Mascot threshold, which allows the user to freely select the best trade-off between sensitivity and specificity. The adjusted Mascot threshold was compared with the default Mascot and X!Tandem scoring thresholds and shown to be more sensitive at the same false discovery rates for both low and high accuracy mass spectrometry data. 相似文献

17.

Improved proteomic analysis pipeline for LC-ETD-MS/MS using charge enhancing methods

LQ Xie CP Shen MB Liu ZD Chen RY Du GQ Yan HJ Lu PY Yang 《Molecular bioSystems》2012,8(10):2692-2698

Electron transfer dissociation (ETD) is a useful and complementary activation method for peptide fragmentation in mass spectrometry. However, ETD spectra typically receive a relatively low score in the identifications of 2+ ions. To overcome this challenge, we, for the first time, systematically interrogated the benefits of combining ion charge enhancing methods (dimethylation, guanidination, m-nitrobenzyl alcohol (m-NBA) or Lys-C digestion) and differential search algorithms (Mascot, Sequest, OMSSA, pFind and X!Tandem). A simple sample (BSA) and a complex sample (AMJ2 cell lysate) were selected in benchmark tests. Clearly distinct outcomes were observed through different experimental protocol. In the analysis of AMJ2 cell lines, X!Tandem and pFind revealed 92.65% of identified spectra; m-NBA adduction led to a 5-10% increase in average charge state and the most significant increase in the number of successful identifications, and Lys-C treatment generated peptides carrying mostly triple charges. Based on the complementary identification results, we suggest that a combination of m-NBA and Lys-C strategies accompanied by X!Tandem and pFind can greatly improve ETD identification. 相似文献

18.

Tempest: GPU-CPU computing for high-throughput database spectral matching

Milloy JA Faherty BK Gerber SA 《Journal of proteome research》2012,11(7):3581-3591

Modern mass spectrometers are now capable of producing hundreds of thousands of tandem (MS/MS) spectra per experiment, making the translation of these fragmentation spectra into peptide matches a common bottleneck in proteomics research. When coupled with experimental designs that enrich for post-translational modifications such as phosphorylation and/or include isotopically labeled amino acids for quantification, additional burdens are placed on this computational infrastructure by shotgun sequencing. To address this issue, we have developed a new database searching program that utilizes the massively parallel compute capabilities of a graphical processing unit (GPU) to produce peptide spectral matches in a very high throughput fashion. Our program, named Tempest, combines efficient database digestion and MS/MS spectral indexing on a CPU with fast similarity scoring on a GPU. In our implementation, the entire similarity score, including the generation of full theoretical peptide candidate fragmentation spectra and its comparison to experimental spectra, is conducted on the GPU. Although Tempest uses the classical SEQUEST XCorr score as a primary metric for evaluating similarity for spectra collected at unit resolution, we have developed a new "Accelerated Score" for MS/MS spectra collected at high resolution that is based on a computationally inexpensive dot product but exhibits scoring accuracy similar to that of the classical XCorr. In our experience, Tempest provides compute-cluster level performance in an affordable desktop computer. 相似文献

19.

A collection of open source applications for mass spectrometry data mining

Óscar Gallardo David Ovelleiro Marina Gay Montserrat Carrascal Joaquin Abian 《Proteomics》2014,14(20):2275-2279

We present several bioinformatics applications for the identification and quantification of phosphoproteome components by MS. These applications include a front‐end graphical user interface that combines several Thermo RAW formats to MASCOT? Generic Format extractors (EasierMgf), two graphical user interfaces for search engines OMSSA and SEQUEST (OmssaGui and SequestGui), and three applications, one for the management of databases in FASTA format (FastaTools), another for the integration of search results from up to three search engines (Integrator), and another one for the visualization of mass spectra and their corresponding database search results (JsonVisor). These applications were developed to solve some of the common problems found in proteomic and phosphoproteomic data analysis and were integrated in the workflow for data processing and feeding on our LymPHOS database. Applications were designed modularly and can be used standalone. These tools are written in Perl and Python programming languages and are supported on Windows platforms. They are all released under an Open Source Software license and can be freely downloaded from our software repository hosted at GoogleCode. 相似文献

20.

General framework for developing and evaluating database scoring algorithms using the TANDEM search engine 总被引：2，自引：0，他引：2

MacLean B Eng JK Beavis RC McIntosh M 《Bioinformatics (Oxford, England)》2006,22(22):2830-2832

MOTIVATION: Tandem mass spectrometry (MS/MS) identifies protein sequences using database search engines, at the core of which is a score that measures the similarity between peptide MS/MS spectra and a protein sequence database. The TANDEM application was developed as a freely available database search engine for the proteomics research community. To extend TANDEM as a platform for further research on developing improved database scoring methods, we modified the software to allow users to redefine the scoring function and replace the native TANDEM scoring function while leaving the remaining core application intact. Redefinition is performed at run time so multiple scoring functions are available to be selected and applied from a single search engine binary. We introduce the implementation of the pluggable scoring algorithm and also provide implementations of two TANDEM compatible scoring functions, one previously described scoring function compatible with PeptideProphet and one very simple scoring function that quantitative researchers may use to begin their development. This extension builds on the open-source TANDEM project and will facilitate research into and dissemination of novel algorithms for matching MS/MS spectra to peptide sequences. The pluggable scoring schema is also compatible with related search applications P3 and Hunter, which are part of the X! suite of database matching algorithms. The pluggable scores and the X! suite of applications are all written in C++. AVAILABILITY: Source code for the scoring functions is available from http://proteomics.fhcrc.org 相似文献