首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A technique that combines ion mobility spectrometry (IMS) with reversed-phase liquid chromatography (LC), collision-induced dissociation (CID) and mass spectrometry (MS) has been developed. The approach is described as a high throughput means of analysing complex mixtures of peptides that arise from enzymatic digestion of protein mixtures. In this approach, peptides are separated by LC and, as they elute from the column, they are introduced into the gas phase and ionised by electrospray ionisation. The beam of ions is accumulated in an ion trap and then the concentrated ion packet is injected into a drift tube where the ions are separated again in the gas phase by IMS, a technique that differentiates ions based on their mobilities through a buffer gas. As ions exit the drift tube, they can be subjected to collisional activation to produce fragments prior to being introduced into a mass spectrometer for detection. The IMS separation can be carried out in only a few milliseconds and offers a number of advantages compared with LC-MS alone. An example of a single 21-minute LC-IMS-(CID)-MS analysis of the human plasma proteome reveals approximately 20,000 parent ions and approximately 600,000 fragment ions and evidence for 227 unique protein assignments.  相似文献   

2.
Proteomics by mass spectrometry technology is widely used for identifying and quantifying peptides and proteins. The breadth and sensitivity of peptide detection have been advanced by the advent of data-independent acquisition mass spectrometry. Analysis of such data, however, is challenging due to the complexity of fragment ion spectra that have contributions from multiple co-eluting precursor ions. We present SWATHProphet software that identifies and quantifies peptide fragment ion traces in data-independent acquisition data, provides accurate probabilities to ensure results are correct, and automatically detects and removes contributions to quantitation originating from interfering precursor ions. Integration in the widely used open source Trans-Proteomic Pipeline facilitates subsequent analyses such as combining results of multiple data sets together for improved discrimination using iProphet and inferring sample proteins using ProteinProphet. This novel development should greatly help make data-independent acquisition mass spectrometry accessible to large numbers of users.Mass spectrometry is widely used to identify and quantify protein samples. Proteins are typically cleaved into peptides (either enzymatically or chemically), separated by at least one-dimensional fractionation (e.g. liquid chromatography), and collisionally fragmented, and fragment ions are detected by their unique m/z values in a mass spectrometer (1). Data-dependent acquisition (shotgun) selects individual precursor ions for fragmentation and is limited in its ability to consistently detect large numbers of peptides, particularly those of lower intensity, in samples (2). In contrast, selective reaction monitoring (SRM)1 is a targeted approach in which known precursor and a set of fragment ions are monitored over time upon selection by mass filters in a triple quadrupole instrument. The selected fragment ions in conjunction with the parent ion constitute a highly sensitive molecular assay specific for a precursor ion of interest. Although this strategy has been successfully applied for a large number of biological studies, it is limited by low throughput.An alternative approach, data-independent acquisition (DIA), aims to overcome the low throughput limitation of SRM while maintaining full quantitative analyses. It selects all ions within a sliding m/z precursor window for fragmentation (37) and effectively creates a digital record of the complete peptide contents of the sample. Its increased sensitivity, however, is limited by the challenge of interpreting fragment ion spectra generated from multiple precursors. This can be done by spectral deconvolution followed by database search (1, 8) or by query of the data with preselected fragment ions in a spectral library in a manner similar to targeted approaches such as SRM (3).Software packages currently available for targeted analysis of DIA MS data with precursor ion assays contained within a spectral library include PeakViewTM from (Sciex, Framingham, MA), for data generated on a TripleTOF mass spectrometer. The proprietary Spectronaut (Biognosys AG, Zurich, Switzerland) and open source OpenSWATH software (9) are adaptations of the mProphet software suite (10) originally designed for SRM data, and the widely used SRM software Skyline (11) now also incorporates mProphet software to handle DIA MS data. None of these available programs, however, provide validation of results with computed probabilities or detection and removal of fragment ion interferences that give rise to inaccurate quantitation and decreased sensitivity.Here we present SWATHProphet software that performs these functions in conjunction with a high quality spectral library. SWATHProphet validates results with accurate probabilities of being correct. These probabilities serve as input to downstream analyses in the highly developed Trans-Proteomic Pipeline (TPP) (12), such as combining together results of multiple runs for improved discrimination with iProphet (13) and inferring sample proteins with ProteinProphet (14). In addition, SWATHProphet uses these probabilities to help cope with complex spectra by automatically detecting fragment ion interferences and removing them in silico to yield accurate quantitation and adjusted probabilities.  相似文献   

3.
Unambiguous identification of tandem mass spectra is a cornerstone in mass-spectrometry-based proteomics. As the study of post-translational modifications (PTMs) by means of shotgun proteomics progresses in depth and coverage, the ability to correctly identify PTM-bearing peptides is essential, increasing the demand for advanced data interpretation. Several PTMs are known to generate unique fragment ions during tandem mass spectrometry, the so-called diagnostic ions, which unequivocally identify a given mass spectrum as related to a specific PTM. Although such ions offer tremendous analytical advantages, algorithms to decipher MS/MS spectra for the presence of diagnostic ions in an unbiased manner are currently lacking. Here, we present a systematic spectral-pattern-based approach for the discovery of diagnostic ions and new fragmentation mechanisms in shotgun proteomics datasets. The developed software tool is designed to analyze large sets of high-resolution peptide fragmentation spectra independent of the fragmentation method, instrument type, or protease employed. To benchmark the software tool, we analyzed large higher-energy collisional activation dissociation datasets of samples containing phosphorylation, ubiquitylation, SUMOylation, formylation, and lysine acetylation. Using the developed software tool, we were able to identify known diagnostic ions by comparing histograms of modified and unmodified peptide spectra. Because the investigated tandem mass spectra data were acquired with high mass accuracy, unambiguous interpretation and determination of the chemical composition for the majority of detected fragment ions was feasible. Collectively we present a freely available software tool that allows for comprehensive and automatic analysis of analogous product ions in tandem mass spectra and systematic mapping of fragmentation mechanisms related to common amino acids.In mass spectrometry (MS)-based proteomics, protein mixtures are digested into peptides using standard proteases such as trypsin or Lys-C (1). The complex peptide mixture is separated via liquid chromatography (LC) directly coupled to MS, and the eluting peptide ions are electrosprayed into the vacuum of the mass spectrometer, where a peptide mass spectrum is recorded (2). In the mass spectrometer, selected peptide ions are fragmented, most commonly through the collision of peptide molecular ions with inert gas molecules in a technique referred to as either collision-induced dissociation (CID)1 or collisionally activated dissociation (3, 4). During this energetic collision, some of the deposited kinetic energy is converted into internal energy, which results in peptide bond breakage and fragmentation of the molecular peptide ion into sequence-specific ions (5). Identification of the analyzed peptide is then performed by scanning the measured peptide mass and list of fragment masses against a protein sequence database (6). Overall this approach provides a rapid and sensitive means of determining the primary sequence of peptides.During the fragmentation step, various types of fragment ions can be observed in the MS/MS spectrum. Their occurrence depends on the primary sequence of the investigated peptide, the amount of internal energy deposited, how the energy was introduced, the charge state, and other factors (7). Low-energy dissociation conditions as observed in ion trap CID mainly generate fragment ions containing sequence-specific amino acid information about the investigated peptides (8). This occurs because the energy deposited during this fragmentation method primarily facilitates the fragmentation of precursor ions yielding single peptide bond fragmentation between individual amino acids (9).With faster activation methods, such as beam-type/quadrupole CID (10), generated fragments can undergo further collisions. Multiple bonds can thereby be fragmented, giving rise to internal sequence ions, which in combination with regular b- and y-type cleavage produce specific amino-immonium ions (11). These immonium ions appear in the very low m/z range of the MS/MS spectrum, and for the majority of naturally occurring amino acids such immonium ions are unique for that particular residue (12, 13). Exceptions for this are the leucine/isoleucine and lysine/glutamine pairs, which produce immonium ions with the same chemical mass. Overall, immonium ions can confirm the presence of certain amino acid residues in a peptide, whereas information regarding the position or the stoichiometry of these amino acid residues cannot be ascertained. Because tryptic peptides on average contain 9 to 12 amino acids, they frequently contain many different residues; as a result, the analytical information hidden in the regular amino acid immonium ions might be limited. However, immonium ions can be used to support peptide sequence assignment during proteomic database searching (14).Contrary to the 20 naturally occurring residues, many amino acids can be modified by various post-translational modifications (PTMs), and these PTM-bearing residues can themselves generate unique immonium ions—the so-called diagnostic ions. The two most prominent examples are phosphorylation of tyrosine and acetylation of lysine residues (15), which generate diagnostic ions at m/z = 216.0424 and m/z = 126.0917, respectively. Thus, the presence of these unique ions in a MS/MS spectrum can unequivocally identify the sequenced peptide as harboring a given PTM. Evidently, knowledge regarding modification-specific diagnostic ions is of great importance for the identification and validation of modified peptides in MS-based proteomics (16, 17). Additionally, such PTM-specific information can be informative in targeted proteomics approaches facilitating MS/MS precursor ion scanning (18) and become valuable in post-acquisition analysis involving extracted ion chromatograms for specific m/z values. Moreover, information regarding diagnostic ions can be a powerful addition to analytical approaches such as selected reaction monitoring, a targeted technique that relies on ion-filtering capabilities to comprehensively study peptides and PTMs (19).Currently only a minor subset of modified amino acids has been investigated for diagnostic ions, primarily because of the lack of unbiased methods for mapping such ions in large-scale proteomics experiments. The identification of diagnostic ions is a labor-intensive endeavor, requiring manual interpretation of large numbers of MS/MS spectra for proper validation of low-mass fragmentation ions. As a result, most studies on diagnostic ions have been performed on a few selected synthetic peptides, as the interrogation of larger biological datasets has not been feasible (15, 20).Here we describe a proteomic approach utilizing a novel algorithm based upon binning of tandem mass spectra for fast and automated mapping of analogously occurring product ions. The developed algorithm is completely independent of instrument type and fragmentation technique employed, but it performs more favorably under experimental conditions that augment the generation of immonium ions. As a result, the performance of the algorithm is benchmarked on data derived from LTQ Orbitrap Velos and Q Exactive mass spectrometers, which exhibit improved HCD performance (2123). HCD has proven to be a powerful fragmentation technique, particularly for PTM analysis (24, 25), as no low mass detection cutoff is observed as compared with fragmentation experiments on ion trap mass spectrometers (26). Moreover, the beam-type energy deposited during HCD fragmentation allows for improved generation of both immonium and other sequence-related ions relative to CID (27, 28). Additionally, HCD experiments are performed at very high resolution, yielding high mass accuracy (<10 ppm) on all detected fragment ions, which allows the algorithm to utilize very narrow mass binning and hence easily determine the exact chemical composition of any novel detected ions.Briefly, the algorithm takes all significantly identified MS/MS spectra and bins them together in discrete mass bins. As commonly occurring ions, such as immonium and diagnostic ions, will have same chemical composition and consequently the same m/z, they will cluster in the same mass bins, whereas sequence-specific fragment ions will scatter across the binned mass range. For validation of the presented approach, we mapped known and novel diagnostic ions from a variety of PTM-bearing amino acids, demonstrating the sensitivity and specificity of the method. Moreover, we demonstrate that mass spectral binning additionally can be employed for automated mapping of composition-specific neutral losses from large-scale proteomic experiments.  相似文献   

4.
The orbitrap mass analyzer combines high sensitivity, high resolution, and high mass accuracy in a compact format. In proteomics applications, it is used in a hybrid configuration with a linear ion trap (LTQ-Orbitrap) where the linear trap quadrupole (LTQ) accumulates, isolates, and fragments peptide ions. Alternatively, isolated ions can be fragmented by higher energy collisional dissociation. A recently introduced stand-alone orbitrap analyzer (Exactive) also features a higher energy collisional dissociation cell but cannot isolate ions. Here we report that this instrument can efficiently characterize protein mixtures by alternating MS and “all-ion fragmentation” (AIF) MS/MS scans in a manner similar to that previously described for quadrupole time-of-flight instruments. We applied the peak recognition algorithms of the MaxQuant software at both the precursor and product ion levels. Assignment of fragment ions to co-eluting precursor ions was facilitated by high resolution (100,000 at m/z 200) and high mass accuracy. For efficient fragmentation of different mass precursors, we implemented a stepped collision energy procedure with cumulative MS readout. AIF on the Exactive identified 45 of 48 proteins in an equimolar protein standard mixture and all of them when using a small database. The technique also identified proteins with more than 100-fold abundance differences in a high dynamic range standard. When applied to protein identification in gel slices, AIF unambiguously characterized an immunoprecipitated protein that was barely visible by Coomassie staining and quantified it relative to contaminating proteins. AIF on a benchtop orbitrap instrument is therefore an attractive technology for a wide range of proteomics analyses.Mass spectrometry (MS)-based proteomics is commonly performed in a “shotgun” format where proteins are digested to peptides, which are separated and analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) (1, 2). Many peptides typically co-elute from the column and are selected for fragmentation on the basis of their abundance (“data dependent acquisition”). The precursor mass, which can be determined with high mass accuracy in most current instruments, together with a list of fragment ions, which are often determined at lower mass accuracy, are together used to identify the peptide in a sequence database. This scheme is the basis of most of current proteomics research from the identification of single protein bands to the comprehensive characterization of entire proteomes. To minimize stochastic effects from the selection of peptides for fragmentation and to maximize coverage in complex mixtures, very high sequencing speed is desirable. Although this is achievable, it requires complex instrumentation, and there is still no guarantee that all peptides in a mixture are fragmented and identified. Illustrating this challenge, when the Association of Biomolecular Resource Facilities (ABRF)1 and the Human Proteome Organisation (HUPO) conducted studies of protein identification success in different laboratories, results were varying (4, 5).2 Despite using state of the art proteomics workflows, often with extensive fractionation, only a few laboratories correctly identified all of the proteins in an equimolar 49-protein mixture (ABRF) or a 20-protein mixture (HUPO).As an alternative to data-dependent shotgun proteomics, the mass spectrometer can be operated to fragment the entire mass range of co-eluting analytes. This approach has its roots in precursor ion scanning techniques in which all precursors were fragmented simultaneously either in the source region or in the collision cell, and the appearance of specific “reporter ions” for a modification of interest was recorded (68). Several groups reported the identification of peptides from MS scans in conjunction with MS/MS scans without precursor ion selection (912). Yates and co-workers (13) pursued an intermediate strategy by cycling through the mass range in 10 m/z fragmentation windows. The major challenge of data-independent acquisition is that the direct relationship between precursor and fragments is lost. In most of the above studies, this problem was alleviated by making use of the fact that precursors and fragments have to “co-elute.”In recent years, data-independent proteomics has mainly been pursued on the quadrupole TOF platform where it has been termed MSE in analogy to MS2, MS3, and MSn techniques used for fragmenting one peptide at a time. Geromanos and co-workers (1416) applied MSE to absolute quantification of proteins in mixtures. Another study showed excellent protein coverage of yeast enolase with data-independent peptide fragmentation where enolase peptide intensities varied over 2 orders of magnitude (17). In a recent comparison of data-dependent and -independent peptide fragmentation, the authors concluded that fragmentation information was highly comparable (18, 19).Recently, the orbitrap mass analyzer (2023) has been introduced in a benchtop format without the linear ion trap that normally performs ion accumulation, fragmentation, and analysis of the fragments. This instrument, termed Exactive, was developed for small molecule applications such as metabolite analysis. It can be obtained with a higher energy collisional dissociation (HCD) cell (24), enabling efficient fragmentation but no precursor ion selection. This option is called “all-ion fragmentation” (AIF) by the manufacturer, and this is the term that we use below. We reasoned that the high resolution (100,000 compared with 10,000 in quadrupole TOF) and mass accuracy of this device in both the MS and MS/MS modes might facilitate the analysis of the complex fragmentation spectra generated by dissociating several precursors simultaneously. The simplicity and compactness of this instrumentation platform would then make it interesting for diverse proteomics applications.  相似文献   

5.
6.
7.
The development of a multidimensional approach involving high-performance liquid chromatography (LC), ion mobility spectrometry (IMS) and tandem mass spectrometry is described for the analysis of complex peptide mixtures. In this approach, peptides are separated based on differences in their LC retention times and mobilities (as ions drift through He) prior to being introduced into a quadrupole/octopole/time-of-flight mass spectrometer. The initial LC separation and IMS dispersion of ions is used to label ions for subsequent fragmentation studies that are carried out for mixtures of ions. The approach is demonstrated by examining a mixture of peptides generated from tryptic digestion of 18 commercially available proteins. Current limitations of this initial study and potential advantages of the experimental approach are discussed.  相似文献   

8.
The use of electron transfer dissociation (ETD) fragmentation for analysis of peptides eluting in liquid chromatography tandem mass spectrometry experiments is increasingly common and can allow identification of many peptides and proteins in complex mixtures. Peptide identification is performed through the use of search engines that attempt to match spectra to peptides from proteins in a database. However, software for the analysis of ETD fragmentation data is currently less developed than equivalent algorithms for the analysis of the more ubiquitous collision-induced dissociation fragmentation spectra. In this study, a new scoring system was developed for analysis of peptide ETD fragmentation data that varies the ion type weighting depending on the precursor ion charge state and peptide sequence. This new scoring regime was applied to the analysis of data from previously published results where four search engines (Mascot, Open Mass Spectrometry Search Algorithm (OMSSA), Spectrum Mill, and X!Tandem) were compared (Kandasamy, K., Pandey, A., and Molina, H. (2009) Evaluation of several MS/MS search algorithms for analysis of spectra derived from electron transfer dissociation experiments. Anal. Chem. 81, 7170–7180). Protein Prospector identified 80% more spectra at a 1% false discovery rate than the most successful alternative searching engine in this previous publication. These results suggest that other search engines would benefit from the application of similar rules.The recently developed fragmentation approach of electron transfer dissociation (ETD)1 has become a genuine alternative to the more ubiquitous collision-induced dissociation (CID) for high throughput and high sensitivity proteomic analysis (13). ETD (4) and the related fragmentation process electron capture dissociation (ECD) (5) have been demonstrated to have particular advantages for the analysis of large peptides and small proteins (68) as well as the analysis of peptides bearing labile post-translational modifications (911). The results achieved through ETD and ECD analysis have been shown to be highly complementary to those obtained through CID fragmentation analysis, both through increasing confidence in particular identifications of peptides and also by allowing identification of extra components in complex mixtures (10, 12, 13). As CID and ETD can be sequentially or alternatively performed on precursor ions in the same mass spectrometric run, it is expected that the combined use of these two fragmentation analysis techniques will become increasingly common to enable more comprehensive sample analysis.Software for analysis of CID spectra is significantly more advanced than that for ECD/ETD data. This is partly because the behavior of peptides under CID fragmentation is better characterized and understood so software has been developed that is better able to predict the fragment ions expected. The fragment ion types observed in ETD and ECD are largely known (5, 14, 15), but information about the frequency and peak intensities of the different ion types observed is less well documented.We recently performed a study to characterize how frequently the different fragment ion types are detected in ETD spectra when analyzing complex digest mixtures produced by proteolytic enzymes or chemical cleavage reagents of different sequence specificity (16). These results were analyzed with respect to precursor charge state and location of basic residues, which were both shown to be significant factors in controlling the fragment ion types observed. The results showed that ETD spectra of doubly charged precursor ions produced very different fragment ions depending on the location of a basic residue in the sequence.Based on this statistical analysis of ETD data from a diverse range of peptides (16), in the present study, a new scoring system was developed and implemented in the search engine Batch-Tag within Protein Prospector that adjusts the weighting for different fragment ion types based on the precursor charge state and the presence of basic amino acid residues at either peptide terminus. The results using this new scoring system were compared with the previous generation of Batch-Tag, which used ion score weightings based on the average frequency of observation of different fragment types in ETD spectra of tryptic peptides and used the same scoring irrespective of precursor charge and sequence. The performance of this new scoring was also compared with those reported by other search engines using results previously published from a large standard data set (17). The new scoring system allowed identification of significantly more spectra than achieved with the previous scoring system. It also assigned 80% more spectra than the most successful of the compared search engines when using the same false discovery rate threshold.  相似文献   

9.
10.

Introduction

Advances in high-resolution mass spectrometry have created renewed interest for studying global lipid biochemistry in disease and biological systems.

Objectives

Here, we present an untargeted 30 min. LC-MS/MS platform that utilizes positive/negative polarity switching to perform unbiased data dependent acquisitions (DDA) via higher energy collisional dissociation (HCD) fragmentation to profile more than 1000–1500 lipid ions mainly from methyl-tert-butyl ether (MTBE) or chloroform:methanol extractions.

Methods

The platform uses C18 reversed-phase chromatography coupled to a hybrid QExactive Plus/HF Orbitrap mass spectrometer and the entire procedure takes?~10 h from lipid extraction to identification/quantification for a data set containing 12 samples (~4 h for a single sample). Lipids are identified by both accurate precursor ion mass and fragmentation features and quantified using LipidSearch and Elements software.

Results

Using this approach, we are able to profile intact lipid ions from up to 18 different main lipid classes and 66 subclasses. We show several studies from different biological sources, including cultured cancer cells, resected tissues from mice such as lung and breast tumors and biological fluids such as plasma and urine.

Conclusions

Using mouse embryonic fibroblasts, we showed that TSC2?/? KD significantly abrogates lipid biosynthesis and that rapamycin can rescue triglyceride (TG) lipids and we show that SREBP?/? shuts down lipid biosynthesis significantly via mTORC1 signaling pathways. We show that in mouse EGFR driven lung tumors, a large number of TGs and phosphatidylmethanol (PMe) lipids are elevated while some phospholipids (PLs) show some of the largest decrease in lipid levels from ~?2000 identified lipid ions. In addition, we identified more than 1500 unique lipid species from human blood plasma.
  相似文献   

11.
We have developed web-based software for the rapid identification of protein biomarkers of bacterial microorganisms. Proteins from bacterial cell lysates were ionized by matrix-assisted laser desorption ionization (MALDI), mass isolated, and fragmented using a tandem time of flight (TOF-TOF) mass spectrometer. The sequence-specific fragment ions generated were compared to a database of in silico fragment ions derived from bacterial protein sequences whose molecular weights are the same as the nominal molecular weights of the protein biomarkers. A simple peak-matching and scoring algorithm was developed to compare tandem mass spectrometry (MS-MS) fragment ions to in silico fragment ions. In addition, a probability-based significance-testing algorithm (P value), developed previously by other researchers, was incorporated into the software for the purpose of comparison. The speed and accuracy of the software were tested by identification of 10 protein biomarkers from three Campylobacter strains that had been identified previously by bottom-up proteomics techniques. Protein biomarkers were identified using (i) their peak-matching scores and/or P values from a comparison of MS-MS fragment ions with all possible in silico N and C terminus fragment ions (i.e., ions a, b, b-18, y, y-17, and y-18), (ii) their peak-matching scores and/or P values from a comparison of MS-MS fragment ions to residue-specific in silico fragment ions (i.e., in silico fragment ions resulting from polypeptide backbone fragmentation adjacent to specific residues [aspartic acid, glutamic acid, proline, etc.]), and (iii) fragment ion error analysis, which distinguished the systematic fragment ion error of a correct identification (caused by calibration drift of the second TOF mass analyzer) from the random fragment ion error of an incorrect identification.Food-borne illness is a serious and continuing problem, with an estimated 76 million cases in the United States per year (http://www.cdc.gov). It is often caused by bacteria and viruses that are often ubiquitous in the environment and are difficult to eliminate due to their ability to adapt. In addition to the resulting morbidity, food-borne illness also has enormous societal costs, including losses in worker productivity due to illness, recall of food products determined (or suspected) to be contaminated, etc. Consequently, there is a critical need to develop rapid and sensitive methods for detection and accurate identification of food-borne pathogens.A number of techniques have been developed for detection and identification of food-borne pathogens. A relatively recent technique for bacterial identification involves the use of mass spectrometry (MS). Because of its sensitivity and high specificity, MS has become a popular technique for chemicotaxonomic classification of microorganisms (16, 27). The use of MS in the analysis of microorganisms is a relatively recent application that was dramatically accelerated by the development of two ionization techniques in the late 1980s and early 1990s: electrospray ionization (15) and matrix-assisted laser desorption ionization (MALDI) (24, 37). When coupled with time of flight (TOF) MS, MALDI has been demonstrated to be a powerful tool for “fingerprinting” microorganisms by ionization and detection of proteins from intact bacterial cells or extracts resulting from bacterial cell lysis (1, 2, 3, 8-12, 19, 21, 25, 26, 29, 34, 40, 41, 42). Typically, MALDI-TOF MS “fingerprinting” of microorganisms involves analysis using either pattern recognition or bioinformatic algorithms.Pattern recognition analysis compares MALDI-TOF MS spectra of samples of unknown microorganisms to spectra of known microorganisms. A high degree of similarity between the MS spectrum of an unknown microorganism and an MS spectrum of a known microorganism strongly suggests the identity of the unknown microorganism (22, 39, 43). It should be noted that pattern recognition analysis does not rely on actual identification of the biomarker ion peaks in an MS spectrum. It is the pattern generated by multiple ion peaks that constitutes a microorganism''s “fingerprint.” The actual identities of individual ion peaks are not specified, and the peaks could be peaks for any of a number of possible biological molecules generated by a microorganism, including proteins, nucleic acids, lipids, etc.Microorganism identification by bioinformatic analysis of MALDI-TOF MS data involves using the protein molecular weights (MWs) in bacterial genomic databases to assign biomarker ion peaks in a mass spectrum to specific proteins (4, 5, 32, 33, 45). If a significant number of biomarker ion peaks in a mass spectrum correspond to protein MWs for the open reading frames of a microorganism''s genome, then the microorganism is considered identified. Such an analysis has also incorporated the simplest and most common posttranslational modification (PTM) observed for bacterial proteins, N-terminal methionine cleavage (5). It should be noted, however, that “identification” of a microorganism relies solely on a sufficient number of protein MWs derived from open reading frames of its genome corresponding to the m/z of biomarker ions in a MALDI-TOF MS spectrum. However, the protein MW alone is not sufficient to definitively identify a biomarker ion as a specific protein. Protein biomarkers are considered to be tentatively assigned instead of definitively identified.Analysis of samples containing multiple bacterial organisms presents increased challenges for MALDI-TOF MS when protein MW is the sole criterion for protein biomarker identification. Clearly, it would be advantageous if researchers could obtain more information about a biomarker in addition to its MW. In the case of protein biomarkers, this can be accomplished by enzymatically digesting a protein in solution and analyzing its tryptic peptides by MS (peptide mass mapping) or by tandem MS (MS-MS) (sequence tags) (45). Alternatively, it is possible to fragment mature, intact proteins (without digestion) in the gas phase to obtain sequence-specific and PTM information. This approach is referred to as top-down proteomics. Until recently, top-down proteomics was possible only if Fourier transform ion cyclotron resonance MS involving complicated gas phase ion dissociation techniques was used (6, 23).Although not originally designed for top-down proteomics, recently developed MALDI-tandem TOF (MALDI-TOF-TOF) MS was shown to fragment small or modest-size proteins (5 kDa > molecular mass < 15 kDa) without prior digestion (28). Demirev and coworkers (7) identified Bacillus atrophaeus and Bacillus cereus spores by fragmenting their protein biomarkers using a MALDI tandem mass spectrometer and analyzing the sequence-specific fragment ions generated by comparison to in silico fragment ions derived from protein amino acid sequences from genomic databases. Protein and microorganism identities were determined using a probability-based significance-testing algorithm (P value). The P value algorithm calculates the probability that a protein or microorganism identification occurred randomly. The smaller the P value, the lower the probability that an identification occurred randomly. The data analysis was performed using software developed in house (7).In the current study, web-based software and databases, developed in house at the U.S. Department of Agriculture (USDA), were used to identify 10 protein biomarkers from three pure strains of Campylobacter by sequence-specific fragmentation using a MALDI-TOF-TOF mass spectrometer. Many of the protein biomarkers had been identified previously by bottom-up proteomics techniques (9, 11, 12), which provided an excellent data set to test the accuracy and performance of the algorithms incorporated into the software. MALDI-TOF-TOF MS-MS fragment ions were compared with a database of in silico fragment ions derived from bacterial protein sequences. The sequence-specific MS-MS fragment ions were used to identify a protein and thus the source microorganism. A simple peak-matching mathematical algorithm, incorporated into the software, was used to score and rank protein and microorganism identifications. In addition, the P value algorithm of Demirev and coworkers (7) was also incorporated into the USDA software (available with execution of appropriate control usage agreement) for comparison to the peak-matching algorithm. The peak-matching algorithm correctly identified a protein biomarker among as many as ∼1,400 possible bacterial proteins and gave rankings for protein identification comparable to the rankings obtained by more complicated and computationally intensive P value calculation. We often observed enhancement of the score for correct identification when results for MS-MS fragment ions were compared to results for residue-specific in silico fragment ions compared to non-residue-specific in silico fragment ions. In addition, the correctness of the algorithm''s identification was, in certain cases, further confirmed by fragment ion error analysis which compared random error caused by false matches between MS-MS fragment ions and in silico fragment ions with the systematic error observed for correct matches due to drift in the calibration of the TOF mass analyzer (38).(Portions of this work were presented at the 121st AOAC Conference [13] and at the 55th American Society of Mass Spectrometry Conference [14].)  相似文献   

12.

Introduction

Tandem mass spectrometry (MS/MS) has been widely used for identifying metabolites in many areas. However, computationally identifying metabolites from MS/MS data is challenging due to the unknown of fragmentation rules, which determine the precedence of chemical bond dissociation. Although this problem has been tackled by different ways, the lack of computational tools to flexibly represent adjacent structures of chemical bonds is still a long-term bottleneck for studying fragmentation rules.

Objectives

This study aimed to develop computational methods for investigating fragmentation rules by analyzing annotated MS/MS data.

Methods

We implemented a computational platform, MIDAS-G, for investigating fragmentation rules. MIDAS-G processes a metabolite as a simple graph and uses graph grammars to recognize specific chemical bonds and their adjacent structures. We can apply MIDAS-G to investigate fragmentation rules by adjusting bond weights in the scoring model of the metabolite identification tool and comparing metabolite identification performances.

Results

We used MIDAS-G to investigate four bond types on real annotated MS/MS data in experiments. The experimental results matched data collected from wet labs and literature. The effectiveness of MIDAS-G was confirmed.

Conclusion

We developed a computational platform for investigating fragmentation rules of tandem mass spectrometry. This platform is freely available for download.
  相似文献   

13.
Histone post-translational modifications (PTMs) have a fundamental function in chromatin biology, as they model chromatin structure and recruit enzymes involved in gene regulation, DNA repair, and chromosome condensation. High throughput characterization of histone PTMs is mostly performed by using nano-liquid chromatography coupled to mass spectrometry. However, limitations in speed and stochastic sampling of data dependent acquisition methods in MS lead to incomplete discrimination of isobaric peptides and loss of low abundant species. In this work, we analyzed histone PTMs with a data-independent acquisition method, namely SWATH™ analysis. This approach allows for MS/MS-based quantification of all analytes without upfront assay development and no issues of biased and incomplete sampling. We purified histone proteins from human embryonic stem cells and mouse trophoblast stem cells before and after differentiation, and prepared them for MS analysis using the propionic anhydride protocol. Results on histone H3 peptides verified that sequential window acquisition of all theoretical mass spectra could accurately quantify peptides (<9% average coefficient of variation, CV) over four orders of magnitude, and we could discriminate isobaric and co-eluting peptides (e.g. H3K18ac and H3K23ac) using MS/MS-based quantification. This method provided high sensitivity and precision, supported by the fact that we could find significant differences for remarkably low abundance PTMs such as H3K9me2S10ph (relative abundance <0.02%). We performed relative quantification for few sample peptides using different fragment ions and observed high consistency (CV <15%) between the fragments. This indicated that different fragment ions can be used independently to achieve the same peptide relative quantification. Taken together, sequential window acquisition of all theoretical mass spectra proved to be an easy-to-use MS acquisition method to perform high quality MS/MS-based quantification of histone-modified peptides.Chromatin is a highly organized and dynamic entity in cell nuclei, mostly composed of DNA and histone proteins. Its structure directly influences gene expression, DNA repair, and cell duplication events such as mitosis and meiosis (1). Histones are assembled in octamers named nucleosomes, wrapped by DNA every ∼200 base pairs. Histones are heavily modified by dynamic post-translational modifications (PTMs)1, which affect chromatin structure because of their chemical properties and their ability to recruit chromatin modifier enzymes and binding proteins (2). Moreover, histone PTMs can be inherited through cell division and thus are crucial components of epigenetic memory (3). The function of histone PTMs has been extensively studied in the last 15–20 years, and several links have been found between aberrations of histone PTM levels and development of diseases (4, 5). Such discoveries revealed the importance of histone PTMs in fine-tuning cell phenotype. Because of this, technology has been rapidly evolving to investigate histone PTM relative abundance with higher accuracy and throughput.Mass spectrometry (MS)-based strategies have continuously evolved toward higher throughput and flexibility, allowing not only identification and quantification of single histone PTMs, but also their combinatorial patterns and even characterization of the intact proteins (reviewed in (6, 7)). For histone analysis, a widely adopted workflow for nano-liquid chromatography–tandem mass spectrometry (nLC-MS/MS) includes derivatization of lysine residue side chains with propionic anhydride, proteolytic digestion with trypsin, and subsequent derivatization of peptide N termini (8, 9). Such protocol leads to generation of ArgC-like peptides (only cleaved after arginine residues) after digestion. Moreover, propionylation of N termini increases peptide hydrophobicity, thereby improving LC retention of shorter ones, and thus the MS signal. Because of the high mass accuracy, sensitivity, and the possibility to perform label-free quantification MS has become the technique of choice, outperforming antibody-based strategies, to study both known and novel global histone PTMs.Several acquisition methods have been developed for MS analysis to accomplish different needs of identification and quantification (10). The most widely adopted in shotgun or discovery proteomics is the data-dependent acquisition (DDA) mode. Such acquisition method does not require any previous knowledge about the analyte, as it automatically selects precursor ions detectable at the full scan level in a given order (commonly from the most intense) to perform MS/MS fragmentation (11). Label-free quantification is performed at the full MS scan level by integrating the area of the LC peak from an extracted ion chromatogram of the precursor mass corresponding to the given peptide. On the other hand, the selected reaction monitoring (SRM) mode is the most widely used acquisition method in targeted proteomics. Such method performs cyclic precursor ion selection, MS/MS fragmentation, and product ion selection of a list of masses input by the user. Even though the method preparation is intuitively more complex than DDA, SRM is highly popular because of the high selectivity and sensitivity, which leads to more accurate label-free quantification (12). However, both methods have inevitable drawbacks; a DDA approach cannot perform accurate quantification of isobaric and co-eluting peptides, for example, KacQLATKAAR and KQLATKacAAR (histone H3 aa 9–17), as the fragment ions should be monitored through the entire peptide peak elution to define the ratio between the two similar analytes. On the contrary, an SRM experiment prevents future data mining of unpredicted peptides, and thus such method cannot be used for any classical PTM discovery. Therefore, LC-MS/MS analysis of histone peptides is commonly performed by integrating shotgun and targeted acquisition within the same MS method (13). This method requires previous knowledge about retention time and mass of co-eluting isobaric species, and tedious manual peak integration or dedicated software to deconvolute such complex raw data. Although this mixed MS mode is a powerful approach, the targeted sequences in the method reduce the duty cycle and number of DDA MS/MS spectra that can be acquired, making it far from ideal.Data independent acquisition (DIA) modes are a third option that recently gained popularity in proteomics (14, 15). Sequential window acquisition of all theoretical mass spectra (SWATH™)-MS is a data independent workflow that uses a first quadrupole isolation window to step across a mass range, collecting high resolution full scan composite MS/MS at each step and generating an ion map of fragments from all detectable precursor masses (15, 16). From such data set, a virtual SRM, or pseudo-SRM, can be performed by extracting the product ion chromatogram of a given peptide (17) with bioinformatics tools such as Peakview®, Skyline (18), or OpenSWATH™ (19). In order to define which fragment masses should be used to quantify a given peptide, a spectral library of identified peptides can be manually programmed, downloaded (if available), or generated by previous DDA experiments. In terms of quantification power, SWATH™ combines the advantages of both DDA and SRM, as it allows for MS/MS-based label-free quantification, discrimination of isobaric peptides, and subsequent data mining of unpredicted species.Histone proteins are an excellent target sample to test SWATH™, as the peptides are heavily modified by PTMs and often have isobaric proteoforms present. We analyzed with both DDA and SWATH™ two model systems: (1) extracted histones from untreated (pluripotent) and retinoic acid (RA) treated (differentiated) human embryonic stem cells (hESCs, strain H9), and (2) extracted histones from undifferentiated and differentiated mouse trophoblast stem cells (mTSCs). The results from the DDA experiment were used to evaluate the reproducibility of peptide retention time and the variety of species identified. For the SWATH™ analysis we focused on histone H3, as it is the histone with the highest variety of modified peptides (6). Results highlighted that such acquisition method provides sensitive and precise MS/MS-based quantification of both isobaric and nonisobaric peptides. Our data demonstrate that quantification at the MS/MS level is highly reproducible, and identification of the peptide elution profile is assisted by the high mass accuracy and the large number of overlapping elution profiles of the fragment ions. Moreover, we show that by using different fragment ions for MS/MS quantification we achieved similar quantification results. Thus, we used all unique fragment ions for a given species to provide a robust quantification method, where by unique is intended fragment ions that belong to only one of the possible isobaric peptide proteoforms. Taken together, we prove that SWATH™-MS is a reliable and simple-to-use acquisition method to perform epigenetic histone PTM analysis.  相似文献   

14.
Ion clustering and the solvation properties in the NaCl solutions are explored by molecular dynamics simulations with several popular force fields. The existence of ions has a negligible disturbance to the hydrogen bond structures and rotational mobility of water beyond the first ion solvation shells, which is suggested by the local hydrogen bond structures and the rotation times of water. The potential of mean force (PMF) of ion pair in the dilute solution presents a consistent view with the populations of ion clusters in the electrolyte solutions. The aggregation level of ions is sensitive to the force field used in the simulations. The ion-ion interaction potential plays an important role in the forming of the contact ion pair. The entropy of water increases as the ion pair approaches each other and the association of ion pair is driven by the increment of water entropy according to the results from the selected force fields. The kinetic transition from the single solvent separated state to the contact ion pair is controlled by the enthalpy loss of solution.
Figure
Ion pairing and ion induction to solvent play an important role in the protein folding and chemical reactions in the water solutions. The existence of ions has a negligible disturbance to the hydrogen bond structures and rotational mobility of water beyond the first ion solvation shells in the NaCl solutions. The clustering level of ions is sensitive to the force field used in the simulations. The formation of NaCl ion pair in the dilute solution is driven by the entropy increment of water  相似文献   

15.
Quantitative analysis of discovery-based proteomic workflows now relies on high-throughput large-scale methods for identification and quantitation of proteins and post-translational modifications. Advancements in label-free quantitative techniques, using either data-dependent or data-independent mass spectrometric acquisitions, have coincided with improved instrumentation featuring greater precision, increased mass accuracy, and faster scan speeds. We recently reported on a new quantitative method called MS1 Filtering (Schilling et al. (2012) Mol. Cell. Proteomics 11, 202–214) for processing data-independent MS1 ion intensity chromatograms from peptide analytes using the Skyline software platform. In contrast, data-independent acquisitions from MS2 scans, or SWATH, can quantify all fragment ion intensities when reference spectra are available. As each SWATH acquisition cycle typically contains an MS1 scan, these two independent label-free quantitative approaches can be acquired in a single experiment. Here, we have expanded the capability of Skyline to extract both MS1 and MS2 ion intensity chromatograms from a single SWATH data-independent acquisition in an Integrated Dual Scan Analysis approach. The performance of both MS1 and MS2 data was examined in simple and complex samples using standard concentration curves. Cases of interferences in MS1 and MS2 ion intensity data were assessed, as were the differentiation and quantitation of phosphopeptide isomers in MS2 scan data. In addition, we demonstrated an approach for optimization of SWATH m/z window sizes to reduce interferences using MS1 scans as a guide. Finally, a correlation analysis was performed on both MS1 and MS2 ion intensity data obtained from SWATH acquisitions on a complex mixture using a linear model that automatically removes signals containing interferences. This work demonstrates the practical advantages of properly acquiring and processing MS1 precursor data in addition to MS2 fragment ion intensity data in a data-independent acquisition (SWATH), and provides an approach to simultaneously obtain independent measurements of relative peptide abundance from a single experiment.Mass spectrometry is the leading technology for large-scale identification and quantitation of proteins and post-translational modifications (PTMs)1 in biological systems (1, 2). Although several types of experimental designs are employed in such workflows, most large-scale applications use data-dependent acquisitions (DDA) where peptide precursors are first identified in the MS1 scan and one or more peaks are then selected for subsequent fragmentation to generate their corresponding MS2 spectra. In experiments using DDA, one can employ either chemical/metabolic labeling or label-free strategies for relative quantitation of peptides (and proteins) (3, 4). Depending on the type of labeling approach employed, i.e. metabolic labeling with SILAC or postmetabolic labeling with ICAT or isobaric tags such as iTRAQ or TMT, the relative quantitation of these peptides are made using either MS1 or MS2 ion intensity data (47). Label-free quantitative techniques have until recently been based entirely on integrated ion intensity measurements of precursors in the MS1 scan, or in the case of spectral counting the number of assigned MS2 spectra (3, 8, 9).Label-free approaches have recently generated more widespread interest (1012), in part because of their adaptability to a wide range of proteomic workflows, including human samples that are not amenable to most metabolic labeling techniques, or where chemical labeling may be cost prohibitive and/or interfere with subsequent enrichment steps (11, 13). However the use of DDA for label-free quantitation is also susceptible to several limitations including insufficient reproducibility because of under-sampling, digestion efficiency, as well as misidentifications (14, 15). Moreover, low ion abundance may prohibit peptide selection, especially in complex samples (14). These limitations often present challenges in data analysis when making comparisons across samples, or when a peptide is sampled in only one of the study conditions.To address the challenges in obtaining more comprehensive sampling in MS1 space, Purvine et al. first demonstrated the ability to obtain sequence information from peptides fragmented across the entire m/z range using “shotgun or parallel collision-induced dissociation (CID)” on an orthogonal time of flight instrument (16). Shortly thereafter Venable et al. reported on a data independent acquisition methodology to limit the complexity of the MS2 scan by using a segmented approach for the sequential isolation and fragmentation of all peptides in a defined precursor window (e.g. 10 m/z) using an ion trap mass spectrometer (17). However, the proper implementation of this DIA technique suffered from technical limitations of instruments available at that time, including slow acquisition rates and low MS2 resolution that made systematic product ion extraction problematic. To alleviate the challenge of long duty cycles in DIAs, researchers at the Waters Corporation adopted an alternative approach by rapidly switching between low (MS1) and high energy (MS2) scans and then using proprietary software to align peptide precursor and fragment ion information to determine peptide sequences (18, 19). Recent mass spectrometry innovations in efficient high-speed scanning capabilities, together with high-resolution data acquisition of both MS1 and MS2 scans, and multiplexing of scan windows have overcome many of these limitations (10, 20, 21). Moreover, the simultaneous development of novel software solutions for extracting ion intensity chromatograms based on spectral libraries has enabled the use of DIA for large-scale label free quantitation of multiple peptide analytes (21, 22). In addition to targeting specific peptides from a previously generated peptide spectral library, the data can also be reexamined (i.e. post-acquisition) for additional peptides of interest as new reference data emerges. On the SCIEX TripleTOF 5600, a quadrupole orthogonal time-of-flight mass spectrometer, this technique has been optimized and extended to what is called ‘SWATH MS2′ based on a combination of new technical and software improvements (10, 22).In a DIA experiment a MS1 survey scan is carried out across the mass range followed by a SWATH MS2 acquisition series, however the cycle time of the MS1 scan is dramatically shortened compared with DDA type experiments. The Q1 quadrupole is set to transmit a wider window, typically Δ25 m/z, to the collision cell in incremental steps over the full mass range. Therefore the MS/MS spectra produced during a SWATH MS2 acquisition are of much greater complexity as the MS/MS spectra are a composite of all fragment ions produced from peptide analytes with molecular ions within the selected MS1 m/z window. The cycle of data independent MS1 survey scans and SWATH MS2 scans is repeated throughout the entire LC-MS acquisition. Fragment ion information contained in these SWATH MS2 spectra can be used to uniquely identify specific peptides by comparisons to reference spectra or spectral libraries. Moreover, ion intensities of these fragment ions can also be used for quantitation. Although MS2 typically increases selectivity and reduces the chemical noise often observed in MS1 scans, quantifying peptides from SWATH MS2 scans can be problematic because of the presence of interferences in one or more fragment ions or decreased ion intensity of MS2 scans as compared with the MS1 precursor ion abundance.To partially alleviate some of these limitations in SWATH MS2 scan quantitation it is potentially advantageous to exploit MS1 ion intensity data, which is acquired independently as part of each SWATH scan cycle. Recently, our laboratories and others have developed label free quantitation tools for data dependent acquisitions (11, 12, 23) using MS1 ion intensity data. For example, the MS1 Filtering algorithm uses expanded features in the open source software application Skyline (11, 24). Skyline MS1 Filtering processes precursor ion intensity chromatograms of peptide analytes from full scan mass spectral data acquired during data dependent acquisitions by LC MS/MS. New graphical tools were developed within Skyline to enable visual inspection and manual interrogation and integration of extracted ion chromatograms across multiple acquisitions. MS1 Filtering was subsequently shown to have excellent linear response across several orders of magnitude with limits of detection in the low attomole range (11). We, and others, have demonstrated the utility of this method for carrying out large-scale quantitation of peptide analytes across a range of applications (2528). However, quantifying peptides based on MS1 precursor ion intensities can be compromised by a low signal-to-noise ratio. This is particularly the case when quantifying low abundance peptides in a complex sample where the MS1 ion “background” signal is high, or when chromatograms contain interferences, or partial overlap of multiple target precursor ions.Currently MS1 scans are underutilized or even deemphasized by some vendors during DIA workflows. However, we believe an opportunity exists that would improve data-independent acquisitions (DIA) experiments by including MS1 ion intensity data in the final data processing of LC-MS/MS acquisitions. Therefore, to address this possibility, we have adapted Skyline to efficiently extract and process both precursor and product ion chromatograms for label free quantitation across multiple samples. The graphical tools and features originally developed for SRM and MS1 Filtering experiments have been expanded to process DIA data sets from multiple vendors including SCIEX, Thermo, Waters, Bruker, and Agilent. These expanded features provide a single platform for data mining of targeted proteomics using both the MS1 and MS2 scans that we call Integrated Dual Scan Analysis, or IDSA. As a test of this approach, a series of SWATH MS2 acquisitions of simple and complex mixtures was analyzed on an SCIEX TripleTOF 5600 mass spectrometer. We also investigated the use of MS2 scans for differentiating a case of phosphopeptide isomers that are indistinguishable at the MS1 level. In addition, we investigated whether smaller SWATH m/z windows would provide more reliable quantitative data in these cases by reducing the number of potential interferences. Lastly, we performed a statistical assessment of the accuracy and reproducibility of the estimated (log) fold change of mitochondrial lysates from mouse liver at different concentration levels to better assess the overall value of acquiring MS1 and MS2 data in combination and as independent measurements during DIA experiments.  相似文献   

16.
Immonium ions and immonium-related ions commonly appear in the mass spectra of peptide precursor ions. An overall understanding of the variation of the abundance of these ions is beneficial for the identification of unknown peptides. Here, four peptides from mass spectrometry (MS) of sucrose phosphorylase were selected as precursor ions, and the frequency of immonium ions and immonium-related ions in a dataset containing 130 MS/MS spectra were examined. Immonium ions and immonium-related ions were mainly produced from the further fragmentation of a-, b-, and y-ions. At the optimal collision energy (CE), the immonium ions of leucine at m/z 86, isoleucine at m/z 86, glutamine at m/z 101, arginine at m/z 129, tryptophan at m/z 159, proline at m/z 70, valine at m/z 72, glutamic acid at m/z 102, phenylalanine at m/z 120, and tyrosine at m/z 136, as well as the immonium-related ions of methionine at m/z 61, lysine at m/z 84, glutamine at m/z 84, and tyrosine at m/z 91 existed in higher abundance and had higher confidence level, therefore suggesting the presence of corresponding amino acid residues well. However, the immonium ions of serine at m/z 60 and threonine at m/z 74, although showing lower abundance, were stable at high CE and had higher confidence level, indicating the presence of serine and threonine residues, respectively. The immonium ion of asparagine at m/z 87 also was a good indicator for the existence of asparagine residue.  相似文献   

17.
An important step in mass spectrometry (MS)-based proteomics is the identification of peptides by their fragment spectra. Regardless of the identification score achieved, almost all tandem-MS (MS/MS) spectra contain remaining peaks that are not assigned by the search engine. These peaks may be explainable by human experts but the scale of modern proteomics experiments makes this impractical. In computer science, Expert Systems are a mature technology to implement a list of rules generated by interviews with practitioners. We here develop such an Expert System, making use of literature knowledge as well as a large body of high mass accuracy and pure fragmentation spectra. Interestingly, we find that even with high mass accuracy data, rule sets can quickly become too complex, leading to over-annotation. Therefore we establish a rigorous false discovery rate, calculated by random insertion of peaks from a large collection of other MS/MS spectra, and use it to develop an optimized knowledge base. This rule set correctly annotates almost all peaks of medium or high abundance. For high resolution HCD data, median intensity coverage of fragment peaks in MS/MS spectra increases from 58% by search engine annotation alone to 86%. The resulting annotation performance surpasses a human expert, especially on complex spectra such as those of larger phosphorylated peptides. Our system is also applicable to high resolution collision-induced dissociation data. It is available both as a part of MaxQuant and via a webserver that only requires an MS/MS spectrum and the corresponding peptides sequence, and which outputs publication quality, annotated MS/MS spectra (www.biochem.mpg.de/mann/tools/). It provides expert knowledge to beginners in the field of MS-based proteomics and helps advanced users to focus on unusual and possibly novel types of fragment ions.In MS-based proteomics, peptides are matched to peptide sequences in databases using search engines (13). Statistical criteria are established for accepted versus rejected peptide spectra matches based on the search engine score, and usually a 99% certainty is required for reported peptides. The search engines typically only take sequence specific backbone fragmentation into account (i.e. a, b, and y ions) and some of their neutral losses. However, tandem mass spectra—especially of larger peptides—can be quite complex and contain a number of medium or even high abundance peptide fragments that are not annotated by the search engine result. This can result in uncertainty for the user—especially if only relatively few peaks are annotated—because it may reflect an incorrect identification. However, the most common cause of unlabeled peaks is that another peptide was present in the precursor selection window and was cofragmented. This has variously been termed “chimeric spectra” (46), or the problem of low precursor ion fraction (PIF)1 (7). Such spectra may still be identifiable with high confidence. The Andromeda search engine in MaxQuant, for instance, attempts to identify a second peptide in such cases (8, 9). However, even “pure” spectra (those with a high PIF) often still contain many unassigned peaks. These can be caused by different fragment types, such as internal ions, single or combined neutral losses as well as immonium and other ion types in the low mass region. A mass spectrometric expert can assign many or all of these peaks, based on expert knowledge of fragmentation and manual calculation of fragment masses, resulting in a higher degree of confidence for the identification. However, there are more and more practitioners of proteomics without in depth training or experience in annotating MS/MS spectra and such annotation would in any case be prohibitive for hundreds of thousands of spectra. Furthermore, even human experts may wrongly annotate a given peak—especially with low mass accuracy tandem mass spectra—or fail to consider every possibility that could have resulted in this fragment mass.Given the desirability of annotating fragment peaks to the highest degree possible, we turned to “Expert Systems,” a well-established technology in computer science. Expert Systems achieved prominence in the 1970s and 1980s and were meant to solve complex problems by reasoning about knowledge (10, 11). Interestingly, one of the first examples was developed by Nobel Prize winner Joshua Lederberg more than 40 years ago, and dealt with the interpretation of mass spectrometric data. The program''s name was Heuristic DENTRAL (12), and it was capable of interpreting the mass spectra of aliphatic ethers and their fragments. The hypotheses produced by the program described molecular structures that are plausible explanations of the data. To infer these explanations from the data, the program incorporated a theory of chemical stability that provided limiting constraints as well as heuristic rules.In general, the aim of an Expert System is to encode knowledge extracted from professionals in the field in question. This then powers a rule-based system that can be applied broadly and in an automated manner. A rule-based Expert System represents the information obtained from human specialists in the form of IF-THEN rules. These are used to perform operations on input data to reach appropriate conclusion. A generic Expert System is essentially a computer program that provides a framework for performing a large number of inferences in a predictable way, using forward or backward chains, backtracking, and other mechanisms (13). Therefore, in contrast to statistics based learning, the “expert program” does not know what it knows through the raw volume of facts in the computer''s memory. Instead, like a human expert, it relies on a reasoning-like process of applying an empirically derived set of rules to the data.Here we implemented an Expert System for the interpretation for high mass accuracy tandem mass spectrometry data of peptides. It was developed in an iterative manner together with human experts on peptide fragmentation, using the published literature on fragmentation pathways as well as large data sets of higher-energy collisional dissociation (HCD) (14) and collision-induced dissociation (CID) based peptide identifications. Our goal was to achieve an annotation performance similar or better than experienced mass spectrometrists (15), thus making comprehensively annotated peptide spectra available in large scale proteomics.  相似文献   

18.

Background

Analysis of multiple LC-MS based metabolomic studies is carried out to determine overlaps and differences among various experiments. For example, in large metabolic biomarker discovery studies involving hundreds of samples, it may be necessary to conduct multiple experiments, each involving a subset of the samples due to technical limitations. The ions selected from each experiment are analyzed to determine overlapping ions. One of the challenges in comparing the ion lists is the presence of a large number of derivative ions such as isotopes, adducts, and fragments. These derivative ions and the retention time drifts need to be taken into account during comparison.

Results

We implemented an ion annotation-assisted method to determine overlapping ions in the presence of derivative ions. Following this, each ion is represented by the monoisotopic mass of its cluster. This mass is then used to determine overlaps among the ions selected across multiple experiments.

Conclusion

The resulting ion list provides better coverage and more accurate identification of metabolites compared to the traditional method in which overlapping ions are selected on the basis of individual ion mass.
  相似文献   

19.

Introduction

Mass spectrometry is the current technique of choice in studying drug metabolism. High-resolution mass spectrometry in combination with MS/MS gas-phase experiments has the potential to contribute to rapid advances in this field. However, the data emerging from such fragmentation spectral files pose challenges to downstream analysis, given their complexity and size.

Objectives

This study aims to detect and visualize antihypertensive drug metabolites in untargeted metabolomics experiments based on the spectral similarity of their fragmentation spectra. Furthermore, spectral clusters of endogenous metabolites were also examined.

Methods

Here we apply a molecular networking approach to seek drugs and their metabolites, in fragmentation spectra from urine derived from a cohort of 26 patients on antihypertensive therapy. The mass spectrometry data was collected on a Thermo Q-Exactive coupled to pHILIC chromatography using data dependent analysis (DDA) MS/MS gas-phase experiments.

Results

In total, 165 separate drug metabolites were found and structurally annotated (17 by spectral matching and 122 by classification based on a clustered fragmentation pattern). The clusters could be traced to 13 drugs including the known antihypertensives verapamil, losartan and amlodipine. The molecular networking approach also generated clusters of endogenous metabolites, including carnitine derivatives, and conjugates containing glutamine, glutamate and trigonelline.

Conclusions

The approach offers unprecedented capability in the untargeted identification of drugs and their metabolites at the population level and has great potential to contribute to understanding stratified responses to drugs where differences in drug metabolism may determine treatment outcome.
  相似文献   

20.
Isobaric labeling techniques coupled with high-resolution mass spectrometry have been widely employed in proteomic workflows requiring relative quantification. For each high-resolution tandem mass spectrum (MS/MS), isobaric labeling techniques can be used not only to quantify the peptide from different samples by reporter ions, but also to identify the peptide it is derived from. Because the ions related to isobaric labeling may act as noise in database searching, the MS/MS spectrum should be preprocessed before peptide or protein identification. In this article, we demonstrate that there are a lot of high-frequency, high-abundance isobaric related ions in the MS/MS spectrum, and removing isobaric related ions combined with deisotoping and deconvolution in MS/MS preprocessing procedures significantly improves the peptide/protein identification sensitivity. The user-friendly software package TurboRaw2MGF (v2.0) has been implemented for converting raw TIC data files to mascot generic format files and can be downloaded for free from https://github.com/shengqh/RCPA.Tools/releases as part of the software suite ProteomicsTools. The data have been deposited to the ProteomeXchange with identifier PXD000994.Mass spectrometry-based proteomics has been widely applied to investigate protein mixtures derived from tissue, cell lysates, or from body fluids (1, 2). Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS)1 is the most popular strategy for protein/peptide mixtures analysis in shotgun proteomics (3). Large-scale protein/peptide mixtures are separated by liquid chromatography followed by online detection by tandem mass spectrometry. The capabilities of proteomics rely greatly on the performance of the mass spectrometer. With the improvement of MS technology, proteomics has benefited significantly from the high-resolution and excellent mass accuracy (4). In recent years, based on the higher efficiency of higher energy collision dissociation (HCD), a new “high–high” strategy (high-resolution MS as well as MS/MS(tandem MS)) has been applied instead of the “high–low” strategy (high-resolution MS, i.e. in Orbitrap, and low-resolution MS/MS, i.e. in ion trap) to obtain high quality tandem MS/MS data as well as full MS in shotgun proteomics. Both full MS scans and MS/MS scans can be performed, and the whole cycle time of MS detection is very compatible with the chromatographic time scale (5).High-resolution measurement is one of the most important features in mass spectrometric application. In this high–high strategy, high-resolution and accurate spectra will be achieved in tandem MS/MS scans as well as full MS scans, which makes isotopic peaks distinguishable from one another, thus enabling the easy calculation of precise charge states and monoisotopic mass. During an LC-MS/MS experiment, a multiply charged precursor ion (peptide) is usually isolated and fragmented, and then the multiple charge states of the fragment ions are generated and collected. After full extraction of peak lists from original tandem mass spectra, the commonly used search engines (i.e. Mascot (6), Sequest (7)) have no capability to distinguish isotopic peaks and recognize charge states, so all of the product ions are considered as all charge state hypotheses during the database search for protein identification. These multiple charge states of fragment ions and their isotopic cluster peaks can be incorrectly assigned by the search engine, which can cause false peptide identification. To overcome this issue, data preprocessing of the high-resolution MS/MS spectra is required before submitting them for identification. There are usually two major preprocessing steps used for high-resolution MS/MS data: deisotoping and deconvolution (8, 9). Deisotoping of spectra removes all isotopic peaks except monoisotopic peaks from multi-isotopic peaks. Deconvolution of spectra translates multiply charged ions to singly charged ions and also accumulates the intensity of fragment ions by summing up all the intensities from their multiply charged states. After performing these two data-preprocessing steps, the resulting spectra is simpler and cleaner and allows more precise database searching and accurate bioinformatics analysis.With the capacity to analyze multiple samples simultaneously, stable isotope labeling approaches have been widely used in quantitative proteomics. Stable isotope labeling approaches are categorized as metabolic labeling (SILAC, stable isotope labeling by amino acids in cell culture) and chemical labeling (10, 11). The peptides labeled by the SILAC approach are quantified by precursor ions in full MS spectra, whereas peptides that have been isobarically labeled using chemical means are quantified by reporter ions in MS/MS spectra. There are two similar isobaric chemical labeling methods: (1) isobaric tag for relative and absolute quantification (iTRAQ), and (2) tandem mass tag (TMT) (12, 13). These reagents contain an amino-reactive group that specifically reacts with N-terminal amino groups and epilson-amino groups of lysine residues to label digested peptides in a typical shotgun proteomics experiment. There are four different channels of isobaric tags: TMT two-plex, iTRAQ four-plex, TMT six-plex, and iTRAQ eight-plex (1216). The number before “plex” denotes the number of samples that can be analyzed by the mass spectrum simultaneously. Peptides labeled with different isotopic variants of the tag show identical or similar mass and appear as a single peak in full scans. This single peak may be selected for subsequent MS/MS analysis. In an MS/MS scan, the mass of reporter ions (114 to 117 for iTRAQ four-plex, 113 to 121 for iTRAQ eight-plex, and 126 to 131for TMT six-plex upon CID or HCD activation) are associated with corresponding samples, and the intensities represent the relative abundances of the labeled peptides. Meanwhile, the other ions from the MS/MS spectra can be used for peptide identification. Because of the multiplexing capability, isobaric labeling methods combined with bottom-up proteomics have been widely applied for accurate quantification of proteins on a global scale (14, 1719). Although mostly associated with peptide labeling, these isobaric labeling methods have also been applied at protein level (2023).For the proteomic analysis of isobarically labeled peptides/proteins in “high–high” MS strategy, the common consensus is that accurate reporter ions can contribute to more accurate quantification. However, there is no evidence to show how the ions related to isobaric labeling affect the peptide/protein identification and what preprocessing steps should be taken for high-resolution isobarically labeled MS/MS. To demonstrate the effectiveness and importance of preprocessing, we examined how the combination of preprocessing steps improved peptide/protein sensitivity in database searching. Several combinatorial ways of data-preprocessing were applied for high-throughput data analysis including deisotoping to keep simple monoisotopic mass peaks, deconvolution of ions with multiple charge states, and preservation of top 10 peaks in every 100 Dalton mass range. After systematic analysis of high-resolution isobarically labeled spectra, we further processed the spectra and removed interferential ions that were not related to the peptide. Our results suggested that the preprocessing of isobarically labeled high-resolution tandem mass spectra significantly improved the peptide/protein identification sensitivity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号