首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 257 毫秒
1.
The use of electron transfer dissociation (ETD) fragmentation for analysis of peptides eluting in liquid chromatography tandem mass spectrometry experiments is increasingly common and can allow identification of many peptides and proteins in complex mixtures. Peptide identification is performed through the use of search engines that attempt to match spectra to peptides from proteins in a database. However, software for the analysis of ETD fragmentation data is currently less developed than equivalent algorithms for the analysis of the more ubiquitous collision-induced dissociation fragmentation spectra. In this study, a new scoring system was developed for analysis of peptide ETD fragmentation data that varies the ion type weighting depending on the precursor ion charge state and peptide sequence. This new scoring regime was applied to the analysis of data from previously published results where four search engines (Mascot, Open Mass Spectrometry Search Algorithm (OMSSA), Spectrum Mill, and X!Tandem) were compared (Kandasamy, K., Pandey, A., and Molina, H. (2009) Evaluation of several MS/MS search algorithms for analysis of spectra derived from electron transfer dissociation experiments. Anal. Chem. 81, 7170–7180). Protein Prospector identified 80% more spectra at a 1% false discovery rate than the most successful alternative searching engine in this previous publication. These results suggest that other search engines would benefit from the application of similar rules.The recently developed fragmentation approach of electron transfer dissociation (ETD)1 has become a genuine alternative to the more ubiquitous collision-induced dissociation (CID) for high throughput and high sensitivity proteomic analysis (13). ETD (4) and the related fragmentation process electron capture dissociation (ECD) (5) have been demonstrated to have particular advantages for the analysis of large peptides and small proteins (68) as well as the analysis of peptides bearing labile post-translational modifications (911). The results achieved through ETD and ECD analysis have been shown to be highly complementary to those obtained through CID fragmentation analysis, both through increasing confidence in particular identifications of peptides and also by allowing identification of extra components in complex mixtures (10, 12, 13). As CID and ETD can be sequentially or alternatively performed on precursor ions in the same mass spectrometric run, it is expected that the combined use of these two fragmentation analysis techniques will become increasingly common to enable more comprehensive sample analysis.Software for analysis of CID spectra is significantly more advanced than that for ECD/ETD data. This is partly because the behavior of peptides under CID fragmentation is better characterized and understood so software has been developed that is better able to predict the fragment ions expected. The fragment ion types observed in ETD and ECD are largely known (5, 14, 15), but information about the frequency and peak intensities of the different ion types observed is less well documented.We recently performed a study to characterize how frequently the different fragment ion types are detected in ETD spectra when analyzing complex digest mixtures produced by proteolytic enzymes or chemical cleavage reagents of different sequence specificity (16). These results were analyzed with respect to precursor charge state and location of basic residues, which were both shown to be significant factors in controlling the fragment ion types observed. The results showed that ETD spectra of doubly charged precursor ions produced very different fragment ions depending on the location of a basic residue in the sequence.Based on this statistical analysis of ETD data from a diverse range of peptides (16), in the present study, a new scoring system was developed and implemented in the search engine Batch-Tag within Protein Prospector that adjusts the weighting for different fragment ion types based on the precursor charge state and the presence of basic amino acid residues at either peptide terminus. The results using this new scoring system were compared with the previous generation of Batch-Tag, which used ion score weightings based on the average frequency of observation of different fragment types in ETD spectra of tryptic peptides and used the same scoring irrespective of precursor charge and sequence. The performance of this new scoring was also compared with those reported by other search engines using results previously published from a large standard data set (17). The new scoring system allowed identification of significantly more spectra than achieved with the previous scoring system. It also assigned 80% more spectra than the most successful of the compared search engines when using the same false discovery rate threshold.  相似文献   

2.
Cross-linking/mass spectrometry resolves protein–protein interactions or protein folds by help of distance constraints. Cross-linkers with specific properties such as isotope-labeled or collision-induced dissociation (CID)-cleavable cross-linkers are in frequent use to simplify the identification of cross-linked peptides. Here, we analyzed the mass spectrometric behavior of 910 unique cross-linked peptides in high-resolution MS1 and MS2 from published data and validate the observation by a ninefold larger set from currently unpublished data to explore if detailed understanding of their fragmentation behavior would allow computational delivery of information that otherwise would be obtained via isotope labels or CID cleavage of cross-linkers. Isotope-labeled cross-linkers reveal cross-linked and linear fragments in fragmentation spectra. We show that fragment mass and charge alone provide this information, alleviating the need for isotope-labeling for this purpose. Isotope-labeled cross-linkers also indicate cross-linker-containing, albeit not specifically cross-linked, peptides in MS1. We observed that acquisition can be guided to better than twofold enrich cross-linked peptides with minimal losses based on peptide mass and charge alone. By help of CID-cleavable cross-linkers, individual spectra with only linear fragments can be recorded for each peptide in a cross-link. We show that cross-linked fragments of ordinary cross-linked peptides can be linearized computationally and that a simplified subspectrum can be extracted that is enriched in information on one of the two linked peptides. This allows identifying candidates for this peptide in a simplified database search as we propose in a search strategy here. We conclude that the specific behavior of cross-linked peptides in mass spectrometers can be exploited to relax the requirements on cross-linkers.Cross-linking/mass spectrometry extends the use of mass-spectrometry-based proteomics from identification (1, 2), quantification (3), and characterization of protein complexes (4) into resolving protein structures and protein–protein interactions (58). Chemical reagents (cross-linkers) covalently connect amino acid pairs that are within a cross-linker-specific distance range in the native three-dimensional structure of a protein or protein complex. A cross-linking/mass spectrometry experiment is typically conducted in four steps: (1) cross-linking of the target protein or complex, (2) protein digestion (usually with trypsin), (3) LC-MS analysis, and (4) database search. The digested peptide mixture consists of linear and cross-linked peptides, and the latter can be enriched by strong cation exchange (9) or size exclusion chromatography (10). Cross-linked peptides are of high value as they provide direct information on the structure and interactions of proteins.Cross-linked peptides fragment under collision-induced dissociation (CID) conditions primarily into b- and y-ions, as do their linear counterparts. An important difference regarding database searches between linear and cross-linked peptides stems from not knowing which peptides might be cross-linked. Therefore, one has to consider each single peptide and all pairwise combinations of peptides in the database. Having n peptides leads to (n2 + n)/2 possible pairwise combinations. This leads to two major challenges: With increasing size of the database, search time and the risk of identifying false positives increases. One way of circumventing these problems is to use MS2-cleavable cross-linkers (11, 12), at the cost of limited experimental design and choice of cross-linker.In a first database search approach (13), all pairwise combinations of peptides in a database were considered in a concatenated and linearized form. Thereby, all possible single bond fragments are considered in one of the two database entries per peptide pair, and the cross-link can be identified by a normal protein identification algorithm. Already, the second search approach split the peptides for the purpose of their identification (14). Linear fragments were used to retrieve candidate peptides from the database that are then matched based on the known mass of the cross-linked pair and scored as a pair against the spectrum. Isotope-labeled cross-linkers were used to sort the linear and cross-linked fragments apart. Many other search tools and approaches have been developed since (10, 1519); see (20) for a more detailed list, at least some of which follow the general idea of an open modification search (2124).As a general concept for open modification search of cross-linked peptides, cross-linked peptides represent two peptides, each with an unknown modification given by the mass of the other peptide and the cross-linker. One identifies both peptides individually and then matches them based on knowing the mass of cross-linked pair (14, 22, 24). Alternatively, one peptide is identified first and, using that peptide and the cross-linker as a modification mass, the second peptide is identified from the database (21, 23). An important element of the open modification search approach is that it essentially converts the quadratic search space of the cross-linked peptides into a linear search space of modified peptides. Still, many peptides and many modification positions have to be considered, especially when working with large databases or when using highly reactive cross-linkers with limited amino acid selectivity (25).We hypothesize that detailed knowledge of the fragmentation behavior of cross-linked peptides might reveal ways to improve the identification of cross-linked peptides. Detailed analyses of the fragmentation behavior of linear peptides exist (2628), and the analysis of the fragmentation behavior of cross-linked peptides has guided the design of scores (24, 29). Further, cross-link-specific ions have been observed from higher energy collision dissociation (HCD) data (30). Isotope-labeled cross-linkers are used to distinguish cross-linked from linear fragments, generally in low-resolution MS2 of cross-linked peptides (14).We compared the mass spectrometric behavior of cross-linked peptides to that of linear peptides, using 910 high-resolution fragment spectra matched to unique cross-linked peptides from multiple different public datasets at 5% peptide-spectrum match (PSM)1 false discovery rate (FDR). In addition, we repeated all experiments with a larger sample set that contains 8,301 spectra—also including data from ongoing studies from our lab (Supplemental material S9-S12). This paper presents the mass spectrometric signature of cross-linked peptides that we identified in our analysis and the resulting heuristics that are incorporated into an integrated strategy for the analysis and identification of cross-linked peptides. We present computational strategies that indicate the possibility of alleviating the need for mass-spectrometrically restricted cross-linker choice.  相似文献   

3.
Mass spectrometry-based unbiased analysis of the full complement of secretory peptides is expected to facilitate the identification of unknown biologically active peptides. However, tandem MS sequencing of endogenous peptides in their native form has proven difficult because they show size heterogeneity and contain multiple internal basic residues, the characteristics not found in peptide fragments produced by in vitro digestion. Endogenous peptides remain largely unexplored by electron transfer dissociation (ETD), despite its widespread use in bottom-up proteomics. We used ETD, in comparison to collision induced dissociation (CID), to identify endogenous peptides derived from secretory granules of a human endocrine cell line. For mass accuracy, both MS and tandem MS were analyzed on an Orbitrap. CID and ETD, performed in different LC-MS runs, resulted in the identification of 795 and 569 unique peptides (ranging from 1000 to 15000 Da), respectively, with an overlap of 397. Peptides larger than 3000 Da accounted for 54% in CID and 46% in ETD identifications. Although numerically outperformed by CID, ETD provided more extensive fragmentation, leading to the identification of peptides that are not reached by CID. This advantage was demonstrated in identifying a new antimicrobial peptide from neurosecretory protein VGF (non-acronymic), VGF[554–577]-NH2, or in differentiating nearly isobaric peptides (mass difference less than 2 ppm) that arise from alternatively spliced exons of the gastrin-releasing peptide gene. CID and ETD complemented each other to add to our knowledge of the proteolytic processing sites of proteins implicated in the regulated secretory pathway. An advantage of the use of both fragmentation methods was also noted in localization of phosphorylation sites. These findings point to the utility of ETD mass spectrometry in the global study of endogenous peptides, or peptidomics.Biologically active peptides, commonly known as peptide hormones and antimicrobial peptides, belong to a defined set of endogenous peptides that gain specialized functions not ascribed to original precursor proteins. For a precursor protein to generate such peptides, it must undergo specific cleavages and in some cases needs to be modified at specific sites (1). This limited cleavage, or proteolytic processing, represents an important cellular mechanism by which molecular diversity of proteins is increased at the post-translational level. In the postgenome era, it is being recognized that localization of processing sites in secretory proteins facilitates the identification of biologically active peptides. A standard approach to determining such sites is to use a panel of antibodies directed against different regions of a target protein (2). However, it is practically impossible to prepare antibodies that can thoroughly cover potential processing products arising from the precursor. Alternatively, mass spectrometry-assisted unbiased analysis of endogenous peptides may be a major step toward elucidating proteolytic processing (3).In neurons and endocrine cells, a majority of biologically active peptides are released via the regulated secretory pathway. They are stored in secretory granules and await secretion until the cells receive an exocytotic stimulus. Owing to their compartmentalization, secretory peptides can be noninvasively recovered in culture supernatant. We have shown that a data set of endogenous peptide sequences that are collected by this procedure is applicable to infer processing sites, as well as to identify bona fide processing products (4). Rather than being digested, every endogenous peptide should be analyzed in its native form to understand how the peptide is generated and subsequently degraded. However, it remains a challenge to identify endogenous peptides because of size heterogeneity (ranging from 3 aa to 100 aa). For example, thyrotropin-releasing hormone is a small 3-aa peptide, human adrenomedullin occurs as a 52-aa peptide, and a 98-aa N-terminal propeptide from the atrial natriuretic peptide precursor is found in the circulation. Unlike digested protein fragments used in bottom-up proteomics, C termini of these endogenous peptides are not restricted to specific residues. Furthermore, proteolytic processing leads to the production of peptides containing multiple internal basic residues, for which collision induced dissociation (CID)1 shows limited performance (5).A solution to address this issue in endogenous peptide sequencing might be the use of electron transfer dissociation (ETD) tandem mass spectrometry, which has been shown to provide a more complete series of fragment ions and hence a more confident sequence identification, along with the ability to leave labile post-translational modifications intact (610). The benefit of ETD in bottom-up proteomics has been increasingly documented, whereas endogenous peptides remain largely unexplored by ETD, despite the expectation that ETD would improve sequencing for larger peptides. In the few studies on endogenous peptides (11, 12), ETD did not cover large peptides exceeding 5000 Da. Because we have used CID to facilitate the discovery of previously unknown biologically active peptides (3, 13, 14), we were interested to see if ETD would be helpful to identify endogenous peptides that have escaped identification by CID. Here we conducted a large-scale identification of endogenous secretory peptides, ranging from 1000 to 15000 Da, using CID and ETD. We describe the merits of using ETD, in connection with CID, in peptidomics studies. The most significant finding is the identification of a previously unknown peptide, VGF[554–577]-NH2, which was sequenced solely by ETD. This peptide was found to have antimicrobial activity.  相似文献   

4.
Campylobacter jejuni is a gastrointestinal pathogen that is able to modify membrane and periplasmic proteins by the N-linked addition of a 7-residue glycan at the strict attachment motif (D/E)XNX(S/T). Strategies for a comprehensive analysis of the targets of glycosylation, however, are hampered by the resistance of the glycan-peptide bond to enzymatic digestion or β-elimination and have previously concentrated on soluble glycoproteins compatible with lectin affinity and gel-based approaches. We developed strategies for enriching C. jejuni HB93-13 glycopeptides using zwitterionic hydrophilic interaction chromatography and examined novel fragmentation, including collision-induced dissociation (CID) and higher energy collisional (C-trap) dissociation (HCD) as well as CID/electron transfer dissociation (ETD) mass spectrometry. CID/HCD enabled the identification of glycan structure and peptide backbone, allowing glycopeptide identification, whereas CID/ETD enabled the elucidation of glycosylation sites by maintaining the glycan-peptide linkage. A total of 130 glycopeptides, representing 75 glycosylation sites, were identified from LC-MS/MS using zwitterionic hydrophilic interaction chromatography coupled to CID/HCD and CID/ETD. CID/HCD provided the majority of the identifications (73 sites) compared with ETD (26 sites). We also examined soluble glycoproteins by soybean agglutinin affinity and two-dimensional electrophoresis and identified a further six glycosylation sites. This study more than doubles the number of confirmed N-linked glycosylation sites in C. jejuni and is the first to utilize HCD fragmentation for glycopeptide identification with intact glycan. We also show that hydrophobic integral membrane proteins are significant targets of glycosylation in this organism. Our data demonstrate that peptide-centric approaches coupled to novel mass spectrometric fragmentation techniques may be suitable for application to eukaryotic glycoproteins for simultaneous elucidation of glycan structures and peptide sequence.Campylobacter jejuni is a Gram-negative, microaerophilic, spiral-shaped, motile bacterium that is the most common cause of food- and water-borne diarrheal illness worldwide (1). Typical infections are acquired via the consumption of undercooked poultry where C. jejuni is found commensally (2). Symptoms in humans range from mild, non-inflammatory diarrhea to severe abdominal cramps, vomiting, and inflammation (3). Prior infection with C. jejuni is a common antecedent of two chronic immune-mediated disorders: Guillain-Barré syndrome (4) and immunoproliferative small intestine disease (5). A unique molecular trait of C. jejuni is the ability to post-translationally modify proteins by the N-linked addition of a 7-residue glycan (GalNAc-α1,4-GalNAc-α1,4-(Glcβ1,3)- GalNAc-α1,4-GalNAc-α1,4-GalNAc-α1,3-Bac-β1 where Bac is bacillosamine (2,4-diacetamido-2,4,6-trideoxyglucopyranose)) (6) at the consensus sequon (D/E)XNX(S/T) where X is any amino acid except proline (7).The N-linked C. jejuni heptasaccharide is encoded by the pgl (protein glycosylation) gene cluster (810), and the glycan is transferred to proteins by the PglB oligosaccharyltransferase (11) at the periplasmic face of the inner membrane (12). Removal of the N-glycosylation gene cluster (or indeed pglB alone) results in C. jejuni that displays poor adherence to and invasion of epithelial cell lines (13) and reduced colonization of the chicken gastrointestinal tract (14). Although this demonstrates a requirement for glycosylation in virulence, the proteins that mediate this are still unknown, and the overall role of glycan attachment remains to be elucidated. Our current understanding of the structural context of glycosylation in C. jejuni suggests that it does not play a role in steric stabilization by conferring structural rigidity as seen in eukaryotes (15) but occurs preferably on flexible loops and unordered regions of proteins (1618). To investigate the role of glycosylation in protein function, recent studies have utilized mutagenesis to remove the N-linked sequon from three glycoproteins: Cj1496c (19), Cj0143c (20), and VirB10 (21). Removal of glycosylation from Cj1496c and Cj0143c had little effect on protein function; however, glycan attachment was required for correct localization of VirB10. Although the exact role of the glycan remains largely unknown, it appears to be site-specific with a single site, Asn97, influencing localization of VirB10, whereas a second site, Asn32, is dispensable (21). It is clear that a more comprehensive analysis of the C. jejuni glycoproteome is required. A further complication in the elucidation of N-linked glycosylation is the use of the NCTC 11168 strain, which because of laboratory passage (22, 23) may not be the most appropriate model in which to study the virulence properties of glycan attachment. For example, we have recently shown that a surface-exposed virulence factor, JlpA, is glycosylated at two sites (Asn146 and Asn107) in all sequenced C. jejuni strains except NCTC 11168, which contains only Asn146 (24).Glycoproteomics in C. jejuni is also a major technical challenge. Unlike eukaryotic N-linked glycans, the C. jejuni glycan is resistant to removal by protein N-glycosidase F (24) and chemical liberation via β-elimination (6) possibly because of the structure of the unique linking sugar, bacillosamine (25). Analysis therefore requires complementary methodology to elucidate the sites of glycosylation in the presence of the glycan. Preferential fragmentation of the glycan itself during collision-induced dissociation (CID) generally results in poor recovery of peptide fragment ions, and thus identification of the underlying protein and site of attachment remains problematic. MS3 has been attempted for site identification (6, 26); however, the data are limited by the requirement for sufficient ions for two rounds of tandem MS. We have also shown previously that C. jejuni encodes several hydrophobic integral membrane and outer membrane proteins possessing multiple transmembrane-spanning regions that are not amenable to gel-based approaches (27), particularly those using lectins for glycoprotein purification (28). We hypothesize that N-linked glycosylation is more widespread than previously demonstrated (6, 7, 26) because these studies examined only soluble proteins (6, 26) or used lectin affinity (6, 7), which limits the amount and type of detergents that can be used. Recent work (26) has demonstrated the potential of exploiting the hydrophilic nature of the C. jejuni glycan to enable glycopeptide enrichment.The ability to generate product ions useful for the identification of a glycosylated peptide is governed by three factors: the peptide backbone, the glycan, and the fragmentation approach. Multiple strategies exist to separately exploit the first two of these parameters (29, 30), but it is only recently that selective fragmentation of modified peptides has been available through electron transfer dissociation (ETD)1 and electron capture dissociation (31, 32). ETD/electron capture dissociation enable the selective cleavage of the peptide while maintaining the carbohydrate structure, and this has been demonstrated using eukaryotic glycopeptides (33, 34) and more recently glycopeptides isolated from the pathogen Neisseria gonorrhoeae (35). A more recent fragmentation approach is higher energy collisional (C-trap) dissociation (HCD), which uses higher fragmentation energies than standard CID and enables identification of modifications, such as phosphotyrosine (36), via diagnostic immonium ions and high mass accuracy over the full mass range in MS/MS. HCD has not previously been applied to glycopeptides.We applied several enrichment and MS fragmentation approaches to the characterization of the glycoproteome of C. jejuni HB93-13. Sequence analysis determined that the HB93-13 genome contains 510 N-linked sequons ((D/E)XNX(S/T)) in 382 proteins of which 261 (with 371 potential N-linked sites) are predicted to pass through the inner membrane and are therefore the subset that may be glycosylated. We examined trypsin digests of whole cell and membrane protein preparations using zwitterionic hydrophilic interaction chromatography (ZIC-HILIC) and graphite enrichment of gel-separated proteins using several mass spectrometric techniques (CID, HCD, and ETD). This is the first study to demonstrate the potential of using the high energy fragmentation of HCD to overcome the signal disruption caused by labile glycan fragmentation and to provide peptide sequencing within a single step. Manual data analysis was also simplified as the GalNAc fragment ion (204.086 Da) provides a signature that can be used to highlight glycopeptides within a complex mixture. We identified 81 glycosylation sites, including 47 not described previously in the literature and a single site that cannot be unambiguously assigned. The majority of these are present on proteins not amenable to traditional gel-based analyses, such as hydrophobic transmembrane proteins. Our work more than doubles the previously known N-linked C. jejuni glycoproteome and provides a clear rationale for other studies where the peptide and glycan need to remain associated.  相似文献   

5.
The use of ultraviolet photodissociation (UVPD) for the activation and dissociation of peptide anions is evaluated for broader coverage of the proteome. To facilitate interpretation and assignment of the resulting UVPD mass spectra of peptide anions, the MassMatrix database search algorithm was modified to allow automated analysis of negative polarity MS/MS spectra. The new UVPD algorithms were developed based on the MassMatrix database search engine by adding specific fragmentation pathways for UVPD. The new UVPD fragmentation pathways in MassMatrix were rigorously and statistically optimized using two large data sets with high mass accuracy and high mass resolution for both MS1 and MS2 data acquired on an Orbitrap mass spectrometer for complex Halobacterium and HeLa proteome samples. Negative mode UVPD led to the identification of 3663 and 2350 peptides for the Halo and HeLa tryptic digests, respectively, corresponding to 655 and 645 peptides that were unique when compared with electron transfer dissociation (ETD), higher energy collision-induced dissociation, and collision-induced dissociation results for the same digests analyzed in the positive mode. In sum, 805 and 619 proteins were identified via UVPD for the Halobacterium and HeLa samples, respectively, with 49 and 50 unique proteins identified in contrast to the more conventional MS/MS methods. The algorithm also features automated charge determination for low mass accuracy data, precursor filtering (including intact charge-reduced peaks), and the ability to combine both positive and negative MS/MS spectra into a single search, and it is freely open to the public. The accuracy and specificity of the MassMatrix UVPD search algorithm was also assessed for low resolution, low mass accuracy data on a linear ion trap. Analysis of a known mixture of three mitogen-activated kinases yielded similar sequence coverage percentages for UVPD of peptide anions versus conventional collision-induced dissociation of peptide cations, and when these methods were combined into a single search, an increase of up to 13% sequence coverage was observed for the kinases. The ability to sequence peptide anions and cations in alternating scans in the same chromatographic run was also demonstrated. Because ETD has a significant bias toward identifying highly basic peptides, negative UVPD was used to improve the identification of the more acidic peptides in conjunction with positive ETD for the more basic species. In this case, tryptic peptides from the cytosolic section of HeLa cells were analyzed by polarity switching nanoLC-MS/MS utilizing ETD for cation sequencing and UVPD for anion sequencing. Relative to searching using ETD alone, positive/negative polarity switching significantly improved sequence coverages across identified proteins, resulting in a 33% increase in unique peptide identifications and more than twice the number of peptide spectral matches.The advent of new high-performance tandem mass spectrometers equipped with the most versatile collision- and electron-based activation methods and ever more powerful database search algorithms has catalyzed tremendous progress in the field of proteomics (14). Despite these advances in instrumentation and methodologies, there are few methods that fully exploit the information available from the acidic proteome or acidic regions of proteins. Typical high-throughput, bottom-up workflows consist of the chromatographic separation of complex mixtures of digested proteins followed by online mass spectrometry (MS) and MSn analysis. This bottom-up approach remains the most popular strategy for protein identification, biomarker discovery, quantitative proteomics, and elucidation of post-translational modifications. To date, proteome characterization via mass spectrometry has overwhelmingly focused on the analysis of peptide cations (5), resulting in an inherent bias toward basic peptides that easily ionize under acidic mobile phase conditions and positive polarity MS settings. Given that ∼50% of peptides/proteins are naturally acidic (6) and that many of the most important post-translational modifications (e.g. phosphorylation, acetylation, sulfonation, etc.) significantly decrease the isoelectric points of peptides (7, 8), there is a compelling need for better analytical methodologies for characterization of the acidic proteome.A principal reason for the shortage of methods for peptide anion characterization is the lack of MS/MS techniques suitable for the efficient and predictable dissociation of peptide anions. Although there are a growing array of new ion activation methods for the dissociation of peptides, most have been developed for the analysis of positively charged peptides. Collision-induced dissociation (CID)1 of peptide anions, for example, often yields unpredictable or uninformative fragmentation behavior, with spectra dominated by neutral losses from both precursor and product ions (9), resulting in insufficient peptide sequence information. The two most promising new electron-based methods, electron-capture dissociation and electron-transfer dissociation (ETD), are applicable only to positively charged ions, not to anions (1013). Because of the known inadequacy of CID and the lack of feasibility of electron-capture dissociation and ETD for peptide anion sequencing, several alternative MSn methods have been developed recently. Electron detachment dissociation using high-energy electrons to induce backbone cleavages was developed for peptide anions (14, 15). Another new technique, negative ETD, entails reactions of radical cation reagents with peptide anions to promote electron transfer from the peptide to the reagent that causes radical-directed dissociation (16, 17). Activated-electron photodetachment dissociation, an MS3 technique, uses UV irradiation to produce intact peptide radical anions, which are then collisionally activated (18, 19). Although they represent inroads in the characterization of peptide anions, these methods also suffer from several significant shortcomings. Electron detachment dissociation and activated-electron photodetachment dissociation are both low-efficiency methods that require long averaging cycles and activation times that range from half a second to multiple seconds, impeding the integration of these methods with chromatographic timescales (1419). In addition, the fragmentation patterns frequently yield many high-abundance neutral losses from product ions, which clutter the spectra (1417), and few sequence ions (14, 18, 19). Recently, we reported the use of 193-nm photons (ultraviolet photodissociation (UVPD)) for peptide anion activation, which was shown to yield rich and predictable fragmentation patterns with high sequence coverage on a fast liquid chromatographic timeline (20). This method showed promise for a range of peptide charge states (i.e. from 3- to 1-), as well as for both unmodified and phosphorylated species.Several widely used or commercial database searching techniques are available for automated “bottom-up” analysis of peptide cations; SEQUEST (21), MASCOT (22), OMSSA (23), X! Tandem (24), and MASPIC (25) are all popular choices and yield comparable results (26). MassMatrix (27), a recently introduced searching algorithm, uses a mass accuracy sensitive probability-based scoring scheme for both the total number of matched product ions and the total abundance of matched products. This searching method also utilizes LC retention times to filter false positive peptide matches (28) and has been shown to yield results comparable to or better than those obtained with SEQUEST, MASCOT, OMSSA, and X! Tandem (29). Despite the ongoing innovation in automated peptide cation analysis, there is a lack of publically available methods for automated peptide anion analysis.In this work, we have modified the mass accuracy sensitive probabilistic MassMatrix algorithms to allow database searching of negative polarity MS/MS spectra. The algorithm is specific to the fragmentation behavior generated from 193-nm UVPD of peptide anions. The UVPD pathways in MassMatrix were rigorously and statistically optimized using two large data sets with high mass accuracy and high mass resolution for both MS1 and MS2 data acquired on an Orbitrap mass spectrometer for complex HeLa and Halo proteome samples. For low mass accuracy/low mass resolution data, we also incorporated a charge-state-filtering algorithm that identifies the charge state of each MS/MS spectrum based on the fragmentation patterns prior to searching. MassMatrix not only can analyze both positive and negative polarity LC-MS/MS files separately, but also can combine files from different polarities and different dissociation methods into a single search, thus maximizing the information content for a given proteomics experiment. The explicit incorporation of mass accuracy in the scores for the UVPD MS/MS spectra of peptide anions increases peptide assignments and identifications. Finally, we showcase the utility of integrating MassMatrix searching with positive/negative polarity MS/MS switching (i.e. data-dependent positive ETD and negative UVPD during a single proteomic LC-MS/MS run). MassMatrix is available to the public as a free search engine online.  相似文献   

6.
Quantifying the similarity of spectra is an important task in various areas of spectroscopy, for example, to identify a compound by comparing sample spectra to those of reference standards. In mass spectrometry based discovery proteomics, spectral comparisons are used to infer the amino acid sequence of peptides. In targeted proteomics by selected reaction monitoring (SRM) or SWATH MS, predetermined sets of fragment ion signals integrated over chromatographic time are used to identify target peptides in complex samples. In both cases, confidence in peptide identification is directly related to the quality of spectral matches. In this study, we used sets of simulated spectra of well-controlled dissimilarity to benchmark different spectral comparison measures and to develop a robust scoring scheme that quantifies the similarity of fragment ion spectra. We applied the normalized spectral contrast angle score to quantify the similarity of spectra to objectively assess fragment ion variability of tandem mass spectrometric datasets, to evaluate portability of peptide fragment ion spectra for targeted mass spectrometry across different types of mass spectrometers and to discriminate target assays from decoys in targeted proteomics. Altogether, this study validates the use of the normalized spectral contrast angle as a sensitive spectral similarity measure for targeted proteomics, and more generally provides a methodology to assess the performance of spectral comparisons and to support the rational selection of the most appropriate similarity measure. The algorithms used in this study are made publicly available as an open source toolset with a graphical user interface.In “bottom-up” proteomics, peptide sequences are identified by the information contained in their fragment ion spectra (1). Various methods have been developed to generate peptide fragment ion spectra and to match them to their corresponding peptide sequences. They can be broadly grouped into discovery and targeted methods. In the widely used discovery (also referred to as shotgun) proteomic approach, peptides are identified by establishing peptide to spectrum matches via a method referred to as database searching. Each acquired fragment ion spectrum is searched against theoretical peptide fragment ion spectra computed from the entries of a specified sequence database, whereby the database search space is constrained to a user defined precursor mass tolerance (2, 3). The quality of the match between experimental and theoretical spectra is typically expressed with multiple scores. These include the number of matching or nonmatching fragments, the number of consecutive fragment ion matches among others. With few exceptions (47) commonly used search engines do not use the relative intensities of the acquired fragment ion signals even though this information could be expected to strengthen the confidence of peptide identification because the relative fragment ion intensity pattern acquired under controlled fragmentation conditions can be considered as a unique “fingerprint” for a given precursor. Thanks to community efforts in acquiring and sharing large number of datasets, the proteomes of some species are now essentially mapped out and experimental fragment ion spectra covering entire proteomes are increasingly becoming accessible through spectral databases (816). This has catalyzed the emergence of new proteomics strategies that differ from classical database searching in that they use prior spectral information to identify peptides. Those comprise inclusion list sequencing (directed sequencing), spectral library matching, and targeted proteomics (17). These methods explicitly use the information contained in empirical fragment ion spectra, including the fragment ion signal intensity to identify the target peptide. For these methods, it is therefore of highest importance to accurately control and quantify the degree of reproducibility of the fragment ion spectra across experiments, instruments, labs, methods, and to quantitatively assess the similarity of spectra. To date, dot product (1824), its corresponding arccosine spectral contrast angle (2527) and (Pearson-like) spectral correlation (2831), and other geometrical distance measures (18, 32), have been used in the literature for assessing spectral similarity. These measures have been used in different contexts including shotgun spectra clustering (19, 26), spectral library searching (18, 20, 21, 24, 25, 2729), cross-instrument fragmentation comparisons (22, 30) and for scoring transitions in targeted proteomics analyses such as selected reaction monitoring (SRM)1 (23, 31). However, to our knowledge, those scores have never been objectively benchmarked for their performance in discriminating well-defined levels of dissimilarities between spectra. In particular, similarity scores obtained by different methods have not yet been compared for targeted proteomics applications, where the sensitive discrimination of highly similar spectra is critical for the confident identification of targeted peptides.In this study, we have developed a method to objectively assess the similarity of fragment ion spectra. We provide an open-source toolset that supports these analyses. Using a computationally generated benchmark spectral library with increasing levels of well-controlled spectral dissimilarity, we performed a comprehensive and unbiased comparison of the performance of the main scores used to assess spectral similarity in mass spectrometry.We then exemplify how this method, in conjunction with its corresponding benchmarked perturbation spectra set, can be applied to answer several relevant questions for MS-based proteomics. As a first application, we show that it can efficiently assess the absolute levels of peptide fragmentation variability inherent to any given mass spectrometer. By comparing the instrument''s intrinsic fragmentation conservation distribution to that of the benchmarked perturbation spectra set, nominal values of spectral similarity scores can indeed be translated into a more directly understandable percentage of variability inherent to the instrument fragmentation. As a second application, we show that the method can be used to derive an absolute measure to estimate the conservation of peptide fragmentation between instruments or across proteomics methods. This allowed us to quantitatively evaluate, for example, the transferability of fragment ion spectra acquired by data dependent analysis in a first instrument into a fragment/transition assay list used for targeted proteomics applications (e.g. SRM or targeted extraction of data independent acquisition SWATH MS (33)) on another instrument. Third, we used the method to probe the fragmentation patterns of peptides carrying a post-translation modification (e.g. phosphorylation) by comparing the spectra of modified peptide with those of their unmodified counterparts. Finally, we used the method to determine the overall level of fragmentation conservation that is required to support target-decoy discrimination and peptide identification in targeted proteomics approaches such as SRM and SWATH MS.  相似文献   

7.
8.
Disulfide bond identification is important for a detailed understanding of protein structures, which directly affect their biological functions. Here we describe an integrated workflow for the fast and accurate identification of authentic protein disulfide bridges. This novel workflow incorporates acidic proteolytic digestion using pepsin to eliminate undesirable disulfide reshuffling during sample preparation and a novel search engine, SlinkS, to directly identify disulfide-bridged peptides isolated via electron transfer higher energy dissociation (EThcD). In EThcD fragmentation of disulfide-bridged peptides, electron transfer dissociation preferentially leads to the cleavage of the S–S bonds, generating two intense disulfide-cleaved peptides as primary fragment ions. Subsequently, higher energy collision dissociation primarily targets unreacted and charge-reduced precursor ions, inducing peptide backbone fragmentation. SlinkS is able to provide the accurate monoisotopic precursor masses of the two disulfide-cleaved peptides and the sequence of each linked peptide by matching the remaining EThcD product ions against a linear peptide database. The workflow was validated using a protein mixture containing six proteins rich in natural disulfide bridges. Using this pepsin-based workflow, we were able to efficiently and confidently identify a total of 31 unique Cys–Cys bonds (out of 43 disulfide bridges present), with no disulfide reshuffling products detected. Pepsin digestion not only outperformed trypsin digestion in terms of the number of detected authentic Cys–Cys bonds, but, more important, prevented the formation of artificially reshuffled disulfide bridges due to protein digestion under neutral pH. Our new workflow therefore provides a precise and generic approach for disulfide bridge mapping, which can be used to study protein folding, structure, and stability.Disulfide bridges are one of the most common post-translational modifications in proteins (1). The formation of disulfide bonds between cysteine residues is a crucial component in the process of protein folding and plays an important role in stabilizing the tertiary and quaternary structures of proteins (2, 3). Therefore, detecting and characterizing the exact locations of disulfide bonds is an important aspect of proteomics, especially in the context of gaining a comprehensive understanding of protein folding and three-dimensional structures. Moreover, in the use of protein therapeutics (e.g. antibodies), it is also of interest to monitor the reshuffling of disulfide bonds during formulation, storage, and usage, which reflects the antibody structure, stability, and biological function (4).Most knowledge about protein disulfide bridges comes from detailed molecular structures obtained via x-ray crystallography and NMR spectroscopy (5, 6), although regrettably such data are mostly obtained from overexpressed recombinant proteins. Mass spectrometry is gaining importance in the identification and characterization of protein disulfide bridges (7, 8). Some advantages of MS-based approaches include relatively easy sample preparation, short analysis time, and the capability to deal with more complex protein mixtures from endogenous sources. However, the detection of disulfide bridges remains challenging for a few reasons.Firstly, the presence of free sulfhydryl groups can induce undesired sulfhydryl-disulfide reshuffling, especially under neutral and alkaline pH condition. As most standard proteomic strategies use enzymatic digestion in a pH range of 7.5–8.5, undesirable disulfide reshuffling can occur during sample handling (8). Secondly, most of the widely applied database searching programs, such as SEQUEST and Mascot, are not developed, and thus are not suitable, for analyzing fragmentation spectra originating from disulfide-bridged peptides (9).Efforts have been directed at tackling these obstacles and facilitating the identification of authentic disulfide bridges. With respect to sample handling, it has been demonstrated by several groups that disulfide reshuffling can be reduced by (i) blocking free cysteines using alkylating reagents before denaturing the protein, (ii) lowering the pH to 6.0 to 7.0 during tryptic digestion (8, 1013), and (iii) using the enzyme pepsin under acidic conditions for proteolytic digestion (1317). Unfortunately, trypsin becomes less efficient and less specific at more acidic pH, and pepsin, which has an optimal pH range of 1–3, tremendously increases the complexity of both protein digests and data analysis (8). Regarding data analysis, one of the current approaches used for the identification of disulfide bridges involves chromatographic comparison between reduced and non-reduced protein digests, with disulfide-bridged peptides appearing only in non-reduced samples (8, 12). Alternatively, disulfide bonds can be identified directly from non-reduced protein digests using an electron transfer dissociation (ETD)1 MS2 and collision-induced dissociation (CID)/higher energy collision dissociation (HCD) MS3 fragmentation scheme (termed the ETD-MS2 CID/HCD-MS3 approach) (13, 18, 19). Thereby, ETD aids in the preferential cleavage of S–S linkages, generating two disulfide-cleaved peptides, which can be subsequently isolated and further fragmented via CID/HCD for sequence information. In addition, substantial efforts have been made to develop novel strategies specifically for interpreting spectra from disulfide-bridged peptides, including de novo sequencing approaches (20, 21) and database search engines such as MassMatrix and Dbond (9, 22).A combined dual fragmentation scheme, referred to as electron-transfer and higher-energy collision dissociation (EThcD), was introduced by our group recently as implemented on an Orbitrap Elite (2325) and will become available for the Orbitrap Fusion. In this approach, an initial ETD step is applied to fragment the isolated MS precursor, and subsequently all resulting ions are subjected to HCD fragmentation, generating a mixture of b/y and c/z ions. Here we explored the use of EThcD for disulfide bridge analysis. We reasoned that the previously reported ETD-MS2 CID/HCD-MS3 method could be integrated into a single EThcD experiment, with ETD applied first to preferentially break the disulfide bond and HCD employed next to enhance the number of peptide backbone fragments. Based on the fact that all the ions resulting from the ETD process are subjected to HCD simultaneously and thus no MS3 isolation is necessary, the sensitivity and duty cycle of the EThcD workflow should potentially be improved relative to the previous MS3 strategy.In this work, we describe a fast and accurate framework for both intrapeptide and interpeptide disulfide bridge identification, including the acidic digestion procedure using pepsin, the usage of the dual-fragmentation scheme EThcD, and the development of a novel search engine, SlinkS. The workflow described herein diminishes issues induced by disulfide reshuffling during sample preparation and provides direct and efficient identification of intrapeptide and interpeptide disulfide bonds from LC/MS2 experiments. We evaluated the integrated workflow using a mixture of six standard proteins and confirmed that this approach enables reliable and robust identification of authentic disulfide bridges from protein mixtures. Furthermore, we assessed the capability of the workflow to quantitatively monitor the changes of disulfide bridges in stress-induced therapeutic antibodies.  相似文献   

9.
Protein–protein interactions (PPIs) are fundamental to the structure and function of protein complexes. Resolving the physical contacts between proteins as they occur in cells is critical to uncovering the molecular details underlying various cellular activities. To advance the study of PPIs in living cells, we have developed a new in vivo cross-linking mass spectrometry platform that couples a novel membrane-permeable, enrichable, and MS-cleavable cross-linker with multistage tandem mass spectrometry. This strategy permits the effective capture, enrichment, and identification of in vivo cross-linked products from mammalian cells and thus enables the determination of protein interaction interfaces. The utility of the developed method has been demonstrated by profiling PPIs in mammalian cells at the proteome scale and the targeted protein complex level. Our work represents a general approach for studying in vivo PPIs and provides a solid foundation for future studies toward the complete mapping of PPI networks in living systems.Protein–protein interactions (PPIs)1 play a key role in defining protein functions in biological systems. Aberrant PPIs can have drastic effects on biochemical activities essential to cell homeostasis, growth, and proliferation, and thereby lead to various human diseases (1). Consequently, PPI interfaces have been recognized as a new paradigm for drug development. Therefore, mapping PPIs and their interaction interfaces in living cells is critical not only for a comprehensive understanding of protein function and regulation, but also for describing the molecular mechanisms underlying human pathologies and identifying potential targets for better therapeutics.Several strategies exist for identifying and mapping PPIs, including yeast two-hybrid, protein microarray, and affinity purification mass spectrometry (AP-MS) (25). Thanks to new developments in sample preparation strategies, mass spectrometry technologies, and bioinformatics tools, AP-MS has become a powerful and preferred method for studying PPIs at the systems level (69). Unlike other approaches, AP-MS experiments allow the capture of protein interactions directly from their natural cellular environment, thus better retaining native protein structures and biologically relevant interactions. In addition, a broader scope of PPI networks can be obtained with greater sensitivity, accuracy, versatility, and speed. Despite the success of this very promising technique, AP-MS experiments can lead to the loss of weak/transient interactions and/or the reorganization of protein interactions during biochemical manipulation under native purification conditions. To circumvent these problems, in vivo chemical cross-linking has been successfully employed to stabilize protein interactions in native cells or tissues prior to cell lysis (1016). The resulting covalent bonds formed between interacting partners allow affinity purification under stringent and fully denaturing conditions, consequently reducing nonspecific background while preserving stable and weak/transient interactions (1216). Subsequent mass spectrometric analysis can reveal not only the identities of interacting proteins, but also cross-linked amino acid residues. The latter provides direct molecular evidence describing the physical contacts between and within proteins (17). This information can be used for computational modeling to establish structural topologies of proteins and protein complexes (1722), as well as for generating experimentally derived protein interaction network topology maps (23, 24). Thus, cross-linking mass spectrometry (XL-MS) strategies represent a powerful and emergent technology that possesses unparalleled capabilities for studying PPIs.Despite their great potential, current XL-MS studies that have aimed to identify cross-linked peptides have been mostly limited to in vitro cross-linking experiments, with few successfully identifying protein interaction interfaces in living cells (24, 25). This is largely because XL-MS studies remain challenging due to the inherent difficulty in the effective MS detection and accurate identification of cross-linked peptides, as well as in unambiguous assignment of cross-linked residues. In general, cross-linked products are heterogeneous and low in abundance relative to non-cross-linked products. In addition, their MS fragmentation is too complex to be interpreted using conventional database searching tools (17, 26). It is noted that almost all of the current in vivo PPI studies utilize formaldehyde cross-linking because of its membrane permeability and fast kinetics (1016). However, in comparison to the most commonly used amine reactive NHS ester cross-linkers, identification of formaldehyde cross-linked peptides is even more challenging because of its promiscuous nonspecific reactivity and extremely short spacer length (27). Therefore, further developments in reagents and methods are urgently needed to enable simple MS detection and effective identification of in vivo cross-linked products, and thus allow the mapping of authentic protein contact sites as established in cells, especially for protein complexes.Various efforts have been made to address the limitations of XL-MS studies, resulting in new developments in bioinformatics tools for improved data interpretation (2832) and new designs of cross-linking reagents for enhanced MS analysis of cross-linked peptides (24, 3339). Among these approaches, the development of new cross-linking reagents holds great promise for mapping PPIs on the systems level. One class of cross-linking reagents containing an enrichment handle have been shown to allow selective isolation of cross-linked products from complex mixtures, boosting their detectability by MS (3335, 4042). A second class of cross-linkers containing MS-cleavable bonds have proven to be effective in facilitating the unambiguous identification of cross-linked peptides (3639, 43, 44), as the resulting cross-linked products can be identified based on their characteristic and simplified fragmentation behavior during MS analysis. Therefore, an ideal cross-linking reagent would possess the combined features of both classes of cross-linkers. To advance the study of in vivo PPIs, we have developed a new XL-MS platform based on a novel membrane-permeable, enrichable, and MS-cleavable cross-linker, Azide-A-DSBSO (azide-tagged, acid-cleavable disuccinimidyl bis-sulfoxide), and multistage tandem mass spectrometry (MSn). This new XL-MS strategy has been successfully employed to map in vivo PPIs from mammalian cells at both the proteome scale and the targeted protein complex level.  相似文献   

10.
Laserspray ionization (LSI) mass spectrometry (MS) allows, for the first time, the analysis of proteins directly from tissue using high performance atmospheric pressure ionization mass spectrometers. Several abundant and numerous lower abundant protein ions with molecular masses up to ∼20,000 Da were detected as highly charged ions from delipified mouse brain tissue mounted on a common microscope slide and coated with 2,5-dihydroxyacetophenone as matrix. The ability of LSI to produce multiply charged ions by laser ablation at atmospheric pressure allowed protein analysis at 100,000 mass resolution on an Orbitrap Exactive Fourier transform mass spectrometer. A single acquisition was sufficient to identify the myelin basic protein N-terminal fragment directly from tissue using electron transfer dissociation on a linear trap quadrupole (LTQ) Velos. The high mass resolution and mass accuracy, also obtained with a single acquisition, are useful in determining protein molecular weights and from the electron transfer dissociation data in confirming database-generated sequences. Furthermore, microscopy images of the ablated areas show matrix ablation of ∼15 μm-diameter spots in this study. The results suggest that LSI-MS at atmospheric pressure potentially combines speed of analysis and imaging capability common to matrix-assisted laser desorption/ionization and soft ionization, multiple charging, improved fragmentation, and cross-section analysis common to electrospray ionization.Tissue imaging by mass spectrometry (MS) is proving useful in areas such as detecting tumor margins, determining sites of high drug uptake, and mapping signaling molecules in brain tissue (18). Imaging using secondary ion mass spectrometry is well established but is only marginally useful with intact molecular mass measurements from biological tissue (911). Matrix-assisted laser desorption/ionization (MALDI)-MS operating under vacuum conditions has been used for tissue imaging with success, especially for abundant components such as membrane lipids, drug metabolites, and proteins (1214). Spatial resolution of ∼20 μm has been achieved (15), and the MALDI-MS method has been applied in an attempt to shed light on Parkinson disease (16, 17), muscular dystrophy (18), obesity, and cancer (12, 19).Unfortunately, there are disadvantages in using vacuum-based MS for tissue imaging in relation to analysis of unadulterated tissue. Also, the mass spectrometers used in these studies frequently have much lower mass resolution and mass accuracy than are available with atmospheric pressure ionization (API)1 instruments and are not as widely available. Because the vacuum ionization methods produce singly charged ions, mass-selected fragmentation methods provide only limited information, especially for proteins. In addition, no advanced fragmentation such as electron transfer dissociation (ETD) (2022) is available for confident protein confirmation or identification. Atmospheric pressure (AP) MALDI can be coupled to high performance mass spectrometers but suffers from sensitivity issues for tissue imaging where high spatial resolution is desired (23). AP MALDI also primarily produces singly charged ions (24, 25). Thus, mass and cross-section analysis of intact proteins has yet to be accomplished using AP MALDI because of intrinsic mass range limitations of API instruments, which frequently have a mass-to-charge (m/z) limit of <4000. Thus, new improved methods of mass-specific tissue imaging, especially at AP, are needed.The potential of laserspray ionization (LSI) (Scheme 1) (2633) for protein tissue analysis is reported here. LSI has advantages relative to other MS-based methods, including speed of analysis, laser ablation of small volumes, more relevant AP conditions, extended mass range and improved fragmentation through multiple charging, and the ability to obtain cross-section data for proteins on appropriate instrumentation. The applicability of LSI for high mass compounds on high performance API mass spectrometers (Orbitrap Exactive and SYNAPT G2) has been demonstrated producing ESI-like multiply protonated ions (2628). The first experiments showing sequence analysis by ETD using the LSI method were successfully carried out on a Thermo Fisher Scientific (San Jose, CA) LTQ-ETD mass spectrometer (26). Nearly complete sequence coverage was obtained for ubiquitin, an important regulatory protein. Applying ETD fragmentation to LSI-MS analyses potentially provides a new method for studying biological processes, including the mapping of phosphorylation, glycosylation, and ubiquitination sites from intact proteins and directly from tissue.Open in a separate windowScheme 1.Overview of LSI-MS operated in transmission geometry.Furthermore, unlike ESI and related ESI-based methods such as desorption-ESI (34), the LSI method has been shown to allow analysis of lipids in tissue from ablated areas <80 μm (30). In comparison with literature reports for AP MALDI at the same stage of development (35), LSI is more than an order of magnitude more sensitive and is capable of analyzing proteins on high resolution mass spectrometers as was demonstrated by obtaining full-acquisition mass spectra at 100,000 mass resolution (FWHH, m/z 200) after application of only 20 fmol of bovine pancreas insulin in the matrix 2,5-dihydroxyacetophenone (2,5-DHAP) onto a glass microscope slide (33). The analysis speed of LSI was demonstrated by obtaining mass spectra of five samples in 8 s (32). Here, we show the utility of LSI for intact peptide and protein analyses directly from mouse brain tissue. The ability to obtain a protein mass spectrum directly from mouse brain tissue in a single laser shot at 100,000 mass resolution and with ETD fragmentation is demonstrated.  相似文献   

11.
12.
The lack of consensus sequence, common core structure, and universal endoglycosidase for the release of O-linked oligosaccharides makes O-glycosylation more difficult to tackle than N-glycosylation. Structural elucidation by mass spectrometry is usually inconclusive as the CID spectra of most glycopeptides are dominated by carbohydrate-related fragments, preventing peptide identification. In addition, O-linked structures also undergo a gas-phase rearrangement reaction, which eliminates the sugar without leaving a telltale sign at its former attachment site. In the present study we report the enrichment and mass spectrometric analysis of proteins from bovine serum bearing Galβ1–3GalNAcα (mucin core-1 type) structures and the analysis of O-linked glycopeptides utilizing electron transfer dissociation and high resolution, high mass accuracy precursor ion measurements. Electron transfer dissociation (ETD) analysis of intact glycopeptides provided sufficient information for the identification of several glycosylation sites. However, glycopeptides frequently feature precursor ions of low charge density (m/z > ∼850) that will not undergo efficient ETD fragmentation. Exoglycosidase digestion was utilized to reduce the mass of the molecules while retaining their charge. ETD analysis of species modified by a single GalNAc at each site was significantly more successful in the characterization of multiply modified molecules. We report the unambiguous identification of 21 novel glycosylation sites. We also detail the limitations of the enrichment method as well as the ETD analysis.Glycosylation is among the most prevalent post-translational modifications of proteins; it is estimated that over half of all proteins undergo glycosylation during their lifespan (1). Glycosylation of secreted proteins and the extracellular part of membrane proteins occurs in the endoplasmic reticulum and the contiguous Golgi complex. The side chains of Trp, Asn, and Thr/Ser residues can be modified, termed as C-, N-, and O-glycosylation, respectively (2, 3). In addition, O-glycosylation also occurs within the nucleus and the cytosol: a single GlcNAc residue modifies Ser and Thr residues. O-GlcNAc glycosylation fulfills a regulatory/signaling function similar to phosphorylation (4).From an analytical point of view, C-glycosylation is the simplest. A consensus sequence has been defined: WXXW where the first Trp is modified, and the modification, a Man moiety, readily survives sample preparation and mass spectrometric analysis, including collisional activation (5). Investigation of N-glycosylation is also facilitated by several factors. First, N-glycosylation again has a well defined consensus sequence: NX(S/T/C) where the middle amino acid cannot be Pro (6). Second, there is a universal core glycan structure: GlcNAc2Man3; and this core is conserved across species. Third, a specific endoglycosidase, peptide N-glycosidase F, has been identified. This enzyme cleaves the carbohydrate structure from the peptide, leaving behind a diagnostic sign: the Asn residue is hydrolyzed to Asp, inducing a mass shift of +1 Da. By contrast, analysis of O-glycosylation is hampered by a lack of (i) a consensus sequence, (ii) a universal core structure, and (iii) a universal endoglycosidase or gentle chemical hydrolysis method to facilitate analysis.Glycosylation shows a high degree of species and tissue specificity; the same site may be modified by a wide variety of different glycan structures, and unmodified variants of the protein may occur simultaneously (79). Disease and physiological changes also may alter the glycosylation pattern (1012). The biological role(s) of glycosylation has been studied extensively (1315), although such studies are seriously hampered by the difficulties of glycosylation analysis.Most secreted proteins are glycosylated; and thus, mammalian serum is rich in glycoproteins. On the other hand, O-linked glycoproteins represent a small percentage of the serum protein content. Glycoproteins may display a befuddling heterogeneity both in site specificity and site occupancy. Thus, the enrichment of modified proteins or peptides is necessary for their characterization, and different techniques have been tested for this purpose. Lectin affinity chromatography is a popular method for selective isolation of glycoproteins and glycopeptides. Concanavalin A can be used to isolate oligomannose type glycopeptides (16), wheat germ agglutinin is applied for GlcNAc-containing compounds (16, 17), and jacalin is selective for core-1 type O-glycopeptides (18, 19). Lectins with preferential affinity for fucosylated and sialylated structures can also be utilized (12). Non-selective capture of glycopeptides can be performed using hydrophilic interaction chromatography (20, 21) or size exclusion chromatography (22). A recent approach applies porous graphite columns for semiselective enrichment (23), whereas the acidic character of sialylated glycopeptides has also been exploited via titanium dioxide-mediated enrichment (24). Finally vicinal cis-diols can be selectively captured using boronic acid derivatives (2527). All methods described here provide some glycopeptide enrichment from non-glycosylated peptide background, but all also suffer from significant non-selective binding. N-Linked glycoproteins may also be selectively captured on hydrazide resin following periodate oxidation (28). This approach requires enzymatic deglycosylation to release the captured peptides for analysis, therefore excluding the determination of the carbohydrate structure.Intact glycopeptide characterization still represents a significant challenge. Edman degradation, either alone or in combination with mass spectrometry, has been utilized for such tasks (29, 30). CID analysis of O-linked glycopeptides has limited utility. (i) MS/MS analysis cannot differentiate between the isomeric carbohydrate units and usually does not reveal the linkage positions and the configuration of the glycosidic bonds. (ii) Such spectra are typically dominated by abundant product ions associated with carbohydrate fragmentation, namely non-reducing end oxonium ions and product ions formed via sequential neutral losses of sugar residues from the precursor ions. (iii) The glycan is cleaved from the peptide via a gas-phase rearrangement reaction, and as a result the peptide itself and most peptide fragments (if any) are detected partially or completely deglycosylated (3133). Recently a different approach, the combination of positive and negative ion mode infrared multiphoton dissociation, was found to provide conclusive structural assignment for some O-linked glycopeptides (34). However, two novel MS/MS techniques, electron capture dissociation (ECD),1 which is performed in FT-ICR mass spectrometers (35), and electron transfer dissociation (ETD), which is performed in various ion trapping devices (36), may represent the real breakthrough. In both cases an electron is transferred to multiply protonated peptide cations, triggering peptide fragmentation at the covalent bond between the amino group and the α-carbon, producing mostly c and radical z· product ions while leaving the side chains intact. ETD is typically more efficient than ECD and thus leads to more comprehensive fragmentation. In addition, ETD can be performed in ion traps and thus, at a higher sensitivity level, especially in a linear ion trap. Because it has been observed that there are instances when the electron transfer is efficient and still no significant fragmentation occurs, ETD is usually combined with supplementary (and gentle) CID activation (37). O-Glycosylation analysis using these new dissociative techniques has been investigated (38, 39). However, because of the complexity of extracellular O-glycosylation, analysis of complex mixtures is rarely attempted (18), and the above techniques are usually restricted to the analysis of purified proteins.In this study we present the analysis of secreted O-linked glycopeptides. Lectin (jacalin) affinity chromatography was used to achieve some enrichment of core-1 O-GalNAcα type carbohydrate-carrying glycopeptides from bovine serum. The glycopeptide fractions were subjected to CID and ETD analysis. These experiments were performed on a linear ion trap-Orbitrap hybrid mass spectrometer (40). The Orbitrap delivered high resolution, high mass accuracy for the precursor ions, whereas the linear trap provided high sensitivity MS/MS analyses. Some fractions were also subjected to sequential exoglycosidase digestions, and glycopeptides retaining only the proximal GalNAc residues were analyzed. ProteinProspector v5.2.1, developed to accommodate ETD product ion spectra, aided data interpretation (41). We identified 26 glycosylation sites from bovine serum unambiguously; 21 of these sites have never been reported by any other study. No other single study to date has yielded so much information about O-linked glycosylation sites.  相似文献   

13.
In large-scale proteomic experiments, multiple peptide precursors are often cofragmented simultaneously in the same mixture tandem mass (MS/MS) spectrum. These spectra tend to elude current computational tools because of the ubiquitous assumption that each spectrum is generated from only one peptide. Therefore, tools that consider multiple peptide matches to each MS/MS spectrum can potentially improve the relatively low spectrum identification rate often observed in proteomics experiments. More importantly, data independent acquisition protocols promoting the cofragmentation of multiple precursors are emerging as alternative methods that can greatly improve the throughput of peptide identifications but their success also depends on the availability of algorithms to identify multiple peptides from each MS/MS spectrum. Here we address a fundamental question in the identification of mixture MS/MS spectra: determining the statistical significance of multiple peptides matched to a given MS/MS spectrum. We propose the MixGF generating function model to rigorously compute the statistical significance of peptide identifications for mixture spectra and show that this approach improves the sensitivity of current mixture spectra database search tools by a ≈30–390%. Analysis of multiple data sets with MixGF reveals that in complex biological samples the number of identified mixture spectra can be as high as 20% of all the identified spectra and the number of unique peptides identified only in mixture spectra can be up to 35.4% of those identified in single-peptide spectra.The advancement of technology and instrumentation has made tandem mass (MS/MS)1 spectrometry the leading high-throughput method to analyze proteins (1, 2, 3). In typical experiments, tens of thousands to millions of MS/MS spectra are generated and enable researchers to probe various aspects of the proteome on a large scale. Part of this success hinges on the availability of computational methods that can analyze the large amount of data generated from these experiments. The classical question in computational proteomics asks: given an MS/MS spectrum, what is the peptide that generated the spectrum? However, it is increasingly being recognized that this assumption that each MS/MS spectrum comes from only one peptide is often not valid. Several recent analyses show that as many as 50% of the MS/MS spectra collected in typical proteomics experiments come from more than one peptide precursor (4, 5). The presence of multiple peptides in mixture spectra can decrease their identification rate to as low as one half of that for MS/MS spectra generated from only one peptide (6, 7, 8). In addition, there have been numerous developments in data independent acquisition (DIA) technologies where multiple peptide precursors are intentionally selected to cofragment in each MS/MS spectrum (9, 10, 11, 12, 13, 14, 15). These emerging technologies can address some of the enduring disadvantages of traditional data-dependent acquisition (DDA) methods (e.g. low reproducibility (16)) and potentially increase the throughput of peptide identification 5–10 fold (4, 17). However, despite the growing importance of mixture spectra in various contexts, there are still only a few computational tools that can analyze mixture spectra from more than one peptide (18, 19, 20, 21, 8, 22). Our recent analysis indicated that current database search methods for mixture spectra still have relatively low sensitivity compared with their single-peptide counterpart and the main bottleneck is their limited ability to separate true matches from false positive matches (8). Traditionally problem of peptide identification from MS/MS spectra involves two sub-problems: 1) define a Peptide-Spectrum-Match (PSM) scoring function that assigns each MS/MS spectrum to the peptide sequence that most likely generated the spectrum; and 2) given a set of top-scoring PSMs, select a subset that corresponds to statistical significance PSMs. Here we focus on the second problem, which is still an ongoing research question even for the case of single-peptide spectra (23, 24, 25, 26). Intuitively the second problem is difficult because one needs to consider spectra across the whole data set (instead of comparing different peptide candidates against one spectrum as in the first problem) and PSM scoring functions are often not well-calibrated across different spectra (i.e. a PSM score of 50 may be good for one spectrum but poor for a different spectrum). Ideally, a scoring function will give high scores to all true PSMs and low scores to false PSMs regardless of the peptide or spectrum being considered. However, in practice, some spectra may receive higher scores than others simply because they have more peaks or their precursor mass results in more peptide candidates being considered from the sequence database (27, 28). Therefore, a scoring function that accounts for spectrum or peptide-specific effects can make the scores more comparable and thus help assess the confidence of identifications across different spectra. The MS-GF solution to this problem is to compute the per-spectrum statistical significance of each top-scoring PSM, which can be defined as the probability that a random peptide (out of all possible peptide within parent mass tolerance) will match to the spectrum with a score at least as high as that of the top-scoring PSM. This measures how good the current best match is in relation to all possible peptides matching to the same spectrum, normalizing any spectrum effect from the scoring function. Intuitively, our proposed MixGF approach extends the MS-GF approach to now calculate the statistical significance of the top pair of peptides matched from the database to a given mixture spectrum M (i.e. the significance of the top peptide–peptide spectrum match (PPSM)). As such, MixGF determines the probability that a random pair of peptides (out of all possible peptides within parent mass tolerance) will match a given mixture spectrum with a score at least as high as that of the top-scoring PPSM.Despite the theoretical attractiveness of computing statistical significance, it is generally prohibitive for any database search methods to score all possible peptides against a spectrum. Therefore, earlier works in this direction focus on approximating this probability by assuming the score distribution of all PSMs follows certain analytical form such as the normal, Poisson or hypergeometric distributions (29, 30, 31). In practice, because score distributions are highly data-dependent and spectrum-specific, these model assumptions do not always hold. Other approaches tried to learn the score distribution empirically from the data (29, 27). However, one is most interested in the region of the score distribution where only a small fraction of false positives are allowed (typically at 1% FDR). This usually corresponds to the extreme tail of the distribution where p values are on the order of 10−9 or lower and thus there is typically lack of sufficient data points to accurately model the tail of the score distribution (32). More recently, Kim et al. (24) and Alves et al. (33), in parallel, proposed a generating function approach to compute the exact score distribution of random peptide matches for any spectra without explicitly matching all peptides to a spectrum. Because it is an exact computation, no assumption is made about the form of score distribution and the tail of the distribution can be computed very accurately. As a result, this approach substantially improved the ability to separate true matches from false positive ones and lead to a significant increase in sensitivity of peptide identification over state-of-the-art database search tools in single-peptide spectra (24).For mixture spectra, it is expected that the scores for the top-scoring match will be even less comparable across different spectra because now more than one peptide and different numbers of peptides can be matched to each spectrum at the same time. We extend the generating function approach (24) to rigorously compute the statistical significance of multiple-Peptide-Spectrum Matches (mPSMs) and demonstrate its utility toward addressing the peptide identification problem in mixture spectra. In particular, we show how to extend the generating approach for mixture from two peptides. We focus on this relatively simple case of mixture spectra because it accounts for a large fraction of mixture spectra presented in traditional DDA workflows (5). This allows us to test and develop algorithmic concepts using readily-available DDA data because data with more complex mixture spectra such as those from DIA workflows (11) is still not widely available in public repositories.  相似文献   

14.
Here we present a novel methodology for the identification of the targeted post-translational modifications present in highly modified proteins using mixed integer linear optimization and electron transfer dissociation (ETD) tandem mass spectrometry. For a given ETD tandem mass spectrum, the rigorous set of modified forms that satisfy the mass of the precursor ion, within some tolerance error, are enumerated by solving a feasibility problem via mixed integer linear optimization. The enumeration of the entire superset of modified forms enables the method to normalize the relative contributions of the individual modification sites. Given the entire set of modified forms, a superposition problem is then formulated using mixed integer linear optimization to determine the relative fractions of the modified forms that are present in the multiplexed ETD tandem mass spectrum. Chromatographic information in the mass and time dimension is utilized to assess the likelihood of the assigned modification states, to average several tandem mass spectra for confident identification of lower level forms, and to infer modification states of partially assigned spectra. The utility of the proposed computational framework is demonstrated on an entire LC-MS/MS ETD experiment corresponding to a mixture of highly modified histone peptides. This new computational method will facilitate the unprecedented LC-MS/MS ETD analysis of many hypermodified proteins and offer novel biological insight into these previously understudied systems.Accurate identification of post-translational modifications (PTMs)1 is a critical and often difficult task in proteomics. Most standard mass spectrometry-based techniques for the identification of protein modifications utilize a “bottom up” approach where the proteins are enzymatically digested into smaller peptides that are subsequently ionized and fragmented via CID to derive their sequence information (19). The identification of all the modifications present in a protein hinges on the successful identification of the PTM modifications of its corresponding peptides. This protocol can be limited by (a) insufficient elution and detection of all the peptides that cover the entire sequence of the protein, (b) false or incomplete identifications at the peptide level, and (c) the existence of different modification states of the same protein. Additional complications arise when using CID to study labile PTMs such as phosphorylation, glycosylation, or sulfonation. In these instances, the preferred reaction is often the cleavage of the PTM as opposed to the backbone of the peptide, resulting in a high intensity peak corresponding to the difference of the parent mass and the cleaved modification. The advent of electron capture dissociation (ECD) (10, 11) and electron transfer dissociation (ETD) (1215) has enabled researchers to address the aforementioned issues associated with bottom up approaches using CID by adopting a complementary top down or middle down analysis strategy.ECD and ETD both involve the reaction of an electron with a highly protonated cation to form an odd electron peptide. This process induces large amounts of backbone cleavage to yield c and z· ions that are analogous to the b and y ion series typically encountered in CID tandem mass spectra. Unlike CID, ECD/ETD cleavage is weakly affected by the composition and number of amino acids in the peptide and for certain systems can provide more fragmentation coverage than CID alone, especially for bigger peptides with higher charge states. Both ECD and ETD also prevent the cleavage of labile modifications, and thus PTMs are retained on the corresponding c and z· ions. The aforementioned benefits make ECD/ETD particularly well suited for the LC-MS/MS top down and middle down analysis of post-translationally modified proteins. These top down and middle down approaches also enable the approximate inference of protein abundance from the chromatogram and MS1 information because the full protein sequence elutes from the column (16).In recent years, there has been significant interest in the identification of highly modified proteins, such as histones. Histone proteins are key regulators of many important DNA processes in eukaryotes, and recent studies have elucidated complex relationships between histone modifications and many nuclear events. It has also been shown that differences in global histone modifications in tissues can be used to predict the clinical outcome of cancer patients (17). Early MS or immunoassay studies were only able to analyze these modifications on a site-by-site basis and as a result lost important connectivity information on the molecular level because several modified forms of the same protein exist concurrently. In MS-based applications, the use of traditional reversed phase HPLC for the separation of a highly modified protein results in poor chromatographic resolution because all the modified forms are physically similar. Successful off-line techniques for the separation of highly modified histone forms have been achieved using cation exchange hydrophilic interaction chromatography (HILIC) (18), which separates the modified species primarily by the number of acetyl groups and secondly by the degree of methylation. The separation must be conducted off line because the mobile phase additives used are non-volatile components, and subsequent fractionation is necessary for mass spectrometric analysis. This protocol has made it possible to analyze the first 50 amino acids of the N-terminal tail of histone H3 and provided important insight regarding connectivity information between the modification sites. A major disadvantage of this approach is that the off-line nature of the experimental protocol is extremely time-consuming (on the order of months) and thus prohibits the ability to conduct multiple runs for high throughput studies and statistical validation. Additionally, other off-line techniques have been successful in the extraction and purification of modified histone proteins using acid-urea gel electrophoresis (19) but suffer from similar throughput constraints.We have recently developed chromatography that is particularly suited for LC-MS ETD analysis of highly modified polypeptides with successful applications to histone proteins (20). The protocol uses a “saltless” pH gradient to elute the various modified forms in a weak cation exchange HILIC. Unprecedented separation of the modified histone forms is achieved within a single LC-MS/MS ETD experiment, thereby introducing important chromatographic information that can be utilized in the subsequent identification and quantification of these post-translational modifications. Although the achieved separation is exceptional in comparison with previous attempts, the complexity and relative similarity of the modified forms still results in minor species co-eluting with similar mass and retention times, thus resulting in multiplexed tandem mass spectra. The term “multiplexed” as used here refers to the fact that several species are dissociated and measured in a single tandem mass spectrum (21) and should not be confused with the multiplex experimental protocols. Computational methodologies that utilize the extensive and complementary information contained within these LC-MS/MS data sets are nonexistent as the technology has only recently been developed.In this work, we present a novel mixed integer linear optimization (MILP) computational framework for the identification and quantification of highly modified proteins using LC-MS and ETD tandem mass spectrometry. Key concepts of the proposed framework are illustrated using histone H3.2 as an example system. For a given primary sequence, the entire set of post-translational modifications that satisfy a precursor mass are enumerated by solving an MILP feasibility problem. Given this set of PTM forms, an MILP superposition problem is then solved to determine the relative fractions of the modified forms that are present in the multiplexed ETD tandem mass spectrum. An important aspect of the proposed framework is that chromatographic information is used to correlate the modification states as a function of modification position, mass, and time. The proposed computational framework is applied to an entire LC-MS/MS ETD experiment corresponding to a mixture of highly modified histone peptides to demonstrate its utility.  相似文献   

15.
The success of high-throughput proteomics hinges on the ability of computational methods to identify peptides from tandem mass spectra (MS/MS). However, a common limitation of most peptide identification approaches is the nearly ubiquitous assumption that each MS/MS spectrum is generated from a single peptide. We propose a new computational approach for the identification of mixture spectra generated from more than one peptide. Capitalizing on the growing availability of large libraries of single-peptide spectra (spectral libraries), our quantitative approach is able to identify up to 98% of all mixture spectra from equally abundant peptides and automatically adjust to varying abundance ratios of up to 10:1. Furthermore, we show how theoretical bounds on spectral similarity avoid the need to compare each experimental spectrum against all possible combinations of candidate peptides (achieving speedups of over five orders of magnitude) and demonstrate that mixture-spectra can be identified in a matter of seconds against proteome-scale spectral libraries. Although our approach was developed for and is demonstrated on peptide spectra, we argue that the generality of the methods allows for their direct application to other types of spectral libraries and mixture spectra.The success of tandem MS (MS/MS1) approaches to peptide identification is partly due to advances in computational techniques allowing for the reliable interpretation of MS/MS spectra. Mainstream computational techniques mainly fall into two categories: database search approaches that score each spectrum against peptides in a sequence database (14) or de novo techniques that directly reconstruct the peptide sequence from each spectrum (58). The combination of these methods with advances in high-throughput MS/MS have promoted the accelerated growth of spectral libraries, collections of peptide MS/MS spectra the identification of which were validated by accepted statistical methods (9, 10) and often also manually confirmed by mass spectrometry experts. The similar concept of spectral archives was also recently proposed to denote spectral libraries including “interesting” nonidentified spectra (11) (i.e. recurring spectra with good de novo reconstructions but no database match). The growing availability of these large collections of MS/MS spectra has reignited the development of alternative peptide identification approaches based on spectral matching (1214) and alignment (1517) algorithms.However, mainstream approaches were developed under the (often unstated) assumption that each MS/MS spectrum is generated from a single peptide. Although chromatographic procedures greatly contribute to making this a reasonable assumption, there are several situations where it is difficult or even impossible to separate pairs of peptides. Examples include certain permutations of the peptide sequence or post-translational modifications (see (18) for examples of co-eluting histone modification variants). In addition, innovative experimental setups have demonstrated the potential for increased throughput in peptide identification using mixture spectra; examples include data-independent acquisition (19) ion-mobility MS (20), and MSE strategies (21).To alleviate the algorithmic bottleneck in such scenarios, we describe a computational approach, M-SPLIT (mixture-spectrum partitioning using library of identified tandem mass spectra), that is able to reliably and efficiently identify peptides from mixture spectra, which are generated from a pair of peptides. In brief, a mixture spectrum is modeled as linear combination of two single-peptide spectra, and peptide identification is done by searching against a spectral library. We show that efficient filtration and accurate branch-and-bound strategies can be used to avoid the huge computational cost of searching all possible pairs. Thus equipped, our approach is able to identify the correct matches by considering only a minuscule fraction of all possible matches. Beyond potentially enhancing the identification capabilities of current MS/MS acquisition setups, we argue that the availability of methods to reliably identify MS/MS spectra from mixtures of peptides could enable the collection of MS/MS data using accelerated chromatography setups to obtain the same or better peptide identification results in a fraction of the experimental time currently required for exhaustive peptide separation.  相似文献   

16.
Mitochondrial functions are dynamically regulated in the heart. In particular, protein phosphorylation has been shown to be a key mechanism modulating mitochondrial function in diverse cardiovascular phenotypes. However, site-specific phosphorylation information remains scarce for this organ. Accordingly, we performed a comprehensive characterization of murine cardiac mitochondrial phosphoproteome in the context of mitochondrial functional pathways. A platform using the complementary fragmentation technologies of collision-induced dissociation (CID) and electron transfer dissociation (ETD) demonstrated successful identification of a total of 236 phosphorylation sites in the murine heart; 210 of these sites were novel. These 236 sites were mapped to 181 phosphoproteins and 203 phosphopeptides. Among those identified, 45 phosphorylation sites were captured only by CID, whereas 185 phosphorylation sites, including a novel modification on ubiquinol-cytochrome c reductase protein 1 (Ser-212), were identified only by ETD, underscoring the advantage of a combined CID and ETD approach. The biological significance of the cardiac mitochondrial phosphoproteome was evaluated. Our investigations illustrated key regulatory sites in murine cardiac mitochondrial pathways as targets of phosphorylation regulation, including components of the electron transport chain (ETC) complexes and enzymes involved in metabolic pathways (e.g. tricarboxylic acid cycle). Furthermore, calcium overload injured cardiac mitochondrial ETC function, whereas enhanced phosphorylation of ETC via application of phosphatase inhibitors restored calcium-attenuated ETC complex I and complex III activities, demonstrating positive regulation of ETC function by phosphorylation. Moreover, in silico analyses of the identified phosphopeptide motifs illuminated the molecular nature of participating kinases, which included several known mitochondrial kinases (e.g. pyruvate dehydrogenase kinase) as well as kinases whose mitochondrial location was not previously appreciated (e.g. Src). In conclusion, the phosphorylation events defined herein advance our understanding of cardiac mitochondrial biology, facilitating the integration of the still fragmentary knowledge about mitochondrial signaling networks, metabolic pathways, and intrinsic mechanisms of functional regulation in the heart.Mitochondria are the source of energy to sustain life. In addition to their evolutionary origin as an energy-producing organelle, their functionality has integrated into every aspect of life, including the cell cycle, ROS1 production, apoptosis, and ion balance (1, 2). Our understanding of mitochondrial biology is still growing. Several systems biology approaches have been dedicated to exploring the molecular infrastructure and dynamics of the functional versatility associated with this organelle (35).To meet tissue-specific functional demands, mitochondria acquire heterogeneous properties in individual organs, a first statement of their plasticity in function and proteome composition (1, 6). The heterogeneity is evident even in an individual cardiomyocyte (7). A catalogue of the cardiac mitochondrial proteome is emerging via a joint effort (35). The dynamics of the mitochondrial proteome manifest at multiple levels, including post-translational modifications, such as phosphorylation. Our investigative goal is to decode this organellar proteome and its post-translational modification in a biological and functional context. In cardiomyocytes, mitochondria are also constantly exposed to fluctuation in energy demands and in ionic conditions. The capacity of mitochondria to cope with such a dynamic environment is essential for the functional role of mitochondria in normal and disease phenotypes (810). Unique protein features enabling the mitochondrial proteome to adapt to these biological changes can be interrogated by proteomics tools (1012). Protein phosphorylation as a rapid and reversible chemical event is an integral component of these protein features (1214).It has been estimated that one-third of cellular proteins exist in a phosphorylated state at least one time in their lifetime (15). However, only a handful of phosphorylation events have been identified to tune mitochondrial functionality (13, 14, 16) despite the fact that the first demonstration of phosphorylation was reported on a mitochondrial protein more than 5 decades ago (17). Kinases and phosphatases comprise nearly 3% of the human genome (18, 19). In mitochondria, ∼30 kinases and phosphatases have been identified thus far within the expected organellar proteome of a few thousand (35, 16). The number of identified mitochondrial phosphoproteins is far below one-third of its proteome size (20). Thus, it appears that the current pool of reported phosphoproteins represents only a small fraction of the anticipated mitochondrial phosphoproteome. The seminal studies from several groups (1214, 16) demonstrated the prevalence as well as the dynamic nature of phosphorylation in cardiac mitochondria, suggesting that obtaining a comprehensive map of the mitochondrial phosphoproteome is feasible.In this study, we took a systematic approach to tackle the phosphorylation of murine cardiac mitochondrial pathways. We applied the unique strengths of both electron transfer dissociation (ETD) and collision-induced dissociation (CID) LC-MS/MS to screen phosphorylation events in a site-specific fashion. A total of 236 phosphorylation sites in 203 unique phosphopeptides were identified and mapped to 181 phosphoproteins. Novel phosphorylation modifications were discovered in diverse pathways of mitochondrial biology, including ion balance, proteolysis, and apoptosis. Consistent with the role of mitochondria as the major source of energy production under delicate control, metabolic pathways claimed one-third of phosphorylation sites captured in this analysis. To study molecular players steering mitochondrial phosphorylation, we probed the effects of calcium loading on phosphorylation. In addition, a number of kinases with previously unappreciated mitochondrial residence are suggested as potential players modulating mitochondrial pathways. Taken together, the cohort of novel phosphorylation events discovered in this study constitutes an essential step toward the full delineation of the cardiac mitochondrial phosphoproteome.  相似文献   

17.
Database search programs are essential tools for identifying peptides via mass spectrometry (MS) in shotgun proteomics. Simultaneously achieving high sensitivity and high specificity during a database search is crucial for improving proteome coverage. Here we present JUMP, a new hybrid database search program that generates amino acid tags and ranks peptide spectrum matches (PSMs) by an integrated score from the tags and pattern matching. In a typical run of liquid chromatography coupled with high-resolution tandem MS, more than 95% of MS/MS spectra can generate at least one tag, whereas the remaining spectra are usually too poor to derive genuine PSMs. To enhance search sensitivity, the JUMP program enables the use of tags as short as one amino acid. Using a target-decoy strategy, we compared JUMP with other programs (e.g. SEQUEST, Mascot, PEAKS DB, and InsPecT) in the analysis of multiple datasets and found that JUMP outperformed these preexisting programs. JUMP also permitted the analysis of multiple co-fragmented peptides from “mixture spectra” to further increase PSMs. In addition, JUMP-derived tags allowed partial de novo sequencing and facilitated the unambiguous assignment of modified residues. In summary, JUMP is an effective database search algorithm complementary to current search programs.Peptide identification by tandem mass spectra is a critical step in mass spectrometry (MS)-based1 proteomics (1). Numerous computational algorithms and software tools have been developed for this purpose (26). These algorithms can be classified into three categories: (i) pattern-based database search, (ii) de novo sequencing, and (iii) hybrid search that combines database search and de novo sequencing. With the continuous development of high-performance liquid chromatography and high-resolution mass spectrometers, it is now possible to analyze almost all protein components in mammalian cells (7). In contrast to rapid data collection, it remains a challenge to extract accurate information from the raw data to identify peptides with low false positive rates (specificity) and minimal false negatives (sensitivity) (8).Database search methods usually assign peptide sequences by comparing MS/MS spectra to theoretical peptide spectra predicted from a protein database, as exemplified in SEQUEST (9), Mascot (10), OMSSA (11), X!Tandem (12), Spectrum Mill (13), ProteinProspector (14), MyriMatch (15), Crux (16), MS-GFDB (17), Andromeda (18), BaMS2 (19), and Morpheus (20). Some other programs, such as SpectraST (21) and Pepitome (22), utilize a spectral library composed of experimentally identified and validated MS/MS spectra. These methods use a variety of scoring algorithms to rank potential peptide spectrum matches (PSMs) and select the top hit as a putative PSM. However, not all PSMs are correctly assigned. For example, false peptides may be assigned to MS/MS spectra with numerous noisy peaks and poor fragmentation patterns. If the samples contain unknown protein modifications, mutations, and contaminants, the related MS/MS spectra also result in false positives, as their corresponding peptides are not in the database. Other false positives may be generated simply by random matches. Therefore, it is of importance to remove these false PSMs to improve dataset quality. One common approach is to filter putative PSMs to achieve a final list with a predefined false discovery rate (FDR) via a target-decoy strategy, in which decoy proteins are merged with target proteins in the same database for estimating false PSMs (2326). However, the true and false PSMs are not always distinguishable based on matching scores. It is a problem to set up an appropriate score threshold to achieve maximal sensitivity and high specificity (13, 27, 28).De novo methods, including Lutefisk (29), PEAKS (30), NovoHMM (31), PepNovo (32), pNovo (33), Vonovo (34), and UniNovo (35), identify peptide sequences directly from MS/MS spectra. These methods can be used to derive novel peptides and post-translational modifications without a database, which is useful, especially when the related genome is not sequenced. High-resolution MS/MS spectra greatly facilitate the generation of peptide sequences in these de novo methods. However, because MS/MS fragmentation cannot always produce all predicted product ions, only a portion of collected MS/MS spectra have sufficient quality to extract partial or full peptide sequences, leading to lower sensitivity than achieved with the database search methods.To improve the sensitivity of the de novo methods, a hybrid approach has been proposed to integrate peptide sequence tags into PSM scoring during database searches (36). Numerous software packages have been developed, such as GutenTag (37), InsPecT (38), Byonic (39), DirecTag (40), and PEAKS DB (41). These methods use peptide tag sequences to filter a protein database, followed by error-tolerant database searching. One restriction in most of these algorithms is the requirement of a minimum tag length of three amino acids for matching protein sequences in the database. This restriction reduces the sensitivity of the database search, because it filters out some high-quality spectra in which consecutive tags cannot be generated.In this paper, we describe JUMP, a novel tag-based hybrid algorithm for peptide identification. The program is optimized to balance sensitivity and specificity during tag derivation and MS/MS pattern matching. JUMP can use all potential sequence tags, including tags consisting of only one amino acid. When we compared its performance to that of two widely used search algorithms, SEQUEST and Mascot, JUMP identified ∼30% more PSMs at the same FDR threshold. In addition, the program provides two additional features: (i) using tag sequences to improve modification site assignment, and (ii) analyzing co-fragmented peptides from mixture MS/MS spectra.  相似文献   

18.
19.
Understanding how a small brain region, the suprachiasmatic nucleus (SCN), can synchronize the body''s circadian rhythms is an ongoing research area. This important time-keeping system requires a complex suite of peptide hormones and transmitters that remain incompletely characterized. Here, capillary liquid chromatography and FTMS have been coupled with tailored software for the analysis of endogenous peptides present in the SCN of the rat brain. After ex vivo processing of brain slices, peptide extraction, identification, and characterization from tandem FTMS data with <5-ppm mass accuracy produced a hyperconfident list of 102 endogenous peptides, including 33 previously unidentified peptides, and 12 peptides that were post-translationally modified with amidation, phosphorylation, pyroglutamylation, or acetylation. This characterization of endogenous peptides from the SCN will aid in understanding the molecular mechanisms that mediate rhythmic behaviors in mammals.Central nervous system neuropeptides function in cell-to-cell signaling and are involved in many physiological processes such as circadian rhythms, pain, hunger, feeding, and body weight regulation (14). Neuropeptides are produced from larger protein precursors by the selective action of endopeptidases, which cleave at mono- or dibasic sites and then remove the C-terminal basic residues (1, 2). Some neuropeptides undergo functionally important post-translational modifications (PTMs),1 including amidation, phosphorylation, pyroglutamylation, or acetylation. These aspects of peptide synthesis impact the properties of neuropeptides, further expanding their diverse physiological implications. Therefore, unveiling new peptides and unreported peptide properties is critical to advancing our understanding of nervous system function.Historically, the analysis of neuropeptides was performed by Edman degradation in which the N-terminal amino acid is sequentially removed. However, analysis by this method is slow and does not allow for sequencing of the peptides containing N-terminal PTMs (5). Immunological techniques, such as radioimmunoassay and immunohistochemistry, are used for measuring relative peptide levels and spatial localization, but these methods only detect peptide sequences with known structure (6). More direct, high throughput methods of analyzing brain regions can be used.Mass spectrometry, a rapid and sensitive method that has been used for the analysis of complex biological samples, can detect and identify the precise forms of neuropeptides without prior knowledge of peptide identity, with these approaches making up the field of peptidomics (712). The direct tissue and single neuron analysis by MALDI MS has enabled the discovery of hundreds of neuropeptides in the last decade, and the neuronal homogenate analysis by fractionation and subsequent ESI or MALDI MS has yielded an equivalent number of new brain peptides (5). Several recent peptidome studies, including the work by Dowell et al. (10), have used the specificity of FTMS for peptide discovery (10, 1315). Here, we combine the ability to fragment ions at ultrahigh mass accuracy (16) with a software pipeline designed for neuropeptide discovery. We use nanocapillary reversed-phase LC coupled to 12 Tesla FTMS for the analysis of peptides present in the suprachiasmatic nucleus (SCN) of rat brain.A relatively small, paired brain nucleus located at the base of the hypothalamus directly above the optic chiasm, the SCN contains a biological clock that generates circadian rhythms in behaviors and homeostatic functions (17, 18). The SCN comprises ∼10,000 cellular clocks that are integrated as a tissue level clock which, in turn, orchestrates circadian rhythms throughout the brain and body. It is sensitive to incoming signals from the light-sensing retina and other brain regions, which cause temporal adjustments that align the SCN appropriately with changes in environmental or behavioral state. Previous physiological studies have implicated peptides as critical synchronizers of normal SCN function as well as mediators of SCN inputs, internal signal processing, and outputs; however, only a small number of peptides have been identified and explored in the SCN, leaving unresolved many circadian mechanisms that may involve peptide function.Most peptide expression in the SCN has only been studied through indirect antibody-based techniques (1929), although we recently used MS approaches to characterize several peptides detected in SCN releasates (30). Previous studies indicate that the SCN expresses a rich diversity of peptides relative to other brain regions studied with the same techniques. Previously used immunohistochemical approaches are not only inadequate for comprehensively evaluating PTMs and alternate isoforms of known peptides but are also incapable of exhaustively examining the full peptide complement of this complex biological network of peptidergic inputs and intrinsic components. A comprehensive study of SCN peptidomics is required that utilizes high resolution strategies for directly analyzing the peptide content of the neuronal networks comprising the SCN.In our study, the SCN was obtained from ex vivo coronal brain slices via tissue punch and subjected to multistage peptide extraction. The SCN tissue extract was analyzed by FTMS/MS, and the high resolution MS and MS/MS data were processed using ProSightPC 2.0 (16), which allows the identification and characterization of peptides or proteins from high mass accuracy MS/MS data. In addition, the Sequence Gazer included in ProSightPC was used for manually determining PTMs (31, 32). As a result, a total of 102 endogenous peptides were identified, including 33 that were previously unidentified, and 12 PTMs (including amidation, phosphorylation, pyroglutamylation, and acetylation) were found. The present study is the first comprehensive peptidomics study for identifying peptides present within the mammalian SCN. In fact, this is one of the first peptidome studies to work with discrete brain nuclei as opposed to larger brain structures and follows up on our recent report using LC-ion trap for analysis of the peptides in the supraoptic nucleus (33); here, the use of FTMS allows a greater range of PTMs to be confirmed and allows higher confidence in the peptide assignments. This information on the peptides in the SCN will serve as a basis to more exhaustively explore the extent that previously unreported SCN neuropeptides may function in SCN regulation of mammalian circadian physiology.  相似文献   

20.
Although K-Ras, Cdc42, and PAK4 signaling are commonly deregulated in cancer, only a few studies have sought to comprehensively examine the spectrum of phosphorylation-mediated signaling downstream of each of these key signaling nodes. In this study, we completed a label-free quantitative analysis of oncogenic K-Ras, activated Cdc42, and PAK4-mediated phosphorylation signaling, and report relative quantitation of 2152 phosphorylated peptides on 1062 proteins. We define the overlap in phosphopeptides regulated by K-Ras, Cdc42, and PAK4, and find that perturbation of these signaling components affects phosphoproteins associated with microtubule depolymerization, cytoskeletal organization, and the cell cycle. These findings provide a resource for future studies to characterize novel targets of oncogenic K-Ras signaling and validate biomarkers of PAK4 inhibition.The Ras oncoproteins are small monomeric GTPases that transduce mitogenic signals from cell surface receptor tyrosine kinases (RTKs) to intracellular serine/threonine kinases. Approximately thirty percent of human tumors harbor a somatic gain-of-function mutation in one of three RAS genes, resulting in the constitutive activation of Ras signaling and the aberrant hyperactivation of growth-promoting effector pathways (1). Designing therapeutic agents that directly target Ras has been challenging (2, 3), and thus clinical development efforts have focused on targeting effector pathways downstream of Ras. The Raf-MEK-ERK and PI3K-Akt effector pathways have been extensively studied and several small molecule inhibitors targeting these pathways are currently under clinical evaluation (4, 5). However, biochemical studies and mouse models indicate that several additional effector pathways are essential for Ras-driven transformation and tumorigenesis (611). Hence, a comprehensive characterization of these effector pathways may reveal additional druggable targets.The Rho GTPase Cdc42 lies downstream of Ras (1214) and regulates many cellular processes that are commonly perturbed in cancer, including migration, polarization, and proliferation (15) (Fig. 1A). Importantly, Cdc42 is overexpressed in several types of human cancer (1620) and is required for Ras-driven cellular transformation (13, 21, 22). Recent studies show that genetic ablation of Cdc42 impairs Ras-driven tumorigenesis (13), indicating the potential of Cdc42 and its effectors as drug targets in Ras mutant tumors.Open in a separate windowFig. 1.Experimental workflow. A, K-Ras is a small GTPase that regulates the activity of a variety of downstream proteins including the Rho GTPase Cdc42. The PAK4 serine/threonine kinase is a direct effector of Cdc42 and regulates actin reorganization, microtubule stability, and cell polarity. B, To measure large-scale phosphorylation changes induced by constitutive K-Ras or Cdc42 signaling or PAK4 ablation, the quantitative label-free PTMscan® approach was employed (Cell Signaling Technology). Briefly, for each condition extracted proteins were digested with trypsin and separated from non-peptide material by solid-phase extraction with Sep-Pak C18 cartridges. Three phosphorylation motif antibodies were used serially to isolate phosphorylated peptides in independent immunoaffinity purifications (CDK substrate motif [K R]-pS-P-X-[K R], CK substrate motif pT-[D E]-X-[D E], PKD substrate motif l-X-R-X-X-p[S T]). The samples were run in duplicate and tandem mass spectra were collected with an LTQ-Orbitrap hybrid mass spectrometer. pLPC is an empty vector control.In particular, the p21-activated kinases (PAKs) are Cdc42 effectors that have generated significant interest (23, 24), as they are central components of key oncogenic signaling pathways and regulate cytoskeletal organization, cell migration, and nuclear signaling (25). The PAK family is comprised of six members and is subdivided into two groups (Groups I and II) based on sequence and structural homology. Group I PAKs (PAK1–3) are relatively well characterized, however, much less is known regarding the function and regulation of Group II PAKs (PAK4–6). The kinase domains of Group I and II PAKs share only about 50% identity, suggesting the two groups may recognize distinct substrates and govern unique cellular processes (26).The Group II PAK family member PAK4 is of particular interest as it is overexpressed or genetically amplified in several lung, colon, prostate, pancreas, and breast tumor cell lines and samples (2630). Furthermore, functional studies have implicated PAK4 in cell transformation, cell invasion, and migration (27, 31). Xenograft studies in athymic mice show an important role for PAK4 in mediating Cdc42- or K-Ras-driven tumor formation, highlighting a critical role for Pak4 downstream of these GTPases (32). Given its roles in transformation, tumorigenesis, and oncogenic signaling, there is significant interest in targeting PAK4 therapeutically (23). PAK4 binds and phosphorylates several proteins involved in cytoskeletal organization and apoptosis, including Lim domain kinase 1 (LIMK1) (33), guanine nucleotide exchange factor-H1 (GEF-H1) (34), Raf-1 (35), and Bad (36). However, the Group I PAK family member PAK1 also phosphorylates several of these PAK4 targets (37). Thus, there remains a need to identify robust and selective pharmacodynamic biomarkers for PAK4 inhibition.Despite the importance of PAK4 and its upstream regulators in cancer development, few studies have sought to comprehensively characterize the spectrum of K-Ras, Cdc42, or PAK4 mediated phosphorylation signaling (3739). Recent developments in mass spectrometry allow the in-depth identification and quantitation of thousands of phosphorylation sites (4043). The majority of large-scale efforts have aimed to identify the basal phosphoproteomes of different species (44, 45) or tissues (46) to characterize global steady-state phosphorylation. However, this methodology can also be applied to quantify perturbed phosphorylation regulation in cancer signaling pathways (40, 4749), and has the potential to reveal novel biomarkers of oncogenic signaling.In this study, we completed a label-free quantitative analysis of K-Ras, Cdc42, and PAK4 phosphorylation signaling using the PTMScan® method, which has proven as robust and reproducible quantitation technology (50, 51). We quantified phosphorylation levels in wild-type and PAK4 knockout NIH3T3 cells expressing oncogenic K-Ras, activated Cdc42, or an empty vector control to elucidate the molecular pathways and functions modulated by these key signaling proteins. We report relative quantitation of 2152 phosphorylated peptides on 1062 proteins among the different conditions, and find that many of the regulated phosphoproteins are associated with microtubule depolymerization, cytoskeletal organization, and the cell cycle. To our knowledge, our study is the first to examine the overlap among signaling networks regulated by K-Ras, Cdc42, and PAK4, and provides a resource for future studies to further interrogate the perturbation of this signaling pathway.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号