首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Quantifying the similarity of spectra is an important task in various areas of spectroscopy, for example, to identify a compound by comparing sample spectra to those of reference standards. In mass spectrometry based discovery proteomics, spectral comparisons are used to infer the amino acid sequence of peptides. In targeted proteomics by selected reaction monitoring (SRM) or SWATH MS, predetermined sets of fragment ion signals integrated over chromatographic time are used to identify target peptides in complex samples. In both cases, confidence in peptide identification is directly related to the quality of spectral matches. In this study, we used sets of simulated spectra of well-controlled dissimilarity to benchmark different spectral comparison measures and to develop a robust scoring scheme that quantifies the similarity of fragment ion spectra. We applied the normalized spectral contrast angle score to quantify the similarity of spectra to objectively assess fragment ion variability of tandem mass spectrometric datasets, to evaluate portability of peptide fragment ion spectra for targeted mass spectrometry across different types of mass spectrometers and to discriminate target assays from decoys in targeted proteomics. Altogether, this study validates the use of the normalized spectral contrast angle as a sensitive spectral similarity measure for targeted proteomics, and more generally provides a methodology to assess the performance of spectral comparisons and to support the rational selection of the most appropriate similarity measure. The algorithms used in this study are made publicly available as an open source toolset with a graphical user interface.In “bottom-up” proteomics, peptide sequences are identified by the information contained in their fragment ion spectra (1). Various methods have been developed to generate peptide fragment ion spectra and to match them to their corresponding peptide sequences. They can be broadly grouped into discovery and targeted methods. In the widely used discovery (also referred to as shotgun) proteomic approach, peptides are identified by establishing peptide to spectrum matches via a method referred to as database searching. Each acquired fragment ion spectrum is searched against theoretical peptide fragment ion spectra computed from the entries of a specified sequence database, whereby the database search space is constrained to a user defined precursor mass tolerance (2, 3). The quality of the match between experimental and theoretical spectra is typically expressed with multiple scores. These include the number of matching or nonmatching fragments, the number of consecutive fragment ion matches among others. With few exceptions (47) commonly used search engines do not use the relative intensities of the acquired fragment ion signals even though this information could be expected to strengthen the confidence of peptide identification because the relative fragment ion intensity pattern acquired under controlled fragmentation conditions can be considered as a unique “fingerprint” for a given precursor. Thanks to community efforts in acquiring and sharing large number of datasets, the proteomes of some species are now essentially mapped out and experimental fragment ion spectra covering entire proteomes are increasingly becoming accessible through spectral databases (816). This has catalyzed the emergence of new proteomics strategies that differ from classical database searching in that they use prior spectral information to identify peptides. Those comprise inclusion list sequencing (directed sequencing), spectral library matching, and targeted proteomics (17). These methods explicitly use the information contained in empirical fragment ion spectra, including the fragment ion signal intensity to identify the target peptide. For these methods, it is therefore of highest importance to accurately control and quantify the degree of reproducibility of the fragment ion spectra across experiments, instruments, labs, methods, and to quantitatively assess the similarity of spectra. To date, dot product (1824), its corresponding arccosine spectral contrast angle (2527) and (Pearson-like) spectral correlation (2831), and other geometrical distance measures (18, 32), have been used in the literature for assessing spectral similarity. These measures have been used in different contexts including shotgun spectra clustering (19, 26), spectral library searching (18, 20, 21, 24, 25, 2729), cross-instrument fragmentation comparisons (22, 30) and for scoring transitions in targeted proteomics analyses such as selected reaction monitoring (SRM)1 (23, 31). However, to our knowledge, those scores have never been objectively benchmarked for their performance in discriminating well-defined levels of dissimilarities between spectra. In particular, similarity scores obtained by different methods have not yet been compared for targeted proteomics applications, where the sensitive discrimination of highly similar spectra is critical for the confident identification of targeted peptides.In this study, we have developed a method to objectively assess the similarity of fragment ion spectra. We provide an open-source toolset that supports these analyses. Using a computationally generated benchmark spectral library with increasing levels of well-controlled spectral dissimilarity, we performed a comprehensive and unbiased comparison of the performance of the main scores used to assess spectral similarity in mass spectrometry.We then exemplify how this method, in conjunction with its corresponding benchmarked perturbation spectra set, can be applied to answer several relevant questions for MS-based proteomics. As a first application, we show that it can efficiently assess the absolute levels of peptide fragmentation variability inherent to any given mass spectrometer. By comparing the instrument''s intrinsic fragmentation conservation distribution to that of the benchmarked perturbation spectra set, nominal values of spectral similarity scores can indeed be translated into a more directly understandable percentage of variability inherent to the instrument fragmentation. As a second application, we show that the method can be used to derive an absolute measure to estimate the conservation of peptide fragmentation between instruments or across proteomics methods. This allowed us to quantitatively evaluate, for example, the transferability of fragment ion spectra acquired by data dependent analysis in a first instrument into a fragment/transition assay list used for targeted proteomics applications (e.g. SRM or targeted extraction of data independent acquisition SWATH MS (33)) on another instrument. Third, we used the method to probe the fragmentation patterns of peptides carrying a post-translation modification (e.g. phosphorylation) by comparing the spectra of modified peptide with those of their unmodified counterparts. Finally, we used the method to determine the overall level of fragmentation conservation that is required to support target-decoy discrimination and peptide identification in targeted proteomics approaches such as SRM and SWATH MS.  相似文献   

Top-down mass spectrometry (MS)-based proteomics is arguably a disruptive technology for the comprehensive analysis of all proteoforms arising from genetic variation, alternative splicing, and posttranslational modifications (PTMs). However, the complexity of top-down high-resolution mass spectra presents a significant challenge for data analysis. In contrast to the well-developed software packages available for data analysis in bottom-up proteomics, the data analysis tools in top-down proteomics remain underdeveloped. Moreover, despite recent efforts to develop algorithms and tools for the deconvolution of top-down high-resolution mass spectra and the identification of proteins from complex mixtures, a multifunctional software platform, which allows for the identification, quantitation, and characterization of proteoforms with visual validation, is still lacking. Herein, we have developed MASH Suite Pro, a comprehensive software tool for top-down proteomics with multifaceted functionality. MASH Suite Pro is capable of processing high-resolution MS and tandem MS (MS/MS) data using two deconvolution algorithms to optimize protein identification results. In addition, MASH Suite Pro allows for the characterization of PTMs and sequence variations, as well as the relative quantitation of multiple proteoforms in different experimental conditions. The program also provides visualization components for validation and correction of the computational outputs. Furthermore, MASH Suite Pro facilitates data reporting and presentation via direct output of the graphics. Thus, MASH Suite Pro significantly simplifies and speeds up the interpretation of high-resolution top-down proteomics data by integrating tools for protein identification, quantitation, characterization, and visual validation into a customizable and user-friendly interface. We envision that MASH Suite Pro will play an integral role in advancing the burgeoning field of top-down proteomics.With well-developed algorithms and computational tools for mass spectrometry (MS)1 data analysis, peptide-based bottom-up proteomics has gained considerable popularity in the field of systems biology (19). Nevertheless, the bottom-up approach is suboptimal for the analysis of protein posttranslational modifications (PTMs) and sequence variants as a result of protein digestion (10). Alternatively, the protein-based top-down proteomics approach analyzes intact proteins, which provides a “bird''s eye” view of all proteoforms (11), including those arising from sequence variations, alternative splicing, and diverse PTMs, making it a disruptive technology for the comprehensive analysis of proteoforms (1224). However, the complexity of top-down high-resolution mass spectra presents a significant challenge for data analysis. In contrast to the well-developed software packages available for processing data from bottom-up proteomics experiments, the data analysis tools in top-down proteomics remain underdeveloped.The initial step in the analysis of top-down proteomics data is deconvolution of high-resolution mass and tandem mass spectra. Thorough high-resolution analysis of spectra by horn (THRASH), which was the first algorithm developed for the deconvolution of high-resolution mass spectra (25), is still widely used. THRASH automatically detects and evaluates individual isotopomer envelopes by comparing the experimental isotopomer envelope with a theoretical envelope and reporting those that score higher than a user-defined threshold. Another commonly used algorithm, MS-Deconv, utilizes a combinatorial approach to address the difficulty of grouping MS peaks from overlapping isotopomer envelopes (26). Recently, UniDec, which employs a Bayesian approach to separate mass and charge dimensions (27), can also be applied to the deconvolution of high-resolution spectra. Although these algorithms assist in data processing, unfortunately, the deconvolution results often contain a considerable amount of misassigned peaks as a consequence of the complexity of the high-resolution MS and MS/MS data generated in top-down proteomics experiments. Errors such as these can undermine the accuracy of protein identification and PTM localization and, thus, necessitate the implementation of visual components that allow for the validation and manual correction of the computational outputs.Following spectral deconvolution, a typical top-down proteomics workflow incorporates identification, quantitation, and characterization of proteoforms; however, most of the recently developed data analysis tools for top-down proteomics, including ProSightPC (28, 29), Mascot Top Down (also known as Big-Mascot) (30), MS-TopDown (31), and MS-Align+ (32), focus almost exclusively on protein identification. ProSightPC was the first software tool specifically developed for top-down protein identification. This software utilizes “shotgun annotated” databases (33) that include all possible proteoforms containing user-defined modifications. Consequently, ProSightPC is not optimized for identifying PTMs that are not defined by the user(s). Additionally, the inclusion of all possible modified forms within the database dramatically increases the size of the database and, thus, limits the search speed (32). Mascot Top Down (30) is based on standard Mascot but enables database searching using a higher mass limit for the precursor ions (up to 110 kDa), which allows for the identification of intact proteins. Protein identification using Mascot Top Down is fundamentally similar to that used in bottom-up proteomics (34), and, therefore, it is somewhat limited in terms of identifying unexpected PTMs. MS-TopDown (31) employs the spectral alignment algorithm (35), which matches the top-down tandem mass spectra to proteins in the database without prior knowledge of the PTMs. Nevertheless, MS-TopDown lacks statistical evaluation of the search results and performs slowly when searching against large databases. MS-Align+ also utilizes spectral alignment for top-down protein identification (32). It is capable of identifying unexpected PTMs and allows for efficient filtering of candidate proteins when the top-down spectra are searched against a large protein database. MS-Align+ also provides statistical evaluation for the selection of proteoform spectrum match (PrSM) with high confidence. More recently, Top-Down Mass Spectrometry Based Proteoform Identification and Characterization (TopPIC) was developed (http://proteomics.informatics.iupui.edu/software/toppic/index.html). TopPIC is an updated version of MS-Align+ with increased spectral alignment speed and reduced computing requirements. In addition, MSPathFinder, developed by Kim et al., also allows for the rapid identification of proteins from top-down tandem mass spectra (http://omics.pnl.gov/software/mspathfinder) using spectral alignment. Although software tools employing spectral alignment, such as MS-Align+ and MSPathFinder, are particularly useful for top-down protein identification, these programs operate using command line, making them difficult to use for those with limited knowledge of command syntax.Recently, new software tools have been developed for proteoform characterization (36, 37). Our group previously developed MASH Suite, a user-friendly interface for the processing, visualization, and validation of high-resolution MS and MS/MS data (36). Another software tool, ProSight Lite, developed recently by the Kelleher group (37), also allows characterization of protein PTMs. However, both of these software tools require prior knowledge of the protein sequence for the effective localization of PTMs. In addition, both software tools cannot process data from liquid chromatography (LC)-MS and LC-MS/MS experiments, which limits their usefulness in large-scale top-down proteomics. Thus, despite these recent efforts, a multifunctional software platform enabling identification, quantitation, and characterization of proteins from top-down spectra, as well as visual validation and data correction, is still lacking.Herein, we report the development of MASH Suite Pro, an integrated software platform, designed to incorporate tools for protein identification, quantitation, and characterization into a single comprehensive package for the analysis of top-down proteomics data. This program contains a user-friendly customizable interface similar to the previously developed MASH Suite (36) but also has a number of new capabilities, including the ability to handle complex proteomics datasets from LC-MS and LC-MS/MS experiments, as well as the ability to identify unknown proteins and PTMs using MS-Align+ (32). Importantly, MASH Suite Pro also provides visualization components for the validation and correction of the computational outputs, which ensures accurate and reliable deconvolution of the spectra and localization of PTMs and sequence variations.  相似文献   

Comprehensive proteomic profiling of biological specimens usually requires multidimensional chromatographic peptide fractionation prior to mass spectrometry. However, this approach can suffer from poor reproducibility because of the lack of standardization and automation of the entire workflow, thus compromising performance of quantitative proteomic investigations. To address these variables we developed an online peptide fractionation system comprising a multiphasic liquid chromatography (LC) chip that integrates reversed phase and strong cation exchange chromatography upstream of the mass spectrometer (MS). We showed superiority of this system for standardizing discovery and targeted proteomic workflows using cancer cell lysates and nondepleted human plasma. Five-step multiphase chip LC MS/MS acquisition showed clear advantages over analyses of unfractionated samples by identifying more peptides, consuming less sample and often improving the lower limits of quantitation, all in highly reproducible, automated, online configuration. We further showed that multiphase chip LC fractionation provided a facile means to detect many N- and C-terminal peptides (including acetylated N terminus) that are challenging to identify in complex tryptic peptide matrices because of less favorable ionization characteristics. Given as much as 95% of peptides were detected in only a single salt fraction from cell lysates we exploited this high reproducibility and coupled it with multiple reaction monitoring on a high-resolution MS instrument (MRM-HR). This approach increased target analyte peak area and improved lower limits of quantitation without negatively influencing variance or bias. Further, we showed a strategy to use multiphase LC chip fractionation LC-MS/MS for ion library generation to integrate with SWATHTM data-independent acquisition quantitative workflows. All MS data are available via ProteomeXchange with identifier PXD001464.Mass spectrometry based proteomic quantitation is an essential technique used for contemporary, integrative biological studies. Whether used in discovery experiments or for targeted biomarker applications, quantitative proteomic studies require high reproducibility at many levels. It requires reproducible run-to-run peptide detection, reproducible peptide quantitation, reproducible depth of proteome coverage, and ideally, a high degree of cross-laboratory analytical reproducibility. Mass spectrometry centered proteomics has evolved steadily over the past decade, now mature enough to derive extensive draft maps of the human proteome (1, 2). Nonetheless, a key requirement yet to be realized is to ensure that quantitative proteomics can be carried out in a timely manner while satisfying the aforementioned challenges associated with reproducibility. This is especially important for recent developments using data independent MS quantitation and multiple reaction monitoring on high-resolution MS (MRM-HR)1 as they are both highly dependent on LC peptide retention time reproducibility and precursor detectability, while attempting to maximize proteome coverage (3). Strategies usually employed to increase the depth of proteome coverage utilize various sample fractionation methods including gel-based separation, affinity enrichment or depletion, protein or peptide chemical modification-based enrichment, and various peptide chromatography methods, particularly ion exchange chromatography (410). In comparison to an unfractionated “naive” sample, the trade-off in using these enrichments/fractionation approaches are higher risk of sample losses, introduction of undesired chemical modifications (e.g. oxidation, deamidation, N-terminal lactam formation), and the potential for result skewing and bias, as well as numerous time and human resources required to perform the sample preparation tasks. Online-coupled approaches aim to minimize those risks and address resource constraints. A widely practiced example of the benefits of online sample fractionation has been the decade long use of combining strong cation exchange chromatography (SCX) with C18 reversed-phase (RP) for peptide fractionation (known as MudPIT – multidimensional protein identification technology), where SCX and RP is performed under the same buffer conditions and the SCX elution performed with volatile organic cations compatible with reversed phase separation (11). This approach greatly increases analyte detection while avoiding sample handling losses. The MudPIT approach has been widely used for discovery proteomics (1214), and we have previously shown that multiphasic separations also have utility for targeted proteomics when configured for selected reaction monitoring MS (SRM-MS). We showed substantial advantages of MudPIT-SRM-MS with reduced ion suppression, increased peak areas and lower limits of detection (LLOD) compared with conventional RP-SRM-MS (15).To improve the reproducibility of proteomic workflows, increase throughput and minimize sample loss, numerous microfluidic devices have been developed and integrated for proteomic applications (16, 17). These devices can broadly be classified into two groups: (1) microfluidic chips for peptide separation (1825) and; (2) proteome reactors that combine enzymatic processing with peptide based fractionation (2630). Because of the small dimension of these devices, they are readily able to integrate into nanoLC workflows. Various applications have been described including increasing proteome coverage (22, 27, 28) and targeting of phosphopeptides (24, 31, 32), glycopeptides and released glycans (29, 33, 34).In this work, we set out to take advantage of the benefits of multiphasic peptide separations and address the reproducibility needs required for high-throughput comparative proteomics using a variety of workflows. We integrated a multiphasic SCX and RP column in a “plug-and-play” microfluidic chip format for online fractionation, eliminating the need for users to make minimal dead volume connections between traps and columns. We show the flexibility of this format to provide robust peptide separation and reproducibility using conventional and topical mass spectrometry workflows. This was undertaken by coupling the multiphase liquid chromatography (LC) chip to a fast scanning Q-ToF mass spectrometer for data dependent MS/MS, data independent MS (SWATH) and for targeted proteomics using MRM-HR, showing clear advantages for repeatable analyses compared with conventional proteomic workflows.  相似文献   

Myofilament proteins are responsible for cardiac contraction. The myofilament subproteome, however, has not been comprehensively analyzed thus far. In the present study, cardiomyocytes were isolated from rodent hearts and stimulated with endothelin-1 and isoproterenol, potent inducers of myofilament protein phosphorylation. Subsequently, cardiomyocytes were “skinned,” and the myofilament subproteome was analyzed using a high mass accuracy ion trap tandem mass spectrometer (LTQ Orbitrap XL) equipped with electron transfer dissociation. As expected, a small number of myofilament proteins constituted the majority of the total protein mass with several known phosphorylation sites confirmed by electron transfer dissociation. More than 600 additional proteins were identified in the cardiac myofilament subproteome, including kinases and phosphatase subunits. The proteomic comparison of myofilaments from control and treated cardiomyocytes suggested that isoproterenol treatment altered the subcellular localization of protein phosphatase 2A regulatory subunit B56α. Immunoblot analysis of myocyte fractions confirmed that β-adrenergic stimulation by isoproterenol decreased the B56α content of the myofilament fraction in the absence of significant changes for the myosin phosphatase target subunit isoforms 1 and 2 (MYPT1 and MYPT2). Furthermore, immunolabeling and confocal microscopy revealed the spatial redistribution of these proteins with a loss of B56α from Z-disc and M-band regions but increased association of MYPT1/2 with A-band regions of the sarcomere following β-adrenergic stimulation. In summary, we present the first comprehensive proteomics data set of skinned cardiomyocytes and demonstrate the potential of proteomics to unravel dynamic changes in protein composition that may contribute to the neurohormonal regulation of myofilament contraction.Myofilament proteins comprise the fundamental contractile apparatus of the heart, the cardiac sarcomere. They are subdivided into thin filament proteins, including actin, tropomyosin, the troponin complex (troponin C, troponin I, and troponin T), and thick filament proteins, including myosin heavy chains, myosin light chains, and myosin-binding protein C. Although calcium is the principal regulator of cardiac contraction through the excitation-contraction coupling process that culminates in calcium binding to troponin C, myofilament function is also significantly modulated by phosphorylation of constituent proteins, such as cardiac troponin I (cTnI),1 cardiac myosin-binding protein C (cMyBP-C), and myosin regulatory light chain (MLC-2). “Skinned” myocyte preparations from rodent hearts, in which the sarcolemmal envelope is disrupted through the use of detergents, have been invaluable in providing mechanistic information on the functional consequences of myofilament protein phosphorylation following exposure to neurohormonal stimuli that activate pertinent kinases prior to skinning or direct exposure to such kinases in active form after skinning (for recent examples, see studies on the phosphorylation of cTnI (13), cMyBP-C (46), and MLC-2 (79)). Nevertheless, to date, only a few myofilament proteins have been studied using proteomics (1019), and a detailed proteomic characterization of the myofilament subproteome and its associated proteins from skinned myocytes has not been performed. In the present analysis, we used an LTQ Orbitrap XL equipped with ETD (20) to analyze the subproteome of skinned cardiomyocytes with or without prior stimulation. Endothelin-1 and isoproterenol were used to activate the endothelin receptor/protein kinase C and β-adrenoreceptor/protein kinase A pathway, respectively (21, 22). Importantly, the mass accuracy of the Orbitrap mass analyzer helped to distinguish true phosphorylation sites from false assignments, and the sensitivity of the ion trap provided novel insights into the translocation of phosphatase regulatory and targeting subunits following β-adrenergic stimulation.  相似文献   

Optimal performance of LC-MS/MS platforms is critical to generating high quality proteomics data. Although individual laboratories have developed quality control samples, there is no widely available performance standard of biological complexity (and associated reference data sets) for benchmarking of platform performance for analysis of complex biological proteomes across different laboratories in the community. Individual preparations of the yeast Saccharomyces cerevisiae proteome have been used extensively by laboratories in the proteomics community to characterize LC-MS platform performance. The yeast proteome is uniquely attractive as a performance standard because it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins. In this study, we describe a standard operating protocol for large scale production of the yeast performance standard and offer aliquots to the community through the National Institute of Standards and Technology where the yeast proteome is under development as a certified reference material to meet the long term needs of the community. Using a series of metrics that characterize LC-MS performance, we provide a reference data set demonstrating typical performance of commonly used ion trap instrument platforms in expert laboratories; the results provide a basis for laboratories to benchmark their own performance, to improve upon current methods, and to evaluate new technologies. Additionally, we demonstrate how the yeast reference, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix, thereby providing a metric to evaluate and minimize preanalytical and analytical variation in comparative proteomics experiments.Access to proteomics performance standards is essential for several reasons. First, to generate the highest quality data possible, proteomics laboratories routinely benchmark and perform quality control (QC)1 monitoring of the performance of their instrumentation using standards. Second, appropriate standards greatly facilitate the development of improvements in technologies by providing a timeless standard with which to evaluate new protocols or instruments that claim to improve performance. For example, it is common practice for an individual laboratory considering purchase of a new instrument to require the vendor to run “demo” samples so that data from the new instrument can be compared head to head with existing instruments in the laboratory. Third, large scale proteomics studies designed to aggregate data across laboratories can be facilitated by the use of a performance standard to measure reproducibility across sites or to compare the performance of different LC-MS configurations or sample processing protocols used between laboratories to facilitate development of optimized standard operating procedures (SOPs).Most individual laboratories have adopted their own QC standards, which range from mixtures of known synthetic peptides to digests of bovine serum albumin or more complex mixtures of several recombinant proteins (1). However, because each laboratory performs QC monitoring in isolation, it is difficult to compare the performance of LC-MS platforms throughout the community.Several standards for proteomics are available for request or purchase (2, 3). RM8327 is a mixture of three peptides developed as a reference material in collaboration between the National Institute of Standards and Technology (NIST) and the Association of Biomolecular Resource Facilities. Mixtures of 15–48 purified human proteins are also available, such as the HUPO (Human Proteome Organisation) Gold MS Protein Standard (Invitrogen), the Universal Proteomics Standard (UPS1; Sigma), and CRM470 from the European Union Institute for Reference Materials and Measurements. Although defined mixtures of peptides or proteins can address some benchmarking and QC needs, there is an additional need for more complex reference materials to fully represent the challenges of LC-MS data acquisition in complex matrices encountered in biological samples (2, 3).Although it has not been widely distributed as a reference material, the yeast Saccharomyces cerevisiae proteome has been extensively used by the proteomics community to characterize the capabilities of a variety of LC-MS-based approaches (415). Yeast provides a uniquely attractive complex performance standard for several reasons. Yeast encodes a complex proteome consisting of ∼4,500 proteins expressed during normal growth conditions (7, 1618). The concentration range of yeast proteins is sufficient to challenge the dynamic range of conventional mass spectrometers; the abundance of proteins ranges from fewer than 50 to more than 106 molecules per cell (4, 15, 16). Additionally, it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins (5, 9, 16, 17, 19, 20) as well as LC-MS/MS data sets showing good correlation between LC-MS/MS detection efficiency and the protein abundance estimates (4, 11, 12, 15). Finally, it is inexpensive and easy to produce large quantities of yeast protein extract for distribution.In this study, we describe large scale production of a yeast S. cerevisiae performance standard, which we offer to the community through NIST. Through a series of interlaboratory studies, we created a reference data set characterizing the yeast performance standard and defining reasonable performance of ion trap-based LC-MS platforms in expert laboratories using a series of performance metrics. This publicly available data set provides a basis for additional laboratories using the yeast standard to benchmark their own performance as well as to improve upon the current status by evolving protocols, improving instrumentation, or developing new technologies. Finally, we demonstrate how the yeast performance standard, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix.  相似文献   

Based on conventional data-dependent acquisition strategy of shotgun proteomics, we present a new workflow DeMix, which significantly increases the efficiency of peptide identification for in-depth shotgun analysis of complex proteomes. Capitalizing on the high resolution and mass accuracy of Orbitrap-based tandem mass spectrometry, we developed a simple deconvolution method of “cloning” chimeric tandem spectra for cofragmented peptides. Additional to a database search, a simple rescoring scheme utilizes mass accuracy and converts the unwanted cofragmenting events into a surprising advantage of multiplexing. With the combination of cloning and rescoring, we obtained on average nine peptide-spectrum matches per second on a Q-Exactive workbench, whereas the actual MS/MS acquisition rate was close to seven spectra per second. This efficiency boost to 1.24 identified peptides per MS/MS spectrum enabled analysis of over 5000 human proteins in single-dimensional LC-MS/MS shotgun experiments with an only two-hour gradient. These findings suggest a change in the dominant “one MS/MS spectrum - one peptide” paradigm for data acquisition and analysis in shotgun data-dependent proteomics. DeMix also demonstrated higher robustness than conventional approaches in terms of lower variation among the results of consecutive LC-MS/MS runs.Shotgun proteomics analysis based on a combination of high performance liquid chromatography and tandem mass spectrometry (MS/MS) (1) has achieved remarkable speed and efficiency (27). In a single four-hour long high performance liquid chromatography-MS/MS run, over 40,000 peptides and 5000 proteins can be identified using a high-resolution Orbitrap mass spectrometer with data-dependent acquisition (DDA)1 (2, 3). However, in a typical LC-MS analysis of unfractionated human cell lysate, over 100,000 individual peptide isotopic patterns can be detected (4), which corresponds to simultaneous elution of hundreds of peptides. With this complexity, a mass spectrometer needs to achieve ≥25 Hz MS/MS acquisition rate to fully sample all the detectable peptides, and ≥17 Hz to cover reasonably abundant ones (4). Although this acquisition rate is reachable by modern time-of-flight (TOF) instruments, the reported DDA identification results do not encompass all expected peptides. Recently, the next-generation Orbitrap instrument, working at 20 Hz MS/MS acquisition rate, demonstrated nearly full profiling of yeast proteome using an 80 min gradient, which opened the way for comprehensive analysis of human proteome in a time efficient manner (5).During the high performance liquid chromatography-MS/MS DDA analysis of complex samples, high density of co-eluting peptides results in a high probability for two or more peptides to overlap within an MS/MS isolation window. With the commonly used ±1.0–2.0 Th isolation windows, most MS/MS spectra are chimeric (4, 810), with cofragmenting precursors being naturally multiplexed. However, as has been discussed previously (9, 10), the cofragmentation events are currently ignored in most of the conventional analysis workflows. According to the prevailing assumption of “one MS/MS spectrum–one peptide,” chimeric MS/MS spectra are generally unwelcome in DDA, because the product ions from different precursors may interfere with the assignment of MS/MS fragment identities, increasing the rate of false discoveries in database search (8, 9). In some studies, the precursor isolation width was set as narrow as ±0.35 Th to prevent unwanted ions from being coselected, fragmented or detected (4, 5).On the contrary, multiplexing by cofragmentation is considered to be one of the solid advantages in data-independent acquisition (DIA) (1013). In several commonly used DIA methods, the precursor ion selection windows are set much wider than in DDA: from 25 Th as in SWATH (12), to extremely broad range as in AIF (13). In order to use the benefit of MS/MS multiplexing in DDA, several approaches have been proposed to deconvolute chimeric MS/MS spectra. In “alternative peptide identification” method implemented in Percolator (14), a machine learning algorithm reranks and rescores peptide-spectrum matches (PSMs) obtained from one or more MS/MS search engines. But the deconvolution in Percolator is limited to cofragmented peptides with masses differing from the target peptide by the tolerance of the database search, which can be as narrow as a few ppm. The “active demultiplexing” method proposed by Ledvina et al. (15) actively separates MS/MS data from several precursors using masses of complementary fragments. However, higher-energy collisional dissociation often produces MS/MS spectra with too few complementary pairs for reliable peptide identification. The “MixDB” method introduces a sophisticated new search engine, also with a machine learning algorithm (9). And the “second peptide identification” method implemented in Andromeda/MaxQuant workflow (16) submits the same dataset to the search engine several times based on the list of chromatographic peptide features, subtracting assigned MS/MS peaks after each identification round. This approach is similar to the ProbIDTree search engine that also performed iterative identification while removing assigned peaks after each round of identification (17).One important factor for spectral deconvolution that has not been fully utilized in most conventional workflows is the excellent mass accuracy achievable with modern high-resolution mass spectrometry (18). An Orbitrap Fourier-transform mass spectrometer can provide mass accuracy in the range of hundreds of ppb (parts per billion) for mass peaks with high signal-to-noise (S/N) ratio (19). However, the mass error of peaks with lower S/N ratios can be significantly higher and exceed 1 ppm. Despite this dependence of the mass accuracy from the S/N level, most MS and MS/MS search engines only allow users to set hard cut-off values for the mass error tolerances. Moreover, some search engines do not provide the option of choosing a relative error tolerance for MS/MS fragments. Such negligent treatment of mass accuracy reduces the analytical power of high accuracy experiments (18).Identification results coming from different MS/MS search engines are sometimes not consistent because of different statistical assumptions used in scoring PSMs. Introduction of tools integrating the results of different search engines (14, 20, 21) makes the data interpretation even more complex and opaque for the user. The opposite trend—simplification of MS/MS data interpretation—is therefore a welcome development. For example, an extremely straightforward algorithm recently proposed by Wenger et al. (22) demonstrated a surprisingly high performance in peptide identification, even though it is only marginally more complex than simply counting the number of matches of theoretical fragment peaks in high resolution MS/MS, without any a priori statistical assumption.In order to take advantage of natural multiplexing of MS/MS spectra in DDA, as well as properly utilize high accuracy of Orbitrap-based mass spectrometry, we developed a simple and robust data analysis workflow DeMix. It is presented in Fig. 1 as an expansion of the conventional workflow. Principles of some of the processes used by the workflow are borrowed from other approaches, including the custom-made mass peak centroiding (20), chromatographic feature detection (19, 20), and two-pass database search with the first limited pass to provide a “software lock mass” for mass scale recalibration (23).Open in a separate windowFig. 1.An overview of the DeMix workflow that expands the conventional workflow, shown by the dashed line. Processes are colored in purple for TOPP, red for search engine (Morpheus/Mascot/MS-GF+), and blue for in-house programs.In DeMix workflow, the deconvolution of chimeric MS/MS spectra consists of simply “cloning” an MS/MS spectrum if a potential cofragmented peptide is detected. The list of candidate peptide precursors is generated from chromatographic feature detection, as in the MaxQuant/Andromeda workflow (16, 19), but using The OpenMS Proteomics Pipeline (TOPP) (20, 24). During the cloning, the precursor is replaced by the new candidate, but no changes in the MS/MS fragment list are made, and therefore the cloned MS/MS spectra remain chimeric. Processing such spectra requires a search engine tolerant to the presence of unassigned peaks, as such peaks are always expected when multiple precursors cofragment. Thus, we chose Morpheus (22) as a search engine. Based on the original search algorithm, we implement a reformed scoring scheme: Morpheus-AS (advanced scoring). It inherits all the basic principles from Morpheus but deeper utilizes the high mass accuracy of the data. This kind of database search removes the necessity of spectral processing for physical separation of MS/MS data into multiple subspectra (15), or consecutive subtraction of peaks (16, 17).Despite the fact that DeMix workflow is largely a combination of known approaches, it provides remarkable improvement compared with the state-of-the-art. On our Orbitrap Q-Exactive workbench, testing on a benchmark dataset of two-hour single-dimension LC-MS/MS experiments from HeLa cell lysate, we identified on average 1.24 peptide per MS/MS spectrum, breaking the “one MS/MS spectrum–one peptide” paradigm on the level of whole data set. At 1% false discovery rate (FDR), we obtained on average nine PSMs per second (at the actual acquisition rate of ca. seven MS/MS spectra per second), and detected 40 human proteins per minute.  相似文献   

Insulin plays a central role in the regulation of vertebrate metabolism. The hormone, the post-translational product of a single-chain precursor, is a globular protein containing two chains, A (21 residues) and B (30 residues). Recent advances in human genetics have identified dominant mutations in the insulin gene causing permanent neonatal-onset DM2 (14). The mutations are predicted to block folding of the precursor in the ER of pancreatic β-cells. Although expression of the wild-type allele would in other circumstances be sufficient to maintain homeostasis, studies of a corresponding mouse model (57) suggest that the misfolded variant perturbs wild-type biosynthesis (8, 9). Impaired β-cell secretion is associated with ER stress, distorted organelle architecture, and cell death (10). These findings have renewed interest in insulin biosynthesis (1113) and the structural basis of disulfide pairing (1419). Protein evolution is constrained not only by structure and function but also by susceptibility to toxic misfolding.Insulin plays a central role in the regulation of vertebrate metabolism. The hormone, the post-translational product of a single-chain precursor, is a globular protein containing two chains, A (21 residues) and B (30 residues). Recent advances in human genetics have identified dominant mutations in the insulin gene causing permanent neonatal-onset DM2 (14). The mutations are predicted to block folding of the precursor in the ER of pancreatic β-cells. Although expression of the wild-type allele would in other circumstances be sufficient to maintain homeostasis, studies of a corresponding mouse model (57) suggest that the misfolded variant perturbs wild-type biosynthesis (8, 9). Impaired β-cell secretion is associated with ER stress, distorted organelle architecture, and cell death (10). These findings have renewed interest in insulin biosynthesis (1113) and the structural basis of disulfide pairing (1419). Protein evolution is constrained not only by structure and function but also by susceptibility to toxic misfolding.  相似文献   

Conjugation of small ubiquitin-like modifier (SUMO) to substrates is involved in a large number of cellular processes. Typically, SUMO is conjugated to lysine residues within a SUMO consensus site; however, an increasing number of proteins are sumoylated on non-consensus sites. To appreciate the functional consequences of sumoylation, the identification of SUMO attachment sites is of critical importance. Discovery of SUMO acceptor sites is usually performed by a laborious mutagenesis approach or using MS. In MS, identification of SUMO acceptor sites in higher eukaryotes is hampered by the large tryptic fragments of SUMO1 and SUMO2/3. MS search engines in combination with known databases lack the possibility to search MSMS spectra for larger modifications, such as sumoylation. Therefore, we developed a simple and straightforward database search tool (“ChopNSpice”) that successfully allows identification of SUMO acceptor sites from proteins sumoylated in vivo and in vitro. By applying this approach we identified SUMO acceptor sites in, among others, endogenous SUMO1, SUMO2, RanBP2, and Ubc9.Post-translational modification with ubiquitin and ubiquitin-like modifiers (Ubls)1 such as SUMO plays an important role in most, if not all, cellular processes (16). Conjugation of Ubls to their targets involves an isopeptide bond between the carboxyl group of the modifier and the ε-amino group of a lysine residue within the targets. Attachment of Ubls to specific targets involves an enzymatic cascade. First the Ubls are processed to expose their C-terminal diglycine motif. The mature Ubl is then transferred to its target via a cascade of E1 (activating), E2 (conjugating), and E3 (ligase) enzymes. The conjugation system for SUMO consists of a heterodimeric activating enzyme, Aos1/Uba2; a conjugating enzyme, Ubc9; and E3 ligases, such as RanBP2 or members of the PIAS family. The conjugation status undergoes perpetual change and is governed by a small family of SUMO proteases that hydrolyze the isopeptide bond between SUMO and its target (7, 8). Although in lower eukaryotes only one SUMO is present, vertebrates express at least three different SUMO paralogs: SUMO1, SUMO2, and SUMO3. Mature SUMO2 and SUMO3 (referred to as SUMO2/3) are 97% identical but differ substantially from SUMO1 (∼50% identity).Although the list of known SUMO substrates is growing rapidly, our understanding of the functional consequences for many of these targets is lagging behind. At a molecular level, the functional consequences of SUMO conjugation can be explained by a gain or loss of interaction with other macromolecules (3, 4). SUMO-dependent intramolecular conformational changes have also been described (9, 10). Thus, to appreciate the role that SUMO plays in the regulation of specific substrates, identification of the acceptor site(s) for SUMO conjugation is of key importance.So far, identification of SUMO acceptor sites has relied largely on mutation of the SUMO consensus site, which consists of a short motif with the sequence ψKXE (ψ represents a bulky hydrophobic residue, and X represents any amino acid). This motif is recognized by Ubc9 if presented in an extended conformation (1113). However, an increasing number of proteins, such as PCNA, E2-25K, Daxx, and USP25, turned out to be sumoylated on lysine residues that do not conform to the SUMO consensus site (1417). For this category of proteins, as well as for proteins that contain a large number of SUMO consensus sites, the identification of acceptor lysines is a burdensome task that often involves mutagenesis of each lysine residue within the substrate in turn.MS is currently one of the state-of-the-art technologies to identify protein factors and their post-translational modifications in an unbiased and sensitive manner. Several groups have shown that, using overexpressed tagged SUMO, MS can be efficiently exploited to identify endogenous substrates for SUMO conjugation (1820). However, the identification of SUMO acceptor lysines using MS has remained a more challenging task (18, 21, 23, 24). So far, using tagged SUMO, unbiased identification of acceptor lysines for endogenous substrates has only been observed in Saccharomyces cerevisiae (18). The identification of substrates in higher eukaryotes has been hampered by the large conjugated SUMO peptide that arises upon tryptic digestion (>2154 Da with human SUMO1 and >3568 Da with human SUMO2/3 compared with 484 Da for Smt3 in S. cerevisiae). Such large fragments, in addition to the mass of the conjugated peptide, can impede their in-gel digestion, extraction, detection, and sequencing in MS. To overcome some of these limitations, several different strategies have been developed: 1) mutation of the tryptic fragment of SUMO, yielding a smaller tryptic fragment (23), 2) development of an automated recognition pattern tool (SUMmOn) (24), and 3) identification of targets using an in vitro to in vivo approach (21). Although these approaches have been applied successfully for the identification of SUMO conjugates in vitro and in vivo, unbiased identification of SUMO conjugates in vivo has not been achieved in higher eukaryotes. Another hurdle to such identification of SUMO conjugates is the variety of masses that can theoretically arise for just one SUMO-conjugated lysine in a given protein because of tryptic miscleavages. Thus, the unambiguous identification of SUMO acceptor sites requires the mass of the modified peptide carrying the conjugated SUMO (fragment) to be measured with high accuracy, and most importantly, it requires sequence analysis of the modified peptides. Because available proteomics search engines lack the possibility to search MSMS spectra for larger modifications, e.g. those that occur upon sumoylation, we developed a novel, simple, and straightforward database search tool (“ChopNSpice”) that, in combination with current proteomics search engines (such as MASCOT (25) or SEQUEST (26)), allows one to identify SUMO1 and SUMO2/3 acceptor sites unambiguously. We confirmed this strategy in vitro on various substrates and demonstrate the power of this technique by the identification of acceptor lysines within several endogenous targets from HeLa cells.  相似文献   

Database search programs are essential tools for identifying peptides via mass spectrometry (MS) in shotgun proteomics. Simultaneously achieving high sensitivity and high specificity during a database search is crucial for improving proteome coverage. Here we present JUMP, a new hybrid database search program that generates amino acid tags and ranks peptide spectrum matches (PSMs) by an integrated score from the tags and pattern matching. In a typical run of liquid chromatography coupled with high-resolution tandem MS, more than 95% of MS/MS spectra can generate at least one tag, whereas the remaining spectra are usually too poor to derive genuine PSMs. To enhance search sensitivity, the JUMP program enables the use of tags as short as one amino acid. Using a target-decoy strategy, we compared JUMP with other programs (e.g. SEQUEST, Mascot, PEAKS DB, and InsPecT) in the analysis of multiple datasets and found that JUMP outperformed these preexisting programs. JUMP also permitted the analysis of multiple co-fragmented peptides from “mixture spectra” to further increase PSMs. In addition, JUMP-derived tags allowed partial de novo sequencing and facilitated the unambiguous assignment of modified residues. In summary, JUMP is an effective database search algorithm complementary to current search programs.Peptide identification by tandem mass spectra is a critical step in mass spectrometry (MS)-based1 proteomics (1). Numerous computational algorithms and software tools have been developed for this purpose (26). These algorithms can be classified into three categories: (i) pattern-based database search, (ii) de novo sequencing, and (iii) hybrid search that combines database search and de novo sequencing. With the continuous development of high-performance liquid chromatography and high-resolution mass spectrometers, it is now possible to analyze almost all protein components in mammalian cells (7). In contrast to rapid data collection, it remains a challenge to extract accurate information from the raw data to identify peptides with low false positive rates (specificity) and minimal false negatives (sensitivity) (8).Database search methods usually assign peptide sequences by comparing MS/MS spectra to theoretical peptide spectra predicted from a protein database, as exemplified in SEQUEST (9), Mascot (10), OMSSA (11), X!Tandem (12), Spectrum Mill (13), ProteinProspector (14), MyriMatch (15), Crux (16), MS-GFDB (17), Andromeda (18), BaMS2 (19), and Morpheus (20). Some other programs, such as SpectraST (21) and Pepitome (22), utilize a spectral library composed of experimentally identified and validated MS/MS spectra. These methods use a variety of scoring algorithms to rank potential peptide spectrum matches (PSMs) and select the top hit as a putative PSM. However, not all PSMs are correctly assigned. For example, false peptides may be assigned to MS/MS spectra with numerous noisy peaks and poor fragmentation patterns. If the samples contain unknown protein modifications, mutations, and contaminants, the related MS/MS spectra also result in false positives, as their corresponding peptides are not in the database. Other false positives may be generated simply by random matches. Therefore, it is of importance to remove these false PSMs to improve dataset quality. One common approach is to filter putative PSMs to achieve a final list with a predefined false discovery rate (FDR) via a target-decoy strategy, in which decoy proteins are merged with target proteins in the same database for estimating false PSMs (2326). However, the true and false PSMs are not always distinguishable based on matching scores. It is a problem to set up an appropriate score threshold to achieve maximal sensitivity and high specificity (13, 27, 28).De novo methods, including Lutefisk (29), PEAKS (30), NovoHMM (31), PepNovo (32), pNovo (33), Vonovo (34), and UniNovo (35), identify peptide sequences directly from MS/MS spectra. These methods can be used to derive novel peptides and post-translational modifications without a database, which is useful, especially when the related genome is not sequenced. High-resolution MS/MS spectra greatly facilitate the generation of peptide sequences in these de novo methods. However, because MS/MS fragmentation cannot always produce all predicted product ions, only a portion of collected MS/MS spectra have sufficient quality to extract partial or full peptide sequences, leading to lower sensitivity than achieved with the database search methods.To improve the sensitivity of the de novo methods, a hybrid approach has been proposed to integrate peptide sequence tags into PSM scoring during database searches (36). Numerous software packages have been developed, such as GutenTag (37), InsPecT (38), Byonic (39), DirecTag (40), and PEAKS DB (41). These methods use peptide tag sequences to filter a protein database, followed by error-tolerant database searching. One restriction in most of these algorithms is the requirement of a minimum tag length of three amino acids for matching protein sequences in the database. This restriction reduces the sensitivity of the database search, because it filters out some high-quality spectra in which consecutive tags cannot be generated.In this paper, we describe JUMP, a novel tag-based hybrid algorithm for peptide identification. The program is optimized to balance sensitivity and specificity during tag derivation and MS/MS pattern matching. JUMP can use all potential sequence tags, including tags consisting of only one amino acid. When we compared its performance to that of two widely used search algorithms, SEQUEST and Mascot, JUMP identified ∼30% more PSMs at the same FDR threshold. In addition, the program provides two additional features: (i) using tag sequences to improve modification site assignment, and (ii) analyzing co-fragmented peptides from mixture MS/MS spectra.  相似文献   

A complete understanding of the biological functions of large signaling peptides (>4 kDa) requires comprehensive characterization of their amino acid sequences and post-translational modifications, which presents significant analytical challenges. In the past decade, there has been great success with mass spectrometry-based de novo sequencing of small neuropeptides. However, these approaches are less applicable to larger neuropeptides because of the inefficient fragmentation of peptides larger than 4 kDa and their lower endogenous abundance. The conventional proteomics approach focuses on large-scale determination of protein identities via database searching, lacking the ability for in-depth elucidation of individual amino acid residues. Here, we present a multifaceted MS approach for identification and characterization of large crustacean hyperglycemic hormone (CHH)-family neuropeptides, a class of peptide hormones that play central roles in the regulation of many important physiological processes of crustaceans. Six crustacean CHH-family neuropeptides (8–9.5 kDa), including two novel peptides with extensive disulfide linkages and PTMs, were fully sequenced without reference to genomic databases. High-definition de novo sequencing was achieved by a combination of bottom-up, off-line top-down, and on-line top-down tandem MS methods. Statistical evaluation indicated that these methods provided complementary information for sequence interpretation and increased the local identification confidence of each amino acid. Further investigations by MALDI imaging MS mapped the spatial distribution and colocalization patterns of various CHH-family neuropeptides in the neuroendocrine organs, revealing that two CHH-subfamilies are involved in distinct signaling pathways.Neuropeptides and hormones comprise a diverse class of signaling molecules involved in numerous essential physiological processes, including analgesia, reward, food intake, learning and memory (1). Disorders of the neurosecretory and neuroendocrine systems influence many pathological processes. For example, obesity results from failure of energy homeostasis in association with endocrine alterations (2, 3). Previous work from our lab used crustaceans as model organisms found that multiple neuropeptides were implicated in control of food intake, including RFamides, tachykinin related peptides, RYamides, and pyrokinins (46).Crustacean hyperglycemic hormone (CHH)1 family neuropeptides play a central role in energy homeostasis of crustaceans (717). Hyperglycemic response of the CHHs was first reported after injection of crude eyestalk extract in crustaceans. Based on their preprohormone organization, the CHH family can be grouped into two sub-families: subfamily-I containing CHH, and subfamily-II containing molt-inhibiting hormone (MIH) and mandibular organ-inhibiting hormone (MOIH). The preprohormones of the subfamily-I have a CHH precursor related peptide (CPRP) that is cleaved off during processing; and preprohormones of the subfamily-II lack the CPRP (9). Uncovering their physiological functions will provide new insights into neuroendocrine regulation of energy homeostasis.Characterization of CHH-family neuropeptides is challenging. They are comprised of more than 70 amino acids and often contain multiple post-translational modifications (PTMs) and complex disulfide bridge connections (7). In addition, physiological concentrations of these peptide hormones are typically below picomolar level, and most crustacean species do not have available genome and proteome databases to assist MS-based sequencing.MS-based neuropeptidomics provides a powerful tool for rapid discovery and analysis of a large number of endogenous peptides from the brain and the central nervous system. Our group and others have greatly expanded the peptidomes of many model organisms (3, 1833). For example, we have discovered more than 200 neuropeptides with several neuropeptide families consisting of as many as 20–40 members in a simple crustacean model system (5, 6, 2531, 34). However, a majority of these neuropeptides are small peptides with 5–15 amino acid residues long, leaving a gap of identifying larger signaling peptides from organisms without sequenced genome. The observed lack of larger size peptide hormones can be attributed to the lack of effective de novo sequencing strategies for neuropeptides larger than 4 kDa, which are inherently more difficult to fragment using conventional techniques (3437). Although classical proteomics studies examine larger proteins, these tools are limited to identification based on database searching with one or more peptides matching without complete amino acid sequence coverage (36, 38).Large populations of neuropeptides from 4–10 kDa exist in the nervous systems of both vertebrates and invertebrates (9, 39, 40). Understanding their functional roles requires sufficient molecular knowledge and a unique analytical approach. Therefore, developing effective and reliable methods for de novo sequencing of large neuropeptides at the individual amino acid residue level is an urgent gap to fill in neurobiology. In this study, we present a multifaceted MS strategy aimed at high-definition de novo sequencing and comprehensive characterization of the CHH-family neuropeptides in crustacean central nervous system. The high-definition de novo sequencing was achieved by a combination of three methods: (1) enzymatic digestion and LC-tandem mass spectrometry (MS/MS) bottom-up analysis to generate detailed sequences of proteolytic peptides; (2) off-line LC fractionation and subsequent top-down MS/MS to obtain high-quality fragmentation maps of intact peptides; and (3) on-line LC coupled to top-down MS/MS to allow rapid sequence analysis of low abundance peptides. Combining the three methods overcomes the limitations of each, and thus offers complementary and high-confidence determination of amino acid residues. We report the complete sequence analysis of six CHH-family neuropeptides including the discovery of two novel peptides. With the accurate molecular information, MALDI imaging and ion mobility MS were conducted for the first time to explore their anatomical distribution and biochemical properties.  相似文献   

Knowledge of elaborate structures of protein complexes is fundamental for understanding their functions and regulations. Although cross-linking coupled with mass spectrometry (MS) has been presented as a feasible strategy for structural elucidation of large multisubunit protein complexes, this method has proven challenging because of technical difficulties in unambiguous identification of cross-linked peptides and determination of cross-linked sites by MS analysis. In this work, we developed a novel cross-linking strategy using a newly designed MS-cleavable cross-linker, disuccinimidyl sulfoxide (DSSO). DSSO contains two symmetric collision-induced dissociation (CID)-cleavable sites that allow effective identification of DSSO-cross-linked peptides based on their distinct fragmentation patterns unique to cross-linking types (i.e. interlink, intralink, and dead end). The CID-induced separation of interlinked peptides in MS/MS permits MS3 analysis of single peptide chain fragment ions with defined modifications (due to DSSO remnants) for easy interpretation and unambiguous identification using existing database searching tools. Integration of data analyses from three generated data sets (MS, MS/MS, and MS3) allows high confidence identification of DSSO cross-linked peptides. The efficacy of the newly developed DSSO-based cross-linking strategy was demonstrated using model peptides and proteins. In addition, this method was successfully used for structural characterization of the yeast 20 S proteasome complex. In total, 13 non-redundant interlinked peptides of the 20 S proteasome were identified, representing the first application of an MS-cleavable cross-linker for the characterization of a multisubunit protein complex. Given its effectiveness and simplicity, this cross-linking strategy can find a broad range of applications in elucidating the structural topology of proteins and protein complexes.Proteins form stable and dynamic multisubunit complexes under different physiological conditions to maintain cell viability and normal cell homeostasis. Detailed knowledge of protein interactions and protein complex structures is fundamental to understanding how individual proteins function within a complex and how the complex functions as a whole. However, structural elucidation of large multisubunit protein complexes has been difficult because of a lack of technologies that can effectively handle their dynamic and heterogeneous nature. Traditional methods such as nuclear magnetic resonance (NMR) analysis and x-ray crystallography can yield detailed information on protein structures; however, NMR spectroscopy requires large quantities of pure protein in a specific solvent, whereas x-ray crystallography is often limited by the crystallization process.In recent years, chemical cross-linking coupled with mass spectrometry (MS) has become a powerful method for studying protein interactions (13). Chemical cross-linking stabilizes protein interactions through the formation of covalent bonds and allows the detection of stable, weak, and/or transient protein-protein interactions in native cells or tissues (49). In addition to capturing protein interacting partners, many studies have shown that chemical cross-linking can yield low resolution structural information about the constraints within a molecule (2, 3, 10) or protein complex (1113). The application of chemical cross-linking, enzymatic digestion, and subsequent mass spectrometric and computational analyses for the elucidation of three-dimensional protein structures offers distinct advantages over traditional methods because of its speed, sensitivity, and versatility. Identification of cross-linked peptides provides distance constraints that aid in constructing the structural topology of proteins and/or protein complexes. Although this approach has been successful, effective detection and accurate identification of cross-linked peptides as well as unambiguous assignment of cross-linked sites remain extremely challenging due to their low abundance and complicated fragmentation behavior in MS analysis (2, 3, 10, 14). Therefore, new reagents and methods are urgently needed to allow unambiguous identification of cross-linked products and to improve the speed and accuracy of data analysis to facilitate its application in structural elucidation of large protein complexes.A number of approaches have been developed to facilitate MS detection of low abundance cross-linked peptides from complex mixtures. These include selective enrichment using affinity purification with biotinylated cross-linkers (1517) and click chemistry with alkyne-tagged (18) or azide-tagged (19, 20) cross-linkers. In addition, Staudinger ligation has recently been shown to be effective for selective enrichment of azide-tagged cross-linked peptides (21). Apart from enrichment, detection of cross-linked peptides can be achieved by isotope-labeled (2224), fluorescently labeled (25), and mass tag-labeled cross-linking reagents (16, 26). These methods can identify cross-linked peptides with MS analysis, but interpretation of the data generated from interlinked peptides (two peptides connected with the cross-link) by automated database searching remains difficult. Several bioinformatics tools have thus been developed to interpret MS/MS data and determine interlinked peptide sequences from complex mixtures (12, 14, 2732). Although promising, further developments are still needed to make such data analyses as robust and reliable as analyzing MS/MS data of single peptide sequences using existing database searching tools (e.g. Protein Prospector, Mascot, or SEQUEST).Various types of cleavable cross-linkers with distinct chemical properties have been developed to facilitate MS identification and characterization of cross-linked peptides. These include UV photocleavable (33), chemical cleavable (19), isotopically coded cleavable (24), and MS-cleavable reagents (16, 26, 3438). MS-cleavable cross-linkers have received considerable attention because the resulting cross-linked products can be identified based on their characteristic fragmentation behavior observed during MS analysis. Gas-phase cleavage sites result in the detection of a “reporter” ion (26), single peptide chain fragment ions (3538), or both reporter and fragment ions (16, 34). In each case, further structural characterization of the peptide product ions generated during the cleavage reaction can be accomplished by subsequent MSn1 analysis. Among these linkers, the “fixed charge” sulfonium ion-containing cross-linker developed by Lu et al. (37) appears to be the most attractive as it allows specific and selective fragmentation of cross-linked peptides regardless of their charge and amino acid composition based on their studies with model peptides.Despite the availability of multiple types of cleavable cross-linkers, most of the applications have been limited to the study of model peptides and single proteins. Additionally, complicated synthesis and fragmentation patterns have impeded most of the known MS-cleavable cross-linkers from wide adaptation by the community. Here we describe the design and characterization of a novel and simple MS-cleavable cross-linker, DSSO, and its application to model peptides and proteins and the yeast 20 S proteasome complex. In combination with new software developed for data integration, we were able to identify DSSO-cross-linked peptides from complex peptide mixtures with speed and accuracy. Given its effectiveness and simplicity, we anticipate a broader application of this MS-cleavable cross-linker in the study of structural topology of other protein complexes using cross-linking and mass spectrometry.  相似文献   

Plasma proteome analysis requires sufficient power to compare numerous samples and detect changes in protein modification, because the protein content of human samples varies significantly among individuals, and many plasma proteins undergo changes in the bloodstream. A label-free proteomics platform developed in our laboratory, termed “Two-Dimensional Image Converted Analysis of Liquid chromatography and mass spectrometry (2DICAL),” is capable of these tasks. Here, we describe successful detection of novel prolyl hydroxylation of α-fibrinogen using 2DICAL, based on comparison of plasma samples of 38 pancreatic cancer patients and 39 healthy subjects. Using a newly generated monoclonal antibody 11A5, we confirmed the increase in prolyl-hydroxylated α-fibrinogen plasma levels and identified prolyl 4-hydroxylase A1 as a key enzyme for the modification. Competitive enzyme-linked immunosorbent assay of 685 blood samples revealed dynamic changes in prolyl-hydroxylated α-fibrinogen plasma level depending on clinical status. Prolyl-hydroxylated α-fibrinogen is presumably controlled by multiple biological mechanisms, which remain to be clarified in future studies.For comprehensive analysis of plasma proteins, it is necessary to compare a sufficient number of blood samples to avoid simple interindividual heterogeneity, because the protein content of human samples varies significantly among individuals. Also, the provision of sufficient power is needed to detect protein modification because many plasma proteins undergo changes in the bloodstream (1). Even though the proteomic technologies have advanced (2, 3), there remains room for improvement. Different isotope labeling and identification-based methods have been developed for quantitative proteomics technologies (46), but the number of samples that can be compared by the current isotope-labeling methods is limited, and identification-based proteomics is unable to capture information regarding unknown modifications.A label-free proteomics platform developed in our laboratory, termed “Two-Dimensional Image Converted Analysis of Liquid chromatography and mass spectrometry (2DICAL)2 (7), simply compares the liquid chromatography and mass spectrometry (LC-MS) data and detects a protein modification by finding changes in the mass to charge ratio (m/z) and retention time (RT). Enhanced methods for accurate MS peak alignment across multiple LC runs have enabled the successful implementation of clinical studies requiring comparison of a large number of samples (8, 9). Using 2DICAL to analyze plasma samples of pancreatic cancer patients and healthy controls, novel prolyl hydroxylation of α-fibrinogen was successfully discovered.Fibrinogen and its modification has been investigated because of its clinical importance (10, 11). On the other hand, prolyl hydroxylation has attracted attention after the discovery of the hypoxia-inducible factor 1α (HIF1α) prolyl-hydroxylase and its role in switching of HIF1α functions (12). Prolyl hydroxylation in other proteins has been energetically sought, but only a few such proteins have been identified (13). Only one study has reported prolyl hydroxylation of fibrinogen at the β chain (14).Here, we report the detection of prolyl 4-hydroxylated α-fibrinogen by plasma proteome analysis, a protein modification that dynamically changes in plasma depending on the clinical status and is a candidate plasma biomarker.  相似文献   

The visual photoreceptor rhodopsin is a prototypical class I (rhodopsin-like) G protein-coupled receptor. Photoisomerization of the covalently bound ligand 11-cis-retinal leads to restructuring of the cytosolic face of rhodopsin. The ensuing protonation of Glu-134 in the class-conserved D(E)RY motif at the C-terminal end of transmembrane helix-3 promotes the formation of the G protein-activating state. Using transmembrane segments derived from helix-3 of bovine rhodopsin, we show that lipid protein interactions play a key role in this cytosolic “proton switch.” Infrared and fluorescence spectroscopic pKa determinations reveal that the D(E)RY motif is an autonomous functional module coupling side chain neutralization to conformation and helix positioning as evidenced by side chain to lipid headgroup Foerster resonance energy transfer. The free enthalpies of helix stabilization and hydrophobic burial of the neutral carboxyl shift the side chain pKa into the range typical of Glu-134 in photoactivated rhodopsin. The lipid-mediated coupling mechanism is independent of interhelical contacts allowing its conservation without interference with the diversity of ligand-specific interactions in class I G protein-coupled receptors.G protein-coupled receptors (GPCRs)2 are hepta-helical membrane proteins that couple a large variety of extracellular signals to cell-specific responses via activation of G proteins. In the visual photoreceptor rhodopsin, a prototypical class I GPCR (1, 2), molecular activation processes can be monitored in real time by spectroscopic assays and analyzed in the context of several crystal structures (38). The primary signal for rhodopsin is the 11-cis to all-trans photoisomerization of retinal covalently bound to the apoprotein opsin through a protonated Schiff base to Lys296. Current models converge toward a picture in which “microdomains” act as conformational switches that are coupled to different degrees to the primary activation process. Two activating “proton switches” have been identified (9) as follows: breakage of an intramolecular salt bridge (10) by transfer of the Schiff base proton to its counter ion Glu-113 (11) is followed by movement of helix-6 (H6) (12, 13) in the metarhodopsin IIa (MIIa) to MIIb transition. The MIIb state takes up a proton at Glu-134 (14) in the class-conserved D(E)RY motif at the C-terminal end of helix-3 (H3) leading to the MIIbH+ intermediate (15, 16), which activates transducin (Gt), the G protein of the photoreceptor cell. Glu-134 regulates the pH sensitivity of receptor signaling (17) in membranes as reviewed previously (18), and in complex with Gt the protonated state of the carboxyl group becomes stabilized (19). This charge alteration is linked to the release of an “ionic lock,” originally described for the β2-adrenergic receptor (20), which also in rhodopsin stabilizes the inactive state (16) through interactions between the cytosolic ends of H3 and H6 (21).In the absence of a lipidic bilayer, proton uptake and H6 movement become uncoupled (15). Lipidic composition affects MII formation, rhodopsin structure, and oligomerization (2224) and differs at the rhodopsin membrane interface from the bulk lipidic phase (25). Likewise, MII formation specifically affects lipid structure (26). Although of fundamental importance for GPCR activation, the potential implication of lipid protein interactions in “proton switching” is not clear. A functional role of Glu-134 in lipid interactions has been originally derived from IR spectra where E134Q replacement abolished changes of lipid headgroup vibrations in the MIIGt complex (19). Computational approaches emphasized the “strategic” location of the D(E)RY motif (27), and the Glu-134 carboxyl pKa may critically depend on the lipid protein interface (28). However, the implications for proton switching are not evident, and the theoretical interest is contrasted by the lack of experimental data addressing the effect of the lipidic phase on side chain protonation, secondary structure, and membrane topology of the D(E)RY motif.We have studied the coupling between conformation and protonation in single transmembrane segments derived from H3 of bovine rhodopsin. We have assessed the “modular” function of the D(E)RY motif by determining parameters not evident from the crystal structures, i.e. the pKa of the conserved carboxyl, its linkage to helical structure, and the effect of protonation on side chain to lipid headgroup distance. We show that the D(E)RY motif encodes an autonomous “proton switch” controlling side chain exposure and helix formation in the low dielectric of a lipidic phase. The data ascribe a functional role to lipid protein interactions that couple the chemical potential of protons to an activity-promoting GPCR conformation in a ligand-independent manner.  相似文献   

Top-down proteomics is emerging as a viable method for the routine identification of hundreds to thousands of proteins. In this work we report the largest top-down study to date, with the identification of 1,220 proteins from the transformed human cell line H1299 at a false discovery rate of 1%. Multiple separation strategies were utilized, including the focused isolation of mitochondria, resulting in significantly improved proteome coverage relative to previous work. In all, 347 mitochondrial proteins were identified, including ∼50% of the mitochondrial proteome below 30 kDa and over 75% of the subunits constituting the large complexes of oxidative phosphorylation. Three hundred of the identified proteins were found to be integral membrane proteins containing between 1 and 12 transmembrane helices, requiring no specific enrichment or modified LC-MS parameters. Over 5,000 proteoforms were observed, many harboring post-translational modifications, including over a dozen proteins containing lipid anchors (some previously unknown) and many others with phosphorylation and methylation modifications. Comparison between untreated and senescent H1299 cells revealed several changes to the proteome, including the hyperphosphorylation of HMGA2. This work illustrates the burgeoning ability of top-down proteomics to characterize large numbers of intact proteoforms in a high-throughput fashion.Although traditional bottom-up approaches to mass-spectrometry-based proteomics are capable of identifying thousands of protein groups from a complex mixture, proteolytic digestion can result in the loss of information pertaining to post-translational modifications and sequence variants (1, 2). The recent implementation of top-down proteomics in a high-throughput format using either Fourier transform ion cyclotron resonance (35) or Orbitrap instruments (6, 7) has shown an increasing scale of applicability while preserving information on combinatorial modifications and highly related sequence variants. For example, the identification of over 500 bacterial proteins helped researchers find covalent switches on cysteines (7), and over 1,000 proteins were identified from human cells (3). Such advances have driven the detection of whole protein forms, now simply called proteoforms (8), with several laboratories now seeking to tie these to specific functions in cell and disease biology (911).The term “proteoform” denotes a specific primary structure of an intact protein molecule that arises from a specific gene and refers to a precise combination of genetic variation, splice variants, and post-translational modifications. Whereas special attention is required in order to accomplish gene- and variant-specific identifications via the bottom-up approach, top-down proteomics routinely links proteins to specific genes without the problem of protein inference. However, the fully automated characterization of whole proteoforms still represents a significant challenge in the field. Another major challenge is to extend the top-down approach to the study of whole integral membrane proteins, whose hydrophobicity can often limit their analysis via LC-MS (5, 1216). Though integral membrane proteins are often difficult to solubilize, the long stretches of sequence information provided from fragmentation of their transmembrane domains in the gas phase can actually aid in their identification (5, 13).In parallel to the early days of bottom-up proteomics a decade ago (1721), in this work we brought the latest methods for top-down proteomics into combination with subcellular fractionation and cellular treatments to expand coverage of the human proteome. We utilized multiple dimensions of separation and an Orbitrap Elite mass spectrometer to achieve large-scale interrogation of intact proteins derived from H1299 cells. For this focus issue on post-translational modifications, we report this summary of findings from the largest implementation of top-down proteomics to date, which resulted in the identification of 1,220 proteins and thousands more proteoforms. We also applied the platform to H1299 cells induced into senescence by treatment with the DNA-damaging agent camptothecin.  相似文献   

Human polymerase kappa (hPol κ) is one of four eukaryotic Y-class DNA polymerases and may be an important element in the cellular response to polycyclic aromatic hydrocarbons such as benzo[a]pyrene, which can lead to reactive oxygenated metabolite-mediated oxidative stress. Here, we present a detailed analysis of the activity and specificity of hPol κ bypass opposite the major oxidative adduct 7,8-dihydro-8-oxo-2′-deoxyguanosine (8-oxoG). Unlike its archaeal homolog Dpo4, hPol κ bypasses this lesion in an error-prone fashion by inserting mainly dATP. Analysis of transient-state kinetics shows diminished “bursts” for dATP:8-oxoG and dCTP:8-oxoG incorporation, indicative of non-productive complex formation, but dATP:8-oxoG insertion events that do occur are 2-fold more efficient than dCTP:G insertion events. Crystal structures of ternary hPol κ complexes with adducted template-primer DNA reveal non-productive (dGTP and dATP) alignments of incoming nucleotide and 8-oxoG. Structural limitations placed upon the hPol κ by interactions between the N-clasp and finger domains combined with stabilization of the syn-oriented template 8-oxoG through the side chain of Met-135 both appear to contribute to error-prone bypass. Mutating Leu-508 in the little finger domain of hPol κ to lysine modulates the insertion opposite 8-oxoG toward more accurate bypass, similar to previous findings with Dpo4. Our structural and activity data provide insight into important mechanistic aspects of error-prone bypass of 8-oxoG by hPol κ compared with accurate and efficient bypass of the lesion by Dpo4 and polymerase η.DNA damage incurred by a multitude of endogenous and exogenous factors constitutes an inevitable challenge for the replication machinery, and various mechanisms exist to either remove the resulting lesions or bypass them in a more or less mutation-prone fashion (1). Error-prone polymerases are central to trans-lesion synthesis across sites of damaged DNA (2, 3). Four so-called Y-class DNA polymerases have been identified in humans, Pol η,4 Pol ι, Pol κ, and Rev1, which exhibit different activities and abilities to replicate past a flurry of individual lesions (4, 5). Homologs have also been identified and characterized in other organisms, notably DinB (Pol IV) in Escherichia coli (68), Dbh in Sulfolobus acidocaldarius (9, 10), and Dpo4 in Sulfolobus solfataricus (11, 12). A decade of investigations directed at the structural and functional properties of bypass polymerases have significantly improved our understanding of this class of enzymes (5, 13). A unique feature of Y-class polymerases, compared with the common right-handed arrangement of palm, thumb, and finger subdomains of high fidelity (i.e. A-class) DNA polymerases (14), is a “little finger” or “PAD” (palm-associated domain) subdomain that plays a crucial role in lesion bypass (12, 1521). In addition to the little finger subdomain at the C-terminal end of the catalytic core, both Rev1 and Pol κ exhibit an N-terminal extension that is absent in other translesion polymerases. The N-terminal extension in the structure of the ternary (human) hPol κ·DNA·dTTP complex folds into a U-shaped tether-helix-turn-helix “clasp” that is located between the thumb and little finger domains, allowing the polymerase to completely encircle the DNA (18). Although the precise role of the clasp for lesion bypass by hPol κ remains to be established, it is clear that this entity is functionally important, because mutant enzymes with partially or completely removed clasps exhibit diminished catalytic activity compared with the full-length catalytic core (hPol κ N1–526) or a core lacking the N-terminal 19 residues (hPol κ N19–526; the construct used for crystal structure determination of the ternary complex (18)).7,8-Dihydro-8-oxo-2′-deoxyguanosine (8-oxoG), found in both lower organisms and eukaryotes, is a major lesion that is a consequence of oxidative stress. The lesion is of relevance not only because of its association with cancer (22, 23), but also in connection with aging (24), hepatitis (25), and infertility (26). It is far from clear which DNA polymerases bypass 8-oxoG most often in a cellular context, but given the ubiquitous nature of the lesion it seems likely that more than one enzyme could encounter the lesion. Replicative polymerases commonly insert dATP opposite template 8-oxoG, with the lesion adopting the preferred syn conformation (e.g. 27, 28). It was recently found that the translesion polymerase Dpo4 from S. solfataricus synthesizes efficiently past 8-oxoG, inserting ≥95% dCTP > dATP opposite the lesion (29, 30). The efficient and low error bypass of the 8-oxoG lesion by Dpo4 is associated to a large extent with an arginine residue in the little finger domain (17). In the crystal structure of the ternary Dpo4·DNA·dCTP complex, the side chain of Arg-332 forms a hydrogen bond to the 8-oxygen of 8-oxoG, thus shifting the nucleoside conformational equilibrium toward the anti state and enabling a Watson-Crick binding mode with the incoming dCTP (30). The efficient and accurate replication of templates bearing 8-oxoG by yeast Pol η (31, 32) may indicate similarities between the bypass reactions catalyzed by the archaeal and eukaryotic enzymes. In contrast, bypass synthesis opposite 8-oxoG by human Pol κ is error-prone, resulting in efficient incorporation of A (3335). The inaccurate bypass of 8-oxoG is thought to contribute to the deleterious effects associated with the lesion. These observations indicate different behaviors of the eukaryotic trans-lesion Pol κ and its archaeal “homolog” Dpo4 vis-à-vis the major oxidative stress lesion 8-oxoG. A mechanistic understanding of human DNA polymerases that bypass 8-oxoG in an error-prone fashion, such as hPol κ, is therefore of great interest.To elucidate commonalities and differences between the trans-8-oxoG syntheses of S. solfataricus Dpo4, yeast Pol η, and hPol κ, we carried out a comprehensive analysis of the bypass activity for the latter with template·DNA containing the 8-oxoG lesion, including pre-steady-state and steady-state kinetics of primer extension opposite and beyond 8-oxoG and LC-MS/MS assays of full-length extension products. We determined crystal structures of ternary hPol κ-(8-oxoG)DNA-dGTP and hPol κ-(8-oxoG)DNA-dATP complexes, apparently the first for any complex with adducted DNA for the κ enzyme reported to date. Our work demonstrates clear distinctions between genetically related translesion polymerases and provides insights into the origins of the error-prone reactions opposite 8-oxoG catalyzed by Y-family DNA polymerases.  相似文献   

Membrane fusion without lysis has been reconstituted with purified yeast vacuolar SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptors), the SNARE chaperones Sec17p/Sec18p and the multifunctional HOPS complex, which includes a subunit of the SNARE-interactive Sec1-Munc18 family, and vacuolar lipids: phosphatidylcholine (PC), phosphatidylethanolamine (PE), phosphatidylinositol (PI), phosphatidylserine (PS), phosphatidic acid (PA), cardiolipin (CL), ergosterol (ERG), diacylglycerol (DAG), and phosphatidylinositol 3-phosphate (PI3P). We now report that many of these lipids are required for rapid and efficient fusion of the reconstituted SNARE proteoliposomes in the presence of SNARE chaperones. Omission of either PE, PA, or PI3P from the complete set of lipids strongly reduces fusion, and PC, PE, PA, and PI3P constitute a minimal set of lipids for fusion. PA could neither be replaced by other lipids with small headgroups such as DAG or ERG nor by the acidic lipids PS or PI. PA is needed for full association of HOPS and Sec18p with proteoliposomes having a minimal set of lipids. Strikingly, PA and PE are as essential for SNARE complex assembly as for fusion, suggesting that these lipids facilitate functional interactions among SNAREs and SNARE chaperones.Biological membrane fusion is the regulated rearrangement of the lipids in two apposed sealed membranes to form one bilayer while mixing lumenal contents without leakage or lysis. It is fundamental for intracellular vesicular traffic, cell growth and division, regulated secretion of hormones and other blood proteins, and neurotransmission and thus has attracted wide and sustained study (1, 2). Its fundamental mechanisms are conserved and employ a Rab-family GTPase, proteins which bind to the GTP-bound form of a Rab, termed its “effectors” (3), and SNARE3 (soluble N-ethylmaleimide-sensitive factor attachment protein receptors) proteins (4) with their attendant chaperones. SNAREs are integral or peripheral membrane proteins with characteristic heptad-repeat domains, which can associate in 4-helical coiled-coils (5), termed “cis-SNARE complexes,” if they are all anchored to the same membrane bilayer, or “trans-SNARE complexes” if they are anchored to apposed membranes.Stable membrane proximity (docking) does not suffice for fusion. Studies in model systems have shown that fusion can be promoted by any of several agents, which promote bilayer rearrangement, such as diacylglycerol (6), high levels of calcium (7), viral-encoded fusion proteins (8, 9), or SNAREs (10, 11). These studies frequently employed liposomes or proteoliposomes of simple lipid composition, suggesting that fusion may not have stringent requirements of lipid head group species. However, each of these model fusion reactions is accompanied by substantial lysis (1215), whereas the preservation of subcellular compartments is a hallmark of physiological membrane fusion.We have studied membrane fusion with the vacuole (lysosome) of Saccharomyces cerevisiae (reviewed in Ref. 16). The fusion of isolated vacuoles requires the Rab Ypt7p, 4 SNAREs (Vam3p, Vti1p, Vam7p, and Nyv1p), the SNARE chaperones Sec17p (α-soluble N-ethylmaleimide-sensitive factor attachment protein)/Sec18p (N-ethylmaleimide-sensitive factor) and the hexameric HOPS complex (17), and key “regulatory” lipids including ERG, phosphoinositides, and DAG (18). HOPS interacts physically or functionally with each component of this fusion system. HOPS stably associates with Ypt7p in its GTP-bound state (19). One HOPS subunit, Vps33p, is a member of the Sec1-Munc18 family of SNARE-binding proteins, and HOPS exhibits direct affinity for SNAREs (17, 2022) and proofreads correct vacuolar SNARE pairing (23). HOPS also has direct affinity for phosphoinositides (17). The SNAREs on isolated vacuoles are in cis-complexes, which are disassembled by Sec17p, Sec18p, and ATP (24). Docking requires Ypt7p (25) and HOPS (17). During docking, vacuoles are drawn against each other until each has a substantial membrane domain tightly apposed to the other. Each of the proteins (26) and lipids (18) required for fusion becomes enriched in a ring-shaped microdomain, the “vertex ring,” which surrounds the two tightly apposed membrane domains. Not only do the proteins depend on each other, in a cascade fashion, for vertex ring enrichment, and the lipids depend on each other for their vertex ring enrichment as well, but the lipids and proteins are mutually interdependent for their enrichment at this ring-shaped microdomain (18, 27). Fusion occurs around the ring, joining the two organelles. The fusion of vacuoles bearing physiological fusion constituents does not cause measurable organelle lysis, although fusion supported exclusively by higher levels of SNARE proteins is accompanied by massive lysis (28), in accord with model liposome studies (14). Thus fusion microdomain assembly and the coordinate action of SNAREs with other proteins and lipids to promote fusion without lysis are central topics in membrane fusion studies.Reconstitution of fusion with pure components allows chemical definition of essential elements of this biologically important reaction. Although SNAREs can drive a slow fusion of PC/PS proteoliposomes (29), this was not stimulated by HOPS and Sec17p/Sec18p (30). SNARE proteoliposomes bearing all the vacuolar lipids (18, 3133), PC, PE, PI, PS, CL, PA, ERG, DAG, PI3P, and phosphatidylinositol 4,5-bisphosphate (PI(4,5)P2), showed rapid and efficient fusion that was fully dependent on Sec17p/Sec18p and HOPS (30). The omission of either DAG, ERG, or phosphoinositide from the liposomes caused a marked reduction in fusion (30). We now report that PE and PA are also necessary for rapid and efficient fusion, function in distinct manners, and are required for efficient assembly of newly formed SNARE complexes by the SNARE chaperones Sec17p/Sec18p and HOPS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号