首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
Understanding how a small brain region, the suprachiasmatic nucleus (SCN), can synchronize the body''s circadian rhythms is an ongoing research area. This important time-keeping system requires a complex suite of peptide hormones and transmitters that remain incompletely characterized. Here, capillary liquid chromatography and FTMS have been coupled with tailored software for the analysis of endogenous peptides present in the SCN of the rat brain. After ex vivo processing of brain slices, peptide extraction, identification, and characterization from tandem FTMS data with <5-ppm mass accuracy produced a hyperconfident list of 102 endogenous peptides, including 33 previously unidentified peptides, and 12 peptides that were post-translationally modified with amidation, phosphorylation, pyroglutamylation, or acetylation. This characterization of endogenous peptides from the SCN will aid in understanding the molecular mechanisms that mediate rhythmic behaviors in mammals.Central nervous system neuropeptides function in cell-to-cell signaling and are involved in many physiological processes such as circadian rhythms, pain, hunger, feeding, and body weight regulation (14). Neuropeptides are produced from larger protein precursors by the selective action of endopeptidases, which cleave at mono- or dibasic sites and then remove the C-terminal basic residues (1, 2). Some neuropeptides undergo functionally important post-translational modifications (PTMs),1 including amidation, phosphorylation, pyroglutamylation, or acetylation. These aspects of peptide synthesis impact the properties of neuropeptides, further expanding their diverse physiological implications. Therefore, unveiling new peptides and unreported peptide properties is critical to advancing our understanding of nervous system function.Historically, the analysis of neuropeptides was performed by Edman degradation in which the N-terminal amino acid is sequentially removed. However, analysis by this method is slow and does not allow for sequencing of the peptides containing N-terminal PTMs (5). Immunological techniques, such as radioimmunoassay and immunohistochemistry, are used for measuring relative peptide levels and spatial localization, but these methods only detect peptide sequences with known structure (6). More direct, high throughput methods of analyzing brain regions can be used.Mass spectrometry, a rapid and sensitive method that has been used for the analysis of complex biological samples, can detect and identify the precise forms of neuropeptides without prior knowledge of peptide identity, with these approaches making up the field of peptidomics (712). The direct tissue and single neuron analysis by MALDI MS has enabled the discovery of hundreds of neuropeptides in the last decade, and the neuronal homogenate analysis by fractionation and subsequent ESI or MALDI MS has yielded an equivalent number of new brain peptides (5). Several recent peptidome studies, including the work by Dowell et al. (10), have used the specificity of FTMS for peptide discovery (10, 1315). Here, we combine the ability to fragment ions at ultrahigh mass accuracy (16) with a software pipeline designed for neuropeptide discovery. We use nanocapillary reversed-phase LC coupled to 12 Tesla FTMS for the analysis of peptides present in the suprachiasmatic nucleus (SCN) of rat brain.A relatively small, paired brain nucleus located at the base of the hypothalamus directly above the optic chiasm, the SCN contains a biological clock that generates circadian rhythms in behaviors and homeostatic functions (17, 18). The SCN comprises ∼10,000 cellular clocks that are integrated as a tissue level clock which, in turn, orchestrates circadian rhythms throughout the brain and body. It is sensitive to incoming signals from the light-sensing retina and other brain regions, which cause temporal adjustments that align the SCN appropriately with changes in environmental or behavioral state. Previous physiological studies have implicated peptides as critical synchronizers of normal SCN function as well as mediators of SCN inputs, internal signal processing, and outputs; however, only a small number of peptides have been identified and explored in the SCN, leaving unresolved many circadian mechanisms that may involve peptide function.Most peptide expression in the SCN has only been studied through indirect antibody-based techniques (1929), although we recently used MS approaches to characterize several peptides detected in SCN releasates (30). Previous studies indicate that the SCN expresses a rich diversity of peptides relative to other brain regions studied with the same techniques. Previously used immunohistochemical approaches are not only inadequate for comprehensively evaluating PTMs and alternate isoforms of known peptides but are also incapable of exhaustively examining the full peptide complement of this complex biological network of peptidergic inputs and intrinsic components. A comprehensive study of SCN peptidomics is required that utilizes high resolution strategies for directly analyzing the peptide content of the neuronal networks comprising the SCN.In our study, the SCN was obtained from ex vivo coronal brain slices via tissue punch and subjected to multistage peptide extraction. The SCN tissue extract was analyzed by FTMS/MS, and the high resolution MS and MS/MS data were processed using ProSightPC 2.0 (16), which allows the identification and characterization of peptides or proteins from high mass accuracy MS/MS data. In addition, the Sequence Gazer included in ProSightPC was used for manually determining PTMs (31, 32). As a result, a total of 102 endogenous peptides were identified, including 33 that were previously unidentified, and 12 PTMs (including amidation, phosphorylation, pyroglutamylation, and acetylation) were found. The present study is the first comprehensive peptidomics study for identifying peptides present within the mammalian SCN. In fact, this is one of the first peptidome studies to work with discrete brain nuclei as opposed to larger brain structures and follows up on our recent report using LC-ion trap for analysis of the peptides in the supraoptic nucleus (33); here, the use of FTMS allows a greater range of PTMs to be confirmed and allows higher confidence in the peptide assignments. This information on the peptides in the SCN will serve as a basis to more exhaustively explore the extent that previously unreported SCN neuropeptides may function in SCN regulation of mammalian circadian physiology.  相似文献   

2.
3.
MHC class I proteins assemble with peptides in the ER. The peptides are predominantly generated from cytoplasmic proteins, probably by the action of the proteasome, a multicatalytic proteinase complex. Peptides are translocated into the ER by the transporters associated with antigen processing (TAP), and bind to the MHC class I molecules before transport to the cell surface. Here, we use a new functional assay to demonstrate that peptides derived from vesicular stomatitis virus nucleoprotein (VSV-N) antigen are actively secreted from cells. This secretion pathway is dependent on the expression of TAP transporters, but is independent of the MHC genotype of the donor cells. Furthermore, the expression and transport of MHC class I molecules is not required. This novel pathway is sensitive to the protein secretion inhibitors brefeldin A (BFA) and a temperature block at 21°C, and is also inhibited by the metabolic poison, azide, and the protein synthesis inhibitor, emetine. These data support the existence of a novel form of peptide secretion that uses the TAP transporters, as opposed to the ER translocon, to gain access to the secretion pathway. Finally, we suggest that this release of peptides in the vicinity of uninfected cells, which we term surrogate antigen processing, could contribute to various immune and secretory phenomena.Protein secretion has traditionally been thought to involve the action of the translocon located in the membrane of the ER of eukaryotic cells. Proteins are recognized cotranslationally when a signal sequence or a signal–anchor sequence emerges from the ribosome (Walter and Johnson, 1994). These sequences are recognized and bound by the signal recognition particle, and the resulting ribosomal complex then interacts with the signal recognition particle receptor on the ER membrane at the translocon (Andrews and Johnson, 1996). This results in the inclusion of proteins within the secretory pathway. This pathway is by far the best described route of protein secretion in eukaryotic cells. Recently, it has been proposed that some proteins are recognized by a component of the translocon, sec 61, exit the ER, and are transported into the cytoplasm where they are degraded (Wiertz et al., 1996).The translocation into the ER of antigenic peptides for presentation by major histocompatibility complex (MHC)1 class I molecules is largely independent of the translocon. This form of translocation involves the action of two gene products that are members of the ATP binding cassette family. These genes encode transporters associated with antigen processing 1 and 2 (TAP-1 and -2), and have been implicated in the translocation of peptides from the cytoplasm to the lumen of the ER (Deverson et al., 1990; Bahram et al., 1991; Spies and DeMars, 1991; Spies et al., 1992; Gabathuler et al., 1994). After translocation into the ER, antigenic peptides bind to MHC class I molecules composed of a heavy chain (46-kD) and a light chain (12-kD) called β2m (Nuchtern et al., 1989; Yewdell and Bennink, 1989; van Bleek and Nathenson, 1990; Matsumura et al., 1992), before transport to the cell surface. The assembly and transport of MHC class I molecules appears to be regulated by a series of chaperones that includes calnexin (Degen and Williams, 1991), calreticulin, and tapasin (Sadasivan et al., 1996).High performance liquid chromatography analysis of peptides eluted from acid-treated whole cells or MHC class I molecules has allowed the identification and characterization of the peptides associated with MHC class I molecules (Falk et al., 1990; Rötzschke et al., 1990; van Bleek and Nathenson, 1990). It is proposed that MHC class I molecules determine the final identity of MHC- restricted peptides and have an instructive role, in addition to a selective role, in peptide selection (Wallny et al., 1992). MHC binding to larger peptides followed by protected proteolytic trimming is a possible mechanism that could account for the observed MHC dependency of cellular peptides (Falk et al., 1990). Peptides unable to bind MHC class I because they are in excess or lack the correct MHC binding motif for the MHC haplotype are thus far undetectable by the techniques commonly used in the field, and are presumed to be short lived and degraded (Falk et al., 1990; Rötzschke et al., 1990). Recent results, however, suggest that peptides not able to bind to a MHC class I molecule intracellularly may be found bound to heat shock proteins (HSPs) such as gp96 (grp94; Arnold et al., 1995). These authors show that cellular antigens are represented by peptides associated with gp96 molecules independently of the MHC class I expressed, confirming earlier results (Udono and Srivastava, 1993, 1994). Gp96 extracted from a specific cell is able to induce cross-priming (Udono and Srivastava, 1993, 1994). Finally, two studies have demonstrated that peptides transported into the lumen of the ER, and do not bind to MHC class I molecules, can be transported out of the ER into the cytoplasm again by a process called “efflux” (Momburg et al., 1994; Schumacher et al., 1994), which may involve the action of the TAP molecules or the sec 61 protein associated with the translocon (Wiertz et al., 1996).We have developed a new bioassay to test the hypothesis that peptides translocated into the ER by the action of the TAP molecules become secreted. Using this assay, we present evidence of an alternative secretion pathway that exists in various mammalian cell types. These observations revise the model of peptide catabolism, and may provide an explanation for various immune and secretion phenomena.  相似文献   

4.
Birdshot chorioretinopathy is a rare ocular inflammation whose genetic association with HLA-A*29:02 is the highest between a disease and a major histocompatibility complex (MHC) molecule. It belongs to a group of MHC-I-associated inflammatory disorders, also including ankylosing spondylitis, psoriasis, and Behçet''s disease, for which endoplasmic reticulum aminopeptidases (ERAP) 1 and/or 2 have been identified as genetic risk factors. Since both enzymes are involved in the processing of MHC-I ligands, it seems reasonable that common peptide-mediated mechanisms may underlie the pathogenesis of these diseases. In this study, comparative immunopeptidomics was used to characterize >5000 A*29:02 ligands and quantify the effects of ERAP1 polymorphism and expression on the A*29:02 peptidome in human cells. The peptides predominant in an active ERAP1 context showed a higher frequency of nonamers and bulkier amino acid side chains at multiple positions, compared with the peptides predominant in a less active ERAP1 background. Thus, ERAP1 polymorphism has a large influence, shaping the A*29:02 peptidome through length-dependent and length-independent effects. These changes resulted in increased affinity and hydrophobicity of A*29:02 ligands in an active ERAP1 context. The results reveal the nature of the functional interaction between A*29:02 and ERAP1 and suggest that this enzyme may affect the susceptibility to birdshot chorioretinopathy by altering the A*29:02 peptidome. The complexity of these alterations is such that not only peptide presentation but also other potentially pathogenic features could be affected.Several major histocompatibility complex class I (MHC-I)1 alleles are strongly associated with polygenic inflammatory diseases, including birdshot chorioretinopathy (BSCR: A*29:02), ankylosing spondylitis (AS: HLA-B*27), psoriasis (C*06:02), and Behçet''s disease (HLA-B*51). In the three latter disorders, ERAP1, an aminopeptidase of the endoplasmic reticulum performing the final trimming of MHC-I ligands (1, 2), is also a risk factor and is in epistasis with the predisposing MHC-I allele (35). These studies suggest common pathogenetic mechanisms involving the MHC-I bound peptidome. ERAP2, a related enzyme that acts in concert with ERAP1 (6, 7), influences the susceptibility to BSCR (8), AS (although not necessarily in epistasis with HLA-B*27) (9), Crohn′s disease (10), and preeclampsia (1113).BSCR is a rare and severe form of bilateral posterior uveitis, showing a progressive inflammation of the choroid and retina, whose association with HLA-A*29 is the strongest for any disease and MHC. The frequency of this allele is about 7% in healthy individuals but >95% in BSCR patients (14, 15). This association specifically concerns A*29:02 and not the closely related allotype A*29:01 (8).Genetic studies on BSCR also showed a highly significant association within the LNPEP gene (rs7705093) in the 5q15 region, which includes the ERAP1 and ERAP2 genes. One single nucleotide polymorphism (SNP) in this region (rs10044354) correlated with ERAP2 expression. This was confirmed at the protein level, leading to the conclusion that ERAP2 expression predisposes to BSCR. Yet, an involvement of functional ERAP1 polymorphisms, not determining protein expression, was not excluded. These polymorphisms have a large influence on the HLA-B*27 peptidome (16, 17). In contrast, the effects of ERAP2 on MHC-I peptidomes are poorly understood and are probably dependent on the particular ERAP1 context since ERAP2 cooperates with ERAP1 in peptide processing. Thus, the present study was conducted to characterize A*29:02-bound peptidomes in various ERAP1 backgrounds and to determine the influence of ERAP1 polymorphism on the amounts and features of A*29:02 ligands in human cells.  相似文献   

5.
A complete understanding of the biological functions of large signaling peptides (>4 kDa) requires comprehensive characterization of their amino acid sequences and post-translational modifications, which presents significant analytical challenges. In the past decade, there has been great success with mass spectrometry-based de novo sequencing of small neuropeptides. However, these approaches are less applicable to larger neuropeptides because of the inefficient fragmentation of peptides larger than 4 kDa and their lower endogenous abundance. The conventional proteomics approach focuses on large-scale determination of protein identities via database searching, lacking the ability for in-depth elucidation of individual amino acid residues. Here, we present a multifaceted MS approach for identification and characterization of large crustacean hyperglycemic hormone (CHH)-family neuropeptides, a class of peptide hormones that play central roles in the regulation of many important physiological processes of crustaceans. Six crustacean CHH-family neuropeptides (8–9.5 kDa), including two novel peptides with extensive disulfide linkages and PTMs, were fully sequenced without reference to genomic databases. High-definition de novo sequencing was achieved by a combination of bottom-up, off-line top-down, and on-line top-down tandem MS methods. Statistical evaluation indicated that these methods provided complementary information for sequence interpretation and increased the local identification confidence of each amino acid. Further investigations by MALDI imaging MS mapped the spatial distribution and colocalization patterns of various CHH-family neuropeptides in the neuroendocrine organs, revealing that two CHH-subfamilies are involved in distinct signaling pathways.Neuropeptides and hormones comprise a diverse class of signaling molecules involved in numerous essential physiological processes, including analgesia, reward, food intake, learning and memory (1). Disorders of the neurosecretory and neuroendocrine systems influence many pathological processes. For example, obesity results from failure of energy homeostasis in association with endocrine alterations (2, 3). Previous work from our lab used crustaceans as model organisms found that multiple neuropeptides were implicated in control of food intake, including RFamides, tachykinin related peptides, RYamides, and pyrokinins (46).Crustacean hyperglycemic hormone (CHH)1 family neuropeptides play a central role in energy homeostasis of crustaceans (717). Hyperglycemic response of the CHHs was first reported after injection of crude eyestalk extract in crustaceans. Based on their preprohormone organization, the CHH family can be grouped into two sub-families: subfamily-I containing CHH, and subfamily-II containing molt-inhibiting hormone (MIH) and mandibular organ-inhibiting hormone (MOIH). The preprohormones of the subfamily-I have a CHH precursor related peptide (CPRP) that is cleaved off during processing; and preprohormones of the subfamily-II lack the CPRP (9). Uncovering their physiological functions will provide new insights into neuroendocrine regulation of energy homeostasis.Characterization of CHH-family neuropeptides is challenging. They are comprised of more than 70 amino acids and often contain multiple post-translational modifications (PTMs) and complex disulfide bridge connections (7). In addition, physiological concentrations of these peptide hormones are typically below picomolar level, and most crustacean species do not have available genome and proteome databases to assist MS-based sequencing.MS-based neuropeptidomics provides a powerful tool for rapid discovery and analysis of a large number of endogenous peptides from the brain and the central nervous system. Our group and others have greatly expanded the peptidomes of many model organisms (3, 1833). For example, we have discovered more than 200 neuropeptides with several neuropeptide families consisting of as many as 20–40 members in a simple crustacean model system (5, 6, 2531, 34). However, a majority of these neuropeptides are small peptides with 5–15 amino acid residues long, leaving a gap of identifying larger signaling peptides from organisms without sequenced genome. The observed lack of larger size peptide hormones can be attributed to the lack of effective de novo sequencing strategies for neuropeptides larger than 4 kDa, which are inherently more difficult to fragment using conventional techniques (3437). Although classical proteomics studies examine larger proteins, these tools are limited to identification based on database searching with one or more peptides matching without complete amino acid sequence coverage (36, 38).Large populations of neuropeptides from 4–10 kDa exist in the nervous systems of both vertebrates and invertebrates (9, 39, 40). Understanding their functional roles requires sufficient molecular knowledge and a unique analytical approach. Therefore, developing effective and reliable methods for de novo sequencing of large neuropeptides at the individual amino acid residue level is an urgent gap to fill in neurobiology. In this study, we present a multifaceted MS strategy aimed at high-definition de novo sequencing and comprehensive characterization of the CHH-family neuropeptides in crustacean central nervous system. The high-definition de novo sequencing was achieved by a combination of three methods: (1) enzymatic digestion and LC-tandem mass spectrometry (MS/MS) bottom-up analysis to generate detailed sequences of proteolytic peptides; (2) off-line LC fractionation and subsequent top-down MS/MS to obtain high-quality fragmentation maps of intact peptides; and (3) on-line LC coupled to top-down MS/MS to allow rapid sequence analysis of low abundance peptides. Combining the three methods overcomes the limitations of each, and thus offers complementary and high-confidence determination of amino acid residues. We report the complete sequence analysis of six CHH-family neuropeptides including the discovery of two novel peptides. With the accurate molecular information, MALDI imaging and ion mobility MS were conducted for the first time to explore their anatomical distribution and biochemical properties.  相似文献   

6.
7.
8.
9.
The performances of 10 different normalization methods on data of endogenous brain peptides produced with label-free nano-LC-MS were evaluated. Data sets originating from three different species (mouse, rat, and Japanese quail), each consisting of 35–45 individual LC-MS analyses, were used in the study. Each sample set contained both technical and biological replicates, and the LC-MS analyses were performed in a randomized block fashion. Peptides in all three data sets were found to display LC-MS analysis order-dependent bias. Global normalization methods will only to some extent correct this type of bias. Only the novel normalization procedure RegrRun (linear regression followed by analysis order normalization) corrected for this type of bias. The RegrRun procedure performed the best of the normalization methods tested and decreased the median S.D. by 43% on average compared with raw data. This method also produced the smallest fraction of peptides with interblock differences while producing the largest fraction of differentially expressed peaks between treatment groups in all three data sets. Linear regression normalization (Regr) performed second best and decreased median S.D. by 38% on average compared with raw data. All other examined methods reduced median S.D. by 20–30% on average compared with raw data.Peptidomics is defined as the analysis of the peptide content within an organism, tissue, or cell (13). The proteome and peptidome have common features, but there are also prominent differences. Proteomics generally identifies proteins by using the information of biologically inactive peptides derived from tryptic digestion, whereas peptidomics tries to identify endogenous peptides using single peptide sequence information only (4). Endogenous neuropeptides are peptides used for intracellular signaling that can act as neurotransmitters or neuromodulators in the nervous system. These polypeptides of 3–100 amino acids can be abundantly produced in large neural populations or in trace levels from single neurons (5) and are often generated through the cleavage of precursor proteins. However, unwanted peptides can also be created through post-mortem induced proteolysis (6). The later aspect complicates the technical analysis of neuropeptides as post-mortem conditions increase the number of degradation peptides. The possibility to detect, identify, and quantify lowly expressed neuropeptides using label-free LC-MS techniques has improved with the development of new sample preparation techniques including rapid heating of the tissue, which prevents protein degradation and inhibition of post-mortem proteolytic activity (7, 8).It has been suggested by us (4, 5) and others (9) that comparing the peptidome between samples of e.g. diseased and normal tissue may lead to the discovery of biologically relevant peptides of certain pathological or pharmacological events. However, differences in relative peptide abundance measurements may not only originate from biological differences but also from systematic bias and noise. To reduce the effects of experimentally induced variability it is common to normalize the raw data. This is a concept well known in the area of genomics studies using gene expression microarrays (1012). As a consequence, many methods developed for microarray data have also been adapted for normalizing peptide data produced with LC-MS techniques (1016). Normally the underlying assumption for applying these techniques is that the total or mean/median peak abundances should be equal across different experiments, in this case between LC-MS analyses. Global normalization methods refer to cases where all peak abundances are used to determine a single normalization factor between experiments (13, 15, 16), a subset of peaks assumed to be similarly abundant between experiments (16) is used, or spiked-in peptides are used as internal standards. In a study by Callister et al. (14), normalization methods for tryptic LC-FTICR-MS peptide data were compared. The authors concluded that global or iterative linear regression works best in most cases but also recommended that the best procedure should be selected for each data set individually. Methods used for normalizing LC-MS data have been reviewed previously (14, 17, 18), but to our knowledge only Callister et al. (14) have used small data sets to systematically evaluate such methods. None of these studies have targeted data of endogenous peptides.In this study, the effects of 10 different normalization methods were evaluated on data produced by a nano-LC system coupled to an electrospray Q-TOF or linear trap quadrupole (LTQ)1 mass spectrometer. Normalization methods that originally were developed for gene expression data were used, and one novel method, linear regression followed by analysis order normalization (RegrRun), is presented. The normalization methods were evaluated using three data sets of endogenous brain peptides originating from three different species (mouse, rat, and Japanese quail), each consisting of 35–45 individual LC-MS analyses. Each data set contained both technical and biological replicates.  相似文献   

10.
HLA class I molecules reflect the health state of cells to cytotoxic T cells by presenting a repertoire of endogenously derived peptides. However, the extent to which the proteome shapes the peptidome is still largely unknown. Here we present a high-throughput mass-spectrometry-based workflow that allows stringent and accurate identification of thousands of such peptides and direct determination of binding motifs. Applying the workflow to seven cancer cell lines and primary cells, yielded more than 22,000 unique HLA peptides across different allelic binding specificities. By computing a score representing the HLA-I sampling density, we show a strong link between protein abundance and HLA-presentation (p < 0.0001). When analyzing overpresented proteins – those with at least fivefold higher density score than expected for their abundance – we noticed that they are degraded almost 3 h faster than similar but nonpresented proteins (top 20% abundance class; median half-life 20.8h versus 23.6h, p < 0.0001). This validates protein degradation as an important factor for HLA presentation. Ribosomal, mitochondrial respiratory chain, and nucleosomal proteins are particularly well presented. Taking a set of proteins associated with cancer, we compared the predicted immunogenicity of previously validated T-cell epitopes with other peptides from these proteins in our data set. The validated epitopes indeed tend to have higher immunogenic scores than the other detected HLA peptides. Remarkably, we identified five mutated peptides from a human colon cancer cell line, which have very recently been predicted to be HLA-I binders. Altogether, we demonstrate the usefulness of combining MS-analysis with immunogenesis prediction for identifying, ranking, and selecting peptides for therapeutic use.The highly polymorphic Human Leukocyte Antigen class I (HLA-I)1 genes are encoded by three loci (HLA-A, B, and C) in a gene-rich region on chromosome 6. They produce up to six unique cell surface receptors that bind and present the so-called HLA class I peptidome, which consists of peptides derived from proteolysis of intracellular proteins. Their function is to reflect the health state of the body''s cells to CD8+ cytotoxic T cells. During thymic maturation T cells that react to self-peptides are eliminated (1), leaving T cells with the capability to recognize peptides from viruses and bacteria. This recognition is interpreted as a danger signal, leading to removal of infected cells. Transformed, preneoplastic and cancer cells also tend to display atypical self-peptides from mutated or excessively expressed self-proteins, known as tumor associated antigens (TAAs). Although HLA-I molecules are indispensable in prevention of disease, they also pose a substantial health problem by causing allergies (2), life-threatening autoimmune diseases (3), and the often fatal rejection of donor organs because of recognition of both major and minor histocompatibility antigens (4).Finding the rules for peptide generation and selection is regarded as the most important open issue in the field of HLA-I biology by leading experts (5). Although the antigen presentation pathway is well characterized, it is still unclear how basic properties such as protein abundance, turnover, and subcellular localization influence and shape the HLA-I presented peptidome (610). One expectation is that protein abundance should correlate with presentation (11), but previous studies have reported conflicting and contradicting results that mostly argue against a strong link (6, 7, 10, 12, 13). It is also not fully understood why only some HLA-sampled self-peptides from cancer antigens spontaneously activate T cells, whereas others do not.The majority of HLA-I peptides are derived from proteasomal degradation (5). Although the proteasome generates an excess of peptides, only some have the required sequence motifs for HLA binding, resulting in a selective sampling of available peptides (14). The presented peptides are typically nine amino acids long, but the length can range from eight to 15. The high degree of genetic variance of HLA-I receptors translates into allele-specific peptide-binding motifs defined by anchor positions, which are usually the second and the last positions in a peptide (15). Each cell has around 200,000 cell-surface-expressed HLA complexes, which bind about 10,000 unique peptide sequences (16). The affinity of a peptide toward the presenting HLA molecule does not correlate strongly with its immunogenicity, and neither does the number of presented HLA complexes (17). Instead, the most robust predictor of peptide immunogenicity appears to be the number of potential reactive T-cell clones (1719).The longer the source protein, the higher the chances it will contain sequences that fit to a certain HLA motif, which would inflate the representation of longer proteins regardless of biological role. Furthermore, some HLA-I peptide sequences can be mapped to multiple proteins, potentially causing a problem in determining the number of observed HLA peptides per protein (13). This illustrates that careful accounting of the potentially and actually presented HLA peptides is important in properly delineating trends in propensity of peptide presentation.In cancer immunotherapy, T cells can be directed against tumors, based on the pattern of cancer associated HLA peptides. Therefore, there is great interest in determining the identity of these immunogenic peptides. Bioinformatic methods that attempt to predict HLA peptides of cancer proteins of interest are easily accessible and most commonly used. They typically score sequences with respect to proteasomal degradation, transport into the ER via the transporter associate with antigen processing (TAP) and binding to different HLA-I alleles (20). However, their precision success is modest (21, 22). The second approach is to directly capture the naturally presented peptides using mass spectrometry; however, this requires the relevant biological sample and sophisticated instruments and workflows, which have become accessible only recently for large-scale work (2328). Although identification of cancer associated HLA peptides by MS, if performed stringently, establish the in vivo existence of the peptide, it still does not guarantee that it will elicit a potent T-cell response, which is required for further development into therapeutics (29). Therefore, like in the case of in silico predicted peptides, the immunogenicity of the peptides must in any case be tested empirically.We here present a rich and high confidence HLA-I peptidome, established by applying state-of-the-art mass-spectrometric techniques on a collection of seven cell lines. We investigate how abundance affects the propensity of proteins to be presented as measurable HLA peptides and whether or not there are specific protein classes that are overrepresented even independent of abundance. Likewise, we explore how to use in silico immunogenicity tools on the set of identified HLA peptides from cancer-associated proteins, with a view to select vaccine candidates.  相似文献   

11.
Knowledge of elaborate structures of protein complexes is fundamental for understanding their functions and regulations. Although cross-linking coupled with mass spectrometry (MS) has been presented as a feasible strategy for structural elucidation of large multisubunit protein complexes, this method has proven challenging because of technical difficulties in unambiguous identification of cross-linked peptides and determination of cross-linked sites by MS analysis. In this work, we developed a novel cross-linking strategy using a newly designed MS-cleavable cross-linker, disuccinimidyl sulfoxide (DSSO). DSSO contains two symmetric collision-induced dissociation (CID)-cleavable sites that allow effective identification of DSSO-cross-linked peptides based on their distinct fragmentation patterns unique to cross-linking types (i.e. interlink, intralink, and dead end). The CID-induced separation of interlinked peptides in MS/MS permits MS3 analysis of single peptide chain fragment ions with defined modifications (due to DSSO remnants) for easy interpretation and unambiguous identification using existing database searching tools. Integration of data analyses from three generated data sets (MS, MS/MS, and MS3) allows high confidence identification of DSSO cross-linked peptides. The efficacy of the newly developed DSSO-based cross-linking strategy was demonstrated using model peptides and proteins. In addition, this method was successfully used for structural characterization of the yeast 20 S proteasome complex. In total, 13 non-redundant interlinked peptides of the 20 S proteasome were identified, representing the first application of an MS-cleavable cross-linker for the characterization of a multisubunit protein complex. Given its effectiveness and simplicity, this cross-linking strategy can find a broad range of applications in elucidating the structural topology of proteins and protein complexes.Proteins form stable and dynamic multisubunit complexes under different physiological conditions to maintain cell viability and normal cell homeostasis. Detailed knowledge of protein interactions and protein complex structures is fundamental to understanding how individual proteins function within a complex and how the complex functions as a whole. However, structural elucidation of large multisubunit protein complexes has been difficult because of a lack of technologies that can effectively handle their dynamic and heterogeneous nature. Traditional methods such as nuclear magnetic resonance (NMR) analysis and x-ray crystallography can yield detailed information on protein structures; however, NMR spectroscopy requires large quantities of pure protein in a specific solvent, whereas x-ray crystallography is often limited by the crystallization process.In recent years, chemical cross-linking coupled with mass spectrometry (MS) has become a powerful method for studying protein interactions (13). Chemical cross-linking stabilizes protein interactions through the formation of covalent bonds and allows the detection of stable, weak, and/or transient protein-protein interactions in native cells or tissues (49). In addition to capturing protein interacting partners, many studies have shown that chemical cross-linking can yield low resolution structural information about the constraints within a molecule (2, 3, 10) or protein complex (1113). The application of chemical cross-linking, enzymatic digestion, and subsequent mass spectrometric and computational analyses for the elucidation of three-dimensional protein structures offers distinct advantages over traditional methods because of its speed, sensitivity, and versatility. Identification of cross-linked peptides provides distance constraints that aid in constructing the structural topology of proteins and/or protein complexes. Although this approach has been successful, effective detection and accurate identification of cross-linked peptides as well as unambiguous assignment of cross-linked sites remain extremely challenging due to their low abundance and complicated fragmentation behavior in MS analysis (2, 3, 10, 14). Therefore, new reagents and methods are urgently needed to allow unambiguous identification of cross-linked products and to improve the speed and accuracy of data analysis to facilitate its application in structural elucidation of large protein complexes.A number of approaches have been developed to facilitate MS detection of low abundance cross-linked peptides from complex mixtures. These include selective enrichment using affinity purification with biotinylated cross-linkers (1517) and click chemistry with alkyne-tagged (18) or azide-tagged (19, 20) cross-linkers. In addition, Staudinger ligation has recently been shown to be effective for selective enrichment of azide-tagged cross-linked peptides (21). Apart from enrichment, detection of cross-linked peptides can be achieved by isotope-labeled (2224), fluorescently labeled (25), and mass tag-labeled cross-linking reagents (16, 26). These methods can identify cross-linked peptides with MS analysis, but interpretation of the data generated from interlinked peptides (two peptides connected with the cross-link) by automated database searching remains difficult. Several bioinformatics tools have thus been developed to interpret MS/MS data and determine interlinked peptide sequences from complex mixtures (12, 14, 2732). Although promising, further developments are still needed to make such data analyses as robust and reliable as analyzing MS/MS data of single peptide sequences using existing database searching tools (e.g. Protein Prospector, Mascot, or SEQUEST).Various types of cleavable cross-linkers with distinct chemical properties have been developed to facilitate MS identification and characterization of cross-linked peptides. These include UV photocleavable (33), chemical cleavable (19), isotopically coded cleavable (24), and MS-cleavable reagents (16, 26, 3438). MS-cleavable cross-linkers have received considerable attention because the resulting cross-linked products can be identified based on their characteristic fragmentation behavior observed during MS analysis. Gas-phase cleavage sites result in the detection of a “reporter” ion (26), single peptide chain fragment ions (3538), or both reporter and fragment ions (16, 34). In each case, further structural characterization of the peptide product ions generated during the cleavage reaction can be accomplished by subsequent MSn1 analysis. Among these linkers, the “fixed charge” sulfonium ion-containing cross-linker developed by Lu et al. (37) appears to be the most attractive as it allows specific and selective fragmentation of cross-linked peptides regardless of their charge and amino acid composition based on their studies with model peptides.Despite the availability of multiple types of cleavable cross-linkers, most of the applications have been limited to the study of model peptides and single proteins. Additionally, complicated synthesis and fragmentation patterns have impeded most of the known MS-cleavable cross-linkers from wide adaptation by the community. Here we describe the design and characterization of a novel and simple MS-cleavable cross-linker, DSSO, and its application to model peptides and proteins and the yeast 20 S proteasome complex. In combination with new software developed for data integration, we were able to identify DSSO-cross-linked peptides from complex peptide mixtures with speed and accuracy. Given its effectiveness and simplicity, we anticipate a broader application of this MS-cleavable cross-linker in the study of structural topology of other protein complexes using cross-linking and mass spectrometry.  相似文献   

12.
13.
Presentation of the Mtv-1 superantigen (vSag1) to specific Vβ-bearing T cells requires association with major histocompatibility complex class II molecules. The intracellular route by which vSag1 trafficks to the cell surface and the site of vSag1-class II complex assembly in antigen-presenting B lymphocytes have not been determined. Here, we show that vSag1 trafficks independently of class II to the plasma membrane by the exocytic secretory pathway. At the surface of B cells, vSag1 associates primarily with mature peptide-bound class II αβ dimers, which are stable in sodium dodecyl sulfate. vSag1 is unstable on the cell surface in the absence of class II, and reagents that alter the surface expression of vSag1 and the conformation of class II molecules affect vSag1 stimulation of superantigen reactive T cells.

T lymphocytes respond to peptide antigens presented by either major histocompatibility complex (MHC) class I or class II molecules. Many viruses have evolved sophisticated strategies that interfere with antigen presentation by infected cells in order to escape recognition by T lymphocytes. Most strategies studied rely on disrupting MHC class I presentation, either by affecting components of the processing machinery that generate and transport viral peptides into the endoplasmic reticulum (ER) or by retarding transport or targeting class I molecules into the degradation pathway (for a review, see reference 73).In contrast, mouse mammary tumor virus (MMTV) utilizes T-cell stimulation to promote its life cycle. MMTVs encode within their 3′ long terminal repeat a viral superantigen (vSag), and coexpression of the Sag glycoprotein with MHC class II molecules on the surface of virally infected B cells induces Vβ-specific T-cell stimulation, generating an immune response which is critical for amplification of MMTV and ensures vertical transmission of virus to the next generation (13, 29, 30). In the absence of B cells, MHC class II, or Sag-reactive T cells, the infection is short-lived (5, 6, 24, 28). The assembly and functional expression of vSag-class II complexes are therefore essential to the viral life cycle. When inherited as germ line elements, Mtv proviruses expressing vSags during ontogeny trigger Vβ-specific clonal elimination of immature T cells and profoundly shape the T-cell repertoire (for a review, see reference 1).vSags are type II integral membrane glycoproteins (14, 36). They possess up to six potential N-linked glycosylation sites, and carbohydrate addition is essential for vSag stability and activity (45). Their protein sequence is highly conserved among all MMTV strains except at the C-terminal 29 to 32 residues, which vary and confer T-cell Vβ specificity (77). Biochemical analyses of vSag7 (minor lymphocyte stimulating locus 1, Mls-1a) molecular forms after transfection into a murine B-cell line have identified a predominant 45-kDa endo-β-N-acetylglucosaminidase H (endo H)-sensitive ER-resident glycoprotein, as well as multiple highly glycosylated forms (74). It is thought that an 18-kDa C-terminal fragment binds MHC class II products (75). It has also been suggested that vSags associate weakly with class II in the ER and that proteolytic processing is required for the efficient assembly of vSag-class II complexes for presentation to T cells (46, 49, 75). As yet, the intracellular route that vSags take to the cell surface, the compartment in which they bind class II, and whether they associate with peptide-loaded class II dimers have been enigmatic.Newly synthesized MHC class II αβ heterodimers assemble with invariant chain (Ii), a type II integral membrane protein, to form an oligomeric complex in the ER (37). Ii prevents class II heterodimers from binding peptides in the ER and Golgi complex (55), and signals in its cytoplasmic tail sort the complex into the endocytic pathway (4, 42). In this acidic, protease-rich compartment, Ii is degraded and class II binds antigenic peptides. After the formation of peptide-class II dimers, the complexes are exported to the plasma membrane (8, 48). In the absence of Ii, class II αβ heterodimers exhibit defective post-ER transport, and their conversion into functionally mature, sodium dodecyl sulfate (SDS)-stable compact dimers by peptide antigens is affected (7, 16, 22, 70).A specialized endosomal compartment where class II peptide loading occurs, termed the MHC class II-enriched compartment (MIIC or CIIV), has been found recently in antigen-presenting cells (2, 50, 53, 58, 68, 71). Whether nascent Ii-class II complexes traffic directly to the MIIC from the trans-Golgi network (TGN) or transit first to early endosomes, either directly or via the cell surface, before entering late endocytic vesicles and MIIC is still under debate (26, 56, 57). Transport by all these routes most probably occurs to ensure the capture and loading of antigenic peptides throughout the endocytic pathway (12). MIIC vesicles are positive for lysosome-associated membrane proteins (LAMPs) and cathepsin D and are enriched for HLA-DM or H-2M (18, 32, 59), proteins that facilitate the catalytic exchange of class II-associated invariant peptide chain (CLIP) for antigenic peptides (19, 61, 62). The ultrastructural colocalization of DM with intracellular peptide-class II complexes suggests that the MIIC is a main site where class II dimers bind exogenous and endogenous peptide antigens (47, 58).Determining the route by which vSag protein(s) trafficks to the cell surface and the cellular location where vSag1 processing and assembly with class II molecules occurs is central to understanding the mechanism whereby vSags activate T cells to maintain the viral life cycle. It has been unclear whether vSags traffic independently by the constitutive exocytic pathway or with class II and Ii to the MIIC before reaching the cell surface. Reagents that alter class II expression have been shown to affect vSag presentation (43, 46). Furthermore, mice lacking Ii show reduced intrathymic Vβ-specific T-cell deletion (70), suggesting that Ii may play a role, either by ensuring proper maturation of class II dimers or by targeting vSag-class II complexes to the MIIC, in promoting efficient vSag-induced immune responses.To investigate these issues, we used immunochemical detection of vSag1 protein in combination with subcellular fractionation and surface reexpression assays. We show that class II is required for stable vSag1 surface expression. vSag1 trafficks directly to the cell surface independently of class II, and reagents that alter the conversion of newly synthesized class II into peptide-loaded SDS-stable dimers affect functional vSag1 surface expression.  相似文献   

14.
A decoding algorithm is tested that mechanistically models the progressive alignments that arise as the mRNA moves past the rRNA tail during translation elongation. Each of these alignments provides an opportunity for hybridization between the single-stranded, -terminal nucleotides of the 16S rRNA and the spatially accessible window of mRNA sequence, from which a free energy value can be calculated. Using this algorithm we show that a periodic, energetic pattern of frequency 1/3 is revealed. This periodic signal exists in the majority of coding regions of eubacterial genes, but not in the non-coding regions encoding the 16S and 23S rRNAs. Signal analysis reveals that the population of coding regions of each bacterial species has a mean phase that is correlated in a statistically significant way with species () content. These results suggest that the periodic signal could function as a synchronization signal for the maintenance of reading frame and that codon usage provides a mechanism for manipulation of signal phase.[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32]  相似文献   

15.
16.
Mathematical tools developed in the context of Shannon information theory were used to analyze the meaning of the BLOSUM score, which was split into three components termed as the BLOSUM spectrum (or BLOSpectrum). These relate respectively to the sequence convergence (the stochastic similarity of the two protein sequences), to the background frequency divergence (typicality of the amino acid probability distribution in each sequence), and to the target frequency divergence (compliance of the amino acid variations between the two sequences to the protein model implicit in the BLOCKS database). This treatment sharpens the protein sequence comparison, providing a rationale for the biological significance of the obtained score, and helps to identify weakly related sequences. Moreover, the BLOSpectrum can guide the choice of the most appropriate scoring matrix, tailoring it to the evolutionary divergence associated with the two sequences, or indicate if a compositionally adjusted matrix could perform better.[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]  相似文献   

17.
Little is known about the nature of post mortem degradation of proteins and peptides on a global level, the so-called degradome. This is especially true for nonneural tissues. Degradome properties in relation to sampling procedures on different tissues are of great importance for the studies of, for instance, post translational modifications and/or the establishment of clinical biobanks. Here, snap freezing of fresh (<2 min post mortem time) mouse liver and pancreas tissue is compared with rapid heat stabilization with regard to effects on the proteome (using two-dimensional differential in-gel electrophoresis) and peptidome (using label free liquid chromatography). We report several proteins and peptides that exhibit heightened degradation sensitivity, for instance superoxide dismutase in liver, and peptidyl-prolyl cis-trans isomerase and insulin C-peptides in pancreas. Tissue sampling based on snap freezing produces a greater amount of degradation products and lower levels of endogenous peptides than rapid heat stabilization. We also demonstrate that solely snap freezing related degradation can be attenuated by subsequent heat stabilization. We conclude that tissue sampling involving a rapid heat stabilization step is preferable to freezing with regard to proteomic and peptidomic sample quality.The evolving maturation of the field of proteomics has, in the same way as in genomics, highlighted the need of better sampling procedures and sample preparation methodologies to minimize the effect of post mortem alterations. The aspect of sample quality is not new in any way and is relevant in most biomedical fields but has only lately started to receive adequate attention. The main factors influencing sample quality is storage temperature of the body until tissue removal (foremost a problem in clinical settings and extraction of less accessible tissue samples from model organisms) and post mortem interval (PMI)1 (13). Post mortem degradation in during PMI is a well known compromising problem when studying endogenous peptides (2, 3) and has also been proven to affect the results of polypeptide (here defined as proteins larger than 10 kDa) studies (38). PMI degradation has mainly been studied on human or mouse brain tissue, using two-dimensional electrophoresis (2-DE), SDS-PAGE, and immunoblotting (1, 312). There are also a few proteomic studies on muscle tissue degradation in livestock (1316).We and others have previously explored the effect of focused microwave irradiation with regard to sample quality, demonstrating that this method is more reliable than snap freezing in liquid nitrogen, especially with regard to post-translational modification (PTM) stability (2, 3, 1720). An alternative method based on cryostat dissection with subsequent heat treatment through boiling has also been reported to improve endogenous peptide sample quality (21). Besides focused microwave irradiation, which is specifically used for rodent brain tissue sampling, we have also demonstrated the efficiency of rapid heat stabilization through conductivity with regard to sample degradation (3, 22). Although somewhat constrained by its dependence on how quickly the tissue is harvested from the body, the latter procedure has the added advantage that it can be used on any type of tissue and species, fresh as well as frozen. This study will compare effects of sampling procedures on the liver and pancreas degradome following rapid heat stabilization, the more traditional snap freezing, or the combination of snap freezing with subsequent heat stabilization.To summarize, this study investigated the effects of post mortem degradation in pancreas and liver. Both tissues are well studied because of their multiple functions in the body and their involvement in different diseases such as diabetes or hepatocarcinoma. Pancreas is especially interesting in this context as it displays endocrine secretion of peptides, and exocrine secretion of digestive enzymes, the later making it a protease rich tissue. We used both two-dimensional difference in gel electrophoresis (2D-DIGE) and label free liquid chromatography mass spectrometry (LC-MS) based differential peptide display (2, 18), the later to better investigate changes in small molecular fragment that are not easily detectable by gel-based methods. 2D-DIGE is an unrivaled methodology to characterize alterations in isoform patterns, which is an important aspect considering that post-translational modifications (PTMs) such as phosphorylations are especially sensitive to post mortem influence within a few minutes PMI (3). The peptidomics approach has been used in several studies to point out early post mortem changes and protein degradation that tissue undergo following sampling and is therefore a well-suited method (3, 18, 22).  相似文献   

18.
Insulin plays a central role in the regulation of vertebrate metabolism. The hormone, the post-translational product of a single-chain precursor, is a globular protein containing two chains, A (21 residues) and B (30 residues). Recent advances in human genetics have identified dominant mutations in the insulin gene causing permanent neonatal-onset DM2 (14). The mutations are predicted to block folding of the precursor in the ER of pancreatic β-cells. Although expression of the wild-type allele would in other circumstances be sufficient to maintain homeostasis, studies of a corresponding mouse model (57) suggest that the misfolded variant perturbs wild-type biosynthesis (8, 9). Impaired β-cell secretion is associated with ER stress, distorted organelle architecture, and cell death (10). These findings have renewed interest in insulin biosynthesis (1113) and the structural basis of disulfide pairing (1419). Protein evolution is constrained not only by structure and function but also by susceptibility to toxic misfolding.Insulin plays a central role in the regulation of vertebrate metabolism. The hormone, the post-translational product of a single-chain precursor, is a globular protein containing two chains, A (21 residues) and B (30 residues). Recent advances in human genetics have identified dominant mutations in the insulin gene causing permanent neonatal-onset DM2 (14). The mutations are predicted to block folding of the precursor in the ER of pancreatic β-cells. Although expression of the wild-type allele would in other circumstances be sufficient to maintain homeostasis, studies of a corresponding mouse model (57) suggest that the misfolded variant perturbs wild-type biosynthesis (8, 9). Impaired β-cell secretion is associated with ER stress, distorted organelle architecture, and cell death (10). These findings have renewed interest in insulin biosynthesis (1113) and the structural basis of disulfide pairing (1419). Protein evolution is constrained not only by structure and function but also by susceptibility to toxic misfolding.  相似文献   

19.
Given the ease of whole genome sequencing with next-generation sequencers, structural and functional gene annotation is now purely based on automated prediction. However, errors in gene structure are frequent, the correct determination of start codons being one of the main concerns. Here, we combine protein N termini derivatization using (N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPP Ac-OSu) as a labeling reagent with the COmbined FRActional DIagonal Chromatography (COFRADIC) sorting method to enrich labeled N-terminal peptides for mass spectrometry detection. Protein digestion was performed in parallel with three proteases to obtain a reliable automatic validation of protein N termini. The analysis of these N-terminal enriched fractions by high-resolution tandem mass spectrometry allowed the annotation refinement of 534 proteins of the model marine bacterium Roseobacter denitrificans OCh114. This study is especially efficient regarding mass spectrometry analytical time. From the 534 validated N termini, 480 confirmed existing gene annotations, 41 highlighted erroneous start codon annotations, five revealed totally new mis-annotated genes; the mass spectrometry data also suggested the existence of multiple start sites for eight different genes, a result that challenges the current view of protein translation initiation. Finally, we identified several proteins for which classical genome homology-driven annotation was inconsistent, questioning the validity of automatic annotation pipelines and emphasizing the need for complementary proteomic data. All data have been deposited to the ProteomeXchange with identifier PXD000337.Recent developments in mass spectrometry and bioinformatics have established proteomics as a common and powerful technique for identifying and quantifying proteins at a very broad scale, but also for characterizing their post-translational modifications and interaction networks (1, 2). In addition to the avalanche of proteomic data currently being reported, many genome sequences are established using next-generation sequencing, fostering proteomic investigations of new cellular models. Proteogenomics is a relatively recent field in which high-throughput proteomic data is used to verify coding regions within model genomes to refine the annotation of their sequences (28). Because genome annotation is now fully automated, the need for accurate annotation for model organisms with experimental data is crucial. Many projects related to genome re-annotation of microorganisms with the help of proteomics have been recently reported, such as for Mycoplasma pneumoniae (9), Rhodopseudomonas palustris (10), Shewanella oneidensis (11), Thermococcus gammatolerans (12), Deinococcus deserti (13), Salmonella thyphimurium (14), Mycobacterium tuberculosis (15, 16), Shigella flexneri (17), Ruegeria pomeroyi (18), and Candida glabrata (19), as well as for higher organisms such as Anopheles gambiae (20) and Arabidopsis thaliana (4, 5).The most frequently reported problem in automatic annotation systems is the correct identification of the translational start codon (2123). The error rate depends on the primary annotation system, but also on the organism, as reported for Halobacterium salinarum and Natromonas pharaonis (24), Deinococcus deserti (21), and Ruegeria pomeroyi (18), where the error rate is estimated above 10%. Identification of a correct translational start site is essential for the genetic and biochemical analysis of a protein because errors can seriously impact subsequent biological studies. If the N terminus is not correctly identified, the protein will be considered in either a truncated or extended form, leading to errors in bioinformatic analyses (e.g. during the prediction of its molecular weight, isoelectric point, cellular localization) and major difficulties during its experimental characterization. For example, a truncated protein may be heterologously produced as an unfolded polypeptide recalcitrant to structure determination (25). Moreover, N-terminal modifications, which are poorly documented in annotation databases, may occur (26, 27).Unfortunately, the poor polypeptide sequence coverage obtained for the numerous low abundance proteins in current shotgun MS/MS proteomic studies implies that the overall detection of N-terminal peptides obtained in proteogenomic studies is relatively low. Different methods for establishing the most extensive list of protein N termini, grouped under the so-called “N-terminomics” theme, have been proposed to selectively enrich or improve the detection of these peptides (2, 28, 29). Large N-terminome studies have recently been reported based on resin-assisted enrichment of N-terminal peptides (30) or terminal amine isotopic labeling of substrates (TAILS) coupled to depletion of internal peptides with a water-soluble aldehyde-functionalized polymer (3135). Among the numerous N-terminal-oriented methods (2), specific labeling of the N terminus of intact proteins with N-tris(2,4,6-trimethoxyphenyl)phosphonium acetyl succinamide (TMPP-Ac-OSu)1 has proven reliable (21, 3639). TMPP-derivatized N-terminal peptides have interesting properties for further LC-MS/MS mass spectrometry: (1) an increase in hydrophobicity because of the trimethoxyphenyl moiety added to the peptides, increasing their retention times in reverse phase chromatography, (2) improvement of their ionization because of the introduction of a positively charged group, and (3) a much simpler fragmentation pattern in tandem mass spectrometry. Other reported approaches rely on acetylation, followed by trypsin digestion, and then biotinylation of free amino groups (40); guanidination of lysine lateral chains followed by N-biotinylation of the N termini and trypsin digestion (41); or reductive amination of all free amino groups with formaldehyde preceeding trypsin digestion (42). Recently, we applied the TMPP method to the proteome of the Deinococcus deserti bacterium isolated from upper sand layers of the Sahara desert (13). This method enabled the detection of N-terminal peptides allowing the confirmation of 278 translation initiation codons, the correction of 73 translation starts, and the identification of non-canonical translation initiation codons (21). However, most TMPP-labeled N-terminal peptides are hidden among the more abundant internal peptides generated after proteolysis of a complex proteome, precluding their detection. This results in disproportionately fewer N-terminal validations, that is, 5 and 8% of total polypeptides coded in the theoretical proteomes of Mycobacterium smegmatis (37) and Deinococcus deserti (21) with a total of 342 and 278 validations, respectively.An interesting chromatographic method to fractionate peptide mixtures for gel-free high-throughput proteome analysis has been developed over the last years and applied to various topics (43, 44). This technique, known as COmbined FRActional DIagonal Chromatography (COFRADIC), uses a double chromatographic separation with a chemical reaction in between to change the physico-chemical properties of the extraneous peptides to be resolved from the peptides of interest. Its previous applications include the separation of methionine-containing peptides (43), N-terminal peptide enrichment (45, 46), sulfur amino acid-containing peptides (47), and phosphorylated peptides (48). COFRADIC was identified as the best method for identification of N-terminal peptides of two archaea, resulting in the identification of 240 polypeptides (9% of the theoretical proteome) for Halobacterium salinarum and 220 (8%) for Natronomonas pharaonis (24).Taking advantage of both the specificity of TMPP labeling, the resolving power of COFRADIC for enrichment, and the increase in information through the use of multiple proteases, we performed the proteogenomic analysis of a marine bacterium from the Roseobacter clade, namely Roseobacter denitrificans OCh114. This novel approach allowed us to validate and correct 534 unique proteins (13% of the theoretical proteome) with TMPP-labeled N-terminal signatures obtained using high-resolution tandem mass spectrometry. We corrected 41 annotations and detected five new open reading frames in the R. denitrificans genome. We further identified eight distinct proteins showing direct evidence for multiple start sites.  相似文献   

20.
Cross-linking/mass spectrometry resolves protein–protein interactions or protein folds by help of distance constraints. Cross-linkers with specific properties such as isotope-labeled or collision-induced dissociation (CID)-cleavable cross-linkers are in frequent use to simplify the identification of cross-linked peptides. Here, we analyzed the mass spectrometric behavior of 910 unique cross-linked peptides in high-resolution MS1 and MS2 from published data and validate the observation by a ninefold larger set from currently unpublished data to explore if detailed understanding of their fragmentation behavior would allow computational delivery of information that otherwise would be obtained via isotope labels or CID cleavage of cross-linkers. Isotope-labeled cross-linkers reveal cross-linked and linear fragments in fragmentation spectra. We show that fragment mass and charge alone provide this information, alleviating the need for isotope-labeling for this purpose. Isotope-labeled cross-linkers also indicate cross-linker-containing, albeit not specifically cross-linked, peptides in MS1. We observed that acquisition can be guided to better than twofold enrich cross-linked peptides with minimal losses based on peptide mass and charge alone. By help of CID-cleavable cross-linkers, individual spectra with only linear fragments can be recorded for each peptide in a cross-link. We show that cross-linked fragments of ordinary cross-linked peptides can be linearized computationally and that a simplified subspectrum can be extracted that is enriched in information on one of the two linked peptides. This allows identifying candidates for this peptide in a simplified database search as we propose in a search strategy here. We conclude that the specific behavior of cross-linked peptides in mass spectrometers can be exploited to relax the requirements on cross-linkers.Cross-linking/mass spectrometry extends the use of mass-spectrometry-based proteomics from identification (1, 2), quantification (3), and characterization of protein complexes (4) into resolving protein structures and protein–protein interactions (58). Chemical reagents (cross-linkers) covalently connect amino acid pairs that are within a cross-linker-specific distance range in the native three-dimensional structure of a protein or protein complex. A cross-linking/mass spectrometry experiment is typically conducted in four steps: (1) cross-linking of the target protein or complex, (2) protein digestion (usually with trypsin), (3) LC-MS analysis, and (4) database search. The digested peptide mixture consists of linear and cross-linked peptides, and the latter can be enriched by strong cation exchange (9) or size exclusion chromatography (10). Cross-linked peptides are of high value as they provide direct information on the structure and interactions of proteins.Cross-linked peptides fragment under collision-induced dissociation (CID) conditions primarily into b- and y-ions, as do their linear counterparts. An important difference regarding database searches between linear and cross-linked peptides stems from not knowing which peptides might be cross-linked. Therefore, one has to consider each single peptide and all pairwise combinations of peptides in the database. Having n peptides leads to (n2 + n)/2 possible pairwise combinations. This leads to two major challenges: With increasing size of the database, search time and the risk of identifying false positives increases. One way of circumventing these problems is to use MS2-cleavable cross-linkers (11, 12), at the cost of limited experimental design and choice of cross-linker.In a first database search approach (13), all pairwise combinations of peptides in a database were considered in a concatenated and linearized form. Thereby, all possible single bond fragments are considered in one of the two database entries per peptide pair, and the cross-link can be identified by a normal protein identification algorithm. Already, the second search approach split the peptides for the purpose of their identification (14). Linear fragments were used to retrieve candidate peptides from the database that are then matched based on the known mass of the cross-linked pair and scored as a pair against the spectrum. Isotope-labeled cross-linkers were used to sort the linear and cross-linked fragments apart. Many other search tools and approaches have been developed since (10, 1519); see (20) for a more detailed list, at least some of which follow the general idea of an open modification search (2124).As a general concept for open modification search of cross-linked peptides, cross-linked peptides represent two peptides, each with an unknown modification given by the mass of the other peptide and the cross-linker. One identifies both peptides individually and then matches them based on knowing the mass of cross-linked pair (14, 22, 24). Alternatively, one peptide is identified first and, using that peptide and the cross-linker as a modification mass, the second peptide is identified from the database (21, 23). An important element of the open modification search approach is that it essentially converts the quadratic search space of the cross-linked peptides into a linear search space of modified peptides. Still, many peptides and many modification positions have to be considered, especially when working with large databases or when using highly reactive cross-linkers with limited amino acid selectivity (25).We hypothesize that detailed knowledge of the fragmentation behavior of cross-linked peptides might reveal ways to improve the identification of cross-linked peptides. Detailed analyses of the fragmentation behavior of linear peptides exist (2628), and the analysis of the fragmentation behavior of cross-linked peptides has guided the design of scores (24, 29). Further, cross-link-specific ions have been observed from higher energy collision dissociation (HCD) data (30). Isotope-labeled cross-linkers are used to distinguish cross-linked from linear fragments, generally in low-resolution MS2 of cross-linked peptides (14).We compared the mass spectrometric behavior of cross-linked peptides to that of linear peptides, using 910 high-resolution fragment spectra matched to unique cross-linked peptides from multiple different public datasets at 5% peptide-spectrum match (PSM)1 false discovery rate (FDR). In addition, we repeated all experiments with a larger sample set that contains 8,301 spectra—also including data from ongoing studies from our lab (Supplemental material S9-S12). This paper presents the mass spectrometric signature of cross-linked peptides that we identified in our analysis and the resulting heuristics that are incorporated into an integrated strategy for the analysis and identification of cross-linked peptides. We present computational strategies that indicate the possibility of alleviating the need for mass-spectrometrically restricted cross-linker choice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号