首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 531 毫秒
1.
The opportunistic human pathogen Acinetobacter baumannii is a concern to health care systems worldwide because of its persistence in clinical settings and the growing frequency of multiple drug resistant infections. To combat this threat, it is necessary to understand factors associated with disease and environmental persistence of A. baumannii. Recently, it was shown that a single biosynthetic pathway was responsible for the generation of capsule polysaccharide and O-linked protein glycosylation. Because of the requirement of these carbohydrates for virulence and the non-template driven nature of glycan biogenesis we investigated the composition, diversity, and properties of the Acinetobacter glycoproteome. Utilizing global and targeted mass spectrometry methods, we examined 15 strains and found extensive glycan diversity in the O-linked glycoproteome of Acinetobacter. Comparison of the 26 glycoproteins identified revealed that different A. baumannii strains target similar protein substrates, both in characteristics of the sites of O-glycosylation and protein identity. Surprisingly, glycan micro-heterogeneity was also observed within nearly all isolates examined demonstrating glycan heterogeneity is a widespread phenomena in Acinetobacter O-linked glycosylation. By comparing the 11 main glycoforms and over 20 alternative glycoforms characterized within the 15 strains, trends within the glycan utilized for O-linked glycosylation could be observed. These trends reveal Acinetobacter O-linked glycosylation favors short (three to five residue) glycans with limited branching containing negatively charged sugars such as GlcNAc3NAcA4OAc or legionaminic/pseudaminic acid derivatives. These observations suggest that although highly diverse, the capsule/O-linked glycan biosynthetic pathways generate glycans with similar characteristics across all A. baumannii.Acinetobacter baumannii is an emerging opportunistic pathogen of increasing significance to health care institutions worldwide (13). The growing number of identified multiple drug resistant (MDR)1 strains (24), the ability of isolates to rapidly acquire resistance (3, 4), and the propensity of this agent to survive harsh environmental conditions (5) account for the increasing number of outbreaks in intensive care, burn, or high dependence health care units since the 1970s (25). The burden on the global health care system of MDR A. baumannii is further exacerbated by standard infection control measures often being insufficient to quell the spread of A. baumannii to high risk individuals and generally failing to remove A. baumannii from health care institutions (5). Because of these concerns, there is an urgent need to identify strategies to control A. baumannii as well as understand the mechanisms that enable its persistence in health care environments.Surface glycans have been identified as key virulence factors related to persistence and virulence within the clinical setting (68). Acinetobacter surface carbohydrates were first identified and studied in A. venetianus strain RAG-1, leading to the identification of a gene locus required for synthesis and export of the surface carbohydrates (9, 10). These carbohydrate synthesis loci are variable yet ubiquitous in A. baumannii (11, 12). Comparison of 12 known capsule structures from A. baumannii with the sequences of their carbohydrate synthesis loci has provided strong evidence that these loci are responsible for capsule synthesis with as many as 77 distinct serotypes identified by molecular serotyping (11). Because of the non-template driven nature of glycan synthesis, the identification and characterization of the glycans themselves are required to confirm the true diversity. This diversity has widespread implications for Acinetobacter biology as the resulting carbohydrate structures are not solely used for capsule biosynthesis but can be incorporated and utilized by other ubiquitous systems, such as O-linked protein glycosylation (13, 14).Although originally thought to be restricted to species such as Campylobacter jejuni (15, 16) and Neisseria meningitidis (17), bacterial protein glycosylation is now recognized as a common phenomenon within numerous pathogens and commensal bacteria (18, 19). Unlike eukaryotic glycosylation where robust and high-throughput technologies now exist to enrich (2022) and characterize both the glycan and peptide component of glycopeptides (2325), the diversity (glycan composition and linkage) within bacterial glycosylation systems makes few technologies broadly applicable to all bacterial glycoproteins. Because of this challenge a deeper understanding of the glycan diversity and substrates of glycosylation has been largely unachievable for the majority of known bacterial glycosylation systems. The recent implementation of selective glycopeptide enrichment methods (26, 27) and the use of multiple fragmentation approaches (28, 29) has facilitated identification of an increasing number of glycosylation substrates independent of prior knowledge of the glycan structure (3033). These developments have facilitated the undertaking of comparative glycosylation studies, revealing glycosylation is widespread in diverse genera and far more diverse then initially thought. For example, Nothaft et al. were able to show N-linked glycosylation was widespread in the Campylobacter genus and that two broad groupings of the N-glycans existed (34).During the initial characterization of A. baumannii O-linked glycosylation the use of selective enrichment of glycopeptides followed by mass spectrometry analysis with multiple fragmentation technologies was found to be an effective means to identify multiple glycosylated substrates in the strain ATCC 17978 (14). Interestingly in this strain, the glycan utilized for protein modification was identical to a single subunit of the capsule (13) and the loss of either protein glycosylation or glycan synthesis lead to decreases in biofilm formation and virulence (13, 14). Because of the diversity in the capsule carbohydrate synthesis loci and the ubiquitous distribution of the PglL O-oligosaccharyltransferase required for protein glycosylation, we hypothesized that the glycan variability might be also extended to O-linked glycosylation. This diversity, although common in surface carbohydrates such as the lipopolysaccharide of numerous Gram-negative pathogens (35), has only recently been observed within bacterial proteins glycosylation system that are typically conserved within species (36) and loosely across genus (34, 37).In this study, we explored the diversity within the O-linked protein glycosylation systems of Acinetobacter species. Our analysis complements the recent in silico studies of A. baumannii showing extensive glycan diversity exists in the carbohydrate synthesis loci (11, 12). Employing global strategies for the analysis of glycosylation, we experimentally demonstrate that the variation in O-glycan structure extends beyond the genetic diversity predicted by the carbohydrate loci alone and targets proteins of similar properties and identity. Using this knowledge, we developed a targeted approach for the detection of protein glycosylation, enabling streamlined analysis of glycosylation within a range of genetic backgrounds. We determined that; O-linked glycosylation is widespread in clinically relevant Acinetobacter species; inter- and intra-strain heterogeneity exist within glycan structures; glycan diversity, although extensive results in the generation of glycans with similar properties and that the utilization of a single glycan for capsule and O-linked glycosylation is a general feature of A. baumannii but may not be a general characteristic of all Acinetobacter species such as A. baylyi.  相似文献   

2.
Glycoprotein structure determination and quantification by MS requires efficient isolation of glycopeptides from a proteolytic digest of complex protein mixtures. Here we describe that the use of acids as ion-pairing reagents in normal-phase chromatography (IP-NPLC) considerably increases the hydrophobicity differences between non-glycopeptides and glycopeptides, thereby resulting in the reproducible isolation of N-linked high mannose type and sialylated glycopeptides from the tryptic digest of a ribonuclease B and fetuin mixture. The elution order of non-glycopeptides relative to glycopeptides in IP-NPLC is predictable by their hydrophobicity values calculated using the Wimley-White water/octanol hydrophobicity scale. O-linked glycopeptides can be efficiently isolated from fetuin tryptic digests using IP-NPLC when N-glycans are first removed with PNGase. IP-NPLC recovers close to 100% of bacterial N-linked glycopeptides modified with non-sialylated heptasaccharides from tryptic digests of periplasmic protein extracts from Campylobacter jejuni 11168 and its pglD mutant. Label-free nano-flow reversed-phase LC-MS is used for quantification of differentially expressed glycopeptides from the C. jejuni wild-type and pglD mutant followed by identification of these glycoproteins using multiple stage tandem MS. This method further confirms the acetyltransferase activity of PglD and demonstrates for the first time that heptasaccharides containing monoacetylated bacillosamine are transferred to proteins in both the wild-type and mutant strains. We believe that IP-NPLC will be a useful tool for quantitative glycoproteomics.Protein glycosylation is a biologically significant and complex post-translational modification, involved in cell-cell and receptor-ligand interactions (14). In fact, clinical biomarkers and therapeutic targets are often glycoproteins (59). Comprehensive glycoprotein characterization, involving glycosylation site identification, glycan structure determination, site occupancy, and glycan isoform distribution, is a technical challenge particularly for quantitative profiling of complex protein mixtures (1013). Both N- and O-glycans are structurally heterogeneous (i.e. a single site may have different glycans attached or be only partially occupied). Therefore, the MS1 signals from glycopeptides originating from a glycoprotein are often weaker than from non-glycopeptides. In addition, the ionization efficiency of glycopeptides is low compared with that of non-glycopeptides and is often suppressed in the presence of non-glycopeptides (1113). When the MS signals of glycopeptides are relatively high in simple protein digests then diagnostic sugar oxonium ion fragments produced by, for example, front-end collisional activation can be used to detect them. However, when peptides and glycopeptides co-elute, parent ion scanning is required to selectively detect the glycopeptides (14). This can be problematic in terms of sensitivity, especially for detecting glycopeptides in digests of complex protein extracts.Isolation of glycopeptides from proteolytic digests of complex protein mixtures can greatly enhance the MS signals of glycopeptides using reversed-phase LC-ESI-MS (RPLC-ESI-MS) or MALDI-MS (1524). Hydrazide chemistry is used to isolate, identify, and quantify N-linked glycopeptides effectively, but this method involves lengthy chemical procedures and does not preserve the glycan moieties thereby losing valuable information on glycan structure and site occupancy (1517). Capturing glycopeptides with lectins has been widely used, but restricted specificities and unspecific binding are major drawbacks of this method (1821). Under reversed-phase LC conditions, glycopeptides from tryptic digests of gel-separated glycoproteins have been enriched using graphite powder medium (22). In this case, however, a second digestion with proteinase K is required for trimming down the peptide moieties of tryptic glycopeptides so that the glycopeptides (typically <5 amino acid residues) essentially resemble the glycans with respect to hydrophilicity for subsequent separation. Moreover, the short peptide sequences of the proteinase K digest are often inadequate for de novo sequencing of the glycopeptides.Glycopeptide enrichment under normal-phase LC (NPLC) conditions has been demonstrated using various hydrophilic media and different capture and elution conditions (2328). NPLC allows either direct enrichment of peptides modified by various N-linked glycan structures using a ZIC®-HILIC column (2327) or targeting sialylated glycopeptides using a titanium dioxide micro-column (28). However, NPLC is neither effective for enriching less hydrophilic glycopeptides, e.g. the five high mannose type glycopeptides modified by 7–11 monosaccharide units from a tryptic digest of ribonuclease b (RNase B), nor for enriching O-linked glycopeptides of bovine fetuin using a ZIC-HILIC column (23). The use of Sepharose medium for enriching glycopeptides yielded only modest recovery of glycopeptides (28). In addition, binding of hydrophilic non-glycopeptides with these hydrophilic media contaminates the enriched glycopeptides (23, 28).We have recently developed an ion-pairing normal-phase LC (IP-NPLC) method to enrich glycopeptides from complex tryptic digests using Sepharose medium and salts or bases as ion-pairing reagents (29). Though reasonably effective the technique still left room for significant improvement. For example, the method demonstrated relatively modest glycopeptide selectivity, providing only 16% recovery for high mannose type glycopeptides (29). Here we report on a new IP-NPLC method using acids as ion-pairing reagents and polyhydroxyethyl aspartamide (A) as the stationary phase for the effective isolation of tryptic glycopeptides. The method was developed and evaluated using a tryptic digest of RNase B and fetuin mixture. In addition, we demonstrate that O-linked glycopeptides can be effectively isolated from a fetuin tryptic digest by IP-NPLC after removal of the N-linked glycans by PNGase F.The new IP-NPLC method was used to enrich N-linked glycopeptides from the tryptic digests of protein extracts of wild-type (wt) and PglD mutant strains of Campylobacter jejuni NCTC 11168. C. jejuni has a unique N-glycosylation system that glycosylates periplasmic and inner membrane proteins containing the extended N-linked sequon, D/E-X-N-X-S/T, where X is any amino acid other than proline (3032). The N-linked glycan of C. jejuni has been previously determined to be GalNAc-α1,4-GalNAc-α1,4-[Glcβ1,3]-GalNAc-α1,4-GalNAc-α1,4-GalNAc-α1,3-Bac-β1 (BacGalNAc5Glc residue mass: 1406 Da), where Bac is 2,4-diacetamido-2,4,6-trideoxyglucopyranose (30). In addition, the glycan structure of C. jejuni is conserved, unlike in eukaryotic systems (3032). IP-NPLC recovered close to 100% of the bacterial N-linked glycopeptides with virtually no contamination of non-glycopeptides. Furthermore, we demonstrate for the first time that acetylation of bacillosamine is incomplete in the wt using IP-NPLC and label-free MS.  相似文献   

3.
We recently reported that induced pluripotent stem cells (iPSCs) prepared from different human origins acquired similar glycan profiles to one another as well as to human embryonic stem cells. Although the results strongly suggested attainment of specific glycan expressions associated with the acquisition of pluripotency, the detailed glycan structures remained to be elucidated. Here, we perform a quantitative glycome analysis targeting both N- and O-linked glycans derived from 201B7 human iPSCs and human dermal fibroblasts as undifferentiated and differentiated cells, respectively. Overall, the fractions of high mannose-type N-linked glycans were significantly increased upon induction of pluripotency. Moreover, it became evident that the type of linkage of Sia on N-linked glycans was dramatically changed from α-2–3 to α-2–6, and the expression of α-1–2 fucose and type 1 LacNAc structures became clearly apparent, while no such glycan epitopes were detected in fibroblasts. The expression profiles of relevant glycosyltransferase genes were fully consistent with these results. These observations indicate unambiguously the manifestation of a “glycome shift” upon conversion to iPSCs, which may not merely be the result of the initialization of gene expression, but could be involved in a more aggressive manner either in the acquisition or maintenance of the undifferentiated state of iPSCs.Induced pluripotent stem cells (iPSCs)1 are genetically manufactured pluripotent cells obtained by the transfection of reprogramming factors. Such iPSCs were first reported in 2006 for the mouse (1) and in 2007 for humans (2, 3). Although iPSCs have already been used in the fields of drug development and disease models (47), basic aspects of iPSCs largely remain to be elucidated to provide us with a fuller understanding of their properties and for therapeutic applications to be developed in the field of regenerative medicine. These aspects include the need for a definitive system to be established to evaluate their properties; e.g. pluripotency, differentiation propensity, risk of possible contamination of xenoantigens, and even the potential for tumorigenesis. Cell surface glycans are often referred to as the “cell signature,” which changes dramatically depending on the cell properties and conditions (8) as a result of changes in gene expression, including epigenetic modifications of glycan-related molecules. Glycans, because of their outermost cell-surface locations and structural complexity, are considered to be most advantageous communication molecules, playing roles in various biological phenomena. Indeed, SSEA3/4 and Tra-1–60/81, which have been used to discriminate pluripotency, are cell surface glycan epitopes that respond to some specific antibodies (912).Glycan-mediated cell-to-cell interactions have been shown to play important roles in various biological phenomena including embryogenesis and carcinogenesis (1316). This might also be the case for the acquisition and maintenance of iPSC and ESC pluripotency, although there remains much to clarify concerning the roles of cell surface glycans in these events. Thus, the development of novel cell surface markers to evaluate the properties of iPSCs and ESCs is keenly required. Toward this goal, a glycomic approach has been made by several groups (1720). In our previous study using an advanced lectin microarray technique (21), thirty-eight lectins capable of discriminating between iPSCs and SCs were statistically selected, and the characteristic features of the pluripotent state were obtained. The glycan profiles of the parent SCs, derived from four different tissues, were totally different from one another and from those of the iPSCs. Despite this observation, the technique used lacks the ability to determine detailed glycan structures or allow their quantification. For this purpose, a conventional approach based on high performance liquid chromatography (HPLC) combined with matrix-assisted laser desorption-ionization (MALDI) - time of flight (TOF) mass spectrometry (MS) was undertaken for both the definitive identification of glycan structures and their quantitative comparison, which remained unclear in the previous analysis (21).We report here structural data on N-linked and O-linked glycans derived from the human iPSC 201B7 cell line (2) and human dermal fibroblasts (SC) representing undifferentiated and differentiated cells, respectively. For quantitative comparison, the glycans were liberated by gas-phase hydrazinolysis from similar numbers of cells (2225) fluorescently tagged with 2-aminopyridine (2-AP) at their reducing terminus (26, 27), following which the derived pyridylaminated (PA-) glycans were purified by multiple-mode (i.e. anion-exchange, size-fractionation and reverse-phase) HPLC. Their structures were determined and quantified by HPLC mapping assisted with MALDI-TOF-MS and exoglycosidase digestion analyses. This report thus provides the first structural evidence showing the occurrence of a dynamic “glycome shift” upon induction of pluripotency.  相似文献   

4.
Human milk contains a rich set of soluble, reducing glycans whose functions and bioactivities are not well understood. Because human milk glycans (HMGs) have been implicated as receptors for various pathogens, we explored the functional glycome of human milk using shotgun glycomics. The free glycans from pooled milk samples of donors with mixed Lewis and Secretor phenotypes were labeled with a fluorescent tag and separated via multidimensional HPLC to generate a tagged glycan library containing 247 HMG targets that were printed to generate the HMG shotgun glycan microarray (SGM). To investigate the potential role of HMGs as decoy receptors for rotavirus (RV), a leading cause of severe gastroenteritis in children, we interrogated the HMG SGM with recombinant forms of VP8* domains of the RV outer capsid spike protein VP4 from human neonatal strains N155(G10P[11]) and RV3(G3P[6]) and a bovine strain, B223(G10P[11]). Glycans that were bound by RV attachment proteins were selected for detailed structural analyses using metadata-assisted glycan sequencing, which compiles data on each glycan based on its binding by antibodies and lectins before and after exo- and endo-glycosidase digestion of the SGM, coupled with independent MSn analyses. These complementary structural approaches resulted in the identification of 32 glycans based on RV VP8* binding, many of which are novel HMGs, whose detailed structural assignments by MSn are described in a companion report. Although sialic acid has been thought to be important as a surface receptor for RVs, our studies indicated that sialic acid is not required for binding of glycans to individual VP8* domains. Remarkably, each VP8* recognized specific glycan determinants within a unique subset of related glycan structures where specificity differences arise from subtle differences in glycan structures.Human milk offers nutrition, innate immune protection, and other developmental benefits to infants (1, 2). In addition to essential nutrients and bioactive antibodies, human milk uniquely possesses a rich pool of free-reducing glycans (oligosaccharides), most of which are unique to human milk (3, 4). Depending on the blood group status and the lactation stage of an individual, the concentration of human milk glycans (HMGs)1 larger than lactose varies between 5 and 15 g/l, making them the third largest component of human milk after lactose and lipids (5). Over the past decades, more than 100 structurally distinct HMGs have been identified (69). All of these glycans originate from a lactose that is extended by type 1 (Galβ1–3GlcNAc) or type 2 (Galβ1–4GlcNAc) N-acetyllactosamine in either linear or branch forms and further modified with α-linked fucose and/or N-acetylneuraminic acid. It has been shown that HMGs are only minimally digested in the upper gastrointestinal tract and are transported intact into the lower parts of intestine (10, 11). Additionally, ∼1% to 2% of HMGs are excreted via an infant''s urine and seem to appear in the circulation (12, 13).Accumulated evidence has indicated that HMGs play multiple biological roles. In addition to having well-known prebiotic effects that promote the growth of beneficial microflora in the intestine (14, 15), HMGs are suggested to competitively interfere with pathogen attachment to the host cell surface by acting as soluble decoy receptors (1618), and such anti-adhesive effects are often glycan specific (19). For example, α1–2 fucosylated HMGs, which arise mainly from individuals that are Secretor(+), were observed to prevent the adherence of Campylobacter jejuni to epithelial cells (20) and were associated with protection against diarrhea caused by Campylobacter, caliciviruses, and Escherichia coli toxin in breastfed infants (2123). Sialylated HMGs were exclusive receptors for influenza viruses (2426) and showed a capacity to inhibit cholera toxin B (27), Vibrio cholera (28), enterotoxigenic E. coli, and uropathogenic E. coli strains (29, 30). It was also proposed that HMGs might serve as anti-inflammatory components and thus contribute to the lower incidence of necrotizing enterocolitis in breastfed infants. This idea is supported by the observations that the acidic fraction of HMG inhibits leukocyte rolling, adhesion, and activation (31) and disialyllacto-N-tetraose prevents necrotizing enterocolitis in neonatal rats (32). Furthermore, a variety of cytoprotective activities of HMGs have been reported against Clostridium difficile toxins (33), Helicobacter pylori (34, 35), Streptococcus pneumonia (36), Entamoeba histolytica (37), and HIV-1-gp120 (38). Although the numerous in vitro and in vivo data provide important information about the function of HMGs, these studies have typically used HMG fraction mixtures or a small panel of defined HMGs, and therefore the bioactive HMGs were not or poorly identified.In order to better understand the interactions of HMGs with various microorganisms, it is necessary to examine the entire milk metaglycome and identify the specific bioactive components, which is not possible via traditional methods that mainly focus on compositional analysis of HMGs (39). To find an efficient route for establishing the function–structure relationship of HMGs, we applied a “shotgun glycomics” approach to generate a shotgun glycan microarray (SGM) from isolated human milk glycans of a Lewis-positive, non-secretor individual (25, 40). The functional recognition studies, along with metadata-assisted glycan sequencing (MAGS), revealed novel epitopes/receptors for anti-TRA-1 antibodies, influenza viruses, and minute viruses of mice. Our work represented the first natural glycan microarray of HMGs containing over 100 glycans. Notably, the antibody binding data showed a lack of α1,2-fucosylated HMGs on this SGM, confirming that the donor was a non-secretor (41, 42).Here we describe our studies in which we prepared a SGM containing over 200 isolated HMG targets from pooled human milk of mixed Lewis and Secretor phenotypes and investigated the binding of rotavirus (RV) cell attachment protein to them. Human RVs are the leading cause of severe gastroenteritis in children, responsible for an estimated 453,000 deaths each year worldwide (43). As with many other pathogens, RV infection is initiated by the interaction with specific cellular glycans. The VP8* domain of the RV outer capsid spike protein VP4 mediates this process (44), but the identity of VP8* receptors is quite controversial. It was believed that VP8* recognized either terminal sialic acid or internal sialic acid, mainly based on crystallographic and NMR studies (4548). However, recently a human strain (HAL1166) with a P[14] VP8* was found to bind to A-type histo-blood group antigen (49), a neonatal strain with a P[11] VP8* bound to type 2 precursor glycans (50), and several other P types recognized secretor-related antigens Lewis b and H type 1 (51). These studies indicate that sialic acid might not be required by all RVs and that the glycan receptors are genotype-dependent. The infectivity of a porcine RV was inhibited by sialyl HMGs in vitro (52); however, there are limited data on human RVs. Here, we demonstrate that the VP8* of two different human neonatal RVs and an additional bovine strain bound to HMGs independent of sialic acid and that each VP8* demonstrated a unique glycan-binding specificity.  相似文献   

5.
Changes to the glycan structures of proteins secreted by cancer cells are known to be functionally important and to have potential diagnostic value. However, an exploration of the population variation and prevalence of glycan alterations on specific proteins has been lacking because of limitations in conventional glycobiology methods. Here we report the use of a previously developed antibody-lectin sandwich array method to characterize both the protein and glycan levels of specific mucins and carcinoembryonic antigen-related proteins captured from the sera of pancreatic cancer patients (n = 23) and control subjects (n = 23). The MUC16 protein was frequently elevated in the cancer patients (65% of the patients) but showed no glycan alterations, whereas the MUC1 and MUC5AC proteins were less frequently elevated (30 and 35%, respectively) and showed highly prevalent (up to 65%) and distinct glycan alterations. The most frequent glycan elevations involved the Thomsen-Friedenreich antigen, fucose, and Lewis antigens. An unexpected increase in the exposure of α-linked mannose also was observed on MUC1 and MUC5ac, indicating possible N-glycan modifications. Because glycan alterations occurred independently from the protein levels, improved identification of the cancer samples was achieved using glycan measurements on specific proteins relative to using the core protein measurements. The most significant elevation was the cancer antigen 19-9 on MUC1, occurring in 19 of 23 (87%) of the cancer patients and one of 23 (4%) of the control subjects. This work gives insight into the prevalence and protein carriers of glycan alterations in pancreatic cancer and points to the potential of using glycan measurements on specific proteins for highly effective biomarkers.Alterations to the glycan structures on extracellular proteins are a common feature of many types of epithelial cancer such as pancreatic, colon, and breast cancers (1, 2). Cancer-associated glycan structures are thought to be functionally involved in many of the phenotypes characterizing cancer cells, including the ability to migrate, avoid apoptosis, evade immune destruction, and enter and exit the vasculature (3). Because proteins bearing cancer-associated glycans can be shed by tumor cells into the circulation, blood-based diagnostic tests using glycan detection may be possible. A potential advantage of using glycans for diagnostics is that carbohydrate modifications of particular proteins may be altered more frequently or more specifically in certain disease states than their underlying core protein concentrations. However, to evaluate and use such a strategy, the prevalence with which various structures appear and the specific proteins on which they appear must be better characterized.Previous studies of cancer-associated glycosylation using enzymatic, chromatographic, and mass spectrometry methods have been very effective for providing detailed information about the glycan structures produced by cancer cells, but because of the requirements for large amounts of material and the time involved to analyze each sample, these studies generally used either cell culture material or a small number of patient samples. Therefore, while many cancer-associated glycans have been identified, much remains unknown about these glycans, including how often they appear, how closely they are associated with particular disease states, and the distribution of protein carriers on which they appear.Affinity-based methods, using reagents such as lectins or glycan-binding antibodies, are a valuable complement to the above mentioned methods. Using antibodies or lectins that bind specific glycans, one may reproducibly measure the levels of those glycans over multiple samples. Although affinity-based glycosylation studies do not provide the structural detail provided by mass spectrometry and enzymatic methods, they can provide information about the biological variation of a particular motif.Lectins and glycan-binding antibodies have been used extensively in immunohistochemistry, for example in studies to examine the tissue distribution in pancreatic tumors of certain blood group carbohydrates (4, 5). Lectins have been valuable in immunoaffinity electrophoresis and blotting methods to identify cancer-associated glycan variants on major serum proteins such as α-fetoprotein (6), haptoglobin (7, 8), α1-acid glycoprotein (9), and α1-antitrypsin (10). Antibodies raised against particular glycan groups, such as the Thomsen-Friedenreich antigens (11), the Lewis blood group structures (12), and underglycosylated MUC11 (13) also have been used to study the roles of glycans in cancer. As a means of quantifying glycans on specific proteins, lectins have been used in the capture or detection of proteins in microtiter plates (14).We previously demonstrated an antibody-lectin sandwich array method (15) that is a valuable complement to the above methods and is ideal for profiling the prevalence of multiple glycans on multiple proteins. Glycan levels can be probed directly from biological samples, and many samples or detection conditions can be processed efficiently in a low volume, high throughput format (16). This method is complementary to lectin microarrays (1719), which are useful for measuring glycan levels on individual, purified proteins; glycan microarrays (20, 21), which are used to measure the recognition of carbohydrate structures by various glycan-binding reagents; and glycoprotein arrays (22) for examining glycosylation on proteins isolated from biological samples.We applied this method to the study of glycan alterations on proteins in the circulation of pancreatic cancer patients. We sought to define the prevalence of various glycan alterations on particular protein carriers and to investigate whether those measurements have advantages for cancer diagnostics relative to measurements of core proteins. We designed antibody microarrays to target members of the mucin and carcinoembryonic antigen-related cell adhesion molecule (CEACAM) families because some of those proteins are known to carry cancer-associated glycans. Mucins are extracellular, long-chain glycoproteins involved in the control and protection of epithelial surfaces, and the expression and glycosylation of several mucins are often altered and functionally involved in cancer (23, 24). The CEACAM family of proteins also is functionally involved in cancer, and they carry cancer-associated glycans (25, 26), but the glycans on CEACAMs are less well studied than those on mucins. By measuring both glycan levels and the core protein levels of several of these molecules, we were able to investigate whether alterations to glycans can appear at a higher rate than changes to core protein abundances. The ability to test the presence of glycan structures on multiple protein carriers in multiple samples was critical to investigating these questions.  相似文献   

6.
The biological and clinical relevance of glycosylation is becoming increasingly recognized, leading to a growing interest in large-scale clinical and population-based studies. In the past few years, several methods for high-throughput analysis of glycans have been developed, but thorough validation and standardization of these methods is required before significant resources are invested in large-scale studies. In this study, we compared liquid chromatography, capillary gel electrophoresis, and two MS methods for quantitative profiling of N-glycosylation of IgG in the same data set of 1201 individuals. To evaluate the accuracy of the four methods we then performed analysis of association with genetic polymorphisms and age. Chromatographic methods with either fluorescent or MS-detection yielded slightly stronger associations than MS-only and multiplexed capillary gel electrophoresis, but at the expense of lower levels of throughput. Advantages and disadvantages of each method were identified, which should inform the selection of the most appropriate method in future studies.Glycans are important structural and functional components of the majority of proteins, but because of their structural complexity and the absence of a direct genetic template our current understanding of the role of glycans in biological processes lags significantly behind the knowledge about proteins or DNA (1, 2). However, a recent comprehensive report endorsed by the US National Academies concluded that “glycans are directly involved in the pathophysiology of every major disease and that additional knowledge from glycoscience will be needed to realize the goals of personalized medicine” (3).It is estimated that the glycome (defined as the complete set of all glycans) of a eukaryotic cell is composed of more than a million different glycosylated structures (1), which contain up to 10,000 structural glycan epitopes for interaction with antibodies, lectins, receptors, toxins, microbial adhesins, or enzymes (4). Our recent population-based studies indicated that the composition of the human plasma N-glycome varies significantly between individuals (5, 6). Because glycans have important structural and regulatory functions on numerous glycoproteins (7), the observed variability suggests that differences in glycosylation might contribute to a large part of the human phenotypic variability. Interestingly, when the N-glycome of isolated immunoglobulin G (IgG)1 was analyzed, it was found to be even more variable than the total plasma N-glycome (8), indicating that the combined analysis of all plasma glycans released from many different glycoproteins blurs signals of protein-specific regulation of glycosylation.A number of studies have investigated the role of glycans in human disease, including autoimmune diseases and cancer (9, 10). However, most human glycan studies have been conducted with very small sample sizes. Given the complex causal pathways involved in pathophysiology of common complex disease, and thus the likely modest effect sizes associated with individual factors, the majority of these studies are very likely to be substantially underpowered. In the case of inflammatory bowel disease, only 20% of reported inflammatory bowel disease glycan associations were replicated in subsequent studies, suggesting that most are false positive findings and that there is publication bias favoring the publication of positive findings (11). This situation is similar to that which occurred in the field of genetic epidemiology in the past when many underpowered candidate gene studies were published and were later found to consist of mainly false positive findings (12, 13). It is essential, therefore, that robust and affordable methods for high-throughput analysis are developed so that adequately powered studies can be conducted and the publication of large numbers of small studies reporting false positive results (which could threaten the credibility of glycoscience) be avoided.Rapid advances of technologies for high-throughput genome analysis in the past decade enabled large-scale genome-wide association studies (GWAS). GWAS has become a reliable tool for identification of associations between genetic polymorphisms and various human diseases and traits (14). Thousands of GWAS have been conducted in recent years, but these have not included the study of glycan traits until recently. The main reason was the absence of reliable tools for high-throughput quantitative analysis of glycans that could match the measurements of genomic, biochemical, and other traits in their cost, precision, and reproducibility. However, several promising high-throughput technologies for analysis of N-glycans were developed (8, 1520) recently. Successful implementation of high-throughput analytical techniques for glycan analysis resulted in publication of four initial GWAS of the human glycome (2124).In this study, we compared ultra-performance liquid chromatography with fluorescence detection (UPLC-FLR), multiplex capillary gel electrophoresis with laser induced fluorescence detection (xCGE-LIF), matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS), and liquid chromatography electrospray mass spectrometry (LC-ESI-MS) as tools for mid-to-high-throughput glycomics and glycoproteomics. We have analyzed IgG N-glycans by all four methods in 1201 individuals from European populations. The analysis of associations between glycans and ∼300,000 single-nucleotide genetic polymorphisms was performed and correlation between glycans and age was studied in all four data sets to identify the analytical method that shows the strongest potential to uncover biological mechanisms underlying protein glycosylation.  相似文献   

7.
Allergenic proteins such as grass pollen and house dust mite (HDM) proteins are known to trigger hypersensitivity reactions of the immune system, leading to what is commonly known as allergy. Key allergenic proteins including sequence variants have been identified but characterization of their post-translational modifications (PTMs) is still limited.Here, we present a detailed PTM1 characterization of a series of the main and clinically relevant allergens used in allergy tests and vaccines. We employ Orbitrap-based mass spectrometry with complementary fragmentation techniques (HCD/ETD) for site-specific PTM characterization by bottom-up analysis. In addition, top-down mass spectrometry is utilized for targeted analysis of individual proteins, revealing hitherto unknown PTMs of HDM allergens. We demonstrate the presence of lysine-linked polyhexose glycans and asparagine-linked N-acetylhexosamine glycans on HDM allergens. Moreover, we identified more complex glycan structures than previously reported on the major grass pollen group 1 and 5 allergens, implicating important roles for carbohydrates in allergen recognition and response by the immune system. The new findings are important for understanding basic disease-causing mechanisms at the cellular level, which ultimately may pave the way for instigating novel approaches for targeted desensitization strategies and improved allergy vaccines.Allergic respiratory disease is a global health problem and current clinical guidelines recommend a combination of allergen avoidance, pharmacotherapy, and allergen specific immunotherapy for treatment (14). At present allergy testing and vaccines are based on isolated crude antigen preparations from natural sources (i.e. HDM, pollens, etc.), but a move toward recombinant allergen design is ongoing (5, 6). This could have important functional implications because the production host will determine the repertoire of post-translational modifications (PTMs) and in particular glycan modifications presented on allergens.The carbohydrate structures found on allergens are in most cases not found in mammals and therefore frequently lead to the induction IgE antibodies named Cross-reactive Carbohydrate Determinants (CCD) (711). Moreover, glycans may directly be involved in and promote uptake and target allergens to carbohydrate lectin receptors on antigen presenting cells (APC) (1214). Therefore, a full structural characterization of the glycans on the natural allergens is a prerequisite for understanding both antibody reactivity and lectin receptor mediated allergen recognition and modulation of the immune response (15, 16). Furthermore, a detailed characterization of PTMs of allergens is important for standardization of allergen products for diagnostic purposes as well as for vaccine use (17, 18). Although many major allergens and their etiology have been characterized in some detail, structural information on for example their immunological important PTM status is still incomplete (1921).Mass spectrometry-based technologies offer sensitive and accurate analyses for identification and characterization of proteins. The common proteomics workflow typically adopts the bottom-up approach, i.e. in vitro proteolytic digestion of proteins followed by nanoflow-liquid chromatography-tandem mass spectrometry (nLC-MS/MS) for protein identification and PTM characterization. Electron- or collision-driven fragmentation techniques, e.g. electron transfer dissociation (ETD) (22) or higher energy collisional dissociation (HCD) (23) have enabled accurate identification of peptides of purified proteins, e.g. allergens (21, 24), or complex biological samples (2527) with concurrent characterization of their PTMs. One advantage of bottom-up mass spectrometry is the ability to resolve modified peptides within a narrow chromatographic time frame thereby enabling in-depth characterization of site-specific features, e.g. glycoforms, on peptides. This peptide-level information is subsequently used to generate a protein-level view on the PTM status for a given protein. Importantly, the PTM connectivity of the protein (28) is lost upon proteolytic digestion, and alternative approaches are often required for comprehensive characterization of all proteoforms (29). Top-down mass spectrometry has emerged as an alternative approach to bottom-up proteomics, offering complementary MS and MS/MS information that may be used for protein identification and characterization (30, 31). With top-down MS, intact proteins are typically analyzed by high-resolution FTMS and characterized at the MS/MS level by CID, HCD, ECD, or ETD. This technique provides instant protein-level information on analytes, e.g. sequence variants, amino acid substitutions, PTMs, etc., which can be verified at the MS/MS level by different fragmentation modes. The combination of bottom-up and top-down mass spectrometry is therefore a powerful tool for the identification and characterization of proteins. Here, we combine top-down and bottom-up mass spectrometry for comprehensive characterization of seven major allergens as a first step toward unraveling the molecular mode of action of allergens with complex PTMs. By these methods, we demonstrate hitherto unknown PTMs of HDM allergens and identify more complex glycan structures than previously reported on the major grass pollen group 1 and 5 allergens. The new findings implicate important roles for carbohydrates in allergen recognition and response by the immune system.  相似文献   

8.
Database search programs are essential tools for identifying peptides via mass spectrometry (MS) in shotgun proteomics. Simultaneously achieving high sensitivity and high specificity during a database search is crucial for improving proteome coverage. Here we present JUMP, a new hybrid database search program that generates amino acid tags and ranks peptide spectrum matches (PSMs) by an integrated score from the tags and pattern matching. In a typical run of liquid chromatography coupled with high-resolution tandem MS, more than 95% of MS/MS spectra can generate at least one tag, whereas the remaining spectra are usually too poor to derive genuine PSMs. To enhance search sensitivity, the JUMP program enables the use of tags as short as one amino acid. Using a target-decoy strategy, we compared JUMP with other programs (e.g. SEQUEST, Mascot, PEAKS DB, and InsPecT) in the analysis of multiple datasets and found that JUMP outperformed these preexisting programs. JUMP also permitted the analysis of multiple co-fragmented peptides from “mixture spectra” to further increase PSMs. In addition, JUMP-derived tags allowed partial de novo sequencing and facilitated the unambiguous assignment of modified residues. In summary, JUMP is an effective database search algorithm complementary to current search programs.Peptide identification by tandem mass spectra is a critical step in mass spectrometry (MS)-based1 proteomics (1). Numerous computational algorithms and software tools have been developed for this purpose (26). These algorithms can be classified into three categories: (i) pattern-based database search, (ii) de novo sequencing, and (iii) hybrid search that combines database search and de novo sequencing. With the continuous development of high-performance liquid chromatography and high-resolution mass spectrometers, it is now possible to analyze almost all protein components in mammalian cells (7). In contrast to rapid data collection, it remains a challenge to extract accurate information from the raw data to identify peptides with low false positive rates (specificity) and minimal false negatives (sensitivity) (8).Database search methods usually assign peptide sequences by comparing MS/MS spectra to theoretical peptide spectra predicted from a protein database, as exemplified in SEQUEST (9), Mascot (10), OMSSA (11), X!Tandem (12), Spectrum Mill (13), ProteinProspector (14), MyriMatch (15), Crux (16), MS-GFDB (17), Andromeda (18), BaMS2 (19), and Morpheus (20). Some other programs, such as SpectraST (21) and Pepitome (22), utilize a spectral library composed of experimentally identified and validated MS/MS spectra. These methods use a variety of scoring algorithms to rank potential peptide spectrum matches (PSMs) and select the top hit as a putative PSM. However, not all PSMs are correctly assigned. For example, false peptides may be assigned to MS/MS spectra with numerous noisy peaks and poor fragmentation patterns. If the samples contain unknown protein modifications, mutations, and contaminants, the related MS/MS spectra also result in false positives, as their corresponding peptides are not in the database. Other false positives may be generated simply by random matches. Therefore, it is of importance to remove these false PSMs to improve dataset quality. One common approach is to filter putative PSMs to achieve a final list with a predefined false discovery rate (FDR) via a target-decoy strategy, in which decoy proteins are merged with target proteins in the same database for estimating false PSMs (2326). However, the true and false PSMs are not always distinguishable based on matching scores. It is a problem to set up an appropriate score threshold to achieve maximal sensitivity and high specificity (13, 27, 28).De novo methods, including Lutefisk (29), PEAKS (30), NovoHMM (31), PepNovo (32), pNovo (33), Vonovo (34), and UniNovo (35), identify peptide sequences directly from MS/MS spectra. These methods can be used to derive novel peptides and post-translational modifications without a database, which is useful, especially when the related genome is not sequenced. High-resolution MS/MS spectra greatly facilitate the generation of peptide sequences in these de novo methods. However, because MS/MS fragmentation cannot always produce all predicted product ions, only a portion of collected MS/MS spectra have sufficient quality to extract partial or full peptide sequences, leading to lower sensitivity than achieved with the database search methods.To improve the sensitivity of the de novo methods, a hybrid approach has been proposed to integrate peptide sequence tags into PSM scoring during database searches (36). Numerous software packages have been developed, such as GutenTag (37), InsPecT (38), Byonic (39), DirecTag (40), and PEAKS DB (41). These methods use peptide tag sequences to filter a protein database, followed by error-tolerant database searching. One restriction in most of these algorithms is the requirement of a minimum tag length of three amino acids for matching protein sequences in the database. This restriction reduces the sensitivity of the database search, because it filters out some high-quality spectra in which consecutive tags cannot be generated.In this paper, we describe JUMP, a novel tag-based hybrid algorithm for peptide identification. The program is optimized to balance sensitivity and specificity during tag derivation and MS/MS pattern matching. JUMP can use all potential sequence tags, including tags consisting of only one amino acid. When we compared its performance to that of two widely used search algorithms, SEQUEST and Mascot, JUMP identified ∼30% more PSMs at the same FDR threshold. In addition, the program provides two additional features: (i) using tag sequences to improve modification site assignment, and (ii) analyzing co-fragmented peptides from mixture MS/MS spectra.  相似文献   

9.
All human cells are covered by glycans, the carbohydrate units of glycoproteins, glycolipids, and proteoglycans. Most glycans are localized to cell surfaces and participate in events essential for cell viability and function. Glycosylation evolves during carcinogenesis, and therefore carcinoma-related glycan structures are potential cancer biomarkers. Colorectal cancer is one of the world''s three most common cancers, and its incidence is rising. Novel biomarkers are essential to identify patients for targeted and individualized therapy. We compared the N-glycan profiles of five rectal adenomas and 18 rectal carcinomas of different stages by matrix-assisted laser desorption-ionization time-of-flight mass spectrometry. Paraffin-embedded tumor samples were deparaffinized, and glycans were enzymatically released and purified. We found differences in glycosylation between adenomas and carcinomas: monoantennary, sialylated, pauci-mannose, and small high-mannose N-glycan structures were more common in carcinomas than in adenomas. We also found differences between stage I–II and stage III carcinomas. Based on these findings, we selected two glycan structures: pauci-mannose and sialyl Lewis a, for immunohistochemical analysis of their tissue expression in 220 colorectal cancer patients. In colorectal cancer, poor prognosis correlated with elevated expression of sialyl Lewis a, and in advanced colorectal cancer, poor prognosis correlated with elevated expression of pauci-mannose. In conclusion, by mass spectrometry we found several carcinoma related glycans, and we demonstrate a method of transforming these results into immunohistochemistry, a readily applicable method to study biomarker expression in patient samples.Glycans, the carbohydrate units of glycoproteins, glycolipids, and proteoglycans, that cover all human cells. Around 1% of the human genome participates in the biosynthesis of glycans(1). This biosynthesis is the most complex post-translational modification of proteins, and the great variability in glycan structures contains a tremendous ability to fine-tune the chemical and biological properties of glycoproteins. The glycosylation process occurs most abundantly in the Golgi apparatus and the endoplasmic reticulum, but also occurs in the cytoplasm and nucleus (2). Most glycoconjugates are localized to cell surfaces, where glycans participate in events essential for cell viability and function, such as cell adhesion, motility, and intracellular signaling (2). Changes in these functions are key steps seen when normal cells transform to malignant ones, and these are also reflected in changes of a cell''s glycan profile, observed in many cancers (3, 4). Specific structural changes in glycans may serve as cancer biomarkers (5, 6), and changes in glycosylation profiles are related to aggressive behavior in tumor cells (79).Cancer-associated asparagine-linked glycan (N-glycan) structures may play specific roles in supporting tumor progression; growth (10, 11), invasion (12, 13), and angiogenesis (14). Changes in the N-glycan profile emerge in numerous cancers, including lung (15, 16), breast (17), and colorectal cancer (CRC)1 (16, 18). Balog et al. (18) comparing the N-glycomic profile of CRC tissue to adjacent normal mucosa, reported differences in specific glycan structures. Moreover, serum N-glycosylation profile from patients with CRC differ from those of healthy controls (19).Colorectal cancer is the third most common cause of cancer-related death worldwide and its incidence is rising; 40% of CRCs are of rectal origin. Roughly 40% of patients have localized disease (stage I–II; Dukes A–B), another 40% loco regional disease (stage III; Dukes C), and 20% metastasized disease (stage IV; Dukes D) (20). Although stage at diagnosis is the most important factor determining prognosis, clinical outcome, and response to adjuvant treatment can markedly vary within each stage. Adjuvant therapy routinely goes to stage III patients, but the benefit of adjuvant treatment for stage II patients is unclear. Of stage II patients, 80% are cured by radical surgery alone. To identify patients who will benefit from postoperative treatment, we need novel biomarkers. The glycan profile of the tumor tissue could provide new biomarkers for diagnosis and prognosis of cancer.In this study, we characterized the N-glycomic profiles of rectal adenomas and carcinomas by MALDI-TOF mass spectrometric (MS) profiling of asparagine-linked glycans. Our aim was to identify differences between adenomas and carcinomas, and also between cancers of different stages. Based on glycan profiling, we also chose, for immunohistochemical expression studies of a series of 220 CRC patients, two glycan markers: sialyl Lewis a and pauci-mannose.  相似文献   

10.
11.
The success of high-throughput proteomics hinges on the ability of computational methods to identify peptides from tandem mass spectra (MS/MS). However, a common limitation of most peptide identification approaches is the nearly ubiquitous assumption that each MS/MS spectrum is generated from a single peptide. We propose a new computational approach for the identification of mixture spectra generated from more than one peptide. Capitalizing on the growing availability of large libraries of single-peptide spectra (spectral libraries), our quantitative approach is able to identify up to 98% of all mixture spectra from equally abundant peptides and automatically adjust to varying abundance ratios of up to 10:1. Furthermore, we show how theoretical bounds on spectral similarity avoid the need to compare each experimental spectrum against all possible combinations of candidate peptides (achieving speedups of over five orders of magnitude) and demonstrate that mixture-spectra can be identified in a matter of seconds against proteome-scale spectral libraries. Although our approach was developed for and is demonstrated on peptide spectra, we argue that the generality of the methods allows for their direct application to other types of spectral libraries and mixture spectra.The success of tandem MS (MS/MS1) approaches to peptide identification is partly due to advances in computational techniques allowing for the reliable interpretation of MS/MS spectra. Mainstream computational techniques mainly fall into two categories: database search approaches that score each spectrum against peptides in a sequence database (14) or de novo techniques that directly reconstruct the peptide sequence from each spectrum (58). The combination of these methods with advances in high-throughput MS/MS have promoted the accelerated growth of spectral libraries, collections of peptide MS/MS spectra the identification of which were validated by accepted statistical methods (9, 10) and often also manually confirmed by mass spectrometry experts. The similar concept of spectral archives was also recently proposed to denote spectral libraries including “interesting” nonidentified spectra (11) (i.e. recurring spectra with good de novo reconstructions but no database match). The growing availability of these large collections of MS/MS spectra has reignited the development of alternative peptide identification approaches based on spectral matching (1214) and alignment (1517) algorithms.However, mainstream approaches were developed under the (often unstated) assumption that each MS/MS spectrum is generated from a single peptide. Although chromatographic procedures greatly contribute to making this a reasonable assumption, there are several situations where it is difficult or even impossible to separate pairs of peptides. Examples include certain permutations of the peptide sequence or post-translational modifications (see (18) for examples of co-eluting histone modification variants). In addition, innovative experimental setups have demonstrated the potential for increased throughput in peptide identification using mixture spectra; examples include data-independent acquisition (19) ion-mobility MS (20), and MSE strategies (21).To alleviate the algorithmic bottleneck in such scenarios, we describe a computational approach, M-SPLIT (mixture-spectrum partitioning using library of identified tandem mass spectra), that is able to reliably and efficiently identify peptides from mixture spectra, which are generated from a pair of peptides. In brief, a mixture spectrum is modeled as linear combination of two single-peptide spectra, and peptide identification is done by searching against a spectral library. We show that efficient filtration and accurate branch-and-bound strategies can be used to avoid the huge computational cost of searching all possible pairs. Thus equipped, our approach is able to identify the correct matches by considering only a minuscule fraction of all possible matches. Beyond potentially enhancing the identification capabilities of current MS/MS acquisition setups, we argue that the availability of methods to reliably identify MS/MS spectra from mixtures of peptides could enable the collection of MS/MS data using accelerated chromatography setups to obtain the same or better peptide identification results in a fraction of the experimental time currently required for exhaustive peptide separation.  相似文献   

12.
13.
14.
Campylobacter jejuni is a gastrointestinal pathogen that is able to modify membrane and periplasmic proteins by the N-linked addition of a 7-residue glycan at the strict attachment motif (D/E)XNX(S/T). Strategies for a comprehensive analysis of the targets of glycosylation, however, are hampered by the resistance of the glycan-peptide bond to enzymatic digestion or β-elimination and have previously concentrated on soluble glycoproteins compatible with lectin affinity and gel-based approaches. We developed strategies for enriching C. jejuni HB93-13 glycopeptides using zwitterionic hydrophilic interaction chromatography and examined novel fragmentation, including collision-induced dissociation (CID) and higher energy collisional (C-trap) dissociation (HCD) as well as CID/electron transfer dissociation (ETD) mass spectrometry. CID/HCD enabled the identification of glycan structure and peptide backbone, allowing glycopeptide identification, whereas CID/ETD enabled the elucidation of glycosylation sites by maintaining the glycan-peptide linkage. A total of 130 glycopeptides, representing 75 glycosylation sites, were identified from LC-MS/MS using zwitterionic hydrophilic interaction chromatography coupled to CID/HCD and CID/ETD. CID/HCD provided the majority of the identifications (73 sites) compared with ETD (26 sites). We also examined soluble glycoproteins by soybean agglutinin affinity and two-dimensional electrophoresis and identified a further six glycosylation sites. This study more than doubles the number of confirmed N-linked glycosylation sites in C. jejuni and is the first to utilize HCD fragmentation for glycopeptide identification with intact glycan. We also show that hydrophobic integral membrane proteins are significant targets of glycosylation in this organism. Our data demonstrate that peptide-centric approaches coupled to novel mass spectrometric fragmentation techniques may be suitable for application to eukaryotic glycoproteins for simultaneous elucidation of glycan structures and peptide sequence.Campylobacter jejuni is a Gram-negative, microaerophilic, spiral-shaped, motile bacterium that is the most common cause of food- and water-borne diarrheal illness worldwide (1). Typical infections are acquired via the consumption of undercooked poultry where C. jejuni is found commensally (2). Symptoms in humans range from mild, non-inflammatory diarrhea to severe abdominal cramps, vomiting, and inflammation (3). Prior infection with C. jejuni is a common antecedent of two chronic immune-mediated disorders: Guillain-Barré syndrome (4) and immunoproliferative small intestine disease (5). A unique molecular trait of C. jejuni is the ability to post-translationally modify proteins by the N-linked addition of a 7-residue glycan (GalNAc-α1,4-GalNAc-α1,4-(Glcβ1,3)- GalNAc-α1,4-GalNAc-α1,4-GalNAc-α1,3-Bac-β1 where Bac is bacillosamine (2,4-diacetamido-2,4,6-trideoxyglucopyranose)) (6) at the consensus sequon (D/E)XNX(S/T) where X is any amino acid except proline (7).The N-linked C. jejuni heptasaccharide is encoded by the pgl (protein glycosylation) gene cluster (810), and the glycan is transferred to proteins by the PglB oligosaccharyltransferase (11) at the periplasmic face of the inner membrane (12). Removal of the N-glycosylation gene cluster (or indeed pglB alone) results in C. jejuni that displays poor adherence to and invasion of epithelial cell lines (13) and reduced colonization of the chicken gastrointestinal tract (14). Although this demonstrates a requirement for glycosylation in virulence, the proteins that mediate this are still unknown, and the overall role of glycan attachment remains to be elucidated. Our current understanding of the structural context of glycosylation in C. jejuni suggests that it does not play a role in steric stabilization by conferring structural rigidity as seen in eukaryotes (15) but occurs preferably on flexible loops and unordered regions of proteins (1618). To investigate the role of glycosylation in protein function, recent studies have utilized mutagenesis to remove the N-linked sequon from three glycoproteins: Cj1496c (19), Cj0143c (20), and VirB10 (21). Removal of glycosylation from Cj1496c and Cj0143c had little effect on protein function; however, glycan attachment was required for correct localization of VirB10. Although the exact role of the glycan remains largely unknown, it appears to be site-specific with a single site, Asn97, influencing localization of VirB10, whereas a second site, Asn32, is dispensable (21). It is clear that a more comprehensive analysis of the C. jejuni glycoproteome is required. A further complication in the elucidation of N-linked glycosylation is the use of the NCTC 11168 strain, which because of laboratory passage (22, 23) may not be the most appropriate model in which to study the virulence properties of glycan attachment. For example, we have recently shown that a surface-exposed virulence factor, JlpA, is glycosylated at two sites (Asn146 and Asn107) in all sequenced C. jejuni strains except NCTC 11168, which contains only Asn146 (24).Glycoproteomics in C. jejuni is also a major technical challenge. Unlike eukaryotic N-linked glycans, the C. jejuni glycan is resistant to removal by protein N-glycosidase F (24) and chemical liberation via β-elimination (6) possibly because of the structure of the unique linking sugar, bacillosamine (25). Analysis therefore requires complementary methodology to elucidate the sites of glycosylation in the presence of the glycan. Preferential fragmentation of the glycan itself during collision-induced dissociation (CID) generally results in poor recovery of peptide fragment ions, and thus identification of the underlying protein and site of attachment remains problematic. MS3 has been attempted for site identification (6, 26); however, the data are limited by the requirement for sufficient ions for two rounds of tandem MS. We have also shown previously that C. jejuni encodes several hydrophobic integral membrane and outer membrane proteins possessing multiple transmembrane-spanning regions that are not amenable to gel-based approaches (27), particularly those using lectins for glycoprotein purification (28). We hypothesize that N-linked glycosylation is more widespread than previously demonstrated (6, 7, 26) because these studies examined only soluble proteins (6, 26) or used lectin affinity (6, 7), which limits the amount and type of detergents that can be used. Recent work (26) has demonstrated the potential of exploiting the hydrophilic nature of the C. jejuni glycan to enable glycopeptide enrichment.The ability to generate product ions useful for the identification of a glycosylated peptide is governed by three factors: the peptide backbone, the glycan, and the fragmentation approach. Multiple strategies exist to separately exploit the first two of these parameters (29, 30), but it is only recently that selective fragmentation of modified peptides has been available through electron transfer dissociation (ETD)1 and electron capture dissociation (31, 32). ETD/electron capture dissociation enable the selective cleavage of the peptide while maintaining the carbohydrate structure, and this has been demonstrated using eukaryotic glycopeptides (33, 34) and more recently glycopeptides isolated from the pathogen Neisseria gonorrhoeae (35). A more recent fragmentation approach is higher energy collisional (C-trap) dissociation (HCD), which uses higher fragmentation energies than standard CID and enables identification of modifications, such as phosphotyrosine (36), via diagnostic immonium ions and high mass accuracy over the full mass range in MS/MS. HCD has not previously been applied to glycopeptides.We applied several enrichment and MS fragmentation approaches to the characterization of the glycoproteome of C. jejuni HB93-13. Sequence analysis determined that the HB93-13 genome contains 510 N-linked sequons ((D/E)XNX(S/T)) in 382 proteins of which 261 (with 371 potential N-linked sites) are predicted to pass through the inner membrane and are therefore the subset that may be glycosylated. We examined trypsin digests of whole cell and membrane protein preparations using zwitterionic hydrophilic interaction chromatography (ZIC-HILIC) and graphite enrichment of gel-separated proteins using several mass spectrometric techniques (CID, HCD, and ETD). This is the first study to demonstrate the potential of using the high energy fragmentation of HCD to overcome the signal disruption caused by labile glycan fragmentation and to provide peptide sequencing within a single step. Manual data analysis was also simplified as the GalNAc fragment ion (204.086 Da) provides a signature that can be used to highlight glycopeptides within a complex mixture. We identified 81 glycosylation sites, including 47 not described previously in the literature and a single site that cannot be unambiguously assigned. The majority of these are present on proteins not amenable to traditional gel-based analyses, such as hydrophobic transmembrane proteins. Our work more than doubles the previously known N-linked C. jejuni glycoproteome and provides a clear rationale for other studies where the peptide and glycan need to remain associated.  相似文献   

15.
CD22, a regulator of B-cell signaling, is a siglec that recognizes the sequence NeuAcα2–6Gal on glycoprotein glycans as ligands. CD22 interactions with glycoproteins on the same cell (in cis) and apposing cells (in trans) modulate its activity in B-cell receptor signaling. Although CD22 predominantly recognizes neighboring CD22 molecules as cis ligands on B-cells, little is known about the trans ligands on apposing cells. We conducted a proteomics scale study to identify candidate trans ligands of CD22 on B-cells by UV photocross-linking CD22-Fc chimera bound to B-cell glycoproteins engineered to carry sialic acids with a 9-aryl azide moiety. Using mass spectrometry-based quantitative proteomics to analyze the cross-linked products, 27 glycoproteins were identified as candidate trans ligands. Next, CD22 expressed on the surface of one cell was photocross-linked to glycoproteins on apposing B-cells followed by immunochemical analysis of the products with antibodies to the candidate ligands. Of the many candidate ligands, only the B-cell receptor IgM was found to be a major in situ trans ligand of CD22 that is selectively redistributed to the site of cell contact upon interaction with CD22 on the apposing cell.Glycan-binding proteins (GBPs)1 mediate diverse aspects of cell communication through their interactions with their counter-receptors comprising glycan ligands carried on cell surface glycoproteins and glycolipids. Identification of the in situ counter-receptors of glycan-binding proteins is problematic due to the fact that the vast majority of the glycoproteins of a cell will carry highly related glycan structures because they share the same secretory pathway that elaborates their glycans post-translationally en route to the cell surface. Thus, although many glycoproteins will carry the glycan structure recognized by a GBP, the challenge is to determine whether one, several, or all of these cell surface glycoproteins (and glycolipids) are recognized in situ as physiologically relevant counter-receptors (14). Standard in vitro methods, such as co-precipitation from cell lysates or Western blotting using binding protein probes, are useful for identifying glycoproteins that contain the glycan structure recognized by the GBP. However, these may not be relevant ligands in situ due to constraints imposed by their microdomain localization and the geometric arrangement of their glycans relative to the GBP presented on the apposing cell.In this report, we examine the in situ ligands of CD22 (Siglec-2), a member of the siglec family and a regulator of B-cell receptor (BCR) signaling that recognizes glycans containing the sequence NeuAcα2–6Gal as ligands (2, 5, 6). Regulation of BCR signaling by CD22 is effected by its proximity to the BCR through recruitment of a tyrosine phosphatase, SHP-1, which is in turn influenced by CD22 binding to its glycan ligands (6). Glycoproteins bearing CD22 ligands are abundantly expressed on B-cells and bind to CD22 in cis (on the same cell) (7), regulating BCR signaling (2, 5, 6). Although binding to cis ligands has been shown to “mask” CD22 from binding low avidity synthetic sialoside probes (2, 7), CD22 can also interact with ligands on apposing immune cells in trans (810). Interactions of CD22 with trans ligands influence T-cell signaling in vitro (11, 12), mediate B-cell homing via binding to sinusoidal endothelial cells in the bone marrow (13), and aid in “self”-recognition (14). Thus, interactions with both cis and trans ligands modulate CD22 function in immune homeostasis.Several groups have demonstrated that recombinant CD22-Fc chimera is capable of binding and precipitating the majority of glycoproteins from B- and T-cell lysates whose glycans contain the sequence NeuAcα2–6Gal (1518). Among them, CD45, IgM, and CD22 itself were identified as specific B-cell binding partners and were postulated to have functional significance as in situ cis ligands of CD22 in regulation of BCR signaling (11, 16, 1820). Several reports have also documented in situ interactions of CD22 with IgM and CD45, but these interactions were found to be of low stoichiometry and sialic acid-independent (1921), leaving open the question of which glycoproteins served as in situ cis ligands of CD22 on B-cells that masked the glycan ligand binding site of CD22 (7). Subsequently, using metabolically labeled B-cells with sialic acids containing a photoactivatable 9-aryl azide moiety, we demonstrated that CD22 could be photocross-linked to its cis ligands, effectively tagging the in situ cis ligands with CD22 (15). Notably, there was no cross-linking observed to IgM or CD45, demonstrating that they are not significant in situ cis ligands of CD22 (15). Instead, only glycans of neighboring CD22 molecules interacted significantly with CD22, resulting in photocross-linking of homomultimeric complexes of CD22. Thus, despite the fact that most B-cell glycoproteins are recognized in vitro, CD22 selectively recognizes glycans of neighboring CD22 molecules as cis ligands in situ.With the perspective gained from analysis of cis ligands, we wished to determine whether CD22 was also selective in recognition of trans ligands upon cell contact. We have previously demonstrated that CD22 is redistributed to sites of cell contact of interacting B-cells and T-cells and that redistribution is mediated by the interaction of CD22 with sialic acid-containing trans ligands on the apposing cell (8). Stamenkovic et al. (22) had previously demonstrated that binding of T-cells to CD22-expressing COS cells was blocked by an anti-CD45RO antibody, suggesting that CD45 was a functional trans ligand of CD22 on T-cells. However, we found that redistribution of CD22 to sites of cell contact was also observed with CD45-deficient B-cells (8), indicating that, at a minimum, other glycoproteins must also serve as trans ligands of CD22 on B-cells.To assess whether CD22 recognizes all or a subset of glycoproteins as trans ligands on an apposing cell, we initiated an unbiased analysis of the trans ligands of CD22 on apposing B-cells using our protein-glycan cross-linking strategy (15). By cross-linking CD22-Fc to intact B-cells, we identified 27 candidate trans ligands of CD22 by quantitative mass spectrometry-based proteomics. We then looked at the in situ trans interactions of CD22 in the physiologically relevant cellular context by cross-linking CD22 expressed on one cell to the trans ligands with photoreactive sialic acids on the apposing cell. Our results indicate that only a subset of cell surface glycoproteins, including IgM and, to a lesser extent, CD45 and Basigin, are selectively recognized in trans by CD22. Indeed, IgM in particular is a preferred trans ligand that is selectively redistributed to the sites of cell contact on apposing B-cells in a CD22- and sialic acid-dependent manner despite a vast excess of cell surface glycoproteins that carry a glycan recognized by CD22. The results support the view that factors other than glycan sequence are critical for the in situ engagement of glycan-binding proteins with glycan ligand bearing counter-receptors on the same cell (in cis) or apposing cell (in trans).  相似文献   

16.
Glycosylation is one of the most common and important protein modifications in biological systems. Many glycoproteins naturally occur at low abundances, which makes comprehensive analysis extremely difficult. Additionally, glycans are highly heterogeneous, which further complicates analysis in complex samples. Lectin enrichment has been commonly used, but each lectin is inherently specific to one or several carbohydrates, and thus no single or collection of lectin(s) can bind to all glycans. Here we have employed a boronic acid-based chemical method to universally enrich glycopeptides. The reaction between boronic acids and sugars has been extensively investigated, and it is well known that the interaction between boronic acid and diols is one of the strongest reversible covalent bond interactions in an aqueous environment. This strong covalent interaction provides a great opportunity to catch glycopeptides and glycoproteins by boronic acid, whereas the reversible property allows their release without side effects. More importantly, the boronic acid-diol recognition is universal, which provides great capability and potential for comprehensively mapping glycosylation sites in complex biological samples. By combining boronic acid enrichment with PNGase F treatment in heavy-oxygen water and MS, we have identified 816 N-glycosylation sites in 332 yeast proteins, among which 675 sites were well-localized with greater than 99% confidence. The results demonstrated that the boronic acid-based chemical method can effectively enrich glycopeptides for comprehensive analysis of protein glycosylation. A general trend seen within the large data set was that there were fewer glycosylation sites toward the C termini of proteins. Of the 332 glycoproteins identified in yeast, 194 were membrane proteins. Many proteins get glycosylated in the high-mannose N-glycan biosynthetic and GPI anchor biosynthetic pathways. Compared with lectin enrichment, the current method is more cost-efficient, generic, and effective. This method can be extensively applied to different complex samples for the comprehensive analysis of protein glycosylation.Glycosylation is an extremely important protein modification that frequently regulates protein folding, trafficking, and stability. It is also involved in a wide range of cellular events (1) such as immune response (2, 3), cell proliferation (4), cell-cell interactions (5), and signal transduction (6). Aberrant protein glycosylation is believed to have a direct correlation with the development of several diseases, including diabetes, infectious diseases, and cancer (711). Secretory proteins frequently get glycosylated, including those in body fluids such as blood, saliva, and urine (12, 13). Samples containing these proteins can be easily obtained and used for diagnostic and therapeutic purposes. Several glycoproteins have previously been identified as biomarkers, including Her2/Neu in breast cancer (14), prostate-specific antigen (PSA) in prostate cancer (15), and CA125 in ovarian cancer (16, 17), which highlights the clinical importance of identifying glycoproteins as indicators or biomarkers of diseases. Therefore, effective methods for systematic analysis of protein glycosylation are essential to understand the mechanisms of glycobiology, identify drug targets and discover biomarkers.Approximately half of mammalian cell proteins are estimated to be glycosylated at any given time (18). There have been many reports regarding identification of protein glycosylation sites and elucidation of glycan structures (1930). Glycan structure analysis can lead to potential therapeutic and diagnostic applications (31, 32), but it is also critical to identify which proteins are glycosylated as well as the sites at which the modification occurs. Despite progress in recent years, the large-scale analysis of protein glycosylation sites using MS-based proteomics methods is still a challenge. Without an effective enrichment method, the low abundance of glycoproteins prohibits the identification of the majority of sites using the popular intensity-dependent MS sequence method.About a decade ago, a very beautiful and elegant method based on hydrazide chemistry was developed to enrich glycopeptides. Hydrazide conjugated beads reacted with aldehydes formed from the oxidation of cis-diols in glycans (33). This method has been extensively applied to many different types of biological samples (3441). Besides the hydrazide-based enrichment method, lectins have also been frequently used to enrich glycopeptides or glycoproteins before MS analysis (28, 29, 4246). However, there are many different types of lectins, and each is specific to certain glycans (47, 48). Therefore, no combination of lectins can bind to all glycosylated peptides or proteins, which prevents comprehensive analysis of protein glycosylation. Because of the complexity of biological samples, effective enrichment methods are critical for the comprehensive analysis of protein glycosylation before MS analysis.One common feature of all glycoproteins and glycopeptides is that they contain multiple hydroxyl groups in their glycans. From a chemistry point of view, this can be exploited to effectively enrich them. Ideally, chemical enrichment probes must have both strong and specific interactions with multiple hydroxyl groups. The reaction between boronic acids and 1,2- or 1,3-cis-diols in sugars has been extensively studied (4952) and applied for the small-scale analysis of glycoproteins (5355). Furthermore, boronate affinity chromatography has been employed for the analysis of nonenzymatically glycated peptides (56, 57). Boronic acid-based chemical enrichment methods are expected to have great potential for global analysis of glycopeptides when combined with modern MS-based proteomics techniques. However, the method has not yet been used for the comprehensive analysis of protein N-glycosylation in complex biological samples (58).Yeast is an excellent model biological system that has been extensively used in a wide range of experiments. Last year, two papers reported the large-scale analysis of protein N-glycosylation in yeast (59, 60). In one study, a new MS-based method was developed based on N-glycopeptide mass envelopes with a pattern via metabolic incorporation of a defined mixture of N-acetylglucosamine isotopologs into N-glycans. Peptides with the recoded envelopes were specifically targeted for fragmentation, facilitating high confidence site mapping (59). Using this method, 133 N-glycosylation sites were confidently identified in 58 yeast proteins. When combined with an effective enrichment method, this MS-based analysis will provide a more complete coverage of the N-glycoproteome. The other work combined lectin enrichment with digestion by two enzymes (Glu_c and trypsin) to increase the peptide coverage, and 516 well-localized N-glycosylation sites were identified in 214 yeast proteins by MS (60).Here we have comprehensively identified protein N-glycosylation sites in yeast by combining a boronic acid-based chemical enrichment method with MS-based proteomics techniques. Magnetic beads conjugated with boronic acid were systematically optimized to selectively enrich glycosylated peptides from yeast whole cell lysates. The enriched peptides were subsequently treated with Peptide-N4-(N-acetyl-beta-glucosaminyl)asparagine amidase (PNGase F)1 in heavy-oxygen water. Finally, peptides were analyzed by an on-line LC-MS system. Over 800 protein N-glycosylation sites were identified in the yeast proteome, which clearly demonstrates that the boronic acid-based chemical method is an effective enrichment method for large-scale analysis of protein glycosylation by MS.  相似文献   

17.
In large-scale proteomic experiments, multiple peptide precursors are often cofragmented simultaneously in the same mixture tandem mass (MS/MS) spectrum. These spectra tend to elude current computational tools because of the ubiquitous assumption that each spectrum is generated from only one peptide. Therefore, tools that consider multiple peptide matches to each MS/MS spectrum can potentially improve the relatively low spectrum identification rate often observed in proteomics experiments. More importantly, data independent acquisition protocols promoting the cofragmentation of multiple precursors are emerging as alternative methods that can greatly improve the throughput of peptide identifications but their success also depends on the availability of algorithms to identify multiple peptides from each MS/MS spectrum. Here we address a fundamental question in the identification of mixture MS/MS spectra: determining the statistical significance of multiple peptides matched to a given MS/MS spectrum. We propose the MixGF generating function model to rigorously compute the statistical significance of peptide identifications for mixture spectra and show that this approach improves the sensitivity of current mixture spectra database search tools by a ≈30–390%. Analysis of multiple data sets with MixGF reveals that in complex biological samples the number of identified mixture spectra can be as high as 20% of all the identified spectra and the number of unique peptides identified only in mixture spectra can be up to 35.4% of those identified in single-peptide spectra.The advancement of technology and instrumentation has made tandem mass (MS/MS)1 spectrometry the leading high-throughput method to analyze proteins (1, 2, 3). In typical experiments, tens of thousands to millions of MS/MS spectra are generated and enable researchers to probe various aspects of the proteome on a large scale. Part of this success hinges on the availability of computational methods that can analyze the large amount of data generated from these experiments. The classical question in computational proteomics asks: given an MS/MS spectrum, what is the peptide that generated the spectrum? However, it is increasingly being recognized that this assumption that each MS/MS spectrum comes from only one peptide is often not valid. Several recent analyses show that as many as 50% of the MS/MS spectra collected in typical proteomics experiments come from more than one peptide precursor (4, 5). The presence of multiple peptides in mixture spectra can decrease their identification rate to as low as one half of that for MS/MS spectra generated from only one peptide (6, 7, 8). In addition, there have been numerous developments in data independent acquisition (DIA) technologies where multiple peptide precursors are intentionally selected to cofragment in each MS/MS spectrum (9, 10, 11, 12, 13, 14, 15). These emerging technologies can address some of the enduring disadvantages of traditional data-dependent acquisition (DDA) methods (e.g. low reproducibility (16)) and potentially increase the throughput of peptide identification 5–10 fold (4, 17). However, despite the growing importance of mixture spectra in various contexts, there are still only a few computational tools that can analyze mixture spectra from more than one peptide (18, 19, 20, 21, 8, 22). Our recent analysis indicated that current database search methods for mixture spectra still have relatively low sensitivity compared with their single-peptide counterpart and the main bottleneck is their limited ability to separate true matches from false positive matches (8). Traditionally problem of peptide identification from MS/MS spectra involves two sub-problems: 1) define a Peptide-Spectrum-Match (PSM) scoring function that assigns each MS/MS spectrum to the peptide sequence that most likely generated the spectrum; and 2) given a set of top-scoring PSMs, select a subset that corresponds to statistical significance PSMs. Here we focus on the second problem, which is still an ongoing research question even for the case of single-peptide spectra (23, 24, 25, 26). Intuitively the second problem is difficult because one needs to consider spectra across the whole data set (instead of comparing different peptide candidates against one spectrum as in the first problem) and PSM scoring functions are often not well-calibrated across different spectra (i.e. a PSM score of 50 may be good for one spectrum but poor for a different spectrum). Ideally, a scoring function will give high scores to all true PSMs and low scores to false PSMs regardless of the peptide or spectrum being considered. However, in practice, some spectra may receive higher scores than others simply because they have more peaks or their precursor mass results in more peptide candidates being considered from the sequence database (27, 28). Therefore, a scoring function that accounts for spectrum or peptide-specific effects can make the scores more comparable and thus help assess the confidence of identifications across different spectra. The MS-GF solution to this problem is to compute the per-spectrum statistical significance of each top-scoring PSM, which can be defined as the probability that a random peptide (out of all possible peptide within parent mass tolerance) will match to the spectrum with a score at least as high as that of the top-scoring PSM. This measures how good the current best match is in relation to all possible peptides matching to the same spectrum, normalizing any spectrum effect from the scoring function. Intuitively, our proposed MixGF approach extends the MS-GF approach to now calculate the statistical significance of the top pair of peptides matched from the database to a given mixture spectrum M (i.e. the significance of the top peptide–peptide spectrum match (PPSM)). As such, MixGF determines the probability that a random pair of peptides (out of all possible peptides within parent mass tolerance) will match a given mixture spectrum with a score at least as high as that of the top-scoring PPSM.Despite the theoretical attractiveness of computing statistical significance, it is generally prohibitive for any database search methods to score all possible peptides against a spectrum. Therefore, earlier works in this direction focus on approximating this probability by assuming the score distribution of all PSMs follows certain analytical form such as the normal, Poisson or hypergeometric distributions (29, 30, 31). In practice, because score distributions are highly data-dependent and spectrum-specific, these model assumptions do not always hold. Other approaches tried to learn the score distribution empirically from the data (29, 27). However, one is most interested in the region of the score distribution where only a small fraction of false positives are allowed (typically at 1% FDR). This usually corresponds to the extreme tail of the distribution where p values are on the order of 10−9 or lower and thus there is typically lack of sufficient data points to accurately model the tail of the score distribution (32). More recently, Kim et al. (24) and Alves et al. (33), in parallel, proposed a generating function approach to compute the exact score distribution of random peptide matches for any spectra without explicitly matching all peptides to a spectrum. Because it is an exact computation, no assumption is made about the form of score distribution and the tail of the distribution can be computed very accurately. As a result, this approach substantially improved the ability to separate true matches from false positive ones and lead to a significant increase in sensitivity of peptide identification over state-of-the-art database search tools in single-peptide spectra (24).For mixture spectra, it is expected that the scores for the top-scoring match will be even less comparable across different spectra because now more than one peptide and different numbers of peptides can be matched to each spectrum at the same time. We extend the generating function approach (24) to rigorously compute the statistical significance of multiple-Peptide-Spectrum Matches (mPSMs) and demonstrate its utility toward addressing the peptide identification problem in mixture spectra. In particular, we show how to extend the generating approach for mixture from two peptides. We focus on this relatively simple case of mixture spectra because it accounts for a large fraction of mixture spectra presented in traditional DDA workflows (5). This allows us to test and develop algorithmic concepts using readily-available DDA data because data with more complex mixture spectra such as those from DIA workflows (11) is still not widely available in public repositories.  相似文献   

18.
Glycans present on glycoproteins and glycolipids of the major human parasite Schistosoma mansoni induce innate as well as adaptive immune responses in the host. To be able to study the molecular characteristics of schistosome infections it is therefore required to determine the expression profiles of glycans and antigenic glycan-motifs during a range of critical stages of the complex schistosome lifecycle. We performed a longitudinal profiling study covering schistosome glycosylation throughout worm- and egg-development using a mass spectrometry-based glycomics approach. Our study revealed that during worm development N-glycans with Galβ1–4(Fucα1–3)GlcNAc (LeX) and core-xylose motifs were rapidly lost after cercariae to schistosomula transformation, whereas GalNAcβ1–4GlcNAc (LDN)-motifs gradually became abundant and predominated in adult worms. LeX-motifs were present on glycolipids up to 2 weeks of schistosomula development, whereas glycolipids with mono- and multifucosylated LDN-motifs remained present up to the adult worm stage. In contrast, expression of complex O-glycans diminished to undetectable levels within days after transformation. During egg development, a rich diversity of N-glycans with fucosylated motifs was expressed, but with α3-core fucose and a high degree of multifucosylated antennae only in mature eggs and miracidia. N-glycan antennae were exclusively LDN-based in miracidia. O-glycans in the mature eggs were also diverse and contained LeX- and multifucosylated LDN, but none of these were associated with miracidia in which we detected only the Galβ1–3(Galβ1–6)GalNAc core glycan. Immature eggs also exhibited short O-glycan core structures only, suggesting that complex fucosylated O-glycans of schistosome eggs are derived primarily from glycoproteins produced by the subshell envelope in the developed egg. Lipid glycans with multifucosylated GlcNAc repeats were present throughout egg development, but with the longer highly fucosylated stretches enriched in mature eggs and miracidia. This global analysis of the developing schistosome''s glycome provides new insights into how stage-specifically expressed glycans may contribute to different aspects of schistosome-host interactions.Schistosoma blood flukes give rise to infections in over 200 million people in developing countries worldwide (1). With a Disability-Adjusted Life Years (DALY) value of more than 3 million, schistosomiasis ranks as one of the neglected tropical diseases with the highest impact on public health (2). The schistosome has a complex and intriguing lifecycle, which involves a definitive host (mammal) as well as an intermediate host (snail). Infections with Schistosoma mansoni, one of the major schistosome species infecting humans, are initiated when snail-borne cercariae penetrate intact skin. The cercariae then transform into schistosomula, which enter the vasculature of the host and mature while migrating to the portal system. Here, adult male and female worms pair, with the female worm producing hundreds of eggs each day during a life span of several years unless the infection is treated by chemotherapy. Miracidia develop inside the maturing eggs while they cross the intestinal wall over a period of several days to be excreted with the feces. Miracidia then hatch from the eggs upon contact with fresh water and infect the snail host where asexual replication takes place and eventually new cercariae are shed. Notably, many eggs get trapped in organs such as the liver, where they induce a granulomatous inflammation and organ damage, the main cause of pathology in schistosomiasis (1).Throughout their lifecycle, schistosomes express a multitude of protein- and lipid-linked glycans that play an important role in the parasite biology. The expression of many glycan elements appears to be developmentally regulated by the differential expression of glycosyltransferases during the different lifecycle stages (3). A series of papers has been published indicating that schistosome glycans play essential roles in the molecular interaction of the parasite and the host immune system, enabling survival of the parasite and allowing chronic infection to establish. For example, glycosylated soluble egg antigens (SEA) interact with the C-type lectins mannose receptor (MR), macrophage galactose-type lectin (MGL) and dendritic cell-specific ICAM-3-grabbing nonintegrin (DC-SIGN), and some of these interactions lead to immunomodulatory effects of specific components of SEA via dendritic cells (DCs)1 (4, 5). Furthermore, fucosylated egg glycolipids trigger innate immune responses of peripheral blood mononuclear cells and egg glycans are required for periovular granuloma formation in a mouse model. In addition, cercarial secretions induce alternatively activated macrophages in a carbohydrate dependent manner (69). Importantly, also adaptive immune responses to schistosome glycans are mounted by the human host. A large part of the antibody responses to schistosomes is directed against antigenic glycan motifs, raising the question whether they could form a basis for antischistosome vaccine strategies (10).Rapid developments in mass spectrometry-based glycan-analysis technology in the last two decades have led to several studies focused on elucidating the glycan structures of somatic and secretory schistosome preparations (1122). Among the typical glycan elements detected in S. mansoni were unusual and antigenic Fucα1–2Fucα1–3- (DF-) motifs attached to GalNAcβ1–4GlcNAc (LacDiNAc or LDN) (12, 14, 1719, 21), Xylβ1–2- and Fucα1–3-modified N-glycan core structures (13, 15, 17, 20), and a unique O-glycan core (Galβ1–3(Galβ1–6)GalNAc) (14, 17) (see supplemental Table S5 for a definition of glycan motifs of S. mansoni glycoconjugates). Also more widely occurring glycan elements shared with the mammalian or snail host were detected, e.g. Galβ1–4GlcNAc (LacNAc or LN), Galβ1–4(Fucα1–3)GlcNAc (Lewis X or LeX), LDN, and GalNAcβ1–4(Fucα1–3)GlcNAc (LDN-F) (23, 24). These data were generated over a long period of time, often focusing on a single schistosome life stage and a specific class of glycans only, and using various analytical techniques and strategies that make inter-study comparisons often difficult. In addition, glycosylation of the schistosomula that develop shortly after infection and are considered to be relatively vulnerable to immune attack, has remained largely unexplored (20, 25, 26), although these could be interesting therapeutic targets (2729). Clearly, an integrated and complete overview of schistosome glycosylation was so far not available.In this study, we therefore set out to determine the overall schistosome protein- and lipid-linked glycome by analyzing a total of 16 lifecycle stages ranging from cercariae to miracidia. We analyzed the glycoprotein-derived N- and O-glycans as well as the lipid-derived glycans of these life stages by a MALDI-TOF MS-based approach complemented with fragmentation and enzyme degradation studies. Our findings give new insights in the glycobiology of parasite development and parasite–host interaction and contribute to the identification of new potential immune intervention targets.  相似文献   

19.
A complete understanding of the biological functions of large signaling peptides (>4 kDa) requires comprehensive characterization of their amino acid sequences and post-translational modifications, which presents significant analytical challenges. In the past decade, there has been great success with mass spectrometry-based de novo sequencing of small neuropeptides. However, these approaches are less applicable to larger neuropeptides because of the inefficient fragmentation of peptides larger than 4 kDa and their lower endogenous abundance. The conventional proteomics approach focuses on large-scale determination of protein identities via database searching, lacking the ability for in-depth elucidation of individual amino acid residues. Here, we present a multifaceted MS approach for identification and characterization of large crustacean hyperglycemic hormone (CHH)-family neuropeptides, a class of peptide hormones that play central roles in the regulation of many important physiological processes of crustaceans. Six crustacean CHH-family neuropeptides (8–9.5 kDa), including two novel peptides with extensive disulfide linkages and PTMs, were fully sequenced without reference to genomic databases. High-definition de novo sequencing was achieved by a combination of bottom-up, off-line top-down, and on-line top-down tandem MS methods. Statistical evaluation indicated that these methods provided complementary information for sequence interpretation and increased the local identification confidence of each amino acid. Further investigations by MALDI imaging MS mapped the spatial distribution and colocalization patterns of various CHH-family neuropeptides in the neuroendocrine organs, revealing that two CHH-subfamilies are involved in distinct signaling pathways.Neuropeptides and hormones comprise a diverse class of signaling molecules involved in numerous essential physiological processes, including analgesia, reward, food intake, learning and memory (1). Disorders of the neurosecretory and neuroendocrine systems influence many pathological processes. For example, obesity results from failure of energy homeostasis in association with endocrine alterations (2, 3). Previous work from our lab used crustaceans as model organisms found that multiple neuropeptides were implicated in control of food intake, including RFamides, tachykinin related peptides, RYamides, and pyrokinins (46).Crustacean hyperglycemic hormone (CHH)1 family neuropeptides play a central role in energy homeostasis of crustaceans (717). Hyperglycemic response of the CHHs was first reported after injection of crude eyestalk extract in crustaceans. Based on their preprohormone organization, the CHH family can be grouped into two sub-families: subfamily-I containing CHH, and subfamily-II containing molt-inhibiting hormone (MIH) and mandibular organ-inhibiting hormone (MOIH). The preprohormones of the subfamily-I have a CHH precursor related peptide (CPRP) that is cleaved off during processing; and preprohormones of the subfamily-II lack the CPRP (9). Uncovering their physiological functions will provide new insights into neuroendocrine regulation of energy homeostasis.Characterization of CHH-family neuropeptides is challenging. They are comprised of more than 70 amino acids and often contain multiple post-translational modifications (PTMs) and complex disulfide bridge connections (7). In addition, physiological concentrations of these peptide hormones are typically below picomolar level, and most crustacean species do not have available genome and proteome databases to assist MS-based sequencing.MS-based neuropeptidomics provides a powerful tool for rapid discovery and analysis of a large number of endogenous peptides from the brain and the central nervous system. Our group and others have greatly expanded the peptidomes of many model organisms (3, 1833). For example, we have discovered more than 200 neuropeptides with several neuropeptide families consisting of as many as 20–40 members in a simple crustacean model system (5, 6, 2531, 34). However, a majority of these neuropeptides are small peptides with 5–15 amino acid residues long, leaving a gap of identifying larger signaling peptides from organisms without sequenced genome. The observed lack of larger size peptide hormones can be attributed to the lack of effective de novo sequencing strategies for neuropeptides larger than 4 kDa, which are inherently more difficult to fragment using conventional techniques (3437). Although classical proteomics studies examine larger proteins, these tools are limited to identification based on database searching with one or more peptides matching without complete amino acid sequence coverage (36, 38).Large populations of neuropeptides from 4–10 kDa exist in the nervous systems of both vertebrates and invertebrates (9, 39, 40). Understanding their functional roles requires sufficient molecular knowledge and a unique analytical approach. Therefore, developing effective and reliable methods for de novo sequencing of large neuropeptides at the individual amino acid residue level is an urgent gap to fill in neurobiology. In this study, we present a multifaceted MS strategy aimed at high-definition de novo sequencing and comprehensive characterization of the CHH-family neuropeptides in crustacean central nervous system. The high-definition de novo sequencing was achieved by a combination of three methods: (1) enzymatic digestion and LC-tandem mass spectrometry (MS/MS) bottom-up analysis to generate detailed sequences of proteolytic peptides; (2) off-line LC fractionation and subsequent top-down MS/MS to obtain high-quality fragmentation maps of intact peptides; and (3) on-line LC coupled to top-down MS/MS to allow rapid sequence analysis of low abundance peptides. Combining the three methods overcomes the limitations of each, and thus offers complementary and high-confidence determination of amino acid residues. We report the complete sequence analysis of six CHH-family neuropeptides including the discovery of two novel peptides. With the accurate molecular information, MALDI imaging and ion mobility MS were conducted for the first time to explore their anatomical distribution and biochemical properties.  相似文献   

20.
Various studies in the past have revealed that molluscs can produce a wide range of rather complex N-glycan structures, which vary from those occurring in other invertebrate animals; particularly methylated glycans have been found in gastropods, and there are some reports of anionic glycans in bivalves. Due to the high variability in terms of previously described structures and methodologies, it is a major challenge to establish glycomic workflows that yield the maximum amount of detailed structural information from relatively low quantities of sample. In this study, we apply differential release with peptide:N-glycosidases F and A followed by solid-phase extraction on graphitized carbon and reversed-phase materials to examine the glycome of Volvarina rubella (C. B. Adams, 1845), a margin snail of the clade Neogastropoda. The resulting four pools of N-glycans were fractionated on a fused core RP-HPLC column and subject to MALDI-TOF MS and MS/MS in conjunction with chemical and enzymatic treatments. In addition, selected N-glycan fractions, as well as O-glycans released by β-elimination, were analyzed by porous graphitized carbon-LC-MS and MSn. This comprehensive approach enabled us to determine a number of novel modifications of protein-linked glycans, including N-methyl-2-aminoethylphosphonate on mannose and N-acetylhexosamine residues, core β1,3-linked mannose, zwitterionic moieties on core Galβ1,4Fuc motifs, additional mannose residues on oligomannosidic glycans, and bisubstituted antennal fucose; furthermore, typical invertebrate N-glycans with sulfate and core fucose residues are present in this gastropod.Molluscs represent one of the largest groups of animals on the planet; there is an estimated 200,000 species, which vary in morphology from gastropods (snails) through to cephalopods (octopus) and live in a range of marine, aquatic, and terrestrial environments (1). Many molluscs are familiar due to their shells or being seafood. Less appreciated is perhaps their ecological role as filter feeders or scavengers and their being an indicator for water quality (24); also, some molluscs are intermediate hosts for pathogens such as viruses or schistosomes (5, 6).In glycobiological terms, the most studies on molluscs have been structural characterizations of the N-glycans on hemocyanins of a range of gastropods, such as from keyhole limpet (Megathura crenulata; KLH is an often-used carrier protein for immunization), Lymnaea stagnalis, Helix pomatia, and Rapana venosa (710). Furthermore, glycans from cephalopod rhodopsins, proteins of bivalves involved in biomineralization, or whole snail viscera have also been analyzed. Including our recent study on the hemocytes and plasma of the eastern oyster (Crassostrea virginica), the variety of modifications of N-glycans in these organisms is immense and includes branched fucose residues, glucuronylation, sulfation, methylation, core xylose, and galactosylation of core fucose as well as LacdiNAc and blood-group-like motifs (1115). On the other hand, there is only scattered information regarding the biosynthesis of mollusc N-glycan epitopes, based on assay of some fucosyl-, xylosyl-, N-acetylglucosaminyltransferases, and N-acetylgalactosaminyltransferases (1618); also, probably only two mollusc glycosyltransferases have ever been characterized in recombinant form (19, 20).The high variability and lack of predictability of mollusc glycomes mean that a suitable glycomic workflow has to be employed that takes account of the maxim “expect the unexpected.” Thereby, in comparison to mammalian glycomes with known major components, the analyses of those of lower eukaryotes can present major challenges. In the past, mollusc glycans from either a single glycoprotein or from tissue were very often analyzed in any single study by one or two methods (e.g. GC-MS and NMR or MALDI-TOF MS/MS of HPLC-fractionated N-glycans, LC-MS/MS of glycopeptides, or GC-MS and MSn of permethylated N-glycans; see references above). In some cases, chemical and enzymatic treatments were employed. Here, we have sought to maximize the potential of off-line MALDI-TOF MS and MS/MS by prefractionating N-glycans first on the basis of whether they can be released by peptide:N-glycosidase A or F (the former being able to remove glycans containing core α1,3-fucose (21)) and then using solid-phase extraction on nonporous graphitized carbon (for an initial separation of anionic from neutral glycans (22)) and on a reversed-phase resin (which aids enrichment of glycans with substitutions of core α1,6-fucose). Subsequent use of a fused core reversed-phase (RP)-HPLC column (23) resulted in high-resolution separation into fractions containing either a single or very few glycan species that facilitated further MS-based analyses; as this RP column offers isomeric/isobaric separation, HPLC fractionation was a prerequisite for the definition of the individual N-glycan structures. Furthermore, the residual glycopeptides (posttreatment with peptide:N-glycosidases) were subject to β-elimination to release the O-glycans followed by LC-MS.On the basis of these considerations, we have examined the N- and O-glycomes of a margin snail (Volvarina rubella), a species of carnivorous and scavenging marine gastropod first described as Marginella rubella in 1845 (24). Using off-line LC-MALDI-TOF MS and on-line LC-ESI-MS, we reveal a particularly complex N-glycome encompassing a range of oligomannosidic, paucimannosidic, core-modified, and complex (up to triantennary) N-linked oligosaccharides with also a number of anionic and zwitterionic modifications, which are also present on O-glycans. Although some of these features are also found on N-glycans or lipid-linked glycans of other species, the majority of the ∼100 structures are described here for the first time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号