首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
The structural and functional analysis of the core protein of hepatitis B virus is important for a full understanding of the viral life cycle and the development of novel therapeutic agents. The majority of the core protein (CP149) comprises the capsid assembly domain, and the C-terminal region (residues 150–183) is responsible for nucleic acid binding. Protein monomers associate to form dimeric structural subunits, and helices 3 and 4 (residues 50–111 of the assembly domain) have been shown to be important for this as they constitute the interdimer interface. Here, using mass spectrometry coupled with ion mobility spectrometry, we demonstrate the conformational flexibility of the CP149 dimer. Limited proteolysis was used to locate involvement in this feature to the C-terminal region. A genetically fused CP dimer was found to show decreased disorder, consistent with a more restricted C-terminus at the fusion junction. Incubation of CP149 dimer with heteroaryldihydropyrimidine-1, a small molecule known to interfere with the assembly process, was shown to result in oligomers different in shape to the capsid assembly-competent oligomers of the fused CP dimer. We suggest that heteroaryldihydropyrimidine-1 affects the dynamics of CP149 dimer in solution, likely affecting the ratio between assembly active and inactive states. Therefore, assembly of the less dynamic fused dimer is less readily misdirected by heteroaryldihydropyrimidine-1. These studies of the flexibility and oligomerization properties of hepatitis B virus core protein illustrate both the importance of C-terminal dynamics in function and the utility of gas-phase techniques for structural and dynamical biomolecular analysis.  相似文献   

2.
3.
4.
We present the first comprehensive capillary electrophoresis electrospray ionization mass spectrometry (CESI-MS) analysis of post-translational modifications derived from H1 and core histones. Using a capillary electrophoresis system equipped with a sheathless high-sensitivity porous sprayer and nano–liquid chromatography electrospray ionization mass spectrometry (nano-LC-ESI-MS) as two complementary techniques, we characterized H1 histones isolated from rat testis. Without any pre-separation of the perchloric acid extraction, a total of 70 different modified peptides, including 50 phosphopeptides, were identified in the rat linker histones H1.0, H1a-H1e, and H1t. Out of the 70 modified H1 histone peptides, 27 peptides could be identified with CESI-MS only, and 11 solely with LC-ESI-MS. Immobilized metal-affinity chromatography enrichment prior to MS analysis yielded a total of 55 phosphopeptides; 22 of these peptides could be identified only by CESI-MS, and 19 only by LC-ESI-MS, showing the complementarity of the two techniques. We mapped 42 H1 modification sites, including 31 phosphorylation sites, of which 8 were novel sites. For the analysis of core histones, we chose a different strategy. In a first step, the sulfuric-acid-extracted core histones were pre-separated using reverse-phase high-performance liquid chromatography. Individual rat testis core histone fractions obtained in this way were digested and analyzed via bottom-up CESI-MS. This approach yielded the identification of 42 different modification sites including acetylation (lysine and Nα-terminal); mono-, di-, and trimethylation; and phosphorylation. When we applied CESI-MS for the analysis of intact core histone subtypes from butyrate-treated mouse tumor cells, we were able to rapidly detect their degree of modification, and we found this method very useful for the separation of isobaric trimethyl and acetyl modifications. Taken together, our results highlight the need for additional techniques for the comprehensive analysis of post-translational modifications. CESI-MS is a promising new proteomics tool as demonstrated by this, the first comprehensive analysis of histone modifications, using rat testis as an example.Histones are the most intensively studied group of basic nuclear proteins and are of great importance with regard to the organization of chromatin structure and control of gene activity. They are highly conserved during evolution, binding to and condensing eukaryotic chromosomal DNA to form chromatin. The fundamental chromatin subunit is the nucleosome, in which 166 bp of DNA are wrapped around a core histone octamer and a further ∼40 bp constitute the linker between one nucleosome core and the next. The histone octamer contains two molecules of each of the core histones H2A, H2B, H3, and H4. A fifth type of histone, referred to as linker histone (H1, H5), binds to both the DNA on the outer surface of nucleosomes and the linker DNA.There are numerous microsequence variants of linker and core histones (except H4) differing only slightly in primary sequence. In rat testis, for example, six somatic H1 subtypes, designated as H1a, H1b, H1c, H1d, H1e, and H1.0, as well as germ cell specific subtypes (i.e. H1t, H1T2, and HILS1), have been identified (13). Under various biological conditions, all histone proteins, for both linker and core histones, are subjected to post-translational modifications, including phosphorylation, acetylation, methylation, ubiquitination, deamidation, glycosylation, and ADP-ribosylation, which have a great influence on the epigenetic control of gene expression (46). The multitude of histone proteins resulting from closely related sequence variants and post-translational modifications, as well as their highly basic nature combined with hydrophobic properties, provides a major analytical challenge in current proteomics research. Over the past several years, considerable efforts have been expended to develop methods to identify the specific sites of histone modifications. Mass spectrometry (MS) coupled to liquid chromatography (LC) is the dominant technique for their characterization (714). However, because histone proteins contain up to nearly 35% basic amino acids, the analysis of histone peptides is still problematic, as digestion with many commonly used enzymes (e.g. trypsin, Lys-C, etc.) causes the formation of many short and polar peptides that poorly interact with the reverse-phase (RP)1 material and go undetected by conventional liquid chromatography electrospray ionization mass spectrometry (LC-ESI-MS). To overcome this problem, chemical derivatization such as propionylation is often applied (15, 16).Capillary electrophoresis (CE) overcomes this disadvantage; this technique allows separations based on the mass-to-charge ratio of peptides and does not utilize their hydrophobic nature as a separation principle. The methods of electrophoresis and LC and their applicability for histone analysis have been reviewed in detail by Lindner (17). CE has proven to be a remarkably powerful method for separating individual histones and their modified forms based on their different electrophoretic mobilities. Using a bare fused silica capillary and hydroxypropylmethyl cellulose (HPMC) as a buffer additive in order to avoid undesired protein adsorption, different core and linker histones and their multiply phosphorylated and acetylated forms were successfully separated via capillary zone electrophoresis (CZE) (1822). So far, no data have been published about the identification of histone modifications by means of capillary electrophoresis electrospray ionization mass spectrometry (CESI-MS). LC is given preference over CE because of the difficulty of achieving on-line interfacing of CE with MS that allows stable electrospray processes without compromising the quality of separation or the detection sensitivity. However, CE-MS is a promising technique with constantly increasing importance, as documented by numerous articles (2326).Various interfaces have been constructed to improve CESI-MS coupling (27, 28). Sheathflow interfaces are the most widely used, and although the drawback of having to dilute the analyte is inherent in this kind of interface, they offer stable electrophoretic separations and allow greater versatility in the choice of background electrolyte (BGE) and the range of flow rates (2932). Sheathless interfaces have generated interest because no sheath liquid is added, which leads to enhanced detection sensitivity (33, 34). However, they have not been used frequently because of their limited robustness and lack of well-established interfaces and routine analysis protocols. The most widely used method for establishing the terminating electrical contact is coating the outer surface of the CE capillary tip with a conductive material (3537). Unfortunately, the lifetimes of such coatings are generally very limited, as they suffer from deterioration under the influence of the high voltages applied.A recently published concept of a sheathless interface based on a separation capillary with a porous tip acting as a nanospray emitter overcomes these disadvantages (38). The capillary tip is etched using hydrofluoric acid until the capillary wall becomes so thin and porous that an electric contact can be established. The performance of this methodology, which combines the low-flow characteristics of CE with an integrated ESI source, is described in Refs. 3941. Applications such as the analysis of intact proteins (42), protein–protein and protein–metal complexes (43), and ribosomal protein digests from E. coli (44) have been published. Method-inherent advantages of CESI-MS are highly efficient separations, low flow rates leading to reduced ion suppression, and greater sensitivity (40). In contrast to nano-LC, no column equilibration is needed, there are no gradient effects, and the instrumentation is less maintenance-intensive.Our group recently described important features of CESI-MS and reported the comparison of this method with LC-ESI-MS for the analysis of a 5% perchloric acid extraction of rat testis consisting mainly of different histone H1 subtypes (39). The performance of both techniques was evaluated regarding analysis time, protein sequence coverage, and number and molecular mass distribution of the identified peptides. The CESI-MS method provided shorter analysis times, narrower peaks yielding high signals, and the identification of a greater number of low molecular mass range peptides than LC-ESI-MS (39).In the current study, we investigated the analysis of post-translationally modified peptides, particularly phosphopeptides, obtained from endoproteinase Arg-C digested histones from rat testis; this organ contains the whole set of somatic and germ cell specific H1 histones, as well as numerous modified core histone proteins. CESI-MS and LC-ESI-MS were compared regarding the number and type of identified modified peptides. Without any pre-separation of the perchloric acid extraction, we found numerous known and novel modification sites in linker histones. In addition, immobilized metal-affinity chromatography (IMAC) experiments were utilized to enrich phosphopeptides prior to MS analysis. CESI-MS was also used for the rapid identification of post-translational modifications (PTMs) of rat testis core histones, which were pre-fractionated via RP-HPLC and digested with Arg-C. Using core histones from butyrate-treated mouse erythroleukemia cells, we further demonstrated that our method achieves excellent separations of intact histone subtypes and their multiply modified forms and enables the detection of the extent of PTMs in a fast and reproducible way. Our work represents the first detailed characterization of modified linker and core histone peptides and clearly demonstrates that CESI-MS is a promising alternative tool for epigenetic studies.  相似文献   

5.
6.
The cellular microenvironment comprises soluble factors, support cells, and components of the extracellular matrix (ECM) that combine to regulate cellular behavior. Pluripotent stem cells utilize interactions between support cells and soluble factors in the microenvironment to assist in the maintenance of self-renewal and the process of differentiation. However, the ECM also plays a significant role in shaping the behavior of human pluripotent stem cells, including embryonic stem cells (hESCs) and induced pluripotent stem cells. Moreover, it has recently been observed that deposited factors in a hESC-conditioned matrix have the potential to contribute to the reprogramming of metastatic melanoma cells. Therefore, the ECM component of the pluripotent stem cell microenvironment necessitates further analysis.In this study we first compared the self-renewal and differentiation properties of hESCs grown on Matrigel™ pre-conditioned by hESCs to those on unconditioned Matrigel™. We determined that culture on conditioned Matrigel™ prevents differentiation when supportive growth factors are removed from the culture medium. To investigate and identify factors potentially responsible for this beneficial effect, we performed a defined SILAC MS-based proteomics screen of hESC-conditioned Matrigel™. From this proteomics screen, we identified over 80 extracellular proteins in matrix conditioned by hESCs and induced pluripotent stem cells. These included matrix-associated factors that participate in key stem cell pluripotency regulatory pathways, such as Nodal/Activin and canonical Wnt signaling. This work represents the first investigation of stem-cell-derived matrices from human pluripotent stem cells using a defined SILAC MS-based proteomics approach.The two defining characteristics of human embryonic stem cells (hESCs),1 self-renewal and pluripotency, are maintained by a delicate balance of intracellular and extracellular signaling processes. Extracellular regulation is primarily the result of changes in the microenvironment surrounding the cells during growth in vitro or in vivo. HESCs interact with this “niche ” through support cells, extracellular matrix (ECM) components, and autocrine/paracrine signaling (reviewed in Refs. 13). Modulation of any of these supportive elements individually or in combination has been used extensively to alter hESC behavior (13).The culture of hESCs, as well as that of human induced pluripotent stem cells (hiPSCs), is conventionally performed on a layer of irradiated mouse embryonic fibroblast cells (MEFs). These MEFs are believed to promote the maintenance of hESCs and hiPSCs through the secretion of beneficial support proteins and cytokines into the soluble microenvironment. A number of proteomic studies have been conducted that examine the secretome of feeder-cell layers in an attempt to elucidate proteins and pathways essential for hESC and hiPSC survival (47). Alternatively, hESCs and hiPSCs can be cultured in feeder-free conditions in the absence of support cells. In feeder-free conditions, hESCs and hiPSCs are most often grown on the basement membrane matrix Matrigel™ in medium that has been previously conditioned by MEFs (MEF-CM). Matrigel™ is a gelatinous mixture that is secreted by Engelbreth-Holm-Swarm mouse sarcoma cells (8). Although recent studies have proposed that a variety of defined matrices can support the growth of hESCs and hiPSCs, few of these can maintain a wide range of stem cell lines and therefore are typically not used in place of Matrigel™. The properties of Matrigel™ that make it such an effective matrix for hESC and hiPSC culture remain poorly understood. Because of the complexity of matrices like Matrigel™, the majority of proteomic studies that examine the hESC and hiPSC microenvironment have focused on contributions from support cells and soluble extracellular factors.The ECM is typically a complex network of structural proteins and glycosaminoglycans that function to support cells through the regulation of processes such as adhesion and growth factor signaling (9). Thus, it is not surprising that the generation of a well-defined matrix capable of facilitating hESC and hiPSC self-renewal has remained difficult (10). Previous proteomic investigations of Matrigel™ and other matrices supportive of hESC maintenance in vitro have revealed the presence of numerous growth, binding, and signaling proteins (11, 12). Further examination of how hESCs and hiPSCs interact with these complex matrices would provide critical information about what role the ECM plays in the organization of processes involved in the regulation of self-renewal and pluripotency.A recent study has established the ability of hESC-derived matrix microenvironments to alter tumorigenic properties through the reprogramming of metastatic melanoma cells (13). Importantly, this effect was found to be dependent on the exposure of metastatic cells to hESC-derived conditioned Matrigel™. Culture of metastatic melanoma cells in hESC-conditioned medium did not promote the reprogramming effect. These data suggest that the proteins responsible for this effect were integrated in the matrix. With the use of immunochemical techniques, it was later found that the left-right determination (Lefty) proteins A and B that were deposited in the matrix by hESCs during conditioning were at least in part responsible for the cellular change observed in metastatic cells (14). The Lefty A and B proteins are antagonists of transforming growth factor (TGF)-β signaling that act directly on Nodal protein, a critical regulator of the stem cell phenotype (15, 16). Subsequent studies of conditioned matrix utilizing mESCs implicated the bone morphogenic protein (BMP) 4 antagonist Gremlin as a primary regulator of the observed changes in metastatic cells (17). Collectively, these studies were all biased by a targeted analysis of potential effectors of metastatic cells. A comprehensive proteomic analysis of conditioned matrix could potentially reveal other factors involved in metastatic cell reprogramming. Furthermore, proteomic examination of hESC and hiPSC conditioned matrix could expose factors important in the regulation of self-renewal and pluripotency by the microenvironment in vitro.To this end, we have analyzed both types of human pluripotent stem cells, hESCs and hiPSCs, via a mass spectrometry (MS)-based proteomics approach to identify proteins deposited during growth in feeder-free conditions in vitro on Matrigel™. To investigate the hESC- and hiPSC-derived matrix, the metabolic labeling technique known as stable isotope labeling with amino acids in cell culture (SILAC) was used (18). SILAC facilitates the identification of hESC- and hiPSC-derived proteins that would otherwise be confounded by the presence of mouse-derived protein background from Matrigel™. From the proteomic analysis of three cells lines, namely, the hESC lines H9 and CA1 and the hiPSC line BJ-1D, we identified a total of 621, 1355, and 1350 total unique proteins, respectively. This work represents the first analysis of a hESC- and hiPSC-derived conditioned matrix and resulted in the identification of at least one novel microenvironmental contributor responsible for the regulation of human pluripotent stem cells.  相似文献   

7.
8.
Glycosylation is one of the most important and common forms of protein post-translational modification that is involved in many physiological functions and biological pathways. Altered glycosylation has been associated with a variety of diseases, including cancer, inflammatory and degenerative diseases. Glycoproteins are becoming important targets for the development of biomarkers for disease diagnosis, prognosis, and therapeutic response to drugs. The emerging technology of glycoproteomics, which focuses on glycoproteome analysis, is increasingly becoming an important tool for biomarker discovery. An in-depth, comprehensive identification of aberrant glycoproteins, and further, quantitative detection of specific glycosylation abnormalities in a complex environment require a concerted approach drawing from a variety of techniques. This report provides an overview of the recent advances in mass spectrometry based glycoproteomic methods and technology, in the context of biomarker discovery and clinical application.With recent advances in proteomics, analytical and computational technologies, glycoproteomics—the global analysis of glycoproteins—is rapidly emerging as a subfield of proteomics with high biological and clinical relevance. Glycoproteomics integrates glycoprotein enrichment and proteomics technologies to support the systematic identification and quantification of glycoproteins in a complex sample. The recent development of these techniques has stimulated great interest in applying the technology in clinical translational studies, in particular, protein biomarker research.While glycomics is the study of glycome (repertoire of glycans), glycoproteomics focuses on studying the profile of glycosylated proteins, i.e. the glycoproteome, in a biological system. Considerable work has been done to characterize the sequences and primary structure of the glycan moieties attached to proteins (13), and their structural alterations related to cancer (46). Recent reports have provided a comprehensive overview of the concept of glycomics and its prospective in biomarker research (710). In contrast, this review is focused on recent developments in glycoproteomic techniques and their unique application and technical challenge to biomarker discovery.

Glycoproteomics in Biomarker Discovery and Clinical Study

Most secretory and membrane-bound proteins produced by mammalian cells contain covalently linked glycans with diverse structures (2). The glycosylation form of a glycoprotein is highly specific at each glycosylation site and generally stable for a given cell type and physiological state. However, the glycosylation form of a protein can be altered significantly because of changes in cellular pathways and processes resulting from diseases, such as cancer, inflammation, and neurodegeneration. Such disease-associated alterations in glycoproteins can happen in one or both of two ways: 1) protein glycosylation sites are either hypo, hyper, or newly glycosylated and/or; 2) the glycosylation form of the attached carbohydrate moiety is altered. In fact, altered glycosylation patterns have long been recognized as hallmarks in cancer progression, in which tumor-specific glycoproteins are actively involved in neoplastic progression and metastasis (5, 6, 11, 12). Sensitive detection of such disease-associated glycosylation changes and abnormalities can provide a unique avenue to develop glycoprotein biomarkers for diagnosis and prognosis. In addition, intervention in the glycosylation and carbohydrate-dependent cellular pathways represent a potential new modality for cancer therapies (6, 11, 13). 14, 15) that are glycosylated proteins or protein complexes.

Table I

Listing of some of the US Food and Drug Administration (FDA) approved cancer biomarkers
Protein targetGlycosylationDetectionSourceDiseaseClinical biomarker
α-FetoproteinYesGlycoproteinSerumNonseminomatous testicular cancerDiagnosis
Human chorionic gonadotropin-βYesGlycoproteinSerumTesticular cancerDiagnosis
CA19–9YesCarbohydrateSerumPancreatic cancerMonitoring
CA125YesGlycoproteinSerumOvarian cancerMonitoring
CEA (carcinoembryonic antigen)YesProteinSerumColon cancerMonitoring
Epidermal growth factor receptorYesProteinTissueColon cancerTherapy selection
KITYesProtein (IHC)TissueGastrointestinal (GIST) cancerDiagnosis/Therapy selection
ThyroglobulinYesProteinSerumThyroid cancerMonitoring
PSA-prostate-specific antigen (Kallikrein 3)YesProteinSerumProstate cancerScreening/Monitoring/Diagnosis
CA15–3YesGlycoproteinSerumBreast cancerMonitoring
CA27–29YesGlycoproteinSerumBreast cancerMonitoring
HER2/NEUYesProtein (IHC), ProteinTissue, SerumBreast cancerPrognosis/Therapy selection/Monitoring
Fibrin/FDP-fibrin degradation proteinYesProteinUrineBladder cancerMonitoring
BTA-bladder tumour-associated antigen (Complement factor H related protein)YesProteinUrineBladder cancerMonitoring
CEA and mucin (high molecular weight)YesProtein (Immunofluorescence)UrineBladder cancerMonitoring
Open in a separate windowProtein biomarker development is a complex and challenging task. The criteria and approach applied for developing each individual biomarker can vary, depending on the purpose of the biomarker and the performance requirement for its clinical application (16, 17). In general, it has been suggested that the preclinical exploratory phase of protein biomarker development can be technically defined into four stages (18), including initial discovery of differential proteins; testing and selection of qualified candidates; verification of a subset of candidates; assay development and pre-clinical validation of potential biomarkers. Thanks to recent technological advances, mass spectrometry based glycoproteomics is now playing a major role in the initial phase of discovering aberrant glycoproteins associated with a disease. Glycoprotein enrichment techniques, coupled with multidimensional chromatographic separation and high-resolution mass spectrometry have greatly enhanced the analytical dynamic range and limit of detection for glycoprotein profiling in complex samples such as plasma, serum, other bodily fluids, or tissue. In addition, candidate-based quantitative glycoproteomics platforms have been introduced recently, allowing targeted detection of glycoprotein candidates in complex samples in a multiplexed fashion, providing a complementary tool for glycoprotein biomarker verification in addition to antibody based approaches. It is clear that glycoproteomics is gaining momentum in biomarker research.

Glycoproteomics Approaches

Glycoproteomic analysis is complicated not only by the variety of carbohydrates, but also by the complex linkage of the glycan to the protein. Glycosylation can occur at several different amino acid residues in the protein sequence. The most common and widely studied forms are N-linked and O-linked glycosylation. O-linked glycans are linked to the hydroxyl group on serine or threonine residues. N-linked glycans are attached to the amide group of asparagine residues in a consensus Asn-X-Ser/Thr sequence (X can be any amino acid except proline) (19). Other known, but less well studied forms of glycosylation include glycosylphosphatidylinositol anchors attached to protein carboxyl terminus, C-glycosylation that occurs on tryptophan residues (20), and S-linked glycosylation through a sulfur atom on cysteine or methionine (21, 22). Our following discussion is focused on glycoproteomic analysis of the most common N-linked and O-linked glycoproteins.A comprehensive analysis of glycoproteins in a complex biological sample requires a concerted approach. Although the specific methods for sample preparation can be different for different types of samples (e.g. plasma, serum, tissue, and cell lysate), a glycoproteomics pipeline typically consists of glycoprotein or glycopeptide enrichment, multidimensional protein or peptide separation, tandem mass spectrometric analysis, and bioinformatic data interpretation. For glycoprotein-based enrichment methods, proteolytic digestion can be performed before or after glycan cleavage, depending on the specific workflow and enrichment methods used. For glycopeptide enrichment, proteolytic digestion is typically performed before the isolation step so that glycopeptides, instead of glycoproteins, can be captured. For quantitative glycoproteomics profiling, additional steps, such as differential stable isotope labeling of the sample and controls, are required. Fig. 1 illustrates the general strategy for an integrated glycoproteomics analysis.Open in a separate windowFig. 1.The strategies of mass spectrometry based glycoproteomic analysis.Glycoproteins or glycopeptides can be effectively enriched using a variety of techniques (see below). Following the enrichment step, the workflow then splits into two directions: glycan analysis and glycoprotein analysis. The strategies for glycan analysis have been discussed in several reviews and will not be covered in this report. For glycoprotein analysis, bottom-up workflows (“shotgun proteomics”—peptide based proteomics analysis) (23) are still most common, providing not only detailed information of a glycoprotein profile, but also the specific mapping of glycosylation sites. It is notable that the reliable analysis of mass spectrometric data in glycoproteomic studies largely relies on bioinformatic tools and glyco-related databases that are available. An increasing number of algorithms and databases for glycan analysis have been developed and well documented in several recent reviews (2426). For glycoprotein and glycopeptide sequence analysis, a large number of well-characterized and annotated glycoproteins can be found in the UniProt Knowledgebase. In addition, many glycopeptide mass spectra are now available in the continually expanding PeptideAtlas library (27), which stores millions of high-resolution peptide fragment ion mass spectra acquired from a variety of biological and clinical samples for peptide and protein identification. Ultimately, all the data obtained from different aspects of the workflow need to be merged and interpreted in an integrated fashion so that the full extent of glycosylation changes associated with a particular biological state can be better revealed. To the best of our knowledge, the complete glycoform analysis of any glycoprotein in a specific cell type under any specific condition has not yet been accomplished for any glycoprotein with multiple glycosylation sites. Current technology can define the glycan compliment and profile the glycoproteins, but is not capable of putting them together to define the molecular species present. To date, such integrated studies still remain highly challenging, even with advanced tandem mass spectrometry technologies and growing bioinformatic resources (26, 2831).

Enrichment of the Glycoproteome

Characterization of the glycoproteome in a complex biological sample such as plasma, serum, or tissue, is analytically challenging because of the enormous complexity of protein and glycan constituents and the vast dynamic range of protein concentration in the sample. The selective enrichment of the glycoproteome is one of the most efficient ways to simplify the enormous complexity of a biological sample to achieve an in-depth glycoprotein analysis. Two approaches for glycoprotein enrichment have been widely applied: lectin affinity based enrichment methods (3136) and hydrazide chemistry-based solid phase extraction methods (3742). Recent studies have demonstrated that the two methods are complementary and a very effective means for the enrichment of glycoproteins or glycopeptides from human plasma and other bodily fluids (38, 39, 43). In addition, glycoprotein and glycopeptide enrichment using boronic acid (44, 45), size-exclusion chromatography (46), hydrophilic interaction (47) and a graphite powder microcolumn (48) have been reported.Lectin affinity enrichment is based on the specific binding interaction between a lectin and a distinct glycan structure attached on a glycoprotein (49, 50). There are a variety of lectin species that can selectively bind to different oligosaccharide epitopes. For instance, concanavalin A (ConA) binds to mannosyl and glucosyl residues of glycoproteins (51); wheat germ agglutinin (WGA) binds to N-acetyl-glucosamine and sialic acid (52); and jacalin (JAC) specifically recognizes galactosyl (β-1,3) N acetylgalactosamine and O-linked glycoproteins (53). Lectin affinity enrichment has been designed to enrich glycoproteins with specific glycan attachment from plasma, serum, tissue, and other biological samples through affinity chromatography and other methods. Multiple lectin species can also be combined to isolate multiple types of glycoproteins in complex biological samples (5459). Concanavalin A and wheat germ agglutinin, as well as jacalin are often used together to achieve a more extensive glycproteome characterization (31, 34, 57, 59, 60). Several reports have demonstrated a multilectin column approach to achieve a global enrichment of glycoproteins with various glycan attachments from serum and plasma (31, 34, 59, 61, 62). A recent study has developed a “filter aided sample preparation (FASP)” based method, which allows highly efficient enrichment of glycopeptides using multi-lectins (63). To date, most of the work using lectin affinity for targeted glycoprotein enrichment has focused on N-glycosylation because the binding specificity of lectin for O-glycosylation is less satisfactory. To overcome such caveat, efforts have been made using serial lectin columns of concanavalin A and jacelin in tandem to isolate O-glycopeptides from human serum (35).A hydrazide chemistry-based method has been applied to isolate glycoproteins and glycopeptides through the formation of covalent bonding between the glycans and the hydrazide groups (37). The carbohydrates on glycoproteins are first oxidized to form aldehyde groups, which sequentially react with hydrazide groups that are immobilized on a solid surface. The chemical reaction conjugates the glycoproteins to the solid phase by forming the covalent hydrazone bond. Although, conceptually, the majority of the glycoproteins in a biological sample can be captured using this method, the further analysis of the captured glycoproteins is practically limited by the method that can cleave glycoproteins or glycopeptides from the solid phase. Because there is a lack of efficient enzymes or chemicals that can specifically deglycosylate and/or release O-linked glycoproteins or glycopeptides from the solid phase, most of the studies have applied this method solely for N-linked glycoprotein analysis. PNGase F is the enzyme that can specifically release an N-glycosylated proteins or peptides (except those carrying α1→3 linked core fucose (38)) from its corresponding oligosaccharide groups. The hydrazide chemistry method is not only highly efficient in enriching N-linked glycoproteins or glycopeptides from a complex environment, but also allows great flexibility in its applications, such as capturing extracellular N-glycoproteins on live cells to monitor their abundant changes because of cell activation, differentiation, or other cellular activities (64). This method can be readily automated for analyzing a large quantity of samples.Recent studies have compared glycoprotein isolation methods. One study assessed lectin-based protocols and hydrophilic interaction chromatography for their performance in enriching glycoproteins and glycopeptides from serum (65). Other studies compared lectin affinity and hydrazide chemistry methods for their efficiency in isolating glycoproteins and glycopeptides from a complex biological sample (39, 66, 67). The methods are complementary in enriching glycoproteins because of their different mechanisms of glycoprotein capturing. When both methods were applied, it significantly improves the coverage of the glycoproteome, resulting in an increased number of glycoproteins identified. The lectin affinity method can be tailored to target glycoproteins with specific glycan structure(s) for isolation using different lectins, thus, affording flexibility for its application in glycoproteomic studies. The application of hydrazide chemistry method has been widely used for N-linked glycosylation study. The hydrazide chemistry essentially reacts with all the proteins with carbonyl groups, which may include glycoproteins with oxidized glycans (37, 40) and other oxidized proteins that carry carbonyl groups (6870). The high specificity of this method may mainly result from the specificity of PNGase F, the enzyme cleaving N-glycosidic bonds to release N-glycoproteins and peptides from the solid phase. This method affords high efficiency and specificity in enriching N-linked glycoproteins or glycopeptides from a complex sample, and can be easily incorporated into a proteomics workflow for integrated analysis. In addition to the lectin and hydrazide chemistry-based methods, it has been suggested that boronic acid-based solid phase extraction may also be useful for an overall glycoproteome enrichment (44, 45), on the basis of the evidence that boronic acid can form diester bonds with most glycans, including both N-linked and O-linked glycosylation (71).

Mass Spectrometric Analysis of Glycoproteome

Mass spectrometry, because of its high sensitivity and selectivity, has been one of the most versatile and powerful tools in glycoprotein analysis, to identify the glycoproteins, evaluate glycosylation sites, and elucidate the oligosaccharide structures (56, 72, 73). The utility of a top-down approach (intact protein based proteomics analysis) (74) for glycoprotein characterization in a complex sample is still technically challenging with the current technology. The most versatile and widely used current glycoproteomics methods are based on characterizing glycopeptides generated by the digestion of glycoproteins, analyzing either deglycosylated glycopeptides or intact glycopeptides with glycan attachment, as illustrated in Fig. 1.The direct analysis of intact glycopeptides with carbohydrate attachments is complicated by the mixed information obtained from the fragment ion spectra, which may include fragment ions from the peptide backbone, the carbohydrate group and the combinations of both. Although it is technically challenging to comprehensively analyze intact glycopeptides in a global scale for a complex biological sample, complementary information regarding peptide backbone and glycan structure can likely be obtained in a single measurement. Early work using collision-induced dissociation (CID)1 has identified a few key features that are characteristics of the fragmentation of glycopeptides, providing the basis for intact glycopeptide identification (7579). The analysis of intact glycopeptides has been carried out using a variety of different instruments, including electrospray ionization (EST)-based ion trap (IT) (8084), quadrupole ion trap (QIT) (8587), Fourier transform ion cyclotron resonance (FTICR) (31, 57, 88, 89), ion trap/time-of-flight (IT/TOF) (90, 91), and quadrupole/time-of-flight (Q/TOF) (9297); matrix-assisted laser desorption/ionization (MALDI) based Q/TOF (98100), quadrupole ion trap/time-of-flight (QIT/TOF) (86, 101, 102), and tandem time-of-flight (TOF/TOF) (81, 82, 101, 103105) mass spectrometers. In general, the CID generated MS/MS spectrum of a glycopeptide is dominated by B- and Y-type glycosidic cleavage ions (carbohydrate fragments) (106), and b- and y-type peptide fragments from the peptide backbone. However, the MS/MS fragmentation data obtained from different instruments can have pronounced difference in providing structure information on glycan and peptide backbone, depending on the experimental setting and instrumentation used for mass analysis, including ionization methods, collision techniques and mass analyzers. Low energy CID with electrospray ionization-based ion trap, Fourier transform-ion cyclotron resonance, and Q/TOF instrument predominantly generates fragments of glycosidic bonds. The increase of collision energy using Fourier transform-ion cyclotron resonance, and Q/TOF instruments result in the more efficient fragmentation of b- and y- ions from the peptide backbone. MALDI ionization generates predominantly singly charged precursor ions, which are more stable and usually fragmented using higher energies via CID or post-source decay (PSD), generating fragments from both the peptide backbone and the glycan (98100, 103, 107110). Although Q/TOF instruments have been widely used for intact glycopeptide characterization, one unique feature of the ion trap instrument is that it allows repeated ion isolation/CID fragmentation cycles, which can provide a wealth of complementary information to interpret the structure of a glycan moiety and peptide backbone (56, 86, 111). Recently, fragmentation techniques using different mechanisms from CID have been introduced and applied for glycopeptide analysis, including infrared multiphoton dissociation (IRMPD) (112115), electon-capture dissociation (ECD) (112120) and electron-transfer disassociation (ETD) (85, 121123). The application of infrared multiphoton dissociation and electon-capture dissociation is largely performed with Fourier transform-ion cyclotron resonance instruments. Complementary to CID fragmentation, electon-capture dissociation and electron-transfer disassociation tend to cleave the peptide backbone with no loss of the glycan moiety, providing specific information on localizing the glycosidic modification. More details regarding mass spectrometric analysis of intact glycopeptides can be found in recent reviews (56, 124). Although great efforts have been made to apply a variety of mass spectrometry techniques to study both N-linked (32, 56, 86, 87, 112114, 125130) and O-linked (90, 116, 119, 120, 130140) glycopeptides, the interpretation of the fragment spectrum of an intact glycopeptide still requires intensive manual assignment and evaluation. A recent study has demonstrated the feasibility to develop an automated workflow for analyzing intact glycopeptides in mixtures (141). In general, however, a high throughput, large scale profiling of intact glycopeptides in a complex sample still remains a challenge with current technology.The analysis of deglycosylated peptides requires the removal of glycan attachments from glycopeptides. Fortunately, for N-linked glycopeptides, the N-glycosidic bond can be specifically cleaved using the enzyme PNGase F, providing deglycosylated peptides, which can then be analyzed directly using shotgun proteomics. The PNGase F-catalyzed deglycosylation results in the conversion of asparagine to aspartic acid in the glycopeptide sequence, which introduces a mass difference of 0.9840 Da. Such distinct mass differences can be used to precisely map the N-linked glycosylation sites using high resolution mass spectrometers. Stable isotope labeling introduced by enzymatic cleavage of glycans in H218O has also been used to enhance the precise identification of N-glycosylation sites (33, 142, 143). The removal of O-linked glycans is less straightforward, most assays rely on chemical deglycosylation methods, such as trifluoromethansulfonic acid (144), hydrazinolysis (145), β-elimination (146), and periodate oxidation (35, 147). The application of these methods suffers from a variety of limitations, such as low specificity for O-linked glycosylation, degradation of the peptide backbone, and modifications of the amino acid residues—all of which can complicate or compromise O-linked glycoproteomics analysis in a complex sample. Most of the large scale glycoproteomics studies using the deglycosylation approach have been focused on N-glycoproteins, which are prevalent in blood and a rich source for biomarker discovery. O-glycosylation lacks a common core, consensus sequence, and universal enzyme that can specifically remove the glycans from the peptide backbone, thus, is more challenging to analyze for large scale profiling.Following deglycosylation, the glycopeptides can be treated and analyzed as stripped peptides using a shotgun proteomics pipeline. MS/MS fragment spectra with b-ions and y-ions generated from CID are searched against protein databases using search algorithms, such as SEQUEST (148), MASCOT (149), and X!tandem (150), and subsequently validated via statistical analysis (151154), to provide peptide and protein identifications with known false discovery rate. The N-glycosylation sites can be precisely mapped using the consensus sequence of Asn-X-Ser/Thr, in which asparagine is converted to aspartic acid following enzyme cleavage introducing a mass difference of 0.9840 Dalton. A variety of mass spectrometers have been used to analyze glycoproteins, in particular N-linked glycoproteins, in complex biological and clinical samples using the deglycosylation approach. These studies include electrospray ionization-based ion trap (3739, 41, 67, 155157), Orbitrap (158), Q/TOF (33, 35, 142, 155), triple quadrupole (159), Fourier transform-ion cyclotron resonance (64, 160); and MALDI based TOF/TOF (41, 161) and Q/TOF (37). Recently, an attempt was made to apply ion mobility-mass spectrometry (IM-MS) to characterize deglycosylated glycopeptides and the corresponding carbohydrates simultaneously (162) in a single measurement. The approach of analyzing deglycosylated glycopeptides makes it possible to utilize available proteomics technology for large-scale glycoproteome profiling, especially N-linked glycoproteins, in a high-throughput fashion.

Glycoproteomics Analysis in Blood and Other Bodily Fluids

An important target for blood-based diagnostic assays involves the detection and quantification of glycosylated proteins. Glycosylated proteins, especially N-linked glycoproteins, are ubiquitous among the proteins destined for extracellular environments (163), such as plasma or serum. A systematic and in-depth global profiling of the blood glycoproteome can provide fundamental knowledge for blood biomarker development, and is now possible with the development of glycoproteomics technologies. In the past few years, several large scale proteomics studies on profiling the glycoproteome of human plasma and serum have been reported (34, 35, 37, 38, 43, 61, 65, 164166), adding significant numbers of glycoproteins into the blood glycoproteome database. In one study (38), immunoaffinity subtraction and hydrazide chemistry were applied to enrich N-glycoproteins from human plasma. The captured plasma glycoproteins were subjected to two-dimensional liquid chromatography separation followed by tandem mass spectrometric analysis. A total of 2053 different N-glycopeptides were identified, covering 303 nonredundant glycoproteins, including many glycoproteins with low abundance in blood (38). In a different study, hydrazide chemistry-based solid phase extraction method was applied to enhance the detection of tissue-derived proteins in human plasma (167). Other studies have applied lectin affinity-based approaches to characterize the serum and plasma glycoproteome (34, 43, 166). These studies provide detailed identification regarding the individual N-glycosylation sites using high-resolution mass spectrometry. The efforts made in global profiling of glycoproteins in plasma and serum have not only greatly enhanced our understanding of the blood glycoproteome, but also have facilitated the development of new technologies that can be used for glycoprotein biomarker discovery. A variety of experimental designs and strategies for blood glycoprotein profiling have been applied for clinical disease studies, including prostate cancer (168), hepatocellular carcinoma (164, 168170), lung adenocarcinoma (61, 171), breast cancer (58, 165, 172), atopic dermatitis (169), ovarian cancer (173, 174), congenital disorders of glycosylation (175), and pancreatic cancer (156, 176). Most of these studies focused on the early stages of glycoprotein biomarker discovery and many of them exploited multilectin affinity techniques to isolate glycoproteins from serum or plasma.Glycoproteomics techniques have also been applied to study the glycoproteome of other bodily fluids. The complementary application of hydrazide chemistry-based solid phase extraction and lectin affinity method have led to the identification of 216 glycoproteins in human cerebrospinal fluid (CSF), including many low abundant ones (39). A hydrazide chemistry based study on human saliva has characterized 84 N-glycosylated peptides in 45 glycoproteins (177). The study on tear fluid identified 43 N-linked glycoproteins, including 19 proteins that have not been discovered in tear fluid previously (178). Other glycoproteomics studies on bodily fluids include N-glycoprotein profiling of lung adenocarcinoma pleural effusions (179), urine glycoprotein profiling (180), and urine glycoprotein signature identification for bladder cancer (181). In the urine glycoprotein profiling study, 150 annotated glycoproteins in addition to 43 predicted glycoproteins were identified (180). In our own study, 48 glycoproteins have so far been identified in pancreatic juice (unpublished data), adding complementary information to the pancreatic juice protein database (182184).

Glycoproteomics Analysis of Tissue and Cell Lysates

Protein glycosylation has been increasingly recognized as one of the prominent alterations involved in tumorigenesis, inflammation, and other disease states. The study of glycoproteins in cell and tissue carries great promise for defining biomarkers for diagnotic and therapeutic targets. The glycoproteomics studies in liver tissue (185, 186) and cell lines (187) have provided a fundamental understanding of the liver glycoproteome and identified protein candidates that are associated with highly metastatic liver cancer cells. In one of the studies, hydrazide chemistry and multiple enzyme digestion provided a complementary identification of 939 N-glycosylation sites covering 523 nonredundant glycoproteins in human liver tissue (185). Studies on ovarian cancer have focused on discovering putative glycoprotein biomarkers for improving diagnosis (173, 174) and therapeutic treatment (188). Glycoproteomics studies have also been carried out to study hepatocelluar carcinoma. Magnetic nanoparticle immobilized Concanavalin A was used to selectively enrich N-glycoproteins in a hepatocelluar carcinoma cell line leading to the identification of 184 glycosylation sites corresponding to 101 glycoproteins (189). In a different study, complementary methods of hydrophilic affinity and hydrazide chemistry were applied to investigate the secreted glycoproteins from a hepatocelluar carcinoma cell line, in which 300 different glycosylation sites within 194 glycoproteins were identified (190). While many of these studies focused on N-glycoproteins, mucin-type O-linked glycoproteins are the predominant forms of O-linked glycosylation and are difficult to analyze. A metabolic labeling method was developed to facilitate their identification in complex cell lysates using proteomic strategies (191).Cell surface and membrane proteins are particularly appealing for biomarker discovery, and many of them are glycosylated proteins. Both hydrazide chemistry- and lectin affinity-based approaches have been applied to specifically study cell surface and membrane N-glycoproteins that are associated with diseases, including colon carcinoma (192), breast cancer (158), and thyroid cancer (157). One study applied hydrazide chemistry to covalently label extracellular glycan moieties on live cells, providing highly specific and selective identification of cell surface N-glycoproteins (64). A complementary application of hydrazide chemistry and lectin affinity methods was demonstrated to profile cell membrane glycoproteins, significantly enhancing the glycoprotein identification (67).

Quantitative Glycoprotein Profiling

One of the major goals of clinical proteomics is to effectively identify dysregulated proteins that are specifically associated with a biological state, such as a disease. In the past decade, different quantitative proteomics techniques have been introduced and applied to study a wide variety of disease settings. These techniques are based on different mechanisms to facilitate mass spectrometric-based quantitative analysis, including stable isotopic or isobaric labeling using chemical reactions (e.g. ICAT and iTRAQ) (193195), metabolic incorporation (e.g. SILAC) (196) and enzymatic reactions (e.g. 18O labeling) (197, 198); as well as less quantitatively accurate label-free approaches (199, 200). The overview and comparison of these quantitative techniques can be found in several reports in the literature and are not discussed in this review. Most of these isotopic labeling techniques can be adapted and utilized for glycoproteomics analysis to quantitatively compare the glycoproteome of a diseased sample to a control, thus revealing the glycosylation occupancy of individual glycosylation sites that may be involved in a disease. In addition to the well-established labeling methods cited above, several more experimental labeling strategies have been described in the field of glycoproteomics. One study demonstrated the feasibility of using stable isotope labeled succinic anhydride for quantitative analysis of glycoproteins isolated from serum via hydrazide chemistry (37). In a different report, the heavy and light version of N-acetoxy-succinimide combining with lectin affinity selection was used to quantitatively profile serum glycopeptides in canine lymphoma and transitional cell carcinoma (201). Stable isotope labeled 2-nitrobenzenesulfenyl was also used for chemical labeling in a quantitative glycoprotein profiling study on the sera from patients with lung adenocarcinoma (202). O-Linked N-acetylglucosamine (O-GlcNAc) is an intracellular, reversible form of glycosylation that shares many features with phosphorylation (203). Studies have suggested that O-GlcNAc may play an important role in many biological processes (204). A quantitative study on O-GlcNAc glycosylation has been reported, in which a method termed quantitative isotopic and chemoenzymatic tagging (QUIC-Tag) was described using a biotin-avidin affinity strategy for O-GlcNAc glycopeptide enrichment and stable isotope-labeled formaldehyde for mass spectrometric quantification (205). Recently, the isobaric tag for relative and absolute quantitation (iTRAQ) technique, combined with different glycoprotein enrichment approaches, has been utilized in several quantitative glycoproteomics studies. In the study of hepatocellular carcinoma, N-linked glycoproteins were enriched from hepatocellular carcinoma patients and controls using multilectin column and then quantitatively compared using iTRAQ to reveal the differential proteins associated with hepatocellular carcinoma (206). In a different study, the approach of using narrow selectivity lectin affinity chromatography followed by iTRAQ labeling was demonstrated to selectively identify differential glycoproteins in plasma samples from breast cancer patients (165). Another study utilized hydrazide chemistry-based solid phase extraction and iTRAQ to investigate the tear fluid of patients with climatic droplet keratopathy in comparison of normal controls, identifying multiple N-glycosylation sites with differential occupancy associated with climatic droplet keratopathy (178).In addition to using chemical reactions to incorporate stable isotope tag for quantitative mass spectrometric analysis, 18O can be introduced into N-glycopeptides during enzymatic reactions, such as tryptic digestion (incorporation of two 18O into the peptide carboxyl-terminal) and PNGase F mediated hydrolysis (incorporation of one 18O into the asparagine of N-glycosylation sites (33)). Attempts have been made to apply this approach to identify differentially expressed N-glycosylation associated with ovarian cancer in serum (207). In a different approach, the SILAC technique allows incorporation of stable isotope-labeled amino acids into proteins during cell culturing process (196), and was applied to investigate the difference in cell surface N-glycoproteins among different cell types (64). A label-free approach has also been used for glycoproteomics profiling, including a method developed to profile intact glycopeptides in a complex sample (208) and a study that compares the plasma glycoproteome between psoriasis patients and healthy controls (209).

Targeted Glycoproteomics Analysis

Mass spectrometry based targeted proteomics has recently emerged as a multiplexed quantitative technique that affords highly specific and candidate-based detection of targeted peptides and proteins in a complex biological sample (18, 210214). The technique is based on the concept of stable isotope dilution utilizing stable isotope-labeled synthetic reference peptides, which precisely mimic their endogenous counterparts, to achieve targeted quantification (214). Such techniques can be applied to target specific glycoproteins or glycopeptides, to precisely quantify the status of candidate glycosylation sites and assess the glycosylation occupancy at the molecular level. However, it is technically impractical to use synthetic peptides to precisely mimic a large number of natural glycopeptides with intact a glycan moiety as internal standards because of the structure complexity and variation of the sugar chain. To overcome these technical obstacles, an alternative approach was proposed for targeted analysis of N-glycosylation occupancy, in which stable isotope-labeled peptides were synthesized to mimic the deglycosylated form of candidate glycopeptides as internal references (161). It is known that the deglycosylation step using PNGase F results in a conversion of asparagine to aspartic acid in the peptide sequence, introducing a mass difference of 0.9840 Da. This phenomenon was utilized to design a synthetic peptide to mimic the endogenous N-linked glycopeptide in its deglycosylation form with exact amino acid sequence of its endogenous counterpart and with 13C and 15N labeling on one of its amino acids (161). Therefore, each matched pair of reference and endogenous candidate glycopeptides should share the same chromatographic and mass spectrometric characteristics, and can only be distinguished by their mass difference and isotopic pattern because of isotopic labeling. This design conceptually ensures that the synthetic internal standard of a candidate glycopeptide will be detected simultaneously with its endogenous form under the same analytical conditions, thus, minimizing the systematic variation and providing reliable quantification (214). The strategy for targeted glycoproteomics analysis is schematically illustrated in Fig. 2.Open in a separate windowFig. 2.Targeted analysis of N-glycopeptides.The targeted glycoproteomics technique was first demonstrated to analyze N-glycopeptides that were extracted from human serum using an integrated pipeline combining a hydrazide chemistry-based solid phase extraction method and a data-driven liquid chromatography MALDI TOF/TOF mass spectrometric analysis to quantify 21 N-glycopeptides in human serum (161). A similar mass spectrometric platform was then applied in a different study to assess a subset of glycoprotein biomarker candidates in the sera from prostate cancer patients (215). The targeted glycoproteomics analysis has also been demonstrated using a triple Q/linear ion trap instrument with the selected reaction monitoring (also referred to as multiple reaction monitoring) technique for highly sensitive targeted detection of N-glycoproteins in plasma (159). The technique was applied to detect tissue inhibitor of metalloproteinase 1 (TIMP1), an aberrant glycoprotein associated with colorectal cancer, in the sera of colorectal cancer patients (216) using a tandem enrichment strategy, combing lectin glycoprotein enrichment followed by the method of stable isotope standards and capture by antipeptide antibodies (SISCAPA), to enhance the detection of tissue inhibitor of metalloproteinase 1 (216). These studies demonstrate an integrated pipeline for candidate-based glycoproteomics analysis with precise mapping of targeted N-linked motifs and absolute quantification of the glycoprotein targets in a complex biological sample. Such targeted glycoproteomics can reach a detection sensitivity at the nanogram per milliliter level for serum and plasma detection (159, 214216).

Concluding Remarks

The major challenge for a comprehensive glycoproteomics analysis arises not only from the enormous complexity and nonlinear dynamic range in protein constituent in a clinical sample, but also the profound biological intricacy within the molecule of a glycoprotein, involving the flexibility in glycan structures and the complex linkage with the corresponding protein. In the past decade, significant efforts have been made to structurally or quantitatively characterize the glycoproteome of a variety of biological samples, and to investigate the significant glycoproteins in a wide assortment of diseases. Shotgun proteomics-based techniques are still the most effective and versatile approach in glycoproteomics analysis, allowing high throughput and detailed analysis on individual glycosylation sites. Although glycoproteomics is quickly emerging as an important technique for clinical proteomics study and biomarker discovery, a comprehensive, quantitative glycoproteomics analysis in a complex biological sample still remains challenging. It is anticipated that with the continued evolution in mass spectrometry, separation technology, and bioinformatics many of the technical limitations associated with current glycoproteomics may be transient. There is no doubt that glycoproteomics is playing an increasingly important role in biomarker discovery and clinical study.  相似文献   

9.
  1. Download : Download high-res image (44KB)
  2. Download : Download full-size image
Highlights
  • •Automated metadata extraction from potentially large sets of mass spectrometric raw data.
  • •Reduction of extracted metadata into groups of shared parameter sets.
  • •Tabular representation for quality control, reporting and publication.
  相似文献   

10.
The interaction at neutral pH between wild-type and a variant form (R3A) of the amyloid fibril-forming protein β2-microglobulin (β2m) and the molecular chaperone αB-crystallin was investigated by thioflavin T fluorescence, NMR spectroscopy, and mass spectrometry. Fibril formation of R3Aβ2m was potently prevented by αB-crystallin. αB-crystallin also prevented the unfolding and nonfibrillar aggregation of R3Aβ2m. From analysis of the NMR spectra collected at various R3Aβ2m to αB-crystallin molar subunit ratios, it is concluded that the structured β-sheet core and the apical loops of R3Aβ2m interact in a nonspecific manner with the αB-crystallin. Complementary information was derived from NMR diffusion coefficient measurements of wild-type β2m at a 100-fold concentration excess with respect to αB-crystallin. Mass spectrometry acquired in the native state showed that the onset of wild-type β2m oligomerization was effectively reduced by αB-crystallin. Furthermore, and most importantly, αB-crystallin reversibly dissociated β2m oligomers formed spontaneously in aged samples. These results, coupled with our previous studies, highlight the potent effectiveness of αB-crystallin in preventing β2m aggregation at the various stages of its aggregation pathway. Our findings are highly relevant to the emerging view that molecular chaperone action is intimately involved in the prevention of in vivo amyloid fibril formation.  相似文献   

11.
A pilot study using capillary electrophoresis with mass spectrometry for the analysis of nucleotides in human erythrocytes is presented. Erythrocytes were incubated with 5-amino-4-imidazolecarboxamide riboside in order to mimic situation in defect of purine metabolism—AICA-ribosiduria. Characteristic AICA-ribotides together with normal nucleotides were separated by capillary electrophoresis in acetate buffer (20 mmol/L, pH 4.4) and identified on line by mass spectrometry.  相似文献   

12.
13.
Progesterone has a number of important functions throughout the human body. While the roles of progesterone are well known, the possible actions and implications of progesterone metabolites in different tissues remain to be determined. There is a growing body of evidence that these metabolites are not inactive, but can have significant biological effects, as anesthetics, anxiolytics and anticonvulsants. Furthermore, they can facilitate synthesis of myelin components in the peripheral nervous system, have effects on human pregnancy and onset of labour, and have a neuroprotective role. For a better understanding of the functions of progesterone metabolites, improved analytical methods are essential. We have developed a combined liquid chromatography—tandem mass spectrometry (LC-MS/MS) method for detection and quantification of progesterone and 16 progesterone metabolites that has femtomolar sensitivity and good reproducibility in a single chromatographic run. MS/MS analyses were performed in positive mode and under constant electrospray ionization conditions. To increase the sensitivity, all of the transitions were recorded using the Scheduled MRM algorithm. This LC-MS/MS method requires small sample volumes and minimal sample preparation, and there is no need for derivatization. Here, we show the application of this method for evaluation of progesterone metabolism in the HES endometrial cell line. In HES cells, the metabolism of progesterone proceeds mainly to (20S)-20-hydroxy-pregn-4-ene-3-one, (20S)-20-hydroxy-5α-pregnane-3-one and (20S)-5α-pregnane-3α,20-diol. The investigation of possible biological effects of these metabolites on the endometrium is currently undergoing.  相似文献   

14.
15.
Phosphorylase kinase (PhK), a 1.3 MDa enzyme complex that regulates glycogenolysis, is composed of four copies each of four distinct subunits (α, β, γ, and δ). The catalytic protein kinase subunit within this complex is γ, and its activity is regulated by the three remaining subunits, which are targeted by allosteric activators from neuronal, metabolic, and hormonal signaling pathways. The regulation of activity of the PhK complex from skeletal muscle has been studied extensively; however, considerably less is known about the interactions among its subunits, particularly within the non-activated versus activated forms of the complex. Here, nanoelectrospray mass spectrometry and partial denaturation were used to disrupt PhK, and subunit dissociation patterns of non-activated and phospho-activated (autophosphorylation) conformers were compared. In so doing, we have established a network of subunit contacts that complements and extends prior evidence of subunit interactions obtained from chemical crosslinking, and these subunit interactions have been modeled for both conformers within the context of a known three-dimensional structure of PhK solved by cryoelectron microscopy. Our analyses show that the network of contacts among subunits differs significantly between the nonactivated and phospho-activated conformers of PhK, with the latter revealing new interprotomeric contact patterns for the β subunit, the predominant subunit responsible for PhK''s activation by phosphorylation. Partial disruption of the phosphorylated conformer yields several novel subcomplexes containing multiple β subunits, arguing for their self-association within the activated complex. Evidence for the theoretical αβγδ protomeric subcomplex, which has been sought but not previously observed, was also derived from the phospho-activated complex. In addition to changes in subunit interaction patterns upon phospho-activation, mass spectrometry revealed a large change in the overall stability of the complex, with the phospho-activated conformer being more labile, in concordance with previous hypotheses on the mechanism of allosteric activation of PhK through perturbation of its inhibitory quaternary structure.In the cascade activation of glycogenolysis in skeletal muscle, phosphorylase kinase (PhK),1 upon becoming activated through phosphorylation, subsequently phosphorylates glycogen phosphorylase in a Ca2+-dependent reaction. This phosphorylation of glycogen phosphorylase activates its phosphorolysis of glycogen, leading to energy production (1). The 1.3 MDa (αβγδ)4 PhK complex was the first protein kinase to be characterized and is among the largest and most complex enzymes known (2). As such, the intact complex has proved to be refractory to high resolution x-ray crystallographic or NMR techniques; however, low resolution structures of the nonactivated and Ca2+-saturated conformers of PhK have been deduced through modeling (3) and solved by means of three-dimensional electron microscopic (EM) reconstruction (47), and they show that the complex is a bilobal structure with interconnecting bridges. Approximate locations of small regions of each subunit in the complex are known (810) and show that the subunits pack head-to-head as apparent αβγδ protomers that form two octameric (αβγδ)2 lobes associating in D2 symmetry (11), although direct evidence that the αβγδ protomers are discrete, functional subcomplexes has been lacking until now.Approximately 90% of the mass of the PhK complex is involved in its regulation. Its kinase activity is carried out by the catalytic core of the γ subunit (44.7 kDa), with the kcat being enhanced up to 100-fold by multiple metabolic, hormonal, and neural stimuli that are integrated through allosteric sites on PhK''s three regulatory subunits, α, β, and δ (12). The small δ subunit (16.7 kDa), which is tightly bound integral calmodulin (13), binds to at least the C-terminal regulatory domain of the γ subunit (γCRD) (14, 15), thereby mediating activation of the catalytic subunit by the obligate activator Ca2+ (16). The α and β subunits, as deduced from DNA sequencing, are polypeptides of 1237 and 1092 amino acids, respectively, with calculated masses prior to post-translational modifications of 138.4 and 125.2 kDa (17, 18). Both subunits can be phosphorylated by numerous protein kinases, including cAMP-dependent protein kinase and PhK itself (2). The α and β subunits are also homologous (38% identity and 61% similarity); however, each subunit has unique phosphorylatable regions that contain nearly all the phosphorylation sites found in these subunits (17, 18).The regulation of PhK activity by both Ca2+ (1923) and phosphorylation has been studied extensively (reviewed in Ref. 24); however, only the structural effects induced by Ca2+ are well characterized (25), primarily through comparison of the non-activated and Ca2+-activated conformers using three-dimensional EM reconstructions (4), small angle x-ray scattering modeling (3), and biophysical (2628) and chemical crosslinking methods (2932). In contrast to the Ca2+-activated versus non-activated conformers, there are no reported structures of phosphorylated PhK to compare against the non-activated form. A very small amount of structural information for phospho-activated PhK derived from chemical crosslinking raises the possibility of phosphorylation-dependent communication between the β and γ subunits: Arg-18 in the N-terminal phosphorylatable region of β was found to be relatively near the γCRD (33). Several lines of evidence suggest that transduction of the activating phosphorylation signal in PhK occurs concomitantly with conformational changes in β (33) that are detected via various methods (10, 34), including chemical crosslinking (35). For example, crosslinking of only the phosphorylated conformer by the short-span crosslinker 1,5-difluoro-2,4-dinitrobenzene results in the formation of β homodimers (35). Correspondingly, more recent two-hybrid screens of the full length β subunit against itself yielded positive binding interactions only for point mutants in which the N-terminal phosphorylatable serine residues were mutated to phosphomimetic glutamates (33). It should be noted, however, that both chemical crosslinking and two-hybrid screening have potential drawbacks in the study of subunit interactions within a multisubunit complex. In the case of the latter, it is difficult when observing homodimeric two-hybrid interactions to determine whether they correspond to naturally occurring interactions between two like subunits within a complex or between two interacting regions within a single subunit of that complex. Studying subunit interactions in a complex through chemical crosslinking comes with its own inherent limitations. For example, an initial mono-derivatization can potentially cause a conformational change in one subunit that might affect the subsequent crosslinking reaction. This is particularly the case if the crosslinker contains a functionality, such as an aromatic group, that can unexpectedly direct it to a specific locus on the protein complex (36, 37). In addition, the spacer arms on many crosslinkers are sufficiently long to confound interpretation as to whether two subunits within a complex are actually in contact. Similarly, it should be proved that any observed crosslinked conjugate is formed from subunits within a complex, as opposed to between complexes (38, 39), a control that is often not run. Thus, it is prudent to analyze subunit interactions within a complex using a variety of approaches.To corroborate, complement, and expand the previous two-hybrid screening and chemical crosslinking studies of PhK''s subunit interactions and to investigate changes in the pattern of subunit interactions induced by phosphorylation, we carried out comparative MS analyses of both intact and partially denatured forms of nonactivated and phospho-activated PhK using mass spectrometers modified specifically to enhance the transmission of large noncovalently bound protein complexes (4042). The array of subunit interactions detected for the nonactivated PhK complex largely replicated those reported in the crosslinking literature for this conformer, both corroborating those earlier studies and validating the use of these MS approaches to study subunit interactions within the PhK complex. Additionally, several novel subcomplexes of PhK were revealed, most notably an αβγδ protomer, which corroborates the observed packing of this subcomplex in the D2 symmetrical (αβγδ)4 native complex (9, 11). Moreover, we show herein that the array of subunit interactions detected for phospho-activated PhK differs significantly from that observed for the nonactivated conformer, with only the former showing extensive self-interactions between and among the regulatory β subunits. As is discussed, this suggests that activation through phosphorylation is associated with increased interprotomeric interactions in the bridged core of the PhK complex (33, 35).  相似文献   

16.
Oxalate oxidase (EC 1.2.3.4) was purified to apparent homogeneity from Pseudomonas sp. OX-53. The molecular weight of the enzyme was about 320,000 by Sephadex G-200 column chromatography and 38,000 by sodium dodecyl sulfate disc electrophoresis. The isoelectric point of the enzyme was pH 4.7 by isoelectric focusing. This enzyme contained 1.12 atoms of manganese and 0.36 atoms of zinc per subunit. Besides oxalic acid, the enzyme oxidized glyoxylic acid and malic acid at lower reaction rates. The Michaelis constant of the enzyme was 9.5 mM for oxalic acid at the optimal pH 4.8. The enzyme was stable from pH 5.5 to 7.0. The enzyme was activated by flavins, phenylhydrazine, and o-phenylenediamine, and inhibited by I, Br, semicarbazide, and hydroxylamine.  相似文献   

17.
18.
19.
The range of heterogeneous approaches available for quantifying protein abundance via mass spectrometry (MS)1 leads to considerable challenges in modeling, archiving, exchanging, or submitting experimental data sets as supplemental material to journals. To date, there has been no widely accepted format for capturing the evidence trail of how quantitative analysis has been performed by software, for transferring data between software packages, or for submitting to public databases. In the context of the Proteomics Standards Initiative, we have developed the mzQuantML data standard. The standard can represent quantitative data about regions in two-dimensional retention time versus mass/charge space (called features), peptides, and proteins and protein groups (where there is ambiguity regarding peptide-to-protein inference), and it offers limited support for small molecule (metabolomic) data. The format has structures for representing replicate MS runs, grouping of replicates (for example, as study variables), and capturing the parameters used by software packages to arrive at these values. The format has the capability to reference other standards such as mzML and mzIdentML, and thus the evidence trail for the MS workflow as a whole can now be described. Several software implementations are available, and we encourage other bioinformatics groups to use mzQuantML as an input, internal, or output format for quantitative software and for structuring local repositories. All project resources are available in the public domain from the HUPO Proteomics Standards Initiative http://www.psidev.info/mzquantml.The Proteomics Standards Initiative (PSI) has been working for ten years to improve the reporting and standardization of proteomics data. The PSI has published minimum reporting guidelines, called MIAPE (Minimum Information about a Proteomics Experiment) documents, for MS-based proteomics (1) and molecular interactions (2), as well as data standards for raw/processed MS data in mzML (3), peptide and protein identifications in mzIdentML (4), transitions for selected reaction monitoring analysis in TraML (5), and molecular interactions in PSI-MI format (6). Standards are particularly important for quantitative proteomics research, because the associated bioinformatics analysis is highly challenging as a result of the range of different experimental techniques for deriving abundance values for proteins using MS. The techniques can be broadly divided into those based on (i) differential labeling, in which a metabolic label or chemical tag is applied to cells, peptides, or proteins, samples are mixed, and intensity signals for peptide ions are compared within single MS runs; or (ii) label-free methods in which MS runs occur in parallel and bioinformatics methods are used to extract intensity signals, ensuring that like-for-like signals are compared between runs (7). In most label-based and label-free approaches, peptide ratios or abundance values must be summarized in order for one to arrive at relative protein abundance values, taking into account ambiguity in peptide-to-protein inference. Absolute protein abundance values can typically be derived only using internal standards spiked into samples of known abundance (8, 9). The PSI has recently developed a MIAPE-Quant document defining and describing the minimal information necessary in order to judge or repeat a quantitative proteomics experiment.Software packages tend to report peptide or protein abundance values in a bespoke format, often as tab or comma separated values, for import into spreadsheet software. In complementary work, the PSI has developed a standard format for capturing these final results in a standardized tab separated value format, called mzTab, suitable for post-processing and visualization in end-user tools such as Microsoft Excel or the R programming language. The final results of a quantitative analysis are sufficient for many purposes, such as performing statistical analysis to determine differential expression or cluster analysis to find co-expressed proteins. However, mzTab (or similar bespoke formats) was not designed to hold a trace of how the peptide and protein abundance values were calculated from MS data (i.e. metadata is lost that might be crucial for other tasks). For example, most quantitative software packages detect and quantify so-called “features” (representing all ions collected for a given peptide) in two-dimensional MS data, where the two dimensions are retention time from liquid chromatography (LC) and mass over charge (m/z). Without capturing the two-dimensional coordinates of the features, it is not possible to write visualization software showing exactly what the software has quantified; researchers have to trust that the software has accurately quantified all ions from isotopes of a given peptide, excluding any overlapping ions derived from other peptides. The history of proteomics research has been one in which studies of highly variable quality have been published. There is also little quality control or benchmarking performed on quantitative software (10), meaning it is difficult to make quality judgments on a set of peptide and protein abundance values. The PSI has recently developed mzML, which can capture raw or processed MS data in a vendor neutral format, and the mzIdentML standard, to capture search engine results and the important metadata (such as software parameters), such that peptide and protein identification data can be interpreted consistently. These two standards are now being used for data sharing and to support open source software development, so that informatics groups can focus on algorithmic development rather than file format conversions. Until now, there has been no widely used open source format or data standard for capturing metadata and data relating to the quantitation step of analysis pipelines. In this work, we report the mzQuantML standard from the PSI, which has recently completed the PSI standardization process (11), from which version 1.0 was released. We believe that quantitative proteomics research will benefit from improved capabilities for tracing what manipulations have happened to data at each stage of the analysis process. The mzQuantML standard has been designed to store quantitative values calculated for features, peptides, proteins, and/or protein groups (where there is ambiguity in protein inference), plus associated software parameters. It has also been designed to accommodate small molecule data to improve interoperability with metabolomics investigations. The format can represent experimental replicates and grouping of replicates, and it has been designed via an open and transparent process.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号