首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Post‐translational modifications (PTMs) of proteins are central in any kind of cellular signaling. Modern mass spectrometry technologies enable comprehensive identification and quantification of various PTMs. Given the increased numbers and types of mapped protein modifications, a database is necessary that simultaneously integrates and compares site‐specific information for different PTMs, especially in plants for which the available PTM data are poorly catalogued. Here, we present the Plant PTM Viewer (http://www.psb.ugent.be/PlantPTMViewer), an integrative PTM resource that comprises approximately 370 000 PTM sites for 19 types of protein modifications in plant proteins from five different species. The Plant PTM Viewer provides the user with a protein sequence overview in which the experimentally evidenced PTMs are highlighted together with an estimate of the confidence by which the modified peptides and, if possible, the actual modification sites were identified and with functional protein domains or active site residues. The PTM sequence search tool can query PTM combinations in specific protein sequences, whereas the PTM BLAST tool searches for modified protein sequences to detect conserved PTMs in homologous sequences. Taken together, these tools help to assume the role and potential interplay of PTMs in specific proteins or within a broader systems biology context. The Plant PTM Viewer is an open repository that allows the submission of mass spectrometry‐based PTM data to remain at pace with future PTM plant studies.  相似文献   

2.
Pharmacogenomics is the study of the genetic basis for individual variation in response to drugs and other xenobiotics. Successful prediction of effects of genetic variations that change encoded amino acid sequences on protein function and their consequent biomedical implications depends on three-dimensional (3D) structures of the encoded amino acid sequences. To bridge sequence to function, thus facilitating an in-depth pharmacogenomic study, we tested the feasibility of the use of a semi-computational approach to predict 3D structures of rabbit and human indolethylamine N-methyltransferases (INMTs) from their amino acid sequences, which share less than 26% sequence identity with known protein 3D structures. Herein, we report 3D models of INMTs predicted by using the crystal structure of rat catechol O-methyltransferase as a template, testing of the models both computationally and experimentally, and successful use of the models in retrospective prediction of the effects of genetic polymorphisms and in identification of residues that contribute to observed species-specific differences in substrate affinity. The results encourage the use of the semi-computational approach to predict 3D protein structures for use in pharmacogenomic studies when de novo prediction of protein 3D structures from their amino acid sequences is still not feasible and X-ray crystallography and/or solution nuclear magnetic resonance spectroscopy can only determine 3D structures for a small number of known amino acid sequences.Electronic Supplementary Material available.  相似文献   

3.
4.
蛋白质翻译后修饰对蛋白质成熟、结构和功能多样性有决定性的作用。但蛋白质翻译后修饰的多样性、普遍性、动态性,使传统的生物化学方法在全局水平上理解翻译后修饰非常有限,对它们的研究、特别是大规模的研究长期发展缓慢。现在,在实验研究基础上,借助多方面的生物信息学方法,可以快速高通量的预测和鉴定蛋白质翻译后修饰。一方面,可以从序列角度出发,基于酶识别底物的特异性,用位点权重矩阵、支持向量机等算法,从底物蛋白质序列提取修饰相关的保守序列,并用于预测翻译后修饰位点。这种方法相对成熟,能够取得较理想的预测准确性,但不能反映不同时间不同细胞的翻译后修饰状态。另一方面,可从质谱数据分析出发,有望捕获细胞内翻译后修饰的动态特性。质谱分析的高灵敏度、高准确度和高通量的能力已使建立在质谱基础上的蛋白质组学成为研究翻译后修饰的重要工具,生物信息学方法和质谱蛋白质组学的结合则更可以加速研究翻译后修饰的进程。本文从序列和质谱分析两个角度总结评价了各种翻译后修饰相关生物信息学方法的研究近况,重点讨论利用质谱数据鉴定翻译后修饰的新思路。  相似文献   

5.
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)—a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits—conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit–N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data.Post-translational modifications (PTMs)1 are a rapidly expanding and important class of protein feature that broaden the functional diversity of proteins in a proteome. By definition, PTMs change protein structure and therefore have the potential to affect protein function by altering protein interactions, protein stability or catalytic activity (1, 2). As they have been found to occur on nearly every protein in the eukaryotic proteome, PTMs broadly impact nearly all known cellular processes. Over 300 different types of PTM are known, ranging from single atom modifications (e.g. oxide) to small protein modifiers (e.g. ubiquitin), which can occur on all but five amino acid residues resulting from enzymatic or nonenzymatic processes (3). Over 220,000 distinct PTM sites have been experimentally identified across ∼77,000 different proteins to date (dbPTM; http://dbptm.mbc.nctu.edu.tw/statistics.php) – numbers that continue to grow exponentially because of improved methods for high throughput detection by mass spectrometry (MS). By virtue of how they are detected, most PTM data are sequence-linked and lack structural context.The function of most PTMs is unknown because the rate of PTM detection far surpasses the rate at which any one modification can be studied empirically. Moreover, the functional impact of every PTM is likely not equivalent (4). For example, computational analysis of phosphorylation sites in yeast and human proteomes indicate that well-conserved phosphosites are more likely to have a functional consequence compared with poorly conserved sites, yet only a fraction of phosphosites are well conserved (5, 6). Consequently, the development of tools that provide functional prioritization of PTMs could have a broad impact on our understanding of protein regulation, biological mechanism, and molecular evolution.The emerging need for methods that predict the functional impact of a PTM has not yet been met. Longstanding methods capitalize predominantly on the sequence context of PTMs and have been used to predict sites of modification (expasy.org/proteomics/post-translational_modification) and to compare enzyme/substrate interactions (79). More recently, studies aimed at expanding the parameters associated with functional PTMs have emerged. In these cases, a set of common features correlated with functional importance are derived from the analysis of PTMs within and between organisms including: number of PTM observations at a multiple sequence alignment position (i.e. hotspots), measures of co-occurrence between different PTMs (e.g. distance between phosphorylation and ubiquitination sites), biological dynamics (up or down-regulation), and protein–protein interaction influence (7, 1012). Recent efforts to provide structural context by linking individual PTMs to three-dimensional structures in the protein data bank (PDB) have also been described (13, 14). However, these resources are extensions of existing PTM databases that allow visualization of single instances of modification onto individual proteins, but do not provide quantitative or analytical value.In principle, combining PTM hotspot and structural analysis would offer multiple advantages over any one approach used in isolation. Sequence homology provides protein family membership—thereby clustering PTMs into hotspots for groups of proteins to provide information about: (1) the evolutionary conservation and (2) observation frequencies of PTMs within the family. A primary consequence of their sequence homology is that members of a protein family will exhibit similar structures and protein interactions—features that dictate the function of protein systems. A secondary consequence is that PTM hotspots generated by alignment can be projected onto family-representative protein structures, which places each PTM hotspot into a three-dimensional context that can be visualized for each family. The structural context enabled by this projection can also provide spatial information about the PTM site that can supplement the sequence characteristics of the hotspot, namely: (3) solvent accessibility, which provides an estimate of whether a modification could occur on the folded protein; and (4) protein interface residence, which indicates the potential of the PTM to disrupt protein–protein interactions. Despite the theoretical advantages, no single tool has been developed that exploits the quantitative output from both sequence and structural data to evaluate the function potential of PTMs.Here we describe a new analytical method – Structural Analysis of PTM Hotspots (SAPH-ire), which ranks PTM hotspots by their potential to impact biological function for distinct protein families (Fig. 1). We demonstrate the application of SAPH-ire to the complete set of PTMs for eight distinct protein families including large heterotrimeric G proteins—revealing high-ranking hotspots for which a biological function has not yet been determined. In particular, SAPH-ire revealed the N-terminal tail (Nt) of G protein gamma (Gγ) subunits as one of the highest ranking PTM hotspots for heterotrimeric G proteins (Gα, Gβ, and Gγ). We tested this prediction by monitoring the phosphorylation state and mutation effects of phosphorylation sites in the N terminus of the yeast Gγ subunit (Ste18). Consistent with SAPH-ire predictions, we found that phosphorylation of Ste18-Nt is biologically responsive to a GPCR stimulus and that phospho-null or phospho-mimic mutation of these sites controls protein abundance in an opposite manner in vivo. Thus, SAPH-ire is a powerful new method for predicting the function potential of PTM hotspots, which can guide empirical research toward the discovery of new protein regulatory elements based on high-throughput proteomics.Open in a separate windowFig. 1.Schematic diagram of the SAPH-ire method. A, SAPH-ire integrates InterPro, the Protein Data bank (PDB) and a customized database of experimentally validated PTMs. Uniprot entries with PTMs that belong to specific InterPro-classified protein families undergo multiple-sequence alignment (MSA) and PTM hotspot analysis (HSA), which layers all PTMs for a given alignment position in the MSA. The total PTMs observed in each hotspot and the conservation of a modifiable residue (e.g. conservation lysine at a ubiquitination hotspot) at the hotspot are quantified. B, PTM hotspots within the protein family are then projected onto all known crystal structures for the family using the Structural Projection of PTMs (SPoP) tool. From the structural topology of PTM hotspots generated by SPoP, the solvent accessible surface area (SASA) and protein interface residence is quantified for each hotspot. C, PTM Function Potential Calculator (FPC) integrates the output from HSA and SPoP, resulting in PTM function potential scores for each hotspot. The function potential score can be used to rank PTM hotspots within or between protein families – prioritizing hotspots with the greatest potential to be biologically regulated and/or effect a biological function for the protein family of interest.  相似文献   

6.
Mitogen-activated protein (MAP) kinase-mediated phosphorylation of specific residues in tyrosine hydroxylase leads to an increase in enzyme activity. However, the mechanism whereby phosphorylation affects enzyme turnover is not well understood. We used a combination of fluorescence resonance energy transfer (FRET) measurements and molecular dynamics simulations to explore the conformational free energy landscape of a 10-residue MAP kinase substrate found near the N terminus of the enzyme. This region is believed to be part of an autoregulatory sequence that overlies the active site of the enzyme. FRET was used to measure the effect of phosphorylation on the ensemble of peptide conformations, and molecular dynamics simulations generated free energy profiles for both the unphosphorylated and phosphorylated peptides. We demonstrate how FRET transfer efficiencies can be calculated from molecular dynamics simulations. For both the unphosphorylated and phosphorylated peptides, the calculated FRET efficiencies are in excellent agreement with the experimentally determined values. Moreover, the FRET measurements and molecular simulations suggest that phosphorylation causes the peptide backbone to change direction and fold into a compact structure relative to the unphosphorylated state. These results are consistent with a model of enzyme activation where phosphorylation of the MAP kinase substrate causes the N-terminal region to adopt a compact structure away from the active site. The methods we employ provide a general framework for analyzing the accessible conformational states of peptides and small molecules. Therefore, they are expected to be applicable to a variety of different systems.  相似文献   

7.
Typically, detection of protein sequences in collision-induced dissociation (CID) tandem MS (MS2) dataset is performed by mapping identified peptide ions back to protein sequence by using the protein database search (PDS) engine. Finding a particular peptide sequence of interest in CID MS2 records very often requires manual evaluation of the spectrum, regardless of whether the peptide-associated MS2 scan is identified by PDS algorithm or not. We have developed a compact cross-platform database-free command-line utility, pepgrep, which helps to find an MS2 fingerprint for a selected peptide sequence by pattern-matching of modelled MS2 data using Peptide-to-MS2 scoring algorithm. pepgrep can incorporate dozens of mass offsets corresponding to a variety of post-translational modifications (PTMs) into the algorithm. Decoy peptide sequences are used with the tested peptide sequence to reduce false-positive results. The engine is capable of screening an MS2 data file at a high rate when using a cluster computing environment. The matched MS2 spectrum can be displayed by using built-in graphical application programming interface (API) or optionally recorded to file. Using this algorithm, we were able to find extra peptide sequences in studied CID spectra that were missed by PDS identification. Also we found pepgrep especially useful for examining a CID of small fractions of peptides resulting from, for example, affinity purification techniques. The peptide sequences in such samples are less likely to be positively identified by using routine protein-centric algorithm implemented in PDS. The software is freely available at http://bsproteomics.essex.ac.uk:8080/data/download/pepgrep-1.4.tgz.  相似文献   

8.
In eukaryotes, GPI (glycosylphosphatidylinositol) lipid anchoring of proteins is an abundant post-translational modification. The attachment of the GPI anchor is mediated by GPI-T (GPI transamidase), a multimeric, membrane-bound enzyme located in the ER (endoplasmic reticulum). Upon modification, GPI-anchored proteins enter the secretory pathway and ultimately become tethered to the cell surface by association with the plasma membrane and, in yeast, by covalent attachment to the outer glucan layer. This work demonstrates a novel in vivo assay for GPI-T. Saccharomyces cerevisiae INV (invertase), a soluble secreted protein, was converted into a substrate for GPI-T by appending the C-terminal 21 amino acid GPI-T signal sequence from the S. cerevisiae Yapsin 2 [Mkc7p (Y21)] on to the C-terminus of INV. Using a colorimetric assay and biochemical partitioning, extracellular presentation of GPI-anchored INV was shown. Two human GPI-T signal sequences were also tested and each showed diminished extracellular INV activity, consistent with lower levels of GPI anchoring and species specificity. Human/fungal chimaeric signal sequences identified a small region of five amino acids that was predominantly responsible for this species specificity.  相似文献   

9.
Post-translational modifications (PTMs) are required for proper folding of many proteins. The low capacity for PTMs hinders the production of heterologous proteins in the widely used prokaryotic systems of protein synthesis. Until now, a systematic and comprehensive study concerning the specific effects of individual PTMs on heterologous protein synthesis has not been presented. To address this issue, we expressed 1488 human proteins and their domains in a bacterial cell-free system, and we examined the correlation of the expression yields with the presence of multiple PTM sites bioinformatically predicted in these proteins. This approach revealed a number of previously unknown statistically significant correlations. Prediction of some PTMs, such as myristoylation, glycosylation, palmitoylation, and disulfide bond formation, was found to significantly worsen protein amenability to soluble expression. The presence of other PTMs, such as aspartyl hydroxylation, C-terminal amidation, and Tyr sulfation, did not correlate with the yield of heterologous protein expression. Surprisingly, the predicted presence of several PTMs, such as phosphorylation, ubiquitination, SUMOylation, and prenylation, was associated with the increased production of properly folded soluble proteins. The plausible rationales for the existence of the observed correlations are presented. Our findings suggest that identification of potential PTMs in polypeptide sequences can be of practical use for predicting expression success and optimizing heterologous protein synthesis. In sum, this study provides the most compelling evidence so far for the role of multiple PTMs in the stability and solubility of heterologously expressed recombinant proteins.  相似文献   

10.
The substrate specificity of furin, a mammalian enzyme involved in the cleavage of many constitutively expressed protein precursors, was studied using substrate phage display. In this method, a multitude of substrate sequences are displayed as fusion proteins on filamentous phage particles and ones that are cleaved can be purified by affinity chromatography. The cleaved phage are propagated and submitted to additional rounds of protease selection to further enrich for good substrates. DNA sequencing of the cleaved phage is used to identify the substrate sequence. After 6 rounds of sorting a substrate phage library comprising 5 randomized amino acids (xxxxx), virtually all clones had an RxxR motif and many had Lys, Arg, or Pro before the second Arg. Nine of the selected sequences were assayed using a substrate-alkaline phosphatase fusion protein system. All were cleaved after the RxxR, and some substrates with Pro or Thr in P2 were also found to be cleaved as efficiently as RxKR or RxRR. To further elaborate surrounding determinants, we constructed 2 secondary libraries (xxRx(K/R)Rx and xxRxPRx). Although no consensus developed for the latter library, many of the sequences in the the former library had the 7-residue motif (L/P)RRF(K/R)RP, suggesting that the furin recognition sequence may extend over more than 4 residues. These studies further clarify the substrate specificity of furin and suggest the substrate phage method may be useful for identifying consensus substrate motifs in other protein processing enzymes.  相似文献   

11.
We present a statistical-mechanical selection theory for the sequence analysis of a set of specific DNA regulatory sites that makes it possible to predict the relationship between individual base-pair choices in the site and specific activity (affinity). The theory is based on the assumption that specific DNA sequences have been selected to conform to some requirement for protein binding (or activity), and that all sequences that can fulfil this requirement are equally likely to occur. In most cases, the number of specific DNA sequences that are known for a certain DNA-binding protein is very small, and we discuss in detail the small-sample uncertainties that this leads to. When applied to the binding sites for cro repressor in phage lambda, the theory can predict, from the sequence statistics alone, their rank order binding affinities in reasonable agreement with measured values. However, the statistical uncertainty generated by such a small sample (only 6 sites known) limits the result to order-of-magnitude comparisons. When applied to the much larger sample of Escherichia coli promoter sequences, the theory predicts the correlation between in vitro activity (k2KB values) and homology score (closeness to the consensus sequence) observed by Mulligan et al. (1984). The analysis of base-pair frequencies in the promoter sample is consistent with the assumption that base-pairs at different positions in the sites contribute independently to the specific activity, except in a few marginal cases that are discussed. When the promoter sites are ordered according to predicted activities, they seem to conform to the Gaussian distribution that results from a requirement for maximal sequence variability within the constraint of providing a certain average activity. The theory allows us to compare the number of specific sites with a certain activity to the number that would be expected from random occurrence in the genome. While strong promoters are "overspecified", in the sense that their probability of random occurrence is very low, random sequences with weak promoter-like properties are expected to occur in very large numbers. This leads to the conclusion that functional specificity is based on other properties in addition to primary sequence recognition; some possibilities are discussed. Finally, we show that the sequence information, as defined by Schneider et al. (1986), can be used directly (at least in the case of equilibrium binding sites) to estimate the number of protein molecules that are specifically bound at random "pseudosites" in the genome.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

12.
Baoqiang Cao  Ron Elber 《Proteins》2010,78(4):985-1003
We investigate small sequence adjustments (of one or a few amino acids) that induce large conformational transitions between distinct and stable folds of proteins. Such transitions are intriguing from evolutionary and protein‐design perspectives. They make it possible to search for ancient protein structures or to design protein switches that flip between folds and functions. A network of sequence flow between protein folds is computed for representative structures of the Protein Data Bank. The computed network is dense, on an average each structure is connected to tens of other folds. Proteins that attract sequences from a higher than expected number of neighboring folds are more likely to be enzymes and alpha/beta fold. The large number of connections between folds may reflect the need of enzymes to adjust their structures for alternative substrates. The network of the Cro family is discussed, and we speculate that capacity is an important factor (but not the only one) that determines protein evolution. The experimentally observed flip from all alpha to alpha + beta fold is examined by the network tools. A kinetic model for the transition of sequences between the folds (with only protein stability in mind) is proposed. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

13.
XopD (Xanthomonas outer protein D), a type III secreted effector from Xanthomonas campestris pv. vesicatoria, is a desumoylating enzyme with strict specificity for its plant small ubiquitin-like modifier (SUMO) substrates. Based on SUMO sequence alignments and peptidase assays with various plant, yeast, and mammalian SUMOs, we identified residues in SUMO that contribute to XopD/SUMO recognition. Further predictions regarding the enzyme/substrate specificity were made by solving the XopD crystal structure. By incorporating structural information with sequence alignments and enzyme assays, we were able to elucidate determinants of the rigid SUMO specificity exhibited by the Xanthomonas virulence factor XopD.  相似文献   

14.
Molecular recognition features (MoRFs) are intrinsically disordered protein regions that bind to partners via disorder‐to‐order transitions. In one‐to‐many binding, a single MoRF binds to two or more different partners individually. MoRF‐based one‐to‐many protein–protein interaction (PPI) examples were collected from the Protein Data Bank, yielding 23 MoRFs bound to 2–9 partners, with all pairs of same‐MoRF partners having less than 25% sequence identity. Of these, 8 MoRFs were bound to 2–9 partners having completely different folds, whereas 15 MoRFs were bound to 2–5 partners having the same folds but with low sequence identities. For both types of partner variation, backbone and side chain torsion angle rotations were used to bring about the conformational changes needed to enable close fits between a single MoRF and distinct partners. Alternative splicing events (ASEs) and posttranslational modifications (PTMs) were also found to contribute to distinct partner binding. Because ASEs and PTMs both commonly occur in disordered regions, and because both ASEs and PTMs are often tissue‐specific, these data suggest that MoRFs, ASEs, and PTMs may collaborate to alter PPI networks in different cell types. These data enlarge the set of carefully studied MoRFs that use inherent flexibility and that also use ASE‐based and/or PTM‐based surface modifications to enable the same disordered segment to selectively associate with two or more partners. The small number of residues involved in MoRFs and in their modifications by ASEs or PTMs may simplify the evolvability of signaling network diversity.  相似文献   

15.
Bottom-up proteomics largely relies on tryptic peptides for protein identification and quantification. Tryptic digestion often provides limited coverage of protein sequence because of issues such as peptide length, ionization efficiency, and post-translational modification colocalization. Unfortunately, a region of interest in a protein, for example, because of proximity to an active site or the presence of important post-translational modifications, may not be covered by tryptic peptides. Detection limits, quantification accuracy, and isoform differentiation can also be improved with greater sequence coverage. Selected reaction monitoring (SRM) would also greatly benefit from being able to identify additional targetable sequences. In an attempt to improve protein sequence coverage and to target regions of proteins that do not generate useful tryptic peptides, we deployed a multiprotease strategy on the HeLa proteome. First, we used seven commercially available enzymes in single, double, and triple enzyme combinations. A total of 48 digests were performed. 5223 proteins were detected by analyzing the unfractionated cell lysate digest directly; with 42% mean sequence coverage. Additional strong-anion exchange fractionation of the most complementary digests permitted identification of over 3000 more proteins, with improved mean sequence coverage. We then constructed a web application (https://proteomics.swmed.edu/confetti) that allows the community to examine a target protein or protein isoform in order to discover the enzyme or combination of enzymes that would yield peptides spanning a certain region of interest in the sequence. Finally, we examined the use of nontryptic digests for SRM. From our strong-anion exchange fractionation data, we were able to identify three or more proteotypic SRM candidates within a single digest for 6056 genes. Surprisingly, in 25% of these cases the digest producing the most observable proteotypic peptides was neither trypsin nor Lys-C. SRM analysis of Asp-N versus tryptic peptides for eight proteins determined that Asp-N yielded higher signal in five of eight cases.Mass-spectrometry based proteomics provides various tools to detect and quantify changes in protein expression or post-translational modifications (PTMs).1 In bottom-up proteomics, these analyses typically involve using peptides derived from the tryptic digestion of proteins. Although trypsin is a robust enzyme and provides peptides suitable for mass spectrometry, not all sequences are detectable by this approach (1). Sequences may be missed because of the limited number and uneven distribution of lysine and arginine residues throughout a protein sequence. Tryptic coverage of interesting regions of sequence, such as trans-membrane domains that may contain notable PTMs, is often incomplete (2). Sequence coverage greater than that offered by trypsin is a requirement for many studies (3).Missing sequence coverage can also adversely affect analysis by selected reaction monitoring (SRM). Although SRM has emerged in recent years as a highly sensitive and accurate method for protein detection and quantification (4), it is sometimes hampered by the limited number of targetable peptides (primarily tryptic peptides) available in public databases. Improving amino acid sequence coverage would provide more targets for SRM assay development, facilitating protein quantification and the ability to target specific isoforms or sequence regions of interest.Fractionation is commonly employed to increase protein identifications and improve sequence coverage, but introduces a number of complexities. Separation of proteins or peptides significantly increases the number of samples to analyze and the amount of data to process. Species may be present in multiple fractions or in different fractions in different runs, which makes quantitative analysis with techniques like SRM difficult. However, SRM has sufficient sensitivity that peptides identified in fractionated discovery experiments are often targetable in whole lysate (5).One approach to increase sequence coverage without fractionation or purification is to use proteases other than trypsin for digestion (6, 7). In recent years, there has been a surge in the use of alternative proteases to improve sequence coverage. Biringer et al. demonstrated in 2006 that combining the MS data from tryptic and Glu-C digestions of human cerebrospinal fluid (CSF) resulted in increased protein identifications. Sequence coverage also improved versus individual enzyme digests, though this was shown only for the 38 most confidently identified proteins (8). In 2010, Swaney et al. expanded the multi-enzyme approach to five specific proteases (trypsin, Lys-C, Arg-C, Asp-N, and Glu-C) and showed that although this method only modestly increases the number of protein IDs, it significantly increases the average sequence coverage (from 24.5% to 43.4%) (9). The most comprehensive coverage of a human cell line to date was reported by Nagaraj et al., in which in-depth proteomics with two levels of prefractionation and analysis using trypsin, Lys-C, and Glu-C was carried out for the HeLa cell line. A total of 10,255 proteins and 166,420 peptides were identified (10). However, none of these studies investigated the use of consecutive enzymatic digestion on a sample.The Mann laboratory recently introduced a strategy, using consecutive digestion in conjunction with filter-aided sample preparation (FASP), for two-step and three-step digestions with various combinations of trypsin, Lys-C, Glu-C, Arg-C, and Asp-N (11). The consecutive use of Lys-C and trypsin enabled the identification of up to 40% more proteins and phosphorylation sites in comparison to trypsin alone. However, a systematic study of all common commercially available proteases for comprehensive mapping of the human proteome has not yet been performed.These prior studies have clearly shown the ability of tandem and parallel protease digestion to improve protein ID and sequence coverage. However, their focus has been either to improve the number of protein identifications or to improve the sequence coverage of few targets. In an effort to provide a resource for targeting as much of the amino acid sequence in a human cell line as possible, we conducted a comprehensive study in which seven commercially available enzymes were used individually and in combination. First, we digested HeLa lysate with a total of 48 single, double, and triple enzyme combinations. Across these combinations we detected 5223 proteins with an average of 42% sequence coverage by analyzing the total cell lysate digest without fractionation. We then selected the best five complementary digests for each of Orbitrap elite collision induced dissociation (CID) and Q exactive higher-energy CID (HCD) analyses. A strong-anion exchange fractionation strategy was applied to these best digests, from which we were able to identify 8470 proteins with 40.3% mean sequence coverage. Combining all digests, both unfractionated and SAX, gave 8539 proteins with 44.7% mean coverage. These data are now publically available (https://proteomics.swmed.edu/confetti) and can be queried using a simple web interface to discover the enzyme or combination of enzymes required to yield a peptide spanning a certain region of interest on a protein.Finally, we performed a proof-of-concept experiment to demonstrate that SRM assays using nontryptic peptides are viable, and in some cases more sensitive than tryptic assays. Though tryptic peptides are generally sufficient for protein quantification by SRM we believe there will be increased use of nontryptic SRM as coverage of specific regions of sequence becomes more important. For example, bio-marker studies considering the presence of specific PTMs rather than general protein abundance are increasingly common. Truly comprehensive PTM studies require access to the nontryptic proteome.  相似文献   

16.
We have previously shown that DNA demethylation by chick embryo 5-methylcytosine (5-MeC)-DNA glycosylase needs both protein and RNA. RNA from enzyme purified by SDS-PAGE was isolated and cloned. The clones have an insert ranging from 240 to 670 bp and contained on average one CpG per 14 bases. All six clones tested had different sequences and did not have any sequence homology with any other known RNA. RNase-inactivated 5-MeC-DNA glycosylase regained enzyme activity when incubated with recombinant RNA. However, when recombinant RNA was incubated with the DNA substrate alone there was no demethylation activity. Short sequences complementary to the labeled DNA substrate are present in the recombinant RNA. Small synthetic oligoribonucleotides (11 bases long) complementary to the region of methylated CpGs of the hemimethylated double-stranded DNA substrate restore the activity of the RNase-inactivated 5-MeC-DNA glycosylase. The corresponding oligodeoxyribonucleotide or the oligoribonucleotide complementary to the non-methylated strand of the same DNA substrate are inactive when incubated in the complementation test. A minimum of 4 bases complementary to the CpG target sequence are necessary for reactivation of RNase-treated 5-MeC-DNA glycosylase. Complementation with double-stranded oligoribonucleotides does not restore 5-MeC-DNA glycosylase activity. An excess of targeting oligoribonucleotides cannot change the preferential substrate specificity of the enzyme for hemimethylated double-stranded DNA.  相似文献   

17.
《MABS-AUSTIN》2013,5(4):379-394
This study shows that state-of-the-art liquid chromatography (LC) and mass spectrometry (MS) can be used for rapid verification of identity and characterization of sequence variants and posttranslational modifications (PTMs) for antibody products. A candidate biosimilar IgG1 monoclonal antibody (mAb) was compared in detail to a commercially available innovator product. Intact protein mass, primary sequence, PTMs, and the micro-differences between the two mAbs were identified and quantified simultaneously. Although very similar in terms of sequences and modifications, a mass difference observed by LC-MS intact mass measurements indicated that they were not identical. Peptide mapping, performed with data independent acquisition LC-MS using an alternating low and elevated collision energy scan mode (LC-MSE), located the mass difference between the biosimilar and the innovator to a two amino acid residue variance in the heavy chain sequences. The peptide mapping technique was also used to comprehensively catalogue and compare the differences in PTMs of the biosimilar and innovator mAbs. Comprehensive glycosylation profiling confirmed that the proportion of individual glycans was different between the biosimilar and the innovator, although the number and identity of glycans were the same. These results demonstrate that the combination of accurate intact mass measurement, released glycan profiling, and LC-MSE peptide mapping provides a set of routine tools that can be used to comprehensively compare a candidate biosimilar and an innovator mAb.  相似文献   

18.
We have identified several protein biomarkers of three Campylobacter jejuni strains (RM1221, RM1859, and RM3782) by proteomic techniques. The protein biomarkers identified are prominently observed in the time-of-flight mass spectra (TOF MS) of bacterial cell lysate supernatants ionized by matrix-assisted laser desorption/ionization (MALDI). The protein biomarkers identified were: DNA-binding protein HU, translation initiation factor IF-1, cytochrome c553, a transthyretin-like periplasmic protein, chaperonin GroES, thioredoxin Trx, and ribosomal proteins: L7/L12 (50S), L24 (50S), S16 (30S), L29 (50S), and S15 (30S), and conserved proteins similar to strain NCTC 11168 proteins Cj1164 and Cj1225. The protein biomarkers identified appear to represent high copy, intact proteins. The significant findings are as follows: (1) Biomarker mass shifts between these strains were due to amino acid substitutions of the primary polypeptide sequence and not due to changes in post-translational modifications (PTMs). (2) If present, a PTM of a protein biomarker appeared consistently for all three strains, which supported that the biomarker mass shifts observed between strains were not due to PTM variability. (3) The PTMs observed included N-terminal methionine (N-Met) cleavage as well as a number of other PTMs. (4) It was discovered that protein biomarkers of C. jejuni (as well as other thermophilic Campylobacters) appear to violate the N-Met cleavage rule of bacterial proteins, which predicts N-Met cleavage if the penultimate residue is threonine. Two protein biomarkers (HU and 30S ribosomal protein S16) that have a penultimate threonine residue do not show N-Met cleavage. In all other cases, the rule correctly predicted N-Met cleavage among the biomarkers analyzed. This exception to the N-Met cleavage rule has implications for the development of bioinformatics algorithms for protein/pathogen identification. (5) There were fewer biomarker mass shifts between strains RM1221 and RM1859 compared to strain RM3782. As the mass shifts were due to the frequency of amino acid substitutions (and thus underlying genetic variations), this suggested that strains RM1221 and RM1859 were phylogenetically closer to one another than to strain RM3782 (in addition, a protein biomarker prominent in the spectra of RM1221 and RM1859 was absent from the RM3782 spectrum due to a nonsense mutation in the gene of the biomarker). These observations were confirmed by a nitrate reduction test, which showed that RM1221 and RM1859 were C. jejuni subsp. jejuni whereas RM3782 was C. jejuni subsp. doylei. This result suggests that detection/identification of protein biomarkers by pattern recognition and/or bioinformatics algorithms may easily subspeciate bacterial microorganisms. (6) Finally, the number and variation of PTMs detected in this relatively small number of protein biomarkers suggest that bioinformatics algorithms for pathogen identification may need to incorporate many more possible PTMs than suggested previously in the literature.  相似文献   

19.
Myristoylation by the myristoyl-CoA:protein N-myristoyltransferase (NMT) is an important lipid anchor modification of eukaryotic and viral proteins. Automated prediction of N-terminal N-myristoylation from the substrate protein sequence alone is necessary for large-scale sequence annotation projects but it requires a low rate of false positive hits in addition to a sufficient sensitivity.Our previous analysis of substrate protein sequence variability, NMT sequences and 3D structures has revealed motif properties in addition to the known PROSITE motif that are utilized in a new predictor described here. The composite prediction function (with separate ad hoc parameterization (a) for queries from non-fungal eukaryotes and their viruses and (b) for sequences from fungal species) consists of terms evaluating amino acid type preferences at sequences positions close to the N terminus as well as terms penalizing deviations from the physical property pattern of amino acid side-chains encoded in multi-residue correlation within the motif sequence. The algorithm has been validated with a self-consistency and two jack-knife tests for the learning set as well as with kinetic data for model substrates. The sensitivity in recognizing documented NMT substrates is above 95 % for both taxon-specific versions. The corresponding rate of false positive prediction (for sequences with an N-terminal glycine residue) is close to 0.5 %; thus, the technique is applicable for large-scale automated sequence database annotation. The predictor is available as public WWW-server with the URL http://mendel.imp.univie.ac.at/myristate/. Additionally, we propose a version of the predictor that identifies a number of proteolytic protein processing sites at internal glycine residues and that evaluates possible N-terminal myristoylation of the protein fragments.A scan of public protein databases revealed new potential NMT targets for which the myristoyl modification may be of critical importance for biological function. Among others, the list includes kinases, phosphatases, proteasomal regulatory subunit 4, kinase interacting proteins KIP1/KIP2, protozoan flagellar proteins, homologues of mitochondrial translocase TOM40, of the neuronal calcium sensor NCS-1 and of the cytochrome c-type heme lyase CCHL. Analyses of complete eukaryote genomes indicate that about 0.5 % of all encoded proteins are apparent NMT substrates except for a higher fraction in Arabidopsis thaliana ( approximately 0.8 %).  相似文献   

20.
All organisms, including humans, possess a huge number of uncharacterized enzymes. Here we describe a general cell-based screen for enzyme substrate discovery by untargeted metabolomics and its application to identify the protein α/β-hydrolase domain-containing 3 (ABHD3) as a lipase that selectively cleaves medium-chain and oxidatively truncated phospholipids. Abhd3(-/-) mice possess elevated myristoyl (C14)-phospholipids, including the bioactive lipid C14-lysophosphatidylcholine, confirming the physiological relevance of our substrate assignments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号