首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The enzymes of the GCN5-related N-acetyltransferase (GNAT) superfamily count more than 870 000 members through all kingdoms of life and share the same structural fold. GNAT enzymes transfer an acyl moiety from acyl coenzyme A to a wide range of substrates including aminoglycosides, serotonin, glucosamine-6-phosphate, protein N-termini and lysine residues of histones and other proteins. The GNAT subtype of protein N-terminal acetyltransferases (NATs) alone targets a majority of all eukaryotic proteins stressing the omnipresence of the GNAT enzymes. Despite the highly conserved GNAT fold, sequence similarity is quite low between members of this superfamily even when substrates are similar. Furthermore, this superfamily is phylogenetically not well characterized. Thus functional annotation based on sequence similarity is unreliable and strongly hampered for thousands of GNAT members that remain biochemically uncharacterized. Here we used sequence similarity networks to map the sequence space and propose a new classification for eukaryotic GNAT acetyltransferases. Using the new classification, we built a phylogenetic tree, representing the entire GNAT acetyltransferase superfamily. Our results show that protein NATs have evolved more than once on the GNAT acetylation scaffold. We use our classification to predict the function of uncharacterized sequences and verify by in vitro protein assays that two fungal genes encode NAT enzymes targeting specific protein N-terminal sequences, showing that even slight changes on the GNAT fold can lead to change in substrate specificity. In addition to providing a new map of the relationship between eukaryotic acetyltransferases the classification proposed constitutes a tool to improve functional annotation of GNAT acetyltransferases.  相似文献   

2.
The exponential growth of sequence data provides abundant information for the discovery of new enzyme reactions. Correctly annotating the functions of highly diverse proteins can be difficult, however, hindering use of this information. Global analysis of large superfamilies of related proteins is a powerful strategy for understanding the evolution of reactions by identifying catalytic commonalities and differences in reaction and substrate specificity, even when only a few members have been biochemically or structurally characterized. A comparison of >2500 sequences sharing the six-bladed β-propeller fold establishes sequence, structural, and functional links among the three subgroups of the functionally diverse N6P superfamily: the arylesterase-like and senescence marker protein-30/gluconolactonase/luciferin-regenerating enzyme-like (SGL) subgroups, representing enzymes that catalyze lactonase and related hydrolytic reactions, and the so-called strictosidine synthase-like (SSL) subgroup. Metal-coordinating residues were identified as broadly conserved in the active sites of all three subgroups except for a few proteins from the SSL subgroup, which have been experimentally determined to catalyze the quite different strictosidine synthase (SS) reaction, a metal-independent condensation reaction. Despite these differences, comparison of conserved catalytic features of the arylesterase-like and SGL enzymes with the SSs identified similar structural and mechanistic attributes between the hydrolytic reactions catalyzed by the former and the condensation reaction catalyzed by SS. The results also suggest that despite their annotations, the great majority of these >500 SSL sequences do not catalyze the SS reaction; rather, they likely catalyze hydrolytic reactions typical of the other two subgroups instead. This prediction was confirmed experimentally for one of these proteins.  相似文献   

3.
Franco OL  Rigden DJ 《Glycobiology》2003,13(10):707-712
Glycosyltransferases (GTs) are diverse enzymes organized into 65 families. X-ray crystallography and in silico studies have shown many of these to belong to two structural superfamilies: GT-A and GT-B. Through application of fold recognition and iterated sequence searches, we demonstrate that families 60, 62, and 64 may also be grouped into the GT-A fold superfamily. Analysis of conserved acidic residues suggests that catalytic sites are better conserved in superfamily GT-B than in GT-A. Although 26% and 29% of GT families may now be confidently placed in superfamilies GT-A and GT-B, respectively, the remaining 45% of families bear no discernible resemblance to either superfamily, which, given the sensitivity of modern fold recognition methods, suggests the existence of novel structural scaffolds associated with GT activity. Furthermore, bioinformatics studies indicate the apparent ease with which mechanism-inverting or retaining-may change during evolution.  相似文献   

4.
The nitrilases are enzymes that convert nitriles to the corresponding acid and ammonia. They are members of a superfamily, which includes amidases and occur in both prokaryotes and eukaryotes. The superfamily is characterized by having a homodimeric building block with a αββα–αββα sandwich fold and an active site containing four positionally conserved residues: cys, glu, glu and lys. Their high chemical specificity and frequent enantioselectivity makes them attractive biocatalysts for the production of fine chemicals and pharmaceutical intermediates. Nitrilases are also used in the treatment of toxic industrial effluent and cyanide remediation. The superfamily enzymes have been visualized as dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers and variable length helices, but all nitrilase oligomers have the same basic dimer interface. Moreover, in the case of the octamers, tetradecamers, octadecamers and the helices, common principles of subunit association apply. While the range of industrially interesting reactions catalysed by this enzyme class continues to increase, research efforts are still hampered by the lack of a high resolution microbial nitrilase structure which can provide insights into their specificity, enantioselectivity and the mechanism of catalysis. This review provides an overview of the current progress in elucidation of structure and function in this enzyme class and emphasizes insights that may lead to further biotechnological applications.  相似文献   

5.
Evolution of function in protein superfamilies, from a structural perspective   总被引:29,自引:0,他引:29  
The recent growth in protein databases has revealed the functional diversity of many protein superfamilies. We have assessed the functional variation of homologous enzyme superfamilies containing two or more enzymes, as defined by the CATH protein structure classification, by way of the Enzyme Commission (EC) scheme. Combining sequence and structure information to identify relatives, the majority of superfamilies display variation in enzyme function, with 25 % of superfamilies in the PDB having members of different enzyme types. We determined the extent of functional similarity at different levels of sequence identity for 486,000 homologous pairs (enzyme/enzyme and enzyme/non-enzyme), with structural and sequence relatives included. For single and multi-domain proteins, variation in EC number is rare above 40 % sequence identity, and above 30 %, the first three digits may be predicted with an accuracy of at least 90 %. For more distantly related proteins sharing less than 30 % sequence identity, functional variation is significant, and below this threshold, structural data are essential for understanding the molecular basis of observed functional differences. To explore the mechanisms for generating functional diversity during evolution, we have studied in detail 31 diverse structural enzyme superfamilies for which structural data are available. A large number of variations and peculiarities are observed, at the atomic level through to gross structural rearrangements. Almost all superfamilies exhibit functional diversity generated by local sequence variation and domain shuffling. Commonly, substrate specificity is diverse across a superfamily, whilst the reaction chemistry is maintained. In many superfamilies, the position of catalytic residues may vary despite playing equivalent functional roles in related proteins. The implications of functional diversity within supefamilies for the structural genomics projects are discussed. More detailed information on these superfamilies is available at http://www.biochem.ucl.ac.uk/bsm/FAM-EC/.  相似文献   

6.
In this study, I explain the observation that a rather limited number of residues (about 10) establishes the immunoglobulin fold for the sequences of about 100 residues. Immunoglobulin fold proteins (IgF) comprise SCOP protein superfamilies with rather different functions and with less than 10% sequence identity; their alignment can be accomplished only taking into account the 3D structure. Therefore, I believe that discovering the additional common features of the sequences is necessary to explain the existence of a common fold for these SCOP superfamilies. We propose a method for analysis of pair-wise interconnections between residues of the multiple sequence alignment which helps us to reveal the set of mutually correlated positions, inherent to almost every superfamily of this protein fold. Hence, the set of constant positions (comprising the hydrophobic common core) and the set of variable but mutually correlated ones can serve as a basis of having the common 3D structure for rather distinct protein sequences.  相似文献   

7.
What are the selective pressures on protein sequences during evolution? Amino acid residues may be highly conserved for functional or structural (stability) reasons. Theoretical studies have proposed that residues involved in the folding nucleus may also be highly conserved. To test this we are using an experimental "fold approach" to the study of protein folding. This compares the folding and stability of a number of proteins that share the same fold, but have no common amino acid sequence or biological activity. The fold selected for this study is the immunoglobulin-like beta-sandwich fold, which is a fold that has no specifically conserved function. Four model proteins are used from two distinct superfamilies that share the immunoglobulin-like fold, the fibronectin type III and immunoglobulin superfamilies. Here, the fold approach and protein engineering are used to question the role of a highly conserved tyrosine in the "tyrosine corner" motif that is found ubiquitously and exclusively in Greek key proteins. In the four model beta-sandwich proteins characterised here, the tyrosine is the only residue that is absolutely conserved at equivalent sites. By mutating this position to phenylalanine, we show that the tyrosine hydroxyl is not required to nucleate folding in the immunoglobulin superfamily, whereas it is involved to some extent in early structure formation in the fibronectin type III superfamily. The tyrosine corner is important for stability, mutation to phenylalanine costs between 1.5 and 3 kcal mol(-1). We propose that the high level of conservation of the tyrosine is related to the structural restraints of the loop connecting the beta-sheets, representing an evolutionary "cul-de-sac".  相似文献   

8.
PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence‐structure‐dynamics‐function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence‐conserved residues and build phylogenetic tree. Three‐dimensional structure alignment was also applied to obtain structure‐conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics.  相似文献   

9.
The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies.  相似文献   

10.
11.
We describe a novel approach for inferring functional relationship of proteins by detecting sequence and spatial patterns of protein surfaces. Well-formed concave surface regions in the form of pockets and voids are examined to identify similarity relationship that might be directly related to protein function. We first exhaustively identify and measure analytically all 910,379 surface pockets and interior voids on 12,177 protein structures from the Protein Data Bank. The similarity of patterns of residues forming pockets and voids are then assessed in sequence, in spatial arrangement, and in orientational arrangement. Statistical significance in the form of E and p-values is then estimated for each of the three types of similarity measurements. Our method is fully automated without human intervention and can be used without input of query patterns. It does not assume any prior knowledge of functional residues of a protein, and can detect similarity based on surface patterns small and large. It also tolerates, to some extent, conformational flexibility of functional sites. We show with examples that this method can detect functional relationship with specificity for members of the same protein family and superfamily, as well as remotely related functional surfaces from proteins of different fold structures. We envision that this method can be used for discovering novel functional relationship of protein surfaces, for functional annotation of protein structures with unknown biological roles, and for further inquiries on evolutionary origins of structural elements important for protein function.  相似文献   

12.
13.
The group of proteins that contain a thioredoxin (Trx) fold is huge and diverse. Assessment of the variation in catalytic machinery of Trx fold proteins is essential in providing a foundation for understanding their functional diversity and predicting the function of the many uncharacterized members of the class. The proteins of the Trx fold class retain common features—including variations on a dithiol CxxC active site motif—that lead to delivery of function. We use protein similarity networks to guide an analysis of how structural and sequence motifs track with catalytic function and taxonomic categories for 4,082 representative sequences spanning the known superfamilies of the Trx fold. Domain structure in the fold class is varied and modular, with 2.8% of sequences containing more than one Trx fold domain. Most member proteins are bacterial. The fold class exhibits many modifications to the CxxC active site motif—only 56.8% of proteins have both cysteines, and no functional groupings have absolute conservation of the expected catalytic motif. Only a small fraction of Trx fold sequences have been functionally characterized. This work provides a global view of the complex distribution of domains and catalytic machinery throughout the fold class, showing that each superfamily contains remnants of the CxxC active site. The unifying context provided by this work can guide the comparison of members of different Trx fold superfamilies to gain insight about their structure-function relationships, illustrated here with the thioredoxins and peroxiredoxins.  相似文献   

14.
The manipulation of modular regulatory domains from allosteric enzymes represents a possible mechanism to engineer allostery into non-allosteric systems. Currently, there is insufficient understanding of the structure/function relationships in modular regulatory domains to rationally implement this methodology. The LeuA dimer regulatory domain represents a well-conserved, novel fold responsible for the regulation of two enzymes involved in branched chain amino acid biosynthesis, α-isopropylmalate synthase and citramalate synthase. The LeuA dimer regulatory domain is responsible for the feedback inhibition of these enzymes by their respective downstream products. Both enzymes display multidomain architecture with a conserved N-terminal TIM barrel catalytic domain and a C-terminal (βββα)2 LeuA dimer domain joined by a flexible linker region. Due to the similarity of three-dimensional structure and catalytic mechanism combined with low sequence similarity, we propose these enzymes can be classified as members of the LeuA dimer superfamily. Despite their similarity, members of the LeuA dimer superfamily display diversity in their allosteric mechanisms. In this review, structural aspects of the LeuA dimer superfamily are discussed followed by three examples highlighting the diversity of allosteric mechanisms in the LeuA dimer superfamily.  相似文献   

15.
Cupins: the most functionally diverse protein superfamily?   总被引:10,自引:0,他引:10  
  相似文献   

16.
As enzymes evolve and diverge from common ancestor sequences, they often keep their overall reaction chemistry but specialize in the binding of different cognate ligands. This study borrows methods for the computational assessment of 2D similarity of small molecules from the field of chemoinformatics, to examine the extent of structure conservation of cognate ligands binding to similar proteins. Proteins from 87 structural superfamilies from Escherichia coli form the core dataset, which is extended using homologues with functional assignments from any organism. We find that correlation of the substrate similarity with protein similarity (measured by either sequence-based or structure-based scores) can only be clearly established for very similar proteins. At low sequence identities, the superfamily to which a protein belongs can give helpful clues to its function, and more importantly, the confidence attached to such clues is superfamily-dependent. Our data indicate that only a few superfamilies show great substrate diversity, and that most exhibit conservation of at least part of the structural scaffold of the substrate.  相似文献   

17.
Determining enzyme functions is essential for a thorough understanding of cellular processes. Although many prediction methods have been developed, it remains a significant challenge to predict enzyme functions at the fourth-digit level of the Enzyme Commission numbers. Functional specificity of enzymes often changes drastically by mutations of a small number of residues and therefore, information about these critical residues can potentially help discriminate detailed functions. However, because these residues must be identified by mutagenesis experiments, the available information is limited, and the lack of experimentally verified specificity determining residues (SDRs) has hindered the development of detailed function prediction methods and computational identification of SDRs. Here we present a novel method for predicting enzyme functions by random forests, EFPrf, along with a set of putative SDRs, the random forests derived SDRs (rf-SDRs). EFPrf consists of a set of binary predictors for enzymes in each CATH superfamily and the rf-SDRs are the residue positions corresponding to the most highly contributing attributes obtained from each predictor. EFPrf showed a precision of 0.98 and a recall of 0.89 in a cross-validated benchmark assessment. The rf-SDRs included many residues, whose importance for specificity had been validated experimentally. The analysis of the rf-SDRs revealed both a general tendency that functionally diverged superfamilies tend to include more active site residues in their rf-SDRs than in less diverged superfamilies, and superfamily-specific conservation patterns of each functional residue. EFPrf and the rf-SDRs will be an effective tool for annotating enzyme functions and for understanding how enzyme functions have diverged within each superfamily.  相似文献   

18.
Peptidase family U34 consists of enzymes with unclear catalytic mechanism, for instance, dipeptidase A from Lactobacillus helveticus. Using extensive sequence similarity searches, we infer that U34 family members are homologous to penicillin V acylases (PVA) and thus potentially adopt the N-terminal nucleophile (Ntn) hydrolase fold. Comparative sequence and structural analysis reveals a cysteine as the catalytic nucleophile as well as other conserved residues important for catalysis. The PVA/U34 family is variable in sequence and exhibits great diversity in substrate specificity, to include enzymes such as choloyglycine hydrolases, acid ceramidases, isopenicillin N acyltransferases, and a subgroup of eukaryotic proteins with unclear function.  相似文献   

19.
WW Zhu  C Wang  J Jipp  L Ferguson  SN Lucas  MA Hicks  ME Glasner 《Biochemistry》2012,51(31):6171-6181
Understanding how enzyme specificity evolves will provide guiding principles for protein engineering and function prediction. The o-succinylbenzoate synthase (OSBS) family is an excellent model system for elucidating these principles because it has many highly divergent amino acid sequences that are <20% identical, and some members have evolved a second function. The OSBS family belongs to the enolase superfamily, members of which use a set of conserved residues to catalyze a wide variety of reactions. These residues are the only conserved residues in the OSBS family, so they are not sufficient to determine reaction specificity. Some enzymes in the OSBS family catalyze another reaction, N-succinylamino acid racemization (NSAR). NSARs cannot be segregated into a separate family because their sequences are highly similar to those of known OSBSs, and many of them have both OSBS and NSAR activities. To determine how such divergent enzymes can catalyze the same reaction and how NSAR activity evolved, we divided the OSBS family into subfamilies and compared the divergence of their active site residues. Correlating sequence conservation with the effects of mutations in Escherichia coli OSBS identified two nonconserved residues (R159 and G288) at which mutations decrease efficiency ≥200-fold. These residues are not conserved in the subfamily that includes NSAR enzymes. The OSBS/NSAR subfamily binds the substrate in a different orientation, eliminating selective pressure to retain arginine and glycine at these positions. This supports the hypothesis that specificity-determining residues have diverged in the OSBS family and provides insight into the sequence changes required for the evolution of NSAR activity.  相似文献   

20.
Functional annotation is seldom straightforward with complexities arising due to functional divergence in protein families or functional convergence between non‐homologous protein families, leading to mis‐annotations. An enzyme may contain multiple domains and not all domains may be involved in a given function, adding to the complexity in function annotation. To address this, we use binding site information from bound cognate ligands and catalytic residues, since it can help in resolving fold‐function relationships at a finer level and with higher confidence. A comprehensive database of 2,020 fold‐function‐binding site relationships has been systematically generated. A network‐based approach is employed to capture the complexity in these relationships, from which different types of associations are deciphered, that identify versatile protein folds performing diverse functions, same function associated with multiple folds and one‐to‐one relationships. Binding site similarity networks integrated with fold, function, and ligand similarity information are generated to understand the depth of these relationships. Apart from the observed continuity in the functional site space, network properties of these revealed versatile families with topologically different or dissimilar binding sites and structural families that perform very similar functions. As a case study, subtle changes in the active site of a set of evolutionarily related superfamilies are studied using these networks. Tracing of such similarities in evolutionarily related proteins provide clues into the transition and evolution of protein functions. Insights from this study will be helpful in accurate and reliable functional annotations of uncharacterized proteins, poly‐pharmacology, and designing enzymes with new functional capabilities. Proteins 2017; 85:1319–1335. © 2017 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号