首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Typically, protein spatial structures are more conserved in evolution than amino acid sequences. However, the recent explosion of sequence and structure information accompanied by the development of powerful computational methods led to the accumulation of examples of homologous proteins with globally distinct structures. Significant sequence conservation, local structural resemblance, and functional similarity strongly indicate evolutionary relationships between these proteins despite pronounced structural differences at the fold level. Several mechanisms such as insertions/deletions/substitutions, circular permutations, and rearrangements in beta-sheet topologies account for the majority of detected structural irregularities. The existence of evolutionarily related proteins that possess different folds brings new challenges to the homology modeling techniques and the structure classification strategies and offers new opportunities for protein design in experimental studies.  相似文献   

2.
X-ray crystal structures have revealed that numerous secondary transporter proteins originally categorized into different sequence families share similar structures, namely, the LeuT fold. The core of this fold consists of two units of five transmembrane helices, whose conformations have been proposed to exchange to form the two alternate states required for transport. That these two units are related implies that LeuT-like transporters evolved from gene-duplication and fusion events. Thus, the origins of this structural repeat may be relevant to the evolution of transport function. However, the lack of significant sequence similarity requires sensitive sequence search methods for analyzing their evolution. To this end, we developed a software application called AlignMe, which can use various types of input information, such as residue hydrophobicity, to perform pairwise alignments of sequences and/or of hydropathy profiles of (membrane) proteins. We used AlignMe to analyze the evolutionary relationships between repeats of the LeuT fold. In addition, we identified proteins from the so-called DedA family that potentially share a common ancestor with these repeats. DedA domains have been implicated in, e.g., selenite uptake; they are found widely distributed across all kingdoms of life; two or more DedA domains are typically found per genome, and some may adopt dual topologies. These results suggest that DedA proteins existed in ancient organisms and may function as dimers, as required for a would-be ancestor of the LeuT fold. In conclusion, we provide novel insights into the evolution of this important structural motif and thus potentially into the alternating-access mechanism of transport itself.  相似文献   

3.
KH domain: one motif, two folds   总被引:12,自引:3,他引:9       下载免费PDF全文
The K homology (KH) module is a widespread RNA-binding motif that has been detected by sequence similarity searches in such proteins as heterogeneous nuclear ribonucleoprotein K (hnRNP K) and ribosomal protein S3. Analysis of spatial structures of KH domains in hnRNP K and S3 reveals that they are topologically dissimilar and thus belong to different protein folds. Thus KH motif proteins provide a rare example of protein domains that share significant sequence similarity in the motif regions but possess globally distinct structures. The two distinct topologies might have arisen from an ancestral KH motif protein by N- and C-terminal extensions, or one of the existing topologies may have evolved from the other by extension, displacement and deletion. C-terminal extension (deletion) requires β-sheet rearrangement through the insertion (removal) of a β-strand in a manner similar to that observed in serine protease inhibitors serpins. Current analysis offers a new look on how proteins can change fold in the course of evolution.  相似文献   

4.
Joseph M. Dybas  Andras Fiser 《Proteins》2016,84(12):1859-1874
Structure conservation, functional similarities, and homologous relationships that exist across diverse protein topologies suggest that some regions of the protein fold universe are continuous. However, the current structure classification systems are based on hierarchical organizations, which cannot accommodate structural relationships that span fold definitions. Here, we describe a novel, super‐secondary‐structure motif‐based, topology‐independent structure comparison method (SmotifCOMP) that is able to quantitatively identify structural relationships between disparate topologies. The basis of SmotifCOMP is a systematically defined super‐secondary‐structure motif library whose representative geometries are shown to be saturated in the Protein Data Bank and exhibit a unique distribution within the known folds. SmotifCOMP offers a robust and quantitative technique to compare domains that adopt different topologies since the method does not rely on a global superposition. SmotifCOMP is used to perform an exhaustive comparison of the known folds and the identified relationships are used to produce a nonhierarchical representation of the fold space that reflects the notion of a continuous and connected fold universe. The current work offers insight into previously hypothesized evolutionary relationships between disparate folds and provides a resource for exploring novel ones. Proteins 2016; 84:1859–1874. © 2016 Wiley Periodicals, Inc.  相似文献   

5.
Many protein classification systems capture homologous relationships by grouping domains into families and superfamilies on the basis of sequence similarity. Superfamilies with similar 3D structures are further grouped into folds. In the absence of discernable sequence similarity, these structural similarities were long thought to have originated independently, by convergent evolution. However, the growth of databases and advances in sequence comparison methods have led to the discovery of many distant evolutionary relationships that transcend the boundaries of superfamilies and folds. To investigate the contributions of convergent versus divergent evolution in the origin of protein folds, we clustered representative domains of known structure by their sequence similarity, treating them as point masses in a virtual 2D space which attract or repel each other depending on their pairwise sequence similarities. As expected, families in the same superfamily form tight clusters. But often, superfamilies of the same fold are linked with each other, suggesting that the entire fold evolved from an ancient prototype. Strikingly, some links connect superfamilies with different folds. They arise from modular peptide fragments of between 20 and 40 residues that co‐occur in the connected folds in disparate structural contexts. These may be descendants of an ancestral pool of peptide modules that evolved as cofactors in the RNA world and from which the first folded proteins arose by amplification and recombination. Our galaxy of folds summarizes, in a single image, most known and many yet undescribed homologous relationships between protein superfamilies, providing new insights into the evolution of protein domains.  相似文献   

6.
Kinch LN  Baker D  Grishin NV 《Proteins》2003,52(3):323-331
Sequence--and structure-based searching strategies have proven useful in the identification of remote homologs and have facilitated both structural and functional predictions of many uncharacterized protein families. We implement these strategies to predict the structure of and to classify a previously uncharacterized cluster of orthologs (COG3019) in the thioredoxin-like fold superfamily. The results of each searching method indicate that thioltransferases are the closest structural family to COG3019. We substantiate this conclusion using the ab initio structure prediction method rosetta, which generates a thioredoxin-like fold similar to that of the glutaredoxin-like thioltransferase (NrdH) for a COG3019 target sequence. This structural model contains the thiol-redox functional motif CYS-X-X-CYS in close proximity to other absolutely conserved COG3019 residues, defining a novel thioredoxin-like active site that potentially binds metal ions. Finally, the rosetta-derived model structure assists us in assembling a global multiple-sequence alignment of COG3019 with two other thioredoxin-like fold families, the thioltransferases and the bacterial arsenate reductases (ArsC).  相似文献   

7.
Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.  相似文献   

8.
Protein folds, functions and evolution.   总被引:11,自引:0,他引:11  
The evolution of proteins and their functions is reviewed from a structural perspective in the light of the current database. Protein domain families segregate unequally between the three major classes, the 32 different architectures and almost 700 folds observed to date. We find that the number of new topologies is still increasing, although 25 new structures are now determined for each new topology. The corresponding analysis and classification of function is only just beginning, fuelled by the genome data. The structural data revealed unexpected conservations and divergence of function both within and between families. The next five years will see the compilation of a definitive dictionary of protein families and their related functions, based on structural data which reveals relationships hidden at the sequence level. Such information will provide the foundation to build a better understanding of the molecular basis of biological complexity and hopefully to facilitate rational molecular design.  相似文献   

9.
The U4/U6*U5 tri-snRNP complex is the catalytic core of the pre-mRNA splicing machinery. The thioredoxin-like protein hDim1 (U5-15 kDa) constitutes an essential component of the U5 particle, and its functions have been reported to be highly conserved throughout evolution. Recently, the Dim1-like protein (DLP) family has been extended to other proteins harboring similar sequence motifs. Here we report the biochemical characterization and crystallographic structure of a 149 amino acid protein, hDim2, which shares 38% sequence identity with hDim1. The crystallographic structure of hDim2 solved at 2.5 A reveals a classical thioredoxin-fold structure. However, despite the similarity in the thioredoxin fold, hDim2 differs from hDim1 in many significant features. The structure of hDim2 contains an extra alpha helix (alpha3) and a beta strand (beta5), which stabilize the protein, suggesting that they may be involved in interactions with hDim2-specific partners. The stability and thermodynamic parameters of hDim2 were evaluated by combining circular dichroism and fluorescence spectroscopy together with chromatographic and cross-linking approaches. We have demonstrated that, in contrast to hDim1, hDim2 forms stable homodimers. The dimer interface is essentially stabilized by electrostatic interactions and involves tyrosine residues located in the alpha3 helix. Structural analysis reveals that hDim2 lacks some of the essential structural motifs and residues that are required for the biological activity and interactive properties of hDim1. Therefore, on the basis of structural investigations we suggest that, in higher eukaryotes, although both hDim1 and hDim2 are involved in pre-mRNA splicing, the two proteins are likely to participate in different multisubunit complexes and biological processes.  相似文献   

10.
Of the membrane proteins of known structure, we found that a remarkable 67% of the water soluble domains are structurally similar to water soluble proteins of known structure. Moreover, 41% of known water soluble protein structures share a domain with an already known membrane protein structure. We also found that functional residues are frequently conserved between extramembrane domains of membrane and soluble proteins that share structural similarity. These results suggest membrane and soluble proteins readily exchange domains and their attendant functionalities. The exchanges between membrane and soluble proteins are particularly frequent in eukaryotes, indicating that this is an important mechanism for increasing functional complexity. The high level of structural overlap between the two classes of proteins provides an opportunity to employ the extensive information on soluble proteins to illuminate membrane protein structure and function, for which much less is known. To this end, we employed structure guided sequence alignment to elucidate the functions of membrane proteins in the human genome. Our results bridge the gap of fold space between membrane and water soluble proteins and provide a resource for the prediction of membrane protein function. A database of predicted structural and functional relationships for proteins in the human genome is provided at sbi.postech.ac.kr/emdmp.  相似文献   

11.
Here, we provide an analysis of molecular evolution of five of the most populated protein folds: immunoglobulin fold, oligonucleotide-binding fold, Rossman fold, alpha/beta plait, and TIM barrels. In order to distinguish between "historic", functional and structural reasons for amino acid conservations, we consider proteins that acquire the same fold and have no evident sequence homology. For each fold we identify positions that are conserved within each individual family and coincide when non-homologous proteins are structurally superimposed. As a baseline for statistical assessment we use the conservatism expected based on the solvent accessibility. The analysis is based on a new concept of "conservatism-of-conservatism". This approach allows us to identify the structural features that are stabilized in all proteins having a given fold, despite the fact that actual interactions that provide such stabilization may vary from protein to protein. Comparison with experimental data on thermodynamics, folding kinetics and function of the proteins reveals that such universally conserved clusters correspond to either: (i) super-sites (common location of active site in proteins having common tertiary structures but not function) or (ii) folding nuclei whose stability is an important determinant of folding rate, or both (in the case of Rossman fold). The analysis also helps to clarify the relation between folding and function that is apparent for some folds.  相似文献   

12.
Disulfide-rich domains are small protein domains whose global folds are stabilized primarily by the formation of disulfide bonds and, to a much lesser extent, by secondary structure and hydrophobic interactions. Disulfide-rich domains perform a wide variety of roles functioning as growth factors, toxins, enzyme inhibitors, hormones, pheromones, allergens, etc. These domains are commonly found both as independent (single-domain) proteins and as domains within larger polypeptides. Here, we present a comprehensive structural classification of approximately 3000 small, disulfide-rich protein domains. We find that these domains can be arranged into 41 fold groups on the basis of structural similarity. Our fold groups, which describe broader structural relationships than existing groupings of these domains, bring together representatives with previously unacknowledged similarities; 18 of the 41 fold groups include domains from several SCOP folds. Within the fold groups, the domains are assembled into families of homologs. We define 98 families of disulfide-rich domains, some of which include newly detected homologs, particularly among knottin-like domains. On the basis of this classification, we have examined cases of convergent and divergent evolution of functions performed by disulfide-rich proteins. Disulfide bonding patterns in these domains are also evaluated. Reducible disulfide bonding patterns are much less frequent, while symmetric disulfide bonding patterns are more common than expected from random considerations. Examples of variations in disulfide bonding patterns found within families and fold groups are discussed.  相似文献   

13.
The quest to order and classify protein structures has lead to various classification schemes, focusing mostly on hierarchical relationships between structural domains. At the coarsest classification level, such schemes typically identify hundreds of types of fundamental units called folds. As a result, we picture protein structure space as a collection of isolated fold islands. It is obvious, however, that many protein folds share structural and functional commonalities. Locating those commonalities is important for our understanding of protein structure, function, and evolution. Here, we present an alternative view of the protein fold space, based on an interfold similarity measure that is related to the frequency of fragments shared between folds. In this view, protein structures form a complicated, crossconnected network with very interesting topology. We show that interfold similarity based on sequence/structure fragments correlates well with similarities of functions between protein populations in different folds.  相似文献   

14.
Many dissimilar protein sequences fold into similar structures. A central and persistent challenge facing protein structural analysis is the discrimination between homology and convergence for structurally similar domains that lack significant sequence similarity. Classic examples are the OB-fold and SH3 domains, both small, modular beta-barrel protein superfolds. The similarities among these domains have variously been attributed to common descent or to convergent evolution. Using a sequence profile-based phylogenetic technique, we analyzed all structurally characterized OB-fold, SH3, and PDZ domains with less than 40% mutual sequence identity. An all-against-all, profile-versus-profile analysis of these domains revealed many previously undetectable significant interrelationships. The matrices of scores were used to infer phylogenies based on our derivation of the relationships between sequence similarity E-values and evolutionary distances. The resulting clades of domains correlate remarkably well with biological function, as opposed to structural similarity, indicating that the functionally distinct sub-families within these superfolds are homologous. This method extends phylogenetics into the challenging "twilight zone" of sequence similarity, providing the first objective resolution of deep evolutionary relationships among distant protein families.  相似文献   

15.
L B Ellis  P Saurugger  C Woodward 《Biochemistry》1992,31(20):4882-4891
We have developed a computerized search pattern for recognition of the three-dimensional redox site of thioredoxins based on primary and predicted secondary structure. This pattern, developed in the ARIADNE protein expert system, is used to search for thioredoxin-like tertiary structural motif among proteins for which the only structural information is the primary sequence. The pattern was trained on 102 protein sequences (25 functionals and 77 controls); it matches all 25 members of the functional set under cutoff conditions that include only 2 members of the control set, for a sensitivity of 1.0 and a specificity of 0.97. The pattern matches only one of the two thioredoxin-like domains in protein disulfide isomerases (PDIs) and their analogues, suggesting that the C-terminal domain is more structurally similar to thioredoxin than the N-terminal domain. The Escherichia coli DsbA protein, a possible PDI analogue, appears to be more structurally similar to the N-terminal thioredoxin-like domain of PDIs. Thioredoxin-like redox functionality has been proposed for lutropin and follitropin, in part on the basis of their having -Cys-X-Pro-Cys- sequences. None match our pattern; all lack a predicted alpha-helix pattern element immediately after the active site. Hypothetical proteins in the National Biomedical Research Foundation Protein Identification Resource database were searched for matches to the pattern. The most interesting match was a hypothetical protein (161 residues) from the third open reading frame in the Staphylococcus aureus mer operon, which is involved in mercury detoxification. The match to our pattern and the hydrophobicity distribution in aligned elements of secondary structure not in our pattern strongly suggest that it has thioredoxin-like structure.  相似文献   

16.
A unique family of proteins have been identified in the Deinococcus genus with an N-terminal cobalamin (vitamin B(12)) chelatase domain denoted CbiX and an additional unique C-terminal domain with unknown function. Here we report the first crystal structure from this new family of proteins with the structure of Deinococcus radiodurans protein DR2241. The structure reveals a multi-domain protein where domains A (residues 1-132) has the same fold as the small CbiX (CbiX(S)), domains A and B (residues 1-272) follow the chelatase super-family fold and the two additional unique domains C and D have no structural homologues. Domain D harbours the sequence motifs CxxC and CxxxC, in which DR2241 gives the first evidence that these motifs bind a [4Fe-4S] iron-sulphur cluster. In solution there are indications of multimeric forms, and in the crystallographic asymmetric unit a tetramer is found where domains C and D are involved in stabilising the tetrameric assembly.  相似文献   

17.
Copley SD  Novak WR  Babbitt PC 《Biochemistry》2004,43(44):13981-13995
The thioredoxin fold is found in proteins that serve a wide variety of functions. Among these are peroxiredoxins, which catalyze the reduction of hydrogen peroxide and alkyl peroxides. Although the common structural fold shared by thioredoxins and peroxiredoxins suggests the possibility that they have evolved from a common progenitor, it has been difficult to examine this hypothesis in depth because pairwise sequence identities between proteins in these two superfamilies are statistically insignificant. Using the Shotgun program, we have found that sequences of reductases involved in maturation of cytochromes in certain bacteria bridge the sequences of thioredoxins and peroxiredoxins. Analysis of motifs found in a divergent set of thioredoxins, cytochrome maturation proteins, and peroxiredoxins provides further support for an evolutionary relationship between these proteins. Within the conserved motifs are specific residues that are characteristic of individual protein classes, and therefore are likely to be involved in the specific functions of those classes. We have used this information, in combination with existing structural and functional information, to gain new insight into the structure-function relationships in these proteins and to construct a model for the emergence of peroxiredoxins from a thioredoxin-like ancestor.  相似文献   

18.
Dekker C  Willison KR  Taylor WR 《Proteins》2011,79(4):1172-1192
An analysis of the apical domain of the Group-I and Group-II chaperonins shows that they have structural similarities to two different protein folds: a "swivel-domain" phosphotransferase and a thioredoxin-like peroxiredoxin. There is no significant sequence similarity that supports either similarity and the degree of similarity based on structure is comparable but weak for both relationships. Based on possible evolutionary transitions, we deduced that a phosphotransferase origin would require both a large insertion and deletion of structure whereas a peroxiredoxin origin requires only a peripheral rearrangement, similar to an internal domain-swap. We postulate that this change could have been triggered by the insertion of a peroxiredoxin into the ATPase domain that led to the modern chaperonin domain arrangement. The peroxidoxin fold is the most highly embellished member of the thioredoxin super-family and the insertion event may have "overloaded" the core, leading to a rearrangement. A peroxiredoxin origin for the domain also provides a functional explanation, as the peroxiredoxins can act as chaperones when they adopt a multimeric ring complex, similar to the chaperonin subunit configuration. In addition, several of the GroEL apical domain hydrophobic residues which interact with the unfolded protein are located in a position that corresponds to the protein substrate binding region of the peroxiredoxin fold. We suggest that the origin of the ur-chaperonin from a thioredoxin/peroxiredoxin fold might also account for the number of thioredoxin-fold containing proteins that interact with chaperonins, such as tubulin and phosducin-like proteins.  相似文献   

19.
蛋白质结构与功能中的结构域   总被引:5,自引:1,他引:4  
结构域是蛋白质亚基结构中的紧密球状区域.结构域作为蛋白质结构中介于二级与三级结构之间的又一结构层次,在蛋白质中起着独立的结构单位、功能单位与折叠单位的作用.在复杂蛋白质中,结构域具有结构与功能组件与遗传单位的作用.结构域层次的研究将会促进蛋白质结构与功能关系、蛋白质折叠机制以及蛋白质设计的研究.  相似文献   

20.
Proteins form arguably the most significant link between genotype and phenotype. Understanding the relationship between protein sequence and structure, and applying this knowledge to predict function, is difficult. One way to investigate these relationships is by considering the space of protein folds and how one might move from fold to fold through similarity, or potential evolutionary relationships. The many individual characterisations of fold space presented in the literature can tell us a lot about how well the current Protein Data Bank represents protein fold space, how convergence and divergence may affect protein evolution, how proteins affect the whole of which they are part, and how proteins themselves function. A synthesis of these different approaches and viewpoints seems the most likely way to further our knowledge of protein structure evolution and thus, facilitate improved protein structure design and prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号