首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
Panchenko AR  Madej T 《Proteins》2004,57(3):539-547
Two proteins are considered to have a similar fold if sufficiently many of their secondary structure elements are positioned similarly in space and are connected in the same order. Such a common structural scaffold may arise due to either divergent or convergent evolution. The intervening unaligned regions ("loops") between the superimposable helices and strands can exhibit a wide range of similarity and may offer clues to the structural evolution of folds. One might argue that more closely related proteins differ less in their nonconserved loop regions than distantly related proteins and, at the same time, the degree of variability in the loop regions in structurally similar but unrelated proteins is higher than in homologs. Here we introduce a new measure for structural (dis)similarity in loop regions that is based on the concept of the Hausdorff metric. This measure is used to gauge protein relatedness and is tested on a benchmark of homologous and analogous protein structures. It has been shown that the new measure can distinguish homologous from analogous proteins with the same or higher accuracy than the conventional measures that are based on comparing proteins in structurally aligned regions. We argue that this result can be attributed to the higher sensitivity of the Hausdorff (dis)similarity measure in detecting particularly evident dissimilarities in structures and draw some conclusions about evolutionary relatedness of proteins in the most populated protein folds.  相似文献   

2.
In this work we examine how protein structural changes are coupled with sequence variation in the course of evolution of a family of homologs. The sequence-structure correlation analysis performed on 81 homologous protein families shows that the majority of them exhibit statistically significant linear correlation between the measures of sequence and structural similarity. We observed, however, that there are cases where structural variability cannot be mainly explained by sequence variation, such as protein families with a number of disulfide bonds. To understand whether structures from different families and/or folds evolve in the same manner, we compared the degrees of structural change per unit of sequence change ("the evolutionary plasticity of structure") between those families with a significant linear correlation. Using rigorous statistical procedures we find that, with a few exceptions, evolutionary plasticity does not show a statistically significant difference between protein families. Similar sequence-structure analysis performed for protein loop regions shows that evolutionary plasticity of loop regions is greater than for the protein core.  相似文献   

3.
Fructose-6-phosphate aldolase from Escherichia coli is a member of a small enzyme subfamily (MipB/TalC family) that belongs to the class I aldolases. The three-dimensional structure of this enzyme has been determined at 1.93 A resolution by single isomorphous replacement and tenfold non-crystallographic symmetry averaging and refined to an R-factor of 19.9% (R(free) 21.3%). The subunit folds into an alpha/beta barrel, with the catalytic lysine residue on barrel strand beta 4. It is very similar in overall structure to that of bacterial and mammalian transaldolases, although more compact due to extensive deletions of additional secondary structural elements. The enzyme forms a decamer of identical subunits with point group symmetry 52. Five subunits are arranged as a pentamer, and two ring-like pentamers pack like a doughnut to form the decamer. A major interaction within the pentamer is through the C-terminal helix from one monomer, which runs across the active site of the neighbouring subunit. In classical transaldolases, this helix folds back and covers the active site of the same subunit and is involved in dimer formation. The inter-subunit helix swapping appears to be a major determinant for the formation of pentamers rather than dimers while at the same time preserving importing interactions of this helix with the active site of the enzyme. The active site lysine residue is covalently modified, by forming a carbinolamine with glyceraldehyde from the crystallisation mixture. The catalytic machinery is very similar to that of transaldolase, which together with the overall structural similarity suggests that enzymes of the MipB/TALC subfamily are evolutionary related to the transaldolase family.  相似文献   

4.
Dekker C  Willison KR  Taylor WR 《Proteins》2011,79(4):1172-1192
An analysis of the apical domain of the Group-I and Group-II chaperonins shows that they have structural similarities to two different protein folds: a "swivel-domain" phosphotransferase and a thioredoxin-like peroxiredoxin. There is no significant sequence similarity that supports either similarity and the degree of similarity based on structure is comparable but weak for both relationships. Based on possible evolutionary transitions, we deduced that a phosphotransferase origin would require both a large insertion and deletion of structure whereas a peroxiredoxin origin requires only a peripheral rearrangement, similar to an internal domain-swap. We postulate that this change could have been triggered by the insertion of a peroxiredoxin into the ATPase domain that led to the modern chaperonin domain arrangement. The peroxidoxin fold is the most highly embellished member of the thioredoxin super-family and the insertion event may have "overloaded" the core, leading to a rearrangement. A peroxiredoxin origin for the domain also provides a functional explanation, as the peroxiredoxins can act as chaperones when they adopt a multimeric ring complex, similar to the chaperonin subunit configuration. In addition, several of the GroEL apical domain hydrophobic residues which interact with the unfolded protein are located in a position that corresponds to the protein substrate binding region of the peroxiredoxin fold. We suggest that the origin of the ur-chaperonin from a thioredoxin/peroxiredoxin fold might also account for the number of thioredoxin-fold containing proteins that interact with chaperonins, such as tubulin and phosducin-like proteins.  相似文献   

5.
Minai R  Matsuo Y  Onuki H  Hirota H 《Proteins》2008,72(1):367-381
Many drugs, even ones that are designed to act selectively on a target protein, bind unintended proteins. These unintended bindings can explain side effects or indicate additional mechanisms for a drug's medicinal properties. Structural similarity between binding sites is one of the reasons for binding to multiple targets. We developed a method for the structural alignment of atoms in the solvent-accessible surface of proteins that uses similarities in the local atomic environment, and carried out all-against-all structural comparisons for 48,347 potential ligand-binding regions from a nonredundant protein structure subset (nrPDB, provided by NCBI). The relationships between the similarity of ligand-binding regions and the similarity of the global structures of the proteins containing the binding regions were examined. We found 10,403 known ligand-binding region pairs whose structures were similar despite having different global folds. Of these, we detected 281 region pairs that had similar ligands with similar binding modes. These proteins are good examples of convergent evolution. In addition, we found a significant correlation between Z-score of structural similarity and true positive rate of "active" entries in the PubChem BioAssay database. Moreover, we confirmed the interaction between ibuprofen and a new target, porcine pancreatic elastase, by NMR experiment. Finally, we used this method to predict new drug-target protein interactions. We obtained 540 predictions for 105 drugs (e.g., captopril, lovastatin, flurbiprofen, metyrapone, and salicylic acid), and calculated the binding affinities using AutoDock simulation. The results of these structural comparisons are available at http://www.tsurumi.yokohama-cu.ac.jp/fold/database.html.  相似文献   

6.
In a similar manner to sequence database searching, it is also possible to compare three-dimensional protein structures. Such methods can be extremely useful because a structural similarity may represent a distant evolutionary relationship that is undetectable by sequence analysis. In this review, we summarise the most popular structure comparison methods, show how they can be used for database searching, and then describe some of the most advanced attempts to develop comprehensive protein structure classifications. With such data, it is possible to identify distant evolutionary relationships, provide libraries of unique folds for structure prediction, estimate the total number of folds that exist, and investigate the preference for certain types of structures over others. BioEssays 20:884–891, 1998. © 1998 John Wiley & Sons, Inc.  相似文献   

7.
Many protein classification systems capture homologous relationships by grouping domains into families and superfamilies on the basis of sequence similarity. Superfamilies with similar 3D structures are further grouped into folds. In the absence of discernable sequence similarity, these structural similarities were long thought to have originated independently, by convergent evolution. However, the growth of databases and advances in sequence comparison methods have led to the discovery of many distant evolutionary relationships that transcend the boundaries of superfamilies and folds. To investigate the contributions of convergent versus divergent evolution in the origin of protein folds, we clustered representative domains of known structure by their sequence similarity, treating them as point masses in a virtual 2D space which attract or repel each other depending on their pairwise sequence similarities. As expected, families in the same superfamily form tight clusters. But often, superfamilies of the same fold are linked with each other, suggesting that the entire fold evolved from an ancient prototype. Strikingly, some links connect superfamilies with different folds. They arise from modular peptide fragments of between 20 and 40 residues that co‐occur in the connected folds in disparate structural contexts. These may be descendants of an ancestral pool of peptide modules that evolved as cofactors in the RNA world and from which the first folded proteins arose by amplification and recombination. Our galaxy of folds summarizes, in a single image, most known and many yet undescribed homologous relationships between protein superfamilies, providing new insights into the evolution of protein domains.  相似文献   

8.
Abstract Protein structures are much more conserved than sequences during evolution. Based on this observation, we investigate the consequences of structural conservation on protein evolution. We study seven of the most studied protein folds, determining that an extended neutral network in sequence space is associated with each of them. Within our model, neutral evolution leads to a non-Poissonian substitution process, due to the broad distribution of connectivities in neutral networks. The observation that the substitution process has non-Poissonian statistics has been used to argue against the original Kimura neutral theory, while our model shows that this is a generic property of neutral evolution with structural conservation. Our model also predicts that the substitution rate can strongly fluctuate from one branch to another of the evolutionary tree. The average sequence similarity within a neutral network is close to the threshold of randomness, as observed for families of sequences sharing the same fold. Nevertheless, some positions are more difficult to mutate than others. We compare such structurally conserved positions to positions conserved in protein evolution, suggesting that our model can be a valuable tool to distinguish structural from functional conservation in databases of protein families. These results indicate that a synergy between database analysis and structurally based computational studies can increase our understanding of protein evolution.  相似文献   

9.
Baoqiang Cao  Ron Elber 《Proteins》2010,78(4):985-1003
We investigate small sequence adjustments (of one or a few amino acids) that induce large conformational transitions between distinct and stable folds of proteins. Such transitions are intriguing from evolutionary and protein‐design perspectives. They make it possible to search for ancient protein structures or to design protein switches that flip between folds and functions. A network of sequence flow between protein folds is computed for representative structures of the Protein Data Bank. The computed network is dense, on an average each structure is connected to tens of other folds. Proteins that attract sequences from a higher than expected number of neighboring folds are more likely to be enzymes and alpha/beta fold. The large number of connections between folds may reflect the need of enzymes to adjust their structures for alternative substrates. The network of the Cro family is discussed, and we speculate that capacity is an important factor (but not the only one) that determines protein evolution. The experimentally observed flip from all alpha to alpha + beta fold is examined by the network tools. A kinetic model for the transition of sequences between the folds (with only protein stability in mind) is proposed. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

10.
Despite significant methodological advances in protein structure determination high-resolution structures of membrane proteins are still rare, leaving sequence-based predictions as the only option for exploring the structural variability of membrane proteins at large scale. Here, a new structural classification approach for α-helical membrane proteins is introduced based on the similarity of predicted helix interaction patterns. Its application to proteins with known 3D structure showed that it is able to reliably detect structurally similar proteins even in the absence of any sequence similarity, reproducing the SCOP and CATH classifications with a sensitivity of 65% at a specificity of 90%. We applied the new approach to enhance our comprehensive structural classification of α-helical membrane proteins (CAMPS), which is primarily based on sequence and topology similarity, in order to find protein clusters that describe the same fold in the absence of sequence similarity. The total of 151 helix architectures were delineated for proteins with more than four transmembrane segments. Interestingly, we observed that proteins with 8 and more transmembrane helices correspond to fewer different architectures than proteins with up to 7 helices, suggesting that in large membrane proteins the evolutionary tendency to re-use already available folds is more pronounced.  相似文献   

11.
12.
The crystal structures of three proteins of diverse function and low sequence similarity were analyzed to evaluate structural and evolutionary relationships. The proteins include a bacterial bleomycin resistance protein, a bacterial extradiol dioxygenase, and human glyoxalase I. Structural comparisons, as well as phylogenetic analyses, strongly indicate that the modern family of proteins represented by these structures arose through a rich evolutionary history that includes multiple gene duplication and fusion events. These events appear to be historically shared in some cases, but parallel and historically independent in others. A significant early event is proposed to be the establishment of metal-binding in an oligomeric ancestor prior to the first gene fusion. Variations in the spatial arrangements of homologous modules are observed that are consistent with the structural principles of three-dimensional domain swapping, but in the unusual context of the formation of larger monomers from smaller dimers or tetramers. The comparisons support a general mechanism for metalloprotein evolution that exploits the symmetry of a homooligomeric protein to originate a metal binding site and relies upon the relaxation of symmetry, as enabled by gene duplication, to establish and refine specific functions.  相似文献   

13.
MOTIVATION: The evolution of protein sequences can be described by a stepwise process, where each step involves changes of a few amino acids. In a similar manner, the evolution of protein folds can be at least partially described by an analogous process, where each step involves comparatively simple changes affecting few secondary structure elements. A number of such evolution steps, justified by biologically confirmed examples, have previously been proposed by other researchers. However, unlike the situation with sequences, as far as we know there have been no attempts to estimate the comparative probabilities for different kinds of such structural changes. RESULTS: We have tried to assess the comparative probabilities for a number of known structural changes, and to relate the probabilities of such changes with the distance between protein sequences. We have formalized these structural changes using a topological representation of structures (TOPS), and have developed an algorithm for measuring structural distances that involve few evolutionary steps. The probabilities of structural changes then were estimated on the basis of all-against-all comparisons of the sequence and structure of protein domains from the CATH-95 representative set. The results obtained are reasonably consistent for a number of different data subsets and permit the identification of several 'most popular' types of evolutionary changes in protein structure. The results also suggest that alterations in protein structure are more likely to occur when the sequence similarity is >10% (the average similarity being approximately 6% for the data sets employed in this study), and that the distribution of probabilities of structural changes is fairly uniform within the interval of 15-50% sequence similarity. AVAILABILITY: The algorithms have been implemented on the Windows operating system in C++ and using the Borland Visual Component Library. The source code is available on request from the first author. The data sets used for this study (representative sets of protein domains, matrices of sequence similarities and structural distances) are available on http://bioinf.mii.lu.lv/epsrc_project/struct_ev.html.  相似文献   

14.
The high frequency of internal structural symmetry in common protein folds is presumed to reflect their evolutionary origins from the repetition and fusion of ancient peptide modules, but little is known about the primary sequence and physical determinants of this process. Unexpectedly, a sequence and structural analysis of symmetric subdomain modules within an abundant and ancient globular fold, the β-trefoil, reveals that modular evolution is not simply a relic of the ancient past, but is an ongoing and recurring mechanism for regenerating symmetry, having occurred independently in numerous existing β-trefoil proteins. We performed a computational reconstruction of a β-trefoil subdomain module and repeated it to form a newly three-fold symmetric globular protein, ThreeFoil. In addition to its near perfect structural identity between symmetric modules, ThreeFoil is highly soluble, performs multivalent carbohydrate binding, and has remarkably high thermal stability. These findings have far-reaching implications for understanding the evolution and design of proteins via subdomain modules.  相似文献   

15.
A new method to analyze the similarity between multiply aligned protein motifs (blocks) was developed. It identifies sets of consistently aligned blocks. These are found to be protein regions of similar function and structure that appear in different contexts. For example, the Rossmann fold ligand-binding region is found similar to TIM barrel and methylase regions, various protein families are predicted to have a TIM-barrel fold and the structural relation between the ClpP protease and crotonase folds is identified from their sequence. Besides identifying local structure features, sequence similarity across short sequence-regions (less than 20 amino acid regions) also predicts structure similarity of whole domains (folds) a few hundred amino acid residues long. Most of these relations could not be identified by other advanced sequence-to-sequence or sequence-to-multiple alignments comparisons. We describe the method (termed CYRCA), present examples of our findings, and discuss their implications.  相似文献   

16.
Utilizing concepts of protein building blocks, we propose a de novo computational algorithm that is similar to combinatorial shuffling experiments. Our goal is to engineer new naturally occurring folds with low homology to existing proteins. A selected protein is first partitioned into its building blocks based on their compactness, degree of isolation from the rest of the structure, and hydrophobicity. Next, the protein building blocks are substituted by fragments taken from other proteins with overall low sequence identity, but with a similar hydrophobic/hydrophilic pattern and a high structural similarity. These criteria ensure that the designed protein has a similar fold, low sequence identity, and a good hydrophobic core compared with its native counterpart. Here, we have selected two proteins for engineering, protein G B1 domain and ubiquitin. The two engineered proteins share approximately 20% and approximately 25% amino acid sequence identities with their native counterparts, respectively. The stabilities of the engineered proteins are tested by explicit water molecular dynamics simulations. The algorithm implements a strategy of designing a protein using relatively stable fragments, with a high population time. Here, we have selected the fragments by searching for local minima along the polypeptide chain using the protein building block model. Such an approach provides a new method for engineering new proteins with similar folds and low homology.  相似文献   

17.
To explore whether the generation of new protein folds could be linked to metallic cofactor recruitment, we identified the oldest examples of folds for manganese, iron, zinc, and copper proteins by analyzing their fold‐domain mapping patterns. We discovered that the generation of these folds was tightly coupled to corresponding metals. We found that the emerging order for these folds, i.e., manganese and iron protein folds appeared earlier than zinc and copper counterparts, coincides with the putative bioavailability of the corresponding metals in the ancient anoxic ocean. Therefore, we conclude that metallic cofactors, like organic cofactors, play an evolutionary role in the formation of new protein folds. This link could be explained by the emergence of protein structures with novel folds that could fulfill the new protein functions introduced by the metallic cofactors. These findings not only have important implications for understanding the evolutionary mechanisms of protein architectures, but also provide a further interpretation for the evolutionary story of superoxide dismutases.  相似文献   

18.
We have developed a method of searching for similar spatial arrangements of atoms around a given chemical moiety in proteins that bind a common ligand. The first step in this method is to consider a set of atoms that closely surround a given chemical moiety. Then, to compare the spatial arrangements of such surrounding atoms in different proteins, they are translated and rotated so that the chemical moieties are superposed on each other. Spatial arrangements of surrounding atoms in a pair of proteins are judged to be similar, when there are many corresponding atoms occupying similar spatial positions. Because the method focuses on the arrangements of surrounding atoms, it can detect structural similarities of binding sites in proteins that are dissimilar in their amino acid sequences or in their chain folds. We have applied this method to identify modes of nucleotide base recognition by proteins. An all-against-all comparison of the arrangements of atoms surrounding adenine moieties revealed an unexpected structural similarity between protein kinases, cAMP-dependent protein kinase (cAPK), and casein kinase-1 (CK1), and D-Ala:D-Ala ligase (DD-ligase) at their adenine-binding sites, despite a lack of similarity in their chain folds. The similar local structure consists of a four-residue segment and three sequentially separated residues. In particular the four-residue segments of these enzymes were found to have nearly identical conformations in their backbone parts, which are involved in the recognition of adenine. This common local structure was also found in substrate-free three-dimensional structures of other proteins that are similar to DD-ligase in the chain fold and of other protein kinases. As the proteins with different folds were found to share a common local structure, these proteins seem to constitute a remarkable example of convergent evolution for the same recognition mechanism. Received: 9 December 1996 / Accepted: 7 February 1997  相似文献   

19.
A newly defined family of fungal lectins displays no significant sequence similarity to any protein in the databases. These proteins, made of about 140 amino acid residues, have sequence identities ranging from 38% to 65% and share binding specificity to N-acetyl galactosamine. One member of this family, the lectin XCL from Xerocomus chrysenteron, induces drastic changes in the actin cytoskeleton after sugar binding at the cell surface and internalization, and has potent insecticidal activity. The crystal structure of XCL to 1.4 A resolution reveals the architecture of this new lectin family. The fold of the protein is not related to any of the several lectin folds documented so far. Unexpectedly, the structure similarity is significant with actinoporins, a family of pore-forming toxins. The specific structural features and sequence signatures in each protein family suggest a potential sugar binding site in XCL and a possible evolutionary relationship between these proteins. Finally, the tetrameric assembly of XCL reveals a complex network of protomer-protomer interfaces and generates a large, hydrated cavity of 1000 A3, which may become accessible to larger solutes after a small conformational change of the protein.  相似文献   

20.
Using structural similarity clustering of protein domains: protein domain universe graph (PDUG), and a hierarchical functional annotation: gene ontology (GO) as two evolutionary lenses, we find that each structural cluster (domain fold) exhibits a distribution of functions that is unique to it. These functional distributions are functional fingerprints that are specific to characteristic structural clusters and vary from cluster to cluster. Furthermore, as structural similarity threshold for domain clustering in the PDUG is relaxed we observe an influx of earlier-diverged domains into clusters. These domains join clusters without destroying the functional fingerprint. These results can be understood in light of a divergent evolution scenario that posits correlated divergence of structural and functional traits in protein domains from one or few progenitors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号