首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Shachar O  Linial M 《Proteins》2004,57(3):531-538
With currently available sequence data, it is feasible to conduct extensive comparisons among large sets of protein sequences. It is still a much more challenging task to partition the protein space into structurally and functionally related families solely based on sequence comparisons. The ProtoNet system automatically generates a treelike classification of the whole protein space. It stands to reason that this classification reflects evolutionary relationships, both close and remote. In this article, we examine this hypothesis. We present a semiautomatic procedure that singles out certain inner nodes in the ProtoNet tree that should ideally correspond to structurally and functionally defined protein families. We compare the performance of this method against several expert systems. Some of the competing methods incorporate additional extraneous information on protein structure or on enzymatic activities. The ProtoNet-based method performs at least as well as any of the methods with which it was compared. This article illustrates the ProtoNet-based method on several evolutionarily diverse families. Using this new method, an evolutionary divergence scheme can be proposed for a large number of structural and functional related superfamilies.  相似文献   

2.

Background  

The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins.  相似文献   

3.
Mosher DF  Adams JC 《Matrix biology》2012,31(3):155-161
The thrombospondins are a family of secreted, oligomeric glycoproteins that interact with cell surfaces, multiple components of the extracellular matrix, growth factors and proteases. These interactions underlie complex roles in cell interactions and tissue homeostasis in animals. Thrombospondins have been grouped functionally with SPARCs, tenascins and CCN proteins as adhesion-modulating or matricellular components of the extracellular milieu. Although all these multi-domain proteins share various commonalities of domains, the grouping is not based on structural homologies. Instead, the terms emphasise the general observations that these proteins do not form large-scale ECM structures, yet act at cell surfaces and function in coordination with the structural ECM and associated extracellular proteins. The designation of adhesion-modulation thus depends on observed tissue and cell culture ECM distributions and on experimentally identified functional properties. To date, the evolutionary relationships of these proteins have not been critically compared: yet, knowledge of their evolutionary histories is clearly relevant to any consideration of functional similarities. In this article, we survey briefly the structural and functional knowledge of these protein families, consider the evolution of each family, and outline a perspective on their functional roles.  相似文献   

4.
5.
6.

Background  

Protein structural alignment provides a fundamental basis for deriving principles of functional and evolutionary relationships. It is routinely used for structural classification and functional characterization of proteins and for the construction of sequence alignment benchmarks. However, the available techniques do not fully consider the implications of protein structural diversity and typically generate a single alignment between sequences.  相似文献   

7.
MOTIVATION: It is commonly believed that sequence determines structure, which in turn determines function. However, the presence of many proteins with the same structural fold but different functions suggests that global structure and function do not always correlate well. RESULTS: We propose a method for accurate functional annotation, based on identification of functional signatures from structural alignments (FSSA) using the Structural Classification of Proteins (SCOP) database. The FSSA method is superior at function discrimination and classification compared with several methods that directly inherit functional annotation information from homology inference, such as Smith-Waterman, PSI-BLAST, hidden Markov models and structure comparison methods, for a large number of structural fold families. Our results indicate that the contributions of amino acid residue types and positions to structure and function are largely separable for proteins in multi-functional fold families.  相似文献   

8.
9.
Function prediction frequently relies on comparing genes or gene products to search for relevant similarities. Because the number of protein structures with unknown function is mushrooming, however, we asked here whether such comparisons could be improved by focusing narrowly on the key functional features of protein structures, as defined by the Evolutionary Trace (ET). Therefore a series of algorithms was built to (a) extract local motifs (3D templates) from protein structures based on ET ranking of residue importance; (b) to assess their geometric and evolutionary similarity to other structures; and (c) to transfer enzyme annotation whenever a plurality was reached across matches. Whereas a prototype had only been 80% accurate and was not scalable, here a speedy new matching algorithm enabled large-scale searches for reciprocal matches and thus raised annotation specificity to 100% in both positive and negative controls of 49 enzymes and 50 non-enzymes, respectively-in one case even identifying an annotation error-while maintaining sensitivity ( approximately 60%). Critically, this Evolutionary Trace Annotation (ETA) pipeline requires no prior knowledge of functional mechanisms. It could thus be applied in a large-scale retrospective study of 1218 structural genomics enzymes and reached 92% accuracy. Likewise, it was applied to all 2935 unannotated structural genomics proteins and predicted enzymatic functions in 320 cases: 258 on first pass and 62 more on second pass. Controls and initial analyses suggest that these predictions are reliable. Thus the large-scale evolutionary integration of sequence-structure-function data, here through reciprocal identification of local, functionally important structural features, may contribute significantly to de-orphaning the structural proteome.  相似文献   

10.
The expansion of functions in an enzyme superfamily is thought to occur through recruitment of latent promiscuous functions within existing enzymes. Thus, the promiscuous activities of enzymes represent connections between different catalytic landscapes and provide an additional layer of evolutionary connectivity between functional families alongside their sequence and structural relationships. Functional connectivity has been observed between individual functional families; however, little is known about how catalytic landscapes are connected throughout a highly diverged superfamily. Here, we describe a superfamily-wide analysis of evolutionary and functional connectivity in the metallo-β-lactamase (MBL) superfamily. We investigated evolutionary connections between functional families and related evolutionary to functional connectivity; 24 enzymes from 15 distinct functional families were challenged against 10 catalytically distinct reactions. We revealed that enzymes of this superfamily are generally promiscuous, as each enzyme catalyzes on average 1.5 reactions in addition to its native one. Catalytic landscapes in the MBL superfamily overlap substantially; each reaction is connected on average to 3.7 other reactions whereas some connections appear to be unrelated to recent evolutionary events and occur between chemically distinct reactions. These findings support the idea that the highly distinct reactions in the MBL superfamily could have evolved from a common ancestor traversing a continuous network via promiscuous enzymes. Several functional connections (e.g., the lactonase/phosphotriesterase and phosphonatase/phosphodiesterase/arylsulfatase reactions) are also observed in structurally and evolutionary distinct superfamilies, suggesting that these catalytic landscapes are substantially connected. Our results show that new enzymatic functions could evolve rapidly from the current diversity of enzymes and range of promiscuous activities.  相似文献   

11.
From protein structure to function.   总被引:6,自引:0,他引:6  
Several databases of protein structural families now exist-organised according to both evolutionary relationships and common folding arrangements. Although these lag behind sequence databases in size, the prospect of structural genomics initiatives means that they may soon include representatives of many of the sequence families. To some extent, functional information can be derived from structural similarity. For some structural families, their function is highly conserved, whereas, for others, it can only be inherited or derived on the basis of additional information (e.g. sequence patterns, common residue clusters and characteristic surface properties).  相似文献   

12.

Background

Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution.

Results

Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

Conclusions

The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0399-6) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background  

The post-genomic era is characterised by a torrent of biological information flooding the public databases. As a direct consequence, similarity searches starting with a single query sequence frequently lead to the identification of hundreds, or even thousands of potential homologues. The huge volume of data renders the subsequent structural, functional and evolutionary analyses very difficult. It is therefore essential to develop new strategies for efficient sampling of this large sequence space, in order to reduce the number of sequences to be processed. At the same time, it is important to retain the most pertinent sequences for structural and functional studies.  相似文献   

14.
Deacetylation of N-acetylhexosamine residues in structural polysaccharides and glycoconjugates is catalyzed by different families of carbohydrate esterases that, despite different structural folds, share a common metal-assisted acid/base mechanism with the metal cation coordinated with a conserved Asp-His-His triad. These enzymes serve diverse biological functions in the modification of cell-surface polysaccharides in bacteria and fungi as well as in the metabolism of hexosamines in the biosynthesis of cellular glycoconjugates. Focusing on carbohydrate de-N-acetylases, this article summarizes the background of the different families from a structural and functional viewpoint and covers advances in the characterization of novel enzymes over the last 2–3 years. Current research is addressed to the identification of new deacetylases and unravel their biological functions as they are candidate targets for the design of antimicrobials against pathogenic bacteria and fungi. Likewise, some families are also used as biocatalysts for the production of defined glycostructures with diverse applications.  相似文献   

15.
A major component of the plant nuclear genome is constituted by different classes of repetitive DNA sequences. The structural, functional and evolutionary aspects of the satellite repetitive DNA families, and their organization in the chromosomes is reviewed. The tandem satellite DNA sequences exhibit characteristic chromosomal locations, usually at subtelomeric and centromeric regions. The repetitive DNA family(ies) may be widely distributed in a taxonomic family or a genus, or may be specific for a species, genome or even a chromosome. They may acquire large-scale variations in their sequence and copy number over an evolutionary time-scale. These features have formed the basis of extensive utilization of repetitive sequences for taxonomic and phylogenetic studies. Hybrid polyploids have especially proven to be excellent models for studying the evolution of repetitive DNA sequences. Recent studies explicitly show that some repetitive DNA families localized at the telomeres and centromeres have acquired important structural and functional significance. The repetitive elements are under different evolutionary constraints as compared to the genes. Satellite DNA families are thought to arise de novo as a consequence of molecular mechanisms such as unequal crossing over, rolling circle amplification, replication slippage and mutation that constitute "molecular drive".  相似文献   

16.
Oliveira L  Paiva PB  Paiva AC  Vriend G 《Proteins》2003,52(4):544-552
We introduce sequence entropy-variability plots as a method of analyzing families of protein sequences, and demonstrate this for three well-known sequence families: globins, ras-like proteins, and serine-proteases. The location of an aligned residue position in the entropy-variability plot correlates with structural characteristics, and with known facts about the roles of individual amino acids in the function of these proteins. The large numbers of known sequences in these families allowed us to introduce new filtering methods for variability patterns. The results are discussed in terms of a simple evolutionary model for functional proteins.  相似文献   

17.
The prediction of functional sites in newly solved protein structures is a challenge for computational structural biology. Most methods for approaching this problem use evolutionary conservation as the primary indicator of the location of functional sites. However, sequence conservation reflects not only evolutionary selection at functional sites to maintain protein function, but also selection throughout the protein to maintain the stability of the folded state. To disentangle sequence conservation due to protein functional constraints from sequence conservation due to protein structural constraints, we use all atom computational protein design methodology to predict sequence profiles expected under solely structural constraints, and to compute the free energy difference between the naturally occurring amino acid and the lowest free energy amino acid at each position. We show that functional sites are more likely than non-functional sites to have computed sequence profiles which differ significantly from the naturally occurring sequence profiles and to have residues with sub-optimal free energies, and that incorporation of these two measures improves sequence based prediction of protein functional sites. The combined sequence and structure based functional site prediction method has been implemented in a publicly available web server.  相似文献   

18.
Structural biology and structural genomics are expected to produce many three-dimensional protein structures in the near future. Each new structure raises questions about its function and evolution. Correct functional and evolutionary classification of a new structure is difficult for distantly related proteins and error-prone using simple statistical scores based on sequence or structure similarity. Here we present an accurate numerical method for the identification of evolutionary relationships (homology). The method is based on the principle that natural selection maintains structural and functional continuity within a diverging protein family. The problem of different rates of structural divergence between different families is solved by first using structural similarities to produce a global map of folds in protein space and then further subdividing fold neighborhoods into superfamilies based on functional similarities. In a validation test against a classification by human experts (SCOP), 77% of homologous pairs were identified with 92% reliability. The method is fully automated, allowing fast, self-consistent and complete classification of large numbers of protein structures. In particular, the discrimination between analogy and homology of close structural neighbors will lead to functional predictions while avoiding overprediction.  相似文献   

19.
The database PALI (Phylogeny and ALIgnment of homologous protein structures) consists of families of protein domains of known three-dimensional (3D) structure. In a PALI family, every member has been structurally aligned with every other member (pairwise) and also simultaneous superposition (multiple) of all the members has been performed. The database also contains 3D structure-based and structure-dependent sequence similarity-based phylogenetic dendrograms for all the families. The PALI release used in the present analysis comprises 225 families derived largely from the HOMSTRAD and SCOP databases. The quality of the multiple rigid-body structural alignments in PALI was compared with that obtained from COMPARER, which encodes a procedure based on properties and relationships. The alignments from the two procedures agreed very well and variations are seen only in the low sequence similarity cases often in the loop regions. A validation of Direct Pairwise Alignment (DPA) between two proteins is provided by comparing it with Pairwise alignment extracted from Multiple Alignment of all the members in the family (PMA). In general, DPA and PMA are found to vary rarely. The ready availability of pairwise alignments allows the analysis of variations in structural distances as a function of sequence similarities and number of topologically equivalent Calpha atoms. The structural distance metric used in the analysis combines root mean square deviation (r.m.s.d.) and number of equivalences, and is shown to vary similarly to r.m.s.d. The correlation between sequence similarity and structural similarity is poor in pairs with low sequence similarities. A comparison of sequence and 3D structure-based phylogenies for all the families suggests that only a few families have a radical difference in the two kinds of dendrograms. The difference could occur when the sequence similarity among the homologues is low or when the structures are subjected to evolutionary pressure for the retention of function. The PALI database is expected to be useful in furthering our understanding of the relationship between sequences and structures of homologous proteins and their evolution.  相似文献   

20.
In the postgenomic era, bioinformatic analysis of sequence similarity is an immensely powerful tool to gain insight into evolution and protein function. Over long evolutionary distances, however, sequence-based methods fail as the similarities become too low for phylogenetic analysis. Macromolecular structure generally appears better conserved than sequence, but clear models for how structure evolves over time are lacking. The exponential growth of three-dimensional structural information may allow novel structure-based methods to drastically extend the evolutionary time scales amenable to phylogenetics and functional classification of proteins. To this end, we analyzed 80 structures from the functionally diverse ferritin-like superfamily. Using evolutionary networks, we demonstrate that structural comparisons can delineate and discover groups of proteins beyond the "twilight zone" where sequence similarity does not allow evolutionary analysis, suggesting that considerable and useful evolutionary signal is preserved in three-dimensional structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号