首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The genus Peridinium Ehrenb. comprises a group of highly diversified dinoflagellates. Their morphological taxonomy has been established over the last century. Here, we examined relationships within the genus Peridinium, including Peridinium bipes F. Stein sensu lato, based on a molecular phylogeny derived from nuclear rDNA sequences. Extensive rDNA analyses of nine selected Peridinium species showed that intraspecies genetic variation was considerably low, but interspecies genetic divergence was high (>1.5% dissimilarity in the nearly complete 18S sequence; >4.4% in the 28S rDNA D1/D2). The 18S and 28S rDNA Bayesian tree topologies showed that Peridinium species grouped according to their taxonomic positions and certain morphological characters (e.g., epithecal plate formula). Of these groups, the quinquecorne group (plate formula of 3′, 2a, 7″) diverged first, followed by the umbonatum group (4′, 2a, 7″) and polonicum group (4′, 1a, 7″). Peridinium species with a plate formula of 4′, 3a, 7″ diverged last. Thus, 18S and 28S rDNA D1/D2 sequences are informative about relationships among Peridinium species. Statistical analyses revealed that the 28S rDNA D1/D2 region had a significantly higher genetic divergence than the 18S rDNA region, suggesting that the former as DNA markers may be more suitable for sequence‐based delimitation of Peridinium. The rDNA sequences had sufficient discriminative power to separate P. bipes f. occultaum (Er. Lindem.) M. Lefèvre and P. bipes f. globosum Er. Lindem. into two distinct species, even though these taxa are morphologically only marginally discriminated by spines on antapical plates and the shape of red bodies during the generation of cysts. Our results suggest that 28S rDNA can be used for all Peridinium species to make species‐level taxonomic distinctions, allowing improved taxonomic classification of Peridinium.  相似文献   

2.
Many protein classification systems capture homologous relationships by grouping domains into families and superfamilies on the basis of sequence similarity. Superfamilies with similar 3D structures are further grouped into folds. In the absence of discernable sequence similarity, these structural similarities were long thought to have originated independently, by convergent evolution. However, the growth of databases and advances in sequence comparison methods have led to the discovery of many distant evolutionary relationships that transcend the boundaries of superfamilies and folds. To investigate the contributions of convergent versus divergent evolution in the origin of protein folds, we clustered representative domains of known structure by their sequence similarity, treating them as point masses in a virtual 2D space which attract or repel each other depending on their pairwise sequence similarities. As expected, families in the same superfamily form tight clusters. But often, superfamilies of the same fold are linked with each other, suggesting that the entire fold evolved from an ancient prototype. Strikingly, some links connect superfamilies with different folds. They arise from modular peptide fragments of between 20 and 40 residues that co‐occur in the connected folds in disparate structural contexts. These may be descendants of an ancestral pool of peptide modules that evolved as cofactors in the RNA world and from which the first folded proteins arose by amplification and recombination. Our galaxy of folds summarizes, in a single image, most known and many yet undescribed homologous relationships between protein superfamilies, providing new insights into the evolution of protein domains.  相似文献   

3.
4.
Several studies based on the known three-dimensional (3-D) structures of proteins show that two homologous proteins with insignificant sequence similarity could adopt a common fold and may perform same or similar biochemical functions. Hence, it is appropriate to use similarities in 3-D structure of proteins rather than the amino acid sequence similarities in modelling evolution of distantly related proteins. Here we present an assessment of using 3-D structures in modelling evolution of homologous proteins. Using a dataset of 108 protein domain families of known structures with at least 10 members per family we present a comparison of extent of structural and sequence dissimilarities among pairs of proteins which are inputs into the construction of phylogenetic trees. We find that correlation between the structure-based dissimilarity measures and the sequence-based dissimilarity measures is usually good if the sequence similarity among the homologues is about 30% or more. For protein families with low sequence similarity among the members, the correlation coefficient between the sequence-based and the structure-based dissimilarities are poor. In these cases the structure-based dendrogram clusters proteins with most similar biochemical functional properties better than the sequence-similarity based dendrogram. In multi-domain protein families and disulphide-rich protein families the correlation coefficient for the match of sequence-based and structure-based dissimilarity (SDM) measures can be poor though the sequence identity could be higher than 30%. Hence it is suggested that protein evolution is best modelled using 3-D structures if the sequence similarities (SSM) of the homologues are very low.  相似文献   

5.
Abstract

Structures and functions of proteins play various essential roles in biological processes. The functions of newly discovered proteins can be predicted by comparing their structures with that of known-functional proteins. Many approaches have been proposed for measuring the protein structure similarity, such as the template-modeling (TM)-score method, GRaphlet (GR)-Align method as well as the commonly used root-mean-square deviation (RMSD) measures. However, the alignment comparisons between the similarity of protein structure cost much time on large dataset, and the accuracy still have room to improve. In this study, we introduce a new three-dimensional (3D) Yau–Hausdorff distance between any two 3D objects. The (3D) Yau–Hausdorff distance can be used in particular to measure the similarity/dissimilarity of two proteins of any size and does not need aligning and superimposing two structures. We apply structural similarity to study function similarity and perform phylogenetic analysis on several datasets. The results show that (3D) Yau–Hausdorff distance could serve as a more precise and effective method to discover biological relationships between proteins than other methods on structure comparison.

Communicated by Ramaswamy H. Sarma  相似文献   

6.
Phylogeny as a guide to structure and function of membrane transport proteins   总被引:10,自引:0,他引:10  
Protein phylogeny, based on primary amino acid sequence relatedness, reflects the evolutionary process and therefore provides a guide to structure, mechanism and function. Any two proteins that are related by common descent are expected to exhibit similar structures and functions to a degree proportional to the degree of their sequence similarity; but two independently evolving proteins should not. This principle provides the impetus to define protein phylogenetic relationships and interrelate families when possible. In this mini-review, we summarize the computational approaches and criteria we use to establish common evolutionary origin. We apply these tools to define distant superfamily relationships between several previously recognized transport protein families. In some cases, available structural and functional data are evaluated in order to substantiate our claim that molecular phylogeny provides a reliable guide to protein structure and function.  相似文献   

7.
It is observed that during divergent evolution of two proteins with a common phylogenetic origin, the structural similarity of their backbones is often preserved even when the sequence similarity between them decreases to a virtually undetectable level. Here we analyzed, whether the conservation of structure along evolution involves also the local atomic structures in the interfaces between secondary structural elements. We have used as study case one protein family, the proteasomal subunits, for which 17 crystal structures are known. These include 14 different subunits of Saccharomyces cerevisiae, 2 subunits of Thermoplasma acidophilum and one subunit of Escherichia coli. The structural core of the 17 proteasomal subunits has 23 secondary structural elements. Any two adjacent secondary structural elements form a molecular interface consisting of two molecular patches. We found 61 interfaces that occurred in all 17 subunits. The 3D shape of equivalent molecular patches from different proteasomal subunits were compared by superposition. Our results demonstrate that pairs of equivalent molecular patches show an RMSD which is lower than that of randomly chosen patches from unrelated proteins. This is true even when patch comparisons with identical residues were excluded from the analysis. Furthermore it is known that the sequential dissimilarity is correlated to the RMSD between the backbones of the members of protein families. The question arises whether this is also true for local atomic structures. The results show that the correlation of individual patch RMSD values and local sequence dissimilarities is low and has a wide range from 0 to 0.41, however, it is surprising that there is a good correlation between the average RMSD of all corresponding patches and the global sequence dissimilarity. This average patch RMSD correlates slightly stronger than the C(alpha)-trace RMSD to the global sequence dissimilarity.  相似文献   

8.
9.
Bae SH  Liu D  Lim HM  Lee Y  Choi BS 《Biochemistry》2008,47(7):1993-2001
Cnu is a nucleoid protein that has a high degree of sequence homology with Hha/YmoA family proteins, which bind to chromatin and regulate the expression of Escherichia coli virulence genes in response to changes in temperature or ionic strength. Here, we determined its solution structure and dynamic properties and mapped H-NS binding sites. Cnu consists of three alpha helices that are comparable with those of Hha, but it has significant flexibility in the C-terminal region and lacks a short alpha helix present in Hha. Upon increasing ionic strength, the helical structure of Cnu is destabilized, especially at the ends of the helices. The dominant H-NS binding sites, located at helix 3 as in Hha, reveal a common structural platform for H-NS binding. Our results may provide structural and dynamic bases for the similarity and dissimilarity between Cnu and Hha functions.  相似文献   

10.
Catalytic site structure is normally highly conserved between distantly related enzymes. As a consequence, templates representing catalytic sites have the potential to succeed at function prediction in cases where methods based on sequence or overall structure fail. There are many methods for searching protein structures for matches to structural templates, but few validated template libraries to use with these methods. We present a library of structural templates representing catalytic sites, based on information from the scientific literature. Furthermore, we analyse homologous template families to discover the diversity within families and the utility of templates for active site recognition. Templates representing the catalytic sites of homologous proteins mostly differ by less than 1A root mean square deviation, even when the sequence similarity between the two proteins is low. Within these sets of homologues there is usually no discernible relationship between catalytic site structure similarity and sequence similarity. Because of this structural conservation of catalytic sites, the templates can discriminate between matches to related proteins and random matches with over 85% sensitivity and predictive accuracy. Templates based on protein backbone positions are more discriminating than those based on side-chain atoms. These analyses show encouraging prospects for prediction of functional sites in structural genomics structures of unknown function, and will be of use in analyses of convergent evolution and exploring relationships between active site geometry and chemistry. The template library can be queried via a web server at and is available for download.  相似文献   

11.
Functional analyses of the tRNA:(guanine 26, N2,N2)-dimethyltransferase (Trm1) have been hampered by a lack of structural information about the enzyme and by low sequence similarity to better studied methyltransferases. Here we used computational methods to detect novel homologs of Trm1, infer the evolutionary relationships of the family, and predict the structure of the Trm1 methyltransferase. The N-terminal region of the protein is predicted to form an S-adenosylmethionine-binding domain, which harbors the active site. The C-terminal region is rich in predicted alpha-helices and, in analogy to other nucleic acid methyltransferases, may constitute the target recognition domain of the enzyme. Interposing these two domains, most Trm1 homologs possess a highly variable inserted sequence that is delimited by a Cys4 cluster, likely forming a Zn-finger structure. The residues of Trm1 predicted to participate in cofactor binding, target recognition, and catalysis, were mapped onto a preliminary structural model, providing a platform for designing new experiments to better understand the molecular functions of this protein family. In addition, identification of novel, atypical Trm1 homologs suggests candidates for cloning and biochemical characterization.  相似文献   

12.
Proteins that contain similar structural elements often have analogous functions regardless of the degree of sequence similarity or structure connectivity in space. In general, protein structure comparison (PSC) provides a straightforward methodology for biologists to determine critical aspects of structure and function. Here, we developed a novel PSC technique based on angle-distance image (A-D image) transformation and matching, which is independent of sequence similarity and connectivity of secondary structure elements (SSEs). An A-D image is constructed by utilizing protein secondary structure information. According to various types of SSEs, the mutual SSE pairs of the query protein are classified into three different types of sub-images. Subsequently, corresponding sub-images between query and target protein structures are compared using modified cross-correlation approaches to identify the similarity of various patterns. Structural relationships among proteins are displayed by hierarchical clustering trees, which facilitate the establishment of the evolutionary relationships between structure and function of various proteins.Four standard testing datasets and one newly created dataset were used to evaluate the proposed method. The results demonstrate that proteins from these five datasets can be categorized in conformity with their spatial distribution of SSEs. Moreover, for proteins with low sequence identity that share high structure similarity, the proposed algorithms are an efficient and effective method for structural comparison.  相似文献   

13.
Evolution of protein sequences and structures.   总被引:9,自引:0,他引:9  
The relationship between sequence similarity and structural similarity has been examined in 36 protein families with five or more diverse members whose structures are known. The structural similarity within a family (as determined with the DALI structure comparison program) is linearly related to sequence similarity (as determined by a Smith-Waterman search of the protein sequences in the structure database). The correlation between structural similarity and sequence similarity is very high; 18 of the 36 families had linear correlation coefficients r>/=0.878, and only nine had correlation coefficients r相似文献   

14.
Kosloff M  Kolodny R 《Proteins》2008,71(2):891-902
It is often assumed that in the Protein Data Bank (PDB), two proteins with similar sequences will also have similar structures. Accordingly, it has proved useful to develop subsets of the PDB from which "redundant" structures have been removed, based on a sequence-based criterion for similarity. Similarly, when predicting protein structure using homology modeling, if a template structure for modeling a target sequence is selected by sequence alone, this implicitly assumes that all sequence-similar templates are equivalent. Here, we show that this assumption is often not correct and that standard approaches to create subsets of the PDB can lead to the loss of structurally and functionally important information. We have carried out sequence-based structural superpositions and geometry-based structural alignments of a large number of protein pairs to determine the extent to which sequence similarity ensures structural similarity. We find many examples where two proteins that are similar in sequence have structures that differ significantly from one another. The source of the structural differences usually has a functional basis. The number of such proteins pairs that are identified and the magnitude of the dissimilarity depend on the approach that is used to calculate the differences; in particular sequence-based structure superpositioning will identify a larger number of structurally dissimilar pairs than geometry-based structural alignments. When two sequences can be aligned in a statistically meaningful way, sequence-based structural superpositioning provides a meaningful measure of structural differences. This approach and geometry-based structure alignments reveal somewhat different information and one or the other might be preferable in a given application. Our results suggest that in some cases, notably homology modeling, the common use of nonredundant datasets, culled from the PDB based on sequence, may mask important structural and functional information. We have established a data base of sequence-similar, structurally dissimilar protein pairs that will help address this problem (http://luna.bioc.columbia.edu/rachel/seqsimstrdiff.htm).  相似文献   

15.
A database search often will find a seemingly strong sequence similarity between two fragments of proteins that are not expected to have an evolutionary or functional relationship. It is tempting to suggest that the two fragments will adopt a similar conformation due to a common pattern of residues that dictate a particular substructure. To investigate the likelihood of such a structural similarity, local sequence similarities between proteins of known conformation were identified by a standard database search algorithm. Significant sequence similarity was identified as when the chance probability of obtaining the relatedness score from a scan of the entire database was less than 1%. In this region both true homologies and false homologies are detected. A total of 69 false homologies was located of length between 20 and 262 aligned positions. Many of these alignments had approximately 25% sequence identity and a further 25% of conservative changes. However, the results show in general these aligned fragments did not have a significant similarity in secondary or tertiary structure. Thus local sequence does not indicate a structural similarity when there is neither an evolutionary nor functional explanation to support this. Accordingly structure predictions based on finding a local sequence similarity with an evolutionary unrelated protein of known conformation are unlikely to be valid.  相似文献   

16.
Levenshtein dissimilarity measures are used to compare sequences in application areas including coding theory, computer science and macromolecular biology. In general, they measure sequence dissimilarity by the length of a shortest weighted sequence of insertions, deletions and substitutions required, to transform one sequence into another. Those Levenshtein dissimilarity measures based on insertions and deletions are analyzed by a model involving valuations on a partially ordered set. The model reveals structural relationships among poset, valuation and dissimilarity measure. As a consequence, certain Levenshtein dissimilarity measures are shown to be metrics characterized by betweenness properties and computable in terms of well-known measures of sequence similarity. This work was supported in part by the Natural Sciences and Engineering Research Council of Canada under Grant A-4142.  相似文献   

17.
The 3D structures of α-crystallin, a major eye lens protein, and related small heat shock proteins are unresolved. It has been assumed that α-crystallin is primarily a β-sheet globular protein similar to γ-crystallin (Siezen and Argos, Biochim. Biophys. Acta, 1983, 748, 56–67) containing sequence repeats in its two domains (Wistow, FEBS Lett. 1985, 181, 1–6). Positional flexibility of amino acid residues and far UV-circular dichroism spectroscopy were used to investigate structural relationships among these proteins. The utility of flexibility plots for predicting protein structure is demonstrated by the excellent correlation of these plots with the known 3D X-ray structures of β/γ-crystallins. Similar analyses of α-crystallin subunits, αA and αB, and human heat shock protein 27 show that the C-terminal domains and connecting segments of these proteins are very similar while the N-terminal domains have significant structural differences. Unlike β/γ-crystallins, both Hsp27 and α-crystallin subunits are asymmetrical with highly flexible C-terminal domains. Flexibility is considered essential for protein functional activity. Therefore, the C-terminal region may play an active role in α-crystallin and small heat shock protein function. Differences in flexibility profiles and estimated secondary structure distribution in α-crystallin by three recent/updated algorithms from far UV-CD spectra support our predicted 3D structure and the concept that α-crystallin and members of β/γ-superfamily are structurally dissimilar.  相似文献   

18.
The dramatic increase in heterogeneous types of biological data—in particular, the abundance of new protein sequences—requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity—GPCRs and kinases from humans, and the crotonase superfamily of enzymes—we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.  相似文献   

19.
Many dissimilar protein sequences fold into similar structures. A central and persistent challenge facing protein structural analysis is the discrimination between homology and convergence for structurally similar domains that lack significant sequence similarity. Classic examples are the OB-fold and SH3 domains, both small, modular beta-barrel protein superfolds. The similarities among these domains have variously been attributed to common descent or to convergent evolution. Using a sequence profile-based phylogenetic technique, we analyzed all structurally characterized OB-fold, SH3, and PDZ domains with less than 40% mutual sequence identity. An all-against-all, profile-versus-profile analysis of these domains revealed many previously undetectable significant interrelationships. The matrices of scores were used to infer phylogenies based on our derivation of the relationships between sequence similarity E-values and evolutionary distances. The resulting clades of domains correlate remarkably well with biological function, as opposed to structural similarity, indicating that the functionally distinct sub-families within these superfolds are homologous. This method extends phylogenetics into the challenging "twilight zone" of sequence similarity, providing the first objective resolution of deep evolutionary relationships among distant protein families.  相似文献   

20.
Type II restriction enzymes are commercially important deoxyribonucleases and very attractive targets for protein engineering of new specificities. At the same time they are a very challenging test bed for protein structure prediction methods. Typically, enzymes that recognize different sequences show little or no amino acid sequence similarity to each other and to other proteins. Based on crystallographic analyses that revealed the same PD-(D/E)XK fold for more than a dozen case studies, they were nevertheless considered to be related until the combination of bioinformatics and mutational analyses has demonstrated that some of these proteins belong to other, unrelated folds PLD, HNH, and GIY-YIG. As a part of a large-scale project aiming at identification of a three-dimensional fold for all type II REases with known sequences (currently approximately 1000 proteins), we carried out preliminary structure prediction and selected candidates for experimental validation. Here, we present the analysis of HpaI REase, an ORFan with no detectable homologs, for which we detected a structural template by protein fold recognition, constructed a model using the FRankenstein monster approach and identified a number of residues important for the DNA binding and catalysis. These predictions were confirmed by site-directed mutagenesis and in vitro analysis of the mutant proteins. The experimentally validated model of HpaI will serve as a low-resolution structural platform for evolutionary considerations in the subgroup of blunt-cutting REases with different specificities. The research protocol developed in the course of this work represents a streamlined version of the previously used techniques and can be used in a high-throughput fashion to build and validate models for other enzymes, especially ORFans that exhibit no sequence similarity to any other protein in the database.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号