首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Conserved domains are recognized as the building blocks of eukaryotic proteins. Domains showing a tendency to occur in diverse combinations (??promiscuous?? domains) are involved in versatile architectures in proteins with different functions. Current models, based on global-level analyses of domain combinations in multiple genomes, have suggested that the propensity of some domains to associate with other domains in high-level architectures increases with organismal complexity. Alternative models using domain-based phylogenetic trees propose that domains have become promiscuous independently in different lineages through convergent evolution and are, thus, random with no functional or structural preferences. Here we test whether complex protein architectures have occurred by accretion from simpler systems and whether the appearance of multidomain combinations parallels organismal complexity. As a model, we analyze the modular evolution of the PWWP domain and ask whether its appearance in combinations with other domains into multidomain architectures is linked with the occurrence of more complex life-forms. Whether high-level combinations of domains are conserved and transmitted as stable units (cassettes) through evolution is examined in the genomes of plant or metazoan species selected for their established position in the evolution of the respective lineages.

Results

Using the domain-tree approach, we analyze the evolutionary origins and distribution patterns of the promiscuous PWWP domain to understand the principles of its modular evolution and its existence in combination with other domains in higher-level protein architectures. We found that as a single module the PWWP domain occurs only in proteins with a limited, mainly, species-specific distribution. Earlier, it was suggested that domain promiscuity is a fast-changing (volatile) feature shaped by natural selection and that only a few domains retain their promiscuity status throughout evolution. In contrast, our data show that most of the multidomain PWWP combinations in extant multicellular organisms (humans or land plants) are present in their unicellular ancestral relatives suggesting they have been transmitted through evolution as conserved linear arrangements (??cassettes??). Among the most interesting biologically relevant results is the finding that the genes of the two plant Trithorax family subgroups (ATX1/2 and ATX3/4/5) have different phylogenetic origins. The two subgroups occur together in the earliest land plants Physcomitrella patens and Selaginella moellendorffii.

Conclusion

Gain/loss of a single PWWP domain is observed throughout evolution reflecting dynamic lineage- or species-specific events. In contrast, higher-level protein architectures involving the PWWP domain have survived as stable arrangements driven by evolutionary descent. The association of PWWP domains with the DNA methyltransferases in O. tauri and in the metazoan lineage seems to have occurred independently consistent with convergent evolution. Our results do not support models wherein more complex protein architectures involving the PWWP domain occur with the appearance of more evolutionarily advanced life forms.  相似文献   

2.
The members of the Ras-like superfamily of small GTP-binding proteins are molecular switches that are in general regulated in time and space by guanine nucleotide exchange factors and GTPase activating proteins. The Ras-like G-proteins Ras, Rap and Ral are regulated by a variety of guanine nucleotide exchange factors that are characterized by a CDC25 homology domain. Here we study the evolution of the Ras pathway by determining the evolutionary history of CDC25 homology domain coding sequences. We identified CDC25 homology domain coding sequences in animals, fungi and a wide range of protists, but not in plants. This suggests that the CDC25 homology domain originated in or before the last eukaryotic ancestor but was subsequently lost in plant. We provide evidence that at least seven different ancestral Ras guanine nucleotide exchange factors were present in the ancestor of fungi and animals. Differences between present day fungi and animals are the result of loss of ancestral Ras guanine nucleotide exchange factors early in fungal and animal evolution combined with lineage specific duplications and domain acquisitions. In addition, we identify Ral guanine exchange factors and Ral in early diverged fungi, dating the origin of Ral signaling back to before the divergence of animals and fungi. We conclude that the Ras signaling pathway evolved by gradual change as well as through differential sampling of the ancestral CDC25 homology domain repertoire by both fungi and animals. Finally, a comparison of the domain composition of the Ras guanine nucleotide exchange factors shows that domain addition and diversification occurred both prior to and after the fungal–animal split.  相似文献   

3.
Ponting CP  Dickens NJ 《Genome biology》2001,2(7):comment2006.1-comment20066
The evolutionary history of eukaryotic proteins involves rapid sequence divergence, addition and deletion of domains, and fusion and fission of genes. Although the protein repertoires of distantly related species differ greatly, their domain repertoires do not. To account for the great diversity of domain contexts and an unexpected paucity of ortholog conservation, we must categorize the coding regions of completely sequenced genomes into domain families, as well as protein families.  相似文献   

4.
Repurposing existing proteins for new cellular functions is recognized as a main mechanism of evolutionary innovation, but its role in organelle evolution is unclear. Here, we explore the mechanisms that led to the evolution of the centrosome, an ancestral eukaryotic organelle that expanded its functional repertoire through the course of evolution. We developed a refined sequence alignment technique that is more sensitive to coiled coil proteins, which are abundant in the centrosome. For proteins with high coiled-coil content, our algorithm identified 17% more reciprocal best hits than BLAST. Analyzing 108 eukaryotic genomes, we traced the evolutionary history of centrosome proteins. In order to assess how these proteins formed the centrosome and adopted new functions, we computationally emulated evolution by iteratively removing the most recently evolved proteins from the centrosomal protein interaction network. Coiled-coil proteins that first appeared in the animal–fungi ancestor act as scaffolds and recruit ancestral eukaryotic proteins such as kinases and phosphatases to the centrosome. This process created a signaling hub that is crucial for multicellular development. Our results demonstrate how ancient proteins can be co-opted to different cellular localizations, thereby becoming involved in novel functions.  相似文献   

5.
GGDEF domain is homologous to adenylyl cyclase   总被引:21,自引:0,他引:21  
Pei J  Grishin NV 《Proteins》2001,42(2):210-216
The GGDEF domain is detected in many prokaryotic proteins, most of which are of unknown function. Several bacteria carry 12-22 different GGDEF homologues in their genomes. Conducting extensive profile-based searches, we detect statistically supported sequence similarity between GGDEF domain and adenylyl cyclase catalytic domain. From this homology, we deduce that the prokaryotic GGDEF domain is a regulatory enzyme involved in nucleotide cyclization, with the fold similar to that of the eukaryotic cyclase catalytic domain. This prediction correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. Domain architecture analysis shows that GGDEF is typically present in multidomain proteins containing regulatory domains of signaling pathways or protein-protein interaction modules. Evolutionary tree analysis indicates that GGDEF/cyclase superfamily forms a large diversified cluster of orthologous proteins present in bacteria, archaea, and eukaryotes. Proteins 2001;42:210-216.  相似文献   

6.
There is a limited repertoire of domain families in nature that are duplicated and combined in different ways to form the set of proteins in a genome. Most proteins in both prokaryote and eukaryote genomes consist of two or more domains, and we show that the family size distribution of multi-domain protein families follows a power law like that of individual families. Most domain pairs occur in four to six different domain architectures: in isolation and in combinations with different partners. We showed previously that within the set of all pairwise domain combinations, most small and medium-sized families are observed in combination with one or two other families, while a few large families are very versatile and combine with many different partners. Though this may appear to be a stochastic pattern, in which large families have more combination partners by virtue of their size, we establish here that all the domain families with more than three members in genomes are duplicated more frequently than would be expected by chance considering their number of neighbouring domains. This duplication of domain pairs is statistically significant for between one and three quarters of all families with seven or more members. For the majority of pairwise domain combinations, there is no known three-dimensional structure of the two domains together, and we term these novel combinations. Novel domain combinations are interesting and important targets for structural elucidation, as the geometry and interaction between the domains will help understand the function and evolution of multi-domain proteins. Of particular interest are those combinations that occur in the largest number of multi-domain proteins, and several of these frequent novel combinations contain DNA-binding domains.Abbreviations:SCOP: Structural Classification of Proteins database, PDB: Protein DataBank, HMM: hidden Markov model  相似文献   

7.
Though the heterotrimeric G-proteins signaling system is one of the best studied in eukaryotes, its provenance and its prevalence outside of model eukaryotes remains poorly understood. We utilized the wealth of sequence data from recently sequenced eukaryotic genomes to uncover robust G-protein signaling systems in several poorly studied eukaryotic lineages such as the parabasalids, heteroloboseans and stramenopiles. This indicated that the Gα subunit is likely to have separated from the ARF-like GTPases prior to the last eukaryotic common ancestor. We systematically identified the structure and sequence features associated with this divergence and found that most of the neomorphic positions in Gα form a ring of residues centered on the nucleotide binding site, several of which are likely to be critical for interactions with the RGS domain for its GAP function. We also present evidence that in some of the potentially early branching eukaryotic lineages, like Trichomonas, Gα is likely to function independently of the Gβγ subunits. We were able to identify previously unknown Gγ subunits in Naegleria, suggesting that the trimeric version was already present by the time of the divergence of the heteroloboseans from the remaining eukaryotes. Evolution of Gα subunits is dominated by several independent lineage-specific expansions (LSEs). In most of these cases there are concomitant, independent LSEs of RGS proteins along with an extraordinary diversification of their domain architectures. The diversity of RGS domains from Naegleria in particular, which has the largest complement of Gα and RGS proteins for any eukaryote, provides new insights into RGS function and evolution. We uncovered a new class of soluble ligand receptors of bacterial origin with RGS domains and an extraordinary diversity of membrane-linked, redox-associated, adhesion-dependent and small molecule-induced G-protein signaling networks that evolved in early-branching eukaryotes, independently of parallel systems in animals. Furthermore, this newly characterized diversity of RGS domains helps in defining their ancestral conserved interfaces with Gα and also those interfaces that are prone to extensive lineage-specific diversification and are thereby responsible for selectivity in Gα-RGS interactions. Several mushrooms show LSEs of Gαs but not of RGS proteins pointing to the probable differentiation of Gαs in conjunction with mating-type diversity. When combined with the characterization of the 7TM receptors (GPCRs), it becomes apparent that, through much of eukaryotic evolution, cells contained both 7TM receptors that acted as GEFs and those as GAPs (with C-terminal RGS domains) for Gαs. Only in some lineages like animals and stramenopiles the 7TM receptors were restricted to GEF only roles, probably due to selection imposed by the rate-constants of the Gαs that underwent lineage-specific expansion in them. In the alveolate lineage the 7TM receptors occur independently of heterotrimeric G-proteins, suggesting the prevalence of G-protein-independent signaling in these organisms.  相似文献   

8.
Pax proteins play a diverse role in early animal development and contain the characteristic paired domain, consisting of two conserved helix-turn-helix motifs. In many Pax proteins the paired domain is fused to a second DNA binding domain of the paired-like homeobox family. By amino acid sequence alignments, secondary structure prediction, 3D-structure comparison, and phylogenetic reconstruction, we analyzed the relationship between Pax proteins and members of the Tc1 family of transposases, which possibly share a common ancestor with Pax proteins. We suggest that the DNA binding domain of an ancestral transposase (proto-Pax transposase) was fused to a homeodomain shortly after the emergence of metazoans about one billion years ago. Using the transposase sequences as an outgroup we reexamined the early evolution of the Pax proteins. Our novel evolutionary scenario features a single homeobox capturing event and an early duplication of Pax genes before the divergence of porifera, indicating a more diverse role of Pax proteins in primitive animals than previously expected. Received: 16 February 2000 / Accepted: 13 August 2000  相似文献   

9.
The WD40 domain exhibits a β-propeller architecture, often comprising seven blades. The WD40 domain is one of the most abundant domains and also among the top interacting domains in eukaryotic genomes. In this review, we will discuss the identification, definition and architecture of the WD40 domains. WD40 domain proteins are involved in a large variety of cellular processes, in which WD40 domains function as a protein-protein or protein-DNA interaction platform. WD40 domain mediates molecular recognition events mainly through the smaller top surface, but also through the bottom surface and sides. So far, no WD40 domain has been found to display enzymatic activity. We will also discuss the different binding modes exhibited by the large versatile family of WD40 domain proteins. In the last part of this review, we will discuss how post-translational modifications are recognized by WD40 domain proteins.  相似文献   

10.
In eukaryotes, the assembly and elongation of unbranched actin filaments is controlled by formins, which are long, multidomain proteins. These proteins are important for dynamic cellular processes such as determination of cell shape, cell division, and cellular interaction. Yet, no comprehensive study has been done about the origins and evolution of this gene family. We therefore performed extensive phylogenetic and motif analyses of the formin genes by examining 597 prokaryotic and 53 eukaryotic genomes. Additionally, we used three-dimensional protein structure data in an effort to uncover distantly related sequences. Our results suggest that the formin homology 2 (FH2) domain, which promotes the formation of actin filaments, is a eukaryotic innovation and apparently originated only once in eukaryotic evolution. Despite the high degree of FH2 domain sequence divergence, the FH2 domains of most eukaryotic formins are predicted to assume the same fold and thus have similar functions. The formin genes have experienced multiple taxon-specific duplications and followed the birth-and-death model of evolution. Additionally, the formin genes experienced taxon-specific genomic rearrangements that led to the acquisition of unrelated protein domains. The evolutionary diversification of formin genes apparently increased the number of formin's interacting molecules and consequently contributed to the development of a complex and precise actin assembly mechanism. The diversity of formin types is probably related to the range of actin-based cellular processes that different cells or organisms require. Our results indicate the importance of gene duplication and domain acquisition in the evolution of the eukaryotic cell and offer insights into how a complex system, such as the cytoskeleton, evolved.  相似文献   

11.
Evolutionary innovation in eukaryotes and especially animals is at least partially driven by genome rearrangements and the resulting emergence of proteins with new domain combinations, and thus potentially novel functionality. Given the random nature of such rearrangements, one could expect that proteins with particularly useful multidomain combinations may have been rediscovered multiple times by parallel evolution. However, existing reports suggest a minimal role of this phenomenon in the overall evolution of eukaryotic proteomes. We assembled a collection of 172 complete eukaryotic genomes that is not only the largest, but also the most phylogenetically complete set of genomes analyzed so far. By employing a maximum parsimony approach to compare repertoires of Pfam domains and their combinations, we show that independent evolution of domain combinations is significantly more prevalent than previously thought. Our results indicate that about 25% of all currently observed domain combinations have evolved multiple times. Interestingly, this percentage is even higher for sets of domain combinations in individual species, with, for instance, 70% of the domain combinations found in the human genome having evolved independently at least once in other species. We also show that previous, much lower estimates of this rate are most likely due to the small number and biased phylogenetic distribution of the genomes analyzed. The process of independent emergence of identical domain combination is widespread, not limited to domains with specific functional categories. Besides data from large-scale analyses, we also present individual examples of independent domain combination evolution. The surprisingly large contribution of parallel evolution to the development of the domain combination repertoire in extant genomes has profound consequences for our understanding of the evolution of pathways and cellular processes in eukaryotes and for comparative functional genomics.  相似文献   

12.
Convergent evolution of domain architectures (is rare)   总被引:4,自引:0,他引:4  
MOTIVATION: In this paper, we shall examine the evolution of domain architectures across 62 genomes of known phylogeny including all kingdoms of life. We look in particular at the possibility of convergent evolution, with a view to determining the extent to which the architectures observed in the genomes are due to functional necessity or evolutionary descent. We used domains of known structure, because from this and other information we know their evolutionary relationships. We use a range of methods including phylogenetic grouping, sequence similarity/alignment, mutation rates and comparative genomics to approach this difficult problem from several angles. RESULTS: Although we do not claim an exhaustive analysis, we conclude that between 0.4 and 4% of sequences are involved in convergent evolution of domain architectures, and expect the actual number to be close to the lower bound. We also made two incidental observations, albeit on a small sample: the events leading to convergent evolution appear to be random with no functional or structural preferences, and changes in the number of tandem repeat domains occur more readily than changes which alter the domain composition. CONCLUSION: The principal conclusion is that the observed domain architectures of the sequences in the genomes are driven by evolutionary descent rather than functional necessity. CONTACT: gough@supfam.org.  相似文献   

13.
In eukaryotes, neighboring genes can be packaged together in specific chromatin structures that ensure their coordinated expression. Examples of such multi-gene chromatin domains are well-documented, but a global view of the chromatin organization of eukaryotic genomes is lacking. To systematically identify multi-gene chromatin domains, we constructed a compendium of genome-scale binding maps for a broad panel of chromatin-associated proteins in Drosophila melanogaster. Next, we computationally analyzed this compendium for evidence of multi-gene chromatin domains using a novel statistical segmentation algorithm. We find that at least 50% of all fly genes are organized into chromatin domains, which often consist of dozens of genes. The domains are characterized by various known and novel combinations of chromatin proteins. The genes in many of the domains are coregulated during development and tend to have similar biological functions. Furthermore, during evolution fewer chromosomal rearrangements occur inside chromatin domains than outside domains. Our results indicate that a substantial portion of the Drosophila genome is packaged into functionally coherent, multi-gene chromatin domains. This has broad mechanistic implications for gene regulation and genome evolution.  相似文献   

14.
15.
Protein domains represent the basic evolutionary units that form proteins. Domain duplication and shuffling by recombination are probably the most important forces driving protein evolution and hence the complexity of the proteome. While the duplication of whole genes as well as domain-encoding exons increases the abundance of domains in the proteome, domain shuffling increases versatility, i.e. the number of distinct contexts in which a domain can occur. Here, we describe a comprehensive, genome-wide analysis of the relationship between these two processes. We observe a strong and robust correlation between domain versatility and abundance: domains that occur more often also have many different combination partners. This supports the view that domain recombination occurs in a random way. However, we do not observe all the different combinations that are expected from a simple random recombination scenario, and this is due to frequent duplication of specific domain combinations. When we simulate the evolution of the protein repertoire considering stochastic recombination of domains followed by extensive duplication of the combinations, we approximate the observed data well. Our analyses are consistent with a stochastic process that governs domain recombination and thus protein divergence with respect to domains within a polypeptide chain. At the same time, they support a scenario in which domain combinations are formed only once during the evolution of the protein repertoire, and are then duplicated to various extents. The extent of duplication of different combinations varies widely and, in nature, will depend on selection for the domain combination based on its function. Some of the pair-wise domain combinations that are highly duplicated also recur frequently with other partner domains, and thus represent evolutionary units larger than single protein domains, which we term "supra-domains".  相似文献   

16.
Understanding the dynamics behind domain architecture evolution is of great importance to unravel the functions of proteins. Complex architectures have been created throughout evolution by rearrangement and duplication events. An interesting question is how many times a particular architecture has been created, a form of convergent evolution or domain architecture reinvention. Previous studies have approached this issue by comparing architectures found in different species. We wanted to achieve a finer-grained analysis by reconstructing protein architectures on complete domain trees. The prevalence of domain architecture reinvention in 96 genomes was investigated with a novel domain tree-based method that uses maximum parsimony for inferring ancestral protein architectures. Domain architectures were taken from Pfam. To ensure robustness, we applied the method to bootstrap trees and only considered results with strong statistical support. We detected multiple origins for 12.4% of the scored architectures. In a much smaller data set, the subset of completely domain-assigned proteins, the figure was 5.6%. These results indicate that domain architecture reinvention is a much more common phenomenon than previously thought. We also determined which domains are most frequent in multiply created architectures and assessed whether specific functions could be attributed to them. However, no strong functional bias was found in architectures with multiple origins.  相似文献   

17.

Background

Sequencing the genomes of multiple, taxonomically diverse eukaryotes enables in-depth comparative-genomic analysis which is expected to help in reconstructing ancestral eukaryotic genomes and major events in eukaryotic evolution and in making functional predictions for currently uncharacterized conserved genes.

Results

We examined functional and evolutionary patterns in the recently constructed set of 5,873 clusters of predicted orthologs (eukaryotic orthologous groups or KOGs) from seven eukaryotic genomes: Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Arabidopsis thaliana, Saccharomyces cerevisiae, Schizosaccharomyces pombe and Encephalitozoon cuniculi. Conservation of KOGs through the phyletic range of eukaryotes strongly correlates with their functions and with the effect of gene knockout on the organism's viability. The approximately 40% of KOGs that are represented in six or seven species are enriched in proteins responsible for housekeeping functions, particularly translation and RNA processing. These conserved KOGs are often essential for survival and might approximate the minimal set of essential eukaryotic genes. The 131 single-member, pan-eukaryotic KOGs we identified were examined in detail. For around 20 that remained uncharacterized, functions were predicted by in-depth sequence analysis and examination of genomic context. Nearly all these proteins are subunits of known or predicted multiprotein complexes, in agreement with the balance hypothesis of evolution of gene copy number. Other KOGs show a variety of phyletic patterns, which points to major contributions of lineage-specific gene loss and the 'invention' of genes new to eukaryotic evolution. Examination of the sets of KOGs lost in individual lineages reveals co-elimination of functionally connected genes. Parsimonious scenarios of eukaryotic genome evolution and gene sets for ancestral eukaryotic forms were reconstructed. The gene set of the last common ancestor of the crown group consists of 3,413 KOGs and largely includes proteins involved in genome replication and expression, and central metabolism. Only 44% of the KOGs, mostly from the reconstructed gene set of the last common ancestor of the crown group, have detectable homologs in prokaryotes; the remainder apparently evolved via duplication with divergence and invention of new genes.

Conclusions

The KOG analysis reveals a conserved core of largely essential eukaryotic genes as well as major diversification and innovation associated with evolution of eukaryotic genomes. The results provide quantitative support for major trends of eukaryotic evolution noticed previously at the qualitative level and a basis for detailed reconstruction of evolution of eukaryotic genomes and biology of ancestral forms.  相似文献   

18.
Domains are basic evolutionary units of proteins and most proteins have more than one domain. Advances in domain modeling and collection are making it possible to annotate a large fraction of known protein sequences by a linear ordering of their domains, yielding their architecture. Protein domain architectures link evolutionarily related proteins and underscore their shared functions. Here, we attempt to better understand this association by identifying the evolutionary pathways by which extant architectures may have evolved. We propose a model of evolution in which architectures arise through rearrangements of inferred precursor architectures and acquisition of new domains. These pathways are ranked using a parsimony principle, whereby scenarios requiring the fewest number of independent recombination events, namely fission and fusion operations, are assumed to be more likely. Using a data set of domain architectures present in 159 proteomes that represent all three major branches of the tree of life allows us to estimate the history of over 85% of all architectures in the sequence database. We find that the distribution of rearrangement classes is robust with respect to alternative parsimony rules for inferring the presence of precursor architectures in ancestral species. Analyzing the most parsimonious pathways, we find 87% of architectures to gain complexity over time through simple changes, among which fusion events account for 5.6 times as many architectures as fission. Our results may be used to compute domain architecture similarities, for example, based on the number of historical recombination events separating them. Domain architecture "neighbors" identified in this way may lead to new insights about the evolution of protein function.  相似文献   

19.
20.
Fan JS  Zhang M 《Neuro-Signals》2002,11(6):315-321
As one of the most abundant protein domains in the genomes of metazoans, PDZ domains play important roles in the targeting of proteins to specific cell membranes, as well as assembling proteins into supramolecular signaling complexes. The structures of individual PDZ domains, along with their diverse cooccurrence with a great variety of other protein domains, provide the biochemical basis for the functional diversity of PDZ proteins. In this review, we first briefly summarize the structure and target-binding properties of PDZ domains. After surveying the SMART protein domain database, we attempt to classify PDZ domain proteins into three general categories. We end the review by presenting several recent studies showing some novel features of PDZ domain proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号