首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Most eukaryotic proteins consist of multiple domains created through gene fusions or internal duplications. The most frequent change of a domain architecture (DA) is insertion or deletion of a domain at the N or C terminus. Still, the mechanisms underlying the evolution of multidomain proteins are not very well studied.Here, we have studied the evolution of multidomain architectures (MDA), guided by evolutionary information in the form of a phylogenetic tree. Our results show that Pfam domain families and MDAs have been created with comparable rates (0.1-1 per million years (My)). The major changes in DA evolution have occurred in the process of multicellularization and within the metazoan lineage. In contrast, creation of domains seems to have been frequent already in the early evolution. Furthermore, most of the architectures have been created from older domains or architectures, whereas novel domains are mainly found in single-domain proteins. However, a particular group of exon-bordering domains may have contributed to the rapid evolution of novel multidomain proteins in metazoan organisms. Finally, MDAs have evolved predominantly through insertions of domains, whereas domain deletions are less common.In conclusion, the rate of creation of multidomain proteins has accelerated in the metazoan lineage, which may partly be explained by the frequent insertion of exon-bordering domains into new architectures. However, our results indicate that other factors have contributed as well.  相似文献   

2.
We have investigated the mechanism and the evolutionary pathway of protein dimerization through analysis of experimental structures of dimers. We propose that the evolution of dimers may have multiple pathways, including (1) formation of a functional dimer directly without going through an ancestor monomer, (2) formation of a stable monomer as an intermediate followed by mutations of its surface residues, and (3), a domain swapping mechanism, replacing one segment in a monomer by an equivalent segment from an identical chain in the dimer. Some of the dimers which are governed by a domain swapping mechanism may have evolved at an earlier stage of evolution via the second mechanism. Here, we follow the theory that the kinetic pathway reflects the evolutionary pathway. We analyze the structure-kinetics-evolution relationship for a collection of symmetric homodimers classified into three groups: (1) 14 dimers, which were referred to as domain swapping dimers in the literature; (2) nine 2-state dimers, which have no measurable intermediates in equilibrium denaturation; and (3), eight 3-state dimers, which have stable intermediates in equilibrium denaturation. The analysis consists of the following stages: (i) The dimer is divided into two structural units, which have twofold symmetry. Each unit contains a contiguous segment from one polypeptide chain of the dimer, and its complementary contiguous segment from the other chain. (ii) The division is repeated progressively, with different combinations of the two segments in each unit. (iii) The coefficient of compactness is calculated for the units in all divisions. The coefficients obtained for different cuttings of a dimer form a compactness profile. The profile probes the structural organization of the two chains in a dimer and the stability of the monomeric state. We describe the features of the compactness profiles in each of the three dimer groups. The profiles identify the swapping segments in domain swapping dimers, and can usually predict whether a dimer has domain swapping. The kinetics of dimerization indicates that some dimers which have been assigned in the literature as domain swapping cases, dimerize through the 2-state kinetics, rather than through swapping segments of performed monomers. The compactness profiles indicate a wide spectrum in the kinetics of dimerization: dimers having no intermediate stable monomers; dimers having an intermediate with a stable monomer structure; and dimers having an intermediate with a stable structure in part of the monomer. These correspond to the multiple evolutionary pathways for dimer formation. The evolutionary mechanisms proposed here for dimers are applicable to other oligomers as well.  相似文献   

3.
We have previously attempted to simulate domain creation in early protein evolution by recombining polypeptide segments from non-homologous proteins, and we have described the structure of one such de novo protein, 1b11, a segment-swapped tetramer with novel architecture. Here, we have analyzed the thermodynamic stability and folding kinetics of the 1b11 tetramer and its monomeric and dimeric intermediates, and of 1b11 mutants with changes at the domain interface. Denatured 1b11 polypeptides fold into transient, folded monomers with marginal stability (DeltaG<1kcalmol(-1)) which convert rapidly ( approximately 6x10(4)M(-1)s(-1)) into dimers (DeltaG=9.8kcal/mol) and then more slowly ( approximately 3M(-1)s(-1)) into tetramers (DeltaG=28kcalmol(-1)). Segment swapping takes place during dimerization, as suggested by mass spectroscopic analysis of covalently linked peptides derived from proteolysis of a disulfide-linked dimer. Our results confirm that segment swapping and associated oligomerization are both powerful ways of stabilizing proteins, and we suggest that this may have been a feature of early protein evolution.  相似文献   

4.

Background

Conserved domains are recognized as the building blocks of eukaryotic proteins. Domains showing a tendency to occur in diverse combinations (??promiscuous?? domains) are involved in versatile architectures in proteins with different functions. Current models, based on global-level analyses of domain combinations in multiple genomes, have suggested that the propensity of some domains to associate with other domains in high-level architectures increases with organismal complexity. Alternative models using domain-based phylogenetic trees propose that domains have become promiscuous independently in different lineages through convergent evolution and are, thus, random with no functional or structural preferences. Here we test whether complex protein architectures have occurred by accretion from simpler systems and whether the appearance of multidomain combinations parallels organismal complexity. As a model, we analyze the modular evolution of the PWWP domain and ask whether its appearance in combinations with other domains into multidomain architectures is linked with the occurrence of more complex life-forms. Whether high-level combinations of domains are conserved and transmitted as stable units (cassettes) through evolution is examined in the genomes of plant or metazoan species selected for their established position in the evolution of the respective lineages.

Results

Using the domain-tree approach, we analyze the evolutionary origins and distribution patterns of the promiscuous PWWP domain to understand the principles of its modular evolution and its existence in combination with other domains in higher-level protein architectures. We found that as a single module the PWWP domain occurs only in proteins with a limited, mainly, species-specific distribution. Earlier, it was suggested that domain promiscuity is a fast-changing (volatile) feature shaped by natural selection and that only a few domains retain their promiscuity status throughout evolution. In contrast, our data show that most of the multidomain PWWP combinations in extant multicellular organisms (humans or land plants) are present in their unicellular ancestral relatives suggesting they have been transmitted through evolution as conserved linear arrangements (??cassettes??). Among the most interesting biologically relevant results is the finding that the genes of the two plant Trithorax family subgroups (ATX1/2 and ATX3/4/5) have different phylogenetic origins. The two subgroups occur together in the earliest land plants Physcomitrella patens and Selaginella moellendorffii.

Conclusion

Gain/loss of a single PWWP domain is observed throughout evolution reflecting dynamic lineage- or species-specific events. In contrast, higher-level protein architectures involving the PWWP domain have survived as stable arrangements driven by evolutionary descent. The association of PWWP domains with the DNA methyltransferases in O. tauri and in the metazoan lineage seems to have occurred independently consistent with convergent evolution. Our results do not support models wherein more complex protein architectures involving the PWWP domain occur with the appearance of more evolutionarily advanced life forms.  相似文献   

5.
Potato type II serine proteinase inhibitors are proteins that consist of multiple sequence repeats, and exhibit a multidomain structure. The structural domains are circular permutations of the repeat sequence, as a result of intramolecular domain swapping. Structural studies give indications for the origins of this folding behaviour, and the evolution of the inhibitor family.  相似文献   

6.

Background  

Protein domains represent the basic units in the evolution of proteins. Domain duplication and shuffling by recombination and fusion, followed by divergence are the most common mechanisms in this process. Such domain fusion and recombination events are predicted to occur only once for a given multidomain architecture. However, other scenarios may be relevant in the evolution of specific proteins, such as convergent evolution of multidomain architectures. With this in mind, we study glutaredoxin (GRX) domains, because these domains of approximately one hundred amino acids are widespread in archaea, bacteria and eukaryotes and participate in fusion proteins. GRXs are responsible for the reduction of protein disulfides or glutathione-protein mixed disulfides and are involved in cellular redox regulation, although their specific roles and targets are often unclear.  相似文献   

7.
Comparisons of bacteriophage PRD1 and adenovirus protein structures and virion architectures have been instrumental in unraveling an evolutionary relationship and have led to a proposal of a phylogeny-based virus classification. The structure of the PRD1 spike protein P5 provides further insight into the evolution of viral proteins. The crystallized P5 fragment comprises two structural domains: a globular knob and a fibrous shaft. The head folds into a ten-stranded jelly roll beta barrel, which is structurally related to the tumor necrosis factor (TNF) and the PRD1 coat protein domains. The shaft domain is a structural counterpart to the adenovirus spike shaft. The structural relationships between PRD1, TNF, and adenovirus proteins suggest that the vertex proteins may have originated from an ancestral TNF-like jelly roll coat protein via a combination of gene duplication and deletion.  相似文献   

8.
During evolution, many new proteins have been formed by the process of gene duplication and combination. The genes involved in this process usually code for whole domains. Small proteins contain one domain; medium and large proteins contain two or more domains. We have compared homologous domains that occur in both one-domain proteins and multidomain proteins. We have determined (1) how the functions of the individual domains in the multidomain proteins combine to produce their overall functions and (2) the extent to which these functions are similar to those in the one-domain homologs. We describe how domain combinations increase the specificity of enzymes; act as links between domains that have functional roles; regulate activity; combine within one chain functions that can act either independently, in concert or in new contexts; and provide the structural framework for the evolution of entirely new functions.  相似文献   

9.
When the entire genome of a filamentous heterocyst-forming N2-fixing cyanobacterium, Anabaena sp. PCC 7120 (Anabaena) was determined in 2001, a large number of PAS domains were detected in signal-transducing proteins. The draft genome sequence is also available for the cyanobacterium, Nostoc punctiforme strain ATCC 29133 (Nostoc), that is closely related to Anabaena. In this study, we extracted all PAS domains from the Nostoc genome sequence and analyzed them together with those of Anabaena. Clustering analysis of all the PAS domains gave many specific pairings, indicative of evolutionary conservations. Ortholog analysis of PAS-containing proteins showed composite multidomain architecture in some cases of conserved domains and domains of disagreement between the two species. Further inspection of the domains of disagreement allowed us to trace them back in evolution. Thus, multidomain proteins could have been generated by duplication or shuffling in these cyanobacteria. The conserved PAS domains in the orthologous proteins were analyzed by structural fitting to the known PAS domains. We detected several subclasses with unique sequence features, which will be the target of experimental analysis.  相似文献   

10.
Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.  相似文献   

11.
Qi Y  Grishin NV 《Proteins》2005,58(2):376-388
Protein structure classification is necessary to comprehend the rapidly growing structural data for better understanding of protein evolution and sequence-structure-function relationships. Thioredoxins are important proteins that ubiquitously regulate cellular redox status and various other crucial functions. We define the thioredoxin-like fold using the structure consensus of thioredoxin homologs and consider all circular permutations of the fold. The search for thioredoxin-like fold proteins in the PDB database identified 723 protein domains. These domains are grouped into eleven evolutionary families based on combined sequence, structural, and functional evidence. Analysis of the protein-ligand structure complexes reveals two major active site locations for the thioredoxin-like proteins. Comparison to existing structure classifications reveals that our thioredoxin-like fold group is broader and more inclusive, unifying proteins from five SCOP folds, five CATH topologies and seven DALI domain dictionary globular folding topologies. Considering these structurally similar domains together sheds new light on the relationships between sequence, structure, function and evolution of thioredoxins.  相似文献   

12.
During evolution, proteins containing newly emerged domains and the increasing proportion of multidomain proteins in the full genome-encoded proteome (GEP) have substantially contributed to increasing biological complexity. However, it is not known how these two potential structural factors are preferentially utilized at given physiological states. Here, we classified proteins according to domain number and domain age and explored the general trends across species for the utilization of proteins from GEP to various certain-state proteomes (CSPs, i.e., all the proteins expressed at certain physiological states). We found that multidomain proteins or only older domain-containing proteins are significantly overrepresented in CSPs compared with GEP, which is a trend that is stronger in multicellular organisms than in unicellular organisms. Interestingly, the strengths of overrepresentation decreased during evolution of multicellular eukaryotes. When comparing across CSPs, we found that multidomain proteins are more overrepresented in complex tissues than in simpler ones, whereas no difference among proteins with domains of different ages is evident between complex and simple tissues. Thus, biological complexity under certain conditions is more significantly realized by diverse domain organization than by the emergence of new types of domain. In addition, we found that multidomain or only older domain-containing proteins tend to evolve slowly and generally are under stronger purifying selection, which may partly result from their general overrepresentation trends in CSPs.  相似文献   

13.
Domains are basic evolutionary units of proteins and most proteins have more than one domain. Advances in domain modeling and collection are making it possible to annotate a large fraction of known protein sequences by a linear ordering of their domains, yielding their architecture. Protein domain architectures link evolutionarily related proteins and underscore their shared functions. Here, we attempt to better understand this association by identifying the evolutionary pathways by which extant architectures may have evolved. We propose a model of evolution in which architectures arise through rearrangements of inferred precursor architectures and acquisition of new domains. These pathways are ranked using a parsimony principle, whereby scenarios requiring the fewest number of independent recombination events, namely fission and fusion operations, are assumed to be more likely. Using a data set of domain architectures present in 159 proteomes that represent all three major branches of the tree of life allows us to estimate the history of over 85% of all architectures in the sequence database. We find that the distribution of rearrangement classes is robust with respect to alternative parsimony rules for inferring the presence of precursor architectures in ancestral species. Analyzing the most parsimonious pathways, we find 87% of architectures to gain complexity over time through simple changes, among which fusion events account for 5.6 times as many architectures as fission. Our results may be used to compute domain architecture similarities, for example, based on the number of historical recombination events separating them. Domain architecture "neighbors" identified in this way may lead to new insights about the evolution of protein function.  相似文献   

14.
We now know that the evolution of multidomain proteins has frequently involved genetic duplication events. These, however, are sometimes difficult to trace because of low sequence similarity between duplicated segments. Spectrin, the major component of the membrane skeleton that provides elasticity to the cell, contains tandemly repeated sequences of 106 amino acid residues. The same repeats are also present in α-actinin, dystrophin and utrophin. Sequence alignments and phylogenetic trees of these domains allow us to interpret the evolutionary relationship between these proteins, concluding that spectrin evolved from α-actinin by an elongation process that included two duplications of a block of seven repeats. This analysis shows how a modular protein unit can be used in the evolution of large cytoskeletal structures.  相似文献   

15.
16.
Tordai H  Nagy A  Farkas K  Bányai L  Patthy L 《The FEBS journal》2005,272(19):5064-5078
Originally the term 'protein module' was coined to distinguish mobile domains that frequently occur as building blocks of diverse multidomain proteins from 'static' domains that usually exist only as stand-alone units of single-domain proteins. Despite the widespread use of the term 'mobile domain', the distinction between static and mobile domains is rather vague as it is not easy to quantify the mobility of domains. In the present work we show that the most appropriate measure of the mobility of domains is the number of types of local environments in which a given domain is present. Ranking of domains with respect to this parameter in different evolutionary lineages highlighted marked differences in the propensity of domains to form multidomain proteins. Our analyses have also shown that there is a correlation between domain size and domain mobility: smaller domains are more likely to be used in the construction of multidomain proteins, whereas larger domains are more likely to be static, stand-alone domains. It is also shown that shuffling of a limited set of modules was facilitated by intronic recombination in the metazoan lineage and this has contributed significantly to the emergence of novel complex multidomain proteins, novel functions and increased organismic complexity of metazoa.  相似文献   

17.
Gene fusion produces proteins with novel structural architectures during evolution. Recent comparative genome analysis shows several cases of fusion/fission across distant phylogeny. However, the selection forces driving gene fusion are not fully understood due to the lack of structural, dynamics and kinetics data. Available structural data at PDB (protein databank) contains limited cases of structural pairs describing fused and un-fused structures. Nonetheless, we identified a pair of IGPS (imidazole glycerol phosphate synthetase) structures (comprising of HisF - glutaminase unit and HisH - cyclase unit) from S. cerevisiae (SC) and T. thermophilus (TT). The HisF-HisH structural units are domains in SC and subunits in TT. Hence, they are fused in SC and un-fused in TT. Subsequently, a domain-domain interface is formed in SC and a subunit-subunit interface in TT between HisF and HisH. Our interest is to document the structure and dynamics differences between fused and un-fused IGPS. Therefore, we probed into the structures of fused IGPS in SC and un-fused IGPS in TT using molecular dynamics simulation for 5ns. Simulation shows that fused IGPS in SC has larger interface area between HisF-HisH and greater radius of gyration compared to un-fused IGPS in TT. These structural features for the first time demonstrate the evolutionary advantage in generating proteins with novel structural architecture through gene fusion.  相似文献   

18.
In animals, the innate immune system is the first line of defense against invading microorganisms, and the pattern-recognition receptors (PRRs) are the key components of this system, detecting microbial invasion and initiating innate immune defenses. Two families of PRRs, the intracellular NOD-like receptors (NLRs) and the transmembrane Toll-like receptors (TLRs), are of particular interest because of their roles in a number of diseases. Understanding the evolutionary history of these families and their pattern of evolutionary changes may lead to new insights into the functioning of this critical system. We found that the evolution of both NLR and TLR families included massive species-specific expansions and domain shuffling in various lineages, which resulted in the same domain architectures evolving independently within different lineages in a process that fits the definition of parallel evolution. This observation illustrates both the dynamics of the innate immune system and the effects of “combinatorially constrained” evolution, where existence of the limited numbers of functionally relevant domains constrains the choices of domain architectures for new members in the family, resulting in the emergence of independently evolved proteins with identical domain architectures, often mistaken for orthologs.  相似文献   

19.
With the preponderance of multidomain proteins in eukaryotic genomes, it is essential to recognize the constituent domains and their functions. Often function involves communications across the domain interfaces, and the knowledge of the interacting sites is essential to our understanding of the structure–function relationship. Using evolutionary information extracted from homologous domains in at least two diverse domain architectures (single and multidomain), we predict the interface residues corresponding to domains from the two‐domain proteins. We also use information from the three‐dimensional structures of individual domains of two‐domain proteins to train naïve Bayes classifier model to predict the interfacial residues. Our predictions are highly accurate (~85%) and specific (~95%) to the domain–domain interfaces. This method is specific to multidomain proteins which contain domains in at least more than one protein architectural context. Using predicted residues to constrain domain–domain interaction, rigid‐body docking was able to provide us with accurate full‐length protein structures with correct orientation of domains. We believe that these results can be of considerable interest toward rational protein and interaction design, apart from providing us with valuable information on the nature of interactions. Proteins 2014; 82:1219–1234. © 2013 Wiley Periodicals, Inc.  相似文献   

20.
Many signaling molecules are multidomain proteins that have other domains in addition to the catalytic kinase domain. Protein tyrosine kinases almost without exception contain Src homology 2 (SH2) and/or SH3 domains that can interact with other signaling proteins. Here, we studied evolution of the tyrosine kinases containing SH2 and/or SH3 and kinase domains. The three domains seem to have duplicated together, since the phylogenetic analysis using parsimony gave almost identical evolutionary trees for the separate domains and the multidomain complexes. The congruence analysis of the sequences for the separate domains also suggested that the domains have coevolved. There are several reasons for the domains to appear in a cluster. Kinases are regulated in many ways, and the presence of SH2 and SH3 domains at proper positions is crucial. Because all three domains can recognize different parts of ligands and substrates, their evolution has been interconnected. The reasons for the clustering and coevolution of the three domains in protein tyrosine kinases (PTKs) are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号