首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Studies of microbial eukaryotes have been pivotal in the discovery of biological phenomena, including RNA editing, self-splicing RNA, and telomere addition. Here we extend this list by demonstrating that genome architecture, namely the extensive processing of somatic (macronuclear) genomes in some ciliate lineages, is associated with elevated rates of protein evolution. Using newly developed likelihood-based procedures for studying molecular evolution, we investigate 6 genes to compare 1) ciliate protein evolution to that of 3 other clades of eukaryotes (plants, animals, and fungi) and 2) protein evolution in ciliates with extensively processed macronuclear genomes to that of other ciliate lineages. In 5 of the 6 genes, ciliates are estimated to have a higher ratio of nonsynonymous/synonymous substitution rates, consistent with an increase in the rate of protein diversification in ciliates relative to other eukaryotes. Even more striking, there is a significant effect of genome architecture within ciliates as the most divergent proteins are consistently found in those lineages with the most highly processed macronuclear genomes. We propose a model whereby genome architecture-specifically chromosomal processing, amitosis within macronuclei, and epigenetics-allows ciliates to explore protein space in a novel manner. Further, we predict that examination of diverse eukaryotes will reveal additional evidence of the impact of genome architecture on molecular evolution.  相似文献   

2.
Domains are basic evolutionary units of proteins and most proteins have more than one domain. Advances in domain modeling and collection are making it possible to annotate a large fraction of known protein sequences by a linear ordering of their domains, yielding their architecture. Protein domain architectures link evolutionarily related proteins and underscore their shared functions. Here, we attempt to better understand this association by identifying the evolutionary pathways by which extant architectures may have evolved. We propose a model of evolution in which architectures arise through rearrangements of inferred precursor architectures and acquisition of new domains. These pathways are ranked using a parsimony principle, whereby scenarios requiring the fewest number of independent recombination events, namely fission and fusion operations, are assumed to be more likely. Using a data set of domain architectures present in 159 proteomes that represent all three major branches of the tree of life allows us to estimate the history of over 85% of all architectures in the sequence database. We find that the distribution of rearrangement classes is robust with respect to alternative parsimony rules for inferring the presence of precursor architectures in ancestral species. Analyzing the most parsimonious pathways, we find 87% of architectures to gain complexity over time through simple changes, among which fusion events account for 5.6 times as many architectures as fission. Our results may be used to compute domain architecture similarities, for example, based on the number of historical recombination events separating them. Domain architecture "neighbors" identified in this way may lead to new insights about the evolution of protein function.  相似文献   

3.
Most eukaryotic proteins are multi-domain proteins that are created from fusions of genes, deletions and internal repetitions. An investigation of such evolutionary events requires a method to find the domain architecture from which each protein originates. Therefore, we defined a novel measure, domain distance, which is calculated as the number of domains that differ between two domain architectures. Using this measure the evolutionary events that distinguish a protein from its closest ancestor have been studied and it was found that indels are more common than internal repetition and that the exchange of a domain is rare. Indels and repetitions are common at both the N and C-terminals while they are rare between domains. The evolution of the majority of multi-domain proteins can be explained by the stepwise insertions of single domains, with the exception of repeats that sometimes are duplicated several domains in tandem. We show that domain distances agree with sequence similarity and semantic similarity based on gene ontology annotations. In addition, we demonstrate the use of the domain distance measure to build evolutionary trees. Finally, the evolution of multi-domain proteins is exemplified by a closer study of the evolution of two protein families, non-receptor tyrosine kinases and RhoGEFs.  相似文献   

4.
Protein domain architectures (PDAs), in which single domains are linked to form multiple-domain proteins, are a major molecular form used by evolution for the diversification of protein functions. However, the design principles of PDAs remain largely uninvestigated. In this study, we constructed networks to connect domain architectures that had grown out from the same single domain for every single domain in the Pfam-A database and found that there are three main distinctive types of these networks, which suggests that evolution can exploit PDAs in three different ways. Further analysis showed that these three different types of PDA networks are each adopted by different types of protein domains, although many networks exhibit the characteristics of more than one of the three types. Our results shed light on nature''s blueprint for protein architecture and provide a framework for understanding architectural design from a network perspective.  相似文献   

5.
6.
In animals, the innate immune system is the first line of defense against invading microorganisms, and the pattern-recognition receptors (PRRs) are the key components of this system, detecting microbial invasion and initiating innate immune defenses. Two families of PRRs, the intracellular NOD-like receptors (NLRs) and the transmembrane Toll-like receptors (TLRs), are of particular interest because of their roles in a number of diseases. Understanding the evolutionary history of these families and their pattern of evolutionary changes may lead to new insights into the functioning of this critical system. We found that the evolution of both NLR and TLR families included massive species-specific expansions and domain shuffling in various lineages, which resulted in the same domain architectures evolving independently within different lineages in a process that fits the definition of parallel evolution. This observation illustrates both the dynamics of the innate immune system and the effects of “combinatorially constrained” evolution, where existence of the limited numbers of functionally relevant domains constrains the choices of domain architectures for new members in the family, resulting in the emergence of independently evolved proteins with identical domain architectures, often mistaken for orthologs.  相似文献   

7.
8.
Investigating the relative importance of protein stability, function, and folding kinetics in driving protein evolution has long been hindered by the fact that we can only compare modern natural proteins, the products of the very process we seek to understand, to each other, with no external references or baselines. Through a large-scale all-atom simulation of protein evolution, we have created a large diverse alignment of SH3 domain sequences which have been selected only for native state stability, with no other influencing factors. Although the average pairwise identity between computationally evolved and natural sequences is only 17%, the residue frequency distributions of the computationally evolved sequences are similar to natural SH3 sequences at 86% of the positions in the domain, suggesting that optimization for the native state structure has dominated the evolution of natural SH3 domains. Additionally, the positions which play a consistent role in the transition state of three well-characterized SH3 domains (by phi-value analysis) are structurally optimized for the native state, and vice versa. Indeed, we see a specific and significant correlation between sequence optimization for native state stability and conservation of transition state structure.  相似文献   

9.
Domains are the evolutionary units that comprise proteins, and most proteins are built from more than one domain. Domains can be shuffled by recombination to create proteins with new arrangements of domains. Using structural domain assignments, we examined the combinations of domains in the proteins of 131 completely sequenced organisms. We found two-domain and three-domain combinations that recur in different protein contexts with different partner domains. The domains within these combinations have a particular functional and spatial relationship. These units are larger than individual domains and we term them "supra-domains". Amongst the supra-domains, we identified some 1400 (1203 two-domain and 166 three-domain) combinations that are statistically significantly over-represented relative to the occurrence and versatility of the individual component domains. Over one-third of all structurally assigned multi-domain proteins contain these over-represented supra-domains. This means that investigation of the structural and functional relationships of the domains forming these popular combinations would be particularly useful for an understanding of multi-domain protein function and evolution as well as for genome annotation. These and other supra-domains were analysed for their versatility, duplication, their distribution across the three kingdoms of life and their functional classes. By examining the three-dimensional structures of several examples of supra-domains in different biological processes, we identify two basic types of spatial relationships between the component domains: the combined function of the two domains is such that either the geometry of the two domains is crucial and there is a tight constraint on the interface, or the precise orientation of the domains is less important and they are spatially separate. Frequently, the role of the supra-domain becomes clear only once the three-dimensional structure is known. Since this is the case for only a quarter of the supra-domains, we provide a list of the most important unknown supra-domains as potential targets for structural genomics projects.  相似文献   

10.
The mitochondrion is an essential cellular compartment in eukaryotes. The mitochondrial proteins Tom20 and Tom22 are receptors that ensure recognition and binding of proteins imported for mitochondrial biogenesis. Comparison of the sequence for the Tom20 and Tom22 subunits in the yeasts Saccharomyces cerevisiae and Saccharomyces castellii, show a rare case of domain stealing, where in Saccharomyces castellii Tom22 has lost an acidic domain, and Tom20 has gained one. This example of domain stealing is a snapshot of evolution in action and provides excellent evidence that Tom20 and Tom22 are subunits of a single, composite receptor that binds precursor proteins for import into mitochondria.  相似文献   

11.
Most studies of behaviour examine traits whose proximate causes include sensory input and neural decision-making, but conflict and collaboration in biological systems began long before brains or sensory systems evolved. Many behaviours result from non-neural mechanisms such as direct physical contact between recognition proteins or modifications of development that coincide with altered behaviour. These simple molecular mechanisms form the basis of important biological functions and can enact organismal interactions that are as subtle, strategic and interesting as any. The genetic changes that underlie divergent molecular behaviours are often targets of selection, indicating that their functional variation has important fitness consequences. These behaviours evolve by discrete units of quantifiable phenotypic effect (amino acid and regulatory mutations, often by successive mutations of the same gene), so the role of selection in shaping evolutionary change can be evaluated on the scale at which heritable phenotypic variation originates. We describe experimental strategies for finding genes that underlie biochemical and developmental alterations of behaviour, survey the existing literature highlighting cases where the simplicity of molecular behaviours has allowed insight to the evolutionary process and discuss the utility of a genetic knowledge of the sources and spectrum of phenotypic variation for a deeper understanding of how genetic and phenotypic architectures evolve.  相似文献   

12.
We have developed a statistical method named MAP (mutagenesis assistant program) to equip protein engineers with a tool to develop promising directed evolution strategies by comparing 19 mutagenesis methods. Instead of conventional transition/transversion bias indicators as benchmarks for comparison, we propose to use three indicators based on the subset of amino acid substitutions generated on the protein level: (1) protein structure indicator; (2) amino acid diversity indicator with a codon diversity coefficient; and (3) chemical diversity indicator. A MAP analysis for a single nucleotide substitution was performed for four genes: (1) heme domain of cytochrome P450 BM-3 from Bacillus megaterium (EC 1.14.14.1); (2) glucose oxidase from Aspergillus niger (EC 1.1.3.4); (3) arylesterase from Pseudomonas fluorescens (EC 3.1.1.2); and (4) alcohol dehydrogenase from Saccharomyces cerevisiae (EC 1.1.1.1). Based on the MAP analysis of these four genes, 19 mutagenesis methods have been evaluated and criteria for an ideal mutagenesis method have been proposed. The statistical analysis showed that existing gene mutagenesis methods are limited and highly biased. An average amino acid substitution per residue of only 3.15-7.4 can be achieved with current random mutagenesis methods. For the four investigated gene sequences, an average fraction of amino acid substitutions of 0.5-7% results in stop codons and 4.5-23.9% in glycine or proline residues. An average fraction of 16.2-44.2% of the amino acid substitutions are preserved, and 45.6% (epPCR method) are chemically different. The diversity remains low even when applying a non-biased method: an average of seven amino acid substitutions per residue, 2.9-4.7% stop codons, 11.1-16% glycine/proline residues, 21-25.8% preserved amino acids, and 55.5% are amino acids with chemically different side-chains. Statistical information for each mutagenesis method can further be used to investigate the mutational spectra in protein regions regarded as important for the property of interest.  相似文献   

13.
The proteomes that make up the collection of proteins in contemporary organisms evolved through recombination and duplication of a limited set of domains. These protein domains are essentially the main components of globular proteins and are the most principal level at which protein function and protein interactions can be understood. An important aspect of domain evolution is their atomic structure and biochemical function, which are both specified by the information in the amino acid sequence. Changes in this information may bring about new folds, functions and protein architectures. With the present and still increasing wealth of sequences and annotation data brought about by genomics, new evolutionary relationships are constantly being revealed, unknown structures modeled and phylogenies inferred. Such investigations not only help predict the function of newly discovered proteins, but also assist in mapping unforeseen pathways of evolution and reveal crucial, co-evolving inter- and intra-molecular interactions. In turn this will help us describe how protein domains shaped cellular interaction networks and the dynamics with which they are regulated in the cell. Additionally, these studies can be used for the design of new and optimized protein domains for therapy. In this review, we aim to describe the basic concepts of protein domain evolution and illustrate recent developments in molecular evolution that have provided valuable new insights in the field of comparative genomics and protein interaction networks.  相似文献   

14.
Two recent studies demonstrated a positive correlation between divergence in gene expression and protein sequence in Drosophila. This correlation could be driven by positive selection or variation in functional constraint. To distinguish between these alternatives, we compared patterns of molecular evolution for 1,862 genes with two previously reported estimates of expression divergence in Drosophila. We found a slight negative trend (nonsignificant) between positive selection on protein sequence and divergence in expression levels between Drosophila melanogaster and Drosophila simulans. Conversely, shifts in expression patterns during Drosophila development showed a positive association with adaptive protein evolution, though as before the relationship was weak and not significant. Overall, we found no strong evidence for an increase in the incidence of positive selection on protein-coding regions in genes with divergent expression in Drosophila, suggesting that the previously reported positive association between protein and regulatory divergence primarily reflects variation in functional constraint.  相似文献   

15.
Adaptation is often regarded as the sequential fixation of individually, intrinsically beneficial mutations. Contrary to this expectation, we find a surprisingly large number of evolutionary trajectories on which natural selection first favors a mutation, then favors its removal, and later still favors its ultimate restoration during the course of antibiotic resistance evolution. The existence of reversion trajectories implies that natural selection may not follow the most parsimonious path separating two alleles, even during adaptation. Altogether, this discovery highlights the unusual and potentially circuitous routes natural selection can follow during adaptation.  相似文献   

16.
Molecular evolution of the mammalian prion protein   总被引:10,自引:0,他引:10  
Prion protein (PrP) sequences are until now available for only six of the 18 orders of placental mammals. A broader comparison of mammalian prions might help to understand the enigmatic functional and pathogenic properties of this protein. We therefore determined PrP coding sequences in 26 mammalian species to include all placental orders and major subordinal groups. Glycosylation sites, cysteines forming a disulfide bridge, and a hydrophobic transmembrane region are perfectly conserved. Also, the sequences responsible for secondary structure elements, for N- and C-terminal processing of the precursor protein, and for attachment of the glycosyl-phosphatidylinositol membrane anchor are well conserved. The N-terminal region of PrP generally contains five or six repeats of the sequence P(Q/H)GGG(G/-)WGQ, but alleles with two, four, and seven repeats were observed in some species. This suggests, together with the pattern of amino acid replacements in these repeats, the regular occurrence of repeat expansion and contraction. Histidines implicated in copper ion binding and a proline involved in 4-hydroxylation are lacking in some species, which questions their importance for normal functioning of cellular PrP. The finding in certain species of two or seven repeats, and of amino acid substitutions that have been related to human prion diseases, challenges the relevance of such mutations for prion pathology. The gene tree deduced from the PrP sequences largely agrees with the species tree, indicating that no major deviations occurred in the evolution of the prion gene in different placental lineages. In one species, the anteater, a prion pseudogene was present in addition to the active gene.  相似文献   

17.
We use flexible backbone protein design to explore the sequence and structure neighborhoods of naturally occurring proteins. The method samples sequence and structure space in the vicinity of a known sequence and structure by alternately optimizing the sequence for a fixed protein backbone using rotamer based sequence search, and optimizing the backbone for a fixed amino acid sequence using atomic-resolution structure prediction. We find that such a flexible backbone design method better recapitulates protein family sequence variation than sequence optimization on fixed backbones or randomly perturbed backbone ensembles for ten diverse protein structures. For the SH3 domain, the backbone structure variation in the family is also better recapitulated than in randomly perturbed backbones. The potential application of this method as a model of protein family evolution is highlighted by a concerted transition to the amino acid sequence in the structural core of one SH3 domain starting from the backbone coordinates of an homologous structure.  相似文献   

18.
19.
Domains are considered as the basic units of protein folding, evolution, and function. Decomposing each protein into modular domains is thus a basic prerequisite for accurate functional classification of biological molecules. Here, we present ADDA, an automatic algorithm for domain decomposition and clustering of all protein domain families. We use alignments derived from an all-on-all sequence comparison to define domains within protein sequences based on a global maximum likelihood model. In all, 90% of domain boundaries are predicted within 10% of domain size when compared with the manual domain definitions given in the SCOP database. A representative database of 249,264 protein sequences were decomposed into 450,462 domains. These domains were clustered on the basis of sequence similarities into 33,879 domain families containing at least two members with less than 40% sequence identity. Validation against family definitions in the manually curated databases SCOP and PFAM indicates almost perfect unification of various large domain families while contamination by unrelated sequences remains at a low level. The global survey of protein-domain space by ADDA confirms that most large and universal domain families are already described in PFAM and/or SMART. However, a survey of the complete set of mobile modules leads to the identification of 1479 new interesting domain families which shuffle around in multi-domain proteins. The data are publicly available at ftp://ftp.ebi.ac.uk/pub/contrib/heger/adda.  相似文献   

20.
苯丙氨酰-tRNA合成酶的进化与结构域丢失   总被引:1,自引:0,他引:1  
基因的复制、融合以及基因的水平转移是许多蛋白质包括氨酰 tRNA合成酶 (aminoacyl tRNAsynthetase ,AARS)进化过程中的常见事件。然而作者研究的结果显示 ,苯丙氨酰 tRNA合成酶 (phenylalanyl tRNAsynthetase,PheRS)的进化主要表现为一些结构域的丢失 ;并且这种结构域的丢失不影响PheRS的功能或活性。通常在生物从细菌到真核生物的进化过程中 ,其基因组的大小和基因的数目都有所增加 ,然而有趣的是 ,真核生物中PheRS的结构域类型和数目都明显少于细菌的PheRS。PheRS通过结构域的丢失而进化的现象 ,似乎与某些AARS功能由多重专一性向单一专一性的演化有着“异曲同工”之妙。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号