首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Harper JT  Keeling PJ 《Gene》2004,340(2):227-235
Insertions and deletions in protein-coding genes are relatively rare events compared with sequence substitutions because they are more likely to alter the tertiary structure of the protein. For this reason, insertions and deletions which are clearly homologous are considered to be stable characteristics of the proteins where they are found, and their presence and absence has been used extensively to infer large-scale evolutionary relationships and events. Recently, however, it has been shown that the pattern of highly conserved, clearly homologous insertions at positions with no other detectable homoplasy can be incongruent with the phylogeny of the genes or organisms in which they are found. One case where this has been reported is in the enolase genes of apicomplexan parasites and ciliates, which share homologous insertions in a highly conserved region of the gene with the apparently distantly related enolases of plants. Here we explore the distribution of this character in enolase genes from the third major alveolate group, the dinoflagellates, as well as two groups considered to be closely related to alveolates, haptophytes and heterokonts. With these data, all major groups of the chromalveolates are represented, and the distribution of these insertions is shown to be far more complicated than previously believed. The incongruence between this pattern, the known evolutionary relationships between the organisms, and enolase phylogeny itself cannot be explained by any single event or type of event. Instead, the distribution of enolase insertions is more likely the product of several forces that may have included lateral gene transfer, paralogy, and/or recombination. Of these, lateral gene transfer is the easiest to detect and some well-supported cases of eukaryote-to-eukaryote lateral transfer are evident from the phylogeny.  相似文献   

2.
A comparative genomic analysis of 35 cyanobacterial strains has revealed that the gene complement of aminoacyl-tRNA synthetases (AARSs) and routes for aminoacyl-tRNA synthesis may differ among the species of this phylum. Several genes encoding AARS paralogues were identified in some genomes. In-depth phylogenetic analysis was done for each of these proteins to gain insight into their evolutionary history. GluRS, HisRS, ArgRS, ThrRS, CysRS, and Glu-Q-RS showed evidence of a complex evolutionary course as indicated by a number of inconsistencies with our reference tree for cyanobacterial phylogeny. In addition to sequence data, support for evolutionary hypotheses involving horizontal gene transfer or gene duplication events was obtained from other observations including biased sequence conservation, the presence of indels (insertions or deletions), or vestigial traces of ancestral redundant genes. We present evidences for a novel protein domain with two putative transmembrane helices recruited independently by distinct AARS in particular cyanobacteria.  相似文献   

3.
The wheat high molecular weight (HMW) glutenins are important seed storage proteins that determine bread-making quality in hexaploid wheat (Triticum aestivum). In this study, detailed comparative sequence analyses of large orthologous HMW glutenin genomic regions from eight grass species, representing a wide evolutionary history of grass genomes, reveal a number of lineage-specific sequence changes. These lineage-specific changes, which resulted in duplications, insertions, and deletions of genes, are the major forces disrupting gene colinearity among grass genomes. Our results indicate that the presence of the HMW glutenin gene in Triticeae genomes was caused by lineage-specific duplication of a globulin gene. This tandem duplication event is shared by Brachypodium and Triticeae genomes, but is absent in rice, maize, and sorghum, suggesting the duplication occurred after Brachypodium and Triticeae genomes diverged from the other grasses ~35 Ma ago. Aside from their physical location in tandem, the sequence similarity, expression pattern, and conserved cis-acting elements responsible for endosperm-specific expression further support the paralogous relationship between the HMW glutenin and globulin genes. While the duplicated copy in Brachypodium has apparently become nonfunctional, the duplicated copy in wheat has evolved to become the HMW glutenin gene by gaining a central prolamin repetitive domain.  相似文献   

4.
Many approaches to compute the genomic distance are still limited to genomes with the same content, without duplicated markers. However, differences in the gene content are frequently observed and can reflect important evolutionary aspects. While duplicated markers can hardly be handled by exact models, when duplicated markers are not allowed, a few polynomial time algorithms that include genome rearrangements, insertions and deletions were already proposed. In an attempt to improve these results, in the present work we give the first linear time algorithm to compute the distance between two multichromosomal genomes with unequal content, but without duplicated markers, considering insertions, deletions and double cut and join (DCJ) operations. We derive from this approach algorithms to sort one genome into another one also using DCJ operations, insertions and deletions. The optimal sorting scenarios can have different compositions and we compare two types of sorting scenarios: one that maximizes and one that minimizes the number of DCJ operations with respect to the number of insertions and deletions. We also show that, although the triangle inequality can be disrupted in the proposed genomic distance, it is possible to correct this problem adopting a surcharge on the number of non-common markers. We use our method to analyze six species of Rickettsia, a group of obligate intracellular parasites, and identify preliminary evidence of clusters of deletions.  相似文献   

5.

Background

Bacterial genomes develop new mechanisms to tide them over the imposing conditions they encounter during the course of their evolution. Acquisition of new genes by lateral gene transfer may be one of the dominant ways of adaptation in bacterial genome evolution. Lateral gene transfer provides the bacterial genome with a new set of genes that help it to explore and adapt to new ecological niches.

Methods

A maximum likelihood analysis was done on the five sequenced corynebacterial genomes to model the rates of gene insertions/deletions at various depths of the phylogeny.

Results

The study shows that most of the laterally acquired genes are transient and the inferred rates of gene movement are higher on the external branches of the phylogeny and decrease as the phylogenetic depth increases. The newly acquired genes are under relaxed selection and evolve faster than their older counterparts. Analysis of some of the functionally characterised LGTs in each species has indicated that they may have a possible adaptive role.

Conclusion

The five Corynebacterial genomes sequenced to date have evolved by acquiring between 8 – 14% of their genomes by LGT and some of these genes may have a role in adaptation.
  相似文献   

6.
Bacterial genomes can evolve either by gene gain, gene loss, mutating existing genes, and/or by duplication of existing genes. Recent studies have clearly demonstrated that the acquisition of new genes by lateral gene transfer (LGT) is a predominant force in bacterial evolution. To better understand the significance of LGT, we employed a comparative genomics approach to model species-specific and intraspecies gene insertions/deletions (ins/del among 12 sequenced streptococcal genomes using a maximum likelihood method. This study indicates that the rate of gene ins/del is higher on the external branches and varies dramatically for each species. We have analyzed here some of the experimentally characterized species-specific genes that have been acquired by LGT and conclude that at least a portion of these genes have a role in adaptation.  相似文献   

7.
The plastid genome from subclover, Trifolium subterraneum, is unusual in a variety of respects, compared with other land-plant chloroplast DNAs. Gene mapping of subclover chloroplast DNA reveals major structural reorganization of the genome. Ten clusters of genes are rearranged in both order and orientation. Eight large inversions are sufficient to explain this reorganization; however, the actual evolutionary changes may have been more complex. For example, a fine-scale analysis of a set of ribosomal protein genes reveals the occurrence of insertions, deletions, and transpositions. Associated with this unusually unstable genome are two structural features potentially involved in the rearrangements. A dispersed family of repeats, with each element about 1 kb in length, is present in at least six copies. A survey of a wide taxonomic range of species indicates that these elements are unique to the chloroplast DNAs of subclover and two closely related species. Several of the repeated elements are associated with genomic rearrangements, and one repeat is inserted within a normally highly conserved series of genes. This set of dispersed repeats may be the first family of transposable elements found in any organelle genome. In addition, the subclover genome is much larger than those in other closely related legumes, even when one takes into account the presence of the repeated elements. Some of the extra DNA has no sequence similarity to other chloroplast genomes and may represent insertion of DNA from another genome. These unusual features are not found in the structurally stable chloroplast genomes of other vascular plants and may, therefore, be implicated in the rapid and major reorganization of the chloroplast DNA in subclover.  相似文献   

8.
Deletional bias and the evolution of bacterial genomes   总被引:28,自引:0,他引:28  
Although bacteria increase their DNA content through horizontal transfer and gene duplication, their genomes remain small and, in particular, lack nonfunctional sequences. This pattern is most readily explained by a pervasive bias towards higher numbers of deletions than insertions. When selection is not strong enough to maintain them, genes are lost in large deletions or inactivated and subsequently eroded. Gene inactivation and loss are particularly apparent in obligate parasites and symbionts, in which dramatic reductions in genome size can result not from selection to lose DNA, but from decreased selection to maintain gene functionality. Here we discuss the evidence showing that deletional bias is a major force that shapes bacterial genomes.  相似文献   

9.
The concept of the phase shift of triplet periodicity (TP) was used for searching potential DNA insertions in genes from 17 bacterial genomes. A mathematical algorithm for detection of these insertions has been developed. This approach can detect potential insertions and deletions with lengths that are not multiples of three bases, especially insertions of relatively large DNA fragments (>100 bases). New similarity measure between triplet matrixes was employed to improve the sensitivity for detecting the TP phase shift. Sequences of 17,220 bacterial genes with each consisting of more than 1,200 bases were analyzed, and the presence of a TP phase shift has been shown in ~16% of analysed genes (2,809 genes), which is about 4 times more than that detected in our previous work. We propose that shifts of the TP phase may indicate the shifts of reading frame in genes after insertions of the DNA fragments with lengths that are not multiples of three bases. A relationship between the phase shifts of TP and the frame shifts in genes is discussed.  相似文献   

10.
The concept of the phase shift of triplet periodicity (TP) was used for searching potential DNA insertions in genes from 17 bacterial genomes. A mathematical algorithm for detection of these insertions has been developed. This approach can detect potential insertions and deletions with lengths that are not multiples of three bases, especially insertions of relatively large DNA fragments (>100 bases). New similarity measure between triplet matrixes was employed to improve the sensitivity for detecting the TP phase shift. Sequences of 17,220 bacterial genes with each consisting of more than 1,200 bases were analyzed, and the presence of a TP phase shift has been shown in ~16% of analysed genes (2,809 genes), which is about 4 times more than that detected in our previous work. We propose that shifts of the TP phase may indicate the shifts of reading frame in genes after insertions of the DNA fragments with lengths that are not multiples of three bases. A relationship between the phase shifts of TP and the frame shifts in genes is discussed.  相似文献   

11.
A large fragment of the dissimilatory sulfite reductase genes (dsrAB) was PCR amplified and fully sequenced from 30 reference strains representing all recognized lineages of sulfate-reducing bacteria. In addition, the sequence of the dsrAB gene homologs of the sulfite reducer Desulfitobacterium dehalogenans was determined. In contrast to previous reports, comparative analysis of all available DsrAB sequences produced a tree topology partially inconsistent with the corresponding 16S rRNA phylogeny. For example, the DsrAB sequences of several Desulfotomaculum species (low G+C gram-positive division) and two members of the genus Thermodesulfobacterium (a separate bacterial division) were monophyletic with delta-proteobacterial DsrAB sequences. The most parsimonious interpretation of these data is that dsrAB genes from ancestors of as-yet-unrecognized sulfate reducers within the delta-Proteobacteria were laterally transferred across divisions. A number of insertions and deletions in the DsrAB alignment independently support these inferred lateral acquisitions of dsrAB genes. Evidence for a dsrAB lateral gene transfer event also was found within the delta-Proteobacteria, affecting Desulfobacula toluolica. The root of the dsr tree was inferred to be within the Thermodesulfovibrio lineage by paralogous rooting of the alpha and beta subunits. This rooting suggests that the dsrAB genes in Archaeoglobus species also are the result of an ancient lateral transfer from a bacterial donor. Although these findings complicate the use of dsrAB genes to infer phylogenetic relationships among sulfate reducers in molecular diversity studies, they establish a framework to resolve the origins and diversification of this ancient respiratory lifestyle among organisms mediating a key step in the biogeochemical cycling of sulfur.  相似文献   

12.
Makarova KS  Ponomarev VA  Koonin EV 《Genome biology》2001,2(9):research0033.1-research003314

Background

Ribosomal proteins are encoded in all genomes of cellular life forms and are, generally, well conserved during evolution. In prokaryotes, the genes for most ribosomal proteins are clustered in several highly conserved operons, which ensures efficient co-regulation of their expression. Duplications of ribosomal-protein genes are infrequent, and given their coordinated expression and functioning, it is generally assumed that ribosomal-protein genes are unlikely to undergo horizontal transfer. However, with the accumulation of numerous complete genome sequences of prokaryotes, several paralogous pairs of ribosomal protein genes have been identified. Here we analyze all such cases and attempt to reconstruct the evolutionary history of these ribosomal proteins.

Results

Complete bacterial genomes were searched for duplications of ribosomal proteins. Ribosomal proteins L36, L33, L31, S14 are each duplicated in several bacterial genomes and ribosomal proteins L11, L28, L7/L12, S1, S15, S18 are so far duplicated in only one genome each. Sequence analysis of the four ribosomal proteins, for which paralogs were detected in several genomes, two of the ribosomal proteins duplicated in one genome (L28 and S18), and the ribosomal protein L32 showed that each of them comes in two distinct versions. One form contains a predicted metal-binding Zn-ribbon that consists of four conserved cysteines (in some cases replaced by histidines), whereas, in the second form, these metal-chelating residues are completely or partially replaced. Typically, genomes containing paralogous genes for these ribosomal proteins encode both versions, designated C+ and C-, respectively. Analysis of phylogenetic trees for these seven ribosomal proteins, combined with comparison of genomic contexts for the respective genes, indicates that in most, if not all cases, their evolution involved a duplication of the ancestral C+ form early in bacterial evolution, with subsequent alternative loss of the C+ and C- forms in different lineages. Additionally, evidence was obtained for a role of horizontal gene transfer in the evolution of these ribosomal proteins, with multiple cases of gene displacement 'in situ', that is, without a change of the gene order in the recipient genome.

Conclusions

A more complex picture of evolution of bacterial ribosomal proteins than previously suspected is emerging from these results, with major contributions of lineage-specific gene loss and horizontal gene transfer. The recurrent theme of emergence and disruption of Zn-ribbons in bacterial ribosomal proteins awaits a functional interpretation.  相似文献   

13.
Determining the phylogeny of closely related prokaryotes may fail in an analysis of rRNA or a small set of sequences. Whole-genome phylogeny utilizes the maximally available sample space. For a precise determination of genome similarity, two aspects have to be considered when developing an algorithm of whole-genome phylogeny: (1) gene order conservation is a more precise signal than gene content; and (2) when using sequence similarity, failures in identifying orthologues or the in situ replacement of genes via horizontal gene transfer may give misleading results. GO4genome is a new paradigm, which is based on a detailed analysis of gene function and the location of the respective genes. For characterization of genes, the algorithm uses gene ontology enabling a comparison of function independent of evolutionary relationship. After the identification of locally optimal series of gene functions, their length distribution is utilized to compute a phylogenetic distance. The outcome is a classification of genomes based on metabolic capabilities and their organization. Thus, the impact of effects on genome organization that are not covered by methods of molecular phylogeny can be studied. Genomes of strains belonging to Escherichia coli, Shigella, Streptococcus, Methanosarcina, and Yersinia were analyzed. Differences from the findings of classical methods are discussed. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

14.
ABSTRACT: BACKGROUND: The Escherichia coli species contains a variety of commensal and pathogenic strains, and its intraspecific diversity is extraordinarily high. With the availability of an increasing number of E. coli strain genomes, a more comprehensive concept of their evolutionary history and ecological adaptation can be developed using phylogenomic analyses. In this study, we constructed two types of whole-genome phylogenies based on 34 E. coli strains using collinear genomic segments. The first phylogeny was based on the concatenated collinear regions shared by all of the studied genomes, and the second phylogeny was based on the variable collinear regions that are absent from at least one genome. Intuitively, the first phylogeny is likely to reveal the lineal evolutionary history among these strains (i.e., an evolutionary phylogeny), whereas the latter phylogeny is likely to reflect the whole-genome similarities of extant strains (i.e., a similarity phylogeny). RESULTS: Within the evolutionary phylogeny, the strains were clustered in accordance with known phylogenetic groups and phenotypes. When comparing evolutionary and similarity phylogenies, a concept emerges that Shigella may have originated from at least three distinct ancestors and evolved into a single clade. By scrutinizing the properties that are shared amongst Shigella strains but missing in other E. coli genomes, we found that the common regions of the Shigella genomes were mainly influenced by mobile genetic elements, implying that they may have experienced convergent evolution via horizontal gene transfer. Based on an inspection of certain key branches of interest, we identified several collinear regions that may be associated with the pathogenicity of specific strains. Moreover, by examining the annotated genes within these regions, further detailed evidence associated with pathogenicity was revealed. CONCLUSIONS: Collinear regions are reliable genomic features used for phylogenomic analysis among closely related genomes while linking the genomic diversity with phenotypic differences in a meaningful way. The pathogenicity of a strain may be associated with both the arrival of virulence factors and the modification of genomes via mutations. Such phylogenomic studies that compare collinear regions of whole genomes will help to better understand the evolution and adaptation of closely related microbes and E. coli in particular.  相似文献   

15.
Most eukaryotic proteins are multi-domain proteins that are created from fusions of genes, deletions and internal repetitions. An investigation of such evolutionary events requires a method to find the domain architecture from which each protein originates. Therefore, we defined a novel measure, domain distance, which is calculated as the number of domains that differ between two domain architectures. Using this measure the evolutionary events that distinguish a protein from its closest ancestor have been studied and it was found that indels are more common than internal repetition and that the exchange of a domain is rare. Indels and repetitions are common at both the N and C-terminals while they are rare between domains. The evolution of the majority of multi-domain proteins can be explained by the stepwise insertions of single domains, with the exception of repeats that sometimes are duplicated several domains in tandem. We show that domain distances agree with sequence similarity and semantic similarity based on gene ontology annotations. In addition, we demonstrate the use of the domain distance measure to build evolutionary trees. Finally, the evolution of multi-domain proteins is exemplified by a closer study of the evolution of two protein families, non-receptor tyrosine kinases and RhoGEFs.  相似文献   

16.
To elucidate some of the molecular mechanisms involved in genome differentiation and evolution of cultivated wheats, we compared orthologous genes encoding starch branching enzyme IIa (SBEIIa). Bread wheat is an allohexaploid species comprising the three genomes A, B and D, each of which contributes a copy of the SBEIIa gene, involved in starch biosynthesis and known to control important quality traits related to technological and nutritional value of wheat-based food products. Alignment of the nucleotide sequences of these three genes revealed variation, both at the level of single nucleotides and indels. Multiple transposon elements were identified in the intragenic regions, some of which appear to have inserted before the divergence of the wheat diploid genomes. The B genome homoeologue was the most divergent of the three genes. Two MITE transposon insertions were detected within the intronic sequence of SBEIIa-B and two other transposons within SBEIIa-D. The presence/absence of these transposons in a panel of diploid and polyploid Triticum and Aegilops species provided some insights into the phylogeny of wheat.  相似文献   

17.
To study reductive evolutionary processes in bacterial genomes, we examine sequences in the Rickettsia genomes which are unconstrained by selection and evolve as pseudogenes, one of which is the metK gene, which codes for AdoMet synthetase. Here, we sequenced the metK gene and three surrounding genes in eight different species of the genus Rickettsia. The metK gene was found to contain a high incidence of deletions in six lineages, while the three genes in its surroundings were functionally conserved in all eight lineages. A more drastic example of gene degradation was identified in the metK downstream region, which contained an open reading frame in Rickettsia felis. Remnants of this open reading frame could be reconstructed in five additional species by eliminating sites of frameshift mutations and termination codons. A detailed examination of the two reconstructed genes revealed that deletions strongly predominate over insertions and that there is a strong transition bias for point mutations which is coupled to an excess of GC-to-AT substitutions. Since the molecular evolution of these inactive genes should reflect the rates and patterns of neutral mutations, our results strongly suggest that there is a high spontaneous rate of deletions as well as a strong mutation bias toward AT pairs in the Rickettsia genomes. This may explain the low genomic G + C content (29%), the small genome size (1.1 Mb), and the high noncoding content (24%), as well as the presence of several pseudogenes in the Rickettsia prowazekii genome.  相似文献   

18.
Summary. Tracing organismal histories on the timescale of the tree of life remains one of the challenging tasks in evolutionary biology. The hotly debated questions include the evolutionary relationship between the three domains of life (e.g., which of the three domains are sister domains, are the domains para-, poly-, or monophyletic) and the location of the root within the universal tree of life. For the latter, many different points of view have been considered but so far no consensus has been reached. The only widely accepted rationale to root the universal tree of life is to use anciently duplicated paralogous genes that are present in all three domains of life. To date only few anciently duplicated gene families useful for phylogenetic reconstruction have been identified. Here we present results from a systematic search for ancient gene duplications using twelve representative, completely sequenced, archaeal and bacterial genomes. Phylogenetic analyses of identified cases show that the majority of datasets support a root between Archaea and Bacteria; however, some datasets support alternative hypotheses, and all of them suffer from a lack of strong phylogenetic signal. The results are discussed with respect to the impact of horizontal gene transfer on the ability to reconstruct organismal evolution. The exchange of genetic information between divergent organisms gives rise to mosaic genomes, where different genes in a genome have different histories. Simulations show that even low rates of horizontal gene transfer dramatically complicate the reconstruction of organismal evolution, and that the different most recent common molecular ancestors likely existed at different times and in different lineages. Correspondence and reprints: Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, U.S.A. Present address: Genome Atlantic, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada.  相似文献   

19.
20.
In this paper, we are interested in the computational complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes or genomic markers, a problem that happens frequently when comparing whole nuclear genomes. Recently, several methods ( [1], [2]) have been proposed that are based on two steps to compute a given (dis)similarity measure M between two genomes G_1 and G_2: first, one establishes a oneto- one correspondence between genes of G_1 and genes of G_2 ; second, once this correspondence is established, it defines explicitly a permutation and it is then possible to quantify their similarity using classical measures defined for permutations, like the number of breakpoints. Hence these methods rely on two elements: a way to establish a one-to-one correspondence between genes of a pair of genomes, and a (dis)similarity measure for permutations. The problem is then, given a (dis)similarity measure for permutations, to compute a correspondence that defines an optimal permutation for this measure. We are interested here in two models to compute a one-to-one correspondence: the exemplar model, where all but one copy are deleted in both genomes for each gene family, and the matching model, that computes a maximal correspondence for each gene family. We show that for these two models, and for three (dis)similarity measures on permutations, namely the number of common intervals, the maximum adjacency disruption (MAD) number and the summed adjacency disruption (SAD) number, the problem of computing an optimal correspondence is NP-complete, and even APXhard for the MAD number and SAD number.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号