The increasing number of genome sequences of archaea and bacteria show their adaptation to different environmental conditions at the genomic level. Aeropyrum spp. are aerobic and hyperthermophilic archaea. Aeropyrum camini was isolated from a deep-sea hydrothermal vent, and Aeropyrum pernix was isolated from a coastal solfataric vent. To investigate the adaptation strategy in each habitat, we compared the genomes of the two species. Shared genome features were a small genome size, a high GC content, and a large portion of orthologous genes (86 to 88%). The genomes also showed high synteny. These shared features may have been derived from the small number of mobile genetic elements and the lack of a RecBCD system, a recombinational enzyme complex. In addition, the specialized physiology (aerobic and hyperthermophilic) of Aeropyrum spp. may also contribute to the entire-genome similarity. Despite having stable genomes, interference of synteny occurred with two proviruses, A. pernix spindle-shaped virus 1 (APSV1) and A. pernix ovoid virus 1 (APOV1), and clustered regularly interspaced short palindromic repeat (CRISPR) elements. Spacer sequences derived from the A. camini CRISPR showed significant matches with protospacers of the two proviruses infecting A. pernix, indicating that A. camini interacted with viruses closely related to APSV1 and APOV1. Furthermore, a significant fraction of the nonorthologous genes (41 to 45%) were proviral genes or ORFans probably originating from viruses. Although the genomes of A. camini and A. pernix were conserved, we observed nonsynteny that was attributed primarily to virus-related elements. Our findings indicated that the genomic diversification of Aeropyrum spp. is substantially caused by viruses.  相似文献   

Various methods have been developed to detect horizontal gene transfer in bacteria, based on anomalous nucleotide composition, assuming that compositional features undergo amelioration in the host genome. Evolutionary theory predicts the inevitability of false positives when essential sequences are strongly conserved. Foreign genes could become more detectable on the basis of their higher order compositions if such features ameliorate more rapidly and uniformly than lower order features. This possibility is tested by comparing the heterogeneities of bacterial genomes with respect to strand-independent first- and second-order features, (i) G + C content and (ii) dinucleotide relative abundance, in 1 kb segments. Although statistical analysis confirms that (ii) is less inhomogeneous than (i) in all 12 species examined, extreme anomalies with respect to (ii) in the Escherichia coli K12 genome are typically co-located with essential genes.Key words: amelioration, dinucleotide frequency, essential genes, horizontal transfer, molecular evolution  相似文献   

Mitochondrial genomes of apicomplexans, dinoflagellates, and chrompodellids that collectively make up the Myzozoa, encode only three proteins (Cytochrome b [COB], Cytochrome c oxidase subunit 1 [COX1], Cytochrome c oxidase subunit 3 [COX3]), contain fragmented ribosomal RNAs, and display extensive recombination, RNA trans-splicing, and RNA-editing. The early-diverging Perkinsozoa is the final major myzozoan lineage whose mitochondrial genomes remained poorly characterized. Previous reports of Perkinsus genes indicated independent acquisition of non-canonical features, namely the occurrence of multiple frameshifts. To determine both ancestral myzozoan and novel perkinsozoan mitochondrial genome features, we sequenced and assembled mitochondrial genomes of four Perkinsus species. These data show a simple ancestral genome with the common reduced coding capacity but disposition for rearrangement. We identified 75 frameshifts across the four species that occur as distinct types and that are highly conserved in gene location. A decoding mechanism apparently employs unused codons at the frameshift sites that advance translation either +1 or +2 frames to the next used codon. The locations of frameshifts are seemingly positioned to regulate protein folding of the nascent protein as it emerges from the ribosome. The cox3 gene is distinct in containing only one frameshift and showing strong selection against residues that are otherwise frequently encoded at the frameshift positions in cox1 and cob. All genes lack cysteine codons implying a reduction to 19 amino acids in these genomes. Furthermore, mitochondrion-encoded rRNA fragment complements are incomplete in Perkinsus spp. but some are found in the nuclear DNA suggesting import into the organelle. Perkinsus demonstrates further remarkable trajectories of organelle genome evolution including pervasive integration of frameshift translation into genome expression.  相似文献   

Horizontal gene transfer (HGT) plays a central role in bacterial evolution, yet the molecular and cellular constraints on functional integration of the foreign genes are poorly understood. Here we performed inter-species replacement of the chromosomal folA gene, encoding an essential metabolic enzyme dihydrofolate reductase (DHFR), with orthologs from 35 other mesophilic bacteria. The orthologous inter-species replacements caused a marked drop (in the range 10–90%) in bacterial growth rate despite the fact that most orthologous DHFRs are as stable as E.coli DHFR at 37°C and are more catalytically active than E. coli DHFR. Although phylogenetic distance between E. coli and orthologous DHFRs as well as their individual molecular properties correlate poorly with growth rates, the product of the intracellular DHFR abundance and catalytic activity (kcat/KM), correlates strongly with growth rates, indicating that the drop in DHFR abundance constitutes the major fitness barrier to HGT. Serial propagation of the orthologous strains for ~600 generations dramatically improved growth rates by largely alleviating the fitness barriers. Whole genome sequencing and global proteome quantification revealed that the evolved strains with the largest fitness improvements have accumulated mutations that inactivated the ATP-dependent Lon protease, causing an increase in the intracellular DHFR abundance. In one case DHFR abundance increased further due to mutations accumulated in folA promoter, but only after the lon inactivating mutations were fixed in the population. Thus, by apparently distinguishing between self and non-self proteins, protein homeostasis imposes an immediate and global barrier to the functional integration of foreign genes by decreasing the intracellular abundance of their products. Once this barrier is alleviated, more fine-tuned evolution occurs to adjust the function/expression of the transferred proteins to the constraints imposed by the intracellular environment of the host organism.  相似文献   

Over 3000 microbial (bacterial and archaeal) genomes have been made publically available to date, providing an unprecedented opportunity to examine evolutionary genomic trends and offering valuable reference data for a variety of other studies such as metagenomics. The utility of these genome sequences is greatly enhanced when we have an understanding of how they are phylogenetically related to each other. Therefore, we here describe our efforts to reconstruct the phylogeny of all available bacterial and archaeal genomes. We identified 24, single-copy, ubiquitous genes suitable for this phylogenetic analysis. We used two approaches to combine the data for the 24 genes. First, we concatenated alignments of all genes into a single alignment from which a Maximum Likelihood (ML) tree was inferred using RAxML. Second, we used a relatively new approach to combining gene data, Bayesian Concordance Analysis (BCA), as implemented in the BUCKy software, in which the results of 24 single-gene phylogenetic analyses are used to generate a “primary concordance” tree. A comparison of the concatenated ML tree and the primary concordance (BUCKy) tree reveals that the two approaches give similar results, relative to a phylogenetic tree inferred from the 16S rRNA gene. After comparing the results and the methods used, we conclude that the current best approach for generating a single phylogenetic tree, suitable for use as a reference phylogeny for comparative analyses, is to perform a maximum likelihood analysis of a concatenated alignment of conserved, single-copy genes.  相似文献   



The concept of ribosomal constraints on rRNA genes is deduced primarily based on the comparison of consensus rRNA sequences between closely related species, but recent advances in whole-genome sequencing allow evaluation of this concept within organisms with multiple rRNA operons.

Methodology/Principal Findings

Using the 23S rRNA gene as an example, we analyzed the diversity among individual rRNA genes within a genome. Of 184 prokaryotic species containing multiple 23S rRNA genes, diversity was observed in 113 (61.4%) genomes (mean 0.40%, range 0.01%–4.04%). Significant (1.17%–4.04%) intragenomic variation was found in 8 species. In 5 of the 8 species, the diversity in the primary structure had only minimal effect on the secondary structure (stem versus loop transition). In the remaining 3 species, the diversity significantly altered local secondary structure, but the alteration appears minimized through complex rearrangement. Intervening sequences (IVS), ranging between 9 and 1471 nt in size, were found in 7 species. IVS in Deinococcus radiodurans and Nostoc sp. encode transposases. T. tengcongensis was the only species in which intragenomic diversity >3% was observed among 4 paralogous 23S rRNA genes.


These findings indicate tight ribosomal constraints on individual 23S rRNA genes within a genome. Although classification using primary 23S rRNA sequences could be erroneous, significant diversity among paralogous 23S rRNA genes was observed only once in the 184 species analyzed, indicating little overall impact on the mainstream of 23S rRNA gene-based prokaryotic taxonomy.  相似文献   

Analysis of intragenomic variation of 16S rRNA genes is a unique approach to examining the concept of ribosomal constraints on rRNA genes; the degree of variation is an important parameter to consider for estimation of the diversity of a complex microbiome in the recently initiated Human Microbiome Project (http://nihroadmap.nih.gov/hmp). The current GenBank database has a collection of 883 prokaryotic genomes representing 568 unique species, of which 425 species contained 2 to 15 copies of 16S rRNA genes per genome (2.22 ± 0.81). Sequence diversity among the 16S rRNA genes in a genome was found in 235 species (from 0.06% to 20.38%; 0.55% ± 1.46%). Compared with the 16S rRNA-based threshold for operational definition of species (1 to 1.3% diversity), the diversity was borderline (between 1% and 1.3%) in 10 species and >1.3% in 14 species. The diversified 16S rRNA genes in Haloarcula marismortui (diversity, 5.63%) and Thermoanaerobacter tengcongensis (6.70%) were highly conserved at the 2° structure level, while the diversified gene in B. afzelii (20.38%) appears to be a pseudogene. The diversified genes in the remaining 21 species were also conserved, except for a truncated 16S rRNA gene in “Candidatus Protochlamydia amoebophila.” Thus, this survey of intragenomic diversity of 16S rRNA genes provides strong evidence supporting the theory of ribosomal constraint. Taxonomic classification using the 16S rRNA-based operational threshold could misclassify a number of species into more than one species, leading to an overestimation of the diversity of a complex microbiome. This phenomenon is especially seen in 7 bacterial species associated with the human microbiome or diseases.rRNA genes are widely used for estimation of evolutionary history and taxonomic assignment of individual organisms (14, 26, 50-52). The choice of rRNA genes as optimal tools for such purposes is based on both observations and assumptions of ribosomal conservation (13, 50). rRNA genes are essential components of the ribosome, which consists of >50 proteins and three classes of RNA molecules; precise spatial relationships may be essential for assembly of functional ribosomes, constraining rRNA genes from drastic change (9, 13). In bacteria, the three rRNA genes are organized into a gene cluster which is expressed as single operon, which may be present in multiple copies in the genome. In organisms with multiple rRNA gene operons, the gene sequences tend to evolve in concert. It is generally believed that copies of rRNA genes within an organism are subject to a homogenization process through homologous recombination, also known as gene conversion (18), a form of concerted evolution that maintains their fit within the ribosome. The homogenization process may involve short domains without affecting the entire sequence of each gene (8).However, significant differences between copies of rRNA genes in single organisms, albeit few, have been discovered in all three domains of life and in all three classes of rRNA genes. The amphibian Xenopus laevis and the loach Misgurnus fossilis have two types of 5S rRNA genes that are specific to either somatic or oocyte ribosomes (30, 48). The parasite Plasmodium berghei contains two types of 18S rRNA genes that differ at 3.5% of the nucleotide positions and are life cycle stage specific (17). The metazoan Dugesia mediterranea possesses two types of 18S rRNA genes with 8% dissimilarity (6). The archaeon Haloarcula marismortui contains two distinct types of 16S rRNA genes that differ by 5% (32, 33). In the domain Bacteria, the actinomycete Thermobispora bispora contains two types of 16S rRNA genes that differ by 6.4% (47). Copies of the 16S rRNA genes and 23S rRNA genes of the actinomycete Thermospora chromogena differ by approximately 6 and 10%, respectively (54). Paralogous copies of rRNA genes with different sequences may have functionally distinct roles.Divergent evolution between rRNA genes in the same genome may corrupt the record of evolutionary history and obscure the true identity of an organism. Substantial variation, if it occurs, may lead to the artificial classification of an organism into more than one species. For a cultivable organism, this problem can be resolved by cloning rRNA genes from a pure culture of the organism to identify the degree of variation. However, most environmental surveys and the recently initiated Human Microbiome Project (HMP) (http://nihroadmap.nih.gov/hmp/) (34) use cultivation-independent techniques to examine microbiomes that contain mixed species. In the case of the HMP, it is hoped that this approach may identify some idiopathic diseases that are caused by alterations in the microbiome in humans. In this type of study, it may be impossible to trace all rRNA genes observed back to their original host. For example, in the phylum TM7, multiple 16S rRNA gene sequences have been reported (21), but it is not known whether they belong to multiple species or to the same bacterium with a high degree of intragenomic variation among rRNA gene paralogs. Due to the limited number of microorganisms for which nucleotide sequences are available for all copies of the rRNA genes, intragenomic variation among 16S rRNA genes, and the likelihood of pyrosequencing errors (25, 40), the potential to overestimate the diversity of a microbiome exists.Coenye et al. analyzed 55 bacterial genomes and found the intragenomic heterogeneity between multiple 16S rRNA genes in these genomes was below the common threshold (1 to 1.3%) for distinguishing species (44) and was unlikely to have a profound effect on the classification of taxa (10). The analysis of 76 whole genomes by Acinas et al. revealed the extreme diversity (11.6%) of 16S rRNA genes in Thermoanaerobacter tengcongensis (2). These early analyses of intragenomic variation of 16S rRNA genes were limited to a small number of available whole genomes. With the increasing number of whole microbial genomes available from the National Center for Biotechnology Information (NCBI), the extent of diversity among the paralogous 16S rRNA genes within single organisms can now be more thoroughly assessed. In the present study, we (i) addressed the theory of 16S rRNA conservation by systematic evaluation of intragenomic diversity of 16S rRNA sequences in completely sequenced prokaryotic genomes to assess its effect on the accuracy of 16S rRNA-based molecular taxonomy and (ii) examined whether previously observed ribosomal constraints on conservation of 2° structures are uniformly applicable at the intragenomic level.  相似文献   

The purpose of this research was to search for evolutionarily conserved fungal sequences to test the hypothesis that fungi have a set of core genes that are not found in other organisms, as these genes may indicate what makes fungi different from other organisms. By comparing 6355 predicted or known yeast (Saccharomyces cerevisiae) genes to the genomes of 13 other fungi using Standalone TBLASTN at an e-value <1E-5, a list of 3340 yeast genes was obtained with homologs present in at least 12 of 14 fungal genomes. By comparing these common fungal genes to complete genomes of animals (Fugu rubripes, Caenorhabditis elegans), plants (Arabidopsis thaliana, Oryza sativa), and bacteria (Agrobacterium tumefaciens, Xylella fastidiosa), a list of common fungal genes with homologs in these plants, animals, and bacteria was produced (938 genes), as well as a list of exclusively fungal genes without homologs in these other genomes (60 genes). To ensure that the 60 genes were exclusively fungal, these were compared using TBLASTN to the major sequence databases at GenBank: NR (nonredundant), EST (expressed sequence tags), GSS (genome survey sequences), and HTGS (unfinished high-throughput genome sequences). This resulted in 17 yeast genes with homologs in other fungal genomes, but without known homologs in other organisms. These 17 core, fungal genes were not found to differ from other yeast genes in GC content or codon usage patterns. More intensive study is required of these 17 genes and other common fungal genes to discover unique features of fungi compared to other organisms.Reviewing Editor: Prof. David Gottman  相似文献   

Kochetov  A. V.  Sarai  A.  Vorob'ev  D. G.  Kolchanov  N. A. 《Molecular Biology》2002,36(6):833-840
With the example of yeast genes, context organization was compared for functional gene regions (promoter, 5"-UTR, 3"-UTR) and tested for association with the level of gene expression. Several parameters (nucleotide composition, dinucleotide content bias) proved to correlate with expression level, each functional region having its specific features. Context optimization of a functional region was assumed to be essential for highly efficient interaction with the expression system of the cell. Specific context features were considered as dispersed signals important for high-level gene expression.  相似文献   

The organization of genes into operons, clusters of genes that are co-transcribed to produce polycistronic pre-mRNAs, is a trait found in a wide range of eukaryotic groups, including multiple animal phyla. Operons are present in the class Chromadorea, one of the two main nematode classes, but their distribution in the other class, the Enoplea, is not known. We have surveyed the genomes of Trichinella spiralis, Trichuris muris, and Romanomermis culicivorax and identified the first putative operons in members of the Enoplea. Consistent with the mechanism of polycistronic RNA resolution in other nematodes, the mRNAs produced by genes downstream of the first gene in the T. spiralis and T. muris operons are trans-spliced to spliced leader RNAs, and we are able to detect polycistronic RNAs derived from these operons. Importantly, a putative intercistronic region from one of these potential enoplean operons confers polycistronic processing activity when expressed as part of a chimeric operon in Caenorhabditis elegans. We find that T. spiralis genes located in operons have an increased likelihood of having operonic C. elegans homologs. However, operon structure in terms of synteny and gene content is not tightly conserved between the two taxa, consistent with models of operon evolution. We have nevertheless identified putative operons conserved between Enoplea and Chromadorea. Our data suggest that operons and “spliced leader” (SL) trans-splicing predate the radiation of the nematode phylum, an inference which is supported by the phylogenetic profile of proteins known to be involved in nematode SL trans-splicing.  相似文献   

