首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We analyzed the nucleotide contents of several completely sequenced genomes, and we show that nucleotide bias can have a dramatic effect on the amino acid composition of the encoded proteins. By surveying the genes in 21 completely sequenced eubacterial and archaeal genomes, along with the entire Saccharomyces cerevisiae genome and two Plasmodium falciparum chromosomes, we show that biased DNA encodes biased proteins on a genomewide scale. The predicted bias affects virtually all genes within the genome, and it could be clearly seen even when we limited the analysis to sets of homologous gene sequences. Parallel patterns of compositional bias were found within the archaea and the eubacteria. We also found a positive correlation between the degree of amino acid bias and the magnitude of protein sequence divergence. We conclude that mutational bias can have a major effect on the molecular evolution of proteins. These results could have important implications for the interpretation of protein-based molecular phylogenies and for the inference of functional protein adaptation from comparative sequence data.  相似文献   

2.
Analyses of 55 individual and 31 concatenated protein data sets encoded in Reclinomonas americana and Marchantia polymorpha mitochondrial genomes revealed that current methods for constructing phylogenetic trees are insufficiently sensitive (or artifact-insensitive) to ascertain the sister of mitochondria among the current sample of eight alpha-proteobacterial genomes using mitochondrially-encoded proteins. However, Rhodospirillum rubrum came as close to mitochondria as any alpha-proteobacterium investigated. This prompted a search for methods to directly compare eukaryotic genomes to their prokaryotic counterparts to investigate the origin of the mitochondrion and its host from the standpoint of nuclear genes. We examined pairwise amino acid sequence identity in comparisons of 6,214 nuclear protein-coding genes from Saccharomyces cerevisiae to 177,117 proteins encoded in sequenced genomes from 45 eubacteria and 15 archaebacteria. The results reveal that approximately 75% of yeast genes having homologues among the present prokaryotic sample share greater amino acid sequence identity to eubacterial than to archaebacterial homologues. At high stringency comparisons, only the eubacterial component of the yeast genome is detectable. Our findings indicate that at the levels of overall amino acid sequence identity and gene content, yeast shares a sister-group relationship with eubacteria, not with archaebacteria, in contrast to the current phylogenetic paradigm based on ribosomal RNA. Among eubacteria and archaebacteria, proteobacterial and methanogen genomes, respectively, shared more similarity with the yeast genome than other prokaryotic genomes surveyed.  相似文献   

3.

Background  

As a key parameter of genome sequence variation, the GC content of bacterial genomes has been investigated for over half a century, and many hypotheses have been put forward to explain this GC content variation and its relationship to other fundamental processes. Previously, we classified eubacteria into dnaE-based groups (the dimeric combination of DNA polymerase III alpha subunits), according to a hypothesis where GC content variation is essentially governed by genome replication and DNA repair mechanisms. Further investigation led to the discovery that two major mutator genes, polC and dnaE2, may be responsible for genomic GC content variation. Consequently, an in-depth analysis was conducted to evaluate various potential intrinsic and extrinsic factors in association with GC content variation among eubacterial genomes.  相似文献   

4.
MOTIVATION: It has been speculated that CpG dinucleotide deficiency in genomes is a consequence of DNA methylation. However, this hypothesis does not adequately explain CpG deficiency in bacteria. The hypothesis based on DNA structure constraint as an alternative explanation was therefore examined. RESULTS: By comparing real bacterial genomes and Markov artificial genomes in the second order, we found that the core structure of a restricted pattern, the TTCGAA pattern, was under represented in low GC content bacterial genomes regardless of CpG dinucleotide level. This is in contrast to the AACGTT pattern, indicating that the counterselection is context-dependent. Further study discovered nine underrepresented patterns that were supposed to be capable of inducing DNA structure constraint. In summary, most of them are in TTCGNA and TTCGAN patterns in both DNA strands. An explanation is also proposed for the strong correlation between GC content and CpG deficiency. The result of random sequence simulation showed that the occurrences of these patterns were correlated with GC content, as well as the percentage of CpG dinucleotides being trapped in these patterns. Finally, we suggest that the degree of counter-selection against these restricted patterns could be influenced by global GC content of a genome.  相似文献   

5.
Various methods have been developed to detect horizontal gene transfer in bacteria, based on anomalous nucleotide composition, assuming that compositional features undergo amelioration in the host genome. Evolutionary theory predicts the inevitability of false positives when essential sequences are strongly conserved. Foreign genes could become more detectable on the basis of their higher order compositions if such features ameliorate more rapidly and uniformly than lower order features. This possibility is tested by comparing the heterogeneities of bacterial genomes with respect to strand-independent first- and second-order features, (i) G + C content and (ii) dinucleotide relative abundance, in 1 kb segments. Although statistical analysis confirms that (ii) is less inhomogeneous than (i) in all 12 species examined, extreme anomalies with respect to (ii) in the Escherichia coli K12 genome are typically co-located with essential genes.Key words: amelioration, dinucleotide frequency, essential genes, horizontal transfer, molecular evolution  相似文献   

6.
Based on 152 mitochondrial genomes and 36 bacterial chromosomes that have been completely sequenced, as well as three long contigs for human chromosomes 6, 21, and 22, we examined skews of mononucleotide frequencies and the relative abundance of dinucleotides in one DNA strand. Each group of these genomes has its own characteristics. Regarding mitochondrial genomes, both CpG and GpT are underrepresented, while either GpG or CpC or both are overrepresented. The relative frequency of nucleotide T vs A and of nucleotide G vs C is strongly skewed, due presumably to strand asymmetry in replication errors and unidirectional DNA replication from single origins. Exceptions are found in the plant and yeast mitochondrial genomes, each of which may replicate from multiple origins. Regarding bacterial genomes, the ``universal' rule of CpG deficiency is restricted to archaebacteria and some eubacteria. In other eubacteria, the most underrepresented dinucleotide is either TpA or GpT. In general, there are significant T vs A and G vs C skews in each half of the bacterial genome, although these are almost exactly canceled out over the whole genome. Regarding human chromosomes 6, 21, and 22, dinucleotide CpG tends to be avoided. The relative frequency of mononucleotides exhibits conspicuous local skews, suggesting that each of these chromosomal segments contains more than one DNA replication origin. It is concluded that, when there are several replicons in a genomic region, not only the number of DNA replication origins but also the directionality is important and that the observed patterns of nucleotide frequencies in the genome strongly support the hypothesis of strand asymmetry in replication errors. Received: 1 November 2000 / Accepted: 12 March 2001  相似文献   

7.
Castoe TA  Stephens T  Noonan BP  Calestani C 《Gene》2007,392(1-2):47-58
Type I polyketide synthases (PKSs), and related fatty acid synthases (FASs), represent a large group of proteins encoded by a diverse gene family that occurs in eubacteria and eukaryotes (mainly in fungi). Collectively, enzymes encoded by this gene family produce a wide array of polyketide compounds that encompass a broad spectrum of biological activity including antibiotic, antitumor, antifungal, immunosuppressive, and predator defense functional roles. We employed a phylogenomics approach to estimate relationships among members of this gene family from eubacterial and eukaryotic genomes. Our results suggest that some animal genomes (sea urchins, birds, and fish) possess a previously unidentified group of pks genes, in addition to possessing fas genes used in fatty acid metabolism. These pks genes in the chicken, fish, and sea urchin genomes do not appear to be closely related to any other animal or fungal genes, and instead are closely related to pks genes from the slime mold Dictyostelium and eubacteria. Continued accumulation of genome sequence data from diverse animal lineages is required to clarify whether the presence of these (non-fas) pks genes in animal genomes owes their origins to horizontal gene transfer (from eubacterial or Dictostelium genomes) or to more conventional patterns of vertical inheritance coupled with massive gene loss in several animal lineages. Additionally, results of our broad-scale phylogenetic analyses bolster the support for previous hypotheses of horizontal gene transfer of pks genes from bacterial to fungal and protozoan lineages.  相似文献   

8.
Eubacterial genomes have highly variable GC content (0.17-0.75) and the primary mechanism of such variability remains unknown. The place to look for is what actually catalyzes the synthesis of DNA, where DNA polymerase III is at the center stage, particularly one of its 10 subunits--the alpha subunit. According to the dimeric combination of alpha subunits, GC contents of eubacterial genomes were partitioned into three groups with distinct GC content variation spectra: dnaE1 (full-spectrum), dnaE2/dnaE1 (high-GC), and polC/dnaE3 (low-GC). Therefore, genomic GC content variability is believed to be governed primarily by the alpha subunit grouping of DNA polymerase III; it is of essence in genome composition analysis to take full account of such a grouping principle. Since horizontal gene transfer is very frequent among bacterial genomes, exceptions of the grouping scheme, a few percents of the total, are readily identifiable and should be excluded from in-depth analyses on nucleotide compositions.  相似文献   

9.
Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host–pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007. We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection.  相似文献   

10.
11.
To study possible relationships between an organism's genomic DNA curvature and the aminoacid composition of its proteome, every peptidic sequence from fully determined genomes was retrotranslated using the E. coli codon preferences, and the curvature profiles of the resulting DNA sequences were calculated and compared. A clear interdependence between these two variables was observed, as each retrotranslated proteome presented a distinctive, statistically significant DNA curvature profile biased toward its natural DNA curvature profile. In addition, by comparing the profiles arising from real and randomly permuted proteomes, we also found a position-dependent contribution of the peptidic sequence to DNA curvature. The implications of these results support the idea of a possible selection toward a specific global curvature of genomes.  相似文献   

12.
Static DNA curvature distributions of full-sequenced genomes and large DNA contigs from different organisms were calculated. Very distinctive differences among histogram profiles coming from archaebacteria, eubacteria, and eukaryotes were observed. Eubacterial profiles were, on average, more curved than were archaeal and eukaryotic profiles. A comparative analysis between real and randomized DNA sequences revealed that eubacterial genomes presented, overall, higher curvature values than random sequences. An opposite portrait was exhibited by archaeal and eukaryotic genomes. They displayed a lower frequency of curved regions than their corresponding randomized sequences. The contributions of coding and intergenic regions to the curvature profile were also analyzed. Intergenic regions, on average, were found to be more curved than the overall genomic sequences, especially in prokaryotic organisms. Nevertheless, because of their small size with respect to coding regions, the contribution of intergenic sequences to the overall curvature profile tended to be minor. A clear relationship between codon usage and DNA curvature was demonstrated, and a proposal of the possible coevolution of both systems is discussed. Finally, we present a procedure to quantify the deviation of a curvature profile from randomness through a formal statistical analysis.  相似文献   

13.
It is known that in thermophiles the G+C content of ribosomal RNA linearly correlates with growth temperature, while that of genomic DNA does not. Although the G+C contents (singlet) of the genomic DNAs of thermophiles and methophiles do not differ significantly, the dinucleotide (doublet) compositions of the two bacterial groups clearly do. The average amino acid compositions of proteins of the two groups are also distinct. Based on these facts, we here analyzed the DNA and protein compositions of various bacteria in terms of the optimal growth temperature (OGT). Regression analyses of the sequence data for thermophilic, mesophilic and psychrophilic bacteria revealed good linear relationships between OGT and the dinucleotide compositions of DNA, and between OGT and the amino acid compositions of proteins. Together with the above-mentioned linear relationship between ribosomal RNA and OGT, the DNA and protein compositions can be regarded as thermostability measures for RNA, DNA and proteins, covering a wide range of temperatures. Both the DNA and proteins of psychrophiles apparently exhibit characteristics diametrically opposite to those of thermophiles. The physicochemical parameters of dinucleotides suggested that supercoiling of DNA is relevant to its thermostability. Protein stability in thermophiles is realized primarily through global changes that increase charged residues (i.e., Glu, Arg, and Lys) on the molecular surface of all proteins. This kind of global change is attainable through a change in the amino acid composition coupled with alterations in the DNA base composition. The general strategies of thermophiles and psychrophiles for adaptation to higher and lower temperatures, respectively, that are suggested by the present study are discussed.  相似文献   

14.
SUMMARY: Although whole-genome sequences have been analysed for the presence of anomalous DNA, no dedicated application is currently available to analyse the composition of individual sequence entries, for instance those derived by experimental techniques, such as subtractive hybridization. Since genomic dinucleotide frequency values are conserved between related species, a representative genome sequence can often be found to score for anomalous sequence composition for many of these putative horizontally transferred sequences. We developed the application deltarho-web, which enables the determination of the differences between the dinucleotide composition of an input sequence and that of a selected genome in a size-dependent manner. A feature allowing batch comparisons is included as well. In addition, deltarho-web allows the analysis of the dinucleotide composition of complete genomes. This provides complementary information for the identification of large anomalous gene clusters.  相似文献   

15.
16.
Microbes utilize defence systems with fundamental similarities to our innate and adaptive immune responses to protect themselves from harmful invaders. One system, made up of CRISPR loci & Cas proteins, incorporates recognizable features from the genomes of viruses (bacteriophages) and plasmids into bacterial genomes, where they are later used to direct a ribonucleoprotein complex to destroy invading nucleic acids upon re-exposure. CRISPR-mediated defence against invasive nucleic acids is found in most archaea and many eubacteria. Many aspects of this newly described defence system have not been worked out, including the molecular mechanisms by which foreign nucleic acids are incorporated into microbial genomes during adaption and destroyed during interference. In this issue of Molecular Microbiology, DeLisa and colleagues provide insight into how this form of microbial immunity might be regulated in eubacteria. They demonstrate that Escherichia coli CRISPR-mediated immunity requires the presence of the BaeSR two-component system under certain conditions. Since BaeSR regulate an envelope stress response, their data imply that immunity against invading, foreign nucleic acids may be somehow linked to stresses to the bacterial membrane. These observations will help pave the way to understanding how and when CRISPR-based immunity may be important in driving evolution and adaptation in eubacteria.  相似文献   

17.
18.
19.
Organelle origins and ribosomal RNA   总被引:8,自引:0,他引:8  
As the detailed molecular biology of organelle genomes has unfolded, there has been a general acceptance of the view that plastids and mitochondria are of endosymbiotic, eubacterial origin. Plastid genes are strikingly similar to their eubacterial (particularly cyanobacterial) counterparts in sequence, organization, and mode of expression, and such features strongly support the hypothesis that the plastid and its genome were derived in evolution from a blue-green alga-like endosymbiont. Mitochondria, on the other hand, are problematic: mitochondrial genes are organized and expressed in remarkably diverse ways in the different major groups of eukaryotes, and in no case are these features particularly characteristic of either bacterial or nuclear genomes. There is, however, clear evidence derived from gene sequence supporting the eubacterial ancestry of mitochondria, and some of the most compelling data have come from analyses of mitochondrial ribosomal RNA (rRNA). Plant mitochondrial rRNA genes diverge in sequence at a particularly slow rate, and these genes have proven to be especially supportive of the endosymbiont hypothesis, pointing to an origin of mitochondria from within the alpha subdivision of the purple bacteria. Ribosomal RNA sequences provide a basis for the construction of global phylogenetic trees that probe the evolutionary history of organelles, and that address the question of whether mitochondria and plastids are monophyletic or polyphyletic in origin. Such studies raise the possibility that the rRNA genes of plant mitochondria originated separately from the mitochondrial rRNA genes of other eukaryotes.  相似文献   

20.
Insertion sequences (ISs) are mobile elements that are commonly found in bacterial genomes. Here, the structural and functional diversity of these mobile elements in the genome of the cyanobacterium Crocosphaera watsonii WH8501 is analyzed. The number, distribution, and diversity of nucleotide and amino acid stretches with similarity to the transposase gene of this IS family suggested that this genome harbors many functional as well as truncated IS fragments. The selection pressure acting on full-length transposase open reading frames of these ISs suggested (i) the occurrence of positive selection and (ii) the presence of one or more positively selected codons. These results were obtained using three data sets of transposase genes from the same IS family that were collected based on the level of amino acid similarity, the presence of an inverted repeat, and the number of sequences in the data sets. Neither recombination nor ribosomal frameshifting, which may interfere with the selection analyses, appeared to be important forces in the transposase gene family. Some positively selected codons were located in a conserved domain, suggesting that these residues are functionally important. The finding that this type of selection acts on IS-carried genes is intriguing, because although ISs have been associated with the adaptation of the bacterial host to new environments, this has typically been attributed to transposition or transformation, thus involving different genomic locations. Intragenic adaptation of IS-carried genes identified here may constitute a novel mechanism associated with bacterial diversification and adaptation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号