首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
The relative contribution of mutation and selection to the G+C content of DNA was analyzed in bacterial species having widely different G+C contents. The analysis used two methods that were developed previously. The first method was to plot the average G+C content of a set of nucleotides against the G+C content of the third codon position for each gene. This method was used to present the G+C distribution of the third codon position and to assess the relative neutrality of a set of nucleotides to that of the G+C content of the third codon position. The second method was to plot the intrastrand bias of the third codon position from Parity Rule 2 (PR2), where A=T and G=C. It was found that whereas intragenomic distributions of the DNA G+C content of these bacteria are narrow in the majority of species, in some species the G+C content of the minor class of genes distributes over wider ranges than the major class of genes. On the other hand, ubiquitous PR2 biases are amino acid specific and independent of the G+C content of DNA, so that when averaged over the amino acids, the biases are small and not correlated with the DNA G+C content. Therefore, translation coupled PR2-biases are unlikely to explain the wide range of G+C contents among different species. Considering all data available, it was concluded that the amino acid-specific PR2 bias has only a minor effect, if any, on the average G+C content. In addition, PR2 bias patterns of different species show phylogenetic relationships, and the pattern can be as a taxal fingerprint. Received: 5 November 1998 / Accepted: 1 March 1999  相似文献   

2.
Base composition is not uniform across the genome of Drosophila melanogaster. Earlier analyses have suggested that there is variation in composition in D. melanogaster on both a large scale and a much smaller, within-gene, scale. Here we present analyses on 117 genes which have reliable intron/exon boundaries and no known alternative splicing. We detect significant heterogeneity in G+C content among intron segments from the same gene, as well as a significant positive correlation between the intron and the third codon position G+C content within genes. Both of these observations appear to be due, in part, to an overall decline in intron and third codon position G+C content along Drosophila genes with introns. However, there is also evidence of an increase in third codon position G+C content at the start of genes; this is particularly evident in genes without introns. This is consistent with selection acting against preferred codons at the start of genes. Received: 24 February 1997 / Accepted: 10 November 1997  相似文献   

3.
Sueoka N  Kawanishi Y 《Gene》2000,261(1):53-62
The human genome, as in other eukaryotes, has a wide heterogeneity in the DNA base composition. The evolutionary basis for this heterogeneity has been unknown. A previous study of the human genome (846 genes analyzed) has shown that, in the major range of the G+C content in the third codon position (0.25-0.75), biases from the Parity Rule 2 (PR2) among the synonymous codons of the four-codon amino acids are similar except in the highest G+C range (Sueoka, N., 1999. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 238, 53-58.). PR2 is an intra-strand rule where A=T and G=C are expected when there are no biases between the two complementary strands of DNA in mutation and selection rates (substitution rates). In this study, 14,026 human genes were analyzed. In addition, the third codon positions of two-codon amino acids were analyzed. New results show the following: (a) The G+C contents of the third codon position of human genes are scattered in the G+C range of 0.22-0.96 in the third codon position. (b) The PR2 biases are similar in the range of 0.25-0.75, whereas, in the high G+C range (0.75-0.96; 13% of the genes), the PR2-bias fingerprints are different from those of the major range. (c) Unlike the PR2 biases, the G+C contents of the third codon position for both four-codon and two-codon amino acids are all correlated almost perfectly with the G+C content of the third codon position over the total G+C ranges. These results support the notion that the directional mutation pressure, rather than the directional selection pressure, is mainly responsible for the heterogeneity of the G+C content of the third codon position.  相似文献   

4.
We compared the codon usage of sequences of transposable elements (TEs) with that of host genes from the species Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Saccharomyces cerevisiae, and Homo sapiens. Factorial correspondence analysis showed that, regardless of the base composition of the genome, the TEs differed from the genes of their host species by their AT-richness. In all species, the percentage of A + T on the third codon position of the TEs was higher than that on the first codon position and lower than that in the noncoding DNA of the genomes. This indicates that the codon choice is not simply the outcome of mutational bias but is also subject to selection constraints. A tendency toward higher A + T on the third position than on the first position was also found in the host genes of A. thaliana, C. elegans, and S. cerevisiae but not in those of D. melanogaster and H. sapiens. This strongly suggests that the AT choice is a host-independent characteristic common to all TEs. The codon usage of TEs generally appeared to be different from the mean of the host genes. In the AT-rich genomes of Arabidopsis thaliana, Caenorhabditis elegans, and Saccharomyces cerevisiae, the codon usage bias of TEs was similar to that of weakly expressed genes. In the GC-rich genome of D. melanogaster, however, the bias in codon usage of the TEs clearly differed from that of weakly expressed genes. These findings suggest that selection acts on TEs and that TEs may display specific behavior within the host genomes. Received: 2 May 2001 / Accepted: 29 October 2001  相似文献   

5.
A+T content, phylogenetic relationships, codon usage, evolutionary rates, and ratio of synonymous versus non-synonymous substitutions have been studied in partial sequences of the atpD and aroQ/pheA genes of primary (Buchnera) and secondary symbionts of aphids and a set of selected non-symbiotic bacteria, belonging to the five subdivisions of the Proteobacteria. Compared to the homologous genes of the last group, both genes belonging to Buchnera behave in a similar way, showing a higher A+T content, forming a monophyletic group, a loss in codon bias, especially in third base position, an evolutionary acceleration and an increase in the number of non-synonymous substitutions, confirming previous results reported elsewhere for other genes. When available, these properties have been partly observed with the secondary symbionts, but with values that are intermediate between Buchnera and free living Proteobacteria. They show high A+T content, but not as high as Buchnera, a non-solved phylogenetic position between Buchnera, and the other γ-Proteobacteria, a loss in codon bias, again not as high as in Buchnera and a significant evolutionary acceleration in the case of the three atpD genes, but not when considering aroQ/pheA genes. These results give support to the hypothesis that they are symbionts at different stages of the symbiotic accommodation to the host.  相似文献   

6.
The principal intracellular symbiotic bacteria of the cereal weevil Sitophilus oryzae were characterized using the sequence of the 16S rDNA gene (rrs gene) and G + C content analysis. Polymerase chain reaction amplification with universal eubacterial primers of the rrs gene showed a single expected sequence of 1,501 bp. Comparison of this sequence with the available database sequences placed the intracellular bacteria of S. oryzae as members of the Enterobacteriaceae family, closely related to the free-living bacteria, Erwinia herbicola and Escherichia coli, and the endocytobiotic bacteria of the tsetse fly and aphids. Moreover, by high-performance liquid chromatography, we measured the genomic G + C content of the S. oryzae principal endocytobiotes (SOPE) as 54%, while the known genomic G + C content of most intracellular bacteria is about 39.5%. Furthermore, based on the third codon position G + C content and the rrs gene G + C content, we demonstrated that most intracellular bacteria except SOPE are A + T biased irrespective of their phylogenetic position. Finally, using the hsp60 gene sequence, the codon usage of SOPE was compared with that of two phylogenetically closely related bacteria: E. coli, a free-living bacterium, and Buchnera aphidicola, the intracellular symbiotic bacteria of aphids. Taken together, these results show a peculiar and distinctly different DNA composition of SOPE with respect to the other obligate intracellular bacteria, and, combined with biological and biochemical data, they elucidate the evolution of symbiosis in S. oryzae. Received: 8 September 1997 / Accepted: 24 October 1997  相似文献   

7.
Identifying the G + C difference between closely related bacterial species or between different strains of the same species is one of the first steps in understanding the evolutionary mechanisms accounting for the differences observed among bacterial species. The G + C content can be one of the most important factors in the evolution of genomic structures. In this paper, we describe a new method for detecting an initial stage of differentiation of the G + C content at the third codon base position between two strains of the same bacterial species. We apply this method to the two strains of Helicobacter pylori. A group of genes is detected with large variations of G + C in the third positions—apparently genes of early response to pressures of changing G + C. We discuss our findings from the viewpoint of genomic evolution. Received: 26 February 2001 / Accepted: 16 May 2001  相似文献   

8.
We show that in animal mitochondria homologous genes that differ in guanine plus cytosine (G + C) content code for proteins differing in amino acid content in a manner that relates to the G + C content of the codons. DNA sequences were analyzed using square plots, a new method that combines graphical visualization and statistical analysis of compositional differences in both DNA and protein. Square plots divide codons into four groups based on first and second position A + T (adenine plus thymine) and G + C content and indicate differences in amino acid content when comparing sequences that differ in G + C content. When sequences are compared using these plots, the amino acid content is shown to correlate with the nucleotide bias of the genes. This amino acid effect is shown in all protein-coding genes in the mitochondrial genome, including cox I, cox II, and cyt b, mitochondrial genes which are commonly used for phylogenetic studies. Furthermore, nucleotide content differences are shown to affect the content of all amino acids with A + T- and G + C-rich codons. We speculate that phylogenetic analysis of genes so affected may tend erroneously to indicate relatedness (or lack thereof) based only on amino acid content. Received: 3 July 1996 / Accepted: 6 November 1996  相似文献   

9.
The phylogenetic relationships of genus Passer (Old World sparrows) have been studied with species covering their complete world living range. Mitochondrial (mt) cyt b genes and pseudogenes have been analyzed, the latter being strikingly abundant in genus Passer compared with other studied songbirds. The significance of these Passer pseudogenes is presently unclear. The mechanisms by which mt cyt b genes become pseudogenes after nuclear translocation are discussed together with their mode of evolution, i.e., transition/transversion mitochondrial ratio is decreased in the nucleus, as is the constraint for variability at the three codon positions. However, the skewed base composition according to codon position (in 1st position the percentage is very similar for the four bases, in 2nd position there are fewer percentage of A and G and more percentage of T, and in 3rd codon position fewer percentage of G and T and is very rich in A and C) is maintained in the translocated nuclear pseudogenes. Different nuclear internal mechanisms and/or selective pressures must exist for explaining this nuclear/mitochondrial differential DNA base evolutive variability. Also, the phylogenetic usefulness of pseudogenes for defining relationships between closely related lineages is stressed. The analyses suggest that the primitive genus Passer species comes from Africa, the Cape sparrow being the oldest: P. hispaniolensis italiae is more likely conspecific to P. domesticus than to P. hispaniolensis. Also, Passer species are not included within weavers or Estrildinae or Emberizinae, as previously suggested. European and American Emberizinae sparrows are closely related to each other and seem to be the earliest species that radiated among the studied songbirds (all in the Miocene Epoch). Received: 29 November 2000 / Accepted: 22 March 2001  相似文献   

10.
Complete sequences of seven protein coding genes from Penaeus notialis mitochondrial DNA were compared in base composition and codon usage with homologous genes from Artemia franciscana and four insects. The crustacean genes are significantly less A + T-rich than their counterpart in insects and the pattern of codon usage (ratio of G + C-rich versus A + T-rich codon) is less biased. A phylogenetic analysis using amino acid sequences of the seven corresponding polypeptides supports a sister-taxon status for mollusks–annelid and arthropods. Furthermore, a distance matrix-based tree and two most-parsimonious trees both suggest that crustaceans are paraphyletic with respect to insects. This is also supported by the inclusion of Panulirus argus COII (complete) and COI and COIII (partial) sequence data. From analysis of single and combined genes to infer phylogenies, it is observed that obtained from single genes are not well supported in most topologies cases and notably differ from that of the tree based on all seven genes. Received: 25 August 1998 / Accepted: 8 March 1999  相似文献   

11.
Along the gene, nucleotides in various codon positions tend to exert a slight but observable influence on the nucleotide choice at neighboring positions. Such context biases are different in different organisms and can be used as genomic signatures. In this paper, we will focus specifically on the dinucleotide composed of a third codon position nucleotide and its succeeding first position nucleotide. Using the 16 possible dinucleotide combinations, we calculate how well individual genes conform to the observed mean dinucleotide frequencies of an entire genome, forming a distance measure for each gene. It is found that genes from different genomes can be separated with a high degree of accuracy, according to these distance values. In particular, we address the problem of recent horizontal gene transfer, and how imported genes may be evaluated by their poor assimilation to the host's context biases. By concentrating on the third- and succeeding first position nucleotides, we eliminate most spurious contributions from codon usage and amino-acid requirements, focusing mainly on mutational effects. Since imported genes are expected to converge only gradually to genomic signatures, it is possible to question whether a gene present in only one of two closely related organisms has been imported into one organism or deleted in the other. Striking correlations between the proposed distance measure and poor homology are observed when Escherichia coli genes are compared to Salmonella typhi, indicating that sets of outlier genes in E. coli may contain a high number of genes that have been imported into E. coli, and not deleted in S. typhi. Received: 16 January 2001 / Accepted: 30 August 2001  相似文献   

12.
Wolbachia are obligatory intracellular and maternally inherited bacteria, known to infect many species of arthropod. In this study, we discovered a bacteriophage-like genetic element in Wolbachia, which was tentatively named bacteriophage WO. The phylogenetic tree based on phage WO genes of several Wolbachia strains was not congruent with that based on chromosomal genes of the same strains, suggesting that phage WO was active and horizontally transmitted among various Wolbachia strains. All the strains of Wolbachia used in this study were infected with phage WO. Although the phage genome contained genes of diverse origins, the average G+C content and codon usage of these genes were quite similar to those of a chromosomal gene of Wolbachia. These results raised the possibility that phage WO has been associated with Wolbachia for a very long time, conferring some benefit to its hosts. The evolution and possible roles of phage WO in various reproductive alterations of insects caused by Wolbachia are discussed. Received: 28 January 2000 / Accepted: 3 August 2000  相似文献   

13.
Variation in GC content, GC skew and AT skew along genomic regions was examined at third codon positions in completely sequenced prokaryotes. Eight out of nine eubacteria studied show GC and AT skews that change sign at the origin of replication. The leading strand in DNA replication is G-T rich at codon position 3 in six eubacteria, but C-T rich in two Mycoplasma species. In M. genitalium the AT and GC skews are symmetrical around the origin and terminus of replication, whereas its GC content variation has been shown to have a centre of symmetry elsewhere in the genome. Borrelia burgdorferi and Treponema pallidum show extraordinary extents of base composition skew correlated with direction of DNA replication. Base composition skews measured at third codon positions probably reflect mutational biases, whereas those measured over all bases in a sequence (or at codon positions 1 and 2) can be strongly affected by protein considerations due to the tendency in some bacteria for genes to be transcribed in the same direction that they are replicated. Consequently in some species the direction of skew for total genomic DNA is opposite to that for codon position 3. Received: 2 February 1998 / Accepted: 15 June 1998  相似文献   

14.
Genes with atypical G+C content and pattern of codon usage in a certain genome are possibly of exotic origin, and this idea has been applied to identify horizontal events. In this way, it was postulated that a total of 755 genes in the E. coli genome are relics of horizontal events after the divergence of E. coli from the Salmonella lineage 100 million years ago (Lawrence and Ochman, 1998). In this paper we propose a new way to study sequence composition more thoroughly. We found that although the 755 genes differ in composition from other genes in the E. coli genome, the difference is minor. If we accepted that these genes are horizontally transferred, then (1) it would be more likely that they were transferred from genomes evolutionarily closely related to E. coli; but (2) the dating method used by Lawrence and Ochman (1997, 1998) largely underestimated the average age of introduced sequences in the E. coli genome, in particular, most of the 755 genes should be introduced into E. coli before, instead of after, the divergence of E. coli from the Salmonella lineage. Our study reveals that atypical G+C content and pattern of codon usage are not reliable indicators of horizontal gene transfer events. Received: 27 September 2000 / Accepted: 9 April 2001  相似文献   

15.
Highly expressed plastid genes display codon adaptation, which is defined as a bias toward a set of codons which are complementary to abundant tRNAs. This type of adaptation is similar to what is observed in highly expressed Escherichia coli genes and is probably the result of selection to increase translation efficiency. In the current work, the codon adaptation of plastid genes is studied with regard to three specific features that have been observed in E. coli and which may influence translation efficiency. These features are (1) a relatively low codon adaptation at the 5′ end of highly expressed genes, (2) an influence of neighboring codons on codon usage at a particular site (codon context), and (3) a correlation between the level of codon adaptation of a gene and its amino acid content. All three features are found in plastid genes. First, highly expressed plastid genes have a noticeable decrease in codon adaptation over the first 10–20 codons. Second, for the twofold degenerate NNY codon groups, highly expressed genes have an overall bias toward the NNC codon, but this is not observed when the 3′ neighboring base is a G. At these sites highly expressed genes are biased toward NNT instead of NNC. Third, plastid genes that have higher codon adaptations also tend to have an increased usage of amino acids with a high G + C content at the first two codon positions and GNN codons in particular. The correlation between codon adaptation and amino acid content exists separately for both cytosolic and membrane proteins and is not related to any obvious functional property. It is suggested that at certain sites selection discriminates between nonsynonymous codons based on translational, not functional, differences, with the result that the amino acid sequence of highly expressed proteins is partially influenced by selection for increased translation efficiency. Received: 21 July 1999 / Accepted: 5 November 1999  相似文献   

16.
The sequence of the mitochondrial COII gene has been widely used to estimate phylogenetic relationships at different taxomonic levels across insects. We investigated the molecular evolution of the COII gene and its usefulness for reconstructing phylogenetic relationships within and among four collembolan families. The collembolan COII gene showed the lowest A + T content of all insects so far examined, confirming that the well-known A + T bias in insect mitochondrial genes tends to increase from the basal to apical orders. Fifty-seven percent of all nucleotide positions were variable and most of the third codon positions appeared free to vary. Values of genetic distance between congeneric species and between families were remarkably high; in some cases the latter were higher than divergence values between other orders of insects. The remarkably high divergence levels observed here provide evidence that collembolan taxa are quite old; divergence levels among collembolan families equaled or exceeded divergences among pterygote insect orders. Once the saturated third-codon positions (which violated stationarity of base frequencies) were removed, the COII sequences contained phylogenetic information, but the extent of that information was overestimated by parsimony methods relative to likelihood methods. In the phylogenetic analysis, consistent statistical support was obtained for the monophyly of all four genera examined, but relationships among genera/families were not well supported. Within the genus Orchesella, relationships were well resolved and agreed with allozyme data. Within the genus Isotomurus, although three pairs of populations were consistently identified, these appeared to have arisen in a burst of evolution from an earlier ancestor. Isotomurus italicus always appeared as basal and I. palustris appeared to harbor a cryptic species, corroborating allozyme data. Received: 12 January 1996 / Accepted: 10 August 1996  相似文献   

17.
Mitochondrial genetic codons can be categorized by four patterns of nucleotide-site degeneracy based on varying combinations of twofold- or nondegenerate sites at first codon positions and twofold- or fourfold-degenerate sites at third codon positions. Herein, a model of molecular evolution is introduced that uses these patterns to calculate expected substitution frequencies for each codon position and substitution type relative to overall number of synonymous or nonsynonymous substitutions. Regions of the pocket gopher cytochrome oxidase subunit I (COI) and cytochrome b (cyt-b) genes are analyzed using this model. Chi-square distributions are used to produce relative goodness-of-fit (GF) scores for measuring the difference between substitution frequencies predicted by the codon-degeneracy model (CDM), and frequencies inferred using a well-supported phylogenetic tree of closely related species. The GF scores for expected and observed synonymous (GFsyn= 0.429, p= 0.807) and nonsynonymous (GFns= 2.309, p= 0.679) substitution frequencies resulted in a failure to reject the CDM as a null hypothesis for the molecular evolution of COI and cyt-b in pocket gophers. Alternative tree topologies and calculations of transition bias for these data result in higher GF scores. Received: 25 March 1999 / Accepted: 17 September 1999  相似文献   

18.
We have analyzed the patterns of synonymous codon preferences of the nuclear genes of Plasmodium falciparum, a unicellular parasite characterized by an extremely GC-poor genome. When all genes are considered, codon usage is strongly biased toward A and T in third codon positions, as expected, but multivariate statistical analysis detects a major trend among genes. At one end genes display codon choices determined mainly by the extreme genome composition of this parasite, and very probably their expression level is low. At the other end a few genes exhibit an increased relative usage of a particular subset of codons, many of which are C-ending. Since the majority of these few genes is putatively highly expressed, we postulate that the increased C-ending codons are translationally optimal. In conclusion, while codon usage of the majority of P. falciparum genes is determined mainly by compositional constraints, a small number of genes exhibit translational selection. Received: 10 November 1998 / Accepted: 28 January 1999  相似文献   

19.
Cultured isolates of the unicellular planktonic cyanobacteria Prochlorococcus and marine Synechococcus belong to a single marine picophytoplankton clade. Within this clade, two deeply branching lineages of Prochlorococcus, two lineages of marine A Synechococcus and one lineage of marine B Synechococcus exhibit closely spaced divergence points with low bootstrap support. This pattern is consistent with a near-simultaneous diversification of marine lineages with divinyl chlorophyll b and phycobilisomes as photosynthetic antennae. Inferences from 16S ribosomal RNA sequences including data for 18 marine picophytoplankton clade members were congruent with results of psbB and petB and D sequence analyses focusing on five strains of Prochlorococcus and one strain of marine A Synechococcus. Third codon position and intergenic region nucleotide frequencies vary widely among members of the marine picophytoplankton group, suggesting that substitution biases differ among the lineages. Nonetheless, standard phylogenetic methods and newer algorithms insensitive to such biases did not recover different branching patterns within the group, and failed to cluster Prochlorococcus with chloroplasts or other chlorophyll b-containing prokaryotes. Prochlorococcus isolated from surface waters of stratified, oligotrophic ocean provinces predominate in a lineage exhibiting low G + C nucleotide frequencies at highly variable positions. Received: 18 January 1997 / Accepted: 18 May 1997  相似文献   

20.
Synonymous codon choices vary considerably among Schistosoma mansoni genes. Principal components analysis detects a single major trend among genes, which highly correlates with GC content in third codon positions and exons, but does not discriminate among putatively highly and lowly expressed genes. The effective number of codons used in each gene, and its distribution when plotted against GC3, suggests that codon usage is shaped mainly by mutational biases. The GC content of exons, GC3, 5′, 3′, and flanking (5′+ 3′+ introns) regions are all correlated among them, suggesting that variations in GC content may exist among different regions of the S. mansoni genome. We propose that this genome structure might be among the most important factors shaping codon usage in this species, although the action of selection on certain sequences cannot be excluded. Received: 10 March 1997 / Accepted: 27 June 1997  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号