首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
We compared the codon usage of sequences of transposable elements (TEs) with that of host genes from the species Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Saccharomyces cerevisiae, and Homo sapiens. Factorial correspondence analysis showed that, regardless of the base composition of the genome, the TEs differed from the genes of their host species by their AT-richness. In all species, the percentage of A + T on the third codon position of the TEs was higher than that on the first codon position and lower than that in the noncoding DNA of the genomes. This indicates that the codon choice is not simply the outcome of mutational bias but is also subject to selection constraints. A tendency toward higher A + T on the third position than on the first position was also found in the host genes of A. thaliana, C. elegans, and S. cerevisiae but not in those of D. melanogaster and H. sapiens. This strongly suggests that the AT choice is a host-independent characteristic common to all TEs. The codon usage of TEs generally appeared to be different from the mean of the host genes. In the AT-rich genomes of Arabidopsis thaliana, Caenorhabditis elegans, and Saccharomyces cerevisiae, the codon usage bias of TEs was similar to that of weakly expressed genes. In the GC-rich genome of D. melanogaster, however, the bias in codon usage of the TEs clearly differed from that of weakly expressed genes. These findings suggest that selection acts on TEs and that TEs may display specific behavior within the host genomes. Received: 2 May 2001 / Accepted: 29 October 2001  相似文献   

2.
Synonymous codon choices vary considerably among Schistosoma mansoni genes. Principal components analysis detects a single major trend among genes, which highly correlates with GC content in third codon positions and exons, but does not discriminate among putatively highly and lowly expressed genes. The effective number of codons used in each gene, and its distribution when plotted against GC3, suggests that codon usage is shaped mainly by mutational biases. The GC content of exons, GC3, 5′, 3′, and flanking (5′+ 3′+ introns) regions are all correlated among them, suggesting that variations in GC content may exist among different regions of the S. mansoni genome. We propose that this genome structure might be among the most important factors shaping codon usage in this species, although the action of selection on certain sequences cannot be excluded. Received: 10 March 1997 / Accepted: 27 June 1997  相似文献   

3.
In this work, we present the sequences and a comparison of the glycosomal GAPDHs from a number of Kinetoplastida. The complete gene sequences have been determined for some species (Crithidia fasciculata, Herpetomonas samuelpessoai, Leptomonas seymouri, and Phytomonas sp), whereas for other species (Trypanosoma brucei gambiense, Trypanosoma congolense, Trypanosoma vivax, and Leishmania major), only partial sequences have been obtained by PCR amplification. The structure of all available glycosomal GAPDH genes was analyzed in detail. Considerable variations were observed in both their nucleotide composition and their codon usage. The GC content varies between 64.4% in L. seymouri and 49.5% in the previously sequenced GAPDH gene from Trypanoplasma borreli. A highly biased codon usage was found in C. fasciculata, with only 34 triplets used, whereas in T. borreli 57 codons were employed. No obvious correlation could be observed between the codon usage and either the nucleotide composition or the level of gene expression. The glycosomal GAPDH is a very well-conserved enzyme. The maximal overall difference observed in the amino acid sequences is only 25%. Specific insertions and extensions are retained in all sequences. The residues involved in catalysis, substrate, and inorganic phosphate binding are fully conserved, whereas some variability is observed in the cofactor-binding pocket. The implications of these data for the design of new trypanocidal drugs targeted against GAPDH are discussed. All available gene and amino acid sequences of glycosomal GAPDHs were used for a phylogenetic analysis. The division of the Kinetoplastida into two suborders, Bodonina and Trypanosomatina, was well supported. Within the letter group, the Trypanosoma species appeared to be monophyletic, whereas the other trypanosomatids form a second clade. Received: 23 February 1998/Accepted: 26 March 1998  相似文献   

4.
Wolbachia are obligatory intracellular and maternally inherited bacteria, known to infect many species of arthropod. In this study, we discovered a bacteriophage-like genetic element in Wolbachia, which was tentatively named bacteriophage WO. The phylogenetic tree based on phage WO genes of several Wolbachia strains was not congruent with that based on chromosomal genes of the same strains, suggesting that phage WO was active and horizontally transmitted among various Wolbachia strains. All the strains of Wolbachia used in this study were infected with phage WO. Although the phage genome contained genes of diverse origins, the average G+C content and codon usage of these genes were quite similar to those of a chromosomal gene of Wolbachia. These results raised the possibility that phage WO has been associated with Wolbachia for a very long time, conferring some benefit to its hosts. The evolution and possible roles of phage WO in various reproductive alterations of insects caused by Wolbachia are discussed. Received: 28 January 2000 / Accepted: 3 August 2000  相似文献   

5.
A+T content, phylogenetic relationships, codon usage, evolutionary rates, and ratio of synonymous versus non-synonymous substitutions have been studied in partial sequences of the atpD and aroQ/pheA genes of primary (Buchnera) and secondary symbionts of aphids and a set of selected non-symbiotic bacteria, belonging to the five subdivisions of the Proteobacteria. Compared to the homologous genes of the last group, both genes belonging to Buchnera behave in a similar way, showing a higher A+T content, forming a monophyletic group, a loss in codon bias, especially in third base position, an evolutionary acceleration and an increase in the number of non-synonymous substitutions, confirming previous results reported elsewhere for other genes. When available, these properties have been partly observed with the secondary symbionts, but with values that are intermediate between Buchnera and free living Proteobacteria. They show high A+T content, but not as high as Buchnera, a non-solved phylogenetic position between Buchnera, and the other γ-Proteobacteria, a loss in codon bias, again not as high as in Buchnera and a significant evolutionary acceleration in the case of the three atpD genes, but not when considering aroQ/pheA genes. These results give support to the hypothesis that they are symbionts at different stages of the symbiotic accommodation to the host.  相似文献   

6.
Biased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified in the codon–anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces not related to the individual codon–anticodon interaction may be involved in determining which synonymous codons are preferred or avoided. We show that the ``off-frame' trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's ``in-frame' codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of the location of confamilial synonymous codons along coding regions—a pattern often described as a context dependence of nucleotide choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences and the similarities in the codon usages of distantly related organisms. Received: 10 November 1998 / Accepted: 23 February 1999  相似文献   

7.
To characterize the coding-sequence divergence of closely related genomes, we compared DNA sequence divergence between sequences from a Brassica rapa ssp. pekinensis EST library isolated from flower buds and genomic sequences from Arabidopsis thaliana. The specific objectives were (i) to determine the distribution of and relationship between K a and K s, (ii) to identify genes with the lowest and highest K a:K s values, and (iii) to evaluate how codon usage has diverged between two closely related species. We found that the distribution of K a:K s was unimodal, and that substitution rates were more variable at nonsynonymous than synonymous sites, and detected no evidence that K a and K s were positively correlated. Several genes had K a:K s values equal to or near zero, as expected for genes that have evolved under strong selective constraint. In contrast, there were no genes with K a:K s >1 and thus we found no strong evidence that any of the 218 sequences we analyzed have evolved in response to positive selection. We detected a stronger codon bias but a lower frequency of GC at synonymous sites in A. thaliana than B. rapa. Moreover, there has been a shift in the profile of most commonly used synonymous codons since these two species diverged from one another. This shift in codon usage may have been caused by stronger selection acting on codon usage or by a shift in the direction of mutational bias in the B. rapa phylogenetic lineage.  相似文献   

8.
The sequence of the mitochondrial COII gene has been widely used to estimate phylogenetic relationships at different taxomonic levels across insects. We investigated the molecular evolution of the COII gene and its usefulness for reconstructing phylogenetic relationships within and among four collembolan families. The collembolan COII gene showed the lowest A + T content of all insects so far examined, confirming that the well-known A + T bias in insect mitochondrial genes tends to increase from the basal to apical orders. Fifty-seven percent of all nucleotide positions were variable and most of the third codon positions appeared free to vary. Values of genetic distance between congeneric species and between families were remarkably high; in some cases the latter were higher than divergence values between other orders of insects. The remarkably high divergence levels observed here provide evidence that collembolan taxa are quite old; divergence levels among collembolan families equaled or exceeded divergences among pterygote insect orders. Once the saturated third-codon positions (which violated stationarity of base frequencies) were removed, the COII sequences contained phylogenetic information, but the extent of that information was overestimated by parsimony methods relative to likelihood methods. In the phylogenetic analysis, consistent statistical support was obtained for the monophyly of all four genera examined, but relationships among genera/families were not well supported. Within the genus Orchesella, relationships were well resolved and agreed with allozyme data. Within the genus Isotomurus, although three pairs of populations were consistently identified, these appeared to have arisen in a burst of evolution from an earlier ancestor. Isotomurus italicus always appeared as basal and I. palustris appeared to harbor a cryptic species, corroborating allozyme data. Received: 12 January 1996 / Accepted: 10 August 1996  相似文献   

9.
Genes with atypical G+C content and pattern of codon usage in a certain genome are possibly of exotic origin, and this idea has been applied to identify horizontal events. In this way, it was postulated that a total of 755 genes in the E. coli genome are relics of horizontal events after the divergence of E. coli from the Salmonella lineage 100 million years ago (Lawrence and Ochman, 1998). In this paper we propose a new way to study sequence composition more thoroughly. We found that although the 755 genes differ in composition from other genes in the E. coli genome, the difference is minor. If we accepted that these genes are horizontally transferred, then (1) it would be more likely that they were transferred from genomes evolutionarily closely related to E. coli; but (2) the dating method used by Lawrence and Ochman (1997, 1998) largely underestimated the average age of introduced sequences in the E. coli genome, in particular, most of the 755 genes should be introduced into E. coli before, instead of after, the divergence of E. coli from the Salmonella lineage. Our study reveals that atypical G+C content and pattern of codon usage are not reliable indicators of horizontal gene transfer events. Received: 27 September 2000 / Accepted: 9 April 2001  相似文献   

10.
We have analyzed the patterns of synonymous codon preferences of the nuclear genes of Plasmodium falciparum, a unicellular parasite characterized by an extremely GC-poor genome. When all genes are considered, codon usage is strongly biased toward A and T in third codon positions, as expected, but multivariate statistical analysis detects a major trend among genes. At one end genes display codon choices determined mainly by the extreme genome composition of this parasite, and very probably their expression level is low. At the other end a few genes exhibit an increased relative usage of a particular subset of codons, many of which are C-ending. Since the majority of these few genes is putatively highly expressed, we postulate that the increased C-ending codons are translationally optimal. In conclusion, while codon usage of the majority of P. falciparum genes is determined mainly by compositional constraints, a small number of genes exhibit translational selection. Received: 10 November 1998 / Accepted: 28 January 1999  相似文献   

11.
In this study, we analyzed the correlation between codon usage bias and Shine–Dalgarno (SD) sequence conservation, using complete genome sequences of nine prokaryotes. For codon usage bias, we adopted the codon adaptation index (CAI), which is based on the codon usage preference of genes encoding ribosomal proteins, elongation factors, heat shock proteins, outer membrane proteins, and RNA polymerase subunit proteins. To compute SD sequence conservation, we used SD motif sequences predicted by Tompa and systematically aligned them with 5′UTR sequences. We found that there exists a clear correlation between the CAI values and SD sequence conservation in the genomes of Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Archaeoglobus fulgidus, Methanobacterium thermoautotrophicum, and Methanococcus jannaschii, and no relationship is found in M. genitalium, M. pneumoniae, and Synechocystis. That is, genes with higher CAI values tend to have more conserved SD sequences than do genes with lower CAI values in these organisms. Some organisms, such as M. thermoautotrophicum, do not clearly show the correlation. The biological significance of these results is discussed in the context of the translation initiation process and translation efficiency. Received: 22 June 2000 / Accepted: 18 October 2000  相似文献   

12.
Detailed nucleotide diversity studies revealed that the fil1 gene of Antirrhinum, which has been reported to be single copy, is a member of a gene family composed of at least five genes. In four Antirrhinum majus populations with different mating systems and one A. graniticum population, diversity within populations is very low. Divergence among Antirrhinum species and between Antirrhinum and Digitalis is also low. For three of these genes we also obtained sequences from a more divergent member of the Scrophulariaceae, Verbascum nigrum. Compared with Antirrhinum, little divergence is again observed. These results, together with similar data obtained previously for five cycloidea genes, suggest either that these gene families (or the Antirrhinum genome) are unusually constrained or that there is a low rate of substitution in these lineages. Using a sample of 52 genes, based on two measures of codon usage (ENC and GC3 content), we show that cyc and fil1 are among the least biased Antirrhinum genes, so that their low diversity is not due to extreme codon bias. Received: 20 June 2000 / Accepted: 25 October 2000  相似文献   

13.
Synonymous codon usage in related species may differ as a result of variation in mutation biases, differences in the overall strength and efficiency of selection, and shifts in codon preference—the selective hierarchy of codons within and between amino acids. We have developed a maximum-likelihood method to employ explicit population genetic models to analyze the evolution of parameters determining codon usage. The method is applied to twofold degenerate amino acids in 50 orthologous genes from D. melanogaster and D. virilis. We find that D. virilis has significantly reduced selection on codon usage for all amino acids, but the data are incompatible with a simple model in which there is a single difference in the long-term N e, or overall strength of selection, between the two species, indicating shifts in codon preference. The strength of selection acting on codon usage in D. melanogaster is estimated to be |N e s|≈ 0.4 for most CT-ending twofold degenerate amino acids, but 1.7 times greater for cysteine and 1.4 times greater for AG-ending codons. In D. virilis, the strength of selection acting on codon usage for most amino acids is only half that acting in D. melanogaster but is considerably greater than half for cysteine, perhaps indicating the dual selection pressures of translational efficiency and accuracy. Selection coefficients in orthologues are highly correlated (ρ= 0.46), but a number of genes deviate significantly from this relationship. Received: 20 December 1998 / Accepted: 17 February 1999  相似文献   

14.
The deduced amino acid sequences from 1200 Haemophilus influenzae genes was compared to a data set that contained the orfs from yeast, two different Archaea and the Gram+ and Gram− bacteria, Bacillus subtilis and Escherichia coli. The results of the comparison yielded a 26 orthologous gene set that had at least one representative from each of the four groups. A four taxa phylogenetic relationship for these 26 genes was determined. The statistical significance of each minimal tree was tested against the two alternative four taxa trees. The result was that four genes significantly supported the (Archaea, Eukaryota) (Gram+, Gram−) topology, two genes supported the one where Gram− and Eukaryota form a clade, and one gene supported the tree where Gram+ and Eukaryota define one clade. The remaining genes do not uniquely support any phylogeny, thereby collapsing the two central nodes into a single node. These are referred to as star phylogenies. I offer a new suggestion for the mechanism that gave rise to the star phylogenies. Namely, these are genes that are younger than the underlying lineages that currently harbor them. This hypothesis is examined with two proteins that display the star phylogeny; namely onithine transcarbamylase and tryptophan synthetase. It is shown, using the distance matrix rate test, that the rate of evolution of these two proteins is comparable to a control gene when rates are determined by comparing closely related species. This implies that the genes under comparison experience comparable functional constraint. However, when the genes from remotely related species are compared, a plateau is encountered. Since we see no unusual levels of functional constraint this plateau cannot be attributed to the divergence of the protein having reached saturation. The simplest explanation is that the genes displaying the star phylogenies were introduced after Archaea, Eukaryota, and Bacteria had diverged from one another. They presumably spread through life by horizontal gene transfer. Received: 12 July 2001 / Accepted: 27 July 2001  相似文献   

15.
We examined a region of high variability in the mosaic mercury resistance (mer) operon of natural bacterial isolates from the primate intestinal microbiota. The region between the merP and merA genes of nine mer loci was sequenced and either the merC, the merF, or no gene was present. Two novel merC genes were identified. Overall nucleotide diversity, π (per 100 sites), of the merC gene was greater (49.63) than adjacent merP (35.82) and merA (32.58) genes. However, the consequences of this variability for the predicted structure of the MerC protein are limited and putative functional elements (metal-binding ligands and transmembrane domains) are strongly conserved. Comparison of codon usage of the merTP, merC, and merA genes suggests that several merC genes are not coeval with their flanking sequences. Although evidence of homologous recombination within the very variable merC genes is not apparent, the flanking regions have higher homologies than merC, and recombination appears to be driving their overall sequence identities higher. The synonymous codon usage bias (ENC) values suggest greater variability in expression of the merC gene than in flanking genes in six different bacterial hosts. We propose a model for the evolution of MerC as a host-dependent, adventitious module of the mer operon. Received: 2 June 2000 / Accepted: 23 October 2000  相似文献   

16.
In the plant chloroplast genome the codon usage of the highly expressed psbA gene is unique and is adapted to the tRNA population, probably due to selection for translation efficiency. In this study the role of selection on codon usage in each of the fully sequenced chloroplast genomes, in addition to Chlamydomonas reinhardtii, is investigated by measuring adaptation to this pattern of codon usage. A method is developed which tests selection on each gene individually by constructing sequences with the same amino acid composition as the gene and randomly assigning codons based on the nucleotide composition of noncoding regions of that genome. The codon bias of the actual gene is then compared to a distribution of random sequences. The data indicate that within the algae selection is strong in Cyanophora paradoxa, affecting a majority of genes, of intermediate intensity in Odontella sinensis, and weaker in Porphyra purpurea and Euglena gracilis. In the plants, selection is found to be quite weak in Pinus thunbergii and the angiosperms but there is evidence that an intermediate level of selection exists in the liverwort Marchantia polymorpha. The role of selection is then further investigated in two comparative studies. It is shown that average relative codon bias is correlated with expression level and that, despite saturation levels of substitution, there is a strong correlation among the algae genomes in the degree of codon bias of homologous genes. All of these data indicate that selection for translation efficiency plays a significant role in determining the codon bias of chloroplast genes but that it acts with different intensities in different lineages. In general it is stronger in the algae than the higher plants, but within the algae Euglena is found to have several unusual features which are noted. The factors that might be responsible for this variation in intensity among the various genomes are discussed. Received: 6 June 1997 / Accepted: 24 July 1997  相似文献   

17.
The extent to which base composition and codon usage vary among RNA viruses, and the possible causes of this bias, is undetermined in most cases. A maximum-likelihood statistical method was used to test whether base composition and codon usage bias covary with arthropod association in the genus Flavivirus, a major source of disease in humans and animals. Flaviviruses are transmitted by mosquitoes, by ticks, or directly between vertebrate hosts. Those viruses associated with ticks were found to have a significantly lower G+C content than non-vector-borne flaviviruses and this difference was present throughout the genome at all amino acids and codon positions. In contrast, mosquito-borne viruses had an intermediate G+C content which was not significantly different from those of the other two groups. In addition, biases in dinucleotide and codon usage that were independent of base composition were detected in all flaviviruses, but these did not covary with arthropod association. However, the overall effect of these biases was slight, suggesting only weak selection at synonymous sites. A preliminary analysis of base composition, codon usage, and vector specificity in other RNA virus families also revealed a possible association between base composition and vector specificity, although with biases different from those seen in the Flavivirus genus. Received: 29 August 2000 / Accepted: 19 December 2000  相似文献   

18.
Algorithmic details to obtain maximum likelihood estimates of parameters on a large phylogeny are discussed. On a large tree, an efficient approach is to optimize branch lengths one at a time while updating parameters in the substitution model simultaneously. Codon substitution models that allow for variable nonsynonymous/synonymous rate ratios (ω=d N/d S) among sites are used to analyze a data set of human influenza virus type A hemagglutinin (HA) genes. The data set has 349 sequences. Methods for obtaining approximate estimates of branch lengths for codon models are explored, and the estimates are used to test for positive selection and to identify sites under selection. Compared with results obtained from the exact method estimating all parameters by maximum likelihood, the approximate methods produced reliable results. The analysis identified a number of sites in the viral gene under diversifying Darwinian selection and demonstrated the importance of including many sequences in the data in detecting positive selection at individual sites. Received: 25 April 2000 / Accepted: 24 July 2000  相似文献   

19.
The principal intracellular symbiotic bacteria of the cereal weevil Sitophilus oryzae were characterized using the sequence of the 16S rDNA gene (rrs gene) and G + C content analysis. Polymerase chain reaction amplification with universal eubacterial primers of the rrs gene showed a single expected sequence of 1,501 bp. Comparison of this sequence with the available database sequences placed the intracellular bacteria of S. oryzae as members of the Enterobacteriaceae family, closely related to the free-living bacteria, Erwinia herbicola and Escherichia coli, and the endocytobiotic bacteria of the tsetse fly and aphids. Moreover, by high-performance liquid chromatography, we measured the genomic G + C content of the S. oryzae principal endocytobiotes (SOPE) as 54%, while the known genomic G + C content of most intracellular bacteria is about 39.5%. Furthermore, based on the third codon position G + C content and the rrs gene G + C content, we demonstrated that most intracellular bacteria except SOPE are A + T biased irrespective of their phylogenetic position. Finally, using the hsp60 gene sequence, the codon usage of SOPE was compared with that of two phylogenetically closely related bacteria: E. coli, a free-living bacterium, and Buchnera aphidicola, the intracellular symbiotic bacteria of aphids. Taken together, these results show a peculiar and distinctly different DNA composition of SOPE with respect to the other obligate intracellular bacteria, and, combined with biological and biochemical data, they elucidate the evolution of symbiosis in S. oryzae. Received: 8 September 1997 / Accepted: 24 October 1997  相似文献   

20.
We present phylogenetic analyses to demonstrate that there are three families of sucrose phosphate synthase (SPS) genes present in higher plants. Two data sets were examined, one consisting of full-length proteins and a second larger set that covered a highly conserved region including the 14-3-3 binding region and the UDPGlu active site. Analysis of both datasets showed a well supported separation of known genes into three families, designated A, B, and C. The genomic sequences of Arabidopsis thaliana include a member in each family: two genes on chromosome 5 belong to Family A, one gene on chromosome 1 to Family B, and one gene on chromosome 4 to Family C. Each of three Citrus genes belong to one of the three families. Intron/exon organization of the four Arabidopsis genes differed according to phylogenetic analysis, with members of the same family from different species having similar genomic organization of their SPS genes. The two Family A genes on Arabidopsis chromosome 5 appear to be due to a recent duplication. Analysis of published literature and ESTs indicated that functional differentiation of the families was not obvious, although B family members appear not to be expressed in roots. B family genes were cloned from two Actinidia species and southern analysis indicated the presence of a single gene family, which contrasts to the multiple members of Family A in Actinidia. Only two family C genes have been reported to date. Received: 17 April 2001 / Accepted: 27 August 2001  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号