共查询到20条相似文献,搜索用时 15 毫秒
1.
The Nonrandom Location of Synonymous Codons Suggests That Reading Frame-Independent Forces Have Patterned Codon Preferences 总被引:6,自引:0,他引:6
Biased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified
in the codon–anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces
not related to the individual codon–anticodon interaction may be involved in determining which synonymous codons are preferred
or avoided. We show that the ``off-frame' trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's ``in-frame' codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of
the location of confamilial synonymous codons along coding regions—a pattern often described as a context dependence of nucleotide
choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that
do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes
examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same
kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases
in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this
scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default
overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences
and the similarities in the codon usages of distantly related organisms.
Received: 10 November 1998 / Accepted: 23 February 1999 相似文献
2.
In many unicellular organisms, invertebrates, and plants, synonymous codon usage biases result from a coadaptation between
codon usage and tRNAs abundance to optimize the efficiency of protein synthesis. However, it remains unclear whether natural
selection acts at the level of the speed or the accuracy of mRNAs translation. Here we show that codon usage can improve the
fidelity of protein synthesis in multicellular species. As predicted by the model of selection for translational accuracy,
we find that the frequency of codons optimal for translation is significantly higher at codons encoding for conserved amino
acids than at codons encoding for nonconserved amino acids in 548 genes compared between Caenorhabditis elegans and Homo sapiens. Although this model predicts that codon bias correlates positively with gene length, a negative correlation between codon
bias and gene length has been observed in eukaryotes. This suggests that selection for fidelity of protein synthesis is not
the main factor responsible for codon biases. The relationship between codon bias and gene length remains unexplained. Exploring
the differences in gene expression process in eukaryotes and prokaryotes should provide new insights to understand this key
question of codon usage.
Received: 18 June 2000 / Accepted: 10 November 2000 相似文献
3.
Natural selection favors certain synonymous codons which aid translation in Escherichia coli, yet codons not favored by translational selection persist. We use the frequency distributions of synonymous polymorphisms
to test three hypotheses for the existence of translationally sub-optimal codons: (1) selection is a relatively weak force,
so there is a balance between mutation, selection, and drift; (2) at some sites there is no selection on codon usage, so some
synonymous sites are unaffected by translational selection; and (3) translationally sub-optimal codons are favored by alternative
selection pressures at certain synonymous sites. We find that when all the data is considered, model 1 is supported and both
models 2 and 3 are rejected as sole explanations for the existence of translationally sub-optimal codons. However, we find
evidence in favor of both models 2 and 3 when the data is partitioned between groups of amino acids and between regions of
the genes. Thus, all three mechanisms appear to contribute to the existence of translationally sub-optimal codons in E. coli.
Received: 18 July 2000 / Accepted: 17 April 2001 相似文献
4.
Along the gene, nucleotides in various codon positions tend to exert a slight but observable influence on the nucleotide
choice at neighboring positions. Such context biases are different in different organisms and can be used as genomic signatures.
In this paper, we will focus specifically on the dinucleotide composed of a third codon position nucleotide and its succeeding
first position nucleotide. Using the 16 possible dinucleotide combinations, we calculate how well individual genes conform
to the observed mean dinucleotide frequencies of an entire genome, forming a distance measure for each gene. It is found that
genes from different genomes can be separated with a high degree of accuracy, according to these distance values.
In particular, we address the problem of recent horizontal gene transfer, and how imported genes may be evaluated by their
poor assimilation to the host's context biases. By concentrating on the third- and succeeding first position nucleotides,
we eliminate most spurious contributions from codon usage and amino-acid requirements, focusing mainly on mutational effects.
Since imported genes are expected to converge only gradually to genomic signatures, it is possible to question whether a gene
present in only one of two closely related organisms has been imported into one organism or deleted in the other. Striking
correlations between the proposed distance measure and poor homology are observed when Escherichia coli genes are compared to Salmonella typhi, indicating that sets of outlier genes in E. coli may contain a high number of genes that have been imported into E. coli, and not deleted in S. typhi.
Received: 16 January 2001 / Accepted: 30 August 2001 相似文献
5.
Héctor Musto Héctor Romero Helena Rodríguez-Maseda 《Journal of molecular evolution》1998,46(2):159-167
Synonymous codon choices vary considerably among Schistosoma mansoni genes. Principal components analysis detects a single major trend among genes, which highly correlates with GC content in
third codon positions and exons, but does not discriminate among putatively highly and lowly expressed genes. The effective
number of codons used in each gene, and its distribution when plotted against GC3, suggests that codon usage is shaped mainly by mutational biases. The GC content of exons, GC3, 5′, 3′, and flanking (5′+ 3′+ introns) regions are all correlated among them, suggesting that variations in GC content may
exist among different regions of the S. mansoni genome. We propose that this genome structure might be among the most important factors shaping codon usage in this species,
although the action of selection on certain sequences cannot be excluded.
Received: 10 March 1997 / Accepted: 27 June 1997 相似文献
6.
A Survey of the Molecular Evolutionary Dynamics of Twenty-Five Multigene Families from Four Grass Taxa 总被引:10,自引:0,他引:10
We surveyed the molecular evolutionary characteristics of 25 plant gene families, with the goal of better understanding general
processes in plant gene family evolution. The survey was based on 247 GenBank sequences representing four grass species (maize,
rice, wheat, and barley). For each gene family, orthology and paralogy relationships were uncertain. Recognizing this uncertainty,
we characterized the molecular evolution of each gene family in four ways. First, we calculated the ratio of nonsynonymous
to synonymous substitutions (d
N/d
S) both on branches of gene phylogenies and across codons. Our results indicated that the d
N/d
S ratio was statistically heterogeneous across branches in 17 of 25 (68%) gene families. The vast majority of d
N/d
S estimates were <<1.0, suggestive of selective constraint on amino acid replacements, and no estimates were >1.0, either across
phylogenetic lineages or across codons. Second, we tested separately for nonsynonymous and synonymous molecular clocks. Sixty-eight
percent of gene families rejected a nonsynonymous molecular clock, and 52% of gene families rejected a synonymous molecular
clock. Thus, most gene families in this study deviated from clock-like evolution at either synonymous or nonsynonymous sites.
Third, we calculated the effective number of codons and the proportion of G+C synonymous sites for each sequence in each gene
family. One or both quantities vary significantly within 18 of 25 gene families. Finally, we tested for gene conversion, and
only six gene families provided evidence of gene conversion events. Altogether, evolution for these 25 gene families is marked
by selective constraint that varies among gene family members, a lack of molecular clock at both synonymous and nonsynonymous
sites, and substantial variation in codon usage.
Received: 25 May 2000 / Accepted: 16 October 2000 相似文献
7.
Berg OG 《Journal of molecular evolution》1999,48(4):398-407
The synonymous divergence between Escherichia coli and Salmonella typhimurium is explained in a model where there is a large variation between mutation rates at different nucleotide sites in the genome.
The model is based on the experimental observation that spontaneous mutation rates can vary over several orders of magnitude
at different sites in a gene. Such site-specific variation must be taken into account when studying synonymous divergence
and will result in an apparent saturation below the level expected from an assumption of uniform rates. Recently, it has been
suggested that codon preference in enterobacteria has a very large site-specific variation and that the synonymous divergence
between different species, e.g., E. coli and Salmonella, is saturated. In the present communication it is shown that when site-specific variation in mutation rates is introduced,
there is no need to invoke assumptions of saturation and a large variability in codon preference. The same rate variation
will also bring average mutation rates as estimated from synonymous sequence divergence into numerical agreement with experimental
values.
Received: 10 July 1998 / Accepted: 20 August 1998 相似文献
8.
To characterize the coding-sequence divergence of closely related genomes, we compared DNA sequence divergence between sequences
from a Brassica rapa ssp. pekinensis EST library isolated from flower buds and genomic sequences from Arabidopsis thaliana. The specific objectives were (i) to determine the distribution of and relationship between K
a and K
s, (ii) to identify genes with the lowest and highest K
a:K
s values, and (iii) to evaluate how codon usage has diverged between two closely related species. We found that the distribution
of K
a:K
s was unimodal, and that substitution rates were more variable at nonsynonymous than synonymous sites, and detected no evidence
that K
a and K
s were positively correlated. Several genes had K
a:K
s values equal to or near zero, as expected for genes that have evolved under strong selective constraint. In contrast, there
were no genes with K
a:K
s >1 and thus we found no strong evidence that any of the 218 sequences we analyzed have evolved in response to positive selection.
We detected a stronger codon bias but a lower frequency of GC at synonymous sites in A. thaliana than B. rapa. Moreover, there has been a shift in the profile of most commonly used synonymous codons since these two species diverged
from one another. This shift in codon usage may have been caused by stronger selection acting on codon usage or by a shift
in the direction of mutational bias in the B. rapa phylogenetic lineage. 相似文献
9.
Wang B 《Journal of molecular evolution》2001,53(3):244-250
Genes with atypical G+C content and pattern of codon usage in a certain genome are possibly of exotic origin, and this idea
has been applied to identify horizontal events. In this way, it was postulated that a total of 755 genes in the E. coli genome are relics of horizontal events after the divergence of E. coli from the Salmonella lineage 100 million years ago (Lawrence and Ochman, 1998). In this paper we propose a new way to study sequence composition
more thoroughly. We found that although the 755 genes differ in composition from other genes in the E. coli genome, the difference is minor. If we accepted that these genes are horizontally transferred, then (1) it would be more
likely that they were transferred from genomes evolutionarily closely related to E. coli; but (2) the dating method used by Lawrence and Ochman (1997, 1998) largely underestimated the average age of introduced sequences
in the E. coli genome, in particular, most of the 755 genes should be introduced into E. coli before, instead of after, the divergence of E. coli from the Salmonella lineage. Our study reveals that atypical G+C content and pattern of codon usage are not reliable indicators of horizontal
gene transfer events.
Received: 27 September 2000 / Accepted: 9 April 2001 相似文献
10.
The mitochondrial DNA-encoded cytochrome oxidase subunit I (COI) gene and the nuclear DNA-encoded hsp60 gene from the euglenoid
protozoan Euglena gracilis were cloned and sequenced. The COI sequence represents the first example of a mitochondrial genome-encoded gene from this
organism. This gene contains seven TGG tryptophan codons and no TGA tryptophan codons, suggesting the use of the universal
genetic code. This differs from the situation in the mitochondrion of the related kinetoplastid protozoa, in which TGA codes
for tryptophan. In addition, a complete absence of CGN triplets may imply the lack of the corresponding tRNA species. COI
cDNAs from E. gracilis possess short 5′ and 3′ untranslated transcribed sequences and lack a 3′ poly[A] tail.
The COI gene does not require uridine insertion/deletion RNA editing, as occurs in kinetoplastid mitochondria, to be functional,
and no short guide RNA-like molecules could be visualized by labeling total mitochondrial RNA with [α-32P]GTP and guanylyl transferase. In spite of the differences in codon usage and the 3′ end structures of mRNAs, phylogenetic
analysis using the COI and hsp60 protein sequences suggests a monophyletic relationship between the mitochondrial genomes
of E. gracilis and of the kinetoplastids, which is consistent with the phylogenetic relationship of these groups previously obtained using
nuclear ribosomal RNA sequences.
Received: 5 March 1996 / Accepted: 31 July 1996 相似文献
11.
Codon Usage in Plastid Genes Is Correlated with Context, Position Within the Gene, and Amino Acid Content 总被引:5,自引:0,他引:5
Highly expressed plastid genes display codon adaptation, which is defined as a bias toward a set of codons which are complementary
to abundant tRNAs. This type of adaptation is similar to what is observed in highly expressed Escherichia coli genes and is probably the result of selection to increase translation efficiency. In the current work, the codon adaptation
of plastid genes is studied with regard to three specific features that have been observed in E. coli and which may influence translation efficiency. These features are (1) a relatively low codon adaptation at the 5′ end of
highly expressed genes, (2) an influence of neighboring codons on codon usage at a particular site (codon context), and (3)
a correlation between the level of codon adaptation of a gene and its amino acid content. All three features are found in
plastid genes. First, highly expressed plastid genes have a noticeable decrease in codon adaptation over the first 10–20 codons.
Second, for the twofold degenerate NNY codon groups, highly expressed genes have an overall bias toward the NNC codon, but
this is not observed when the 3′ neighboring base is a G. At these sites highly expressed genes are biased toward NNT instead
of NNC. Third, plastid genes that have higher codon adaptations also tend to have an increased usage of amino acids with a
high G + C content at the first two codon positions and GNN codons in particular. The correlation between codon adaptation
and amino acid content exists separately for both cytosolic and membrane proteins and is not related to any obvious functional
property. It is suggested that at certain sites selection discriminates between nonsynonymous codons based on translational,
not functional, differences, with the result that the amino acid sequence of highly expressed proteins is partially influenced
by selection for increased translation efficiency.
Received: 21 July 1999 / Accepted: 5 November 1999 相似文献
12.
Codon Usage Bias and tRNA Abundance in Drosophila 总被引:5,自引:0,他引:5
Codon usage bias of 1,117 Drosophila melanogaster genes, as well as fewer D. pseudoobscura and D. virilis genes, was examined from the perspective of relative abundance of isoaccepting tRNAs and their changes during development.
We found that each amino acid contributes about equally and highly significantly to overall codon usage bias, with the exception
of Asp which had very low contribution to overall bias. Asp was also the only amino acid that did not show a clear preference
for one of its synonymous codons. Synonymous codon usage in Drosophila was consistent with ``optimal' codons deduced from the isoaccepting tRNA availability. Interestingly, amino acids whose
major isoaccepting tRNAs change during development did not show as strong bias as those with developmentally unchanged tRNA
pools. Asp is the only amino acid for which the major isoaccepting tRNAs change between larval and adult stages. We conclude
that synonymous codon usage in Drosophila is well explained by tRNA availability and is probably influenced by developmental changes in relative abundance.
Received: 5 December 1996 / Accepted: 14 June 1997 相似文献
13.
Synonymous codon usage in related species may differ as a result of variation in mutation biases, differences in the overall
strength and efficiency of selection, and shifts in codon preference—the selective hierarchy of codons within and between
amino acids. We have developed a maximum-likelihood method to employ explicit population genetic models to analyze the evolution
of parameters determining codon usage. The method is applied to twofold degenerate amino acids in 50 orthologous genes from
D. melanogaster and D. virilis. We find that D. virilis has significantly reduced selection on codon usage for all amino acids, but the data are incompatible with a simple model
in which there is a single difference in the long-term N
e, or overall strength of selection, between the two species, indicating shifts in codon preference. The strength of selection
acting on codon usage in D. melanogaster is estimated to be |N
e
s|≈ 0.4 for most CT-ending twofold degenerate amino acids, but 1.7 times greater for cysteine and 1.4 times greater for AG-ending
codons. In D. virilis, the strength of selection acting on codon usage for most amino acids is only half that acting in D. melanogaster but is considerably greater than half for cysteine, perhaps indicating the dual selection pressures of translational efficiency
and accuracy. Selection coefficients in orthologues are highly correlated (ρ= 0.46), but a number of genes deviate significantly
from this relationship.
Received: 20 December 1998 / Accepted: 17 February 1999 相似文献
14.
Synonymous Codon Choices in the Extremely GC-Poor Genome of Plasmodium falciparum: Compositional Constraints and Translational Selection 总被引:7,自引:0,他引:7
Héctor Musto Héctor Romero Alejandro Zavala Kamel Jabbari Giorgio Bernardi 《Journal of molecular evolution》1999,49(1):27-35
We have analyzed the patterns of synonymous codon preferences of the nuclear genes of Plasmodium falciparum, a unicellular parasite characterized by an extremely GC-poor genome. When all genes are considered, codon usage is strongly
biased toward A and T in third codon positions, as expected, but multivariate statistical analysis detects a major trend among
genes. At one end genes display codon choices determined mainly by the extreme genome composition of this parasite, and very
probably their expression level is low. At the other end a few genes exhibit an increased relative usage of a particular subset
of codons, many of which are C-ending. Since the majority of these few genes is putatively highly expressed, we postulate
that the increased C-ending codons are translationally optimal. In conclusion, while codon usage of the majority of P. falciparum genes is determined mainly by compositional constraints, a small number of genes exhibit translational selection.
Received: 10 November 1998 / Accepted: 28 January 1999 相似文献
15.
The usage of synonymous codons and the frequencies of amino acids were investigated in the complete genome of the bacterium
Thermotoga maritima using a multivariate statistical approach. The GC3 content of each gene was the most prominent source of variation of codon
usage. Surprisingly the usage of UGU and UGC (synonymous triplets coding for Cys, the least frequent amino acid in this species)
was detected as the second most prominent source of variation. However, this result is probably an artifact due to the very
low frequency of Cys together with the nonbiased composition of this genome. The third trend was related to the preferential
usage of a subset of codons among highly expressed genes, and these triplets are presumed to be translationally optimal. Concerning
the amino acid usage, the hydropathy level of each protein (and therefore the frequency of charged residues) was the main
trend, while the second factor was related to the frequency of usage of the smaller residues, suggesting that the cell economy
strongly influences the architecture of the proteins. The third axis of the analysis discriminated the usage of Phe, Tyr,
Trp (aromatic residues) plus Cys, Met, and His. These six residues have in common the property of being the preferential targets
of reactive oxygen species, and therefore the anaerobic condition of T. maritima is an important factor for the amino acid frequencies. Finally, the Cys content of each protein was the fourth trend.
Received: 22 June 2001 / Accepted: 1 October 2001 相似文献
16.
Adam Eyre-Walker 《Journal of molecular evolution》1996,42(2):73-78
It is shown that synonymous codon usage is less biased in favor of those codons preferred by highly expressed genes at the end ofEscherichia coli genes than in the middle. This appears to be due to the close proximity of manyE. coli genes. It is shown that a substantial number of genes overlap either the Shine-Dalgarno sequence or the coding sequence of the next gene on the chromosome and that the codons that overlap have lower synonymous codon bias than those which do not. It is also shown that there is an increase in the frequency of A-ending codons, and a decrease in the frequency of G-ending codons at the end ofE. coli genes that lie close to another gene. It is suggested that these trends in composition could be associated with selection against the formation of mRNA secondary structure near the start of the next gene on the chromosome. Stop codon use is also affected by the close proximity of genes; many genes are forced to use TGA and TAG stop codons because they terminate either within the Shine-Dalgarno or coding sequence of the next gene on the chromosome. The implications these results have for the evolution of synonymous codon use are discussed. 相似文献
17.
Capsular polysaccharides are important virulence factors both in Gram-positive and Gram-negative bacteria. A similar cluster
organization of the genes involved in the synthesis of bacterial exopolysaccharides has been postulated in both cases, suggesting
that these clusters evolved by module assembly. Horizontal gene transfer has been postulated to explain the polymorphism found
in these cellular polymers. The cap1K and cap3A genes coding for the pneumococcal type 1 and type 3 UDP-glucose dehydrogenases, respectively, have been compared with other
UDP-sugar dehydrogenases. We have observed that the evolutionary distance between Cap1K and Cap3A is approximately equal to
that found between Cap1K (or Cap3A) and other UDP-GlcDH of families evolutionarily distant like KfiD, the dehydrogenase from
Escherichia coli K5. On the basis of comparisons of G + C content, patterns of synonymous and nonsynonymous substitutions, dinucleotide frequencies,
and codon usage bias, we conclude that the kfiD gene has been introduced into E. coli from an exogenous source, probably from a streptococcal species.
Received: 26 May 1997 / Accepted: 30 July 1997 相似文献
18.
Relationships Among Stop Codon Usage Bias, Its Context, Isochores, and Gene Expression Level in Various Eukaryotes 总被引:1,自引:0,他引:1
It is well known that stop codons play a critical role in the process of protein synthesis. However, little effort has been
made to investigate whether stop codon usage exhibits biases, such as widely seen for synonymous codon usage. Here we systematically
investigate stop codon usage bias in various eukaryotes as well as its relationships with its context, GC3 content, gene expression
level, and secondary structure. The results show that there is a strong bias for stop codon usage in different eukaryotes,
i.e., UAA is overrepresented in the lower eukaryotes, UGA is overrepresented in the higher eukaryotes, and UAG is least used
in all eukaryotes. Different conserved patterns for each stop codon in different eukaryotic classes are found based on information
content and logo analysis. GC3 contents increase with increasing complexity of organisms. Secondary structure prediction revealed
that UAA is generally associated with loop structures, whereas UGA is more uniformly present in loop and stem structures,
i.e., UGA is less biased toward having a particular structure. The stop codon usage bias, however, shows no significant relationship
with GC3 content and gene expression level in individual eukaryotes. The results indicate that genomic complexity and GC3
content might contribute to stop codon usage bias in different eukaryotes. Our results indicate that stop codons, like synonymous
codons, exhibit biases in usage. Additional work will be needed to understand the causes of these biases and their relationship
to the mechanism of protein termination.
[Reviewing Editor: Dr. Manyuan Long] 相似文献
19.
Richard M. Kliman 《Journal of molecular evolution》1999,49(3):343-351
Evidence from a variety of sources indicates that selection has influenced synonymous codon usage in Drosophila. It has generally been difficult, however, to distinguish selection that acted in the distant past from ongoing selection.
However, under a neutral model, polymorphisms usually reflect more recent mutations than fixed differences between species
and may, therefore, be useful for inferring recent selection. If the ancestral state is preferred, selection should shift
the frequency distribution of derived states/site toward lower values; if the ancestral is unpreferred, selection should increase
the number of derived states/site. Polymorphisms were classified as ancestrally preferred or unpreferred for several genes
of D. simulans and D. melanogaster. A computer simulation of coalescence was employed to derive the expected frequency distributions of derived states/site under
various modifications of the Wright–Fisher neutral model, and distributions of test statistics (t and Mann–Whitney U) were derived by appropriate sampling. One-tailed tests were applied to transformed frequency data to assess whether the
two frequency distributions deviated from neutral expectations in the direction predicted by selection on codon usage. Several
genes from D. simulans appear to be subject to recent selection on synonymous codons, including one gene with low codon bias, esterase-6. Selection may also be acting in D. melanogaster.
Received: 15 April 1998 / Accepted: 13 May 1999 相似文献
20.
Jenkins GM Pagel M Gould EA de A Zanotto PM Holmes EC 《Journal of molecular evolution》2001,52(4):383-390
The extent to which base composition and codon usage vary among RNA viruses, and the possible causes of this bias, is undetermined
in most cases. A maximum-likelihood statistical method was used to test whether base composition and codon usage bias covary
with arthropod association in the genus Flavivirus, a major source of disease in humans and animals. Flaviviruses are transmitted by mosquitoes, by ticks, or directly between
vertebrate hosts. Those viruses associated with ticks were found to have a significantly lower G+C content than non-vector-borne
flaviviruses and this difference was present throughout the genome at all amino acids and codon positions. In contrast, mosquito-borne
viruses had an intermediate G+C content which was not significantly different from those of the other two groups. In addition,
biases in dinucleotide and codon usage that were independent of base composition were detected in all flaviviruses, but these
did not covary with arthropod association. However, the overall effect of these biases was slight, suggesting only weak selection
at synonymous sites. A preliminary analysis of base composition, codon usage, and vector specificity in other RNA virus families
also revealed a possible association between base composition and vector specificity, although with biases different from
those seen in the Flavivirus genus.
Received: 29 August 2000 / Accepted: 19 December 2000 相似文献