共查询到20条相似文献,搜索用时 46 毫秒
1.
We simulate a deterministic population genetic model for the coevolution of genetic codes and protein-coding genes. We use
very simple assumptions about translation, mutation, and protein fitness to calculate mutation-selection equilibria of codon
frequencies and fitness in a large asexual population with a given genetic code. We then compute the fitnesses of altered
genetic codes that compete to invade the population by translating its genes with higher fitness. Codes and genes coevolve
in a succession of stages, alternating between genetic equilibration and code invasion, from an initial wholly ambiguous coding
state to a diversified frozen coding state. Our simulations almost always resulted in partially redundant frozen genetic codes.
Also, the range of simulated physicochemical properties among encoded amino acids in frozen codes was always less than maximal.
These results did not require the assumption of historical constraints on the number and type of amino acids available to
codes nor on the complexity of proteins, stereochemical constraints on the translational apparatus, nor mechanistic constraints
on genetic code change. Both the extent and timing of amino-acid diversification in genetic codes were strongly affected by
the message mutation rate and strength of missense selection. Our results suggest that various omnipresent phenomena that
distribute codons over sites with different selective requirements—such as the persistence of nonsynonymous mutations at equilibrium,
the positive selection of the same codon in different types of sites, and translational ambiguity—predispose the evolution
of redundancy and of reduced amino acid diversity in genetic codes.
Received: 21 December 2000 / Accepted: 12 March 2001 相似文献
2.
Statistical and biochemical studies of the genetic code have found evidence of nonrandom patterns in the distribution of
codon assignments. It has, for example, been shown that the code minimizes the effects of point mutation or mistranslation:
erroneous codons are either synonymous or code for an amino acid with chemical properties very similar to those of the one
that would have been present had the error not occurred. This work has suggested that the second base of codons is less efficient
in this respect, by about three orders of magnitude, than the first and third bases. These results are based on the assumption
that all forms of error at all bases are equally likely. We extend this work to investigate (1) the effect of weighting transition
errors differently from transversion errors and (2) the effect of weighting each base differently, depending on reported mistranslation
biases. We find that if the bias affects all codon positions equally, as might be expected were the code adapted to a mutational
environment with transition/transversion bias, then any reasonable transition/transversion bias increases the relative efficiency
of the second base by an order of magnitude. In addition, if we employ weightings to allow for biases in translation, then
only 1 in every million random alternative codes generated is more efficient than the natural code. We thus conclude not only
that the natural genetic code is extremely efficient at minimizing the effects of errors, but also that its structure reflects
biases in these errors, as might be expected were the code the product of selection.
Received: 25 July 1997 / Accepted: 9 January 1998 相似文献
3.
How did the ``universal' genetic code arise? Several hypotheses have been put forward, and the code has been analyzed extensively
by authors looking for clues to selection pressures that might have acted during its evolution. But this approach has been
ineffective. Although an impressive number of properties has been attributed to the universal code, it has been impossible
to determine whether selection on any of these properties was important in the code's evolution or whether the observed properties
arose as a consequence of selection on some other characteristic. Therefore we turned the question around and asked, what
would a genetic code look like if it had evolved in response to various different selection pressures? To address this question,
we constructed a genetic algorithm. We found first that selecting on a particular measure yields codes that are similar to
each other. Second, we found that the universal code is far from minimized with respect to the effects of mutations (or translation
errors) on the amino acid compositions of proteins. Finally, we found that the codes that most closely resembled real codes
were those generated by selecting on aspects of the code's structure, not those generated by selecting to minimize the effects
of amino acid substitutions on proteins. This suggests that the universal genetic code has been selected for a particular
structure—a structure that confers an important flexibility on the evolution of genes and proteins—and that the particular
assignments of amino acids to codons are secondary.
Received: 29 December 1998 / Accepted: 8 July 1999 相似文献
4.
Natural selection favors certain synonymous codons which aid translation in Escherichia coli, yet codons not favored by translational selection persist. We use the frequency distributions of synonymous polymorphisms
to test three hypotheses for the existence of translationally sub-optimal codons: (1) selection is a relatively weak force,
so there is a balance between mutation, selection, and drift; (2) at some sites there is no selection on codon usage, so some
synonymous sites are unaffected by translational selection; and (3) translationally sub-optimal codons are favored by alternative
selection pressures at certain synonymous sites. We find that when all the data is considered, model 1 is supported and both
models 2 and 3 are rejected as sole explanations for the existence of translationally sub-optimal codons. However, we find
evidence in favor of both models 2 and 3 when the data is partitioned between groups of amino acids and between regions of
the genes. Thus, all three mechanisms appear to contribute to the existence of translationally sub-optimal codons in E. coli.
Received: 18 July 2000 / Accepted: 17 April 2001 相似文献
5.
We have assumed that the coevolution theory of genetic code origin (Wong JT, Proc Natl Acad Sci USA 72:1909–1912, 1975) is
essentially correct. This theory makes it possible to identify at least 10 evolutionary stages through which genetic code
organization might have passed prior to reaching its current form. The calculation of the minimization level of all these
evolutionary stages leads to the following conclusions. (1) The minimization percentages increased linearly with the number
of amino acids codified in the codes of the various evolutionary stages when only the sense changes are considered in the
analysis. This seems to favor the physicochemical theory of genetic code origin even if, as discussed in the paper, this observation
is also compatible with the coevolution theory. (2) For the first seven evolutionary stages of the genetic code, this trend
is less clear and indeed is inverted when we consider the global optimisation of the codes due to both sense changes and synonymous
changes. This inverse correlation between minimization percentages and the number of amino acids codified in the codes of
the intermediate stages seems to favor neither the physicochemical nor the stereochemical theories of genetic code origin,
as it is in the early and intermediate stages of code development that these theories would expect minimization to have played
a crucial role, and this does not seem to be the case. However, these results are in agreement with the coevolution theory,
which attributes a role to the physicochemical properties of amino acids that, while important, is nevertheless subordinate
to the mechanism which concedes codons from the precursor amino acids to the product amino acids as the primary factor determining
the evolutionary structuring of the genetic code. The results are therefore discussed in the context of the various theories
proposed to explain genetic code origin.
Received: 25 October 1998 / Accepted: 19 February 1999 相似文献
6.
7.
David H. Ardell 《Journal of molecular evolution》1998,47(1):1-13
Distances between amino acids were derived from the polar requirement measure of amino acid polarity and Benner and co-workers'
(1994) 74-100 PAM matrix. These distances were used to examine the average effects of amino acid substitutions due to single-base
errors in the standard genetic code and equally degenerate randomized variants of the standard code. Second-position transitions
conserved all distances on average, an order of magnitude more than did second-position transversions. In contrast, first-position
transitions and transversions were about equally conservative. In comparison with randomized codes, second-position transitions
in the standard code significantly conserved mean square differences in polar requirement and mean Benner matrix-based distances,
but mean absolute value differences in polar requirement were not significantly conserved. The discrepancy suggests that these
commonly used distance measures may be insufficient for strict hypothesis testing without more information. The translational
consequences of single-base errors were then examined in different codon contexts, and similarities between these contexts
explored with a hierarchical cluster analysis. In one cluster of codon contexts corresponding to the RNY and GNR codons, second-position
transversions between C and G and transitions between C and U were most conservative of both polar requirement and the matrix-based
distance. In another cluster of codon contexts, second-position transitions between A and G were most conservative. Despite
the claims of previous authors to the contrary, it is shown theoretically that the standard code may have been shaped by position-invariant
forces such as mutation and base content. These forces may have left heterogeneous signatures in the code because of differences
in translational fidelity by codon position.
A scenario for the origin of the code is presented wherein selection for error minimization could have occurred multiple times
in disjoint parts of the code through a phyletic process of competition between lineages. This process permits error minimization
without the disruption of previously useful messages, and does not predict that the code is optimally error-minimizing with
respect to modern error. Instead, the code may be a record of genetic process and patterns of mutation before the radiation
of modern organisms and organelles.
Received: 28 July 1997 / Accepted: 23 January 1998 相似文献
8.
Nucleotide Composition Bias Affects Amino Acid Content in Proteins Coded by Animal Mitochondria 总被引:16,自引:0,他引:16
We show that in animal mitochondria homologous genes that differ in guanine plus cytosine (G + C) content code for proteins
differing in amino acid content in a manner that relates to the G + C content of the codons. DNA sequences were analyzed using
square plots, a new method that combines graphical visualization and statistical analysis of compositional differences in
both DNA and protein. Square plots divide codons into four groups based on first and second position A + T (adenine plus thymine)
and G + C content and indicate differences in amino acid content when comparing sequences that differ in G + C content. When
sequences are compared using these plots, the amino acid content is shown to correlate with the nucleotide bias of the genes.
This amino acid effect is shown in all protein-coding genes in the mitochondrial genome, including cox I, cox II, and cyt b, mitochondrial genes which are commonly used for phylogenetic studies. Furthermore, nucleotide content differences are shown
to affect the content of all amino acids with A + T- and G + C-rich codons. We speculate that phylogenetic analysis of genes
so affected may tend erroneously to indicate relatedness (or lack thereof) based only on amino acid content.
Received: 3 July 1996 / Accepted: 6 November 1996 相似文献
9.
10.
In bacteria, synonymous codon usage can be considerably affected by base composition at neighboring sites. Such context-dependent
biases may be caused by either selection against specific nucleotide motifs or context-dependent mutation biases. Here we
consider the evolutionary conservation of context-dependent codon bias across 11 completely sequenced bacterial genomes. In
particular, we focus on two contextual biases previously identified in Escherichia coli; the avoidance of out-of-frame stop codons and AGG motifs. By identifying homologues of E. coli genes, we also investigate the effect of gene expression level in Haemophilus influenzae and Mycoplasma genitalium. We find that while context-dependent codon biases are widespread in bacteria, few are conserved across all species considered.
Avoidance of out-of-frame stop codons does not apply to all stop codons or amino acids in E. coli, does not hold for different species, does not increase with gene expression level, and is not relaxed in Mycoplasma spp., in which the canonical stop codon, TGA, is recognized as tryptophan. Avoidance of AGG motifs shows some evolutionary
conservation and increases with gene expression level in E. coli, suggestive of the action of selection, but the cause of the bias differs between species. These results demonstrate that
strong context-dependent forces, both selective and mutational, operate on synonymous codon usage but that these differ considerably
between genomes.
Received: 6 May 1999 / Accepted: 29 October 1999 相似文献
11.
We studied 10 protein-coding mitochondrial genes from 19 mammalian species to evaluate the effects of 10 amino acid properties
on the evolution of the genetic code, the amino acid composition of proteins, and the pattern of nonsynonymous substitutions.
The 10 amino acid properties studied are the chemical composition of the side chain, two polarity measures, hydropathy, isoelectric
point, volume, aromaticity, aliphaticity, hydrogenation, and hydroxythiolation. The genetic code appears to have evolved toward
minimizing polarity and hydropathy but not the other seven properties. This can be explained by our finding that the presumably
primitive amino acids differed much only in polarity and hydropathy, but little in the other properties. Only the chemical
composition (C) and isoelectric point (IE) appear to have affected the amino acid composition of the proteins studied, that
is, these proteins tend to have more amino acids with typical C and IE values, so that nonsynonymous mutations tend to result
in small differences in C and IE. All properties, except for hydroxythiolation, affect the rate of nonsynonymous substitution,
with the observed amino acid changes having only small differences in these properties, relative to the spectrum of all possible
nonsynonymous mutations.
Received: 2 January 1998 / Accepted: 25 April 1998 相似文献
12.
Tachida H 《Journal of molecular evolution》2000,50(1):69-81
A simple nearly neutral mutation model of protein evolution was studied using computer simulation assuming a constant population
size. In this model, a gene consists of a finite number of codons and there is no recombination within a gene. Each codon
has two replacement and one silent sites. The fitness of a gene was determined multiplicatively by amino acids specified by
codons (the independent multicodon model). Nucleotide diversity at replacement sites decreases as selection becomes stronger.
A reduction of nucleotide diversity at silent sites also occurs as selection intensifies but the magnitude of the reduction
is not a monotone function of the intensity of selection. The dispersion index is close to one. The average value of Tajima's
and Fu and Li's statistics are negative and their absolute values increases as selection intensifies. However, their powers
of detecting selection under the present model were not high unless the number of sites is large or mutation rate is high.
The MK test was shown to detect intermediate selection fairly well. For comparison, the house-of-cards model was also investigated
and its behavior was shown to be more sensitive to changes of population size than that of the independent multicodon model.
The relevance of the present model for explaining protein evolution was discussed comparing its prediction and recent DNA
data.
Received: 24 May 1999 / Accepted: 17 August 1999 相似文献
13.
Synonymous codon usage in related species may differ as a result of variation in mutation biases, differences in the overall
strength and efficiency of selection, and shifts in codon preference—the selective hierarchy of codons within and between
amino acids. We have developed a maximum-likelihood method to employ explicit population genetic models to analyze the evolution
of parameters determining codon usage. The method is applied to twofold degenerate amino acids in 50 orthologous genes from
D. melanogaster and D. virilis. We find that D. virilis has significantly reduced selection on codon usage for all amino acids, but the data are incompatible with a simple model
in which there is a single difference in the long-term N
e, or overall strength of selection, between the two species, indicating shifts in codon preference. The strength of selection
acting on codon usage in D. melanogaster is estimated to be |N
e
s|≈ 0.4 for most CT-ending twofold degenerate amino acids, but 1.7 times greater for cysteine and 1.4 times greater for AG-ending
codons. In D. virilis, the strength of selection acting on codon usage for most amino acids is only half that acting in D. melanogaster but is considerably greater than half for cysteine, perhaps indicating the dual selection pressures of translational efficiency
and accuracy. Selection coefficients in orthologues are highly correlated (ρ= 0.46), but a number of genes deviate significantly
from this relationship.
Received: 20 December 1998 / Accepted: 17 February 1999 相似文献
14.
We have previously proposed an SNS hypothesis on the origin of the genetic code (Ikehara and Yoshida 1998). The hypothesis
predicts that the universal genetic code originated from the SNS code composed of 16 codons and 10 amino acids (S and N mean
G or C and either of four bases, respectively). But, it must have been very difficult to create the SNS code at one stroke
in the beginning. Therefore, we searched for a simpler code than the SNS code, which could still encode water-soluble globular
proteins with appropriate three-dimensional structures at a high probability using four conditions for globular protein formation
(hydropathy, α-helix, β-sheet, and β-turn formations). Four amino acids (Gly [G], Ala [A], Asp [D], and Val [V]) encoded by
the GNC code satisfied the four structural conditions well, but other codes in rows and columns in the universal genetic code
table do not, except for the GNG code, a slightly modified form of the GNC code. Three three-amino acid systems ([D], Leu
and Tyr; [D], Tyr and Met; Glu, Pro and Ile) also satisfied the above four conditions. But, some amino acids in the three
systems are far more complex than those encoded by the GNC code. In addition, the amino acids in the three-amino acid systems
are scattered in the universal genetic code table. Thus, we concluded that the universal genetic code originated not from
a three-amino acid system but from a four-amino acid system, the GNC code encoding [GADV]-proteins, as the most primitive
genetic code.
Received: 11 June 2001 / Accepted: 11 October 2001 相似文献
15.
In many unicellular organisms, invertebrates, and plants, synonymous codon usage biases result from a coadaptation between
codon usage and tRNAs abundance to optimize the efficiency of protein synthesis. However, it remains unclear whether natural
selection acts at the level of the speed or the accuracy of mRNAs translation. Here we show that codon usage can improve the
fidelity of protein synthesis in multicellular species. As predicted by the model of selection for translational accuracy,
we find that the frequency of codons optimal for translation is significantly higher at codons encoding for conserved amino
acids than at codons encoding for nonconserved amino acids in 548 genes compared between Caenorhabditis elegans and Homo sapiens. Although this model predicts that codon bias correlates positively with gene length, a negative correlation between codon
bias and gene length has been observed in eukaryotes. This suggests that selection for fidelity of protein synthesis is not
the main factor responsible for codon biases. The relationship between codon bias and gene length remains unexplained. Exploring
the differences in gene expression process in eukaryotes and prokaryotes should provide new insights to understand this key
question of codon usage.
Received: 18 June 2000 / Accepted: 10 November 2000 相似文献
16.
It is now well-established that compositional bias in DNA sequences can adversely affect phylogenetic analysis based on those
sequences. Phylogenetic analyses based on protein sequences are generally considered to be more reliable than those derived
from the corresponding DNA sequences because it is believed that the use of encoded protein sequences circumvents the problems
caused by nucleotide compositional biases in the DNA sequences. There exists, however, a correlation between AT/GC bias at
the nucleotide level and content of AT- and GC-rich codons and their corresponding amino acids. Consequently, protein sequences
can also be affected secondarily by nucleotide compositional bias. Here, we report that DNA bias not only may affect phylogenetic
analysis based on DNA sequences, but also drives a protein bias which may affect analyses based on protein sequences. We present
a striking example where common phylogenetic tools fail to recover the correct tree from complete animal mitochondrial protein-coding
sequences. The data set is very extensive, containing several thousand sites per sequence, and the incorrect phylogenetic
trees are statistically very well supported. Additionally, neither the use of the LogDet/paralinear transform nor removal
of positions in the protein alignment with AT- or GC-rich codons allowed recovery of the correct tree. Two taxa with a large
compositional bias continually group together in these analyses, despite a lack of close biological relatedness. We conclude
that even protein-based phylogenetic trees may be misleading, and we advise caution in phylogenetic reconstruction using protein
sequences, especially those that are compositionally biased.
Received: 19 February 1998 / Accepted: 28 August 1998 相似文献
17.
Codon Usage in Plastid Genes Is Correlated with Context, Position Within the Gene, and Amino Acid Content 总被引:5,自引:0,他引:5
Highly expressed plastid genes display codon adaptation, which is defined as a bias toward a set of codons which are complementary
to abundant tRNAs. This type of adaptation is similar to what is observed in highly expressed Escherichia coli genes and is probably the result of selection to increase translation efficiency. In the current work, the codon adaptation
of plastid genes is studied with regard to three specific features that have been observed in E. coli and which may influence translation efficiency. These features are (1) a relatively low codon adaptation at the 5′ end of
highly expressed genes, (2) an influence of neighboring codons on codon usage at a particular site (codon context), and (3)
a correlation between the level of codon adaptation of a gene and its amino acid content. All three features are found in
plastid genes. First, highly expressed plastid genes have a noticeable decrease in codon adaptation over the first 10–20 codons.
Second, for the twofold degenerate NNY codon groups, highly expressed genes have an overall bias toward the NNC codon, but
this is not observed when the 3′ neighboring base is a G. At these sites highly expressed genes are biased toward NNT instead
of NNC. Third, plastid genes that have higher codon adaptations also tend to have an increased usage of amino acids with a
high G + C content at the first two codon positions and GNN codons in particular. The correlation between codon adaptation
and amino acid content exists separately for both cytosolic and membrane proteins and is not related to any obvious functional
property. It is suggested that at certain sites selection discriminates between nonsynonymous codons based on translational,
not functional, differences, with the result that the amino acid sequence of highly expressed proteins is partially influenced
by selection for increased translation efficiency.
Received: 21 July 1999 / Accepted: 5 November 1999 相似文献
18.
A survey of the patterns of synonymous codon preference in the HIV env gene reveals a correlation between the codon bias and the mutability requirements of different regions of the protein. At
hypervariable regions in gp120 one finds a greater proportion of codons that tend to mutate nonsynonymously, but to a target
that is similar in hydrophobicity and volume. We argue that this strategy results from a compromise between the selective
pressure placed on the virus by the induced immune response, which favors amino acid substitutions in the complementarity
determining regions, and the negative selection against missense mutations that violate structural constraints of the env protein.
Received: 9 June 1997 / Accepted: 25 May 1998 相似文献
19.
The Nonrandom Location of Synonymous Codons Suggests That Reading Frame-Independent Forces Have Patterned Codon Preferences 总被引:6,自引:0,他引:6
Biased codon usage is common in eukaryotic and prokaryotic genes. Evidence from Escherichia, Saccharomyces, and Drosophila indicates that it favors translational efficiency and accuracy. However, to date no functional advantages have been identified
in the codon–anticodon interactions involving the most frequently used (preferred) codons. Here we present evidence that forces
not related to the individual codon–anticodon interaction may be involved in determining which synonymous codons are preferred
or avoided. We show that the ``off-frame' trinucleotide motif preferences inferrable from Drosophila coding regions are often in the same direction as Drosophila's ``in-frame' codon preferences, i.e., its codon usage. The off-frame preferences were inferred from the nonrandomness of
the location of confamilial synonymous codons along coding regions—a pattern often described as a context dependence of nucleotide
choice at synonymous positions or as codon-pair bias. We relied on randomizations of the location of confamilial codons that
do not alter, and cannot be influenced by, the encoded amino acid sequences, codon usage, or base composition of the genes
examined. The statistically significant congruency of in-frame and off-frame trinucleotide preferences suggests that the same
kind of reading-frame-independent force(s) may also influence synonymous codon choice. These forces may have produced biases
in codon usage that then led to the evolution of the translational advantages of these motifs as preferred codons. Under this
scenario, tRNA pool size differences between preferred and nonpreferred codons initially were evolved to track the default
overrepresentation of codons with preferred motifs. The motif preference hypothesis can explain the structuring of codon preferences
and the similarities in the codon usages of distantly related organisms.
Received: 10 November 1998 / Accepted: 23 February 1999 相似文献
20.
Positive Darwinian Selection Promotes Heterogeneity Among Members of the Antifreeze Protein Multigene Family 总被引:9,自引:0,他引:9
A variety of organisms have independently evolved proteins exhibiting antifreeze activity that allows survival at subfreezing
temperatures. The antifreeze proteins (AFPs) bind ice nuclei and depress the freezing point by a noncolligative absorption–inhibition
mechanism. Many organisms have a heterogeneous suite of AFPs with variation in primary sequence between paralogous loci. Here,
we demonstrate that the diversification of the AFP paralogues is promoted by positive Darwinian selection in two independently
evolved AFPs from fish and beetle. First, we demonstrate an elevated rate of nonsynonymous substitutions compared to synonymous
substitutions in the mature protein coding region. Second, we perform phylogeny-based tests of selection to demonstrate a
subset of codons is subjected to positive selection. When mapped onto the three-dimensional structure of the fish antifreeze
type III antifreeze structure, these codons correspond to amino acid positions that surround but do not interrupt the putative
ice-binding surface. The selective agent may be related to efficient binding to diverse ice surfaces or some other aspect
of AFP function.
Received: 27 February 2001 / Accepted: 12 September 2001 相似文献