首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Wall DP  Herbeck JT 《Journal of molecular evolution》2003,56(6):673-88; discussion 689-90
In this study we reconstruct the evolution of codon usage bias in the chloroplast gene rbcL using a phylogeny of 92 green-plant taxa. We employ a measure of codon usage bias that accounts for chloroplast genomic nucleotide content, as an attempt to limit plausible explanations for patterns of codon bias evolution to selection- or drift-based processes. This measure uses maximum likelihood-ratio tests to compare the performance of two models, one in which a single codon is overrepresented and one in which two codons are overrepresented. The measure allowed us to analyze both the extent of bias in each lineage and the evolution of codon choice across the phylogeny. Despite predictions based primarily on the low G + C content of the chloroplast and the high functional importance of rbcL, we found large differences in the extent of bias, suggesting differential molecular selection that is clade specific. The seed plants and simple leafy liverworts each independently derived a low level of bias in rbcL, perhaps indicating relaxed selectional constraint on molecular changes in the gene. Overrepresentation of a single codon was typically plesiomorphic, and transitions to overrepresentation of two codons occurred commonly across the phylogeny, possibly indicating biochemical selection. The total codon bias in each taxon, when regressed against the total bias of each amino acid, suggested that twofold amino acids play a strong role in inflating the level of codon usage bias in rbcL, despite the fact that twofolds compose a minority of residues in this gene. Those amino acids that contributed most to the total codon usage bias of each taxon are known through amino acid knockout and replacement to be of high functional importance. This suggests that codon usage bias may be constrained by particular amino acids and, thus, may serve as a good predictor of what residues are most important for protein fitness.  相似文献   

2.
In this study we reconstruct the evolution of codon usage bias in the chloroplast gene rbcL using a phylogeny of 92 green-plant taxa. We employ a measure of codon usage bias that accounts for chloroplast genomic nucleotide content, as an attempt to limit plausible explanations for patterns of codon bias evolution to selection- or drift-based processes. This measure uses maximum likelihood-ratio tests to compare the performance of two models, one in which a single codon is overrepresented and one in which two codons are overrepresented. The measure allowed us to analyze both the extent of bias in each lineage and the evolution of codon choice across the phylogeny. Despite predictions based primarily on the low G+C content of the chloroplast and the high functional importance of rbcL, we found large differences in the extent of bias, suggesting differential molecular selection that is clade specific. The seed plants and simple leafy liverworts each independently derived a low level of bias in rbcL, perhaps indicating relaxed selectional constraint on molecular changes in the gene. Overrepresentation of a single codon was typically plesiomorphic, and transitions to overrepresentation of two codons occurred commonly across the phylogeny, possibly indicating biochemical selection. The total codon bias in each taxon, when regressed against the total bias of each amino acid, suggested that twofold amino acids play a strong role in inflating the level of codon usage bias in rbcL, despite the fact that twofolds compose a minority of residues in this gene. Those amino acids that contributed most to the total codon usage bias of each taxon are known through amino acid knockout and replacement to be of high functional importance. This suggests that codon usage bias may be constrained by particular amino acids and, thus, may serve as a good predictor of what residues are most important for protein fitness. Present address (Joshua T. Herbeck): JBP Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543, USA  相似文献   

3.
Codon usage patterns in cytochrome oxidase I across multiple insect orders   总被引:2,自引:0,他引:2  
Synonymous codon usage bias is determined by a combination of mutational biases, selection at the level of translation, and genetic drift. In a study of mtDNA in insects, we analyzed patterns of codon usage across a phylogeny of 88 insect species spanning 12 orders. We employed a likelihood-based method for estimating levels of codon bias and determining major codon preference that removes the possible effects of genome nucleotide composition bias. Three questions are addressed: (1) How variable are codon bias levels across the phylogeny? (2) How variable are major codon preferences? and (3) Are there phylogenetic constraints on codon bias or preference? There is high variation in the level of codon bias values among the 88 taxa, but few readily apparent phylogenetic patterns. Bias level shifts within the lepidopteran genus Papilio are most likely a result of population size effects. Shifts in major codon preference occur across the tree in all of the amino acids in which there was bias of some level. The vast majority of changes involves double-preference models, however, and shifts between single preferred codons within orders occur only 11 times. These shifts among codons in double-preference models are phylogenetically conservative.  相似文献   

4.
We propose that the amino acid residues 57/58 and 60/61 of eukaryotic release factors (eRF1s) (counted from the N-terminal Met of human eRF1) are responsible for stop codon recognition in protein synthesis. The proposal is based on amino acid exchanges in these positions in the eRF1s of two ciliates that reassigned one or two stop codons to sense codons in evolution and on the crystal structure of human eRF1. The proposed mechanism of stop codon recognition assumes that the amino acid residues 57/58 interact with the second and the residues 60/61 with the third position of a stop codon. The fact that conventional eRF1s recognize all three stop codons but not the codon for tryptophan is attributed to the flexibility of the helix containing these residues. We suggest that the helix is able to assume a partly relaxed or tight conformation depending on the stop codon recognized. The restricted codon recognition observed in organisms with unconventional eRF1s is attributed mainly to the loss of flexibility of the helix due to exchanged amino acids.  相似文献   

5.
The 655 bp cytochrome c oxidase subunit I barcode region of single specimens of 388 species of fishes (four Holocephali, 61 Elasmobranchii and 323 Actinopterygii) was examined. All but two (Urolophus cruciatus and Urolophus sufflavus) showed different cox1 nucleotide sequences (99.5% species discrimination); the two that could not be resolved are suspected to hybridize. Most of the power of cox1 nucleotide sequence analysis for species identification comes from the degenerate nature of the genetic code and the highly variable nature of the third codon position of amino acids. Variation at the third codon position is bimodally distributed, and the more variable mode is dominated by amino acids with four or six codons, while the less variable mode is dominated by amino acids with two codons. The ratio of nonsynonymous to synomymous changes is much less than one, indicating that this gene is subject to strong purifying selection. Consequently, cox1 amino acid sequence diversity is much less than nucleotide sequence diversity and has very poor species resolution power. Fourteen of the 16 amino acid residues recognized as having important functions in the region of cox1 sequenced were completely conserved over all 388 species (and the bovine cox1 sequence), with one fish species varying at one of these sites, and three fish at another site. No significant differences in amino acid conservation were observed between residues in helices, strands and turns. Patterns of nucleotide and amino acid variability were very similar between elasmobranchs and actinopterygians.  相似文献   

6.
Are intron positions correlated with regions of high amino acid conservation? For a set of ancient conserved proteins, with intronless prokaryotic but intron-containing eukaryotic homologs, multiple sequence alignments identified residues invariant throughout evolution. Intron positions between codons show no preferences. However, introns lying after the first base of a codon prefer conserved regions, markedly in glycines. Because glycines are in excess in conserved regions, this behavior could reflect phase-one introns entering glycine residues randomly in the ancestral sequences. Examination of intron positions within codons of evolutionarily invariable amino acids showed that roughly 50% of these introns are bordered by guanines at both 5'- and 3'-ends, 25% have a G only before the intron, and 5% have a G only after the intron, whereas about 20% are bordered by nonguanine bases.  相似文献   

7.
This work describes the molecular characterization of the cytochrome c oxidase subunit I (COI) gene of the mitochondrial DNA from three species of great medical and veterinary importance: the horn fly, Haematobia irritans, the stable fly, Stomoxys calcitrans and the house fly, Musca domestica (Diptera: Muscidae) (Linnaeus). The nucleotide sequence in all species was 1536 bp in size and coded for a 512 amino acid peptide. The nucleotide bias for an A+T-rich sequence is linked to three features: a high A+T content throughout the entire gene, a high A+T content in the third codon position, and a predominance of A+T-rich codons. An anomalous TCG (serine) start codon was identified. Comparative analysis among members of the Muscidae, Scatophagidae, Calliphoridae and Drosophilidae showed high levels of nucleotide sequence conservation. Analysis of the divergent amino acids and COI protein topologies among these three Muscidae species agreed with the evolutionary model suggested for the insect mitochondrial COI protein. The characterization of the structure and evolution of this gene could be informative for further evolutionary analysis of dipteran species.  相似文献   

8.
Selection on Codon Usage for Error Minimization at the Protein Level   总被引:1,自引:0,他引:1  
Given the structure of the genetic code, synonymous codons differ in their capacity to minimize the effects of errors due to mutation or mistranslation. I suggest that this may lead, in protein-coding genes, to a preference for codons that minimize the impact of errors at the protein level. I develop a theoretical measure of error minimization for each codon, based on amino acid similarity. This measure is used to calculate the degree of error minimization for 82 genes of Drosophila melanogaster and 432 rodent genes and to study its relationship with CG content, the degree of codon usage bias, and the rate of nucleotide substitution. I show that (i) Drosophila and rodent genes tend to prefer codons that minimize errors; (ii) this cannot be merely the effect of mutation bias; (iii) the degree of error minimization is correlated with the degree of codon usage bias; (iv) the amino acids that contribute more to codon usage bias are the ones for which synonymous codons differ more in the capacity to minimize errors; and (v) the degree of error minimization is correlated with the rate of nonsynonymous substitution. These results suggest that natural selection for error minimization at the protein level plays a role in the evolution of coding sequences in Drosophila and rodents.Reviewing Editor: Dr. Massimo Di Giulio  相似文献   

9.
To reveal how the AT-rich genome of bacteriophage PhiKZ has been shaped in order to carryout its growth in the GC-rich host Pseudomonas aeruginosa,synonymous codon and amino acid usage bias ofPhiKZ was investigated and the data were compared with that of P.aeruginosa.It was found that synonymouscodon and amino acid usage of PhiKZ was distinct from that of P.aeruginosa.In contrast to P.aeruginosa,the third codon position of the synonymous codons of PhiKZ carries mostly A or T base;codon usage biasin PhiKZ is dictated mainly by mutational bias and,to a lesser extent,by translational selection.A clusteranalysis of the relative synonymous codon usage values of 16 myoviruses including PhiKZ shows that PhiKZis evolutionary much closer to Escherickia coli phage T4.Further analysis reveals that the three factors ofmean molecular weight,aromaticity and cysteine content are mostly responsible for the variation of aminoacid usage in PhiKZ proteins,whereas amino acid usage of P.aeruginosa proteins is mainly governed bygrand average of hydropathicity,aromaticity and cysteine content.Based on these observations,we suggestthat codons of the phage-like PhiKZ have evolved to preferentially incorporate the smaller amino acid residuesinto their proteins during translation,thereby economizing the cost of its development in GC-rich P.aeruginosa.  相似文献   

10.
Ascaridoid nematodes parasitize the gastrointestinal tract of vertebrate definitive hosts and are represented by more than 50 described genera. We used 582 nucleotides (83% of the coding sequence) of the mitochondrial gene cytochrome oxidase subunit 2, in combination with published small- and large-subunit nuclear rDNA sequences (2,557 characters) and morphological data (20 characters), to produce a phylogenetic hypothesis for representatives of this superfamily. This combined evidence phylogeny strongly supported clades that, with 1 exception, were consistent with Fagerholm's 1991 classification. Parsimony mapping of character states on the combined evidence tree was used to develop hypotheses for the evolution of morphological, life history, and amino acid characters. This analysis of character evolution revealed that certain key features that have been used by previous workers for developing taxonomic and evolutionary hypotheses represent plesiomorphic states. Cytochrome oxidase subunit 2 nucleotides show a strong compositional bias to A+T and a substitution bias to thymine. These biases are most apparent at third positions of codons and 4-fold degenerate sites, which is consistent with the nonrandom substitution pattern of A+T pressure. Despite nucleotide bias, cytochrome oxidase amino acid sequences show conservation and retention of critical functional residues, as inferred from comparisons to other organisms.  相似文献   

11.
Unequal use of synonymous codons has been found in several prokaryotic and eukaryotic genomes. This bias has been associated with translational efficiency. The prevalence of this bias across lineages is currently unknown. Here, a new method (GCB) to measure codon usage bias is presented. It uses an iterative approach for the determination of codon scores and allows the computation of an index of codon bias suitable for interspecies comparison. A server to calculate GCB-values of individual genes as well as a list of compiled results are available at . The method was applied to complete bacterial genomes. The relation of codon usage bias with amino acid composition and the choice of stop codons were determined and discussed.  相似文献   

12.
The 'effective number of codons' used in a gene   总被引:64,自引:0,他引:64  
F Wright 《Gene》1990,87(1):23-29
A simple measure is presented that quantifies how far the codon usage of a gene departs from equal usage of synonymous codons. This measure of synonymous codon usage bias, the 'effective number of codons used in a gene', Nc, can be easily calculated from codon usage data alone, and is independent of gene length and amino acid (aa) composition. Nc can take values from 20, in the case of extreme bias where one codon is exclusively used for each aa, to 61 when the use of alternative synonymous codons is equally likely. Nc thus provides an intuitively meaningful measure of the extent of codon preference in a gene. Codon usage patterns across genes can be investigated by the Nc-plot: a plot of Nc vs. G + C content at synonymous sites. Nc-plots are produced for Homo sapiens, Saccharomyces cerevisiae, Escherichia coli, Bacillus subtilis, Dictyostelium discoideum, and Drosophila melanogaster. A FORTRAN77 program written to calculate Nc is available on request.  相似文献   

13.
Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability.  相似文献   

14.
The cytochrome c oxidase subunit 2 gene (COII) encodes a highly conserved protein that is directly responsible for the initial transfer of electrons from cytochrome c to cytochrome c oxidase (COX) crucial to the production of ATP during cellular respiration. Despite its integral role in electron transport, we have observed extensive intraspecific nucleotide and amino acid variation among 26 full-length COII sequences sampled from seven populations of the marine copepod, Tigriopus californicus. Although intrapopulation divergence was virtually nonexistent, interpopulation divergence at the COII locus was nearly 20% at the nucleotide level, including 38 nonsynonymous substitutions. Given the high degree of interaction between the cytochrome c oxidase subunit 2 protein (COX2) and the nuclear-encoded subunits of COX and cytochrome c (CYC), we hypothesized that some codons in the COII gene are likely to be under positive selection in order to compensate for amino acid substitutions in other subunits. Estimates of the ratio of nonsynonymous to synonymous substitution (ω), obtained using a series of maximum likelihood models of codon substitution, indicated that the majority of codons in T. californicus COII are under strong purifying selection (ω << 1), while approximately 4% of the sites in this gene appear to evolve under relaxed selective constraint (ω = 1). A branch-site maximum likelihood model identified three sites that may have experienced positive selection within the central California sequence clade in our COII phylogeny; these results are consistent with previous studies showing functional and fitness consequences among interpopulation hybrids between central and northern California populations. [Reviewing Editor: Dr. Willie Swanson]  相似文献   

15.
All established methods for detecting positive selection at the molecular level rely on comparisons between nucleotide sequences. An exceptional method that purports to detect selection on the basis of a single genomic sequence has recently been proposed. This method uses a measure called "codon volatility," defined for each codon as the ratio between the number of nonsynonymous codons that differ from the codon under study at a single nucleotide position and the number of sense codons that differ from the codon under study at a single nucleotide position. Here, we examine various properties of codon volatility and its derivatives and use simulation of evolutionary processes to determine whether they can be used to detect selective pressures. Codons for only four amino acids (glycine, leucine, arginine, and serine) show any variation in codon volatility. Thus, codon volatility is mainly a proxy for amino acid usage, rather than for codon usage, with 65% of all synonymous changes and 27% of all nonsynonymous changes being undetectable by this measure. Genes identified by the volatility method as being subject to positive selection tend to have idiosyncratic amino acid compositions (e.g., they are glycine rich or arginine poor). An additional property of codon volatility is the near zero variance of its mean expectation, which translates into overestimated statistical significance estimates, especially in the absence of corrections for multiple comparisons. A comparison with measures of selection inferred through comparative methodology reveals no relationship between the results of the two methods. Finally, we show that codon volatility can increase in the absence of positive Darwinian selection; that is, increased codon volatility is not indicative of positive selection.  相似文献   

16.

Background

In a previous study of higher-level arthropod phylogeny, analyses of nucleotide sequences from 62 protein-coding nuclear genes for 80 panarthopod species yielded significantly higher bootstrap support for selected nodes than did amino acids. This study investigates the cause of that discrepancy.

Methodology/Principal Findings

The hypothesis is tested that failure to distinguish the serine residues encoded by two disjunct clusters of codons (TCN, AGY) in amino acid analyses leads to this discrepancy. In one test, the two clusters of serine codons (Ser1, Ser2) are conceptually translated as separate amino acids. Analysis of the resulting 21-amino-acid data matrix shows striking increases in bootstrap support, in some cases matching that in nucleotide analyses. In a second approach, nucleotide and 20-amino-acid data sets are artificially altered through targeted deletions, modifications, and replacements, revealing the pivotal contributions of distinct Ser1 and Ser2 codons. We confirm that previous methods of coding nonsynonymous nucleotide change are robust and computationally efficient by introducing two new degeneracy coding methods. We demonstrate for degeneracy coding that neither compositional heterogeneity at the level of nucleotides nor codon usage bias between Ser1 and Ser2 clusters of codons (or their separately coded amino acids) is a major source of non-phylogenetic signal.

Conclusions

The incongruity in support between amino-acid and nucleotide analyses of the forementioned arthropod data set is resolved by showing that “standard” 20-amino-acid analyses yield lower node support specifically when serine provides crucial signal. Separate coding of Ser1 and Ser2 residues yields support commensurate with that found by degenerated nucleotides, without introducing phylogenetic artifacts. While exclusion of all serine data leads to reduced support for serine-sensitive nodes, these nodes are still recovered in the ML topology, indicating that the enhanced signal from Ser1 and Ser2 is not qualitatively different from that of the other amino acids.  相似文献   

17.
Genome-wide analysis of sequence divergence patterns in 12,024 human-mouse orthologous pairs reveals, for the first time, that the trends in nucleotide and amino acid substitutions in orthologs of high and low GC composition are highly asymmetric and polarized to opposite directions. The entire dataset has been divided into three groups on the basis of the GC content at third codon sites of human genes: high, medium, and low. High-GC orthologs exhibit significant bias in favor of the replacements, Thr --> Ala, Ser --> Ala, Val --> Ala, Lys --> Arg, Asn --> Ser, Ile --> Val etc., from mouse to human, whereas in low-GC orthologs, the reverse trends prevail. In general, in the high-GC group, residues encoded by A/U-rich codons of mouse proteins tend to be replaced by the residues encoded by relatively G/C-rich codons in their human orthologs, whereas the opposite trend is observed among the low-GC orthologous pairs. The medium-GC group shares some trends with high-GC group and some with low-GC group. The only significant trend common in all groups of orthologs, irrespective of their GC bias, is (Asp)(Mouse) --> (Glu)(Human) replacement. At the nucleotide level, high-GC orthologs have undergone a large excess of (A/T)(Mouse) --> (G/C)(Human) substitutions over (G/C)(Mouse) --> (A/T)(Human) at each codon position, whereas for low-GC orthologs, the reverse is true.  相似文献   

18.
We present the nucleotide sequence of the tolC gene of Escherichia coli K12, and the amino acid sequence of the TolC protein (an outer membrane protein) as deduced from it. The mature TolC protein comprises 467 amino acid residues, and, as previously reported (1), a signal sequence of 22 amino acid residues is attached to the N-terminus. The C-terminus of the gene is followed by a stem-loop structure (8 base pair stem, 4 base loop) which may be a rho-independent termination signal. The codon usage of the gene is nonrandom; the major isoaccepting species of tRNA are preferentially utilised, or, among synonomous codons recognized by the same tRNA, those codons are used which can interact better with the anticodon (2,3). In contrast to the codon usage for other outer membrane proteins of E. coli (4) the rare arginine codons AGA and AGG are used once and twice respectively.  相似文献   

19.
Lavner Y  Kotlar D 《Gene》2005,345(1):127-138
We study the interrelations between tRNA gene copy numbers, gene expression levels and measures of codon bias in the human genome. First, we show that isoaccepting tRNA gene copy numbers correlate positively with expression-weighted frequencies of amino acids and codons. Using expression data of more than 14,000 human genes, we show a weak positive correlation between gene expression level and frequency of optimal codons (codons with highest tRNA gene copy number). Interestingly, contrary to non-mammalian eukaryotes, codon bias tends to be high in both highly expressed genes and lowly expressed genes. We suggest that selection may act on codon bias, not only to increase elongation rate by favoring optimal codons in highly expressed genes, but also to reduce elongation rate by favoring non-optimal codons in lowly expressed genes. We also show that the frequency of optimal codons is in positive correlation with estimates of protein biosynthetic cost, and suggest another possible action of selection on codon bias: preference of optimal codons as production cost rises, to reduce the rate of amino acid misincorporation. In the analyses of this work, we introduce a new measure of frequency of optimal codons (FOP'), which is unaffected by amino acid composition and is corrected for background nucleotide content; we also introduce a new method for computing expected codon frequencies, based on the dinucleotide composition of the introns and the non-coding regions surrounding a gene.  相似文献   

20.
Palidwor GA  Perkins TJ  Xia X 《PloS one》2010,5(10):e13431

Background

In spite of extensive research on the effect of mutation and selection on codon usage, a general model of codon usage bias due to mutational bias has been lacking. Because most amino acids allow synonymous GC content changing substitutions in the third codon position, the overall GC bias of a genome or genomic region is highly correlated with GC3, a measure of third position GC content. For individual amino acids as well, G/C ending codons usage generally increases with increasing GC bias and decreases with increasing AT bias. Arginine and leucine, amino acids that allow GC-changing synonymous substitutions in the first and third codon positions, have codons which may be expected to show different usage patterns.

Principal Findings

In analyzing codon usage bias in hundreds of prokaryotic and plant genomes and in human genes, we find that two G-ending codons, AGG (arginine) and TTG (leucine), unlike all other G/C-ending codons, show overall usage that decreases with increasing GC bias, contrary to the usual expectation that G/C-ending codon usage should increase with increasing genomic GC bias. Moreover, the usage of some codons appears nonlinear, even nonmonotone, as a function of GC bias. To explain these observations, we propose a continuous-time Markov chain model of GC-biased synonymous substitution. This model correctly predicts the qualitative usage patterns of all codons, including nonlinear codon usage in isoleucine, arginine and leucine. The model accounts for 72%, 64% and 52% of the observed variability of codon usage in prokaryotes, plants and human respectively. When codons are grouped based on common GC content, 87%, 80% and 68% of the variation in usage is explained for prokaryotes, plants and human respectively.

Conclusions

The model clarifies the sometimes-counterintuitive effects that GC mutational bias can have on codon usage, quantifies the influence of GC mutational bias and provides a natural null model relative to which other influences on codon bias may be measured.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号