首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
赋予氨基酸编码方法下除终止子之外的密码子突变为终止子时每一位发生的变化权值,利用矩阵来表示所有的突变方式和难易程度,综合亲水性与疏水性理化性质,提出亚氨基酸编码方法,并给出该编码方法下同义密码子的相对使用度fSubtypesRelativeSynonymousCodonUsage,SRSCU).然后选取15条H5N1序列,使用MEGA4.0分析它们的同源性,并分别在氨基酸编码、拟氨基酸编码、亚氨基酸编码这三种环境下研究所选序列使用密码子的偏好性,对比结果验证,亚氨基酸编码方法具有相应的优越性.  相似文献   

2.
The genome of the social amoeba Dictyostelium discoideum is known to have a very high density of microsatellite repeats, including thousands of triplet microsatellite repeats in coding regions that apparently code for long runs of single amino acids. We used a mutation accumulation study to see if unusually high microsatellite mutation rates contribute to this pattern. There was a modest bias toward mutations that increase repeat number, but because upward mutations were smaller than downward ones, this did not lead to a net average increase in size. Longer microsatellites had higher mutation rates than shorter ones, but did not show greater directional bias. The most striking finding is that the overall mutation rate is the lowest reported for microsatellites: approximately 1 x 10(-6) for 10 dinucleotide loci and 6 x 10(-6) for 52 trinucleotide loci (which were longer). High microsatellite mutation rates therefore do not explain the high incidence of microsatellites. The causal relation may in fact be reversed, with low mutation rates evolving to protect against deleterious fitness effects of mutation at the numerous microsatellites.  相似文献   

3.
Selection on Codon Usage for Error Minimization at the Protein Level   总被引:1,自引:0,他引:1  
Given the structure of the genetic code, synonymous codons differ in their capacity to minimize the effects of errors due to mutation or mistranslation. I suggest that this may lead, in protein-coding genes, to a preference for codons that minimize the impact of errors at the protein level. I develop a theoretical measure of error minimization for each codon, based on amino acid similarity. This measure is used to calculate the degree of error minimization for 82 genes of Drosophila melanogaster and 432 rodent genes and to study its relationship with CG content, the degree of codon usage bias, and the rate of nucleotide substitution. I show that (i) Drosophila and rodent genes tend to prefer codons that minimize errors; (ii) this cannot be merely the effect of mutation bias; (iii) the degree of error minimization is correlated with the degree of codon usage bias; (iv) the amino acids that contribute more to codon usage bias are the ones for which synonymous codons differ more in the capacity to minimize errors; and (v) the degree of error minimization is correlated with the rate of nonsynonymous substitution. These results suggest that natural selection for error minimization at the protein level plays a role in the evolution of coding sequences in Drosophila and rodents.Reviewing Editor: Dr. Massimo Di Giulio  相似文献   

4.
Palidwor GA  Perkins TJ  Xia X 《PloS one》2010,5(10):e13431

Background

In spite of extensive research on the effect of mutation and selection on codon usage, a general model of codon usage bias due to mutational bias has been lacking. Because most amino acids allow synonymous GC content changing substitutions in the third codon position, the overall GC bias of a genome or genomic region is highly correlated with GC3, a measure of third position GC content. For individual amino acids as well, G/C ending codons usage generally increases with increasing GC bias and decreases with increasing AT bias. Arginine and leucine, amino acids that allow GC-changing synonymous substitutions in the first and third codon positions, have codons which may be expected to show different usage patterns.

Principal Findings

In analyzing codon usage bias in hundreds of prokaryotic and plant genomes and in human genes, we find that two G-ending codons, AGG (arginine) and TTG (leucine), unlike all other G/C-ending codons, show overall usage that decreases with increasing GC bias, contrary to the usual expectation that G/C-ending codon usage should increase with increasing genomic GC bias. Moreover, the usage of some codons appears nonlinear, even nonmonotone, as a function of GC bias. To explain these observations, we propose a continuous-time Markov chain model of GC-biased synonymous substitution. This model correctly predicts the qualitative usage patterns of all codons, including nonlinear codon usage in isoleucine, arginine and leucine. The model accounts for 72%, 64% and 52% of the observed variability of codon usage in prokaryotes, plants and human respectively. When codons are grouped based on common GC content, 87%, 80% and 68% of the variation in usage is explained for prokaryotes, plants and human respectively.

Conclusions

The model clarifies the sometimes-counterintuitive effects that GC mutational bias can have on codon usage, quantifies the influence of GC mutational bias and provides a natural null model relative to which other influences on codon bias may be measured.  相似文献   

5.
The codon bias in Escherichia coli for all two-fold degenerate amino acids was studied as dependent on the context from the six bases in the nearest surrounding codons. By comparing the results in genes at different expression levels, effects that are due to differences in mutation rates can be distinguished from those that are due to selection. Selective effects on the codon bias is found mostly from the first neighbouring base in the 3'direction, while neighbouring bases further away influence mostly the mutational bias. In some cases it is also possible to identify specific molecular processes, repair or avoidance of frame shift, that lead to the context dependence of the bias.  相似文献   

6.
The genetic code is degenerate—most amino acids can be encoded by from two to as many as six different codons. The synonymous codons are not used with equal frequency: not only are some codons favored over others, but also their usage can vary significantly from species to species and between different genes in the same organism. Known causes of codon bias include differences in mutation rates as well as selection pressure related to the expression level of a gene, but the standard analysis methods can account for only a fraction of the observed codon usage variation. We here introduce an explicit model of codon usage bias, inspired by statistical physics. Combining this model with a maximum likelihood approach, we are able to clearly identify different sources of bias in various genomes. We have applied the algorithm to Saccharomyces cerevisiae as well as 325 prokaryote genomes, and in most cases our model explains essentially all observed variance.  相似文献   

7.
In this study we reconstruct the evolution of codon usage bias in the chloroplast gene rbcL using a phylogeny of 92 green-plant taxa. We employ a measure of codon usage bias that accounts for chloroplast genomic nucleotide content, as an attempt to limit plausible explanations for patterns of codon bias evolution to selection- or drift-based processes. This measure uses maximum likelihood-ratio tests to compare the performance of two models, one in which a single codon is overrepresented and one in which two codons are overrepresented. The measure allowed us to analyze both the extent of bias in each lineage and the evolution of codon choice across the phylogeny. Despite predictions based primarily on the low G+C content of the chloroplast and the high functional importance of rbcL, we found large differences in the extent of bias, suggesting differential molecular selection that is clade specific. The seed plants and simple leafy liverworts each independently derived a low level of bias in rbcL, perhaps indicating relaxed selectional constraint on molecular changes in the gene. Overrepresentation of a single codon was typically plesiomorphic, and transitions to overrepresentation of two codons occurred commonly across the phylogeny, possibly indicating biochemical selection. The total codon bias in each taxon, when regressed against the total bias of each amino acid, suggested that twofold amino acids play a strong role in inflating the level of codon usage bias in rbcL, despite the fact that twofolds compose a minority of residues in this gene. Those amino acids that contributed most to the total codon usage bias of each taxon are known through amino acid knockout and replacement to be of high functional importance. This suggests that codon usage bias may be constrained by particular amino acids and, thus, may serve as a good predictor of what residues are most important for protein fitness. Present address (Joshua T. Herbeck): JBP Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543, USA  相似文献   

8.
The genetic code has an inherent bias towards some amino acids because of the variable number of synonymous codons per amino acid. The extent to which these biases are expressed in protein secondary structure is described through the analysis of the overall amino acid compositions of the alpha-helix, beta-sheet, beta-turn and random coil segments elucidated by X-ray crystallography. Given the concept of neutral mutation in proteins, the allocation of synonyms in the genetic code appears to protect secondary structures from amino acid changes and discourages the appearance of chemically complex residues. The level of protection is similar for each structural form, despite their clear preferences for certain amino acids. The organization of the code is therefore relevant to the preservation of conformation seen in the evolution of many protein families.  相似文献   

9.
《Genomics》2020,112(2):1319-1329
NKX-2.5 gene is responsible for cardiac development and its targeted disruption apprehends cardiac development at the linear heart tube stage. Bioinformatic analysis was employed to investigate the codon usage pattern and dN/dS of mammalian NKX-2.5 gene. The relative synonymous codon usage analysis revealed variation in codon usage and two synonymous codons namely ATA (Ile) and GTA (Val) were absent in NKX-2.5 gene across selected mammalian species suggesting that these two codons were possibly selected against during evolution. Parity rule 2 analysis of two and four fold amino acids showed CT bias whereas six-fold amino acids revealed GA bias. Neutrality analysis suggests that selection played a prominent role while mutation had a minor role. The dN/dS analysis suggests synonymous substitution played a significant role and it negatively correlated with p-distance of the gene. Purifying natural selection played a dominant role in the genetic evolution of NKX-2.5 gene in mammals.  相似文献   

10.
The second parity rule states that, if there is no bias in mutation or selection, then within each strand of DNA complementary bases are present at approximately equal frequencies. In bacteria, however, there is commonly an excess of G (over C) and, to a lesser extent, T (over A) in the replicatory leading strand. The low G+C Firmicutes, such as Staphylococcus aureus, are unusual in displaying an excess of A over T on the leading strand. As mutation has been established as a major force in the generation of such skews across various bacterial taxa, this anomaly has been assumed to reflect unusual mutation biases in Firmicute genomes. Here we show that this is not the case and that mutation bias does not explain the atypical AT skew seen in S. aureus. First, recently arisen intergenic SNPs predict the classical replication-derived equilibrium enrichment of T relative to A, contrary to what is observed. Second, sites predicted to be under weak purifying selection display only weak AT skew. Third, AT skew is primarily associated with largely non-synonymous first and second codon sites and is seen with respect to their sense direction, not which replicating strand they lie on. The atypical AT skew we show to be a consequence of the strong bias for genes to be co-oriented with the replicating fork, coupled with the selective avoidance of both stop codons and costly amino acids, which tend to have T-rich codons. That intergenic sequence has more A than T, while at mutational equilibrium a preponderance of T is expected, points to a possible further unresolved selective source of skew.  相似文献   

11.
Wall DP  Herbeck JT 《Journal of molecular evolution》2003,56(6):673-88; discussion 689-90
In this study we reconstruct the evolution of codon usage bias in the chloroplast gene rbcL using a phylogeny of 92 green-plant taxa. We employ a measure of codon usage bias that accounts for chloroplast genomic nucleotide content, as an attempt to limit plausible explanations for patterns of codon bias evolution to selection- or drift-based processes. This measure uses maximum likelihood-ratio tests to compare the performance of two models, one in which a single codon is overrepresented and one in which two codons are overrepresented. The measure allowed us to analyze both the extent of bias in each lineage and the evolution of codon choice across the phylogeny. Despite predictions based primarily on the low G + C content of the chloroplast and the high functional importance of rbcL, we found large differences in the extent of bias, suggesting differential molecular selection that is clade specific. The seed plants and simple leafy liverworts each independently derived a low level of bias in rbcL, perhaps indicating relaxed selectional constraint on molecular changes in the gene. Overrepresentation of a single codon was typically plesiomorphic, and transitions to overrepresentation of two codons occurred commonly across the phylogeny, possibly indicating biochemical selection. The total codon bias in each taxon, when regressed against the total bias of each amino acid, suggested that twofold amino acids play a strong role in inflating the level of codon usage bias in rbcL, despite the fact that twofolds compose a minority of residues in this gene. Those amino acids that contributed most to the total codon usage bias of each taxon are known through amino acid knockout and replacement to be of high functional importance. This suggests that codon usage bias may be constrained by particular amino acids and, thus, may serve as a good predictor of what residues are most important for protein fitness.  相似文献   

12.
Correlations between genomic GC contents and amino acid frequencies were studied in the homologous sequences of 12 eubacterial genomes. Results show that amino acids encoded by GC-rich codons increases significantly with genomic GC contents, whereas opposite trend was observed in case of amino acids encoded by GC-poor codons. Further studies show all the amino acids do not change in the predicted direction according to their genomic GC pressure, suggesting that protein evolution is not entirely dictated by their nucleotide frequencies. Amino acid substitution matrix calculated among hydrophobic, amphipathic and hydrophilic amino acid groups' shows that amphipathic and hydrophilic amino acids are more frequently substituted by hydrophobic amino acids than from hydrophobic to hydrophilic or amphipathic amino acids. This indicates that nucleotide bias induces a directional changes in proteome composition in such a way that underwent strong changes in hydropathy values. In fact, significant increases in hydrophobicity values have also been observed with the increase of genomic GC contents. Correlations between GC contents and amino acid compositions in three different predicted protein secondary structures show that hydropathy values increases significantly with GC contents in aperiodic and helix structures whereas strand structure remains insensitive with the genomic GC levels. The relative importance of mutation and selection on the evolution of proteins have been discussed on the basis of these results.  相似文献   

13.
Many malarial antigens contain extensive arrays of tandemly repeated short amino acid sequences, and much of the antibody response induced by malaria infections is directed against these repeats. Indeed, it has been hypothesized that these repeats function to elicit a relatively ineffective T-cell-independent antibody response by the host. In order to test this hypothesis, tandem repeats of Plasmodium species were examined for a bias in composition favoring amino acids likely to form epitopes for the antibody. The genome of Plasmodium is very A+T-rich, and nucleotide compositional bias will, in itself, lead to a high proportion of hydrophilic amino acids. When this bias was controlled for, Plasmodium antigens did not show a higher proportion of hydrophilic amino acids than expected, but there was a significant reduction in the proportion of hydrophobic amino acids in the repeats of the antigens. The amino acid composition of the repeats was thus strikingly different from those seen both in the remainder of the antigens and in a sample of Plasmodium falciparum housekeeping genes.  相似文献   

14.
Defects in the chemotaxis proteins CheY and CheZ of Salmonella typhimurium can be suppressed by mutations in the flagellar switch, such that swarming of a pseudorevertant on semisolid plates is significantly better than that of its parent. cheY suppressors contribute to a clockwise switch bias, and cheZ suppressors contribute to a counterclockwise bias. Among the three known switch genes, fliM contributes most examples of such suppressor mutations. We have investigated the changes in FliM that are responsible for suppression, as well as the changes in CheY or CheZ that are being compensated for. Ten independently isolated parental cheY mutations represented nine distinct mutations, one an amino acid duplication and the rest missense mutations. Several of the altered amino acids lie on one face of the three-dimensional structure of CheY (A. M. Stock, J. M. Mottonen, J. B. Stock, and C. E. Schutt, Nature (London) 337:745-749, 1989; K. Volz and P. Matsumura, J. Biol. Chem. 266:15511-15519, 1991); this face may constitute the binding site for the switch. All 10 cheZ mutations were distinct, with several of them resulting in premature termination. cheY and cheZ suppressors in FliM occurred in clusters, which in general did not overlap. A few cheZ suppressors and one cheY suppressor involved changes near the N terminus of FliM, but neither cheY nor cheZ suppressors involved changes near the C terminus. Among the strongest cheY suppressors were changes from Arg to a neutral amino acid or from Val to Glu, suggesting that electrostatic interactions may play an important role in switching. A given cheY or cheZ mutation could be suppressed by many different fliM mutations; conversely, a given fliM mutation was often encountered as a suppressor of more than one cheY or cheZ mutation. The data suggest that an important factor in suppression is a balancing of the shift in switch bias introduced by alteration of CheY or CheZ with an appropriate opposing shift introduced by alteration of FliM. For strains with a severe parental mutation, such as the cheZ null mutations, adjustment of switch bias is essentially the only factor in suppression, since the attractant L-aspartate caused at most a slight further enhancement of the swarming rate over that occurring in the absence of a chemotactic stimulus. We discuss a model for switching in which there are distinct interactions for the counterclockwise and clockwise states, with suppression occurring by impairment of one of the states and hence by relative enhancement of the other state. FliM can also undergo amino acid changes that result in a paralyzed (Mot-) phenotype; these changes were confined to a very few residues in the protein.  相似文献   

15.

Introduction

Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affected by base compositional changes. In addition, we wanted to know how genome-wide amino acid usage was biased in the different genomes and how changes to base composition and mutations affected this bias. To carry this out, we used a Generalized Additive Mixed-effects Model (GAMM) to explore non-linear associations and strong data dependences in closely related microbes; principal component analysis (PCA) was used to examine genomic amino acid- and codon frequencies, while the concept of relative entropy was used to analyze genomic mutation rates.

Results

We found that genomic amino acid frequencies carried a stronger phylogenetic signal than codon frequencies, but that this signal was weak compared to that of genomic %AT. Further, in contrast to codon usage bias (CUB), amino acid usage bias (AAUB) was differently distributed in AT- and GC-rich genomes in the sense that AT-rich genomes did not prefer specific amino acids over others to the same extent as GC-rich genomes. AAUB was also associated with relative entropy; genomes with low AAUB contained more random mutations as a consequence of relaxed purifying selection than genomes with higher AAUB.

Conclusion

Genomic base composition has a substantial effect on both amino acid- and codon frequencies in bacterial genomes. While phylogeny influenced amino acid usage more in GC-rich genomes, AT-content was driving amino acid usage in AT-rich genomes. We found the GAMM model to be an excellent tool to analyze the genomic data used in this study.  相似文献   

16.
AAindex: amino acid index database   总被引:12,自引:0,他引:12  
AAindex is a database of amino acid indices and amino acid mutation matrices. An amino acid index is a set of 20 numerical values representing various physico-chemical and biochemical properties of amino acids. An amino acid mutation matrix is generally 20 × 20 numerical values representing similarity of amino acids. AAindex consists of two sections: AAindex1 for the collection of published amino acid indices and AAindex2 for the collection of published amino acid mutation matrices. Each entry of either AAindex1 or AAindex2 consists of the definition, the reference information, a list of related entries in terms of the correlation coefficient and the actual data. The database may be accessed through the DBGET/LinkDB system at GenomeNet (http://www.genome.ad.jp/aaindex/ ) or may be downloaded by anonymous FTP (ftp://ftp.genome.ad.jp/db/genomenet/aaindex/ ).  相似文献   

17.
The definition of a typical sec-dependent bacterial signal peptide contains a positive charge at the N-terminus, thought to be required for membrane association. In this study the amino acid distribution of all Escherichia coli secretory proteins were analysed. This revealed that there was a statistically significant bias for lysine at the second codon position (P2), consistent with a role for the positive charge in secretion. Removal of the positively charged residue P2 in two different model systems revealed that a positive charge is not required for protein export. A well-characterized feature of large amino acids like lysine at P2 is inhibition of N-terminal methionine removal by methionyl amino-peptidase (MAP). Substitution of lysine at P2 for other large or small amino acids did not affect protein export. Analysis of codon usage revealed that there was a bias for the AAA lysine codon at P2, suggesting that a non-coding function for the AAA codon may be responsible for the strong bias for lysine at P2 of secretory signal sequences. We conclude that the selection for high translation initiation efficiency maybe the selective pressure that has led to codon and consequent amino acid usage at P2 of secretory proteins.  相似文献   

18.
It has recently been demonstrated that human natural codon usage bias is optimized towards a higher buffering capacity to mutations (measured as the tendency of single point mutations in a DNA sequence to yield the same or similar amino acids) compared to random sequences. In this work, we investigate this phenomenon further by analyzing the natural DNA of four different species (human, mouse, zebrafish and fruit fly) to determine whether such a tolerance to mutations is correlated with the life span and age of sexual maturation for the corresponding organisms. We also propose a new measure to quantify the buffering capacity of a DNA sequence to mutations that takes into account the observed mutation rates within every genome and the effect of the corresponding mutation.Our results suggest there is a propensity for tolerance to mutations that is positively correlated with the life expectancy of the considered organisms. Moreover, random sequences that are constrained to produce the same protein as the naturally occurring sequences are found to be more buffered than completely random sequences while being less buffered than the natural sequences. These results suggest that optimization toward protective mechanisms tolerant to mutations is correlated with both life expectancy and age to sexual maturity at both the levels of codon usage bias and the bias of the natural sequence of codons itself.  相似文献   

19.
In this study, we use the random principle to analyse the distributions of amino acids and amino acid pairs in human tumour necrosis factor precursor (TNF-!) and its eight mutations, to compare the measured distribution probability with the theoretical distribution probability and to rank the measured distribution probability against the theoretical distribution probability. In this way, we can suggest that distributions with a high random rank should not be deliberately evolved and conserved and those with a low random rank should be deliberately evolved and conserved in human TNF-!. An increased distribution probability in a mutation means probabilistically that the mutation is more likely to occur spontaneously, whereas a decreased distribution probability in a mutation means probabilistically that the mutation is less likely to occur spontaneously and perhaps is more related to a certain cause. The results, for example, show that the distributions of 30% of the amino acids are identical with their probabilistic simplest distributions, and the distributions of some of the remaining amino acids are very close to their probabilistic simplest distributions. With respect to probabilities of distributions of amino acids in mutations, the results show that mutations lead to an increase in eight probabilities, which are thus more likely to occur. Eight probabilities decrease and are thus less likely to occur. With respect to the random ranks against the theoretical probabilities of distributions of amino acids, the results show that mutations lead to an increase in seven and a decrease in seven probabilities, with two probabilities unchanged.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号