首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
Assuming that the observed mutation frequency of an amino acid depends on two factors. The first is mutation coefficient which describes the rate of the nucleotide substitution stochastically and the second is the similarity of amino acids which represents the fitness of a mutant under the selective pressure. A statistical theory is proposed and 380 mutation frequencies are calculated, only 10 of which disagree obviously with the observed data.The Project Supported by NSFC.  相似文献   

2.
In this paper, we investigate a simple protein sequence conservation measure which takes amino acid similarity into account. Instead of grouping 20 amino acids into disjoint sets in previous methods, we consider ten overlapping classes. The method is based on the assumption that a column in a multiple sequence alignment is evolved from an identical column in the evolutionary history. Two ten-dimensional vectors are constructed for each position to denote frequencies of ten classes in a column and the corresponding hypothetical identical column. Then the cosine function of the angle between these two vectors is considered as a measure of divergence of stereochemical properties at this position. This divergence, combining with other conservation scores, is used as conservation measure of the column. Finally, we evaluate our methods by identifying catalytic sites, using rank analysis criterion and receiver operator characteristic analysis criterion.  相似文献   

3.
The frequencies of occurrence of four bases in the first, second and third codon positions and in the total coding sequences have been calculated by the codon usage table published in 1990 by Ikemura et al. The distribution of frequencies are further analysed in detail by a graphic technique presented recently by us. Formulas expressing the frequencies of four bases in the first and second codon positions in terms of frequencies of amino acids have been given. It is shown by the graphic analysis that for 90 species, in the first codon position the purine bases are dominant and in most cases G is the most dominant base. In the second codon position A is the most dominant base, while G is the least dominant base. In the third codon position the G + C content varies from 0.1 to 0.9, keeping the A + C content equal to 1/2 and G content equal to that of C, approximately. If the frequencies for bases A, C, G and U in the total coding sequences are denoted by a, c, g and u, respectively, it is found that the unequal formula: a2 + c2 + g2 + u2 less than 1/3, is valid for each of the 90 species including the human and E.coli etc.  相似文献   

4.
Similarities and differences between amino acids define the rates at which they substitute for one another within protein sequences and the patterns by which these sequences form protein structures. However, there exist many ways to measure similarity, whether one considers the molecular attributes of individual amino acids, the roles that they play within proteins, or some nuanced contribution of each. One popular approach to representing these relationships is to divide the 20 amino acids of the standard genetic code into groups, thereby forming a simplified amino acid alphabet. Here, we develop a method to compare or combine different simplified alphabets, and apply it to 34 simplified alphabets from the scientific literature. We use this method to show that while different suggestions vary and agree in non-intuitive ways, they combine to reveal a consensus view of amino acid similarity that is clearly rooted in physico-chemistry.  相似文献   

5.
Since the genetic code first was determined, many have claimed that it is organized adaptively, so as to assign similar codons to similar amino acids. This claim has proved difficult to establish due to the absence of relevant comparative data on alternative primordial codes and of objective measures of amino acid exchangeability. Here we use a recently developed measure of exchangeability to evaluate a null hypothesis and two alternative hypotheses about the adaptiveness of the genetic code. The null hypothesis that there is no tendency for exchangeable amino acids to be assigned to similar codons can be excluded here as expected from earlier work. The first alternative hypothesis is that any such correlation between codon distance and amino acid distance is due to incremental mechanisms of code evolution, and not to adaptation to reduce deleterious effects of future mutations. More specifically, new codon assignments that occur by ambiguity reduction or by codon capture will tend to give rise to correlations, whether due to the condition of amino acid ambiguity, or to the condition of similarity between a new tRNA synthetase (or tRNA) and its parent. The second alternative hypothesis, the adaptive hypothesis, then may be defined as an excess relative to what may be expected given the incremental nature of evolution, reflecting true adaptation for robustness rather than an incidental effect. The results reported here indicate that most of the nonrandomness in the amino acids to codon assignments can be explained by incremental code evolution, with a small residue of orderliness that may reflect code adaptation.  相似文献   

6.
以3个类群73个二倍体蔷薇属(Rosa)植物为材料,克隆获得其FLOWERING LOCUS T(FT)同源基因,并对该基因的编码区序列进行多态性分析以及多维尺度(MDS)聚类分析。结果显示,73个二倍体蔷薇植物的FT基因共检测到215个核苷酸多态性位点,其中包括214个SNP和1个缺失突变,平均185个碱基发生1次突变;氨基酸多态性分析结果显示共有35个氨基酸发生变异,平均379.6个氨基酸残基发生1次突变;突变位点统计分析结果发现39、258、426 bp位点是高频突变位点,其碱基由A或C突变为T。MDS聚类分析结果表明,3个类群FT基因编码区序列的碱基组内差异依次排序为:野生种月季组中国古老月季,氨基酸组内差异依次排序为:中国古老月季月季组野生种,推测中国古老月季在长期栽培驯化过程中,其FT基因可能经历了较强的人工选择压力,月季组的种和变种可能是古老月季的重要亲本来源。  相似文献   

7.
We outline a method for estimating quantitatively the influence of point mutations and selection on the frequencies of codons and amino acids. We show how the mutation rate, i.e., the rate of amino acid replacement due to point mutation, can be affected by the codon usage as well as by the rates of the involved base exchanges. A comparison of the mutation rates calculated from reliable values of codon usage and base exchange probabilities with those that would be expected on the basis of chance reveals a notable suppression of replacements leading to tryptophan, glutamate, lysine, and methionine, and particularly of those leading to the termination codons. If selection constraints are neglected and only mutations are taken into account, the best agreement between expected and observed frequencies of both codons and amino acids is obtained for alpha = 1.13-1.15, where (Formula: see text). The "selection values" of codons and amino acids derived by our method show a pattern that partially deviates from others in the literature. For example, the selection pressure on methionine and cysteine turns out to be much more pronounced than expected if only the discrepancies between their observed and expected occurrences in proteins are considered. To estimate to what extent randomly occurring amino acid replacements are accepted by selection, we constructed an "acceptability matrix" from the well-established matrix of accepted point mutations. On the basis of this matrix "acceptability values" of the amino acids can be defined that correlate with their selection values. We also examine the significance of mutations and selection of amino acids with respect to their physicochemical properties and functions in proteins. The conservatism of amino acid replacements with respect to certain properties such as polarity can be brought about by the mutational process alone, whereas the conservatism with respect to other relevant properties--among them all measures of bulkiness--obviously is the result of additional selectional constraints on the evolution of protein structures.  相似文献   

8.
9.

Introduction

Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affected by base compositional changes. In addition, we wanted to know how genome-wide amino acid usage was biased in the different genomes and how changes to base composition and mutations affected this bias. To carry this out, we used a Generalized Additive Mixed-effects Model (GAMM) to explore non-linear associations and strong data dependences in closely related microbes; principal component analysis (PCA) was used to examine genomic amino acid- and codon frequencies, while the concept of relative entropy was used to analyze genomic mutation rates.

Results

We found that genomic amino acid frequencies carried a stronger phylogenetic signal than codon frequencies, but that this signal was weak compared to that of genomic %AT. Further, in contrast to codon usage bias (CUB), amino acid usage bias (AAUB) was differently distributed in AT- and GC-rich genomes in the sense that AT-rich genomes did not prefer specific amino acids over others to the same extent as GC-rich genomes. AAUB was also associated with relative entropy; genomes with low AAUB contained more random mutations as a consequence of relaxed purifying selection than genomes with higher AAUB.

Conclusion

Genomic base composition has a substantial effect on both amino acid- and codon frequencies in bacterial genomes. While phylogeny influenced amino acid usage more in GC-rich genomes, AT-content was driving amino acid usage in AT-rich genomes. We found the GAMM model to be an excellent tool to analyze the genomic data used in this study.  相似文献   

10.
The code within the codons   总被引:6,自引:0,他引:6  
F J Taylor  D Coates 《Bio Systems》1989,22(3):177-187
For the first time it is shown that each of the three codon bases has a general correlation with a different, predictable amino acid property, depending on position within the codon. In addition to the previously recognized link between the mid-base and the hydrophobic-hydrophilic spectrum, we show that, with the exception of G, the first base is generally invariant within a synthetic pathway. G--coded amino acids show a different order, being found only at the head of the synthetic pathways. The redundancy of the nature of the third base has a previously unrecognised relationship with molecular weight. The bases U and A (transversions) are associated with the most sharply defined or opposite states in both the first and second position, C somewhat less so or intermediate, anf G neutral. The apparently systematic nature of these relationships has profound implications for the origin of the genetic code. It appears to be the remains of the first language of the cell, predating the tRNA/ribosome system, persisting with remarkably little change at a deeper level of organisation than the codon language.  相似文献   

11.
On considering chemical evolution of the Earth since the time of its appearance when its composition was similar to the elementary composition of star substance, a tentative hypothesis has been put forward that molecular evolution of the four-letter genetic alphabet includes two periods: I (pre-oxygen) and II (oxygenated) periods of chemical evolution. At the period I, in the primary Earth atmosphere the first nitrogen base, adenine (A), containing no oxygen appeared. The period II, during which three other nitrogen bases appeared in the atmosphere, consisted of three stages; at the first stage, guanine (G) appeared, at the second, cytosine (C), and at the third stage, uracyl (U). In accordance with the above periods, formation of codons and amino acids in nature was taking place presumably by the following way: at the period I, the first and the only codon AAA appeared, to which the amino acid lysine (Lys) corresponded; at the first stage of the period II, 7 codons and 3 amino acids (Arg, Glu, Gly) appeared; at the second stage, 19 codons and 8 new amino acids (Asn, Gin, Ser, Asp, Thr, Ala, His, Pro) appeared; at the third stage, 37 codons and more 8 new amino acids (Trp, Tyr, Cys, Ile, Met, Val, Leu, Phe) appeared. Thereby, in the course of biochemical evolution, 20 amino acids and 64 codons appeared in nature.  相似文献   

12.
Amino acid sequence of a specific antigenic peptide of protein B23   总被引:6,自引:0,他引:6  
A specific antigenic peptide was obtained from protein B23 (Mr/pI = 37,000/5.1) after 30 min of digestion with staphylococcal V8 protease (10 micrograms/ml/mg protein B23). The antigenic peptide was purified by DEAE-cellulose chromatography and high pressure liquid chromatography on a reverse-phase C18 column. The antigenic peptide contains 14.7 and 18.7 mol% of glutamic acid and lysine, respectively. Amino acid sequence analysis showed that the peptide has 68 amino acids and is located on the carboxyl-terminal sequence of protein B23. The sequence is Ser-Phe-Lys-Lys-Gln-Glu-Lys-Thr-Pro-Lys-Thr-Pro- Lys-Gly-Pro-Ser-Ser-Val-Glu-Asp-Ile-Lys-Ala-Lys-Met-Gln-Ala-Ser-Ile-Glu- Lys-Gly- Gly-Ser-Leu-Pro-Lys-Val-Glu-Ala-Lys-Phe-Ile-Asn-Tyr-Val-Lys-Asn-Cys-Phe- Arg-Met- Thr-Asp-Gln-Glu-Ala-Ile-Gln-Asp-Leu-Trp-Gln-Trp-Arg-Lys-Ser-Leu-Cooh. Extensive digestion of the antigenic peptide with V8 protease, trypsin, or chymotrypsin results in loss of the antigenic activity. Three cloned cDNAs (hpB1, hpB2, and hpB7) which code for the 82 amino acids at the COOH terminus of protein B23 and the 3' non-translating sequence were identified and characterized. All three clones have identical nucleotide sequences coding for the antigenic portion of the protein (68 amino acids at the COOH terminus), the stop codon, and the 3' non-translated region. However, mutation of 6 nucleotide bases of one clone (hpB2) caused changes in 4 amino acids in the sequence just preceding the immunoreactive region. The result suggests the presence of at least 2 immunologically similar but distinct proteins which are both recognized by the anti-B23 antibody.  相似文献   

13.
The genetic code is comprised of a system concerning the distribution of doublets of the first two codon bases among amino acids. According to this system a definite order in the relative distribution of the first and the second codon bases coincides with a definite order among the common amino acids and their distribution for the number of hydrogen atoms per molecule (an unexpected parameter). The pattern of the relative distribution of the first and the second codon bases suggests it originated from a crystalline-like structure in which the set of bases AUGC served as an elementary structural unit and the base doublets played the role of structural analogs to the amino acids. These hypothetical crystalline-like aggregates are composed of the free molecules of amino acids and bases, and although different in their composition, should have an even number of hydrogen atoms per standard structural module.  相似文献   

14.
We studied the correlations between amino acid composition and mononucleotide and dinucleotide frequencies in 115 bacterial genomes of varying G+C content. Observed amino acid frequencies were compared with those expected from the actual mononucleotide and dinucleotide frequencies. Both mononucleotide and dinucleotide frequencies correlate well with the amino acid frequency, with dinucleotide frequencies doing so better. Despite the strong correlations, some of the observed amino acid frequencies, in particular for Arg, Val, Asp, Glu, Ser, and Cys, were consistently different from predicted values in all genomes. We suggest that this variation from predicted values is a consequence of selection pressure at the level of amino acids, while the close correspondence to the predictions in residues such as Thr, Phe, Lys, and Asn arises only from mutation and selection pressure at the level of the nucleic acid sequences.  相似文献   

15.
Abstract: The amino acid sequence of 11 peptides generated from human placental choline acetyltransferase was compared to the corresponding amino acid sequences predicted from the nucleotide sequence of a recently cloned porcine choline acetyltransferase cDNA. These peptides, which were generated by cyanogen bromide cleavage or tryptic digestion, accounted for 23% of the amino acids in the enzyme. Of the 145 amino acids sequenced eight differed between the two species, yielding an identity of 94% over the regions sampled.
Of the eight amino acids that differed six could represent single base changes in the DNA sequence. These findings demonstrate strong sequence similarity between porcine and human choline acetyltransferase and indicate that they are closely related evolutionarily.  相似文献   

16.

Background  

The arrangement of the amino acids in the genetic code is such that neighbouring codons are assigned to amino acids with similar physical properties. Hence, the effects of translational error are minimized with respect to randomly reshuffled codes. Further inspection reveals that it is amino acids in the same column of the code (i.e. same second base) that are similar, whereas those in the same row show no particular similarity. We propose a 'four-column' theory for the origin of the code that explains how the action of selection during the build-up of the code leads to a final code that has the observed properties.  相似文献   

17.
BACKGROUND: The composition and sequence of amino acids in a protein may serve the underlying needs of the nucleic acids that encode the protein (the genome phenotype). In extreme form, amino acids become mere placeholders inserted between functional segments or domains, and--apart from increasing protein length--playing no role in the specific function or structure of a protein (the conventional phenotype). METHODS: We studied the genomes of two malarial parasites and 521 prokaryotes (144 complete) that differ widely in GC% and optimum growth temperature, comparing the base compositions of the protein coding regions and corresponding lengths (kilobases). RESULTS: Malarial parasites show distinctive responses to base-compositional pressures that increase as protein lengths increase. A low-GC% species (Plasmodium falciparum) is likely to have more placeholder amino acids than an intermediate-GC% species (P. vivax), so that homologous proteins are longer. In prokaryotes, GC% is generally greater and AG% is generally less in open reading frames (ORFs) encoding long proteins. The increased GC% in long ORFs increases as species' GC% increases, and decreases as species' AG% increases. In low- and intermediate-GC% prokaryotic species, increases in ORF GC% as encoded proteins increase in length are largely accounted for by the base compositions of first and second (amino acid-determining) codon positions. In high-GC% prokaryotic species, first and third (non-amino acid-determining) codon positions play this role. CONCLUSION: In low- and intermediate-GC% prokaryotes, placeholder amino acids are likely to be well defined, corresponding to codons enriched in G and/or C at first and second positions. In high-GC% prokaryotes, placeholder amino acids are likely to be less well defined. Increases in ORF GC% as encoded proteins increase in length are greater in mesophiles than in thermophiles, which are constrained from increasing protein lengths in response to base-composition pressures.  相似文献   

18.
Amino acids are utilized with different frequencies both among species and among genes within the same genome. Up to date, no study on the amino acid usage pattern of chicken has been performed. In the present study, we carried out a systematic examination of the amino acid usage in the chicken proteome. Our data indicated that the relative amino acid usage is positively correlated with the tRNA gene copy number. GC contents, including GC1, GC2, GC3, GC content of CDS and GC content of the introns, were correlated with the most of the amino acid usage, especially for GC rich and GC poor amino acids, however, multiple linear regression analyses indicated that only approximately 10–40% variation of amino acid usage can be explained by GC content for GC rich and GC poor amino acids. For other intermediate GC content amino acids, only approximately 10% variation can be explained. Correspondence analyses demonstrated that the main factors responsible for the variation of amino acid usage in chicken are hydrophobicity, aromaticity and genomic GC content. Gene expression level also influenced the amino acid usage significantly. We argued that the amino acid usage of chicken proteome likely reflects a balance or near balance between the action of selection, mutation, and genetic drift.  相似文献   

19.
Scanning tunneling microscopy and chromatography experiments exploring the potential templating properties of nucleic acid bases adsorbed to the surface of crystalline graphite, revealed that the interactions of amino acids with the bare crystal surface are significantly modulated by the prior adsorption of adenine and hypoxanthine. These bases are the coding elements of a putative purine-only genetic alphabet and the observed effects are different for each of the bases. Such mapping between bases and amino acids provides a coding mechanism. These observations demonstrate that a simple pre-RNA amino acid discrimination mechanism could have existed on the prebiotic Earth providing critical functionality for the origin of life.  相似文献   

20.
The phenotypes of biological systems are to some extent robust to genotypic changes. Such robustness exists on multiple levels of biological organization. We analyzed this robustness for two categories of amino acids in proteins. Specifically, we studied the codons of amino acids that bind or do not bind small molecular ligands. We asked to what extent codon changes caused by mutation or mistranslation may affect physicochemical amino acid properties or protein folding. We found that the codons of ligand-binding amino acids are on average more robust than those of non-binding amino acids. Because mistranslation is usually more frequent than mutation, we speculate that selection for error mitigation at the translational level stands behind this phenomenon. Our observations suggest that natural selection can affect the robustness of very small units of biological organization.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号