首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Statistical studies of gene populations on the purine/pyrimidine alphabet have shown that the mean occurrence probability of thei-motif YRY(N) i YRY (R=purine, Y=pyrimidine, N=R or Y) is not uniform by varyingi in the range [1,99], but presents a maximum ati=6 in the following populations: protein coding genes of eukaryotes, prokaryotes, chloroplasts and mitrochondria, and also viral introns, ribosomal RNA genes and transfer RNA genes (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). From the “universality” of this observation, we suggested that the oligonucleotide YRY(N)6 is a primitive one and that it has a central function in DNA sequence evolution (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). Following this idea, we introduce a concept of a model of DNA sequence evolution which will be validated according to a shema presented in three parts. In the first part, using the last version of the gene database, the YRY(N)6YRY preferential occurrence (maximum ati=6) is confirmed for the populations mentioned above and is extended to some newly analysed populations: chloroplast introns, chloroplast 5′ regions, mitochondrial 5′ regions and small nuclear RNA genes. On the other hand, the YRY(N)6YRY preferential occurrence and periodicities are used in order to classify 18 gene populations. In the second part, we will demonstrate that several statistical features characterizing different gene populations (in particular the YRY(N)6YRY preferential occurrence and the periodicities) can be retrieved from a simple Markov model based on the mixing of the two oligonucleotides YRY(N)6 and YRY(N)3 and based on the percentages of RYR and YRY in the unspecified trinucleotides (N)3 of YRY(N)6 and YRY(N)3. Several properties are identified and prove in particular that the oligonucleotide mixing is an independent process and that several different features are functions of a unique parameter. In the third part, the return of the model to the reality shows a strong correlation between reality and simulation concerning the presence of large alternating purine/pyrimidine stretches and of periodicities. It also contributes to a greater understanding of biological reality, e.g. the presence or the absence of large alternating purine/pyrimidine stretches can be explained as being a simple consequence of the mixing of two particular oligonucleotides. Finally, we believe that such an approach is the first step toward a unified model of DNA sequence evolution allowing the molecular understanding of both the origin of life and the actual biological reality.  相似文献   

2.
Recently, a new genetic process termed RNA editing has been identified showing insertions and deletions of nucleotides in particular RNA molecules. On the other hand, there are a few non-random statistical properties in genes: in particular, the periodicity modulo 3 (P3) associated with an open reading frame, the periodicity modulo 2 (P2) associated with alternating purine/pyrimidine stretches, the YRY(N)6YRY preferential occurrence (R = purine = adenine or guanine, Y = pyrimidine = cytosine or thymine, N = R or Y) representing a "code" of the DNA helix pitch, etc. The problem investigated here is whether a process of the type RNA editing can lead to the non-random statistical properties commonly observed in genes. This paper will show in particular that: The process of insertions and deletions of mononucleotides in the initial sequence [YRY(N)3]* [series of YRY(N)3] can lead to the periodicity modulo 2 (P2). The process of insertions and deletions of trinucleotides in the initial sequence [YRY(N)6]* [series of YRY(N)6] can lead to the periodicity modulo 3 (P3) and the YRY(N)6YRY preferential occurrence. Furthermore, these two processes lead to a strong correlation with the reality, namely the mononucleotide insertion/deletion process, with the 5' eukaryotic regions and the trinucleotide insertion/deletion process, with the eukaryotic protein coding genes.  相似文献   

3.
J.C. Shepherd notes that codons of the type RNY (R = purine, N = any nucleotide base, Y = pyrimidine) predominate over RNR in the genes for proteins. He has hypothesized that RNY codons are the relics of “a primitive code” composed of repeating RNY triplets. He found that RNY codons predominated in fourfold RNN codon sets (family boxes). These family boxes code for valine, threonine, alanine, and glycine. We argue that the proposed “comma-less” code composed of RNY never existed, and that, in any case, survival of such a code would have long since been erased by mutations. The excess of RNY codons in family boxes is probably attributable to preference for the corresponding tRNAs.  相似文献   

4.
Recently, we proposed a new model of DNA sequence evolution (Arquès and Michel. 1990b.Bull. math. Biol. 52, 741–772) according to which actual genes on the purine/pyrimidine (R/Y) alphabet (R=purine=adenine or guanine, Y=pyrimidine=cytosine or thymine) are the result of two successive evolutionary genetic processes: (i) a mixing (independent) process of non-random oligonucleotides (words of base length less than 10: YRY(N)6, YRYRYR and YRYYRY are so far identified; N=R or Y) leading to primitive genes (words of several hundreds of base length) and followed by (ii) a random mutation process, i.e. transformations of a base R (respectively Y) into the base Y (respectively R) at random sites in these primitive genes. Following this model the problem investigated here is the study of the variation of the 8 R/Y codon probabilities RRR,..., YYY under random mutations. Two analytical expressions solved here allow analysis of this variation in the classical evolutionary sense (from the past to the present, i.e. after random mutations), but also in the inverted evolutionary sense (from the present to the past, i.e. before random mutations). Different properties are also derived from these formulae. Finally, a few applications of these formulae are presented. They prove the proposition in Arquès and Michel (1990b.Bull. math. Biol. 52, 741–772), Section 3.3.2, with the existence of a miximal mean number of random mutations per base of the order 0.3 in the protein coding genes. They also confirm the mixing process of oligonucleotides by excluding the purine/pyrimidine contiguous and alternating tracts from the formation process of primitive genes.  相似文献   

5.
To study the evolution of mutation biased synonymous codon usage, we examined nucleotide co-occurrence patterns in the Deinococcus radiodurans, D. geothermalis, and Thermus thermophilus genomes for nucleotide replacement dependent on the surrounding nucleotide context. Nucleotides on the third codon site were found to be strongly correlated with nucleotide sites at most six nucleotides away in all three species, where abundance patterns were dependent on whether two nucleotides share the same purine(R)/pyrimidine(Y) status. In the class Deinococci adjacent third site nucleotides were strongly correlated, where NNR|NNR and NNY|NNY codon pairs were overabundant while NNR|NNY and NNY|NNR codon pairs were underabundant. By far the largest deviations in all three species occur for NN(YR)|(YR)NN codon pairs. In the Thermus species, the NNY|YNN and NNR|RNN codon pairs were overabundant versus the underabundant NNY|RNN and NNR|YNN codon pairs, whereas in the Deinococcus species the opposite over-/underabundance relationship held for adjacent (GC) bases. We also observed a weaker overabundance of NNR|NRN and NNY|NYN codon pairs versus the underabundant NNR|NYN and NNY|NRN codon pairs. The perfect purine/pyrimidine symmetry of each of these cases, plus the lack of significant deviations for nucleotide pairs on other length scales up to 20 codons apart demonstrates that a pervasive pattern of nucleotide replacement dependent on local nucleotide context, and not codon bias, has occurred in these species. This nucleotide replacement has led to modified synonymous codon usage within the class Deinococci that affects which codons are positioned at particular codon sites dependent on the local nucleotide context.  相似文献   

6.
1,144 sheep belonging to 21 breeds and known crosses were sequence analyzed for polymorphisms in the ovine PRNP gene. Genotype and allele frequencies of polymorphisms in PRNP known to confer resistance to scrapie, a fatal neurodegenerative disease of sheep, are reported. Known polymorphisms at codons 136 (A/V), 154 (H/R) and 171 (Q/R/H/K) were identified. The frequency of the 171R allele known to confer resistance to type C scrapie was 53.8% and the frequency of the 136A allele known to influence the resistance to type A scrapie was 96.01%. In addition, we report the identification of five new polymorphisms at codons 143 (H/R), 167 (R/S), 180 (H/Y), 195 (T/S) and 196 (T/S). We also report the identification of a novel allele (S/R) at codon 138.  相似文献   

7.
Xia X 《PloS one》2007,2(2):e188
The optimal context for translation initiation in mammalian species is GCCRCCaugG (where R = purine and "aug" is the initiation codon), with the -3R and +4G being particularly important. The presence of +4G has been interpreted as necessary for efficient translation initiation. Accumulated experimental and bioinformatic evidence has suggested an alternative explanation based on amino acid constraint on the second codon, i.e., amino acid Ala or Gly are needed as the second amino acid in the nascent peptide for the cleavage of the initiator Met, and the consequent overuse of Ala and Gly codons (GCN and GGN) leads to the +4G consensus. I performed a critical test of these alternative hypotheses on +4G based on 34169 human protein-coding genes and published gene expression data. The result shows that the prevalence of +4G is not related to translation initiation. Among the five G-starting codons, only alanine codons (GCN), and glycine codons (GGN) to a much smaller extent, are overrepresented at the second codon, whereas the other three codons are not overrepresented. While highly expressed genes have more +4G than lowly expressed genes, the difference is caused by GCN and GGN codons at the second codon. These results are inconsistent with +4G being needed for efficient translation initiation, but consistent with the proposal of amino acid constraint hypothesis.  相似文献   

8.
A statistical parameter identifies, with a high degree of significance, a motif which is present in protein-coding sequences of eukaryotes, prokaryotes, chloroplasts, mitochondria, viral introns, ribosomal RNA genes, and transfer RNA genes. The random probability of occurrence of such a situation is 10(-12). This motif has the following properties: (i) its significant presence in almost all present-day genes explains why it can be considered as primitive oligonucleotide, (ii) its nucleotide order is: YRY (N)6YRY, R being a purine base, Y a pyrimidine one and N any base, (iii) its length and its terminal trinucleotides YRY suggest a primordial function related to the spatial structure of the DNA sequences. This motif is found in some viral protein-coding genes, but not in eukaryotic introns.  相似文献   

9.
研究了Escherichiacoli(115个基因)和SacharomycesYeast(97个基因)核酸序列的密码子使用频率与基因表达水平的关系.将同义密码子按使用频率统计值分成三种特性的密码子:最适密码子(H)、非最适密码子(L)和稀有密码子(R),对每一基因序列的编码区,算出它们各自出现的概率P(H),P(L)和P(R).以P(H)和P(R)为指标,用图论法聚类,发现每种生物的高低表达基因明显分开,基因表达水平被分为四级:甚高表达基因(VH)、高表达基因(H)、较低表达基因(LM)和低表达基因(LL).每类基因的表达水平与实验结果保持了很好的相关性,与E.coli和Yeast的现有资料相比,符合很好.  相似文献   

10.
A novel bias in codon third-letter usage was found in Escherichia coli genes with low fractions of "optimal codons", by comparing intact sequences with control random sequences. Third-letter usage has been found to be biased according to preference in codon usage and to doublet preference from the following first letter. The present study examines third-letter usage in the context of the nucleotide sequence when these preferences are considered. In order to exclude any influence by these factors, the random sequences were generated such that the amino acid sequence, codon usage, and the doublet frequency in each gene were all preserved. Comparison of intact sequences with these randomly generated sequences reveals that third letters of codons show a strong preference for the purine/pyrimidine pattern of the next codons: purine (R) is preferred to pyrimidine (Y) at the third site when followed by an R-Y-R codon, and pyrimidine is preferred when followed by an R-R-Y, an R-Y-Y or a Y-R-Y codon. This bias is probably related to interactions of tRNA molecules in the ribosome.  相似文献   

11.
Different codons are not utilized equally in known gene sequences. One of the important biases of codon usage is observed in the form of an enrichment of RNY codons, especially within RNN codon families. Such biases could represent the residue of a primitive repeating-RNY gene structure, or the outcome of natural selection, or both. Analyses based on the rates of silent substitutions, the frequencies of base doublets, and synonymous codon ratios for Escherichia coli, yeast, Drosophila and Xenopus proteins have been performed. The results rule out any significant support for a primitive repeating-RNY or repeating-RRY gene structure, and establish the important role of natural selection in determining the choice of codons. With strong intervention by natural selection, the relationship between primitive gene structure and codon usage necessarily becomes minimal.  相似文献   

12.
Codon pairs in the genome of Escherichia coli   总被引:9,自引:0,他引:9  
MOTIVATION: The effect of two neighboring codons (codon pairs) on gene expression is mediated via the interaction of their cognate tRNAs occupying the two functional ribosomal sites during the translation elongation step. For steric reasons it is reasonable to assume that not all combinations of codons and therefore of tRNAs are equally favorable when situated on the ribosome surface. Aiming of identifying preferential and rare codon pairs, we have determined the frequency of occurrence of all possible combinations of codon pairs in the entire genome of Escherichia coli (E.coli). RESULTS: The frequency of occurrence of the 3904 codon pairs comprising both sense:sense and sense:stop codon pairs in the full set of E.coli 4289 ORFs was found to vary from zero to 4913 times. For most of the pairs we have observed a significant difference between the real and statistically predicted frequency of occurrence. The analysis of 334 highly expressed and 303 poorly expressed E.coli genes showed that codon pair usage is different for the two gene categories. Using an especially defined criterion (Delta(REG)), the codon pairs are classified as 'hypothetically attenuating' (HAP) and 'hypothetically non-attenuating' (HNAP) and their possible effect on translation is discussed. AVAILABILITY: The program used in this study is available at http://www.bio21.bas.bg/codonpairs/  相似文献   

13.
Many organisms exhibit biased codon usage in their genome, including the fungal model organism Neurospora crassa. The preferential use of subset of synonymous codons (optimal codons) at the macroevolutionary level is believed to result from a history of selection to promote translational efficiency. At present, few data are available about selection on optimal codons at the microevolutionary scale, that is, at the population level. Herein, we conducted a large-scale assessment of codon mutations at biallelic sites, spanning more than 5,100 genes, in 2 distinct populations of N. crassa: the Caribbean and Louisiana populations. Based on analysis of the frequency spectra of synonymous codon mutations at biallelic sites, we found that derived (nonancestral) optimal codon mutations segregate at a higher frequency than derived nonoptimal codon mutations in each population; this is consistent with natural selection favoring optimal codons. We also report that optimal codon variants were less frequent in longer genes and that the fixation of optimal codons was reduced in rapidly evolving long genes/proteins, trends suggestive of genetic hitchhiking (Hill-Robertson) altering codon usage variation. Notably, nonsynonymous codon mutations segregated at a lower frequency than synonymous nonoptimal codon mutations (which impair translational efficiency) in each N. crassa population, suggesting that changes in protein composition are more detrimental to fitness than mutations altering translation. Overall, the present data demonstrate that selection, and partly genetic interference, shapes codon variation across the genome in N. crassa populations.  相似文献   

14.
Mutations were studied in phenylalanine hydroxylase gene of phenylketonuria patients from Kemerovo oblast and Altaiskii krai (15 and 2 families, respectively). The following mutations were identified in exons of this gene: R408W, R261Q, R243Q, Y414C, Y386C, P281L, Y168H, R68S (lead to amino acid substitutions), R243X (leads to stop codon formation), and three splice site mutations (IVS12nt 1g-->a, IVS2nt-13t-->g, IVS7nt 1g-->a).  相似文献   

15.
Lavner Y  Kotlar D 《Gene》2005,345(1):127-138
We study the interrelations between tRNA gene copy numbers, gene expression levels and measures of codon bias in the human genome. First, we show that isoaccepting tRNA gene copy numbers correlate positively with expression-weighted frequencies of amino acids and codons. Using expression data of more than 14,000 human genes, we show a weak positive correlation between gene expression level and frequency of optimal codons (codons with highest tRNA gene copy number). Interestingly, contrary to non-mammalian eukaryotes, codon bias tends to be high in both highly expressed genes and lowly expressed genes. We suggest that selection may act on codon bias, not only to increase elongation rate by favoring optimal codons in highly expressed genes, but also to reduce elongation rate by favoring non-optimal codons in lowly expressed genes. We also show that the frequency of optimal codons is in positive correlation with estimates of protein biosynthetic cost, and suggest another possible action of selection on codon bias: preference of optimal codons as production cost rises, to reduce the rate of amino acid misincorporation. In the analyses of this work, we introduce a new measure of frequency of optimal codons (FOP'), which is unaffected by amino acid composition and is corrected for background nucleotide content; we also introduce a new method for computing expected codon frequencies, based on the dinucleotide composition of the introns and the non-coding regions surrounding a gene.  相似文献   

16.
Summary The cytochrome c oxidase subunit I (COI) gene sequences from planarian (Dugesia japonica) DNA, most probably of mitochondrial origin, are heterogeneous. Taking advantage of the heterogeneity that occurs primarily in silent sites of the COI DNA sequences, amino acid assignments of several codons have been deduced as nonuniversal: UGA = Trp, AAA = Asp, and AGR (R: A or G) = Ser. In addition, UAA, a stop codon in the universal genetic code, is tentatively assumed to be a tyrosine codon, because three of the sequences examined have UAA at the well-conserved tyrosine site of UAY (Y: U or C) in other planarian sequences as well as in the mitochondria of human, Xenopus, sea urchin, Drosophila, Trypanosoma, and Saccharomyces cerevisiae. AUA would most probably be an isoleucine codon in these mitochondria, whereas it is a methionine codon in the majority of nonplant mitochondria.Offprint requests to: Y. Bessho  相似文献   

17.
Xia X 《Gene》2005,345(1):13-20
The H-strand of vertebrate mitochondrial DNA is left single-stranded for hours during the slow DNA replication. This facilitates C-->U mutations on the H-strand (and consequently G-->A mutations on the L-strand) via spontaneous deamination which occurs much more frequently on single-stranded than on double-stranded DNA. For the 12 coding sequences (CDS) collinear with the L-strand, NNY synonymous codon families (where N stands for any of the four nucleotides and Y stands for either C or U) end mostly with C, and NNR and NNN codon families (where R stands for either A or G) end mostly with A. For the lone ND6 gene on the other strand, the codon bias is the opposite, with NNY codon families ending mostly with U and NNR and NNN codon families ending mostly with G. These patterns are consistent with the strand-specific mutation bias. The codon usage biased towards C-ending and A-ending in the 12 CDS sequences affects the codon-anticodon adaptation. The wobble site of the anticodon is always G for NNY codon families dominated by C-ending codons and U for NNR and NNN codon families dominated by A-ending codons. The only, but consistent, exception is the anticodon of tRNA-Met which consistently has a 5'-CAU-3' anticodon base-pairing with the AUG codon (the translation initiation codon) instead of the more frequent AUA. The observed CAU anticodon (matching AUG) would increase the rate of translation initiation but would reduce the rate of peptide elongation because most methionine codons are AUA, whereas the unobserved UAU anticodon (matching AUA) would increase the elongation rate at the cost of translation initiation rate. The consistent CAU anticodon in tRNA-Met suggests the importance of maximizing the rate of translation initiation.  相似文献   

18.
The compositional non-randomness was studied in genes of Saccharomyces cerevisiae and Schizosaccharomyces pombe. In both species, codon usage is well correlated with expressivity (measured as the codon adaptation index). Both species generally display higher nucleotide non-randomness in the group of highly expressed genes than in the lowly expressed genes. The highly expressed genes in both species are furthermore characterized by marked peaks in non-randomness at N=3 upstream of start codons, N=2 downstream of start codons and at N=1 and N=7 downstream of stop codons, indicating that these nucleotides may be key elements in translational regulation. Intragenic variation in codon usage was also observed to be linked to expressivity. It is suggested that the firm link between expressivity and codon usage calls for codon optimization. Based on bioinformatic calculations, examples of proteins are given for which codon optimizations might be relevant.  相似文献   

19.
Adaptive codon usage provides evidence of natural selection in one of its most subtle forms: a fitness benefit of one synonymous codon relative to another. Codon usage bias is evident in the coding sequences of a broad array of taxa, reflecting selection for translational efficiency and/or accuracy as well as mutational biases. Here, we quantify the magnitude of selection acting on alternative codons in genes of the nematode Caenorhabditis remanei, an outcrossing relative of the model organism C. elegans, by fitting the expected mutation-selection-drift equilibrium frequency distribution of preferred and unpreferred codon variants to the empirical distribution. This method estimates the intensity of selection on synonymous codons in genes with high codon bias as N(e)s = 0.17, a value significantly greater than zero. In addition, we demonstrate for the first time that estimates of ongoing selection on codon usage among genes, inferred from nucleotide polymorphism data, correlate strongly with long-term patterns of codon usage bias, as measured by the frequency of optimal codons in a gene. From the pattern of polymorphisms in introns, we also infer that these findings do not result from the operation of biased gene conversion toward G or C nucleotides. We therefore conclude that coincident patterns of current and ancient selection are responsible for shaping biased codon usage in the C. remanei genome.  相似文献   

20.
Hu W  Feng Z  Tang MS 《Biochemistry》2003,42(33):10012-10023
In the ras gene superfamily, codon 12 (-TGGTG-) of the K-ras gene is the most frequently mutated codon in human cancers. Recently, we have found that bulky chemical carcinogens preferentially form DNA adducts at codons 12 and 14 (-CGTAG-) in the K-ras gene in normal human bronchial epithelial (NHBE) cells. Furthermore, DNA adducts formed at codon 12 of the K-ras gene are poorly repaired compared with those at other codons including codon 14. These results suggest that targeted carcinogen-DNA adduct formation is a major reason for the observed high mutation frequency at codon 12 of the K-ras gene in human cancers. This preferential carcinogen-DNA adduct formation at codons 12 and 14 could result from effects of (1) primary sequences of these codons and their surrounding codons in the K-ras gene, (2) the chromatin structure, and/or (3) epigenetic factors such as C5 cytosine methylation or other DNA modifications at these codons and their surrounding codons. To distinguish these possibilities, we have introduced modifications with benzo[a]pyrene diol epoxide, N-hydroxy-2-aminofluorene, and aflatoxin B1 8,9-epoxide in (1) naked intact genomic DNA isolated from NHBE cells, (2) fragmented genomic DNA digested by restriction enzymes, and (3) in vitro synthesized DNA fragments containing the K-ras gene exon 1 sequence with or without methylation of the cytosines at CpG sites and the cytosines pairing with the guanines of codons 12 and 14. The distribution of carcinogen-DNA adducts in the K-ras gene was mapped at the nucleotide sequence level using the UvrABC nuclease incision method with or without the ligation-mediated polymerase chain reaction technique. We have found that carcinogens preferentially form adducts at codons 12 and 14 in the K-ras gene exon 1 in intact as well as in fragmented genomic DNA. In contrast, this preferential DNA adduct formation at codons 12 and 14 was not observed in PCR-amplified DNA fragments containing the K-ras gene exon 1 sequence. Methylation of the cytosine at the CpG site of codon 14, or the cytosine pairing with guanine of codon 14, greatly enhanced carcinogen-DNA adduct formation at codon 14 but did not affect carcinogen-DNA adduct formation at codon 12. Methylation of the cytosine pairing with the guanine of codon 12 also did not enhance carcinogen-DNA adduct formation at codon 12. Furthermore, we found that the cytosine at the CpG site of codon 14 is highly methylated in NHBE cells. These results suggest that cytosine methylation at the CpG site is the major reason for the preferential DNA damage at codon 14 and that epigenetic modification(s) other than cytosine methylation may contribute to the preferential DNA damage at codon 12 of the K-ras gene.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号