首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Asymmetrical distribution of CpG in an 'average' mammalian gene.   总被引:24,自引:7,他引:17       下载免费PDF全文
The frequency and distribution of the rare dinucleotide CpG was examined in 15 mammalian genes. CpG is highly methylated at cytosine in mammalian DNA (1,2) and 5-methylcytosine (5mC) is thought to undergo a transition mutation via deamination to produce thymine (3). This would result in the accumulation of TpG and CpA and depletion of CpG during evolution (4). Consistent with this hypothesis, the gene sample of 26,541 dinucleotides contained CpG at 40% the frequency expected by base composition and the CpG transition products, TpG+CpA, were significantly elevated at 124% of expected random frequency. However, because CpG occurs at only 25% of expected random frequency in the genome, the sampled genes were considerably enriched in this dinucleotide. CpGs were asymmetrically distributed in sequences flanking the genes. 5'-flanking sequences were enriched in CpG at 135% of the frequency expected assuming a symmetrical distribution of all the CpGs in the sampled genes (p less than 0.01), while 3'-flanking regions were depleted in CpG at 40% of expected values (p less than 0.0001). This asymmetry may reflect the role of 5-methylcytosine in gene expression. In contrast the frequencies of GpC and GpT+ ApC did not differ significantly from that predicted by base composition and these dinucleotides were not asymmetrically distributed.  相似文献   

3.
Human immunodeficiency virus type 1 (HIV-1) and other lentiviridae demonstrate a strong preference for the A-nucleotide, which can account for up to 40% of the viral RNA genome. The biological mechanism responsible for this nucleotide bias is currently unknown. The increased A-content of these viral genomes corresponds to the typical use of synonymous codons by all members of the lentiviral family (HIV, SIV, BIV, FIV, CAEV, EIAV, visna) and the human spuma retrovirus, but not by other retroviruses like the human T-cell leukemia viruses HTLV-I and HTLV-II. In this article, we analyzed A-bias for all codon groups in all open reading frames of several lentiviruses. The extent of lentiviral codon bias could be related to host cellular translation. By calculating codon bias indices (CBIs), we were able to demonstrate an inverse correlation between the extent of codon bias and the rate of translation of individual reading frames in these viruses. Specifically, the shift toward A-rich codons is more pronounced in pol than in gag lentiviral genes. Since it is known that Gag synthesis exceeds Pol synthesis by a factor of 20 due to infrequent ribosomal frame-shifting during translation of the gap-pol mRNA molecule, we propose that the aminoacyl-tRNA availability in the host cell restricts the lentiviral preference for A-rich codons. In addition, less A-nucleotides were found in regions of the viral genome encoding multiple functions; e.g., overlapping reading frames (tat-rev-env) or in genes that overlap regulatory sequences (nef-LTR region). Finally, the characteristics of lentiviral codon usage are presented as a phylogenetic tree without the need for prior sequence alignment.Correspondence to: B. Berkhout  相似文献   

4.
This study reports the analysis of codon usage in 35 complete Homo sapiens genes. Both codon frequency and inter-codon interference exhibit patterns of evolutionary interest. There is a significant positive correlation between the frequency with which a given codon is used and the frequency with which its complement is used. Since the frequency of appearance of the complementary codon on the coding strand is equal to the frequency of appearance of the original codon on the non-coding strand, in the same phase, the non-coding strand is found to resemble the coding strand in triplet composition. The same effect has been observed in Escherichia coli. This preference for the use of certain complementary triplets as codons suggests that the evolution of the use of the genetic code depended to some extent upon the double-stranded nature of the coding material. In addition, the effect of discrimination against the use of two dinucleotides, CpG and UpA, is observed in codon usage and also in adjacent codon interference. Codons beginning with G, or A, are unlikely to be preceded by codons ending in C, or U, respectively. Consideration of codon assignment in the genetic code together with the observed CpG infrequency suggests that the evolution of the code may have been influenced by conditions in which the use of CpG dinucleotides was unfavorable. The infrequent use of UpA dinucleotides can be explained as the result of frameshift mutation during gene evolution.  相似文献   

5.
Summary In species where actin genes exist as single copies, analysis of their synonymous codon usage and of the substitutions occurring between the genes of closely related species shows that there is a positive selection for codons that do not have highly mutable CpG dinucleotides in codon positions 2 and 3 when the GC content of these genes is less than 57%.  相似文献   

6.
A wide spectrum of mutations, ranging from point mutations to large deletions, have been described in the retinoblastoma gene (RB1). Mutations have been found throughout the gene; however, these genetic alterations do not appear to be homogeneously distributed. In particular, a significant proportion of disease-causing mutations results in the premature termination of protein synthesis, and the majority of these mutations occur as C-->T transitions at CpG dinucleotides (CpGs). Such recurrent CpG mutations, including those found in RB1, are likely the result of the deamination of 5-methylcytosine within these CpGs. In the present study, we used the sodiumbisulfite conversion method to detect cytosine methylation in representative exons of RB1. We analyzed DNA from a variety of tissues and specifically targeted CGA codons in RB1, where recurrent premature termination mutations have been reported. We found that DNA methylation within RB1 exons 8, 14, 25, and 27 appeared to be restricted to CpGs, including six CGA codons. Other codons containing methylated cytosines have not been reported to be mutated. Therefore, disease-causing mutations at CpGs in RB1 appear to be determined by several factors, including the constitutive presence of DNA methylation at cytosines within CpGs, the specific codon within which the methylated cytosine is located, and the particular region of the gene within which that codon resides.  相似文献   

7.
De novo origin of coding sequence remains an obscure issue in molecular evolution. One of the possible paths for addition (subtraction) of DNA segments to (from) a gene is stop codon shift. Single nucleotide substitutions can destroy the existing stop codon, leading to uninterrupted translation up to the next stop codon in the gene’s reading frame, or create a premature stop codon via a nonsense mutation. Furthermore, short indels-caused frameshifts near gene’s end may lead to premature stop codons or to translation past the existing stop codon. Here, we describe the evolution of the length of coding sequence of prokaryotic genes by change of positions of stop codons. We observed cases of addition of regions of 3′UTR to genes due to mutations at the existing stop codon, and cases of subtraction of C-terminal coding segments due to nonsense mutations upstream of the stop codon. Many of the observed stop codon shifts cannot be attributed to sequencing errors or rare deleterious variants segregating within bacterial populations. The additions of regions of 3′UTR tend to occur in those genes in which they are facilitated by nearby downstream in-frame triplets which may serve as new stop codons. Conversely, subtractions of coding sequence often give rise to in-frame stop codons located nearby. The amino acid composition of the added region is significantly biased, compared to the overall amino acid composition of the genes. Our results show that in prokaryotes, shift of stop codon is an underappreciated contributor to functional evolution of gene length.  相似文献   

8.
Summary Based on the rates of synonymous substitution in 42 protein-codin gene pairs from rat and human, a correlation is shown to exist between the frequency of the nucleotides in all positions of the codon and the synonymous substitution rate. The correlation coefficients were positive for A and T and negative for C and G. This means that AT-rich genes accumulate more synonymous substitutions than GC-rich genes. Biased patterns of mutation could not account for this phenomenon. Thus, the variation in synonymous substitution rates and the resulting unequal codon usage must be the consequence of selection against A and T in synonymous positions. Most of the varition in rates of synonymous substitution can be explained by the nucleotide composition in synonymous positions. Codon-anticodon interactions, dinucleotide frequencies, and contextual factors influence neither the rates of synonymous substitution nor codon usage. Interestingly, the nucleotide in the second position of codons (always a nonsynonymous position) was found to affect the rate of synonymous substitution. This finding links the rate of nonsynonymous substitution with the synonymous rate. Consequently, highly conservative proteins are expected to be encoded by genes that evolve slowly in terms of synonymous substitutions, and are consequently highly biased in their codon usage.  相似文献   

9.
Akashi H  Ko WY  Piao S  John A  Goel P  Lin CF  Vitins AP 《Genetics》2006,172(3):1711-1726
Although mutation, genetic drift, and natural selection are well established as determinants of genome evolution, the importance (frequency and magnitude) of parameter fluctuations in molecular evolution is less understood. DNA sequence comparisons among closely related species allow specific substitutions to be assigned to lineages on a phylogenetic tree. In this study, we compare patterns of codon usage and protein evolution in 22 genes (>11,000 codons) among Drosophila melanogaster and five relatives within the D. melanogaster subgroup. We assign changes to eight lineages using a maximum-likelihood approach to infer ancestral states. Uncertainty in ancestral reconstructions is taken into account, at least to some extent, by weighting reconstructions by their posterior probabilities. Four of the eight lineages show potentially genomewide departures from equilibrium synonymous codon usage; three are decreasing and one is increasing in major codon usage. Several of these departures are consistent with lineage-specific changes in selection intensity (selection coefficients scaled to effective population size) at silent sites. Intron base composition and rates and patterns of protein evolution are also heterogeneous among these lineages. The magnitude of forces governing silent, intron, and protein evolution appears to have varied frequently, and in a lineage-specific manner, within the D. melanogaster subgroup.  相似文献   

10.
11.
Lavner Y  Kotlar D 《Gene》2005,345(1):127-138
We study the interrelations between tRNA gene copy numbers, gene expression levels and measures of codon bias in the human genome. First, we show that isoaccepting tRNA gene copy numbers correlate positively with expression-weighted frequencies of amino acids and codons. Using expression data of more than 14,000 human genes, we show a weak positive correlation between gene expression level and frequency of optimal codons (codons with highest tRNA gene copy number). Interestingly, contrary to non-mammalian eukaryotes, codon bias tends to be high in both highly expressed genes and lowly expressed genes. We suggest that selection may act on codon bias, not only to increase elongation rate by favoring optimal codons in highly expressed genes, but also to reduce elongation rate by favoring non-optimal codons in lowly expressed genes. We also show that the frequency of optimal codons is in positive correlation with estimates of protein biosynthetic cost, and suggest another possible action of selection on codon bias: preference of optimal codons as production cost rises, to reduce the rate of amino acid misincorporation. In the analyses of this work, we introduce a new measure of frequency of optimal codons (FOP'), which is unaffected by amino acid composition and is corrected for background nucleotide content; we also introduce a new method for computing expected codon frequencies, based on the dinucleotide composition of the introns and the non-coding regions surrounding a gene.  相似文献   

12.
13.
A O Urrutia  L D Hurst 《Genetics》2001,159(3):1191-1199
In numerous species, from bacteria to Drosophila, evidence suggests that selection acts even on synonymous codon usage: codon bias is greater in more abundantly expressed genes, the rate of synonymous evolution is lower in genes with greater codon bias, and there is consistency between genes in the same species in which codons are preferred. In contrast, in mammals, while nonequal use of alternative codons is observed, the bias is attributed to the background variance in nucleotide concentrations, reflected in the similar nucleotide composition of flanking noncoding and exonic third sites. However, a systematic examination of the covariants of codon usage controlling for background nucleotide content has yet to be performed. Here we present a new method to measure codon bias that corrects for background nucleotide content and apply this to 2396 human genes. Nearly all (99%) exhibit a higher amount of codon bias than expected by chance. The patterns associated with selectively driven codon bias are weakly recovered: Broadly expressed genes have a higher level of bias than do tissue-specific genes, the bias is higher for genes with lower rates of synonymous substitutions, and certain codons are repeatedly preferred. However, while these patterns are suggestive, the first two patterns appear to be methodological artifacts. The last pattern reflects in part biases in usage of nucleotide pairs. We conclude that we find no evidence for selection on codon usage in humans.  相似文献   

14.
Summary Sequence data from regions of five vertebrate vitellogenin genes were used to examine the frequency, distribution, and mutability of the dinucleotide CpG, the preferred modification site for eukaryotic DNA methyltransferases. The observed level of the CpG dinucleotide in all five genes was markedly lower than that expected from the known mononucleotide frequencies. CpG suppression was greater in introns than in exons. CpG-containing codons were found to be avoided in the vitellogenin genes, but not completely despite the redundancy of the genetic code. Frequency and distribution patterns of this dinucleotide varied dramatically among these otherwise closely related genes. Dense clusters of CpG dinucleotides tended to appear in regions of either functional or structural interest (e.g., in the transposon-like Vi-element ofXenopus) and these clusters contained 5-methylcytosine (5 mC). 5 mC is known to undergo deamination to form thymidine, but the extent to which this transition occurs in the heavily methylated genomes of vertebrates and its contribution to CpG suppression are still unclear. Sequence comparison of the methylated vitellogenin gene regions identified CT and GA substitutions that were found to occur at relatively high frequencies. The predicted products of CpG deamination, TpG and CpA, were elevated. These findings are consistent with the view that CpG distribution and methylation are interdependent and that deamination of 5 mC plays an important role in promoting evolutionary change at the nucleotide sequence level.  相似文献   

15.
HIV avoids elimination by cytotoxic T-lymphocytes (CTLs) through the evolution of escape mutations. Although there is mounting evidence that these escape pathways are broadly consistent among individuals with similar human leukocyte antigen (HLA) class I alleles, previous population-based studies have been limited by the inability to simultaneously account for HIV codon covariation, linkage disequilibrium among HLA alleles, and the confounding effects of HIV phylogeny when attempting to identify HLA-associated viral evolution. We have developed a statistical model of evolution, called a phylogenetic dependency network, that accounts for these three sources of confounding and identifies the primary sources of selection pressure acting on each HIV codon. Using synthetic data, we demonstrate the utility of this approach for identifying sites of HLA-mediated selection pressure and codon evolution as well as the deleterious effects of failing to account for all three sources of confounding. We then apply our approach to a large, clinically-derived dataset of Gag p17 and p24 sequences from a multicenter cohort of 1144 HIV-infected individuals from British Columbia, Canada (predominantly HIV-1 clade B) and Durban, South Africa (predominantly HIV-1 clade C). The resulting phylogenetic dependency network is dense, containing 149 associations between HLA alleles and HIV codons and 1386 associations among HIV codons. These associations include the complete reconstruction of several recently defined escape and compensatory mutation pathways and agree with emerging data on patterns of epitope targeting. The phylogenetic dependency network adds to the growing body of literature suggesting that sites of escape, order of escape, and compensatory mutations are largely consistent even across different clades, although we also identify several differences between clades. As recent case studies have demonstrated, understanding both the complexity and the consistency of immune escape has important implications for CTL-based vaccine design. Phylogenetic dependency networks represent a major step toward systematically expanding our understanding of CTL escape to diverse populations and whole viral genes.  相似文献   

16.
Sequence Evolution of Drosophila Mitochondrial DNA   总被引:18,自引:3,他引:15       下载免费PDF全文
We have compared nucleotide sequences of corresponding segments of the mitochondrial DNA (mtDNA) molecules of Drosophila yakuba and Drosophila melanogaster, which contain the genes for six proteins and seven tRNAs. The overall frequency of substitution between the nucleotide sequences of these protein genes is 7.2%. As was found for mtDNAs from closely related mammals, most substitutions (86%) in Drosophila mitochondrial protein genes do not result in an amino acid replacement. However, the frequencies of transitions and transversions are approximately equal in Drosophila mtDNAs, which is in contrast to the vast excess of transitions over transversions in mammalian mtDNAs. In Drosophila mtDNAs the frequency of C----T substitutions per codon in the third position is 2.5 times greater among codons of two-codon families than among codons of four-codon families; this is contrary to the hypothesis that third position silent substitutions are neutral in regard to selection. In the third position of codons of four-codon families transversions are 4.6 times more frequent than transitions and A----T substitutions account for 86% of all transversions. Ninety-four percent of all codons in the Drosophila mtDNA segments analyzed end in A or T. However, as this alone cannot account for the observed high frequency of A----T substitutions there must be either a disproportionately high rate of A----T mutation in Drosophila mtDNA or selection bias for the products of A----T mutation. --Consideration of the frequencies of interchange of AGA and AGT codons in the corresponding D. yakuba and D. melanogaster mitochondrial protein genes provides strong support for the view that AGA specifies serine in the Drosophila mitochondrial genetic code.  相似文献   

17.
A codon-based model of nucleotide substitution for protein-coding DNA sequences   总被引:34,自引:23,他引:11  
A codon-based model for the evolution of protein-coding DNA sequences is presented for use in phylogenetic estimation. A Markov process is used to describe substitutions between codons. Transition/transversion rate bias and codon usage bias are allowed in the model, and selective restraints at the protein level are accommodated using physicochemical distances between the amino acids coded for by the codons. Analyses of two data sets suggest that the new codon-based model can provide a better fit to data than can nucleotide-based models and can produce more reliable estimates of certain biologically important measures such as the transition/transversion rate ratio and the synonymous/nonsynonymous substitution rate ratio.   相似文献   

18.
The evolution of human immunodeficiency virus (HIV) type 1 nef quasispecies in a patient clonally infected with a contaminated batch of blood clotting factor IX was monitored. nef sequences were derived at 11, 25, and 41 months postinfection from infected peripheral blood mononuclear cells after molecular cloning of PCR-amplified proviral DNA. The phylogenetic relationships among a total of 41 informative sequences were established by split decomposition analysis and used as a basis to establish a substitution matrix and to score synonymous (s) and nonsynonymous (ns) substitutions. The number of observed in-phase stop codons within the nef sequences was comparable to that expected on a random basis. Similarly, the numbers of observed s and ns substitutions did not differ significantly from expected values. No codon position was preferentially mutated. The maximum sequence divergence increased in a linear manner, with approximately 4.4 nucleotide and approximately 3.2 amino acid changes per year. It appears that stochastic processes strongly influence short-term HIV nef quasispecies evolution in vivo.  相似文献   

19.
20.
Substitution rates at the three codon positions (r1, r2, and r3) of mammalian mitochondrial genes are in the order of r3 > r1 > r2, and the rate heterogeneity at the three positions, as measured by the shape parameter of the gamma distribution (alpha 1, alpha 2, and alpha 3), is in the order of alpha 3 > alpha 1 > alpha 2. The causes for the rate heterogeneity at the three codon positions remain unclear and, in particular, there has been no satisfactory explanation for the observation of alpha 1 > alpha 2. I attempted to dissect the causes of rate heterogeneity by studying the pattern of nonsynonymous substitutions with respect to codon positions in 10 mitochondrial genes from 19 mammalian species. Nonsynonymous substitutions involve more different amino acid replacements at the second than at the first codon position, which results in r1 > r2. The difference between r1 and r2 increases with the intensity of purifying selection, and so does the rate heterogeneity in nonsynonymous substitutions among sites at the same codon position. All mitochondrial genes appear to have functionally important and unimportant codons, with the latter having all three codon positions prone to nonsynonymous substitutions. Within the functionally important codons, the second codon position is much more conservative than the codon position. This explains why alpha 1 > alpha 2. The result suggests that overweighting of the second codon position in phylogenetic analysis may be a misguided practice.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号