首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
CpG dinucleotides mutate at a high rate because cytosine is vulnerable to deamination, cytosines in CpG dinucleotides are often methylated, and deamination of 5-methylcytosine (5mC) produces thymidine. Previous experiments have shown that DNA melting is the rate-limiting step in cytosine deamination. Here we show, through the analysis of human single-nucleotide polymorphisms (SNPs), that the mutation rate produced by 5mC deamination is highly dependent on local GC content. In fact, linear regression analysis showed that the log(10) of the 5mC mutation rates (inferred from SNP frequencies) had slopes of -3 when graphed with respect to the GC content of neighboring sequences. This is the ideal slope that would be expected if the correlation between CpG underrepresentation and GC content had been solely caused by DNA melting. Moreover, this same result was obtained regardless of the SNP locations (all SNPs versus only SNPs in noncoding intergenic regions, excluding CpG islands) and regardless of the lengths over which GC content was calculated (SNP sequences with a modal length of 564 bp versus genomic contigs with a modal length of 163 kb). Several alternative interpretations are discussed.  相似文献   

2.
Compositional evolution of noncoding DNA in the human and chimpanzee genomes   总被引:11,自引:0,他引:11  
We have examined the compositional evolution of noncoding DNA in the primate genome by comparison of lineage-specific substitutions observed in 1.8 Mb of genomic alignments of human, chimpanzee, and baboon with 6542 human single-nucleotide polymorphisms (SNPs) rooted using chimpanzee sequence. The pattern of compositional evolution, measured in terms of the numbers of GC-->AT and AT-->GC changes, differs significantly between fixed and polymorphic sites, and indicates that there is a bias toward fixation of AT-->GC mutations, which could result from weak directional selection or biased gene conversion in favor of high GC content. Comparison of the frequency distributions of a subset of the SNPs revealed no significant difference between GC-->AT and AT-->GC polymorphisms, although AT-->GC polymorphisms in regions of high GC segregate at slightly higher frequencies on average than GC-->AT polymorphisms, which is consistent with a fixation bias favoring high GC in these regions. However, the substitution data suggest that this fixation bias is relatively weak, because the compositional structure of the human and chimpanzee genomes is becoming homogenized, with regions of high GC decreasing in GC content and regions of low GC increasing in GC content. The rate and pattern of nucleotide substitution in 333 Alu repeats within the human-chimpanzee-baboon alignments are not significantly affected by the GC content of the region in which they are inserted, providing further evidence that, since the time of the human-chimpanzee ancestor, there has been little or no regional variation in mutation bias.  相似文献   

3.
Regional biases in substitution pattern are likely to be responsible for the large-scale variation in base composition observed in vertebrate genomes. However, the evolutionary forces responsible for these biases are still not clearly defined. In order to study the processes of mutation and fixation across the entire human genome, we analyzed patterns of substitution in Alu repeats since their insertion. We also studied patterns of human polymorphism within the repeats. There is a highly significant effect of recombination rate on the pattern of substitution, whereas no such effect is seen on the pattern of polymorphism. These results suggest that regional biases in substitution are caused by biased gene conversion, a process that increases the probability of fixation of mutations that increase GC content. Furthermore, the strongest correlate of substitution patterns is found to be male recombination rates rather than female or sex-averaged recombination rates. This indicates that in addition to sexual dimorphism in recombination rates, the sexes also differ in the relative rates of crossover and gene conversion.  相似文献   

4.
SINEs, evolution and genome structure in the opossum   总被引:3,自引:0,他引:3  
Short INterspersed Elements (SINEs) are non-autonomous retrotransposons, usually between 100 and 500 base pairs (bp) in length, which are ubiquitous components of eukaryotic genomes. Their activity, distribution, and evolution can be highly informative on genomic structure and evolutionary processes. To determine recent activity, we amplified more than one hundred SINE1 loci in a panel of 43 M. domestica individuals derived from five diverse geographic locations. The SINE1 family has expanded recently enough that many loci were polymorphic, and the SINE1 insertion-based genetic distances among populations reflected geographic distance. Genome-wide comparisons of SINE1 densities and GC content revealed that high SINE1 density is associated with high GC content in a few long and many short spans. Young SINE1s, whether fixed or polymorphic, showed an unbiased GC content preference for insertion, indicating that the GC preference accumulates over long time periods, possibly in periodic bursts. SINE1 evolution is thus broadly similar to human Alu evolution, although it has an independent origin. High GC content adjacent to SINE1s is strongly correlated with bias towards higher AT to GC substitutions and lower GC to AT substitutions. This is consistent with biased gene conversion, and also indicates that like chickens, but unlike eutherian mammals, GC content heterogeneity (isochore structure) is reinforced by substitution processes in the M. domestica genome. Nevertheless, both high and low GC content regions are apparently headed towards lower GC content equilibria, possibly due to a relative shift to lower recombination rates in the recent Monodelphis ancestral lineage. Like eutherians, metatherian (marsupial) mammals have evolved high CpG substitution rates, but this is apparently a convergence in process rather than a shared ancestral state.  相似文献   

5.
This study presents the first global, 1-Mbp-level analysis of patterns of nucleotide substitutions along the human lineage. The study is based on the analysis of a large amount of repetitive elements deposited into the human genome since the mammalian radiation, yielding a number of results that would have been difficult to obtain using the more conventional comparative method of analysis. This analysis revealed substantial and consistent variability of rates of substitution, with the variability ranging up to twofold among different regions. The rates of substitutions of C or G nucleotides with A or T nucleotides vary much more sharply than the reverse rates, suggesting that much of that variation is due to differences in mutation rates rather than in the probabilities of fixation of C/G vs. A/T nucleotides across the genome. For all types of substitution we observe substantially more hotspots than coldspots, with hotspots showing substantial clustering over tens of Mbp’s. Our analysis revealed that GC-content of surrounding sequences is the best predictor of the rates of substitution. The pattern of substitution appears very different near telomeres compared to the rest of the genome and cannot be explained by the genome-wide correlations of the substitution rates with GC content or exon density. The telomere pattern of substitution is consistent with natural selection or biased gene conversion acting to increase the GC-content of the sequences that are within 10–15 Mbp away from the telomere.Reviewing Editor: Dr. Jerzy Jurka
This revised version was published online in July 2005 with corrected page numbers.  相似文献   

6.
Singh ND  Arndt PF  Petrov DA 《Genetics》2005,169(2):709-722
Mutation is the underlying force that provides the variation upon which evolutionary forces can act. It is important to understand how mutation rates vary within genomes and how the probabilities of fixation of new mutations vary as well. If substitutional processes across the genome are heterogeneous, then examining patterns of coding sequence evolution without taking these underlying variations into account may be misleading. Here we present the first rigorous test of substitution rate heterogeneity in the Drosophila melanogaster genome using almost 1500 nonfunctional fragments of the transposable element DNAREP1_DM. Not only do our analyses suggest that substitutional patterns in heterochromatic and euchromatic sequences are different, but also they provide support in favor of a recombination-associated substitutional bias toward G and C in this species. The magnitude of this bias is entirely sufficient to explain recombination-associated patterns of codon usage on the autosomes of the D. melanogaster genome. We also document a bias toward lower GC content in the pattern of small insertions and deletions (indels). In addition, the GC content of noncoding DNA in Drosophila is higher than would be predicted on the basis of the pattern of nucleotide substitutions and small indels. However, we argue that the fast turnover of noncoding sequences in Drosophila makes it difficult to assess the importance of the GC biases in nucleotide substitutions and small indels in shaping the base composition of noncoding sequences.  相似文献   

7.
8.
The ApoE gene responsible for the Alzheimer's disease has been examined to identify functional consequences of single-nucleotide polymorphisms (SNPs). Eighty-eight SNPs have been identified in the ApoE gene in which 31 are found to be nonsynonymous, 8 of them are coding synonymous, 33 are found to be in intron, and 3 are in untranslated region. The SNPs found in the untranslated region consisted of two SNPs from 5′ and one SNP from the 3′. Twenty-nine percent of the identified nsSNPs have been reported as damaging. In the analysis of SNPs in the UTR regions, it has been recognized that rs72654467 from 5′ and rs71673244 from 5′ and 3′ are responsible for the alteration in levels of expression. Both native and mutant protein structures were analyzed along with the stabilization residues. It has been concluded that among all SNPs of ApoE, the mutation in rs11542041 (R132S) has the most significant effect on functional variation.  相似文献   

9.
Single nucleotide polymorphisms (SNPs) are thought to be well suitable for genetic and evolutionary studies. In this study, we reported the first set of SNP markers in a commercially important crab species, Scylla paramamosain. A total of 12,500 base pairs high quality DNA sequences were obtained from 15 genes, and thirty-seven SNPs were identified, representing one SNP every 338 base pairs. Twenty-four SNPs were successfully genotyped in a single population. All loci had two alleles and the minor allele frequency ranged from 0.02 to 0.44. The observed and expected heterozygosity ranged from 0.04 to 0.59 and from 0.04 to 0.50, respectively. No significant departures from Hardy–Weinberg equilibrium at each locus was found. The linkage disequilibrium was detected in six loci pairs, but absent after sequential Bonferroni correction. These SNP markers will provide a useful addition to the genetic tools for genetic and evolutionary studies for S. paramamosain.  相似文献   

10.
We used the LOKI software to generate multipoint identity-by-descent matrices for a microsatellite map (with 31 markers) and two single-nucleotide polymorphism (SNP) maps to examine information content across chromosome 7 in the Collaborative Study on the Genetics of Alcoholism dataset. Despite the lower information provided by a single SNP, SNP maps overall had higher and more uniform information content across the chromosome. The Affymetrix map (578 SNPs) and the Illumina map (271 SNPs) provided almost identical information. However, increased information has a computational cost: SNP maps require 100 times as many iterations as microsatellites to produce stable estimates.  相似文献   

11.
Microsatellite variation and the mechanisms which are responsible for this variation have received much attention in the last few years. Most theoretical studies of microsatellite allele distributions, however, did not incorporate the evolutionary dynamics of linked sites. The dynamics is usually modeled by invoking a special mutation mechanism such as stepwise mutation, which leads to a stepwise increase or decrease of the number of motif repeats on the occasion of mutation. It is shown here that selection at a locus, which itself is not subject to mutation, but which is adjacent to a microsatellite locus has an influence on statistics of the microsatellite allele distribution, provided that mutation rates are low to intermediate, when compared to 1/t1, the inverse of the time to fixation of a linked favorable substitution. If mutation rates are high, as for example in humans, a selective effect upon the microsatellite locus, such as hitchhiking, will quickly be obscured by mutations. In particular, in the latter case, the model shows that no correlation is to be expected between recombination rates and variability of microsatellites—such as had been predicted and experimentally demonstrated for nucleotide variability and recombination rates inDrosophila. The presented model is a generalization of the two locus two allele hitchhiking model which had been studied by Stephan and co-workers.  相似文献   

12.
Popescu CE  Lee RW 《Genetics》2007,175(2):819-826
The mitochondrial genomes of the Chlorophyta exhibit significant diversity with respect to gene content and genome compactness; however, quantitative data on the rates of nucleotide substitution in mitochondrial DNA, which might help explain the origin of this diversity, are lacking. To gain insight into the evolutionary forces responsible for mitochondrial genome diversification, we sequenced to near completion the mitochondrial genome of the chlorophyte Chlamydomonas incerta, estimated the evolutionary divergence between Chlamydomonas reinhardtii and C. incerta mitochondrial protein-coding genes and rRNA-coding regions, and compared the relative evolutionary rates in mitochondrial and nuclear genes. Synonymous and nonsynonymous substitution rates do not differ significantly between the mitochondrial and nuclear protein-coding genes. The mitochondrial rRNA-coding regions, however, are evolving much faster than their nuclear counterparts, and this difference might be explained by relaxed functional constraints on the mitochondrial translational apparatus due to the small number of proteins synthesized in Chlamydomonas mitochondria. Substitution rates at synonymous sites in a nonstandard mitochondrial gene (rtl) and at intronic and synonymous sites in nuclear genes expressed at low levels suggest that the mutation rate is similar in these two genetic compartments. Potential evolutionary forces shaping mitochondrial genome evolution in Chlamydomonas are discussed.  相似文献   

13.
Although single-nucleotide polymorphisms (SNPs) have become the marker of choice in the field of human genetics, these markers are only slowly emerging in ecological, evolutionary and conservation genetic analyses of nonmodel species. This is partly because of difficulties associated with the discovery and characterization of SNP markers. Herein, we adopted a simple straightforward approach to identifying SNPs, based on screening of a random genomic library. In total, we identified 768 SNPs in the ringed seal, Pusa hispida hispida, in samples from Greenland and Svalbard. Using three seal samples, SNPs were discovered at a rate of one SNP per 402 bp, whereas re-sequencing of 96 seals increased the density to one SNP per 29 bp. Although applicable to any species of interest, the approach is especially well suited for SNP discovery in nonmodel organisms and is easily implemented in any standard genetics laboratory, circumventing the need for prior genomic data and use of next-generation sequencing facilities.  相似文献   

14.
The mutation rate is known to vary between adjacent sites within the human genome as a consequence of context, the most well-studied example being the influence of CpG dinucelotides. We investigated whether there is additional variation by testing whether there is an excess of sites at which both humans and chimpanzees have a single-nucleotide polymorphism (SNP). We found a highly significant excess of such sites, and we demonstrated that this excess is not due to neighbouring nucleotide effects, ancestral polymorphism, or natural selection. We therefore infer that there is cryptic variation in the mutation rate. However, although this variation in the mutation rate is not associated with the adjacent nucleotides, we show that there are highly nonrandom patterns of nucleotides that extend ~80 base pairs on either side of sites with coincident SNPs, suggesting that there are extensive and complex context effects. Finally, we estimate the level of variation needed to produce the excess of coincident SNPs and show that there is a similar, or higher, level of variation in the mutation rate associated with this cryptic process than there is associated with adjacent nucleotides, including the CpG effect. We conclude that there is substantial variation in the mutation that has, until now, been hidden from view.  相似文献   

15.
We characterized rates and patterns of synonymous and nonsynonymous substitution in 242 duplicated gene pairs on chromosomes 2 and 4 of Arabidopsis thaliana. Based on their collinear order along the two chromosomes, the gene pairs were likely duplicated contemporaneously, and therefore comparison of genetic distances among gene pairs provides insights into the distribution of nucleotide substitution rates among plant nuclear genes. Rates of synonymous substitution varied 13.8-fold among the duplicated gene pairs, but 90% of gene pairs differed by less than 2.6-fold. Average nonsynonymous rates were approximately fivefold lower than average synonymous rates; this rate difference is lower than that of previously studied nonplant lineages. The coefficient of variation of rates among genes was 0.65 for nonsynonymous rates and 0.44 for synonymous rates, indicating that synonymous and nonsynonymous rates vary among genes to roughly the same extent. The causes underlying rate variation were explored. Our analyses tentatively suggest an effect of physical location on synonymous substitution rates but no similar effect on nonsynonymous rates. Nonsynonymous substitution rates were negatively correlated with GC content at synonymous third codon positions, and synonymous substitution rates were negatively correlated with codon bias, as observed in other systems. Finally, the 242 gene pairs permitted investigation of the processes underlying divergence between paralogs. We found no evidence of positive selection, little evidence that paralogs evolve at different rates, and no evidence of differential codon usage or third position GC content.  相似文献   

16.
A new modification of the single nucleotide polymorphism (SNP) analysis (DSNP, duplex-specific nuclease preference) method using the duplex-specific nuclease from the king crab was proposed. The method was used to study SNPs in the following human genes: kRAS, nRAS, hRAS, and p53, the genes of blood coagulation factor V, methyltetrahydrofolate reductase, prothrombin, and apolipoprotein E and a deletion in the BRCA1 gene. DSNP was shown to be useful for the estimation of the mutant allele content in DNA samples. A system for the simultaneous identification of several adjacent single-nucleotide polymorphisms in the kRAS gene was proposed. The approaches could be used to develop test systems for the detection of SNPs in human genes. The English version of the paper: Russian Journal of Bioorganic Chemistry, 2005, vol. 31, no. 6; see also http://www.maik.ru.  相似文献   

17.
The distribution of guanine and cytosine nucleotides throughout a genome, or the GC content, is associated with numerous features in mammals; understanding the pattern and evolutionary history of GC content is crucial to our efforts to annotate the genome. The local GC content is decaying toward an equilibrium point, but the causes and rates of this decay, as well as the value of the equilibrium point, remain topics of debate. By comparing the results of 2 methods for estimating local substitution rates, we identify 620 Mb of the human genome in which the rates of the various types of nucleotide substitutions are the same on both strands. These strand-symmetric regions show an exponential decay of local GC content at a pace determined by local substitution rates. DNA segments subjected to higher rates experience disproportionately accelerated decay and are AT rich, whereas segments subjected to lower rates decay more slowly and are GC rich. Although we are unable to draw any conclusions about causal factors, the results support the hypothesis proposed by Khelifi A, Meunier J, Duret L, and Mouchiroud D (2006. GC content evolution of the human and mouse genomes: insights from the study of processed pseudogenes in regions of different recombination rates. J Mol Evol. 62:745-752.) that the isochore structure has been reshaped over time. If rate variation were a determining factor, then the current isochore structure of mammalian genomes could result from the local differences in substitution rates. We predict that under current conditions strand-symmetric portions of the human genome will stabilize at an average GC content of 30% (considerably less than the current 42%), thus confirming that the human genome has not yet reached equilibrium.  相似文献   

18.
A new modification of the single nucleotide polymorphism (SNP) analysis (DSNP, duplex-specific nuclease preference) method using the duplex-specific nuclease from the king crab was proposed. The method was used to study SNPs in the following human genes: kRAS, nRAS, hRAS, and p53, the genes of blood coagulation factor V, methyltetrahydrofolate reductase, prothrombin, and apolipoprotein E and a deletion in the BRCA1 gene. DSNP was shown to be useful for the estimation of the mutant allele content in DNA samples. A system for the simultaneous identification of several adjacent single-nucleotide polymorphisms in the kRAS gene was proposed. The approaches could be used to develop test systems for the detection of SNPs in human genes.  相似文献   

19.
To investigate whether common variants in the human genetic background are associated with pathogenesis of ischemic heart diseases, we systematically surveyed 41 possible candidate genes for single-nucleotide polymorphisms (SNPs) by directly sequencing 96 independent alleles at each locus, derived from 48 unrelated Japanese patients with myocardial infarction, including 25.8 kb 5' flanking regions, 56.8 kb exonic and 35.4 kb intronic sequences, and 1.8 kb 3' flanking regions. In this genomic DNA of nearly 120 kb, we identified 187 SNPs: 55 in 5' flanking regions, seven in 5' untranslated regions (UTRs), 52 in coding elements, 64 in introns, eight in 3' UTRs, and one in a 3' flanking region. Among the 52 coding SNPs, 26 were non-synonymous changes. Allelic frequencies of some of the polymorphisms were significantly different from those reported in European populations. For example, the Q506R substitution in the coagulation factor V gene, the so-called "Leiden mutation", has a reported frequency of 2.3% in Europeans, but we detected the Leiden mutation in none of the Japanese genomes that we investigated. The allelic frequencies of the -33A>G SNP in the thrombomodulin gene were also very different; this allele occurred at a 12% frequency in the Japanese patients that we examined, although it had been detected in none of 82 Caucasians reported previously. These data support the hypothesis that some SNPs are specific to particular ethnic groups.  相似文献   

20.

Background

The epidermal growth factor receptor (EGFR) gene plays a key role in tumor survival, invasion, angiogenesis, and metastatic spread. Recent studies showed that gastric cancer (GC) was associated with polymorphisms of the EGFR gene and environmental influences, such as lifestyle factors. In this study, seven known SNPs in EGFR exons were investigated in a high-risk Chinese population in Jiangsu province to test whether genetic variants of EGFR exons and lifestyle are associated with an increased risk of GC.

Methodology/Principal Findings

A hospital-based case-control study was performed in Jiangsu province. The results showed that smoking, drinking and preference for salty food were significantly associated with the risk of GC. The differences of lifestyle between males and females might be as the reason of higher incidence rates in males than those in females. Seven exon SNPs were genotyped rs2227983,rs2072454,rs17337023,rs1050171,rs1140475, rs2293347, and rs28384375. It was noted that the variant rs2072454 T allele and TT genotype were significantly associated with an increased risk of GC. Interestingly, our result suggested the ACAGCA haplotype might be associated with decreased risk of GC. However, no significant association was examined between the other six SNPs and the risk of GC both in the total population and the age-matching population even with gender differences.

Conclusions

Smoking, drinking and preference for salty food were significantly associated with the risk of GC in Jiangsu province with gender differences. Although only one SNP (rs2072454) was significantly associated with an increased risk of GC, combined the six EGFR exon SNPs together may be useful for predicting the risk of GC.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号