首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Microstructural changes such as insertions and deletions (=indels) are a major driving force in the evolution of non-coding DNA sequences. To better understand the mechanisms by which indel mutations arise, as well as the molecular evolution of non-coding regions, the number and pattern of indels and nucleotide substitutions were compared in the whole chloroplast genomes. Comparisons were made for a total of over 38 kb non-coding DNA sequences from 126 intergenic regions in two data sets representing species with different divergence times: sugarcane and maize and Oryza sativa var. indica and japonica. The main findings of this study are: (i) Approximately half of all indels are single nucleotide indels. This observation agrees with previous studies in various organisms. (ii) The distribution and number of indels was different between two data sets, and different patterns were observed for tandem repeat and non-repeat indels. (iii) Distribution pattern of tandem repeat indels showed statistically significant bias towards A/T-rich. (iv) The rate of indel mutation was estimated to be approximately 0.8 +/- 0.04 x 10(-9) per site per year, which was similar to previous estimates in other organisms. (v) The frequencies of nucleotide substitutions and indels were significantly lower in inverted repeat (IR).  相似文献   

2.
Singh ND  Arndt PF  Petrov DA 《Genetics》2005,169(2):709-722
Mutation is the underlying force that provides the variation upon which evolutionary forces can act. It is important to understand how mutation rates vary within genomes and how the probabilities of fixation of new mutations vary as well. If substitutional processes across the genome are heterogeneous, then examining patterns of coding sequence evolution without taking these underlying variations into account may be misleading. Here we present the first rigorous test of substitution rate heterogeneity in the Drosophila melanogaster genome using almost 1500 nonfunctional fragments of the transposable element DNAREP1_DM. Not only do our analyses suggest that substitutional patterns in heterochromatic and euchromatic sequences are different, but also they provide support in favor of a recombination-associated substitutional bias toward G and C in this species. The magnitude of this bias is entirely sufficient to explain recombination-associated patterns of codon usage on the autosomes of the D. melanogaster genome. We also document a bias toward lower GC content in the pattern of small insertions and deletions (indels). In addition, the GC content of noncoding DNA in Drosophila is higher than would be predicted on the basis of the pattern of nucleotide substitutions and small indels. However, we argue that the fast turnover of noncoding sequences in Drosophila makes it difficult to assess the importance of the GC biases in nucleotide substitutions and small indels in shaping the base composition of noncoding sequences.  相似文献   

3.
Sequence variation in 2.2 kb of non-coding regions of the chloroplast genome of eight dandelions (Taraxacum: Lactuceae) from Asia and Europe is interpreted in the light of the phylogenetic signal of base substitutions vs. indels (insertions-deletions). The four non-coding regions displayed a total of approximately 30 structural mutations of which 9 are potentially phylogenetically informative. Insertions, deletions, and an inversion were found that involved consecutive stretches of up to 172 bases. When compared to phylogenetic relationships of the chloroplast genomes based on nucleotide substitutions only, many homoplasious indels (33%) were detected that differed considerably in length and did not comprise simple sequence repeats typically associated with replication slippage. Though many indels in the intergenic spacers were associated with direct repeats, frequently, the variable stretches participated in inverted repeat stabilized hairpins. In each intergenic spacer or intron examined, nucleotide stretches ranging from 30 to 60 bp were able to fold into stabilized secondary structures. When these indels were homoplasious, they always ranked among the most stabilized hairpins in the non-coding regions. The association of higher order structures that involve both classes of repeats and parallel structural mutations in hot spot regions of the chloroplast genome can be used to differentiate among mutations that differ in phylogenetic reliability.  相似文献   

4.
5.
Mutation rates are used to calibrate molecular clocks and to link genetic variants with human disease. However, mutation rates are not uniform across each eukaryotic genome. Rates for insertion/deletion (indel) mutations have been found to vary widely when examined in vitro and at specific loci in vivo. Here, we report the genome-wide rates of formation and repair of indels made during replication of yeast nuclear DNA. Using over 6000 indels accumulated in four mismatch repair (MMR) defective strains, and statistical corrections for false negatives, we find that indel rates increase by 100 000-fold with increasing homonucleotide run length, representing the greatest effect on replication fidelity of any known genomic parameter. Nonetheless, long genomic homopolymer runs are overrepresented relative to random chance, implying positive selection. Proofreading defects in the replicative polymerases selectively increase indel rates in short repetitive tracts, likely reflecting the distance over which Pols δ and ϵ interact with duplex DNA upstream of the polymerase active site. In contrast, MMR defects hugely increase indel mutagenesis in long repetitive sequences. Because repetitive sequences are not uniformly distributed among genomic functional elements, the quantitatively different consequences on genome-wide repeat sequence instability conferred by defects in proofreading and MMR have important biological implications.  相似文献   

6.
Eighteen microsatellite markers were developed for the Crassostrea virginica nuclear genome, including di-, tri-, and tetranucleotide microsatellite repeat regions that included perfect, imperfect, and compound repeat sequences. A reference panel with DNA from the parents and four progeny of 10 full-sib families was used for a preliminary confirmation of polymorphism at these loci and indications of null alleles. Null alleles were discovered at three loci; in two instances, primer redesign enabled their amplification. Two to five representative alleles from each locus were sequenced to ensure that the targeted loci were amplifying. The sequence analysis revealed not only variation in the number of simple sequence repeat units, but also polymorphisms in the microsatellite flanking regions. A total of 3626 bp of combined microsatellite flanking region from the 18 loci was examined, revealing indels as well as nucleotide site substitutions. Overall, 16 indels and 146 substitutions were found with an average of 4.5% polymorphism across all loci. Eight markers were tested on the parents and 39-61 progeny from each of four families for examination of allelic inheritance patterns and genotypic ratios. Twenty-six tests of segregation ratios revealed eight significant departures from expected Mendelian ratios, three of which remained significant after correction for multiple tests. Deviations were observed in both the directions of heterozygote excess and deficiency.  相似文献   

7.
Traditional sequence comparison by alignment employs a mutation model comprised of two events, substitutions and indels (insertions or deletions) of single positions. However, modern genetic analysis knows a variety of more complex mutation events (e.g., duplications, excisions, and rearrangements), especially regarding DNA. With ever more DNA sequence data becoming available, the need to accurately compare sequences which have clearly undergone more complicated types of mutational processes is becoming critical. Herein we introduce a new method for pairwise alignment and comparison of sequences with respect to the special evolution of tandem repeats: substitutions and indels of single positions and, additionally, duplications and excisions of variable degree (i.e., of one or more repeat copies simultaneously) are taken into account. To evaluate our method, we apply it to the spa VNTR (variable number of tandem repeats) cluster of Staphylococcus aureus, a bacterium of high medical importance  相似文献   

8.
Estimate of the mutation rate per nucleotide in humans   总被引:41,自引:0,他引:41  
Nachman MW  Crowell SL 《Genetics》2000,156(1):297-304
Many previous estimates of the mutation rate in humans have relied on screens of visible mutants. We investigated the rate and pattern of mutations at the nucleotide level by comparing pseudogenes in humans and chimpanzees to (i) provide an estimate of the average mutation rate per nucleotide, (ii) assess heterogeneity of mutation rate at different sites and for different types of mutations, (iii) test the hypothesis that the X chromosome has a lower mutation rate than autosomes, and (iv) estimate the deleterious mutation rate. Eighteen processed pseudogenes were sequenced, including 12 on autosomes and 6 on the X chromosome. The average mutation rate was estimated to be approximately 2.5 x 10(-8) mutations per nucleotide site or 175 mutations per diploid genome per generation. Rates of mutation for both transitions and transversions at CpG dinucleotides are one order of magnitude higher than mutation rates at other sites. Single nucleotide substitutions are 10 times more frequent than length mutations. Comparison of rates of evolution for X-linked and autosomal pseudogenes suggests that the male mutation rate is 4 times the female mutation rate, but provides no evidence for a reduction in mutation rate that is specific to the X chromosome. Using conservative calculations of the proportion of the genome subject to purifying selection, we estimate that the genomic deleterious mutation rate (U) is at least 3. This high rate is difficult to reconcile with multiplicative fitness effects of individual mutations and suggests that synergistic epistasis among harmful mutations may be common.  相似文献   

9.
The DNA sequence of two wild-type strains of polyomavirus (A2 and strain 3) are known. We have determined the majority of the DNA sequence of a third strain, the Crawford small-plaque virus. This virus has been noted for its capacity to induce readily detected tumor-specific transplantation antigen in hamster cells, a property that is most likely attributable to an altered middle T-antigen. A comparison of its DNA sequence with those of the A2 and strain 3 viruses reveals numerous nucleotide substitutions, insertions, and deletions throughout the genome. Most sequence changes in coding regions are silent mutations; however, variability in proteins can be predicted from these sequence data at 5 locations in middle T-antigen, 10 in large T-antigen, and 10 in VP1. The Crawford small-plaque virus noncoding regulatory region contains, in addition to nucleotide substitutions, a 44-base-pair tandem repeat of sequences on the late side of the origin of DNA replication.  相似文献   

10.
The nucleotide sequence of Korean ginseng (Panax schinseng Nees) chloroplast genome has been completed (AY582139). The circular double-stranded DNA, which consists of 156,318 bp, contains a pair of inverted repeat regions (IRa and IRb) with 26,071 bp each, which are separated by small and large single copy regions of 86,106 bp and 18,070 bp, respectively. The inverted repeat region is further extended into a large single copy region which includes the 5' parts of the rpsl9 gene. Four short inversions associated with short palindromic sequences that form stem-loop structures were also observed in the chloroplast genome of P. schinseng compared to that of Nicotiana tabacum. The genome content and the relative positions of 114 genes (75 peptide-encoding genes, 30 tRNA genes, 4 rRNA genes, and 5 conserved open reading frames [ycfs]), however, are identical with the chloroplast DNA of N. tabacum. Sixteen genes contain one intron while two genes have two introns. Of these introns, only one (trnL-UAA) belongs to the self-splicing group I; all remaining introns have the characteristics of six domains belonging to group II. Eighteen simple sequence repeats have been identified from the chloroplast genome of Korean ginseng. Several of these SSR loci show infra-specific variations. A detailed comparison of 17 known completed chloroplast genomes from the vascular plants allowed the identification of evolutionary modes of coding segments and intron sequences, as well as the evaluation of the phylogenetic utilities of chloroplast genes. Furthermore, through the detailed comparisons of several chloroplast genomes, evolutionary hotspots predominated by the inversion end points, indel mutation events, and high frequencies of base substitutions were identified. Large-sized indels were often associated with direct repeats at the end of the sequences facilitating intra-molecular recombination.  相似文献   

11.
Friedman R  Drake JW  Hughes AL 《Genetics》2004,167(3):1507-1512
To test the hypothesis that the proteins of thermophilic prokaryotes are subject to unusually stringent functional constraints, we estimated the numbers of synonymous and nonsynonymous nucleotide substitutions per site between 17,957 pairs of orthologous genes from 22 pairs of closely related species of Archaea and Bacteria. The average ratio of nonsynonymous to synonymous substitutions was significantly lower in thermophiles than in nonthermophiles, and this effect was observed in both Archaea and Bacteria. There was no evidence that this difference could be explained by factors such as nucleotide content bias. Rather, the results support the hypothesis that proteins of thermophiles are subject to unusually strong purifying selection, leading to a reduced overall level of amino acid evolution per mutational event. The results show that genome-wide patterns of sequence evolution can be influenced by natural selection exerted by a species' environment and shed light on a previous observation that relatively few of the mutations arising in a thermophilic archaeon were nucleotide substitutions in contrast to indels.  相似文献   

12.
It has become clear that a large proportion of functional DNA in the human genome does not code for protein. Identification of this non-coding functional sequence using comparative approaches is proving difficult and has previously been thought to require deep sequencing of multiple vertebrates. Here we introduce a new model and comparative method that, instead of nucleotide substitutions, uses the evolutionary imprint of insertions and deletions (indels) to infer the past consequences of selection. The model predicts the distribution of indels under neutrality, and shows an excellent fit to human–mouse ancestral repeat data. Across the genome, many unusually long ungapped regions are detected that are unaccounted for by the neutral model, and which we predict to be highly enriched in functional DNA that has been subject to purifying selection with respect to indels. We use the model to determine the proportion under indel-purifying selection to be between 2.56% and 3.25% of human euchromatin. Since annotated protein-coding genes comprise only 1.2% of euchromatin, these results lend further weight to the proposition that more than half the functional complement of the human genome is non-protein-coding. The method is surprisingly powerful at identifying selected sequence using only two or three mammalian genomes. Applying the method to the human, mouse, and dog genomes, we identify 90 Mb of human sequence under indel-purifying selection, at a predicted 10% false-discovery rate and 75% sensitivity. As expected, most of the identified sequence represents unannotated material, while the recovered proportions of known protein-coding and microRNA genes closely match the predicted sensitivity of the method. The method's high sensitivity to functional sequence such as microRNAs suggest that as yet unannotated microRNA genes are enriched among the sequences identified. Futhermore, its independence of substitutions allowed us to identify sequence that has been subject to heterogeneous selection, that is, sequence subject to both positive selection with respect to substitutions and purifying selection with respect to indels. The ability to identify elements under heterogeneous selection enables, for the first time, the genome-wide investigation of positive selection on functional elements other than protein-coding genes.  相似文献   

13.
The rate at which new mutations arise in the genome is a key factor in the evolution and adaptation of species. Here we describe the rate and spectrum of spontaneous mutations for the fission yeast Schizosaccharomyces pombe, a key model organism with many similarities to higher eukaryotes. We undertook an ∼1700-generation mutation accumulation (MA) experiment with a haploid S. pombe, generating 422 single-base substitutions and 119 insertion-deletion mutations (indels) across the 96 replicates. This equates to a base-substitution mutation rate of 2.00 × 10−10 mutations per site per generation, similar to that reported for the distantly related budding yeast Saccharomyces cerevisiae. However, these two yeast species differ dramatically in their spectrum of base substitutions, the types of indels (S. pombe is more prone to insertions), and the pattern of selection required to counteract a strong AT-biased mutation rate. Overall, our results indicate that GC-biased gene conversion does not play a major role in shaping the nucleotide composition of the S. pombe genome and suggest that the mechanisms of DNA maintenance may have diverged significantly between fission and budding yeasts. Unexpectedly, CpG sites appear to be excessively liable to mutation in both species despite the likely absence of DNA methylation.  相似文献   

14.
15.
Mutations affect individual health, population persistence, adaptation, diversification, and genome evolution. There is evidence that the mutation rate varies among genotypes, but the causes of this variation are poorly understood. Here, we link differences in genetic quality with variation in spontaneous mutation in a Drosophila mutation accumulation experiment. We find that chromosomes maintained in low-quality genetic backgrounds experience a higher rate of indel mutation and a lower rate of gene conversion in a manner consistent with condition-based differences in the mechanisms used to repair DNA double strand breaks. These aspects of the mutational spectrum were also associated with body mass, suggesting that the effect of genetic quality on DNA repair was mediated by overall condition, and providing a mechanistic explanation for the differences in mutational fitness decline among these genotypes. The rate and spectrum of substitutions was unaffected by genetic quality, but we find variation in the probability of substitutions and indels with respect to several aspects of local sequence context, particularly GC content, with implications for models of molecular evolution and genome scans for signs of selection. Our finding that the chances of mutation depend on genetic context and overall condition has important implications for how sequences evolve, the risk of extinction, and human health.  相似文献   

16.
Nucleotide sequences of the self-splicing group-II intron of rps16 have first been determined in nine species of the Solanum genus. It was found that the observed variations in the intron length (855–864 bp) was associated with indels of 1 to 9 bp. Altogether, five indels and 50 nucleotide substitutions were detected, which were used to identify six Solanum haplotypes. Although the intron sequence was in general fairly well conserved, the distribution of the described mutations among its structural elements corresponding to six pre-RNA domains was qualitatively and quantitatively nonuniform. The highest polymorphism levels were observed in domains I, II, and IV. The sequence of domain V was absolutely invariable, which is in agreement with its functional significance. The chloroplast rpS16 intron sequences have been characterized in nine Solanum species. The intron length ranged from 855 bp to 864 bp, which is associated with 1–9-nucleotide indels. In total five indels and 50 nucleotide substitutions have been detected and six Solanum haplotypes have been revealed. Solanum rpS16 introns has been characterized by mutation rate heterogeneity between structure regions of all six domains its pre-RNA. Intron domains I, II, IV are shown to be more variable. Sequences of the domain V are invariant, that agrees with its functional significance.  相似文献   

17.
Rates of molecular evolution: the hominoid slowdown   总被引:2,自引:0,他引:2  
It is proposed that early in phylogeny a large proportion of amino acid substitutions were selectively neutral, but that bursts of adaptive substitutions during major radiations of life so increased selective constraints that most mutations in modern proteins are detrimental. Recent findings on DNA nucleotide sequences indicate that decreasing mutation rates further slowed the rate of molecular evolution in the lineage to humans.  相似文献   

18.
It is understood that DNA and amino acid substitution rates are highly sequence context-dependent, e.g., C --> T substitutions in vertebrates may occur much more frequently at CpG sites and that cysteine substitution rates may depend on support of the context for participation in a disulfide bond. Furthermore, many applications rely on quantitative models of nucleotide or amino acid substitution, including phylogenetic inference and identification of amino acid sequence positions involved in functional specificity. We describe quantification of the context dependence of nucleotide substitution rates using baboon, chimpanzee, and human genomic sequence data generated by the NISC Comparative Sequencing Program. Relative mutation rates are reported for the 96 classes of mutations of the form 5' alphabetagamma 3' --> 5' alphadeltagamma 3', where alpha, beta, gamma, and delta are nucleotides and beta not equal delta, based on maximum likelihood calculations. Our results confirm that C --> T substitutions are enhanced at CpG sites compared with other transitions, relatively independent of the identity of the preceding nucleotide. While, as expected, transitions generally occur more frequently than transversions, we find that the most frequent transversions involve the C at CpG sites (CpG transversions) and that their rate is comparable to the rate of transitions at non-CpG sites. A four-class model of the rates of context-dependent evolution of primate DNA sequences, CpG transitions > non-CpG transitions approximately CpG transversions > non-CpG transversions, captures qualitative features of the mutation spectrum. We find that despite qualitative similarity of mutation rates among different genomic regions, there are statistically significant differences.  相似文献   

19.
Y. Ogihara  T. Terachi    T. Sasakuma 《Genetics》1991,129(3):873-884
The nucleotide divergence of chloroplast DNAs around the hot spot region related to length mutation in Triticum (wheat) and Aegilops was analyzed. DNA sequences (ca. 4.5 kbp) of three chloroplast genome types of wheat complex were compared with one another and with the corresponding region of other grasses. The sequences region contained rbcL and psaI, two open reading frames, and a pseudogene, rpl23' (pseudogene for ribosomal protein L23) disrupted by AT-rich intergic spacer regions. The evolution of these genes in the closely related wheat complex is characterized by nonbiased nucleotide substitutions in terms of being synonymous/nonsynonymous, having A-T pressure transitions over transversions, and frequent changes at the third codon position, in contrast with the gene evolution among more distant plant groups where biased nucleotide substitutions have frequently occurred. The sequences of these genes had diverged almost in proportion to taxonomic distance. The sequence of the pseudogene rpl23' changed approximately two times faster than that of the coding region. Sequence comparison between the pseudogene and its protein-coding counterpart revealed different degrees of nucleotide homology in wheat, rice and maize, suggesting that the transposition timing of the pseudogene differed and/or that different rates of gene conversion operated on the pseudogene in the cpDNA of the three plant groups in Gramineae. The intergenic spacer regions diverged approximately ten times faster than the genes. The divergence of wheat from barley, and that from rice are estimated based on the nucleotide similarity to be 1.5, 10 and 40 million years, respectively.  相似文献   

20.
Microsatellites are a major component of the human genome, and their evolution has been much studied. However, the evolution of microsatellite flanking sequences has received less attention, with reports of both high and low mutation rates and of a tendency for microsatellites to cluster. From the human genome we generated a database of many thousands of (AC)n flanking sequences within which we searched for common characteristics. Sequences flanking microsatellites of similar length show remarkable levels of convergent evolution, indicating shared mutational biases. These biases extend 25–50 bases either side of the microsatellite and may therefore affect more than 30% of the entire genome. To explore the extent and absolute strength of these effects, we quantified the observed convergence. We also compared homologous human and chimpanzee loci to look for evidence of changes in mutation rate around microsatellites. Most models of DNA sequence evolution assume that mutations are independent and occur randomly. Allowances may be made for sites mutating at different rates and for general mutation biases such as the faster rate of transitions over transversions. Our analysis suggests that these models may be inadequate, in that proximity to even very short microsatellites may alter the rate and distribution of mutations that occur. The elevated local mutation rate combined with sequence convergence, both of which we find evidence for, also provide a possible resolution for the apparently contradictory inferences of mutation rates in microsatellite flanking sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号