首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Singh ND  Arndt PF  Petrov DA 《Genetics》2005,169(2):709-722
Mutation is the underlying force that provides the variation upon which evolutionary forces can act. It is important to understand how mutation rates vary within genomes and how the probabilities of fixation of new mutations vary as well. If substitutional processes across the genome are heterogeneous, then examining patterns of coding sequence evolution without taking these underlying variations into account may be misleading. Here we present the first rigorous test of substitution rate heterogeneity in the Drosophila melanogaster genome using almost 1500 nonfunctional fragments of the transposable element DNAREP1_DM. Not only do our analyses suggest that substitutional patterns in heterochromatic and euchromatic sequences are different, but also they provide support in favor of a recombination-associated substitutional bias toward G and C in this species. The magnitude of this bias is entirely sufficient to explain recombination-associated patterns of codon usage on the autosomes of the D. melanogaster genome. We also document a bias toward lower GC content in the pattern of small insertions and deletions (indels). In addition, the GC content of noncoding DNA in Drosophila is higher than would be predicted on the basis of the pattern of nucleotide substitutions and small indels. However, we argue that the fast turnover of noncoding sequences in Drosophila makes it difficult to assess the importance of the GC biases in nucleotide substitutions and small indels in shaping the base composition of noncoding sequences.  相似文献   

2.
Microstructural changes such as insertions and deletions (=indels) are a major driving force in the evolution of non-coding DNA sequences. To better understand the mechanisms by which indel mutations arise, as well as the molecular evolution of non-coding regions, the number and pattern of indels and nucleotide substitutions were compared in the whole chloroplast genomes. Comparisons were made for a total of over 38 kb non-coding DNA sequences from 126 intergenic regions in two data sets representing species with different divergence times: sugarcane and maize and Oryza sativa var. indica and japonica. The main findings of this study are: (i) Approximately half of all indels are single nucleotide indels. This observation agrees with previous studies in various organisms. (ii) The distribution and number of indels was different between two data sets, and different patterns were observed for tandem repeat and non-repeat indels. (iii) Distribution pattern of tandem repeat indels showed statistically significant bias towards A/T-rich. (iv) The rate of indel mutation was estimated to be approximately 0.8 +/- 0.04 x 10(-9) per site per year, which was similar to previous estimates in other organisms. (v) The frequencies of nucleotide substitutions and indels were significantly lower in inverted repeat (IR).  相似文献   

3.
The principal sources of genetic variation that can be assayed with restriction enzymes are base substitutions and insertions/deletions (indels). The likelihood of detecting indels as restriction fragment length polymorphisms (RFLPs) is determined by the size and frequency of the indels, and the ability to resolve small indels as RFLPs is limited by the distribution of restriction fragment sizes. In this study, we use aligned sequences from the indica and japonica subspecies of rice ( Oryza sativa L.) to quantify and compare the ability of restriction enzymes to detect indels. We look specifically at two abundant transposable element-derived indel sources: miniature inverted repeat transposable elements (MITEs) and long terminal repeat (LTR) retroelements. From this analysis we conclude that indels rather than base substitutions are the prevailing source of the polymorphism detected in rice. We show that, although MITE derived indels are more abundant than LTR-retroelement derived indels, LTR-retroelements have a greater capacity to generate visible restriction fragment length polymorphism because of their larger size. We find that the variation in the detectability of indels among restriction enzymes can be explained by differences in the frequency and dispersion of their restriction sites in the genome. The parameters that describe the fragment size distributions obtained with the restriction enzymes are highly correlated across the sequenced genomes of rice, Arabidopsis and human, with the exception of some extreme deviations in frequency for particular recognition sequences corresponding to variations in the levels and modes of DNA methylation in the three disparate organisms. Thus, we can predict the relative ability of a restriction enzyme to detect indels derived from a specific source based on the distribution of restriction fragment sizes, even when this is estimated for a distantly related genome.Electronic Supplementary Material Supplementary Material is available in the online version of this article at Communicated by M.-A. Grandbastien  相似文献   

4.
Brandström M  Ellegren H 《Genetics》2007,176(3):1691-1701
It is increasingly recognized that insertions and deletions (indels) are an important source of genetic as well as phenotypic divergence and diversity. We analyzed length polymorphisms identified through partial (0.25x) shotgun sequencing of three breeds of domestic chicken made by the International Chicken Polymorphism Map Consortium. A data set of 140,484 short indel polymorphisms in unique DNA was identified after filtering for microsatellite structures. There was a significant excess of tandem duplicates at indel sites, with deletions of a duplicate motif outnumbering the generation of duplicates through insertion. Indel density was lower in microchromosomes than in macrochromosomes, in the Z chromosome than in autosomes, and in 100 bp of upstream sequence, 5'-UTR, and first introns than in intergenic DNA and in other introns. Indel density was highly correlated with single nucleotide polymorphism (SNP) density. The mean density of indels in pairwise sequence comparisons was 1.9 x 10(-4) indel events/bp, approximately 5% the density of SNPs segregating in the chicken genome. The great majority of indels involved a limited number of nucleotides (median 1 bp), with A-rich motifs being overrepresented at indel sites. The overrepresentation of deletions at tandem duplicates indicates that replication slippage in duplicate sequences is a common mechanism behind indel mutation. The correlation between indel and SNP density indicates common effects of mutation and/or selection on the occurrence of indels and point mutations.  相似文献   

5.
6.
Insertions and deletions (indels) in chloroplast noncoding regions are common genetic markers to estimate population structure and gene flow, although relatively little is known about indel evolution among recently diverged lineages such as within plant families. Because indel events tend to occur nonrandomly along DNA sequences, recurrent mutations may generate homoplasy for indel haplotypes. This is a potential problem for population studies, because indel haplotypes may be shared among populations after recurrent mutation as well as gene flow. Furthermore, indel haplotypes may differ in fitness and therefore be subject to natural selection detectable as rate heterogeneity among lineages. Such selection could contribute to the spatial patterning of cpDNA haplotypes, greatly complicating the interpretation of cpDNA population structure. This study examined both nucleotide and indel cpDNA variation and divergence at six noncoding regions (psbB-psbH, atpB-rbcL, trnL-trnH, rpl20-5'rps12, trnS-trnG, and trnH-psbA) in 16 individuals from eight species in the Lecythidaceae and a Sapotaceae outgroup. We described patterns of cpDNA changes, assessed the level of indel homoplasy, and tested for rate heterogeneity among lineages and regions. Although regression analysis of branch lengths suggested some degree of indel homoplasy among the most divergent lineages, there was little evidence for indel homoplasy within the Lecythidaceae. Likelihood ratio tests applied to the entire phylogenetic tree revealed a consistent pattern rejecting a molecular clock. Tajima's 1D and 2D tests revealed two taxa with consistent rate heterogeneity, one showing relatively more and one relatively fewer changes than other taxa. In general, nucleotide changes showed more evidence of rate heterogeneity than did indel changes. The rate of evolution was highly variable among the six cpDNA regions examined, with the trnS-trnG and trnH-psbA regions showing as much as 10% and 15% divergence within the Lecythidaceae. Deviations from rate homogeneity in the two taxa were constant across cpDNA regions, consistent with lineage-specific rates of evolution rather than cpDNA region-specific natural selection. There is no evidence that indels are more likely than nucleotide changes to experience homoplasy within the Lecythidaceae. These results support a neutral interpretation of cpDNA indel and nucleotide variation in population studies within species such as Corythophora alta.  相似文献   

7.
Mutations affect individual health, population persistence, adaptation, diversification, and genome evolution. There is evidence that the mutation rate varies among genotypes, but the causes of this variation are poorly understood. Here, we link differences in genetic quality with variation in spontaneous mutation in a Drosophila mutation accumulation experiment. We find that chromosomes maintained in low-quality genetic backgrounds experience a higher rate of indel mutation and a lower rate of gene conversion in a manner consistent with condition-based differences in the mechanisms used to repair DNA double strand breaks. These aspects of the mutational spectrum were also associated with body mass, suggesting that the effect of genetic quality on DNA repair was mediated by overall condition, and providing a mechanistic explanation for the differences in mutational fitness decline among these genotypes. The rate and spectrum of substitutions was unaffected by genetic quality, but we find variation in the probability of substitutions and indels with respect to several aspects of local sequence context, particularly GC content, with implications for models of molecular evolution and genome scans for signs of selection. Our finding that the chances of mutation depend on genetic context and overall condition has important implications for how sequences evolve, the risk of extinction, and human health.  相似文献   

8.
9.
We investigated whether relative rates of divergence were correlated between the mitochondrial and chloroplast genomes as expected under lineage effects or were genome specific as expected with locus-specific effects. Five mitochondrial noncoding regions (nad1B_C, nad4exon1_2, nad7exon2_3, nad7exon3_4, and rps14-cob) for 21 samples from Lecythidaceae were sequenced. Three chloroplast regions (rpl20-5'rps12, trnS-trnG, and psbA-trnH) were sequenced to expand the taxa in an existing data set. Absolute rates of nucleotide and insertion and deletion (indel) changes were 13 times faster in the chloroplast genome than in the mitochondrial genome. Similar indel length frequency distributions for both organelles suggested that common mechanisms were responsible for generating indels. Molecular clock tests applied to phylogenetic trees estimated from mitochondrial and chloroplast sequences revealed global rate heterogeneity of nucleotide substitution. Maximum likelihood and Tajima's 1D relative rate tests show that Lecythis zabucajo exhibited a rate acceleration for both the mitochondrial and chloroplast sequences. Whereas Eschweilera romeu-cardosoi showed a significant rate slowdown for chloroplast sequences, the mitochondrial sequences for 3 Eschweilera taxa showed evidence for a rate slowdown only when compared with L. zabucajo. Significant rate heterogeneity was also observed for indel changes in the mitochondrial genome but not for the chloroplast. The lack of mitochondrial nucleotide changes for some taxa as well as chloroplast indel homoplasy may have limited the power of relative rate tests to detect rate variation. Relative ratio tests consistently indicated rate proportionality among branch lengths between the mitochondrial and chloroplast phylogenetic trees. The relative ratio tests showed that taxa possessing rate heterogeneity had parallel relative divergence rates in both mitochondrial and chloroplast sequences as expected under lineage effects. A neutral replication-dependent model of rate heterogeneity for both nucleotide and indel changes provides a simple explanation for common patterns of rate heterogeneity across the 2 organelle genomes in Lecythidaceae. The lineage effects observed here were uncoupled from annual/perennial habit because all the species from this study are perennial.  相似文献   

10.
Indels in DNA sequences frequently affect more than a single nucleotide, creating problems for alignment, character coding and phylogenetic analysis. However, the size and frequency of multiple‐residue indels is not usually tested, and with popular alignment packages their reconstruction is indirectly acheived by reducing the affine (gap extension) cost. We explored the length distribution of indels in intron sequences of the gene Mp20 by modifying the gap opening and gap extension costs. Given a “known” tree for the study group, global homology levels were greatest under low gap cost, with gap extension costs of roughly 0.4‐fold the opening cost. Different approaches to gap coding and weighting suggested that taxonomic congruence was correlated with high frequencies of multiple‐position indels, with a maximum indel length of 2–5 bp and few indels above 15 bp, but also including a proportion of indels > 100 bp. Only a small minority of indels could be reconstructed as single‐position indels. Consequently, tree topologies improved when homologous multinucleotide indels were recoded as binary characters which are otherwise highly homoplastic and weighted characters in single‐position coding. In tree‐generating alignment procedures as implemented in POY, where gap penalty determines the character weight during tree search, the problem of assigning inappropriately high weight to multiple‐residue indels could partly be overcome by setting the extension costs to about 0.4‐fold lower than gap opening costs. We conclude that multiple consecutive gap positions are not independent characters and hence methods for parsimony reconstruction of long indels are required. Finally, we also observed a general lack of correlation between taxonomic and character congruence, demonstrating the difficulties of applying congruence criteria to decide among competing alignments. This highlights the value of recent model‐based alignment procedures which can implement the statistical distributions of indel size classes, and do not rely on potentially circular strategies for optimizing overall congruence. © The Willi Hennig Society 2006.  相似文献   

11.
Genes encoding reproductive proteins often diverge rapidly due to positive selection on nucleotide substitutions. While this general pattern is well established, the extent to which specific reproductive genes experience similar selection in different clades has been little explored, nor have possible targets of positive selection other than nucleotide substitutions, such as indels, received much attention. Here, we inspect for the signature of positive selection in the genes encoding five accessory gland proteins (Acps) (Acp26Aa, Acp32CD, Acp53Ea, Acp62F, and Acp70A) originally described from Drosophila melanogaster but with recognizable orthologues in the D. pseudoobscura subgroup. We compare patterns of selection within the D. psuedoobscura subgroup to those in the D. melanogaster subgroup. Similar patterns of positive selection were found in Acp26Aa and Acp62F in the two subgroups, while Acp53Ea and Acp70A experienced purifying selection in both subgroups. These proteins have thus remained targets for similar types of selection over long (>21-MY) periods of time. We also found several indel substitutions and polymorphisms in Acp26Aa and Acp32CD. These indels occur in the same regions as positively selected nucleotide substitutions for Acp26Aa in the D. pseudoobscura subgroup but not in the D. melanogaster subgroup. Rates of indel substitution within Acp26Aa in the D. pseudoobscura subgroup were up to several times those in noncoding regions of the Drosophila genome. This suggests that indel substitutions may be under positive selection and may play a key role in the divergence of some Acps. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. Willis Swanson]  相似文献   

12.
The plant mitochondrial rps3 intron was analyzed for substitution and indel rate variation among 15 monocot and dicot angiosperms from 10 genera, including perennial and annual taxa. Overall, the intron sequence was very conserved among angiosperms. Based on length polymorphism, 10 different alleles were identified among the 10 genera. These allelic differences were mainly attributable to large indels. An insertion of 133 nucleotides, observed in the Alnus intron was partially or completely absent in the other lineages of the family Betulaceae. This insertion was located within domain IV of the secondary-structure model of this group IIA intron. A mobile element of 47 nucleotides that showed homology to sequences located in rice rps3 intron and in intergenic plant mitochondrial genomes was found within this insertion. Both substitution and indel rates were low among the Betulaceae sequences, but substitution rates were increasingly larger than indel rates in comparisons involving more distantly related taxa. From a secondary-structure model, regions involved in helical structures were shown to be well preserved from indels as compared to substitutions, but compensatory changes were not observed among the angiosperm sequences analyzed. Using approximate divergence times based on the fossil record, substitution and indel rate heterogeneity was observed between different pairs of annual and perennial taxa. In particular, the annual petunia and primrose evolved more than 15 and 10 times faster, for substitution and indel rates respectively, than the perennial birch and alder. This is the first demonstration of an evolutionary rate difference between perennial and annual forms in noncoding DNA, lending support to neutral causes such as the generation time, population size, and speciation rate effects to explain such rate heterogeneity. Surprisingly, the sequence from the rps3 intron had a high identity with the sequence of intron 1 from the angiosperm mitochondrial nad5 gene, suggesting a common origin of these two group IIA introns.  相似文献   

13.
Opinions split when it comes to the significance and thus the weighting of indel characters as phylogenetic markers. This paper attempts to test the phylogenetic information content of indels and nucleotide substitutions by proposing an a priori weighting system of non-protein-coding genes. Theoretically, the system rests on a weighting scheme which is based on a falsificationist approach to cladistic inference. It provides insertions, deletions and nucleotide substitutions weights according to their specific number of identical classes of potential falsifiers, resulting in the following system: nucleotide substitutions weight = 3, deletions of n nucleotides weight = (2n–1), and insertions of n nucleotides weight = (5n–1). This weighting system and the utility of indels as phylogenetic markers are tested against a suitable data set of 18S rDNA sequences of Diptera and Strepsiptera taxa together with other Metazoa species. The indels support the same clades as the nucleotide substitution data, and the application of the weighting system increases the corresponding consistency indices of the differentially weighted character types. As a consequence, applying the weighting system seems to be reasonable, and indels appear to be good phylogenetic markers.  相似文献   

14.
Yang H  Wu Y  Feng J  Yang S  Tian D 《Genomics》2009,93(1):90-97
Mutations, which can alter amino acid constitution, contribute greatly to protein evolution. However, little is reported of their pattern during protein structural evolution. We investigated the distribution of non-synonymous single nucleotide polymorphisms (nsSNPs) and insertions/deletions (indels) along mammal and fruit fly proteins. We found the nsSNPs (and d(N)) and indels increased in protein boundary regions, and this pattern is inversely correlated with the distribution of protein domain density. Additionally, synonymous substitutions (and d(S)) are reduced in 5' and 3' regions, indicating more variable protein boundaries, compared with central interior. All evidence suggests that the inner part of coding sequences (CDSs) is comparatively conserved, whereas the 5' and 3' regions, with higher evolution rates, are more variable. We assumed that due to greater frequencies of nsSNPs and indels in adaptive regions of CDSs it could be easier to ultimately alter, gain, or lose amino acids, thus becoming the front line of protein evolution.  相似文献   

15.
An insertion/deletion polymorphism (Ind2) in the Brassica nigra CONSTANS LIKE 1 (Bni COL1) gene was previously found to be associated with variation in flowering time. In the present study we examine the inter-specific divergence of COL1 in the family Brassicaceae. Analysis of codon substitution models did not reveal evidence of positive Darwinian selection, but comparisons of the COL1 gene in different species revealed a surprising number of indels. A total of 24 indels were found in the 650 bp of the middle variable region of the gene. This high number of indels could reflect a lack of constraint on length of this region of the protein, or the effect of positive selection. The number of indels was close to that expected in non-coding DNA, but the indels were longer in COL1 than those observed in non-coding regions. Reconstruction of indel evolution indicated that most indels resulted from deletions rather than insertions. The Ind2 indel that has shown association with flowering time in Brassica nigra exhibited a remarkable distribution in the Brassicaceae family, indicating that the polymorphism may have persisted more than ten million years. Considering presumed historic populations sizes of Brassicaceae species, such a long persistence time seems unlikely for a neutral polymorphism.  相似文献   

16.
Recombination between homologous loci is accompanied by formation of heteroduplexes. Repairing mismatches in heteroduplexes often leads to single nucleotide substitutions in a process known as gene conversion. Gene conversion was shown to be GC‐biased in different organisms; that is, a W(A or T)→S(G or C) substitution is more likely in this process than a S→W substitution. Here, we show that the insertion/deletion ratio for short noncoding indels that reach fixation between species is positively correlated with the recombination rate in Drosophila melanogaster, Homo sapiens, and Saccharomyces cerevisiae. This correlation is both due to an increase of the fixation rate of insertions and decrease of the fixation rate of deletions in regions of high recombination. Whole‐genome data on indel polymorphism and divergence in D. melanogaster rule out mutation biases and selection as the cause of this trend, pointing to insertion‐biased gene conversion as the most likely explanation. The bias toward insertions is the strongest for single‐nucleotide indels, and decreases with indel length. In regions of high recombination rate this bias leads to an up to ~5‐fold excess of fixed short insertions over deletions, and substantially affects the evolution of DNA segments.  相似文献   

17.
The genome-sequencing gold rush has facilitated the use of comparative genomics to uncover patterns of genome evolution, although their causal mechanisms remain elusive. One such trend, ubiquitous to prokarya and eukarya, is the association of insertion/deletion mutations (indels) with increases in the nucleotide substitution rate extending over hundreds of base pairs. The prevailing hypothesis is that indels are themselves mutagenic agents. Here, we employ population genomics data from Escherichia coli, Saccharomyces paradoxus, and Drosophila to provide evidence suggesting that it is not the indels per se but the sequence in which indels occur that causes the accumulation of nucleotide substitutions. We found that about two-thirds of indels are closely associated with repeat sequences and that repeat sequence abundance could be used to identify regions of elevated sequence diversity, independently of indels. Moreover, the mutational signature of indel-proximal nucleotide substitutions matches that of error-prone DNA polymerases. We propose that repeat sequences promote an increased probability of replication fork arrest, causing the persistent recruitment of error-prone DNA polymerases to specific sequence regions over evolutionary time scales. Experimental measures of the mutation rates of engineered DNA sequences and analyses of experimentally obtained collections of spontaneous mutations provide molecular evidence supporting our hypothesis. This study uncovers a new role for repeat sequences in genome evolution and provides an explanation of how fine-scale sequence contextual effects influence mutation rates and thereby evolution.  相似文献   

18.
Indels are increasingly used in phylogenetics and play a major role in genome size evolution, and yet both the phylogenetic information content of indels and their evolutionary significance remain to be better assessed. Using three presumably independently evolving nuclear gene fragments (28S rDNA, β-fibrinogen, ornithine decarboxylase) from 29 families of neognathous birds, we have obtained a topology that is in general agreement with the current molecular consensus tree, supports the monophyly of Metaves, and provides evidence for the unresolved relationships within the Charadriiformes. Based on the retrieved topology, we assess the relative impact of indels and nucleotide substitutions and demonstrate that the superposition of the two kinds of data yields a topology that could not be obtained from either data set alone. Although only two out of three gene fragments reveal the deletion bias, the combined nucleotide insertion-to-deletion ratio is 0.22, indicating a rapid decrease of intron length. The average indel fixation rate in the neognaths is 2.5 times faster than that in therian (placental) mammals of similar geologic age. As in mammals, there is a considerable variation of indel fixation rate that is 1.5 times higher in Galloanseres compared to Neoaves, and 2.4 times higher in the Rallidae compared to the average for Neoaves (8.2 times higher compared to the related Gruidae). Our results add to the evidence that indel fixation rates correlate with lineage-specific evolutionary rates.  相似文献   

19.
We characterized rates and patterns of synonymous and nonsynonymous substitution in 242 duplicated gene pairs on chromosomes 2 and 4 of Arabidopsis thaliana. Based on their collinear order along the two chromosomes, the gene pairs were likely duplicated contemporaneously, and therefore comparison of genetic distances among gene pairs provides insights into the distribution of nucleotide substitution rates among plant nuclear genes. Rates of synonymous substitution varied 13.8-fold among the duplicated gene pairs, but 90% of gene pairs differed by less than 2.6-fold. Average nonsynonymous rates were approximately fivefold lower than average synonymous rates; this rate difference is lower than that of previously studied nonplant lineages. The coefficient of variation of rates among genes was 0.65 for nonsynonymous rates and 0.44 for synonymous rates, indicating that synonymous and nonsynonymous rates vary among genes to roughly the same extent. The causes underlying rate variation were explored. Our analyses tentatively suggest an effect of physical location on synonymous substitution rates but no similar effect on nonsynonymous rates. Nonsynonymous substitution rates were negatively correlated with GC content at synonymous third codon positions, and synonymous substitution rates were negatively correlated with codon bias, as observed in other systems. Finally, the 242 gene pairs permitted investigation of the processes underlying divergence between paralogs. We found no evidence of positive selection, little evidence that paralogs evolve at different rates, and no evidence of differential codon usage or third position GC content.  相似文献   

20.

Background

Insertions and deletions (indels) are the most abundant form of structural variation in all genomes. Indels have been increasingly recognized as an important source of molecular markers due to high-density occurrence, cost-effectiveness, and ease of genotyping. Coupled with developments in bioinformatics, next-generation sequencing (NGS) platforms enable the discovery of millions of indel polymorphisms by comparing the whole genome sequences of individuals within a species.

Results

A total of 1,973,746 unique indels were identified in 345 maize genomes, with an overall density of 958.79 indels/Mbp, and an average allele number of 2.76, ranging from 2 to 107. There were 264,214 indels with polymorphism information content (PIC) values greater than or equal to 0.5, accounting for 13.39 % of overall indels. Of these highly polymorphic indels, we designed primer pairs for 83,481 and 29,403 indels with major allele differences (i.e. the size difference between the most and second most frequent alleles) greater than or equal to 3 and 8 bp, respectively, based on the differing resolution capabilities of gel electrophoresis. The accuracy of our indel markers was experimentally validated, and among 100 indel markers, average accuracy was approximately 90 %. In addition, we also validated the polymorphism of the indel markers. Of 100 highly polymorphic indel markers, all had polymorphisms with average PIC values of 0.54.

Conclusions

The maize genome is rich in indel polymorphisms. Intriguingly, the level of polymorphism in genic regions of the maize genome was higher than that in intergenic regions. The polymorphic indel markers developed from this study may enhance the efficiency of genetic research and marker-assisted breeding in maize.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1797-5) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号