首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Duret L  Arndt PF 《PLoS genetics》2008,4(5):e1000071
Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue for detecting selection within sequences. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We analyzed the pattern of neutral substitutions in 1 Gb of primate non-coding regions. We show that the GC-content toward which sequences are evolving is strongly negatively correlated to the distance to telomeres and positively correlated to the rate of crossovers (R2=47%). This demonstrates that recombination has a major impact on substitution patterns in human, driving the evolution of GC-content. The evolution of GC-content correlates much more strongly with male than with female crossover rate, which rules out selectionist models for the evolution of isochores. This effect of recombination is most probably a consequence of the neutral process of biased gene conversion (BGC) occurring within recombination hotspots. We show that the predictions of this model fit very well with the observed substitution patterns in the human genome. This model notably explains the positive correlation between substitution rate and recombination rate. Theoretical calculations indicate that variations in population size or density in recombination hotspots can have a very strong impact on the evolution of base composition. Furthermore, recombination hotspots can create strong substitution hotspots. This molecular drive affects both coding and non-coding regions. We therefore conclude that along with mutation, selection and drift, BGC is one of the major factors driving genome evolution. Our results also shed light on variations in the rate of crossover relative to non-crossover events, along chromosomes and according to sex, and also on the conservation of hotspot density between human and chimp.  相似文献   

2.
Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue in the identification of functional sequence features. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We have analyzed the pattern of neutral substitutions in 14.3 Mb of primate noncoding regions. We show that the GC-content toward which sequences are evolving is strongly correlated (r(2) = 0.61, P 相似文献   

3.
Substitutions occurring in noncoding sequences of the plant chloroplast genome violate the independence of sites that is assumed by substitution models in molecular evolution. The probability that a substitution at a site is a transversion, as opposed to a transition, increases significantly with increasing A + T content of the two adjacent nucleotides. In the present study, this dependency of substitutions on local context is examined further in a number of noncoding regions from the chloroplast genome of members of the grass family (Poaceae). Two features were examined; the influence of specific neighboring bases, as opposed to the general A + T content, on transversion proportion and an influence on substitutions by nucleotides other than the two immediately adjacent to the site of substitution. In both cases, a significant effect was found. In the case of specific nucleotides, transversion proportion is significantly higher at sites with a pyrimidine immediately 5′ on either strand. Substitutions at sites of the type YNR, where N is the site of substitution, have the highest rate of transversion. This specific effect is secondary to the A + T content effect such that, in terms of proportion of substitutions that are transversions, the nucleotides are ranked T > A > C > G as to their effect when they are immediately 5′ to the site of substitution. In the case of nucleotides other than the immediate neighbors, a significant influence on substitution dynamics is observed in the case where the two neighboring bases are both A and/or T. Thus, substitutions are primarily, but not exclusively, influenced by the composition of the two nucleotides that are immediately adjacent. These results indicate that the pattern of molecular evolution of the plant chloroplast genome is extremely complex as a result of a variety of inter-site dependencies. Received: 18 October 1996 / Accepted: 12 April 1997  相似文献   

4.
The hepatitis B virus (HBV) has a circular DNA genome of about 3,200 base pairs. Economical use of the genome with overlapping reading frames may have led to severe constraints on nucleotide substitutions along the genome and to highly variable rates of substitution among nucleotide sites. Nucleotide sequences from 13 complete HBV genomes were compared to examine such variability of substitution rates among sites and to examine the phylogenetic relationships among the HBV variants. The maximum likelihood method was employed to fit models of DNA sequence evolution that can account for the complexity of the pattern of nucleotide substitution. Comparison of the models suggests that the rates of substitution are different in different genes and codon positions; for example, the third codon position changes at a rate over ten times higher than the second position. Furthermore, substantial variation of substitution rates was detected even after the effects of genes and codon positions were corrected; that is, rates are different at different sites of the same gene or at the same codon position. Such rates after the correction were also found to be positively correlated at adjacent sites, which indicated the existence of conserved and variable domains in the proteins encoded by the viral genome. A multiparameter model validates the earlier finding that the variation in nucleotide conservation is not random around the HBV genome. The test for the existence of a molecular clock suggests that substitution rates are more or less constant among lineages. The phylogenetic relationships among the viral variants were examined. Although the data do not seem to contain sufficient information to resolve the details of the phylogeny, it appears quite certain that the serotypes of the viral variants do not reflect their genetic relatedness. Correspondence to: Z. Yang  相似文献   

5.
Understanding the proximate and ultimate causes underlying the evolution of nucleotide composition in mammalian genomes is of fundamental interest to the study of molecular evolution. Comparative genomics studies have revealed that many more substitutions occur from G and C nucleotides to A and T nucleotides than the reverse, suggesting that mammalian genomes are not at equilibrium for base composition. Analysis of human polymorphism data suggests that mutations that increase GC-content tend to be at much higher frequencies than those that decrease or preserve GC-content when the ancestral allele is inferred via parsimony using the chimpanzee genome. These observations have been interpreted as evidence for a fixation bias in favor of G and C alleles due to either positive natural selection or biased gene conversion. Here, we test the robustness of this interpretation to violations of the parsimony assumption using a data set of 21,488 noncoding single nucleotide polymorphisms (SNPs) discovered by the National Institute of Environmental Health Sciences (NIEHS) SNPs project via direct resequencing of n = 95 individuals. Applying standard nonparametric and parametric population genetic approaches, we replicate the signatures of a fixation bias in favor of G and C alleles when the ancestral base is assumed to be the base found in the chimpanzee outgroup. However, upon taking into account the probability of misidentifying the ancestral state of each SNP using a context-dependent mutation model, the corrected distribution of SNP frequencies for GC-content increasing SNPs are nearly indistinguishable from the patterns observed for other types of mutations, suggesting that the signature of fixation bias is a spurious artifact of the parsimony assumption.  相似文献   

6.
Schmegner C  Hoegel J  Vogel W  Assum G 《Genetics》2007,175(1):421-428
The human genome is composed of long stretches of DNA with distinct GC contents, called isochores or GC-content domains. A boundary between two GC-content domains in the human NF1 gene region is also a boundary between domains of early- and late-replicating sequences and of regions with high and low recombination frequencies. The perfect conservation of the GC-content distribution in this region between human and mouse demonstrates that GC-content stabilizing forces must act regionally on a fine scale at this locus. To further elucidate the nature of these forces, we report here on the spectrum of human SNPs and base pair substitutions between human and chimpanzee. The results show that the mutation rate changes exactly at the GC-content transition zone from low values in the GC-poor sequences to high values in GC-rich ones. The GC content of the GC-poor sequences can be explained by a bias in favor of GC > AT mutations, whereas the GC content of the GC-rich segment may result from a fixation bias in favor of AT > GC substitutions. This fixation bias may be explained by direct selection by the GC content or by biased gene conversion.  相似文献   

7.
The “heterozygote instability” (HI) hypothesis suggests that gene conversion events focused on heterozygous sites during meiosis locally increase the mutation rate, but this hypothesis remains largely untested. As humans left Africa they lost variability, which, if HI operates, should have reduced the mutation rate in non-Africans. Relative substitution rates were quantified in diverse humans using aligned whole genome sequences from the 1,000 genomes project. Substitution rate is consistently greater in Africans than in non-Africans, but only in diploid regions of the genome, consistent with a role for heterozygosity. Analysing the same data partitioned into a series of non-overlapping 2 Mb windows reveals a strong, non-linear correlation between the amount of heterozygosity lost “out of Africa” and the difference in substitution rate between Africans and non-Africans. Putative recent mutations, derived variants that occur only once among the 80 human chromosomes sampled, occur preferentially at the centre of 2 Kb windows that have elevated heterozygosity compared both with the same region in a closely related population and with an immediately adjacent region in the same population. More than half of all substitutions appear attributable to variation in heterozygosity. This observation provides strong support for HI with implications for many branches of evolutionary biology.  相似文献   

8.
It has been demonstrated that recombination in the human p-arm pseudoautosomal region (p-PAR) is at least twenty times more frequent than the genomic average of approximately 1 cM/Mb, which may affect substitution patterns and rates in this region. Here I report the analysis of substitution patterns and rates in 10 human, chimpanzee, gorilla, and orangutan genes across the p-PAR. Between species silent divergence in the p-PAR forms a gradient, increasing toward the telomere. The correlation of silent divergence with distance from the p-PAR boundary is highly significant (rho = 0.911, P < 0.001). After exclusion of the CpG dinucleotides this correlation is still significant (rho = 0.89, P < 0.01), thus the substitution rate gradient cannot be explained solely by the differences in the extent of methylation across the p-PAR. Frequent recombination in the PAR may result in a relatively strong effect of biased gene conversion (BGC), which, because of the increased probability of fixation of the G or C nucleotides at (A or T)/(G or C) segregating sites, may affect substitution rates. BGC, however, does not seem to be the factor creating the substitution rate gradient in the p-PAR, because the only gradient is still detactable if only A<-->T and G<-->C substitutions are taken into account (rho = 0.82, P < 0.01). I hypothesize that the substitution rate gradient in the p-PAR is due to the mutagenic effect of recombination, which is very frequent in the distal human p-PAR and might be lower near the p-PAR boundary.  相似文献   

9.
Homologous recombination occurs especially frequently near special chromosomal sites called hotspots. In Escherichia coli, Chi hotspots control RecBCD enzyme, a protein machine essential for the major pathway of DNA break-repair and recombination. RecBCD generates recombinogenic single-stranded DNA ends by unwinding DNA and cutting it a few nucleotides to the 3′ side of 5′ GCTGGTGG 3′, the sequence historically equated with Chi. To test if sequence context affects Chi activity, we deep-sequenced the products of a DNA library containing 10 random base-pairs on each side of the Chi sequence and cut by purified RecBCD. We found strongly enhanced cutting at Chi with certain preferred sequences, such as A or G at nucleotides 4–7, on the 3′ flank of the Chi octamer. These sequences also strongly increased Chi hotspot activity in E. coli cells. Our combined enzymatic and genetic results redefine the Chi hotspot sequence, implicate the nuclease domain in Chi recognition, indicate that nicking of one strand at Chi is RecBCD''s biologically important reaction in living cells, and enable more precise analysis of Chi''s role in recombination and genome evolution.  相似文献   

10.
A full genome analysis of differences between the gene expression in the human and chimpanzee brains revealed that the gene for transthyretin, the carrier of thyroid hormones, is differently transcribed in the cerebella of these species. A 7-kbp DNA fragment of chimpanzee was sequenced to identify possible regulatory sequences responsible for the differences in expression. One hundred and thirteen substitutions were found in the chimpanzee sequence in comparison with the human sequence. About 40% of the substitutions were revealed within the repeating elements of the genome; their location and sizes did not differ from those in the corresponding fragments of the human genome, and the nucleotide sequences had a high degree of identity. A comparison of nucleotide sequences of the transthyretin region of human, chimpanzee, and mouse genes revealed substantial differences in the distribution of G + C content along the examined fragment in the human (chimpanzee) and mouse genes and allowed us to localize three sequence tracts with a higher degree of identity in the three species. One of these tracts was located in the promoter region of the gene, and the other two probably determine the specificity of transthyretin gene expression in the liver and brain. One of the conserved tracts of the chimpanzee genome was found to have a single and a triple nucleotide substitution. The triple substitution distinguishes chimpanzees from humans and mice, which have identical sequences of this site. It is likely that these substitutions are responsible for the differences in the expression levels of the transthyretin gene in the human and chimpanzee brains.  相似文献   

11.
A full genome analysis of differences between the gene expression in the human and chimpanzee brains revealed that the gene for transthyretin, the carrier of thyroid hormones, is differently transcribed in the cerebella of these species. A 7-kbp DNA fragment of chimpanzee was sequenced to identify possible regulatory sequences responsible for the differences in expression. One hundred and thirteen substitutions were found in the chimpanzee sequence in comparison with the human sequence. About 40% of the substitutions were revealed within the repeating elements of the genome; their location and sizes did not differ from those in the corresponding fragments of the human genome, and the nucleotide sequences had a high degree of identity. A comparison of nucleotide sequences of the transthyretin region of human, chimpanzee, and mouse genes revealed substantial differences in the distribution of G + C content along the examined fragment in the human (chimpanzee) and mouse genes and allowed us to localize three sequence tracts with a higher degree of identity in the three species. One of these tracts is located in the promoter region of the gene, and the other two probably determine the specificity of transthyretin gene expression in the liver and brain. One of the conserved tracts of the chimpanzee genome was found to have a single and a triple nucleotide substitution. The triple substitution distinguishes chimpanzees from humans and mice, which have identical sequences of this site. It is likely that these substitutions are responsible for the differences in the expression levels of the transthyretin gene in the human and chimpanzee brains.  相似文献   

12.
Some aspects of microsatellite evolution, such as the role of base substitutions, are far from being fully understood. To examine the significance of base substitutions underlying the evolution of microsatellites we explored the nature and the distribution of interruptions in dinucleotide repeats from the human genome. The frequencies that we inferred in the repetitive sequences were statistically different from the frequencies observed in other noncoding sequences. Additionally, we detected that the interruptions tended to be towards the ends of the microsatellites and 5'-3' asymmetry. In all the estimates nucleotides forming the same repetitive motif seem to be affected by different base substitution rates in AC and AG. This tendency itself could generate patterning and similarity in flanking sequences and reconcile these phenomena with the high mutation rate found in flanking sequences without invoking convergent evolution. Nevertheless, our data suggest that there is a regional bias in the substitution pattern of microsatellites. The accumulation of random substitutions alone cannot explain the heterogeneity and the asymmetry of interruptions found in this study or the relative frequency of different compound microsatellites in the human genome. Therefore, we cannot rule out the possibility of a mutational bias leading to convergent or parallel evolution in flanking sequences.  相似文献   

13.
Interspersed repeats have emerged as a valuable tool for studying neutral patterns of molecular evolution. Here we analyze variation in the rate and pattern of nucleotide substitution across all autosomes in the chicken genome by comparing the present-day CR1 repeat sequences with their ancestral copies and reconstructing nucleotide substitutions with a maximum likelihood model. The results shed light on the origin and evolution of large-scale heterogeneity in GC content found in the genomes of birds and mammals--the isochore structure. In contrast to mammals, where GC content is becoming homogenized, heterogeneity in GC content is being reinforced in the chicken genome. This is also supported by patterns of substitution inferred from alignments of introns in chicken, turkey, and quail. Analysis of individual substitution frequencies is consistent with the biased gene conversion (BGC) model of isochore evolution, and it is likely that patterns of evolution in the chicken genome closely resemble those in the ancestral amniote genome, when it is inferred that isochores originated. Microchromosomes and distal regions of macrochromosomes are found to have elevated substitution rates and a more GC-biased pattern of nucleotide substitution. This can largely be accounted for by a strong correlation between GC content and the rate and pattern of substitution. The results suggest that an interaction between increased mutability at CpG motifs and fixation biases due to BGC could explain increased levels of divergence in GC-rich regions.  相似文献   

14.
Nucleotide sequences of the genome RNA encoding capsid protein VP1 (918 nucleotides) of 18 enterovirus 70 (EV70) isolates collected from various parts of the world in 1971 to 1981 were determined, and nucleotide substitutions among them were studied. The genetic distances between isolates were calculated by the pairwise comparison of nucleotide difference. Regression analysis of the genetic distances against time of isolation of the strains showed that the synonymous substitution rate was very high at 21.53 x 10(-3) substitution per nucleotide per year, while the nonsynonymous rate was extremely low at 0.32 x 10(-3) substitution per nucleotide per year. The rate estimated by the average value of synonymous and nonsynonymous substitutions (W.-H. Li, C.-C. Wu, and C.-C. Luo, Mol. Biol. Evol. 2:150-174, 1985) was 5.00 x 10(-3) substitution per nucleotide per year. Taking the average value of synonymous and nonsynonymous substitutions as genetic distances between isolates, the phylogenetic tree was inferred by the unweighted pairwise grouping method of arithmetic average and by the neighbor-joining method. The tree indicated that the virus had evolved from one focal place, and the time of emergence was estimated to be August 1967 +/- 15 months, 2 years before first recognition of the pandemic of acute hemorrhagic conjunctivitis. By superimposing every nucleotide substitution on the branches of the phylogenetic tree, we analyzed nucleotide substitution patterns of EV70 genome RNA. In synonymous substitutions, the proportion of transitions, i.e., C<==>U and G<==>A, was found to be extremely frequent in comparison with that reported on other viruses or pseudogenes. In addition, parallel substitutions (independent substitutions at the same nucleotide position on different branches, i.e., different isolates, of the tree) were frequently found in both synonymous and nonsynonymous substitutions. These frequent parallel substitutions and the low nonsynonymous substitution rate despite the very high synonymous substitution rate described above imply a strong restriction on nonsynonymous substitution sites of VP1, probably due to the requirement for maintaining the rigid icosahedral conformation of the virus.  相似文献   

15.
Genomic variation and related evolutionary dynamics of human respiratory syncytial virus (RSV), a common causative agent of severe lower respiratory tract infections, may affect its transmission behavior. RSV evolutionary patterns are likely to be influenced by a precarious interplay between selection favoring variants with higher replicative fitness and variants that evade host immune responses. Studying RSV genetic variation can reveal both the genes and the individual codons within these genes that are most crucial for RSV survival. In this study, we conducted genetic diversity and evolutionary rate analyses on 36 RSV subgroup B (RSV-B) whole-genome sequences. The attachment protein, G, was the most variable protein; accordingly, the G gene had a higher substitution rate than other RSV-B genes. Overall, less genetic variability was found among the available RSV-B genome sequences than among RSV-A genome sequences in a comparable sample. The mean substitution rates of the two subgroups were, however, similar (for subgroup A, 6.47 × 10−4 substitutions/site/year [95% credible interval {CI 95%}, 5.56 × 10−4 to 7.38 × 10−4]; for subgroup B, 7.76 × 10−4 substitutions/site/year [CI 95%, 6.89 × 10−4 to 8.58 × 10−4]), with the time to their most recent common ancestors (TMRCAs) being much lower for RSV-B (19 years) than for RSV-A (46.8 years). The more recent RSV-B TMRCA is apparently the result of a genetic bottleneck that, over longer time scales, is still compatible with neutral population dynamics. Whereas the immunogenic G protein seems to require high substitution rates to ensure immune evasion, strong purifying selection in conserved proteins such as the fusion protein and nucleocapsid protein is likely essential to preserve RSV viability.  相似文献   

16.
The presence of at least ten mouse LDH-A pseudogenes was demonstrated in the genomic blot analysis, and four different processed pseudogenes have thus far been isolated and characterized. In this report, the nucleotide sequences to two different mouse lactate dehydrogenase-A processed pseudogenes, M11 and M14, were determined and compared with the protein-coding sequences of the mouse and rat LDH-A functional genes. In the pseudogene M11, the sequence of 64 nucleotides from codon no. 257 to 278 was tandemly duplicated. In the pseudogene M14, the sequence of 22 nucleotides from codon no. 68 to 75 was replaced by an inserted repetitive sequence of 242 nucleotides homologous to a mouse truncated R element. The pattern of nucleotide substitutions accumulated in mouse LDH-A pseudogenes M11 and M14, as well as that of pseudogene M10 identified previously, was analyzed, and the substitution frequencies of the C or G at the CG dinucleotide were found to be high.  相似文献   

17.
Plastid sequences of the atpB-rbcL spacer and rbcL gene itself were used to evaluate their respective potential in reconstructing the phylogeny of 15 taxa from the tribe Rubieae (Rubiaceae). From our previous analyses using the atpB-rbcL spacer, the 15 selected taxa represent most of the variability of the tribe. Since this group is considered to be relatively recent (Upper Tertiary), it should allow the study of early dynamics of nucleotide substitutions in recent divergences. The results show that the spacer and rbcL inferred phylogenies are not totally congruent; the spacer trees are more similar to interpretations of morphological data. A comparative analysis of the pattern of nucleotide substitution of these two sequences in the Rubieae shows that (1) the overall rate of substitution is similar in the spacer and in rbcL, and the rate of synonymous substitution in rbcL is much higher; (2) the level of homoplasy is higher in rbcL than in the spacer matrix which shows a higher phylogenetic structure; and (3) the pattern of transition and transversion substitutions is different in the two sequences, and is not linear in rbcL. As a result of these observations, we suggest that (1) the spacer is evolving relatively slowly because of unsuspected, and phylogenetically important; selective constraints on its sequence; and (2) in the rbcL sequence, many sites, free of constraint, are changing at high rate, and some of these sites seem to have undergone multiple substitutions even in this recent tribe. This could explain the high level of homoplasy found in Rubieae rbcL sequences. Correspondence to: J.-F. Manen  相似文献   

18.
Yi SY  Joeng KS  Kweon JU  Cho JW  Chung IK  Lee J 《FEBS letters》2001,505(2):301-306
We identified and characterized a protein (STB-1) from the nuclear extract of Caenorhabditis elegans that specifically binds single-stranded telomere DNA sequences, but not the corresponding RNA sequences. STB-1 binding activity is specific to the nematode telomere, but not to the human or plant telomere. STB-1 requires the core nucleotides of GCTTAGG and three spacer nucleotides in front of them for binding. While any single nucleotide change in the core sequence abolishes binding, the spacer nucleotides tolerate substitution. STB-1 was determined to be a basic protein of 45 kDa by Southwestern analyses. STB-1 forms a stable complex with DNA once bound to the telomere.  相似文献   

19.
Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences, we have thoroughly studied DNA mutation patterns in the human genome. We analyzed a total of 1726 processed RP pseudogene sequences, comprising more than 700 000 bases. To be sure to differentiate the sequence changes occurring in the functional genes during evolution from those occurring in pseudogenes after they were fixed in the genome, we used only pseudogene sequences originating from parts of RP genes that are identical in human and mouse. Overall, we found that nucleotide transitions are more common than transversions, by roughly a factor of two. Moreover, the substitution rates amongst the 12 possible nucleotide pairs are not homogeneous as they are affected by the type of immediately neighboring nucleotides and the overall local G+C content. Finally, our dataset is large enough that it has many indels, thus allowing for the first time statistically robust analysis of these events. Overall, we found that deletions are about three times more common than insertions (3740 versus 1291). The frequencies of both these events follow characteristic power–law behavior associated with the size of the indel. However, unexpectedly, the frequency of 3 bp deletions (in contrast to 3 bp insertions) violates this trend, being considerably higher than that of 2 bp deletions. The possible biological implications of such a 3 bp bias are discussed.  相似文献   

20.
Summary Based on the rates of synonymous substitution in 42 protein-codin gene pairs from rat and human, a correlation is shown to exist between the frequency of the nucleotides in all positions of the codon and the synonymous substitution rate. The correlation coefficients were positive for A and T and negative for C and G. This means that AT-rich genes accumulate more synonymous substitutions than GC-rich genes. Biased patterns of mutation could not account for this phenomenon. Thus, the variation in synonymous substitution rates and the resulting unequal codon usage must be the consequence of selection against A and T in synonymous positions. Most of the varition in rates of synonymous substitution can be explained by the nucleotide composition in synonymous positions. Codon-anticodon interactions, dinucleotide frequencies, and contextual factors influence neither the rates of synonymous substitution nor codon usage. Interestingly, the nucleotide in the second position of codons (always a nonsynonymous position) was found to affect the rate of synonymous substitution. This finding links the rate of nonsynonymous substitution with the synonymous rate. Consequently, highly conservative proteins are expected to be encoded by genes that evolve slowly in terms of synonymous substitutions, and are consequently highly biased in their codon usage.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号