首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
How Do Variable Substitution Rates Influence Ka and Ks Calculations?   总被引:2,自引:1,他引:1  
The ratio of nonsynonymous substitution rate (Ka) to synonymous substitution rate (Ks) is widely used as an indicator of selective pressure at sequence level among different species, and diverse mutation models have been incorporated into several computing methods. We have previously developed a new γ-MYN method by capturing a key dynamic evolution trait of DNA nucleotide sequences, in consideration of varying mutation rates across sites. We now report a further improvement of NG, LWL, MLWL, LPB, MLPB, and ...  相似文献   

2.
Cereal genes are classified into two distinct classes according to the guanine-cytosine (GC) content at the third codon sites (GC3). Natural selection and mutation bias have been proposed to affect the GC content. However, there has been controversy about the cause of GC variation. Here, we characterized the GC content of 1 092 paralogs and other single-copy genes in the duplicated chromosomal regions of the rice genome (ssp. indica) and classified the paralogs into GC3-rich and GC3-poor groups. By referring to out-group sequences from Arabidopsis and maize, we confirmed that the average synonymous substitution rate of the GC3-rich genes is significantly lower than that of the GC3-poor genes. Furthermore, we explored the other possible factors corresponding to the GC variation including the length of coding sequences, the number of exons in each gene, the number of genes in each family, the location of genes on chromosomes and the protein functions. Consequently, we propose that natural selection rather than mutation bias was the primary cause of the GC variation.  相似文献   

3.
Cereal genes are classified into two distinct classes according to the guanine-cytosine(GC)content at the third codonsites(GC_3).Natural selection and mutation bias have been proposed to affect the GC content.However,there has beencontroversy about the cause of GC variation.Here,we characterized the GC content of 1092 paralogs and other single-copygenes in the duplicated chromosomal regions of the rice genome(ssp.indica)and classified the paralogs into GC_3-richand GC_3-poor groups.By referring to out-group sequences from Arabidopsis and maize,we confirmed that the averagesynonymous substitution rate of the GC_3-rich genes is significantly lower than that of the GC_3-poor genes.Furthermore,we explored the other possible factors corresponding to the GC variation including the length of coding sequences,thenumber of exons in each gene,the number of genes in each family,the location of genes on chromosomes and the proteinfunctions.Consequently,we propose that natural selection rather than mutation bias was the primary cause of the GCvariation.  相似文献   

4.
5.
We present an integrated stand-alone software package named KaKs_Calculator 2.0 as an updated version.It incorporates 17 methods for the calculation of nonsynonymous and synonymous substitution rates;among them,we added our modified versions of several widely used methods as the gamma series including γ-NG,γ-LWL,γ-MLWL,γ-LPB,γ-MLPB,γ-YN and γ-MYN,which have been demonstrated to perform better under certain conditions than their original forms and are not implemented in the previous version.The package is readily used for the identification of positively selected sites based on a sliding window across the sequences of interests in 5' to 3' direction of protein-coding sequences,and have improved the overall performance on sequence analysis for evolution studies.A toolbox,including C++ and Java source code and executable files on both Windows and Linux platforms together with a user instruction,is downloadable from the website for academic purpose at https://sourceforge.net/projects/kakscalculator2/.  相似文献   

6.
Methods for estimating synonymous and nonsynonymous substitution rates among protein-coding sequences adopt different mutation (substitution) models with subtle yet significant differences, which lead to different estimates of evolutionary information. Little attention has been devoted to the comparison of methods for obtaining reliable estimates since the amount of sequence variations within targeted datasets is always unpredictable. To our knowledge, there is little information available in literature about evaluation of these different methods. In this study, we compared six widely used methods and provided with evaluation results using simulated sequences. The results indicate that incorporating sequence features (such as transition/transversion bias and nucleotide/codon frequency bias) into methods could yield better performance. We recommend that conclusions related to or derived from Ka and Ks analyses should not be readily drawn only according to results from one method.  相似文献   

7.
8.
The CD59-coding sequences were obtained from 5 mammals by PCR and BLAST, and combined with the available sequences in GenBank, the nucleotide substitution rates of mammalian cd59 were calcu- lated. Results of synonymous and nonsynonymous substitution rates revealed that cd59 experienced negative selection in mammals overall. Four sites experiencing positive selection were found by using "site-specific" model in PAML software. These sites were distributed on the molecular surface, of which 2 sites located in the key functional domain. Furthermore, "branch-site-specific" model detected 1 positive site in cd59a and cd59b lineages which underwent accelerated evolution caused by positive selection after gene duplication in mouse.  相似文献   

9.
The first intron of rice EPSP synthase enhances expression of foreign gene   总被引:5,自引:0,他引:5  
Translatable exon sequences in pre-mRNA often are separated by non-coding introns in eu-karyotic genomes. The removal of non-coding introns from pre-mRNA and the splicing together of translatable exons sequence is an essential requirement of gene expression. DNA size of introns in a gene is 5—10 times larger than that of exon, which can store more information and is helpful for a gene during evolution[1]. In many experiments on gene expression, it is indispensable for a gene to be expresse…  相似文献   

10.
Since the birth of molecular evolutionary analysis,primates have been a central focus of study and mitochondrial DNA is well suited to these endeavors because of its unique features.Surprisingly,to date no comprehensive evaluation of the nucleotide substitution patterns has been conducted on the mitochondrial genome of primates.Here,we analyzed the evolutionary patterns and evaluated selection and recombination in the mitochondrial genomes of 44 Primates species downloaded from GenBank.The results revealed that a strong rate heterogeneity occurred among sites and genes in all comparisons.Likewise,an obvious decline in primate nucleotide diversity was noted in the subunit rRNAs and tRNAs as compared to the protein-coding genes.Within 13 protein-coding genes,the pattern of nonsynonymous divergence was similar to that of overall nucleotide divergence,while synonymous changes differed only for individual genes,indicating that the rate heterogeneity may result from the rate of change at nonsynonymous sites.Codon usage analysis revealed that there was intermediate codon usage bias in primate protein-coding genes,and supported the idea that GC mutation pressure might determine codon usage and that positive selection is not the driving force for the codon usage bias.Neutrality tests using site-specific positive selection from a Bayesian framework indicated no sites were under positive selection for any gene,consistent with near neutrality.Recombination tests based on the pairwise homoplasy test statistic supported complete linkage even for much older divergent primate species.Thus,with the exception of rate heterogeneity among mitochondrial genes,evaluating the validity assumed complete linkage and selective neutrality in primates prior to phylogenetic or phylogeographic analysis seems unnecessary.  相似文献   

11.
Since plant mitochondrial genomes exhibit some of the slowest known synonymous substitution rates, it is generally believed that they experience exceptionally low mutation rates. However, the use of synonymous substitution rates to infer mutation rates depends on the implicit assumption that synonymous sites are evolving neutrally (or nearly so). To assess the validity of this assumption in plant mitochondrial genomes, we examined coding sequence for footprints of selection acting at synonymous sites. We found that synonymous sites exhibit an AT rich and pyrimidine skewed nucleotide composition compared to both non-synonymous sites and non-coding regions. We also found some evidence for selection associated with both biased codon usage and conservation of regulatory sequences involved in mRNA processing, although some of these findings are subject to alternative non-adaptive interpretations. Regardless, the inferred strength of selection appears too weak to account for the variation in substitution rates between the mitochondrial genomes of plants and other multicellular eukaryotes. Therefore, these results are consistent with the interpretation that plant mitochondrial genomes experience a substantially lower mutation rate rather than increased functional constraints acting on synonymous sites. Nevertheless, there are important nucleotide composition patterns (particularly the differences between synonymous sites and non-coding DNA) that remain largely unexplained.  相似文献   

12.
Genes sequences from Escherichia coli, Salmonella typhimurium, and other members of the Enterobacteriaceae show a negative correlation between the degree of synonymous-codon usage bias and the rate of nucleotide substitution at synonymous sites. In particular, very highly expressed genes have very biased codon usage and accumulate synonymous substitutions very slowly. In contrast, there is little correlation between the degree of codon bias and the rate of protein evolution. It is concluded that both the rate of synonymous substitution and the degree of codon usage bias largely reflect the intensity of selection at the translational level. Because of the high variability among genes in rates of synonymous substitution, separate molecular clocks of synonymous substitution might be required for different genes.   相似文献   

13.
A common approach to estimate the strength and direction of selection acting on protein coding sequences is to calculate the dN/dS ratio. The method to calculate dN/dS has been widely used by many researchers and many critical reviews have been made on its application after the proposition by Nei and Gojobori in 1986. However, the method is still evolving considering the non-uniform substitution rates and pretermination codons. In our study of SNPs in 586 genes across 156 Escherichia coli strains, synonymous polymorphism in 2-fold degenerate codons were higher in comparison to that in 4-fold degenerate codons, which could be attributed to the difference between transition (Ti) and transversion (Tv) substitution rates where the average rate of a transition is four times more than that of a transversion in general. We considered both the Ti/Tv ratio, and nonsense mutation in pretermination codons, to improve estimates of synonymous (S) and non-synonymous (NS) sites. The accuracy of estimating dN/dS has been improved by considering the Ti/Tv ratio and nonsense substitutions in pretermination codons. We showed that applying the modified approach based on Ti/Tv ratio and pretermination codons results in higher values of dN/dS in 29 common genes of equal reading-frames between E. coli and Salmonella enterica. This study emphasizes the robustness of amino acid composition with varying codon degeneracy, as well as the pretermination codons when calculating dN/dS values.  相似文献   

14.
Detecting selection in noncoding regions of nucleotide sequences   总被引:2,自引:0,他引:2  
Wong WS  Nielsen R 《Genetics》2004,167(2):949-958
We present a maximum-likelihood method for examining the selection pressure and detecting positive selection in noncoding regions using multiple aligned DNA sequences. The rate of substitution in noncoding regions relative to the rate of synonymous substitution in coding regions is modeled by a parameter zeta. When a site in a noncoding region is evolving neutrally zeta = 1, while zeta > 1 indicates the action of positive selection, and zeta < 1 suggests negative selection. Using a combined model for the evolution of noncoding and coding regions, we develop two likelihood-ratio tests for the detection of selection in noncoding regions. Data analysis of both simulated and real viral data is presented. Using the new method we show that positive selection in viruses is acting primarily in protein-coding regions and is rare or absent in noncoding regions.  相似文献   

15.
分析了百合目主要类群叶绿体中编码核酮糖1,5二磷酸羧化氧化酶大亚基rbcL基因的42条序列,使用RRTree相对速率检测方法,详细研究rbcL基因在百合目7科间同义替代速率和非同义替代速率的变化.相对速率检测显示:百合目内秋水仙科(Colchicaceae)的同义替代速率和非同义替代速率均最快,金梅草科(Campynemat-aceae)同义替代速率最慢,百合科(Liliaceae)的非同义替代速率最慢,但在百合目各科间,无论同义替代速率还是非同义替代速率差异均不显著.  相似文献   

16.
The molecular clock of mitochondrial DNA has been extensively used to date various genetic events. However, its substitution rate among humans appears to be higher than rates inferred from human-chimpanzee comparisons, limiting the potential of interspecies clock calibrations for intraspecific dating. It is not well understood how and why the substitution rate accelerates. We have analyzed a phylogenetic tree of 3057 publicly available human mitochondrial DNA coding region sequences for changes in the ratios of mutations belonging to different functional classes. The proportion of non-synonymous and RNA genes substitutions has reduced over hundreds of thousands of years. The highest mutation ratios corresponding to fast acceleration in the apparent substitution rate of the coding sequence have occurred after the end of the Last Ice Age. We recalibrate the molecular clock of human mtDNA as 7990 years per synonymous mutation over the mitochondrial genome. However, the distribution of substitutions at synonymous sites in human data significantly departs from a model assuming a single rate parameter and implies at least 3 different subclasses of sites. Neutral model with 3 synonymous substitution rates can explain most, if not all, of the apparent molecular clock difference between the intra- and interspecies levels. Our findings imply the sluggishness of purifying selection in removing the slightly deleterious mutations from the human as well as the Neandertal and chimpanzee populations. However, for humans, the weakness of purifying selection has been further exacerbated by the population expansions associated with the out-of Africa migration and the end of the Last Ice Age.  相似文献   

17.
Duret L  Arndt PF 《PLoS genetics》2008,4(5):e1000071
Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue for detecting selection within sequences. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We analyzed the pattern of neutral substitutions in 1 Gb of primate non-coding regions. We show that the GC-content toward which sequences are evolving is strongly negatively correlated to the distance to telomeres and positively correlated to the rate of crossovers (R2=47%). This demonstrates that recombination has a major impact on substitution patterns in human, driving the evolution of GC-content. The evolution of GC-content correlates much more strongly with male than with female crossover rate, which rules out selectionist models for the evolution of isochores. This effect of recombination is most probably a consequence of the neutral process of biased gene conversion (BGC) occurring within recombination hotspots. We show that the predictions of this model fit very well with the observed substitution patterns in the human genome. This model notably explains the positive correlation between substitution rate and recombination rate. Theoretical calculations indicate that variations in population size or density in recombination hotspots can have a very strong impact on the evolution of base composition. Furthermore, recombination hotspots can create strong substitution hotspots. This molecular drive affects both coding and non-coding regions. We therefore conclude that along with mutation, selection and drift, BGC is one of the major factors driving genome evolution. Our results also shed light on variations in the rate of crossover relative to non-crossover events, along chromosomes and according to sex, and also on the conservation of hotspot density between human and chimp.  相似文献   

18.
Drosophila nuclear introns are commonly assumed to change according to a single rate of substitution, yet little is known about the evolution of these non-coding sequences. The hypothesis of a uniform substitution rate for introns seems to be at odds with recent findings that the nucleotide composition of introns varies at a scale unknown before, and that their base content variation is correlated with that of the adjacent exons. However, no direct attempt at comparing substitution rates in introns seems to have been addressed so far. We have studied the rate of nucleotide substitution over a region of the Xdh gene containing two adjacent short, constitutively spliced introns, in several species of Drosophila and related genera. The two introns differ significantly in base composition and substitution rate, with one intron evolving at least twice as fast as the other. In addition, the substitution pattern of the introns is positively associated with that of the surrounding coding regions, evidencing that the molecular evolution of these introns is impacted by the region in which they are embedded. The observed differences cannot be attributed to selection acting differently at the level of the secondary structure of the pre-mRNA. Rather, they are better accounted for by locally heterogeneous patterns of mutation. Received: 26 July 1999 / Accepted: 21 August 1999  相似文献   

19.
Reduced median networks of African haplogroup L mitochondrial DNA (mtDNA) sequences were analyzed to determine the pattern of substitutions in both the noncoding control and coding regions. In particular, we attempted to determine the causes of the previously reported (Howell et al. 2004) violation of the molecular clock during the evolution of these sequences. In the coding region, there was a significantly higher rate of substitution at synonymous sites than at nonsynonymous sites as well as in the tRNA and rRNA genes. This is further evidence for the operation of purifying selection during human mtDNA evolution. For most sites in the control region, the relative rate of substitution was similar to the rate of neutral evolution (assumed to be most closely approximated by the substitution rate at 4-fold degenerate sites). However, there are a number of mutational hot spots in the control region, approximately 3% of the total sites, that have a rate of substitution greater than the neutral rate, at some sites by more than an order of magnitude. It is possible either that these sites are evolving under conditions of positive selection or that the substitution rate at some sites in the control region is strongly dependent upon sequence context. Finally, we obtained preliminary evidence for "nonideal" evolution in the control region, including haplogroup-specific substitution patterns and a decoupling between relative rates of substitution in the control and coding regions.  相似文献   

20.
The selective forces acting on a protein-coding gene are commonly inferred using evolutionary codon models by contrasting the rate of nonsynonymous substitutions to the rate of synonymous substitutions. These models usually assume that the synonymous substitution rate, Ks, is homogenous across all sites, which is justified if synonymous sites are free from selection. However, a growing body of evidence indicates that the DNA and RNA levels of protein-coding genes are subject to varying degrees of selective constraints due to various biological functions encoded at these levels. In this paper, we develop evolutionary models that account for these layers of selection by allowing for both among-site variability of substitution rates at the DNA/RNA level (which leads to Ks variability among protein-coding sites) and among-site variability of substitution rates at the protein level (Ka variability). These models are constructed so that positive selection is either allowed or not. This enables statistical testing of positive selection when variability at the DNA/RNA substitution rate is accounted for. Using this methodology, we show that variability of the baseline DNA/RNA substitution rate is a widespread phenomenon in coding sequence data of mammalian genomes, most likely reflecting varying degrees of selection at the DNA and RNA levels. Additionally, we use simulations to examine the impact that accounting for the variability of the baseline DNA/RNA substitution rate has on the inference of positive selection. Our results show that ignoring this variability results in a high rate of erroneous positive-selection inference. Our newly developed model, which accounts for this variability, does not suffer from this problem and hence provides a likelihood framework for the inference of positive selection on a background of variability in the baseline DNA/RNA substitution rate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号