首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
High-throughput sequencing enables rapid genome sequencing during infectious disease outbreaks and provides an opportunity to quantify the evolutionary dynamics of pathogens in near real-time. One difficulty of undertaking evolutionary analyses over short timescales is the dependency of the inferred evolutionary parameters on the timespan of observation. Crucially, there are an increasing number of molecular clock analyses using external evolutionary rate priors to infer evolutionary parameters. However, it is not clear which rate prior is appropriate for a given time window of observation due to the time-dependent nature of evolutionary rate estimates. Here, we characterize the molecular evolutionary dynamics of SARS-CoV-2 and 2009 pandemic H1N1 (pH1N1) influenza during the first 12 months of their respective pandemics. We use Bayesian phylogenetic methods to estimate the dates of emergence, evolutionary rates, and growth rates of SARS-CoV-2 and pH1N1 over time and investigate how varying sampling window and data set sizes affect the accuracy of parameter estimation. We further use a generalized McDonald–Kreitman test to estimate the number of segregating nonneutral sites over time. We find that the inferred evolutionary parameters for both pandemics are time dependent, and that the inferred rates of SARS-CoV-2 and pH1N1 decline by ∼50% and ∼100%, respectively, over the course of 1 year. After at least 4 months since the start of sequence sampling, inferred growth rates and emergence dates remain relatively stable and can be inferred reliably using a logistic growth coalescent model. We show that the time dependency of the mean substitution rate is due to elevated substitution rates at terminal branches which are 2–4 times higher than those of internal branches for both viruses. The elevated rate at terminal branches is strongly correlated with an increasing number of segregating nonneutral sites, demonstrating the role of purifying selection in generating the time dependency of evolutionary parameters during pandemics.  相似文献   

2.
We have used analysis of variance to partition the variation in synonymous and amino acid substitution rates between three effects (gene, lineage, and a gene-by-lineage interaction) in mammalian nuclear and mitochondrial genes. We find that gene effects are stronger for amino acid substitution rates than for synonymous substitution rates and that lineage effects are stronger for synonymous substitution rates than for amino acid substitution rates. Gene-by-lineage interactions, equivalent to overdispersion corrected for lineage effects, are found in amino acid substitutions but not in synonymous substitutions. The variance in the ratio of amino acid and synonymous substitution rates is dominated by gene effects, but there is also a significant gene-by-lineage interaction.  相似文献   

3.
以蛋白质分子的氨基酸置换数或核酸分子的核苷酸置换数为衡量尺度 ,说明生物大分子随时间的改变 (即分子进化速率 )保持相对恒定  相似文献   

4.
Sequences for multiple protein-coding genes are now commonly available from several, often closely related species. These data sets offer intriguing opportunities to test hypotheses regarding whether different types of genes evolve under different selective pressures. Although maximum likelihood (ML) models of codon substitution that are suitable for such analyses have been developed, little is known about the statistical properties of these tests. We use a previously developed fixed-sites model and computer simulations to examine the accuracy and power of the likelihood ratio test (LRT) in comparing the nonsynonymous-to-synonymous substitution rate ratio (=dN/dS) between two genes. Our results show that the LRT applied to fixed-sites models may be inaccurate in some cases when setting significance thresholds using a 2 approximation. Instead, we use a parametric bootstrap to describe the distribution of the LRT statistic for fixed-sites models and examine the power of the test as a function of sampling variables and properties of the genes under study. We find that the power of the test is high (>80%) even when sampling few taxa (e.g., six species) if sequences are sufficiently diverged and the test is largely unaffected by the tree topology used to simulate data. Our simulations show fixed-sites models are suitable for comparing substitution parameters among genes evolving under even strong evolutionary constraint ( 0.05), although relative rate differences of 25% or less may be difficult to detect.Reviewing Editor: Dr. Rosmus Nielsen  相似文献   

5.
    
Evolutionary timescales can be estimated from genetic data using phylogenetic methods based on the molecular clock. To account for molecular rate variation among lineages, a number of relaxed‐clock models have been developed. Some of these models assume that rates vary among lineages in an autocorrelated manner, so that closely related species share similar rates. In contrast, uncorrelated relaxed clocks allow all of the branch‐specific rates to be drawn from a single distribution, without assuming any correlation between rates along neighbouring branches. There is uncertainty about which of these two classes of relaxed‐clock models are more appropriate for biological data. We present an R package, NELSI, that allows the evolution of DNA sequences to be simulated according to a range of clock models. Using data generated by this package, we assessed the ability of two Bayesian phylogenetic methods to distinguish among different relaxed‐clock models and to quantify rate variation among lineages. The results of our analyses show that rate autocorrelation is typically difficult to detect, even when there is complete taxon sampling. This provides a potential explanation for past failures to detect rate autocorrelation in a range of data sets.  相似文献   

6.
Rate variation among nuclear genes and the age of polyploidy in Gossypium   总被引:7,自引:0,他引:7  
Molecular evolutionary rate variation in Gossypium (cotton) was characterized using sequence data for 48 nuclear genes from both genomes of allotetraploid cotton, models of its diploid progenitors, and an outgroup. Substitution rates varied widely among the 48 genes, with silent and replacement substitution levels varying from 0.018 to 0.162 and from 0.000 to 0.073, respectively, in comparisons between orthologous Gossypium and outgroup sequences. However, about 90% of the genes had silent substitution rates spanning a more narrow threefold range. Because there was no evidence of rate heterogeneity among lineages for any gene and because rates were highly correlated in independent tests, evolutionary rate is inferred to be a property of each gene or its genetic milieu rather than the clade to which it belongs. Evidence from approximately 200,000 nucleotides (40,000 per genome) suggests that polyploidy in Gossypium led to a modest enhancement in rates of nucleotide substitution. Phylogenetic analysis for each gene yielded the topology expected from organismal history, indicating an absence of gene conversion or recombination among homoeologs subsequent to allopolyploid formation. Using the mean synonymous substitution rate calculated across the 48 genes, allopolyploid cotton is estimated to have formed circa 1.5 million years ago (MYA), after divergence of the diploid progenitors about 6.7 MYA.  相似文献   

7.
Summary. The rate of mitochondrial DNA evolution and the speciation pattern in relation to glacial periods are tested in the European taxa of the eusocial genus Reticulitermes. The linearized tree obtained from cytochrome oxidase II sequences and a geological event calibration shows a substitution rate 100-fold higher than that usually applied for insect mitochondrial DNA. An accelerated rate of evolution has also been observed in social Vespidae (Hymenoptera); we therefore suggest the involvement of eusociality in mediating gene pool drift. The role of the last ice age in speciation pattern of Reticulitermes taxa is supported by molecular data, but a four refugia model better explains genetic diversity, phyletic relationships and present-day distribution of these termites.Received 30 March 2004; revised 29 July and 15 November 2004; accepted 1 December 2004.  相似文献   

8.
Both the overall rate of nucleotide substitution and the relative proportions of synonymous and non-synonymous substitutions are predicted to vary between species that differ in effective population size (Ne). Our understanding of the genetic processes underlying these lineage-specific differences in molecular evolution is still developing. Empirical analyses indicate that variation in substitution rates and patterns caused by differences in Ne is often substantial, however, and must be accounted for in analyses of molecular evolution.  相似文献   

9.
For over half a century, it has been known that the rate of morphological evolution appears to vary with the time frame of measurement. Rates of microevolutionary change, measured between successive generations, were found to be far higher than rates of macroevolutionary change inferred from the fossil record. More recently, it has been suggested that rates of molecular evolution are also time dependent, with the estimated rate depending on the timescale of measurement. This followed surprising observations that estimates of mutation rates, obtained in studies of pedigrees and laboratory mutation-accumulation lines, exceeded long-term substitution rates by an order of magnitude or more. Although a range of studies have provided evidence for such a pattern, the hypothesis remains relatively contentious. Furthermore, there is ongoing discussion about the factors that can cause molecular rate estimates to be dependent on time. Here we present an overview of our current understanding of time-dependent rates. We provide a summary of the evidence for time-dependent rates in animals, bacteria and viruses. We review the various biological and methodological factors that can cause rates to be time dependent, including the effects of natural selection, calibration errors, model misspecification and other artefacts. We also describe the challenges in calibrating estimates of molecular rates, particularly on the intermediate timescales that are critical for an accurate characterization of time-dependent rates. This has important consequences for the use of molecular-clock methods to estimate timescales of recent evolutionary events.  相似文献   

10.
DNA sequences evolve at different rates in different species. This rate variation has been most closely examined in mammals, revealing a large number of characteristics that can shape the rate of molecular evolution. Many of these traits are part of the mammalian life-history continuum: species with small body size, rapid generation turnover, high fecundity and short lifespans tend to have faster rates of molecular evolution. In addition, rate of molecular evolution in mammals might be influenced by behaviour (such as mating system), ecological factors (such as range restriction) and evolutionary history (such as diversification rate). I discuss the evidence for these patterns of rate variation, and the possible explanations of these correlations. I also consider the impact of these systematic patterns of rate variation on the reliability of the molecular date estimates that have been used to suggest a Cretaceous radiation of modern mammals, before the final extinction of the dinosaurs.  相似文献   

11.
A note on 'Testing the number of components in a normal mixture'   总被引:1,自引:0,他引:1  
Jeffries  Neal O. 《Biometrika》2003,90(4):991-994
  相似文献   

12.
In survivorship modelling using the proportional hazards model of Cox (1972, Journal of the Royal Statistical Society, Series B, 34, 187–220), it is often desired to test a subset of the vector of unknown regression parameters β in the expression for the hazard rate at time t. The likelihood ratio test statistic is well behaved in most situations but may be expensive to calculate. The Wald (1943, Transactions of the American Mathematical Society 54, 426–482) test statistic is easier to calculate, but has some drawbacks. In testing a single parameter in a binomial logit model, Hauck and Donner (1977, Journal of the American Statistical Association 72, 851–853) show that the Wald statistic decreases to zero the further the parameter estimate is from the null and that the asymptotic power of the test decreases to the significance level. The Wald statistic is extensively used in statistical software packages for survivorship modelling and it is therefore important to understand its behavior. The present work examines empirically the behavior of the Wald statistic under various departures from the null hypothesis and under the presence of Type I censoring and covariates in the model. It is shown via examples that the Wald statistic's behavior is not as aberrant as found for the logistic model. For the single parameter case, the asymptotic non-null distribution of the Wald statistic is examined.  相似文献   

13.
    
The mass-specific metabolic rate hypothesis of Gillooly and others predicts that DNA mutation and substitution rates are a function of body mass and temperature. We tested this hypothesis with sequence divergences estimated from mtDNA cytochrome b sequences of 54 taxa of cyprinid fish. Branch lengths estimated from a likelihood tree were compared with metabolic rates calculated from body mass and environmental temperatures experienced by those taxa. The problem of unknown age estimates of lineage splitting was avoided by comparing estimated amounts of metabolic activity along phyletic lines leading to pairs of modern taxa from their most recent common ancestor with sequence divergences along those same pairs of phyletic lines. There were significantly more pairs for which the phyletic line with greater genetic change also had the higher metabolic activity, when compared to the prediction of a hypothesis that body mass and temperature are not related to substitution rate.  相似文献   

14.
A number of studies indicated that lineages of animals with high rates of mitochondrial (mt) gene rearrangement might have high rates of mt nucleotide substitution. We chose the hemipteroid assemblage and the Insecta to test the idea that rates of mt gene rearrangement and mt nucleotide substitution are correlated. For this purpose, we sequenced the mt genome of a lepidopsocid from the Psocoptera, the only order of hemipteroid insects for which an entire mtDNA sequence is not available. The mt genome of this lepidopsocid is circular, 16,924 bp long, and contains 37 genes and a putative control region; seven tRNA genes and a protein-coding gene in this genome have changed positions relative to the ancestral arrangement of mt genes of insects. We then compared the relative rates of nucleotide substitution among species from each of the four orders of hemipteroid insects and among the 20 insects whose mt genomes have been sequenced entirely. All comparisons among the hemipteroid insects showed that species with higher rates of gene rearrangement also had significantly higher rates of nucleotide substitution statistically than did species with lower rates of gene rearrangement. In comparisons among the 20 insects, where the mt genomes of the two species differed by more than five breakpoints, the more rearranged species always had a significantly higher rate of nucleotide substitution than the less rearranged species. However, in comparisons where the mt genomes of two species differed by five or less breakpoints, the more rearranged species did not always have a significantly higher rate of nucleotide substitution than the less rearranged species. We tested the statistical significance of the correlation between the rates of mt gene rearrangement and mt nucleotide substitution with nine pairs of insects that were phylogenetically independent from one another. We found that the correlation was positive and statistically significant (R2 = 0.73, P = 0.01; Rs = 0.67, P < 0.05). We propose that increased rates of nucleotide substitution may lead to increased rates of gene rearrangement in the mt genomes of insects.  相似文献   

15.
    
A number of statistical tests have been proposed to detect positive Darwinian selection affecting a few amino acid sites in a protein, exemplified by an excess of nonsynonymous nucleotide substitutions. These tests are often more powerful than pairwise sequence comparison, which averages synonymous (d(S)) and nonsynonymous (d(N)) rates over the whole gene. In a recent study, however, Hughes AL and Friedman R (2005. Variation in the pattern of synonymous and nonsynonymous difference between two fungal genomes. Mol Bio Evol. 22: 1320-1324) argue that d(S) and d(N) are expected to fluctuate along the sequence by chance and that an excess of nonsynonymous differences in individual codons is no evidence for positive selection. The authors compared codons in protein-coding genes from the genomes of 2 yeast species, Saccharomyces cerevisiae and Saccharomyces paradoxus. They calculated the proportions of synonymous and nonsynonymous differences per site (p(S) and p(N)) in every codon and discovered that p(N) is often greater than p(S) and that among some codons p(S) and p(N) are negatively correlated. The authors argued that these results invalidate previous tests of codons under positive selection. Here I discuss several errors of statistics in the analysis of Hughes and Friedman, including confusion of statistics with parameters, arbitrary data filtering, and derivation of hypotheses from data. I also apply likelihood ratio tests of positive selection to the yeast data and illustrate empirically that Hughes and Friedman's criticisms on such tests are not valid.  相似文献   

16.
Chenuil A  Anne C 《Genetica》2006,127(1-3):101-120
The use of molecular genetic markers (MGMs) has become widespread among evolutionary biologists, and the methods of analysis of genetic data improve rapidly, yet an organized framework in which scientists can work is lacking. Elements of molecular evolution are summarized to explain the origin of variation at the DNA level, its measures, and the relationships linking genetic variability to the biological parameters of the studied organisms. MGM are defined by two components: the DNA region(s) screened, and the technique used to reveal its variation. Criteria of choice belong to three categories: (1) the level of variability, (2) the nature of the information (e.g. dominance vs. codominance, ploidy, ... ) which must be determined according to the biological question and (3) some practical criteria which mainly depend on the equipment of the laboratory and experience of the scientist. A three-step procedure is proposed for drawing up MGMs suitable to answer given biological questions, and compiled data are organized to guide the choice at each step: (1) choice, determined by the biological question, of the level of variability and of the criteria of the nature of information, (2) choice of the DNA region and (3) choice of the technique.  相似文献   

17.
The distinctive gymnosperm genus Ephedra is sometimes considered to have originated over 200 million years (Myr) ago on the basis of "ephedroid" fossil pollen. In this article we estimate the age of extant Ephedra using chloroplast rbcL gene sequences. Relative rate tests fail to reject the null hypothesis of equal rates of nucleotide substitution of the rbcL sequences among three landmark lineages (Gnetales, Pinaceae, and Ginkgo). The most divergent sequences we have found in Ephedra differ by only 7 bp for an 1,110 bp region of rbcL sequence, whereas the differences among genera range from 92 to 107 bp in Gnetales and from 35 to 92 bp in Pinaceae. Using three landmark events, the age of extant Ephedra is estimated to be approximately 8-32 Myr. Our result is consistent with the current distribution of many Ephedra species in geologically recent habitats and points out difficulties in the identification of older ephedroid pollen fossils with the modern genus Ephedra.  相似文献   

18.
    
Gill PS 《Biometrics》2004,60(2):525-527
We propose a likelihood-based test for comparing the means of two or more log-normal distributions, with possibly unequal variances. A modification to the likelihood ratio test is needed when sample sizes are small. The performance of the proposed procedures is compared with the F-ratio test using Monte Carlo simulations.  相似文献   

19.
Distance-based phylogenetic methods are widely used in biomedical research. However, distance-based dating of speciation events and the test of the molecular clock hypothesis are relatively underdeveloped. Here I develop an approximate test of the molecular clock hypothesis for distance-based trees, as well as information-theoretic indices that have been used frequently in model selection, for use with distance matrices. The results are in good agreement with the conventional sequence-based likelihood ratio test. Among the information-theoretic indices, AICu is the most consistent with the sequence-based likelihood ratio test. The confidence in model selection by the indices can be evaluated by bootstrapping. I illustrate the usage of the indices and the approximate significance test with both empirical and simulated sequences. The tests show that distance matrices from protein gel electrophoresis and from genome rearrangement events do not violate the molecular clock hypothesis, and that the evolution of the third codon position conforms to the molecular clock hypothesis better than the second codon position in vertebrate mitochondrial genes. I outlined evolutionary distances that are appropriate for phylogenetic reconstruction and dating.  相似文献   

20.
    
When analyzing mortality data due to rare diseases in small areas, it is common to find several health zones with no mortality cases. In these circumstances, the classical homogeneous model based on the Poisson distribution used to estimate the relative risks within each area may encounter lack of fit due to a disproportionately large frequency of zeros. To cope with these zeros, the zero inflated Poisson model can be used. In this paper, we propose a test for detecting zero inflation in the context of disease mapping which is based on bootstrap techniques. The test is illustrated using male mortality data due to brain cancer in Navarra, Spain. In addition, comparisons with other tests for Poisson zero inflation such as the score test and the likelihood ratio test are carried out in terms of empirical power and size using the brain cancer scenario. The proposed bootstrap test has good power and size and works well when detecting the excess of zeros in small area data sets. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号