共查询到20条相似文献,搜索用时 15 毫秒
1.
DNA and protein sequence comparisons are performed by a number of computational algorithms. Most of these algorithms search for the alignment of two sequences that optimizes some alignment score. It is an important problem to assess the statistical significance of a given score. In this paper we use newly developed methods for Poisson approximation to derive estimates of the statistical significance ofk-word matches on a diagonal of a sequence comparison. We require at leastq of thek letters of the words to match where 0<q≤k. The distribution of the number of matches on a diagonal is approximated as well as the distribution of the order statistics of the sizes of clumps of matches on the diagonal. These methods provide an easily computed approximation of the distribution of the longest exact matching word between sequences. The methods are validated using comparisons of vertebrate andE. coli protein sequences. In addition, we compare two HLA class II transplantation antigens by this method and contrast the results with a dynamic programming approach. Several open problems are outlined in the last section. This work was supported by grants DMS 90-05833 from NSF and GM 36230 from NIH. 相似文献
2.
Simon Y. W. Ho K. Jun Tong Charles S. P. Foster Andrew M. Ritchie Nathan Lo Michael D. Crisp 《Biology letters》2015,11(9)
Molecular estimates of evolutionary timescales have an important role in a range of biological studies. Such estimates can be made using methods based on molecular clocks, including models that are able to account for rate variation across lineages. All clock models share a dependence on calibrations, which enable estimates to be given in absolute time units. There are many available methods for incorporating fossil calibrations, but geological and climatic data can also provide useful calibrations for molecular clocks. However, a number of strong assumptions need to be made when using these biogeographic calibrations, leading to wide variation in their reliability and precision. In this review, we describe the nature of biogeographic calibrations and the assumptions that they involve. We present an overview of the different geological and climatic events that can provide informative calibrations, and explain how such temporal information can be incorporated into dating analyses. 相似文献
3.
Summary We present the ideas, and their motivation, at the basis of a simple model of nucleic acid evolution: thestationary Markov process, or Markov clock. After a brief review of its relevant mathematical properties, the Markov clock is applied to nucleotide sequences from mitochondrial and nuclear genes of different species. Particular emphasis is given to the necessity of carrying out a correct statistical analysis, which allows us to check quantitatively the applicability of our model. We find evidence that the Markov clock ticks in many different processes, and that its limitations can be understood in terms of a simple idea that we call the base-drift hypothesis. This hypothesis correlates the deviations from the stationarity of the Markov process to the evolutionary distanced
AB
(P) of two species A and B, relative to the processP. We conclude by discussing the implications of our findings for future work. 相似文献
4.
5.
Cutler DJ 《Genetics》2000,154(3):1403-1417
Rates of molecular evolution at some protein-encoding loci are more irregular than expected under a simple neutral model of molecular evolution. This pattern of excessive irregularity in protein substitutions is often called the overdispersed molecular clock and is characterized by an index of dispersion, R(T) > 1. Assuming infinite sites, no recombination model of the gene R(T) is given for a general stationary model of molecular evolution. R(T) is shown to be affected by only three things: fluctuations that occur on a very slow time scale, advantageous or deleterious mutations, and interactions between mutations. In the absence of interactions, advantageous mutations are shown to lower R(T); deleterious mutations are shown to raise it. Previously described models for the overdispersed molecular clock are analyzed in terms of this work as are a few very simple new models. A model of deleterious mutations is shown to be sufficient to explain the observed values of R(T). Our current best estimates of R(T) suggest that either most mutations are deleterious or some key population parameter changes on a very slow time scale. No other interpretations seem plausible. Finally, a comment is made on how R(T) might be used to distinguish selective sweeps from background selection. 相似文献
6.
Recombination and the molecular clock 总被引:7,自引:0,他引:7
7.
On the molecular evolutionary clock 总被引:1,自引:0,他引:1
Emile Zuckerkandl 《Journal of molecular evolution》1987,26(1-2):34-46
Summary The conceptual framework surrounding the origin of the molecular evolutionary clock and circumstances of this origin are described. In regard to the quest for the best available molecular clocks, a return to protein clocks is conditionally recommended. On the basis of recent data and certain considerations, it is pointed out that the realm of neutrality in evolution is probably less extensive than is now commonly thought, in the three distinct senses of the term neutrality—neutrality as nonfunctionality of mutations, neutrality as equifunctionality of mutations, and neutrality as a mode of fixation of mutations. The possibility is raised that complex sets of interacting components forming a system that is bounded with respect to its environment may quite generally display an intrinsic trend to a quasi-clockwise evolutionary behavior. 相似文献
8.
Calibrating the avian molecular clock 总被引:6,自引:0,他引:6
Molecular clocks are widely used to date phylogenetic events, yet evidence supporting the rate constancy of molecular clocks through time and across taxonomic lineages is weak. Here, we present 90 candidate avian clock calibrations obtained from fossils and biogeographical events. Cross-validation techniques were used to identify and discard 16 inconsistent calibration points. Molecular evolution occurred in an approximately clock-like manner through time for the remaining 74 calibrations of the mitochondrial gene, cytochrome b . A molecular rate of approximately 2.1% (± 0.1%, 95% confidence interval) was maintained over a 12-million-year interval and across most of 12 taxonomic orders. Minor but significant variance in rates occurred across lineages but was not explained by differences in generation time, body size or latitudinal distribution as previously suggested. 相似文献
9.
A general comparison of relaxed molecular clock models 总被引:4,自引:0,他引:4
Several models have been proposed to relax the molecular clock in order to estimate divergence times. However, it is unclear which model has the best fit to real data and should therefore be used to perform molecular dating. In particular, we do not know whether rate autocorrelation should be considered or which prior on divergence times should be used. In this work, we propose a general bench mark of alternative relaxed clock models. We have reimplemented most of the already existing models, including the popular lognormal model, as well as various prior choices for divergence times (birth-death, Dirichlet, uniform), in a common Bayesian statistical framework. We also propose a new autocorrelated model, called the "CIR" process, with well-defined stationary properties. We assess the relative fitness of these models and priors, when applied to 3 different protein data sets from eukaryotes, vertebrates, and mammals, by computing Bayes factors using a numerical method called thermodynamic integration. We find that the 2 autocorrelated models, CIR and lognormal, have a similar fit and clearly outperform uncorrelated models on all 3 data sets. In contrast, the optimal choice for the divergence time prior is more dependent on the data investigated. Altogether, our results provide useful guidelines for model choice in the field of molecular dating while opening the way to more extensive model comparisons. 相似文献
10.
11.
Summary A few years ago we presented a stationary Markov model of gene evolution according to which only homologous genes from not
too divergent species obeying the condition of being stationary may behave as reliable molecular clocks. A compartmentalized
model of the nuclear genome in which the genes are distributed in compartments, the isochores, defined by their G+C content
has been proposed recently. We have found that only homologous gene pairs that are stationary, and belong to the same isochore,
can be used consistently for the determination of phylogeny and base substitution rate. In particular, for the rodent-human
couple, only about half of the homologous gene pairs are stationary. Stationary genes evolve at the third silent codon position
with the same velocity independent of the genes and base composition. By contrast, nonstationary genes display apparent rate
values (pseudovelocities) that are significantly higher. Our results cast doubt upon recent claims of a large acceleration
in the rate of molecular evolution in rodents. 相似文献
12.
The modern molecular clock 总被引:1,自引:0,他引:1
The discovery of the molecular clock--a relatively constant rate of molecular evolution--provided an insight into the mechanisms of molecular evolution, and created one of the most useful new tools in biology. The unexpected constancy of rate was explained by assuming that most changes to genes are effectively neutral. Theory predicts several sources of variation in the rate of molecular evolution. However, even an approximate clock allows time estimates of events in evolutionary history, which provides a method for testing a wide range of biological hypotheses ranging from the origins of the animal kingdom to the emergence of new viral epidemics. 相似文献
13.
Natural selection and the molecular clock 总被引:12,自引:1,他引:12
14.
DNA turnover and the molecular clock 总被引:7,自引:0,他引:7
Gabriel A. Dover 《Journal of molecular evolution》1987,26(1-2):47-58
Summary Many detailed studies on the mechanisms by which different components of eukaryotic nuclear genomes have diverged reveal that the majority of sequences are seemingly not passively accumulating base substitutions in a clocklike manner solely determined by laws of diffusion at the population level. It appears that variation in the rates, units, biases, and gradients of several DNA turnover mechanisms are contributing to the course of DNA divergence. Turnover mechanisms have the potential to retard, maintain, or accelerate the rate of DNA differentiation between populations. Furthermore, examples are known of coding and noncoding DNA subject to the simultaneous operation of several turnover mechanisms leading to complex patterns of fine-scale restructuring and divergence, generally uninterpretable using selection and/or neutral drift arguments in isolation. Constancy in the rate of divergence, where observed over defined periods of time, could be a reflection of constancy in the rates and units of turnover. However, a consideration of the generally large disparity between rates of turnover and mutation reveals that DNA clocks, which would be independently driven by turnover in separate genomic components, would tend to be episodic. The utility of any given DNA sequence for measuring time and species relationships, like individual proteins, is proportional to the extent to which all contributing forces to the evolution of the sequence, internal and external, are understood. 相似文献
15.
Current understanding of the diversification of birds is hindered by their incomplete fossil record and uncertainty in phylogenetic relationships and phylogenetic rates of molecular evolution. Here we performed the first comprehensive analysis of mitogenomic data of 48 vertebrates, including 35 birds, to derive a Bayesian timescale for avian evolution and to estimate rates of DNA evolution. Our approach used multiple fossil time constraints scattered throughout the phylogenetic tree and accounts for uncertainties in time constraints, branch lengths, and heterogeneity of rates of DNA evolution. We estimated that the major vertebrate lineages originated in the Permian; the 95% credible intervals of our estimated ages of the origin of archosaurs (258 MYA), the amniote-amphibian split (356 MYA), and the archosaur-lizard divergence (278 MYA) bracket estimates from the fossil record. The origin of modern orders of birds was estimated to have occurred throughout the Cretaceous beginning about 139 MYA, arguing against a cataclysmic extinction of lineages at the Cretaceous/Tertiary boundary. We identified fossils that are useful as time constraints within vertebrates. Our timescale reveals that rates of molecular evolution vary across genes and among taxa through time, thereby refuting the widely used mitogenomic or cytochrome b molecular clock in birds. Moreover, the 5-Myr divergence time assumed between 2 genera of geese (Branta and Anser) to originally calibrate the standard mitochondrial clock rate of 0.01 substitutions per site per lineage per Myr (s/s/l/Myr) in birds was shown to be underestimated by about 9.5 Myr. Phylogenetic rates in birds vary between 0.0009 and 0.012 s/s/l/Myr, indicating that many phylogenetic splits among avian taxa also have been underestimated and need to be revised. We found no support for the hypothesis that the molecular clock in birds "ticks" according to a constant rate of substitution per unit of mass-specific metabolic energy rather than per unit of time, as recently suggested. Our analysis advances knowledge of rates of DNA evolution across birds and other vertebrates and will, therefore, aid comparative biology studies that seek to infer the origin and timing of major adaptive shifts in vertebrates. 相似文献
16.
There is evidence to suggest that eukaryotic genomes are subject to frequent insertions and deletions of non-coding DNA. This may lead to a gradual increase or decrease in genome size, or to a dynamic equilibrium in which the overall size remains constant. We argue, however, that there is a bias favouring an accumulation of non-coding DNA in the proximity of genes. Such bias causes a progressive change in genome structure regardless of whether the overall genome size increases, decreases or remains constant. We show that this change may serve as a 'molecular clock', supplementing that provided by nucleotide substitution rates. 相似文献
17.
Statistical models of the overdispersed molecular clock 总被引:2,自引:0,他引:2
Naoyuki Takahata 《Theoretical population biology》1991,39(3):329-344
The most commonly used statistical model to describe the rate constancy of molecular evolution (molecular clock) is a simple Poisson process in which the variance of the number of amino acid or nucleotide substitutions in a particular gene should be equal to the mean and henceforth the dispersion index, the ratio of the variance to the mean, should be equal to one. Recent sequence data, however, have shown that the substitutional process in molecular evolution is often considerably overdispersed and have called into question the generality of using a simple Poisson process. Several efforts have been made to develop more realistic models of molecular evolution. In this paper, I will show that the spatial (site-specific) variation in the rate of molecular evolution is an improbable cause of the overdispersion and then review various statistical models which take the temporal variation into account. Although these models do not immediately specify what the mechanisms of molecular evolution might be, they do make qualitatively different predictions and give some insight into their inference. One way to distinguish them is suggested. In addition, effects of selected substitutions that presumably occur after a major change in a molecule are quasi-quantitatively examined. It is most likely that the overdispersion of molecular clock is due either to a major molecular reconfiguration (fluctuating neutral space) led by a series of subliminal neutral changes or to selected substitutions fine-tuning a molecule after a major molecular change. Although the latter possibility, of course, violates the simplest neutrality assumption, it would not impair the neutral theory as a whole. 相似文献
18.
19.
20.
MOTIVATION: The high pace of viral sequence change means that variation in the times at which sequences are sampled can have a profound effect both on the ability to detect trends over time in evolutionary rates and on the power to reject the Molecular Clock Hypothesis (MCH). Trends in viral evolutionary rates are of particular interest because their detection may allow connections to be established between a patient's treatment or condition and the process of evolution. Variation in sequence isolation times also impacts the uncertainty associated with estimates of divergence times and evolutionary rates. Variation in isolation times can be intentionally adjusted to increase the power of hypothesis tests and to reduce the uncertainty of evolutionary parameter estimates, but this fact has received little previous attention. RESULTS: We provide approximations for the power to reject the MCH when the alternative is that rates change in a linear fashion over time and when the alternative is that rates differ randomly among branches. In addition, we approximate the standard deviation of estimated evolutionary rates and divergence times. We illustrate how these approximations can be exploited to determine which viral sample to sequence when samples representing different dates are available. 相似文献