首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Lineage effects and the index of dispersion of molecular evolution   总被引:9,自引:4,他引:5  
Recent efforts to estimate the index of dispersion [R(t)] of molecular evolution-i.e., the ratio of the variance in the number of substitutions on a lineage to the mean number-have suffered from an inability to adjust the data for lineage effects. These effects may include the generation-time dependency of the rate of evolution or improper assumptions about the branching pattern of a phylogenetic tree. In the present paper a method for correcting for lineage effects in the estimation of R(t) is presented for trees made up of three species. The recent data published by Li et al. for 20 loci in three orders of mammals is examined, and the average R(t), corrected for lineage effects, is shown to be 7.75 for replacement substitutions and 3.3 for silent substitutions. Thus the high values reported earlier may not be dismissed as due to generation-time effects or improper assumptions about phylogenies. Computer simulations are presented to give confidence in the estimate for replacement substitutions but also to demonstrate that the estimate for silent substitutions is sensitive to corrections for multiple substitutions and is not as reliable. This work's implications for our understanding of the mechanism of molecular evolution are discussed, and the arguments in favor of the hypothesis that replacement substitutions are mostly selected while silent substitutions are mostly neutral is presented.  相似文献   

2.
The fractal doubly stochastic Poisson process (FDSPP) model of molecular evolution, like other doubly stochastic Poisson models, agrees with the high estimates for the index of dispersion found from sequence comparisons. Unlike certain previous models, the FDSPP also predicts a positive geometric correlation between the index of dispersion and the mean number of substitutions. Such a relationship is statistically proven herein using comparisons between 49 mammalian genes. There is no characteristic rate associated with molecular evolution according to this model, but there is a scaling relationship in rates according to a fractal dimension of evolution. The FDSPP is a suitable replacement for the homogeneous Poisson process in tests of the lineage dependence of rates and in estimating confidence intervals for divergence times. As opposed to other fractal models, this model can be interpreted in terms of Darwinian selection and drift.   相似文献   

3.
A fractal renewal point process (FRPP) is used to model molecular evolution in agreement with the relationship between the variance and the mean numbers of nonsynonymous and synonymous substitutions in mammals. Like other episodic models such as the doubly stochastic Poisson process, this model accounts for the large variances observed in amino acid substitution rates, but unlike certain other episodic models, it also accounts for the increase in the index of dispersion with the mean number of substitutions in Ohta's (1995) data. We find that this correlation is significant for nonsynonymous substitutions at the 1% level and for synonymous substitutions at the 10% level, even after removing lineage effects and when using Bulmer's (1989) unbiased estimator of the index of dispersion. This model is simpler than most other overdispersed models of evolution in the sense that it is fully specified by a single interevent probability distribution. Interpretations in terms of chaotic dynamics and in terms of chance and selection are discussed. Received: 12 January 1998 / Accepted: 19 May 1998  相似文献   

4.
The extent of amino acid differences of major histocompatibility complex molecules within species is unusually high, consistent with the finding that some pairs of alleles have persisted for more than ten million years and the view that the polymorphism has been maintained by natural selection. The disparity between synonymous and non-synonymous substitutions in the antigen recognition site, however, suggests that some non-synonymous sites have undergone a number of substitutions whereas others have little or none. To describe statistically such an overdispersed underlying process, commonly used Poisson processes are inadequate. An alternative process leads to the surprising conclusion that each non-synonymous site has accumulated as many as 2.6 substitutions, on the average, in the two lineages leading to humans and mice. The standard deviation is also very large (6.6) and the dispersion index (the ratio of the variance to the mean) is at least 17. The substitution process thus inferred qualitatively agrees with the disposition (a boomerang pattern) of substitutions between HLA-A2 and Aw68 alleles, and quantitatively agrees well with that expected where the evolution of major histocompatibility complex molecules has long been driven mostly by balancing selection.  相似文献   

5.
The most simple neutral model of molecular evolution predicts that the number of substitutions within a lineage in T generations ought to be Poisson distributed. Therefore, the variance in the number of substitutions ought to equal the mean number. The ratio of the variance to the mean number of substitutions is called the index of dispersion, R(T). Assuming infinite sites, no recombination model of the gene, and a haploid, Moran population structure, R(T) is derived for a general stationary model of molecular evolution. R(T) is shown to be affected by fluctuations in parameters only when they occur on a very slow time scale. In order for parameter fluctuations to cause R(T) to deviate significantly from one, the time between parameter changes must be roughly as large, or larger, than the time between substitutions.  相似文献   

6.
The number of nucleotide substitutions accumulated in a gene or in a lineage is an important random variable in the study of molecular evolution. Of particular interest is the ratio of the variance to the mean of that random variable, often known as the dispersion index. Because nucleotide substitution is most commonly modeled by a continuous-time four-state Markov chain, this paper provides a systematic method of computing the dispersion indices exhibited by a continuous-time four-state Markov chain. Using this method along with computer algebra and Monte Carlo simulation, this paper offers partially proven conjectures that were supported by thorough computer experiments. It is believed that the Tamura model, the equal-input model and the Takahata-Kimura model always exhibit dispersion indices less than 2. It is also believed that a general four-state model can be chosen to exhibit a dispersion index of any desired magnitude, although the chance of a randomly chosen such model exhibiting a dispersion index greater than 2 is as small as about 2%. Relevance of these findings to the neutral theory is discussed.  相似文献   

7.
A new method is proposed for estimating the number of synonymous and nonsynonymous nucleotide substitutions between homologous genes. In this method, a nucleotide site is classified as nondegenerate, twofold degenerate, or fourfold degenerate, depending on how often nucleotide substitutions will result in amino acid replacement; nucleotide changes are classified as either transitional or transversional, and changes between codons are assumed to occur with different probabilities, which are determined by their relative frequencies among more than 3,000 changes in mammalian genes. The method is applied to a large number of mammalian genes. The rate of nonsynonymous substitution is extremely variable among genes; it ranges from 0.004 X 10(-9) (histone H4) to 2.80 X 10(-9) (interferon gamma), with a mean of 0.88 X 10(-9) substitutions per nonsynonymous site per year. The rate of synonymous substitution is also variable among genes; the highest rate is three to four times higher than the lowest one, with a mean of 4.7 X 10(-9) substitutions per synonymous site per year. The rate of nucleotide substitution is lowest at nondegenerate sites (the average being 0.94 X 10(-9), intermediate at twofold degenerate sites (2.26 X 10(-9)). and highest at fourfold degenerate sites (4.2 X 10(-9)). The implication of our results for the mechanisms of DNA evolution and that of the relative likelihood of codon interchanges in parsimonious phylogenetic reconstruction are discussed.  相似文献   

8.
The evolutionary selection forces acting on a protein are commonly inferred using evolutionary codon models by contrasting the rate of synonymous to nonsynonymous substitutions. Most widely used models are based on theoretical assumptions and ignore the empirical observation that distinct amino acids differ in their replacement rates. In this paper, we develop a general method that allows assimilation of empirical amino acid replacement probabilities into a codon-substitution matrix. In this way, the resulting codon model takes into account not only the transition-transversion bias and the nonsynonymous/synonymous ratio, but also the different amino acid replacement probabilities as specified in empirical amino acid matrices. Different empirical amino acid replacement matrices, such as secondary structure-specific matrices or organelle-specific matrices (e.g., mitochondria and chloroplasts), can be incorporated into the model, making it context dependent. Using a diverse set of coding DNA sequences, we show that the novel model better fits biological data as compared with either mechanistic or empirical codon models. Using the suggested model, we further analyze human immunodeficiency virus type 1 protease sequences obtained from drug-treated patients and reveal positive selection in sites that are known to confer drug resistance to the virus.  相似文献   

9.
A simple nearly neutral mutation model of protein evolution was studied using computer simulation assuming a constant population size. In this model, a gene consists of a finite number of codons and there is no recombination within a gene. Each codon has two replacement and one silent sites. The fitness of a gene was determined multiplicatively by amino acids specified by codons (the independent multicodon model). Nucleotide diversity at replacement sites decreases as selection becomes stronger. A reduction of nucleotide diversity at silent sites also occurs as selection intensifies but the magnitude of the reduction is not a monotone function of the intensity of selection. The dispersion index is close to one. The average value of Tajima's and Fu and Li's statistics are negative and their absolute values increases as selection intensifies. However, their powers of detecting selection under the present model were not high unless the number of sites is large or mutation rate is high. The MK test was shown to detect intermediate selection fairly well. For comparison, the house-of-cards model was also investigated and its behavior was shown to be more sensitive to changes of population size than that of the independent multicodon model. The relevance of the present model for explaining protein evolution was discussed comparing its prediction and recent DNA data. Received: 24 May 1999 / Accepted: 17 August 1999  相似文献   

10.
The nearly neutral theory of molecular evolution predicts larger generation-time effects for synonymous than for nonsynonymous substitutions. This prediction is tested using the sequences of 49 single-copy genes by calculating the average and variance of synonymous and nonsynonymous substitutions in mammalian star phylogenies (rodentia, artiodactyla, and primates). The average pattern of the 49 genes supports the prediction of the nearly neutral theory, with some notable exceptions.The nearly neutral theory also predicts that the variance of the evolutionary rate is larger than the value predicted by the completely neutral theory. This prediction is tested by examining the dispersion index (ratio of the variance to the mean), which is positively correlated with the average substitution number. After weighting by the lineage effects, this correlation almost disappears for nonsynonymous substitutions, but not quite so for synonymous substitutions. After weighting, the dispersion indices of both synonymous and nonsynonymous substitutions still exceed values expected under the simple Poisson process. The results indicate that both the systematic bias in evolutionary rate among the lineages and the episodic type of rate variation are contributing to the large variance. The former is more significant to synonymous substitutions than to nonsynonymous substitutions. Isochore evolution may be similar to synonymous substitutions. The rate and pattern found here are consistent with the nearly neutral theory, such that the relative contributions of drift and selection differ between the two types of substitutions. The results are also consistent with Gillespie's episodic selection theory.  相似文献   

11.
All current phylogenetic methods assume that DNA substitutions are independent among sites. However, ample empirical evidence suggests that the process of substitution is not independent but is, in fact, temporally and spatially correlated. The robustness of several commonly used phylogenetic methods to the assumption of independent substitution is examined. A compound Poisson process is used to model DNA substitution. This model assumes that substitution events are Poisson-distributed in time and that the number of substitutions associated with each event is geometrically distributed. The asymptotic properties of phylogenetic methods do not appear to change under a compound Poisson process of DNA substitution. Moreover, the rank order of the performance of different methods does not change. However, all phylogenetic methods become less efficient when substitution follows a compound Poisson process.  相似文献   

12.
Most previous work on the evolution of mobile DNA was limited by incomplete sequence information. Whole genome sequences allow us to overcome this limitation. I study the nucleotide diversity of prominent members of five insertion sequence families whose transposition activity is encoded by a single transposase gene. Eighteen among 376 completely sequenced bacterial genomes and plasmids carry between 3 and 20 copies of a given insertion sequence. I show that these copies generally show very low DNA divergence. Specifically, more than 68% of the transposase genes are identical within a genome. The average number of amino acid replacement substitutions at amino acid replacement sites is Ka = 0.013, that at silent sites is Ks = 0.1. This low intragenomic diversity stands in stark contrast to a much higher divergence of the same insertion sequences among distantly related genomes. Gene conversion among protein-coding genes is unlikely to account for this lack of diversity. The relation between transposition frequencies and silent substitution rates suggests that most insertion sequences in a typical genome are evolutionarily young and have been recently acquired. They may undergo periodic extinction in bacterial lineages. By implication, they are detrimental to their host in the long run. This is also suggested by the highly skewed and patchy distribution of insertion sequences among genomes. In sum, one can think of insertion sequences as slow-acting infectious diseases of cell lineages.  相似文献   

13.
A nomographic method is presented that estimates the number of nucleotide substitutions since the common ancestor of two nucleotide sequences with no assumption about the proportion of transition and transversion substitutions except that it is constant over time. Of two previous methods of estimating this number, that of M. Kimura (Proc. natn. Acad. Sci. U.S.A. 78, 454-458 (1981) obtains the same result, and is thus confirmed by this work, while that of W. M. Brown, E. M. Prager, A. Wang & A. C. Wilson (J. molec. Evol. 18, 225-239 (1982] does not get the same result. The method presented here also obtains the fraction of all substitutions that are transitions. If one has three or more homologous sequences to compare, one can test the validity of the model by examining the constancy of the estimated proportion of substitutions that are transitions across the various pairs of sequences in a simple visual way. The method is general for any pair of mutually exclusive nucleotide substitutional categories, not just transitions and transversions. Mitochondrial data provide evidence that, for this and probably other current models correcting for superimposed substitutions, one or more of the underlying assumptions is incorrect. This is because there is some unknown systematic bias affecting this evolutionary process. It is suggested that at least part of the bias arises from incorrectly assuming that all sites are variable. In the absence of evidence that this bias is not present in other data, all estimates of the number of substitutions based upon pairs of sequences and current methods of estimating superimposed substitutions at a single site should be viewed as uncertain.  相似文献   

14.
Positive Darwinian selection promotes fixations of advantageous mutations during gene evolution and is probably responsible for most adaptations. Detecting positive selection at the DNA sequence level is of substantial interest because such information provides significant insights into possible functional alterations during gene evolution as well as important nucleotide substitutions involved in adaptation. Efficient detection of positive selection, however, has been difficult because selection often operates on only a few sites in a short period of evolutionary time. A likelihood-based method with branch-site models was recently introduced to overcome such difficulties. Here I examine the accuracy of the method using computer simulation. I find that the method detects positive selection in 20%-70% of cases when the DNA sequences are generated by computer simulation under no positive selection. Although the frequency of such false detection varies depending on, among other things, the tree topology, branch length, and selection scheme, the branch-site likelihood method generally gives misleading results. Thus, detection of positive selection by this method alone is unreliable. This unreliability may have resulted from its over-sensitivity to violations of assumptions made in the method, such as certain distributions of selective strength among sites and equal transition/transversion ratios for synonymous and nonsynonymous substitutions.  相似文献   

15.
HAP1 protein, the major apurinic/apyrimidinic (AP) endonuclease in human cells, is a member of a homologous family of multifunctional DNA repair enzymes including the Escherichia coli exonuclease III and Drosophila Rrp1 proteins. The most extensively characterised member of this family, exonuclease III, exhibits both DNA- and RNA-specific nuclease activities. Here, we show that the RNase H activity characteristic of exonuclease III has been conserved in the human homologue, although the products resulting from RNA cleavage are dissimilar. To identify residues important for enzymatic activity, five mutant HAP1 proteins containing single amino acid substitutions were purified and analysed in vitro. The substitutions were made at sites of conserved amino acids and targeted either acidic or histidine residues because of their known participation in the active sites of hydrolytic nucleases. One of the mutant proteins (replacement of Asp-219 by alanine) showed a markedly reduced enzymatic activity, consistent with a greatly diminished capacity to bind DNA and RNA. In contrast, replacement of Asp-90, Asp-308 or Glu-96 by alanine led to a reduction in enzymatic activity without significantly compromising nucleic acid binding. Replacement of His-255 by alanine led to only a very small reduction in enzymatic activity. Our data are consistent with the presence of a single catalytic active site for the DNA- and RNA-specific nuclease activities of the HAP1 protein.  相似文献   

16.
17.
Continuous and tractable models for the variation of evolutionary rates   总被引:1,自引:0,他引:1  
We propose a continuous model for variation in the evolutionary rate across sites and over the phylogenetic tree. We derive exact transition probabilities of substitutions under this model. Changes in rate are modelled using the CIR process, a diffusion widely used in financial applications. The model directly extends the standard gamma distributed rates across site model, with one additional parameter governing changes in rate down the tree. The parameters of the model can be estimated directly from two well-known statistics: the index of dispersion and the gamma shape parameter of the rates across sites model. The CIR model can be readily incorporated into probabilistic models for sequence evolution. We provide here an exact formula for the likelihood of a three-taxon tree. The likelihoods of larger trees can be evaluated using Monte-Carlo methods.  相似文献   

18.
The ``hitchhiking Effect'''' Revisited   总被引:49,自引:18,他引:49  
N. L. Kaplan  R. R. Hudson    C. H. Langley 《Genetics》1989,123(4):887-899
The number of selectively neutral polymorphic sites in a random sample of genes can be affected by ancestral selectively favored substitutions at linked loci. The degree to which this happens depends on when in the history of the sample the selected substitutions happen, the strength of selection and the amount of crossing over between the sampled locus and the loci at which the selected substitutions occur. This phenomenon is commonly called hitchhiking. Using the coalescent process for a random sample of genes from a selectively neutral locus that is linked to a locus at which selection is taking place, a stochastic, finite population model is developed that describes the steady state effect of hitchhiking on the distribution of the number of selectively neutral polymorphic sites in a random sample. A prediction of the model is that, in regions of low crossing over, strongly selected substitutions in the history of the sample can substantially reduce the number of polymorphic sites in a random sample of genes from that expected under a neutral model.  相似文献   

19.
Statistical models of the overdispersed molecular clock   总被引:2,自引:0,他引:2  
The most commonly used statistical model to describe the rate constancy of molecular evolution (molecular clock) is a simple Poisson process in which the variance of the number of amino acid or nucleotide substitutions in a particular gene should be equal to the mean and henceforth the dispersion index, the ratio of the variance to the mean, should be equal to one. Recent sequence data, however, have shown that the substitutional process in molecular evolution is often considerably overdispersed and have called into question the generality of using a simple Poisson process. Several efforts have been made to develop more realistic models of molecular evolution. In this paper, I will show that the spatial (site-specific) variation in the rate of molecular evolution is an improbable cause of the overdispersion and then review various statistical models which take the temporal variation into account. Although these models do not immediately specify what the mechanisms of molecular evolution might be, they do make qualitatively different predictions and give some insight into their inference. One way to distinguish them is suggested. In addition, effects of selected substitutions that presumably occur after a major change in a molecule are quasi-quantitatively examined. It is most likely that the overdispersion of molecular clock is due either to a major molecular reconfiguration (fluctuating neutral space) led by a series of subliminal neutral changes or to selected substitutions fine-tuning a molecule after a major molecular change. Although the latter possibility, of course, violates the simplest neutrality assumption, it would not impair the neutral theory as a whole.  相似文献   

20.
It is often stated that patterns of nonsynonymous rate variation among mammalian lineages are more irregular than expected or overdispersed under the neutral model, whereas synonymous sites conform to the neutral model. Here we reexamined genome-wide patterns of the variance to mean ratio, or index of dispersion (R), of substitutions in proteins from human, mouse, and dog. Contrary to the prevailing notion, we found that the mean index of dispersion for nonsynonymous sites of mammalian proteins is not significantly different from 1. We propose that earlier analyses were biased because the data included disproportionately more protein hormones, which tend to be more dispersed than genes in other functional categories. Synonymous sites exhibit greater degree of dispersion than nonsynonymous sites, although similar to earlier estimates and potentially due to errors associated with correction for multiple hits. Overall, our analysis identifies strong genome-wide generation-time effect and natural selection as important determinants of among-lineage variation of protein evolutionary rates. Furthermore, patterns of lineage-specific selective constraint are consistent with the nearly neutral model of molecular evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号