A Hidden Markov Model approach to variation among sites in rate of evolution |
| |
Authors: | Felsenstein, J Churchill, GA |
| |
Affiliation: | Department of Genetics, University of Washington, Seattle 98195, USA. |
| |
Abstract: | The method of Hidden Markov Models is used to allow for unequal and unknownevolutionary rates at different sites in molecular sequences. Rates ofevolution at different sites are assumed to be drawn from a set of possiblerates, with a finite number of possibilities. The overall likelihood ofphylogeny is calculated as a sum of terms, each term being the probabilityof the data given a particular assignment of rates to sites, times theprior probability of that particular combination of rates. Theprobabilities of different rate combinations are specified by a stationaryMarkov chain that assigns rate categories to sites. While there will be avery large number of possible ways of assigning rates to sites, a simplerecursive algorithm allows the contributions to the likelihood from allpossible combinations of rates to be summed, in a time proportional to thenumber of different rates at a single site. Thus with three rates, theeffort involved is no greater than three times that for a single rate. This"Hidden Markov Model" method allows for rates to differ between sites andfor correlations between the rates of neighboring sites. By summing overall possibilities it does not require us to know the rates at individualsites. However, it does not allow for correlation of rates at nonadjacentsites, nor does it allow for a continuous distribution of rates over sites.It is shown how to use the Newton-Raphson method to estimate branch lengthsof a phylogeny and to infer from a phylogeny what assignment of rates tosites has the largest posterior probability. An example is given usingbeta-hemoglobin DNA sequences in eight mammal species; the regions of highand low evolutionary rates are inferred and also the average length ofpatches of similar rates. |
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|