Estimation of Reversible Substitution Matrices from Multiple Pairs of Sequences |
| |
Authors: | Lars Arvestad William J Bruno |
| |
Institution: | (1) Theoretical Biology and Biophysics (T-10), MS K-710 Los Alamos National Laboratory, Los Alamos, NM 87545, USA, US;(2) Numerical Analysis and Computing Science (NADA), Royal Institute of Technology (KTH), Stockholm, Sweden, SE |
| |
Abstract: | We present a method for estimating the most general reversible substitution matrix corresponding to a given collection of
pairwise aligned DNA sequences. This matrix can then be used to calculate evolutionary distances between pairs of sequences
in the collection. If only two sequences are considered, our method is equivalent to that of Lanave et al. (1984). The main
novelty of our approach is in combining data from different sequence pairs. We describe a weighting method for pairs of taxa
related by a known tree that results in uniform weights for all branches. Our method for estimating the rate matrix results
in fast execution times, even on large data sets, and does not require knowledge of the phylogenetic relationships among sequences.
In a test case on a primate pseudogene, the matrix we arrived at resembles one obtained using maximum likelihood, and the
resulting distance measure is shown to have better linearity than is obtained in a less general model. |
| |
Keywords: | : Evolutionary distance — General reversible model — Rate matrix — Eigenvalues |
本文献已被 SpringerLink 等数据库收录! |
|