Tests of applicability of several substitution models for DNA sequence data |
| |
Authors: | Rzhetsky A; Nei M |
| |
Institution: | Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802. |
| |
Abstract: | Using linear invariants for various models of nucleotide substitution, we
developed test statistics for examining the applicability of a specific
model to a given dataset in phylogenetic inference. The models examined are
those developed by Jukes and Cantor (1969), Kimura (1980), Tajima and Nei
(1984), Hasegawa et al. (1985), Tamura (1992), Tamura and Nei (1993), and a
new model called the eight-parameter model. The first six models are
special cases of the last model. The test statistics developed are
independent of evolutionary time and phylogeny, although the variances of
the statistics contain phylogenetic information. Therefore, these
statistics can be used before a phylogenetic tree is estimated. Our
objective is to find the simplest model that is applicable to a given
dataset, keeping in mind that a simple model usually gives an estimate of
evolutionary distance (number of nucleotide substitutions per site) with a
smaller variance than a complicated model when the simple model is correct.
We have also developed a statistical test of the homogeneity of nucleotide
frequencies of a sample of several sequences that takes into account
possible phylogenetic correlations. This test is used to examine the
stationarity in time of the base frequencies in the sample. For Hasegawa et
al.'s and the eight-parameter models, analytical formulas for estimating
evolutionary distances are presented. Application of the above tests to
several sets of real data has shown that the assumption of stationarity of
base composition is usually acceptable when the sequences studied are
closely related but otherwise it is rejected. Similarly, the simple models
of nucleotide substitution are almost always rejected when actual genes are
distantly related and/or the total number of nucleotides examined is large.
|
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|