Estimates of DNA and protein sequence divergence: an examination of some assumptions |
| |
Authors: | Golding GB |
| |
Institution: | Genetics Laboratory, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709. |
| |
Abstract: | Some of the assumptions underlying estimates of DNA and protein sequence
divergence are examined. A solution for the variance of these estimates
that allows for different mutation rates and different population sizes in
each species and for an arbitrary structure in the initial population is
obtained. It is shown that these conditions do not strongly affect
estimates of divergence. In general, they cause the variance of divergence
to be smaller than a binomial variance. Thus, the binomial variance that is
usually assumed for these estimates is safely conservative. It is shown
that variability in the mutation rate among sites can have an effect as
large as or larger than variability in the mutation rate among bases.
Variability in the mutation rate among bases and among sites causes the
number of substitutions between two sequences to be underestimated. Protein
and DNA sequences from several species are collected to estimate the
variability in mutation rates among sites. When many homologous sequences
are known, standard methods to estimate this variability can be used. The
estimates of this variability show that this factor is important when
considering the spectrum of spontaneous mutations and is strongly reflected
in the divergence of sequences. Smaller variability is found for the third
position of codons than for the first and second codon positions. This may
be because of less selective constraints on this position or because the
third position has been saturated with mutations for the sequences
examined.
|
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|