Evolutionary distances for protein-coding sequences: modeling site- specific residue frequencies |
| |
Authors: | Halpern AL; Bruno WJ |
| |
Institution: | Los Alamos National Laboratory, New Mexico, USA. ahalpern@ender.unm.edu |
| |
Abstract: | Estimation of evolutionary distances from coding sequences must take into
account protein-level selection to avoid relative underestimation of longer
evolutionary distances. Current modeling of selection via site-to-site rate
heterogeneity generally neglects another aspect of selection, namely
position-specific amino acid frequencies. These frequencies determine the
maximum dissimilarity expected for highly diverged but functionally and
structurally conserved sequences, and hence are crucial for estimating long
distances. We introduce a codon- level model of coding sequence evolution
in which position-specific amino acid frequencies are free parameters. In
our implementation, these are estimated from an alignment using methods
described previously. We use simulations to demonstrate the importance and
feasibility of modeling such behavior; our model produces linear distance
estimates over a wide range of distances, while several alternative models
underestimate long distances relative to short distances. Site-to-site
differences in rates, as well as synonymous/nonsynonymous and
first/second/third-codon-position differences, arise as a natural
consequence of the site-to-site differences in amino acid frequencies.
|
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|