Phylogenetic analysis based on spectral methods |
| |
Authors: | Abeysundera Melanie Field Chris Gu Hong |
| |
Affiliation: | Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada. amelanie@mathstat.dal.ca |
| |
Abstract: | Whole-genome or multiple gene phylogenetic analysis is of interest since single gene analysis often results in poorly resolved trees. Here, the use of spectral techniques for analyzing multigene data sets is explored. The protein sequences are treated as categorical time series, and a measure of similarity between a pair of sequences, the spectral covariance, is based on the common periodicity between these two sequences. Unlike the other methods, the spectral covariance method focuses on the relationship between the sites of genetic sequences. By properly scaling the dissimilarity measures derived from different genes between a pair of species, we can use the mean of these scaled dissimilarity measures as a summary statistic to measure the taxonomic distances across multiple genes. The methods are applied to three different data sets, one noncontroversial and two with some dispute over the correct placement of the taxa in the tree. Trees are constructed using two distance-based methods, BIONJ and FITCH. A variation of block bootstrap sampling method is used for inference. The methods are able to recover all major clades in the corresponding reference trees with moderate to high bootstrap support. Through simulations, we show that the covariance-based methods effectively capture phylogenetic signal even when structural information is not fully retained. Comparisons of simulation results with the bootstrap permutation results indicate that the covariance-based methods are fairly robust under perturbations in sequence similarity but more sensitive to perturbations in structural similarity. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|