Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of apicomplexa |
| |
Authors: | Morrison DA; Ellis JT |
| |
Institution: | Molecular Parasitology Unit, University of Technology Sydney, NSW, Australia. davidm@bio.uts.edu.au |
| |
Abstract: | The reconstruction of phylogenetic history is predicated on being able to
accurately establish hypotheses of character homology, which involves
sequence alignment for studies based on molecular sequence data. In an
empirical study investigating nucleotide sequence alignment, we inferred
phylogenetic trees for 43 species of the Apicomplexa and 3 of Dinozoa based
on complete small-subunit rDNA sequences, using six different
multiple-alignment procedures: manual alignment based on the secondary
structure of the 18S rRNA molecule, and automated similarity-based
alignment algorithms using the PileUp, ClustalW, TreeAlign, MALIGN, and SAM
computer programs. Trees were constructed using neighboring-joining,
weighted-parsimony, and maximum- likelihood methods. All of the multiple
sequence alignment procedures yielded the same basic structure for the
estimate of the phylogenetic relationship among the taxa, which presumably
represents the underlying phylogenetic signal. However, the placement of
many of the taxa was sensitive to the alignment procedure used; and the
different alignments produced trees that were on average more dissimilar
from each other than did the different tree-building methods used. The
multiple alignments from the different procedures varied greatly in length,
but aligned sequence length was not a good predictor of the similarity of
the resulting phylogenetic trees. We also systematically varied the gap
weights (the relative cost of inserting a new gap into a sequence or
extending an already-existing gap) for the ClustalW program, and this
produced alignments that were at least as different from each other as
those produced by the different alignment algorithms. Furthermore, there
was no combination of gap weights that produced the same tree as that from
the structure alignment, in spite of the fact that many of the alignments
were similar in length to the structure alignment. We also investigated the
phylogenetic information content of the helical and nonhelical regions of
the rDNA, and conclude that the helical regions are the most informative.
We therefore conclude that many of the literature disagreements concerning
the phylogeny of the Apicomplexa are probably based on differences in
sequence alignment strategies rather than differences in data or
tree-building methods.
|
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|