Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of apicomplexa |
| |
Authors: | Morrison, DA Ellis, JT |
| |
Affiliation: | Molecular Parasitology Unit, University of Technology Sydney, NSW, Australia. davidm@bio.uts.edu.au |
| |
Abstract: | The reconstruction of phylogenetic history is predicated on being able toaccurately establish hypotheses of character homology, which involvessequence alignment for studies based on molecular sequence data. In anempirical study investigating nucleotide sequence alignment, we inferredphylogenetic trees for 43 species of the Apicomplexa and 3 of Dinozoa basedon complete small-subunit rDNA sequences, using six differentmultiple-alignment procedures: manual alignment based on the secondarystructure of the 18S rRNA molecule, and automated similarity-basedalignment algorithms using the PileUp, ClustalW, TreeAlign, MALIGN, and SAMcomputer programs. Trees were constructed using neighboring-joining,weighted-parsimony, and maximum- likelihood methods. All of the multiplesequence alignment procedures yielded the same basic structure for theestimate of the phylogenetic relationship among the taxa, which presumablyrepresents the underlying phylogenetic signal. However, the placement ofmany of the taxa was sensitive to the alignment procedure used; and thedifferent alignments produced trees that were on average more dissimilarfrom each other than did the different tree-building methods used. Themultiple alignments from the different procedures varied greatly in length,but aligned sequence length was not a good predictor of the similarity ofthe resulting phylogenetic trees. We also systematically varied the gapweights (the relative cost of inserting a new gap into a sequence orextending an already-existing gap) for the ClustalW program, and thisproduced alignments that were at least as different from each other asthose produced by the different alignment algorithms. Furthermore, therewas no combination of gap weights that produced the same tree as that fromthe structure alignment, in spite of the fact that many of the alignmentswere similar in length to the structure alignment. We also investigated thephylogenetic information content of the helical and nonhelical regions ofthe rDNA, and conclude that the helical regions are the most informative.We therefore conclude that many of the literature disagreements concerningthe phylogeny of the Apicomplexa are probably based on differences insequence alignment strategies rather than differences in data ortree-building methods. |
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|