共查询到20条相似文献,搜索用时 31 毫秒
1.
In the reconstruction of a large phylogenetic tree, the most difficult part is usually the problem of how to explore the topology space to find the optimal topology. We have developed a "divide-and-conquer" heuristic algorithm in which an initial neighbor-joining (NJ) tree is divided into subtrees at internal branches having bootstrap values higher than a threshold. The topology search is then conducted by using the maximum-likelihood method to reevaluate all branches with a bootstrap value lower than the threshold while keeping the other branches intact. Extensive simulation showed that our simple method, the neighbor-joining maximum-likelihood (NJML) method, is highly efficient in improving NJ trees. Furthermore, the performance of the NJML method is nearly equal to or better than existing time-consuming heuristic maximum-likelihood methods. Our method is suitable for reconstructing relatively large molecular phylogenetic trees (number of taxa >/= 16). 相似文献
2.
Whole-genome duplication (WGD) produces sets of gene pairs that are all of the same age. We therefore expect that phylogenetic trees that relate these pairs to their orthologs in other species should show a single consistent topology. However, a previous study of gene pairs formed by WGD in the yeast Saccharomyces cerevisiae found conflicting topologies among neighbor-joining (NJ) trees drawn from different loci and suggested that this conflict was the result of "asynchronous functional divergence" of duplicated genes (Langkjaer, R. B., P. F. Cliften, M. Johnston, and J. Piskur. 2003. Yeast genome duplication was followed by asynchronous differentiation of duplicated genes. Nature 421:848-852). Here, we test whether the conflicting topologies might instead be due to asymmetrical rates of evolution leading to long-branch attraction (LBA) artifacts in phylogenetic trees. We constructed trees for 433 pairs of WGD paralogs in S. cerevisiae with their single orthologs in Saccharomyces kluyveri and Candida albicans. We find a strong correlation between the asymmetry of evolutionary rates of a pair of S. cerevisiae paralogs and the topology of the tree inferred for that pair. Saccharomyces cerevisiae gene pairs with approximately equal rates of evolution tend to give phylogenies in which the WGD postdates the speciation between S. cerevisiae and S. kluyveri (B-trees), whereas trees drawn from gene pairs with asymmetrical rates tend to show WGD pre-dating this speciation (A-trees). Gene order data from throughout the genome indicate that the "A-trees" are artifacts, even though more than 50% of gene pairs are inferred to have this topology when the NJ method as implemented in ClustalW (i.e., with Poisson correction of distances) is used to construct the trees. This LBA artifact can be ameliorated, but not eliminated, by using gamma-corrected distances or by using maximum likelihood trees with robustness estimated by the Shimodaira-Hasegawa test. Tests for adaptive evolution indicated that positive selection might be the cause of rate asymmetry in a substantial fraction (19%) of the paralog pairs. 相似文献
3.
In phylogenetic inference by maximum-parsimony (MP), minimum-evolution (ME), and maximum-likelihood (ML) methods, it is customary to conduct extensive heuristic searches of MP, ME, and ML trees, examining a large number of different topologies. However, these extensive searches tend to give incorrect tree topologies. Here we show by extensive computer simulation that when the number of nucleotide sequences (m) is large and the number of nucleotides used (n) is relatively small, the simple MP or ML tree search algorithms such as the stepwise addition (SA) plus nearest neighbor interchange (NNI) search and the SA plus subtree pruning regrafting (SPR) search are as efficient as the extensive search algorithms such as the SA plus tree bisection-reconnection (TBR) search in inferring the true tree. In the case of ME methods, the simple neighbor-joining (NJ) algorithm is as efficient as or more efficient than the extensive NJ+TBR search. We show that when ME methods are used, the simple p distance generally gives better results in phylogenetic inference than more complicated distance measures such as the Hasegawa-Kishino-Yano (HKY) distance, even when nucleotide substitution follows the HKY model. When ML methods are used, the simple Jukes-Cantor (JC) model of phylogenetic inference generally shows a better performance than the HKY model even if the likelihood value for the HKY model is much higher than that for the JC model. This indicates that at least in the present case, selecting of a substitution model by using the likelihood ratio test or the AIC index is not appropriate. When n is small relative to m and the extent of sequence divergence is high, the NJ method with p distance often shows a better performance than ML methods with the JC model. However, when the level of sequence divergence is low, this is not the case. 相似文献
4.
A rapid heuristic algorithm for finding minimum evolution trees 总被引:2,自引:0,他引:2
The minimum sum of branch lengths (S), or the minimum evolution (ME) principle, has been shown to be a good optimization criterion in phylogenetic inference. Unfortunately, the number of topologies to be analyzed is computationally prohibitive when a large number of taxa are involved. Therefore, simplified, heuristic methods, such as the neighbor-joining (NJ) method, are usually employed instead. The NJ method analyzes only a small number of trees (compared with the size of the entire search space); so, the tree obtained may not be the ME tree (for which the S value is minimum over the entire search space). Different compromises between very restrictive and exhaustive search spaces have been proposed recently. In particular, the "stepwise algorithm" (SA) utilizes what is known in computer science as the "beam search," whereas the NJ method employs a "greedy search." SA is virtually guaranteed to find the ME trees while being much faster than exhaustive search algorithms. In this study we propose an even faster method for finding the ME tree. The new algorithm adjusts its search exhaustiveness (from greedy to complete) according to the statistical reliability of the tree node being reconstructed. It is also virtually guaranteed to find the ME tree. The performances and computational efficiencies of ME, SA, NJ, and our new method were compared in extensive simulation studies. The new algorithm was found to perform practically as well as the SA (and, therefore, ME) methods and slightly better than the NJ method. For searching for the globally optimal ME tree, the new algorithm is significantly faster than existing ones, thus making it relatively practical for obtaining all trees with an S value equal to or smaller than that of the NJ tree, even when a large number of taxa is involved. 相似文献
5.
We have developed a phylogenetic tree reconstruction method that detects and reports multiple topologically distant low-cost solutions. Our method is a generalization of the neighbor-joining method of Saitou and Nei and affords a more thorough sampling of the solution space by keeping track of multiple partial solutions during its execution. The scope of the solution space sampling is controlled by a pair of user-specified parameters--the total number of alternate solutions and the number of alternate solutions that are randomly selected--effecting a smooth trade-off between run time and solution quality and diversity. This method can discover topologically distinct low-cost solutions. In tests on biological and synthetic data sets using either the least-squares distance or minimum-evolution criterion, the method consistently performed as well as, or better than, both the neighbor-joining heuristic and the PHYLIP implementation of the Fitch-Margoliash distance measure. In addition, the method identified alternative tree topologies with costs within 1% or 2% of the best, but with topological distances of 9 or more partitions from the best solution (16 taxa); with 32 taxa, topologies were obtained 17 (least-squares) and 22 (minimum-evolution) partitions from the best topology when 200 partial solutions were retained. Thus, the method can find lower-cost tree topologies and near-best tree topologies that are significantly different from the best topology. 相似文献
6.
Relative efficiencies of the maximum-parsimony and distance-matrix methods of phylogeny construction for restriction data. 总被引:4,自引:0,他引:4
The relative efficiencies of the maximum-parsimony (MP), UPGMA, and neighbor-joining (NJ) methods in obtaining the correct tree (topology) for restriction-site and restriction-fragment data were studied by computer simulation. In this simulation, six DNA sequences of 16,000 nucleotides were assumed to evolve following a given model tree. The recognition sequences of 20 different six-base restriction enzymes were used to identify the restriction sites of the DNA sequences generated. The restriction-site data and restriction-fragment data thus obtained were used to reconstruct a phylogenetic tree, and the tree obtained was compared with the model tree. This process was repeated 300 times. The results obtained indicate that when the rate of nucleotide substitution is constant the probability of obtaining the correct tree (Pc) is generally higher in the NJ method than in the MP method. However, if we use the average topological deviation from the model tree (dT) as the criterion of comparison, the NJ and MP methods are nearly equally efficient. When the rate of nucleotide substitution varies with evolutionary lineage, the NJ method is better than the MP method, whether Pc or dT is used as the criterion of comparison. With 500 nucleotides and when the number of nucleotide substitutions per site was very small, restriction-site data were, contrary to our expectation, more useful than sequence data. Restriction-fragment data were less useful than restriction-site data, except when the sequence divergence was very small. UPGMA seems to be useful only when the rate of nucleotide substitution is constant and sequence divergence is high. 相似文献
7.
Murphy and colleagues reported that the mammalian phylogeny was resolved by Bayesian phylogenetics. However, the DNA sequences they used had many alignment gaps and undetermined nucleotide sites. We therefore reanalyzed their data by minimizing unshared nucleotide sites and retaining as many species as possible (13 species). In constructing phylogenetic trees, we used the Bayesian, maximum likelihood (ML), maximum parsimony (MP), and neighbor-joining (NJ) methods with different substitution models. These trees were constructed by using both protein and DNA sequences. The results showed that the posterior probabilities for Bayesian trees were generally much higher than the bootstrap values for ML, MP, and NJ trees. Two different Bayesian topologies for the same set of species were sometimes supported by high posterior probabilities, implying that two different topologies can be judged to be correct by Bayesian phylogenetics. This suggests that the posterior probability in Bayesian analysis can be excessively high as an indication of statistical confidence and therefore Murphy et al.'s tree, which largely depends on Bayesian posterior probability, may not be correct. 相似文献
8.
Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. We used computer simulation to investigate the behavior of three phylogenetic confidence methods: Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMC-PP), maximum likelihood bootstrap proportion (ML-BP), and maximum parsimony bootstrap proportion (MP-BP). We simulated the evolution of DNA sequence on 17-taxon topologies under 18 evolutionary scenarios and examined the performance of these methods in assigning confidence to correct monophyletic and incorrect monophyletic groups, and we examined the effects of increasing character number on support value. BMCMC-PP and ML-BP were often strongly correlated with one another but could provide substantially different estimates of support on short internodes. In contrast, BMCMC-PP correlated poorly with MP-BP across most of the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were supported by BMCMC-PP than by either ML-BP or MP-BP. When threshold values were chosen that fixed the rate of accepting incorrect monophyletic relationship as true at 5%, all three methods recovered most of the correct relationships on the simulated topologies, although BMCMC-PP and ML-BP performed better than MP-BP. BMCMC-PP was usually a less biased predictor of phylogenetic accuracy than either bootstrapping method. BMCMC-PP provided high support values for correct topological bipartitions with fewer characters than was needed for nonparametric bootstrap. 相似文献
9.
Relative efficiencies of the maximum parsimony and distance-matrix methods in obtaining the correct phylogenetic tree 总被引:14,自引:1,他引:13
The relative efficiencies of the maximum parsimony (MP) and distance-matrix methods in obtaining the correct tree (topology) were studied by using computer simulation. The distance-matrix methods examined are the neighbor-joining, distance-Wagner, Tateno et al. modified Farris, Faith, and Li methods. In the computer simulation, six or eight DNA sequences were assumed to evolve following a given model tree, and the evolutionary changes of the sequences were followed. Both constant and varying rates of nucleotide substitution were considered. From the sequences thus obtained, phylogenetic trees were constructed using the six tree-making methods and compared with the model (true) tree. This process was repeated 300 times for each different set of parameters. The results obtained indicate that when the number of nucleotide substitutions per site is small and a relatively small number of nucleotides are used, the probability of obtaining the correct topology (P1) is generally lower in the MP method than in the distance-matrix methods. The P1 value for the MP method increases with increasing number of nucleotides but is still generally lower than the value for the NJ or DW method. Essentially the same conclusion was obtained whether or not the rate of nucleotide substitution was constant or whether or not a transition bias in nucleotide substitution existed. The relatively poor performance of the MP method for these cases is due to the fact that information from singular sites is not used in this method. The MP method also showed a relatively low P1 value when the model of varying rate of nucleotide substitution was used and the number of substitutions per site was large. However, the MP method often produced cases in which the correct tree was one of several equally parsimonious trees. When these cases were included in the class of "success," the MP method performed better than the other methods, provided that the number of nucleotide substitutions per site was small. 相似文献
10.
Junhyong Kim F. James Rohlf Robert R. Sokal 《Evolution; international journal of organic evolution》1993,47(2):471-486
We studied the factors affecting the accuracy of the neighbor-joining (NJ) method for estimating phylogenies by simulating character change under different evolutionary models applied to twenty different 8-OTU tree topologies that varied widely with respect to tree imbalance and stemminess. The models incorporated three evolutionary rates—constant, varying among lineages, varying among characters—and three evolutionary contexts concerning patterns of character change relative to speciation events—phyletic, speciational, and punctuational. All combinations of the rate and context models were studied. In addition, three different absolute rates of change were investigated. To measure the accuracy, the strict consensus index was computed between the estimated tree and the tree topology along which the data had been generated. The results were analyzed by analysis of variance and compared to a previous study that evaluated UPGMA clustering and maximum parsimony (MP) as phylogenetic estimation techniques. We found evolutionary context and tree imbalance to be the most important factors affecting the accuracy of the NJ method. NJ was more accurate than UPGMA or MP in terms of the average strict consensus index over all treatments. However, no one method was more accurate than the other two for all combinations of treatments. Higher absolute rate of change generally resulted in higher accuracy for all three methods. 相似文献
11.
The neighbor-joining (NJ) method is widely used in reconstructing large phylogenies because of its computational speed and
the high accuracy in phylogenetic inference as revealed in computer simulation studies. However, most computer simulation
studies have quantified the overall performance of the NJ method in terms of the percentage of branches inferred correctly
or the percentage of replications in which the correct tree is recovered. We have examined other aspects of its performance,
such as the relative efficiency in correctly reconstructing shallow (close to the external branches of the tree) and deep
branches in large phylogenies; the contribution of zero-length branches to topological errors in the inferred trees; and the
influence of increasing the tree size (number of sequences), evolutionary rate, and sequence length on the efficiency of the
NJ method. Results show that the correct reconstruction of deep branches is no more difficult than that of shallower branches.
The presence of zero-length branches in realized trees contributes significantly to the overall error observed in the NJ tree,
especially in large phylogenies or slowly evolving genes. Furthermore, the tree size does not influence the efficiency of
NJ in reconstructing shallow and deep branches in our simulation study, in which the evolutionary process is assumed to be
homogeneous in all lineages.
Received: 7 March 2000 / Accepted: 2 August 2000 相似文献
12.
Silva AE Villanueva WJ Knidel H Bonato VC Reis SF Von Zuben FJ 《Genetics and molecular research : GMR》2005,4(3):525-534
The computationally challenging problem of reconstructing the phylogeny of a set of contemporary data, such as DNA sequences or morphological attributes, was treated by an extended version of the neighbor-joining (NJ) algorithm. The original NJ algorithm provides a single-tree topology, after a cascade of greedy pairing decisions that tries to simultaneously optimize the minimum evolution and the least squares criteria. Given that some sub-trees are more stable than others, and that the minimum evolution tree may not be achieved by the original NJ algorithm, we propose a multi-neighbor-joining (MNJ) algorithm capable of performing multiple pairing decisions at each level of the tree reconstruction, keeping various partial solutions along the recursive execution of the NJ algorithm. The main advantages of the new reconstruction procedure are: 1) as is the case for the original NJ algorithm, the MNJ algorithm is still a low-cost reconstruction method; 2) a further investigation of the alternative topologies may reveal stable and unstable sub-trees; 3) the chance of achieving the minimum evolution tree is greater; 4) tree topologies with very similar performances will be simultaneously presented at the output. When there are multiple unrooted tree topologies to be compared, a visualization tool is also proposed, using a radial layout to uniformly distribute the branches with the help of well-known metaheuristics used in computer science. 相似文献
13.
Evolution of Lycopodiaceae Inferred from Spacer Sequencing of Chloroplast rRNA Genes 总被引:1,自引:0,他引:1
Yatsentyuk S. P. Valiejo-Roman K. M. Samigullin T. H. Wilkström N. Troitsky A. V. 《Russian Journal of Genetics》2001,37(9):1068-1073
Nucleotide sequences of a chloroplast rDNA region including 8 bp from the 3" end of 23S rDNA–ITS2–4.5S rDNA–ITS3–5S rDNA–ITS4 (approximately 800 bp) were determined in 25 species of Lycopodiaceae and two species of the genus Isoetes. The rate of molecular evolution of spacers significantly varied in different Lycopsida taxa. A phylogenetic analysis by the neighbor-joining (NJ) method revealed that the family Lycopodiaceae is monophyletic. The topology of phylogenetic trees suggests the isolation of four or probably five genera in family Lycopodiaceae. For these genera, synapomorphic indels were detected. The obtained data were compared with the results of phylogenetic analysis of Lycopsida with regard to other sequences. The relationships of taxa within the family Lycopodiaceae is discussed. 相似文献
14.
Iatentiuk SP Val'ekho-Roman KM Samigullin TKh Wilkström N Troitskiĭ AV 《Genetika》2001,37(9):1274-1280
Nucleotide sequences of a chloroplast rDNA region including 8 bp from the 3' end of 23S rDNA-ITS2-4.5S rDNA-ITS3-5S rDNA-ITS4 (approximately 800 bp) were determined in 25 species of Lycopodiaceae and two species of the genus Isoetes. The rate of molecular evolution of spacers significantly varied in different Lycopsida taxa. A phylogenetic analysis by the neighbor-joining (NJ) method revealed that the family Lycopodiaceae is monophyletic. The topology of phylogenetic trees suggests the isolation of four or probably five genera in family Lycopodiaceae. For these genera, synapomorphic indels were detected. The obtained data were compared with the results of phylogenetic analysis of Lycopsida with regard to other sequences. The relationships of taxa within the family Lycopodiaceae is discussed. 相似文献
15.
16.
Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny 总被引:24,自引:6,他引:18
The relative efficiencies of different protein-coding genes of the
mitochondrial genome and different tree-building methods in recovering a
known vertebrate phylogeny (two whale species, cow, rat, mouse, opossum,
chicken, frog, and three bony fish species) was evaluated. The
tree-building methods examined were the neighbor joining (NJ), minimum
evolution (ME), maximum parsimony (MP), and maximum likelihood (ML), and
both nucleotide sequences and deduced amino acid sequences were analyzed.
Generally speaking, amino acid sequences were better than nucleotide
sequences in obtaining the true tree (topology) or trees close to the true
tree. However, when only first and second codon positions data were used,
nucleotide sequences produced reasonably good trees. Among the 13 genes
examined, Nd5 produced the true tree in all tree-building methods or
algorithms for both amino acid and nucleotide sequence data. Genes Cytb and
Nd4 also produced the correct tree in most tree-building algorithms when
amino acid sequence data were used. By contrast, Co2, Nd1, and Nd41 showed
a poor performance. In general, large genes produced better results, and
when the entire set of genes was used, all tree-building methods generated
the true tree. In each tree-building method, several distance measures or
algorithms were used, but all these distance measures or algorithms
produced essentially the same results. The ME method, in which many
different topologies are examined, was no better than the NJ method, which
generates a single final tree. Similarly, an ML method, in which many
topologies are examined, was no better than the ML star decomposition
algorithm that generates a single final tree. In ML the best substitution
model chosen by using the Akaike information criterion produced no better
results than simpler substitution models. These results question the
utility of the currently used optimization principles in phylogenetic
construction. Relatively simple methods such as the NJ and ML star
decomposition algorithms seem to produce as good results as those obtained
by more sophisticated methods. The efficiencies of the NJ, ME, MP, and ML
methods in obtaining the correct tree were nearly the same when amino acid
sequence data were used. The most important factor in constructing reliable
phylogenetic trees seems to be the number of amino acids or nucleotides
used.
相似文献
17.
In the field of phylogenetics and comparative genomics, it is important to establish orthologous relationships when comparing homologous sequences. Due to the slight sequence dissimilarity between orthologs and paralogs, it is prone to regarding paralogs as orthologs. For this reason, several methods based on evolutionary distance, phylogeny and BLAST have tried to detect orthologs with more precision. Depending on their algorithmic implementations, each of these methods sometimes has increased false negative or false positive rates. Here, we developed a novel algorithm for orthology detection that uses a distance method based on the phylogenetic criterion of minimum evolution. Our algorithm assumes that sets of sequences exhibiting orthologous relationships are evolutionarily less costly than sets that include one or more paralogous relationships. Calculation of evolutionary cost requires the reconstruction of a neighbor-joining (NJ) tree, but calculations are unaffected by the topology of any given NJ tree. Unlike tree reconciliation, our algorithm appears free from the problem of incorrect topologies of species and gene trees. The reliability of the algorithm was tested in a comparative analysis with two other orthology detection methods using 95 manually curated KOG datasets and 21 experimentally verified EXProt datasets. Sensitivity and specificity estimates indicate that the concept of minimum evolution could be valuable for the detection of orthologs. 相似文献
18.
Applying the tree bisection and reconnection (TBR) algorithm, we have developed a heuristic method (maximum likelihood (ML)-TBR) for inferring the ML tree based on tree topology search. For initial trees from which iterative processes start in ML-TBR, two cases were considered: one is 100 neighbor-joining (NJ) trees based on the bootstrap resampling and the other is 100 randomly generated trees. The same ML tree was obtained in both cases. All different iterative processes started from 100 independent initial trees ultimately converged on one optimum tree with the largest log-likelihood value, suggesting that a limited number of initial trees will be quite enough in ML-TBR. This also suggests that the optimum tree corresponds to the global optimum in tree topology space and thus probably coincides with the ML tree inferred by intact ML analysis. This method has been applied to the inference of phylogenetic tree of the SOX family members. The mammalian testis-determining gene SRY is believed to have evolved from SOX-3, a member of the SOX family, based on several lines of evidence, including their sequence similarity, the location of SOX-3 on the X chromosome and some aspects of their expression. This model should be supported directly from the phylogenetic tree of the SOX family, but no evidence has been provided to date. A recently published NJ tree shows implausibly remote origin of SRY, suggesting that a more sophisticated method is required for understanding this problem. The ML tree inferred by the present method showed that the SRYs of marsupial and placental mammals form a monophyletic cluster which had diverged from the mammalian SOX-3 in the early evolution of mammals. 相似文献
19.
Accuracy of phylogenetic trees estimated from DNA sequence data 总被引:4,自引:1,他引:3
The relative merits of four different tree-making methods in obtaining the
correct topology were studied by using computer simulation. The methods
studied were the unweighted pair-group method with arithmetic mean (UPGMA),
Fitch and Margoliash's (FM) method, thd distance Wagner (DW) method, and
Tateno et al.'s modified Farris (MF) method. An ancestral DNA sequence was
assumed to evolve into eight sequences following a given model tree. Both
constant and varying rates of nucleotide substitution were considered. Once
the DNA sequences for the eight extant species were obtained, phylogenetic
trees were constructed by using corrected (d) and uncorrected (p)
nucleotide substitutions per site. The topologies of the trees obtained
were then compared with that of the model tree. The results obtained can be
summarized as follows: (1) The probability of obtaining the correct rooted
or unrooted tree is low unless a large number of nucleotide differences
exists between different sequences. (2) When the number of nucleotide
substitutions per sequence is small or moderately large, the FM, DW, and MF
methods show a better performance than UPGMA in recovering the correct
topology. The former group of methods is particularly good for obtaining
the correct unrooted tree. (3) When the number of substitutions per
sequence is large, UPGMA is at least as good as the other methods,
particularly for obtaining the correct rooted tree. (4) When the rate of
nucleotide substitution varies with evolutionary lineage, the FM, DW, and
MF methods show a better performance in obtaining the correct topology than
UPGMA, except when a rooted tree is to be produced from data with a large
number of nucleotide substitutions per sequence.(ABSTRACT TRUNCATED AT 250
WORDS)
相似文献
20.
Relative efficiencies of the maximum-likelihood, neighbor-joining, and maximum-parsimony methods when substitution rate varies with site 总被引:10,自引:5,他引:5
The relative efficiencies of the maximum-likelihood (ML), neighbor- joining
(NJ), and maximum-parsimony (MP) methods in obtaining the correct topology
and in estimating the branch lengths for the case of four DNA sequences
were studied by computer simulation, under the assumption either that there
is variation in substitution rate among different nucleotide sites or that
there is no variation. For the NJ method, several different distance
measures (Jukes-Cantor, Kimura two- parameter, and gamma distances) were
used, whereas for the ML method three different transition/transversion
ratios (R) were used. For the MP method, both the standard unweighted
parsimony and the dynamically weighted parsimony methods were used. The
results obtained are as follows: (1) When the R value is high, dynamically
weighted parsimony is more efficient than unweighted parsimony in obtaining
the correct topology. (2) However, both weighted and unweighted parsimony
methods are generally less efficient than the NJ and ML methods even in the
case where the MP method gives a consistent tree. (3) When all the
assumptions of the ML method are satisfied, this method is slightly more
efficient than the NJ method. However, when the assumptions are not
satisfied, the NJ method with gamma distances is slightly better in
obtaining the correct topology than is the ML method. In general, the two
methods show more or less the same performance. The NJ method may give a
correct topology even when the distance measures used are not unbiased
estimators of nucleotide substitutions. (4) Branch length estimates of a
tree with the correct topology are affected more easily than topology by
violation of the assumptions of the mathematical model used, for both the
ML and the NJ methods. Under certain conditions, branch lengths are
seriously overestimated or underestimated. The MP method often gives
serious underestimates for certain branches. (5) Distance measures that
generate the correct topology, with high probability, do not necessarily
give good estimates of branch lengths. (6) The likelihood-ratio test and
the confidence-limit test, in Felsenstein's DNAML, for examining the
statistical of branch length estimates are quite sensitive to violation of
the assumptions and are generally too liberal to be used for actual data.
Rzhetsky and Nei's branch length test is less sensitive to violation of the
assumptions than is Felsenstein's test. (7) When the extent of sequence
divergence is < or = 5% and when > or = 1,000 nucleotides are used,
all three methods show essentially the same efficiency in obtaining the
correct topology and in estimating branch lengths.(ABSTRACT TRUNCATED AT
400 WORDS)
相似文献