首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 550 毫秒
1.
We conducted a simulation study of the phylogenetic methods UPGMA, neighbor joining, maximum parsimony, and maximum likelihood for a five-taxon tree under a molecular clock. The parameter space included a small region where maximum parsimony is inconsistent, so we tested inconsistency correction for parsimony and distance correction for neighbor joining. As expected, corrected parsimony was consistent. For these data, maximum likelihood with the clock assumption outperformed each of the other methods tested. The distance-based methods performed marginally better than did maximum parsimony and maximum likelihood without the clock assumption. Data correction was generally detrimental to accuracy, especially for short sequence lengths. We identified another region of the parameter space where, although consistent for a given method, some incorrect trees were each selected with up to twice the frequency of the correct (generating) tree for sequences of bounded length. These incorrect trees are those where the outgroup has been incorrectly placed. In addition to this problem, the placement of the outgroup sequence can have a confounding effect on the ingroup tree, whereby the ingroup is correct when using the ingroup sequences alone, but with the inclusion of the outgroup the ingroup tree becomes incorrect.  相似文献   

2.
Using simulated data, we compared five methods of phylogenetic tree estimation: parsimony, compatibility, maximum likelihood, Fitch- Margoliash, and neighbor joining. For each combination of substitution rates and sequence length, 100 data sets were generated for each of 50 trees, for a total of 5,000 replications per condition. Accuracy was measured by two measures of the distance between the true tree and the estimate of the tree, one measure sensitive to accuracy of branch lengths and the other not. The distance-matrix methods (Fitch- Margoliash and neighbor joining) performed best when they were constrained from estimating negative branch lengths; all comparisons with other methods used this constraint. Parsimony and compatibility had similar results, with compatibility generally inferior; Fitch- Margoliash and neighbor joining had similar results, with neighbor joining generally slightly inferior. Maximum likelihood was the most successful method overall, although for short sequences Fitch- Margoliash and neighbor joining were sometimes better. Bias of the estimates was inferred by measuring whether the independent estimates of a tree for different data sets were closer to the true tree than to each other. Parsimony and compatibility had particular difficulty with inaccuracy and bias when substitution rates varied among different branches. When rates of evolution varied among different sites, all methods showed signs of inaccuracy and bias.   相似文献   

3.
We examine whether phylogenetic methods provide biased estimates of tree shape with respect to the random branching model. We investigate the performance of five commonly used phylogenetic methods using computer simulation: (1) maximum parsimony; (2) neighbor joining; (3) UPGMA with an outgroup taxon; (4) UPGMA without an outgroup taxon; and (5) maximum likelihood. All methods provide estimates of tree shape that are, on average, more asymmetrical than the true tree, especially when rates of evolution are high. We suggest a simple explanation for the bias and propose a modified test of tree shape that corrects for it.  相似文献   

4.
AFLPs (and to a lesser extent ISSRs and RAPDs) are increasingly being used for phylogenetic inference among closely related species. Presence/absence characters for each AFLP allele treat all absences as homologous to one another. With three or more alleles, terminals are grouped by their shared absence of alleles in character-based phylogenetic-inference methods in a manner that is not redundant with their shared presence of an alternative allele. We conducted simulations to quantify how severe the negative effect of using presence/absence characters of individual bands is for phylogenetic inference relative to standard multistate characters. We examined alternative tree topologies, relative branch lengths, numbers of characters, rates of evolution, and numbers of alternative alleles, using both parsimony and Nei-and-Li distance analyses. Multistate parsimony generally outperformed presence/absence parsimony, which in turn outperformed Nei-and-Li distance. Increasing the character-state space (i.e., the number of alternative character states available) was found to be advantageous for all three methods of analysis examined, but was most advantageous for multistate parsimony. However, the advantage of multistate parsimony relative to Nei-and-Li distance decreased when applied to more divergent characters. More parsimony-informative variation generally alleviated the problem associated with scoring multistate characters as presence/absence characters. The ensemble consistency index was lower for presence/absence characters relative to multistate characters.  相似文献   

5.
为了探究进化模型对DNA条形码分类的影响, 本研究以雾灵山夜蛾科44个种的标本为材料, 获得COI基因序列。使用邻接法(neighbor-joining)、 最大简约法(maximum parsimony)、 最大似然法(maximum likelihood)以及贝叶斯法(Bayesian inference)构建系统发育树, 并且对邻接法的12种模型、 最大似然法的7种模型、 贝叶斯法的2种模型进行模型成功率的评估。结果表明, 邻接法的12种模型成功率相差不大, 较稳定; 最大似然法及贝叶斯法的不同模型成功率存在明显差异, 不稳定; 最大简约法不基于模型, 成功率比较稳定。邻接法及最大似然法共有6种相同的模型, 这6种模型在不同的方法中成功率存在差异。此外, 分子数据中存在单个物种仅有一条序列的情况, 显著降低了模型成功率, 表明在DNA条形码研究中, 每个物种需要有多个样本。  相似文献   

6.
Parsimony, likelihood, and simplicity   总被引:2,自引:1,他引:1  
The latest charge against parsimony in phylogenetic inference is that it involves estimating too many parameters. The charge is derived from the fact that, when each character is allowed a branch length vector of its own (instead of the homogeneous branch lengths assumed in current likelihood models), the results for likelihood and parsimony are identical. Parsimony, however, can also be derived from simpler models, involving fewer parameters. Therefore, parsimony provides (as many authors had argued before) the simplest explanation of the data, or the most realistic, depending on one's views. If (as argued by likelihoodists) phylogenetic inference is to use the simplest model that provides sufficient explanation of the data, the starting point of phylogenetic analyses should be parsimony, not maximum likelihood. If the addition of new parameters (which increase the likelihood) to a parsimony estimation is seen as desirable, this may lead to a preference for results based on current likelihood models. If the addition of parameters is continued, however, the results will eventually come back to the same place where they had started, since allowing each character a branch length of its own also produces parsimony. Parsimony can be justified by very different types of models—either very complex or very simple. This suggests that parsimony does have a unique place among methods of phylogenetic estimation.  相似文献   

7.
ABSTRACT: BACKGROUND: The unbranched filamentous green alga Spirogyra (Streptophyta, Zygnemataceae) is easily recognizable based on its vegetative morphology, which shows one to several spiral chloroplasts. This simple structure falsely points to a low genetic diversity: Spirogyra is commonly excluded from phylogenetic analyses because the genus is known as a long-branch taxon caused by a high evolutionary rate. RESULTS: We focused on this genetic diversity and sequenced 130 Spirogyra small subunit nuclear ribosomal DNA (SSU rDNA) strands of different origin. The resulting SSU rDNA sequences were used for phylogenetic analyses using complex evolutionary models (posterior probability, maximum likelihood, neighbor joining, and maximum parsimony methods). The sequences were between 1672 and 1779 nucleotides long. Sequence comparisons revealed 53 individual clones, but our results still support monophyly of the genus. Our data set did not contain a single slow-evolving taxon that would have been placed on a shorter branch compared to the remaining sequences. Out of 130 accessions analyzed, 72 showed a.  相似文献   

8.
A comparison of phylogenetic network methods using computer simulation   总被引:1,自引:0,他引:1  

Background

We present a series of simulation studies that explore the relative performance of several phylogenetic network approaches (statistical parsimony, split decomposition, union of maximum parsimony trees, neighbor-net, simulated history recombination upper bound, median-joining, reduced median joining and minimum spanning network) compared to standard tree approaches, (neighbor-joining and maximum parsimony) in the presence and absence of recombination.

Principal Findings

In the absence of recombination, all methods recovered the correct topology and branch lengths nearly all of the time when the substitution rate was low, except for minimum spanning networks, which did considerably worse. At a higher substitution rate, maximum parsimony and union of maximum parsimony trees were the most accurate. With recombination, the ability to infer the correct topology was halved for all methods and no method could accurately estimate branch lengths.

Conclusions

Our results highlight the need for more accurate phylogenetic network methods and the importance of detecting and accounting for recombination in phylogenetic studies. Furthermore, we provide useful information for choosing a network algorithm and a framework in which to evaluate improvements to existing methods and novel algorithms developed in the future.  相似文献   

9.
In order to elucidate the phylogenetic relationship among groups of the order Entomobryomorpha (Collembola), the sequences on the ITS 1 to ITS 2 fragments of the rRNA gene were analyzed in 11 species of three families. In order to avoid the potential risks and inconsistencies of a single method or data set, the phylogenetic reconstructions were based on three different approaches: methods of maximum parsimony, maximum likelihood and neighbor joining. The inferred phylogenies supported monophyly of the order Entomobryomorpha. The relationships between families were different, but the orders of branching within each family were the same. Entomobryidae and Isotomidae were paraphyletic, whereas Tomoceridae was monophyletic. Tomoceridae was subdivided into two branches; the molecular analysis provided results distinctive enough to separate the two genera by the high bootstrap value. On the other hand, two different populations of putative Homidia koreana appeared to be different species, although their chaetotaxy is identical. A wide coverage of characters, including not only morphological characters but also genetic data such as allozymes and DNA sequences, will give a more accurate picture of the classification and phylogeny of the studied group.  相似文献   

10.
Allozyme data are widely used to infer the phylogenies of populations and closely-related species. Numerous parsimony, distance, and likelihood methods have been proposed for phylogenetic analysis of these data; the relative merits of these methods have been debated vigorously, but their accuracy has not been well explored. In this study, I compare the performance of 13 phylogenetic methods (six parsimony, six distance, and continuous maximum likelihood) by applying a congruence approach to eight allozyme data sets from the literature. Clades are identified that are supported by multiple data sets other than allozymes (e.g. morphology, DNA sequences), and the ability of different methods to recover these 'known' clades is compared. The results suggest that (1) distance and likelihood methods generally outperform parsimony methods, (2) methods that utilize frequency data tend to perform well, and (3) continuous maximum likelihood is among the most accurate methods, and appears to be robust to violations of its assumptions. These results are in agreement with those from recent simulation studies, and help provide a basis for empirical workers to choose among the many methods available for analysing allozyme characters.  相似文献   

11.
The nucleotide substitution matrix inferred from avian data sets using cytochrome b differs considerably from the models commonly used in phylogenetic analyses. To analyze the possible effects of this particular pattern of change in phylogeny estimation we performed a computer simulation in which we started with a real sequence and used the inferred model of change to produce a tree of 10 species. Maximum parsimony (MP), maximum likelihood (ML), and various distance methods were then used to recover the topology and the branch lengths. We used two kinds of data with varying levels of variation. In addition, we tested with the removal of third positions and different weighting schemes. At low levels of variation, MP was outstanding in recovering the topology (90% correct), while unweighted pair-group method, arithmetic average (UPGMA), regardless of distances used, was poor (40%). At the higher level, most methods had a chance of around 40%-58% of finding the true tree. However, in most cases, the trees found were only slightly wrong, with only one or a few branches misplaced. On the other hand, the use of a "wrong" model had serious effects on the estimation of branch lengths (distances). Although precision was high, accuracy was poor with most methods, giving branch lengths that were biased downward. When seeded with the true distance matrix, Fitch and NJ always found the true tree, while UPGMA frequently failed to do so. The effect of removing third positions was dramatic at low levels of variation, because only one MP program was able to find a true tree at all, albeit rarely, while none of the others ever did so. At higher levels, the situation was better, but still much worse than with the whole data set.  相似文献   

12.
A phylogenetic method is a consistent estimator of phylogeny if and only if it is guaranteed to give the correct tree, given that sufficient (possibly infinite) independent data are examined. The following methods are examined for consistency: UPGMA (unweighted pair-group method, averages), NJ (neighbor joining), MF (modified Farris), and P (parsimony). A two-parameter model of nucleotide sequence substitution is used, and the expected distribution of character states is calculated. Without perfect correction for superimposed substitutions, all four methods may be inconsistent if there is but one branch evolving at a faster rate than the other branches. Partial correction of observed distances improves the robustness of the NJ method to rate variation, and perfect correction makes the NJ method a consistent estimator for all combinations of rates that were examined. The sensitivity of all the methods to unequal rates varies over a wide range, so relative-rate tests are unlikely to be a reliable guide for accepting or rejecting phylogenies based on parsimony analysis.  相似文献   

13.
A simulation study was carried out to investigate the relative importance of tree topology (both balance and stemminess), evolutionary rates (constant, varying among characters, and varying among lineages), and evolutionary models in determining the accuracy with which phylogenetic trees can be estimated. The three evolutionary context models were phyletic (characters can change at each simulated time step), speciational (changes are possible only at the time of speciation into two daughter lineages), and punctuational (changes occur at the time of speciation but only in one of the daughter lineages). UPGMA clustering and maximum parsimony (“Wagner trees”) methods for estimating phylogenies were compared. All trees were based on eight recent OTUs. The three evolutionary context models were found to have the largest influence on accuracy of estimates by both methods. The next most important effect was that of the stemminess × context interaction. Maximum parsimony and UPGMA performed worst under the punctuational models. Under the phyletic model, trees with high stemminess values could be estimated more accurately and balanced trees were slightly easier to estimate than unbalanced ones. Overall, maximum parsimony yielded more accurate trees than UPGMA—but that was expected for these simulations since many more characters than OTUs were used. Our results suggest that the great majority of estimated phylogenetic trees are likely to be quite inaccurate; they underscore the inappropriateness of characterizing current phylogenetic methods as being for reconstruction rather than for estimation.  相似文献   

14.
Yang Z 《Systematic biology》1998,47(1):125-133
The effect of the evolutionary rate of a gene on the accuracy of phylogeny reconstruction was examined by computer stimulation. The evolutionary rate is measured by the tree length, that is, the expected total number of nucleotide substitutions per site on the phylogeny. DNA sequence data were simulated using both fixed trees with specified branch lengths and random trees with branch lengths generated from a model of cladogenesis. The parsimony and likelihood methods were used for phylogeny reconstruction, and the proportion of correctly recovered branch partitions by each method was estimated. Phylogenetic methods including parsimony appear quite tolerant of multiple substitutions at the same site. The optimum levels of sequence divergence were even higher than upper limits previously suggested for saturation of substitutions, indicating that the problem of saturation may have been exaggerated. Instead, the lack of information at low levels of divergence should be seriously considered in evaluation of a gene's phylogenetic utility, especially when the gene sequence is short. The performance of parsimony, relative to that of likelihood, does not necessarily decrease with the increase of the evolutionary rate.  相似文献   

15.
文中分析现生介形类 (Ostracoda) 4目 2 1科 2 9属的 18SrDNA部分序列 ,采用最大似然法 (ML)、邻接法 (NJ)和最大简约法 (MP) ,尝试构建介形类的分子系统树 ;结合介形类的形态特征和化石记录 ,主要对速足目(Podocopida)、丽足目 (Myodocopida)及其超科级分类阶元的系统发生关系进行探讨。 3种分析方法均支持形态学上Podocopida ,Myodocopida和海萤超科 (Cypridinacea)的界定 ;但对Podocopida目土菱介超科 (Bairdiacea)的系统地位提出质疑 ,该类群可能不是单系发生的自然类群。上述分析显示 ,Podocopida,Myodocopida,Platycopida和Halo cypridina组成一个单系群 ;介形类在目、超科、科和属的水平上可能发生过多次辐射分化  相似文献   

16.
Comprehensive phylogenetic trees are essential tools to better understand evolutionary processes. For many groups of organisms or projects aiming to build the Tree of Life, comprehensive phylogenetic analysis implies sampling hundreds to thousands of taxa. For the tree of all life this task rises to a highly conservative 13 million. Here, we assessed the performances of methods to reconstruct large trees using Monte Carlo simulations with parameters inferred from four large angiosperm DNA matrices, containing between 141 and 567 taxa. For each data set, parameters of the HKY85+G model were estimated and used to simulate 20 new matrices for sequence lengths from 100 to 10,000 base pairs. Maximum parsimony and neighbor joining were used to analyze each simulated matrix. In our simulations, accuracy was measured by counting the number of nodes in the model tree that were correctly inferred. The accuracy of the two methods increased very quickly with the addition of characters before reaching a plateau around 1000 nucleotides for any sizes of trees simulated. An increase in the number of taxa from 141 to 567 did not significantly decrease the accuracy of the methods used, despite the increase in the complexity of tree space. Moreover, the distribution of branch lengths rather than the rate of evolution was found to be the most important factor for accurately inferring these large trees. Finally, a tree containing 13,000 taxa was created to represent a hypothetical tree of all angiosperm genera and the efficiency of phylogenetic reconstructions was tested with simulated matrices containing an increasing number of nucleotides up to a maximum of 30,000. Even with such a large tree, our simulations suggested that simple heuristic searches were able to infer up to 80% of the nodes correctly.  相似文献   

17.
通过对类人猿亚目中部分种类的孕激素受体基因进行分析,重建类人猿亚目的 系统发育关系.扩增并测定了来源于14个属的类人猿亚目物种的孕激素受体编码区序列,并基于这一序列数据,分别采用邻接法、最大简约法和最大似然法重建了系统发育关系.除了阔鼻下目,3种方法构建的系统发生树的拓扑结构类似且各节点支持率高.重建的人猿超科和猴超科内部亲缘关系支持多数人所认可的分类系统.本研究为黑猩猩和人的姐妹群关系提供了证据,提示黑猩猩比大猩猩或其他猿猴更接近人类.阔鼻下目中蜘蛛猴科、卷尾猴科和僧面猴科三者之间的系统发育关系在本研究中未得到很好辨析.  相似文献   

18.
从线粒体细胞色素b基因探讨矮岩羊物种地位的有效性   总被引:21,自引:0,他引:21  
用采自四川省的岩羊和矮岩羊共18个头骨或皮张标本,分析了线粒体Cyt6基因1140bp的全序列(其中一个样品只测到802bp的基因片段)。用NJ、MP、ML等系统发育分析法分别重建的系统发育树的拓扑结构完全一致,均不支持矮岩羊是单系群。结果提示:所研究的全部样品都属于同一个物种,即岩羊P.nayaur,不支持矮岩羊的物种地位。根据基于17个样品Cyt6基因全序列的系统发育树和遗传变异以及地理分布,这些岩羊样品聚为5支,根据地理区划和本文分析结果,四川的岩羊可分为摩天岭、川西、川西北、川西南和川北5个种群。基于17个样品Cyt6基因1137bp的编码区,共定义了岩羊的16种单元型,在5个种群间未发现共享的单元型。  相似文献   

19.
Evolutionary biologists have adopted simple likelihood models for purposes of estimating ancestral states and evaluating character independence on specified phylogenies; however, for purposes of estimating phylogenies by using discrete morphological data, maximum parsimony remains the only option. This paper explores the possibility of using standard, well-behaved Markov models for estimating morphological phylogenies (including branch lengths) under the likelihood criterion. An important modification of standard Markov models involves making the likelihood conditional on characters being variable, because constant characters are absent in morphological data sets. Without this modification, branch lengths are often overestimated, resulting in potentially serious biases in tree topology selection. Several new avenues of research are opened by an explicitly model-based approach to phylogenetic analysis of discrete morphological data, including combined-data likelihood analyses (morphology + sequence data), likelihood ratio tests, and Bayesian analyses.  相似文献   

20.
Convergence in nucleotide composition (CNC) in unrelated lineages is a factor potentially affecting the performance of most phylogeny reconstruction methods. Such convergence has deleterious effects because unrelated lineages show similarities due to similar nucleotide compositions and not shared histories. While some methods (such as the LogDet/paralinear distance measure) avoid this pitfall, the amount of convergence in nucleotide composition necessary to deceive other phylogenetic methods has never been quantified. We examined analytically the relationship between convergence in nucleotide composition and the consistency of parsimony as a phylogenetic estimator for four taxa. Our results show that rather extreme amounts of convergence are necessary before parsimony begins to prefer the incorrect tree. Ancillary observations are that (for unweighted Fitch parsimony) transition/transversion bias contributes to the impact of CNC and, for a given amount of CNC and fixed branch lengths, data sets exhibiting substantial site-to-site rate heterogeneity present fewer difficulties than data sets in which rates are homogeneous. We conclude by reexamining a data set originally used to illustrate the problems caused by CNC. Using simulations, we show that in this case the convergence in nucleotide composition alone is insufficient to cause any commonly used methods to fail, and accounting for other evolutionary factors (such as site-to-site rate heterogeneity) can give a correct inference without accounting for CNC.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号