首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The popular neighbor-joining (NJ) algorithm used in phylogenetics is a greedy algorithm for finding the balanced minimum evolution (BME) tree associated to a dissimilarity map. From this point of view, NJ is "optimal" when the algorithm outputs the tree which minimizes the balanced minimum evolution criterion. We use the fact that the NJ tree topology and the BME tree topology are determined by polyhedral subdivisions of the spaces of dissimilarity maps to study the optimality of the neighbor-joining algorithm. In particular, we investigate and compare the polyhedral subdivisions for n ≤ 8. This requires the measurement of volumes of spherical polytopes in high dimension, which we obtain using a combination of Monte Carlo methods and polyhedral algorithms. Our results include a demonstration that highly unrelated trees can be co-optimal in BME reconstruction, and that NJ regions are not convex. We obtain the l 2 radius for neighbor-joining for n = 5 and we conjecture that the ability of the neighbor-joining algorithm to recover the BME tree depends on the diameter of the BME tree.  相似文献   

2.
Due to its speed, the distance approach remains the best hope for building phylogenies on very large sets of taxa. Recently (R. Desper and O. Gascuel, J. Comp. Biol. 9:687-705, 2002), we introduced a new "balanced" minimum evolution (BME) principle, based on a branch length estimation scheme of Y. Pauplin (J. Mol. Evol. 51:41-47, 2000). Initial simulations suggested that FASTME, our program implementing the BME principle, was more accurate than or equivalent to all other distance methods we tested, with running time significantly faster than Neighbor-Joining (NJ). This article further explores the properties of the BME principle, and it explains and illustrates its impressive topological accuracy. We prove that the BME principle is a special case of the weighted least-squares approach, with biologically meaningful variances of the distance estimates. We show that the BME principle is statistically consistent. We demonstrate that FASTME only produces trees with positive branch lengths, a feature that separates this approach from NJ (and related methods) that may produce trees with branches with biologically meaningless negative lengths. Finally, we consider a large simulated data set, with 5,000 100-taxon trees generated by the Aldous beta-splitting distribution encompassing a range of distributions from Yule-Harding to uniform, and using a covarion-like model of sequence evolution. FASTME produces trees faster than NJ, and much faster than WEIGHBOR and the weighted least-squares implementation of PAUP*. Moreover, FASTME trees are consistently more accurate at all settings, ranging from Yule-Harding to uniform distributions, and all ranges of maximum pairwise divergence and departure from molecular clock. Interestingly, the covarion parameter has little effect on the tree quality for any of the algorithms. FASTME is freely available on the web.  相似文献   

3.
Minimum evolution is the guiding principle of an important class of distance-based phylogeny reconstruction methods, including neighbor-joining (NJ), which is the most cited tree inference algorithm to date. The minimum evolution principle involves searching for the tree with minimum length, where the length is estimated using various least-squares criteria. Since evolutionary distances cannot be known precisely but only estimated, it is important to investigate the robustness of phylogenetic reconstruction to imprecise estimates for these distances. The safety radius is a measure of this robustness: it consists of the maximum relative deviation that the input distances can have from the correct distances, without compromising the reconstruction of the correct tree structure. Answering some open questions, we here derive the safety radius of two popular minimum evolution criteria: balanced minimum evolution (BME) and minimum evolution based on ordinary least squares (OLS + ME). Whereas BME has a radius of \frac12\frac{1}{2}, which is the best achievable, OLS + ME has a radius tending to 0 as the number of taxa increases. This difference may explain the gap in reconstruction accuracy observed in practice between OLS + ME and BME (which forms the basis of popular programs such as NJ and FastME).  相似文献   

4.
Understanding the face structure of the balanced minimal evolution (BME) polytope, especially its top-dimensional facets, is a fundamental problem in phylogenetic theory. We show that BME polytope has a sublattice of its poset of faces which is isomorphic to a quotient of the well-studied permutoassociahedron. This sublattice corresponds to compatible sets of splits displayed by phylogenetic trees and extends the lattice of faces of the BME polytope found by Hodge, Haws and Yoshida. Each of the maximal elements in our new poset of faces corresponds to a single split of the leaves. Nearly all of these turn out to actually be facets of the BME polytope, a collection of facets which grows exponentially.  相似文献   

5.
It is well known among phylogeneticists that adding an extra taxon (e.g. species) to a data set can alter the structure of the optimal phylogenetic tree in surprising ways. However, little is known about this “rogue taxon” effect. In this paper we characterize the behavior of balanced minimum evolution (BME) phylogenetics on data sets of this type using tools from polyhedral geometry. First we show that for any distance matrix there exist distances to a “rogue taxon” such that the BME-optimal tree for the data set with the new taxon does not contain any nontrivial splits (bipartitions) of the optimal tree for the original data. Second, we prove a theorem which restricts the topology of BME-optimal trees for data sets of this type, thus showing that a rogue taxon cannot have an arbitrary effect on the optimal tree. Third, we computationally construct polyhedral cones that give complete answers for BME rogue taxon behavior when our original data fits a tree on four, five, and six taxa. We use these cones to derive sufficient conditions for rogue taxon behavior for four taxa, and to understand the frequency of the rogue taxon effect via simulation.  相似文献   

6.
The Minimum Evolution (ME) approach to phylogeny estimation has been shown to be statistically consistent when it is used in conjunction with ordinary least-squares (OLS) fitting of a metric to a tree structure. The traditional approach to using ME has been to start with the Neighbor Joining (NJ) topology for a given matrix and then do a topological search from that starting point. The first stage requires O(n(3)) time, where n is the number of taxa, while the current implementations of the second are in O(p n(3)) or more, where p is the number of swaps performed by the program. In this paper, we examine a greedy approach to minimum evolution which produces a starting topology in O(n(2)) time. Moreover, we provide an algorithm that searches for the best topology using nearest neighbor interchanges (NNIs), where the cost of doing p NNIs is O(n(2) + p n), i.e., O(n(2)) in practice because p is always much smaller than n. The Greedy Minimum Evolution (GME) algorithm, when used in combination with NNIs, produces trees which are fairly close to NJ trees in terms of topological accuracy. We also examine ME under a balanced weighting scheme, where sibling subtrees have equal weight, as opposed to the standard "unweighted" OLS, where all taxa have the same weight so that the weight of a subtree is equal to the number of its taxa. The balanced minimum evolution scheme (BME) runs slower than the OLS version, requiring O(n(2) x diam(T)) operations to build the starting tree and O(p n x diam(T)) to perform the NNIs, where diam(T) is the topological diameter of the output tree. In the usual Yule-Harding distribution on phylogenetic trees, the diameter expectation is in log(n), so our algorithms are in practice faster that NJ. Moreover, this BME scheme yields a very significant improvement over NJ and other distance-based algorithms, especially with large trees, in terms of topological accuracy.  相似文献   

7.
Many phylogenetic algorithms search the space of possible trees using topological rearrangements and some optimality criterion. FastME is such an approach that uses the {em balanced minimum evolution (BME)} principle, which computer studies have demonstrated to have high accuracy. FastME includes two variants: {em balanced subtree prune and regraft (BSPR)} and {em balanced nearest neighbor interchange (BNNI)}. These algorithms take as input a distance matrix and a putative phylogenetic tree. The tree is modified using SPR or NNI operations, respectively, to reduce the BME length relative to the distance matrix, until a tree with (locally) shortest BME length is found. Following computer simulations, it has been conjectured that BSPR and BNNI are consistent, i.e. for an input distance that is a tree-metric, they converge to the corresponding tree. We prove that the BSPR algorithm is consistent. Moreover, even if the input contains small errors relative to a tree-metric, we show that the BSPR algorithm still returns the corresponding tree. Whether BNNI is consistent remains open.  相似文献   

8.
The phylogenetic positioning of the non-pathogenic genusSpiromastix in the Onygenales was studied based on large subunit rDNA (LSU rDNA) partial sequences (ca. 570 bp.). FourSpiromastix species and 28 representative taxa of the Onygenales were newly sequenced. Phylogenetic trees were constructed by the neighbor-joining (NJ) method and evaluated by the maximum parsimony (MP) method with the data of 13 taxa retrieved from DNA databases.Spiromastix and dimorphic systemic pathogens,Ajellomyces andParacoccidioides, appear to be a monophyletic group with 74% bootstrap probability (BP) in the NJ tree constructed with the representative taxa of the Onygenales. The tree topology was concordant with the NJ tree based on SSU rDNA sequences of our previous work and corresponded to the classification system of the Onygenales by Currah (1985) and its minor modification by Udagawa (1997) with the exception of the classification of the Onygenaceae. The Onygeneceae sensu Udagawa may still be polyphyletic, since three independent lineages were recognized. The taxa forming helicoid peridial appendages were localized to two clades on the tree. The topology of the NJ tree constructed withSpiromastix and its close relatives suggested that the helicoid peridial appendages were apomorphic and acquired independently in the two clades of the Onygenales.  相似文献   

9.
Our ability to construct very large phylogenetic trees is becoming more important as vast amounts of sequence data are becoming readily available. Neighbor joining (NJ) is a widely used distance-based phylogenetic tree construction method that has historically been considered fast, but it is prohibitively slow for building trees from increasingly large datasets. We developed a fast variant of NJ called relaxed neighbor joining (RNJ) and performed experiments to measure the speed improvement over NJ. Since repeated runs of the RNJ algorithm generate a superset of the trees that repeated NJ runs generate, we also assessed tree quality. RNJ is dramatically faster than NJ, and the quality of resulting trees is very similar for the two algorithms. The results indicate that RNJ is a reasonable alternative to NJ and that it is especially well suited for uses that involve large numbers of taxa or highly repetitive procedures such as bootstrapping. [Reviewing Editor: Dr. James Bull]  相似文献   

10.
Liu L  Yu L 《Systematic biology》2011,60(5):661-667
In this study, we develop a distance method for inferring unrooted species trees from a collection of unrooted gene trees. The species tree is estimated by the neighbor joining (NJ) tree built from a distance matrix in which the distance between two species is defined as the average number of internodes between two species across gene trees, that is, average gene-tree internode distance. The distance method is named NJ(st) to distinguish it from the original NJ method. Under the coalescent model, we show that if gene trees are known or estimated correctly, the NJ(st) method is statistically consistent in estimating unrooted species trees. The simulation results suggest that NJ(st) and STAR (another coalescence-based method for inferring species trees) perform almost equally well in estimating topologies of species trees, whereas the Bayesian coalescence-based method, BEST, outperforms both NJ(st) and STAR. Unlike BEST and STAR, the NJ(st) method can take unrooted gene trees to infer species trees without using an outgroup. In addition, the NJ(st) method can handle missing data and is thus useful in phylogenomic studies in which data sets often contain missing loci for some individuals.  相似文献   

11.
Aim We aimed to elucidate how the current geographic distribution of alpine plants in the Japanese archipelago was shaped during Quaternary climatic oscillations, using Potentilla matsumurae as a case study. According to previous phylogeographic studies, post‐glacial range fragmentation (vicariance scenario) and stepwise migration (dispersal scenario) are both possible. We thus aimed to assess which scenario is more probable for the distribution changes of alpine plants in the Japanese archipelago. Location The alpine zone in the Japanese archipelago. Methods Using amplified fragment length polymorphism we determined the genotype of 161 individuals of P. matsumurae from 22 populations. Relationships among individuals and populations were examined using principal coordinates analysis and a neighbour‐joining (NJ) tree, respectively. To examine the genetic population structure, we performed analysis of molecular variance (amova ) and structure analysis. Results Differentiation between central Honshu and northern Japan was not very strong based on the principal coordinates analysis among individuals, the NJ tree of populations (59% bootstrap support), or amova (12% of genetic variation). Moreover, structure analysis did not detect clear geographic differentiation across populations. Although the populations in central Honshu were structured geographically (Mantel test: r = 0.45, P < 0.005; NJ tree), those in northern Japan did not exhibit geographic structure regardless of geographic distance (Mantel test: r = 0.26, P = 0.03; NJ tree). Population relationships in the NJ tree did not always reflect the geographic location. Main conclusions The current geographic structure of P. matsumurae could not be explained by stepwise migration. This suggests that a single continuous distribution during the last glacial period was later fragmented, perhaps by recovering forest, during the post‐glacial period, resulting in the current distribution and phylogeographic structure of P. matsumurae. Our data support the vicariance scenario.  相似文献   

12.
Tree squirrels (Tamiasciurus) are important selective agents on conifer reproductive strategies (Smith 1970, 1975). Although this is well established for wind-dispersed pines, the impact of tree squirrels on bird-dispersed pines has been largely ignored. I assessed the impact of tree squirrels on the allocation of reproductive energy in the bird-dispersed limber pine (Pinus flexilis) by comparing its cone and seed traits from three sites in the Rocky Mountains where tree squirrels (Tamiasciurus) are present to those from three mountain ranges in the Great Basin where tree squirrels are absent. As predicted, differences between the two regions in individual cone and seed traits are consistent with the hypothesis that tree squirrels are important selective agents on these traits. In the absence of tree squirrels, limber pine allocates more than twice as much energy to kernel compared with that invested in putative seed defenses (cone, resin, and seed coat) as does limber pine where tree squirrels are present. Such a large difference is particularly striking, because tree squirrels may have become extinct in the Great Basin in only the last 12,000 yr. Although many factors influence the allocation of energy to cones and seeds, no single factor other than the presence of tree squirrels is compatible with the large and consistent differences between limber pine in the Rocky Mountains and Great Basin. These results show that tree squirrels are an important constraint on the evolution of cone and seed traits that promote the dispersal of seeds by birds.  相似文献   

13.
Objective

In mathematical phylogenetics, a labeled rooted binary tree topology can possess any of a number of labeled histories, each of which represents a possible temporal ordering of its coalescences. Labeled histories appear frequently in calculations that describe the combinatorics of phylogenetic trees. Here, we generalize the concept of labeled histories from rooted phylogenetic trees to rooted phylogenetic networks, specifically for the class of rooted phylogenetic networks known as rooted galled trees.

Results

Extending a recursive algorithm for enumerating the labeled histories of a labeled tree topology, we present a method to enumerate the labeled histories associated with a labeled rooted galled tree. The method relies on a recursive decomposition by which each gall in a galled tree possesses three or more descendant subtrees. We exhaustively provide the numbers of labeled histories for all small galled trees, finding that each gall reduces the number of labeled histories relative to a specified galled tree that does not contain it.

Conclusion

The results expand the set of structures for which labeled histories can be enumerated, extending a well-known calculation for phylogenetic trees to a class of phylogenetic networks.

  相似文献   

14.
重建系统演化树的一种新方法--试错法   总被引:1,自引:0,他引:1  
谭远德 《动物学报》2000,46(4):448-456
重建系统演化树是进化研究的一个极为重要的方面。系统树的构建依赖于一定的方法和数据。在分子系统演化研究中,所使用的数据大多是DNA序列、氨基酸序列和分子标记。而就构树方法来说,NJ法、ML法和MP法是三种最为普遍使用的方法。本文给出了一种新的建树方法,即试错法。该方法不但具有与NJ法一样好的建树效果,而且不存在难以解释的负枝长问题。  相似文献   

15.
A rapid heuristic algorithm for finding minimum evolution trees   总被引:2,自引:0,他引:2  
The minimum sum of branch lengths (S), or the minimum evolution (ME) principle, has been shown to be a good optimization criterion in phylogenetic inference. Unfortunately, the number of topologies to be analyzed is computationally prohibitive when a large number of taxa are involved. Therefore, simplified, heuristic methods, such as the neighbor-joining (NJ) method, are usually employed instead. The NJ method analyzes only a small number of trees (compared with the size of the entire search space); so, the tree obtained may not be the ME tree (for which the S value is minimum over the entire search space). Different compromises between very restrictive and exhaustive search spaces have been proposed recently. In particular, the "stepwise algorithm" (SA) utilizes what is known in computer science as the "beam search," whereas the NJ method employs a "greedy search." SA is virtually guaranteed to find the ME trees while being much faster than exhaustive search algorithms. In this study we propose an even faster method for finding the ME tree. The new algorithm adjusts its search exhaustiveness (from greedy to complete) according to the statistical reliability of the tree node being reconstructed. It is also virtually guaranteed to find the ME tree. The performances and computational efficiencies of ME, SA, NJ, and our new method were compared in extensive simulation studies. The new algorithm was found to perform practically as well as the SA (and, therefore, ME) methods and slightly better than the NJ method. For searching for the globally optimal ME tree, the new algorithm is significantly faster than existing ones, thus making it relatively practical for obtaining all trees with an S value equal to or smaller than that of the NJ tree, even when a large number of taxa is involved.  相似文献   

16.
The neighbor-joining (NJ) method is widely used in reconstructing large phylogenies because of its computational speed and the high accuracy in phylogenetic inference as revealed in computer simulation studies. However, most computer simulation studies have quantified the overall performance of the NJ method in terms of the percentage of branches inferred correctly or the percentage of replications in which the correct tree is recovered. We have examined other aspects of its performance, such as the relative efficiency in correctly reconstructing shallow (close to the external branches of the tree) and deep branches in large phylogenies; the contribution of zero-length branches to topological errors in the inferred trees; and the influence of increasing the tree size (number of sequences), evolutionary rate, and sequence length on the efficiency of the NJ method. Results show that the correct reconstruction of deep branches is no more difficult than that of shallower branches. The presence of zero-length branches in realized trees contributes significantly to the overall error observed in the NJ tree, especially in large phylogenies or slowly evolving genes. Furthermore, the tree size does not influence the efficiency of NJ in reconstructing shallow and deep branches in our simulation study, in which the evolutionary process is assumed to be homogeneous in all lineages. Received: 7 March 2000 / Accepted: 2 August 2000  相似文献   

17.
Longleaf pine (Pinus palustris Mill.) cones have been counted annually by the United States Forest Service (USFS) at eleven locations throughout the species’ range since 1958. These data have been useful for understanding spatiotemporal patterns in longleaf pine cone production, and are beneficial in timing regeneration efforts. Variations in annual mast (i.e. seed crop) are known to influence ring widths in numerous tree species, yet this relationship is poorly understood for longleaf pine. This research examines the relationship between longleaf pine cone data and tree-ring growth from trees sampled in the multi-decadal USFS cone-crop study. We examined cone–radial growth relationships using individual tree-ring data and proprietary cone data for each tree from six sites in four locations in the southeastern USA. We found that longleaf pine cones were correlated with basal area increment growth (BAI) over the three-year cone-development cycle. Low BAI years were more frequently associated with above-average cone crop and BAI during years that coincided with the largest cone-crop class (bumper, > 100 cones per tree) were statistically less than any other cone class. We prepared linear models that predicted radial growth using PDSI and cones as predictors, and found that including cones in the models did not improve adjusted R2 values. We conclude that while cone production is inversely related to radial growth, the combination of infrequent bumper years and the concentration of cone production by a few trees per stand, creates an environment where radial-growth chronologies assembled from longleaf pine for dendroclimatic purposes are unlikely to be significantly influenced by reproductive strain.  相似文献   

18.
The Mediterranean cypress (Cupressus sempervirens) is a multi-purpose tree widely used in the Mediterranean region. An anthropogenic range expansion of cypress has taken place at the northern margin of the range in Italy in recent decades, driven by ornamental planting in spite of climatic constraints imposed by low winter temperature. The expansion has created new habitats for pathogens and pests, which strongly limit tree survival in the historical (core) part of the range. Based on the enemy release hypothesis, we predicted that damage should be lower in the expansion area. By comparing tree and seed cone damage by pathogens and pests in core and expansion areas of Trentino, a district in the southern Alps, we showed that tree damage was significantly higher in the core area. Seed cones of C. sempervirens are intensively colonized by an aggressive and specific pathogen (the canker fungus Seiridium cardinale, Coelomycetes), associated with seed insect vectors Megastigmus wachtli (Hymenoptera Torymidae) and Orsillus maculatus (Heteroptera Lygaeidae). In contrast, we observed lower tree damage in the expansion area, where a non-aggressive fungus (Pestalotiopsis funerea, Coelomycetes) was more frequently associated with the same insect vectors. Our results indicate that both insect species have a great potential to reach the range margin, representing a continuous threat of the arrival of fungal pathogens to trees planted at extreme sites. Global warming may accelerate this process since both insects and fungi profit from increased temperature. In the future, cypress planted at the range margin may then face similar pest and pathogen threats as in the historical range.  相似文献   

19.
Betelvine (Piper betle L., family Piperaceae) is an important, traditional and widely cultivated crop of India. The cultivators and consumers recognize more than 100 cultivars (landraces) based on regional and organoleptic considerations, while in terms of phytochemical constituents only five groups have been identified for all the landraces. Since betelvine is an obligate vegetatively propagated species, genomic changes, if any, may have become ‘fixed’ in the landraces. We carried out random amplified polymorphic DNA (RAPD) analysis in several landraces considered in four groups, namely, ‘Kapoori’, ‘Bangla’, ‘Sanchi’ and ‘Others’ in order to ascertain their genetic diversity. On the basis of the data from eleven RAPD primers, we distinguished genetic variation within and among the four groups of landraces. The results indicate the’Kapoori’ group is the most diverse. The neighbour joining (NJ) tree after a bootstrap (500 replicate) test of robustness clearly shows the four groups to be well separated. Interestingly, all known male or female betelvine landraces have separated in the NJ tree indicating an apparent gender-based distinction among the betelvines.  相似文献   

20.
Clearcut: a fast implementation of relaxed neighbor joining   总被引:1,自引:0,他引:1  
SUMMARY: Clearcut is an open source implementation for the relaxed neighbor joining (RNJ) algorithm. While traditional neighbor joining (NJ) remains a popular method for distance-based phylogenetic tree reconstruction, it suffers from a O(N(3)) time complexity, where N represents the number of taxa in the input. Due to this steep asymptotic time complexity, NJ cannot reasonably handle very large datasets. In contrast, RNJ realizes a typical-case time complexity on the order of N(2)logN without any significant qualitative difference in output. RNJ is particularly useful when inferring a very large tree or a large number of trees. In addition, RNJ retains the desirable property that it will always reconstruct the true tree given a matrix of additive pairwise distances. Clearcut implements RNJ as a C program, which takes either a set of aligned sequences or a pre-computed distance matrix as input and produces a phylogenetic tree. Alternatively, Clearcut can reconstruct phylogenies using an extremely fast standard NJ implementation. AVAILABILITY: Clearcut source code is available for download at: http://bioinformatics.hungry.com/clearcut  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号