首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Matrix representation with parsimony (MRP) supertree construction has been criticized because the supertree may specify clades that are contradicted by every source tree contributing to it. Such unsupported clades may also occur using other supertree methods; however, their incidence is largely unknown. In this study, I investigated the frequency of unsupported clades in both simulated and empirical MRP supertrees. Here, I propose a new index, QS, to quantify the qualitative support for a supertree and its clades among the set of source trees. Results show that unsupported clades are very rare in MRP supertrees, occurring most often when there are few source trees that all possess the same set of taxa. However, even under these conditions the frequency of unsupported clades was <0.2%. Unsupported clades were absent from both the Carnivora and Lagomorpha supertrees, reflecting the use of large numbers of source trees for both. The proposed QS indices are correlated broadly with another measure of quantitative clade support (bootstrap frequencies, as derived from resampling of the MRP matrix) but appear to be more sensitive. More importantly, they sample at the level of the source trees and thus, unlike the bootstrap, are suitable for summarizing the support of MRP supertree clades.  相似文献   

3.
Accurate phylogenetic reconstruction methods are inherently computationally heavy and therefore are limited to relatively small numbers of taxa. Supertree construction is the task of amalgamating small trees over partial sets into a big tree over the complete taxa set. The need for fast and accurate supertree methods has become crucial due to the enormous number of new genomic sequences generated by modern technology and the desire to use them for classification purposes. In particular, the Assembling the Tree of Life (ATOL) program aims at constructing the evolutionary history of all living organisms on Earth. When dealing with unrooted trees, a quartet - an unrooted tree over four taxa - is the most basic piece of phylogenetic information. Therefore, quartet amalgamation stands at the heart of any supertree problem as it concerns combining many minimal pieces of information into a single, coherent, and more comprehensive piece of information.We have devised an extremely fast algorithm for quartet amalgamation and implemented it in a very efficient code. The new code can handle over a hundred millions of quartet trees over several hundreds of taxa with very high accuracy.  相似文献   

4.
Supertree methods are used to assemble separate phylogenetic trees with shared taxa into larger trees (supertrees) in an effort to construct more comprehensive phylogenetic hypotheses. In spite of much recent interest in supertrees, there are still few methods for supertree construction. The flip supertree problem is an error correction approach that seeks to find a minimum number of changes (flips) to the matrix representation of the set of input trees to resolve their incompatibilities. A previous flip supertree algorithm was limited to finding exact solutions and was only feasible for small input trees. We developed a heuristic algorithm for the flip supertree problem suitable for much larger input trees. We used a series of 48- and 96-taxon simulations to compare supertrees constructed with the flip supertree heuristic algorithm with supertrees constructed using other approaches, including MinCut (MC), modified MC (MMC), and matrix representation with parsimony (MRP). Flip supertrees are generally far more accurate than supertrees constructed using MC or MMC algorithms and are at least as accurate as supertrees built with MRP. The flip supertree method is therefore a viable alternative to other supertree methods when the number of taxa is large.  相似文献   

5.

Background  

Supertree methods synthesize collections of small phylogenetic trees with incomplete taxon overlap into comprehensive trees, or supertrees, that include all taxa found in the input trees. Supertree methods based on the well established Robinson-Foulds (RF) distance have the potential to build supertrees that retain much information from the input trees. Specifically, the RF supertree problem seeks a binary supertree that minimizes the sum of the RF distances from the supertree to the input trees. Thus, an RF supertree is a supertree that is consistent with the largest number of clusters (or clades) from the input trees.  相似文献   

6.
Important desired properties of an algorithm to construct a supertree (species tree) by reconciling input trees are its low complexity and applicability to large biological data. In its common statement the problem is proved to be NP-hard, i.e. to have an exponential complexity in practice. We propose a reformulation of the supertree building problem that allows a computationally effective solution. We introduce a biologically natural requirement that the supertree is sought for such that it does not contain clades incompatible with those existing in the input trees. The algorithm was tested with simulated and biological trees and was shown to possess an almost square complexity even if horizontal transfers are allowed. If HGTs are not assumed, the algorithm is mathematically correct and possesses the longest running time of n3 x[V0]3, where n is the number of input trees and [V0] is the total number of species. The authors are unaware of analogous solutions in published evidence. The corresponding inferring program, its usage examples and manual are freely available at http://lab6.iitp.ru/en/super3gl. The available program does not implement HGTs. The generalized case is described in the publication "A tree nearest in average to a set of trees" (Information Transmission Problems, 2011).  相似文献   

7.
Despite the growing popularity of supertree construction for combining phylogenetic information to produce more inclusive phylogenies, large-scale performance testing of this method has not been done. Through simulation, we tested the accuracy of the most widely used supertree method, matrix representation with parsimony analysis (MRP), with respect to a (maximum parsimony) total evidence solution and a known model tree. When source trees overlap completely, MRP provided a reasonable approximation of the total evidence tree; agreement was usually > 85%. Performance improved slightly when using smaller, more numerous, or more congruent source trees, and especially when elements were weighted in proportion to the bootstrap frequencies of the nodes they represented on each source tree ("weighted MRP"). Although total evidence always estimated the model tree slightly better than nonweighted MRP methods, weighted MRP in turn usually out-performed total evidence slightly. When source studies were even moderately nonoverlapping (i.e., sharing only three-quarters of the taxa), the high proportion of missing data caused a loss in resolution that severely degraded the performance for all methods, including total evidence. In such cases, even combining more trees, which had positive effects elsewhere, did not improve accuracy. Instead, "seeding" the supertree or total evidence analyses with a single largely complete study improved performance substantially. This finding could be an important strategy for any studies that seek to combine phylogenetic information. Overall, our results suggest that MRP supertree construction provides a reasonable approximation of a total evidence solution and that weighted MRP should be used whenever possible.  相似文献   

8.
The supertree algorithm matrix representation with parsimony was used to combine existing hypotheses of coral relationships and provide the most comprehensive species-level estimate of scleractinian phylogeny, comprised of 353 species (27% of extant species), 141 genera (63%) and 23 families (92%) from all seven suborders. The resulting supertree offers a guide for future studies in coral systematics by highlighting regions of concordance and conflict in existing source phylogenies. It should also prove useful in formal comparative studies of character evolution. Phylogenetic effort within Scleractinia has been taxonomically uneven, with a third of studies focussing on the Acroporidae or its most diverse genera. Sampling has also been geographically non-uniform, as tropical, reef-forming taxa have been considered twice as often as non-reef species. The supertree indicated that source trees concur on numerous aspects of coral relationships, such as the division between robust versus complex corals and the distant relationship between families in Archaeocoeniina. The supertree also supported the existence of a large, taxonomically diverse and monophyletic group of corals with many Atlantic representatives having exsert corallites. Another large, unanticipated clade consisted entirely of solitary deep-water species from three families. Important areas of ambiguity include the relationship of Astrocoeniidae to Pocilloporidae and the relative positions of several, mostly deep-water genera of Caryophylliidae. Conservative grafting of species at the base of congeneric groups with uncontroversial monophyletic status resulted in a more comprehensive, though less resolved tree of 1016 taxa.  相似文献   

9.

Background

Supertree methods combine trees on subsets of the full taxon set together to produce a tree on the entire set of taxa. Of the many supertree methods, the most popular is MRP (Matrix Representation with Parsimony), a method that operates by first encoding the input set of source trees by a large matrix (the "MRP matrix") over {0,1, ?}, and then running maximum parsimony heuristics on the MRP matrix. Experimental studies evaluating MRP in comparison to other supertree methods have established that for large datasets, MRP generally produces trees of equal or greater accuracy than other methods, and can run on larger datasets. A recent development in supertree methods is SuperFine+MRP, a method that combines MRP with a divide-and-conquer approach, and produces more accurate trees in less time than MRP. In this paper we consider a new approach for supertree estimation, called MRL (Matrix Representation with Likelihood). MRL begins with the same MRP matrix, but then analyzes the MRP matrix using heuristics (such as RAxML) for 2-state Maximum Likelihood.

Results

We compared MRP and SuperFine+MRP with MRL and SuperFine+MRL on simulated and biological datasets. We examined the MRP and MRL scores of each method on a wide range of datasets, as well as the resulting topological accuracy of the trees. Our experimental results show that MRL, coupled with a very good ML heuristic such as RAxML, produced more accurate trees than MRP, and MRL scores were more strongly correlated with topological accuracy than MRP scores.

Conclusions

SuperFine+MRP, when based upon a good MP heuristic, such as TNT, produces among the best scores for both MRP and MRL, and is generally faster and more topologically accurate than other supertree methods we tested.  相似文献   

10.
Semi-strict supertrees   总被引:3,自引:1,他引:2  
A method to calculate semi‐strict supertrees is proposed. The semi‐strict supertrees are calculated by creating the matrix that represents all the groups in the source trees (as done in already existing techniques), and then finding the trees determined by the ultra‐clique. The ultra‐clique is defined as the set of characters where each possible subset is compatible with each possible subset from the entire matrix. Finding the ultra‐clique is computationally complex (since in most cases many of the characters have missing entries), but a heuristic method yields reliable results. When the trees have no conflict, or when there are only two trees, the method produces the exact result for any ordering of the input trees and any ordering of the groups within them; when there are more than two trees and they have conflict, a single ordering or sequence can create some spurious groups, but doing multiple sequences eliminates the spurious groups. The method uses only state set operations, and is thus easily implemented in computer programs. Unlike any existing type of supertree, semi‐strict supertrees display all the groups, and only those groups, that are implied by at least some combination of the input trees and contradicted by none. The idea that supertrees should take into account the number of occurences of a given group, so as to retain some groups even in the case of conflict, is discussed; it is argued that a conceptual equivalent of the majority rule consensus is not possible when the sets of taxa differ among trees. Also, when pruning taxa from a set of trees, the supertree can display groups that contradict the consensus for the entire trees, suggesting that supertrees for matrices with very dissimilar sets of taxa should be interpreted with caution. If (for any valid reason) the data cannot be combined in a single matrix, it is advisable that the taxon sets in the matrices be as similar as possible.  相似文献   

11.
When molecules and morphology produce incongruent hypotheses of primate interrelationships, the data are typically viewed as incompatible, and molecular hypotheses are often considered to be better indicators of phylogenetic history. However, it has been demonstrated that the choice of which taxa to include in cladistic analysis as well as assumptions about character weighting, character state transformation order, and outgroup choice all influence hypotheses of relationships and may positively influence tree topology, so that relationships between extant taxa are consistent with those found using molecular data. Thus, the source of incongruence between morphological and molecular trees may lie not in the morphological data themselves but in assumptions surrounding the ways characters evolve and their impact on cladistic analysis. In this study, we investigate the role that assumptions about character polarity and transformation order play in creating incongruence between primate phylogenies based on morphological data and those supported by multiple lines of molecular data. By releasing constraints imposed on published morphological analyses of primates from disparate clades and subjecting those data to parsimony analysis, we test the hypothesis that incongruence between morphology and molecules results from inherent flaws in morphological data. To quantify the difference between incongruent trees, we introduce a new method called branch slide distance (BSD). BSD mitigates many of the limitations attributed to other tree comparison methods, thus allowing for a more accurate measure of topological similarity. We find that releasing a priori constraints on character behavior often produces trees that are consistent with molecular trees. Case studies are presented that illustrate how congruence between molecules and unconstrained morphological data may provide insight into issues of polarity, transformation order, homology, and homoplasy.  相似文献   

12.
We used the supertree approach of matrix representation with parsimony to reconstruct to date the most exhaustive (genus‐level) phylogeny of Cyprinidae. The supertree of Cyprinidae, representing 397 taxa (237 nominal genera) and 990 pseudocharacters, was well resolved (96%) through extended consensus majority rule, although 36 nodes (9.4%) were unsupported. The proportion of shared taxa among source trees was very low after calculation of the taxonomic coverage index (TCI = 0.059), which is proposed here as a more accurate alternative to the usual ratios calculated from the number of pseudo‐characters or source trees per taxon. We define a new index for the calculation of partitioned qualitative clade support, the partitioned rQS (prQS), which offers a straightforward visualization of the relative supports of source tree partitions at supertree nodes.The use of prQS showed that the molecular source tree partition contributed to most node supports within the supertree of Cyprinidae (73%, contra 21% for the morphological partition) and evidenced a fair proportion of conflict at nodes between the two partitions (21%), notably reflecting (i) the greater number and resolution of molecular source trees, and (ii) potential morphological convergences. Most of the higher‐level relationships within Cyprinidae were supported by both morphological and molecular source tree partitions. Our supertree showed a well‐supported dichotomy between a clade consisting of a ‘barbine’ + ‘rasborine’ lineage, sister group to (Barbinae [paraphyletic], (Cyprininae, Labeoninae)), and a clade consisting of other rasborines (large polytomy) and the two monophyletic groups ((Tincinae, Tanichthys), (Ecocarpia, (Acheilognathinae, (Gobioninae, Leuciscinae)))) and (Squaliobarbinae, (Xenocyprinae, Cultrinae)). Through the non‐monophyly of almost all the traditional subfamilies of Cyprinidae and 34 genera, our supertree exemplified the taxonomic chaos that reigns in the classification of the family. It also highlighted that further efforts should aim at increasing taxonomic sampling and generating alternative phylogenetic signals, notably for the still poorly apprehended Tincinae, Squaliobarbinae, Acheilognathinae, Gobioninae, and Rasborinae, the latter representing a key taxon for the understanding of early cyprinid evolution. Our supertree also proved useful for testing macro‐evolutionary scenarios at a wide taxonomic scale. Ancestral reconstructions using linear parsimony confirmed that the Oriental tropical region was the centre of origin of Cyprinidae, and identified three Oriental‐to‐Palaearctic, two Palaearctic‐to‐Nearctic, and one Oriental‐to‐Afrotropical major migration events. On the other hand, we almost completely rejected the hypothesis of presence of barbels as a plesiomorphic condition within Cyprinidae (although ambiguous for maxillary barbels of the Barbinae‐Cyprininae type). The supertree of Cyprinidae serves as a basis to discuss the applications and bias of the newly proposed prQS, to provide future guidelines for a better achievement of cyprinid phylogeny, and to elaborate further on inter‐continental migrations and the adaptive value of barbels.  相似文献   

13.

Background  

Supertree methods combine phylogenies with overlapping sets of taxa into a larger one. Topological conflicts frequently arise among source trees for methodological or biological reasons, such as long branch attraction, lateral gene transfers, gene duplication/loss or deep gene coalescence. When topological conflicts occur among source trees, liberal methods infer supertrees containing the most frequent alternative, while veto methods infer supertrees not contradicting any source tree, i.e. discard all conflicting resolutions. When the source trees host a significant number of topological conflicts or have a small taxon overlap, supertree methods of both kinds can propose poorly resolved, hence uninformative, supertrees.  相似文献   

14.
Suppose that a family of rooted phylogenetic trees T i with different sets X i of leaves is given. A supertree for the family is a single rooted tree T whose leaf set is the union of all the X i , such that the branching information in T corresponds to the branching information in all the trees T i . This paper proposes a polynomial-time method BUILD-WITH-DISTANCES that makes essential use of distance information provided by the trees T i to construct a rooted tree S 0. When a supertree also containing the distance information exists, then S 0 is a supertree. The supertree S 0 often shows increased resolution over the trees found by methods that utilize only the topology of the input trees. When no supertree exists because the input trees are incompatible, several variants of the method are described which still produce trees with provable properties.  相似文献   

15.
Supertree methods construct trees on a set of taxa (species) combining many smaller trees on the overlapping subsets of the entire set of taxa. A ‘quartet’ is an unrooted tree over taxa, hence the quartet-based supertree methods combine many -taxon unrooted trees into a single and coherent tree over the complete set of taxa. Quartet-based phylogeny reconstruction methods have been receiving considerable attentions in the recent years. An accurate and efficient quartet-based method might be competitive with the current best phylogenetic tree reconstruction methods (such as maximum likelihood or Bayesian MCMC analyses), without being as computationally intensive. In this paper, we present a novel and highly accurate quartet-based phylogenetic tree reconstruction method. We performed an extensive experimental study to evaluate the accuracy and scalability of our approach on both simulated and biological datasets.  相似文献   

16.
One way to build larger, more comprehensive phylogenies is to combine the vast amount of phylogenetic information already available. We review the two main strategies for accomplishing this (combining raw data versus combining trees), but employ a relatively new variant of the latter: supertree construction. The utility of one supertree technique, matrix representation using parsimony analysis (MRP), is demonstrated by deriving a complete phylogeny for all 271 extant species of the Carnivora from 177 literature sources. Beyond providing a 'consensus' estimate of carnivore phylogeny, the tree also indicates taxa for which the relationships remain controversial (e.g. the red panda; within canids, felids, and hyaenids) or have not been studied in any great detail (e.g. herpestids, viverrids, and intrageneric relationships in the procyonids). Times of divergence throughout the tree were also estimated from 74 literature sources based on both fossil and molecular data. We use the phylogeny to show that some lineages within the Mustelinae and Canidae contain significantly more species than expected for their age, illustrating the tree's utility for studies of macroevolution. It will also provide a useful foundation for comparative and conservational studies involving the carnivores.  相似文献   

17.
Large and comprehensive phylogenetic trees are desirable for studying macroevolutionary processes and for classification purposes. Such trees can be obtained in two different ways. Either the widest possible range of taxa can be sampled and used in a phylogenetic analysis to produce a "big tree," or preexisting topologies can be used to create a supertree. Although large multigene analyses are often favored, combinable data are not always available, and supertrees offer a suitable solution. The most commonly used method of supertree reconstruction, matrix representation with parsimony (MRP), is presented here. We used a combined data set for the Poaceae to (1) assess the differences between an approach that uses combined data and one that uses different MRP modifications based on the character partitions and (2) investigate the advantages and disadvantages of these modifications. Baum and Ragan and Purvis modifications gave similar results. Incorporating bootstrap support associated with pre-existing topologies improved Baum and Ragan modification and its similarity with a combined analysis. Finally, we used the supertree reconstruction approach on 55 published phylogenies to build one of most comprehensive phylogenetic trees published for the grass family including 403 taxa and discuss its strengths and weaknesses in relation to other published hypotheses.  相似文献   

18.
Systematists and comparative biologists commonly want to make statements about relationships among taxa that have never been collectively included in any single phylogenetic analysis. Construction of phylogenetic 'supertrees' provides one solution. Supertrees are estimates of phylogeny assembled from sets of smaller estimates (source trees) sharing some but not necessarily all their taxa in common. If certain conditions are met, supertrees can retain all or most of the information from the source trees and also make novel statements about relationships of taxa that do not co-occur on any one source tree. Supertrees have commonly been constructed using subjective and informal approaches, but several explicit approaches have recently been proposed.  相似文献   

19.
Supertree methods are used to construct a large tree over a large set of taxa from a set of small trees over overlapping subsets of the complete taxa set. Since accurate reconstruction methods are currently limited to a maximum of a few dozen taxa, the use of a supertree method in order to construct the tree of life is inevitable. Supertree methods are broadly divided according to the input trees: When the input trees are unrooted, the basic reconstruction unit is a quartet tree. In this case, the basic decision problem of whether there exists a tree that agrees with all quartets is NP-complete. On the other hand, when the input trees are rooted, the basic reconstruction unit is a rooted triplet and the above decision problem has a polynomial time algorithm. However, when there is no tree which agrees with all triplets, it would be desirable to find the tree that agrees with the maximum number of triplets. However, this optimization problem was shown to be NP-hard. Current heuristic approaches perform min cut on a graph representing the triplets inconsistency and return a tree that is guaranteed to satisfy some required properties. In this work, we present a different heuristic approach that guarantees the properties provided by the current methods and give experimental evidence that it significantly outperforms currently used methods. This method is based on a divide and conquer approach, where the min cut in the divide step is replaced by a max cut in a variant of the same graph. The latter is achieved by a lightweight semidefinite programming-like heuristic that leads to very fast running times  相似文献   

20.
A phylogenetic supertree of oscine passerine birds (Aves: Passeri)   总被引:1,自引:0,他引:1  
Oscine passerine birds make up almost half of all avian diversity. Relationships within the group, and its classification, have long been controversial. Over the last 10 years numerous molecular phylogenies have been published. We compiled source phylogenies from 99 published studies to construct an oscine supertree. We aimed to illustrate weak and strong parts of the phylogeny and set targets for future phylogenetic work and therefore preferred a heuristic approach where we judged the adequacy of taxon sampling and molecular method of each source tree instead of using matrices and automated tree-building programs. We present an estimate of the phylogenetic relationships of 1723 extant and one extinct species of oscine passerine birds (Aves: Passeri) — more than 37% of the total. We included 34/35 (97%) families, 38/39 (97%) subfamilies and 40/43 (93%) tribes. Overall resolution is 83% of a fully bifurcating tree. The basal lineages are all distributed in the Australo-Papuan region, but several more distal lineages dispersed out of this region and radiated in other parts of the world. However, taxa of the Australian region suffer from larger evolutionary gaps and the deep branches of the Sylvioidea and nine South American primaried oscines are still poorly resolved.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号