首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Majority-rule reduced consensus trees and their use in bootstrapping   总被引:3,自引:0,他引:3  
Bootstrap analyses are usually summarized with majority-rule component consensus trees. This consensus method is based on replicated components and, like all component consensus methods, it is insensitive to other kinds of agreement between trees. Recently developed reduced consensus methods can be used to summarize much additional agreement on hypothesised phylogenetic relationships among multiple trees. The new methods are "strict" in the sense that they require agreement among all the trees being compared for any relationships to be represented in a consensus tree. Majority-rule reduced consensus methods are described and their use in bootstrap analyses is illustrated with a hypothetical and a real example. The new methods provide summaries of the bootstrap proportions of all n-taxon statements/partitions and facilitate the identification of hypotheses of relationships that are supported by high bootstrap proportions, in spite of a lack of support for particular components or clades. In practice majority-rule reduced consensus profiles may contain many trees. The size of the profile can be reduced by constraints on minimal bootstrap proportions and/or cardinality of the included trees. Majority-rule reduced consensus trees can also be selected a posteriori from the profile. Surrogates to the majority-rule reduced consensus methods using partition tables or tree pruning options provided by widely used phylogenetic inference software are also described. The methods are designed to produce more informative summaries of bootstrap analyses and thereby foster more informed assessment of the strengths and weaknesses of complex phylogenetic hypotheses.   相似文献   

2.
Semi-strict supertrees   总被引:3,自引:1,他引:2  
A method to calculate semi‐strict supertrees is proposed. The semi‐strict supertrees are calculated by creating the matrix that represents all the groups in the source trees (as done in already existing techniques), and then finding the trees determined by the ultra‐clique. The ultra‐clique is defined as the set of characters where each possible subset is compatible with each possible subset from the entire matrix. Finding the ultra‐clique is computationally complex (since in most cases many of the characters have missing entries), but a heuristic method yields reliable results. When the trees have no conflict, or when there are only two trees, the method produces the exact result for any ordering of the input trees and any ordering of the groups within them; when there are more than two trees and they have conflict, a single ordering or sequence can create some spurious groups, but doing multiple sequences eliminates the spurious groups. The method uses only state set operations, and is thus easily implemented in computer programs. Unlike any existing type of supertree, semi‐strict supertrees display all the groups, and only those groups, that are implied by at least some combination of the input trees and contradicted by none. The idea that supertrees should take into account the number of occurences of a given group, so as to retain some groups even in the case of conflict, is discussed; it is argued that a conceptual equivalent of the majority rule consensus is not possible when the sets of taxa differ among trees. Also, when pruning taxa from a set of trees, the supertree can display groups that contradict the consensus for the entire trees, suggesting that supertrees for matrices with very dissimilar sets of taxa should be interpreted with caution. If (for any valid reason) the data cannot be combined in a single matrix, it is advisable that the taxon sets in the matrices be as similar as possible.  相似文献   

3.
The increasing availability of large genomic data sets as well as the advent of Bayesian phylogenetics facilitates the investigation of phylogenetic incongruence, which can result in the impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace , combines tree metrics and multivariate analysis to provide low‐dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group‐specific consensus phylogenies. treespace also provides a user‐friendly web interface for interactive data analysis and is integrated alongside existing standards for phylogenetics. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results.  相似文献   

4.
Although recent studies indicate that estimating phylogenies from alignments of concatenated genes greatly reduces the stochastic error, the potential for systematic error still remains, heightening the need for reliable methods to analyze multigene data sets. Consensus methods provide an alternative, more inclusive, approach for analyzing collections of trees arising from multiple genes. We extend a previously described consensus network method for genome-scale phylogeny (Holland, B. R., K. T. Huber, V. Moulton, and P. J. Lockhart. 2004. Using consensus networks to visualize contradictory evidence for species phylogeny. Mol. Biol. Evol. 21:1459-1461) to incorporate additional information. This additional information could come from bootstrap analysis, Bayesian analysis, or various methods to find confidence sets of trees. The new methods can be extended to include edge weights representing genetic distance. We use three data sets to illustrate the approach: 61 genes from 14 angiosperm taxa and one gymnosperm, 106 genes from eight yeast taxa, and 46 members of a gene family from 15 vertebrate taxa.  相似文献   

5.
This paper examines a recent proposal to calculate supertrees by minimizing the sum of subtree prune‐and‐regraft distances to the input trees. The supertrees thus calculated may display groups present in a minority of the input trees but contradicted by the majority, or groups that are not supported by any input tree or combination of input trees. The proponents of the method themselves stated that these are serious problems of “matrix representation with parsimony”, but they can in fact occur in their own method. The majority rule supertrees, being explicitly clade‐based, cannot have these problems, and seem much more suited to retrieving common clades from a set of trees with different taxon sets. However, it is dubious that so‐called majority rule supertrees can always be interpreted as displaying those clades present (or compatible with) with a majority of the trees. The majority rule consensus is always a median tree, in terms of the Robinson–Foulds distances (i.e. it minimizes the sum of Robinson–Foulds distances to the input trees). In contrast, majority rule supertrees may not be median—different, contradictory trees may minimize Robinson–Foulds distances, while their strict consensus does not. If being “majority” results from being median in Robinson–Foulds distances, this means that in the supertree setting a “majority” is ambiguously defined, sometimes achievable only by mutually contradictory trees.  相似文献   

6.
When dissimilarity matrices of faunistic and phylogenetic beta‐diversity turnover indices are projected in dendrograms, a high frequency of ties and zero values produces trees whose topology and bootstrap support are affected by the order of areas in the original presence–absence matrix. We tested the magnitude of this bias and developed R functions to obtain consensus trees after shuffling of matrix row order and applied this algorithm to a multiscale bootstrap procedure. Our functions not only solve the bias of row order but, owing to varying support for different bootstrap scales, reveal fundamental characteristics about the structure of species assemblages.  相似文献   

7.
ON CONSENSUS, COLLAPSIBILITY, AND CLADE CONCORDANCE   总被引:1,自引:0,他引:1  
Abstract — Consensus in cladistics is reviewed. Consensus trees, which summarize the agreement in grouping among a set of cladograms, are distinguished from compromise trees, which may contain groups that do not appear in all the cladograms being compared. Only a strict or Nelson tree is an actual consensus. This distinction has implications for the concept of support for cladograms: only those branches supported under all possible optimizations are unambiguously supported. We refer to such cladograms as strictly supported, in contrast to the semistrictly (ambiguously) supported cladograms output by various current microcomputer programs for cladistic analysis. Such semistrictly supported cladograms may be collapsed, however, by a variety of options in various programs. Consideration of collapsibility and optimization on multifurcations leads to some conclusions on the use of consensus. Consensus tree length provides information about character conflict that occurs between, not within, cladograms. We propose the clade concordance index, which employs the consensus tree length to measure inter-cladogram character conflict for all characters among a set of cladograms.  相似文献   

8.
Many phylogenetic methods produce large collections of trees as opposed to a single tree, which allows the exploration of support for various evolutionary hypotheses. However, to be useful, the information contained in large collections of trees should be summarized; frequently this is achieved by constructing a consensus tree. Consensus trees display only those signals that are present in a large proportion of the trees. However, by their very nature consensus trees require that any conflicts between the trees are necessarily disregarded. We present a method that extends the notion of consensus trees to allow the visualization of conflicting hypotheses in a consensus network. We demonstrate the utility of this method in highlighting differences amongst maximum likelihood bootstrap values and Bayesian posterior probabilities in the placental mammal phylogeny, and also in comparing the phylogenetic signal contained in amino acid versus nucleotide characters for hexapod monophyly.  相似文献   

9.
Collections of phylogenetic trees are usually summarized using consensus methods. These methods build a single tree, supposed to be representative of the collection. However, in the case of heterogeneous collections of trees, the resulting consensus may be poorly resolved (strict consensus, majority-rule consensus, ...), or may perform arbitrary choices among mutually incompatible clades, or splits (greedy consensus). Here, we propose an alternative method, which we call the multipolar consensus (MPC). Its aim is to display all the splits having a support above a predefined threshold, in a minimum number of consensus trees, or poles. We show that the problem is equivalent to a graph-coloring problem, and propose an implementation of the method. Finally, we apply the MPC to real data sets. Our results indicate that, typically, all the splits down to a weight of 10% can be displayed in no more than 4 trees. In addition, in some cases, biologically relevant secondary signals, which would not have been present in any of the classical consensus trees, are indeed captured by our method, indicating that the MPC provides a convenient exploratory method for phylogenetic analysis. The method was implemented in a package freely available at http://www.lirmm.fr/~cbonnard/MPC.html  相似文献   

10.
Methods for Quick Consensus Estimation   总被引:5,自引:0,他引:5  
A method that allows estimating consensus trees without exhaustive searches is described. The method consists of comparing the results of different independent superficial searches. The results of the searches are then summarized through a majority rule, consensed with the strict consensus tree of the best trees found overall. This assumes that to the extent that a group is recovered by most searches, it is more likely to be actually supported by the data. The effect of different parameters on the accuracy and reliability of the results is discussed. Increasing the cutoff frequency decreases the number of spurious groups, although it also decreases the number of correct nodes recovered. Collapsing trees during swapping reduces the number of spurious groups without significantly decreasing the number of correct nodes recovered. A way to collapse branches considering suboptimal trees is described, which can be extended as a measure of relative support for groups; the relative support is based on the Bremer support, but takes into account relative amounts of favorable and contradictory evidence. More exhaustive searches increase the number of correct nodes recovered, but leave unaffected (or increase) the number of spurious groups. Within some limits, the number of replications does not strongly affect the accuracy of the results, so that using relatively small numbers of replications normally suffices to produce a reliable estimation.  相似文献   

11.
New examples are presented, showing that supertree methods such as matrix representation with parsimony, minimum flip trees, and compatibility analysis of the matrix representing the input trees, produce supertrees that cannot be interpreted as displaying the groups present in the majority of the input trees. These methods may produce a supertree displaying some groups present in the minority of the trees, and contradicted by the majority. Of the three methods, compatibility analysis is the least used, but it seems to be the one that differs the least from majority rule consensus. The three methods are similar in that they choose the supertree(s) that best fit the set of input trees (quantified as some measure of the fit to the matrix representation of the input trees); in the case of complete trees, it is argued that, for a supertree method to be equivalent to majority rule or frequency difference consensus, two necessary (but not sufficient) conditions must be met. First, the measure of fit between a supertree and an input tree must be symmetrical. Second, the fit for a character representing a group must be measured as absolute: either it fits or it does not fit. In the restricted case of complete and equally resolved input trees, compatibility analysis (unlike MRP and minimum flipping) fulfils these two conditions: it is symmetrical (i.e., as long as the trees have the same taxon sets and are equally resolved, the number of characters in the matrix representation of tree A that require homoplasy in tree B is always the same as the number of characters in the matrix representation of tree B that require homoplasy in tree A) and it measures fit as all‐or‐none. In the case of just two complete and equally resolved input trees, the two conditions (symmetry and absolute fit) are necessary and sufficient, which explains why the compatibility analysis of such trees behaves as majority consensus. With more than two such trees, these conditions are still necessary but no longer sufficient for the equivalence; in such cases, the compatibility supertree may differ significantly from the majority rule consensus, even when these conditions apply (as shown by example). MRP and minimum flipping are asymmetric and measure various degrees of fit for each character, which explains why they often behave very differently from majority rule procedures, and why they are very likely to have groups contradicted by each of the input trees, or groups supported by a minority of the input trees. © The Willi Hennig Society 2005.  相似文献   

12.
The stratigraphic record of first appearances provides an independent source of data for evaluating and comparing phylogenetic hypotheses that include taxa with fossil histories. However, no standardized method exists for calculating these metrics for polytomous phylogenies, restricting their applicability. Previously proposed methods insufficiently deal with this problem because they skew or restrict the resulting scores. To resolve this issue, we propose a standardized method for treating polytomies when calculating these metrics: the Comprehensive Polytomy approach (ComPoly). This approach accurately describes how phylogenetic uncertainty, indicated by polytomies, affects stratigraphic consistency scores. We also present a new program suite (Assistance with Stratigraphic Consistency Calculations) that incorporates the ComPoly approach and simplifies the calculation of absolute temporal stratigraphic consistency metrics. This study also demonstrates that stratigraphic consistency scores calculated from strict consensus trees can be overly inclusive and those calculated from less‐than‐strict consensus trees inaccurately describe the phylogenetic signal present in the source most‐parsimonious trees (MPTs). Therefore, stratigraphic consistency scores should be calculated directly from the source MPTs whenever possible to ensure their accuracy. Finally, we offer recommendations for standardizing comparisons between molecular divergence dates and the stratigraphic record of first appearances, a promising new application of these methods. © The Willi Hennig Society 2010.  相似文献   

13.
Majority-rule supertrees   总被引:1,自引:0,他引:1  
Most supertree methods proposed to date are essentially ad hoc, rather than designed with particular properties in mind. Although the supertree problem remains difficult, one promising avenue is to develop from better understood consensus methods to the more general supertree setting. Here, we generalize the widely used majority-rule consensus method to the supertree setting. The majority-rule consensus tree is the strict consensus of the median trees under the symmetric-difference metric, so we can generalize the consensus method by generalizing this metric to trees with differing leaf sets. There are two different natural generalizations, based on pruning or grafting leaves to produce comparable trees, and these two generalizations produce two different, but related, majority-rule supertree methods.  相似文献   

14.
Phylogenetic trees based on mtDNA polymorphisms are often used to infer the history of recent human migrations. However, there is no consensus on which method to use. Most methods make strong assumptions which may bias the choice of polymorphisms and result in computational complexity which limits the analysis to a few samples/polymorphisms. For example, parsimony minimizes the number of mutations, which biases the results to minimizing homoplasy events. Such biases may miss the global structure of the polymorphisms altogether, with the risk of identifying a "common" polymorphism as ancient without an internal check on whether it either is homoplasic or is identified as ancient because of sampling bias (from oversampling the population with the polymorphism). A signature of this problem is that different methods applied to the same data or the same method applied to different datasets results in different tree topologies. When the results of such analyses are combined, the consensus trees have a low internal branch consensus. We determine human mtDNA phylogeny from 1737 complete sequences using a new, direct method based on principal component analysis (PCA) and unsupervised consensus ensemble clustering. PCA identifies polymorphisms representing robust variations in the data and consensus ensemble clustering creates stable haplogroup clusters. The tree is obtained from the bifurcating network obtained when the data are split into k = 2,3,4,...,kmax clusters, with equal sampling from each haplogroup. Our method assumes only that the data can be clustered into groups based on mutations, is fast, is stable to sample perturbation, uses all significant polymorphisms in the data, works for arbitrary sample sizes, and avoids sample choice and haplogroup size bias. The internal branches of our tree have a 90% consensus accuracy. In conclusion, our tree recreates the standard phylogeny of the N, M, L0/L1, L2, and L3 clades, confirming the African origin of modern humans and showing that the M and N clades arose in almost coincident migrations. However, the N clade haplogroups split along an East-West geographic divide, with a "European R clade" containing the haplogroups H, V, H/V, J, T, and U and a "Eurasian N subclade" including haplogroups B, R5, F, A, N9, I, W, and X. The haplogroup pairs (N9a, N9b) and (M7a, M7b) within N and M are placed in nonnearest locations in agreement with their expected large TMRCA from studies of their migrations into Japan. For comparison, we also construct consensus maximum likelihood, parsimony, neighbor joining, and UPGMA-based trees using the same polymorphisms and show that these methods give consistent results only for the clade tree. For recent branches, the consensus accuracy for these methods is in the range of 1-20%. From a comparison of our haplogroups to two chimp and one bonobo sequences, and assuming a chimp-human coalescent time of 5 million years before present, we find a human mtDNA TMRCA of 206,000 +/- 14,000 years before present.  相似文献   

15.
The success of resampling approaches to branch support depends on the effectiveness of the underlying tree searches. Two primary factors are identified as key: the depth of tree search and the number of trees saved per resampling replicate. Two datasets were explored for a range of search parameters using jackknifing. Greater depth of tree search tends to increase support values because shorter trees conflict less with each other, while increasing numbers of trees saved tends to reduce support values because of conflict that reduces structure in the replicate consensus. Although a relatively small amount of branch swapping will achieve near‐accurate values for a majority of clades, some clades do not yield accurate values until more extensive searches are performed. This means that in order to maximize the accuracy of resampling analyses, one should employ as extensive a search strategy as possible, and save as many trees per replicate as possible. Strict consensus summary of resampling replicates is preferable to frequency‐within‐replicates summary because it is a more conservative approach to the reporting of replicate results. Jackknife analysis is preferable to bootstrap because of its closer relationship to the original data.© The Willi Hennig Society 2010.  相似文献   

16.
Nowadays, there are many phylogeny reconstruction methods, each with advantages and disadvantages. We explored the advantages of each method, putting together the common parts of trees constructed by several methods, by means of a consensus computation. A number of phylogenetic consensus methods are already known. Unfortunately, there is also a taboo concerning consensus methods, because most biologists see them mainly as comparators and not as phylogenetic tree constructors. We challenged this taboo by defining a consensus method that builds a fully resolved phylogenetic tree based on the most common parts of fully resolved trees in a given collection. We also generated results showing that this consensus is in a way a kind of "median" of the input trees; as such it can be closer to the correct tree in many situations.  相似文献   

17.
18.
Abstract Phylogenetic relationships of 25 genera of Holarctic Teleiodini (Gelechiidae) are postulated based on morphology and molecular characters, including CO‐I, CO‐II, and 28S genes. The phylogenetic analysis of the morphology matrix yielded four equal most‐parsimonious trees (length 330 steps, CI = 0.36, RI = 0.55) and a strict consensus tree (length 335 steps, CI = 0.36, RI = 0.54) with one polytomy and one trichotomy. The phylogenetic analysis of the combined morphology and CO‐I + CO‐II + 28S matrices yielded two equally most‐parsimonious trees (length 1184 steps, CI = 0.50, RI = 0.42) and a strict consensus tree (length 1187 steps, CI = 0.50, RI = 0.42) that reinforced results from the morphological analysis and resolved the one polytomy present in the morphology consensus tree. Teleiodini are defined as a monophyletic clade with a Bremer support value greater than 5 in the consensus tree based on morphological and molecular data. Twenty‐three clades of genera are defined with Bremer support values provided. An analysis of larval host‐plant preferences based on the consensus tree for combined data indicates derivation of feeding on woody hosts from genera feeding on herbaceous hosts and a single origin of feeding on coniferous hosts. An area cladogram indicates five independent origins of Nearctic genera from Holarctic ancestors and one origin from a Palearctic genus.  相似文献   

19.
20.
Estimating species trees using multiple-allele DNA sequence data   总被引:3,自引:0,他引:3  
Several techniques, such as concatenation and consensus methods, are available for combining data from multiple loci to produce a single statement of phylogenetic relationships. However, when multiple alleles are sampled from individual species, it becomes more challenging to estimate relationships at the level of species, either because concatenation becomes inappropriate due to conflicts among individual gene trees, or because the species from which multiple alleles have been sampled may not form monophyletic groups in the estimated tree. We propose a Bayesian hierarchical model to reconstruct species trees from multiple-allele, multilocus sequence data, building on a recently proposed method for estimating species trees from single allele multilocus data. A two-step Markov Chain Monte Carlo (MCMC) algorithm is adopted to estimate the posterior distribution of the species tree. The model is applied to estimate the posterior distribution of species trees for two multiple-allele datasets--yeast (Saccharomyces) and birds (Manacus-manakins). The estimates of the species trees using our method are consistent with those inferred from other methods and genetic markers, but in contrast to other species tree methods, it provides credible regions for the species tree. The Bayesian approach described here provides a powerful framework for statistical testing and integration of population genetics and phylogenetics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号