首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The success of resampling approaches to branch support depends on the effectiveness of the underlying tree searches. Two primary factors are identified as key: the depth of tree search and the number of trees saved per resampling replicate. Two datasets were explored for a range of search parameters using jackknifing. Greater depth of tree search tends to increase support values because shorter trees conflict less with each other, while increasing numbers of trees saved tends to reduce support values because of conflict that reduces structure in the replicate consensus. Although a relatively small amount of branch swapping will achieve near‐accurate values for a majority of clades, some clades do not yield accurate values until more extensive searches are performed. This means that in order to maximize the accuracy of resampling analyses, one should employ as extensive a search strategy as possible, and save as many trees per replicate as possible. Strict consensus summary of resampling replicates is preferable to frequency‐within‐replicates summary because it is a more conservative approach to the reporting of replicate results. Jackknife analysis is preferable to bootstrap because of its closer relationship to the original data.© The Willi Hennig Society 2010.  相似文献   

2.
A numerical cladistic analysis, based on 23 terminal groups and 63 morphological characters, was done to infer phylogenetic relationships within the Eurasian catfish family Siluridae. Nine hundred and forty-five equally most parsimonious trees (134 steps, consistency index 0.634) were found that differ in their resolutions of four polychotomies. Strict consensus of these trees includes ten internal nodes, does not support monophyly of Silurus, Ompok and Kryptopterus , as usually defined, and offers ambiguous support for monophyly of Wallago. Silurus and Kryptopterus are each composed of two non-sister group clades, and Ompok is composed of at least two such clades. Heuristic searches constrained by monophyly of Silurus, Ompok or Kryptopterus yielded trees five or six steps longer than the shortest trees free of constraints. The strict consensus also infers a basal dichotomy that separates the Siluridae into a temperate Eurasian clade with about 20 nominal species and a subtropical/tropical south and southeast Asian clade with about 75 nominal species. The distributions of these clades overlap in a relatively narrow region of east Asia. A heuristic search for trees 1 step longer than the shortest trees yielded 253890 trees. A strict consensus of these trees also infers a basal dichotomy between the above-mentioned clades. This analysis revealed four additional putative synapomorphies of the Siluridae, pending further resolution of the family's outgroup relationships.  相似文献   

3.
We compared general behaviour trends of resampling methods (bootstrap, bootstrap with Poisson distribution, jackknife, and jackknife with symmetric resampling) and different ways to summarize the results for resampling (absolute frequency, F, and frequency difference, GC') for real data sets under variable resampling strengths in three weighting schemes. We propose an equivalence between bootstrap and jackknife in order to make bootstrap variable across different resampling strengths. Specifically, for each method we evaluated the number of spurious groups (groups not present in the strict consensus of the unaltered data set), of real groups, and of inconsistencies in ranking of groups under variable resampling strengths. We found that GC' always generated more spurious groups and recovered more groups than F. Bootstrap methods generated more spurious groups than jackknife methods; and jackknife is the method that recovered more real groups. We consistently obtained a higher proportion of spurious groups for GC' than for F; and for bootstrap than for jackknife. Finally, we evaluated the ranking of groups under variable resampling strengths qualitatively in the trajectories of "support" against resampling strength, and quantitatively with Kendall coefficient values. We found fewer ranking inconsistencies for GC' than for F, and for bootstrap than for jackknife.
© The Willi Hennig Society 2009.  相似文献   

4.
The neighbor-joining (NJ) method is widely used in reconstructing large phylogenies because of its computational speed and the high accuracy in phylogenetic inference as revealed in computer simulation studies. However, most computer simulation studies have quantified the overall performance of the NJ method in terms of the percentage of branches inferred correctly or the percentage of replications in which the correct tree is recovered. We have examined other aspects of its performance, such as the relative efficiency in correctly reconstructing shallow (close to the external branches of the tree) and deep branches in large phylogenies; the contribution of zero-length branches to topological errors in the inferred trees; and the influence of increasing the tree size (number of sequences), evolutionary rate, and sequence length on the efficiency of the NJ method. Results show that the correct reconstruction of deep branches is no more difficult than that of shallower branches. The presence of zero-length branches in realized trees contributes significantly to the overall error observed in the NJ tree, especially in large phylogenies or slowly evolving genes. Furthermore, the tree size does not influence the efficiency of NJ in reconstructing shallow and deep branches in our simulation study, in which the evolutionary process is assumed to be homogeneous in all lineages. Received: 7 March 2000 / Accepted: 2 August 2000  相似文献   

5.
Semi-strict supertrees   总被引:3,自引:1,他引:2  
A method to calculate semi‐strict supertrees is proposed. The semi‐strict supertrees are calculated by creating the matrix that represents all the groups in the source trees (as done in already existing techniques), and then finding the trees determined by the ultra‐clique. The ultra‐clique is defined as the set of characters where each possible subset is compatible with each possible subset from the entire matrix. Finding the ultra‐clique is computationally complex (since in most cases many of the characters have missing entries), but a heuristic method yields reliable results. When the trees have no conflict, or when there are only two trees, the method produces the exact result for any ordering of the input trees and any ordering of the groups within them; when there are more than two trees and they have conflict, a single ordering or sequence can create some spurious groups, but doing multiple sequences eliminates the spurious groups. The method uses only state set operations, and is thus easily implemented in computer programs. Unlike any existing type of supertree, semi‐strict supertrees display all the groups, and only those groups, that are implied by at least some combination of the input trees and contradicted by none. The idea that supertrees should take into account the number of occurences of a given group, so as to retain some groups even in the case of conflict, is discussed; it is argued that a conceptual equivalent of the majority rule consensus is not possible when the sets of taxa differ among trees. Also, when pruning taxa from a set of trees, the supertree can display groups that contradict the consensus for the entire trees, suggesting that supertrees for matrices with very dissimilar sets of taxa should be interpreted with caution. If (for any valid reason) the data cannot be combined in a single matrix, it is advisable that the taxon sets in the matrices be as similar as possible.  相似文献   

6.
An analysis of the relationship between the number of loci utilized in an electrophoretic study of genetic relationships and the statistical support for the topology of UPGMA trees is reported for two published data sets. These are Highton and Larson (Syst. Zool.28: 579-599, 1979), an analysis of the relationships of 28 species of plethodonine salamanders, and Hedges (Syst. Zool., 35: 1-21, 1986), a similar study of 30 taxa of Holarctic hylid frogs. As the number of loci increases, the statistical support for the topology at each node in UPGMA trees was determined by both the bootstrap and jackknife methods. The results show that the bootstrap and jackknife probabilities supporting the topology at some nodes of UPGMA trees increase as the number of loci utilized in a study is increased, as expected for nodes that have groupings that reflect phylogenetic relationships. The pattern of increase varies and is especially rapid in the case of groups with no close relatives. At nodes that likely do not represent correct phylogenetic relationships, the bootstrap probabilities do not increase and often decline with the addition of more loci.  相似文献   

7.
Using outgroup(s) is the most frequent method to root trees. Rooting through unconstrained simultaneous analysis of several outgroups is a favoured option because it serves as a test of the supposed monophyly of the ingroup. When contradiction occurs among the characters of the outgroups, the branching pattern of basal nodes of the rooted tree is dependent on the order of the outgroups listed in the data matrix, that is, on the prime outgroup (even in the case of exhaustive search). Different equally parsimonious rooted trees (=cladograms) can be obtained by permutation of prime outgroups. An alternative to a common implicit practice (select one outgroup to orientate the tree) is that the accepted cladogram is the strict consensus of the different equally parsimonious rooted trees. The consensus tree is less parsimonious but is not hampered with extra assumption such as the choice of one outgroup (or more) among the initial number of outgroup terminals. It also does not show sister-group relations that are ambiguously resolved or not resolved at all.  相似文献   

8.
9.
Comprehensive phylogenetic trees are essential tools to better understand evolutionary processes. For many groups of organisms or projects aiming to build the Tree of Life, comprehensive phylogenetic analysis implies sampling hundreds to thousands of taxa. For the tree of all life this task rises to a highly conservative 13 million. Here, we assessed the performances of methods to reconstruct large trees using Monte Carlo simulations with parameters inferred from four large angiosperm DNA matrices, containing between 141 and 567 taxa. For each data set, parameters of the HKY85+G model were estimated and used to simulate 20 new matrices for sequence lengths from 100 to 10,000 base pairs. Maximum parsimony and neighbor joining were used to analyze each simulated matrix. In our simulations, accuracy was measured by counting the number of nodes in the model tree that were correctly inferred. The accuracy of the two methods increased very quickly with the addition of characters before reaching a plateau around 1000 nucleotides for any sizes of trees simulated. An increase in the number of taxa from 141 to 567 did not significantly decrease the accuracy of the methods used, despite the increase in the complexity of tree space. Moreover, the distribution of branch lengths rather than the rate of evolution was found to be the most important factor for accurately inferring these large trees. Finally, a tree containing 13,000 taxa was created to represent a hypothetical tree of all angiosperm genera and the efficiency of phylogenetic reconstructions was tested with simulated matrices containing an increasing number of nucleotides up to a maximum of 30,000. Even with such a large tree, our simulations suggested that simple heuristic searches were able to infer up to 80% of the nodes correctly.  相似文献   

10.
Abstract.  The phylogeny of Iberian Aphodiini species was reconstructed based on morphology. Wing venation, mouthparts, male and female genitalia, and external morphology provided ninety-four characters scored for ninety-three Aphodiini species. Phylogenetic analyses were based on maximum parsimony and Bayesian inference criteria. Maximum parsimony consensus trees recovered Acrossus species as a sister group of the remaining Aphodiini, followed by two other branches, one including Neagolius , Plagiogonus , Ahermodontus and Ammoecius species, and the other including Oxyomus , Nimbus , Heptaulacus and Euheptaulacus species. The remaining studied taxa clustered in an unresolved group. Bayesian inference trees recovered Acrossus as the sister group of the remaining Iberian Aphodiini, followed by Colobopterus erraticus and the rest of the Iberian Aphodiini, but this latter branch was unresolved. The general lack of statistical support for the inferred phylogenetic relationships at terminal nodes using both maximum parsimony and Bayesian inference suggests that variation in morphological characters useful for phylogenetic inference in the present study is small, perhaps as a consequence of a radiation event occurring at the origin of the tribe. A probable evolutionary pattern for Aphodiini is proposed which infers six groups, namely Acrossian, Ammoecian, Oxyomian, Aphodian s.str., Colobopteran and Aphodian s.l. clades.  相似文献   

11.
Nowadays, there are many phylogeny reconstruction methods, each with advantages and disadvantages. We explored the advantages of each method, putting together the common parts of trees constructed by several methods, by means of a consensus computation. A number of phylogenetic consensus methods are already known. Unfortunately, there is also a taboo concerning consensus methods, because most biologists see them mainly as comparators and not as phylogenetic tree constructors. We challenged this taboo by defining a consensus method that builds a fully resolved phylogenetic tree based on the most common parts of fully resolved trees in a given collection. We also generated results showing that this consensus is in a way a kind of "median" of the input trees; as such it can be closer to the correct tree in many situations.  相似文献   

12.
Quantification of the success of phylogenetic inference in simulations   总被引:1,自引:0,他引:1  
For phylogenetic simulation studies, the accuracy of topological reconstruction obtained from different data matrices or different methods of phylogenetic inference generally needs to be quantified. Two components of performance within this context are: (1) how the inferred tree topology matches or conflicts with the correct tree topology, and (2) the branch support assigned to both correctly and incorrectly resolved clades. We present a method (averaged overall success of resolution) that incorporates both of these components. Branch support is incorporated in the averaged overall success of resolution by linearly scaling the observed support relative to that conferred by uncontradicted synapomorphies. We believe that this method represents an improvement relative to the commonly used approaches of quantifying the percentage of clades that are correctly resolved in the inferred trees or presenting the Robinson–Foulds distance between the inferred trees and the correct tree. In contrast to Bremer support, the averaged overall success of resolution may be applied equally well to distance, likelihood and parsimony analyses. © The Willi Hennig Society 2006.  相似文献   

13.
For more than 10 years, systematists have been debating the superiority of character or taxonomic congruence in phylogenetic analysis. In this paper, we demonstrate that the competing approaches can converge to the same solution when a consensus method that accounts for branch lengths is selected. Thus, we propose to use both methods in combination, as a way to corroborate the results of combined and separate analyses. This so-called "global congruence" approach is tested with a wide variety of examples sampled from the literature, and the results are compared with those obtained by standard consensus methods. Our analyses show that when the total evidence and consensus trees differ topologically, collapsing weakly supported nodes with low bootstrap support usually improves "global congruence".  相似文献   

14.
An expanded plastid DNA phylogeny for Orchidaceae was generated from sequences of rbcL and matK for representatives of all five subfamilies. The data were analyzed using equally weighted parsimony, and branch support was assessed with jackknifing. The analysis supports recognition of five subfamilies with the following relationships: (Apostasioideae (Vanilloideae (Cypripedioideae (Orchidoideae (Epidendroideae))))). Support for many tribal-level groups within Epidendroideae is evident, but relationships among these groups remain uncertain, probably due to a rapid radiation in the subfamily that resulted in short branches along the spine of the tree. A series of experiments examined jackknife parameters and strategies to determine a reasonable balance between computational effort and results. We found that support values plateau rapidly with increased search effort. Tree bisection-reconnection swapping in a single search replicate per jackknife replicate and saving only two trees resulted in values that were close to those obtained in the most extensive searches. Although this approach uses considerably more computational effort than less extensive (or no) swapping, the results were also distinctly better. The effect of saving a maximal number of trees in each jackknife replicate can also be pronounced and is important for representing support accurately.  相似文献   

15.
Quantifying branch support using the bootstrap and/or jackknife is generally considered to be an essential component of rigorous parsimony and maximum likelihood phylogenetic analyses. Previous authors have described how application of the frequency-within-replicates approach to treating multiple equally optimal trees found in a given bootstrap pseudoreplicate can provide apparent support for otherwise unsupported clades. We demonstrate how a similar problem may occur when a non-representative subset of equally optimal trees are held per pseudoreplicate, which we term the undersampling-within-replicates artifact. We illustrate the frequency-within-replicates and undersampling-within-replicates bootstrap and jackknife artifacts using both contrived and empirical examples, demonstrate that the artifacts can occur in both parsimony and likelihood analyses, and show that the artifacts occur in outputs from multiple different phylogenetic-inference programs. Based on our results, we make the following five recommendations, which are particularly relevant to supermatrix analyses, but apply to all phylogenetic analyses. First, when two or more optimal trees are found in a given pseudoreplicate they should be summarized using the strict-consensus rather than frequency-within-replicates approach. Second jackknife resampling should be used rather than bootstrap resampling. Third, multiple tree searches while holding multiple trees per search should be conducted in each pseudoreplicate rather than conducting only a single search and holding only a single tree. Fourth, branches with a minimum possible optimized length of zero should be collapsed within each tree search rather than collapsing branches only if their maximum possible optimized length is zero. Fifth, resampling values should be mapped onto the strict consensus of all optimal trees found rather than simply presenting the ≥ 50% bootstrap or jackknife tree or mapping the resampling values onto a single optimal tree.  相似文献   

16.
Majority-rule reduced consensus trees and their use in bootstrapping   总被引:3,自引:0,他引:3  
Bootstrap analyses are usually summarized with majority-rule component consensus trees. This consensus method is based on replicated components and, like all component consensus methods, it is insensitive to other kinds of agreement between trees. Recently developed reduced consensus methods can be used to summarize much additional agreement on hypothesised phylogenetic relationships among multiple trees. The new methods are "strict" in the sense that they require agreement among all the trees being compared for any relationships to be represented in a consensus tree. Majority-rule reduced consensus methods are described and their use in bootstrap analyses is illustrated with a hypothetical and a real example. The new methods provide summaries of the bootstrap proportions of all n-taxon statements/partitions and facilitate the identification of hypotheses of relationships that are supported by high bootstrap proportions, in spite of a lack of support for particular components or clades. In practice majority-rule reduced consensus profiles may contain many trees. The size of the profile can be reduced by constraints on minimal bootstrap proportions and/or cardinality of the included trees. Majority-rule reduced consensus trees can also be selected a posteriori from the profile. Surrogates to the majority-rule reduced consensus methods using partition tables or tree pruning options provided by widely used phylogenetic inference software are also described. The methods are designed to produce more informative summaries of bootstrap analyses and thereby foster more informed assessment of the strengths and weaknesses of complex phylogenetic hypotheses.   相似文献   

17.
The phylogeny of flying squirrels was assessed, based on analyses of 80 morphological characters. Three published hypotheses were tested with constraint trees and compared with trees based on heuristic searches, all using PAUP*. Analyses were conducted on unordered data, on ordered data (Wagner), and on ordered data using Dollo parsimony. Compared with trees based on heuristic searches, the McKenna (1962) constraint trees were consistently the longest, requiring 8–11 more steps. The Mein (1970) constraint trees were shorter, requiring five to seven steps more than the unconstrained trees, and the Thorington and Darrow (2000) constraint trees were shorter yet, zero to one step longer than the corresponding unconstrained tree. In each of the constraint trees, some of the constrained nodes had poor character support. The heuristic trees provided best character support for three groups, but they did not resolve the basal trichotomy between a Glaucomys group of six genera, a Petaurista group of four genera, and a Trogopterus group of four genera. The inclusion of the small northern Eurasian flying squirrel, Pteromys, in the Petaurista group of giant South Asian flying squirrels is an unexpected hypothesis. Another novel hypothesis is the inclusion of the genus Aeromys, large animals from the Sunda Shelf, with the Trogopterus group of smaller "complex-toothed flying squirrels" from mainland Malaysia and southeast Asia. We explore the implications of this study for future analysis of molecular data and for past and future interpretations of the fossil record.  相似文献   

18.
This study is a phylogenetic analysis of the avian family Ciconiidae, the storks, based on two molecular data sets: 1065 base pairs of sequence from the mitochondrial cytochromebgene and a complete matrix of single-copy nuclear DNA–DNA hybridization distances. Sixteen of the nineteen stork species were included in the cytochromebdata matrix, and fifteen in the DNA–DNA hybridization matrix. Both matrices included outgroups from the families Cathartidae (New World vultures) and Threskiornithidae (ibises, spoonbills). Optimal trees based on the two data sets were congruent in those nodes with strong bootstrap support. In the best-fit tree based on DNA–DNA hybridization distances, nodes defining relationships among very recently diverged species had low bootstrap support, while nodes defining more distant relationships had strong bootstrap support. In the optimal trees based on the sequence data, nodes defining relationships among recently diverged species had strong bootstrap support, while nodes defining basal relationships in the family had weak support and were incongruent among analyses. A combinable-component consensus of the best-fit DNA–DNA hybridization tree and a consensus tree based on different analyses of the cytochromebsequences provide the best estimate of relationships among stork species based on the two data sets.  相似文献   

19.
The Lamprologini are the most species-rich and diverse tribe of Lake Tanganyika cichlids, comprising around 90 described species. We reconstruct the most complete ( approximately 70 species) mtDNA phylogeny to date for this tribe, based on NADH dehydrogenase subunit 2 (ND2 approximately 1047 bp) and the non-coding control region ( approximately 874 bp) and examine the degree to which mtDNA trees are good proxies for species trees. Phylogenetic relationships are assessed using Bayesian inference, maximum likelihood and maximum parsimony to determine the robustness of relationships. The resulting topologies are largely congruent and only the tree produced by an unpartitioned BI analysis is rejected using the non-parametric likelihood-based AU test. The trees are remarkably balanced, with two major clades consistently recovered in all analyses and with reasonable support. A smaller clade of deep-water species is also recovered. Overall support is good, when compared to some groups that have undergone adaptive radiation and rapid lineage formation. The much-expanded phylogeny of the group helps resolve the placement of some previously problematic taxa, such as Neolamprologus moori, highlighting the importance of greater taxonomic sampling. The results include a number of divergent placements of closely related species, and the following genera Neolamprologus, Lamprologus, Julidiochromis, Telmatochromis are not monophyletic, with alternative hypotheses consistent with traditional taxonomy providing a significantly worse fit to the data. We find several examples of divergent mtDNA taxa sequences of presumed closely related species. This could be due to incorrect taxonomy or to the failure of the mtDNA to reflect species relationships and may support the hypothesis that speciation within this group has been facilitated by introgressive hybridisation.  相似文献   

20.
Exploring a large number of parameter sets in sensitivity analyses of direct optimization parsimony can be costly in terms of time and computing resources, and there is little a priori guidance available for reasonable limits to these search parameters. For this reason, we sought a general‐purpose upper limit for gap costs in the direct optimization program POY to streamline this process. To test the performance of POY as gap costs increase, we simulated data onto a pre‐set topology using a GTR + I + G model modified to include gaps by adding them according to a negative‐binomial model. Gaps were then removed and the data were analysed in POY at increasing gap costs. Increasing gap costs consistently resulted in reduced phylogenetic accuracy across trees of different relative branch lengths. Decoupling gap insertion and gap extension costs recovered a fraction of the accuracy lost by having both high gap insertion and gap extension costs, but only in trees with long internal nodes. To determine whether loss of phylogenetic accuracy was node‐specific, we designed a small dataset with a constrained node, where all possible combinations of cost substitution and different percentages of gap versus nucleotide changes were explored. These analyses showed that the effects of gap insertion and extension are node‐specific, and the minimum threshold for convergence on gap‐supported nodes is similar to the threshold for accuracy loss found in the larger simulated datasets. Subsequent analyses of empirical data revealed that a similar pattern of loss with gap cost increase can occur with ribosomal genes (18S, 28S, 16S and 12S) but this pattern was not seen in the intron data (myoglobin II) examined. In conjunction with previously published congruence‐based studies, the results suggest that POY sensitivity analyses can be streamlined and made more accurate if gap insertion and extension costs follow, as a guideline, a limit of four times the highest base‐transformation cost. © The Willi Hennig Society 2008.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号