首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Dissimilarity measures for (possibly weighted) phylogenetic trees based on the comparison of their vectors of path lengths between pairs of taxa, have been present in the systematics literature since the early seventies. For rooted phylogenetic trees, however, these vectors can only separate non-weighted binary trees, and therefore these dissimilarity measures are metrics only on this class of rooted phylogenetic trees. In this paper we overcome this problem, by splitting in a suitable way each path length between two taxa into two lengths. We prove that the resulting splitted path lengths matrices single out arbitrary rooted phylogenetic trees with nested taxa and arcs weighted in the set of positive real numbers. This allows the definition of metrics on this general class of rooted phylogenetic trees by comparing these matrices through metrics in spaces Mn(\mathbb R){\mathcal{M}_n(\mathbb {R})} of real-valued n × n matrices. We conclude this paper by establishing some basic facts about the metrics for non-weighted phylogenetic trees defined in this way using L p metrics on Mn(\mathbb R){\mathcal{M}_n(\mathbb {R})}, with ${p \in \mathbb {R}_{ >0 }}${p \in \mathbb {R}_{ >0 }}.  相似文献   

2.
Majority-rule supertrees   总被引:1,自引:0,他引:1  
Most supertree methods proposed to date are essentially ad hoc, rather than designed with particular properties in mind. Although the supertree problem remains difficult, one promising avenue is to develop from better understood consensus methods to the more general supertree setting. Here, we generalize the widely used majority-rule consensus method to the supertree setting. The majority-rule consensus tree is the strict consensus of the median trees under the symmetric-difference metric, so we can generalize the consensus method by generalizing this metric to trees with differing leaf sets. There are two different natural generalizations, based on pruning or grafting leaves to produce comparable trees, and these two generalizations produce two different, but related, majority-rule supertree methods.  相似文献   

3.
Semi-strict supertrees   总被引:2,自引:1,他引:2  
A method to calculate semi‐strict supertrees is proposed. The semi‐strict supertrees are calculated by creating the matrix that represents all the groups in the source trees (as done in already existing techniques), and then finding the trees determined by the ultra‐clique. The ultra‐clique is defined as the set of characters where each possible subset is compatible with each possible subset from the entire matrix. Finding the ultra‐clique is computationally complex (since in most cases many of the characters have missing entries), but a heuristic method yields reliable results. When the trees have no conflict, or when there are only two trees, the method produces the exact result for any ordering of the input trees and any ordering of the groups within them; when there are more than two trees and they have conflict, a single ordering or sequence can create some spurious groups, but doing multiple sequences eliminates the spurious groups. The method uses only state set operations, and is thus easily implemented in computer programs. Unlike any existing type of supertree, semi‐strict supertrees display all the groups, and only those groups, that are implied by at least some combination of the input trees and contradicted by none. The idea that supertrees should take into account the number of occurences of a given group, so as to retain some groups even in the case of conflict, is discussed; it is argued that a conceptual equivalent of the majority rule consensus is not possible when the sets of taxa differ among trees. Also, when pruning taxa from a set of trees, the supertree can display groups that contradict the consensus for the entire trees, suggesting that supertrees for matrices with very dissimilar sets of taxa should be interpreted with caution. If (for any valid reason) the data cannot be combined in a single matrix, it is advisable that the taxon sets in the matrices be as similar as possible.  相似文献   

4.
The stationary birth-only, or Yule-Furry, process for rooted binary trees has been analysed with a view to developing explicit expressions for two fundamental statistical distributions: the probability that a randomly selected leaf is preceded by N nodes, or “ancestors”, and the probability that two randomly selected leaves are separated by N nodes. For continuous-time Yule processes, the first of these distributions is presented in closed analytical form as a function of time, with time being measured with respect to the moment of “birth” of the common ancestor (which is essentially inaccessible to phylogenetic analysis), or with respect to the instant at which the first bifurcation occurred.The second distribution is shown to follow in an iterative manner from a hierarchy of second-order ordinary differential equations.For Yule trees of a given number n of tips, expressions have been derived for the mean and variance for each of these distributions as functions of n, as well as for the distributions themselves.In addition, it is shown how the methods developed to obtain these distributions can be employed to find, with minor effort, expressions for the expectation values of two statistics on Yule trees, the Sackin index (sum over all root-to-leaf distances), and the sum over all leaf-to-leaf distances.  相似文献   

5.
6.
7.
8.
Maximum likelihood supertrees   总被引:2,自引:0,他引:2  
  相似文献   

9.
Supertrees result from combining many smaller, overlapping phylogenetic trees into a single, more comprehensive tree. As such, supertree construction is probably as old as the field of systematics itself, and remains our only way of visualizing the Tree of Life as a whole. Over the past decade, supertree construction has gained a more formal, objective footing, and has become an area of active theoretical and practical research. Here, I review the history of the supertree approach, focusing mainly on its current implementation. The supertrees of today represent some of the largest, complete phylogenies available for many groups, but are not without their critics. I conclude by arguing that the ever-growing molecular revolution will result in supertree construction taking on a new role and implementation in the future for analyzing large DNA sequence matrices as part of a divide-and-conquer phylogenetic approach.  相似文献   

10.
A peak is a pair of real values (x,y), where x is the time when peak of height y is registered. In the peak alignment problem, we are given two sequences of peaks, and our task is to align the sequences allowing some basic edit operations on the peaks. We study an instance of the peak alignment problem that arises in the analysis of Mass Spectrometry data in Systems Biology. There the measurement technique guarantees that two peaks (x,y), (x',y') can only be considered the same if x is close enough to x', and y is close enough to y'. We review some methods to do alignment under such restrictions on matches.  相似文献   

11.
Gasbarra D  Sillanpää MJ 《Genetics》2006,172(2):1325-1335
A new statistical approach for construction of the genetic linkage map and estimation of the parental linkage phase based on allele frequency data from pooled gametic (sperm or egg) samples is introduced. This method can be applied for estimation of recombination fractions (over distances <1 cM) and ordering of large numbers (even hundreds) of closely linked markers. This method should be extremely useful in species with a long generation interval and a large genome size such as in dairy cattle or in forest trees; the conifer species have haploid tissues available in megagametophytes. According to Mendelian expectation, two parental alleles should occur in gametes in 1:1 proportions, if segregation distortion does not occur. However, due to mere sampling variation, the observed proportions may deviate from their expected value in practice. These deviations and their dependence along the chromosome can provide information on the parental linkage phase and on the genetic linkage map. Usefulness of the method is illustrated with simulations. The role of segregation distortion as a source of these deviations is also discussed. The software implementing this method is freely available for research purposes from the authors.  相似文献   

12.
Large and comprehensive phylogenetic trees are desirable for studying macroevolutionary processes and for classification purposes. Such trees can be obtained in two different ways. Either the widest possible range of taxa can be sampled and used in a phylogenetic analysis to produce a "big tree," or preexisting topologies can be used to create a supertree. Although large multigene analyses are often favored, combinable data are not always available, and supertrees offer a suitable solution. The most commonly used method of supertree reconstruction, matrix representation with parsimony (MRP), is presented here. We used a combined data set for the Poaceae to (1) assess the differences between an approach that uses combined data and one that uses different MRP modifications based on the character partitions and (2) investigate the advantages and disadvantages of these modifications. Baum and Ragan and Purvis modifications gave similar results. Incorporating bootstrap support associated with pre-existing topologies improved Baum and Ragan modification and its similarity with a combined analysis. Finally, we used the supertree reconstruction approach on 55 published phylogenies to build one of most comprehensive phylogenetic trees published for the grass family including 403 taxa and discuss its strengths and weaknesses in relation to other published hypotheses.  相似文献   

13.
The estimation of ever larger phylogenies requires consideration of alternative inference strategies, including divide-and-conquer approaches that decompose the global inference problem to a set of smaller, more manageable component problems. A prominent locus of research in this area is the development of supertree methods, which estimate a composite tree by combining a set of partially overlapping component topologies. Although promising, the use of component tree topologies as the primary data dissociates supertrees from complexities within the underling character data and complicates the evaluation of phylogenetic uncertainty. We address these issues by exploring three approaches that variously incorporate nonparametric bootstrapping into a common supertree estimation algorithm (matrix representation with parsimony, although any algorithm might be used), including bootstrap-weighting, source-tree bootstrapping, and hierarchical bootstrapping. We illustrate these procedures by means of hypothetical and empirical examples. Our preliminary experiments suggest that these methods have the potential to improve the correspondence of supertree estimates to those derived from simultaneous analysis of the combined data and to allow uncertainty in supertree topologies to be quantified. The ability to increase the transparency of supertrees to the underlying character data has several practical implications and sheds new light on an old debate. These methods have been implemented in the freely available program, tREeBOOT.  相似文献   

14.
Mammalian phylogeny: genes and supertrees   总被引:8,自引:0,他引:8  
Novacek MJ 《Current biology : CB》2001,11(14):R573-R575
A massive effort to sample mammals for genes has yielded new proposals for the branching architecture of the great radiation of placental mammals. Some of these are notably discrepant with morphologically based analyses, but they suggest new research that should address several major outstanding issues.  相似文献   

15.
Inferring species phylogenies is an important part of understanding molecular evolution. Even so, it is well known that an accurate phylogenetic tree reconstruction for a single gene does not always necessarily correspond to the species phylogeny. One commonly accepted strategy to cope with this problem is to sequence many genes; the way in which to analyze the resulting collection of genes is somewhat more contentious. Supermatrix and supertree methods can be used, although these can suppress conflicts arising from true differences in the gene trees caused by processes such as lineage sorting, horizontal gene transfer, or gene duplication and loss. In 2004, Huson et al. (IEEE/ACM Trans. Comput. Biol. Bioinformatics 1:151-158) presented the Z-closure method that can circumvent this problem by generating a supernetwork as opposed to a supertree. Here we present an alternative way for generating supernetworks called Q-imputation. In particular, we describe a method that uses quartet information to add missing taxa into gene trees. The resulting trees are subsequently used to generate consensus networks, networks that generalize strict and majority-rule consensus trees. Through simulations and application to real data sets, we compare Q-imputation to the matrix representation with parsimony (MRP) supertree method and Z-closure, and demonstrate that it provides a useful complementary tool.  相似文献   

16.
In this paper, we evaluate the relative performance of competing approaches for estimating phylogenies from incomplete distance matrices. The direct approach proceeds with phylogenetic reconstruction while ignoring missing cells, whereas the indirect approach proceeds by estimating the missing distances prior to phylogenetic analysis. Two distinct indirect procedures based on the ultrametric inequality and the four-point condition are further compared. Using simulations, we show that more reliable results are obtained when such indirect methods are used. Expectedly, the phylogenies become less accurate as the percentage of missing cells increases, but combining different estimation methods greatly improves the accuracy. An application to bat phylogeny confirms the results obtained in the simulation study and illustrates the effect of missing distances in the construction of supertrees.  相似文献   

17.
A Robinson-Foulds (RF) supertree for a collection of input trees is a tree containing all the species in the input trees that is at minimum total RF distance to the input trees. Thus, an RF supertree is consistent with the maximum number of splits in the input trees. Constructing RF supertrees for rooted and unrooted data is NP-hard. Nevertheless, effective local search heuristics have been developed for the restricted case where the input trees and the supertree are rooted. We describe new heuristics, based on the Edge Contract and Refine (ECR) operation, that remove this restriction, thereby expanding the utility of RF supertrees. Our experimental results on simulated and empirical data sets show that our unrooted local search algorithms yield better supertrees than those obtained from MRP and rooted RF heuristics in terms of total RF distance to the input trees and, for simulated data, in terms of RF distance to the true tree.  相似文献   

18.
Systematists and comparative biologists commonly want to make statements about relationships among taxa that have never been collectively included in any single phylogenetic analysis. Construction of phylogenetic 'supertrees' provides one solution. Supertrees are estimates of phylogeny assembled from sets of smaller estimates (source trees) sharing some but not necessarily all their taxa in common. If certain conditions are met, supertrees can retain all or most of the information from the source trees and also make novel statements about relationships of taxa that do not co-occur on any one source tree. Supertrees have commonly been constructed using subjective and informal approaches, but several explicit approaches have recently been proposed.  相似文献   

19.
20.
Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号