首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We report on the only known case of independent discovery of unrooted trees in a historical science outside of biological systematics. The method of textual criticism (ecdotics, i.e., the building of text-version genealogies) created by French philologist Henri Quentin (1872–1935) proposes the use of a type of branching scheme equivalent to unrooted trees in phylogenetics. Because Quentin's method has never become the prevailing paradigm in philology, his insight into unrooted trees has not been noticed in previous studies comparing philology and phylogenetics. In fact, the modern use of unrooted trees in philology is seen as imported from phylogenetics. Quentin's procedure starts by building an unrooted tree (‘chain’) expressing the network of text versions (taxa) based on ‘variants’ (equivalent to unpolarized character states). Such undirected scheme is then rooted on the basis of extrinsic temporal information, thus resulting in a complete (rooted) hypothesis of relationships. Quentin asserts that the building of an unrooted tree precedes the determination of its orientation (rooting) and that the two procedures reflect distinct levels of structural organization, relying on different assumptions. Henri Quentin fully grasped the implications of time-reversible properties of unrooted trees and associated characters, in striking prescience of the same concepts developed in phylogenetics some 45 years later. The two versions of unrooted trees were developed entirely independently of each other and such convergence is testimony to the formal efficiency of approaching historical reconstruction in unrooted and rooted dimensions.  相似文献   

2.

Background  

Lateral genetic transfer can lead to disagreements among phylogenetic trees comprising sequences from the same set of taxa. Where topological discordance is thought to have arisen through genetic transfer events, tree comparisons can be used to identify the lineages that may have shared genetic information. An 'edit path' of one or more transfer events can be represented with a series of subtree prune and regraft (SPR) operations, but finding the optimal such set of operations is NP-hard for comparisons between rooted trees, and may be so for unrooted trees as well.  相似文献   

3.
A Robinson-Foulds (RF) supertree for a collection of input trees is a tree containing all the species in the input trees that is at minimum total RF distance to the input trees. Thus, an RF supertree is consistent with the maximum number of splits in the input trees. Constructing RF supertrees for rooted and unrooted data is NP-hard. Nevertheless, effective local search heuristics have been developed for the restricted case where the input trees and the supertree are rooted. We describe new heuristics, based on the Edge Contract and Refine (ECR) operation, that remove this restriction, thereby expanding the utility of RF supertrees. Our experimental results on simulated and empirical data sets show that our unrooted local search algorithms yield better supertrees than those obtained from MRP and rooted RF heuristics in terms of total RF distance to the input trees and, for simulated data, in terms of RF distance to the true tree.  相似文献   

4.
Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals—each with many genes—splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.  相似文献   

5.
Summary The maximum likelihood (ML) method for constructing phylogenetic trees (both rooted and unrooted trees) from DNA sequence data was studied. Although there is some theoretical problem in the comparison of ML values conditional for each topology, it is possible to make a heuristic argument to justify the method. Based on this argument, a new algorithm for estimating the ML tree is presented. It is shown that under the assumption of a constant rate of evolution, the ML method and UPGMA always give the same rooted tree for the case of three operational taxonomic units (OTUs). This also seems to hold approximately for the case with four OTUs. When we consider unrooted trees with the assumption of a varying rate of nucleotide substitution, the efficiency of the ML method in obtaining the correct tree is similar to those of the maximum parsimony method and distance methods. The ML method was applied to Brown et al.'s data, and the tree topology obtained was the same as that found by the maximum parsimony method, but it was different from those obtained by distance methods.  相似文献   

6.
One of the criteria for inferring a species tree from a collection of gene trees, when gene tree incongruence is assumed to be due to incomplete lineage sorting (ILS), is Minimize Deep Coalescence (MDC). Exact algorithms for inferring the species tree from rooted, binary trees under MDC were recently introduced. Nevertheless, in phylogenetic analyses of biological data sets, estimated gene trees may differ from true gene trees, be incompletely resolved, and not necessarily rooted. In this article, we propose new MDC formulations for the cases where the gene trees are unrooted/binary, rooted/non-binary, and unrooted/non-binary. Further, we prove structural theorems that allow us to extend the algorithms for the rooted/binary gene tree case to these cases in a straightforward manner. In addition, we devise MDC-based algorithms for cases when multiple alleles per species may be sampled. We study the performance of these methods in coalescent-based computer simulations.  相似文献   

7.
The need for structures capable of accommodating complex evolutionary signals such as those found in, for example, wheat has fueled research into phylogenetic networks. Such structures generalize the standard model of a phylogenetic tree by also allowing for cycles and have been introduced in rooted and unrooted form. In contrast to phylogenetic trees or their unrooted versions, rooted phylogenetic networks are notoriously difficult to understand. To help alleviate this, recent work on them has also centered on their “uprooted” versions. By focusing on such graphs and the combinatorial concept of a split system which underpins an unrooted phylogenetic network, we show that not only can a so-called (uprooted) 1-nested network N be obtained from the Buneman graph (sometimes also called a median network) associated with the split system \(\Sigma (N)\) induced on the set of leaves of N but also that that graph is, in a well-defined sense, optimal. Along the way, we establish the 1-nested analogue of the fundamental “splits equivalence theorem” for phylogenetic trees and characterize maximal circular split systems.  相似文献   

8.
MOTIVATION: Inferring species phylogenies with a history of gene losses and duplications is a challenging and an important task in computational biology. This problem can be solved by duplication-loss models in which the primary step is to reconcile a rooted gene tree with a rooted species tree. Most modern methods of phylogenetic reconstruction (from sequences) produce unrooted gene trees. This limitation leads to the problem of transforming unrooted gene tree into a rooted tree, and then reconciling rooted trees. The main questions are 'What about biological interpretation of choosing rooting?', 'Can we find efficiently the optimal rootings?', 'Is the optimal rooting unique?'. RESULTS: In this paper we present a model of reconciling unrooted gene tree with a rooted species tree, which is based on a concept of choosing rooting which has minimal reconciliation cost. Our analysis leads to the surprising property that all the minimal rootings have identical distributions of gene duplications and gene losses in the species tree. It implies, in our opinion, that the concept of an optimal rooting is very robust, and thus biologically meaningful. Also, it has nice computational properties. We present a linear time and space algorithm for computing optimal rooting(s). This algorithm was used in two different ways to reconstruct the optimal species phylogeny of five known yeast genomes from approximately 4700 gene trees. Moreover, we determined locations (history) of all gene duplications and gene losses in the final species tree. It is interesting to notice that the top five species trees are the same for both methods. AVAILABILITY: Software and documentation are freely available from http://bioputer.mimuw.edu.pl/~gorecki/urec  相似文献   

9.
Background

Discovering the location of gene duplications and multiple gene duplication episodes is a fundamental issue in evolutionary molecular biology. The problem introduced by Guigó et al. in 1996 is to map gene duplication events from a collection of rooted, binary gene family trees onto theirs corresponding rooted binary species tree in such a way that the total number of multiple gene duplication episodes is minimized. There are several models in the literature that specify how gene duplications from gene families can be interpreted as one duplication episode. However, in all duplication episode problems gene trees are rooted. This restriction limits the applicability, since unrooted gene family trees are frequently inferred by phylogenetic methods.

Results

In this article we show the first solution to the open problem of episode clustering where the input gene family trees are unrooted. In particular, by using theoretical properties of unrooted reconciliation, we show an efficient algorithm that reduces this problem into the episode clustering problems defined for rooted trees. We show theoretical properties of the reduction algorithm and evaluation of empirical datasets.

Conclusions

We provided algorithms and tools that were successfully applied to several empirical datasets. In particular, our comparative study shows that we can improve known results on genomic duplication inference from real datasets.

  相似文献   

10.
URec is a software based on a concept of unrooted reconciliation. It can be used to reconcile a set of unrooted gene trees with a rooted species tree or a set of rooted species trees. Moreover, it computes detailed distribution of gene duplications and gene losses in a species tree. It can be used to infer optimal species phylogenies for a given set of gene trees. URec is implemented in C++ and can be easily compiled under Unix and Windows systems. Availability: Software is freely available for download from our website at http://bioputer.mimuw.edu.pl/~gorecki/urec. This webpage also contains Windows executables and a number of advanced examples with explanations.  相似文献   

11.
We developed a recurrence relation that counts the number of tandem duplication trees (either rooted or unrooted) that are consistent with a set of n tandemly repeated sequences generated under the standard unequal recombination (or crossover) model of tandem duplications. The number of rooted duplication trees is exactly twice the number of unrooted trees, which means that on average only two positions for a root on a duplication tree are possible. Using the recurrence, we tabulated these numbers for small values of n. We also developed an asymptotic formula that for large n provides estimates for these numbers. These numbers give a priori probabilities for phylogenies of the repeated sequences to be duplication trees. This work extends earlier studies where exhaustive counts of the numbers for small n were obtained. One application showed the significance of finding that most maximum-parsimony trees constructed from repeat sequences from human immunoglobins and T-cell receptors were tandem duplication trees. Those findings provided strong support to the proposed mechanisms of tandem gene duplication. The recurrence relation also suggests efficient algorithms to recognize duplication trees and to generate random duplication trees for simulation. We present a linear-time recognition algorithm.  相似文献   

12.
Neutral macroevolutionary models, such as the Yule model, give rise to a probability distribution on the set of discrete rooted binary trees over a given leaf set. Such models can provide a signal as to the approximate location of the root when only the unrooted phylogenetic tree is known, and this signal becomes relatively more significant as the number of leaves grows. In this short note, we show that among models that treat all taxa equally, and are sampling consistent (i.e. the distribution on trees is not affected by taxa yet to be included), all such models, except one (the so-called PDA model), convey some information as to the location of the ancestral root in an unrooted tree.  相似文献   

13.
Popular methods for exploring the space of rooted phylogenetic trees use rearrangement moves such as rooted Nearest Neighbour Interchange (rNNI) and rooted Subtree Prune and Regraft (rSPR). Recently, these moves were generalized to rooted phylogenetic networks, which are a more suitable representation of reticulate evolutionary histories, and it was shown that any two rooted phylogenetic networks of the same complexity are connected by a sequence of either rSPR or rNNI moves. Here, we show that this is possible using only tail moves, which are a restricted version of rSPR moves on networks that are more closely related to rSPR moves on trees. The connectedness still holds even when we restrict to distance-1 tail moves (a localized version of tail moves). Moreover, we give bounds on the number of (distance-1) tail moves necessary to turn one network into another, which in turn yield new bounds for rSPR, rNNI and SPR (i.e. the equivalent of rSPR on unrooted networks). The upper bounds are constructive, meaning that we can actually find a sequence with at most this length for any pair of networks. Finally, we show that finding a shortest sequence of tail or rSPR moves is NP-hard.  相似文献   

14.
Pedigrees illustrate the genealogical relationships among individuals, and phylogenies do the same for groups of organisms (such as species, genera, etc.). Here, I provide a brief survey of current concepts and methods for calculating and displaying genealogical relationships. These relationships have long been recognized to be reticulating, rather than strictly divergent, and so both pedigrees and phylogenies are correctly treated as networks rather than trees. However, currently most pedigrees are instead presented as “family trees”, and most phylogenies are presented as phylogenetic trees. Nevertheless, the historical development of concepts shows that networks pre-dated trees in most fields of biology, including the study of pedigrees, biology theory, and biology practice, as well as in historical linguistics in the social sciences. Trees were actually introduced in order to provide a simpler conceptual model for historical relationships, since trees are a specific type of simple network. Computationally, trees and networks are a part of graph theory, consisting of nodes connected by edges. In this mathematical context they differ solely in the absence or presence of reticulation nodes, respectively. There are two types of graphs that can be called phylogenetic networks: (1) rooted evolutionary networks, and (2) unrooted affinity networks. There are quite a few computational methods for unrooted networks, which have two main roles in phylogenetics: (a) they act as a generic form of multivariate data display; and (b) they are used specifically to represent haplotype networks. Evolutionary networks are more difficult to infer and analyse, as there is no mathematical algorithm for reconstructing unique historical events. There is thus currently no coherent analytical framework for computing such networks.  相似文献   

15.
The evolutionary history of a collection of species is usually represented by a phylogenetic tree. Sometimes, phylogenetic networks are used as a means of representing reticulate evolution or of showing uncertainty and incompatibilities in evolutionary datasets. This is often done using unrooted phylogenetic networks such as split networks, due in part, to the availability of software (SplitsTree) for their computation and visualization. In this paper we discuss the problem of drawing rooted phylogenetic networks as cladograms or phylograms in a number of different views that are commonly used for rooted trees. Implementations of the algorithms are available in new releases of the Dendroscope and SplitsTree programs.  相似文献   

16.
Liu L  Yu L 《Systematic biology》2011,60(5):661-667
In this study, we develop a distance method for inferring unrooted species trees from a collection of unrooted gene trees. The species tree is estimated by the neighbor joining (NJ) tree built from a distance matrix in which the distance between two species is defined as the average number of internodes between two species across gene trees, that is, average gene-tree internode distance. The distance method is named NJ(st) to distinguish it from the original NJ method. Under the coalescent model, we show that if gene trees are known or estimated correctly, the NJ(st) method is statistically consistent in estimating unrooted species trees. The simulation results suggest that NJ(st) and STAR (another coalescence-based method for inferring species trees) perform almost equally well in estimating topologies of species trees, whereas the Bayesian coalescence-based method, BEST, outperforms both NJ(st) and STAR. Unlike BEST and STAR, the NJ(st) method can take unrooted gene trees to infer species trees without using an outgroup. In addition, the NJ(st) method can handle missing data and is thus useful in phylogenomic studies in which data sets often contain missing loci for some individuals.  相似文献   

17.
Abstract

Phyto-recurrent selection is an established method for selecting tree genotypes for phytoremediation. To identify promising Populus (poplar) and Salix (willow) genotypes for phytotechnologies, our objectives were to (1) evaluate the genotypic variability in survival, height, and diameter of poplar and willow clones established on soils heavily contaminated with nitrates; and (2) assess the genotypic stability in survival and diameter of selected poplar clones after one and eleven growing seasons. We tested 27 poplar and 10 willow clones planted as unrooted cuttings, along with 15 poplar genotypes planted as rooted cuttings. The trees were tested at an agricultural production facility in the Midwestern, United States. After 11 growing seasons, using phyto-recurrent selection, we surveyed survival and measured the diameter of 27 poplar clones (14 unrooted, 13 rooted) that were selected based on superior survival and growth throughout plantation development. Overall, willow exhibited the greatest survival, while poplar had the greatest height and diameter. At 11 years after planting, superior clones were identified that exhibited above-average diameter growth at the establishment- and rotation-age, most of which had stable genotypic performance over time. Selection of specific clones was favorable to genomic groups, based on the geographic location and soil conditions of the site.  相似文献   

18.
Accuracy of phylogenetic trees estimated from DNA sequence data   总被引:4,自引:1,他引:3  
The relative merits of four different tree-making methods in obtaining the correct topology were studied by using computer simulation. The methods studied were the unweighted pair-group method with arithmetic mean (UPGMA), Fitch and Margoliash's (FM) method, thd distance Wagner (DW) method, and Tateno et al.'s modified Farris (MF) method. An ancestral DNA sequence was assumed to evolve into eight sequences following a given model tree. Both constant and varying rates of nucleotide substitution were considered. Once the DNA sequences for the eight extant species were obtained, phylogenetic trees were constructed by using corrected (d) and uncorrected (p) nucleotide substitutions per site. The topologies of the trees obtained were then compared with that of the model tree. The results obtained can be summarized as follows: (1) The probability of obtaining the correct rooted or unrooted tree is low unless a large number of nucleotide differences exists between different sequences. (2) When the number of nucleotide substitutions per sequence is small or moderately large, the FM, DW, and MF methods show a better performance than UPGMA in recovering the correct topology. The former group of methods is particularly good for obtaining the correct unrooted tree. (3) When the number of substitutions per sequence is large, UPGMA is at least as good as the other methods, particularly for obtaining the correct rooted tree. (4) When the rate of nucleotide substitution varies with evolutionary lineage, the FM, DW, and MF methods show a better performance in obtaining the correct topology than UPGMA, except when a rooted tree is to be produced from data with a large number of nucleotide substitutions per sequence.(ABSTRACT TRUNCATED AT 250 WORDS)   相似文献   

19.
Compatibility analysis and its applications   总被引:1,自引:0,他引:1  
A two-state character is defined as uniquely derived if it has only evolved once in the history of a group, without subsequent reversal. Two independent characters cannot both be uniquely derived if all four possible combinations (or all three excluding that of the two ancestral forms) occur.
A number of ways of choosing compatible sets of uniquely derived characters are discussed and used to derive possible unrooted and rooted trees. Results of these are related to those chosen on parsimony criteria, using data for orthopteroid groups, and the assumptions of both methods are compared. Application of compatibility analysis to the moth genera Teldenia and Argodrepana is also discussed. Compatibility and parsimony methods are complementary rather than exclusive of each other.  相似文献   

20.
The Yule model and the coalescent model are two neutral stochastic models for generating trees in phylogenetics and population genetics, respectively. Although these models are quite different, they lead to identical distributions concerning the probability that pre-specified groups of taxa form monophyletic groups (clades) in the tree. We extend earlier work to derive exact formulae for the probability of finding one or more groups of taxa as clades in a rooted tree, or as ‘clans’ in an unrooted tree. Our findings are relevant for calculating the statistical significance of observed monophyly and reciprocal monophyly in phylogenetics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号