共查询到20条相似文献,搜索用时 15 毫秒
1.
Phylogenetic networks are a generalization of phylogenetic trees that are used to represent non-tree-like evolutionary histories that arise in organisms such as plants and bacteria, or uncertainty in evolutionary histories. An unrooted phylogenetic network on a non-empty, finite set X of taxa, or network, is a connected, simple graph in which every vertex has degree 1 or 3 and whose leaf set is X. It is called a phylogenetic tree if the underlying graph is a tree. In this paper we consider properties of tree-based networks, that is, networks that can be constructed by adding edges into a phylogenetic tree. We show that although they have some properties in common with their rooted analogues which have recently drawn much attention in the literature, they have some striking differences in terms of both their structural and computational properties. We expect that our results could eventually have applications to, for example, detecting horizontal gene transfer or hybridization which are important factors in the evolution of many organisms. 相似文献
2.
Huson Daniel H. 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(1):103-109
The evolutionary history of a collection of species is usually represented by a phylogenetic tree. Sometimes, phylogenetic networks are used as a means of representing reticulate evolution or of showing uncertainty and incompatibilities in evolutionary datasets. This is often done using unrooted phylogenetic networks such as split networks, due in part, to the availability of software (SplitsTree) for their computation and visualization. In this paper we discuss the problem of drawing rooted phylogenetic networks as cladograms or phylograms in a number of different views that are commonly used for rooted trees. Implementations of the algorithms are available in new releases of the Dendroscope and SplitsTree programs. 相似文献
3.
Stephen J. Willson 《Bulletin of mathematical biology》2010,72(2):340-358
A phylogenetic network is a rooted acyclic digraph with vertices corresponding to taxa. Let X denote a set of vertices containing the root, the leaves, and all vertices of outdegree 1. Regard X as the set of vertices on which measurements such as DNA can be made. A vertex is called normal if it has one parent, and hybrid if it has more than one parent. The network is called normal if it has no redundant arcs and also from every vertex there is a directed path to a member of X such that all vertices after the first are normal. This paper studies properties of normal networks. Under a simple model of inheritance that allows homoplasies only at hybrid vertices, there is essentially unique determination of the genomes at all vertices by the genomes at members of X if and only if the network is normal. This model is a limiting case of more standard models of inheritance when the substitution rate is sufficiently low. Various mathematical properties of normal networks are described. These properties include that the number of vertices grows at most quadratically with the number of leaves and that the number of hybrid vertices grows at most linearly with the number of leaves. 相似文献
4.
Cardona Gabriel Rossello Francesc Valiente Gabriel 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(4):552-569
Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of nontreelike evolutionary events, like recombination, hybridization, or lateral gene transfer. While much progress has been made to find practical algorithms for reconstructing a phylogenetic network from a set of sequences, all attempts to endorse a class of phylogenetic networks (strictly extending the class of phylogenetic trees) with a well-founded distance measure have, to the best of our knowledge and with the only exception of the bipartition distance on regular networks, failed so far. In this paper, we present and study a new meaningful class of phylogenetic networks, called tree-child phylogenetic networks, and we provide an injective representation of these networks as multisets of vectors of natural numbers, their path multiplicity vectors. We then use this representation to define a distance on this class that extends the well-known Robinson-Foulds distance for phylogenetic trees and to give an alignment method for pairs of networks in this class. Simple polynomial algorithms for reconstructing a tree-child phylogenetic network from its path multiplicity vectors, for computing the distance between two tree-child phylogenetic networks and for aligning a pair of tree-child phylogenetic networks, are provided. They have been implemented as a Perl package and a Java applet, which can be found at http://bioinfo.uib.es/~recerca/phylonetworks/mudistance/. 相似文献
5.
6.
Cardona Gabriel Llabres Merce Rossello Francesc Valiente Gabriel 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(4):629-638
We prove that Nakhleh's metric for reduced phylogenetic networks is also a metric on the classes of tree-child phylogenetic networks, semibinary tree-sibling time consistent phylogenetic networks, and multilabeled phylogenetic trees. We also prove that it separates distinguishable phylogenetic networks. In this way, it becomes the strongest dissimilarity measure for phylogenetic networks available so far. Furthermore, we propose a generalization of that metric that separates arbitrary phylogenetic networks. 相似文献
7.
In the last decade, the use of phylogenetic networks to analyze the evolution of species whose past is likely to include reticulation events, such as horizontal gene transfer or hybridization, has gained popularity among evolutionary biologists. Nevertheless, the evolution of a particular gene can generally be described without reticulation events and therefore be represented by a phylogenetic tree. While this is not in contrast to each other, it places emphasis on the necessity of algorithms that analyze and summarize the tree-like information that is contained in a phylogenetic network. We contribute to the toolbox of such algorithms by investigating the question of whether or not a phylogenetic network embeds a tree twice and give a quadratic-time algorithm to solve this problem for a class of networks that is more general than tree-child networks. 相似文献
8.
van Iersel Leo Keijsper Judith Kelk Steven Stougie Leen Hagen Ferry Boekhout Teun 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(4):667-681
Jansson and Sung showed that, given a dense set of input triplets T (representing hypotheses about the local evolutionary relationships of triplets of taxa), it is possible to determine in polynomial time whether there exists a level-1 network consistent with T, and if so, to construct such a network [24]. Here, we extend this work by showing that this problem is even polynomial time solvable for the construction of level-2 networks. This shows that, assuming density, it is tractable to construct plausible evolutionary histories from input triplets even when such histories are heavily nontree-like. This further strengthens the case for the use of triplet-based methods in the construction of phylogenetic networks. We also implemented the algorithm and applied it to yeast data. 相似文献
9.
Leo van Iersel Vincent Moulton Eveline de Swart Taoyang Wu 《Bulletin of mathematical biology》2017,79(5):1135-1154
Phylogenetic networks are a generalization of evolutionary trees that are used by biologists to represent the evolution of organisms which have undergone reticulate evolution. Essentially, a phylogenetic network is a directed acyclic graph having a unique root in which the leaves are labelled by a given set of species. Recently, some approaches have been developed to construct phylogenetic networks from collections of networks on 2- and 3-leaved networks, which are known as binets and trinets, respectively. Here we study in more depth properties of collections of binets, one of the simplest possible types of networks into which a phylogenetic network can be decomposed. More specifically, we show that if a collection of level-1 binets is compatible with some binary network, then it is also compatible with a binary level-1 network. Our proofs are based on useful structural results concerning lowest stable ancestors in networks. In addition, we show that, although the binets do not determine the topology of the network, they do determine the number of reticulations in the network, which is one of its most important parameters. We also consider algorithmic questions concerning binets. We show that deciding whether an arbitrary set of binets is compatible with some network is at least as hard as the well-known graph isomorphism problem. However, if we restrict to level-1 binets, it is possible to decide in polynomial time whether there exists a binary network that displays all the binets. We also show that to find a network that displays a maximum number of the binets is NP-hard, but that there exists a simple polynomial-time 1/3-approximation algorithm for this problem. It is hoped that these results will eventually assist in the development of new methods for constructing phylogenetic networks from collections of smaller networks. 相似文献
10.
Juan Wang 《PloS one》2016,11(11)
Rooted phylogenetic networks are primarily used to represent conflicting evolutionary information and describe the reticulate evolutionary events in phylogeny. So far a lot of methods have been presented for constructing rooted phylogenetic networks, of which the methods based on the decomposition property of networks and by means of the incompatible graph (such as the CASS, the LNETWORK and the BIMLR) are more efficient than other available methods. The paper will discuss and compare these methods by both the practical and artificial datasets, in the aspect of the running time of the methods and the effective of constructed phylogenetic networks. The results show that the LNETWORK can construct much simper networks than the others. 相似文献
11.
Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are indistinguishable. This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks. 相似文献
12.
Stephen J. Willson 《Bulletin of mathematical biology》2013,75(10):1840-1878
Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the tree-average distances between the leaves. For a certain class of phylogenetic networks, a polynomial-time method is presented to reconstruct the network from the tree-average distances. The method is proved to work if there is a single reticulation cycle. 相似文献
13.
Willson SJ 《Bulletin of mathematical biology》2006,68(4):919-944
In this paper, a class of rooted acyclic directed graphs (called TOM-networks) is defined that generalizes rooted trees and
allows for models including hybridization events. It is argued that the defining properties are biologically plausible. Each
TOM-network has a distance defined between each pair of vertices. For a TOM-network N, suppose that the set X consisting of the leaves and the root is known, together with the distances between members of X. It is proved that N is uniquely determined from this information and can be reconstructed in polynomial time. Thus, given exact distance information
on the leaves and root, the phylogenetic network can be uniquely recovered, provided that it is a TOM-network. An outgroup
can be used instead of a true root. 相似文献
14.
Cardona Gabriel Llabres Merce Rossello Francesc Valiente Gabriel 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(3):454-469
The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the second in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we generalize to phylogenetic networks two metrics that have already been introduced in the literature for phylogenetic trees: the nodal distance and the triplets distance. We prove that they are metrics on any class of tree-child time consistent phylogenetic networks on the same set of taxa, as well as some basic properties for them. To prove these results, we introduce a reduction/expansion procedure that can be used not only to establish properties of tree-child time consistent phylogenetic networks by induction, but also to generate all tree-child time consistent phylogenetic networks with a given number of leaves. 相似文献
15.
Willson SJ 《Bulletin of mathematical biology》2007,69(8):2561-2590
Suppose G is a phylogenetic network given as a rooted acyclic directed graph. Let X be a subset of the vertex set containing the root, all leaves, and all vertices of outdegree 1. A vertex is “regular” if
it has a unique parent, and “hybrid” if it has two parents. Consider the case where each gene is binary. Assume an idealized
system of inheritance in which no homoplasies occur at regular vertices, but homoplasies can occur at hybrid vertices. Under
our model, the distances between taxa are shown to be described using a system of numbers called “originating weights” and
“homoplasy weights.” Assume that the distances are known between all members of X. Sufficient conditions are given such that the graph G and all the originating and homoplasy weights can be reconstructed from the given distances. 相似文献
16.
Phylogenetic networks are necessary to represent the tree of life expanded by edges to represent events such as horizontal gene transfers, hybridizations or gene flow. Not all species follow the paradigm of vertical inheritance of their genetic material. While a great deal of research has flourished into the inference of phylogenetic trees, statistical methods to infer phylogenetic networks are still limited and under development. The main disadvantage of existing methods is a lack of scalability. Here, we present a statistical method to infer phylogenetic networks from multi-locus genetic data in a pseudolikelihood framework. Our model accounts for incomplete lineage sorting through the coalescent model, and for horizontal inheritance of genes through reticulation nodes in the network. Computation of the pseudolikelihood is fast and simple, and it avoids the burdensome calculation of the full likelihood which can be intractable with many species. Moreover, estimation at the quartet-level has the added computational benefit that it is easily parallelizable. Simulation studies comparing our method to a full likelihood approach show that our pseudolikelihood approach is much faster without compromising accuracy. We applied our method to reconstruct the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), which is characterized by widespread hybridizations. 相似文献
17.
Cardona Gabriel Llabr s Merc Rossell Francesc Valiente Gabriel 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(1):46-61
The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the first in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we study three metrics that have already been introduced in the literature: the Robinson-Foulds distance, the tripartitions distance and the $mu$-distance. They generalize to networks the classical Robinson-Foulds or partition distance for phylogenetic trees. We analyze the behavior of these metrics by studying their least and largest values and when they achieve them. As a by-product of this study, we obtain tight bounds on the size of a tree-child time consistent phylogenetic network. 相似文献
18.
After the introduction of uprooted trees to their environment, the behavior of 28 socially housed, laboratory chimpanzees (Pan troglodytes) was studied for five months. Subjects used the tree during 41.9% of the data points collected during the first day trees were introduced. Thereafter, the mean for tree use dropped to 3.5% and remained fairly consistent. Immature subjects used the trees significantly more than did adult subjects, as measured by the Mann-Whitney U-test. No sex difference was detected. The trees elicited a variety of species-appropriate behaviors. Increasing the similarity between the behavior of captive and wild chimpanzees can be viewed as promoting the psychological well-being of the captive animals. 相似文献
19.
It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary
split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation,
and providing an analogous augmentation of the general Markov model has thus far been elusive. In this paper, we rectify this
shortcoming by showing how to extend the general Markov model on trees to include incompatible edges; and even further to
more general network models. This is achieved by exploring the algebra of the generators of the continuous-time Markov chain
together with the “splitting” operator that generates the branching process on phylogenetic trees. For simplicity, we proceed
by discussing the two state case and then show that our results are easily extended to more states with little complication.
Intriguingly, upon restriction of the two state general Markov model to the parameter space of the binary symmetric model,
our extension is indistinguishable from the Hadamard approach only on trees; as soon as any incompatible splits are introduced
the two approaches give rise to differing probability distributions with disparate structure. Through exploration of a simple
example, we give an argument that our extension to more general networks has desirable properties that the previous approaches
do not share. In particular, our construction allows for convergent evolution of previously divergent lineages; a property
that is of significant interest for biological applications. 相似文献
20.
Jin Guohua Nakhleh Luay Snir Sagi Tuller Tamir 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(3):495-505
Phylogenies—the evolutionary histories of groups of organisms—play a major role in representing the interrelationships among biological entities. Many methods for reconstructing and studying such phylogenies have been proposed, almost all of which assume that the underlying history of a given set of species can be represented by a binary tree. Although many biological processes can be effectively modeled and summarized in this fashion, others cannot: recombination, hybrid speciation, and horizontal gene transfer result in networks of relationships rather than trees of relationships. In previous works, we formulated a maximum parsimony (MP) criterion for reconstructing and evaluating phylogenetic networks, and demonstrated its quality on biological as well as synthetic data sets. In this paper, we provide further theoretical results as well as a very fast heuristic algorithm for the MP criterion of phylogenetic networks. In particular, we provide a novel combinatorial definition of phylogenetic networks in terms of “forbidden cycles,” and provide detailed hardness and hardness of approximation proofs for the "small” MP problem. We demonstrate the performance of our heuristic in terms of time and accuracy on both biological and synthetic data sets. Finally, we explain the difference between our model and a similar one formulated by Nguyen et al., and describe the implications of this difference on the hardness and approximation results. 相似文献