首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the tree-average distances between the leaves. For a certain class of phylogenetic networks, a polynomial-time method is presented to reconstruct the network from the tree-average distances. The method is proved to work if there is a single reticulation cycle.  相似文献   

2.
We describe some new and recent results that allow for the analysis and representation of reticulate evolution by non-tree networks. In particular, we (1) present a simple result to show that, despite the presence of reticulation, there is always a well-defined underlying tree that corresponds to those parts of life that do not have a history of reticulation; (2) describe and apply new theory for determining the smallest number of hybridization events required to explain conflicting gene trees; and (3) present a new algorithm to determine whether an arbitrary rooted network can be realized by contemporaneous reticulation events. We illustrate these results with examples. [Directed acyclic graph; reticulate evolution; hybrid species; sub-tree prune and re-graft.].  相似文献   

3.
We present new methods for reconstructing reticulate evolution of species due to events such as horizontal transfer or hybrid speciation; both methods are based upon extensions of Wayne Maddison's approach in his seminal 1997 paper. Our first method is a polynomial time algorithm for constructing phylogenetic networks from two gene trees contained inside the network.We allow the network to have an arbitrary number of reticulations, but we limit the reticulation in the network so that the cycles in the network are node-disjoint ("galled"). Our second method is a polynomial time algorithm for constructing networks with one reticulation, where we allow for errors in the estimated gene trees. Using simulations, we demonstrate improved performance of this method over both NeighborNet and Maddison's method.  相似文献   

4.

Phylogenetic networks are a type of leaf-labelled, acyclic, directed graph used by biologists to represent the evolutionary history of species whose past includes reticulation events. A phylogenetic network is tree–child if each non-leaf vertex is the parent of a tree vertex or a leaf. Up to a certain equivalence, it has been recently shown that, under two different types of weightings, edge-weighted tree–child networks are determined by their collection of distances between each pair of taxa. However, the size of these collections can be exponential in the size of the taxa set. In this paper, we show that, if we have no “shortcuts”, that is, the networks are normal, the same results are obtained with only a quadratic number of inter-taxa distances by using the shortest distance between each pair of taxa. The proofs are constructive and give cubic-time algorithms in the size of the taxa sets for building such weighted networks.

  相似文献   

5.
Reticulation processes in evolution mean that the ancestral history of certain groups of present-day species is non-tree-like. These processes include hybridization, lateral gene transfer, and recombination. Despite the existence of reticulation, such events are relatively rare and so a fundamental problem for biologists is the following: Given a collection of rooted binary phylogenetic trees on sets of species that correctly represent the tree-like evolution of different parts of their genomes, what is the smallest number of "reticulation" vertices in any network that explains the evolution of the species under consideration? It has been previously shown that this problem is NP-hard even when the collection consists of only two rooted binary phylogenetic trees. However, in this paper, we show that the problem is fixed-parameter tractable in the two-tree instance, when parameterized by this smallest number of reticulation vertices.  相似文献   

6.
Phylogenetic networks are models of evolution that go beyond trees, incorporating non-tree-like biological events such as recombination (or more generally reticulation), which occur either in a single species (meiotic recombination) or between species (reticulation due to lateral gene transfer and hybrid speciation). The central algorithmic problems are to reconstruct a plausible history of mutations and non-tree-like events, or to determine the minimum number of such events needed to derive a given set of binary sequences, allowing one mutation per site. Meiotic recombination, reticulation and recurrent mutation can cause conflict or incompatibility between pairs of sites (or characters) of the input. Previously, we used "conflict graphs" and "incompatibility graphs" to compute lower bounds on the minimum number of recombination nodes needed, and to efficiently solve constrained cases of the minimization problem. Those results exposed the structural and algorithmic importance of the non-trivial connected components of those two graphs. In this paper, we more fully develop the structural importance of non-trivial connected components of the incompatibility and conflict graphs, proving a general decomposition theorem (Gusfield and Bansal, 2005) for phylogenetic networks. The decomposition theorem depends only on the incompatibilities in the input sequences, and hence applies to many types of phylogenetic networks, and to any biological phenomena that causes pairwise incompatibilities. More generally, the proof of the decomposition theorem exposes a maximal embedded tree structure that exists in the network when the sequences cannot be derived on a perfect phylogenetic tree. This extends the theory of perfect phylogeny in a natural and important way. The proof is constructive and leads to a polynomial-time algorithm to find the unique underlying maximal tree structure. We next examine and fully solve the major open question from Gusfield and Bansal (2005): Is it true that for every input there must be a fully decomposed phylogenetic network that minimizes the number of recombination nodes used, over all phylogenetic networks for the input. We previously conjectured that the answer is yes. In this paper, we show that the answer in is no, both for the case that only single-crossover recombination is allowed, and also for the case that unbounded multiple-crossover recombination is allowed. The latter case also resolves a conjecture recently stated in (Huson and Klopper, 2007) in the context of reticulation networks. Although the conjecture from Gusfield and Bansal (2005) is disproved in general, we show that the answer to the conjecture is yes in several natural special cases, and establish necessary combinatorial structure that counterexamples to the conjecture must possess. We also show that counterexamples to the conjecture are rare (for the case of single-crossover recombination) in simulated data.  相似文献   

7.
《Comptes Rendus Palevol》2013,12(6):333-337
Hybridization is increasingly seen as an important source of adaptive genetic variation and biotic diversity. Recent phylogenetic studies on the early evolution of birds suggest that the early diversification of neoavian orders perhaps involved a period of extensive hybridization or incomplete lineage sorting. Phylogenetic error, saturation, long-branch attraction, and convergence make it difficult to detect ancient hybridization events and differentiate them from incomplete lineage sorting using sequence data. We used recently published retroposon marker data to visualize the early radiation of Neoaves within a phylogenetic network approach, and found that the most basal neoavian taxa indeed show a complex pattern of reticulated relationships. Moreover, the reticulation levels of different parts of the network are consistent with the insertion pattern of the retroposon elements. The use of network-based analyses on homoplasy-free data shows true conflicting signals and the taxa involved that are not represented in trees.  相似文献   

8.
Reticulate evolution—the umbrella term for processes like hybridization, horizontal gene transfer, and recombination—plays an important role in the history of life of many species. Although the occurrence of such events is widely accepted, approaches to calculate the extent to which reticulation has influenced evolution are relatively rare. In this paper, we show that the NP-hard problem of calculating the minimum number of reticulation events for two (arbitrary) rooted phylogenetic trees parameterized by this minimum number is fixed-parameter tractable.  相似文献   

9.
In the last decade, the use of phylogenetic networks to analyze the evolution of species whose past is likely to include reticulation events, such as horizontal gene transfer or hybridization, has gained popularity among evolutionary biologists. Nevertheless, the evolution of a particular gene can generally be described without reticulation events and therefore be represented by a phylogenetic tree. While this is not in contrast to each other, it places emphasis on the necessity of algorithms that analyze and summarize the tree-like information that is contained in a phylogenetic network. We contribute to the toolbox of such algorithms by investigating the question of whether or not a phylogenetic network embeds a tree twice and give a quadratic-time algorithm to solve this problem for a class of networks that is more general than tree-child networks.  相似文献   

10.
Phylogenetic networks aim to represent the evolutionary history of taxa. Within these, reticulate networks are explicitly able to accommodate evolutionary events like recombination, hybridization, or lateral gene transfer. Although several metrics exist to compare phylogenetic networks, they make several assumptions regarding the nature of the networks that are not likely to be fulfilled by the evolutionary process. In order to characterize the potential disagreement between the algorithms and the biology, we have used the coalescent with recombination to build the type of networks produced by reticulate evolution and classified them as regular, tree sibling, tree child, or galled trees. We show that, as expected, the complexity of these reticulate networks is a function of the population recombination rate. At small recombination rates, most of the networks produced are already more complex than regular or tree sibling networks, whereas with moderate and large recombination rates, no network fit into any of the standard classes. We conclude that new metrics still need to be devised in order to properly compare two phylogenetic networks that have arisen from reticulating evolutionary process.  相似文献   

11.

Background  

Phylogenetic trees based on sequences from a set of taxa can be incongruent due to horizontal gene transfer (HGT). By identifying the HGT events, we can reconcile the gene trees and derive a taxon tree that adequately represents the species' evolutionary history. One HGT can be represented by a rooted Subtree Prune and Regraft (RSPR) operation and the number of RSPRs separating two trees corresponds to the minimum number of HGT events. Identifying the minimum number of RSPRs separating two trees is NP-hard, but the problem can be reduced to fixed parameter tractable. A number of heuristic and two exact approaches to identifying the minimum number of RSPRs have been proposed. This is the first implementation delivering an exact solution as well as the intermediate trees connecting the input trees.  相似文献   

12.
Hybridization is a well-documented, natural phenomenon that is common at low taxonomic levels in the higher plants and other groups. In spite of the obvious potential for gene flow via hybridization to cause reticulation in an evolutionary tree, analytical methods based on a strictly bifurcating model of evolution have frequently been applied to data sets containing taxa known to hybridize in nature. Using simulated data, we evaluated the relative performance of phenetic, tree-based, and network approaches for distinguishing between taxa with known reticulate history and taxa that were true terminal monophyletic groups. In all methods examined, type I error (the erroneous rejection of the null hypothesis that a taxon of interest is not monophyletic) was likely during the early stages of introgressive hybridization. We used the gradual erosion of type I error with continued gene flow as a metric for assessing relative performance. Bifurcating tree-based methods performed poorly, with highly supported, incorrect topologies appearing during some phases of the simulation. Based on our model, we estimate that many thousands of gene flow events may be required in natural systems before reticulate taxa will be reliably detected using tree-based methods of phylogeny reconstruction. We conclude that the use of standard bifurcating tree-based methods to identify terminal monophyletic groups for the purposes of defining or delimiting phylogenetic species, or for prioritizing populations for conservation purposes, is difficult to justify when gene flow between sampled taxa is possible. As an alternative, we explored the use of two network methods. Minimum spanning networks performed worse than most tree-based methods and did not yield topologies that were easily interpretable as phylogenies. The performance of NeighborNet was comparable to parsimony bootstrap analysis. NeighborNet and reverse successive weighting were capable of identifying an ephemeral signature of reticulate evolution during the early stages of introgression by revealing conflicting phylogenetic signal. However, when gene flow was topologically complex, the conflicting phylogenetic signal revealed by these methods resulted in a high probability of type II error (inferring that a monophyletic taxon has a reticulate history). Lastly, we present a novel application of an existing nonparametric clustering procedure that, when used against a density landscape derived from principal coordinate data, showed superior performance to the tree-based and network procedures tested.  相似文献   

13.
The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation.  相似文献   

14.
Application of phylogenetic networks in evolutionary studies   总被引:42,自引:0,他引:42  
The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees.  相似文献   

15.
Yu Y  Degnan JH  Nakhleh L 《PLoS genetics》2012,8(4):e1002660
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa.  相似文献   

16.
Phylogenetic networks generalise phylogenetic (evolutionary) trees by allowing for the representation of reticulation (non-treelike) events. The structure of such networks is often viewed by the phylogenetic trees they embed. In this paper, we determine when a phylogenetic network \({\mathcal {N}}\) has two phylogenetic tree embeddings which collectively contain all of the edges of \({\mathcal {N}}\). This determination leads to a polynomial-time algorithm for recognising such networks and an unexpected characterisation of the class of reticulation-visible networks.  相似文献   

17.
Rooted phylogenetic networks are used to model non-treelike evolutionary histories. Such networks are often constructed by combining trees, clusters, triplets or characters into a single network that in some well-defined sense simultaneously represents them all. We review these four models and investigate how they are related. Motivated by the parsimony principle, one often aims to construct a network that contains as few reticulations (non-treelike evolutionary events) as possible. In general, the model chosen influences the minimum number of reticulation events required. However, when one obtains the input data from two binary (i.e. fully resolved) trees, we show that the minimum number of reticulations is independent of the model. The number of reticulations necessary to represent the trees, triplets, clusters (in the softwired sense) and characters (with unrestricted multiple crossover recombination) are all equal. Furthermore, we show that these results also hold when not the number of reticulations but the level of the constructed network is minimised. We use these unification results to settle several computational complexity questions that have been open in the field for some time. We also give explicit examples to show that already for data obtained from three binary trees the models begin to diverge.  相似文献   

18.

Background and Aims

Here evidence for reticulation in the pantropical orchid genus Polystachya is presented, using gene trees from five nuclear and plastid DNA data sets, first among only diploid samples (homoploid hybridization) and then with the inclusion of cloned tetraploid sequences (allopolyploids). Two groups of tetraploids are compared with respect to their origins and phylogenetic relationships.

Methods

Sequences from plastid regions, three low-copy nuclear genes and ITS nuclear ribosomal DNA were analysed for 56 diploid and 17 tetraploid accessions using maximum parsimony and Bayesian inference. Reticulation was inferred from incongruence between gene trees using supernetwork and consensus network analyses and from cloning and sequencing duplicated loci in tetraploids.

Key Results

Diploid trees from individual loci showed considerable incongruity but little reticulation signal when support from more than one gene tree was required to infer reticulation. This was coupled with generally low support in the individual gene trees. Sequencing the duplicated gene copies in tetraploids showed clearer evidence of hybrid evolution, including multiple origins of one group of tetraploids included in the study.

Conclusions

A combination of cloning duplicate gene copies in allotetraploids and consensus network comparison of gene trees allowed a phylogenetic framework for reticulation in Polystachya to be built. There was little evidence for homoploid hybridization, but our knowledge of the origins and relationships of three groups of allotetraploids are greatly improved by this study. One group showed evidence of multiple long-distance dispersals to achieve a pantropical distribution; another showed no evidence of multiple origins or long-distance dispersal but had greater morphological variation, consistent with hybridization between more distantly related parents.  相似文献   

19.
Pedigrees illustrate the genealogical relationships among individuals, and phylogenies do the same for groups of organisms (such as species, genera, etc.). Here, I provide a brief survey of current concepts and methods for calculating and displaying genealogical relationships. These relationships have long been recognized to be reticulating, rather than strictly divergent, and so both pedigrees and phylogenies are correctly treated as networks rather than trees. However, currently most pedigrees are instead presented as “family trees”, and most phylogenies are presented as phylogenetic trees. Nevertheless, the historical development of concepts shows that networks pre-dated trees in most fields of biology, including the study of pedigrees, biology theory, and biology practice, as well as in historical linguistics in the social sciences. Trees were actually introduced in order to provide a simpler conceptual model for historical relationships, since trees are a specific type of simple network. Computationally, trees and networks are a part of graph theory, consisting of nodes connected by edges. In this mathematical context they differ solely in the absence or presence of reticulation nodes, respectively. There are two types of graphs that can be called phylogenetic networks: (1) rooted evolutionary networks, and (2) unrooted affinity networks. There are quite a few computational methods for unrooted networks, which have two main roles in phylogenetics: (a) they act as a generic form of multivariate data display; and (b) they are used specifically to represent haplotype networks. Evolutionary networks are more difficult to infer and analyse, as there is no mathematical algorithm for reconstructing unique historical events. There is thus currently no coherent analytical framework for computing such networks.  相似文献   

20.
Polyploidy, the duplication of entire genomes, plays a major role in plant evolution. In allopolyploids, genome duplication is associated with hybridization between two or more divergent genomes. Successive hybridization and polyploidization events can build up species complexes of allopolyploids with complicated network-like histories, and the evolutionary history of many plant groups cannot be adequately represented by phylogenetic trees because of such reticulate events. The history of complex genome mergings within a high-polyploid species complex in the genus Cerastium (Caryophyllaceae) is here untangled by the use of a network algorithm and noncoding sequences of a low-copy number gene. The resulting network illustrates how hybridization and polyploidization have acted as key evolutionary processes in creating a plant group where high-level allopolyploids clearly outnumber extant parental genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号