首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 327 毫秒
1.
In recent studies, phylogenetic networks have been derived from so-called multilabeled trees in order to understand the origins of certain polyploids. Although the trees used in these studies were constructed using sophisticated techniques in phylogenetic analysis, the presented networks were inferred using ad hoc arguments that cannot be easily extended to larger, more complicated examples. In this paper, we present a general method for constructing such networks, which takes as input a multilabeled phylogenetic tree and outputs a phylogenetic network with certain desirable properties. To illustrate the applicability of our method, we discuss its use in reconstructing the evolutionary history of plant allopolyploids. We conclude with a discussion concerning possible future directions. The network construction method has been implemented and is freely available for use from http://www.uea.ac.uk/ approximately a043878/padre.html.  相似文献   

2.
徐立业  李玉 《生物信息学》2007,5(4):160-162
对于一组给定的DNA或蛋白质序列,UPGMA算法构建的二叉进化树可能是不惟一的,其具体拓扑结构与序列输入顺序相关,这一现象通常被称为"tied trees"。提出了UPGMA的一种改进算法——不加权算术平均组群方法(UMGMA),用以解决UPGMA树的不惟一问题。在UPGMA树惟一时,该方法产生的进化树与UPGMA树相同;而在UPGMA树不惟一时,该方法可以产生一棵惟一的、与序列输入顺序无关的多叉进化树,而且该算法还具有一个可调的容差参数,来控制生成进化树的主要分枝结构,这对于突出大规模进化树的总体脉络具有重要意义。  相似文献   

3.
The increasing availability of large genomic data sets as well as the advent of Bayesian phylogenetics facilitates the investigation of phylogenetic incongruence, which can result in the impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace , combines tree metrics and multivariate analysis to provide low‐dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group‐specific consensus phylogenies. treespace also provides a user‐friendly web interface for interactive data analysis and is integrated alongside existing standards for phylogenetics. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results.  相似文献   

4.
Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals—each with many genes—splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.  相似文献   

5.
Phylogenetic networks are a generalization of phylogenetic trees that are used to represent non-tree-like evolutionary histories that arise in organisms such as plants and bacteria, or uncertainty in evolutionary histories. An unrooted phylogenetic network on a non-empty, finite set X of taxa, or network, is a connected, simple graph in which every vertex has degree 1 or 3 and whose leaf set is X. It is called a phylogenetic tree if the underlying graph is a tree. In this paper we consider properties of tree-based networks, that is, networks that can be constructed by adding edges into a phylogenetic tree. We show that although they have some properties in common with their rooted analogues which have recently drawn much attention in the literature, they have some striking differences in terms of both their structural and computational properties. We expect that our results could eventually have applications to, for example, detecting horizontal gene transfer or hybridization which are important factors in the evolution of many organisms.  相似文献   

6.
Given a collection of rooted phylogenetic trees with overlapping sets of leaves, a compatible supertree $S$ is a single tree whose set of leaves is the union of the input sets of leaves and such that $S$ agrees with each input tree when restricted to the leaves of the input tree. Typically with trees from real data, no compatible supertree exists, and various methods may be utilized to reconcile the incompatibilities in the input trees. This paper focuses on a measure of robustness of a supertree method called its ``radius" $R$. The larger the value of $R$ is, the further the data set can be from a natural correct tree $T$ and yet the method will still output $T$. It is shown that the maximal possible radius for a method is $R = 1/2$. Many familiar methods, both for supertrees and consensus trees, are shown to have $R = 0$, indicating that they need not output a tree $T$ that would seem to be the natural correct answer. A polynomial-time method Normalized Triplet Supertree (NTS) with the maximal possible $R = 1/2$ is defined. A geometric interpretion is given, and NTS is shown to solve an optimization problem. Additional properties of NTS are described.  相似文献   

7.
Martin FN  Tooley PW 《Mycologia》2003,95(2):269-284
The phylogenetic relationships of 51 isolates representing 27 species of Phytophthora were assessed by sequence alignment of 568 bp of the mitochondrially encoded cytochrome oxidase II gene. A total of 1299 bp of the cytochrome oxidase I gene also were examined for a subset of 13 species. The cox II gene trees constructed by a heuristic search, based on maximum parsimony for a bootstrap 50% majority-rule consensus tree, revealed 18 species grouping into seven clades and nine species unaffiliated with a specific clade. The phylogenetic relationships among species observed on cox II gene trees did not exhibit consistent similarities in groupings for morphology, pathogenicity, host range or temperature optima. The topology of cox I gene trees, constructed by a heuristic search based on maximum parsimony for a bootstrap 50% majority-rule consensus tree for 13 species of Phytophthora, revealed 10 species grouping into three clades and three species unaffiliated with a specific clade. The groupings in general agreed with what was observed in the cox II tree. Species relationships observed for the cox II gene tree were in agreement with those based on ITS regions, with several notable exceptions. Some of these differences were noted in species in which the same isolates were used for both ITS and cox II analysis, suggesting either a differential rate of evolutionary divergence for these two regions or incorrect assumptions about alignment of ITS sequences. Analysis of combined data sets of ITS and cox II sequences generated a tree that did not differ substantially from analysis of ITS data alone, however, the results of a partition homogeneity test suggest that combining data sets may not be valid.  相似文献   

8.
In this study, two-component system (TCS) gene profile and metabolic network gene profile based phylogenetic trees were constructed and compared to each other to evaluate the evolutionary relationship between the bacterial sensing system and metabolism. The gene profiles of the these systems suggested that bacteria employed different evolutionary strategies to optimize the two-component system and metabolic network. In addition, comparative analysis revealed that the TCS based tree showed better family grouping than the metabolic network based tree, which indicated that the TCS and metabolic network have been modified via self-evolution and recruitment methods, respectively.  相似文献   

9.
The reconstruction of bacterial evolutionary relationships has proven to be a daunting task because variable mutation rates and horizontal gene transfer (HGT) among species can cause grave incongruities between phylogenetic trees based on single genes. Recently, a highly robust phylogenetic tree was constructed for 13 gamma-proteobacteria using the combined alignments of 205 conserved orthologous proteins.1 Only two proteins had incongruent tree topologies, which were attributed to HGT between Pseudomonas species and Vibrio cholerae or enterics. While the evolutionary relationships among these species appears to be resolved, further analysis suggests that HGT events with other bacterial partners likely occurred; this alters the implicit assumption of gamma-proteobacteria monophyly. Thus, any thorough reconstruction of bacterial evolution must not only choose a suitable set of molecular markers but also strive to reduce potential bias in the selection of species.  相似文献   

10.
Phylogenetic inference based on matrix representation of trees.   总被引:14,自引:0,他引:14  
Rooted phylogenetic trees can be represented as matrices in which the rows correspond to termini, and columns correspond to internal nodes (elements of the n-tree). Parsimony analysis of such a matrix will fully recover the topology of the original tree. The maximum size of the represented matrix depends only on the number of termini in the tree; for a tree derived from molecular sequences, the represented matrix may be orders of magnitude smaller than the original data matrix. Representations of multiple trees (which may or may not have identical termini) can readily be combined into a single matrix; columns of discrete-character-state data can be added and, if desired, weighted differentially. Parsimony analysis of the resulting composite matrix yields a hybrid supertree which typically provides greater resolution than conventional consensus trees. Use of this method is illustrated with examples involving multiple tRNA genes in organelles and multiple protein-coding genes in eukaryotes.  相似文献   

11.
Maximum likelihood supertrees   总被引:2,自引:0,他引:2  
  相似文献   

12.
Recently, much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here, we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks--a type of network slightly more general than a phylogenetic tree--from triplets. Our algorithm has been made publicly available as the program LEV1ATHAN. It combines ideas from several known theoretical algorithms for phylogenetic tree and network reconstruction with two novel subroutines. Namely, an exponential-time exact and a greedy algorithm both of which are of independent theoretical interest. Most importantly, LEV1ATHAN runs in polynomial time and always constructs a level-1 network. If the data are consistent with a phylogenetic tree, then the algorithm constructs such a tree. Moreover, if the input triplet set is dense and, in addition, is fully consistent with some level-1 network, it will find such a network. The potential of LEV1ATHAN is explored by means of an extensive simulation study and a biological data set. One of our conclusions is that LEV1ATHAN is able to construct networks consistent with a high percentage of input triplets, even when these input triplets are affected by a low to moderate level of noise.  相似文献   

13.
Two different methods of using paralogous genes for phylogenetic inference have been proposed: reconciled trees (or gene tree parsimony) and uninode coding. Gene tree parsimony suffers from 10 serious problems, including differential weighting of nucleotide and gap characters, undersampling which can be misinterpreted as synapomorphy, all of the characters not being allowed to interact, and conflict between gene trees being given equal weight, regardless of branch support. These problems are largely avoided by using uninode coding. The uninode coding method is elaborated to address multiple gene duplications within a single gene tree family and handle problems caused by lack of gene tree resolution. An example of vertebrate phylogeny inferred from nine genes is reanalyzed using uninode coding. We suggest that uninode coding be used instead of gene tree parsimony for phylogenetic inference from paralogous genes.  相似文献   

14.
Nowadays, there are many phylogeny reconstruction methods, each with advantages and disadvantages. We explored the advantages of each method, putting together the common parts of trees constructed by several methods, by means of a consensus computation. A number of phylogenetic consensus methods are already known. Unfortunately, there is also a taboo concerning consensus methods, because most biologists see them mainly as comparators and not as phylogenetic tree constructors. We challenged this taboo by defining a consensus method that builds a fully resolved phylogenetic tree based on the most common parts of fully resolved trees in a given collection. We also generated results showing that this consensus is in a way a kind of "median" of the input trees; as such it can be closer to the correct tree in many situations.  相似文献   

15.
Discrepancies in phylogenetic trees of bacteria and archaea are often explained as lateral gene transfer events. However, such discrepancies may also be due to phylogenetic artifacts or orthology assignment problems. A first step that may help to resolve this dilemma is to estimate the extent of phylogenetic inconsistencies in trees of prokaryotes in comparison with those of higher eukaryotes, where no lateral gene transfer is expected. To test this, we used 21 proteomes each of eukaryotes (mainly opisthokonts), proteobacteria, and archaea that spanned equivalent levels of genetic divergence. In each domain of life, we defined a set of putative orthologous sequences using a phylogenetic-based orthology protocol and, as a reference topology, we used a tree constructed with concatenated genes of each domain. Our results show, for most of the tests performed, that the magnitude of topological inconsistencies with respect to the reference tree was very similar in the trees of proteobacteria and eukaryotes. When clade support was taken into account, prokaryotes showed some more inconsistencies, but then all values were very low. Discrepancies were only consistently higher in archaea but, as shown by simulation analysis, this is likely due to the particular tree of the archaeal species used here being more difficult to reconstruct, whereas the trees of proteobacteria and eukaryotes were of similar difficulty. Although these results are based on a relatively small number of genes, it seems that phylogenetic reconstruction problems, including orthology assignment problems, have a similar overall effect over prokaryotic and eukaryotic trees based on single genes. Consequently, lateral gene transfer between distant prokaryotic species may have been more rare than previously thought, which opens the way to obtain the tree of life of bacterial and archaeal species using genomic data and the concatenation of adequate genes, in the same way as it is usually done in eukaryotes.  相似文献   

16.
Toward the goal of recovering the phylogenetic relationships among elapid snakes, we separately found the shortest trees from the amino acid sequences for the venom proteins phospholipase A2and the short neurotoxin, collectively representing 32 species in 16 genera. We then applied a method we term gene tree parsimony for inferring species trees from gene trees that works by finding the species tree which minimizes the number of deep coalescences or gene duplications plus unsampled sequences necessary to fit each gene tree to the species tree. This procedure, which is both logical and generally applicable, avoids many of the problems of previous approaches for inferring species trees from gene trees. The results support a division of the elapids examined into sister groups of the Australian and marine (laticaudines and hydrophiines) species, and the African and Asian species. Within the former clade, the sea snakes are shown to be diphyletic, with the laticaudines and hydrophiines having separate origins. This finding is corroborated by previous studies, which provide support for the usefulness of gene tree parsimony.  相似文献   

17.
Numerous simulation studies have investigated the accuracy of phylogenetic inference of gene trees under maximum parsimony, maximum likelihood, and Bayesian techniques. The relative accuracy of species tree inference methods under simulation has received less study. The number of analytical techniques available for inferring species trees is increasing rapidly, and in this paper, we compare the performance of several species tree inference techniques at estimating recent species divergences using computer simulation. Simulating gene trees within species trees of different shapes and with varying tree lengths (T) and population sizes (), and evolving sequences on those gene trees, allows us to determine how phylogenetic accuracy changes in relation to different levels of deep coalescence and phylogenetic signal. When the probability of discordance between the gene trees and the species tree is high (i.e., T is small and/or is large), Bayesian species tree inference using the multispecies coalescent (BEST) outperforms other methods. The performance of all methods improves as the total length of the species tree is increased, which reflects the combined benefits of decreasing the probability of discordance between species trees and gene trees and gaining more accurate estimates for gene trees. Decreasing the probability of deep coalescences by reducing also leads to accuracy gains for most methods. Increasing the number of loci from 10 to 100 improves accuracy under difficult demographic scenarios (i.e., coalescent units ≤ 4N(e)), but 10 loci are adequate for estimating the correct species tree in cases where deep coalescence is limited or absent. In general, the correlation between the phylogenetic accuracy and the posterior probability values obtained from BEST is high, although posterior probabilities are overestimated when the prior distribution for is misspecified.  相似文献   

18.
Most plant phylogenetic inference has used DNA sequence data from the plastid genome. This genome represents a single genealogical sample with no recombination among genes, potentially limiting the resolution of evolutionary relationships in some contexts. In contrast, nuclear DNA is inherently more difficult to employ for phylogeny reconstruction because major mutational events in the genome, including polyploidization, gene duplication, and gene extinction can result in homologous gene copies that are difficult to identify as orthologs or paralogs. Gene tree parsimony (GTP) can be used to infer the rooted species tree by fitting gene genealogies to species trees while simultaneously minimizing the estimated number of duplications needed to reconcile conflicts among them. Here, we use GTP for five nuclear gene families and a previously published plastid data set to reconstruct the phylogenetic backbone of the aquatic plant family Pontederiaceae. Plastid-based phylogenetic studies strongly supported extensive paraphyly of Eichhornia (one of the four major genera) but also depicted considerable ambiguity concerning the true root placement for the family. Our results indicate that species trees inferred from the nuclear genes (alone and in combination with the plastid data) are highly congruent with gene trees inferred from plastid data alone. Consideration of optimal and suboptimal gene tree reconciliations place the root of the family at (or near) a branch leading to the rare and locally restricted E. meyeri. We also explore methods to incorporate uncertainty in individual gene trees during reconciliation by considering their individual bootstrap profiles and relate inferred excesses of gene duplication events on individual branches to whole-genome duplication events inferred for the same branches. Our study improves understanding of the phylogenetic history of Pontederiaceae and also demonstrates the utility of GTP for phylogenetic analysis.  相似文献   

19.
Conserved genes have found their way into the mainstream of molecular systematics. Many of these genes are members of multigene families. A difficulty with using single genes of multigene families for phylogenetic inference is that genes from one species may be paralogous to those from another taxon. We focus attention on this problem using heat shock 70 (HSP70) genes. Using polymerase chain reaction techniques with genomic DNA, we isolated and sequenced 123 distinct sequences from 12 species of sharks. Phylogenetic analysis indicated that the sequences cluster with constituitively expressed cytoplasmic heat shock-like genes. Three highly divergent gene clades were sampled. A number of similar sequences were sampled from each species within each distinct gene clade. Comparison of published species trees with an HSP70 gene tree inferred using Bayesian phylogenetic analysis revealed several cases of gene duplication and differential sorting of gene lineages within this group of sharks. Gene tree parsimony based on the objective criteria of duplication and losses showed that previously published hypotheses of species relationships and two novel hypothesis based on Bayesian phylogenetics were concordant with the history of HSP70 gene duplication and loss. By contrast, two published hypotheses based on morphological data were not significantly different from the null hypothesis of a random association between species relatedness and the HSP70 gene tree. These results suggest that gene tree parsimony using data from multigene families can be used for inferring species relationships or testing published alternative hypotheses. More importantly, the results suggest that systematic studies relying on phylogenetic inferences from HSP70 genes may by plagued by unrecognized paralogy of sampled genes. Our results underscore the distinction between gene and species trees and highlight an underappreciated source of discordance between gene trees and organismal phylogeny, i.e., unrecognized paralogy of sampled genes.  相似文献   

20.
An important problem in phylogenetics is the construction of phylogenetic trees. One way to approach this problem, known as the supertree method, involves inferring a phylogenetic tree with leaves consisting of a set X of species from a collection of trees, each having leaf-set some subset of X. In the 1980s, Colonius and Schulze gave certain inference rules for deciding when a collection of 4-leaved trees, one for each 4-element subset of X, can be simultaneously displayed by a single supertree with leaf-set X. Recently, it has become of interest to extend this and related results to phylogenetic networks. These are a generalization of phylogenetic trees which can be used to represent reticulate evolution (where species can come together to form a new species). It has recently been shown that a certain type of phylogenetic network, called a (unrooted) level-1 network, can essentially be constructed from 4-leaved trees. However, the problem of providing appropriate inference rules for such networks remains unresolved. Here, we show that by considering 4-leaved networks, called quarnets, as opposed to 4-leaved trees, it is possible to provide such rules. In particular, we show that these rules can be used to characterize when a collection of quarnets, one for each 4-element subset of X, can all be simultaneously displayed by a level-1 network with leaf-set X. The rules are an intriguing mixture of tree inference rules, and an inference rule for building up a cyclic ordering of X from orderings on subsets of X of size 4. This opens up several new directions of research for inferring phylogenetic networks from smaller ones, which could yield new algorithms for solving the supernetwork problem in phylogenetics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号