期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Estimating species trees from unrooted gene trees

Liu L Yu L 《Systematic biology》2011,60(5):661-667

In this study, we develop a distance method for inferring unrooted species trees from a collection of unrooted gene trees. The species tree is estimated by the neighbor joining (NJ) tree built from a distance matrix in which the distance between two species is defined as the average number of internodes between two species across gene trees, that is, average gene-tree internode distance. The distance method is named NJ(st) to distinguish it from the original NJ method. Under the coalescent model, we show that if gene trees are known or estimated correctly, the NJ(st) method is statistically consistent in estimating unrooted species trees. The simulation results suggest that NJ(st) and STAR (another coalescence-based method for inferring species trees) perform almost equally well in estimating topologies of species trees, whereas the Bayesian coalescence-based method, BEST, outperforms both NJ(st) and STAR. Unlike BEST and STAR, the NJ(st) method can take unrooted gene trees to infer species trees without using an outgroup. In addition, the NJ(st) method can handle missing data and is thus useful in phylogenomic studies in which data sets often contain missing loci for some individuals. 相似文献

2.

Clanistics: a multi-level perspective for harvesting unrooted gene trees

François-Joseph Lapointe Philippe Lopez Yan Boucher Jeremy Koenig Eric Bapteste 《Trends in microbiology》2010,18(8):341-347

相似文献

3.

Linear-time algorithms for the multiple gene duplication problems

Luo CW Chen MC Chen YC Yang RW Liu HF Chao KM 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2011,8(1):260-265

A fundamental problem arising in the evolutionary molecular biology is to discover the locations of gene duplications and multiple gene duplication episodes based on the phylogenetic information. The solutions to the MULTIPLE GENE DUPLICATION problems can provide useful clues to place the gene duplication events onto the locations of a species tree and to expose the multiple gene duplication episodes. In this paper, we study two variations of the MULTIPLE GENE DUPLICATION problems: the EPISODE-CLUSTERING (EC) problem and the MINIMUM EPISODES (ME) problem. For the EC problem, we improve the results of Burleigh et al. with an optimal linear-time algorithm. For the ME problem, on the basis of the algorithm presented by Bansal and Eulenstein, we propose an optimal linear-time algorithm. 相似文献

4.

Of clades and clans: terms for phylogenetic relationships in unrooted trees

Wilkinson M McInerney JO Hirt RP Foster PG Embley TM 《Trends in ecology & evolution》2007,22(3):114-115

相似文献

5.

Protein classification based on propagation of unrooted binary trees

Kocsor A Busa-Fekete R Pongor S 《Protein and peptide letters》2008,15(5):428-434

We present two efficient network propagation algorithms that operate on a binary tree, i.e., a sparse-edged substitute of an entire similarity network. TreeProp-N is based on passing increments between nodes while TreeProp-E employs propagation to the edges of the tree. Both algorithms improve protein classification efficiency. 相似文献

6.

Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent

Elizabeth S. Allman James H. Degnan John A. Rhodes 《Journal of mathematical biology》2011,62(6):833-862

Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals—each with many genes—splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled. 相似文献

7.

On counting tandem duplication trees

Yang J Zhang L 《Molecular biology and evolution》2004,21(6):1160-1163

相似文献

8.

Topological rearrangements and local search method for tandem duplication trees

Bertrand D Gascuel O 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2005,2(1):15-28

The problem of reconstructing the duplication history of a set of tandemly repeated sequences was first introduced by Fitch (1977). Many recent studies deal with this problem, showing the validity of the unequal recombination model proposed by Fitch, describing numerous inference algorithms, and exploring the combinatorial properties of these new mathematical objects, which are duplication trees. In this paper, we deal with the topological rearrangement of these trees. Classical rearrangements used in phylogeny (NNI, SPR, TBR, ...) cannot be applied directly on duplication trees. We show that restricting the neighborhood defined by the SPR (Subtree Pruning and Regrafting) rearrangement to valid duplication trees, allows exploring the whole duplication tree space. We use these restricted rearrangements in a local search method which improves an initial tree via successive rearrangements. This method is applied to the optimization of parsimony and minimum evolution criteria. We show through simulations that this method improves all existing programs for both reconstructing the topology of the true tree and recovering its duplication events. We apply this approach to tandemly repeated human Zinc finger genes and observe that a much better duplication tree is obtained by our method than using any other program. 相似文献

9.

Evidence for gene duplication in collagen

A.D McLachlan 《Journal of molecular biology》1976,107(2):159-174

相似文献

10.

Fast local search for unrooted Robinson-Foulds supertrees

Chaudhary R Burleigh JG Fernández-Baca D 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2012,9(4):1004-1013

A Robinson-Foulds (RF) supertree for a collection of input trees is a tree containing all the species in the input trees that is at minimum total RF distance to the input trees. Thus, an RF supertree is consistent with the maximum number of splits in the input trees. Constructing RF supertrees for rooted and unrooted data is NP-hard. Nevertheless, effective local search heuristics have been developed for the restricted case where the input trees and the supertree are rooted. We describe new heuristics, based on the Edge Contract and Refine (ECR) operation, that remove this restriction, thereby expanding the utility of RF supertrees. Our experimental results on simulated and empirical data sets show that our unrooted local search algorithms yield better supertrees than those obtained from MRP and rooted RF heuristics in terms of total RF distance to the input trees and, for simulated data, in terms of RF distance to the true tree. 相似文献

11.

Coalescent histories for discordant gene trees and species trees

Noah A. Rosenberg James H. Degnan 《Theoretical population biology》2010,77(3):145-151

Given a gene tree and a species tree, a coalescent history is a list of the branches of the species tree on which coalescences in the gene tree take place. Each pair consisting of a gene tree topology and a species tree topology has some number of possible coalescent histories. Here we show that, for each n≥7, there exist a species tree topology S and a gene tree topology G≠S, both with n leaves, for which the number of coalescent histories exceeds the corresponding number of coalescent histories when the species tree topology is S and the gene tree topology is also S. This result has the interpretation that the gene tree topology G discordant with the species tree topology S can be produced by the evolutionary process in more ways than can the gene tree topology that matches the species tree topology, providing further insight into the surprising combinatorial properties of gene trees that arise from their joint consideration with species trees. 相似文献

12.

URec: a system for unrooted reconciliation

Górecki P Tiuryn J 《Bioinformatics (Oxford, England)》2007,23(4):511-512

URec is a software based on a concept of unrooted reconciliation. It can be used to reconcile a set of unrooted gene trees with a rooted species tree or a set of rooted species trees. Moreover, it computes detailed distribution of gene duplications and gene losses in a species tree. It can be used to infer optimal species phylogenies for a given set of gene trees. URec is implemented in C++ and can be easily compiled under Unix and Windows systems. Availability: Software is freely available for download from our website at http://bioputer.mimuw.edu.pl/~gorecki/urec. This webpage also contains Windows executables and a number of advanced examples with explanations. 相似文献

13.

Evolution by gene duplication

下载免费PDF全文

Charles J. Epstein 《American journal of human genetics》1971,23(5):541

相似文献

14.

Simplifying gene trees for easier comprehension

Paul-Ludwig Lott Marvin Mundry Christoph Sassenberg Stefan Lorkowski Georg Fuellen 《BMC bioinformatics》2006,7(1):231-15

Background

In the genomic age, gene trees may contain large amounts of data making them hard to read and understand. Therefore, an automated simplification is important. 相似文献

15.

Genomic duplication, fractionation and the origin of regulatory novelty

Langham RJ Walsh J Dunn M Ko C Goff SA Freeling M 《Genetics》2004,166(2):935-945

相似文献

16.

An ILP solution for the gene duplication problem

Chang WC Burleigh GJ Fernández-Baca DF Eulenstein O 《BMC bioinformatics》2011,12(Z1):S14

Background

The gene duplication (GD) problem seeks a species tree that implies the fewest gene duplication events across a given collection of gene trees. Solving this problem makes it possible to use large gene families with complex histories of duplication and loss to infer phylogenetic trees. However, the GD problem is NP-hard, and therefore, most analyses use heuristics that lack any performance guarantee.

Results

We describe the first integer linear programming (ILP) formulation to solve instances of the gene duplication problem exactly. With simulations, we demonstrate that the ILP solution can solve problem instances with up to 14 taxa. Furthermore, we apply the new ILP solution to solve the gene duplication problem for the seed plant phylogeny using a 12-taxon, 6, 084-gene data set. The unique, optimal solution, which places Gnetales sister to the conifers, represents a new, large-scale genomic perspective on one of the most puzzling questions in plant systematics.

Conclusions

Although the GD problem is NP-hard, our novel ILP solution for it can solve instances with data sets consisting of as many as 14 taxa and 1, 000 genes in a few hours. These are the largest instances that have been solved to optimally to date. Thus, this work can provide large-scale genomic perspectives on phylogenetic questions that previously could only be addressed by heuristic estimates.

相似文献

17.

Relationships between gene trees and species trees 总被引：39，自引：10，他引：39

Pamilo P; Nei M 《Molecular biology and evolution》1988,5(5):568-583

It is well known that a phylogenetic tree (gene tree) constructed from DNA sequences for a genetic locus does not necessarily agree with the tree that represents the actual evolutionary pathway of the species involved (species tree). One of the important factors that cause this difference is genetic polymorphism in the ancestral species. Under the assumption of neutral mutations, this problem can be studied by evaluating the probability (P) that a gene tree has the same topology as that of the species tree. When one gene (allele) is used from each of the species involved, the probability can be expressed as a simple function of Ti = ti/(2N), where ti is the evolutionary time measured in generations for the ith internodal branch of the species tree and N is the effective population size. When any of the Ti's is less than 1, the probability P becomes considerably less than 1.0. This probability cannot be substantially increased by increasing the number of alleles sampled from a locus. To increase the probability, one has to use DNA sequences from many different loci that have evolved independently of each other. 相似文献

18.

基因倍增研究进展 总被引：2，自引：0，他引：2

李鸿健谭军《生命科学》2006,18(2):150-154

基因倍增是指DNA片段在基因组中复制出一个或更多的拷贝,这种DNA片段可以是一小段基因组序列、整条染色体,甚至是整个基因组。基因倍增是基因组进化最主要的驱动力之一,是产生具有新功能的基因和进化出新物种的主要原因之一。本文综述了脊椎动物、模式植物和酵母在进化过程中基因倍增研究领域的最新进展,并讨论了基因倍增研究的发展方向。相似文献

19.

Enumeration of compact coalescent histories for matching gene trees and species trees

Disanto Filippo Rosenberg Noah A. 《Journal of mathematical biology》2019,78(1-2):155-188

Journal of Mathematical Biology - Compact coalescent histories are combinatorial structures that describe for a given gene tree G and species tree S possibilities for the numbers of coalescences of... 相似文献

20.

Adaptive evolution after gene duplication

Hughes AL 《Trends in genetics : TIG》2002,18(9):433-434

One of the two ribonuclease genes in a leaf-eating monkey has adapted to a role in the digestion of bacterial RNA. Following duplication of the ancestral ribonuclease gene, adaptation occurred through a series of changes in the amino acid sequence of the protein it encodes. This example is a good illustration of how specialization of protein function after gene duplication can be as source of novel protein functions. 相似文献