首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Most plant phylogenetic inference has used DNA sequence data from the plastid genome. This genome represents a single genealogical sample with no recombination among genes, potentially limiting the resolution of evolutionary relationships in some contexts. In contrast, nuclear DNA is inherently more difficult to employ for phylogeny reconstruction because major mutational events in the genome, including polyploidization, gene duplication, and gene extinction can result in homologous gene copies that are difficult to identify as orthologs or paralogs. Gene tree parsimony (GTP) can be used to infer the rooted species tree by fitting gene genealogies to species trees while simultaneously minimizing the estimated number of duplications needed to reconcile conflicts among them. Here, we use GTP for five nuclear gene families and a previously published plastid data set to reconstruct the phylogenetic backbone of the aquatic plant family Pontederiaceae. Plastid-based phylogenetic studies strongly supported extensive paraphyly of Eichhornia (one of the four major genera) but also depicted considerable ambiguity concerning the true root placement for the family. Our results indicate that species trees inferred from the nuclear genes (alone and in combination with the plastid data) are highly congruent with gene trees inferred from plastid data alone. Consideration of optimal and suboptimal gene tree reconciliations place the root of the family at (or near) a branch leading to the rare and locally restricted E. meyeri. We also explore methods to incorporate uncertainty in individual gene trees during reconciliation by considering their individual bootstrap profiles and relate inferred excesses of gene duplication events on individual branches to whole-genome duplication events inferred for the same branches. Our study improves understanding of the phylogenetic history of Pontederiaceae and also demonstrates the utility of GTP for phylogenetic analysis.  相似文献   

2.
Genome-scale sequence data have become increasingly available in the phylogenetic studies for understanding the evolutionary histories of species. However, it is challenging to develop probabilistic models to account for heterogeneity of phylogenomic data. The multispecies coalescent model describes gene trees as independent random variables generated from a coalescence process occurring along the lineages of the species tree. Since the multispecies coalescent model allows gene trees to vary across genes, coalescent-based methods have been popularly used to account for heterogeneous gene trees in phylogenomic data analysis. In this paper, we summarize and evaluate the performance of coalescent-based methods for estimating species trees from genome-scale sequence data. We investigate the effects of deep coalescence and mutation on the performance of species tree estimation methods. We found that the coalescent-based methods perform well in estimating species trees for a large number of genes, regardless of the degree of deep coalescence and mutation. The performance of the coalescent methods is negatively correlated with the lengths of internal branches of the species tree.  相似文献   

3.
Phylogenomics reveal a robust fungal tree of life   总被引:3,自引:0,他引:3  
Our understanding of the tree of life (TOL) is still fragmentary. Until recently, molecular phylogeneticists have built trees based on ribosomal RNA sequences and selected protein sequences, which, however, usually suffered from lack of support for the deeper branches and inconsistencies probably due to limited subsampling of the entire genome. Now, phylogenetic hypotheses can be based on the analysis of full genomes. We used available complete genome data as well as the eukaryote orthologous group (KOG) proteins to reconstruct with confidence basal branches of the fungal TOL. Phylogenetic analysis of a core of 531 KOGs shared among 21 fungal genomes, three animal genomes and one plant genome showed a single tree with high support resulting from four different methods of phylogenetic reconstruction. The single tree that we inferred from our dataset showed excellent nodal support for each branch, suggesting that it reflects the true phylogenetic relationships of the species involved.  相似文献   

4.
Reconstruction artifacts are a serious hindrance to the elucidation of phylogenetic relationships and a number of methods have been devised to alleviate them. Previous studies have demonstrated a striking disparity in the evolutionary rates of the mitochondrial (mt) genomes of squamate reptiles (lizards, worm lizards and snakes) and the reconstruction artifacts that may arise from this. Here, to examine basal squamate relationships, we have added the mt genome of the blind skink Dibamus novaeguineae to the mitogenomic dataset and applied different models for resolving the squamate tree. Categorical models were found to be less susceptible to artifacts than were the commonly used noncategorical phylogenetic models GTR and mtREV. The application of different treatments to the data showed that the removal of the fastest evolving sites in snakes improved phylogenetic signal in the dataset. Basal divergences remained, nevertheless, poorly resolved. The proportion of both fast-evolving and conserved sites in the squamate mt genomes relative to sites with intermediate rates of evolution suggests rapid early divergences among squamate taxa and at least partly explains the short internal relative to external branches in the squamate tree. Thus, mt and nuclear trees may never reach full agreement because of the short branches characterizing these divergences.  相似文献   

5.
Mutation and lateral transfer are two categories of processes generating genetic diversity in prokaryotic genomes. Their relative importance varies between lineages, yet both are complementary rather than independent, separable evolutionary forces. The replication process inevitably merges together their effects on the genome. We develop the concept of “open lineages” to characterize evolutionary lineages that over time accumulate more changes in their genomes by lateral transfer than by mutation. They contrast with “closed lineages,” in which most of the changes are caused by mutation. Open and closed lineages are interspersed along the branches of any tree of prokaryotes. This patchy distribution conflicts with the basic assumptions of traditional phylogenetic approaches. As a result, a tree representation including both open and closed lineages is a misrepresentation. The evolution of all prokaryotic lineages cannot be studied under a single model unless new phylogenetic approaches that are more pluralistic about lineage evolution are designed.  相似文献   

6.
We determined the complete mitochondrial genomes of five cephalopods of the Subclass Coleoidea (Suborder Oegopsida: Watasenia scintillans, Todarodes pacificus, Suborder Myopsida: Sepioteuthis lessoniana, Order Sepiida: Sepia officinalis, and Order Octopoda: Octopus ocellatus) and used them to infer phylogenetic relationships. In our Maximum Likelihood (ML) tree, sepiids (cuttlefish) are at the most basal position of all decapodiformes, and oegopsids and myopsids form a monophyletic clade, thus supporting the traditional classification of the Order Teuthida. We detected extensive gene rearrangements in the mitochondrial genomes of broad cephalopod groups. It is likely that the arrangements of mitochondrial genes in Oegopsida and Sepiida were derived from those of Octopoda, which is thought to be the ancestral order, by entire gene duplication and random gene loss. Oegopsida in particular has undergone long-range gene duplications. We also found that the mitochondrial gene arrangement of Sepioteuthis lessoniana differs from that of Loligo bleekeri, although they belong to the same family. Analysis of both the phylogenetic tree and mitochondrial gene rearrangements of coleoid Cephalopoda suggests that each mitochondrial gene arrangement was acquired after the divergence of each lineage.  相似文献   

7.
Kim SY  Pritchard JK 《PLoS genetics》2007,3(9):1572-1586
Conserved noncoding elements (CNCs) are an abundant feature of vertebrate genomes. Some CNCs have been shown to act as cis-regulatory modules, but the function of most CNCs remains unclear. To study the evolution of CNCs, we have developed a statistical method called the “shared rates test” to identify CNCs that show significant variation in substitution rates across branches of a phylogenetic tree. We report an application of this method to alignments of 98,910 CNCs from the human, chimpanzee, dog, mouse, and rat genomes. We find that ~68% of CNCs evolve according to a null model where, for each CNC, a single parameter models the level of constraint acting throughout the phylogeny linking these five species. The remaining ~32% of CNCs show departures from the basic model including speed-ups and slow-downs on particular branches and occasionally multiple rate changes on different branches. We find that a subset of the significant CNCs have evolved significantly faster than the local neutral rate on a particular branch, providing strong evidence for adaptive evolution in these CNCs. The distribution of these signals on the phylogeny suggests that adaptive evolution of CNCs occurs in occasional short bursts of evolution. Our analyses suggest a large set of promising targets for future functional studies of adaptation.  相似文献   

8.
When gene copies are sampled from various species, the resulting gene tree might disagree with the containing species tree. The primary causes of gene tree and species tree discord include incomplete lineage sorting, horizontal gene transfer, and gene duplication and loss. Each of these events yields a different parsimony criterion for inferring the (containing) species tree from gene trees. With incomplete lineage sorting, species tree inference is to find the tree minimizing extra gene lineages that had to coexist along species lineages; with gene duplication, it becomes to find the tree minimizing gene duplications and/or losses. In this paper, we present the following results: 1) The deep coalescence cost is equal to the number of gene losses minus two times the gene duplication cost in the reconciliation of a uniquely leaf labeled gene tree and a species tree. The deep coalescence cost can be computed in linear time for any arbitrary gene tree and species tree. 2) The deep coalescence cost is always not less than the gene duplication cost in the reconciliation of an arbitrary gene tree and a species tree. 3) Species tree inference by minimizing deep coalescence events is NP-hard.  相似文献   

9.
Incomplete taxon sampling has been a major problem in resolving the early divergences in birds. Five new mitochondrial genomes are reported here (brush-turkey, lyrebird, suboscine flycatcher, turkey vulture, and a gull) and three break up long branches that tended to attract the distant reptilian outgroup. These long branches were to galliforms, and to oscine and suboscine passeriformes. Breaking these long branches leaves the root, as inferred by maximum likelihood and Bayesian phylogenetic analyses, between paleognaths and neognaths. This means that morphological, nuclear, and mitochondrial data are now in agreement on the position of the root of the avian tree and we can, move on to other questions. An overview is then given of the deepest divisions in the mitogenomic tree inferred from complete mitochondrial genomes. The strict monophyly of both the galloanseres and the passerines is strongly supported, leaving the deep six-way split within Neoaves as the next major question for which resolution is still lacking. Incomplete taxon sampling was also a problem for Neoaves, and although some resolution is now available there are still problems because current phylogenetic methods still fail to account for real features of DNA sequence evolution.  相似文献   

10.
The PHASE software package allows phylogenetic tree construction with a number of evolutionary models designed specifically for use with RNA sequences that have conserved secondary structure. Evolution in the paired regions of RNAs occurs via compensatory substitutions, hence changes on either side of a pair are correlated. Accounting for this correlation is important for phylogenetic inference because it affects the likelihood calculation. In the present study we use the complete set of tRNA and rRNA sequences from 69 complete mammalian mitochondrial genomes. The likelihood calculation uses two evolutionary models simultaneously for different parts of the sequence: a paired-site model for the paired sites and a single-site model for the unpaired sites. We use Bayesian phylogenetic methods and a Markov chain Monte Carlo algorithm is used to obtain the most probable trees and posterior probabilities of clades. The results are well resolved for almost all the important branches on the mammalian tree. They support the arrangement of mammalian orders within the four supra-ordinal clades that have been identified by studies of much larger data sets mainly comprising nuclear genes. Groups such as the hedgehogs and the murid rodents, which have been problematic in previous studies with mitochondrial proteins, appear in their expected position with the other members of their order. Our choice of genes and evolutionary model appears to be more reliable and less subject to biases caused by variation in base composition than previous studies with mitochondrial genomes.  相似文献   

11.
Yu Y  Degnan JH  Nakhleh L 《PLoS genetics》2012,8(4):e1002660
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa.  相似文献   

12.
Analyses of the increasingly available genomic data continue to reveal the extent of hybridization and its role in the evolutionary diversification of various groups of species. We show, through extensive coalescent-based simulations of multilocus data sets on phylogenetic networks, how divergence times before and after hybridization events can result in incomplete lineage sorting with gene tree incongruence signatures identical to those exhibited by hybridization. Evolutionary analysis of such data under the assumption of a species tree model can miss all hybridization events, whereas analysis under the assumption of a species network model would grossly overestimate hybridization events. These issues necessitate a paradigm shift in evolutionary analysis under these scenarios, from a model that assumes a priori a single source of gene tree incongruence to one that integrates multiple sources in a unifying framework. We propose a framework of coalescence within the branches of a phylogenetic network and show how this framework can be used to detect hybridization despite incomplete lineage sorting. We apply the model to simulated data and show that the signature of hybridization can be revealed as long as the interval between the divergence times of the species involved in hybridization is not too small. We reanalyze a data set of 106 loci from 7 in-group Saccharomyces species for which a species tree with no hybridization has been reported in the literature. Our analysis supports the hypothesis that hybridization occurred during the evolution of this group, explaining a large amount of the incongruence in the data. Our findings show that an integrative approach to gene tree incongruence and its reconciliation is needed. Our framework will help in systematically analyzing genomic data for the occurrence of hybridization and elucidating its evolutionary role.  相似文献   

13.
We examined how alignment of internal transcribed spacers of rDNA in fungi and plants changes with increasing genetic distance by successive removal of sequences from each data set followed by realignment and phylogenetic analysis. Increasing genetic distance can negatively affect phylogenetic reconstruction in two ways. First, it may cause errors in the alignment and therefore the homology hypotheses of the sequence characters. Second, it may cause errors in the homology assessments of character states because of multiple hits on individual branches. These two causes of error in phylogenetic inference were distinguished from one another in our analysis. The errors in alignment caused by increasing genetic distance were primarily due to inserting too few gaps and inserting gaps at the wrong positions. Errors in tree resolution, topology, and/or branch-support values were more often caused by multiple hits than by misaligned positions. This suggests that increasing genetic distance negatively affects our primary homology assessments of character states more severely than our primary homology assessments of characters. We suggest that increasing taxon sampling with the aim of subdividing long branches is a strategy for obtaining reliable alignments.  相似文献   

14.
Efficient enumeration of phylogenetically informative substrings.   总被引:1,自引:0,他引:1  
We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore conserved) substrings that are shared between all mammals and not found in non-mammals. Such collection of substrings may be used to identify conserved subsequences or to construct sets of identifying substrings for branches of a phylogenetic tree. For two disjoint sets of genomes on a phylogenetic tree, a substring is called a tag if it is found in all of the genomes of one set and none of the genomes of the other set. We present a near-linear time algorithm that finds all tags in a given phylogeny; and a sublinear space algorithm (at the expense of running time) that is more suited for very large data sets. Under a stochastic model of evolution, we show that a simple process of tag-generation essentially captures all possible ways of generating tags. We use this insight to develop a faster tag discovery algorithm with a small chance of error. However, since tags are not guaranteed to exist in a given data set, we generalize the notion of a tag from a single substring to a set of substrings. We present a linear programming-based approach for finding approximate generalized tag sets. Finally, we use our tag enumeration algorithm to analyze a phylogeny containing 57 whole microbial genomes. We find tags for all nodes in the phylogeny except the root for which we find generalized tag sets.  相似文献   

15.
随着越来越多基因组的测序完成,基于全基因组的非比对的系统发生分析已成为研究热点。不同的生物物种或个体基因组之间的核酸组分不完全相同。遗传语言-DNA序列的信息很大程度上反映在其k—mer频数中。基于基因组序列k-mer频数的系统发生树则从新的角度为我们提供物种之间的亲缘关系。本文定义基于k-mer,频数的信息参数,并用它表征基因组序列,计算不同基因组之间信息参数的距离,用邻接法对84个病毒构建了系统发生树,发现构建的系统发生树很大程度上与已有的系统发生树相吻合。  相似文献   

16.
It is well known that molecular data "saturates" with increasing sequence divergence (thereby losing phylogenetic information) and that in addition the accumulation of misleading information due to chance similarities or to systematic bias may accompany saturation as well. Exploratory data analysis methods that can quantify the extent of signal loss or convergence for a given data set are scarce. Such methods are needed because genomics delivers very long sequence alignments spanning substantial phylogenetic depth, where site saturation may be compounded by systematic biases or other alternative signals. Here we introduce the Treeness Triangle (TT) graph, in which signals detectable by Hadamard (spectral) analysis are summed into 3 categories--those supporting 1) external and 2) internal branches in the optimal tree, in addition to 3) the residuals (potential internal branches not present in the optimal tree). These 3 values are plotted in a standard ternary coordinate system. The approach is illustrated with simulated and real data sets, the latter from complete chloroplast genomes, where potential problems of paralogy or lateral gene acquisition can be excluded. The TT uncovers the divergence-dependent loss of phylogenetic signal as subsets of chloroplast genomes are investigated that span increasingly deeper evolutionary timescales. The rate of signal loss (or signal retention) varies with the gene and/or the method of analysis.  相似文献   

17.
The advances accelerated by next-generation sequencing and long-read sequencing technologies continue to provide an impetus for plant phylogenetic study.In the past decade,a large number of phylogenetic studies adopting hundreds to thousands of genes across a wealth of clades have emerged and ushered plant phylogenetics and evolution into a new era.In the meantime,a roadmap for researchers when making decisions across different approaches for their phylogenomic research design is imminent.This r...  相似文献   

18.
We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.  相似文献   

19.

Background  

The ever-increasing wealth of genomic sequence information provides an unprecedented opportunity for large-scale phylogenetic analysis. However, species phylogeny inference is obfuscated by incongruence among gene trees due to evolutionary events such as gene duplication and loss, incomplete lineage sorting (deep coalescence), and horizontal gene transfer. Gene tree parsimony (GTP) addresses this issue by seeking a species tree that requires the minimum number of evolutionary events to reconcile a given set of incongruent gene trees. Despite its promise, the use of gene tree parsimony has been limited by the fact that existing software is either not fast enough to tackle large data sets or is restricted in the range of evolutionary events it can handle.  相似文献   

20.
Denitrification is a facultative respiratory pathway in which nitrite (NO2(-)), nitric oxide (NO), and nitrous oxide (N2O) are successively reduced to nitrogen gas (N(2)), effectively closing the nitrogen cycle. The ability to denitrify is widely dispersed among prokaryotes, and this polyphyletic distribution has raised the possibility of horizontal gene transfer (HGT) having a substantial role in the evolution of denitrification. Comparisons of 16S rRNA and denitrification gene phylogenies in recent studies support this possibility; however, these results remain speculative as they are based on visual comparisons of phylogenies from partial sequences. We reanalyzed publicly available nirS, nirK, norB, and nosZ partial sequences using Bayesian and maximum likelihood phylogenetic inference. Concomitant analysis of denitrification genes with 16S rRNA sequences from the same organisms showed substantial differences between the trees, which were supported by examining the posterior probability of monophyletic constraints at different taxonomic levels. Although these differences suggest HGT of denitrification genes, the presence of structural variants for nirK, norB, and nosZ makes it difficult to determine HGT from other evolutionary events. Additional analysis using phylogenetic networks and likelihood ratio tests of phylogenies based on full-length sequences retrieved from genomes also revealed significant differences in tree topologies among denitrification and 16S rRNA gene phylogenies, with the exception of the nosZ gene phylogeny within the data set of the nirK-harboring genomes. However, inspection of codon usage and G + C content plots from complete genomes gave no evidence for recent HGT. Instead, the close proximity of denitrification gene copies in the genomes of several denitrifying bacteria suggests duplication. Although HGT cannot be ruled out as a factor in the evolution of denitrification genes, our analysis suggests that other phenomena, such gene duplication/divergence and lineage sorting, may have differently influenced the evolution of each denitrification gene.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号