首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ge F  Wang LS  Kim J 《PLoS biology》2005,3(10):e316
With the availability of increasing amounts of genomic sequences, it is becoming clear that genomes experience horizontal transfer and incorporation of genetic information. However, to what extent such horizontal gene transfer (HGT) affects the core genealogical history of organisms remains controversial. Based on initial analyses of complete genomic sequences, HGT has been suggested to be so widespread that it might be the “essence of phylogeny” and might leave the treelike form of genealogy in doubt. On the other hand, possible biased estimation of HGT extent and the findings of coherent phylogenetic patterns indicate that phylogeny of life is well represented by tree graphs. Here, we reexamine this question by assessing the extent of HGT among core orthologous genes using a novel statistical method based on statistical comparisons of tree topology. We apply the method to 40 microbial genomes in the Clusters of Orthologous Groups database over a curated set of 297 orthologous gene clusters, and we detect significant HGT events in 33 out of 297 clusters over a wide range of functional categories. Estimates of positions of HGT events suggest a low mean genome-specific rate of HGT (2.0%) among the orthologous genes, which is in general agreement with other quantitative of HGT. We propose that HGT events, even when relatively common, still leave the treelike history of phylogenies intact, much like cobwebs hanging from tree branches.  相似文献   

2.
Application of phylogenetic networks in evolutionary studies   总被引:42,自引:0,他引:42  
The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees.  相似文献   

3.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the gamma-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the gamma-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

4.
5.
系统发育基因组学是利用全基因组数据构建系统发育树的新领域。全基因组数据能有效消除横向基因转移和类群间基因进化速率差异等因素对系统发育树的影响。根据所使用的全基因组数据的类型, 可以将系统发育基因组学方法分为以下5类:多基因联合建树方法, 基于基因含量的方法, 基于基因排列信息的方法, 基于序列短串含量特征信息的方法及基于代谢途径的方法。文章系统地总结了每一类方法的原理、速度、准确性、适用范围及在各个生物类群中的应用, 并对系统发育基因组学的前景及面临的挑战进行了概述。  相似文献   

6.
被子植物系统发育深层关系研究: 进展与挑战   总被引:1,自引:0,他引:1  
曾丽萍  张宁  马红 《生物多样性》2014,22(1):21-434
被子植物系统发育学是研究被子植物及其各类群间亲缘关系与进化历史的学科。从20世纪90年代起, 核苷酸和氨基酸序列等分子数据开始被广泛运用于被子植物系统发育研究, 经过20多年的发展, 从使用单个或联合少数几个细胞器基因, 到近期应用整个叶绿体基因组来重建被子植物的系统发育关系, 目、科水平上的被子植物系统发育框架已被广泛接受。在这个框架中, 基部类群、主要的5个分支(即真双子叶植物、单子叶植物、木兰类、金粟兰目和金鱼藻目)、每个分支所包含的目以及几个大分支包括的核心类群等都具有高度支持。与此同时, 细胞器基因还存在一些固有的问题, 例如单亲遗传、系统发育信息量有限等, 因此近年来双亲遗传的核基因在被子植物系统发育研究中的重要性逐渐得到关注, 并在不同分类阶元的研究中都取得了一定进展。但是, 被子植物系统发育中仍然存在一些难以确定的关系, 例如被子植物5个分支之间的关系、真双子叶植物内部某些类群的位置等。本文简述了20多年来被子植物系统发育深层关系的主要研究进展, 讨论了被子植物系统发育学常用的细胞器基因和核基因的选用, 已经确定和尚未确定系统发育位置的主要类群, 以及研究中尚存在的问题和可能的解决方法。  相似文献   

7.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the γ-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the γ-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

8.
Using the sequence information from nine completely sequenced bacterial genomes, we extract 32 protein families that are thought to contain orthologous proteins from each genome. The alignments of these 32 families are used to construct a phylogeny with the neighbor-joining algorithm. This tree has several topological features that are different from the conventional phylogeny, yet it is highly reliable according to its bootstrap values. Upon closer study of the individual families used, it is clear that the strong phylogenetic signal comes from three families, at least two of which are good candidates for horizontal transfer. The tree from the remaining 29 families consists almost entirely of noise at the level of bacterial phylum divisions, indicating that, even with large amounts of data, it may not be possible to reconstruct the prokaryote phylogeny using standard sequence-based methods. Received: 22 November 1998 / Accepted: 17 February 1999  相似文献   

9.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the γ-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the γ-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

10.
If lateral gene transfer (LGT) has affected all genes over the course of prokaryotic evolution, reconstruction of organismal phylogeny is compromised. However, if a core of genes is immune to transfer, then the evolutionary history of that core might be our most reliable guide to the evolution of organisms. Such a core should be preferentially included in the subset of genes shared by all organisms, but where universally conserved genes have been analyzed, there is too little phylogenetic signal to allow determination of whether or not they indeed have the same history (Hansmann and Martin 2000; Teichmann and Mitchison 1999). Here we look at a more restricted set, 521 homologous genes (COGs) simultaneously present in four sequenced euryarchaeal genomes. Although there is overall little robust phylogenetic signal in this data set, there is, among well-supported trees, strong representation of all three possible four-taxon topologies. ``Informational' genes seem no less subject to LGT than are ``operational genes,' within the euryarchaeotes. We conclude that (i) even in this collection of conserved genes there has been extensive LGT (orthologous gene replacement) and (ii) the notion that there is a core of nontransferable genes (the ``core hypothesis') has not been proven and may be unprovable. Received: 7 November 2000 / Accepted: 20 February 2001  相似文献   

11.
The natural history of nitrogen fixation   总被引:1,自引:0,他引:1  
In recent years, our understanding of biological nitrogen fixation has been bolstered by a diverse array of scientific techniques. Still, the origin and extant distribution of nitrogen fixation has been perplexing from a phylogenetic perspective, largely because of factors that confound molecular phylogeny such as sequence divergence, paralogy, and horizontal gene transfer. Here, we make use of 110 publicly available complete genome sequences to understand how the core components of nitrogenase, including NifH, NifD, NifK, NifE, and NifN proteins, have evolved. These genes are universal in nitrogen fixing organisms-typically found within highly conserved operons-and, overall, have remarkably congruent phylogenetic histories. Additional clues to the early origins of this system are available from two distinct clades of nitrogenase paralogs: a group composed of genes essential to photosynthetic pigment biosynthesis and a group of uncharacterized genes present in methanogens and in some photosynthetic bacteria. We explore the complex genetic history of the nitrogenase family, which is replete with gene duplication, recruitment, fusion, and horizontal gene transfer and discuss these events in light of the hypothesized presence of nitrogenase in the last common ancestor of modern organisms, as well as the additional possibility that nitrogen fixation might have evolved later, perhaps in methanogenic archaea, and was subsequently transferred into the bacterial domain.  相似文献   

12.
Phylogenetic sequence analysis of single or multiple genes has dominated the study and census of the genetic diversity among closely related bacteria. It remains unclear, however, how the results based on a few genes in the genome correlate with whole-genome-based relatedness and what genes (if any) best reflect whole-genome-level relatedness and hence should be preferentially used to economize on cost and to improve accuracy. We show here that phylogenies of closely related organisms based on the average nucleotide identity (ANI) of their shared genes correspond accurately to phylogenies based on state-of-the-art analysis of their whole-genome sequences. We use ANI to evaluate the phylogenetic robustness of every gene in the genome and show that almost all core genes, regardless of their functions and positions in the genome, offer robust phylogenetic reconstruction among strains that show 80 to 95% ANI (16S rRNA identity, >98.5%). Lack of elapsed time and, to a lesser extent, horizontal transfer and recombination make the selection of genes more critical for applications that target the intraspecies level, i.e., strains that show >95% ANI according to current standards. A much more accurate phylogeny for the Escherichia coli group was obtained based on just three best-performing genes according to our analysis compared to the concatenated alignment of eight genes that are commonly employed for phylogenetic purposes in this group. Our results are reproducible within the Salmonella, Burkholderia, and Shewanella groups and therefore are expected to have general applicability for microevolution studies, including metagenomic surveys.  相似文献   

13.
A number of recent papers have suggested that gene family content can be used to resolve phylogenies, particularly in the case of prokaryotes, in which extensive horizontal gene transfer means that individual gene phylogenies may not mirror the organismal phylogeny. However, no study has yet examined how sensitive such analyses are to the criterion of homology assessment used to assemble multigene families. Using data from 99 completely sequenced prokaryotic genomes, we examined the effect of homology criteria in phylogenetic analyses wherein presence or absence of each family in the genome was used as a cladistic character. Different criteria resulted in evidence for contradictory tree topologies, sometimes with high bootstrap support. A moderately strict criterion seemed best for assembling multigene families in a biologically meaningful way, but it was not necessarily preferable for phylogenetic analysis. Instead, a very strict criterion, which broke up gene families into smaller subfamilies, seemed to have advantages for phylogenetic purposes. The poor performance of gene family content-based phylogenetic analysis in the case of prokaryotes appears to reflect high levels of homoplasy resulting not only from horizontal gene transfer but also, more importantly, from extensive parallel loss of gene families in certain bacteria genomes.  相似文献   

14.
Phylogenetic trees based on gene repertoires are remarkably similar to the current consensus of life history. Yet it has been argued that shared gene content is unreliable for phylogenetic reconstruction because of convergence in gene content due to horizontal gene transfer and parallel gene loss. Here we test this argument, by filtering out as noise those orthologous groups that have an inconsistent phylogenetic distribution, using two independent methods. The resulting phylogenies do indeed contain small but significant improvements. More importantly, we find that the majority of orthologous groups contain some phylogenetic signal and that the resulting phylogeny is the only detectable signal present in the gene distribution across genomes. Horizontal gene transfer or parallel gene loss does not cause systematic biases in the gene content tree.  相似文献   

15.
The mitochondrial genome is one of the most frequently used loci in phylogenetic and phylogeographic analyses, and it is becoming increasingly possible to sequence and analyze this genome in its entirety from diverse taxa. However, sequencing the entire genome is not always desirable or feasible. Which genes should be selected to best infer the evolutionary history of the mitochondria within a group of organisms, and what properties of a gene determine its phylogenetic performance? The current study addresses these questions in a Bayesian phylogenetic framework with reference to a phylogeny of plethodontid and related salamanders derived from 27 complete mitochondrial genomes; this topology is corroborated by nuclear DNA and morphological data. Evolutionary rates for each mitochondrial gene and divergence dates for all nodes in the plethodontid mitochondrial genome phylogeny were estimated in both Bayesian and maximum likelihood frameworks using multiple fossil calibrations, multiple data partitions, and a clock-independent approach. Bayesian analyses of individual genes were performed, and the resulting trees compared against the reference topology. Ordinal logistic regression analysis of molecular evolution rate, gene length, and the G-shape parameter a demonstrated that slower rate of evolution and longer gene length both increased the probability that a gene would perform well phylogenetically. Estimated rates of molecular evolution vary 84-fold among different mitochondrial genes and different salamander lineages, and mean rates among genes vary 15-fold. Despite having conserved amino acid sequences, cox1, cox2, cox3, and cob have the fastest mean rates of nucleotide substitution, and the greatest variation in rates, whereas rrnS and rrnL have the slowest rates. Reasons underlying this rate variation are discussed, as is the extensive rate variation in cox1 in light of its proposed role in DNA barcoding.  相似文献   

16.
Retrotransposons of the R2 superclade specifically insert within the 28S ribosomal gene. They have been isolated from a variety of metazoan genomes and were found vertically inherited even if their phylogeny does not always agree with that of the host species. This was explained with the diversification/extinction of paralogous lineages, being proved the absence of horizontal transfer. We here analyze the widest available collection of R2 sequences, either newly isolated from recently sequenced genomes or drawn from public databases, in a phylogenetic framework. Results are congruent with previous analyses, but new important issues emerge. First, the N-terminal end of the R2-B clade protein, so far unknown, presents a new zinc fingers configuration. Second, the phylogenetic pattern is consistent with an ancient, rapid radiation of R2 lineages: being the estimated time of R2 origin (850–600 Million years ago) placed just before the metazoan Cambrian explosion, the wide element diversity and the incongruence with the host phylogeny could be attributable to the sudden expansion of available niches represented by host’s 28S ribosomal genes. Finally, we detect instances of coexisting multiple R2 lineages showing a non-random phylogenetic pattern, strongly similar to that of the “library” model known for tandem repeats: a collection of R2s were present in the ancestral genome and then differentially activated/repressed in the derived species. Models for activation/repression as well as mechanisms for sequence maintenance are also discussed within this framework.  相似文献   

17.
The evolutionary forces that determine genome size in bacteria and archaea have been the subject of intense debate over the last few decades. Although the preferential loss of genes observed in prokaryotes is explained through the deletional bias, factors promoting and preventing the fixation of such gene losses often remain unclear. Importantly, statistical analyses on this topic typically do not consider the potential bias introduced by the shared ancestry of many lineages, which is critical when using species as data points because of the potential dependence on residuals. In this study, we investigated the genome size distributions across a broad diversity of bacteria and archaea to evaluate if this trait is phylogenetically conserved at broad phylogenetic scales. After model fit, Pagel’s lambda indicated a strong phylogenetic signal in genome size data, suggesting that the diversification of this trait is influenced by shared evolutionary histories. We used a phylogenetic generalized least-squares analysis (PGLS) to test whether phylogeny influences the predictability of genome size from dN/dS ratios and 16S copy number, two variables that have been previously linked to genome size. These results confirm that failure to account for evolutionary history can lead to biased interpretations of genome size predictors. Overall, our results indicate that although bacteria and archaea can rapidly gain and lose genetic material through gene transfers and deletions, respectively, phylogenetic signal for genome size distributions can still be recovered at broad phylogenetic scales that should be taken into account when inferring the drivers of genome size evolution.  相似文献   

18.
Phylogenetic classifications based on single genes such as rRNA genes do not provide a complete and accurate picture of evolution because they do not account for evolutionary leaps caused by gene transfer, duplication, deletion and functional replacement. Here, we present a whole-genome-scale phylogeny based on metabolic pathway reaction content. From the genome sequences of 42 microorganisms, we deduced the metabolic pathway reactions and used the relatedness of these contents to construct a phylogenetic tree that represents the similarity of metabolic profiles (relatedness) as well as the extent of metabolic pathway similarity (evolutionary distance). This method accounts for horizontal gene transfer and specific gene loss by comparison of whole metabolic subpathways, and allows evaluation of evolutionary relatedness and changes in metabolic pathways. Thus, a tree based on metabolic pathway content represents both the evolutionary time scale (changes in genetic content) and the evolutionary process (changes in metabolism).  相似文献   

19.
One of the most complicated remaining problems of molecular-phylogenetic analysis is choosing an appropriate genome region. In an ideal case, such a region should have two specific properties: (i) results of analysis using this region should be similar to the results of multigene analysis using the maximal number of regions; (ii) this region should be arranged compactly and be significantly shorter than the multigene set. The second condition is necessary to facilitate sequencing and extension of taxons under analysis, the number of which is also crucial for molecular phylogenetic analysis. Such regions have been revealed for some groups of animals and have been designated as "lucky genes". We have carried out a computational experiment on analysis of 41 complete chloroplast genomes of flowering plants aimed at searching for a "lucky gene" for reconstruction of their phylogeny. It is shown that the phylogenetic tree inferred from a combination of translated nucleotide sequences of genes encoding subunits of plastid RNA polymerase is closest to the tree constructed using all protein coding sites of the chloroplast genome. The only node for which a contradiction is observed is unstable according to the different type analyses. For all the other genes or their combinations, the coincidence is significantly worse. The RNA polymerase genes are compactly arranged in the genome and are fourfold shorter than the total length of protein coding genes used for phylogenetic analysis. The combination of all necessary features makes this group of genes main candidates for the role of "lucky gene" in studying phylogeny of flowering plants.  相似文献   

20.
Most plant phylogenetic inference has used DNA sequence data from the plastid genome. This genome represents a single genealogical sample with no recombination among genes, potentially limiting the resolution of evolutionary relationships in some contexts. In contrast, nuclear DNA is inherently more difficult to employ for phylogeny reconstruction because major mutational events in the genome, including polyploidization, gene duplication, and gene extinction can result in homologous gene copies that are difficult to identify as orthologs or paralogs. Gene tree parsimony (GTP) can be used to infer the rooted species tree by fitting gene genealogies to species trees while simultaneously minimizing the estimated number of duplications needed to reconcile conflicts among them. Here, we use GTP for five nuclear gene families and a previously published plastid data set to reconstruct the phylogenetic backbone of the aquatic plant family Pontederiaceae. Plastid-based phylogenetic studies strongly supported extensive paraphyly of Eichhornia (one of the four major genera) but also depicted considerable ambiguity concerning the true root placement for the family. Our results indicate that species trees inferred from the nuclear genes (alone and in combination with the plastid data) are highly congruent with gene trees inferred from plastid data alone. Consideration of optimal and suboptimal gene tree reconciliations place the root of the family at (or near) a branch leading to the rare and locally restricted E. meyeri. We also explore methods to incorporate uncertainty in individual gene trees during reconciliation by considering their individual bootstrap profiles and relate inferred excesses of gene duplication events on individual branches to whole-genome duplication events inferred for the same branches. Our study improves understanding of the phylogenetic history of Pontederiaceae and also demonstrates the utility of GTP for phylogenetic analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号