首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The availability of whole-genome data has created the extraordinary opportunity to reconstruct in fine details the 'tree of life'. The application of such comprehensive effort promises to unravel the enigmatic evolutionary relationships between prokaryotes and eukaryotes. Traditionally, biologists have represented the evolutionary relationships of all organisms by a bifurcating phylogenetic tree. But recent analyses of completely sequenced genomes using conditioned reconstruction (CR), a newly developed gene-content algorithm, suggest that a cycle graph or 'ring' rather than a 'tree' is a better representation of the evolutionary relationships between prokaryotes and eukaryotes. CR is the first phylogenetic-reconstruction method to provide precise evidence about the origin of the eukaryotes. This review summarizes how the CR analyses of complete genomes provide evidence for a fusion origin of the eukaryotes.  相似文献   

2.
Phylogenies involving nonmodel species are based on a few genes, mostly chosen following historical or practical criteria. Because gene trees are sometimes incongruent with species trees, the resulting phylogenies may not accurately reflect the evolutionary relationships among species. The increase in availability of genome sequences now provides large numbers of genes that could be used for building phylogenies. However, for practical reasons only a few genes can be sequenced for a wide range of species. Here we asked whether we can identify a few genes, among the single-copy genes common to most fungal genomes, that are sufficient for recovering accurate and well-supported phylogenies. Fungi represent a model group for phylogenomics because many complete fungal genomes are available. An automated procedure was developed to extract single-copy orthologous genes from complete fungal genomes using a Markov Clustering Algorithm (Tribe-MCL). Using 21 complete, publicly available fungal genomes with reliable protein predictions, 246 single-copy orthologous gene clusters were identified. We inferred the maximum likelihood trees using the individual orthologous sequences and constructed a reference tree from concatenated protein alignments. The topologies of the individual gene trees were compared to that of the reference tree using three different methods. The performance of individual genes in recovering the reference tree was highly variable. Gene size and the number of variable sites were highly correlated and significantly affected the performance of the genes, but the average substitution rate did not. Two genes recovered exactly the same topology as the reference tree, and when concatenated provided high bootstrap values. The genes typically used for fungal phylogenies did not perform well, which suggests that current fungal phylogenies based on these genes may not accurately reflect the evolutionary relationships among species. Analyses on subsets of species showed that the phylogenetic performance did not seem to depend strongly on the sample. We expect that the best-performing genes identified here will be very useful for phylogenetic studies of fungi, at least at a large taxonomic scale. Furthermore, we compare the method developed here for finding genes for building robust phylogenies with previous ones and we advocate that our method could be applied to other groups of organisms when more complete genomes are available.  相似文献   

3.
随着越来越多基因组的测序完成,基于全基因组的非比对的系统发生分析已成为研究热点。不同的生物物种或个体基因组之间的核酸组分不完全相同。遗传语言-DNA序列的信息很大程度上反映在其k—mer频数中。基于基因组序列k-mer频数的系统发生树则从新的角度为我们提供物种之间的亲缘关系。本文定义基于k-mer,频数的信息参数,并用它表征基因组序列,计算不同基因组之间信息参数的距离,用邻接法对84个病毒构建了系统发生树,发现构建的系统发生树很大程度上与已有的系统发生树相吻合。  相似文献   

4.
Genome trees and the tree of life   总被引:16,自引:0,他引:16  
Genome comparisons indicate that horizontal gene transfer and differential gene loss are major evolutionary phenomena that, at least in prokaryotes, involve a large fraction, if not the majority, of genes. The extent of these events casts doubt on the feasibility of constructing a 'Tree of Life', because the trees for different genes often tell different stories. However, alternative approaches to tree construction that attempt to determine tree topology on the basis of comparisons of complete gene sets seem to reveal a phylogenetic signal that supports the three-domain evolutionary scenario and suggests the possibility of delineation of previously undetected major clades of prokaryotes. If the validity of these whole-genome approaches to tree building is confirmed by analyses of numerous new genomes, which are currently being sequenced at an increasing rate, it would seem that the concept of a universal 'species' tree is still appropriate. However, this tree should be reinterpreted as a prevailing trend in the evolution of genome-scale gene sets rather than as a complete picture of evolution.  相似文献   

5.
The concept of the genome tree depends on the potential evolutionary significance in the clustering of species according to similarities in the gene content of their genomes. In this respect, genome trees have often been identified with species trees. With the rapid expansion of genome sequence data it becomes of increasing importance to develop accurate methods for grasping global trends for the phylogenetic signals that mutually link the various genomes. We therefore derive here the methodological concept of genome trees based on protein conservation profiles in multiple species. The basic idea in this derivation is that the multi-component "presence-absence" protein conservation profiles permit tracking of common evolutionary histories of genes across multiple genomes. We show that a significant reduction in informational redundancy is achieved by considering only the subset of distinct conservation profiles. Beyond these basic ideas, we point out various pitfalls and limitations associated with the data handling, paving the way for further improvements. As an illustration for the methods, we analyze a genome tree based on the above principles, along with a series of other trees derived from the same data and based on pair-wise comparisons (ancestral duplication-conservation and shared orthologs). In all trees we observe a sharp discrimination between the three primary domains of life: Bacteria, Archaea, and Eukarya. The new genome tree, based on conservation profiles, displays a significant correspondence with classically recognized taxonomical groupings, along with a series of departures from such conventional clusterings.  相似文献   

6.
The PHASE software package allows phylogenetic tree construction with a number of evolutionary models designed specifically for use with RNA sequences that have conserved secondary structure. Evolution in the paired regions of RNAs occurs via compensatory substitutions, hence changes on either side of a pair are correlated. Accounting for this correlation is important for phylogenetic inference because it affects the likelihood calculation. In the present study we use the complete set of tRNA and rRNA sequences from 69 complete mammalian mitochondrial genomes. The likelihood calculation uses two evolutionary models simultaneously for different parts of the sequence: a paired-site model for the paired sites and a single-site model for the unpaired sites. We use Bayesian phylogenetic methods and a Markov chain Monte Carlo algorithm is used to obtain the most probable trees and posterior probabilities of clades. The results are well resolved for almost all the important branches on the mammalian tree. They support the arrangement of mammalian orders within the four supra-ordinal clades that have been identified by studies of much larger data sets mainly comprising nuclear genes. Groups such as the hedgehogs and the murid rodents, which have been problematic in previous studies with mitochondrial proteins, appear in their expected position with the other members of their order. Our choice of genes and evolutionary model appears to be more reliable and less subject to biases caused by variation in base composition than previous studies with mitochondrial genomes.  相似文献   

7.
The phylogenetic profile method has been widely applied in the prediction of protein-protein interactions (PPIs). Studies often use all of the available complete genomes for this method. With more than 400 genomes complete and new ones on the horizon, it remains unclear how to select reference organisms for profile construction and then influence the PPI prediction. Here, we performed a systematic assessment of reference organism selection from 225 complete genomes with their evolutionary tree. Our results suggest that reference organisms should be selected from moderately and highly genetically distant organisms, from all three domains (Bacteria, Archaea, and Eukarya), and by their even distribution at the fifth hierarchical level in the evolutionary tree. Our study provides important guidance on the construction of phylogenetic profiles for PPI prediction and functional genomics, which has become challenging due to the large and increasing number of available candidate organisms.  相似文献   

8.
As more complete genomes are sequenced, phylogenetic analysis is entering a new era - that of phylogenomics. One branch of this expanding field aims to reconstruct the evolutionary history of organisms on the basis of the analysis of their genomes. Recent studies have demonstrated the power of this approach, which has the potential to provide answers to several fundamental evolutionary questions. However, challenges for the future have also been revealed. The very nature of the evolutionary history of organisms and the limitations of current phylogenetic reconstruction methods mean that part of the tree of life might prove difficult, if not impossible, to resolve with confidence.  相似文献   

9.
Horseshoe crabs (order Xiphosura) are often referred to as an ancient order of marine chelicerates and have been considered as keystone taxa for the understanding of chelicerate evolution. However, the mitochondrial genome of this order is only available from a single species, Limulus polyphemus. In the present study, we analyzed the complete mitochondrial genomes from two Asian horseshoe crabs, Carcinoscorpius rotundicauda and Tachypleus tridentatus to offer novel data for the evolutionary relationship within Xiphosura and their position in the chelicerate phylogeny. The mitochondrial genomes of C. rotundicauda (15,033 bp) and T. tridentatus (15,006 bp) encode 13 protein-coding genes, two ribosomal RNA (rRNA) genes, and 22 transfer RNA (tRNA) genes. Overall sequences and genome structure of two Asian species were highly similar to that of Limulus polyphemus, though clear differences among three were found in the stem-loop structure of the putative control region. In the phylogenetic analysis with complete mitochondrial genomes of 43 chelicerate species, C. rotundicauda and T. tridentatus were recovered as a monophyly, while L. polyphemus solely formed an independent clade. Xiphosuran species were placed at the basal root of the tree, and major other chelicerate taxa were clustered in a single monophyly, clearly confirming that horseshoe crabs composed an ancestral taxon among chelicerates. By contrast, the phylogenetic tree without the information of Asian horseshoe crabs did not support monophyletic clustering of other chelicerates. In conclusion, our analyses may provide more robust and reliable perspective on the study of evolutionary history for chelicerates than earlier analyses with a single Atlantic species.  相似文献   

10.
With the development of genome sequencing more whole genomes of microorganisms were completed, many methods wereintroduced to reconstruct the phylogenetic tree of those microorganismswith the information extracted from the whole genomes through variousways of transforming or mapping the whole genome sequences into otherforms which can describe the evolutionary distance in a new way. We thinkit might be possible that there exists information buried in the wholegenome transferred along lineage, which remains stable and is moreessential than sequence conservation of individual genes or the arrangementof some genes of a selected set. We need to find one measurement that caninvolve as many phylogenetic features as possible that are beyond thegenome sequence itself. We converted each genome sequence of themicroorganisms into another linear sequence to represent the functionalstructure of the sequence, and we used a new information function tocalculate the discrepancy of sequences and to get one distance matrix of thegenomes, and built one phylogenetic tree with a neighbor joining method.The resulting tree shows that the major lineages are consistent with theresult based on their 16srRNA sequences. Our method discovered onephylogenetic feature derived from the genome sequences and the encodedgenes that can rebuild the phylogenetic tree correctly. The mapping of onegenome sequence to its new form representing the relative positions of thefunctional genes provides a new way to measure the phylogeneticrelationships, and with the more specific classification of gene functions theresult could be more sensitive.  相似文献   

11.
Mitochondrial (mt) genes and genomes are among the major sources of data for evolutionary studies in birds. This places mitogenomic studies in birds at the core of intense debates in avian evolutionary biology. Indeed, complete mt genomes are actively been used to unveil the phylogenetic relationships among major orders, whereas single genes (e.g., cytochrome c oxidase I [COX1]) are considered standard for species identification and defining species boundaries (DNA barcoding). In this investigation, we study the time of origin and evolutionary relationships among Neoaves orders using complete mt genomes. First, we were able to solve polytomies previously observed at the deep nodes of the Neoaves phylogeny by analyzing 80 mt genomes, including 17 new sequences reported in this investigation. As an example, we found evidence indicating that columbiforms and charadriforms are sister groups. Overall, our analyses indicate that by improving the taxonomic sampling, complete mt genomes can solve the evolutionary relationships among major bird groups. Second, we used our phylogenetic hypotheses to estimate the time of origin of major avian orders as a way to test if their diversification took place prior to the Cretaceous/Tertiary (K/T) boundary. Such timetrees were estimated using several molecular dating approaches and conservative calibration points. Whereas we found time estimates slightly younger than those reported by others, most of the major orders originated prior to the K/T boundary. Finally, we used our timetrees to estimate the rate of evolution of each mt gene. We found great variation on the mutation rates among mt genes and within different bird groups. COX1 was the gene with less variation among Neoaves orders and the one with the least amount of rate heterogeneity across lineages. Such findings support the choice of COX 1 among mt genes as target for developing DNA barcoding approaches in birds.  相似文献   

12.
Mimivirus is a nucleocytoplasmic large DNA virus (NCLDV) with a genome size (1.2 Mb) and coding capacity ( 1000 genes) comparable to that of some cellular organisms. Unlike other viruses, Mimivirus and its NCLDV relatives encode homologs of broadly conserved informational genes found in Bacteria, Archaea, and Eukaryotes, raising the possibility that they could be placed on the tree of life. A recent phylogenetic analysis of these genes showed the NCLDVs emerging as a monophyletic group branching between Eukaryotes and Archaea. These trees were interpreted as evidence for an independent "fourth domain" of life that may have contributed DNA processing genes to the ancestral eukaryote. However, the analysis of ancient evolutionary events is challenging, and tree reconstruction is susceptible to bias resulting from non-phylogenetic signals in the data. These include compositional heterogeneity and homoplasy, which can lead to the spurious grouping of compositionally-similar or fast-evolving sequences. Here, we show that these informational gene alignments contain both significant compositional heterogeneity and homoplasy, which were not adequately modelled in the original analysis. When we use more realistic evolutionary models that better fit the data, the resulting trees are unable to reject a simple null hypothesis in which these informational genes, like many other NCLDV genes, were acquired by horizontal transfer from eukaryotic hosts. Our results suggest that a fourth domain is not required to explain the available sequence data.  相似文献   

13.
Bacterial and archaeal complete genome sequences have been obtained from a wide range of evolutionary lines, which allows some general conclusions about the phylogenetic distribution and evolution of bioenergetic pathways to be drawn. In particular, I searched in the complete genomes for key enzymes involved in aerobic and anaerobic respiratory pathways and in photosynthesis, and mapped them into an rRNA tree of sequenced species. The phylogenetic distribution of these enzymes is very irregular, and clearly shows the diverse strategies of energy conservation used by prokaryotes. In addition, a thorough phylogenetic analysis of other bioenergetic protein families of wide distribution reveals a complex evolutionary history for the respective genes. A parsimonious explanation for these complex phylogenetic patterns and for the irregular distribution of metabolic pathways is that the last common ancestor of Bacteria and Archaea contained several members of every gene family as a consequence of previous gene or genome duplications, while different patterns of gene loss occurred during the evolution of every gene family. This would imply that the last universal ancestor was a bioenergetically sophisticated organism. Finally, important steps that occurred during the evolution of energetic machineries, such as the early evolution of aerobic respiration and the acquisition of eukaryotic mitochondria from a proteobacterium ancestor, are supported by the analysis of the complete genome sequences.  相似文献   

14.
Li X  Ogoh K  Ohba N  Liang X  Ohmiya Y 《Gene》2007,392(1-2):196-205
We determined the mitochondrial DNA (mtDNA) sequences of two luminous beetles (Arthropoda, Insecta, Coleoptera), Rhagophthalmus lufengensis from Yunnan, China and Rhagophthalmus ohbai from Yaeyama Island, Japan. We identified all the 37 mtDNA genes of R. lufengensis (15,982 bp) and the 34 genes of R. ohbai (15,704 bp). R. lufengensis and R. ohbai genomes have higher A + T contents than other coleopteran genomes although the gene arrangements are similar. Interestingly, in a study of the evolutionary relationship among R. lufengensis, R. ohbai and the firefly Pyrocoelia rufa, the phylogenetic tree inferred from lrRNA genes from mitochondrial genomes indicates a biogeographic relationship among the bioluminescent insects in East Asia and the phylogenetic tree inferred from luciferase-related genes from nuclear genomes shows an appropriate relationship among coleopterans, reflecting the evolutionary origin of bioluminescence. Thus, the mtDNAs of luminescent beetles can provide an insight into their evolutionary origin and biogeography.  相似文献   

15.
Native grasslands are one of the most endangered ecosystems in North America. In this study, we examined the ecological and evolutionary roles of endangered and threatened (e/t) grasses by establishing robust evolutionary relationships with other nonthreatened native and introduced grass species of the community. We hypothesized that the phylogenomic distribution of e/t species of grasses in Illinois would be phylogenetically clustered because closely related species would be vulnerable to the same threats and have similar requirements for survival. This study presents the first time a phylogeny based on complete plastome DNA of Poaceae was analyzed by phylogenetic diversity analysis. To avoid the disturbance of e/t populations, DNA was extracted from herbarium specimens. Next‐generation sequencing (NGS) techniques were used to sequence DNA of plastid genomes (plastomes). The resulting phylogenomic tree was analyzed by phylogenetic diversity metrics. The extracted DNA successfully produced complete plastomes demonstrating that herbarium material is a practical source of DNA for genomic studies. The phylogenomic tree was strongly supported and defined Dichanthelium as a separate clade from Panicum. The phylogenetic metrics revealed phylogenetic clustering of e/t species, confirming our hypothesis.  相似文献   

16.
Over 3000 microbial (bacterial and archaeal) genomes have been made publically available to date, providing an unprecedented opportunity to examine evolutionary genomic trends and offering valuable reference data for a variety of other studies such as metagenomics. The utility of these genome sequences is greatly enhanced when we have an understanding of how they are phylogenetically related to each other. Therefore, we here describe our efforts to reconstruct the phylogeny of all available bacterial and archaeal genomes. We identified 24, single-copy, ubiquitous genes suitable for this phylogenetic analysis. We used two approaches to combine the data for the 24 genes. First, we concatenated alignments of all genes into a single alignment from which a Maximum Likelihood (ML) tree was inferred using RAxML. Second, we used a relatively new approach to combining gene data, Bayesian Concordance Analysis (BCA), as implemented in the BUCKy software, in which the results of 24 single-gene phylogenetic analyses are used to generate a “primary concordance” tree. A comparison of the concatenated ML tree and the primary concordance (BUCKy) tree reveals that the two approaches give similar results, relative to a phylogenetic tree inferred from the 16S rRNA gene. After comparing the results and the methods used, we conclude that the current best approach for generating a single phylogenetic tree, suitable for use as a reference phylogeny for comparative analyses, is to perform a maximum likelihood analysis of a concatenated alignment of conserved, single-copy genes.  相似文献   

17.
A phylogenomic approach to microbial evolution   总被引:21,自引:2,他引:19       下载免费PDF全文
To study the origin and evolution of biochemical pathways in microorganisms, we have developed methods and software for automatic, large-scale reconstructions of phylogenetic relationships. We define the complete set of phylogenetic trees derived from the proteome of an organism as the phylome and introduce the term phylogenetic connection as a concept that describes the relative relationships between taxa in a tree. A query system has been incorporated into the system so as to allow searches for defined categories of trees within the phylome. As a complement, we have developed the pyphy system for visualising the results of complex queries on phylogenetic connections, genomic locations and functional assignments in a graphical format. Our phylogenomics approach, which links phylogenetic information to the flow of biochemical pathways within and among microbial species, has been used to examine more than 8000 phylogenetic trees from seven microbial genomes. The results have revealed a rich web of phylogenetic connections. However, the separation of Bacteria and Archaea into two separate domains remains robust.  相似文献   

18.
Zhang YJ  Ma PF  Li DZ 《PloS one》2011,6(5):e20596

Background

Bambusoideae is the only subfamily that contains woody members in the grass family, Poaceae. In phylogenetic analyses, Bambusoideae, Pooideae and Ehrhartoideae formed the BEP clade, yet the internal relationships of this clade are controversial. The distinctive life history (infrequent flowering and predominance of asexual reproduction) of woody bamboos makes them an interesting but taxonomically difficult group. Phylogenetic analyses based on large DNA fragments could only provide a moderate resolution of woody bamboo relationships, although a robust phylogenetic tree is needed to elucidate their evolutionary history. Phylogenomics is an alternative choice for resolving difficult phylogenies.

Methodology/Principal Findings

Here we present the complete nucleotide sequences of six woody bamboo chloroplast (cp) genomes using Illumina sequencing. These genomes are similar to those of other grasses and rather conservative in evolution. We constructed a phylogeny of Poaceae from 24 complete cp genomes including 21 grass species. Within the BEP clade, we found strong support for a sister relationship between Bambusoideae and Pooideae. In a substantial improvement over prior studies, all six nodes within Bambusoideae were supported with ≥0.95 posterior probability from Bayesian inference and 5/6 nodes resolved with 100% bootstrap support in maximum parsimony and maximum likelihood analyses. We found that repeats in the cp genome could provide phylogenetic information, while caution is needed when using indels in phylogenetic analyses based on few selected genes. We also identified relatively rapidly evolving cp genome regions that have the potential to be used for further phylogenetic study in Bambusoideae.

Conclusions/Significance

The cp genome of Bambusoideae evolved slowly, and phylogenomics based on whole cp genome could be used to resolve major relationships within the subfamily. The difficulty in resolving the diversification among three clades of temperate woody bamboos, even with complete cp genome sequences, suggests that these lineages may have diverged very rapidly.  相似文献   

19.
The complete sequenced genomes of chloroplast have provided much information on the origin and evolution of this organelle. In this paper we attempt to use these sequences to test a novel approach for phylogenetic analysis of complete genomes based on correlation analysis of compositional vectors. All protein sequences from 21 complete chloroplast genomes are analyzed in comparison with selected archaea, eubacteria, and eukaryotes. The distance-based analysis shows that the chloroplast genomes are most closely related to cyanobacteria, consistent with the endosymbiotic origin of chloroplasts. The chloroplast genomes are separated to two major clades corresponding to chlorophytes (green plants) s.l. and rhodophytes (red algae) s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution. For instance, the analysis places the chloroplasts of two chromophytes (Guillardia and Odontella) within the rhodophyte lineage, supporting secondary endosymbiosis as the source of these chloroplasts. The relationships among the green algae and land plants in our tree also agree with results from traditional phylogenetic analyses. Thus, this study establishes the value of our simple correlation analysis in elucidating the evolutionary relationships among genomes. It is hoped that this approach will provide insights on comparative genome analysis.  相似文献   

20.
Phylogenetic classifications based on single genes such as rRNA genes do not provide a complete and accurate picture of evolution because they do not account for evolutionary leaps caused by gene transfer, duplication, deletion and functional replacement. Here, we present a whole-genome-scale phylogeny based on metabolic pathway reaction content. From the genome sequences of 42 microorganisms, we deduced the metabolic pathway reactions and used the relatedness of these contents to construct a phylogenetic tree that represents the similarity of metabolic profiles (relatedness) as well as the extent of metabolic pathway similarity (evolutionary distance). This method accounts for horizontal gene transfer and specific gene loss by comparison of whole metabolic subpathways, and allows evaluation of evolutionary relatedness and changes in metabolic pathways. Thus, a tree based on metabolic pathway content represents both the evolutionary time scale (changes in genetic content) and the evolutionary process (changes in metabolism).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号