首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In silico genomic fingerprints were produced by virtual hybridization of 191 fully sequenced bacterial genomes using a set of 15,264 13-mer probes specially designed to produce universal whole genome fingerprints. A novel approach for constructing phylogenetic trees, based on comparative analysis of genomic fingerprints, was developed. The resultant bacterial phylogenetic tree had strong similarities to those produced from the alignment of conserved sequences. Notably, the trees derived from the alignment of other conserved COG genes divided the Bacillus and Corynebacterium genera into the same subgroups produced by the novel bacterial tree. A number of discrepancies between both techniques were observed for the grouping of some Lactobacillus species. However, a detailed analysis of the alignment of these genomes using other bioinformatics tools revealed that the grouping of these organisms in the novel tree was more satisfactory than the groupings from previous classifications, which used only a few conserved genes. All these data suggest that the bacterial taxonomy produced by genomic fingerprints is satisfactory, but sometimes different from classical taxonomies. Discrepancies probably arise because the fingerprinting technique analyzes genomic sequences and reveals more information than previously used approaches.  相似文献   

2.
Genomic trees have been constructed based on the presence and absence of families of protein-encoding genes observed in 27 complete genomes, including genomes of 15 free-living organisms. This method does not rely on the identification of suspected orthologs in each genome, nor the specific alignment used to compare gene sequences because the protein-encoding gene families are formed by grouping any protein with a pairwise similarity score greater than a preset value. Because of this all inclusive grouping, this method is resilient to some effects of lateral gene transfer because transfers of genes are masked when the recipient genome already has a homolog (not necessarily an ortholog) of the incoming gene. Of 71 genes suspected to have been laterally transferred to the genome of Aeropyrum pernix, only approximately 7 to 15 represent genes where a lateral gene transfer appears to have generated homoplasy in our character dataset. The genomic tree of the 15 free-living taxa includes six different bacterial orders, six different archaeal orders, and two different eukaryotic kingdoms. The results are remarkably similar to results obtained by analysis of rRNA. Inclusion of the other 12 genomes resulted in a tree only broadly similar to that suggested by rRNA with at least some of the differences due to artifacts caused by the small genome size of many of these species. Very small genomes, such as those of the two Mycoplasma genomes included, fall to the base of the Bacterial domain, a result expected due to the substantial gene loss inherent to these lineages. Finally, artificial ``partial genomes' were generated by randomly selecting ORFs from the complete genomes in order to test our ability to recover the tree generated by the whole genome sequences when only partial data are available. The results indicated that partial genomic data, when sampled randomly, could robustly recover the tree generated by the whole genome sequences. Received: 30 May 2001 / Accepted: 10 October 2001  相似文献   

3.
The order Trichosporonales (Tremellomycotina, Basidiomycota) includes various species that have clinical, agricultural and biotechnological value. Thus, understanding why and how evolutionary diversification occurred within this order is extremely important. This study clarified the phylogenetic relationships among Tricosporonales species. To select genes suitable for phylogenetic analysis, we determined the draft genomes of 17 Trichosporonales species and extracted 30 protein-coding DNA sequences (CDSs) from genomic data. The CDS regions of Trichosporon asahii and T. faecale were identified by referring to mRNA sequence data since the intron positions of the respective genes differed from those of Cryptococcus neoformans (outgroup) and are not conserved within this order. A multiple alignment of the respective gene was first constructed using the CDSs of T. asahii, T. faecale and C. neoformans, and those of other species were added and aligned based on codons. The phylogenetic trees were constructed based on each gene and a concatenated alignment. Resolution of the maximum-likelihood trees estimated from the concatenated dataset based on both nucleotide (72,531) and amino acid (24,173) sequences were greater than in previous reports. In addition, we found that several genes, such as phosphatidylinositol 3-kinase TOR1 and glutamate synthase (NADH), had good resolution in this group (even when used alone). Our study proposes a set of genes suitable for constructing a phylogenetic tree with high resolution to examine evolutionary diversification in Trichosporonales. These can also be used for epidemiological and biogeographical studies, and may also serve as the basis for a comprehensive reclassification of pleomorphic fungi.  相似文献   

4.
Over 3000 microbial (bacterial and archaeal) genomes have been made publically available to date, providing an unprecedented opportunity to examine evolutionary genomic trends and offering valuable reference data for a variety of other studies such as metagenomics. The utility of these genome sequences is greatly enhanced when we have an understanding of how they are phylogenetically related to each other. Therefore, we here describe our efforts to reconstruct the phylogeny of all available bacterial and archaeal genomes. We identified 24, single-copy, ubiquitous genes suitable for this phylogenetic analysis. We used two approaches to combine the data for the 24 genes. First, we concatenated alignments of all genes into a single alignment from which a Maximum Likelihood (ML) tree was inferred using RAxML. Second, we used a relatively new approach to combining gene data, Bayesian Concordance Analysis (BCA), as implemented in the BUCKy software, in which the results of 24 single-gene phylogenetic analyses are used to generate a “primary concordance” tree. A comparison of the concatenated ML tree and the primary concordance (BUCKy) tree reveals that the two approaches give similar results, relative to a phylogenetic tree inferred from the 16S rRNA gene. After comparing the results and the methods used, we conclude that the current best approach for generating a single phylogenetic tree, suitable for use as a reference phylogeny for comparative analyses, is to perform a maximum likelihood analysis of a concatenated alignment of conserved, single-copy genes.  相似文献   

5.
《Genomics》2019,111(6):1590-1603
Genomes are not random sequences because natural selection has injected information in biological sequences for billions of years. Inspired by this idea, we developed a simple method to compare genomes considering nucleotide counts in subsequences (blocks) instead of their exact sequences.We introduce the Block Alignment method for comparing two genomes and based on this comparison method, define a similarity score and a distance. The presented model ignores nucleotide order in the sequence. On the other hand, in this block comparison method, due to exclusion of point mutations and small size variations, there is no need for high coverage sequencing which is responsible for the high costs of data production and storage; moreover, the sequence comparisons could be performed with higher speed.Phylogenetic trees of two sets of bacterial genomes were constructed and the results were in full agreement with their already constructed phylogenetic trees. Furthermore, a weighted and directed similarity network of each set of bacterial genomes was inferred ab initio by this model. Remarkably, the communities of these networks are in agreement with the clades of the corresponding phylogenetic trees which means these similarity networks also contain phylogenetic information about the genomes. Moreover, the block comparison method was used to distinguish rob(15;21)c-associated iAMP21 and sporadic iAMP21 rearrangements in subgroups of chromosome 21 in acute lymphoblastic leukemia. Our results show a meaningful difference between the number of contigs that mapped to chromosomes 15 and 21 in these cases. Furthermore, the presented block alignment model can select the candidate blocks to perform more accurate analysis and it is capable to find conserved blocks on a set of genomes.  相似文献   

6.
Species belonging to the phylum Synergistetes are poorly characterized. Though the known species display Gram-negative characteristics and the ability to ferment amino acids, no single characteristic is known which can define this group. For eight Synergistetes species, complete genome sequences or draft genomes have become available. We have used these genomes to construct detailed phylogenetic trees for the Synergistetes species and carried out comprehensive analysis to identify molecular markers consisting of conserved signature indels (CSIs) in protein sequences that are specific for either all Synergistetes or some of their sub-groups. We report here identification of 32 CSIs in widely distributed proteins such as RpoB, RpoC, UvrD, GyrA, PolA, PolC, MraW, NadD, PyrE, RpsA, RpsH, FtsA, RadA, etc., including a large >300 aa insert within the RpoC protein, that are present in various Synergistetes species, but except for isolated bacteria, these CSIs are not found in the protein homologues from any other organisms. These CSIs provide novel molecular markers that distinguish the species of the phylum Synergistetes from all other bacteria. The large numbers of other CSIs discovered in this work provide valuable information that supports and consolidates evolutionary relationships amongst the sequenced Synergistetes species. Of these CSIs, seven are specifically present in Jonquetella, Pyramidobacter and Dethiosulfovibrio species indicating a cladal relationship among them, which is also strongly supported by phylogenetic trees. A further 15 CSIs that are only present in Jonquetella and Pyramidobacter indicate a close association between these two species. Additionally, a previously described phylogenetic relationship between the Aminomonas and Thermanaerovibrio species was also supported by 9 CSIs. The strong relationships indicated by the indel analysis provide incentives for the grouping of species from these clades into higher taxonomic groups such as families or orders. The identified molecular markers, due to their specificity for Synergistetes and presence in highly conserved regions of important proteins suggest novel targets for evolutionary, genetic and biochemical studies on these bacteria as well as for the identification of additional species belonging to this phylum in different environments.  相似文献   

7.
Molecular sequences provide a rich source of data for inferring the phylogenetic relationships among species. However, recent work indicates that even an accurate multiple alignment of a large sequence set may yield an incorrect phylogeny and that the quality of the phylogenetic tree improves when the input consists only of the highly conserved, motif regions of the alignment. This work introduces two methods of producing multiple alignments that include only the conserved regions of the initial alignment. The first method retains conserved motifs, whereas the second retains individual conserved sites in the initial alignment. Using parsimony analysis on a mitochondrial data set containing 19 species among which the phylogenetic relationships are widely accepted, both conserved alignment methods produce better phylogenetic trees than the complete alignment. Unlike any of the 19 inference methods used before to analyze this data, both methods produce trees that are completely consistent with the known phylogeny. The motif-based method employs far fewer alignment sites for comparable error rates. For a larger data set containing mitochondrial sequences from 39 species, the site-based method produces a phylogenetic tree that is largely consistent with known phylogenetic relationships and suggests several novel placements. J. Exp. Zool. ( Mol. Dev. Evol.) 285:128-139, 1999.  相似文献   

8.
9.
Phenotypic behavior of a group of organisms can be studied using a range of molecular evolutionary tools that help to determine evolutionary relationships. Traditionally a gene or a set of gene sequences was used for generating phylogenetic trees. Incomplete evolutionary information in few selected genes causes problems in phylogenetic tree construction. Whole genomes are used as remedy. Now, the task is to identify the suitable parameters to extract the hidden information from whole genome sequences that truly represent evolutionary information. In this study we explored a random anchor (a stretch of 100 nucleotides) based approach (ABWGP) for finding distance between any two genomes, and used the distance estimates to compute evolutionary trees. A number of strains and species of Mycobacteria were used for this study. Anchor-derived parameters, such as cumulative normalized score, anchor order and indels were computed in a pair-wise manner, and the scores were used to compute distance/phylogenetic trees. The strength of branching was determined by bootstrap analysis. The terminal branches are clearly discernable using the distance estimates described here. In general, different measures gave similar trees except the trees based on indels. Overall the tree topology reflected the known biology of the organisms. This was also true for different strains of Escherichia coli. A new whole genome-based approach has been described here for studying evolutionary relationships among bacterial strains and species.  相似文献   

10.
Detailed phylogenetic and comparative genomic analyses are reported on 140 genome sequenced cyanobacteria with the main focus on the heterocyst-differentiating cyanobacteria. In a phylogenetic tree for cyanobacteria based upon concatenated sequences for 32 conserved proteins, the available cyanobacteria formed 8–9 strongly supported clades at the highest level, which may correspond to the higher taxonomic clades of this phylum. One of these clades contained all heterocystous cyanobacteria; within this clade, the members exhibiting either true (Nostocales) or false (Stigonematales) branching of filaments were intermixed indicating that the division of the heterocysts-forming cyanobacteria into these two groups is not supported by phylogenetic considerations. However, in both the protein tree as well as in the 16S rRNA gene tree, the akinete-forming heterocystous cyanobacteria formed a distinct clade. Within this clade, the members which differentiate into hormogonia or those which lack this ability were also separated into distinct groups. A novel molecular signature identified in this work that is uniquely shared by the akinete-forming heterocystous cyanobacteria provides further evidence that the members of this group are specifically related and they shared a common ancestor exclusive of the other cyanobacteria. Detailed comparative analyses on protein sequences from the genomes of heterocystous cyanobacteria reported here have also identified eight conserved signature indels (CSIs) in proteins involved in a broad range of functions, and three conserved signature proteins, that are either uniquely or mainly found in all heterocysts-forming cyanobacteria, but generally not found in other cyanobacteria. These molecular markers provide novel means for the identification of heterocystous cyanobacteria, and they provide evidence of their monophyletic origin. Additionally, this work has also identified seven CSIs in other proteins which in addition to the heterocystous cyanobacteria are uniquely shared by two smaller clades of cyanobacteria, which form the successive outgroups of the clade comprising of the heterocystous cyanobacteria in the protein trees. Based upon their close relationship to the heterocystous cyanobacteria, the members of these clades are indicated to be the closest relatives of the heterocysts-forming cyanobacteria.  相似文献   

11.
12.
13.
Phototrophic microbial mat communities from 60 °C and 65 °C regions in the effluent channels of Mushroom and Octopus Springs (Yellowstone National Park, WY, USA) were investigated by shotgun metagenomic sequencing. Analyses of assembled metagenomic sequences resolved six dominant chlorophototrophic populations and permitted the discovery and characterization of undescribed but predominant community members and their physiological potential. Linkage of phylogenetic marker genes and functional genes showed novel chlorophototrophic bacteria belonging to uncharacterized lineages within the order Chlorobiales and within the Kingdom Chloroflexi. The latter is the first chlorophototrophic member of Kingdom Chloroflexi that lies outside the monophyletic group of chlorophototrophs of the Order Chloroflexales. Direct comparison of unassembled metagenomic sequences to genomes of representative isolates showed extensive genetic diversity, genomic rearrangements and novel physiological potential in native populations as compared with genomic references. Synechococcus spp. metagenomic sequences showed a high degree of synteny with the reference genomes of Synechococcus spp. strains A and B′, but synteny declined with decreasing sequence relatedness to these references. There was evidence of horizontal gene transfer among native populations, but the frequency of these events was inversely proportional to phylogenetic relatedness.  相似文献   

14.
15.
Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences (MHGs) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa.  相似文献   

16.
Nucleotide sequence comparisons of three house-keeping genes, adenylate kinase (adk), shikimate dehydrogenase (aroE), and glucose-6-phosphate dehydrogenase (gdh), were used to infer the phylogeny of 33 gamma-proteobacteria. Phylogenetic trees inferred from each gene, and from the concatenated sequences of all three genes, are, in general, similar to a 16S rRNA gene-inferred tree. Similar grouping of bacteria are revealed at the family, genus, species and strain levels in all five trees. The house-keeping genes, however, show a higher rate of nucleotide sequence substitutions. Consequently, they can possibly probe deeper branches of a phylogenetic tree than the 16S rRNA gene. However, because their nucleotide sequences are not as highly conserved among gamma-proteobacteria, family- or genus-specific primers would need to be designed for the amplification of any of these three house-keeping genes. Since these genes are used in multilocus sequence typing, it is expected that the number of sequences publicly available for many taxa will increase over time proving them very useful either at complementing 16S rRNA-inferred phylogenies or for specific, targeted, phylogenetic analysis.  相似文献   

17.
Comparisons of proteins show that they evolve through the movement of domains. However, in many cases, the underlying mechanisms remain unclear. Here, we observed the movements of DNA recognition domains between non-orthologous proteins within a prokaryote genome. Restriction–modification (RM) systems, consisting of a sequence-specific DNA methyltransferase and a restriction enzyme, contribute to maintenance/evolution of genomes/epigenomes. RM systems limit horizontal gene transfer but are themselves mobile. We compared Type III RM systems in Helicobacter pylori genomes and found that target recognition domain (TRD) sequences are mobile, moving between different orthologous groups that occupy unique chromosomal locations. Sequence comparisons suggested that a likely underlying mechanism is movement through homologous recombination of similar DNA sequences that encode amino acid sequence motifs that are conserved among Type III DNA methyltransferases. Consistent with this movement, incongruence was observed between the phylogenetic trees of TRD regions and other regions in proteins. Horizontal acquisition of diverse TRD sequences was suggested by detection of homologs in other Helicobacter species and distantly related bacterial species. One of these RM systems in H. pylori was inactivated by insertion of another RM system that likely transferred from an oral bacterium. TRD movement represents a novel route for diversification of DNA-interacting proteins.  相似文献   

18.
Understanding the early evolution of placental mammals is one of the most challenging issues in mammalian phylogeny. Here, we addressed this question by using the sequence data of the ENCODE consortium, which include 1% of mammalian genomes in 18 species belonging to all main mammalian lineages. Phylogenetic reconstructions based on an unprecedented amount of coding sequences taken from 218 genes resulted in a highly supported tree placing the root of Placentalia between Afrotheria and Exafroplacentalia (Afrotheria hypothesis). This topology was validated by the phylogenetic analysis of a new class of genomic phylogenetic markers, the conserved noncoding sequences. Applying the tests of alternative topologies on the coding sequence dataset resulted in the rejection of the Atlantogenata hypothesis (Xenarthra grouping with Afrotheria), while this test rejected the second alternative scenario, the Epitheria hypothesis (Xenarthra at the base), when using the noncoding sequence dataset. Thus, the two datasets support the Afrotheria hypothesis; however, none can reject both of the remaining topological alternatives.  相似文献   

19.

Background

Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda.

Results

The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic.

Conclusions

The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the resulting tree topologies as suspect. As such, these data are likely inappropriate for investigating such ancient relationships.  相似文献   

20.
Reconstruction of the Tree of Life is a central goal in biology. Although numerous novel phyla of bacteria and archaea have recently been discovered, inconsistent phylogenetic relationships are routinely reported, and many inter-phylum and inter-domain evolutionary relationships remain unclear. Here, we benchmark different marker genes often used in constructing multidomain phylogenetic trees of bacteria and archaea and present a set of marker genes that perform best for multidomain trees constructed from concatenated alignments. We use recently-developed Tree Certainty metrics to assess the confidence of our results and to obviate the complications of traditional bootstrap-based metrics. Given the vastly disparate number of genomes available for different phyla of bacteria and archaea, we also assessed the impact of taxon sampling on multidomain tree construction. Our results demonstrate that biases between the representation of different taxonomic groups can dramatically impact the topology of resulting trees. Inspection of our highest-quality tree supports the division of most bacteria into Terrabacteria and Gracilicutes, with Thermatogota and Synergistota branching earlier from these superphyla. This tree also supports the inclusion of the Patescibacteria within the Terrabacteria as a sister group to the Chloroflexota instead of as a basal-branching lineage. For the Archaea, our tree supports three monophyletic lineages (DPANN, Euryarchaeota, and TACK/Asgard), although we note the basal placement of the DPANN may still represent an artifact caused by biased sequence composition. Our findings provide a robust and standardized framework for multidomain phylogenetic reconstruction that can be used to evaluate inter-phylum relationships and assess uncertainty in conflicting topologies of the Tree of Life.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号