首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 58 毫秒
1.
The sequencing and analysis of multiple housekeeping genes has been routinely used to phylogenetically compare closely related bacterial isolates. Recent studies using whole-genome alignment (WGA) and phylogenetics from >100 Escherichia coli genomes has demonstrated that tree topologies from WGA and multilocus sequence typing (MLST) markers differ significantly. A nonrepresentative phylogeny can lead to incorrect conclusions regarding important evolutionary relationships. In this study, the Phylomark algorithm was developed to identify a minimal number of useful phylogenetic markers that recapitulate the WGA phylogeny. To test the algorithm, we used a set of diverse draft and complete E. coli genomes. The algorithm identified more than 100,000 potential markers of different fragment lengths (500 to 900 nucleotides). Three molecular markers were ultimately chosen to determine the phylogeny based on a low Robinson-Foulds (RF) distance compared to the WGA phylogeny. A phylogenetic analysis demonstrated that a more representative phylogeny was inferred for a concatenation of these markers compared to all other MLST schemes for E. coli. As a functional test of the algorithm, the three markers (genomic guided E. coli markers, or GIG-EM) were amplified and sequenced from a set of environmental E. coli strains (ECOR collection) and informatically extracted from a set of 78 diarrheagenic E. coli strains (DECA collection). In the instances of the 40-genome test set and the DECA collection, the GIG-EM system outperformed other E. coli MLST systems in terms of recapitulating the WGA phylogeny. This algorithm can be employed to determine the minimal marker set for any organism that has sufficient genome sequencing.  相似文献   

2.
Escherichia coli O104:H4 was associated with a severe foodborne disease outbreak originating in Germany in May 2011. More than 4000 illnesses and 50 deaths were reported. The outbreak strain was a typical enteroaggregative E. coli (EAEC) that acquired an antibiotic resistance plasmid and a Shiga-toxin 2 (Stx2)-encoding bacteriophage. Based on whole-genome phylogenies, the O104:H4 strain was most closely related to other EAEC strains; however, Stx2-bacteriophage are mobile, and do not necessarily share an evolutionary history with their bacterial host. In this study, we analyzed Stx2-bacteriophage from the E. coli O104:H4 outbreak isolates and compared them to all available Stx2-bacteriophage sequences. We also compared Stx2 production by an E. coli O104:H4 outbreak-associated isolate (ON-2011) to that of E. coli O157:H7 strains EDL933 and Sakai. Among the E. coli Stx2-phage sequences studied, that from O111:H- strain JB1-95 was most closely related phylogenetically to the Stx2-phage from the O104:H4 outbreak isolates. The phylogeny of most other Stx2-phage was largely concordant with their bacterial host genomes. Finally, O104:H4 strain ON-2011 produced less Stx2 than E. coli O157:H7 strains EDL933 and Sakai in culture; however, when mitomycin C was added, ON-2011 produced significantly more toxin than the E. coli O157:H7 strains. The Stx2-phage from the E. coli O104:H4 outbreak strain and the Stx2-phage from O111:H- strain JB1-95 likely share a common ancestor. Incongruence between the phylogenies of the Stx2-phage and their host genomes suggest the recent Stx2-phage acquisition by E. coli O104:H4. The increase in Stx2-production by ON-2011 following mitomycin C treatment may or may not be related to the high rates of hemolytic uremic syndrome associated with the German outbreak strain. Further studies are required to determine whether the elevated Stx2-production levels are due to bacteriophage or E. coli O104:H4 host related factors.  相似文献   

3.
The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR) with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH) identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI) approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as PGPR.  相似文献   

4.
MOTIVATION: Phylogenomics integrates the vast amount of phylogenetic information contained in complete genome sequences, and is rapidly becoming the standard for reliably inferring species phylogenies. There are, however, fundamental differences between the ways in which phylogenomic approaches like gene content, superalignment, superdistance and supertree integrate the phylogenetic information from separate orthologous groups. Furthermore, they all depend on the method by which the orthologous groups are initially determined. Here, we systematically compare these four phylogenomic approaches, in parallel with three approaches for large-scale orthology determination: pairwise orthology, cluster orthology and tree-based orthology. RESULTS: Including various phylogenetic methods, we apply a total of 54 fully automated phylogenomic procedures to the fungi, the eukaryotic clade with the largest number of sequenced genomes, for which we retrieved a golden standard phylogeny from the literature. Phylogenomic trees based on gene content show, relative to the other methods, a bias in the tree topology that parallels convergence in lifestyle among the species compared, indicating convergence in gene content. CONCLUSIONS: Complete genomes are no guarantee for good or even consistent phylogenies. However, the large amounts of data in genomes enable us to carefully select the data most suitable for phylogenomic inference. In terms of performance, the superalignment approach, combined with restrictive orthology, is the most successful in recovering a fungal phylogeny that agrees with current taxonomic views, and allows us to obtain a high-resolution phylogeny. We provide solid support for what has grown to be a common practice in phylogenomics during its advance in recent years. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

5.
Escherichia coli exhibits a wide range of lifestyles encompassing commensalism and various pathogenic behaviors which its highly dynamic genome contributes to develop. How environmental and host factors shape the genetic structure of E. coli strains remains, however, largely unknown. Following a previous study of E. coli genomic diversity, we investigated its diversity at the metabolic level by building and analyzing the genome-scale metabolic networks of 29 E. coli strains (8 commensal and 21 pathogenic strains, including 6 Shigella strains). Using a tailor-made reconstruction strategy, we significantly improved the completeness and accuracy of the metabolic networks over default automatic reconstruction processes. Among the 1,545 reactions forming E. coli panmetabolism, 885 reactions were common to all strains. This high proportion of core reactions (57%) was found to be in sharp contrast to the low proportion (13%) of core genes in the E. coli pangenome, suggesting less diversity of metabolic functions compared to that of all gene functions. Core reactions were significantly overrepresented among biosynthetic reactions compared to the more variable degradation processes. Differences between metabolic networks were found to follow E. coli phylogeny rather than pathogenic phenotypes, except for Shigella networks, which were significantly more distant from the others. This suggests that most metabolic changes in non-Shigella strains were not driven by their pathogenic phenotypes. Using a supervised method, we were yet able to identify small sets of reactions related to pathogenicity or commensalism. The quality of our reconstructed networks also makes them reliable bases for building metabolic models.  相似文献   

6.
Species evolutionary relationships have traditionally been defined by sequence similarities of phylogenetic marker molecules, recently followed by whole-genome phylogenies based on gene order, average ortholog similarity or gene content. Here, we introduce genome conservation--a novel metric of evolutionary distances between species that simultaneously takes into account, both gene content and sequence similarity at the whole-genome level. Genome conservation represents a robust distance measure, as demonstrated by accurate phylogenetic reconstructions. The genome conservation matrix for all presently sequenced organisms exhibits a remarkable ability to define evolutionary relationships across all taxonomic ranges. An assessment of taxonomic ranks with genome conservation shows that certain ranks are inadequately described and raises the possibility for a more precise and quantitative taxonomy in the future. All phylogenetic reconstructions are available at the genome phylogeny server: .  相似文献   

7.
Genomes of prokaryotes differ significantly in size and DNA composition. Escherichia coli is considered a model organism to analyze the processes involved in bacterial genome evolution, as the species comprises numerous pathogenic and commensal variants. Pathogenic and nonpathogenic E. coli strains differ in the presence and absence of additional DNA elements contributing to specific virulence traits and also in the presence and absence of additional genetic information. To analyze the genetic diversity of pathogenic and commensal E. coli isolates, a whole-genome approach was applied. Using DNA arrays, the presence of all translatable open reading frames (ORFs) of nonpathogenic E. coli K-12 strain MG1655 was investigated in 26 E. coli isolates, including various extraintestinal and intestinal pathogenic E. coli isolates, 3 pathogenicity island deletion mutants, and commensal and laboratory strains. Additionally, the presence of virulence-associated genes of E. coli was determined using a DNA "pathoarray" developed in our laboratory. The frequency and distributional pattern of genomic variations vary widely in different E. coli strains. Up to 10% of the E. coli K-12-specific ORFs were not detectable in the genomes of the different strains. DNA sequences described for extraintestinal or intestinal pathogenic E. coli are more frequently detectable in isolates of the same origin than in other pathotypes. Several genes coding for virulence or fitness factors are also present in commensal E. coli isolates. Based on these results, the conserved E. coli core genome is estimated to consist of at least 3,100 translatable ORFs. The absence of K-12-specific ORFs was detectable in all chromosomal regions. These data demonstrate the great genome heterogeneity and genetic diversity among E. coli strains and underline the fact that both the acquisition and deletion of DNA elements are important processes involved in the evolution of prokaryotes.  相似文献   

8.
Bacteria exchange genetic material by horizontal gene transfer (HGT). To evaluate the impact of HGT on Escherichia coli genome plasticity, 19 commensal strains collected from the intestinal floras of humans and animals were analyzed by microarrays. Strains were hybridized against an oligoarray containing 2700 E. coli K12 chromosomal genes. A core (genes shared among compared genomes) and a flexible gene pool (genes unique for each genome) have been identified. Analysis of hybridization signals evidenced 1015 divergent genes among the 19 strains and each strain showed a specific genomic variability pattern. Four hundred and fifty-eight genes were characterized by higher rates of interstrain variation and were considered hyperdivergent. These genes are not randomly distributed onto the chromosome but are clustered in precise regions. Hyperdivergent genes belong to the flexible gene pool and show a specific GC content, differing from that of the chromosome, indicating acquisition by HGT. Among these genes, those involved in defense mechanisms and cell motility as well as intracellular trafficking and secretion were far more represented than others. The observed genome plasticity contributes to the maintenance of genetic diversity and may therefore be a source of evolutionary adaptation and survival.  相似文献   

9.
Escherichia coli, including the closely related genus Shigella, is a highly diverse species in terms of genome structure. Comparative genomic hybridization (CGH) microarray analysis was used to compare the gene content of E. coli K-12 with the gene contents of pathogenic strains. Missing genes in a pathogen were detected on a microarray slide spotted with 4,071 open reading frames (ORFs) of W3110, a commonly used wild-type K-12 strain. For 22 strains subjected to the CGH microarray analyses 1,424 ORFs were found to be absent in at least one strain. The common backbone of the E. coli genome was estimated to contain about 2,800 ORFs. The mosaic distribution of absent regions indicated that the genomes of pathogenic strains were highly diversified because of insertions and deletions. Prophages, cell envelope genes, transporter genes, and regulator genes in the K-12 genome often were not present in pathogens. The gene contents of the strains tested were recognized as a matrix for a neighbor-joining analysis. The phylogenic tree obtained was consistent with the results of previous studies. However, unique relationships between enteroinvasive strains and Shigella, uropathogenic, and some enteropathogenic strains were suggested by the results of this study. The data demonstrated that the CGH microarray technique is useful not only for genomic comparisons but also for phylogenic analysis of E. coli at the strain level.  相似文献   

10.
We describe the design and evaluate the use of a high-density oligonucleotide microarray covering seven sequenced Escherichia coli genomes in addition to several sequenced E. coli plasmids, bacteriophages, pathogenicity islands, and virulence genes. Its utility is demonstrated for comparative genomic profiling of two unsequenced strains, O175:H16 D1 and O157:H7 3538 (Deltastx(2)::cat) as well as two well-known control strains, K-12 W3110 and O157:H7 EDL933. By using fluorescently labeled genomic DNA to query the microarrays and subsequently analyze common virulence genes and phage elements and perform whole-genome comparisons, we observed that O175:H16 D1 is a K-12-like strain and confirmed that its phi3538 (Deltastx(2)::cat) phage element originated from the E. coli 3538 (Deltastx(2)::cat) strain, with which it shares a substantial proportion of phage elements. Moreover, a number of genes involved in DNA transfer and recombination was identified in both new strains, providing a likely explanation for their capability to transfer phi3538 (Deltastx(2)::cat) between them. Analyses of control samples demonstrated that results using our custom-designed microarray were representative of the true biology, e.g., by confirming the presence of all known chromosomal phage elements as well as 98.8 and 97.7% of queried chromosomal genes for the two control strains. Finally, we demonstrate that use of spatial information, in terms of the physical chromosomal locations of probes, improves the analysis.  相似文献   

11.
This report describes the sequencing in the Escherichia coli B genome of 36 randomly chosen regions that are present in most or all of the fully sequenced E. coli genomes. The phylogenetic relationships among E. coli strains were examined, and evidence for the horizontal gene transfer and variation in mutation rates was determined. The overall phylogenetic tree indicated that E. coli B and K-12 are the most closely related strains, with E. coli O157:H7 being more distantly related, Shigella flexneri 2a even more, and E. coli CFT073 the most distant strain. Within the B, K-12, and O157:H7 clusters, several regions supported alternative topologies. While horizontal transfer may explain these phylogenetic incongruities, faster evolution at synonymous sites along the O157:H7 lineage was also identified. Further interpretation of these results is confounded by an association among genes showing more rapid evolution and results supporting horizontal transfer. Using genes supporting the B and K-12 clusters, an estimate of the genomic mutation rate from a long-term experiment with E. coli B, and an estimate of 200 generations per year, it was estimated that B and K-12 diverged several hundred thousand years ago, while O157:H7 split off from their common ancestor about 1.5-2 million years ago.  相似文献   

12.
This review covers the O antigens of the 46 serotypes of Shigella, but those of most Shigella flexneri are variants of one basic structure, leaving 34 Shigella distinct O antigens to review, together with their gene clusters. Several of the structures and gene clusters are reported for the first time and this is the first such group for which structures and DNA sequences have been determined for all O antigens. Shigella strains are in effect Escherichia coli with a specific mode of pathogenicity, and 18 of the 34 O antigens are also found in traditional E. coli. Three are very similar to E. coli O antigens and 13 are unique to Shigella strains. The O antigen of Shigella sonnei is quite atypical for E. coli and is thought to have transferred from Plesiomonas. The other 12 O antigens unique to Shigella strains have structures that are typical of E. coli, but there are considerably more anomalies in their gene clusters, probably reflecting recent modification of the structures. Having the complete set of structures and genes opens the way for experimental studies on the role of this diversity in pathogenicity.  相似文献   

13.
mutS mutators accelerate the bacterial mutation rate 100- to 1,000-fold and relax the barriers that normally restrict homeologous recombination. These mutators thus afford the opportunity for horizontal exchange of DNA between disparate strains. While much is known regarding the mutS phenotype, the evolutionary structure of the mutS(+) gene in Escherichia coli remains unclear. The physical proximity of mutS to an adjacent polymorphic region of the chromosome suggests that this gene itself may be subject to horizontal transfer and recombination events. To test this notion, a phylogenetic approach was employed that compared gene phylogeny to strain phylogeny, making it possible to identify E. coli strains in which mutS alleles have recombined. Comparison of mutS phylogeny against predicted E. coli "whole-chromosome" phylogenies (derived from multilocus enzyme electrophoresis and mdh sequences) revealed striking levels of phylogenetic discordance among mutS alleles and their respective strains. We interpret these incongruences as signatures of horizontal exchange among mutS alleles. Examination of additional sites surrounding mutS also revealed incongruous distributions compared to E. coli strain phylogeny. This suggests that other regional sequences are equally subject to horizontal transfer, supporting the hypothesis that the 61.5-min mutS-rpoS region is a recombinational hot spot within the E. coli chromosome. Furthermore, these data are consistent with a mechanism for stabilizing adaptive changes promoted by mutS mutators through rescue of defective mutS alleles with wild-type sequences.  相似文献   

14.
SHOT: a web server for the construction of genome phylogenies   总被引:23,自引:0,他引:23  
With the increasing availability of genome sequences, new methods are being proposed that exploit information from complete genomes to classify species in a phylogeny. Here we present SHOT, a web server for the classification of genomes on the basis of shared gene content or the conservation of gene order that reflects the dominant, phylogenetic signal in these genomic properties. In general, the genome trees are consistent with classical gene-based phylogenies, although some interesting exceptions indicate massive horizontal gene transfer. SHOT is a useful tool for analysing the tree of life from a genomic point of view. It is available at http://www.Bork.EMBL-Heidelberg.de/SHOT.  相似文献   

15.
Given the considerable promise whole-genome sequencing offers for phylogeny and classification, it is surprising that microbial systematics and genomics have not yet been reconciled. This might be due to the intrinsic difficulties in inferring reasonable phylogenies from genomic sequences, particularly in the light of the significant amount of lateral gene transfer in prokaryotic genomes. However, recent studies indicate that the species tree and the hierarchical classification based on it are still meaningful concepts, and that state-of-the-art phylogenetic inference methods are able to provide reliable estimates of the species tree to the benefit of taxonomy. Conversely, we suspect that the current lack of completely sequenced genomes for many of the major lineages of prokaryotes and for most type strains is a major obstacle in progress towards a genome-based classification of microorganisms. We conclude that phylogeny-driven microbial genome sequencing projects such as the Genomic Encyclopaedia of Archaea and Bacteria (GEBA) project are likely to rectify this situation.  相似文献   

16.
17.
18.
Clostridium difficile is the most frequent cause of nosocomial diarrhea worldwide, and recent reports suggested the emergence of a hypervirulent strain in North America and Europe. In this study, we applied comparative phylogenomics (whole-genome comparisons using DNA microarrays combined with Bayesian phylogenies) to model the phylogeny of C. difficile, including 75 diverse isolates comprising hypervirulent, toxin-variable, and animal strains. The analysis identified four distinct statistically supported clusters comprising a hypervirulent clade, a toxin A(-) B(+) clade, and two clades with human and animal isolates. Genetic differences among clades revealed several genetic islands relating to virulence and niche adaptation, including antibiotic resistance, motility, adhesion, and enteric metabolism. Only 19.7% of genes were shared by all strains, confirming that this enteric species readily undergoes genetic exchange. This study has provided insight into the possible origins of C. difficile and its evolution that may have implications in disease control strategies.  相似文献   

19.
DNA-DNA hybridization has been established as an important technology in bacterial species taxonomy and phylogenetic analysis. In this study, we analyzed how the efficiency with which the genomic DNA from one species hybridizes to the genomic DNA of another species (DNA-DNA hybridization) in microarray analysis relates to the similarity between two genomes. We found that the predicted DNA-DNA hybridization based on genome sequence similarity correlated well with the experimentally determined microarray hybridization. Between closely related strains, significant numbers of highly divergent genes (<55% identity) and/or the accumulation of mismatches between conserved genes lowered the DNA-DNA hybridization signal, and this reduced the hybridization signals to below 70% for even bacterial strains with over 97% 16S rRNA gene identity. In addition, our results also suggest that a DNA-DNA hybridization signal intensity of over 40% indicates that two genomes at least shared 30% conserved genes (>60% gene identity). This study may expand our knowledge of DNA-DNA hybridization based on genomic sequence similarity comparison and further provide insights for bacterial phylogeny analyses.  相似文献   

20.
Genome evolution in prokaryotes is assisted by integration of gene pools from phages and plasmids. Regions downstream of tRNAs and tmRNAs are considered as hot spots for the integration of these gene pools or genomic islands. Till date, genomic islands have been identified only at tRNA/tmRNA genes in the enterobacterial genomes. Present work reports 10 distinct small RNAs as potent integration sites for genomic islands. A known tool tRNAcc 1.0 has been used to identify genomic islands associated with small RNAs c0362, oxyS, ryaA, rybB, rybD, ryeB, ryeE, rtT, sraE and tmRNA. The coordinates of 25 such small RNA associated genomic islands in three E. coli (strains: CFT073, EDL933 and K12) and Shigella flexneri (strain: 301) genomes are presented. Moreover cross-verification of the genomic sequences encoded within the identified genomic islands in horizontal gene transfer database, GenBank annotation features and atypical sequence compositions support our results. Again, all of the identified 25 genomic integration sites do exhibit genomic block rearrangements with respect to the associated small RNA. Similar to tRNAs/tmRNAs, the downstream regions of the small RNAs are found to be hotspots of integration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号