首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Specific identification of microorganisms in the environment is important but challenging, especially at the species/strain level. Here, we have developed a novel k-mer-based approach to select strain/species-specific probes for microbial identification with diagnostic microarrays. Application of this approach to human microbiome genomes showed that multiple (≥10 probes per strain) strain-specific 50-mer oligonucleotide probes could be designed for 2,012 of 3,421 bacterial strains of the human microbiome, and species-specific probes could be designed for most of the other strains. The method can also be used to select strain/species-specific probes for sequenced genomes in any environments, such as soil and water.  相似文献   

2.
With the decreasing cost of next-generation sequencing, deep sequencing of clinical samples provides unique opportunities to understand host-associated microbial communities. Among the primary challenges of clinical metagenomic sequencing is the rapid filtering of human reads to survey for pathogens with high specificity and sensitivity. Metagenomes are inherently variable due to different microbes in the samples and their relative abundance, the size and architecture of genomes, and factors such as target DNA amounts in tissue samples (i.e. human DNA versus pathogen DNA concentration). This variation in metagenomes typically manifests in sequencing datasets as low pathogen abundance, a high number of host reads, and the presence of close relatives and complex microbial communities. In addition to these challenges posed by the composition of metagenomes, high numbers of reads generated from high-throughput deep sequencing pose immense computational challenges. Accurate identification of pathogens is confounded by individual reads mapping to multiple different reference genomes due to gene similarity in different taxa present in the community or close relatives in the reference database. Available global and local sequence aligners also vary in sensitivity, specificity, and speed of detection. The efficiency of detection of pathogens in clinical samples is largely dependent on the desired taxonomic resolution of the organisms. We have developed an efficient strategy that identifies “all against all” relationships between sequencing reads and reference genomes. Our approach allows for scaling to large reference databases and then genome reconstruction by aggregating global and local alignments, thus allowing genetic characterization of pathogens at higher taxonomic resolution. These results were consistent with strain level SNP genotyping and bacterial identification from laboratory culture.  相似文献   

3.
The human body consists of innumerable multifaceted environments that predispose colonization by a number of distinct microbial communities, which play fundamental roles in human health and disease. In addition to community surveys and shotgun metagenomes that seek to explore the composition and diversity of these microbiomes, there are significant efforts to sequence reference microbial genomes from many body sites of healthy adults. To illustrate the utility of reference genomes when studying more complex metagenomes, we present a reference-based analysis of sequence reads generated from 55 shotgun metagenomes, selected from 5 major body sites, including 16 sub-sites. Interestingly, between 13% and 92% (62.3% average) of these shotgun reads were aligned to a then-complete list of 2780 reference genomes, including 1583 references for the human microbiome. However, no reference genome was universally found in all body sites. For any given metagenome, the body site-specific reference genomes, derived from the same body site as the sample, accounted for an average of 58.8% of the mapped reads. While different body sites did differ in abundant genera, proximal or symmetrical body sites were found to be most similar to one another. The extent of variation observed, both between individuals sampled within the same microenvironment, or at the same site within the same individual over time, calls into question comparative studies across individuals even if sampled at the same body site. This study illustrates the high utility of reference genomes and the need for further site-specific reference microbial genome sequencing, even within the already well-sampled human microbiome.  相似文献   

4.
Comparative bacterial genomics shows that even different isolates of the same bacterial species can vary significantly in gene content. An effective means to survey differences across whole genomes would be highly advantageous for understanding this variation. Here we show that suppression subtractive hybridization (SSH) provides high, representative coverage of regions that differ between similar genomes. Using Helicobacter pylori strains 26695 and J99 as a model, SSH identified approximately 95% of the unique open reading frames in each strain, showing that the approach is effective. Furthermore, combining data from parallel SSH experiments using different restriction enzymes significantly increased coverage compared to using a single enzyme. These results suggest a powerful approach for assessing genome differences among closely related strains when one member of the group has been completely sequenced.  相似文献   

5.
Some Eubacterium and Roseburia species are among the most prevalent motile bacteria present in the intestinal microbiota of healthy adults. These flagellate species contribute “cell motility” category genes to the intestinal microbiome and flagellin proteins to the intestinal proteome. We reviewed and revised the annotation of motility genes in the genomes of six Eubacterium and Roseburia species that occur in the human intestinal microbiota and examined their respective locus organization by comparative genomics. Motility gene order was generally conserved across these loci. Five of these species harbored multiple genes for predicted flagellins. Flagellin proteins were isolated from R. inulinivorans strain A2-194 and from E. rectale strains A1-86 and M104/1. The amino-termini sequences of the R. inulinivorans and E. rectale A1-86 proteins were almost identical. These protein preparations stimulated secretion of interleukin-8 (IL-8) from human intestinal epithelial cell lines, suggesting that these flagellins were pro-inflammatory. Flagellins from the other four species were predicted to be pro-inflammatory on the basis of alignment to the consensus sequence of pro-inflammatory flagellins from the β- and γ- proteobacteria. Many fliC genes were deduced to be under the control of σ28. The relative abundance of the target Eubacterium and Roseburia species varied across shotgun metagenomes from 27 elderly individuals. Genes involved in the flagellum biogenesis pathways of these species were variably abundant in these metagenomes, suggesting that the current depth of coverage used for metagenomic sequencing (3.13–4.79 Gb total sequence in our study) insufficiently captures the functional diversity of genomes present at low (≤1%) relative abundance. E. rectale and R. inulinivorans thus appear to synthesize complex flagella composed of flagellin proteins that stimulate IL-8 production. A greater depth of sequencing, improved evenness of sequencing and improved metagenome assembly from short reads will be required to facilitate in silico analyses of complete complex biochemical pathways for low-abundance target species from shotgun metagenomes.  相似文献   

6.
The Human Microbiome Project (HMP) aims to characterize the microbial communities of 18 body sites from healthy individuals. To accomplish this, the HMP generated two types of shotgun data: reference shotgun sequences isolated from different anatomical sites on the human body and shotgun metagenomic sequences from the microbial communities of each site. The alignment strategy for characterizing these metagenomic communities using available reference sequence is important to the success of HMP data analysis. Six next-generation aligners were used to align a community of known composition against a database comprising reference organisms known to be present in that community. All aligners report nearly complete genome coverage (>97%) for strains with over 6X depth of coverage, however they differ in speed, memory requirement and ease of use issues such as database size limitations and supported mapping strategies. The selected aligner was tested across a range of parameters to maximize sensitivity while maintaining a low false positive rate. We found that constraining alignment length had more impact on sensitivity than does constraining similarity in all cases tested. However, when reference species were replaced with phylogenetic neighbors, similarity begins to play a larger role in detection. We also show that choosing the top hit randomly when multiple, equally strong mappings are available increases overall sensitivity at the expense of taxonomic resolution. The results of this study identified a strategy that was used to map over 3 tera-bases of microbial sequence against a database of more than 5,000 reference genomes in just over a month.  相似文献   

7.
We describe here a new method for large-scale scanning of microbial genomes on a quantitative and qualitative basis. To achieve this aim we propose to create NotI passports: databases containing NotI tags. We demonstrated that these tags comprising 19 bp of sequence information could be successfully generated using DNA isolated from intestinal or fecal samples. Such NotI passports allow the discrimination between closely related bacterial species and even strains. This procedure for generating restriction site tagged sequences (RSTS) is called passporting and can be adapted to any other rare cutting restriction enzyme. A comparison of 1312 tags from available sequenced Escherichia coli genomes, generated with the NotI, PmeI and SbfI restriction enzymes, revealed only 219 tags that were not unique. None of these tags matched human or rodent sequences. Therefore the approach allows analysis of complex microbial mixtures such as in human gut and identification with high accuracy of a particular bacterial strain on a quantitative and qualitative basis.  相似文献   

8.
Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.  相似文献   

9.
As most eukaryotic genomes are yet to be sequenced, the mechanisms underlying their contribution to different ecosystem processes remain untapped. Although approaches to recovering Prokaryotic genomes have become common in genome biology, few studies have tackled the recovery of eukaryotic genomes from metagenomes. This study assessed the reconstruction of microbial eukaryotic genomes using 6000 metagenomes from terrestrial and some transition environments using the EukRep pipeline. Only 215 metagenomic libraries yielded eukaryotic bins. From a total of 447 eukaryotic bins recovered 197 were classified at the phylum level. Streptophytes and fungi were the most represented clades with 83 and 73 bins, respectively. More than 78% of the obtained eukaryotic bins were recovered from samples whose biomes were classified as host-associated, aquatic, and anthropogenic terrestrial. However, only 93 bins were taxonomically assigned at the genus level and 17 bins at the species level. Completeness and contamination estimates were obtained for a total of 193 bins and consisted of 44.64% (σ = 27.41%) and 3.97% (σ = 6.53%), respectively. Micromonas commoda was the most frequent taxon found while Saccharomyces cerevisiae presented the highest completeness, probably because more reference genomes are available. Current measures of completeness are based on the presence of single-copy genes. However, mapping of the contigs from the recovered eukaryotic bins to the chromosomes of the reference genomes showed many gaps, suggesting that completeness measures should also include chromosome coverage. Recovering eukaryotic genomes will benefit significantly from long-read sequencing, development of tools for dealing with repeat-rich genomes, and improved reference genomes databases.  相似文献   

10.
乳杆菌(Lactobacillus)是益生菌, 也是当前的研究热点之一。研究泡菜等样品中的乳杆菌需要快速的检出方法。根据已完成全基因组测序的14种乳杆菌的16S rDNA序列, 设计一对乳杆菌特异性引物。PCR检测结果表明该引物对乳杆菌和明串珠菌能扩增出800 bp的片段, 对表皮葡萄球菌、乳酸乳球菌和枯草芽胞杆菌却没有扩增条带, 具有一定的乳杆菌特异性。结合MRS乳杆菌半选择培养基和革兰氏染色, 运用菌落PCR技术, 可以快速高效地检出四川泡菜中的乳杆菌。再通过对PCR扩增片段测序, 可以将乳杆菌鉴定到种。从16份四川泡菜样品中检出了15株乳杆菌, 其中14株被鉴定为植物乳杆菌, 1株需进一步鉴定才能确定种。该方法可以检出乳杆菌新种。  相似文献   

11.

Background

Unclassified simian strain Treponema Fribourg-Blanc was isolated in 1966 from baboons (Papio cynocephalus) in West Africa. This strain was morphologically indistinguishable from T. pallidum ssp. pallidum or ssp. pertenue strains, and it was shown to cause human infections.

Methodology/Principal Findings

To precisely define genetic differences between Treponema Fribourg-Blanc (unclassified simian isolate, FB) and T. pallidum ssp. pertenue strains (TPE), a high quality sequence of the whole Fribourg-Blanc genome was determined with 454-pyrosequencing and Illumina sequencing platforms. Combined average coverage of both methods was greater than 500×. Restriction target sites (n = 1,773), identified in silico, of selected restriction enzymes within the Fribourg-Blanc genome were verified experimentally and no discrepancies were found. When compared to the other three sequenced TPE genomes (Samoa D, CDC-2, Gauthier), no major genome rearrangements were found. The Fribourg-Blanc genome clustered with other TPE strains (especially with the TPE CDC-2 strain), while T. pallidum ssp. pallidum strains clustered separately as well as the genome of T. paraluiscuniculi strain Cuniculi A. Within coding regions, 6 deletions, 5 insertions and 117 substitutions differentiated Fribourg-Blanc from other TPE genomes.

Conclusions/Significance

The Fribourg-Blanc genome showed similar genetic characteristics as other TPE strains. Therefore, we propose to rename the unclassified simian isolate to Treponema pallidum ssp. pertenue strain Fribourg-Blanc. Since the Fribourg-Blanc strain was shown to cause experimental infection in human hosts, non-human primates could serve as possible reservoirs of TPE strains. This could considerably complicate recent efforts to eradicate yaws. Genetic differences specific for Fribourg-Blanc could then contribute for identification of cases of animal-derived yaws infections.  相似文献   

12.
By comparing the SEED and Pfam functional profiles of metagenomes of two Brazilian coral species with 29 datasets that are publicly available, we were able to identify some functions, such as protein secretion systems, that are overrepresented in the metagenomes of corals and may play a role in the establishment and maintenance of bacteria-coral associations. However, only a small percentage of the reads of these metagenomes could be annotated by these reference databases, which may lead to a strong bias in the comparative studies. For this reason, we have searched for identical sequences (99% of nucleotide identity) among these metagenomes in order to perform a reference-independent comparative analysis, and we were able to identify groups of microbial communities that may be under similar selective pressures. The identification of sequences shared among the metagenomes was found to be even better for the identification of groups of communities with similar niche requirements than the traditional analysis of functional profiles. This approach is not only helpful for the investigation of similarities between microbial communities with high proportion of unknown reads, but also enables an indirect overview of gene exchange between communities.  相似文献   

13.
In the pelagic environment, iron is a scarce but essential micronutrient. The iron acquisition capabilities of selected marine bacteria have been investigated, but the recent proliferation of marine prokaryotic genomes and metagenomes offers a more comprehensive picture of microbial iron uptake pathways in the ocean. Searching these data sets, we were able to identify uptake mechanisms for Fe(3+), Fe(2+) and iron chelates (e.g. siderophore and haem iron complexes). Transport of iron chelates is accomplished by TonB-dependent transporters (TBDTs). After clustering the TBDTs from marine prokaryotic genomes, we identified TBDT clusters for the transport of hydroxamate and catecholate siderophore iron complexes and haem using gene neighbourhood analysis and co-clustering of TBDTs of known function. The genomes also contained two classes of siderophore biosynthesis genes: NRPS (non-ribosomal peptide synthase) genes and NIS (NRPS Independent Siderophore) genes. The most common iron transporters, in both the genomes and metagenomes, were Fe(3+) ABC transporters. Iron uptake-related TBDTs and siderophore biosynthesis genes were less common in pelagic marine metagenomes relative to the genomic data set, in part because Pelagibacter ubique and Prochlorococcus species, which almost entirely lacked these Fe uptake systems, dominate the metagenomes. Our results are largely consistent with current knowledge of iron speciation in the ocean, but suggest that in certain niches the ability to acquire siderophores and/or haem iron chelates is beneficial.  相似文献   

14.
15.
Environmental bioremediation relies heavily on the realized potential of efficient bioremediation agents or microbial strains of interest. Identifying suitable microbial agents for plant biomass waste valorization requires (i) high-quality genome assemblies to predict the full metabolic and functional potential, (ii) accurate mapping of lignocellulose metabolizing enzymes. However, fragmented nature of the sequenced genomes often limits the prediction ability due to breaks occurring in coding sequences. To address these challenges and as part of our ongoing agri-culturomics efforts, we have performed a hybrid genome assembly using Illumina and Nanopore reads with modified assembly protocol, for a novel Streptomyces strain isolated from the rhizosphere niche of green leafy vegetables grown in a commercial urban farm. High-quality genome was assembled with the size of 8.6 Mb in just two contigs with N50 of 8,542,030 and coverage of 383X. This facilitated identification and complete arrangement of approximately 248 CAZymes and 38 biosynthetic gene clusters in the genome. Multiple gene clusters consisting of cellulases and hemicellulases associated with substrate recognition domain were identified in the genome. Genes for lignin, chitin, and even some aromatic compounds degradation were found in the Streptomyces sp. genome which makes it a promising candidate for lignocellulosic waste valorization. Supplementary InformationThe online version contains supplementary material available at 10.1007/s12088-021-00935-5.  相似文献   

16.
Marine phages have an astounding global abundance and ecological impact. However, little knowledge is derived from phage genomes, as most of the open reading frames in their small genomes are unknown, novel proteins. To infer potential functional and ecological relevance of sequenced marine Pseudoalteromonas phage H105/1, two strategies were used. First, similarity searches were extended to include six viral and bacterial metagenomes paired with their respective environmental contextual data. This approach revealed ‘ecogenomic'' patterns of Pseudoalteromonas phage H105/1, such as its estuarine origin. Second, intrinsic genome signatures (phylogenetic, codon adaptation and tetranucleotide (tetra) frequencies) were evaluated on a resolved intra-genomic level to shed light on the evolution of phage functional modules. On the basis of differential codon adaptation of Phage H105/1 proteins to the sequenced Pseudoalteromonas spp., regions of the phage genome with the most ‘host''-adapted proteins also have the strongest bacterial tetra signature, whereas the least ‘host''-adapted proteins have the strongest phage tetra signature. Such a pattern may reflect the evolutionary history of the respective phage proteins and functional modules. Finally, analysis of the structural proteome identified seven proteins that make up the mature virion, four of which were previously unknown. This integrated approach combines both novel and classical strategies and serves as a model to elucidate ecological inferences and evolutionary relationships from phage genomes that typically abound with unknown gene content.  相似文献   

17.
The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR) with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH) identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI) approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as PGPR.  相似文献   

18.
Amplified fragment length polymorphism (AFLP) analysis allows a rapid, relatively simple analysis of a large portion of a microbial genome, providing information about the species and its phylogenetic relationship to other microbes (Vos et al. 1995). The method simply surveys the genome for length and sequence polymorphisms. The AFLP pattern identified can be used for comparison to the genomes of other species. Unlike other methods, it does not rely on analysis of a single genetic locus that may bias the interpretation of results and does not require any prior knowledge of the targeted organism. Moreover, a standard set of reagents can be applied to any species without using species-specific information or molecular probes. We are using AFLP analysis to rapidly identify different bacterial species. A comparison of AFLP profiles generated from a large battery of Bacillus anthracis strains shows very little variability among different isolates (Keim et al. 1997). By contrast, there is a significant difference between AFLP profiles generated for any B. anthracis strain and even the most closely related Bacillus species. Sufficient variability is apparent among all known microbial species to allow phylogenetic analysis based on large numbers of genetically unlinked loci. These striking differences among AFLP profiles allow unambiguous identification of previously identified species and phylogenetic placement of newly characterized isolates relative to known species based on a large number of independent genetic loci. Data generated thus far show that the method provides phylogenetic analyses that are consistent with other widely accepted phylogenetic methods. However, AFLP analysis provides a more detailed analysis of the targets and samples a much larger portion of the genome. Consequently, it provides an inexpensive, rapid means of characterizing microbial isolates to further differentiate among strains and closely related microbial species. Such information cannot be rapidly generated by other means. AFLP sample analysis quickly generates a very large amount of molecular information about microbial genomes. However, this information cannot be analysed rapidly using manual methods. We are developing a large archive of electronic AFLP signatures that is being used to identify isolates collected from medical, veterinary, forensic and environmental samples. We are also developing the computational packages necessary to rapidly and unambiguously analyse the AFLP profiles and conduct a phylogenetic comparison of these data relative to information already in our database. We will use this archive and the associated algorithms to determine the species identity of previously uncharacterized isolates and place them phylogenetically relative to other microbes based on their AFLP signatures. This study provides significant new information about microbes with environmental, veterinary and medical significance. This information can be used in further studies to understand the relationships among these species and the factors that distinguish them from one another. It should also allow the identification of unique factors that contribute to important microbial traits, including pathogenicity and virulence. We are also using AFLP data to identify, isolate and sequence DNA fragments that are unique to particular microbial species and strains. The fragment patterns and sequence information provide insights into the complexity and organization of bacterial genomes relative to one another. They also provide the information necessary for the development of species-specific polymerase chain reaction primers that can be used to interrogate complex samples for the presence of B. anthracis, other microbial pathogens or their remnants.  相似文献   

19.
In sequenced genomes of prokaryotes, anomalous DNA (aDNA) can be recognized, among others, by atypical clustering of dinucleotides. We hypothesized that atypical clustering of hexameric endonuclease recognition sites in aDNA allows the specific isolation of anomalous sequences in vitro. Clustering of endonuclease recognition sites in aDNA regions of eight published prokaryotic genome sequences was demonstrated. In silico digestion of the Neisseria meningitidis MC58 genome, using four selected endonucleases, revealed that out of 27 of the small fragments predicted (<5 kb), 21 were located in known genomic islands. Of the 24 calculated fragments (>300 bp and <5 kb), 22 met our criteria for aDNA, i.e. a high dinucleotide dissimilarity and/or aberrant GC content. The four enzymes also allowed the identification of aDNA fragments from the related Z2491 strain. Similarly, the sequenced genomes of three strains of Escherichia coli assessed by in silico digestion using XbaI yielded strain-specific sets of fragments of anomalous composition. In vitro applicability of the method was demonstrated by using adaptor-linked PCR, yielding the predicted fragments from the N.meningitidis MC58 genome. In conclusion, this strategy allows the selective isolation of aDNA from prokaryotic genomes by a simple restriction digest–amplification–cloning–sequencing scheme.  相似文献   

20.
Determining the taxonomic affiliation of sequences assembled from metagenomes remains a major bottleneck that affects research across the fields of environmental, clinical and evolutionary microbiology. Here, we introduce MyTaxa, a homology-based bioinformatics framework to classify metagenomic and genomic sequences with unprecedented accuracy. The distinguishing aspect of MyTaxa is that it employs all genes present in an unknown sequence as classifiers, weighting each gene based on its (predetermined) classifying power at a given taxonomic level and frequency of horizontal gene transfer. MyTaxa also implements a novel classification scheme based on the genome-aggregate average amino acid identity concept to determine the degree of novelty of sequences representing uncharacterized taxa, i.e. whether they represent novel species, genera or phyla. Application of MyTaxa on in silico generated (mock) and real metagenomes of varied read length (100–2000 bp) revealed that it correctly classified at least 5% more sequences than any other tool. The analysis also showed that ∼10% of the assembled sequences from human gut metagenomes represent novel species with no sequenced representatives, several of which were highly abundant in situ such as members of the Prevotella genus. Thus, MyTaxa can find several important applications in microbial identification and diversity studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号