首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Next‐generation DNA sequencing has enabled a rapid expansion in the size of molecular fungal ecology studies employing the nuclear internal transcribed spacer (ITS) region. Many sequence‐processing pipelines and protocols require sequence clustering to generate operational taxonomic units (OTUs) based on sequence similarity as a step to reduce total data quantity and complexity prior to taxonomic assignment. However, the consequences of ITS sequence clustering in regard to sample taxonomic coverage have not been carefully examined. Here we demonstrate that typically used clustering thresholds for fungal ITS sequences result in statistically significant losses in taxonomic coverage. Analyses using environmentally derived fungal sequences indicated an average of 3.1% of species went undetected (P < 0.05) if the sequences were denoised and clustered at a 97% threshold prior to taxonomic assignment. Additionally, an in silico analysis using a reference fungal ITS database suggested that approximately 25% of species went undetected if the sequences were clustered prior to taxonomic assignment. Finally, analysis of sequences derived from pure‐cultured fungal isolates of known identity indicated sequence denoising and clustering were not critical in improving identification accuracy.  相似文献   

2.
Given the absence of universal marker genes in the viral kingdom, researchers typically use BLAST (with stringent E-values) for taxonomic classification of viral metagenomic sequences. Since majority of metagenomic sequences originate from hitherto unknown viral groups, using stringent e-values results in most sequences remaining unclassified. Furthermore, using less stringent e-values results in a high number of incorrect taxonomic assignments. The SOrt-ITEMS algorithm provides an approach to address the above issues. Based on alignment parameters, SOrt-ITEMS follows an elaborate work-flow for assigning reads originating from hitherto unknown archaeal/bacterial genomes. In SOrt-ITEMS, alignment parameter thresholds were generated by observing patterns of sequence divergence within and across various taxonomic groups belonging to bacterial and archaeal kingdoms. However, many taxonomic groups within the viral kingdom lack a typical Linnean-like taxonomic hierarchy. In this paper, we present ProViDE (Program for Viral Diversity Estimation), an algorithm that uses a customized set of alignment parameter thresholds, specifically suited for viral metagenomic sequences. These thresholds capture the pattern of sequence divergence and the non-uniform taxonomic hierarchy observed within/across various taxonomic groups of the viral kingdom. Validation results indicate that the percentage of 'correct' assignments by ProViDE is around 1.7 to 3 times higher than that by the widely used similarity based method MEGAN. The misclassification rate of ProViDE is around 3 to 19% (as compared to 5 to 42% by MEGAN) indicating significantly better assignment accuracy. ProViDE software and a supplementary file (containing supplementary figures and tables referred to in this article) is available for download from http://metagenomics.atc.tcs.com/binning/ProViDE/  相似文献   

3.
Next-generation DNA sequencing (NGS) approaches are rapidly surpassing Sanger sequencing for characterizing the diversity of natural microbial communities. Despite this rapid transition, few comparisons exist between Sanger sequences and the generally much shorter reads of NGS. Operational taxonomic units (OTUs) derived from full-length (Sanger sequencing) and pyrotag (454 sequencing of the V9 hypervariable region) sequences of 18S rRNA genes from 10 global samples were analyzed in order to compare the resulting protistan community structures and species richness. Pyrotag OTUs called at 98% sequence similarity yielded numbers of OTUs that were similar overall to those for full-length sequences when the latter were called at 97% similarity. Singleton OTUs strongly influenced estimates of species richness but not the higher-level taxonomic composition of the community. The pyrotag and full-length sequence data sets had slightly different taxonomic compositions of rhizarians, stramenopiles, cryptophytes, and haptophytes, but the two data sets had similarly high compositions of alveolates. Pyrotag-based OTUs were often derived from sequences that mapped to multiple full-length OTUs at 100% similarity. Thus, pyrotags sequenced from a single hypervariable region might not be appropriate for establishing protistan species-level OTUs. However, nonmetric multidimensional scaling plots constructed with the two data sets yielded similar clusters, indicating that beta diversity analysis results were similar for the Sanger and NGS sequences. Short pyrotag sequences can provide holistic assessments of protistan communities, although care must be taken in interpreting the results. The longer reads (>500 bp) that are now becoming available through NGS should provide powerful tools for assessing the diversity of microbial eukaryotic assemblages.  相似文献   

4.
Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.  相似文献   

5.
【背景】对于环境样品中氨氧化古菌(Ammonia-oxidizing archaea,AOA)多样性的研究,利用amoA功能基因作为分子标记会比16SrRNA基因有更强的特异性和更高的分辨率,能更准确地反映环境样品中氨氧化古菌的种群结构和分布特征。然而,目前对amoA基因扩增子高通量测序的分析存在两大限制因素:一是缺乏相应的amoA基因参考数据库;二是AOA amoA基因在种水平上的相似性阈值未知,分析过程中没有明确的划分种水平操作分类单元(Operational taxonomic unit,OTU)的阈值。【目的】构建基于amoA功能基因序列分析氨氧化古菌多样性的方法,为基于高通量测序的功能微生物多样性分析提供参考。【方法】基于目前已通过分离纯化或富集培养获得的34株氨氧化古菌及功能基因数据库中收录的环境样品amoA基因序列,构建氨氧化古菌amoA基因参考数据库。通过菌株间两两比对获得的amoA基因相似度与16SrRNA基因相似度的相关性分析,确定amoA基因在种水平上的相似性阈值。基于MOTHUR软件平台,利用建立的参考数据库和确定的阈值对南海一个垂直水体剖面样品的amoA基因序列进行多样性分析。【结果】构建了含有26 091条序列信息的古菌amoA基因参考数据库,确定了89%作为分析过程中古菌amoA基因划分种水平OTU的阈值,对南海水体样品氨氧化古菌的多样性分析结果很好地显示了南海不同深度水层水体中氨氧化古菌的种群结构和系统发育关系,有效揭示了南海氨氧化古菌的垂直分布差异。【结论】建立了基于amoA基因高通量测序的氨氧化古菌多样性分析方法,此方法可以有效分析环境样品中氨氧化古菌的多样性。  相似文献   

6.
Identification of North Sea molluscs with DNA barcoding   总被引:1,自引:0,他引:1       下载免费PDF全文
Sequence‐based specimen identification, known as DNA barcoding, is a common method complementing traditional morphology‐based taxonomic assignments. The fundamental resource in DNA barcoding is the availability of a taxonomically reliable sequence database to use as a reference for sequence comparisons. Here, we provide a reference library including 579 sequences of the mitochondrial cytochrome c oxidase subunit I for 113 North Sea mollusc species. We tested the efficacy of this library by simulating a sequence‐based specimen identification scenario using Best Match, Best Close Match (BCM) and All Species Barcode (ASB) criteria with three different threshold values. Each identification result was compared with our prior morphology‐based taxonomic assignments. Our simulation resulted in 87.7% congruent identifications (93.8% when excluding singletons). The highest number of congruent identifications was obtained with BCM and ASB and a 0.05 threshold. We also compared identifications with genetic clustering (Barcode Index Numbers, BINs) computed by the Barcode of Life Datasystem (BOLD). About 68% of our morphological identifications were congruent with BINs created by BOLD. Forty‐nine sequences were clustered in 16 discordant BINs, and these were divided in two classes: sequences from different species clustered in a single BIN and conspecific sequences divided in more BINs. Whereas former incongruences were probably caused by BOLD entries in need of a taxonomic update, the latter incongruences regarded taxa requiring further investigations. These include species with amphi‐Atlantic distribution, whose genetic structure should be evaluated over their entire range to produce a reliable sequence‐based identification system.  相似文献   

7.
Environmental DNA sequencing efforts of substrates such as soil, wood, and seawater have been found to present very different views of the underlying biological communities compared with efforts based on morphological examination and culture studies. The taxonomic affiliation of many of these environmental sequences cannot be settled with certainty due to the lack of proximate reference sequences in the corpus of public sequence data, and they are typically submitted to the international sequence databases without much indication of their relatedness. The scientific community has proved reluctant to include such unnamed sequences in phylogenetic analyses and taxonomic studies, but the present study shows such a position to be not only largely unwarranted but also potentially unsound. The sequences of 48 published fungal alignments of the nuclear ribosomal internal transcribed spacer region were subjected to similarity searches in the sequence databases to recover environmental sequences with a clear bearing on the respective ingroup. An average of 20 environmental sequences were added to each alignment, and upon rerunning the phylogenetic analyses of each study we found that topological rearrangements involving the original ingroup sequences were observed for no less than 29 (60%) of the studies. In nearly 20% of these cases, the rearrangements were large enough to question or even overthrow at least one conclusion presented in the original studies. The basal branching order was similarly subject to changes in 16% of the applicable studies. Environmental sequences are thus not only relevant in ecological research but form a requisite source of information also in systematics and taxonomy.
© The Willi Hennig Society 2010.  相似文献   

8.
Molecular identification of ectomycorrhizal mycelium in soil horizons   总被引:27,自引:0,他引:27  
Molecular identification techniques based on total DNA extraction provide a unique tool for identification of mycelium in soil. Using molecular identification techniques, the ectomycorrhizal (EM) fungal community under coniferous vegetation was analyzed. Soil samples were taken at different depths from four horizons of a podzol profile. A basidiomycete-specific primer pair (ITS1F-ITS4B) was used to amplify fungal internal transcribed spacer (ITS) sequences from total DNA extracts of the soil horizons. Amplified basidiomycete DNA was cloned and sequenced, and a selection of the obtained clones was analyzed phylogenetically. Based on sequence similarity, the fungal clone sequences were sorted into 25 different fungal groups, or operational taxonomic units (OTUs). Out of 25 basidiomycete OTUs, 7 OTUs showed high nucleotide homology (> or = 99%) with known EM fungal sequences and 16 were found exclusively in the mineral soil. The taxonomic positions of six OTUs remained unclear. OTU sequences were compared to sequences from morphotyped EM root tips collected from the same sites. Of the 25 OTUs, 10 OTUs had > or = 98% sequence similarity with these EM root tip sequences. The present study demonstrates the use of molecular techniques to identify EM hyphae in various soil types. This approach differs from the conventional method of EM root tip identification and provides a novel approach to examine EM fungal communities in soil.  相似文献   

9.
Molecular Identification of Ectomycorrhizal Mycelium in Soil Horizons   总被引:14,自引:0,他引:14       下载免费PDF全文
Molecular identification techniques based on total DNA extraction provide a unique tool for identification of mycelium in soil. Using molecular identification techniques, the ectomycorrhizal (EM) fungal community under coniferous vegetation was analyzed. Soil samples were taken at different depths from four horizons of a podzol profile. A basidiomycete-specific primer pair (ITS1F-ITS4B) was used to amplify fungal internal transcribed spacer (ITS) sequences from total DNA extracts of the soil horizons. Amplified basidiomycete DNA was cloned and sequenced, and a selection of the obtained clones was analyzed phylogenetically. Based on sequence similarity, the fungal clone sequences were sorted into 25 different fungal groups, or operational taxonomic units (OTUs). Out of 25 basidiomycete OTUs, 7 OTUs showed high nucleotide homology (≥99%) with known EM fungal sequences and 16 were found exclusively in the mineral soil. The taxonomic positions of six OTUs remained unclear. OTU sequences were compared to sequences from morphotyped EM root tips collected from the same sites. Of the 25 OTUs, 10 OTUs had ≥98% sequence similarity with these EM root tip sequences. The present study demonstrates the use of molecular techniques to identify EM hyphae in various soil types. This approach differs from the conventional method of EM root tip identification and provides a novel approach to examine EM fungal communities in soil.  相似文献   

10.
目前常用的蝗虫分类系统都是基于形态学建立起来的,但由于其主观性较强,不同的专家有不同的观点,因此争议较大。DNA序列中含有丰富的生物学信息,记载了生物进化的历史。根据DNA序列中的生物学信息研究蝗虫的系统发育,可为建立理想的蝗虫分类系统提供充足的分子证据。本文综述了RAPD与DNA序列技术在蝗虫分子系统学研究中的应用进展,并对二者在分子系统学研究中存在的问题做了简单介绍。  相似文献   

11.
Identification of ichthyoplankton is difficult because fish during early life stages often lack stable morphological characteristics; such difficulty in species identification can be a major hindrance in conducting ichthyoplankton surveys for fish biodiversity investigations. Here, we evaluated the feasibility of a molecular operational taxonomic unit (MOTU) approach for ichthyoplankton investigations, and describe fish biodiversity in the Jinshajiang section of the upper Yangtze River, China. The MOTUs were established by grouping specimens diverging less than 1.00% Kimura two‐parameter (K2P) distance units from their nearest neighbor within the same MOTU, based on previous work on between‐species divergences of the mitochondrial cytochrome C oxidase subunit I (COI) gene. Taxonomic assignment of the MOTUs was performed by comparing the MOTU sequences with the COI sequences of taxonomic species. Sixty‐eight MOTUs were inferred from 818 COI sequences of ichthyoplankton in the Jinshajiang river section. Among those, one MOTU was composed of two identified taxonomic species, and each of the other MOTUs was linked to a single, identified taxonomic species. Only 26 MOTUs were successfully identified to taxonomic species due to the limited reference database. Our results demonstrate that the MOTU approach can be applied successfully for analyzing biodiversity and identifying species of freshwater ichthyoplankton. Compared with previous ichthyoplankton investigations the richness of ichthyoplankton was very high. High diversity of ichthyoplankton noted in our study suggests that the Jinshajiang section should be an important target for fish biodiversity conservation in the Yangtze River.  相似文献   

12.
獐牙菜亚族(subtribe Swertiinae)是龙胆科(Gentianaceae)中分类处理较困难的一个亚族。为探讨该亚族各属之间和属内的系统关系,选取了该亚族86种及变种,采用ML和BI方法对样本的叶绿体基因mat K和rbc L片段进行分析,构建了该亚族的系统发育树,用马尔科夫蒙特卡洛算法(MCMC)的分子序列贝叶斯分析推算了该亚族的关键演化时间点。结果显示:①龙胆亚族和獐牙菜亚族各自为单系,且互为姐妹类群;②獐牙菜属、假龙胆属、肋柱花属和喉毛花属均不是单系群,各属的种在系统发育树上互有交叉,特别是獐牙菜属的多个种分别聚到不同的支上,与其它属是并系关系;③獐牙菜亚族49个种在约4 Ma开始形成;④分子数据支持何廷农分类系统对于獐牙菜亚属和多枝亚属的属间划分,部分支持多枝亚属下多枝组和宽丝组的划分;⑤异型花属、獐牙菜属、假龙胆属、喉毛花和肋柱花属的属间分类以及獐牙菜属肉根亚属密花组的系统位置仍需进一步讨论。  相似文献   

13.
The microarray approach has been proposed for high throughput analysis of the microbial community by providing snapshots of the microbial diversity under different environmental conditions. For this purpose, a prototype of a 16S rRNA-based taxonomic microarray was developed and evaluated for assessing bacterial community diversity. The prototype microarray is composed of 122 probes that target bacteria at various taxonomic levels from phyla to species (mostly Alphaproteobacteria). The prototype microarray was first validated using bacteria in pure culture. Differences in the sequences of probes and potential target DNAs were quantified as weighted mismatches (WMM) in order to evaluate hybridization reliability. As a general feature, probes having a WMM > 2 with target DNA displayed only 2.8% false positives. The prototype microarray was subsequently tested with an environmental sample, which consisted of an Agrobacterium-related polymerase chain reaction amplicon from a maize rhizosphere bacterial community. Microarray results were compared to results obtained by cloning-sequencing with the same DNA. Microarray analysis enabled the detection of all 16S rRNA gene sequences found by cloning-sequencing. Sequences representing only 1.7% of the clone library were detected. In conclusion, this prototype 16S rRNA-based taxonomic microarray appears to be a promising tool for the analysis of Alphaproteobacteria in complex ecosystems.  相似文献   

14.
Dinoflagellates are a heterogeneous group of protists present in all aquatic ecosystems where they occupy various ecological niches. They play a major role as primary producers, but many species are mixotrophic or heterotrophic. Environmental metabarcoding based on high‐throughput sequencing is increasingly applied to assess diversity and abundance of planktonic organisms, and reference databases are definitely needed to taxonomically assign the huge number of sequences. We provide an updated 18S rRNA reference database of dinoflagellates: dinoref . Sequences were downloaded from genbank and filtered based on stringent quality criteria. All sequences were taxonomically curated, classified taking into account classical morphotaxonomic studies and molecular phylogenies, and linked to a series of metadata. dinoref includes 1,671 sequences representing 149 genera and 422 species. The taxonomic assignation of 468 sequences was revised. The largest number of sequences belongs to Gonyaulacales and Suessiales that include toxic and symbiotic species. dinoref provides an opportunity to test the level of taxonomic resolution of different 18S barcode markers based on a large number of sequences and species. As an example, when only the V4 region is considered, 374 of the 422 species included in dinoref can still be unambiguously identified. Clustering the V4 sequences at 98% similarity, a threshold that is commonly applied in metabarcoding studies, resulted in a considerable underestimation of species diversity.  相似文献   

15.
A total of 95 nucleotide sequences of a Co-1 gene fragment of approximately 650 bp were analyzed for fishes of the orders Perciformes and Scorpaeniformes (outgroup). Gene trees based on four algorithms (BA, NJ, MP, and ML) were similar in topology of solved branches. An emphasis was placed on the species and generic levels, but a significant phylogenetic signal was obtained for higher taxonomic ranks as well. For instance, a monophyletic origin was confirmed for the family Zoarcidae and the subfamily Opisthocentrinae (Stichaeidae). The proportion of different nucleotides in the sequences compared (p-distances) significantly increased with increasing taxonomic rank. The p-distances were estimated for four hierarchic levels and were (1) 0.15 0.06% for the within-species hierarchic level, (2) 6.33 0.37% for the within-genus level, (3) 11.83 0.06% for the within-family level, and (4) 15.22 0.05% for the within-order level. The difference in the Co-1 gene fragments between levels (1) and (2) allows almost errorless species identification on the basis of this kind of a molecular bar code.  相似文献   

16.
The microbial mats of Guerrero Negro (GN), Baja California Sur, Mexico historically were considered a simple environment, dominated by cyanobacteria and sulfate-reducing bacteria. Culture-independent rRNA community profiling instead revealed these microbial mats as among the most phylogenetically diverse environments known. A preliminary molecular survey of the GN mat based on only ∼1500 small subunit rRNA gene sequences discovered several new phylum-level groups in the bacterial phylogenetic domain and many previously undetected lower-level taxa. We determined an additional ∼119 000 nearly full-length sequences and 28 000 >200 nucleotide 454 reads from a 10-layer depth profile of the GN mat. With this unprecedented coverage of long sequences from one environment, we confirm the mat is phylogenetically stratified, presumably corresponding to light and geochemical gradients throughout the depth of the mat. Previous shotgun metagenomic data from the same depth profile show the same stratified pattern and suggest that metagenome properties may be predictable from rRNA gene sequences. We verify previously identified novel lineages and identify new phylogenetic diversity at lower taxonomic levels, for example, thousands of operational taxonomic units at the family-genus levels differ considerably from known sequences. The new sequences populate parts of the bacterial phylogenetic tree that previously were poorly described, but indicate that any comprehensive survey of GN diversity has only begun. Finally, we show that taxonomic conclusions are generally congruent between Sanger and 454 sequencing technologies, with the taxonomic resolution achieved dependent on the abundance of reference sequences in the relevant region of the rRNA tree of life.  相似文献   

17.
Complex microbial communities remain poorly characterized despite their ubiquity and importance to human and animal health, agriculture, and industry. Attempts to describe microbial communities by either traditional microbiological methods or molecular methods have been limited in both scale and precision. The availability of genomics technologies offers an unprecedented opportunity to conduct more comprehensive characterizations of microbial communities. Here we describe the application of an established molecular diagnostic method based on the chaperonin-60 sequence, in combination with high-throughput sequencing, to the profiling of a microbial community: the pig intestinal microbial community. Four libraries of cloned cpn60 sequences were generated by two genomic DNA extraction procedures in combination with two PCR protocols. A total of 1,125 cloned cpn60 sequences from the four libraries were sequenced. Among the 1,125 cloned cpn60 sequences, we identified 398 different nucleotide sequences encoding 280 unique peptide sequences. Pairwise comparisons of the 398 unique nucleotide sequences revealed a high degree of sequence diversity within the library. Identification of the likely taxonomic origins of cloned sequences ranged from imprecise, with clones assigned to a taxonomic subclass, to precise, for cloned sequences with 100% DNA sequence identity with a species in our reference database. The compositions of the four libraries were compared and differences related to library construction parameters were observed. Our results indicate that this method is an alternative to 16S rRNA sequence-based studies which can be scaled up for the purpose of performing a potentially comprehensive assessment of a given microbial community or for comparative studies.  相似文献   

18.
16S rDNA library-based analysis of ruminal bacterial diversity   总被引:13,自引:0,他引:13  
Bacterial 16S rDNA sequence data, incorporating sequences > 1 kb, were retrieved from published rumen library studies and public databases, then were combined and analysed to assess the diversity of the rumen microbial ecosystem as indicated by the pooled data. Low G+C Gram positive bacteria (54%) and the Cytophaga-Flexibacter-Bacteroides (40%) phyla were most abundantly represented. The diversity inferred by combining the datasets was much wider than inferred by individual studies, most likely due to different diets enriching for bacteria with different fermentative activities. A total of 341 operational taxonomic units (OTU) was predicted by the Chao1 non-parametric estimator approach. Phylogenetic and database analysis demonstrated that 89% of the diversity had greatest similarity to organisms which had not been cultivated, and that several sequences are likely to represent novel taxonomic groupings. Furthermore, of the 11% of the diversity represented by cultured isolates (> 95% 16S rDNA identity), not all of the bacteria were of ruminal origin. This study therefore reinforces the need to reconcile classical culture-based rumen microbiology with molecular ecological studies to determine the metabolic role of uncultivated species.  相似文献   

19.
目的对比Sanger和Pyrosequencing测序法分析健康人口腔菌群组成。方法收集6例健康成人唾液、舌背、黏膜、龈上及龈下菌斑并构建16SrRNA基因文库,分别用Sanger和Pyrosequencing测序法分析。结果 Sanger测序所得已知的序列有5,794条(占6,535总序列数88.7%)、75个属,396个序列划分操作分类单元(operational taxonomic units,OTUs,占总OTUs的61.4%)。Pyrosequencing测序所得已知的序列有10,771条(占11,103总序列数97.0%)、66个属,322个OTUs(占总OTUs的68.0%)。Sanger和Pyrosequencing测序法所得口腔菌群在门、属的水平分布趋势基本一致,但在种的水平分布差异显著。Sanger和Pyrosequencing测序法构建的口腔菌群文库均匀度值分别为0.016和0.007,说明Pyrosequencing分析口腔菌群物种数量分布比Sanger测序方法的文库均匀性稍差,但优势种更显著。结论 Pyrosequencing测序时所构建基因文库能代表口腔菌群的多样性且经济、省时,可以应用于口腔细菌物种的分析。  相似文献   

20.
Determining the taxonomic affiliation of sequences assembled from metagenomes remains a major bottleneck that affects research across the fields of environmental, clinical and evolutionary microbiology. Here, we introduce MyTaxa, a homology-based bioinformatics framework to classify metagenomic and genomic sequences with unprecedented accuracy. The distinguishing aspect of MyTaxa is that it employs all genes present in an unknown sequence as classifiers, weighting each gene based on its (predetermined) classifying power at a given taxonomic level and frequency of horizontal gene transfer. MyTaxa also implements a novel classification scheme based on the genome-aggregate average amino acid identity concept to determine the degree of novelty of sequences representing uncharacterized taxa, i.e. whether they represent novel species, genera or phyla. Application of MyTaxa on in silico generated (mock) and real metagenomes of varied read length (100–2000 bp) revealed that it correctly classified at least 5% more sequences than any other tool. The analysis also showed that ∼10% of the assembled sequences from human gut metagenomes represent novel species with no sequenced representatives, several of which were highly abundant in situ such as members of the Prevotella genus. Thus, MyTaxa can find several important applications in microbial identification and diversity studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号