首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
DNA metabarcoding is a promising method for describing communities and estimating biodiversity. This approach uses high‐throughput sequencing of targeted markers to identify species in a complex sample. By convention, sequences are clustered at a predefined sequence divergence threshold (often 3%) into operational taxonomic units (OTUs) that serve as a proxy for species. However, variable levels of interspecific marker variation across taxonomic groups make clustering sequences from a phylogenetically diverse dataset into OTUs at a uniform threshold problematic. In this study, we use mock zooplankton communities to evaluate the accuracy of species richness estimates when following conventional protocols to cluster hypervariable sequences of the V4 region of the small subunit ribosomal RNA gene (18S) into OTUs. By including individually tagged single specimens and “populations” of various species in our communities, we examine the impact of intra‐ and interspecific diversity on OTU clustering. Communities consisting of single individuals per species generated a correspondence of 59–84% between OTU number and species richness at a 3% divergence threshold. However, when multiple individuals per species were included, the correspondence between OTU number and species richness dropped to 31–63%. Our results suggest that intraspecific variation in this marker can often exceed 3%, such that a single species does not always correspond to one OTU. We advocate the need to apply group‐specific divergence thresholds when analyzing complex and taxonomically diverse communities, but also encourage the development of additional filtering steps that allow identification of artifactual rRNA gene sequences or pseudogenes that may generate spurious OTUs.  相似文献   

2.
In spite of technical advances that have provided increases in orders of magnitude in sequencing coverage, microbial ecologists still grapple with how to interpret the genetic diversity represented by the 16S rRNA gene. Two widely used approaches put sequences into bins based on either their similarity to reference sequences (i.e., phylotyping) or their similarity to other sequences in the community (i.e., operational taxonomic units [OTUs]). In the present study, we investigate three issues related to the interpretation and implementation of OTU-based methods. First, we confirm the conventional wisdom that it is impossible to create an accurate distance-based threshold for defining taxonomic levels and instead advocate for a consensus-based method of classifying OTUs. Second, using a taxonomic-independent approach, we show that the average neighbor clustering algorithm produces more robust OTUs than other hierarchical and heuristic clustering algorithms. Third, we demonstrate several steps to reduce the computational burden of forming OTUs without sacrificing the robustness of the OTU assignment. Finally, by blending these solutions, we propose a new heuristic that has a minimal effect on the robustness of OTUs and significantly reduces the necessary time and memory requirements. The ability to quickly and accurately assign sequences to OTUs and then obtain taxonomic information for those OTUs will greatly improve OTU-based analyses and overcome many of the challenges encountered with phylotype-based methods.  相似文献   

3.
The vast number of undescribed species and the fast rate of biodiversity loss call for new approaches to speed up alpha taxonomy. A plethora of methods for delimiting species or operational taxonomic units (OTUs) based on sequence data have been published in recent years. We test the ability of four delimitation methods (BIN, ABGD, GMYC, PTP) to reproduce established species boundaries on a carefully curated DNA barcode data set of 1870 North European beetle species. We also explore how sampling effort, intraspecific variation, nearest neighbour divergence and nonmonophyly affect the OTU delimitations. All methods produced approximately 90% identity between species and OTUs. The effects of variation and sampling differed between methods. ABGD was sensitive to singleton sequences, while GMYC showed tendencies for oversplitting. The best fit between species and OTUs was achieved using simple rules to find consensus between discordant OTU delimitations. Using several approaches simultaneously allows the methods to compensate for each other's weaknesses. Barcode‐based OTU‐picking is an efficient way to delimit putative species from large data sets where the use of more sophisticated methods based on multilocus or genomic data is not feasible.  相似文献   

4.
Many aspects of animal ecology and physiology are influenced by the microbial communities within them. The underlying forces contributing to the assembly and diversity of gut microbiotas include chance events, host‐based selection and interactions among microorganisms within these communities. We surveyed 215 wild individuals from four sympatric species of Drosophila that share a common diet of decaying mushrooms. Their microbiotas consistently contained abundant bacteria that were undetectable or at low abundance in their diet. Despite their deep phylogenetic divergence, all species had similar microbiotas, thus failing to support predictions of the phylosymbiosis hypothesis. Communities within flies were not random assemblages drawn from a common pool; instead, many bacterial operational taxonomic units (OTUs) were overrepresented or underrepresented relative to the neutral expectations, and OTUs exhibited checkerboard distributions among flies. These results suggest that selective factors play an important role in shaping the gut community structure of these flies.  相似文献   

5.
Large‐scale environmental disturbances may impact both partners in coral host–Symbiodinium systems. Elucidation of the assembly patterns in such complex and interdependent communities may enable better prediction of environmental impacts across coral reef ecosystems. In this study, we investigated how the community composition and diversity of dinoflagellate symbionts in the genus Symbiodinium were distributed among 12 host species from six taxonomic orders (Actinaria, Alcyonacea, Miliolida, Porifera, Rhizostoma, Scleractinia) and in the reef water and sediments at Lizard Island, Great Barrier Reef before the 3rd Global Coral Bleaching Event. 454 pyrosequencing of the ITS2 region of Symbiodinium yielded 83 operational taxonomic units (OTUs) at a 97% similarity cut‐off. Approximately half of the Symbiodinium OTUs from reef water or sediments were also present in symbio. OTUs belonged to six clades (A‐D, F‐G), but community structure was uneven. The two most abundant OTUs (100% matches to types C1 and A3) comprised 91% of reads and OTU C1 was shared by all species. However, sequence‐based analysis of these dominant OTUs revealed host species specificity, suggesting that genetic similarity cut‐offs of Symbiodinium ITS2 data sets need careful evaluation. Of the less abundant OTUs, roughly half occurred at only one site or in one species and the background Symbiodinium communities were distinct between individual samples. We conclude that sampling multiple host taxa with differing life history traits will be critical to fully understand the symbiont diversity of a given system and to predict coral ecosystem responses to environmental change and disturbance considering the differential stress response of the taxa within.  相似文献   

6.
7.
The composition of lichen ecosystems except mycobiont and photobiont has not been evaluated intensively. In addition, recent studies to identify algal genotypes have raised questions about the specific relationship between mycobiont and photobiont. In the current study, we analyzed algal and fungal community structures in lichen species from King George Island, Antarctica, by pyrosequencing of eukaryotic large subunit (LSU) and algal internal transcribed spacer (ITS) domains of the nuclear rRNA gene. The sequencing results of LSU and ITS regions indicated that each lichen thallus contained diverse algal species. The major algal operational taxonomic unit (OTU) defined at a 99% similarity cutoff of LSU sequences accounted for 78.7–100% of the total algal community in each sample. In several cases, the major OTUs defined by LSU sequences were represented by two closely related OTUs defined by 98% sequence similarity of ITS domain. The results of LSU sequences indicated that lichen‐associated fungi belonged to the Arthoniomycetes, Eurotiomycetes, Lecanoromycetes, Leotiomycetes, and Sordariomycetes of the Ascomycota, and Tremellomycetes and Cystobasidiomycetes of the Basidiomycota. The composition of major photobiont species and lichen‐associated fungal community were mostly related to the mycobiont species. The contribution of growth forms or substrates on composition of photobiont and lichen‐associated fungi was not evident.  相似文献   

8.
This study summarizes results of a DNA barcoding campaign on German Diptera, involving analysis of 45,040 specimens. The resultant DNA barcode library includes records for 2,453 named species comprising a total of 5,200 barcode index numbers (BINs), including 2,700 COI haplotype clusters without species‐level assignment, so called “dark taxa.” Overall, 88 out of 117 families (75%) recorded from Germany were covered, representing more than 50% of the 9,544 known species of German Diptera. Until now, most of these families, especially the most diverse, have been taxonomically inaccessible. By contrast, within a few years this study provided an intermediate taxonomic system for half of the German Dipteran fauna, which will provide a useful foundation for subsequent detailed, integrative taxonomic studies. Using DNA extracts derived from bulk collections made by Malaise traps, we further demonstrate that species delineation using BINs and operational taxonomic units (OTUs) constitutes an effective method for biodiversity studies using DNA metabarcoding. As the reference libraries continue to grow, and gaps in the species catalogue are filled, BIN lists assembled by metabarcoding will provide greater taxonomic resolution. The present study has three main goals: (a) to provide a DNA barcode library for 5,200 BINs of Diptera; (b) to demonstrate, based on the example of bulk extractions from a Malaise trap experiment, that DNA barcode clusters, labelled with globally unique identifiers (such as OTUs and/or BINs), provide a pragmatic, accurate solution to the “taxonomic impediment”; and (c) to demonstrate that interim names based on BINs and OTUs obtained through metabarcoding provide an effective method for studies on species‐rich groups that are usually neglected in biodiversity research projects because of their unresolved taxonomy.  相似文献   

9.
Recent studies of 16S rRNA sequences through next-generation sequencing have revolutionized our understanding of the microbial community composition and structure. One common approach in using these data to explore the genetic diversity in a microbial community is to cluster the 16S rRNA sequences into Operational Taxonomic Units (OTUs) based on sequence similarities. The inferred OTUs can then be used to estimate species, diversity, composition, and richness. Although a number of methods have been developed and commonly used to cluster the sequences into OTUs, relatively little guidance is available on their relative performance and the choice of key parameters for each method. In this study, we conducted a comprehensive evaluation of ten existing OTU inference methods. We found that the appropriate dissimilarity value for defining distinct OTUs is not only related with a specific method but also related with the sample complexity. For data sets with low complexity, all the algorithms need a higher dissimilarity threshold to define OTUs. Some methods, such as, CROP and SLP, are more robust to the specific choice of the threshold than other methods, especially for shorter reads. For high-complexity data sets, hierarchical cluster methods need a more strict dissimilarity threshold to define OTUs because the commonly used dissimilarity threshold of 3% often leads to an under-estimation of the number of OTUs. In general, hierarchical clustering methods perform better at lower dissimilarity thresholds. Our results show that sequence abundance plays an important role in OTU inference. We conclude that care is needed to choose both a threshold for dissimilarity and abundance for OTU inference.  相似文献   

10.
High‐throughput sequencing is revealing that most macro‐organisms house diverse microbial communities. Of particular interest are disease vectors whose microbiome could potentially affect pathogen transmission and vector competence. We investigated bacterial community composition and diversity of the ticks Dermacentor variabilis (n = 68) and Ixodes scapularis (n = 15) and blood of their shared rodent host, Peromyscus leucopus (n = 45) to quantify bacterial diversity and concordance. The 16S rRNA gene was amplified from genomic DNA from field‐collected tick and rodent blood samples, and 454 pyrosequencing was used to elucidate their bacterial communities. After quality control, over 300 000 sequences were obtained and classified into 118 operational taxonomic units (OTUs, clustered at 97% similarity). Analysis of rarefied communities revealed that the most abundant OTUs were tick species‐specific endosymbionts, Francisella and Rickettsia, and the commonly flea‐associated bacterium Bartonella in rodent blood. An Arsenophonus and additional Francisella endosymbiont were also present in D. variabilis samples. Rickettsia was found in both tick species but not in rodent blood, suggesting that it is not transmitted during feeding. Bartonella was present in larvae and nymphs of both tick species, even those scored as unengorged. Relatively, few OTUs (e.g. Bartonella, Lactobacillus) were found in all sample types. Overall, bacterial communities from each sample type were significantly different and highly structured, independent of their dominant OTUs. Our results point to complex microbial assemblages inhabiting ticks and host blood including infectious agents, tick‐specific endosymbionts and environmental bacteria that could potentially affect arthropod‐vectored disease dynamics.  相似文献   

11.
Understanding how and why populations evolve is of fundamental importance to molecular ecology. Restriction site‐associated DNA sequencing (RADseq), a popular reduced representation method, has ushered in a new era of genome‐scale research for assessing population structure, hybridization, demographic history, phylogeography and migration. RADseq has also been widely used to conduct genome scans to detect loci involved in adaptive divergence among natural populations. Here, we examine the capacity of those RADseq‐based genome scan studies to detect loci involved in local adaptation. To understand what proportion of the genome is missed by RADseq studies, we developed a simple model using different numbers of RAD‐tags, genome sizes and extents of linkage disequilibrium (length of haplotype blocks). Under the best‐case modelling scenario, we found that RADseq using six‐ or eight‐base pair cutting restriction enzymes would fail to sample many regions of the genome, especially for species with short linkage disequilibrium. We then surveyed recent studies that have used RADseq for genome scans and found that the median density of markers across these studies was 4.08 RAD‐tag markers per megabase (one marker per 245 kb). The length of linkage disequilibrium for many species is one to three orders of magnitude less than density of the typical recent RADseq study. Thus, we conclude that genome scans based on RADseq data alone, while useful for studies of neutral genetic variation and genetic population structure, will likely miss many loci under selection in studies of local adaptation.  相似文献   

12.
Adequate read filtering is critical when processing high-throughput data in marker-gene-based studies. Sequencing errors can cause the mis-clustering of otherwise similar reads, artificially increasing the number of retrieved Operational Taxonomic Units (OTUs) and therefore leading to the overestimation of microbial diversity. Sequencing errors will also result in OTUs that are not accurate reconstructions of the original biological sequences. Herein we present the Poisson binomial filtering algorithm (PBF), which minimizes both problems by calculating the error-probability distribution of a sequence from its quality scores. In order to validate our method, we quality-filtered 37 publicly available datasets obtained by sequencing mock and environmental microbial communities with the Roche 454, Illumina MiSeq and IonTorrent PGM platforms, and compared our results to those obtained with previous approaches such as the ones included in mothur, QIIME and USEARCH. Our algorithm retained substantially more reads than its predecessors, while resulting in fewer and more accurate OTUs. This improved sensitiveness produced more faithful representations, both quantitatively and qualitatively, of the true microbial diversity present in the studied samples. Furthermore, the method introduced in this work is computationally inexpensive and can be readily applied in conjunction with any existent analysis pipeline.  相似文献   

13.
In Middle European suburban environments green algae often cover open surfaces of artificial hard substrates. Microscopy reveals the Apatococcus/Desmococcus morphotype predominant over smaller coccoid forms. Adverse conditions such as limited water availability connected with high PAR and UV irradiance may narrow the algal diversity to a few specialists in these subaerial habitats. We used rRNA gene cloning/sequencing from both DNA extracts of the biofilms without culturing as well as cultures, for the unambiguous determination of the algal composition and to assess the algal diversity more comprehensively. The culture independent approach revealed mainly just two genera (Apatococcus, Trebouxia) for all study sites and five molecular operational taxonomic units (OTUs) for a particular study site, which based on microscopic observation was the one with the highest morphological diversity. The culture approach, however, revealed seven additional OTUs from five genera (Chloroidium, Coccomyxa, Coenochloris, Pabia, Klebsormidium) and an unidentified trebouxiophyte lineage for that same site; only two OTUs were shared by both approaches. Two OTUs or species were recovered for which references have been isolated only from Antarctica so far. However, the internal transcribed spacer (ITS) sequence differences among them supported they are representing distinct populations of the same species. Within Apatococcus five clearly distinct groups of ITS sequences, each putatively representing a distinct species, were recovered with three or four such ITS types co‐occurring at the same study site. Except for the streptophyte Klebsormidium only members of Trebouxiophyceae were detected suggesting these algae may be particularly well‐adapted to subaerial habitats.  相似文献   

14.
Pedigree and sibship reconstruction are important methods in quantifying relationships and fitness of individuals in natural populations. Current methods employ a Markov chain‐based algorithm to explore plausible possible pedigrees iteratively. This provides accurate results, but is time‐consuming. Here, we develop a method to infer sibship and paternity relationships from half‐sibling arrays of known maternity using hierarchical clustering. Given 50 or more unlinked SNP markers and empirically derived error rates, the method performs as well as the widely used package Colony, but is faster by two orders of magnitude. Using simulations, we show that the method performs well across contrasting mating scenarios, even when samples are large. We then apply the method to open‐pollinated arrays of the snapdragon Antirrhinum majus and find evidence for a high degree of multiple mating. Although we focus on diploid SNP data, the method does not depend on marker type and as such has broad applications in nonmodel systems.  相似文献   

15.
16.
Macroalgal bloom‐forming species occur in coastal systems worldwide. However, due to overlapping morphologies in some taxa, accurate taxonomic assessment and classification of these species can be quite challenging. We investigated the molecular and morphological characteristics of 153 specimens of bloom‐forming Ulva located in and around Narragansett Bay, RI, USA. We analyzed sequences of the nuclear internal transcribed spacer 1 region (ITS1) and the chloroplast‐encoded rbcL; based on the ITS1 data, we grouped the specimens into nine operational taxonomic units (OTUs). Eight of these OTUs have been previously reported to exist, while one is novel. Of the eight OTUs, all shared sequence identity with previously published sequences or differed by less than 1.5% sequence divergence for two molecular markers. Previously, 10 species names were reported for Ulva in Rhode Island (one blade and nine tube‐forming species) based upon morphological classification alone. Of our nine OTUs, three contained blade‐forming specimens (U. lactuca, U. compressa, U. rigida), one OTU had a blade with a tubular stipe, and six contained unbranched and/or branched tubular morphologies (one of these six, U. compressa, had both a blade and a tube morphology). While the three blade‐forming OTUs in Narragansett Bay can frequently be distinguished by careful observations of morphological characteristics, and spatial/temporal distribution, it is much more difficult to distinguish among the tube‐forming specimens based upon morphology or distribution alone. Our data support the molecular species concept for Ulva, and indicate that molecular‐based classifications of Ulva species are critical for proper species identification, and subsequent ecological assessment or mitigation of Ulva blooms.  相似文献   

17.
Each holotype specimen provides the only objective link to a particular Linnean binomen. Sequence information from them is increasingly valuable due to the growing usage of DNA barcodes in taxonomy. As type specimens are often old, it may only be possible to recover fragmentary sequence information from them. We tested the efficacy of short sequences from type specimens in the resolution of a challenging taxonomic puzzle: the Elachista dispunctella complex which includes 64 described species with minuscule morphological differences. We applied a multistep procedure to resolve the taxonomy of this species complex. First, we sequenced a large number of newly collected specimens and as many holotypes as possible. Second, we used all >400 bp examine species boundaries. We employed three unsupervised methods (BIN, ABGD, GMYC) with specified criteria on how to handle discordant results and examined diagnostic bases from each delineated putative species (operational taxonomic units, OTUs). Third, we evaluated the morphological characters of each OTU. Finally, we associated short barcodes from types with the delineated OTUs. In this step, we employed various supervised methods, including distance‐based, tree‐based and character‐based. We recovered 658 bp barcode sequences from 194 of 215 fresh specimens and recovered an average of 141 bp from 33 of 42 holotypes. We observed strong congruence among all methods and good correspondence with morphology. We demonstrate potential pitfalls with tree‐, distance‐ and character‐based approaches when associating sequences of varied length. Our results suggest that sequences as short as 56 bp can often provide valuable taxonomic information. The results support significant taxonomic oversplitting of species in the Elachista dispunctella complex.  相似文献   

18.
A set of expressed sequence tag (EST) simple sequence repeat (SSR) markers were developed and characterized using next‐generation sequencing technology for the genus Diabelia (Caprifoliaceae). De novo assembly of RNA‐seq reads resulted in 58 669 contigs with the N50 length of 1211 bp. A total of 2746 contigs were identified to harbor SSR motifs, of which 48 primer pairs were designed and 11 were shown to be polymorphic across three morphospecies of Diabelia. When evaluated with 30 individuals, the number of alleles per locus ranged from 2 to 11 and the expected heterozygosity varied from 0.399 to 0.873, respectively. Distance‐based clustering indicated that the EST‐SSR markers can provide sufficient power to distinguish the three species (or populations). These markers will be useful for evaluating the range‐wide genetic diversity of each species and examining genetic divergence and gene flow between the three species.  相似文献   

19.
The fungal communities associated with three bryophytes species (the liverwort Barbilophozia hatcheri, the mosses Chorisodontium aciphyllum and Sanionia uncinata) in the Fildes Region, King George Island, maritime Antarctica, were studied using clone library analysis. Fungal communities showed low diversity; the 680 clones belonged to 93 OTUs. Of these, 78 belonged to the phylum Ascomycota, 13 to the phylum Basidiomycota, 1 to the phylum Zygomycota, and 1 to an unknown phylum. Among the OTUs, the most common orders in the Ascomycota were Helotiales (42 OTUs) and Chaetothyriales (14 OTUs) and the most common orders in the Basidiomycota were Sebacinales (3 OTUs) and Platygloeales (3 OTUs). Most OTUs clustered within clades that contained phylotypes identified from samples in Antarctic or Arctic ecosystems or from bryophytes in other ecosystems. In addition, we found that host-related factor may shape the fungal communities associated with bryophytes in this region. This is the first systematic study of the fungal community in Antarctic bryophytes to be performed using culture-independent method and the results may improve understanding of the endophytic fungal evolution and ecology in the Antarctic ecosystem.  相似文献   

20.
The advent of next generation sequencing has coincided with a growth in interest in using these approaches to better understand the role of the structure and function of the microbial communities in human, animal, and environmental health. Yet, use of next generation sequencing to perform 16S rRNA gene sequence surveys has resulted in considerable controversy surrounding the effects of sequencing errors on downstream analyses. We analyzed 2.7×10(6) reads distributed among 90 identical mock community samples, which were collections of genomic DNA from 21 different species with known 16S rRNA gene sequences; we observed an average error rate of 0.0060. To improve this error rate, we evaluated numerous methods of identifying bad sequence reads, identifying regions within reads of poor quality, and correcting base calls and were able to reduce the overall error rate to 0.0002. Implementation of the PyroNoise algorithm provided the best combination of error rate, sequence length, and number of sequences. Perhaps more problematic than sequencing errors was the presence of chimeras generated during PCR. Because we knew the true sequences within the mock community and the chimeras they could form, we identified 8% of the raw sequence reads as chimeric. After quality filtering the raw sequences and using the Uchime chimera detection program, the overall chimera rate decreased to 1%. The chimeras that could not be detected were largely responsible for the identification of spurious operational taxonomic units (OTUs) and genus-level phylotypes. The number of spurious OTUs and phylotypes increased with sequencing effort indicating that comparison of communities should be made using an equal number of sequences. Finally, we applied our improved quality-filtering pipeline to several benchmarking studies and observed that even with our stringent data curation pipeline, biases in the data generation pipeline and batch effects were observed that could potentially confound the interpretation of microbial community data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号