首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
High‐throughput sequencing (HTS) technologies generate millions of sequence reads from DNA/RNA molecules rapidly and cost‐effectively, enabling single investigator laboratories to address a variety of ‘omics’ questions in nonmodel organisms, fundamentally changing the way genomic approaches are used to advance biological research. One major challenge posed by HTS is the complexity and difficulty of data quality control (QC). While QC issues associated with sample isolation, library preparation and sequencing are well known and protocols for their handling are widely available, the QC of the actual sequence reads generated by HTS is often overlooked. HTS‐generated sequence reads can contain various errors, biases and artefacts whose identification and amelioration can greatly impact subsequent data analysis. However, a systematic survey on QC procedures for HTS data is still lacking. In this review, we begin by presenting standard ‘health check‐up’ QC procedures recommended for HTS data sets and establishing what ‘healthy’ HTS data look like. We next proceed by classifying errors, biases and artefacts present in HTS data into three major types of ‘pathologies’, discussing their causes and symptoms and illustrating with examples their diagnosis and impact on downstream analyses. We conclude this review by offering examples of successful ‘treatment’ protocols and recommendations on standard practices and treatment options. Notwithstanding the speed with which HTS technologies – and consequently their pathologies – change, we argue that careful QC of HTS data is an important – yet often neglected – aspect of their application in molecular ecology, and lay the groundwork for developing a HTS data QC ‘best practices’ guide.  相似文献   

2.
To enable rapid selection of traits in marker‐assisted breeding, markers must be technically simple, low‐cost, high‐throughput and randomly distributed in a genome. We developed such a technology, designated as Multiplex Restriction Amplicon Sequencing (MRASeq), which reduces genome complexity by polymerase chain reaction (PCR) amplification of amplicons flanked by restriction sites. The first PCR primers contain restriction site sequences at 3’‐ends, preceded by 6‐10 bases of specific or degenerate nucleotide sequences and then by a unique M13‐tail sequence which serves as a binding site for a second PCR that adds sequencing primers and barcodes to allow sample multiplexing for sequencing. The sequences of restriction sites and adjacent nucleotides can be altered to suit different species. Physical mapping of MRASeq SNPs from a biparental population of allohexaploid wheat (Triticum aestivum L.) showed a random distribution of SNPs across the genome. MRASeq generated thousands of SNPs from a wheat biparental population and natural populations of wheat and barley (Hordeum vulgare L.). This novel, next‐generation sequencing‐based genotyping platform can be used for linkage mapping to screen quantitative trait loci (QTL), background selection in breeding and many other genetics and breeding applications of various species.  相似文献   

3.
The purpose of this review is to present the most common and emerging DNA‐based methods used to generate data for biodiversity and biomonitoring studies. As environmental assessment and monitoring programmes may require biodiversity information at multiple levels, we pay particular attention to the DNA metabarcoding method and discuss a number of bioinformatic tools and considerations for producing DNA‐based indicators using operational taxonomic units (OTUs), taxa at a variety of ranks and community composition. By developing the capacity to harness the advantages provided by the newest technologies, investigators can “scale up” by increasing the number of samples and replicates processed, the frequency of sampling over time and space, and even the depth of sampling such as by sequencing more reads per sample or more markers per sample. The ability to scale up is made possible by the reduced hands‐on time and cost per sample provided by the newest kits, platforms and software tools. Results gleaned from broad‐scale monitoring will provide opportunities to address key scientific questions linked to biodiversity and its dynamics across time and space as well as being more relevant for policymakers, enabling science‐based decision‐making, and provide a greater socio‐economic impact. As genomic approaches are continually evolving, we provide this guide to methods used in biodiversity genomics.  相似文献   

4.
The application of high‐throughput sequencing‐based approaches to DNA extracted from environmental samples such as gut contents and faeces has become a popular tool for studying dietary habits of animals. Due to the high resolution and prey detection capacity they provide, both metabarcoding and shotgun sequencing are increasingly used to address ecological questions grounded in dietary relationships. Despite their great promise in this context, recent research has unveiled how a wealth of biological (related to the study system) and technical (related to the methodology) factors can distort the signal of taxonomic composition and diversity. Here, we review these studies in the light of high‐throughput sequencing‐based assessment of trophic interactions. We address how the study design can account for distortion factors, and how acknowledging limitations and biases inherent to sequencing‐based diet analyses are essential for obtaining reliable results, thus drawing appropriate conclusions. Furthermore, we suggest strategies to minimize the effect of distortion factors, measures to increase reproducibility, replicability and comparability of studies, and options to scale up DNA sequencing‐based diet analyses. In doing so, we aim to aid end‐users in designing reliable diet studies by informing them about the complexity and limitations of DNA sequencing‐based diet analyses, and encourage researchers to create and improve tools that will eventually drive this field to its maturity.  相似文献   

5.
Recent advances in high‐throughput sequencing library preparation and subgenomic enrichment methods have opened new avenues for population genetics and phylogenetics of nonmodel organisms. To multiplex large numbers of indexed samples while sequencing predominantly orthologous, targeted regions of the genome, we propose modifications to an existing, in‐solution capture that utilizes PCR products as target probes to enrich library pools for the genomic subset of interest. The sequence capture using PCR‐generated probes (SCPP) protocol requires no specialized equipment, is highly flexible and significantly reduces experimental costs for projects where a modest scale of genetic data is optimal (25–100 genomic loci). Our alterations enable application of this method across a wider phylogenetic range of taxa and result in higher capture efficiencies and coverage at each locus. Efficient and consistent capture over multiple SCPP experiments and at various phylogenetic distances is demonstrated, extending the utility of this method to both phylogeographic and phylogenomic studies.  相似文献   

6.
7.
High‐throughput sequencing has been proposed as a method to genotype microsatellites and overcome the four main technical drawbacks of capillary electrophoresis: amplification artifacts, imprecise sizing, length homoplasy, and limited multiplex capability. The objective of this project was to test a high‐throughput amplicon sequencing approach to fragment analysis of short tandem repeats and characterize its advantages and disadvantages against traditional capillary electrophoresis. We amplified and sequenced 12 muskrat microsatellite loci from 180 muskrat specimens and analyzed the sequencing data for precision of allele calling, propensity for amplification or sequencing artifacts, and for evidence of length homoplasy. Of the 294 total alleles, we detected by sequencing, only 164 alleles would have been detected by capillary electrophoresis as the remaining 130 alleles (44%) would have been hidden by length homoplasy. The ability to detect a greater number of unique alleles resulted in the ability to resolve greater population genetic structure. The primary advantages of fragment analysis by sequencing are the ability to precisely size fragments, resolve length homoplasy, multiplex many individuals and many loci into a single high‐throughput run, and compare data across projects and across laboratories (present and future) with minimal technical calibration. A significant disadvantage of fragment analysis by sequencing is that the method is only practical and cost‐effective when performed on batches of several hundred samples with multiple loci. Future work is needed to optimize throughput while minimizing costs and to update existing microsatellite allele calling and analysis programs to accommodate sequence‐aware microsatellite data.  相似文献   

8.
  1. Increasing access to next‐generation sequencing (NGS) technologies is revolutionizing the life sciences. In disease ecology, NGS‐based methods have the potential to provide higher‐resolution data on communities of parasites found in individual hosts as well as host populations.
  2. Here, we demonstrate how a novel analytical method, utilizing high‐throughput sequencing of PCR amplicons, can be used to explore variation in blood‐borne parasite (Theileria—Apicomplexa: Piroplasmida) communities of African buffalo at higher resolutions than has been obtained with conventional molecular tools.
  3. Results reveal temporal patterns of synchronized and opposite fluctuations of prevalence and relative abundance of Theileria spp. within the host population, suggesting heterogeneous transmission across taxa. Furthermore, we show that the community composition of Theileria spp. and their subtypes varies considerably between buffalo, with differences in composition reflected in mean and variance of overall parasitemia, thereby showing potential to elucidate previously unexplained contrasts in infection outcomes for host individuals.
  4. Importantly, our methods are generalizable as they can be utilized to describe blood‐borne parasite communities in any host species. Furthermore, our methodological framework can be adapted to any parasite system given the appropriate genetic marker.
  5. The findings of this study demonstrate how a novel NGS‐based analytical approach can provide fine‐scale, quantitative data, unlocking opportunities for discovery in disease ecology.
  相似文献   

9.
Microalgae in the division Haptophyta play key roles in the marine ecosystem and in global biogeochemical processes. Despite their ecological importance, knowledge on seasonal dynamics, community composition and abundance at the species level is limited due to their small cell size and few morphological features visible under the light microscope. Here, we present unique data on haptophyte seasonal diversity and dynamics from two annual cycles, with the taxonomic resolution and sampling depth obtained with high‐throughput sequencing. From outer Oslofjorden, S Norway, nano‐ and picoplanktonic samples were collected monthly for 2 years, and the haptophytes targeted by amplification of RNA/cDNA with Haptophyta‐specific 18S rDNA V4 primers. We obtained 156 operational taxonomic units (OTUs), from c. 400.000 454 pyrosequencing reads, after rigorous bioinformatic filtering and clustering at 99.5%. Most OTUs represented uncultured and/or not yet 18S rDNA‐sequenced species. Haptophyte OTU richness and community composition exhibited high temporal variation and significant yearly periodicity. Richness was highest in September–October (autumn) and lowest in April–May (spring). Some taxa were detected all year, such as Chrysochromulina simplex, Emiliania huxleyi and Phaeocystis cordata, whereas most calcifying coccolithophores only appeared from summer to early winter. We also revealed the seasonal dynamics of OTUs representing putative novel classes (clades HAP‐3–5) or orders (clades D, E, F). Season, light and temperature accounted for 29% of the variation in OTU composition. Residual variation may be related to biotic factors, such as competition and viral infection. This study provides new, in‐depth knowledge on seasonal diversity and dynamics of haptophytes in North Atlantic coastal waters.  相似文献   

10.
High‐throughput DNA analyses are increasingly being used to detect rare mutations in moderately sized genomes. These methods have yielded genome mutation rates that are markedly higher than those obtained using pre‐genomic strategies. Recent work in a variety of organisms has shown that mutation rate is strongly affected by sequence context and genome position. These observations suggest that high‐throughput DNA analyses will ultimately allow researchers to identify trans‐acting factors and cis sequences that underlie mutation rate variation. Such work should provide insights on how mutation rate variability can impact genome organization and disease progression.  相似文献   

11.
High‐throughput sequencing (HTS) of PCR amplicons is becoming the method of choice to sequence one or several targeted loci for phylogenetic and DNA barcoding studies. Although the development of HTS has allowed rapid generation of massive amounts of DNA sequence data, preparing amplicons for HTS remains a rate‐limiting step. For example, HTS platforms require platform‐specific adapter sequences to be present at the 5′ and 3′ end of the DNA fragment to be sequenced. In addition, short multiplex identifier (MID) tags are typically added to allow multiple samples to be pooled in a single HTS run. Existing methods to incorporate HTS adapters and MID tags into PCR amplicons are either inefficient, requiring multiple enzymatic reactions and clean‐up steps, or costly when applied to multiple samples or loci (fusion primers). We describe a method to amplify a target locus and add HTS adapters and MID tags via a linker sequence using a single PCR. We demonstrate our approach by generating reference sequence data for two mitochondrial loci (COI and 16S) for a diverse suite of insect taxa. Our approach provides a flexible, cost‐effective and efficient method to prepare amplicons for HTS.  相似文献   

12.
DNA analysis of predator faeces using high‐throughput amplicon sequencing (HTS) enhances our understanding of predator–prey interactions. However, conclusions drawn from this technique are constrained by biases that occur in multiple steps of the HTS workflow. To better characterize insectivorous animal diets, we used DNA from a diverse set of arthropods to assess PCR biases of commonly used and novel primer pairs for the mitochondrial gene, cytochrome oxidase C subunit 1 (COI). We compared diversity recovered from HTS of bat guano samples using a commonly used primer pair “ZBJ” to results using the novel primer pair “ANML.” To parameterize our bioinformatics pipeline, we created an arthropod mock community consisting of single‐copy (cloned) COI sequences. To examine biases associated with both PCR and HTS, mock community members were combined in equimolar amounts both pre‐ and post‐PCR. We validated our system using guano from bats fed known diets and using composite samples of morphologically identified insects collected in pitfall traps. In PCR tests, the ANML primer pair amplified 58 of 59 arthropod taxa (98%), whereas ZBJ amplified 24–40 of 59 taxa (41%–68%). Furthermore, in an HTS comparison of field‐collected samples, the ANML primers detected nearly fourfold more arthropod taxa than the ZBJ primers. The additional arthropods detected include medically and economically relevant insect groups such as mosquitoes. Results revealed biases at both the PCR and sequencing levels, demonstrating the pitfalls associated with using HTS read numbers as proxies for abundance. The use of an arthropod mock community allowed for improved bioinformatics pipeline parameterization.  相似文献   

13.
14.
The present study aimed to estimate the clinical performance of non‐invasive prenatal testing (NIPT) based on high‐throughput sequencing method for the detection of foetal chromosomal deletions and duplications. A total of 6348 pregnant women receiving NIPT using high‐throughput sequencing method were included in our study. They all conceived naturally, without twins, triplets or multiple births. Individuals showing abnormalities in NIPT received invasive ultrasound‐guided amniocentesis for chromosomal karyotype and microarray analysis at 18‐24 weeks of pregnancy. Detection results of foetal chromosomal deletions and duplications were compared between high‐throughput sequencing method and chromosomal karyotype and microarray analysis. Thirty‐eight individuals were identified to show 51 chromosomal deletions/duplications via high‐throughput sequencing method. In subsequent chromosomal karyotype and microarray analysis, 34 subchromosomal deletions/duplications were identified in 26 pregnant women. The observed deletions and duplications ranged from 1.05 to 17.98 Mb. Detection accuracy for these deletions and duplications was 66.7%. Twenty‐one deletions and duplications were found to be correlated with the known abnormalities. NIPT based on high‐throughput sequencing technique is able to identify foetal chromosomal deletions and duplications, but its sensitivity and specificity were not explored. Further progress should be made to reduce false‐positive results.  相似文献   

15.
Wild crop relatives represent a source of novel alleles for crop genetic improvement. Screening biodiversity for useful or diverse gene homologues has often been based upon the amplification of targeted genes using available sequence information to design primers that amplify the target gene region across species. The crucial requirement of this approach is the presence of sequences with sufficient conservation across species to allow for the design of universal primers. This approach is often not successful with diverse organisms or highly variable genes. Massively parallel sequencing (MPS) can quickly produce large amounts of sequence data and provides a viable option for characterizing homologues of known genes in poorly described genomes. MPS of genomic DNA was used to obtain species‐specific sequence information for 18 rice genes related to domestication characteristics in a wild relative of rice, Microlaena stipoides. Species‐specific primers were available for 16 genes compared with 12 genes using the universal primer method. The use of species‐specific primers had the potential to cover 92% of the sequence of these genes, while traditional universal primers could only be designed to cover 80%. A total of 24 species‐specific primer pairs were used to amplify gene homologues, and 11 primer pairs were successful in capturing six gene homologues. The 23 million, 36‐base pair (bp) paired end reads, equated to an average of 2X genome coverage, facilitated the successful amplification and sequencing of six target gene homologues, illustrating an important approach to the discovery of useful genes in wild crop relatives.  相似文献   

16.
Metabarcoding has been used in a range of ecological applications such as taxonomic assignment, dietary analysis and the analysis of environmental DNA. However, after a decade of use in these applications there is little consensus on the extent to which proportions of reads generated corresponds to the original proportions of species in a community. To quantify our current understanding, we conducted a structured review and meta‐analysis. The analysis suggests that a weak quantitative relationship may exist between the biomass and sequences produced (slope = 0.52 ± 0.34, p < 0.01), albeit with a large degree of uncertainty. None of the tested moderators, sequencing platform type, the number of species used in a trial or the source of DNA, were able to explain the variance. Our current understanding of the factors affecting the quantitative performance of metabarcoding is still limited: additional research is required before metabarcoding can be confidently utilized for quantitative applications. Until then, we advocate the inclusion of mock communities when metabarcoding as this facilitates direct assessment of the quantitative ability of any given study.  相似文献   

17.
18.
Dietary changes linked to the availability of anthropogenic food resources can have complex implications for species and ecosystems, especially when species are in decline. Here, we use recently developed primers targeting the ITS2 region of plants to characterize diet from faecal samples of four UK columbids, with particular focus on the European turtle dove (Streptopelia turtur), a rapidly declining obligate granivore. We examine dietary overlap between species (potential competition), associations with body condition in turtle doves and spatiotemporal variation in diet. We identified 143 taxonomic units, of which we classified 55% to species, another 34% to genus and the remaining 11% to family. We found significant dietary overlap between all columbid species, with the highest between turtle doves and stock doves (Columba oenas), then between turtle doves and woodpigeons (Columba palumbus). The lowest overlap was between woodpigeons and collared doves (Streptopelia decaocto). We show considerable change in columbid diets compared to previous studies, probably reflecting opportunistic foraging behaviour by columbids within a highly anthropogenically modified landscape, although our data for nonturtle doves should be considered preliminary. Nestling turtle doves in better condition had a higher dietary proportion of taxonomic units from natural arable plant species and a lower proportion of taxonomic units from anthropogenic food resources such as garden bird seed mixes and brassicas. This suggests that breeding ground conservation strategies for turtle doves should include provision of anthropogenic seeds for adults early in the breeding season, coupled with habitat rich in accessible seeds from arable plants once chicks have hatched.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号