首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
Next‐generation sequencing (NGS) methodologies have proven useful in deciphering the food items of generalist predators, but have yet to be applied to gelatinous animal gut and tentacle content. NGS can potentially supplement traditional methods of visual identification. Chrysaora quinquecirrha (Atlantic sea nettle) has progressively become more abundant in Mid‐Atlantic United States’ estuaries including Barnegat Bay (New Jersey), potentially having detrimental effects on both marine organisms and human enterprises. Full characterization of this predator's diet is essential for a comprehensive understanding of its impact on the food web and its management. Here, we tested the efficacy of NGS for prey item determination in the Atlantic sea nettle. We implemented a NGS ‘shotgun’ approach to randomly sequence DNA fragments isolated from gut lavages and gastric pouch/tentacle picks of eight and 84 sea nettles, respectively. These results were verified by visual identification and co‐occurring plankton tows. Over 550 000 contigs were assembled from ~110 million paired‐end reads. Of these, 100 contigs were confidently assigned to 23 different taxa, including soft‐bodied organisms previously undocumented as prey species, including copepods, fish, ctenophores, anemones, amphipods, barnacles, shrimp, polychaete worms, flukes, flatworms, echinoderms, gastropods, bivalves and hemichordates. Our results not only indicate that a ‘shotgun’ NGS approach can supplement visual identification methods, but targeted enrichment of a specific amplicon/gene is not a prerequisite for identifying Atlantic sea nettle prey items.  相似文献   

2.
Ecological understanding of the role of consumer–resource interactions in natural food webs is limited by the difficulty of accurately and efficiently determining the complex variety of food types animals have eaten in the field. We developed a method based on DNA metabarcoding multiplexing and next‐generation sequencing to uncover different taxonomic groups of organisms from complex diet samples. We validated this approach on 91 faeces of a large omnivorous mammal, the brown bear, using DNA metabarcoding markers targeting the plant, vertebrate and invertebrate components of the diet. We included internal controls in the experiments and performed PCR replication for accuracy validation in postsequencing data analysis. Using our multiplexing strategy, we significantly simplified the experimental procedure and accurately and concurrently identified different prey DNA corresponding to the targeted taxonomic groups, with ≥60% of taxa of all diet components identified to genus/species level. The systematic application of internal controls and replication was a useful and simple way to evaluate the performance of our experimental procedure, standardize the selection of sequence filtering parameters for each marker data and validate the accuracy of the results. Our general approach can be adapted to the analysis of dietary samples of various predator species in different ecosystems, for a number of conservation and ecological applications entailing large‐scale population level diet assessment through cost‐effective screening of multiple DNA metabarcodes, and the detection of fine dietary variation among samples or individuals and of rare food items.  相似文献   

3.
The predator-prey interactions within food chains are used to both characterize and understand ecosystems. Conventional methods of constructing food chains from visual identification of prey in predator diet can suffer from poor taxonomic resolution, misidentification, and bias against small or completely digestible prey. Next-generation sequencing (NGS) technology has become a powerful tool for diet reconstruction through barcoding of DNA in stomach content or fecal samples. Here we use multi-locus (16S and CO1) next-generation sequencing of DNA barcodes on the feces of Atlantic puffin (Fratercula arctica) chicks (n=65) and adults (n=64) and the stomach contents of their main prey, Atlantic herring (Clupea harengus, n=44) to investigate a previously studied food chain. We compared conventional and molecular-derived chick diet, tested the similarity between the diets of puffin adults and chicks, and determined whether herring prey can be detected in puffin diet samples. There was high variability in the coverage of prey groups between 16S and CO1 markers. We identified more unique prey with our 16S compared to CO1 barcoding markers (51 and 39 taxa respectively) with only 12 taxa identified by both genes. We found no significant difference between the 16S-identified diets of puffin adults (n=17) and chicks (n=41). Our molecular method is more taxonomically resolved and detected chick prey at higher frequencies than conventional field observations. Many likely planktonic prey of herring were detected in feces from puffin adults and chicks, highlighting the impact secondary consumption may have on the interpretation of molecular dietary analysis. This study represents the first simultaneous molecular investigation into the diet of multiple components of a food chain and highlights the utility of a multi-locus approach to diet reconstruction that is broadly applicable to food web analysis.  相似文献   

4.
Who is eating what: diet assessment using next generation sequencing   总被引:4,自引:0,他引:4  
The analysis of food webs and their dynamics facilitates understanding of the mechanistic processes behind community ecology and ecosystem functions. Having accurate techniques for determining dietary ranges and components is critical for this endeavour. While visual analyses and early molecular approaches are highly labour intensive and often lack resolution, recent DNA-based approaches potentially provide more accurate methods for dietary studies. A suite of approaches have been used based on the identification of consumed species by characterization of DNA present in gut or faecal samples. In one approach, a standardized DNA region (DNA barcode) is PCR amplified, amplicons are sequenced and then compared to a reference database for identification. Initially, this involved sequencing clones from PCR products, and studies were limited in scale because of the costs and effort required. The recent development of next generation sequencing (NGS) has made this approach much more powerful, by allowing the direct characterization of dozens of samples with several thousand sequences per PCR product, and has the potential to reveal many consumed species simultaneously (DNA metabarcoding). Continual improvement of NGS technologies, on-going decreases in costs and current massive expansion of reference databases make this approach promising. Here we review the power and pitfalls of NGS diet methods. We present the critical factors to take into account when choosing or designing a suitable barcode. Then, we consider both technical and analytical aspects of NGS diet studies. Finally, we discuss the validation of data accuracy including the viability of producing quantitative data.  相似文献   

5.
While the morphological identification of prey remains in predators' faeces is the most commonly used method to study trophic interactions, many studies indicate that this method does not detect all consumed prey. Polymerase chain reaction–based methods are increasingly used to detect prey DNA in the predator food bolus and have proven efficient, delivering highly accurate results. When studying complex diet samples, the extraction of total DNA is a critical step, as polymerase chain reaction (PCR) inhibitors may be co‐extracted. Another critical step involves a careful selection of suitable group‐specific primer sets that should only amplify DNA from the targeted prey taxon. In this study, the food boluses of five Rattus rattus and seven Rattus exulans were analysed using both morphological and molecular methods. We tested a panel of 31 PCR primer pairs targeting bird, invertebrate and plant sequences; four of them were selected to be used as group‐specific primer pairs in PCR protocols. The performances of four DNA extraction protocols (QIAamp® DNA stool mini kit, DNeasy® mericon food kit and two of cetyltrimethylammonium bromide‐based methods) were compared using four variables: DNA concentration, A260/A280 absorbance ratio, food compartment analysed (stomach or faecal contents) and total number of prey‐specific PCR amplification per sample. Our results clearly indicate that the A260/A280 absorbance ratio, which varies between extraction protocols, is positively correlated to the number of PCR amplifications of each prey taxon. We recommend using the DNeasy® mericon food kit (QIAGEN), which yielded results very similar to those achieved with the morphological approach.  相似文献   

6.
DNA barcoding is used in a variety of ecological applications to identify organisms, including partially digested prey items from diet samples. That particular application can enhance the ability to characterize diet and predator–prey dynamics but is problematic when genetic sequences of prey match those of consumer species (i.e., self-DNA). Such a result may indicate cannibalism, but false positives can result from contamination of degraded prey samples with consumer DNA. Here, nuclear-encoded microsatellite markers were used to genotype invasive lionfish, Pterois volitans, consumers and their prey (n?=?80 pairs) previously barcoded as lionfish. Cannibalism was confirmed when samples exhibited two or more different alleles between lionfish and prey DNA across multiple microsatellite loci. This occurred in 26.2% of all samples and in 42% of samples for which the data were considered conclusive. These estimates should be considered conservative given rigorous assignment criteria and low allelic diversity in invasive lionfish populations. The highest incidence of cannibalism corresponded to larger sized consumers from areas with high lionfish densities, suggesting cannibalism in northern Gulf of Mexico lionfish is size- and density-dependent. Cannibalism has the potential to influence population dynamics of lionfish which lack native western Atlantic predators. These results also have important implications for interpreting DNA barcoding analysis of diet in other predatory species where cannibalism may be underreported.  相似文献   

7.
Next‐generation sequencing (NGS) technology has extraordinarily enhanced the scope of research in the life sciences. To broaden the application of NGS to systems that were previously difficult to study, we present protocols for processing faecal and swab samples into amplicon libraries amenable to Illumina sequencing. We developed and tested a novel metagenomic DNA extraction approach using solid phase reversible immobilization (SPRI) beads on Western Bluebird (Sialia mexicana) samples stored in RNAlater. Compared with the MO BIO PowerSoil Kit, the current standard for the Human and Earth Microbiome Projects, the SPRI‐based method produced comparable 16S rRNA gene PCR amplification from faecal extractions but significantly greater DNA quality, quantity and PCR success for both cloacal and oral swab samples. We furthermore modified published protocols for preparing highly multiplexed Illumina libraries with minimal sample loss and without post‐adapter ligation amplification. Our library preparation protocol was successfully validated on three sets of heterogeneous amplicons (16S rRNA gene amplicons from SPRI and PowerSoil extractions as well as control arthropod COI gene amplicons) that were sequenced across three independent, 250‐bp, paired‐end runs on Illumina's MiSeq platform. Sequence analyses revealed largely equivalent results from the SPRI and PowerSoil extractions. Our comprehensive strategies focus on maximizing efficiency and minimizing costs. In addition to increasing the feasibility of using minimally invasive sampling and NGS capabilities in avian research, our methods are notably not avian‐specific and thus applicable to many research programmes that involve DNA extraction and amplicon sequencing.  相似文献   

8.
Molecular techniques have become an important tool to empirically assess feeding interactions. The increased usage of next‐generation sequencing approaches has stressed the need of fast DNA extraction that does not compromise DNA quality. Dietary samples here pose a particular challenge, as these demand high‐quality DNA extraction procedures for obtaining the minute quantities of short‐fragmented food DNA. Automatic high‐throughput procedures significantly decrease time and costs and allow for standardization of extracting total DNA. However, these approaches have not yet been evaluated for dietary samples. We tested the efficiency of an automatic DNA extraction platform and a traditional CTAB protocol, employing a variety of dietary samples including invertebrate whole‐body extracts as well as invertebrate and vertebrate gut content samples and feces. Extraction efficacy was quantified using the proportions of successful PCR amplifications of both total and prey DNA, and cost was estimated in terms of time and material expense. For extraction of total DNA, the automated platform performed better for both invertebrate and vertebrate samples. This was also true for prey detection in vertebrate samples. For the dietary analysis in invertebrates, there is still room for improvement when using the high‐throughput system for optimal DNA yields. Overall, the automated DNA extraction system turned out as a promising alternative to labor‐intensive, low‐throughput manual extraction methods such as CTAB. It is opening up the opportunity for an extensive use of this cost‐efficient and innovative methodology at low contamination risk also in trophic ecology.  相似文献   

9.
Spiders are the most common and predominant predators in terrestrial ecosystems. The predatory behavior of spiders affects the energy flow across the food web within an ecosystem. Traditiaonal methods for analyzing spider diets such as field observation, anatomy and faeces analysis are not suitable for spider experiments due to spiders’ special dietary behavior. The molecular method based on the specific primers of prey DNA seems to be inefficient either in spite of its wide application in diet analysis. As the next-generation sequencing (NGS) technology becomes prevalent in many different areas, several cases of the NGS-based analysis of mammal diets have been published. This study analyzed the diet differences of Pardosa pseudoannulata (Araneae: Lycosidae) in four habitats (a wetland, a tea plantation, an alpine meadow and a paddy field) by using the NGS technology, combined with the DNA barcode method. The results suggested that the Pardosa pseudoannulata feed on a broad range of prey, and 7 orders and 24 families of insects were detected in the four investigated habitats. Moreover, it is found that the diet diversity of Pardosa pseudoannulata is greatly influenced by their living environments and seasons. In a nutshell, this study established an NGS-based methodology for spider diets analysis, and the results provided some basic materials to inform the protection and utilization of the Pardosa pseudoannulata as a potential eco-friendly predator against pests.  相似文献   

10.
Somatic activating GNAS mutations cause McCune-Albright syndrome (MAS). Owing to low mutation abundance, mutant-specific enrichment procedures, such as the peptide nucleic acid (PNA) method, are required to detect mutations in peripheral blood. Next generation sequencing (NGS) can analyze millions of PCR amplicons independently, thus it is expected to detect low-abundance GNAS mutations quantitatively. In the present study, we aimed to develop an NGS-based method to detect low-abundance somatic GNAS mutations. PCR amplicons encompassing exons 8 and 9 of GNAS, in which most activating mutations occur, were sequenced on the MiSeq instrument. As expected, our NGS-based method could sequence the GNAS locus with very high read depth (approximately 100,000) and low error rate. A serial dilution study with use of cloned mutant and wildtype DNA samples showed a linear correlation between dilution and measured mutation abundance, indicating the reliability of quantification of the mutation. Using the serially diluted samples, the detection limits of three mutation detection methods (the PNA method, NGS, and combinatory use of PNA and NGS [PNA-NGS]) were determined. The lowest detectable mutation abundance was 1% for the PNA method, 0.03% for NGS and 0.01% for PNA-NGS. Finally, we analyzed 16 MAS patient-derived leukocytic DNA samples with the three methods, and compared the mutation detection rate of them. Mutation detection rate of the PNA method, NGS and PNA-NGS in 16 patient-derived peripheral blood samples were 56%, 63% and 75%, respectively. In conclusion, NGS can detect somatic activating GNAS mutations quantitatively and sensitively from peripheral blood samples. At present, the PNA-NGS method is likely the most sensitive method to detect low-abundance GNAS mutation.  相似文献   

11.
As one of the most abundant predators of insects in terrestrial ecosystems, spiders have long received much attention from agricultural scientists and ecologists. Do spiders have a certain controlling effect on the main insect pests of concern in farmland ecosystems? Answering this question requires us to fully understand the prey spectrum of spiders. Next‐generation sequencing (NGS) has been successfully employed to analyze spider prey spectra. However, the high sequencing costs make it difficult to analyze the prey spectrum of various spider species with large samples in a given farmland ecosystem. We performed a comparative analysis of the prey spectra of Ovia alboannulata (Araneae, Lycosidae) using NGS with individual and mixed DNA samples to demonstrate which treatment was better for determining the spider prey spectra in the field. We collected spider individuals from tea plantations, and two treatments were then carried out: (1) The DNA was extracted from the spiders individually and then sequenced separately (DESISS) and (2) the DNA was extracted from the spiders individually and then mixed and sequenced (DESIMS). The results showed that the number of prey families obtained by the DESISS treatment was approximately twice that obtained by the DESIMS treatment. Therefore, the DESIMS treatment greatly underestimated the prey composition of the spiders, although its sequencing costs were obviously lower. However, the relative abundance of prey sequences detected in the two treatments was slightly different only at the family level. Therefore, we concluded that if our purpose were to obtain the most accurate prey spectrum of the spiders, the DESISS treatment would be the best choice. However, if our purpose were to obtain only the relative abundance of prey sequences of the spiders, the DESIMS treatment would also be an option. The present study provides an important reference for choosing applicable methods to analyze the prey spectra and food web compositions of animal in ecosystems.  相似文献   

12.
Testing for deviations from Hardy–Weinberg equilibrium (HWE) is a common practice for quality control in genetic studies. Variable sites violating HWE may be identified as technical errors in the sequencing or genotyping process, or they may be of particular evolutionary interest. Large‐scale genetic studies based on next‐generation sequencing (NGS) methods have become more prevalent as cost is decreasing but these methods are still associated with statistical uncertainty. The large‐scale studies usually consist of samples from diverse ancestries that make the existence of some degree of population structure almost inevitable. Precautions are therefore needed when analysing these data set, as population structure causes deviations from HWE. Here we propose a method that takes population structure into account in the testing for HWE, such that other factors causing deviations from HWE can be detected. We show the effectiveness of PCAngsd in low‐depth NGS data, as well as in genotype data, for both simulated and real data set, where the use of genotype likelihoods enables us to model the uncertainty.  相似文献   

13.
Understanding predator–prey interactions is a major challenge in ecological studies. In particular, the accurate identification of prey is a fundamental requirement in elucidating food‐web structure. This study took a molecular approach in determining the species identity of consumed prey items of a freshwater carnivorous fish (largemouth bass, Micropterus salmoides), according to their size class. Thirty randomly selected gut samples were categorized into three size classes, based on the total length of the bass. Using the universal primer for the mtDNA cytochrome oxidase I (COI) region, polymerase chain reaction (PCR) amplification was performed on unidentified gut contents and then sequenced after cloning. Two gut samples were completely empty, and DNA materials from 27 of 28 gut samples were successfully amplified by PCR (success rate: 96.4%). Sequence database navigation yielded a total of 308 clones, containing DNA from 26 prey items. They comprised four phyla, including seven classes, 12 orders, and 12 families based on BLAST and BOLD database searches. The results indicate that largemouth bass show selective preferences in prey item consumption as they mature. These results corroborate a hypothesis, presence of ontogenetic diet shift, derived through other methodological approaches. Despite the practical limitations inherent in DNA barcoding analysis, high‐resolution (i.e., species level) identification was possible, and the predation patterns of predators of different sizes were identifiable. The utilization of this method is strongly recommended for determining specific predator–prey relationships in complex freshwater ecosystems.  相似文献   

14.
Diatoms are frequently used for water quality assessments; however, identification to species level is difficult, time‐consuming and needs in‐depth knowledge of the organisms under investigation, as nonhomoplastic species‐specific morphological characters are scarce. We here investigate how identification methods based on DNA (metabarcoding using NGS platforms) perform in comparison to morphological diatom identification and propose a workflow to optimize diatom fresh water quality assessments. Diatom diversity at seven different sites along the course of the river system Odra and Lusatian Neisse from the source to the mouth is analysed with DNA and morphological methods, which are compared. The NGS technology almost always leads to a higher number of identified taxa (270 via NGS vs. 103 by light microscopy LM), whose presence could subsequently be verified by LM. The sequence‐based approach allows for a much more graduated insight into the taxonomic diversity of the environmental samples. Taxa retrieval varies considerably throughout the river system, depending on species occurrences and the taxonomic depth of the reference databases. Mostly rare taxa from oligotrophic parts of the river systems are less well represented in the reference database used. A workflow for DNA‐based NGS diatom identification is presented. 28 000 diatom sequences were evaluated. Our findings provide evidence that metabarcoding of diatoms via NGS sequencing of the V4 region (18S) has a great potential for water quality assessments and could complement and maybe even improve the identification via light microscopy.  相似文献   

15.
All next-generation sequencing (NGS) procedures include assays performed at the laboratory bench ("wet bench") and data analyses conducted using bioinformatics pipelines ("dry bench"). Both elements are essential to produce accurate and reliable results, which are particularly critical for clinical laboratories. Targeted NGS technologies have increasingly found favor in oncology applications to help advance precision medicine objectives, yet the methods often involve disconnected and variable wet and dry bench workflows and uncoordinated reagent sets. In this report, we describe a method for sequencing challenging cancer specimens with a 21-gene panel as an example of a comprehensive targeted NGS system. The system integrates functional DNA quantification and qualification, single-tube multiplexed PCR enrichment, and library purification and normalization using analytically-verified, single-source reagents with a standalone bioinformatics suite. As a result, accurate variant calls from low-quality and low-quantity formalin-fixed, paraffin-embedded (FFPE) and fine-needle aspiration (FNA) tumor biopsies can be achieved. The method can routinely assess cancer-associated variants from an input of 400 amplifiable DNA copies, and is modular in design to accommodate new gene content. Two different types of analytically-defined controls provide quality assurance and help safeguard call accuracy with clinically-relevant samples. A flexible "tag" PCR step embeds platform-specific adaptors and index codes to allow sample barcoding and compatibility with common benchtop NGS instruments. Importantly, the protocol is streamlined and can produce 24 sequence-ready libraries in a single day. Finally, the approach links wet and dry bench processes by incorporating pre-analytical sample quality control results directly into the variant calling algorithms to improve mutation detection accuracy and differentiate false-negative and indeterminate calls. This targeted NGS method uses advances in both wetware and software to achieve high-depth, multiplexed sequencing and sensitive analysis of heterogeneous cancer samples for diagnostic applications.  相似文献   

16.
Chemical mutagenesis is routinely used to create large numbers of rare mutations in plant and animal populations, which can be subsequently subjected to selection for beneficial traits and phenotypes that enable the characterization of gene functions. Several next‐generation sequencing (NGS)‐based target enrichment methods have been developed for the detection of mutations in target DNA regions. However, most of these methods aim to sequence a large number of target regions from a small number of individuals. Here, we demonstrate an effective and affordable strategy for the discovery of rare mutations in a large sodium azide‐induced mutant rice population (F2). The integration of multiplex, semi‐nested PCR combined with NGS library construction allowed for the amplification of multiple target DNA fragments for sequencing. The 8 × 8 × 8 tridimensional DNA sample pooling strategy enabled us to obtain DNA sequences of 512 individuals while only sequencing 24 samples. A stepwise filtering procedure was then elaborated to eliminate most of the false positives expected to arise through sequencing error, and the application of a simple Student's t‐test against position‐prone error allowed for the discovery of 16 mutations from 36 enriched targeted DNA fragments of 1024 mutagenized rice plants, all without any false calls.  相似文献   

17.
Dietary metabarcoding has vastly improved our ability to analyse the diets of animals, but it is hampered by a plethora of technical limitations including potentially reduced data output due to the disproportionate amplification of the DNA of the focal predator, here termed “the predator problem”. We review the various methods commonly used to overcome this problem, from deeper sequencing to exclusion of predator DNA during PCR, and how they may interfere with increasingly common multipredator-taxon studies. We suggest that multiprimer approaches with an emphasis on achieving both depth and breadth of prey detections may overcome the issue to some extent, although multitaxon studies require further consideration, as highlighted by an empirical example. We also review several alternative methods for reducing the prevalence of predator DNA that are conceptually promising but require additional empirical examination. The predator problem is a key constraint on molecular dietary analyses but, through this synthesis, we hope to guide researchers in overcoming this in an effective and pragmatic way.  相似文献   

18.
The application of high‐throughput sequencing‐based approaches to DNA extracted from environmental samples such as gut contents and faeces has become a popular tool for studying dietary habits of animals. Due to the high resolution and prey detection capacity they provide, both metabarcoding and shotgun sequencing are increasingly used to address ecological questions grounded in dietary relationships. Despite their great promise in this context, recent research has unveiled how a wealth of biological (related to the study system) and technical (related to the methodology) factors can distort the signal of taxonomic composition and diversity. Here, we review these studies in the light of high‐throughput sequencing‐based assessment of trophic interactions. We address how the study design can account for distortion factors, and how acknowledging limitations and biases inherent to sequencing‐based diet analyses are essential for obtaining reliable results, thus drawing appropriate conclusions. Furthermore, we suggest strategies to minimize the effect of distortion factors, measures to increase reproducibility, replicability and comparability of studies, and options to scale up DNA sequencing‐based diet analyses. In doing so, we aim to aid end‐users in designing reliable diet studies by informing them about the complexity and limitations of DNA sequencing‐based diet analyses, and encourage researchers to create and improve tools that will eventually drive this field to its maturity.  相似文献   

19.
The applicability of species-specific primers to study feeding interactions is restricted to those ecosystems where the targeted prey species occur. Therefore, group-specific primer pairs, targeting higher taxonomic levels, are often desired to investigate interactions in a range of habitats that do not share the same species but the same groups of prey. Such primers are also valuable to study the diet of generalist predators when next generation sequencing approaches cannot be applied beneficially. Moreover, due to the large range of prey consumed by generalists, it is impossible to investigate the breadth of their diet with species-specific primers, even if multiplexing them. However, only few group-specific primers are available to date and important groups of prey such as flying insects have rarely been targeted. Our aim was to fill this gap and develop group-specific primers suitable to detect and identify the DNA of common taxa of flying insects. The primers were combined in two multiplex PCR systems, which allow a time- and cost-effective screening of samples for DNA of the dipteran subsection Calyptratae (including Anthomyiidae, Calliphoridae, Muscidae), other common dipteran families (Phoridae, Syrphidae, Bibionidae, Chironomidae, Sciaridae, Tipulidae), three orders of flying insects (Hymenoptera, Lepidoptera, Plecoptera) and coniferous aphids within the genus Cinara. The two PCR assays were highly specific and sensitive and their suitability to detect prey was confirmed by testing field-collected dietary samples from arthropods and vertebrates. The PCR assays presented here allow targeting prey at higher taxonomic levels such as family or order and therefore improve our ability to assess (trophic) interactions with flying insects in terrestrial and aquatic habitats.  相似文献   

20.
Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose 'Copy Number estimation by a Mixture Of PoissonS' (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1-FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/software/cnmops/ and at Bioconductor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号