首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Large‐scale genomic studies of wild animal populations are often limited by access to high‐quality DNA. Although noninvasive samples, such as faeces, can be readily collected, DNA from the sample producers is usually present in low quantities, fragmented, and contaminated by microorganism and dietary DNAs. Hybridization capture can help to overcome these impediments by increasing the proportion of subject DNA prior to high‐throughput sequencing. Here we evaluate a key design variable for hybridization capture, the number of rounds of capture, by testing whether one or two rounds are most appropriate, given varying sample quality (as measured by the ratios of subject to total DNA). We used a set of 1,780 quality‐assessed wild chimpanzee (Pan troglodytes schweinfurthii) faecal samples and chose 110 samples of varying quality for exome capture and sequencing. We used multiple regression to assess the effects of the ratio of subject to total DNA (sample quality), rounds of capture and sequencing effort on the number of unique exome reads sequenced. We not only show that one round of capture is preferable when the proportion of subject DNA in a sample is above ~2%–3%, but also explore various types of bias introduced by capture, and develop a model that predicts the sequencing effort necessary for a desired data yield from samples of a given quality. Thus, our results provide a useful guide and pave a methodological way forward for researchers wishing to plan similar hybridization capture studies.  相似文献   

Sequence capture methods for targeted next generation sequencing promise to massively reduce cost of genomics projects compared to untargeted sequencing. However, evaluated capture methods specifically dedicated to biologically relevant genomic regions are rare. Whole exome capture has been shown to be a powerful tool to discover the genetic origin of disease and provides a reduction in target size and thus calculative sequencing capacity of > 90-fold compared to untargeted whole genome sequencing. For further cost reduction, a valuable complementing approach is the analysis of smaller, relevant gene subsets but involving large cohorts of samples. However, effective adjustment of target sizes and sample numbers is hampered by the limited scalability of enrichment systems. We report a highly scalable and automated method to capture a 480 Kb exome subset of 115 cancer-related genes using microfluidic DNA arrays. The arrays are adaptable from 125 Kb to 1 Mb target size and/or one to eight samples without barcoding strategies, representing a further 26 – 270-fold reduction of calculative sequencing capacity compared to whole exome sequencing. Illumina GAII analysis of a HapMap genome enriched for this exome subset revealed a completeness of > 96%. Uniformity was such that > 68% of exons had at least half the median depth of coverage. An analysis of reference SNPs revealed a sensitivity of up to 93% and a specificity of 98.2% or higher.  相似文献   

The ability to generate genomic data from wild animal populations has the potential to give unprecedented insight into the population history and dynamics of species in their natural habitats. However, for many species, it is impossible legally, ethically or logistically to obtain tissue samples of quality sufficient for genomic analyses. In this study we evaluate the success of multiple sources of genetic material (faeces, urine, dentin and dental calculus) and several capture methods (shotgun, whole‐genome, exome) in generating genome‐scale data in wild eastern chimpanzees (Pan troglodytes schweinfurthii) from Gombe National Park, Tanzania. We found that urine harbours significantly more host DNA than other sources, leading to broader and deeper coverage across the genome. Urine also exhibited a lower rate of allelic dropout. We found exome sequencing to be far more successful than both shotgun sequencing and whole‐genome capture at generating usable data from low‐quality samples such as faeces and dental calculus. These results highlight urine as a promising and untapped source of DNA that can be noninvasively collected from wild populations of many species.  相似文献   

The mountain bongo antelope Tragelaphus eurycerus isaaci has rapidly declined in recent decades, due to a combination of hunting, habitat degradation and disease. Endemic to Kenya, mountain bongo populations have shrunk to approximately 100 individuals now mainly confined to the Aberdares mountain ranges. Indirect observation of bongo signs (e.g. tracks, dung) can be misleading, thus methods to ensure reliable species identification, such as DNA-based techniques, are necessary to effectively study and monitor this species. We assessed bongo presence in four mountain habitats in Kenya (Mount Kenya National Park, Aberdare National Park, Eburu and Mau forests) and carried out a preliminary analysis of genetic variation by examining 466 bp of the first domain of the mtDNA control region using DNA extracted from faecal samples. Of the 201 dung samples collected in the field, 102 samples were molecularly identified as bongo, 97 as waterbuck, one as African buffalo and one as Aders’ duiker. Overall species-identification accuracy by experienced trackers was 64%, with very high error of commission when identifying bongo sign (37%), and high error of omission for waterbuck sign (82%), suggesting that the two species’ signs are easily confused. Despite high variation in the mtDNA control region in most antelope species, our results suggest low genetic variation in mountain bongo as only two haplotypes were detected in 102 samples analyzed. In contrast, the analysis of 63 waterbuck samples from the same sites revealed 21 haplotypes. Nevertheless, further examination using nuclear DNA markers (e.g. microsatellites) in a multi-locus approach is still required, especially because the use of mitochondrial DNA can result in population overestimation as distinct dung samples can potentially be originated from the same individual.  相似文献   



Human exome resequencing using commercial target capture kits has been and is being used for sequencing large numbers of individuals to search for variants associated with various human diseases. We rigorously evaluated the capabilities of two solution exome capture kits. These analyses help clarify the strengths and limitations of those data as well as systematically identify variables that should be considered in the use of those data.  相似文献   

The large genome size of many species hinders the development and application of genomic tools to study them. For instance, loblolly pine (Pinus taeda L.), an ecologically and economically important conifer, has a large and yet uncharacterized genome of 21.7 Gbp. To characterize the pine genome, we performed exome capture and sequencing of 14 729 genes derived from an assembly of expressed sequence tags. Efficiency of sequence capture was evaluated and shown to be similar across samples with increasing levels of complexity, including haploid cDNA, haploid genomic DNA and diploid genomic DNA. However, this efficiency was severely reduced for probes that overlapped multiple exons, presumably because intron sequences hindered probe:exon hybridizations. Such regions could not be entirely avoided during probe design, because of the lack of a reference sequence. To improve the throughput and reduce the cost of sequence capture, a method to multiplex the analysis of up to eight samples was developed. Sequence data showed that multiplexed capture was reproducible among 24 haploid samples, and can be applied for high‐throughput analysis of targeted genes in large populations. Captured sequences were de novo assembled, resulting in 11 396 expanded and annotated gene models, significantly improving the knowledge about the pine gene space. Interspecific capture was also evaluated with over 98% of all probes designed from P. taeda that were efficient in sequence capture, were also suitable for analysis of the related species Pinus elliottii Engelm.  相似文献   

Advanced resources for genome‐assisted research in barley (Hordeum vulgare) including a whole‐genome shotgun assembly and an integrated physical map have recently become available. These have made possible studies that aim to assess genetic diversity or to isolate single genes by whole‐genome resequencing and in silico variant detection. However such an approach remains expensive given the 5 Gb size of the barley genome. Targeted sequencing of the mRNA‐coding exome reduces barley genomic complexity more than 50‐fold, thus dramatically reducing this heavy sequencing and analysis load. We have developed and employed an in‐solution hybridization‐based sequence capture platform to selectively enrich for a 61.6 megabase coding sequence target that includes predicted genes from the genome assembly of the cultivar Morex as well as publicly available full‐length cDNAs and de novo assembled RNA‐Seq consensus sequence contigs. The platform provides a highly specific capture with substantial and reproducible enrichment of targeted exons, both for cultivated barley and related species. We show that this exome capture platform provides a clear path towards a broader and deeper understanding of the natural variation residing in the mRNA‐coding part of the barley genome and will thus constitute a valuable resource for applications such as mapping‐by‐sequencing and genetic diversity analyzes.  相似文献   

Hybridization-based target enrichment protocols require relatively large starting amounts of genomic DNA, which is not always available. Here, we tested three approaches to pre-capture library preparation starting from 10 ng of genomic DNA: (i and ii) whole-genome amplification of DNA samples with REPLI-g (Qiagen) and GenomePlex (Sigma) kits followed by standard library preparation, and (iii) library construction with a low input oriented ThruPLEX kit (Rubicon Genomics). Exome capture with Agilent SureSelectXT2 Human AllExon v4+UTRs capture probes, and HiSeq2000 sequencing were performed for test libraries along with the control library prepared from 1 µg of starting DNA. Tested protocols were characterized in terms of mapping efficiency, enrichment ratio, coverage of the target region, and reliability of SNP genotyping. REPLI-g- and ThruPLEX-FD-based protocols seem to be adequate solutions for exome sequencing of low input samples.  相似文献   

Remote biopsy darting of polar bears (Ursus maritimus) is less invasive and time intensive than physical capture and is therefore useful when capture is challenging or unsafe. We worked with two manufacturers to develop a combination biopsy and marking dart for use on polar bears. We had an 80% success rate of collecting a tissue sample with a single biopsy dart and collected tissue samples from 143 polar bears on land, in water, and on sea ice. Dye marks ensured that 96% of the bears were not resampled during the same sampling period, and we recovered 96% of the darts fired. Biopsy heads with 5 mm diameters collected an average of 0.12 g of fur, tissue, and subcutaneous adipose tissue, while biopsy heads with 7 mm diameters collected an average of 0.32 g. Tissue samples were 99.3% successful (142 of 143 samples) in providing a genetic and sex identification of individuals. We had a 64% success rate collecting adipose tissue and we successfully examined fatty acid signatures in all adipose samples. Adipose lipid content values were lower compared to values from immobilized or harvested polar bears, indicating that our method was not suitable for quantifying adipose lipid content.  相似文献   

Despite their suitability for studying evolution, many conifer species have large and repetitive giga-genomes (16–31 Gbp) that create hurdles to producing high coverage SNP data sets that capture diversity from across the entirety of the genome. Due in part to multiple ancient whole genome duplication events, gene family expansion and subsequent evolution within Pinaceae, false diversity from the misalignment of paralog copies creates further challenges in accurately and reproducibly inferring evolutionary history from sequence data. Here, we leverage the cost-saving benefits of pool-seq and exome-capture to discover SNPs in two conifer species, Douglas-fir (Pseudotsuga menziesii var. menziesii (Mirb.) Franco, Pinaceae) and jack pine (Pinus banksiana Lamb., Pinaceae). We show, using minimal baseline filtering, that allele frequencies estimated from pooled individuals show a strong, positive correlation with those estimated by sequencing the same population as individuals (r > .948), on par with such comparisons made in model organisms. Further, we highlight the utility of haploid megagametophyte tissue for identifying sites that are probably due to misaligned paralogs. Together with additional minor filtering, we show that it is possible to remove many of the loci with large frequency estimate discrepancies between individual and pooled sequencing approaches, improving the correlation further (r > .973). Our work addresses bioinformatic challenges in non-model organisms with large and complex genomes, highlights the use of megagametophyte tissue for the identification of paralogous artefacts, and suggests the combination of pool-seq and exome capture to be robust for further evolutionary hypothesis testing in these systems.  相似文献   

It is often difficult to determine optimal sampling design for non-invasive genetic sampling, especially when dealing with rare or elusive species depleted of genetic diversity. To address this problem, we ran a hair-snag pilot study on the remnant Apennine brown bear population. We used occupancy models to estimate the performance of an improved field protocol, a meta-analysis approach to indirectly model capture probability, and simulations to evaluate the effect of genotyping errors on the accuracy of capture-recapture population estimates. In spring 2007 we collected 70 bear hair samples in 15 5 × 5 km cells, using 5 10-day trapping sessions. Bear detectability was higher in 2007 than in a previous attempt on the same population in 2004, reflecting improved field protocols and sampling design. However, individual capture probability was 0.136 (95% CI = 0.120–0.152), still below the minimum requirements of capture-mark-recapture closed population models. We genotyped hair samples (n = 63) at 9 microsatellite loci, obtaining 94% Polymerase Chain Reaction success, and 13 bear genotypes. Estimated PIDsib was 0.00594, and per-genotype error rate was 0.13, corresponding to a 99% probability of correct individual identification. Simulation studies showed that the effect of non-corrected or filtered genetic errors on the accuracy of population estimates was negligible only when individual capture probability was >0.2. Our results underline how the interaction among field protocols, sampling strategies and genotyping errors may affect the accuracy of DNA-based estimates of small and genetically depleted populations, and warned us about the feasibility of a survey using only traditional hair-snag sampling. In this and similar cases, indications from pilot studies can provide cost-effective means to evaluate the efficiency of designed sampling and modelling procedures.  相似文献   

Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples.  相似文献   

Nearly 25 years ago, Allan Wilson and colleagues isolated DNA sequences from museum specimens of kangaroo rats (Dipodomys panamintinus) and compared these sequences with those from freshly collected animals (Thomas et al. 1990 ). The museum specimens had been collected up to 78 years earlier, so the two samples provided a direct temporal comparison of patterns of genetic variation. This was not the first time DNA sequences had been isolated from preserved material, but it was the first time it had been carried out with a population sample. Population geneticists often try to make inferences about the influence of historical processes such as selection, drift, mutation and migration on patterns of genetic variation in the present. The work of Wilson and colleagues was important in part because it suggested a way in which population geneticists could actually study genetic change in natural populations through time, much the same way that experimentalists can do with artificial populations in the laboratory. Indeed, the work of Thomas et al. ( 1990 ) spawned dozens of studies in which museum specimens were used to compare historical and present‐day genetic diversity (reviewed in Wandeler et al. 2007 ). All of these studies, however, were limited by the same fundamental problem: old DNA is degraded into short fragments. As a consequence, these studies mostly involved PCR amplification of short templates, usually short stretches of mitochondrial DNA or microsatellites. In this issue, Bi et al. ( 2013 ) report a breakthrough that should open the door to studies of genomic variation in museum specimens. They used target enrichment (exon capture) and next‐generation (Illumina) sequencing to compare patterns of genetic variation in historic and present‐day population samples of alpine chipmunks (Tamias alpinus) (Fig. 1). The historic samples came from specimens collected in 1915, so the temporal span of this comparison is nearly 100 years.  相似文献   

PCR-based methods for rRNA gene analysis have been widely used to study diversity of microbiology. However, the analysis would be difficult when the DNA content in samples is too low to be amplified by conventional PCR. Nested PCR comes up with the advantage of higher sensitivity. It can detect target DNA at several-fold lower concentrations than conventional PCR. However, the amplification bias and factors that potentially affect measurement of sample diversity associated with nested PCR method has received little attention. Here, nested PCR was compared to reconditioning PCR which is based on conventional PCR and it would reduce the formation of heteroduplex. We investigated the use of both nested and reconditioning PCR methods to construct clone libraries of 16S rRNA genes from four swimming pool water samples. Abundances of OTUs (operational taxonomic units) were correlated between the libraries (r 2 = 0.88, P < 0.0005), and some OTUs had equivalent abundances in the two libraries using the Chi-square test. Differences in taxonomic groups, as well as diversity and richness estimators, were compared by paired t-test and the Wilcoxon test, respectively. There were no significant differences between clone libraries using these two PCR methods. The results of ∫-Libshuff analysis suggested that nested PCR have no particular biases in revealing OTU diversity of a bacterial community. Thus, nested PCR produce congruent pictures with reconditioning PCR in the microbial community analysis.  相似文献   

Sequence capture across large phylogenetic scales is not easy because hybridization capture is only effective when the genetic distance between the bait and target is small. Here, we propose a simple but effective strategy to tackle this issue: pooling DNA from a number of selected representative species of different clades to prepare PCR‐generated baits to minimize the genetic distance between the bait and target. To demonstrate the utility of this strategy, we newly developed a set of universal nuclear markers (including 94 nuclear protein‐coding genes) for Lepidoptera, a superdiverse insect group. We used a DNA pool from six lepidopteran species (representing six superfamilies) to prepare PCR baits for the 94 markers. These homemade PCR baits were used to capture sequence data from 43 species of 17 lepidopteran families, and 94% of the target loci were recovered. We constructed two data sets from the obtained data (one containing ~90 kb target coding sequences and the other containing ~120 kb target + flanking coding sequences). Both data sets yielded highly similar and well‐resolved trees with 90% of nodes having >95% bootstrap support. Our capture experiment indicated that using DNA mixtures pooled from different clade‐representative species of Lepidoptera to prepare PCR baits can reliably capture a large number of targeted nuclear markers across different Lepidoptera lineages. We hope that this newly developed nuclear marker set will serve as a new phylogenetic tool for Lepidoptera phylogenetics, and the PCR bait preparation strategy can facilitate the application of sequence capture techniques by researchers to accelerate data collection.  相似文献   

Noninvasive genetic sampling, or noninvasive DNA sampling (NDS), can be an effective monitoring approach for elusive, wide‐ranging species at low densities. However, few studies have attempted to maximize sampling efficiency. We present a model for combining sample accumulation and DNA degradation to identify the most efficient (i.e. minimal cost per successful sample) NDS temporal design for capture–recapture analyses. We use scat accumulation and faecal DNA degradation rates for two sympatric carnivores, kit fox (Vulpes macrotis) and coyote (Canis latrans) across two seasons (summer and winter) in Utah, USA, to demonstrate implementation of this approach. We estimated scat accumulation rates by clearing and surveying transects for scats. We evaluated mitochondrial (mtDNA) and nuclear (nDNA) DNA amplification success for faecal DNA samples under natural field conditions for 20 fresh scats/species/season from <1–112 days. Mean accumulation rates were nearly three times greater for coyotes (0.076 scats/km/day) than foxes (0.029 scats/km/day) across seasons. Across species and seasons, mtDNA amplification success was ≥95% through day 21. Fox nDNA amplification success was ≥70% through day 21 across seasons. Coyote nDNA success was ≥70% through day 21 in winter, but declined to <50% by day 7 in summer. We identified a common temporal sampling frame of approximately 14 days that allowed species to be monitored simultaneously, further reducing time, survey effort and costs. Our results suggest that when conducting repeated surveys for capture–recapture analyses, overall cost‐efficiency for NDS may be improved with a temporal design that balances field and laboratory costs along with deposition and degradation rates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号