首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Techniques enabling targeted re-sequencing of the protein coding sequences of the human genome on next generation sequencing instruments are of great interest. We conducted a systematic comparison of the solution-based exome capture kits provided by Agilent and Roche NimbleGen. A control DNA sample was captured with all four capture methods and prepared for Illumina GAII sequencing. Sequence data from additional samples prepared with the same protocols were also used in the comparison.

Results

We developed a bioinformatics pipeline for quality control, short read alignment, variant identification and annotation of the sequence data. In our analysis, a larger percentage of the high quality reads from the NimbleGen captures than from the Agilent captures aligned to the capture target regions. High GC content of the target sequence was associated with poor capture success in all exome enrichment methods. Comparison of mean allele balances for heterozygous variants indicated a tendency to have more reference bases than variant bases in the heterozygous variant positions within the target regions in all methods. There was virtually no difference in the genotype concordance compared to genotypes derived from SNP arrays. A minimum of 11× coverage was required to make a heterozygote genotype call with 99% accuracy when compared to common SNPs on genome-wide association arrays.

Conclusions

Libraries captured with NimbleGen kits aligned more accurately to the target regions. The updated NimbleGen kit most efficiently covered the exome with a minimum coverage of 20×, yet none of the kits captured all the Consensus Coding Sequence annotated exons.  相似文献   

2.

Background  

Whole exome capture sequencing allows researchers to cost-effectively sequence the coding regions of the genome. Although the exome capture sequencing methods have become routine and well established, there is currently a lack of tools specialized for variant calling in this type of data.  相似文献   

3.

Background

The domestic pig (Sus scrofa) is both an important livestock species and a model for biomedical research. Exome sequencing has accelerated identification of protein-coding variants underlying phenotypic traits in human and mouse. We aimed to develop and validate a similar resource for the pig.

Results

We developed probe sets to capture pig exonic sequences based upon the current Ensembl pig gene annotation supplemented with mapped expressed sequence tags (ESTs) and demonstrated proof-of-principle capture and sequencing of the pig exome in 96 pigs, encompassing 24 capture experiments. For most of the samples at least 10x sequence coverage was achieved for more than 90% of the target bases. Bioinformatic analysis of the data revealed over 236,000 high confidence predicted SNPs and over 28,000 predicted indels.

Conclusions

We have achieved coverage statistics similar to those seen with commercially available human and mouse exome kits. Exome capture in pigs provides a tool to identify coding region variation associated with production traits, including loss of function mutations which may explain embryonic and neonatal losses, and to improve genomic assemblies in the vicinity of protein coding genes in the pig.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-550) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

Recent developments in deep (next-generation) sequencing technologies are significantly impacting medical research. The global analysis of protein coding regions in genomes of interest by whole exome sequencing is a widely used application. Many technologies for exome capture are commercially available; here we compare the performance of four of them: NimbleGen’s SeqCap EZ v3.0, Agilent’s SureSelect v4.0, Illumina’s TruSeq Exome, and Illumina’s Nextera Exome, all applied to the same human tumor DNA sample.

Results

Each capture technology was evaluated for its coverage of different exome databases, target coverage efficiency, GC bias, sensitivity in single nucleotide variant detection, sensitivity in small indel detection, and technical reproducibility. In general, all technologies performed well; however, our data demonstrated small, but consistent differences between the four capture technologies. Illumina technologies cover more bases in coding and untranslated regions. Furthermore, whereas most of the technologies provide reduced coverage in regions with low or high GC content, the Nextera technology tends to bias towards target regions with high GC content.

Conclusions

We show key differences in performance between the four technologies. Our data should help researchers who are planning exome sequencing to select appropriate exome capture technology for their particular application.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-449) contains supplementary material, which is available to authorized users.  相似文献   

5.
Asan  Xu Y  Jiang H  Tyler-Smith C  Xue Y  Jiang T  Wang J  Wu M  Liu X  Tian G  Wang J  Wang J  Yang H  Zhang X 《Genome biology》2011,12(9):R95-12

Background

Exome sequencing, which allows the global analysis of protein coding sequences in the human genome, has become an effective and affordable approach to detecting causative genetic mutations in diseases. Currently, there are several commercial human exome capture platforms; however, the relative performances of these have not been characterized sufficiently to know which is best for a particular study.

Results

We comprehensively compared three platforms: NimbleGen's Sequence Capture Array and SeqCap EZ, and Agilent's SureSelect. We assessed their performance in a variety of ways, including number of genes covered and capture efficacy. Differences that may impact on the choice of platform were that Agilent SureSelect covered approximately 1,100 more genes, while NimbleGen provided better flanking sequence capture. Although all three platforms achieved similar capture specificity of targeted regions, the NimbleGen platforms showed better uniformity of coverage and greater genotype sensitivity at 30- to 100-fold sequencing depth. All three platforms showed similar power in exome SNP calling, including medically relevant SNPs. Compared with genotyping and whole-genome sequencing data, the three platforms achieved a similar accuracy of genotype assignment and SNP detection. Importantly, all three platforms showed similar levels of reproducibility, GC bias and reference allele bias.

Conclusions

We demonstrate key differences between the three platforms, particularly advantages of solutions over array capture and the importance of a large gene target set.  相似文献   

6.

Background

Less than two percent of the human genome is protein coding, yet that small fraction harbours the majority of known disease causing mutations. Despite rapidly falling whole genome sequencing (WGS) costs, much research and increasingly the clinical use of sequence data is likely to remain focused on the protein coding exome. We set out to quantify and understand how WGS compares with the targeted capture and sequencing of the exome (exome-seq), for the specific purpose of identifying single nucleotide polymorphisms (SNPs) in exome targeted regions.

Results

We have compared polymorphism detection sensitivity and systematic biases using a set of tissue samples that have been subject to both deep exome and whole genome sequencing. The scoring of detection sensitivity was based on sequence down sampling and reference to a set of gold-standard SNP calls for each sample. Despite evidence of incremental improvements in exome capture technology over time, whole genome sequencing has greater uniformity of sequence read coverage and reduced biases in the detection of non-reference alleles than exome-seq. Exome-seq achieves 95% SNP detection sensitivity at a mean on-target depth of 40 reads, whereas WGS only requires a mean of 14 reads. Known disease causing mutations are not biased towards easy or hard to sequence areas of the genome for either exome-seq or WGS.

Conclusions

From an economic perspective, WGS is at parity with exome-seq for variant detection in the targeted coding regions. WGS offers benefits in uniformity of read coverage and more balanced allele ratio calls, both of which can in most cases be offset by deeper exome-seq, with the caveat that some exome-seq targets will never achieve sufficient mapped read depth for variant detection due to technical difficulties or probe failures. As WGS is intrinsically richer data that can provide insight into polymorphisms outside coding regions and reveal genomic rearrangements, it is likely to progressively replace exome-seq for many applications.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-247) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Knowledge of antimicrobial susceptibility, especially to macrolides, has become crucial for the management of Helicobacter pylori infection. Our aim was to evaluate two new PCR kits able to detect H. pylori in gastric biopsies as well as the mutations associated with macrolide resistance.

Materials and Methods

Two hundred successive biopsies (received from gastroenterologists all over France) were used. The two new kits tested were Amplidiag H. pylori+ClariR from Mobidiag Espoo, Finland, and RIDA®GENE H. pylori from R‐Biopharm, Darmstadt, Germany. Culture and a validated in‐house real‐time PCR were also performed, and in the case of a positive culture, Etest for clarithromycin was carried out. Discrepancies were solved by looking at the pathologic data.

Results

Culture was positive in 68 cases (34%), and with our in‐house real‐time PCR in these 68 cases plus 5 others (N = 73, 36%). All were also detected by the two new kits. In addition, RIDA®GENE H. pylori detected one more positive also detected by Amplidiag H. pylori+ClariR, and Amplidiag detected two other positives. Of these three additional cases, pathology confirmed the positivity for two. Only one case diagnosed by Amplidiag could be considered as a false positive. With regard to clarithromycin resistance, 22 cases were detected. The corresponding mutations (A2142/43G) were all identified with the three PCRs.

Conclusions

These two new kits which have an excellent sensitivity and specificity are convenient to use, adaptable to different thermocyclers, provide quick results, and deserve to be used in H. pylori diagnosis for a better choice of treatment regimen.  相似文献   

8.

Background  

Complete exome resequencing has the power to greatly expand our understanding of non-human primate genomes. This includes both a better appreciation of the variation that exists in non-human primate model species, but also an improved annotation of their genomes. By developing an understanding of the variation between individuals, non-human primate models of human disease can be better developed. This effort is hindered largely by the lack of comprehensive information on specific non-human primate genetic variation and the costs of generating these data. If the tools that have been developed in humans for complete exome resequencing can be applied to closely related non-human primate species, then these difficulties can be circumvented.  相似文献   

9.

Key message

Imputing genotypes from the 90K SNP chip to exome sequence in wheat was moderately accurate. We investigated the factors that affect imputation and propose several strategies to improve accuracy.

Abstract

Imputing genetic marker genotypes from low to high density has been proposed as a cost-effective strategy to increase the power of downstream analyses (e.g. genome-wide association studies and genomic prediction) for a given budget. However, imputation is often imperfect and its accuracy depends on several factors. Here, we investigate the effects of reference population selection algorithms, marker density and imputation algorithms (Beagle4 and FImpute) on the accuracy of imputation from low SNP density (9K array) to the Infinium 90K single-nucleotide polymorphism (SNP) array for a collection of 837 hexaploid wheat Watkins landrace accessions. Based on these results, we then used the best performing reference selection and imputation algorithms to investigate imputation from 90K to exome sequence for a collection of 246 globally diverse wheat accessions. Accession-to-nearest-entry and genomic relationship-based methods were the best performing selection algorithms, and FImpute resulted in higher accuracy and was more efficient than Beagle4. The accuracy of imputing exome capture SNPs was comparable to imputing from 9 to 90K at approximately 0.71. This relatively low imputation accuracy is in part due to inconsistency between 90K and exome sequence formats. We also found the accuracy of imputation could be substantially improved to 0.82 when choosing an equivalent number of exome SNP, instead of 90K SNPs on the existing array, as the lower density set. We present a number of recommendations to increase the accuracy of exome imputation.
  相似文献   

10.

Background

To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq™ Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer.

Results

Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3 % in four samples, whereas the concordance of co-detected variant loci reached 99 %. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5 %) was higher than the SNPs specific to TargetSeq-Proton (60.0 %) or specific to SureSelect-HiSeq (88.3 %). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0 %) and SureSelect-HiSeq-specific (89.6 %) were higher than those of TargetSeq-Proton-specific (15.8 %).

Conclusions

In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the sequencing of platform-specific variants, the accuracy of variant calling by HiSeq 2000 was higher than that of Ion Proton, specifically for the InDel detection. Moreover, the variant calling software also influences the detection of SNPs and, specifically, InDels in Ion Proton exome sequencing.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1796-6) contains supplementary material, which is available to authorized users.  相似文献   

11.
12.

Background  

Next generation ultra-sequencing technologies are starting to produce extensive quantities of data from entire human genome or exome sequences, and therefore new software is needed to present and analyse this vast amount of information. The 1000 Genomes project has recently released raw data for 629 complete genomes representing several human populations through their Phase I interim analysis and, although there are certain public tools available that allow exploration of these genomes, to date there is no tool that permits comprehensive population analysis of the variation catalogued by such data.  相似文献   

13.
14.

Background  

The systematic capture of appropriately annotated experimental data is a prerequisite for most bioinformatics analyses. Data capture is required not only for submission of data to public repositories, but also to underpin integrated analysis, archiving, and sharing – both within laboratories and in collaborative projects. The widespread requirement to capture data means that data capture and annotation are taking place at many sites, but the small scale of the literature on tools, techniques and experiences suggests that there is work to be done to identify good practice and reduce duplication of effort.  相似文献   

15.

Aims

Nine commercial DNA extraction kits were evaluated for the isolation of DNA from 10‐fold serial dilutions of Bacillus anthracis spores using quantitative real‐time PCR (qPCR). The three kits determined by qPCR to yield the most sensitive and consistent detection (Epicenter MasterPure Gram Positive; MoBio PowerFood; ABI PrepSeq) were subsequently tested for their ability to isolate DNA from trace amounts of B. anthracis spores (approx. 6·5 × 101 and 1·3 × 102 CFU in 25 ml or 50 g of food sample) spiked into complex food samples including apple juice, ham, whole milk and bagged salad and recovered with immunomagnetic separation (IMS).

Methods and Results

The MasterPure kit effectively and consistently isolated DNA from low amounts of B. anthracis spores captured from food samples. Detection was achieved from apple juice, ham, whole milk and bagged salad from as few as 65 ± 14, 68 ± 8, 66 ± 4 and 52 ± 16 CFU, respectively, and IMS samples were demonstrated to be free of PCR inhibitors.

Conclusions

Detection of B. anthracis spores isolated from food by IMS differs substantially between commercial DNA extraction kits; however, sensitive results can be obtained with the MasterPure Gram Positive kit.

Significance and Impact of the Study

The extraction protocol identified herein combined with IMS is novel for B. anthracis and allows detection of low levels of B. anthracis spores from contaminated food samples.  相似文献   

16.

Background

Distinguishing between individuals is critical to those conducting animal/plant breeding, food safety/quality research, diagnostic and clinical testing, and evolutionary biology studies. Classical genetic identification studies are based on marker polymorphisms, but polymorphism-based techniques are time and labor intensive and often cannot distinguish between closely related individuals. Illumina sequencing technologies provide the detailed sequence data required for rapid and efficient differentiation of related species, lines/cultivars, and individuals in a cost-effective manner. Here we describe the use of Illumina high-throughput exome sequencing, coupled with SNP mapping, as a rapid means of distinguishing between related cultivars of the lignocellulosic bioenergy crop giant miscanthus (Miscanthus × giganteus). We provide the first exome sequence database for Miscanthus species complete with Gene Ontology (GO) functional annotations.

Results

A SNP comparative analysis of rhizome-derived cDNA sequences was successfully utilized to distinguish three Miscanthus × giganteus cultivars from each other and from other Miscanthus species. Moreover, the resulting phylogenetic tree generated from SNP frequency data parallels the known breeding history of the plants examined. Some of the giant miscanthus plants exhibit considerable sequence divergence.

Conclusions

Here we describe an analysis of Miscanthus in which high-throughput exome sequencing was utilized to differentiate between closely related genotypes despite the current lack of a reference genome sequence. We functionally annotated the exome sequences and provide resources to support Miscanthus systems biology. In addition, we demonstrate the use of the commercial high-performance cloud computing to do computational GO annotation.  相似文献   

17.
18.
19.

Purpose

To define the molecular basis of retinal degeneration in consanguineous Pakistani pedigrees with early onset retinal degeneration.

Methods

A cohort of 277 individuals representing 26 pedigrees from the Punjab province of Pakistan was analyzed. Exomes were captured with commercial kits and sequenced on an Illumina HiSeq 2500. Candidate variants were identified using standard tools and analyzed using exomeSuite to detect all potentially pathogenic changes in genes implicated in retinal degeneration. Segregation analysis was performed by dideoxy sequencing and novel variants were additionally investigated for their presence in ethnicity-matched controls.

Results

We identified a total of nine causal mutations, including six novel variants in RPE65, LCA5, USH2A, CNGB1, FAM161A, CERKL and GUCY2D as the underlying cause of inherited retinal degenerations in 13 of 26 pedigrees. In addition to the causal variants, a total of 200 variants each observed in five or more unrelated pedigrees investigated in this study that were absent from the dbSNP, HapMap, 1000 Genomes, NHLBI ESP6500, and ExAC databases were identified, suggesting that they are common in, and unique to the Pakistani population.

Conclusions

We identified causal mutations associated with retinal degeneration in nearly half of the pedigrees investigated in this study through next generation whole exome sequencing. All novel variants detected in this study through exome sequencing have been cataloged providing a reference database of variants common in, and unique to the Pakistani population.  相似文献   

20.

Background  

More than 200 studies related to nucleic acid amplification (NAA) tests to detect Mycobacterium tuberculosis directly from clinical specimens have appeared in the world literature since this technology was first introduced. NAA tests come as either commercial kits or as tests designed by the reporting investigators themselves (in-house tests). In-house tests vary widely in their accuracy, and factors that contribute to heterogeneity in test accuracy are not well characterized. Here, we used meta-analytical methods, including meta-regression, to identify factors related to study design and assay protocols that affect test accuracy in order to identify those factors associated with high estimates of accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号