首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Large-scale high-throughput sequencing techniques are rapidly becoming popular methods to profile complex communities and have generated deep insights into community biodiversity. However, several technical problems, especially sequencing artifacts such as nucleotide calling errors, could artificially inflate biodiversity estimates. Sequence filtering for artifact removal is a conventional method for deleting error-prone sequences from high-throughput sequencing data. As rare species represented by low-abundance sequences in datasets may be sensitive to artifact removal process, the influence of artifact removal on rare species recovery has not been well evaluated in natural complex communities. Here we employed both internal (reliable operational taxonomic units selected from communities themselves) and external (indicator species spiked into communities) references to evaluate the influence of artifact removal on rare species recovery using 454 pyrosequencing of complex plankton communities collected from both freshwater and marine habitats. Multiple analyses revealed three clear patterns: 1) rare species were eliminated during sequence filtering process at all tested filtering stringencies, 2) more rare taxa were eliminated as filtering stringencies increased, and 3) elimination of rare species intensified as biomass of a species in a community was reduced. Our results suggest that cautions be applied when processing high-throughput sequencing data, especially for rare taxa detection for conservation of species at risk and for rapid response programs targeting non-indigenous species. Establishment of both internal and external references proposed here provides a practical strategy to evaluate artifact removal process.  相似文献   

2.
As the number of transgenic livestock increases, reliable detection and molecular characterization of transgene integration sites and copy number are crucial not only for interpreting the relationship between the integration site and the specific phenotype but also for commercial and economic demands. However, the ability of conventional PCR techniques to detect incomplete and multiple integration events is limited, making it technically challenging to characterize transgenes. Next-generation sequencing has enabled cost-effective, routine and widespread high-throughput genomic analysis. Here, we demonstrate the use of next-generation sequencing to extensively characterize cattle harboring a 150-kb human lactoferrin transgene that was initially analyzed by chromosome walking without success. Using this approach, the sites upstream and downstream of the target gene integration site in the host genome were identified at the single nucleotide level. The sequencing result was verified by event-specific PCR for the integration sites and FISH for the chromosomal location. Sequencing depth analysis revealed that multiple copies of the incomplete target gene and the vector backbone were present in the host genome. Upon integration, complex recombination was also observed between the target gene and the vector backbone. These findings indicate that next-generation sequencing is a reliable and accurate approach for the molecular characterization of the transgene sequence, integration sites and copy number in transgenic species.  相似文献   

3.
Complementary to the time- and cost-intensive direct bisulfite sequencing, we applied reduced representation bisulfite sequencing (RRBS) to the human peripheral blood mononuclear cells (PBMC) from YH, the Asian individual whose genome and epigenome has been deciphered in the YH project and systematically assessed the genomic coverage, coverage depth and reproducibility of this technology as well as the concordance of DNA methylation levels measured by RRBS and direct bisulfite sequencing for the detected CpG sites. Our result suggests that RRBS can cover more than half of CpG islands and promoter regions with a good coverage depth and the proportion of the CpG sites covered by the biological replicates reaches 80-90%, indicating good reproducibility. Given a smaller data quantity, RRBS enjoys much better coverage depth than direct bisulfite sequencing and the concordance of DNA methylation levels between the two methods is high. It can be concluded that RRBS is a time and cost-effective sequencing method for unbiased DNA methylation profiling of CpG islands and promoter regions in a genome-wide scale and it is the method of choice to assay certain genomic regions for multiple samples in a rapid way.  相似文献   

4.
Amplicon sequencing has been the method of choice in many high-throughput DNA sequencing (HTS) applications. To date there has been a heavy focus on the means by which to analyse the burgeoning amount of data afforded by HTS. In contrast, there has been a distinct lack of attention paid to considerations surrounding the importance of sample preparation and the fidelity of library generation. No amount of high-end bioinformatics can compensate for poorly prepared samples and it is therefore imperative that careful attention is given to sample preparation and library generation within workflows, especially those involving multiple PCR steps. This paper redresses this imbalance by focusing on aspects pertaining to the benchtop within typical amplicon workflows: sample screening, the target region, and library generation. Empirical data is provided to illustrate the scope of the problem. Lastly, the impact of various data analysis parameters is also investigated in the context of how the data was initially generated. It is hoped this paper may serve to highlight the importance of pre-analysis workflows in achieving meaningful, future-proof data that can be analysed appropriately. As amplicon sequencing gains traction in a variety of diagnostic applications from forensics to environmental DNA (eDNA) it is paramount workflows and analytics are both fit for purpose.  相似文献   

5.
6.
We have developed and validated a consolidated bead-based genotyping platform, the Bioplex suspension array for simultaneous detection of multiple single nucleotide polymorphisms (SNPs) of the ATP-binding cassette transporters. Genetic polymorphisms have been known to influence therapeutic response and risk of disease pathologies. Genetic screening for therapeutic and diagnostic applications thus holds great promise in clinical management. The allele-specific primer extension (ASPE) reaction was used to assay 22 multiplexed SNPs for eight subjects. Comparison of the microsphere-based ASPE assay results to sequencing results showed complete concordance in genotype assignments. The Bioplex suspension array thus proves to be a reliable, cost-effective and high-throughput technological platform for genotyping. It can be easily adapted to customized SNP panels for specific applications involving large-scale mutation screening of clinically relevant markers.  相似文献   

7.
Yoo SD  Cho YH  Sheen J 《Nature protocols》2007,2(7):1565-1572
The transient gene expression system using Arabidopsis mesophyll protoplasts has proven an important and versatile tool for conducting cell-based experiments using molecular, cellular, biochemical, genetic, genomic and proteomic approaches to analyze the functions of diverse signaling pathways and cellular machineries. A well-established protocol that has been extensively tested and applied in numerous experiments is presented here. The method includes protoplast isolation, PEG-calcium transfection of plasmid DNA and protoplast culture. Physiological responses and high-throughput capability enable facile and cost-effective explorations as well as hypothesis-driven tests. The protoplast isolation and DNA transfection procedures take 6-8 h, and the results can be obtained in 2-24 h. The cell system offers reliable guidelines for further comprehensive analysis of complex regulatory mechanisms in whole-plant physiology, immunity, growth and development.  相似文献   

8.
New generation sequencing technologies offer unique opportunities and challenges for re-sequencing studies. In this article, we focus on re-sequencing experiments using the Solexa technology, based on bacterial artificial chromosome (BAC) clones, and address an experimental design problem. In these specific experiments, approximate coordinates of the BACs on a reference genome are known, and fine-scale differences between the BAC sequences and the reference are of interest. The high-throughput characteristics of the sequencing technology makes it possible to multiplex BAC sequencing experiments by pooling BACs for a cost-effective operation. However, the way BACs are pooled in such re-sequencing experiments has an effect on the downstream analysis of the generated data, mostly due to subsequences common to multiple BACs. The experimental design strategy we develop in this article offers combinatorial solutions based on approximation algorithms for the well-known max n-cut problem and the related max n-section problem on hypergraphs. Our algorithms, when applied to a number of sample cases give more than a 2-fold performance improvement over random partitioning.  相似文献   

9.
Mass spectrometry-based proteomics holds great promise as a discovery tool for biomarker candidates in the early detection of diseases. Recently much emphasis has been placed upon producing highly reliable data for quantitative profiling for which highly reproducible methodologies are indispensable. The main problems that affect experimental reproducibility stem from variations introduced by sample collection, preparation, and storage protocols and LC-MS settings and conditions. On the basis of a formally precise and quantitative definition of similarity between LC-MS experiments, we have developed Chaorder, a fully automatic software tool that can assess experimental reproducibility of sets of large scale LC-MS experiments. By visualizing the similarity relationships within a set of experiments, this tool can form the basis of systematic quality control and thus help assess the comparability of mass spectrometry data over time, across different laboratories, and between instruments. Applying Chaorder to data from multiple laboratories and a range of instruments, experimental protocols, and sample complexities revealed biases introduced by the sample processing steps, experimental protocols, and instrument choices. Moreover we show that reducing bias by correcting for just a few steps, for example randomizing the run order, does not provide much gain in statistical power for biomarker discovery.  相似文献   

10.
The use of N-glycan mass spectrometry for clinical diagnostics requires the development of robust high-throughput profiling methods. Still, structural assignment of glycans requires additional information such as MS2 fragmentation or exoglycosidase digestions. We present a setting which combines a MALDI ionization source with a linear ion trap analyzer. This instrumentation allows automated measurement of samples thanks to the crystal positioning system, combined with MSn sequencing options. 2,5-Dihydroxybenzoic acid, commonly used for the analysis of glycans, failed to produce the required reproducibility due to its non-homogeneous crystallization properties. In contrast, α-cyano-4-hydroxycinnamic acid provided a homogeneous crystallization pattern and reproducibility of the measurements. Using serum N-glycans as a test sample, we focused on the automation of data collection by optimizing the instrument settings. Glycan structures were confirmed by MS2 analysis. Although sample processing still needs optimization, this method provides a reproducible and high-throughput approach for measurement of N-glycans using a MALDI–linear ion trap instrument.  相似文献   

11.
12.
The quest for a universal and efficient method of identifying species has been a longstanding challenge in biology. Here, we show that accurate identification of species in all domains of life can be accomplished by multiplex analysis of variable-length sequences containing multiple insertion/deletion variants. The new method, called SPInDel, is able to discriminate 93.3% of eukaryotic species from 18 taxonomic groups. We also demonstrate that the identification of prokaryotic and viral species with numeric profiles of fragment lengths is generally straightforward. A computational platform is presented to facilitate the planning of projects and includes a large data set with nearly 1800 numeric profiles for species in all domains of life (1556 for eukaryotes, 105 for prokaryotes and 130 for viruses). Finally, a SPInDel profiling kit for discrimination of 10 mammalian species was successfully validated on highly processed food products with species mixtures and proved to be easily adaptable to multiple screening procedures routinely used in molecular biology laboratories. These results suggest that SPInDel is a reliable and cost-effective method for broad-spectrum species identification that is appropriate for use in suboptimal samples and is amenable to different high-throughput genotyping platforms without the need for DNA sequencing.  相似文献   

13.
Although nematodes are the most abundant metazoan animals on Earth, their diversity is largely unknown. To overcome limitations of traditional approaches (labour, time, and cost) for assessing biodiversity of nematode species in environmental samples, we have previously examined the suitability of high-throughput sequencing for assessing species level diversity with a set of control experiments employing a mixture of nematodes of known number and with known sequences for target diagnostic loci. Those initial experiments clearly demonstrated the suitability of the approach for identification of nematode taxa but lacked the replicate experiments necessary to evaluate reproducibility of the approach. Here, we analyze reads generated from three different PCR amplifications and three different sequencing reactions to examine the differential PCR amplification, the possibility of emulsion PCR artefacts, and differences between sequencing reactions. Our results suggest that both qualitative and quantitative data are consistent and highly reproducible. Variation associated with in-house PCR amplification or emPCR and sequencing are present but the representation of each nematode is very consistent from experiment to experiment and supports the use of read counts to estimate relative abundance of taxa in a metagenetic sample.  相似文献   

14.

Background  

High-resolution tandem mass spectra can now be readily acquired with hybrid instruments, such as LTQ-Orbitrap and LTQ-FT, in high-throughput shotgun proteomics workflows. The improved spectral quality enables more accurate de novo sequencing for identification of post-translational modifications and amino acid polymorphisms.  相似文献   

15.
拷贝数变异是指基因组中发生大片段的DNA序列的拷贝数增加或者减少。根据现有的研究可知,拷贝数变异是多种人类疾病的成因,与其发生与发展机制密切相关。高通量测序技术的出现为拷贝数变异检测提供了技术支持,在人类疾病研究、临床诊疗等领域,高通量测序技术已经成为主流的拷贝数变异检测技术。虽然不断有新的基于高通量测序技术的算法和软件被人们开发出来,但是准确率仍然不理想。本文全面地综述基于高通量测序数据的拷贝数变异检测方法,包括基于reads深度的方法、基于双末端映射的方法、基于拆分read的方法、基于从头拼接的方法以及基于上述4种方法的组合方法,深入探讨了每类不同方法的原理,代表性的软件工具以及每类方法适用的数据以及优缺点等,并展望未来的发展方向。  相似文献   

16.

Background

Human leukocyte antigen (HLA) is a group of genes that are extremely polymorphic among individuals and populations and have been associated with more than 100 different diseases and adverse drug effects. HLA typing is accordingly an important tool in clinical application, medical research, and population genetics. We have previously developed a phase-defined HLA gene sequencing method using MiSeq sequencing.

Results

Here we report a simple, high-throughput, and cost-effective sequencing method that includes normalized library preparation and adjustment of DNA molar concentration. We applied long-range PCR to amplify HLA-B for 96 samples followed by transposase-based library construction and multiplex sequencing with the MiSeq sequencer. After sequencing, we observed low variation in read percentages (0.2% to 1.55%) among the 96 demultiplexed samples. On this basis, all the samples were amenable to haplotype phasing using our phase-defined sequencing method. In our study, a sequencing depth of 800x was necessary and sufficient to achieve full phasing of HLA-B alleles with reliable assignment of the allelic sequence to the 8 digit level.

Conclusions

Our HLA sequencing method optimized for 96 multiplexing samples is highly time effective and cost effective and is especially suitable for automated multi-sample library preparation and sequencing.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-645) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.
19.
Kwon YS 《Biotechnology letters》2011,33(8):1633-1641
The discovery of novel small RNA classes and species has accelerated since the implementation of high-throughput sequencing technologies for the identification of small RNAs. However, as the sequence coverage increases in a cell, the expectation of finding novel small RNAs from a batch of sequencing gradually decreases. To improve the finding of novel small RNAs, an alternative small RNA library preparation method, the single ligation, extension and circularization method, has been developed which is adequate for high throughput sequencing. The procedure is faster and simpler than the more widely used procedures, and the constructed libraries are compatible with high-level multiplex analysis. The analysis of human small RNA libraries prepared by the SLEC method reported known small RNAs and novel small RNAs including 25 mirtron candidates. This study demonstrates that the method is effective in identifying known and novel small RNAs.  相似文献   

20.
The ability to assay genome-scale methylation patterns using high-throughput sequencing makes it possible to carry out association studies to determine the relationship between epigenetic variation and phenotype. While bisulfite sequencing can determine a methylome at high resolution, cost inhibits its use in comparative and population studies. MethylSeq, based on sequencing of fragment ends produced by a methylation-sensitive restriction enzyme, is a method for methyltyping (survey of methylation states) and is a site-specific and cost-effective alternative to whole-genome bisulfite sequencing. Despite its advantages, the use of MethylSeq has been restricted by biases in MethylSeq data that complicate the determination of methyltypes. Here we introduce a statistical method, MetMap, that produces corrected site-specific methylation states from MethylSeq experiments and annotates unmethylated islands across the genome. MetMap integrates genome sequence information with experimental data, in a statistically sound and cohesive Bayesian Network. It infers the extent of methylation at individual CGs and across regions, and serves as a framework for comparative methylation analysis within and among species. We validated MetMap''s inferences with direct bisulfite sequencing, showing that the methylation status of sites and islands is accurately inferred. We used MetMap to analyze MethylSeq data from four human neutrophil samples, identifying novel, highly unmethylated islands that are invisible to sequence-based annotation strategies. The combination of MethylSeq and MetMap is a powerful and cost-effective tool for determining genome-scale methyltypes suitable for comparative and association studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号