首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species.  相似文献   

3.
对鱼类早期生长阶段的摄食研究有助于了解其饵料来源及其在食物网中的功能地位,而全面准确地获取其食物种类信息是关键,高通量测序技术的发展给动物食性研究带来了前所未有的机遇和挑战.本研究以大亚湾人工码头海域金钱鱼稚鱼为对象,以18S rDNA为靶标,分别使用传统Sanger测序和Illumina Solexa高通量测序对其食物组成进行分析,比较两种方法在稚鱼摄食研究中的适用性.结果表明: 金钱鱼稚鱼为杂食性,食物多样性高,纤毛虫和苔藓动物是最优势的食物类群.使用传统测序方法共获得67条有效食物序列,分属于8个类群,涵盖23个生物种类;使用高通量测序方法共获得17000多条有效食物序列,分属于9个类群,涵盖35个生物种类.两种方法检测到的食物类群基本相同,但高通量测序方法在反映食物多样性和覆盖范围上更具优势,且灵敏度更高,检测出传统测序方法未发现的甲藻和褐藻种类,说明高通量测序技术可以较全面而准确地覆盖稚鱼的食物谱.高通量测序获取的大量数据,可在一定程度上提供半定量信息,克服传统测序在定量研究方面的不足.高通量测序技术在稚鱼摄食研究上优势更明显,食谱覆盖更广,检测灵敏度更高,显著提升了数据与结果的可信度,可为海洋生物摄食生态学研究提供强有力的支撑.  相似文献   

4.
High-throughput sequencing studies (HTS) have been highly successful in identifying the genetic causes of human disease, particularly those following Mendelian inheritance. Many HTS studies to date have been performed without utilizing available family relationships between samples. Here, we discuss the many merits and occasional pitfalls of using identity by descent information in conjunction with HTS studies. These methods are not only applicable to family studies but are also useful in cohorts of apparently unrelated, ‘sporadic’ cases and small families underpowered for linkage and allow inference of relationships between individuals. Incorporating familial/pedigree information not only provides powerful filtering options for the extensive variant lists that are usually produced by HTS but also allows valuable quality control checks, insights into the genetic model and the genotypic status of individuals of interest. In particular, these methods are valuable for challenging discovery scenarios in HTS analysis, such as in the study of populations poorly represented in variant databases typically used for filtering, and in the case of poor-quality HTS data.  相似文献   

5.
Some popular methods for polymorphism and mutation discovery involve ascertainment of novel bands by the examination of electrophoretic gel images. Although existing strategies for mapping bands work well for specific applications, such as DNA sequencing, these strategies are not well suited for novel band detection. Here, we describe a general strategy for band mapping that uses background banding patterns to facilitate lane calling and size calibration. We have implemented this strategy in GelBuddy, a user-friendly Java-based program for PC and Macintosh computers, which includes several utilities to assist discovery of mutations and polymorphisms. We demonstrate the use of GelBuddy in applications based on single-base mismatch cleavage of heteroduplexed PCR products. Use of software designed to facilitate novel band detection can significantly shorten the time needed for image analysis and data entry in a high-throughput setting. Furthermore, the interactive strategy implemented in GelBuddy has been successfully applied to DNA fingerprinting applications, such as AFLP. GelBuddy promises to make electrophoretic gel analysis a viable alternative to DNA resequencing for discovery of mutations and polymorphisms.  相似文献   

6.
MOTIVATION: High-throughput screening (HTS) plays a central role in modern drug discovery, allowing for testing of >100,000 compounds per screen. The aim of our work was to develop and implement methods for minimizing the impact of systematic error in the analysis of HTS data. To the best of our knowledge, two new data correction methods included in HTS-Corrector are not available in any existing commercial software or freeware. RESULTS: This paper describes HTS-Corrector, a software application for the analysis of HTS data, detection and visualization of systematic error, and corresponding correction of HTS signals. Three new methods for the statistical analysis and correction of raw HTS data are included in HTS-Corrector: background evaluation, well correction and hit-sigma distribution procedures intended to minimize the impact of systematic errors. We discuss the main features of HTS-Corrector and demonstrate the benefits of the algorithms.  相似文献   

7.
The future of high-throughput screening   总被引:3,自引:0,他引:3  
High-throughput screening (HTS) is a well-established process in lead discovery for pharma and biotech companies and is now also being set up for basic and applied research in academia and some research hospitals. Since its first advent in the early to mid-1990s, the field of HTS has seen not only a continuous change in technology and processes but also an adaptation to various needs in lead discovery. HTS has now evolved into a quite mature discipline of modern drug discovery. Whereas in previous years, much emphasis has been put toward a steady increase in capacity ("quantitative increase") via various strategies in the fields of automation and miniaturization, the past years have seen a steady shift toward higher content and quality ("quality increase") for these biological test systems. Today, many experts in the field see HTS at the crossroads with the need to decide either toward further increase in throughput or more focus toward relevance of biological data. In this article, the authors describe the development of HTS over the past decade and point out their own ideas for future directions of HTS in biomedical research. They predict that the trend toward further miniaturization will slow down with the implementation of 384-well, 1536-well, and 384 low-volume-well plates. The authors predict that, ultimately, each hit-finding strategy will be much more project related, tailor-made, and better integrated into the broader drug discovery efforts.  相似文献   

8.
The incursion of High-Throughput Sequencing (HTS) in environmental microbiology brings unique opportunities and challenges. HTS now allows a high-resolution exploration of the vast taxonomic and metabolic diversity present in the microbial world, which can provide an exceptional insight on global ecosystem functioning, ecological processes and evolution. This exploration has also economic potential, as we will have access to the evolutionary innovation present in microbial metabolisms, which could be used for biotechnological development. HTS is also challenging the research community, and the current bottleneck is present in the data analysis side. At the moment, researchers are in a sequence data deluge, with sequencing throughput advancing faster than the computer power needed for data analysis. However, new tools and approaches are being developed constantly and the whole process could be depicted as a fast co-evolution between sequencing technology, informatics and microbiologists. In this work, we examine the most popular and recently commercialized HTS platforms as well as bioinformatics methods for data handling and analysis used in microbial metagenomics. This non-exhaustive review is intended to serve as a broad state-of-the-art guide to researchers expanding into this rapidly evolving field.  相似文献   

9.
10.
高通量测序技术在野生动物食性分析中的应用   总被引:2,自引:0,他引:2  
刘刚  宁宇  夏晓飞  龚明昊 《生态学报》2018,38(9):3347-3356
食性研究是动物生态学颇受关注的一个重要内容,而食性分析方法由于受到技术和适用范围的限制,也在不断改进和更新。随着高通量测序技术的发展,该技术逐渐扩展到野生动物的食性分析,使食性分析的效率得到极大提升,并拓宽了食性分析的应用范围。尽管高通量测序应用于食性分析在数据量、灵敏度和分辨率方面的优势较为明显,但由于涉及到的步骤较多,受到的影响因素较为复杂,目前高通量测序应用于食性分析还属于研究比较薄弱的领域。概述了高通量测序技术应用于食性分析的基本流程,总结了该技术在食物组成分析、种内和种间食性关系、食物与栖息地、行为关系方面的研究动态,分析了PCR、污染和定量分析对该技术应用性的影响,提出了相应的解决对策和建议,并对其应用前景进行了展望。  相似文献   

11.
The drug discovery process pursued by major pharmaceutical companies for many years starts with target identification followed by high-throughput screening (HTS) with the goal of identifying lead compounds. To accomplish this goal, significant resources are invested into automation of the screening process or HTS. Robotic systems capable of handling thousands of data points per day are implemented across the pharmaceutical sector. Many of these systems are amenable to handling cell-based screening protocols as well. On the other hand, as companies strive to develop innovative products based on novel mechanisms of action(s), one of the current bottlenecks of the industry is the target validation process. Traditionally, bioinformatics and HTS groups operate separately at different stages of the drug discovery process. The authors describe the convergence and integration of HTS and bioinformatics to perform high-throughput target functional identification and validation. As an example of this approach, they initiated a project with a functional cell-based screen for a biological process of interest using libraries of small interfering RNA (siRNA) molecules. In this protocol, siRNAs function as potent gene-specific inhibitors. siRNA-mediated knockdown of the target genes is confirmed by TaqMan analysis, and genes with impacts on biological functions of interest are selected for further analysis. Once the genes are confirmed and further validated, they may be used for HTS to yield lead compounds.  相似文献   

12.
High‐throughput sequencing (HTS) technologies generate millions of sequence reads from DNA/RNA molecules rapidly and cost‐effectively, enabling single investigator laboratories to address a variety of ‘omics’ questions in nonmodel organisms, fundamentally changing the way genomic approaches are used to advance biological research. One major challenge posed by HTS is the complexity and difficulty of data quality control (QC). While QC issues associated with sample isolation, library preparation and sequencing are well known and protocols for their handling are widely available, the QC of the actual sequence reads generated by HTS is often overlooked. HTS‐generated sequence reads can contain various errors, biases and artefacts whose identification and amelioration can greatly impact subsequent data analysis. However, a systematic survey on QC procedures for HTS data is still lacking. In this review, we begin by presenting standard ‘health check‐up’ QC procedures recommended for HTS data sets and establishing what ‘healthy’ HTS data look like. We next proceed by classifying errors, biases and artefacts present in HTS data into three major types of ‘pathologies’, discussing their causes and symptoms and illustrating with examples their diagnosis and impact on downstream analyses. We conclude this review by offering examples of successful ‘treatment’ protocols and recommendations on standard practices and treatment options. Notwithstanding the speed with which HTS technologies – and consequently their pathologies – change, we argue that careful QC of HTS data is an important – yet often neglected – aspect of their application in molecular ecology, and lay the groundwork for developing a HTS data QC ‘best practices’ guide.  相似文献   

13.
The improvements in high throughput sequencing technologies (HTS) made clinical sequencing projects such as ClinSeq and Genomics England feasible. Although there are significant improvements in accuracy and reproducibility of HTS based analyses, the usability of these types of data for diagnostic and prognostic applications necessitates a near perfect data generation. To assess the usability of a widely used HTS platform for accurate and reproducible clinical applications in terms of robustness, we generated whole genome shotgun (WGS) sequence data from the genomes of two human individuals in two different genome sequencing centers. After analyzing the data to characterize SNPs and indels using the same tools (BWA, SAMtools, and GATK), we observed significant number of discrepancies in the call sets. As expected, the most of the disagreements between the call sets were found within genomic regions containing common repeats and segmental duplications, albeit only a small fraction of the discordant variants were within the exons and other functionally relevant regions such as promoters. We conclude that although HTS platforms are sufficiently powerful for providing data for first-pass clinical tests, the variant predictions still need to be confirmed using orthogonal methods before using in clinical applications.  相似文献   

14.
Next generation sequencing (NGS) has enabled high throughput discovery of somatic mutations. Detection depends on experimental design, lab platforms, parameters and analysis algorithms. However, NGS-based somatic mutation detection is prone to erroneous calls, with reported validation rates near 54% and congruence between algorithms less than 50%. Here, we developed an algorithm to assign a single statistic, a false discovery rate (FDR), to each somatic mutation identified by NGS. This FDR confidence value accurately discriminates true mutations from erroneous calls. Using sequencing data generated from triplicate exome profiling of C57BL/6 mice and B16-F10 melanoma cells, we used the existing algorithms GATK, SAMtools and SomaticSNiPer to identify somatic mutations. For each identified mutation, our algorithm assigned an FDR. We selected 139 mutations for validation, including 50 somatic mutations assigned a low FDR (high confidence) and 44 mutations assigned a high FDR (low confidence). All of the high confidence somatic mutations validated (50 of 50), none of the 44 low confidence somatic mutations validated, and 15 of 45 mutations with an intermediate FDR validated. Furthermore, the assignment of a single FDR to individual mutations enables statistical comparisons of lab and computation methodologies, including ROC curves and AUC metrics. Using the HiSeq 2000, single end 50 nt reads from replicates generate the highest confidence somatic mutation call set.  相似文献   

15.
后基因组时代的真菌天然产物发现   总被引:1,自引:0,他引:1  
真菌产生的次级代谢产物是新药发现的重要来源之一,然而近年来传统的真菌天然产物发现方法在大量真菌基因组测序完成的时代遇到了很大的挑战。如何利用这些基因组数据来发现真菌中新的天然产物已成为后基因组时代天然产物发现的研究重点和热点。本综述先后介绍了真菌天然产物的类型及其相应基因簇和骨架酶的特征,基于基因组挖掘技术发展而来的天然产物发现新策略,以及利用合成生物学理念和技术在真菌天然产物发现中的应用现状,最后展望了后基因组时代中的天然产物发现研究前沿及基因组数据在后基因组时代对真菌天然产物发现的应用前景。  相似文献   

16.
High-throughput screening (HTS) using high-density microplates is the primary method for the discovery of novel lead candidate molecules. However, new strategies that eschew 2D microplate technology, including technologies that enable mass screening of targets against large combinatorial libraries, have the potential to greatly increase throughput and decrease unit cost. This review presents an overview of state-of-the-art microplate-based HTS technology and includes a discussion of emerging miniaturized systems for HTS. We focus on new methods of encoding combinatorial libraries that promise throughputs of as many as 100,000 compounds per second.  相似文献   

17.
High-throughput screening (HTS) plays a central role in modern drug discovery, allowing the rapid screening of large compound collections against a variety of putative drug targets. HTS is an industrial-scale process, relying on sophisticated automation, control, and state-of-the art detection technologies to organize, test, and measure hundreds of thousands to millions of compounds in nano- to microliter volumes. Despite this high technology, hit selection for HTS is still typically done using simple data analysis and basic statistical methods. The authors discuss in this article some shortcomings of these methods and present alternatives based on modern methods of statistical data analysis. Most important, they describe and show numerous real examples from the biologist-friendly Stat Server HTS application (SHS), a custom-developed software tool built on the commercially available S-PLUS and StatServer statistical analysis and server software. This system remotely processes HTS data using powerful and sophisticated statistical methodology but insulates users from the technical details by outputting results in a variety of readily interpretable graphs and tables.  相似文献   

18.

Background  

High-throughput sequencing (HTS) technologies play important roles in the life sciences by allowing the rapid parallel sequencing of very large numbers of relatively short nucleotide sequences, in applications ranging from genome sequencing and resequencing to digital microarrays and ChIP-Seq experiments. As experiments scale up, HTS technologies create new bioinformatics challenges for the storage and sharing of HTS data.  相似文献   

19.
MOTIVATION: High-throughput screening (HTS) is an early-stage process in drug discovery which allows thousands of chemical compounds to be tested in a single study. We report a method for correcting HTS data prior to the hit selection process (i.e. selection of active compounds). The proposed correction minimizes the impact of systematic errors which may affect the hit selection in HTS. The introduced method, called a well correction, proceeds by correcting the distribution of measurements within wells of a given HTS assay. We use simulated and experimental data to illustrate the advantages of the new method compared to other widely-used methods of data correction and hit selection in HTS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

20.

Background

Patient-derived tumor xenografts in mice are widely used in cancer research and have become important in developing personalized therapies. When these xenografts are subject to DNA sequencing, the samples could contain various amounts of mouse DNA. It has been unclear how the mouse reads would affect data analyses. We conducted comprehensive simulations to compare three alignment strategies at different mutation rates, read lengths, sequencing error rates, human-mouse mixing ratios and sequenced regions. We also sequenced a nasopharyngeal carcinoma xenograft and a cell line to test how the strategies work on real data.

Results

We found the "filtering" and "combined reference" strategies performed better than aligning reads directly to human reference in terms of alignment and variant calling accuracies. The combined reference strategy was particularly good at reducing false negative variants calls without significantly increasing the false positive rate. In some scenarios the performance gain of these two special handling strategies was too small for special handling to be cost-effective, but it was found crucial when false non-synonymous SNVs should be minimized, especially in exome sequencing.

Conclusions

Our study systematically analyzes the effects of mouse contamination in the sequencing data of human-in-mouse xenografts. Our findings provide information for designing data analysis pipelines for these data.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1172) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号