期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ and SAM/BAM formats. In addition to implementing reference-based compression, we have developed, to our knowledge, the first assembly-based compressor, using a novel de novo assembly algorithm. A probabilistic data structure is used to dramatically reduce the memory required by traditional de Bruijn graph assemblers, allowing millions of reads to be assembled very efficiently. Read sequences are then stored as positions within the assembled contigs. This is combined with statistical compression of read identifiers, quality scores, alignment information and sequences, effectively collapsing very large data sets to <15% of their original size with no loss of information. Availability: Quip is freely available under the 3-clause BSD license from http://cs.washington.edu/homes/dcjones/quip. 相似文献

10.

Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms

Daniel I Speiser M Sabrina Pankey Alexander K Zaharoff Barbara A Battelle Heather D Bracken-Grissom Jesse W Breinholt Seth M Bybee Thomas W Cronin Anders Garm Annie R Lindgren Nipam H Patel Megan L Porter Meredith E Protas Ajna S Rivera Jeanne M Serb Kirk S Zigler Keith A Crandall Todd H Oakley 《BMC bioinformatics》2014,15(1)

相似文献

11.

Optimal assembly strategies of transcriptome related to ploidies of eukaryotic organisms

Bin He Shirong Zhao Yuehong Chen Qinghua Cao Changhe Wei Xiaojie Cheng Yizheng Zhang 《BMC genomics》2015,16(1)

相似文献

12.

RADtyping: An Integrated Package for Accurate De Novo Codominant and Dominant RAD Genotyping in Mapping Populations

Xiaoteng Fu Jinzhuang Dou Junxia Mao Hailin Su Wenqian Jiao Lingling Zhang Xiaoli Hu Xiaoting Huang Shi Wang Zhenmin Bao 《PloS one》2013,8(11)

Genetic linkage maps are indispensable tools in genetic, genomic and breeding studies. As one of genotyping-by-sequencing methods, RAD-Seq (restriction-site associated DNA sequencing) has gained particular popularity for construction of high-density linkage maps. Current RAD analytical tools are being predominantly used for typing codominant markers. However, no genotyping algorithm has been developed for dominant markers (resulting from recognition site disruption). Given their abundance in eukaryotic genomes, utilization of dominant markers would greatly diminish the extensive sequencing effort required for large-scale marker development. In this study, we established, for the first time, a novel statistical framework for de novo dominant genotyping in mapping populations. An integrated package called RADtyping was developed by incorporating both de novo codominant and dominant genotyping algorithms. We demonstrated the superb performance of RADtyping in achieving remarkably high genotyping accuracy based on simulated and real mapping datasets. The RADtyping package is freely available at http://www2.ouc.edu.cn/mollusk/ detailen.asp?id=727. 相似文献

13.

Corset: enabling differential gene expression analysis for de novo assembled transcriptomes

Nadia M Davidson Alicia Oshlack 《Genome biology》2014,15(7)

相似文献

14.

SHEAR: sample heterogeneity estimation and assembly by reference

Sean R Landman Tae Hyun Hwang Kevin AT Silverstein Yingming Li Scott M Dehm Michael Steinbach Vipin Kumar 《BMC genomics》2014,15(1)

Background

Personal genome assembly is a critical process when studying tumor genomes and other highly divergent sequences. The accuracy of downstream analyses, such as RNA-seq and ChIP-seq, can be greatly enhanced by using personal genomic sequences rather than standard references. Unfortunately, reads sequenced from these types of samples often have a heterogeneous mix of various subpopulations with different variants, making assembly extremely difficult using existing assembly tools. To address these challenges, we developed SHEAR (Sample Heterogeneity Estimation and Assembly by Reference; http://vk.cs.umn.edu/SHEAR), a tool that predicts SVs, accounts for heterogeneous variants by estimating their representative percentages, and generates personal genomic sequences to be used for downstream analysis.

Results

By making use of structural variant detection algorithms, SHEAR offers improved performance in the form of a stronger ability to handle difficult structural variant types and better computational efficiency. We compare against the lead competing approach using a variety of simulated scenarios as well as real tumor cell line data with known heterogeneous variants. SHEAR is shown to successfully estimate heterogeneity percentages in both cases, and demonstrates an improved efficiency and better ability to handle tandem duplications.

Conclusion

SHEAR allows for accurate and efficient SV detection and personal genomic sequence generation. It is also able to account for heterogeneous sequencing samples, such as from tumor tissue, by estimating the subpopulation percentage for each heterogeneous variant.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-84) contains supplementary material, which is available to authorized users. 相似文献

15.

PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms

Ruei-Chi Gan Ting-Wen Chen Timothy H. Wu Po-Jung Huang Chi-Ching Lee Yuan-Ming Yeh Cheng-Hsun Chiu Hsien-Da Huang Petrus Tang 《BMC bioinformatics》2016,17(19):513

相似文献

16.

Computing folding pathways between RNA secondary structures

Ivan Dotu William A. Lorenz Pascal Van Hentenryck Peter Clote 《Nucleic acids research》2010,38(5):1711-1722

Given an RNA sequence and two designated secondary structures A, B, we describe a new algorithm that computes a nearly optimal folding pathway from A to B. The algorithm, RNAtabupath, employs a tabu semi-greedy heuristic, known to be an effective search strategy in combinatorial optimization. Folding pathways, sometimes called routes or trajectories, are computed by RNAtabupath in a fraction of the time required by the barriers program of Vienna RNA Package. We benchmark RNAtabupath with other algorithms to compute low energy folding pathways between experimentally known structures of several conformational switches. The RNApathfinder web server, source code for algorithms to compute and analyze pathways and supplementary data are available at http://bioinformatics.bc.edu/clotelab/RNApathfinder. 相似文献

17.

TopHat2: accurate alignment of transcriptomes in the presence of insertions,deletions and gene fusions

Daehwan Kim Geo Pertea Cole Trapnell Harold Pimentel Ryan Kelley Steven L Salzberg 《Genome biology》2013,14(4):R36

相似文献

18.

A new approach for annotation of transposable elements using small RNA mapping

Moaine El?Baidouri Kyung Do Kim Brian Abernathy Siwaret Arikit Florian Maumus Olivier Panaud Blake C. Meyers Scott A. Jackson 《Nucleic acids research》2015,43(13):e84

Transposable elements (TEs) are mobile genomic DNA sequences found in most organisms. They so densely populate the genomes of many eukaryotic species that they are often the major constituents. With the rapid generation of many plant genome sequencing projects over the past few decades, there is an urgent need for improved TE annotation as a prerequisite for genome-wide studies. Analogous to the use of RNA-seq for gene annotation, we propose a new method for de novo TE annotation that uses as a guide 24 nt-siRNAs that are a part of TE silencing pathways. We use this new approach, called TASR (for Transposon Annotation using Small RNAs), for de novo annotation of TEs in Arabidopsis, rice and soybean and demonstrate that this strategy can be successfully applied for de novo TE annotation in plants.Executable PERL is available for download from: http://tasr-pipeline.sourceforge.net/ 相似文献

19.

The Medicago sativa gene index 1.2: a web-accessible gene expression atlas for investigating expression differences between Medicago sativa subspecies

Jamie A. O’Rourke Fengli Fu Bruna Bucciarelli S. Sam Yang Deborah A. Samac JoAnn F. S. Lamb Maria J. Monteros Michelle A. Graham John W. Gronwald Nick Krom Jun Li Xinbin Dai Patrick X. Zhao Carroll P. Vance 《BMC genomics》2015,16(1)

相似文献

20.

Schistosoma mansoni Egg,Adult Male and Female Comparative Gene Expression Analysis and Identification of Novel Genes by RNA-Seq

Letícia Anderson Murilo S. Amaral Felipe Beckedorff Lucas F. Silva Bianca Dazzani Katia C. Oliveira Giulliana T. Almeida Monete R. Gomes David S. Pires Jo?o C. Setubal Ricardo DeMarco Sergio Verjovski-Almeida 《PLoS neglected tropical diseases》2015,9(12)

相似文献