共查询到20条相似文献,搜索用时 15 毫秒
1.
Juan Falgueras Antonio J Lara Noé Fernández-Pozo Francisco R Cantón Guillermo Pérez-Trabado M Gonzalo Claros 《BMC bioinformatics》2010,11(1):38
Background
High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. 相似文献2.
Background
Next-generation sequencing technologies have led to the high-throughput production of sequence data (reads) at low cost. However, these reads are significantly shorter and more error-prone than conventional Sanger shotgun reads. This poses a challenge for the de novo assembly in terms of assembly quality and scalability for large-scale short read datasets. 相似文献3.
Background
Trace or chromatogram files (raw data) are produced by automatic nucleic acid sequencing equipment or sequencers. Each file contains information which can be interpreted by specialised software to reveal the sequence (base calling). This is done by the sequencer proprietary software or publicly available programs. Depending on the size of a sequencing project the number of trace files can vary from just a few to thousands of files. Sequencing quality assessment on various criteria is important at the stage preceding clustering and contig assembly. Two major publicly available packages – Phred and Staden are used by preAssemble to perform sequence quality processing. 相似文献4.
Background
Finishing is the process of improving the quality and utility of draft genome sequences generated by shotgun sequencing and computational assembly. Finishing can involve targeted sequencing. Finishing reads may be incorporated by manual or automated means. One automated method uses targeted addition by local re-assembly of gap regions. An obvious alternative uses de novo assembly of all the reads. 相似文献5.
Chongle Pan Byung H Park William H McDonald Patricia A Carey Jillian F Banfield Nathan C VerBerkmoes Robert L Hettich Nagiza F Samatova 《BMC bioinformatics》2010,11(1):118
Background
High-resolution tandem mass spectra can now be readily acquired with hybrid instruments, such as LTQ-Orbitrap and LTQ-FT, in high-throughput shotgun proteomics workflows. The improved spectral quality enables more accurate de novo sequencing for identification of post-translational modifications and amino acid polymorphisms. 相似文献6.
Pavle Goldstein Jurica Zucko Du?ica Vujaklija Anita Kri?ko Daslav Hranueli Paul F Long Catherine Etchebest Bojan Basrak John Cullum 《BMC bioinformatics》2009,10(1):335
Background
The number of protein family members defined by DNA sequencing is usually much larger than those characterised experimentally. This paper describes a method to divide protein families into subtypes purely on sequence criteria. Comparison with experimental data allows an independent test of the quality of the clustering. 相似文献7.
Elizabeth T Cirulli Abanish Singh Kevin V Shianna Dongliang Ge Jason P Smith Jessica M Maia Erin L Heinzen James J Goedert David B Goldstein the Center for HIV/AIDS Vaccine Immunology 《Genome biology》2010,11(5):R57
Background
There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole-exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important. 相似文献9.
Background
Better automation, lower cost per reaction and a heightened interest in comparative genomics has led to a dramatic increase in DNA sequencing activities. Although the large sequencing projects of specialized centers are supported by in-house bioinformatics groups, many smaller laboratories face difficulties managing the appropriate processing and storage of their sequencing output. The challenges include documentation of clones, templates and sequencing reactions, and the storage, annotation and analysis of the large number of generated sequences. 相似文献10.
Background
Second-generation sequencing has the potential to revolutionize genomics and impact all areas of biomedical science. New technologies will make re-sequencing widely available for such applications as identifying genome variations or interrogating the oligonucleotide content of a large sample (e.g. ChIP-sequencing). The increase in speed, sensitivity and availability of sequencing technology brings demand for advances in computational technology to perform associated analysis tasks. The Solexa/Illumina 1G sequencer can produce tens of millions of reads, ranging in length from ~25–50 nt, in a single experiment. Accurately mapping the reads back to a reference genome is a critical task in almost all applications. Two sources of information that are often ignored when mapping reads from the Solexa technology are the 3' ends of longer reads, which contain a much higher frequency of sequencing errors, and the base-call quality scores. 相似文献11.
12.
Cristian Coarfa Fuli Yu Christopher A Miller Zuozhou Chen R Alan Harris Aleksandar Milosavljevic 《BMC bioinformatics》2010,11(1):572
Background
Massively parallel sequencing readouts of epigenomic assays are enabling integrative genome-wide analyses of genomic and epigenomic variation. Pash 3.0 performs sequence comparison and read mapping and can be employed as a module within diverse configurable analysis pipelines, including ChIP-Seq and methylome mapping by whole-genome bisulfite sequencing. 相似文献13.
Erik Arner Martti T Tammi Anh-Nhi Tran Ellen Kindlund Bjorn Andersson 《BMC bioinformatics》2006,7(1):155-11
Background
Many genome projects are left unfinished due to complex, repeated regions. Finishing is the most time consuming step in sequencing and current finishing tools are not designed with particular attention to the repeat problem. 相似文献14.
Background
In expressed sequence tag (EST) sequencing, we are often interested in how many genes we can capture in an EST sample of a targeted size. This information provides insights to sequencing efficiency in experimental design, as well as clues to the diversity of expressed genes in the tissue from which the library was constructed. 相似文献15.
Robert Kofler Tatiana Teixeira Torres Tamas Lelley Christian Schl?tterer 《BMC bioinformatics》2009,10(1):143
Background
Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. 相似文献16.
Background
Recent high throughput sequencing technologies are capable of generating a huge amount of data for bacterial genome sequencing projects. Although current sequence assemblers successfully merge the overlapping reads, often several contigs remain which cannot be assembled any further. It is still costly and time consuming to close all the gaps in order to acquire the whole genomic sequence. 相似文献17.
Background
The generation and analysis of high-throughput sequencing data are becoming a major component of many studies in molecular biology and medical research. Illumina's Genome Analyzer (GA) and HiSeq instruments are currently the most widely used sequencing devices. Here, we comprehensively evaluate properties of genomic HiSeq and GAIIx data derived from two plant genomes and one virus, with read lengths of 95 to 150 bases. 相似文献18.
Background
Mitochondria are highly complex, membrane-enclosed organelles that are essential to the eukaryotic cell. The experimental elucidation of organellar proteomes combined with the sequencing of complete genomes allows us to trace the evolution of the mitochondrial proteome. 相似文献19.
Background
New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data. 相似文献20.
Jose C Jimenez-Lopez Emma W Gachomo Manfredo J Seufferheld Simeon O Kotchoni 《BMC structural biology》2010,10(1):43