共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data. 相似文献2.
Background
Recent high throughput sequencing technologies are capable of generating a huge amount of data for bacterial genome sequencing projects. Although current sequence assemblers successfully merge the overlapping reads, often several contigs remain which cannot be assembled any further. It is still costly and time consuming to close all the gaps in order to acquire the whole genomic sequence. 相似文献3.
Background
Next-generation sequencing technologies allow genomes to be sequenced more quickly and less expensively than ever before. However, as sequencing technology has improved, the difficulty of de novo genome assembly has increased, due in large part to the shorter reads generated by the new technologies. The use of mated sequences (referred to as mate-pairs) is a standard means of disambiguating assemblies to obtain a more complete picture of the genome without resorting to manual finishing. Here, we examine the effectiveness of mate-pair information in resolving repeated sequences in the DNA (a paramount issue to overcome). While it has been empirically accepted that mate-pairs improve assemblies, and a variety of assemblers use mate-pairs in the context of repeat resolution, the effectiveness of mate-pairs in this context has not been systematically evaluated in previous literature. 相似文献4.
Background
De novo genome assembly of next-generation sequencing data is one of the most important current problems in bioinformatics, essential in many biological applications. In spite of significant amount of work in this area, better solutions are still very much needed.Results
We present a new program, SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers.Conclusions
SAGE benefits from innovations in almost every aspect of the assembly process: error correction of input reads, string-overlap graph construction, read copy counts estimation, overlap graph analysis and reduction, contig extraction, and scaffolding. We hope that these new ideas will help advance the current state-of-the-art in an essential area of research in genomics.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-302) contains supplementary material, which is available to authorized users. 相似文献5.
Background
Viruses have unique properties, small genome and regions of high similarity, whose effects on metagenomic assemblies have not been characterized so far. This study uses diverse in silico simulated viromes to evaluate how extensively genomes can be assembled using different sequencing platforms and assemblers. Further, it investigates the suitability of different methods to estimate viral diversity in metagenomes.Results
We created in silico metagenomes mimicking various platforms at different sequencing depths. The CLC assembler revealed subpar compared to IDBA_UD and CAMERA , which are metagenomic-specific. Up to a saturation point, Illumina platforms proved more capable of reconstructing large portions of viral genomes compared to 454. Read length was an important factor for limiting chimericity, while scaffolding marginally improved contig length and accuracy. The genome length of the various viruses in the metagenomes did not significantly affect genome reconstruction, but the co-existence of highly similar genomes was detrimental. When evaluating diversity estimation tools, we found that PHACCS results were more accurate than those from CatchAll and clustering, which were both orders of magnitude above expected.Conclusions
Assemblers designed specifically for the analysis of metagenomes should be used to facilitate the creation of high-quality long contigs. Despite the high coverage possible, scientists should not expect to always obtain complete genomes, because their reconstruction may be hindered by co-existing species bearing highly similar genomic regions. Further development of metagenomics-oriented assemblers may help bypass these limitations in future studies. Meanwhile, the lack of fully reconstructed communities keeps methods to estimate viral diversity relevant. While none of the three methods tested had absolute precision, only PHACCS was deemed suitable for comparative studies.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-989) contains supplementary material, which is available to authorized users. 相似文献6.
Background
A "physiologically based pharmacokinetic" (PBPK) approach uses a realistic model of the animal to describe the pharmacokinetics. Previous PBPKs have been designed for specific solutes, required specification of a large number of parameters and have not been designed for general use. 相似文献7.
Background
Avida is a computer program that performs evolution experiments with digital organisms. Previous work has used the program to study the evolutionary origin of complex features, namely logic operations, but has consistently used extremely large mutational fitness effects. The present study uses Avida to better understand the role of low-impact mutations in evolution. 相似文献8.
Codruta?Ignea Ivana?Cvetkovic Sofia?Loupassaki Panagiotis?Kefalas Christopher?B?Johnson Sotirios?C?Kampranis Antonios?M?Makris
Background
Terpenoids constitute a large family of natural products, attracting commercial interest for a variety of uses as flavours, fragrances, drugs and alternative fuels. Saccharomyces cerevisiae offers a versatile cell factory, as the precursors of terpenoid biosynthesis are naturally synthesized by the sterol biosynthetic pathway. 相似文献9.
10.
Jamie Twycross Leah R Band Malcolm J Bennett John R King Natalio Krasnogor 《BMC systems biology》2010,4(1):34
Background
Stochastic and asymptotic methods are powerful tools in developing multiscale systems biology models; however, little has been done in this context to compare the efficacy of these methods. The majority of current systems biology modelling research, including that of auxin transport, uses numerical simulations to study the behaviour of large systems of deterministic ordinary differential equations, with little consideration of alternative modelling frameworks. 相似文献11.
Will Stott Andy Ryan Ian J Jacobs Usha Menon Conrad Bessant Christopher Jones 《Source code for biology and medicine》2008,3(1):11
Background
Ultrasound scanning uses the medical imaging format, DICOM, for electronically storing the images and data associated with a particular scan. Large health care facilities typically use a picture archiving and communication system (PACS) for storing and retrieving such images. However, these systems are usually not suitable for managing large collections of anonymized ultrasound images gathered during a clinical screening trial. 相似文献12.
13.
14.
15.
Adonney Allan de Oliveira Veras Pablo Henrique Caracciolo Gomes de Sá Vasco Azevedo Artur Silva Rommel Thiago Jucá Ramos 《Bioinformation》2013,9(16):840-841
Next-generation sequencing technologies have increased the amount of biological data generated. Thus, bioinformatics has become
important because new methods and algorithms are necessary to manipulate and process such data. However, certain challenges
have emerged, such as genome assembly using short reads and high-throughput platforms. In this context, several algorithms have
been developed, such as Velvet, Abyss, Euler-SR, Mira, Edna, Maq, SHRiMP, Newbler, ALLPATHS, Bowtie and BWA. However,
most such assemblers do not have a graphical interface, which makes their use difficult for users without computing experience
given the complexity of the assembler syntax. Thus, to make the operation of such assemblers accessible to users without a
computing background, we developed AutoAssemblyD, which is a graphical tool for genome assembly submission and remote
management by multiple assemblers through XML templates.
Availability
AssemblyD is freely available at https://sourceforge.net/projects/autoassemblyd. It requires Sun jdk 6 or higher. 相似文献16.
Evolution of plant senescence 总被引:3,自引:0,他引:3
Background
Senescence is integral to the flowering plant life-cycle. Senescence-like processes occur also in non-angiosperm land plants, algae and photosynthetic prokaryotes. Increasing numbers of genes have been assigned functions in the regulation and execution of angiosperm senescence. At the same time there has been a large expansion in the number and taxonomic spread of plant sequences in the genome databases. The present paper uses these resources to make a study of the evolutionary origins of angiosperm senescence based on a survey of the distribution, across plant and microbial taxa, and expression of senescence-related genes. 相似文献17.
Background
Finishing is the process of improving the quality and utility of draft genome sequences generated by shotgun sequencing and computational assembly. Finishing can involve targeted sequencing. Finishing reads may be incorporated by manual or automated means. One automated method uses targeted addition by local re-assembly of gap regions. An obvious alternative uses de novo assembly of all the reads. 相似文献18.
19.
Olivier Delaneau Cédric Coulonges Pierre-Yves Boelle George Nelson Jean-Louis Spadoni Jean-François Zagury 《BMC bioinformatics》2007,8(1):205
Background
We have developed a new haplotyping program based on the combination of an iterative multiallelic EM algorithm (IEM), bootstrap resampling and a pseudo Gibbs sampler. The use of the IEM-bootstrap procedure considerably reduces the space of possible haplotype configurations to be explored, greatly reducing computation time, while the adaptation of the Gibbs sampler with a recombination model on this restricted space maintains high accuracy. On large SNP datasets (>30 SNPs), we used a segmented approach based on a specific partition-ligation strategy. We compared this software, Ishape (Iterative Segmented HAPlotyping by Em), with reference programs such as Phase, Fastphase, and PL-EM. Analogously with Phase, there are 2 versions of Ishape: Ishape1 which uses a simple coalescence model for the pseudo Gibbs sampler step, and Ishape2 which uses a recombination model instead. 相似文献20.