期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Identification of Cytotoxic T Lymphocyte Epitopes on Swine Viruses: Multi-Epitope Design for Universal T Cell Vaccine

Yu-Chieh Liao Hsin-Hung Lin Chieh-Hua Lin Wen-Bin Chung 《PloS one》2013,8(12)

Classical swine fever (CSF), foot-and-mouth disease (FMD) and porcine reproductive and respiratory syndrome (PRRS) are the primary diseases affecting the pig industry globally. Vaccine induced CD8⁺ T cell-mediated immune response might be long-lived and cross-serotype and thus deserve further attention. Although large panels of synthetic overlapping peptides spanning the entire length of the polyproteins of a virus facilitate the detection of cytotoxic T lymphocyte (CTL) epitopes, it is an exceedingly costly and cumbersome approach. Alternatively, computational predictions have been proven to be of satisfactory accuracy and are easily performed. Such a method enables the systematic identification of genome-wide CTL epitopes by incorporating epitope prediction tools in analyzing large numbers of viral sequences. In this study, we have implemented an integrated bioinformatics pipeline for the identification of CTL epitopes of swine viruses including the CSF virus (CSFV), FMD virus (FMDV) and PRRS virus (PRRSV) and assembled these epitopes on a web resource to facilitate vaccine design. Identification of epitopes for cross protections to different subtypes of virus are also reported in this study and may be useful for the development of a universal vaccine against such viral infections among the swine population. The CTL epitopes identified in this study have been evaluated in silico and possibly provide more and wider protection in compared to traditional single-reference vaccine design. The web resource is free and open to all users through http://sb.nhri.org.tw/ICES. 相似文献

2.

Simplifier: a web tool to eliminate redundant NGS contigs

Rommel Thiago Jucá Ramos Adriana Ribeiro Carneiro Vasco Azevedo Maria Paula Schneider Debmalya Barh Artur Silva 《Bioinformation》2012,8(20):996-999

Modern genomic sequencing technologies produce a large amount of data with reduced cost per base; however, this data consists of short reads. This reduction in the size of the reads, compared to those obtained with previous methodologies, presents new challenges, including a need for efficient algorithms for the assembly of genomes from short reads and for resolving repetitions. Additionally after abinitio assembly, curation of the hundreds or thousands of contigs generated by assemblers demands considerable time and computational resources. We developed Simplifier, a stand-alone software that selectively eliminates redundant sequences from the collection of contigs generated by ab initio assembly of genomes. Application of Simplifier to data generated by assembly of the genome of Corynebacterium pseudotuberculosis strain 258 reduced the number of contigs generated by ab initio methods from 8,004 to 5,272, a reduction of 34.14%; in addition, N50 increased from 1 kb to 1.5 kb. Processing the contigs of Escherichia coli DH10B with Simplifier reduced the mate-paired library 17.47% and the fragment library 23.91%. Simplifier removed redundant sequences from datasets produced by assemblers, thereby reducing the effort required for finalization of genome assembly in tests with data from Prokaryotic organisms.

Availability

Simplifier is available at http://www.genoma.ufpa.br/rramos/softwares/simplifier.xhtmlIt requires Sun jdk 6 or higher. 相似文献

3.

MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning

Afiahayati Kengo Sato Yasubumi Sakakibara 《DNA research》2015,22(1):69-77

The assembly of multiple genomes from mixed sequence reads is a bottleneck in metagenomic analysis. A single-genome assembly program (assembler) is not capable of resolving metagenome sequences, so assemblers designed specifically for metagenomics have been developed. MetaVelvet is an extension of the single-genome assembler Velvet. It has been proved to generate assemblies with higher N50 scores and higher quality than single-genome assemblers such as Velvet and SOAPdenovo when applied to metagenomic sequence reads and is frequently used in this research community. One important open problem for MetaVelvet is its low accuracy and sensitivity in detecting chimeric nodes in the assembly (de Bruijn) graph, which prevents the generation of longer contigs and scaffolds. We have tackled this problem of classifying chimeric nodes using supervised machine learning to significantly improve the performance of MetaVelvet and developed a new tool, called MetaVelvet-SL. A Support Vector Machine is used for learning the classification model based on 94 features extracted from candidate nodes. In extensive experiments, MetaVelvet-SL outperformed the original MetaVelvet and other state-of-the-art metagenomic assemblers, IDBA-UD, Ray Meta and Omega, to reconstruct accurate longer assemblies with higher N50 scores for both simulated data sets and real data sets of human gut microbial sequences. 相似文献

4.

AutoAssemblyD: a graphical user interface system for several genome assemblers

Adonney Allan de Oliveira Veras Pablo Henrique Caracciolo Gomes de Sá Vasco Azevedo Artur Silva Rommel Thiago Jucá Ramos 《Bioinformation》2013,9(16):840-841

Next-generation sequencing technologies have increased the amount of biological data generated. Thus, bioinformatics has become important because new methods and algorithms are necessary to manipulate and process such data. However, certain challenges have emerged, such as genome assembly using short reads and high-throughput platforms. In this context, several algorithms have been developed, such as Velvet, Abyss, Euler-SR, Mira, Edna, Maq, SHRiMP, Newbler, ALLPATHS, Bowtie and BWA. However, most such assemblers do not have a graphical interface, which makes their use difficult for users without computing experience given the complexity of the assembler syntax. Thus, to make the operation of such assemblers accessible to users without a computing background, we developed AutoAssemblyD, which is a graphical tool for genome assembly submission and remote management by multiple assemblers through XML templates.

Availability

AssemblyD is freely available at https://sourceforge.net/projects/autoassemblyd. It requires Sun jdk 6 or higher. 相似文献

5.

CAR: contig assembly of prokaryotic draft genomes using rearrangements

Chin Lung Lu Kun-Tze Chen Shih-Yuan Huang Hsien-Tai Chiu 《BMC bioinformatics》2014,15(1)

Background

Next generation sequencing technology has allowed efficient production of draft genomes for many organisms of interest. However, most draft genomes are just collections of independent contigs, whose relative positions and orientations along the genome being sequenced are unknown. Although several tools have been developed to order and orient the contigs of draft genomes, more accurate tools are still needed.

Results

In this study, we present a novel reference-based contig assembly (or scaffolding) tool, named as CAR, that can efficiently and more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome of a related organism. Given a set of contigs in multi-FASTA format and a reference genome in FASTA format, CAR can output a list of scaffolds, each of which is a set of ordered and oriented contigs. For validation, we have tested CAR on a real dataset composed of several prokaryotic genomes and also compared its performance with several other reference-based contig assembly tools. Consequently, our experimental results have shown that CAR indeed performs better than all these other reference-based contig assembly tools in terms of sensitivity, precision and genome coverage.

Conclusions

CAR serves as an efficient tool that can more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome. The web server of CAR is freely available at http://genome.cs.nthu.edu.tw/CAR/ and its stand-alone program can also be downloaded from the same website.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0381-3) contains supplementary material, which is available to authorized users. 相似文献

6.

GABenchToB: A Genome Assembly Benchmark Tuned on Bacteria and Benchtop Sequencers

Sebastian Jünemann Karola Prior Andreas Albersmeier Stefan Albaum J?rn Kalinowski Alexander Goesmann Jens Stoye Dag Harmsen 《PloS one》2014,9(9)

相似文献

7.

Efficient de novo assembly of large and complex genomes by massively parallel sequencing of Fosmid pools

Andrey Alexeyenko Bj?rn Nystedt Francesco Vezzi Ellen Sherwood Rosa Ye Bjarne Knudsen Martin Simonsen Benjamin Turner Pieter de Jong Cheng-Cang Wu Joakim Lundeberg 《BMC genomics》2014,15(1)

Background

Sampling genomes with Fosmid vectors and sequencing of pooled Fosmid libraries on the Illumina platform for massive parallel sequencing is a novel and promising approach to optimizing the trade-off between sequencing costs and assembly quality.

Results

In order to sequence the genome of Norway spruce, which is of great size and complexity, we developed and applied a new technology based on the massive production, sequencing, and assembly of Fosmid pools (FP). The spruce chromosomes were sampled with ~40,000 bp Fosmid inserts to obtain around two-fold genome coverage, in parallel with traditional whole genome shotgun sequencing (WGS) of haploid and diploid genomes. Compared to the WGS results, the contiguity and quality of the FP assemblies were high, and they allowed us to fill WGS gaps resulting from repeats, low coverage, and allelic differences. The FP contig sets were further merged with WGS data using a novel software package GAM-NGS.

Conclusions

By exploiting FP technology, the first published assembly of a conifer genome was sequenced entirely with massively parallel sequencing. Here we provide a comprehensive report on the different features of the approach and the optimization of the process.We have made public the input data (FASTQ format) for the set of pools used in this study:ftp://congenie.org/congenie/Nystedt_2013/Assembly/ProcessedData/FosmidPools/.(alternatively accessible via http://congenie.org/downloads).The software used for running the assembly process is available at http://research.scilifelab.se/andrej_alexeyenko/downloads/fpools/.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-439) contains supplementary material, which is available to authorized users. 相似文献

8.

REAPR: a universal tool for genome assembly evaluation

Martin Hunt Taisei Kikuchi Mandy Sanders Chris Newbold Matthew Berriman Thomas D Otto 《Genome biology》2013,14(5):R47

Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/. 相似文献

9.

SWALO: scaffolding with assembly likelihood optimization

Atif Rahman Lior Pachter 《Nucleic acids research》2021,49(20):e117

Scaffolding, i.e. ordering and orienting contigs is an important step in genome assembly. We present a method for scaffolding using second generation sequencing reads based on likelihoods of genome assemblies. A generative model for sequencing is used to obtain maximum likelihood estimates of gaps between contigs and to estimate whether linking contigs into scaffolds would lead to an increase in the likelihood of the assembly. We then link contigs if they can be unambiguously joined or if the corresponding increase in likelihood is substantially greater than that of other possible joins of those contigs. The method is implemented in a tool called Swalo with approximations to make it efficient and applicable to large datasets. Analysis on real and simulated datasets reveals that it consistently makes more or similar number of correct joins as other scaffolders while linking very few contigs incorrectly, thus outperforming other scaffolders and demonstrating that substantial improvement in genome assembly may be achieved through the use of statistical models. Swalo is freely available for download at https://atifrahman.github.io/SWALO/. 相似文献

10.

Graphical contig analyzer for all sequencing platforms (G4ALL): a new stand-alone tool for finishing and draft generation of bacterial genomes

Rommel Thiago Jucá Ramos Adriana R Carneiro Pablo H Caracciolo Vasco Azevedo Maria Paula C Schneider Debmalya Barh Artur Silva 《Bioinformation》2013,9(11):599-604

Genome assembly has always been complicated due to the inherent difficulties of sequencing technologies, as well the computational methods used to process sequences. Although many of the problems for the generation of contigs from reads are well known, especially those involving short reads, the orientation and ordination of contigs in the finishing stages is still very challenging and time consuming, as it requires the manual curation of the contigs to guarantee correct identification them and prevent misassembly. Due to the large numbers of sequences that are produced, especially from the reads produced by next generation sequencers, this process demands considerable manual effort, and there are few software options available to facilitate the process. To address this problem, we have developed the Graphic Contig Analyzer for All Sequencing Platforms (G4ALL): a stand-alone multi-user tool that facilitates the editing of the contigs produced in the assembly process. Besides providing information on the gene products contained in each contig, obtained through a search of the available biological databases, G4ALL produces a scaffold of the genome, based on the overlap of the contigs after curation.

Availability

The software is available at: http://www.genoma.ufpa.br/rramos/softwares/g4all.xhtml 相似文献

11.

Genome sequence of the clover-nodulating Rhizobium leguminosarum bv. trifolii strain SRDI565.

Wayne Reeve Elizabeth Drew Ross Ballard Vanessa Melino Rui Tian Sofie De Meyer Lambert Brau Mohamed Ninawi Hazuki Teshima Lynne Goodwin Patrick Chain Konstantinos Liolios Amrita Pati Konstantinos Mavromatis Natalia Ivanova Victor Markowitz Tanja Woyke Nikos Kyrpides 《Standards in genomic sciences》2013,9(2):220-231

Rhizobium leguminosarum bv. trifolii SRDI565 (syn. N8-J) is an aerobic, motile, Gram-negative, non-spore-forming rod. SRDI565 was isolated from a nodule recovered from the roots of the annual clover Trifolium subterraneum subsp. subterraneum grown in the greenhouse and inoculated with soil collected from New South Wales, Australia. SRDI565 has a broad host range for nodulation within the clover genus, however N₂-fixation is sub-optimal with some Trifolium species and ineffective with others. Here we describe the features of R. leguminosarum bv. trifolii strain SRDI565, together with genome sequence information and annotation. The 6,905,599 bp high-quality-draft genome is arranged into 7 scaffolds of 7 contigs, contains 6,750 protein-coding genes and 86 RNA-only encoding genes, and is one of 100 rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project. 相似文献

12.

Genome sequence of the Trifolium rueppellianum -nodulating Rhizobium leguminosarum bv. trifolii strain WSM2012.

Wayne Reeve Vanessa Melino Julie Ardley Rui Tian Sofie De Meyer Jason Terpolilli Ravi Tiwari Ronald Yates Graham O’Hara John Howieson Mohamed Ninawi Brittany Held David Bruce Chris Detter Roxanne Tapia Cliff Han Chia-Lin Wei Marcel Huntemann James Han I-Min Chen Konstantinos Mavromatis Victor Markowitz Ernest Szeto Natalia Ivanova Natalia Mikhailova Ioanna Pagani Amrita Pati Lynne Goodwin Tanja Woyke Nikos Kyrpides 《Standards in genomic sciences》2013,9(2):283-293

Rhizobium leguminosarum bv. trifolii WSM2012 (syn. MAR1468) is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from an ineffective root nodule recovered from the roots of the annual clover Trifolium rueppellianum Fresen growing in Ethiopia. WSM2012 has a narrow, specialized host range for N₂-fixation. Here we describe the features of R. leguminosarum bv. trifolii strain WSM2012, together with genome sequence information and annotation. The 7,180,565 bp high-quality-draft genome is arranged into 6 scaffolds of 68 contigs, contains 7,080 protein-coding genes and 86 RNA-only encoding genes, and is one of 20 rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Community Sequencing Program. 相似文献

13.

De novo assembly of bacterial transcriptomes from RNA-seq data

Brian Tjaden 《Genome biology》2015,16(1)

相似文献

14.

MapRepeat: an approach for effective assembly of repetitive regions in prokaryotic genomes

Diego CB Mariano Felipe L Pereira Preetam Ghosh Debmalya Barh Henrique CP Figueiredo Artur Silva Rommel TJ Ramos Vasco AC Azevedo 《Bioinformation》2015,11(6):276-279

相似文献

15.

The SAMBA tool uses long reads to improve the contiguity of genome assemblies

Aleksey V. Zimin Steven L. Salzberg 《PLoS computational biology》2022,18(2)

Third-generation sequencing technologies can generate very long reads with relatively high error rates. The lengths of the reads, which sometimes exceed one million bases, make them invaluable for resolving complex repeats that cannot be assembled using shorter reads. Many high-quality genome assemblies have already been produced, curated, and annotated using the previous generation of sequencing data, and full re-assembly of these genomes with long reads is not always practical or cost-effective. One strategy to upgrade existing assemblies is to generate additional coverage using long-read data, and add that to the previously assembled contigs. SAMBA is a tool that is designed to scaffold and gap-fill existing genome assemblies with additional long-read data, resulting in substantially greater contiguity. SAMBA is the only tool of its kind that also computes and fills in the sequence for all spanned gaps in the scaffolds, yielding much longer contigs. Here we compare SAMBA to several similar tools capable of re-scaffolding assemblies using long-read data, and we show that SAMBA yields better contiguity and introduces fewer errors than competing methods. SAMBA is open-source software that is distributed at https://github.com/alekseyzimin/masurca. 相似文献

16.

Shewregdb: Database and visualization environment for experimental and predicted regulatory information in Shewanella oneidensis mr-1

Mustafa H Syed Tatiana V Karpinets Michael R Leuze Guruprasad H Kora Margaret R Romine Edward C Uberbacher 《Bioinformation》2009,4(4):169-172

相似文献

17.

BinPacker: Packing-Based De Novo Transcriptome Assembly from RNA-seq Data

Juntao Liu Guojun Li Zheng Chang Ting Yu Bingqiang Liu Rick McMullen Pengyin Chen Xiuzhen Huang 《PLoS computational biology》2016,12(2)

相似文献

18.

Bridger: a new framework for de novo transcriptome assembly using RNA-seq data

Zheng Chang Guojun Li Juntao Liu Yu Zhang Cody Ashby Deli Liu Carole L Cramer Xiuzhen Huang 《Genome biology》2015,16(1)

相似文献

19.

MICO: A meta-tool for prediction of the effects of non-synonymous mutations

Gilliean Lee Chin-Fu Chen 《Bioinformation》2014,10(7):469-471

The Next Generation Sequencing (NGS) is a state-of-the-art technology that produces high throughput data with high resolution mutation information in the genome. Numerous methods with different efficiencies have been developed to predict mutational effects in the genome. The challenge is to present the results in a balanced manner for better biological insights and interpretation. Hence, we describe a meta-tool named Mutation Information Collector (MICO) for automatically querying and collecting related information from multiple biology/bioinformatics enabled web servers with prediction capabilities. The predicted mutational results for the proteins of interest are returned and presented as an easy-to-read summary table in this service. MICO also allows for navigating the result from each website for further analysis.

Availability

http: //mico.ggc.org /MICO 相似文献

20.

PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM and Look Ahead Approach

Xiao Zhu Henry C. M. Leung Francis Y. L. Chin Siu Ming Yiu Guangri Quan Bo Liu Yadong Wang 《PloS one》2014,9(12)

Since the read lengths of high throughput sequencing (HTS) technologies are short, de novo assembly which plays significant roles in many applications remains a great challenge. Most of the state-of-the-art approaches base on de Bruijn graph strategy and overlap-layout strategy. However, these approaches which depend on k-mers or read overlaps do not fully utilize information of paired-end and single-end reads when resolving branches. Since they treat all single-end reads with overlapped length larger than a fix threshold equally, they fail to use the more confident long overlapped reads for assembling and mix up with the relative short overlapped reads. Moreover, these approaches have not been special designed for handling tandem repeats (repeats occur adjacently in the genome) and they usually break down the contigs near the tandem repeats. We present PERGA (Paired-End Reads Guided Assembler), a novel sequence-reads-guided de novo assembly approach, which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds using paired-end reads and different read overlap size ranging from O _max to O _min to resolve the gaps and branches. By constructing a decision model using machine learning approach based on branch features, PERGA can determine the correct extension in 99.7% of cases. When the correct extension cannot be determined, PERGA will try to extend the contig by all feasible extensions and determine the correct extension by using look-ahead approach. Many difficult-resolved branches are due to tandem repeats which are close in the genome. PERGA detects such different copies of the repeats to resolve the branches to make the extension much longer and more accurate. We evaluated PERGA on both Illumina real and simulated datasets ranging from small bacterial genomes to large human chromosome, and it constructed longer and more accurate contigs and scaffolds than other state-of-the-art assemblers. PERGA can be freely downloaded at https://github.com/hitbio/PERGA. 相似文献