期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

NGS-QCbox and Raspberry for Parallel,Automated and Rapid Quality Control Analysis of Large-Scale Next Generation Sequencing (Illumina) Data

Mohan A. V. S. K. Katta Aamir W. Khan Dadakhalandar Doddamani Mahendar Thudi Rajeev K. Varshney 《PloS one》2015,10(10)

相似文献

2.

A Real-Time All-Atom Structural Search Engine for Proteins

Gabriel Gonzalez Brett Hannigan William F. DeGrado 《PLoS computational biology》2014,10(7)

Protein designers use a wide variety of software tools for de novo design, yet their repertoire still lacks a fast and interactive all-atom search engine. To solve this, we have built the Suns program: a real-time, atomic search engine integrated into the PyMOL molecular visualization system. Users build atomic-level structural search queries within PyMOL and receive a stream of search results aligned to their query within a few seconds. This instant feedback cycle enables a new “designability”-inspired approach to protein design where the designer searches for and interactively incorporates native-like fragments from proven protein structures. We demonstrate the use of Suns to interactively build protein motifs, tertiary interactions, and to identify scaffolds compatible with hot-spot residues. The official web site and installer are located at http://www.degradolab.org/suns/ and the source code is hosted at https://github.com/godotgildor/Suns (PyMOL plugin, BSD license), https://github.com/Gabriel439/suns-cmd (command line client, BSD license), and https://github.com/Gabriel439/suns-search (search engine server, GPLv2 license).

This is a PLOS Computational Biology Software Article

相似文献

3.

Comprehensive Annotation of the Parastagonospora nodorum Reference Genome Using Next-Generation Genomics,Transcriptomics and Proteogenomics

Robert A. Syme Kar-Chun Tan James K. Hane Kejal Dodhia Thomas Stoll Marcus Hastie Eiko Furuki Simon R. Ellwood Angela H. Williams Yew-Foon Tan Alison C. Testa Jeffrey J. Gorman Richard P. Oliver 《PloS one》2016,11(2)

相似文献

4.

UniqTag: Content-Derived Unique and Stable Identifiers for Gene Annotation

Shaun D. Jackman Joerg Bohlmann ?nan? Birol 《PloS one》2015,10(5)

When working on an ongoing genome sequencing and assembly project, it is rather inconvenient when gene identifiers change from one build of the assembly to the next. The gene labelling system described here, UniqTag, addresses this common challenge. UniqTag assigns a unique identifier to each gene that is a representative k-mer, a string of length k, selected from the sequence of that gene. Unlike serial numbers, these identifiers are stable between different assemblies and annotations of the same data without requiring that previous annotations be lifted over by sequence alignment. We assign UniqTag identifiers to ten builds of the Ensembl human genome spanning eight years to demonstrate this stability. The implementation of UniqTag in Ruby and an R package are available at https://github.com/sjackman/uniqtag sjackman/uniqtag. The R package is also available from CRAN: install.packages ("uniqtag"). Supplementary material and code to reproduce it is available at https://github.com/sjackman/uniqtag-paper. 相似文献

5.

lra: A long read aligner for sequences and contigs

Jingwen Ren Mark J. P. Chaisson 《PLoS computational biology》2021,17(6)

It is computationally challenging to detect variation by aligning single-molecule sequencing (SMS) reads, or contigs from SMS assemblies. One approach to efficiently align SMS reads is sparse dynamic programming (SDP), where optimal chains of exact matches are found between the sequence and the genome. While straightforward implementations of SDP penalize gaps with a cost that is a linear function of gap length, biological variation is more accurately represented when gap cost is a concave function of gap length. We have developed a method, lra, that uses SDP with a concave-cost gap penalty, and used lra to align long-read sequences from PacBio and Oxford Nanopore (ONT) instruments as well as de novo assembly contigs. This alignment approach increases sensitivity and specificity for SV discovery, particularly for variants above 1kb and when discovering variation from ONT reads, while having runtime that are comparable (1.05-3.76×) to current methods. When applied to calling variation from de novo assembly contigs, there is a 3.2% increase in Truvari F1 score compared to minimap2+htsbox. lra is available in bioconda (https://anaconda.org/bioconda/lra) and github (https://github.com/ChaissonLab/LRA). 相似文献

6.

GRAFIMO: Variant and haplotype aware motif scanning on pangenome graphs

Manuel Tognon Vincenzo Bonnici Erik Garrison Rosalba Giugno Luca Pinello 《PLoS computational biology》2021,17(9)

相似文献

7.

Wham: Identifying Structural Variants of Biological Consequence

Zev N. Kronenberg Edward J. Osborne Kelsey R. Cone Brett J. Kennedy Eric T. Domyan Michael D. Shapiro Nels C. Elde Mark Yandell 《PLoS computational biology》2015,11(12)

Existing methods for identifying structural variants (SVs) from short read datasets are inaccurate. This complicates disease-gene identification and efforts to understand the consequences of genetic variation. In response, we have created Wham (Whole-genome Alignment Metrics) to provide a single, integrated framework for both structural variant calling and association testing, thereby bypassing many of the difficulties that currently frustrate attempts to employ SVs in association testing. Here we describe Wham, benchmark it against three other widely used SV identification tools–Lumpy, Delly and SoftSearch–and demonstrate Wham’s ability to identify and associate SVs with phenotypes using data from humans, domestic pigeons, and vaccinia virus. Wham and all associated software are covered under the MIT License and can be freely downloaded from github (https://github.com/zeeev/wham), with documentation on a wiki (http://zeeev.github.io/wham/). For community support please post questions to https://www.biostars.org/.

This is PLOS Computational Biology software paper.

相似文献

8.

POMAShiny: A user-friendly web-based workflow for metabolomics and proteomics data analysis

Pol Castellano-Escuder Raúl Gonzlez-Domínguez Francesc Carmona-Pontaque Cristina Andrs-Lacueva Alex Snchez-Pla 《PLoS computational biology》2021,17(7)

Metabolomics and proteomics, like other omics domains, usually face a data mining challenge in providing an understandable output to advance in biomarker discovery and precision medicine. Often, statistical analysis is one of the most difficult challenges and it is critical in the subsequent biological interpretation of the results. Because of this, combined with the computational programming skills needed for this type of analysis, several bioinformatic tools aimed at simplifying metabolomics and proteomics data analysis have emerged. However, sometimes the analysis is still limited to a few hidebound statistical methods and to data sets with limited flexibility. POMAShiny is a web-based tool that provides a structured, flexible and user-friendly workflow for the visualization, exploration and statistical analysis of metabolomics and proteomics data. This tool integrates several statistical methods, some of them widely used in other types of omics, and it is based on the POMA R/Bioconductor package, which increases the reproducibility and flexibility of analyses outside the web environment. POMAShiny and POMA are both freely available at https://github.com/nutrimetabolomics/POMAShiny and https://github.com/nutrimetabolomics/POMA, respectively. 相似文献

9.

Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks

Payam Dibaeinia Saurabh Sinha 《Nucleic acids research》2021,49(18):10309

相似文献

10.

MetAMOS: a modular and open source metagenomic assembly and analysis pipeline

Todd J Treangen Sergey Koren Daniel D Sommer Bo Liu Irina Astrovskaya Brian Ondov Aaron E Darling Adam M Phillippy Mihai Pop 《Genome biology》2013,14(1):R2

We describe MetAMOS, an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequencing reads and producing genomic scaffolds, open-reading frames and taxonomic or functional annotations. MetAMOS can aid in reducing assembly errors, commonly encountered when assembling metagenomic samples, and improves taxonomic assignment accuracy while also reducing computational cost. MetAMOS can be downloaded from: https://github.com/treangen/MetAMOS. 相似文献

11.

SubcloneSeeker: a computational framework for reconstructing tumor clone structure for cancer variant interpretation and prioritization

Yi Qiao Aaron R Quinlan Amir A Jazaeri Roeland GW Verhaak David A Wheeler Gabor T Marth 《Genome biology》2014,15(8)

Many tumors are composed of genetically divergent cell subpopulations. We report SubcloneSeeker, a package capable of exhaustive identification of subclone structures and evolutionary histories with bulk somatic variant allele frequency measurements from tumor biopsies. We present a statistical framework to elucidate whether specific sets of mutations are present within the same subclones, and the order in which they occur. We demonstrate how subclone reconstruction provides crucial information about tumorigenesis and relapse mechanisms; guides functional study by variant prioritization, and has the potential as a rational basis for informed therapeutic strategies for the patient. SubcloneSeeker is available at: https://github.com/yiq/SubcloneSeeker.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0443-x) contains supplementary material, which is available to authorized users. 相似文献

12.

scDeepSort: a pre-trained cell-type annotation method for single-cell transcriptomics using deep learning with a weighted graph neural network

Xin Shao Haihong Yang Xiang Zhuang Jie Liao Penghui Yang Junyun Cheng Xiaoyan Lu Huajun Chen Xiaohui Fan 《Nucleic acids research》2021,49(21):e122

相似文献

13.

AnnoSINE: a short interspersed nuclear elements annotation tool for plant genomes

Yang Li Ning Jiang Yanni Sun 《Plant physiology》2022,188(2):955

Short interspersed nuclear elements (SINEs) are a widespread type of small transposable element (TE). With increasing evidence for their impact on gene function and genome evolution in plants, accurate genome-scale SINE annotation becomes a fundamental step for studying the regulatory roles of SINEs and their relationship with other components in the genomes. Despite the overall promising progress made in TE annotation, SINE annotation remains a major challenge. Unlike some other TEs, SINEs are short and heterogeneous, and they usually lack well-conserved sequence or structural features. Thus, current SINE annotation tools have either low sensitivity or high false discovery rates. Given the demand and challenges, we aimed to provide a more accurate and efficient SINE annotation tool for plant genomes. The pipeline starts with maximizing the pool of SINE candidates via profile hidden Markov model-based homology search and de novo SINE search using structural features. Then, it excludes the false positives by integrating all known features of SINEs and the features of other types of TEs that can often be misannotated as SINEs. As a result, the pipeline substantially improves the tradeoff between sensitivity and accuracy, with both values close to or over 90%. We tested our tool in Arabidopsis thaliana and rice (Oryza sativa), and the results show that our tool competes favorably against existing SINE annotation tools. The simplicity and effectiveness of this tool would potentially be useful for generating more accurate SINE annotations for other plant species. The pipeline is freely available at https://github.com/yangli557/AnnoSINE.

By integrating all known features of short interspersed nuclear elements (SINEs), AnnoSINE is a tool for both accurate and fast genome-scale SINE annotation in plant genomes. 相似文献

14.

ALLMAPS: robust scaffold ordering based on multiple maps

Haibao Tang Xingtan Zhang Chenyong Miao Jisen Zhang Ray Ming James C Schnable Patrick S Schnable Eric Lyons Jianguo Lu 《Genome biology》2015,16(1)

The ordering and orientation of genomic scaffolds to reconstruct chromosomes is an essential step during de novo genome assembly. Because this process utilizes various mapping techniques that each provides an independent line of evidence, a combination of multiple maps can improve the accuracy of the resulting chromosomal assemblies. We present ALLMAPS, a method capable of computing a scaffold ordering that maximizes colinearity across a collection of maps. ALLMAPS is robust against common mapping errors, and generates sequences that are maximally concordant with the input maps. ALLMAPS is a useful tool in building high-quality genome assemblies. ALLMAPS is available at: https://github.com/tanghaibao/jcvi/wiki/ALLMAPS. 相似文献

15.

tidy tree: A New Layout for Phylogenetic Trees

Simon Penel Damien M de Vienne 《Molecular biology and evolution》2022,39(10)

Many layouts exist for visualizing phylogenetic trees, allowing to display the same information (evolutionary relationships) in different ways. For large phylogenies, the choice of the layout is a key element, because the printable area is limited, and because interactive on-screen visualizers can lead to unreadable phylogenetic relationships at high zoom levels. A visual inspection of available layouts for rooted trees reveals large empty areas that one may want to fill in order to use less drawing space and eventually gain readability. This can be achieved by using the nonlayered tidy tree layout algorithm that was proposed earlier but was never used in a phylogenetic context so far. Here, we present its implementation, and we demonstrate its advantages on simulated and biological data (the measles virus phylogeny). Our results call for the integration of this new layout in phylogenetic software. We implemented the nonlayered tidy tree layout in R language as a stand-alone function (available at https://github.com/damiendevienne/non-layered-tidy-trees), as an option in the tree plotting function of the R package ape, and in the recent tool for visualizing reconciled phylogenetic trees thirdkind (https://github.com/simonpenel/thirdkind/wiki). 相似文献

16.

PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors

Amit G Deshwar Shankar Vembu Christina K Yung Gun Ho Jang Lincoln Stein Quaid Morris 《Genome biology》2015,16(1)

Tumors often contain multiple subpopulations of cancerous cells defined by distinct somatic mutations. We describe a new method, PhyloWGS, which can be applied to whole-genome sequencing data from one or more tumor samples to reconstruct complete genotypes of these subpopulations based on variant allele frequencies (VAFs) of point mutations and population frequencies of structural variations. We introduce a principled phylogenic correction for VAFs in loci affected by copy number alterations and we show that this correction greatly improves subclonal reconstruction compared to existing methods. PhyloWGS is free, open-source software, available at https://github.com/morrislab/phylowgs.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0602-8) contains supplementary material, which is available to authorized users. 相似文献

17.

Prophage Tracer: precisely tracing prophages in prokaryotic genomes using overlapping split-read alignment

Kaihao Tang Weiquan Wang Yamin Sun Yiqing Zhou Pengxia Wang Yunxue Guo Xiaoxue Wang 《Nucleic acids research》2021,49(22):e128

The life cycle of temperate phages includes a lysogenic cycle stage when the phage integrates into the host genome and becomes a prophage. However, the identification of prophages that are highly divergent from known phages remains challenging. In this study, by taking advantage of the lysis-lysogeny switch of temperate phages, we designed Prophage Tracer, a tool for recognizing active prophages in prokaryotic genomes using short-read sequencing data, independent of phage gene similarity searching. Prophage Tracer uses the criterion of overlapping split-read alignment to recognize discriminative reads that contain bacterial (attB) and phage (attP) att sites representing prophage excision signals. Performance testing showed that Prophage Tracer could predict known prophages with precise boundaries, as well as novel prophages. Two novel prophages, dsDNA and ssDNA, encoding highly divergent major capsid proteins, were identified in coral-associated bacteria. Prophage Tracer is a reliable data mining tool for the identification of novel temperate phages and mobile genetic elements. The code for the Prophage Tracer is publicly available at https://github.com/WangLab-SCSIO/Prophage_Tracer. 相似文献

18.

cLoops2: a full-stack comprehensive analytical tool for chromatin interactions

Yaqiang Cao Shuai Liu Gang Ren Qingsong Tang Keji Zhao 《Nucleic acids research》2022,50(1):57

Investigating chromatin interactions between regulatory regions such as enhancer and promoter elements is vital for understanding the regulation of gene expression. Compared to Hi-C and its variants, the emerging 3D mapping technologies focusing on enriched signals, such as TrAC-looping, reduce the sequencing cost and provide higher interaction resolution for cis-regulatory elements. A robust pipeline is needed for the comprehensive interpretation of these data, especially for loop-centric analysis. Therefore, we have developed a new versatile tool named cLoops2 for the full-stack analysis of these 3D chromatin interaction data. cLoops2 consists of core modules for peak-calling, loop-calling, differentially enriched loops calling and loops annotation. It also contains multiple modules for interaction resolution estimation, data similarity estimation, features quantification, feature aggregation analysis, and visualization. cLoops2 with documentation and example data are open source and freely available at GitHub: https://github.com/KejiZhaoLab/cLoops2. 相似文献

19.

NORMA: The Network Makeup Artist — A Web Tool for Network Annotation Visualization

Mikaela Koutrouli Evangelos Karatzas Katerina Papanikolopoulou Georgios A. Pavlopoulos 《基因组蛋白质组与生物信息学报(英文版)》2022,20(3):578

The Network Makeup Artist (NORMA) is a web tool for interactive network annotation visualization and topological analysis, able to handle multiple networks and annotations simultaneously. Precalculated annotations (e.g., Gene Ontology, Pathway enrichment, community detection, or clustering results) can be uploaded and visualized in a network, either as colored pie-chart nodes or as color-filled areas in a 2D/3D Venn-diagram-like style. In the case where no annotation exists, algorithms for automated community detection are offered. Users can adjust the network views using standard layout algorithms or allow NORMA to slightly modify them for visually better group separation. Once a network view is set, users can interactively select and highlight any group of interest in order to generate publication-ready figures. Briefly, with NORMA, users can encode three types of information simultaneously. These are 1) the network, 2) the communities or annotations of interest, and 3) node categories or expression values. Finally, NORMA offers basic topological analysis and direct topological comparison across any of the selected networks. NORMA service is available at http://norma.pavlopouloslab.info, whereas the code is available at https://github.com/PavlopoulosLab/NORMA. 相似文献

20.

Determining the quality and complexity of next-generation sequencing data without a reference genome

Seyed Yahya Anvar Lusine Khachatryan Martijn Vermaat Michiel van Galen Irina Pulyakhina Yavuz Ariyurek Ken Kraaijeveld Johan T den Dunnen Peter de Knijff Peter AC ’t Hoen Jeroen FJ Laros 《Genome biology》2014,15(12)

We describe an open-source kPAL package that facilitates an alignment-free assessment of the quality and comparability of sequencing datasets by analyzing k-mer frequencies. We show that kPAL can detect technical artefacts such as high duplication rates, library chimeras, contamination and differences in library preparation protocols. kPAL also successfully captures the complexity and diversity of microbiomes and provides a powerful means to study changes in microbial communities. Together, these features make kPAL an attractive and broadly applicable tool to determine the quality and comparability of sequence libraries even in the absence of a reference sequence. kPAL is freely available at https://github.com/LUMC/kPAL.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0555-3) contains supplementary material, which is available to authorized users. 相似文献