首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Metabolomics and proteomics, like other omics domains, usually face a data mining challenge in providing an understandable output to advance in biomarker discovery and precision medicine. Often, statistical analysis is one of the most difficult challenges and it is critical in the subsequent biological interpretation of the results. Because of this, combined with the computational programming skills needed for this type of analysis, several bioinformatic tools aimed at simplifying metabolomics and proteomics data analysis have emerged. However, sometimes the analysis is still limited to a few hidebound statistical methods and to data sets with limited flexibility. POMAShiny is a web-based tool that provides a structured, flexible and user-friendly workflow for the visualization, exploration and statistical analysis of metabolomics and proteomics data. This tool integrates several statistical methods, some of them widely used in other types of omics, and it is based on the POMA R/Bioconductor package, which increases the reproducibility and flexibility of analyses outside the web environment. POMAShiny and POMA are both freely available at https://github.com/nutrimetabolomics/POMAShiny and https://github.com/nutrimetabolomics/POMA, respectively.  相似文献   

2.
The HTSstation analysis portal is a suite of simple web forms coupled to modular analysis pipelines for various applications of High-Throughput Sequencing including ChIP-seq, RNA-seq, 4C-seq and re-sequencing. HTSstation offers biologists the possibility to rapidly investigate their HTS data using an intuitive web application with heuristically pre-defined parameters. A number of open-source software components have been implemented and can be used to build, configure and run HTS analysis pipelines reactively. Besides, our programming framework empowers developers with the possibility to design their own workflows and integrate additional third-party software. The HTSstation web application is accessible at http://htsstation.epfl.ch.  相似文献   

3.
4.
Identification of biopolymer motifs represents a key step in the analysis of biological sequences. The MEME Suite is a widely used toolkit for comprehensive analysis of biopolymer motifs; however, these tools are poorly integrated within popular analysis frameworks like the R/Bioconductor project, creating barriers to their use. Here we present memes, an R package that provides a seamless R interface to a selection of popular MEME Suite tools. memes provides a novel “data aware” interface to these tools, enabling rapid and complex discriminative motif analysis workflows. In addition to interfacing with popular MEME Suite tools, memes leverages existing R/Bioconductor data structures to store the multidimensional data returned by MEME Suite tools for rapid data access and manipulation. Finally, memes provides data visualization capabilities to facilitate communication of results. memes is available as a Bioconductor package at https://bioconductor.org/packages/memes, and the source code can be found at github.com/snystrom/memes.  相似文献   

5.
6.
7.
8.
The ordering and orientation of genomic scaffolds to reconstruct chromosomes is an essential step during de novo genome assembly. Because this process utilizes various mapping techniques that each provides an independent line of evidence, a combination of multiple maps can improve the accuracy of the resulting chromosomal assemblies. We present ALLMAPS, a method capable of computing a scaffold ordering that maximizes colinearity across a collection of maps. ALLMAPS is robust against common mapping errors, and generates sequences that are maximally concordant with the input maps. ALLMAPS is a useful tool in building high-quality genome assemblies. ALLMAPS is available at: https://github.com/tanghaibao/jcvi/wiki/ALLMAPS.  相似文献   

9.
TnSeq has become a popular technique for determining the essentiality of genomic regions in bacterial organisms. Several methods have been developed to analyze the wealth of data that has been obtained through TnSeq experiments. We developed a tool for analyzing Himar1 TnSeq data called TRANSIT. TRANSIT provides a graphical interface to three different statistical methods for analyzing TnSeq data. These methods cover a variety of approaches capable of identifying essential genes in individual datasets as well as comparative analysis between conditions. We demonstrate the utility of this software by analyzing TnSeq datasets of M. tuberculosis grown on glycerol and cholesterol. We show that TRANSIT can be used to discover genes which have been previously implicated for growth on these carbon sources. TRANSIT is written in Python, and thus can be run on Windows, OSX and Linux platforms. The source code is distributed under the GNU GPL v3 license and can be obtained from the following GitHub repository: https://github.com/mad-lab/transit
This is a PLOS Computational Biology Software paper
  相似文献   

10.
Protein designers use a wide variety of software tools for de novo design, yet their repertoire still lacks a fast and interactive all-atom search engine. To solve this, we have built the Suns program: a real-time, atomic search engine integrated into the PyMOL molecular visualization system. Users build atomic-level structural search queries within PyMOL and receive a stream of search results aligned to their query within a few seconds. This instant feedback cycle enables a new “designability”-inspired approach to protein design where the designer searches for and interactively incorporates native-like fragments from proven protein structures. We demonstrate the use of Suns to interactively build protein motifs, tertiary interactions, and to identify scaffolds compatible with hot-spot residues. The official web site and installer are located at http://www.degradolab.org/suns/ and the source code is hosted at https://github.com/godotgildor/Suns (PyMOL plugin, BSD license), https://github.com/Gabriel439/suns-cmd (command line client, BSD license), and https://github.com/Gabriel439/suns-search (search engine server, GPLv2 license).
This is a PLOS Computational Biology Software Article
  相似文献   

11.
For many RNA molecules, the secondary structure is essential for the correct function of the RNA. Predicting RNA secondary structure from nucleotide sequences is a long-standing problem in genomics, but the prediction performance has reached a plateau over time. Traditional RNA secondary structure prediction algorithms are primarily based on thermodynamic models through free energy minimization, which imposes strong prior assumptions and is slow to run. Here, we propose a deep learning-based method, called UFold, for RNA secondary structure prediction, trained directly on annotated data and base-pairing rules. UFold proposes a novel image-like representation of RNA sequences, which can be efficiently processed by Fully Convolutional Networks (FCNs). We benchmark the performance of UFold on both within- and cross-family RNA datasets. It significantly outperforms previous methods on within-family datasets, while achieving a similar performance as the traditional methods when trained and tested on distinct RNA families. UFold is also able to predict pseudoknots accurately. Its prediction is fast with an inference time of about 160 ms per sequence up to 1500 bp in length. An online web server running UFold is available at https://ufold.ics.uci.edu. Code is available at https://github.com/uci-cbcl/UFold.  相似文献   

12.
Anvaya is a workflow environment for automated genome analysis that provides an interface for several bioinformatics tools and databases, loosely coupled together in a coordinated system, enabling the execution of a set of analyses tools in series or in parallel. It is a client-server workflow environment that has an advantage over existing software as it enables extensive pre & post processing of biological data in an efficient manner. "Anvaya" offers the user, novel functionalities to carry out exhaustive comparative analysis via "custom tools," which are tools with new functionality not available in standard tools, and "built-in PERL parsers," which automate data-flow between tools that hitherto, required manual intervention. It also provides a set of 11 pre-defined workflows for frequently used pipelines in genome annotation and comparative genomics ranging from EST assembly and annotation to phylogenetic reconstruction and microarray analysis. It provides a platform that serves as a single-stop solution for biologists to carry out hassle-free and comprehensive analysis, without being bothered about the nuances involved in tool installation, command line parameters, format conversions required to connect tools and manage/process multiple data sets at a single instance.  相似文献   

13.
The ability to aggregate experimental data analysis and results into a concise and interpretable format is a key step in evaluating the success of an experiment. This critical step determines baselines for reproducibility and is a key requirement for data dissemination. However, in practice it can be difficult to consolidate data analyses that encapsulates the broad range of datatypes available in the life sciences. We present STENCIL, a web templating engine designed to organize, visualize, and enable the sharing of interactive data visualizations. STENCIL leverages a flexible web framework for creating templates to render highly customizable visual front ends. This flexibility enables researchers to render small or large sets of experimental outcomes, producing high-quality downloadable and editable figures that retain their original relationship to the source data. REST API based back ends provide programmatic data access and supports easy data sharing. STENCIL is a lightweight tool that can stream data from Galaxy, a popular bioinformatic analysis web platform. STENCIL has been used to support the analysis and dissemination of two large scale genomic projects containing the complete data analysis for over 2,400 distinct datasets. Code and implementation details are available on GitHub: https://github.com/CEGRcode/stencil  相似文献   

14.
Despite the growing number of immune repertoire sequencing studies, the field still lacks software for analysis and comprehension of this high-dimensional data. Here we report VDJtools, a complementary software suite that solves a wide range of T cell receptor (TCR) repertoires post-analysis tasks, provides a detailed tabular output and publication-ready graphics, and is built on top of a flexible API. Using TCR datasets for a large cohort of unrelated healthy donors, twins, and multiple sclerosis patients we demonstrate that VDJtools greatly facilitates the analysis and leads to sound biological conclusions. VDJtools software and documentation are available at https://github.com/mikessh/vdjtools.  相似文献   

15.
Comparative genome-scale analyses of protein-coding gene sequences are employed to examine evidence for whole-genome duplication and horizontal gene transfer. For this purpose, an orthogroup should be delineated to infer evolutionary history regarding each gene, and results of all orthogroup analyses need to be integrated to infer a genome-scale history. An orthogroup is a set of genes descended from a single gene in the last common ancestor of all species under consideration. However, such analyses confront several problems: 1) Analytical pipelines to infer all gene histories with methods comparing species and gene trees are not fully developed, and 2) without detailed analyses within orthogroups, evolutionary events of paralogous genes in the same orthogroup cannot be distinguished for genome-wide integration of results derived from multiple orthogroup analyses. Here I present an analytical pipeline, ORTHOSCOPE* (star), to infer evolutionary histories of animal/plant genes from genome-scale data. ORTHOSCOPE* estimates a tree for a specified gene, detects speciation/gene duplication events that occurred at nodes belonging to only one lineage leading to a species of interest, and then integrates results derived from gene trees estimated for all query genes in genome-wide data. Thus, ORTHOSCOPE* can be used to detect species nodes just after whole-genome duplications as a first step of comparative genomic analyses. Moreover, by examining the presence or absence of genes belonging to species lineages with dense taxon sampling available from the ORTHOSCOPE web version, ORTHOSCOPE* can detect genes lost in specific lineages and horizontal gene transfers. This pipeline is available at https://github.com/jun-inoue/ORTHOSCOPE_STAR.  相似文献   

16.
17.
To manage and intelligently mine the avalanche of genomic sequences intuitive and user-friendly graphical interfaces are required. Here we present BlastXtract2 which exclusively facilitates early exploration of un-annotated genomic and metagenomic sequences. Various formats of translated searches, including the commonly used BlastX, of multiple sequences against multiple protein databases can be uploaded to a relational database server, which can be accessed via a locally installed web-server. There, an intuitive GUI allows straightforward data-mining and enables quick detection of potential frameshifts and poorly sequenced or assembled regions, thereby contributing in making BlastXtract2 a unique and valuable tool for early exploration of (meta)genomic sequences.

Availability

Source code, documentation and an online demo version are available at https://github.com/ ClaessonLab/BlastXtract2  相似文献   

18.
Existing methods for identifying structural variants (SVs) from short read datasets are inaccurate. This complicates disease-gene identification and efforts to understand the consequences of genetic variation. In response, we have created Wham (Whole-genome Alignment Metrics) to provide a single, integrated framework for both structural variant calling and association testing, thereby bypassing many of the difficulties that currently frustrate attempts to employ SVs in association testing. Here we describe Wham, benchmark it against three other widely used SV identification tools–Lumpy, Delly and SoftSearch–and demonstrate Wham’s ability to identify and associate SVs with phenotypes using data from humans, domestic pigeons, and vaccinia virus. Wham and all associated software are covered under the MIT License and can be freely downloaded from github (https://github.com/zeeev/wham), with documentation on a wiki (http://zeeev.github.io/wham/). For community support please post questions to https://www.biostars.org/.
This is PLOS Computational Biology software paper.
  相似文献   

19.
BackgroundRecord linkage integrates records across multiple related data sources identifying duplicates and accounting for possible errors. Real life applications require efficient algorithms to merge these voluminous data sources to find out all records belonging to same individuals. Our recently devised highly efficient record linkage algorithms provide best-known solutions to this challenging problem.MethodWe have developed RLT-S, a freely available web tool, which implements our single linkage clustering algorithm for record linkage. This tool requires input data sets and a small set of configuration settings about these files to work efficiently. RLT-S employs exact match clustering, blocking on a specified attribute and single linkage based hierarchical clustering among these blocks.ResultsRLT-S is an implementation package of our sequential record linkage algorithm. It outperforms previous best-known implementations by a large margin. The tool is at least two times faster for any dataset than the previous best-known tools.ConclusionsRLT-S tool implements our record linkage algorithm that outperforms previous best-known algorithms in this area. This website also contains necessary information such as instructions, submission history, feedback, publications and some other sections to facilitate the usage of the tool.AvailabilityRLT-S is integrated into http://www.rlatools.com, which is currently serving this tool only. The tool is freely available and can be used without login. All data files used in this paper have been stored in https://github.com/abdullah009/DataRLATools. For copies of the relevant programs please see https://github.com/abdullah009/RLATools.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号