首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts.

Results

In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure.

Conclusions

Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis. The system can be accessed either through a cloud-enabled web-interface or downloaded and installed to run within the user's local environment. All resources related to Tavaxy are available at http://www.tavaxy.org.  相似文献   

2.
Many data manipulation processes involve the use of programming libraries. These processes may beneficially be automated due to their repeated use. A convenient type of automation is in the form of workflows that also allow such processes to be shared amongst the community. The Taverna workflow system has been extended to enable it to use and invoke Java classes and methods as tasks within Taverna workflows. These classes and methods are selected for use during workflow construction by a Java Doclet application called the API Consumer. This selection is stored as an XML file which enables Taverna to present the subset of the API for use in the composition of workflows. The ability of Taverna to invoke Java classes and methods is demonstrated by a workflow in which we use libSBML to map gene expression data onto a metabolic pathway represented as a SBML model. AVAILABILITY: Taverna and the API Consumer application can be freely downloaded from http://taverna.sourceforge.net  相似文献   

3.
Summary: Taverna is an application that eases the integrationof tools and databases for life science research by the constructionof workflows. The Taverna Interaction Service extends the functionalityof Taverna by defining human interaction within a workflow andacting as a mediation layer between the automated workflow engineand one or more users. Availability: Taverna, the Interaction Service plug-in and webapplication are available as open source and can be downloadedfrom http://taverna.sourceforge.net/ Contact: taverna-users{at}lists.sourceforge.net Associate Editor: John Quackenbush  相似文献   

4.
Ecological niche modelling (ENM) Components are a set of reusable workflow components specialized for performing ENM tasks within the Taverna workflow management system. Each component encapsulates specific functionality and can be combined with other components to facilitate the creation of larger and more complex workflows. One key distinguishing feature of ENM Components is that most tasks are performed remotely by calling web services, simplifying software setup and maintenance on the client side and allowing more powerful computing resources to be exploited. This paper presents the current set of ENM Components in the context of the Taverna family of tools for creating, publishing and sharing workflows. An example is included showing how the components can be used in a preliminary investigation of the effects of mixing different spatial resolutions in ENM experiments.  相似文献   

5.
Taverna: a tool for the composition and enactment of bioinformatics workflows   总被引:12,自引:0,他引:12  
MOTIVATION: In silico experiments in bioinformatics involve the co-ordinated use of computational tools and information repositories. A growing number of these resources are being made available with programmatic access in the form of Web services. Bioinformatics scientists will need to orchestrate these Web services in workflows as part of their analyses. RESULTS: The Taverna project has developed a tool for the composition and enactment of bioinformatics workflows for the life sciences community. The tool includes a workbench application which provides a graphical user interface for the composition of workflows. These workflows are written in a new language called the simple conceptual unified flow language (Scufl), where by each step within a workflow represents one atomic task. Two examples are used to illustrate the ease by which in silico experiments can be represented as Scufl workflows using the workbench application.  相似文献   

6.
7.
ABSTRACT: BACKGROUND: Progress in the modeling of biological systems strongly relies on the availability of specialized computer-aided tools. To that end, the Taverna Workbench eases integration of software tools for life science research and provides a common workflow-based framework for computational experiments in Biology. RESULTS: The Taverna services for Systems Biology (Tav4SB) project provides a set of new Web service operations, which extend the functionality of the Taverna Workbench in a domain of systems biology. Tav4SB operations allow you to perform numerical simulations or model checking of, respectively, deterministic or stochastic semantics of biological models. On top of this functionality, Tav4SB enables the construction of high-level experiments. As an illustration of possibilities offered by our project we apply the multi-parameter sensitivity analysis. To visualize the results of model analysis a flexible plotting operation is provided as well. Tav4SB operations are executed in a simple grid environment, integrating heterogeneous software such as Mathematica, PRISM and SBML ODE Solver. The user guide, contact information, full documentation of available Web service operations, workflows and other additional resources can be found at the Tav4SB project's Web page: http://bioputer.mimuw.edu.pl/tav4sb/. CONCLUSIONS: The Tav4SB Web service provides a set of integrated tools in the domain for which Web-based applications are still not as widely available as for other areas of computational biology. Moreover, we extend the dedicated hardware base for computationally expensive task of simulating cellular models. Finally, we promote the standardization of models and experiments as well as accessibility and usability of remote services.  相似文献   

8.
Anvaya is a workflow environment for automated genome analysis that provides an interface for several bioinformatics tools and databases, loosely coupled together in a coordinated system, enabling the execution of a set of analyses tools in series or in parallel. It is a client-server workflow environment that has an advantage over existing software as it enables extensive pre & post processing of biological data in an efficient manner. "Anvaya" offers the user, novel functionalities to carry out exhaustive comparative analysis via "custom tools," which are tools with new functionality not available in standard tools, and "built-in PERL parsers," which automate data-flow between tools that hitherto, required manual intervention. It also provides a set of 11 pre-defined workflows for frequently used pipelines in genome annotation and comparative genomics ranging from EST assembly and annotation to phylogenetic reconstruction and microarray analysis. It provides a platform that serves as a single-stop solution for biologists to carry out hassle-free and comprehensive analysis, without being bothered about the nuances involved in tool installation, command line parameters, format conversions required to connect tools and manage/process multiple data sets at a single instance.  相似文献   

9.
One of the challenges of computational-centric research is to make the research undertaken reproducible in a form that others can repeat and re-use with minimal effort. In addition to the data and tools necessary to re-run analyses, execution environments play crucial roles because of the dependencies of the operating system and software version used. However, some of the challenges of reproducible science can be addressed using appropriate computational tools and cloud computing to provide an execution environment.Here, we demonstrate the use of a Kepler scientific workflow for reproducible science that is sharable, reusable, and re-executable. These workflows reduce barriers to sharing and will save researchers time when undertaking similar research in the future.To provide infrastructure that enables reproducible science, we have developed cloud-based Collaborative Environment for Ecosystem Science Research and Analysis (CoESRA) infrastructure to build, execute and share sophisticated computation-centric research. The CoESRA provides users with a storage and computational platform that is accessible from a web-browser in the form of a virtual desktop. Any registered user can access the virtual desktop to build, execute and share the Kepler workflows. This approach will enable computational scientists to share complete workflows in a pre-configured environment so that others can reproduce the computational research with minimal effort.As a case study, we developed and shared a complete IUCN Red List of Ecosystems Assessment workflow that reproduces the assessments undertaken by Burns et al. (2015) on Mountain Ash forests in the Central Highlands of Victoria, Australia. This workflow provides an opportunity for other researchers and stakeholders to run this assessment with minimal supervision. The workflow also enables researchers to re-evaluate the assessment when additional data becomes available. The assessment can be run in a CoESRA virtual desktop by opening a workflow in a Kepler user interface and pressing a “start” button. The workflow is pre-configured with all the open access datasets and writes results to a pre-configured folder.  相似文献   

10.
11.

Background  

As biology becomes an increasingly computational science, it is critical that we develop software tools that support not only bioinformaticians, but also bench biologists in their exploration of the vast and complex data-sets that continue to build from international genomic, proteomic, and systems-biology projects. The BioMoby interoperability system was created with the goal of facilitating the movement of data from one Web-based resource to another to fulfill the requirements of non-expert bioinformaticians. In parallel with the development of BioMoby, the European myGrid project was designing Taverna, a bioinformatics workflow design and enactment tool. Here we describe the marriage of these two projects in the form of a Taverna plug-in that provides access to many of BioMoby's features through the Taverna interface.  相似文献   

12.
Warp2D is a novel time alignment approach, which uses the overlapping peak volume of the reference and sample peak lists to correct misleading peak shifts. Here, we present an easy-to-use web interface for high-throughput Warp2D batch processing time alignment service using the Dutch Life Science Grid, reducing processing time from days to hours. This service provides the warping function, the sample chromatogram peak list with adjusted retention times and normalized quality scores based on the sum of overlapping peak volume of all peaks. Heat maps before and after time alignment are created from the arithmetic mean of the sum of overlapping peak area rearranged with hierarchical clustering, allowing the quality control of the time alignment procedure. Taverna workflow and command line tool are provided for remote processing of local user data. AVAILABILITY: online data processing service is available at http://www.nbpp.nl/warp2d.html. Taverna workflow is available at myExperiment with title '2D Time Alignment-Webservice and Workflow' at http://www.myexperiment.org/workflows/1283.html. Command line tool is available at http://www.nbpp.nl/Warp2D_commandline.zip. CONTACT: p.l.horvatovich@rug.nl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

13.
14.
DataBiNS is a custom-designed BioMoby Web Service workflow that integrates non-synonymous coding single nucleotide polymorphisms (nsSNPs) data with structure/function and pathway data for the relevant protein. A KEGG Pathway Identifier representing a specific human biological pathway initializes the DataBiNS workflow. The workflow retrieves a list of publications, gene ontology annotations and nsSNP information for each gene involved in the biological pathway. Manual inspection of output data from several trial runs confirms that all expected information is appropriately retrieved by the workflow services. The use of an automated BioMoby workflow, rather than manual 'surfing', to retrieve the necessary data, significantly reduces the effort required for functional interpretation of SNP data, and thus encourages more speculative investigation. Moreover, the modular nature of the individual BioMoby Services enables fine-grained reusing of each service in other workflows, thus reducing the effort required to achieve similar investigations in the future. AVAILABILITY: The workflow is freely available as a Taverna SCUFL XML document at the iCAPTURE Centre web site, http://www.mrl.ubc.ca/who/who_bios_scott_tebbutt.shtml.  相似文献   

15.
Data processing in proteomics can be a challenging endeavor, requiring extensive knowledge of many different software packages, all with different algorithms, data format requirements, and user interfaces. In this article we describe the integration of a number of existing programs and tools in Taverna Workbench, a scientific workflow manager currently being developed in the bioinformatics community. We demonstrate how a workflow manager provides a single, visually clear and intuitive interface to complex data analysis tasks in proteomics, from raw mass spectrometry data to protein identifications and beyond.  相似文献   

16.
17.
Motivated by the widespread use of workflow systems in e-Science applications, this article introduces a formal analysis framework for the verification and profiling of the control flow aspects of scientific workflows. The framework relies on process algebras that characterise each workflow component with a process behaviour, which is then used to build a CTL state model that can be reasoned about. We demonstrate the benefits of the approach by modelling the control flow behaviour of the Discovery Net system, one of the earliest workflow-based e-Science systems, and present how some key properties of workflows and individual service utilisation can be queried at design time. Our approach is generic and can be applied easily to modelling workflows developed in any other system. It also provides a formal basis for the comparison of control aspects of e-Science workflow systems and a design method for future systems.  相似文献   

18.
Data analysis processes in scientific applications can be expressed as coarse-grain workflows of complex data processing operations with data flow dependencies between them. Performance optimization of these workflows can be viewed as a search for a set of optimal values in a multidimensional parameter space consisting of input performance parameters to the applications that are known to affect their execution times. While some performance parameters such as grouping of workflow components and their mapping to machines do not affect the accuracy of the analysis, others may dictate trading the output quality of individual components (and of the whole workflow) for performance. This paper describes an integrated framework which is capable of supporting performance optimizations along multiple such parameters. Using two real-world applications in the spatial, multidimensional data analysis domain, we present an experimental evaluation of the proposed framework.  相似文献   

19.
20.
ABSTRACT: BACKGROUND: Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. RESULTS: We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,...) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. CONCLUSIONS: NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号