首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
科学工作流系统是由一系列经过特殊设计的数据分析与管理步骤组成的、按照一定的逻辑组织在一起, 并在给定的运行环境下, 完成特定科学研究的工作流管理系统。科学工作流系统致力于使全世界的科学家可以在一个简单易用的平台上交换思想, 共同设计全球尺度的实验, 共享数据、实验步骤与结果等。每一个科学家可以独立创建自己的工作流, 执行工作流并实时查看结果; 不同科学家之间也可以方便地共享和复用这些工作流。本文以开普勒系统(Kepler system)和生物多样性虚拟实验室(BioVeL)两个项目为例, 介绍了科学工作流的发展历史、背景、现有项目和应用等。以生态位模型工作流为例, 介绍了科学工作流的流程以及特点等。并通过对现有科学工作流的分析, 对其发展方向和存在的问题提出了自己的看法及预期。  相似文献   

2.
《植物生态学报》2013,22(3):277
A scientific workflow system is designed specifically to organize, manage and execute a series of research steps, or a workflow, in a given runtime environment. The vision for scientific workflow systems is that the scientists around the world can collaborate on designing global-scaled experiments, sharing the data sets, experimental processes, and results on an easy-to-use platform. Each scientist can create and execute their own workflows and view results in real-time, and then subsequently share and reuse workflows among other scientists. Two case studies, using the Kepler system and BioVeL, are introduced in this paper. Ecological niche modeling process, which is a specialized form of scientific workflow system included in both Kepler system and BioVeL, was used to describe and discuss the features, developmental trends, and problems of scientific workflows.  相似文献   

3.
4.

Background  

There is a significant demand for creating pipelines or workflows in the life science discipline that chain a number of discrete compute and data intensive analysis tasks into sophisticated analysis procedures. This need has led to the development of general as well as domain-specific workflow environments that are either complex desktop applications or Internet-based applications. Complexities can arise when configuring these applications in heterogeneous compute and storage environments if the execution and data access models are not designed appropriately. These complexities manifest themselves through limited access to available HPC resources, significant overhead required to configure tools and inability for users to simply manage files across heterogenous HPC storage infrastructure.  相似文献   

5.
6.
Environmental sensor networks are now commonly being deployed within environmental observatories and as components of smaller-scale ecological and environmental experiments. Effectively using data from these sensor networks presents technical challenges that are difficult for scientists to overcome, severely limiting the adoption of automated sensing technologies in environmental science. The Realtime Environment for Analytical Processing (REAP) is an NSF-funded project to address the technical challenges related to accessing and using heterogeneous sensor data from within the Kepler scientific workflow system. Using distinct use cases in terrestrial ecology and oceanography as motivating examples, we describe workflows and extensions to Kepler to stream and analyze data from observatory networks and archives. We focus on the use of two newly integrated data sources in Kepler: DataTurbine and OPeNDAP. Integrated access to both near real-time data streams and data archives from within Kepler facilitates both simple data exploration and sophisticated analysis and modeling with these data sources.  相似文献   

7.

The modeling of complex computational applications as giant computational workflows has been a critically effective means of better understanding the intricacies of applications and of determining the best approach to their realization. It is a challenging assignment to schedule such workflows in the cloud while also considering users’ different quality of service requirements. The present paper introduces a new direction based on a divide-and-conquer approach to scheduling these workflows. The proposed Divide-and-conquer Workflow Scheduling algorithm (DQWS) is designed with the objective of minimizing the cost of workflow execution while respecting its deadline. The critical path concept is the inspiration behind the divide-and-conquer process. DQWS finds the critical path, schedules it, removes the critical path from the workflow, and effectively divides the leftover into some mini workflows. The process continues until only chain structured workflows, called linear graphs, remain. Scheduling linear graphs is performed in the final phase of the algorithm. Experiments show that DQWS outperforms its competitors, both in terms of meeting deadlines and minimizing the monetary costs of executing scheduled workflows.

  相似文献   

8.
Soma  Prathibha  Latha  B. 《Cluster computing》2021,24(2):1123-1134

Scientific workflow applications are used by scientists to carry out research in various domains such as Physics, Chemistry, Astronomy etc. These applications require huge computational resources and currently cloud platform is used for efficiently running these applications. To improve the makespan and cost in workflow execution in cloud platform it requires to identify proper number of Virtual Machines (VM) and choose proper VM type. As cloud platform is dynamic, the available resources and the type of the resources are the two important factors on the cost and makespan of workflow execution. The primary objective of this work is to analyze the relationship among the cloud configuration parameters (Number of VM, Type of VM, VM configurations) for executing scientific workflow applications in cloud platform. In this work, to accurately analyze the influence of cloud platform resource configuration and scheduling polices a new predictive modelling using Box–Behnken design which is one of the modelling technique of Response Surface Methodology (RSM). It is used to build quadratic mathematical models that can be used to analyze relationships among input and output variables. Workflow cost and makespan models were built for real world scientific workflows using ANOVA and it was observed that the models fit well and can be useful in analyzing the performance of scientific workflow applications in cloud

  相似文献   

9.

Background

Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts.

Results

In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure.

Conclusions

Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis. The system can be accessed either through a cloud-enabled web-interface or downloaded and installed to run within the user's local environment. All resources related to Tavaxy are available at http://www.tavaxy.org.  相似文献   

10.
11.
12.
SUMMARY: Emerging web-services technology allows interoperability between multiple distributed architectures. Here, we present REMORA, a web server implemented according to the BioMoby web-service specifications, providing life science researchers with an easy-to-use workflow generator and launcher, a repository of predefined workflows and a survey system. CONTACT: Jerome.Gouzy@toulouse.inra.fr AVAILABILITY: The REMORA web server is freely available at http://bioinfo.genopole-toulouse.prd.fr/remora, sources are available upon request from the authors.  相似文献   

13.
Data analysis processes in scientific applications can be expressed as coarse-grain workflows of complex data processing operations with data flow dependencies between them. Performance optimization of these workflows can be viewed as a search for a set of optimal values in a multidimensional parameter space consisting of input performance parameters to the applications that are known to affect their execution times. While some performance parameters such as grouping of workflow components and their mapping to machines do not affect the accuracy of the analysis, others may dictate trading the output quality of individual components (and of the whole workflow) for performance. This paper describes an integrated framework which is capable of supporting performance optimizations along multiple such parameters. Using two real-world applications in the spatial, multidimensional data analysis domain, we present an experimental evaluation of the proposed framework.  相似文献   

14.
15.
Extremely large datasets have become routine in biology. However, performing a computational analysis of a large dataset can be overwhelming, especially for novices. Here, we present a step-by-step guide to computing workflows with the biologist end-user in mind. Starting from a foundation of sound data management practices, we make specific recommendations on how to approach and perform computational analyses of large datasets, with a view to enabling sound, reproducible biological research.  相似文献   

16.
Incomplete methods sections have made it difficult for researchers to replicate and build on the work of others, contributing to problems in reproducibility. It is important to increase the level of detail in our methods sections and to share step-by-step protocols in protocol repositories or journals. Request a Protocol is a new feature in Molecular Biology of the Cell that allows readers to request detailed protocols directly from the methods section of the research article, with links between the protocols and the research articles, and has the potential to improve research reproducibility and help everyone design and execute robust life science experiments.  相似文献   

17.
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful technique to decipher tissue composition at the single-cell level and to inform on disease mechanisms, tumor heterogeneity, and the state of the immune microenvironment. Although multiple methods for the computational analysis of scRNA-seq data exist, their application in a clinical setting demands standardized and reproducible workflows, targeted to extract, condense, and display the clinically relevant information. To this end, we designed scAmpi (Single Cell Analysis mRNA pipeline), a workflow that facilitates scRNA-seq analysis from raw read processing to informing on sample composition, clinically relevant gene and pathway alterations, and in silico identification of personalized candidate drug treatments. We demonstrate the value of this workflow for clinical decision making in a molecular tumor board as part of a clinical study.  相似文献   

18.
BioImageXD puts open-source computer science tools for three-dimensional visualization and analysis into the hands of all researchers, through a user-friendly graphical interface tuned to the needs of biologists. BioImageXD has no restrictive licenses or undisclosed algorithms and enables publication of precise, reproducible and modifiable workflows. It allows simple construction of processing pipelines and should enable biologists to perform challenging analyses of complex processes. We demonstrate its performance in a study of integrin clustering in response to selected inhibitors.  相似文献   

19.
Taverna: a tool for the composition and enactment of bioinformatics workflows   总被引:12,自引:0,他引:12  
MOTIVATION: In silico experiments in bioinformatics involve the co-ordinated use of computational tools and information repositories. A growing number of these resources are being made available with programmatic access in the form of Web services. Bioinformatics scientists will need to orchestrate these Web services in workflows as part of their analyses. RESULTS: The Taverna project has developed a tool for the composition and enactment of bioinformatics workflows for the life sciences community. The tool includes a workbench application which provides a graphical user interface for the composition of workflows. These workflows are written in a new language called the simple conceptual unified flow language (Scufl), where by each step within a workflow represents one atomic task. Two examples are used to illustrate the ease by which in silico experiments can be represented as Scufl workflows using the workbench application.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号