首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

High energy consumption (EC) is one of the leading and interesting issue in the cloud environment. The optimization of EC is generally related to scheduling problem. Optimum scheduling strategy is used to select the resources or tasks in such a way that system performance is not violated while minimizing EC and maximizing resource utilization (RU). This paper presents a task scheduling model for scheduling the tasks on virtual machines (VMs). The objective of the proposed model is to minimize EC, maximize RU, and minimize workflow makespan while preserving the task’s deadline and dependency constraints. An energy and resource efficient workflow scheduling algorithm (ERES) is proposed to schedule the workflow tasks to the VMs and dynamically deploy/un-deploy the VMs based on the workflow task’s requirements. An energy model is presented to compute the EC of the servers. Double threshold policy is used to perceive the server’ status i.e. overloaded/underloaded or normal. To balance the workload on the overloaded/underloaded servers, live VM migration strategy is used. To check the effectiveness of the proposed algorithm, exhaustive simulation experiments are conducted. The proposed algorithm is compared with power efficient scheduling and VM consolidation (PESVMC) algorithm on the accounts of RU, energy efficiency and task makespan. Further, the results are also verified in the real cloud environment. The results demonstrate the effectiveness of the proposed ERES algorithm.

  相似文献   

2.
In heterogeneous distributed computing systems like cloud computing, the problem of mapping tasks to resources is a major issue which can have much impact on system performance. For some reasons such as heterogeneous and dynamic features and the dependencies among requests, task scheduling is known to be a NP-complete problem. In this paper, we proposed a hybrid heuristic method (HSGA) to find a suitable scheduling for workflow graph, based on genetic algorithm in order to obtain the response quickly moreover optimizes makespan, load balancing on resources and speedup ratio. At first, the HSGA algorithm makes tasks prioritization in complex graph considering their impact on others, based on graph topology. This technique is efficient to reduction of completion time of application. Then, it merges Best-Fit and Round Robin methods to make an optimal initial population to obtain a good solution quickly, and apply some suitable operations such as mutation to control and lead the algorithm to optimized solution. This algorithm evaluates the solutions by considering efficient parameters in cloud environment. Finally, the proposed algorithm presents the better results with increasing number of tasks in application graph in contrast with other studied algorithms.  相似文献   

3.
Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.  相似文献   

4.
Soma  Prathibha  Latha  B. 《Cluster computing》2021,24(2):1123-1134

Scientific workflow applications are used by scientists to carry out research in various domains such as Physics, Chemistry, Astronomy etc. These applications require huge computational resources and currently cloud platform is used for efficiently running these applications. To improve the makespan and cost in workflow execution in cloud platform it requires to identify proper number of Virtual Machines (VM) and choose proper VM type. As cloud platform is dynamic, the available resources and the type of the resources are the two important factors on the cost and makespan of workflow execution. The primary objective of this work is to analyze the relationship among the cloud configuration parameters (Number of VM, Type of VM, VM configurations) for executing scientific workflow applications in cloud platform. In this work, to accurately analyze the influence of cloud platform resource configuration and scheduling polices a new predictive modelling using Box–Behnken design which is one of the modelling technique of Response Surface Methodology (RSM). It is used to build quadratic mathematical models that can be used to analyze relationships among input and output variables. Workflow cost and makespan models were built for real world scientific workflows using ANOVA and it was observed that the models fit well and can be useful in analyzing the performance of scientific workflow applications in cloud

  相似文献   

5.
Due to the restrictions that most traditional scheduling strategies only cared about users’ quality of service (QoS) time or cost requirements, lacked the effective analysis of users’ real service demand and could not guarantee scheduling security, this paper added trust into workflow’s QoS target and proposed a novel customizable cloud workflow scheduling model. In order to better analyze different user’s service requirements and provide customizable services, the new model divided workflow scheduling into two stages: the macro multi-workflow scheduling as the unit of cloud user and the micro single workflow scheduling. It introduced trust mechanism into multi-workflow scheduling level. And in single workflow scheduling level, it classified workflows into time-sensitive, cost-sensitive and balance three types according to different workflow’s QoS demand parameters using fuzzy clustering method. Based on it, it customized different service strategies for different type. The simulation experiments show that the new schema has some advantages in shortening workflow’s final completion time, achieving relatively high execution success rate and user satisfaction compared to other kindred solutions.  相似文献   

6.

Background

Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts.

Results

In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure.

Conclusions

Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis. The system can be accessed either through a cloud-enabled web-interface or downloaded and installed to run within the user's local environment. All resources related to Tavaxy are available at http://www.tavaxy.org.  相似文献   

7.
Cluster Computing - Recently, modern businesses have started to transform into cloud computing platforms to deploy their workflow applications. However, scheduling workflow under resource...  相似文献   

8.
In recent studies, exome sequencing has proven to be a successful screening tool for the identification of candidate genes causing rare genetic diseases. Although underlying targeted sequencing methods are well established, necessary data handling and focused, structured analysis still remain demanding tasks. Here, we present a cloud-enabled autonomous analysis pipeline, which comprises the complete exome analysis workflow. The pipeline combines several in-house developed and published applications to perform the following steps: (a) initial quality control, (b) intelligent data filtering and pre-processing, (c) sequence alignment to a reference genome, (d) SNP and DIP detection, (e) functional annotation of variants using different approaches, and (f) detailed report generation during various stages of the workflow. The pipeline connects the selected analysis steps, exposes all available parameters for customized usage, performs required data handling, and distributes computationally expensive tasks either on a dedicated high-performance computing infrastructure or on the Amazon cloud environment (EC2). The presented application has already been used in several research projects including studies to elucidate the role of rare genetic diseases. The pipeline is continuously tested and is publicly available under the GPL as a VirtualBox or Cloud image at http://simplex.i-med.ac.at; additional supplementary data is provided at http://www.icbi.at/exome.  相似文献   

9.
With the popularization and development of cloud computing, lots of scientific computing applications are conducted in cloud environments. However, current application scenario of scientific computing is also becoming increasingly dynamic and complicated, such as unpredictable submission times of jobs, different priorities of jobs, deadlines and budget constraints of executing jobs. Thus, how to perform scientific computing efficiently in cloud has become an urgent problem. To address this problem, we design an elastic resource provisioning and task scheduling mechanism to perform scientific workflow jobs in cloud. The goal of this mechanism is to complete as many high-priority workflow jobs as possible under budget and deadline constraints. This mechanism consists of four steps: job preprocessing, job admission control, elastic resource provisioning and task scheduling. We perform the evaluation with four kinds of real scientific workflow jobs under different budget constraints. We also consider the uncertainties of task runtime estimations, provisioning delays, and failures in evaluation. The results show that in most cases our mechanism achieves a better performance than other mechanisms. In addition, the uncertainties of task runtime estimations, VM provisioning delays, and task failures do not have major impact on the mechanism’s performance.  相似文献   

10.
Quantifying ecosystem structure is of key importance for ecology, conservation, restoration, and biodiversity monitoring because the diversity, geographic distribution and abundance of animals, plants and other organisms is tightly linked to the physical structure of vegetation and associated microclimates. Light Detection And Ranging (LiDAR) — an active remote sensing technique — can provide detailed and high resolution information on ecosystem structure because the laser pulse emitted from the sensor and its subsequent return signal from the vegetation (leaves, branches, stems) delivers three-dimensional point clouds from which metrics of vegetation structure (e.g. ecosystem height, cover, and structural complexity) can be derived. However, processing 3D LiDAR point clouds into geospatial data products of ecosystem structure remains challenging across broad spatial extents due to the large volume of national or regional point cloud datasets (typically multiple terabytes consisting of hundreds of billions of points). Here, we present a high-throughput workflow called ‘Laserfarm’ enabling the efficient, scalable and distributed processing of multi-terabyte LiDAR point clouds from national and regional airborne laser scanning (ALS) surveys into geospatial data products of ecosystem structure. Laserfarm is a free and open-source, end-to-end workflow which contains modular pipelines for the re-tiling, normalization, feature extraction and rasterization of point cloud information from ALS and other LiDAR surveys. The workflow is designed with horizontal scalability and can be deployed with distributed computing on different infrastructures, e.g. a cluster of virtual machines. We demonstrate the Laserfarm workflow by processing a country-wide multi-terabyte ALS dataset of the Netherlands (covering ∼34,000 km2 with ∼700 billion points and ∼ 16 TB uncompressed LiDAR point clouds) into 25 raster layers at 10 m resolution capturing ecosystem height, cover and structural complexity at a national extent. The Laserfarm workflow, implemented in Python and available as Jupyter Notebooks, is applicable to other LiDAR datasets and enables users to execute automated pipelines for generating consistent and reproducible geospatial data products of ecosystems structure from massive amounts of LiDAR point clouds on distributed computing infrastructures, including cloud computing environments. We provide information on workflow performance (including total CPU times, total wall-time estimates and average CPU times for single files and LiDAR metrics) and discuss how the Laserfarm workflow can be scaled to other LiDAR datasets and computing environments, including remote cloud infrastructures. The Laserfarm workflow allows a broad user community to process massive amounts of LiDAR point clouds for mapping vegetation structure, e.g. for applications in ecology, biodiversity monitoring and ecosystem restoration.  相似文献   

11.
According to the fact that cloud servers have different energy consumption on different running states, as well as the energy waste problem caused by the mismatching between cloud servers and cloud tasks, we carry out researches on the energy optimal method achieved by a priced timed automaton for the cloud computing center in this paper. The priced timed automaton is used to model the running behaviors of the cloud computing system. After introducing the matching matrix of cloud tasks and cloud resources as well as the power matrix of the running states of cloud servers, we design a generation algorithm for the cloud system automaton based on the generation rules and reduction rules given ahead. Then, we propose another algorithm to settle the minimum path energy consumption problem in the cloud system automaton, therefore obtaining an energy optimal solution and an energy optimal value for the cloud system. A case study and repeated experimental analyses manifest that our method is effective and feasible.  相似文献   

12.
Cluster Computing - The resource provisioning and workflow execution in a multi-cloud environment using a pay-as-you-use framework have recently gained the attention of the cloud computing research...  相似文献   

13.
Ecological niche modelling (ENM) Components are a set of reusable workflow components specialized for performing ENM tasks within the Taverna workflow management system. Each component encapsulates specific functionality and can be combined with other components to facilitate the creation of larger and more complex workflows. One key distinguishing feature of ENM Components is that most tasks are performed remotely by calling web services, simplifying software setup and maintenance on the client side and allowing more powerful computing resources to be exploited. This paper presents the current set of ENM Components in the context of the Taverna family of tools for creating, publishing and sharing workflows. An example is included showing how the components can be used in a preliminary investigation of the effects of mixing different spatial resolutions in ENM experiments.  相似文献   

14.

Background

With the globalization of clinical trials, a growing emphasis has been placed on the standardization of the workflow in order to ensure the reproducibility and reliability of the overall trial. Despite the importance of workflow evaluation, to our knowledge no previous studies have attempted to adapt existing modeling languages to standardize the representation of clinical trials. Unified Modeling Language (UML) is a computational language that can be used to model operational workflow, and a UML profile can be developed to standardize UML models within a given domain. This paper''s objective is to develop a UML profile to extend the UML Activity Diagram schema into the clinical trials domain, defining a standard representation for clinical trial workflow diagrams in UML.

Methods

Two Brazilian clinical trial sites in rheumatology and oncology were examined to model their workflow and collect time-motion data. UML modeling was conducted in Eclipse, and a UML profile was developed to incorporate information used in discrete event simulation software.

Results

Ethnographic observation revealed bottlenecks in workflow: these included tasks requiring full commitment of CRCs, transferring notes from paper to computers, deviations from standard operating procedures, and conflicts between different IT systems. Time-motion analysis revealed that nurses'' activities took up the most time in the workflow and contained a high frequency of shorter duration activities. Administrative assistants performed more activities near the beginning and end of the workflow. Overall, clinical trial tasks had a greater frequency than clinic routines or other general activities.

Conclusions

This paper describes a method for modeling clinical trial workflow in UML and standardizing these workflow diagrams through a UML profile. In the increasingly global environment of clinical trials, the standardization of workflow modeling is a necessary precursor to conducting a comparative analysis of international clinical trials workflows.  相似文献   

15.
Computational analysis and interactive visualization of biological networks and protein structures are common tasks for gaining insight into biological processes. This protocol describes three workflows based on the NetworkAnalyzer and RINalyzer plug-ins for Cytoscape, a popular software platform for networks. NetworkAnalyzer has become a standard Cytoscape tool for comprehensive network topology analysis. In addition, RINalyzer provides methods for exploring residue interaction networks derived from protein structures. The first workflow uses NetworkAnalyzer to perform a topological analysis of biological networks. The second workflow applies RINalyzer to study protein structure and function and to compute network centrality measures. The third workflow combines NetworkAnalyzer and RINalyzer to compare residue networks. The full protocol can be completed in ~2 h.  相似文献   

16.
Cloud computing is becoming the new generation computing infrastructure, and many cloud vendors provide different types of cloud services. How to choose the best cloud services for specific applications is very challenging. Addressing this challenge requires balancing multiple factors, such as business demands, technologies, policies and preferences in addition to the computing requirements. This paper recommends a mechanism for selecting the best public cloud service at the levels of Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). A systematic framework and associated workflow include cloud service filtration, solution generation, evaluation, and selection of public cloud services. Specifically, we propose the following: a hierarchical information model for integrating heterogeneous cloud information from different providers and a corresponding cloud information collecting mechanism; a cloud service classification model for categorizing and filtering cloud services and an application requirement schema for providing rules for creating application-specific configuration solutions; and a preference-aware solution evaluation mode for evaluating and recommending solutions according to the preferences of application providers. To test the proposed framework and methodologies, a cloud service advisory tool prototype was developed after which relevant experiments were conducted. The results show that the proposed system collects/updates/records the cloud information from multiple mainstream public cloud services in real-time, generates feasible cloud configuration solutions according to user specifications and acceptable cost predication, assesses solutions from multiple aspects (e.g., computing capability, potential cost and Service Level Agreement, SLA) and offers rational recommendations based on user preferences and practical cloud provisioning; and visually presents and compares solutions through an interactive web Graphical User Interface (GUI).  相似文献   

17.
Cluster Computing - Due to the limitations associated with the processing capability of mobile devices in cloud environments, various tasks are offloaded to the cloud server. This has led to an...  相似文献   

18.
With the rapid development of cloud computing techniques, the number of users is undergoing exponential growth. It is difficult for traditional data centers to perform many tasks in real time because of the limited bandwidth of resources. The concept of fog computing is proposed to support traditional cloud computing and to provide cloud services. In fog computing, the resource pool is composed of sporadic distributed resources that are more flexible and movable than a traditional data center. In this paper, we propose a fog computing structure and present a crowd-funding algorithm to integrate spare resources in the network. Furthermore, to encourage more resource owners to share their resources with the resource pool and to supervise the resource supporters as they actively perform their tasks, we propose an incentive mechanism in our algorithm. Simulation results show that our proposed incentive mechanism can effectively reduce the SLA violation rate and accelerate the completion of tasks.  相似文献   

19.
Data processing in proteomics can be a challenging endeavor, requiring extensive knowledge of many different software packages, all with different algorithms, data format requirements, and user interfaces. In this article we describe the integration of a number of existing programs and tools in Taverna Workbench, a scientific workflow manager currently being developed in the bioinformatics community. We demonstrate how a workflow manager provides a single, visually clear and intuitive interface to complex data analysis tasks in proteomics, from raw mass spectrometry data to protein identifications and beyond.  相似文献   

20.
In this paper we introduce Armadillo v1.1, a novel workflow platform dedicated to designing and conducting phylogenetic studies, including comprehensive simulations. A number of important phylogenetic and general bioinformatics tools have been included in the first software release. As Armadillo is an open-source project, it allows scientists to develop their own modules as well as to integrate existing computer applications. Using our workflow platform, different complex phylogenetic tasks can be modeled and presented in a single workflow without any prior knowledge of programming techniques. The first version of Armadillo was successfully used by professors of bioinformatics at Université du Quebec à Montreal during graduate computational biology courses taught in 2010-11. The program and its source code are freely available at: .  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号