期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Reliable MapReduce computing on opportunistic resources

Heshan Lin Xiaosong Ma Wu-chun Feng 《Cluster computing》2012,15(2):145-161

MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for opportunistic compute resources. However, unlike dedicated resources, where MapReduce has mostly been deployed, opportunistic resources have significantly higher rates of node volatility. As a consequence, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate on such volatile resources. In this paper, we propose MOON, short for MapReduce On Opportunistic eNvironments, which is designed to offer reliable MapReduce service for opportunistic computing. MOON adopts a hybrid resource architecture by supplementing opportunistic compute resources with a small set of dedicated resources, and it extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms to take advantage of the hybrid resource architecture. Our results on an emulated opportunistic computing system running atop a 60-node cluster demonstrate that MOON can deliver significant performance improvements to Hadoop on volatile compute resources and even finish jobs that are not able to complete in Hadoop. 相似文献

2.

Dynamic data replacement and adaptive scheduling policies in spark

Li Chunlin Cai Qianqian Luo Youlong 《Cluster computing》2022,25(2):1421-1439

Improper data replacement and inappropriate selection of job scheduling policy are important reasons for the degradation of Spark system operation speed, which directly causes the performance degradation of Spark parallel computing. In this paper, we analyze the existing caching mechanism of Spark and find that there is still more room for optimization of the existing caching policy. For the task structure analysis, the key information of Spark tasks is taken out to obtain the data and memory usage during the task runtime, and based on this, an RDD weight calculation method is proposed, which integrates various factors affecting the RDD usage and establishes an RDD weight model. Based on this model, a minimum weight replacement algorithm based on RDD structure analyzing is proposed. The algorithm ensure that the relatively more valuable data in the data replacement process can be cached into memory. In addition, the default job scheduling algorithm of the Spark framework considers a single factor, which cannot form effective scheduling for jobs and causes a waste of cluster resources. In this paper, an adaptive job scheduling policy based on job classification is proposed to solve the above problem. The policy can classify job types and schedule resources more effectively for different types of jobs. The experimental results show that the proposed dynamic data replacement algorithm effectively improves Spark's memory utilization. The proposed job classification-based adaptive job scheduling algorithm effectively improves the system resource utilization and shortens the job completion time.

相似文献

3.

An effective reliability-driven technique of allocating tasks on heterogeneous cluster systems

Xiaoyong Tang Kenli Li Guiping Liao 《Cluster computing》2014,17(4):1413-1425

In large-scale heterogeneous cluster computing systems, processor and network failures are inevitable and can have an adverse effect on applications executing on such systems. One way of taking failures into account is to employ a reliable scheduling algorithm. However, most existing scheduling algorithms for precedence constrained tasks in heterogeneous systems only consider scheduling length, and not efficiently satisfy the reliability requirements of task. In recognition of this problem, we build an application reliability analysis model based on Weibull distribution, which can dynamically measure the reliability of task executing on heterogeneous cluster with arbitrary networks architectures. Then, we propose a reliability-driven earliest finish time with duplication scheduling algorithm (REFTD) which incorporates task reliability overhead into scheduling. Furthermore, to improve system reliability, it duplicates task as if task hazard rate is more than threshold \(\theta \) . The comparison study, based on both randomly generated graphs and the graphs of some real applications, shows that our scheduling algorithm can shorten schedule length and improve system reliability significantly. 相似文献

4.

A MapReduce task scheduling algorithm for deadline constraints

Zhuo Tang Junqing Zhou Kenli Li Ruixuan Li 《Cluster computing》2013,16(4):651-662

The current works about MapReduce task scheduling with deadline constraints neither take the differences of Map and Reduce task, nor the cluster’s heterogeneity into account. This paper proposes an extensional MapReduce Task Scheduling algorithm for Deadline constraints in Hadoop platform: MTSD. It allows user specify a job’s deadline and tries to make the job be finished before the deadline. Through measuring the node’s computing capacity, a node classification algorithm is proposed in MTSD. This algorithm classifies the nodes into several levels in heterogeneous clusters. Under this algorithm, we firstly illuminate a novel data distribution model which distributes data according to the node’s capacity level respectively. The experiments show that the node classification algorithm can improved data locality observably to compare with default scheduler and it also can improve other scheduler’s locality. Secondly, we calculate the task’s average completion time which is based on the node level. It improves the precision of task’s remaining time evaluation. Finally, MTSD provides a mechanism to decide which job’s task should be scheduled by calculating the Map and Reduce task slot requirements. 相似文献

5.

Energy and resource efficient workflow scheduling in a virtualized cloud environment

Garg Neha Singh Damanpreet Goraya Major Singh 《Cluster computing》2021,24(2):767-797

High energy consumption (EC) is one of the leading and interesting issue in the cloud environment. The optimization of EC is generally related to scheduling problem. Optimum scheduling strategy is used to select the resources or tasks in such a way that system performance is not violated while minimizing EC and maximizing resource utilization (RU). This paper presents a task scheduling model for scheduling the tasks on virtual machines (VMs). The objective of the proposed model is to minimize EC, maximize RU, and minimize workflow makespan while preserving the task’s deadline and dependency constraints. An energy and resource efficient workflow scheduling algorithm (ERES) is proposed to schedule the workflow tasks to the VMs and dynamically deploy/un-deploy the VMs based on the workflow task’s requirements. An energy model is presented to compute the EC of the servers. Double threshold policy is used to perceive the server’ status i.e. overloaded/underloaded or normal. To balance the workload on the overloaded/underloaded servers, live VM migration strategy is used. To check the effectiveness of the proposed algorithm, exhaustive simulation experiments are conducted. The proposed algorithm is compared with power efficient scheduling and VM consolidation (PESVMC) algorithm on the accounts of RU, energy efficiency and task makespan. Further, the results are also verified in the real cloud environment. The results demonstrate the effectiveness of the proposed ERES algorithm.

相似文献

6.

Value of service based resource management for large-scale computing systems

Cihan?Tunc Email author Dylan?Machovec Nirmal?Kumbhare Ali?Akoglu Salim?Hariri Bhavesh?Khemka Howard?Jay?Siegel 《Cluster computing》2017,20(3):2013-2030

Task scheduling for large-scale computing systems is a challenging problem. From the users perspective, the main concern is the performance of the submitted tasks, whereas, for the cloud service providers, reducing operation cost while providing the required service is critical. Therefore, it is important for task scheduling mechanisms to balance users’ performance requirements and energy efficiency because energy consumption is one of the major operational costs. We present a time dependent value of service (VoS) metric that will be maximized by the scheduling algorithm that take into consideration the arrival time of a task while evaluating the value functions for completing a task at a given time and the tasks energy consumption. We consider the variation in value for completing a task at different times such that the value of energy reduction can change significantly between peak and non-peak periods. To determine the value of a task completion, we use completion time and energy consumption with soft and hard thresholds. We define the VoS for a given workload to be the sum of the values for all tasks that are executed during a given period of time. Our system model is based on virtual machines, where each task will be assigned a resource configuration characterized by the number of the homogeneous cores and amount of memory. For the scheduling of each task submitted to our system, we use the estimated time to compute matrix and the estimated energy consumption matrix which are created using historical data. We design, evaluate, and compare our task scheduling methods to show that a significant improvement in energy consumption can be achieved when considering time-of-use dependent scheduling algorithms. The simulation results show that we improve the performance and the energy values up to 49% when compared to schedulers that do not consider the value functions. Similar to the simulation results, our experimental results from running our value based scheduling on an IBM blade server show up to 82% improvement in performance value, 110% improvement in energy value, and up to 77% improvement in VoS compared to schedulers that do not consider the value functions. 相似文献

7.

MRPack: Multi-Algorithm Execution Using Compute-Intensive Approach in MapReduce

Muhammad Idris Shujaat Hussain Muhammad Hameed Siddiqi Waseem Hassan Hafiz Syed Muhammad Bilal Sungyoung Lee 《PloS one》2015,10(8)

Large quantities of data have been generated from multiple sources at exponential rates in the last few years. These data are generated at high velocity as real time and streaming data in variety of formats. These characteristics give rise to challenges in its modeling, computation, and processing. Hadoop MapReduce (MR) is a well known data-intensive distributed processing framework using the distributed file system (DFS) for Big Data. Current implementations of MR only support execution of a single algorithm in the entire Hadoop cluster. In this paper, we propose MapReducePack (MRPack), a variation of MR that supports execution of a set of related algorithms in a single MR job. We exploit the computational capability of a cluster by increasing the compute-intensiveness of MapReduce while maintaining its data-intensive approach. It uses the available computing resources by dynamically managing the task assignment and intermediate data. Intermediate data from multiple algorithms are managed using multi-key and skew mitigation strategies. The performance study of the proposed system shows that it is time, I/O, and memory efficient compared to the default MapReduce. The proposed approach reduces the execution time by 200% with an approximate 50% decrease in I/O cost. Complexity and qualitative results analysis shows significant performance improvement. 相似文献

8.

A cost-benefit analysis of using cloud computing to extend the capacity of clusters 总被引：2，自引：0，他引：2

Marcos Dias de Assunção Alexandre di Costanzo Rajkumar Buyya 《Cluster computing》2010,13(3):335-347

In this paper, we investigate the benefits that organisations can reap by using “Cloud Computing” providers to augment the computing capacity of their local infrastructure. We evaluate the cost of seven scheduling strategies used by an organisation that operates a cluster managed by virtual machine technology and seeks to utilise resources from a remote Infrastructure as a Service (IaaS) provider to reduce the response time of its user requests. Requests for virtual machines are submitted to the organisation’s cluster, but additional virtual machines are instantiated in the remote provider and added to the local cluster when there are insufficient resources to serve the users’ requests. Naïve scheduling strategies can have a great impact on the amount paid by the organisation for using the remote resources, potentially increasing the overall cost with the use of IaaS. Therefore, in this work we investigate seven scheduling strategies that consider the use of resources from the “Cloud”, to understand how these strategies achieve a balance between performance and usage cost, and how much they improve the requests’ response times. 相似文献

9.

P-Aware: a proportional multi-resource scheduling strategy in cloud data center

Hang Zhou Qing Li Weiqin Tong Samina Kausar Hai Zhu 《Cluster computing》2016,19(3):1089-1103

Concentrating on a single resource cannot efficiently cope with the overall high utilization of resources in cloud data centers. Nowadays multiple resource scheduling problem is more attractive to researchers. Some studies achieve progresses in multi-resource scenarios. However, these previous heuristics have obvious limitations in complex software defined cloud environment. Focusing on energy conservation and load balancing, we propose a preciousness model for multiple resource scheduling in this paper. We give the formulation of the problem and propose an innovative strategy (P-Aware). In P-Aware, a special algorithm PMDBP (Proportional Multi-dimensional Bin Packing) is applied in the multi-dimensional bin packing approach. In this algorithm, multiple resources are consumed in a proportional way. Structure and details of PMDBP are discussed in this paper. Extensive experiments demonstrate that our strategy outperforms others both in efficiency and load balancing. Now P-Aware has been implemented in the resource management system in our cooperative company to cut energy consumption and reduce resource contention. 相似文献

10.

Resource Management for Ad-Hoc Wireless Networks with Cluster Organization

Ionut Cardei Srivatsan Varadarajan Allalaghatta Pavan Lee Graba Mihaela Cardei Manki Min 《Cluster computing》2004,7(1):91-103

Boosted by technology advancements, government and commercial interest, ad-hoc wireless networks are emerging as a serious platform for distributed mission-critical applications. Guaranteeing QoS in this environment is a hard problem because several applications may share the same resources in the network, and mobile ad-hoc wireless networks (MANETs) typically exhibit high variability in network topology and communication quality. In this paper we introduce DYNAMIQUE, a resource management infrastructure for MANETs. We present a resource model for multi-application admission control that optimizes the application admission utility, defined as a combination of the QoS satisfaction ratio. A method based on external adaptation (shrinking QoS for existing applications and later QoS expansion) is introduced as a way to reduce computation complexity by reducing the search space. We designed an application admission protocol that uses a greedy heuristic to improve application utility. For this, the admission control considers network topology information from the routing layer. Specifically, the admission protocol takes benefit from a cluster network organization, as defined by ad-hoc routing protocols such as CBRP and LANMAR. Information on cluster membership and cluster head elections allows the admission protocol to minimize control signaling and to improve application quality by localizing task mapping. 相似文献

11.

Energy efficient job scheduling with workload prediction on cloud data center

Xiaoyong Tang Xiaoyi Liao Jie Zheng Xiaopan Yang 《Cluster computing》2018,21(3):1581-1593

Data centers are the backbone of cloud infrastructure platform to support large-scale data processing and storage. More and more business-to-consumer and enterprise applications are based on cloud data center. However, the amount of data center energy consumption is inevitably lead to high operation costs. The aim of this paper is to comprehensive reduce energy consumption of cloud data center servers, network, and cooling systems. We first build an energy efficient cloud data center system including its architecture, job and power consumption model. Then, we combine the linear regression and wavelet neural network techniques into a prediction method, which we call MLWNN, to forecast the cloud data center short-term workload. Third, we propose a heuristic energy efficient job scheduling with workload prediction solution, which is divided into resource management strategy and online energy efficient job scheduling algorithm. Our extensive simulation performance evaluation results clearly demonstrate that our proposed solution has good performance and is very suitable for low workload cloud data center. 相似文献

12.

Energy-aware task scheduling in heterogeneous computing environments

Jing Mei Kenli Li Keqin Li 《Cluster computing》2014,17(2):537-550

Efficient application scheduling is critical for achieving high performance in heterogeneous computing (HC) environments. Because of such importance, there are many researches on this problem and various algorithms have been proposed. Duplication-based algorithms are one kind of well known algorithms to solve scheduling problems, which achieve high performance on minimizing the overall completion time (makespan) of applications. However, they pursuit of the shortest makespan overly by duplicating some tasks redundantly, which leads to a large amount of energy consumption and resource waste. With the growing advocacy for green computing systems, energy conservation has been an important issue and gained a particular interest. An existing technique to reduce energy consumption of an application is dynamic voltage/frequency scaling (DVFS), whose efficiency is affected by the overhead of time and energy caused by voltage scaling. In this paper, we propose a new energy-aware scheduling algorithm with reduced task duplication called Energy-Aware Scheduling by Minimizing Duplication (EAMD), which takes the energy consumption as well as the makespan of an application into consideration. It adopts a subtle energy-aware method to search and delete redundant task copies in the schedules generated by duplication-based algorithms, and it is easier to operate than DVFS, and produces no extra time and energy consumption. This algorithm not only consumes less energy but also maintains good performance in terms of makespan compared with duplication-based algorithms. Two kinds of DAGs, i.e., randomly generated graphs and two real-world application graphs, are tested in our experiments. Experimental results show that EAMD can save up to 15.59 % energy consumption for HLD and HCPFD, two classic duplication-based algorithms. Several factors affecting the performance are also analyzed in the paper. 相似文献

13.

Multi-prediction based scheduling for hybrid workloads in the cloud data center

Haiou Jiang Haihong E Meina Song 《Cluster computing》2018,21(3):1607-1622

Cloud computing can leverage over-provisioned resources that are wasted in traditional data centers hosting production applications by consolidating tasks with lower QoS and SLA requirements. However, the dramatic fluctuation of workloads with lower QoS and SLA requirements may impact the performance of production applications. Frequent task eviction, killing and rescheduling operations also waste CPU cycles and create overhead. This paper aims to schedule hybrid workloads in the cloud data center to reduce task failures and increase resource utilization. The multi-prediction model, including the ARMA model and the feedback based online AR model, is used to predict the current and the future resource availability. Decision to accept or reject a new task is based on the available resources and task properties. Evaluations show that the scheduler can reduce the host overload and failed tasks by nearly 70%, and increase effective resource utilization by more than 65%. The task delay performance degradation is also acceptable. 相似文献

14.

SOCCER: Self-Optimization of Energy-efficient Cloud Resources

Sukhpal Singh Inderveer Chana Maninder Singh Rajkumar Buyya 《Cluster computing》2016,19(4):1787-1800

Cloud data centers often schedule heterogeneous workloads without considering energy consumption and carbon emission aspects. Tremendous amount of energy consumption leads to high operational costs and reduces return on investment and contributes towards carbon footprints to the environment. Therefore, there is need of energy-aware cloud based system which schedules computing resources automatically by considering energy consumption as an important parameter. In this paper, energy efficient autonomic cloud system [Self-Optimization of Cloud Computing Energy-efficient Resources (SOCCER)] is proposed for energy efficient scheduling of cloud resources in data centers. The proposed work considers energy as a Quality of Service (QoS) parameter and automatically optimizes the efficiency of cloud resources by reducing energy consumption. The performance of the proposed system has been evaluated in real cloud environment and the experimental results show that the proposed system performs better in terms of energy consumption of cloud resources and utilizes these resources optimally. 相似文献

15.

MrHeter: improving MapReduce performance in heterogeneous environments

Xiao Zhang Yanjun Wu Chen Zhao 《Cluster computing》2016,19(4):1691-1701

As GPUs, ARM CPUs and even FPGAs are widely used in modern computing, a data center gradually develops towards the heterogeneous clusters. However, many well-known programming models such as MapReduce are designed for homogeneous clusters and have poor performance in heterogeneous environments. In this paper, we reconsider the problem and make four contributions: (1) We analyse the causes of MapReduce poor performance in heterogeneous clusters, and the most important one is unreasonable task allocation between nodes with different computing ability. (2) Based on this, we propose MrHeter, which separates MapReduce process into map-shuffle stage and reduce stage, then constructs optimization model separately for them and gets different task allocation \(ml_{ij}, mr_{ij}, r_{ij}\) for heterogeneous nodes based on computing ability.(3) In order to make it suitable for dynamic execution, we propose D-MrHeter, which includes monitor and feedback mechanism. (4) Finally, we prove that MrHeter and D-MrHeter can greatly decrease total execution time of MapReduce from 30 to 70 % in heterogeneous cluster comparing with original Hadoop, having better performance especially in the condition of heavy-workload and large-difference between nodes computing ability. 相似文献

16.

Planning of distributed data production for High Energy and Nuclear Physics

Dzmitry Makatun Jérôme Lauret Hana Rudová 《Cluster computing》2018,21(4):1949-1965

Modern experiments in High Energy and Nuclear Physics heavily rely on distributed computations using multiple computational facilities across the world. One of the essential types of the computations is a distributed data production where petabytes of raw files from a single source has to be processed once (per production campaign) using thousands of CPUs at distant locations and the output has to be transferred back to that source. The data distribution over a large system does not necessary match the distribution of storage, network and CPU capacity. Therefore, bottlenecks may appear and lead to increased latency and degraded performance. In this paper we propose a new scheduling approach for distributed data production which is based on the network flow maximization model. In our approach a central planner defines how much input and output data should be transferred over each network link in order to maximize the computational throughput. Such plans are created periodically for a fixed planning time interval using up-to-date information on network, storage and CPU resources. The centrally created plans are executed in a distributed manner by dedicated services running at participating sites. Our simulations based on the log records from the data production framework of the experiment STAR (Solenoid Tracker at RHIC) have shown that the proposed model systematically provides a better performance compared to the simulated traditional techniques. 相似文献

17.

Infrastructures and services for remote sensing data production management across multiple satellite data centers

Jie Zhang Jining Yan Yan Ma Dong Xu Pengfei Li Wei Jie 《Cluster computing》2016,19(3):1243-1260

With the number of satellite sensors and date centers being increased continuously, it is becoming a trend to manage and process massive remote sensing data from multiple distributed sources. However, the combination of multiple satellite data centers for massive remote sensing (RS) data collaborative processing still faces many challenges. In order to reduce the huge amounts of data migration and improve the efficiency of multi-datacenter collaborative process, this paper presents the infrastructures and services of the data management as well as workflow management for massive remote sensing data production. A dynamic data scheduling strategy was employed to reduce the duplication of data request and data processing. And by combining the remote sensing spatial metadata repositories and Gfarm grid file system, the unified management of the raw data, intermediate products and final products were achieved in the co-processing. In addition, multi-level task order repositories and workflow templates were used to construct the production workflow automatically. With the help of specific heuristic scheduling rules, the production tasks were executed quickly. Ultimately, the Multi-datacenter Collaborative Process System (MDCPS) were implemented for large-scale remote sensing data production based on the effective management of data and workflow. As a consequence, the performance of MDCPS in experiments environment showed that those strategies could significantly enhance the efficiency of co-processing across multiple data centers. 相似文献

18.

Component Object Based Single System Image for Dependable Implementation of Genetic Programming on Clusters

Ivan Tanev Takashi Uozumi Dauren Akhmetov 《Cluster computing》2004,7(4):347-356

We present a distributed component-object model (DCOM) based single system image (SSI) for dependable parallel implementation of genetic programming (DPIGP). DPIGP is aimed to significantly and reliably improve the computational performance of genetic programming (GP) exploiting the inherent parallelism in GP among the evaluation of individuals. It runs on cost-effective clusters of commodity, non-dedicated, heterogeneous workstations or PCs. Developed SSI represents the pool of heterogeneous workstations as a single, unified virtual resource – a metacomputer, and addresses the issues of locating and allocating the physical resources, communicating between the entities of DPIGP, scheduling and load balancing. In addition, addressing the issue of fault tolerance, SSI allows for building a highly available metacomputer in which the cases of workstation failure result only in a corresponding partial degradation of the overall performance characteristics of DPIGP. Adopting DCOM as a communicating paradigm offers the benefits of software platform- and network protocol neutrality of proposed approach; and the generic support for the issues of locating, allocating and security of the distributed entities of DPIGP. 相似文献

19.

A new approach to the job scheduling problem in computational grids 总被引：1，自引：0，他引：1

Javad Akbari Torkestani 《Cluster computing》2012,15(3):201-210

Job scheduling is one of the most challenging issues in Grid resource management that strongly affects the performance of the whole Grid environment. The major drawback of the existing Grid scheduling algorithms is that they are unable to adapt with the dynamicity of the resources and the network conditions. Furthermore, the network model that is used for resource information aggregation in most scheduling methods is centralized or semi-centralized. Therefore, these methods do not scale well as Grid size grows and do not perform well as the environmental conditions change with time. This paper proposes a learning automata-based job scheduling algorithm for Grids. In this method, the workload that is placed on each Grid node is proportional to its computational capacity and varies with time according to the Grid constraints. The performance of the proposed algorithm is evaluated through conducting several simulation experiments under different Grid scenarios. The obtained results are compared with those of several existing methods. Numerical results confirm the superiority of the proposed algorithm over the others in terms of makespan, flowtime, and load balancing. 相似文献

20.

Resource scheduling methods in cloud and fog computing environments: a systematic literature review

Rahimikhanghah Aryan Tajkey Melika Rezazadeh Bahareh Rahmani Amir Masoud 《Cluster computing》2022,25(2):911-945

In recent years, cloud computing can be considered an emerging technology that can share resources with users. Because cloud computing is on-demand, efficient use of resources such as memory, processors, bandwidth, etc., is a big challenge. Despite the advantages of cloud computing, sometimes it is not a proper choice due to its delay in responding appropriately to existing requests, which led to the need for another technology called fog computing. Fog computing reduces traffic and time lags by expanding cloud services to the network and closer to users. It can schedule resources with higher efficiency and utilize them to impact the user's experience dramatically. This paper aims to survey some studies that have been done in the field of scheduling in fog/cloud computing environments. The focus of this survey is on published studies between 2015 and 2021 in journals or conferences. We selected 71 studies in a systematic literature review (SLR) from four major scientific databases based on their relation to our paper. We classified these studies into five categories based on their traced parameters and their focus area. This classification comprises 1—performance 2—energy efficiency, 3—resource utilization, 4—performance and energy efficiency, and 5—performance and resource utilization simultaneously. 42.3% of the studies focused on performance, 9.9% on energy efficiency, 7.0% on resource utilization, 21.1% on both performance and energy efficiency, and 19.7% on both performance and resource utilization. Finally, we present challenges and open issues in the resource scheduling methods in fog/cloud computing environments.

相似文献