期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Continuous Data Block Placement in and Elevation from Tertiary Storage in Hierarchical Storage Servers

Peter Triantafillou Thomas Papadakis 《Cluster computing》2001,4(2):157-172

Given the cost of memories and the very large storage and bandwidth requirements of large-scale multimedia databases, hierarchical storage servers (which consist of disk-based secondary storage and tape-library-based tertiary storage) are becoming increasingly popular. Such server applications rely upon tape libraries to store all media, exploiting their excellent storage capacity and cost per MB characteristics. They also rely upon disk arrays, exploiting their high bandwidth, to satisfy a very large number of requests. Given typical access patterns and server configurations, the tape drives are fully utilized uploading data for requests that fall through to the tertiary level. Such upload operations consume significant secondary storage device and bus bandwidth. In addition, with present technology (and trends) the disk array can serve fewer requests to continuous objects than it can store, mainly due to IO and/or backplane bus bandwidth limitations. In this work we address comprehensively the performance of these hierarchical, continuous-media, storage servers by looking at all three main system resources: the tape drive bandwidth, the secondary-storage bandwidth, and the host's RAM. We provide techniques which, while fully utilizing the tape drive bandwidth (an expensive resource) they introduce bandwidth savings, which allow the secondary storage devices to serve more requests and do so without increasing demands for the host's RAM space. Specifically, we consider the issue of elevating continuous data from its permanent place in tertiary for display purposes. We develop algorithms for sharing the responsibility for the playback between the secondary and tertiary devices and for placing the blocks of continuous objects on tapes, and show how they achieve the above goals. We study these issues for different commercial tape library products with different bandwidth and tape capacity and in environments with and without the multiplexing of tape libraries. 相似文献

2.

Cost-aware job scheduling for cloud instances using deep reinforcement learning

Cheng Feng Huang Yifeng Tanpure Bhavana Sawalani Pawan Cheng Long Liu Cong 《Cluster computing》2022,25(1):619-631

As the services provided by cloud vendors are providing better performance, achieving auto-scaling, load-balancing, and optimized performance along with low infrastructure maintenance, more and more companies migrate their services to the cloud. Since the cloud workload is dynamic and complex, scheduling the jobs submitted by users in an effective way is proving to be a challenging task. Although a lot of advanced job scheduling approaches have been proposed in the past years, almost all of them are designed to handle batch jobs rather than real-time workloads, such as that user requests are submitted at any time with any amount of numbers. In this work, we have proposed a Deep Reinforcement Learning (DRL) based job scheduler that dispatches the jobs in real time to tackle this problem. Specifically, we focus on scheduling user requests in such a way as to provide the quality of service (QoS) to the end-user along with a significant reduction of the cost spent on the execution of jobs on the virtual instances. We have implemented our method by Deep Q-learning Network (DQN) model, and our experimental results demonstrate that our approach can significantly outperform the commonly used real-time scheduling algorithms.

相似文献

3.

On the Scalability of Dynamic Scheduling Scientific Applications with Adaptive Weighted Factoring 总被引：1，自引：0，他引：1

Ioana Banicescu Vijay Velusamy Johnny Devaprasad 《Cluster computing》2003,6(3):215-226

In heterogeneous environments, dynamic scheduling algorithms are a powerful tool towards performance improvement of scientific applications via load balancing. However, these scheduling techniques employ heuristics that require prior knowledge about workload via profiling resulting in higher overhead as problem sizes and number of processors increase. In addition, load imbalance may appear only at run-time, making profiling work tedious and sometimes even obsolete. Recently, the integration of dynamic loop scheduling algorithms into a number of scientific applications has been proven effective. This paper reports on performance improvements obtained by integrating the Adaptive Weighted Factoring, a recently proposed dynamic loop scheduling technique that addresses these concerns, into two scientific applications: computational field simulation on unstructured grids, and N-Body simulations. Reported experimental results confirm the benefits of using this methodology, and emphasize its high potential for future integration into other scientific applications that exhibit substantial performance degradation due to load imbalance. 相似文献

4.

Scheduling data analytics work with performance guarantees: queuing and machine learning models in synergy

Ji Xue Feng Yan Alma Riska Evgenia Smirni 《Cluster computing》2016,19(2):849-864

In today’s scaled out systems, co-scheduling data analytics work with high priority user workloads is common as it utilizes better the vast hardware availability. User workloads are dominated by periodic patterns, with alternating periods of high and low utilization, creating promising conditions to schedule data analytics work during low activity periods. To this end, we show the effectiveness of machine learning models in accurately predicting user workload intensities, essentially by suggesting the most opportune time to co-schedule data analytics work. Yet, machine learning models cannot predict the effects of performance interference when co-scheduling is employed, as this constitutes a “new” observation. Specifically, in tiered storage systems, their hierarchical design makes performance interference even more complex, thus accurate performance prediction is more challenging. Here, we quantify the unknown performance effects of workload co-scheduling by enhancing machine learning models with queuing theory ones to develop a hybrid approach that can accurately predict performance and guide scheduling decisions in a tiered storage system. Using traces from commercial systems we illustrate that queuing theory and machine learning models can be used in synergy to surpass their respective weaknesses and deliver robust co-scheduling solutions that achieve high performance. 相似文献

5.

Fair Scheduling of General-Purpose Workloads on Workstation Clusters

Cosimo Anglano 《Cluster computing》2002,5(1):87-96

In this paper we present a scheduling strategy for workstation clusters able to effectively and fairly schedule general-purpose workloads potentially made up by compute-bound, interactive, and I/O-intensive applications, that may each be sequential, client-server, or parallel. The scheduling strategy allocates resources to processes of the same parallel applications in such a way that they all get the same CPU share regardless of the level of resource contention on the respective machines, and relies on an extended stride scheduler to fairly allocate individual workstations. A simulation analysis carried out for a variety of workloads and operational conditions shows that our strategy (a) delivers good performance to all the applications classes composing general-purpose workloads, (b) fairly allocates resources among competing applications, and (c) outperforms alternative strategies. 相似文献

6.

Value of service based resource management for large-scale computing systems

Cihan?Tunc Email author Dylan?Machovec Nirmal?Kumbhare Ali?Akoglu Salim?Hariri Bhavesh?Khemka Howard?Jay?Siegel 《Cluster computing》2017,20(3):2013-2030

Task scheduling for large-scale computing systems is a challenging problem. From the users perspective, the main concern is the performance of the submitted tasks, whereas, for the cloud service providers, reducing operation cost while providing the required service is critical. Therefore, it is important for task scheduling mechanisms to balance users’ performance requirements and energy efficiency because energy consumption is one of the major operational costs. We present a time dependent value of service (VoS) metric that will be maximized by the scheduling algorithm that take into consideration the arrival time of a task while evaluating the value functions for completing a task at a given time and the tasks energy consumption. We consider the variation in value for completing a task at different times such that the value of energy reduction can change significantly between peak and non-peak periods. To determine the value of a task completion, we use completion time and energy consumption with soft and hard thresholds. We define the VoS for a given workload to be the sum of the values for all tasks that are executed during a given period of time. Our system model is based on virtual machines, where each task will be assigned a resource configuration characterized by the number of the homogeneous cores and amount of memory. For the scheduling of each task submitted to our system, we use the estimated time to compute matrix and the estimated energy consumption matrix which are created using historical data. We design, evaluate, and compare our task scheduling methods to show that a significant improvement in energy consumption can be achieved when considering time-of-use dependent scheduling algorithms. The simulation results show that we improve the performance and the energy values up to 49% when compared to schedulers that do not consider the value functions. Similar to the simulation results, our experimental results from running our value based scheduling on an IBM blade server show up to 82% improvement in performance value, 110% improvement in energy value, and up to 77% improvement in VoS compared to schedulers that do not consider the value functions. 相似文献

7.

A comparative study of online scheduling algorithms for networks of workstations 总被引：4，自引：0，他引：4

Olaf Arndt Bernd Freisleben Thilo Kielmann Frank Thilo 《Cluster computing》2000,3(2):95-112

Networks of workstations offer large amounts of unused processing time. Resource management systems are able to exploit this computing capacity by assigning compute-intensive tasks to idle workstations. To avoid interferences between multiple, concurrently running applications, such resource management systems have to schedule application jobs carefully. Continuously arriving jobs and dynamically changing amounts of available CPU capacity make traditional scheduling algorithms difficult to apply in workstation networks. Online scheduling algorithms promise better results by adapting schedules to changing situations. This paper compares six online scheduling algorithms by simulating several workload scenarios. Based on the insights gained by simulation, the three online scheduling algorithms performing best were implemented in the Winner resource management system. Experiments conducted with Winner in a real workstation network confirm the simulation results obtained. This revised version was published online in July 2006 with corrections to the Cover Date. 相似文献

8.

Empirical and analytical approaches for web server power modeling

Leonardo Piga Reinaldo A. Bergamaschi Sandro Rigo 《Cluster computing》2014,17(4):1279-1293

Power-aware computing has emerged as a significant concern in data centers. In this work, we develop empirical models for estimating the power consumed by web servers. These models can be used by on-the-fly power-saving algorithms and are imperative for simulators that evaluate the power behavior of workloads. To apply power saving methodologies and algorithms at the data center level, we must first be able to measure or estimate the power and performance of individual servers running in the data centers. We show a novel method for developing full system web server power models that reduces non-linear relationships among performance measurements and system power and prunes model parameters. The web server power models use as parameters performance indicators read from the machine internal performance counters. We evaluate our approach on an AMD Opteron-based web server and on an Intel i7-based web sever. Our best model displays an average absolute error of 1.92 % for Intel i7 server and 1.46 % for AMD Opteron as compared to actual measurements, and 90th percentile for the absolute percent error equals to 2.66 % for Intel i7 and 2.08 % for AMD Opteron. 相似文献

9.

Performance and energy modeling for live migration of virtual machines

Haikun Liu Hai Jin Cheng-Zhong Xu Xiaofei Liao 《Cluster computing》2013,16(2):249-264

Live migration of virtual machine (VM) provides a significant benefit for virtual server mobility without disrupting service. It is widely used for system management in virtualized data centers. However, migration costs may vary significantly for different workloads due to the variety of VM configurations and workload characteristics. To take into account the migration overhead in migration decision-making, we investigate design methodologies to quantitatively predict the migration performance and energy consumption. We thoroughly analyze the key parameters that affect the migration cost from theory to practice. We construct application-oblivious models for the cost prediction by using learned knowledge about the workloads at the hypervisor (also called VMM) level. This should be the first kind of work to estimate VM live migration cost in terms of both performance and energy in a quantitative approach. We evaluate the models using five representative workloads on a Xen virtualized environment. Experimental results show that the refined model yields higher than 90% prediction accuracy in comparison with measured cost. Model-guided decisions can significantly reduce the migration cost by more than 72.9% at an energy saving of 73.6%. 相似文献

10.

A Performance Comparison of Coscheduling Strategies for Workstation Clusters

Cosimo Anglano 《Cluster computing》2001,4(2):121-131

Workstation clusters are emerging as a general-purpose computing platform for the execution of workloads comprising parallel and sequential applications. The scalability and flexibility typical of implicit coscheduling strategies makes them a very promising solution to the scheduling needs of workstation clusters. In this paper we present a simulation study that compares, for a variety of workloads (that include both parallel and sequential applications) and operating system schedulers, 12 implicit coscheduling strategies in terms of the performance they are able to deliver to applications. By using a detailed simulator, we evaluate the performance of different coscheduling alternatives for a variety of simulation scenarios, and we identify the set of strategies that deliver the best performance to all the applications composing typical cluster workloads. Moreover, we show that for schedulers providing immediate preemption, the best strategies are also the simplest ones to implement. 相似文献

11.

Energy aware DAG scheduling on heterogeneous systems 总被引：1，自引：0，他引：1

Sanjeev Baskiyar Rabab Abdel-Kader 《Cluster computing》2010,13(4):373-383

We address the problem of scheduling directed a-cyclic task graph (DAG) on a heterogeneous distributed processor system with the twin objectives of minimizing finish time and energy consumption. Previous scheduling heuristics have assigned DAGs to processors to minimize overall run-time of the application. But applications on embedded systems, such as high performance DSP in image processing, multimedia, and wireless security, need schedules which use low energy too. 相似文献

12.

A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems

Aftab Ahmed Chandio Kashif Bilal Nikos Tziritas Zhibin Yu Qingshan Jiang Samee U. Khan Cheng-Zhong Xu 《Cluster computing》2014,17(4):1349-1367

In the large-scale parallel computing environment, resource allocation and energy efficient techniques are required to deliver the quality of services (QoS) and to reduce the operational cost of the system. Because the cost of the energy consumption in the environment is a dominant part of the owner’s and user’s budget. However, when considering energy efficiency, resource allocation strategies become more difficult, and QoS (i.e., queue time and response time) may violate. This paper therefore is a comparative study on job scheduling in large-scale parallel systems to: (a) minimize the queue time, response time, and energy consumption and (b) maximize the overall system utilization. We compare thirteen job scheduling policies to analyze their behavior. A set of job scheduling policies includes (a) priority-based, (b) first fit, (c) backfilling, and (d) window-based policies. All of the policies are extensively simulated and compared. For the simulation, a real data center workload comprised of 22385 jobs is used. Based on results of their performance, we incorporate energy efficiency in three policies i.e., (1) best result producer, (2) average result producer, and (3) worst result producer. We analyze the (a) queue time, (b) response time, (c) slowdown ratio, and (d) energy consumption to evaluate the policies. Moreover, we present a comprehensive workload characterization for optimizing system’s performance and for scheduler design. Major workload characteristics including (a) Narrow, (b) Wide, (c) Short, and (d) Long jobs are characterized for detailed analysis of the schedulers’ performance. This study highlights the strengths and weakness of various job scheduling polices and helps to choose an appropriate job scheduling policy in a given scenario. 相似文献

13.

Rack aware scheduling in HPC data centers: an energy conservation strategy

Vikas Ashok Patil Vipin Chaudhary 《Cluster computing》2013,16(3):559-573

Energy consumption in high performance computing data centers has become a long standing issue. With rising costs of operating the data center, various techniques need to be employed to reduce the overall energy consumption. Currently, among others there are techniques that guarantee reduced energy consumption by powering on/off the idle nodes. However, most of them do not consider the energy consumed by other components in a rack. Our study addresses this aspect of the data center. We show that we can gain considerable energy savings by reducing the energy consumed by these rack components. In this regard, we propose a scheduling technique that will help schedule jobs with the above mentioned goal. We claim that by our scheduling technique we can reduce the energy consumption considerably without affecting other performance metrics of a job. We implement this technique as an enhancement to the well-known Maui scheduler and present our results. We propose three different algorithms as part of this technique. The algorithms evaluate the various trade-offs that could be possibly made with respect to overall cluster performance. We compare our technique with various currently available Maui scheduler configurations. We simulate a wide variety of workloads from real cluster deployments using the simulation mode of Maui. Our results consistently show about 7 to 14 % savings over the currently available Maui scheduler configurations. We shall also see that our technique can be applied in tandem with most of the existing energy aware scheduling techniques to achieve enhanced energy savings. We also consider the side effects of power losses due to the network switches as a result of deploying our technique. We compare our technique with the existing techniques in terms of the power losses due to these switches based on the results in Sharma and Ranganathan, Lecture Notes in Computer Science, vol. 5550, 2009 and account for the power losses. We there on provide a best fit scheme with the rack considerations. We then propose an enhanced technique that merges the two extremes of node allocation based on rack information. We see that we can provide a way to configure the scheduler based on the kind of workload that it schedules and reduce the effect of job splitting across multiple racks. We further discuss how the enhancement can be utilized to build a learning model which can be used to adaptively adjust the scheduling parameters based on the workload experienced. 相似文献

14.

Security-driven scheduling for data-intensive applications on grids

Tao Xie Xiao Qin 《Cluster computing》2007,10(2):145-153

Security-sensitive applications that access and generate large data sets are emerging in various areas including bioinformatics and high energy physics. Data grids provide such data-intensive applications with a large virtual storage framework with unlimited power. However, conventional scheduling algorithms for data grids are unable to meet the security needs of data-intensive applications. In this paper we address the problem of scheduling data-intensive jobs on data grids subject to security constraints. Using a security- and data-aware technique, a dynamic scheduling strategy is proposed to improve quality of security for data-intensive applications running on data grids. To incorporate security into job scheduling, we introduce a new performance metric, degree of security deficiency, to quantitatively measure quality of security provided by a data grid. Results based on a real-world trace confirm that the proposed scheduling strategy significantly improves security and performance over four existing scheduling algorithms by up to 810% and 1478%, respectively.

Xiao QinEmail:

相似文献

15.

Energy efficient job scheduling with workload prediction on cloud data center

Xiaoyong Tang Xiaoyi Liao Jie Zheng Xiaopan Yang 《Cluster computing》2018,21(3):1581-1593

Data centers are the backbone of cloud infrastructure platform to support large-scale data processing and storage. More and more business-to-consumer and enterprise applications are based on cloud data center. However, the amount of data center energy consumption is inevitably lead to high operation costs. The aim of this paper is to comprehensive reduce energy consumption of cloud data center servers, network, and cooling systems. We first build an energy efficient cloud data center system including its architecture, job and power consumption model. Then, we combine the linear regression and wavelet neural network techniques into a prediction method, which we call MLWNN, to forecast the cloud data center short-term workload. Third, we propose a heuristic energy efficient job scheduling with workload prediction solution, which is divided into resource management strategy and online energy efficient job scheduling algorithm. Our extensive simulation performance evaluation results clearly demonstrate that our proposed solution has good performance and is very suitable for low workload cloud data center. 相似文献

16.

Managing performance and power consumption tradeoff for multiple heterogeneous servers in cloud computing

Yuan Tian Chuang Lin Keqin Li 《Cluster computing》2014,17(3):943-955

There are typically multiple heterogeneous servers providing various services in cloud computing. High power consumption of these servers increases the cost of running a data center. Thus, there is a problem of reducing the power cost with tolerable performance degradation. In this paper, we optimize the performance and power consumption tradeoff for multiple heterogeneous servers. We consider the following problems: (1) optimal job scheduling with fixed service rates; (2) joint optimal service speed scaling and job scheduling. For problem (1), we present the Karush-Kuhn-Tucker (KKT) conditions and provide a closed-form solution. For problem (2), both continuous speed scaling and discrete speed scaling are considered. In discrete speed scaling, the feasible service rates are discrete and bounded. We formulate the problem as an MINLP problem and propose a distributed algorithm by online value iteration, which has lower complexity than a centralized algorithm. Our approach provides an analytical way to manage the tradeoff between performance and power consumption. The simulation results show the gain of using speed scaling, and also prove the effectiveness and efficiency of the proposed algorithms. 相似文献

17.

Pulse Rate Criteria for Determining the Energy Cost of Exercise

N. I. Volkov O. I. Popov A. G. Samborskii 《Human physiology》2003,29(3):349-353

From the pulse rate curve taken during exercise and recovery from highly qualified sportsmen performing a bicycle exercise test at a maximum workload, the work pulse sum, pulse debt, and pulse cost were calculated. Plots of these indices versus the time to exhaustion and versus the relative exercise workload were identical to the respective plots of oxygen consumption during exercise, oxygen debt, and oxygen requirement. The exercise pulse cost can serve as a criterion for quantifying physical workloads. 相似文献

18.

Cost-intelligent application-specific data layout optimization for parallel file systems

Huaiming Song Yanlong Yin Yong Chen Xian-He Sun 《Cluster computing》2013,16(2):285-298

Parallel file systems have been developed in recent years to ease the I/O bottleneck of high-end computing system. These advanced file systems offer several data layout strategies in order to meet the performance goals of specific I/O workloads. However, while a layout policy may perform well on some I/O workload, it may not perform as well for another. Peak I/O performance is rarely achieved due to the complex data access patterns. Data access is application dependent. In this study, a cost-intelligent data access strategy based on the application-specific optimization principle is proposed. This strategy improves the I/O performance of parallel file systems. We first present examples to illustrate the difference of performance under different data layouts. By developing a cost model which estimates the completion time of data accesses in various data layouts, the layout can better match the application. Static layout optimization can be used for applications with dominant data access patterns, and dynamic layout selection with hybrid replications can be used for applications with complex I/O patterns. Theoretical analysis and experimental testing have been conducted to verify the proposed cost-intelligent layout approach. Analytical and experimental results show that the proposed cost model is effective and the application-specific data layout approach can provide up to a 74% performance improvement for data-intensive applications. 相似文献

19.

Spiking neural network simulation: memory-optimal synaptic event scheduling

Stewart RD Gurney KN 《Journal of computational neuroscience》2011,30(3):721-728

Spiking neural network simulations incorporating variable transmission delays require synaptic events to be scheduled prior to delivery. Conventional methods have memory requirements that scale with the total number of synapses in a network. We introduce novel scheduling algorithms for both discrete and continuous event delivery, where the memory requirement scales instead with the number of neurons. Superior algorithmic performance is demonstrated using large-scale, benchmarking network simulations. 相似文献

20.

GMBlock: Optimizing data movement in a block-level storage sharing system over Myrinet

Evangelos Koukis Anastassios Nanos Nectarios Koziris 《Cluster computing》2010,13(4):349-372

We present gmblock, a block-level storage sharing system over Myrinet which uses an optimized I/O path to transfer data directly between the storage medium and the network, bypassing the host CPU and main memory bus of the storage server. It is device driver independent and retains the protection and isolation features of the OS. We evaluate the performance of a prototype gmblock server and find that: (a) the proposed techniques eliminate memory and peripheral bus contention, increasing remote I/O bandwidth significantly, in the order of 20–200% compared to an RDMA-based approach, (b) the impact of remote I/O to local computation becomes negligible, (c) the performance characteristics of RAID storage combined with limited NIC resources reduce performance. We introduce synchronized send operations to improve the degree of disk to network I/O overlapping. We deploy the OCFS2 shared-disk filesystem over gmblock and show gains for various application benchmarks, provided I/O scheduling can eliminate the disk bottleneck due to concurrent access. 相似文献