首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A Load Balancing Tool for Distributed Parallel Loops   总被引:1,自引:0,他引:1  
Large scale applications typically contain parallel loops with many iterates. The iterates of a parallel loop may have variable execution times which translate into performance degradation of an application due to load imbalance. This paper describes a tool for load balancing parallel loops on distributed-memory systems. The tool assumes that the data for a parallel loop to be executed is already partitioned among the participating processors. The tool utilizes the MPI library for interprocessor coordination, and determines processor workloads by loop scheduling techniques. The tool was designed independent of any application; hence, it must be supplied with a routine that encapsulates the computations for a chunk of loop iterates, as well as the routines to transfer data and results between processors. Performance evaluation on a Linux cluster indicates that the tool reduces the cost of executing a simulated irregular loop without load balancing by up to 81%. The tool is useful for parallelizing sequential applications with parallel loops, or as an alternate load balancing routine for existing parallel applications.  相似文献   

2.
List scheduling algorithms are known to be efficient when the application to be executed can be described statically as a Directed Acyclic Graph (DAG) of tasks. Regardless of knowing the entire DAG beforehand, obtaining an optimal schedule in a parallel machine is a NP-hard problem. Moreover, many programming tools propose the use of scheduling techniques based on list strategies. This paper presents an analysis of scheduling algorithms for multithread programs in a dynamic scenario where threads are created and destroyed during execution. We introduce an algorithm to convert DAGs, describing applications as tasks, into Directed Cyclic Graphs (DCGs) describing the same application designed in a multithread programming interface. Our algorithm covers case studies described in previous works, successfully mapping from the abstract level of graphs to the application environment. These mappings preserve the guarantees offered by the abstract model, providing efficient scheduling of dynamic programs that follow the intended multithread model. We conclude the paper presenting some performance results we obtained by list schedulers in dynamic multithreaded environments. We also compare these results with the best scheduling we could obtain with similar static task schedulers.  相似文献   

3.
Security-sensitive applications that access and generate large data sets are emerging in various areas including bioinformatics and high energy physics. Data grids provide such data-intensive applications with a large virtual storage framework with unlimited power. However, conventional scheduling algorithms for data grids are unable to meet the security needs of data-intensive applications. In this paper we address the problem of scheduling data-intensive jobs on data grids subject to security constraints. Using a security- and data-aware technique, a dynamic scheduling strategy is proposed to improve quality of security for data-intensive applications running on data grids. To incorporate security into job scheduling, we introduce a new performance metric, degree of security deficiency, to quantitatively measure quality of security provided by a data grid. Results based on a real-world trace confirm that the proposed scheduling strategy significantly improves security and performance over four existing scheduling algorithms by up to 810% and 1478%, respectively.
Xiao QinEmail:
  相似文献   

4.
Divisible load scenarios occur in modern media server applications since most multimedia applications typically require access to continuous and discrete data. A high performance Continuous Media (CM) server greatly depends on the ability of its disk IO subsystem to serve both types of workloads efficiently. Disk scheduling algorithms for mixed media workloads, although they play a central role in this task, have been overlooked by related research efforts. These algorithms must satisfy several stringent performance goals, such as achieving low response time and ensuring fairness, for the discrete-data workload, while at the same time guaranteeing the uninterrupted delivery of continuous data, for the continuous-data workload. The focus of this paper is on disk scheduling algorithms for mixed media workloads in a multimedia information server. We propose novel algorithms, present a taxonomy of relevant algorithms, and study their performance through experimentation. Our results show that our algorithms offer drastic improvements in discrete request average response times, are fair, serve continuous requests without interruptions, and that the disk technology trends are such that the expected performance benefits can be even greater in the future.  相似文献   

5.
Task scheduling is one of the most challenging aspects to improve the overall performance of cloud computing and optimize cloud utilization and Quality of Service (QoS). This paper focuses on Task Scheduling optimization using a novel approach based on Dynamic dispatch Queues (TSDQ) and hybrid meta-heuristic algorithms. We propose two hybrid meta-heuristic algorithms, the first one using Fuzzy Logic with Particle Swarm Optimization algorithm (TSDQ-FLPSO), the second one using Simulated Annealing with Particle Swarm Optimization algorithm (TSDQ-SAPSO). Several experiments have been carried out based on an open source simulator (CloudSim) using synthetic and real data sets from real systems. The experimental results demonstrate the effectiveness of the proposed approach and the optimal results is provided using TSDQ-FLPSO compared to TSDQ-SAPSO and other existing scheduling algorithms especially in a high dimensional problem. The TSDQ-FLPSO algorithm shows a great advantage in terms of waiting time, queue length, makespan, cost, resource utilization, degree of imbalance, and load balancing.  相似文献   

6.
Soma  Prathibha  Latha  B. 《Cluster computing》2021,24(2):1123-1134

Scientific workflow applications are used by scientists to carry out research in various domains such as Physics, Chemistry, Astronomy etc. These applications require huge computational resources and currently cloud platform is used for efficiently running these applications. To improve the makespan and cost in workflow execution in cloud platform it requires to identify proper number of Virtual Machines (VM) and choose proper VM type. As cloud platform is dynamic, the available resources and the type of the resources are the two important factors on the cost and makespan of workflow execution. The primary objective of this work is to analyze the relationship among the cloud configuration parameters (Number of VM, Type of VM, VM configurations) for executing scientific workflow applications in cloud platform. In this work, to accurately analyze the influence of cloud platform resource configuration and scheduling polices a new predictive modelling using Box–Behnken design which is one of the modelling technique of Response Surface Methodology (RSM). It is used to build quadratic mathematical models that can be used to analyze relationships among input and output variables. Workflow cost and makespan models were built for real world scientific workflows using ANOVA and it was observed that the models fit well and can be useful in analyzing the performance of scientific workflow applications in cloud

  相似文献   

7.
Efficient application scheduling is critical for achieving high performance in heterogeneous computing (HC) environments. Because of such importance, there are many researches on this problem and various algorithms have been proposed. Duplication-based algorithms are one kind of well known algorithms to solve scheduling problems, which achieve high performance on minimizing the overall completion time (makespan) of applications. However, they pursuit of the shortest makespan overly by duplicating some tasks redundantly, which leads to a large amount of energy consumption and resource waste. With the growing advocacy for green computing systems, energy conservation has been an important issue and gained a particular interest. An existing technique to reduce energy consumption of an application is dynamic voltage/frequency scaling (DVFS), whose efficiency is affected by the overhead of time and energy caused by voltage scaling. In this paper, we propose a new energy-aware scheduling algorithm with reduced task duplication called Energy-Aware Scheduling by Minimizing Duplication (EAMD), which takes the energy consumption as well as the makespan of an application into consideration. It adopts a subtle energy-aware method to search and delete redundant task copies in the schedules generated by duplication-based algorithms, and it is easier to operate than DVFS, and produces no extra time and energy consumption. This algorithm not only consumes less energy but also maintains good performance in terms of makespan compared with duplication-based algorithms. Two kinds of DAGs, i.e., randomly generated graphs and two real-world application graphs, are tested in our experiments. Experimental results show that EAMD can save up to 15.59 % energy consumption for HLD and HCPFD, two classic duplication-based algorithms. Several factors affecting the performance are also analyzed in the paper.  相似文献   

8.
During the past decade, cluster computing and mobile communication technologies have been extensively deployed and widely applied because of their giant commercial value. The rapid technological advancement makes it feasible to integrate these two technologies and a revolutionary application called mobile cluster computing is arising on the horizon. Mobile cluster computing technology can further enhance the power of our laptops and mobile devices by running parallel applications. However, scheduling parallel applications on mobile clusters is technically challenging due to the significant communication latency and limited battery life of mobile devices. Therefore, shortening schedule length and conserving energy consumption have become two major concerns in designing efficient and energy-aware scheduling algorithms for mobile clusters. In this paper, we propose two novel scheduling strategies aimed at leveraging performance and power consumption for parallel applications running on mobile clusters. Our research focuses on scheduling precedence constrained parallel tasks and thus duplication heuristics are applied to schedule parallel tasks to minimize communication overheads. However, existing duplication algorithms are developed with consideration of schedule lengths, completely ignoring energy consumption of clusters. In this regard, we design two energy-aware duplication scheduling algorithms, called EADUS and TEBUS, to schedule precedence constrained parallel tasks with a complexity of O(n 2), where n is the number of tasks in a parallel task set. Unlike the existing duplication-based scheduling algorithms that replicate all the possible predecessors of each task, the proposed algorithms judiciously replicate predecessors of a task if the duplication can help in conserving energy. Our energy-aware scheduling strategies are conducive to balancing scheduling lengths and energy savings of a set of precedence constrained parallel tasks. We conducted extensive experiments using both synthetic benchmarks and real-world applications to compare our algorithms with two existing approaches. Experimental results based on simulated mobile clusters demonstrate the effectiveness and practicality of the proposed duplication-based scheduling strategies. For example, EADUS and TABUS can reduce energy consumption for the Gaussian Elimination application by averages of 16.08% and 8.1% with merely 5.7% and 2.2% increase in schedule length respectively.
Xiao Qin (Corresponding author)Email:
  相似文献   

9.
Asymmetric multicore processors have recently emerged as an appealing technology for severely energy-constrained environments, especially in mobile appliances where heterogeneity in applications is mainstream. In addition, given the growing interest for low-power high performance computing, this type of architectures is also being investigated as a means to improve the throughput-per-Watt of complex scientific applications on clusters of commodity systems-on-chip. In this paper, we design and embed several architecture-aware optimizations into a multi-threaded general matrix multiplication (gemm), a key operation of the BLAS, in order to obtain a high performance implementation for ARM big.LITTLE AMPs. Our solution is based on the reference implementation of gemm in the BLIS library, and integrates a cache-aware configuration as well as asymmetric-static and dynamic scheduling strategies that carefully tune and distribute the operation’s micro-kernels among the big and LITTLE cores of the target processor. The experimental results on a Samsung Exynos 5422, a system-on-chip with ARM Cortex-A15 and Cortex-A7 clusters that implements the big.LITTLE model, expose that our cache-aware versions of gemm with asymmetric scheduling attain important gains in performance with respect to its architecture-oblivious counterparts while exploiting all the resources of the AMP to deliver considerable energy efficiency.  相似文献   

10.
In heterogeneous distributed computing systems like cloud computing, the problem of mapping tasks to resources is a major issue which can have much impact on system performance. For some reasons such as heterogeneous and dynamic features and the dependencies among requests, task scheduling is known to be a NP-complete problem. In this paper, we proposed a hybrid heuristic method (HSGA) to find a suitable scheduling for workflow graph, based on genetic algorithm in order to obtain the response quickly moreover optimizes makespan, load balancing on resources and speedup ratio. At first, the HSGA algorithm makes tasks prioritization in complex graph considering their impact on others, based on graph topology. This technique is efficient to reduction of completion time of application. Then, it merges Best-Fit and Round Robin methods to make an optimal initial population to obtain a good solution quickly, and apply some suitable operations such as mutation to control and lead the algorithm to optimized solution. This algorithm evaluates the solutions by considering efficient parameters in cloud environment. Finally, the proposed algorithm presents the better results with increasing number of tasks in application graph in contrast with other studied algorithms.  相似文献   

11.
In this paper, we consider the problem of scheduling divisible loads on arbitrary graphs with the objective to minimize the total processing time of the entire load submitted for processing. We consider an arbitrary graph network comprising heterogeneous processors interconnected via heterogeneous links in an arbitrary fashion. The divisible load is assumed to originate at any processor in the network. We transform the problem into a multi-level unbalanced tree network and schedule the divisible load. We design systematic procedures to identify and eliminate any redundant processor–link pairs (those pairs whose consideration in scheduling will penalize the performance) and derive an optimal tree structure to obtain an optimal processing time, for a fixed sequence of load distribution. Since the algorithm thrives to determine an equivalent number of processors (resources) that can be used for processing the entire load, we refer to this approach as resource-aware optimal load distribution (RAOLD) algorithm. We extend our study by applying the optimal sequencing theorem proposed for single-level tree networks in the literature for multi-level tree for obtaining an optimal solution. We evaluate the performance for a wide range of arbitrary graphs with varying connectivity probabilities and processor densities. We also study the effect of network scalability and connectivity. We demonstrate the time performance when the point of load origination differs in the network and highlight certain key features that may be useful for algorithm and/or network system designers. We evaluate the time performance with rigorous simulation experiments under different system parameters for the ease of a complete understanding.  相似文献   

12.
The treatment paradigm of non-small cell lung cancer (NSCLC) has evolved into oncogene-directed precision medicine. Identifying actionable genomic alterations is the initial step towards precision medicine. An important scientific progress in molecular profiling of NSCLC over the past decade is the shift from the traditional piecemeal fashion to massively parallel sequencing with the use of next-generation sequencing (NGS). Another technical advance is the development of liquid biopsy with great potential in providing a dynamic and comprehensive genomic profiling of NSCLC in a minimally invasive manner. The integration of NGS with liquid biopsy has been demonstrated to play emerging roles in genomic profiling of NSCLC by increasing evidences. This review summarized the potential applications of NGS-based liquid biopsy in the diagnosis and treatment of NSCLC including identifying actionable genomic alterations, tracking spatiotemporal tumor evolution, dynamically monitoring response and resistance to targeted therapies, and diagnostic value in early-stage NSCLC, and discussed emerging challenges to overcome in order to facilitate clinical translation in future.  相似文献   

13.
Energy preservation is very important nowadays. A large number of applications in science, engineering, astronomy and business analytics are classified as Bag-of-Tasks (BoT) applications. A BoT is a collection of independent tasks that do not communicate with each other during execution. BoT scheduling has been severely studied from a performance point of view. In this paper, we address the problem of energy-efficient BoT scheduling in a heterogeneous environment with the twin objectives of minimizing finish time and energy consumption. Specifically, we extend two performance-oriented scheduling policies, Min–Min and Max–Min, and propose power-aware centralized scheduling policies that incorporate a dynamic voltage/frequency scaling mechanism and can power on and off unneeded computing nodes of a heterogeneous cluster environment using dynamic power management. Additionally, to evaluate the system using a more realistic workload, high-priority tasks with and without time-constraints are also submitted. A series of simulation experiments conducted, show that we can achieve significant energy savings without affecting significantly the execution of BoTs and high-priority tasks. Additional experiments on a real system also confirmed the effectiveness of our policies.  相似文献   

14.
This paper addresses the efficient exploitation of task-level parallelism, present in many dense linear algebra operations, from the point of view of both computational performance and energy consumption. The strategies considered here, referred to as the Slack Reduction Algorithm (SRA) and the Race-to-Idle Algorithm (RIA), adjust the operation frequency of the cores during the execution of a collection of tasks (in which many dense linear algebra algorithms can be decomposed) with very different approaches to save energy. The procedures are evaluated using an energy-aware simulator, which is in charge of scheduling/mapping the execution of these tasks to the cores, leveraging dynamic frequency voltage scaling featured by current technology. Experiments with this tool and the practical integration of the RIA strategy into a runtime show the energy gains for two versions of the QR factorization.  相似文献   

15.
With the popularization and development of cloud computing, lots of scientific computing applications are conducted in cloud environments. However, current application scenario of scientific computing is also becoming increasingly dynamic and complicated, such as unpredictable submission times of jobs, different priorities of jobs, deadlines and budget constraints of executing jobs. Thus, how to perform scientific computing efficiently in cloud has become an urgent problem. To address this problem, we design an elastic resource provisioning and task scheduling mechanism to perform scientific workflow jobs in cloud. The goal of this mechanism is to complete as many high-priority workflow jobs as possible under budget and deadline constraints. This mechanism consists of four steps: job preprocessing, job admission control, elastic resource provisioning and task scheduling. We perform the evaluation with four kinds of real scientific workflow jobs under different budget constraints. We also consider the uncertainties of task runtime estimations, provisioning delays, and failures in evaluation. The results show that in most cases our mechanism achieves a better performance than other mechanisms. In addition, the uncertainties of task runtime estimations, VM provisioning delays, and task failures do not have major impact on the mechanism’s performance.  相似文献   

16.
A new approach to the job scheduling problem in computational grids   总被引:1,自引:0,他引:1  
Job scheduling is one of the most challenging issues in Grid resource management that strongly affects the performance of the whole Grid environment. The major drawback of the existing Grid scheduling algorithms is that they are unable to adapt with the dynamicity of the resources and the network conditions. Furthermore, the network model that is used for resource information aggregation in most scheduling methods is centralized or semi-centralized. Therefore, these methods do not scale well as Grid size grows and do not perform well as the environmental conditions change with time. This paper proposes a learning automata-based job scheduling algorithm for Grids. In this method, the workload that is placed on each Grid node is proportional to its computational capacity and varies with time according to the Grid constraints. The performance of the proposed algorithm is evaluated through conducting several simulation experiments under different Grid scenarios. The obtained results are compared with those of several existing methods. Numerical results confirm the superiority of the proposed algorithm over the others in terms of makespan, flowtime, and load balancing.  相似文献   

17.
Lipid mediators (LMs) derived from PUFAs play important roles in health and disease. Databases and search algorithms are crucial, but currently unavailable, for accurate and prompt analysis of LMs via liquid chromatography-ultraviolet-tandem mass spectrometry (LC-UV-MS/MS). A novel algorithm and databases, cognoscitive-contrast-angle algorithm and databases (COCAD), were developed for the identification of LMs based on the integration of standard MS/MS spectra with chromatograms and UV spectra. Segment naming and empirical fragmentation rules were introduced to determine MS/MS ion identities, along with ion intensities used by COCAD in matching the unknown to those of authentic standards. The structures of potential LMs without synthetic and/or authentic products as standards were identified by developing theoretical databases and algorithms based on virtual LC-UV-MS/MS spectra and chromatograms. The performance of these databases and algorithms was tested by identifying LMs in murine tissues. These results indicate that COCAD has many advantages for profiling and identification of LMs compared with the conventional dot-product algorithm.  相似文献   

18.
Cloud computing has attracted significant attention from research community because of rapid migration rate of Information Technology services to its domain. Advances in virtualization technology has made cloud computing very popular as a result of easier deployment of application services. Tasks are submitted to cloud datacenters to be processed on pay as you go fashion. Task scheduling is one the significant research challenges in cloud computing environment. The current formulation of task scheduling problems has been shown to be NP-complete, hence finding the exact solution especially for large problem sizes is intractable. The heterogeneous and dynamic feature of cloud resources makes optimum task scheduling non-trivial. Therefore, efficient task scheduling algorithms are required for optimum resource utilization. Symbiotic Organisms Search (SOS) has been shown to perform competitively with Particle Swarm Optimization (PSO). The aim of this study is to optimize task scheduling in cloud computing environment based on a proposed Simulated Annealing (SA) based SOS (SASOS) in order to improve the convergence rate and quality of solution of SOS. The SOS algorithm has a strong global exploration capability and uses fewer parameters. The systematic reasoning ability of SA is employed to find better solutions on local solution regions, hence, adding exploration ability to SOS. Also, a fitness function is proposed which takes into account the utilization level of virtual machines (VMs) which reduced makespan and degree of imbalance among VMs. CloudSim toolkit was used to evaluate the efficiency of the proposed method using both synthetic and standard workload. Results of simulation showed that hybrid SOS performs better than SOS in terms of convergence speed, response time, degree of imbalance, and makespan.  相似文献   

19.
Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing. Many-task computing denotes high-performance computations comprising multiple distinct activities, coupled via file system operations. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. Traditional techniques found in production systems in the scientific community to support many-task computing do not scale to today’s largest systems, due to issues in local resource manager scalability and granularity, efficient utilization of the raw hardware, long wait queue times, and shared/parallel file system contention and scalability. To address these limitations, we adopted a “top-down” approach to building a middleware called Falkon, to support the most demanding many-task computing applications at the largest scales. Falkon (Fast and Light-weight tasK executiON framework) integrates (1) multi-level scheduling to enable dynamic resource provisioning and minimize wait queue times, (2) a streamlined task dispatcher able to achieve orders-of-magnitude higher task dispatch rates than conventional schedulers, and (3) data diffusion which performs data caching and uses a data-aware scheduler to co-locate computational and storage resources. Micro-benchmarks have shown Falkon to achieve over 15K+ tasks/s throughputs, scale to hundreds of thousands of processors and to millions of queued tasks, and execute billions of tasks per day. Data diffusion has also shown to improve applications scalability and performance, with its ability to achieve hundreds of Gb/s I/O rates on modest sized clusters, with Tb/s I/O rates on the horizon. Falkon has shown orders of magnitude improvements in performance and scalability than traditional approaches to resource management across many diverse workloads and applications at scales of billions of tasks on hundreds of thousands of processors across clusters, specialized systems, Grids, and supercomputers. Falkon’s performance and scalability have enabled a new class of applications called Many-Task Computing to operate at previously so-believed impossible scales with high efficiency.  相似文献   

20.
Many studies on integration of process planning and production scheduling have been carried out during the last decade. While various integration approaches and algorithms have been proposed, the implementation of these approaches is still a difficult issue. To achieve successful implementation, it is important to examine and evaluate integration approaches or algorithms beforehand. Based on an object-oriented integration testbed, a simulation study that compares different integration algorithms is presented in this paper. Separated planning method and integrated planning methods are examined. Also, situations of both fixed and variable processing times are simulated, and useful results have been observed. The successful simulation with the object-oriented integration testbed eventually will be extended to include other new planning algorithms for examining their effectiveness and implementation feasibility.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号