Advanced
A Two-Step Job Scheduling Algorithm Based on Priority for Cloud Computing
A Two-Step Job Scheduling Algorithm Based on Priority for Cloud Computing
Journal of Information and Communication Convergence Engineering. 2013. Dec, 11(4): 235-240
Copyright ©2013, The Korean Institute of Information and Commucation Engineering
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/li-censes/bync/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • Received : March 07, 2013
  • Accepted : July 17, 2013
  • Published : December 31, 2013
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Jeongwon Kim
jwkim@silla.ac.kr

Abstract
Cloud systems are popular computing environment because they can provide easy access to computing resources for users as well as efficient use of resources for companies. The resources of cloud computing are heterogeneous and jobs have various characteristics. One such issue is effective job scheduling. Scheduling in the cloud system may be defined as a multiple criteria decision model. To address this issue, this paper proposes a priority-based two-step job scheduling algorithm. On the first level, jobs are classified based on preference. Resources are dedicated to a job if a deadline failure would cause severe results or critical business losses. In case of only minor discomfort or slight functional impairment, the job is scheduled using a best effort approach. On the second level, jobs are allocated to adequate resources through their priorities that are calculated by the analytic hierarchic process model. We then analyze the proposed algorithm and make a scheduling example to confirm its efficiency.
Keywords
I. INTRODUCTION
Under a cloud computing environment, which is Internetbased computing, users can utilize resources such as computing software, hardware performance monitoring, and information systems for public utilities like gas, electricity, and water. As a result, because of the cloud system’s centralized storage, memory processing, hardware, and bandwidth, this system can provide an efficient and low-cost computing environment [1] .
On the other hand, handling jobs efficiently in cloud services is a problem that remains to be solved. In general, the job scheduling problem has been a major research topic of grid and cloud computing. Static or dynamic algorithms have been proposed in a single workflow in order to achieve optimum performance. The schemes are mainly focused on scheduling completion time guarantees and latency reduction techniques.
These techniques are mainly best-effort scheduling algorithms. However, the scheduling in cloud computing algorithms has different requirements because of the importance of how quickly a certain level of resources can be allocated and how much of the resource is dedicated to a job [2] . In addition, as cloud computing users have a variety of levels of the use of services, a wide range of scalability, and this is dynamically set through virtualization, jobs should be scheduled in different ways.
A scheduling algorithm in the cloud environment must take into special consideration that the storage and data transfer costs of data-oriented jobs continue to grow exponentially as time passes. Therefore, this study proposes a new scheduling algorithm that takes into consideration the characteristics of each job.
The proposed scheme is divided into two steps: the first step is to classify jobs by their degree of importance, and the second step is to assign resources to each job. In the first step, if missing a deadline may bring about severe results or critical business losses, resources are dedicated to that job (resource provisioning). If a deadline failure would cause only minor discomfort or slight functional impairment, the job is scheduled using a best effort approach (best effort). In the second step, jobs are assigned to a virtual machine (VM) to catch the user's quality of service (QoS) and maximize the efficiency of cloud computing.
In a cloud environment, many nodes exist, and the reliabilities of all heterogeneous nodes are inevitably low. This allocation issue can be defined as a kind of multicriteria decision making (MCDM) because responsiveness, cost, and loads of jobs have variability in terms of QoS. Therefore, our work uses the analytic hierarchic process (AHP) method to solve this allocation issue and our scheduling algorithm chooses a VM to run the job by determining the priorities of the alternatives for the various decision criteria.
This paper is organized as follows: The research related to the proposed algorithm is described in Section II. In Section III, our new algorithm is introduced in detail. In Section IV, the effectiveness of our algorithm is analyzed and verified. We then discuss conclusions and future work in Section V.
II. RELATED WORKS
Many studies have addressed job scheduling in grid and cloud environments. There are online and offline schemes in the case of batch job scheduling which are known to be suitable for scientific applications. This paper is focused on the scheduling of real-time requests.
Yu et al. [3] proposed a cost-based workflow scheduling algorithm considering the minimization of the execution cost in which it satisfies the deadline. Yu and Buyya [4] proposed a scheduling technique to minimize the execution time while satisfying workflow execution cost. In this scheduling approach, a genetic algorithm was applied to solve the optimization problem and experimental results were presented for a grid environment. Padala et al. [5] proposed an algorithm that satisfies the QoS of workflows and can improve resource utilization between applications by adjusting resource sharing. Yu and Shi [6] proposed a plan-based algorithm for multiple workflows that determines the execution order with rankings of workflows. Xu et al. [7] proposed a scheduling algorithm to support multiple workflows and nested multiple QoS requirements. They further showed that the scheme can improve scheduling accessibility. Kosinska et al. [8] proposed a variety of phases to improve the reliability and scalability of the applications that run on cloud resources. Ghanbaria and Othman [9] reported on a technique similar to this study. Their scheme used the AHP technique to determine a job’s priority, the attribute level of cloud resources, and the alternative to a job. On the other hand, our study obeys the basic principle of the AHP model. In our custom model, the attribute level defines decision criteria. This level determines which criteria, such as responsiveness, cost, and system load, are preferable for choosing alternatives. The alternative level is defined as a resource allocation unit. This unit is meant as a VM in our scheme to determine which VM is more adequate for executing a job.
III. TWO-STEP JOB SCHEDULING SCHEME BASED ON PRIORITY
The proposed scheduling scheme in this paper is composed of two steps. In the first step, the priority of a job is calculated based on the degree of importance of each job. In the second step, the algorithm selects a VM on which to run the job. Every candidate VM is weighted using the AHP model.
- A. First Step: Job Classification Based on Its Degree of Importance
As all resources in cloud computing are dedicated to jobs, it is advantageous for a cloud system to secure the resources needed in advance. In the case of data-oriented jobs, this resource provisioning will be an especially important factor in performance improvement because the required storage as well as bandwidth increases with time passing. Thus, our scheme classifies all jobs into one of two categories: resource provisioning or best effort.
In this step, the jobs for which the user pays a high cost or that are of higher importance belong to the resource provisioning case. If scheduling failures of jobs may bring about severe business loss, the jobs are classified to the first level of priority or the highest level. If the failures may cause considerable after-effects, these jobs belong to the second level of priority. Jobs for which the user pays regular or low costs or are of medium importance belong to the best effort case. If scheduling failures of jobs may bring about a trivial inconvenience to a small number of users, the jobs are classified into the third level of priority. If the failures may cause minor functional impairments, these jobs belong to the fourth level or the lowest priority. The classified jobs are queued in order in the cloud system scheduler. The sequences are fixed, and the jobs are executed in a nonpreemptive way. The jobs with the same priority are served in a round-robin manner.
Lager Image
The analytic hierarchic process model to place jobs on the virtualmachine (VM).
- B. Second Step: Job Allocation to VM Based on the AHP Model
The AHP is a kind of MCDM that helps to choose one of the alternatives. Each selection has a related attribute attached to it, and the weights of each attribute are set. Therefore, the AHP model can select the best choice out of the list of alternatives. The merit of the AHP is that it considers variable parameters for many alternatives and generates the result that best matches the parameters [10] . The proposed scheme uses an AHP model for the VM selection.
As Fig. 1 shows, the objective level in the proposed algorithm is to place a job in a VM that it best matches under many different parameters. The attribute level in our algorithm is composed of user requirements or decision criteria such as response time, system utilization, and cost, and the alternative level represents all VMs in a cloud system.
As each parameter in the AHP model has a preference, the proposed scheme also has a preference from 1 to 9. The preference 9 is the highest one. The values 2, 4, 6, and 8 are intermediate values for the preference, and the inverse number represents the counterpart preference. Therefore, these preference values are used in representing and calculating the requirements of the job as well as parameters of each VM, such as the response time, system utilization, and cost.
Suppose that a set of jobs, which are scheduling objects in cloud environment, is ζ = { J1,J2,…,Jm }, and the criteria set is ψ = { C1,C2,…,Cn }, and set of VMs is ξ = { VM1, VM2,…,VMv }.
The VM allocation is done in two steps. In the first step, the algorithm calculates the weight vector by pairwise comparison and does a consistency check for both of the levels. In the second step, the best VM is selected by multiplying the two weight vectors of each level. The process is explained in detail below.
Sub-step 1: Calculation of the weight matrix and consistency index between the objective and attribute level.
Lager Image
Suppose that the pairwise comparison matrix between the objective level and attribute level is named “pairwise comparison matrix (PCM)” in this paper. This PCM represents the preference for a job under all criteria such as responsiveness, system utilization, and cost. The PCM is an n by n matrix like Eq. (1). If the element of PCM, ( i,j ) is 5, the preference of PCM( j,i ) will be 1/5. There are n pairwise comparison matrices for all criteria, which are created according to the priority of the decision criteria. For each of the comparison matrices, the scheme should compute a priority vector (vector of weights). The priority vector can be obtained by solving Eq. (2). The λmax is the principal eigenvalue of PCM and is denoted by the corresponding eigenvector ωcriteria . With any arbitrary comparison of PCM, the model can produce a vector of weights such as ωcriteria = { ω1, ω2, …, ωn }. An essential step in this model is to obtain a vector of weights. The vector of weights can be computed through Eq. (2). A positive n by n matrix has the ratio form PCM = ( wi/wj ), i,j = 1, ..., n , if, and only if, it is consistent. The matrix of ratios PCM = ( wi/wj ) is consistent if and only if n is its principal eigenvalue and PCM∙ ωcriteria = λmax∙ωcriteria . Further, ωcriteria > 0 is unique to within a multiplicative constant.
Lager Image
Saaty [10] has defined the consistency ratio (CR) as Eq. (3).
Lager Image
In Eq. (3), RI is the random index, which is randomly calculated based on the rank of the comparison matrix. Eq. (3) also uses the RI values, which were calculated by Saaty [10] . If CR < 0.1, then the PCM should be considered consistent.
Sub-step 2: Calculation of the weight matrix and consistency index between the attribute and alternative level.
The next step is also to calculate the weight matrix and consistency index for each decision criterion to all VMs. Like Eqs. (1)–(3), this scheme determines the pairwise comparison matrix and calculates the weight matrix and consistency ratios for each of the decision criteria to each VM.
Suppose that the weight vector for each VM is ωvm={ω12,…,ωv} , and ωvm is a v by n matrix. The v is the number of the VMs and the n is the number of the decision criteria. We then obtain the final score for each VM by multiplying the weight vector of each sub-step 1 by the weight matrix of the sub-step 2. The number of elements of the score vector in Eq. (4) is v . The index of maximum value means a VM that will execute the job.
Lager Image
where {ω vm is v by n , ω criteria is n by 1 } .
IV. ANALYSIS AND EXAMPLE OF THE PROPOSED SCHEDULING ALGORITHM
First, we analyze the proposed scheme by time complexity. As the priority classification of the first step is determined by job characteristics or requirements, the time complexity may be trivial. The computation of the pairwise comparison matrix and CR occupies a huge portion of complexity in the proposed scheme, such as Eq. (5).
Lager Image
In Eq. (6), n is the number of the decision criteria and v is the number of the VMs, and d denotes the number of the eliminated matrices because of inconsistency while che-cking CR. In Eq. (6), n 2.38 are the additions and the multiplications in the calculation of the weight vector from the objective level to the attribute level. The n × v 2.38 is the arithmetic operations in the computation of the weight matrix from the attributes level to the alternative level. Furthermore, if the candidate pairwise matrix is rejected in the consistency check, the matrix must be recalculated in d × n 2.38 times. Thus, the final time complexity may be determined by Eq. (6).
Lager Image
The time complexity for multiplying n by n matrices requires O ( n 3 ) multiplications that are of worst case complexity. The fastest known algorithm, devised by Don Coppersmith and Shmuel Winograd, runs in O ( n 2.38 ) time [11] . Most researchers believe that an optimal algorithm will run in essentially O ( n 2 ) time, yet until recently, no further progress has been made in finding one [12] . Therefore, the best case complexity is O ( n 2.38 ), and the worst case com-plexity of the proposed scheme is O ( n 3 ), and a general case of time complexity is n2.38 or v2.38 . However, as the number of VMs is generally greater than the decision criteria, the complexity of the proposed algorithm should be described as n × v 2.38 .
The following is an example of the proposed scheduling algorithm. Table 1 is a sample preference matrix of a job.
The following Eq. (7) is a PCM and weight vector.
Lager Image
For obtaining the CR, we first multiply the preference matrix by the eigenvalue of the weight vector as shown below Eq. (8):
Lager Image
From Eqs. (2) and (3), we now can get the λmax =3.004 and CR is 0.002. As the CR is smaller than 0.1, this preference matrix must be consistent.
Preference matrix example of a job
Lager Image
Preference matrix example of a job
λmaxand consistency ratio (CR) for response time
Lager Image
λmax and consistency ratio (CR) for response time
λmaxand consistency ratio (CR) for system utilization
Lager Image
λmax and consistency ratio (CR) for system utilization
λmaxand consistency ratio (CR) for cost
Lager Image
λmax and consistency ratio (CR) for cost
Finally, the score of each VM is calculated by multiplying the weight matrix (ω vm ) in the alternative level by the weight vector (ω criteria ) in the attribute level through Eqs. (4)–(9), which follows, is the score vector for each VM. Because VM 3 recorded the maximum score (0.375), the proposed scheduler should select VM3 to execute the job.
Lager Image
Next, we calculate the weight vector and CR of each VM for the response time, system utilization, and cost. Tables 2 4 show these weight vectors and CRs. As all CRs are less than 0.1, all preference matrices must be consistent.
V. CONCLUSION
Job scheduling is an important problem in cloud computing, which is naturally composed of heterogeneous resources. We defined this issue as MCDM. To settle this issue, this paper introduced a two-step job scheduling scheme based on priority, with consideration of both the job characteristics and the MCDM. In the first step, the priority of a job was classified in 1 of 4 levels, which are determined by importance attributes. In the second step, we applied the AHP process in job allocations to the VM to tackle the MCDM issue. We adopted responsibility, system utilization, and cost of jobs as decision criteria. We then analyzed the time complexity of the proposed scheme and confirmed that the proposed algorithm showed acceptable complexity. In the future, work will be carried out aiming to minimize the complexity and to implement a real scheduler in a sample cloud system.
References
Baraglia R. , Capannini G. , Dazzi P. , Pagano G. 2013 “A multicriteriajob scheduling framework for large computing farms” Journal of Computer and System Sciences 79 (2) 230 - 244    DOI : 10.1016/j.jcss.2012.05.005
Deelman E. 2010 “Grids and clouds: making workflow applicationswork in heterogeneous distributed environments” International Journal of High Performance Computing Applications 24 (3) 284 - 298    DOI : 10.1177/1094342009356432
Yu J. , Buyya R. , Tham C. K. 2005 “Cost-based scheduling ofscientific workflow applications on utility grids” in Proceedings of the 1st IEEE International Conference on e-Science and Grid Computing Melbourne, Australia 5 - 8
Yu J. , Buyya R. 2006 “A budget constrained scheduling ofworkflow applications on utility grids using genetic algorithms” in Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing Paris, France 19 - 23
Padala P. , Shin K. G. , Zhu X. , Uysal M. , Wang Z. , Singhal S. , Merchant A. , Salem K. 2007 “Adaptive control of virtualizedresources in utility computing environments” in Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems Lisbon, Portugal 289 - 302
Yu Z. , Shi W. 2008 “A planner-guided scheduling strategy formultiple workflow applications” in Proceedings of the 37th International Conference on Parallel Processing Workshops Portland, OR 1 - 8
Xu M. , Cui L. , Wang H. , Bi Y. 2009 “A multiple QoS constrainedscheduling strategy of multiple workflows for cloud computing” in Proceedings of IEEE International Symposium on Parallel and Distributed Processing with Applications Chengdu, China 629 - 634
Kosinska J. , Kosinski J. , Zielinski K. 2010 “The concept ofapplication clustering in cloud computing environments: the needfor extending the capabilities of virtual networks” in Proceedings of the 5th International Multi-Conference on Computing in the Global Information Technology Valencia, Spain 139 - 145
Ghanbaria S. , Othman M. 2012 “A priority based job schedulingalgorithm in cloud computing” in Proceedings of the International Conference on Advances Science and Contemporary Engineering 2012 Jakarta, Indonesia 778 - 785
Saaty T. L. 2012 Decision Making for Leaders: The AnalyticalHierarchy Process for Decisions in a Complex World. RWS Publications Pittsburgh, PA
Strassen V. 1969 “Gaussian elimination is not optimal” Numerische Mathematik 13 (4) 354 - 356    DOI : 10.1007/BF02165411
Robinson S. 2005 “Toward an optimal algorithm for matrix multiplication” SIAM News 38 (9)