International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013 Classifier MCT for immediate mode independent task scheduling in Computational Grid 1 Gaurav Sharma, 2Puneet Banga 1 Assistant Professor Department of Computer Science and Engineering, JMIT, Radaur Yamuna Nagar, Haryana, India 2 Research Scholar Department of Computer Science and Engineering, JMIT, Radaur Yamuna Nagar, Haryana, India Abstract— Grid computing consists of coordinating and sharing computational power, storage and network resources across geographically dispersed locations [1]. Scheduling the task to best resource is one of the tedious task which reflects the efficiency of whole Grid system. Where Meta-scheduling play a crucial role in scheduling tasks that are submitted for execution and require special attention because an increasing number of tasks are being executed using a limited number of resources [2]. The crucial task of meta-scheduling is to select the best resources to use to execute the underlying tasks while still achieving the following objectives: reducing the processing time, decreasing the makespan, increasing the overall throughput of the system [3], ensuring the average resource utilization rate and considering task requirements. algorithms those based on mapping heuristic can be classified in two major groups: Immediate/Online Mode and Batch/Offline Mode. In Immediate mode, task is mapped as soon as it arrived into the system. Heuristic based on Immediate mode: MET and MCT [4] [5] are the famous one. Whereas in Batch mode, tasks are grouped together in Meta task (MT) and then the batch is scheduled in some predefined times called mapping events. Many mapping heuristics have been proposed for batch mode scheduling among which MinMin and Max-Min, Suffrage etc. are the simplest and most popular ones. Unlike traditional scheduling algorithms which make steady decisions in order to assign a single task to a resource, our proposed algorithm uses classification based on task ratio whether it is CPU bound or I/O bound. Keywords— Task scheduling, MET, MCT, Immediate mode, Batch mode mapping. I. INTRODUCTION Scheduling is one of the most important and crucial step in Grid environment. The major objective functions to assess a grid scheduler performance are resource utilization rate, makespan and matching. Makespan is the time difference between the start time of the first task and the finish time of the last task [3] [4]. We can also treat it as turnaround time in which the time gap between submissions of first task and completion of end task. Actually Makespan is a measure of the throughput of the heterogeneous computing system. The objective of the grid scheduling algorithm is to minimize the makespan. It is well known that the problem of deciding on an optimal assignment of jobs to resources is NP-complete. Another important parameter is Resource utilization which is based on idle time of resources and effective job scheduling is measured by high resource throughput. Motivation behind this research work is to design and implement a new scheduling algorithm [4] under immediate/online mode for independent task scheduling so that to minimize makespan and maximize the average resource utilization rate. Generally, Grid scheduling ISSN: 2231-5381 II. GRID SCHEDULING Task scheduling is a fundamental and important issue in achieving high performance in grid computing systems. In simple terms: Scheduling is the process of mapping submitted tasks to the available resources, means assigning job(s) to intended or selected resource(s). The performance of grid should be improved by reducing the job processing time and by making sure that all the grid resources [6] are used without being idle i.e., nothing but to balance the load among resources. Another term is Meta-scheduling a process for deciding the best site locations for job execution within a distributed environment [8]. We can categorize scheduling into two types: 1. Static scheduling In static scheduling information is available in advance like number of task(s), resource(s) and their requirements like MI (Million Instructions), and volume of data (in Mb), MIPS (Million Instruction per second), Bandwidth (Mbps) etc is known as priori. The above information regarding http://www.ijettjournal.org Page 2722 International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013 task MI is based on various prediction models or based on assumption which may or may not be accurate always [2]. 2. Dynamic scheduling In which number of task and resources are not certain. Any number of task and resources at any time is not predictable. In dynamic scheduling some or all decisions are done during the execution of a schedule. In a Grid, the processor speed of each processor varies over time. So, dynamic scheduling is more appropriate than static scheduling [2]. III. IMMEDIATE MODE SCHEDULING In Immediate mode, the task is mapped to mapper as soon as it arrived into. In the on-line mode heuristics, each task is considered only once [7] [8] for matching and scheduling, i.e., the mapping is not altered once it is computed. Two famous and oldest mapping heuristic of same category are: 1. MET (Minimum Execution Time) In this method minimum execution time is used to assign the task without considering the resource availability. Task is assigned to the resource on which it can be executed in minimum time. [5] [6] Allocating task without considering resource availability results in load imbalance on grid resources. Heuristic MET can solve problem in O(nm) time. Here n denote number of independent task whereas m denote number of allocated resources. Pseudo code for MET //------------------------------------------------------------------Step 1: FOR all the tasks ti Step 2: FOR all machines mj Step 3: Find the machine mq which will finish task tp earliest [End step 2 loop] Step 4: Schedule task tp to machine mq [End step 1 loop] Step 5: Compute the parameter like Makespan etc. //------------------------------------------------------------------2. MCT (Minimum Completion Time) Another famous immediate/online mapping heuristic is MCT. In MCT mapping heuristic task is assigned to the resource that gives minimum completion time Cj (ready time of resource + task execution time on the selected resource) for the task [5][6]. Allocating task in this manner may result in execution of jobs on less high speed grid machines. Problem is solvable using MCT in O(nm) time. Here n denote number of independent task whereas m denote number of allocated resources. //------------------------------------------------------------------Step 1: FOR all the tasks ti Step 2: FOR all machines mj Step 3: Compute Cij = E(i,j) + mj [End step 2 loop] Step 4: Find the machine mq which will finish task tp earliest Step 5: Schedule task tp to machine mq Step 6: Update machine mq availability time [End step 1 loop] Step 7: Compute the parameter like Makespan etc. //------------------------------------------------------------------IV. PROPOSED ALGORITHM In this section, we will discuss the proposed scheduling algorithm known as Classifier Minimum Completion Time (CMCT) of immediate mapping category. The conventional MCT scheduling algorithm does not consider the task requirements, which affects the performance of overall Grid. There are two types of tasks in Grid, Computation based and Communicational based. The communication based jobs like transfer a file from one node to another node require high bandwidth for its operation. The computational based jobs like solving scientific computation based problems which require more clock cycles to carry out the assigned task in minimum delay of time [9]. Pseudo code for CMCT (Proposed) //--------------------------------------------------------------------Step 1: Start Step 2: Categorize the allocated resources in two different classes, HIGH and LOW respectively. Step 3: Examine the current task category based on task ratio. Step 4: If the current task require larger number of clock cycles or MIPS is high then: Submit task to “HIGH” class resources. Else Submit the task to “LOW” class resources. [End If] Step 5: Apply MCT on resources of respective class. Step 6: Is their more upcoming task(s) to be mapped? If Yes, then: Go to Step number 3. Else Calculate the Makespan and Average Resource utilization rate. [End If] Step 7: End //-------------------------------------------------------------------- Pseudo code for MCT ISSN: 2231-5381 http://www.ijettjournal.org Page 2723 International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013 If the computational task(s) submitted to low power (in terms of MIPS or processing speed) resource then it will not utilize its capability effectively, similarly if a communication based jobs submitted to a resource having high speed CPU and low bandwidth then it does not fully utilize the resource and also increase its job completion time. So we proposed an independent task heuristic which considered the class either HIGH or LOW computational which effectively maximize the resource utilization and also minimize the makespan which is another important parameter in computational grid. Based upon the MCT a new scheduling heuristic is proposed. The mapping heuristic at first finds out the type of job i.e., HIGH computation based or LOW. Depending on this criterion the job is assigned to the required class of resources. In our proposed algorithm, firstly we have categorized our allocated resource in to two classes. First one is HIGH class for HIGH computational task(s) and LOW computational task(s) are scheduled to LOW class resources respectively. Each current task is examined according task ratio. The proposed algorithm determines not only the task type but also the task-ratio, which indicates the extent to which a task is a HIGH or LOW computational. Each task has its own ratio, which are composed two values. The first value indicates the task computation execution requirement, and the second value indicates the data-intensive I/O requirement. For example, a task with a ratio of 3:1 means that the task requires 75% computational execution time and 25% data access time or I/O requirements. So our proposed CMCT will classified such task as HIGH computational. Proposed heuristic can also solve problem in O(nm) time like MET and MCT. Let’s take second example in which a task with a ratio of 1: 3 means that the task requires 25% computational time and 75% data access time or I/O requirements. In second case our proposed algorithm will define it as LOW computational class. Then MCT (Minimum Completion Time) is applied according to either HIGH or LOW set of resources. Then MCT assigns each task, in arbitrary order, to the machine of either class with the minimum expected completion time (Ready time of machine + task execution time on the selected machine) for that task. V. AN ILLUSTRATIVE EXAMPLE Table 1: ETC Matrix Task/Resource R1 R2 5 3 T2 12 7 T3 10 5 T4 9 5 T5 18 11 T6 6 3 T1 ISSN: 2231-5381 Each task is classified as either HIGH or LOW and same is applicable to resources too. In our proposed mapping heuristic the Meta-Scheduler will examine whether the upcoming task is of HIGH or LOW Computational category based on task ratio. Each task has its own ratio, which are composed two values. The first value indicates the task computation execution requirement, and the second value indicates the data-intensive I/O requirement. For example, a task with a ratio of 3:1 means that the task requires 75% computational execution time and 25% data access time or I/O requirements. So in this case our proposed CMCT will classified such task as HIGH computational. If the upcoming task is of HIGH class then it will be scheduled to set of HIGH class resources and then traditional MCT is applied to find out the Makespan and average resource utilization rate. In terms of implementation proposed heuristic chooses 1 to show HIGH class whereas 2 denote LOW class. According to Table 1, there are two resources R1 and R2. R1 is defined as LOW and R2 is defined as HIGH class resource respectively. Lets take three scenario based on percentage of HIGH and LOW computational class tasks. Scenario 1: 50% of tasks are of HIGH computation class. Task Type: T1: 2; T2: 1; T3: 1; T4: 2; T5: 1; T6: 2; Scenarios 2: 67% of tasks are of HIGH computation class. Task Type: T1: 1; T2: 2; T3: 1; T4: 2; T5: 1; T6: 1; Scenarios 3: 83% of tasks are of HIGH computation class. Task Type: T1: 1; T2: 1; T3: 1; T4: 1; T5: 2; T6: 1; VI. SIMULATION RESULTS This section will show you the actual results after executing the code for given example discussed in section-V (Table 1). The comparison results shown below clearly describe that the CMCT (proposed) algorithm shows better results compared with other algorithms of immediate mode. As shown above scenario 1 to scenario 3 CMCT outperforms MET and MCT. The proposed task scheduling algorithm minimizes the makespan and got higher average resource utilization rate. Thus proposed task scheduling algorithm provides better QoS to the task as compared to other scheduling algorithms in Grid computing environment. While other algorithms just only consider the execution time based on task length defined in MI for each resource depending upon their MIPS. Our proposed work also considers the task class depends upon its task ratio whether it is of HIGH or LOW computational requirement. Thus considering such a features our proposed algorithm provides better results in terms of minimum makespan and higher average resource utilization rate among other two algorithms MET and MCT respectively. http://www.ijettjournal.org Page 2724 International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013 Graph: 1.1.1: Makespan for scenario 1 Graph: 1.1.2: Average Resource Utilization Rate (1) Graph: 1.2.1: Makespan for scenario 2 Graph: 1.2.2: Average Resource Utilization Rate (2) Graph: 1.3.1: Makespan for scenario 3 Graph: 1.3.2: Average Resource Utilization Rate (3) ISSN: 2231-5381 http://www.ijettjournal.org Page 2725 International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013 VII. EXPERIMENT SETUP The above discussed algorithms: MET, MCT and CMCT (proposed) are compared on the basis of Makespan and average resource utilization rate. The functional code is implemented in Core Java and TextPad as working environment. The machine configuration is: Intel i5 (2.5 GHz), 4 GB RAM Window based laptop to evaluate the performance of online mode scheduling under different scenarios in terms of classification of task as examined by Meta Scheduler based on task ratio. As shown in Graph 1.1.1 to Graph 1.3.2 our proposed algorithm (CMCT) yields higher average resource utilization rate and lesser makespan as compared to MET and MCT. [7] [8] [9] Moore, B. Rust, and H. J. Siegel, “Scheduling resources in multi-user, heterogeneous, computing environments with SmartNet,” 7th IEEE Heterogeneous Computing Workshop (HCW ’98), Mar. 1998, pp. 184– 199. Xhafa F,AbrahamA(eds) (2008) Metaheuristics for scheduling in distributed computing environments. Springer,Berlin. A. A. Khokhar, V. K. Prasanna, M. E. Shaaban, and C. L. Wang, “Heterogeneous Computing: Challenges and Opportunities,” IEEE Computer,vol. 26, pp. 18-27, June, 1993. T. D. Braun, H. J. Siegel, N. Beck, L. L. Boloni, M. Maheswaran, A. I. Reuther, J. P. Robertson, M. D. Theys, B. Yao, D. Hensgen, and R. F. Freund, “A Comparison Study Mapping Heuristics for a Class of Metatasks on Heterogene ous Computing Systems,” 8th IEEE Heterogeneous Computing Workshop, pp. 15-29, 1999. VIII. CONCLUSION AND FUTURE WORK Independent task scheduling is still an interesting topic and many researchers are working on it. We have highlighted one of the important scheduling mode: online/immediate mode task scheduling. Grid can be seen as a powerful system which uses the resources within different domains distributed among geographical areas. So to make the whole system work efficiently, we have to consider optimal task scheduling and to achieve those goals we require suitable resources which can be obtained during resource discovery phase. In future our proposed algorithm can be tailored according to different classes like one for computational and other for I/O or Data intensive those actually require high bandwidth. It can also be adopted in dynamic environment in which information is not known in advance. REFERENCES [1] [2] [3] [4] [5] [6] I. Foster, C. Kesselman and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”, in the International J. Supercomputer applications, 15(3), pp.200- 220, fall 2001. D. Paranhos, W. Cirne, and F. Brasileiro. Trading cycles for information: Using replication to schedule bag-of-tasks applications on computational grids. In International Conference on Parallel and Distributed Computing (Euro-Par), Lecture Notes in Computer Science, volume 2790, pages169–180, 2003. T. D. Braun, H. J. Siegel, N. Beck, L. L. Boloni, M. Maheswaran, A. I. Reuther, J. P. Robertson, M. D. Theys, B. Yao, D. Hensgen, and R. F. Freund, "A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems," Journal of Parallel and Distributed Computing, vol. 61, issue 6, pp. 810837, Jun. 2001. M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, and R. F. Freund, "Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems, The 8th Heterogeneous Computing Workshop , pp. 30-44, Apr. 2001. R. Armstrong, D. Hensgen, and T. Kidd, "The relative performance of various mapping algorithms is independent of sizable varie,nces in runtime predictions”, 7th IEEE Heterogeneous Computing Workshop (HCW '98), Mar. 1998, pp. 79-87 R. F. Freund, M. Gherrity, S. Ambrosius, M. Campbell, M. Halderman, D. Hensgen, E. Keith, T. Kidd, M. Kussow, J. D. Lima, F. Mirabile, L. ISSN: 2231-5381 http://www.ijettjournal.org Page 2726