Classifier MCT for immediate mode independent task scheduling in Computational Grid

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013
Classifier MCT for immediate mode independent
task scheduling in Computational Grid
1
Gaurav Sharma, 2Puneet Banga
1
Assistant Professor
Department of Computer Science and Engineering, JMIT, Radaur
Yamuna Nagar, Haryana, India
2
Research Scholar
Department of Computer Science and Engineering, JMIT, Radaur
Yamuna Nagar, Haryana, India
Abstract— Grid computing consists of coordinating and sharing
computational power, storage and network resources across
geographically dispersed locations [1]. Scheduling the task to
best resource is one of the tedious task which reflects the
efficiency of whole Grid system. Where Meta-scheduling play a
crucial role in scheduling tasks that are submitted for execution
and require special attention because an increasing number of
tasks are being executed using a limited number of resources [2].
The crucial task of meta-scheduling is to select the best resources
to use to execute the underlying tasks while still achieving the
following objectives: reducing the processing time, decreasing the
makespan, increasing the overall throughput of the system [3],
ensuring the average resource utilization rate and considering
task requirements.
algorithms those based on mapping heuristic can be classified
in two major groups: Immediate/Online Mode and
Batch/Offline Mode. In Immediate mode, task is mapped as
soon as it arrived into the system. Heuristic based on
Immediate mode: MET and MCT [4] [5] are the famous one.
Whereas in Batch mode, tasks are grouped together in Meta
task (MT) and then the batch is scheduled in some predefined
times called mapping events. Many mapping heuristics have
been proposed for batch mode scheduling among which MinMin and Max-Min, Suffrage etc. are the simplest and most
popular ones. Unlike traditional scheduling algorithms which
make steady decisions in order to assign a single task to a
resource, our proposed algorithm uses classification based on
task ratio whether it is CPU bound or I/O bound.
Keywords— Task scheduling, MET, MCT, Immediate mode,
Batch mode mapping.
I. INTRODUCTION
Scheduling is one of the most important and crucial step in
Grid environment. The major objective functions to assess a
grid scheduler performance are resource utilization rate,
makespan and matching. Makespan is the time difference
between the start time of the first task and the finish time of
the last task [3] [4]. We can also treat it as turnaround time in
which the time gap between submissions of first task and
completion of end task. Actually Makespan is a measure of
the throughput of the heterogeneous computing system. The
objective of the grid scheduling algorithm is to minimize the
makespan. It is well known that the problem of deciding on an
optimal assignment of jobs to resources is NP-complete.
Another important parameter is Resource utilization which is
based on idle time of resources and effective job scheduling is
measured by high resource throughput.
Motivation behind this research work is to design and
implement a new scheduling algorithm [4] under
immediate/online mode for independent task scheduling so
that to minimize makespan and maximize the average
resource utilization rate. Generally, Grid scheduling
ISSN: 2231-5381
II. GRID SCHEDULING
Task scheduling is a fundamental and important issue in
achieving high performance in grid computing systems. In
simple terms: Scheduling is the process of mapping submitted
tasks to the available resources, means assigning job(s) to
intended or selected resource(s). The performance of grid
should be improved by reducing the job processing time and
by making sure that all the grid resources [6] are used without
being idle i.e., nothing but to balance the load among
resources. Another term is Meta-scheduling a process for
deciding the best site locations for job execution within a
distributed environment [8].
We can categorize scheduling into two types:
1.
Static scheduling
In static scheduling information is available in advance like
number of task(s), resource(s) and their requirements like
MI (Million Instructions), and volume of data (in Mb),
MIPS (Million Instruction per second), Bandwidth (Mbps)
etc is known as priori. The above information regarding
http://www.ijettjournal.org
Page 2722
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013
task MI is based on various prediction models or based on
assumption which may or may not be accurate always [2].
2.
Dynamic scheduling
In which number of task and resources are not certain. Any
number of task and resources at any time is not predictable.
In dynamic scheduling some or all decisions are done
during the execution of a schedule. In a Grid, the processor
speed of each processor varies over time. So, dynamic
scheduling is more appropriate than static scheduling [2].
III. IMMEDIATE MODE SCHEDULING
In Immediate mode, the task is mapped to mapper as soon
as it arrived into. In the on-line mode heuristics, each task is
considered only once [7] [8] for matching and scheduling, i.e.,
the mapping is not altered once it is computed. Two famous
and oldest mapping heuristic of same category are:
1.
MET (Minimum Execution Time)
In this method minimum execution time is used to assign
the task without considering the resource availability. Task is
assigned to the resource on which it can be executed in
minimum time. [5] [6] Allocating task without considering
resource availability results in load imbalance on grid
resources. Heuristic MET can solve problem in O(nm) time.
Here n denote number of independent task whereas m denote
number of allocated resources.
Pseudo code for MET
//------------------------------------------------------------------Step 1: FOR all the tasks ti
Step 2:
FOR all machines mj
Step 3:
Find the machine mq which will finish task
tp earliest
[End step 2 loop]
Step 4: Schedule task tp to machine mq
[End step 1 loop]
Step 5: Compute the parameter like Makespan etc.
//------------------------------------------------------------------2. MCT (Minimum Completion Time)
Another famous immediate/online mapping heuristic
is MCT. In MCT mapping heuristic task is assigned to the
resource that gives minimum completion time Cj (ready
time of resource + task execution time on the selected
resource) for the task [5][6]. Allocating task in this
manner may result in execution of jobs on less high speed
grid machines. Problem is solvable using MCT in O(nm)
time. Here n denote number of independent task whereas
m denote number of allocated resources.
//------------------------------------------------------------------Step 1: FOR all the tasks ti
Step 2:
FOR all machines mj
Step 3:
Compute Cij = E(i,j) + mj
[End step 2 loop]
Step 4: Find the machine mq which will finish task tp
earliest
Step 5: Schedule task tp to machine mq
Step 6: Update machine mq availability time
[End step 1 loop]
Step 7: Compute the parameter like Makespan etc.
//------------------------------------------------------------------IV. PROPOSED ALGORITHM
In this section, we will discuss the proposed scheduling
algorithm known as Classifier Minimum Completion Time
(CMCT) of immediate mapping category.
The conventional MCT scheduling algorithm does not
consider the task requirements, which affects the performance
of overall Grid. There are two types of tasks in Grid,
Computation based and Communicational based. The
communication based jobs like transfer a file from one node to
another node require high bandwidth for its operation. The
computational based jobs like solving scientific computation
based problems which require more clock cycles to carry out
the assigned task in minimum delay of time [9].
Pseudo code for CMCT (Proposed)
//--------------------------------------------------------------------Step 1: Start
Step 2: Categorize the allocated resources in two different
classes, HIGH and LOW respectively.
Step 3: Examine the current task category based on task
ratio.
Step 4: If the current task require larger number of clock
cycles or MIPS is high then:
Submit task to “HIGH” class resources.
Else
Submit the task to “LOW” class resources.
[End If]
Step 5: Apply MCT on resources of respective class.
Step 6: Is their more upcoming task(s) to be mapped? If
Yes, then:
Go to Step number 3.
Else
Calculate the Makespan and Average Resource
utilization rate.
[End If]
Step 7: End
//--------------------------------------------------------------------
Pseudo code for MCT
ISSN: 2231-5381
http://www.ijettjournal.org
Page 2723
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013
If the computational task(s) submitted to low power (in
terms of MIPS or processing speed) resource then it will not
utilize its capability effectively, similarly if a communication
based jobs submitted to a resource having high speed CPU
and low bandwidth then it does not fully utilize the resource
and also increase its job completion time. So we proposed an
independent task heuristic which considered the class either
HIGH or LOW computational which effectively maximize the
resource utilization and also minimize the makespan which is
another important parameter in computational grid. Based
upon the MCT a new scheduling heuristic is proposed. The
mapping heuristic at first finds out the type of job i.e., HIGH
computation based or LOW. Depending on this criterion the
job is assigned to the required class of resources.
In our proposed algorithm, firstly we have categorized our
allocated resource in to two classes. First one is HIGH class
for HIGH computational task(s) and LOW computational
task(s) are scheduled to LOW class resources respectively.
Each current task is examined according task ratio. The
proposed algorithm determines not only the task type but also
the task-ratio, which indicates the extent to which a task is a
HIGH or LOW computational. Each task has its own ratio,
which are composed two values. The first value indicates the
task computation execution requirement, and the second value
indicates the data-intensive I/O requirement. For example, a
task with a ratio of 3:1 means that the task requires 75%
computational execution time and 25% data access time or I/O
requirements. So our proposed CMCT will classified such
task as HIGH computational.
Proposed heuristic can also solve problem in O(nm) time like
MET and MCT.
Let’s take second example in which a task with a ratio of 1:
3 means that the task requires 25% computational time and
75% data access time or I/O requirements. In second case our
proposed algorithm will define it as LOW computational class.
Then MCT (Minimum Completion Time) is applied according
to either HIGH or LOW set of resources. Then MCT assigns
each task, in arbitrary order, to the machine of either class
with the minimum expected completion time (Ready time of
machine + task execution time on the selected machine) for
that task.
V. AN ILLUSTRATIVE EXAMPLE
Table 1: ETC Matrix
Task/Resource
R1
R2
5
3
T2
12
7
T3
10
5
T4
9
5
T5
18
11
T6
6
3
T1
ISSN: 2231-5381
Each task is classified as either HIGH or LOW and same is
applicable to resources too. In our proposed mapping heuristic
the Meta-Scheduler will examine whether the upcoming task
is of HIGH or LOW
Computational category based on task ratio. Each task has its
own ratio, which are composed two values. The first value
indicates the task computation execution requirement, and the
second value indicates the data-intensive I/O requirement. For
example, a task with a ratio of 3:1 means that the task requires
75% computational execution time and 25% data access time
or I/O requirements. So in this case our proposed CMCT will
classified such task as HIGH computational.
If the upcoming task is of HIGH class then it will be
scheduled to set of HIGH class resources and then traditional
MCT is applied to find out the Makespan and average
resource utilization rate.
In terms of implementation proposed heuristic chooses 1 to
show HIGH class whereas 2 denote LOW class.
According to Table 1, there are two resources R1 and R2. R1
is defined as LOW and R2 is defined as HIGH class resource
respectively.
Lets take three scenario based on percentage of HIGH and
LOW computational class tasks.
Scenario 1: 50% of tasks are of HIGH computation class.
Task Type: T1: 2; T2: 1; T3: 1; T4: 2; T5: 1; T6: 2;
Scenarios 2: 67% of tasks are of HIGH computation class.
Task Type: T1: 1; T2: 2; T3: 1; T4: 2; T5: 1; T6: 1;
Scenarios 3: 83% of tasks are of HIGH computation class.
Task Type: T1: 1; T2: 1; T3: 1; T4: 1; T5: 2; T6: 1;
VI. SIMULATION RESULTS
This section will show you the actual results after executing
the code for given example discussed in section-V (Table 1).
The comparison results shown below clearly describe that the
CMCT (proposed) algorithm shows better results compared
with other algorithms of immediate mode. As shown above
scenario 1 to scenario 3 CMCT outperforms MET and MCT.
The proposed task scheduling algorithm minimizes the
makespan and got higher average resource utilization rate.
Thus proposed task scheduling algorithm provides better QoS
to the task as compared to other scheduling algorithms in Grid
computing environment. While other algorithms just only
consider the execution time based on task length defined in
MI for each resource depending upon their MIPS. Our
proposed work also considers the task class depends upon its
task ratio whether it is of HIGH or LOW computational
requirement. Thus considering such a features our proposed
algorithm provides better results in terms of minimum
makespan and higher average resource utilization rate among
other two algorithms MET and MCT respectively.
http://www.ijettjournal.org
Page 2724
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013
Graph: 1.1.1: Makespan for scenario 1
Graph: 1.1.2: Average Resource Utilization Rate (1)
Graph: 1.2.1: Makespan for scenario 2
Graph: 1.2.2: Average Resource Utilization Rate (2)
Graph: 1.3.1: Makespan for scenario 3
Graph: 1.3.2: Average Resource Utilization Rate (3)
ISSN: 2231-5381
http://www.ijettjournal.org
Page 2725
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 6- June 2013
VII.
EXPERIMENT SETUP
The above discussed algorithms: MET, MCT and CMCT
(proposed) are compared on the basis of Makespan and
average resource utilization rate. The functional code is
implemented in Core Java and TextPad as working
environment. The machine configuration is: Intel i5 (2.5 GHz),
4 GB RAM Window based laptop to evaluate the performance
of online mode scheduling under different scenarios in terms
of classification of task as examined by Meta Scheduler based
on task ratio. As shown in Graph 1.1.1 to Graph 1.3.2 our
proposed algorithm (CMCT) yields higher average resource
utilization rate and lesser makespan as compared to MET and
MCT.
[7]
[8]
[9]
Moore, B. Rust, and H. J. Siegel, “Scheduling resources in multi-user,
heterogeneous, computing environments with SmartNet,” 7th IEEE
Heterogeneous Computing Workshop (HCW ’98), Mar. 1998, pp. 184–
199.
Xhafa F,AbrahamA(eds) (2008) Metaheuristics for scheduling in
distributed computing environments. Springer,Berlin.
A. A. Khokhar, V. K. Prasanna, M. E. Shaaban, and C. L. Wang,
“Heterogeneous Computing: Challenges and Opportunities,” IEEE
Computer,vol. 26, pp. 18-27, June, 1993.
T. D. Braun, H. J. Siegel, N. Beck, L. L. Boloni, M. Maheswaran, A. I.
Reuther, J. P. Robertson, M. D. Theys, B. Yao, D. Hensgen, and R. F.
Freund, “A Comparison Study Mapping Heuristics for a Class of Metatasks on Heterogene ous Computing Systems,” 8th
IEEE
Heterogeneous Computing Workshop, pp. 15-29, 1999.
VIII.
CONCLUSION AND FUTURE WORK
Independent task scheduling is still an interesting topic and
many researchers are working on it. We have highlighted one
of the important scheduling mode: online/immediate mode
task scheduling. Grid can be seen as a powerful system which
uses the resources within different domains distributed among
geographical areas. So to make the whole system work
efficiently, we have to consider optimal task scheduling and to
achieve those goals we require suitable resources which can
be obtained during resource discovery phase.
In future our proposed algorithm can be tailored according to
different classes like one for computational and other for I/O
or Data intensive those actually require high bandwidth. It can
also be adopted in dynamic environment in which information
is not known in advance.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
I. Foster, C. Kesselman and S. Tuecke, “The Anatomy of the Grid:
Enabling Scalable Virtual Organizations”, in the International J.
Supercomputer applications, 15(3), pp.200- 220, fall 2001.
D. Paranhos, W. Cirne, and F. Brasileiro. Trading cycles for
information: Using replication to schedule bag-of-tasks applications on
computational grids. In International Conference on Parallel and
Distributed Computing (Euro-Par), Lecture Notes in Computer Science,
volume 2790, pages169–180, 2003.
T. D. Braun, H. J. Siegel, N. Beck, L. L. Boloni, M. Maheswaran, A. I.
Reuther, J. P. Robertson, M. D. Theys, B. Yao, D. Hensgen, and R. F.
Freund, "A comparison of eleven static heuristics for mapping a class of
independent tasks onto heterogeneous distributed computing systems,"
Journal of Parallel and Distributed Computing, vol. 61, issue 6, pp. 810837, Jun. 2001.
M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, and R. F. Freund,
"Dynamic Matching and Scheduling of a Class of Independent Tasks
onto Heterogeneous Computing Systems, The 8th Heterogeneous
Computing Workshop , pp. 30-44, Apr. 2001.
R. Armstrong, D. Hensgen, and T. Kidd, "The relative performance of
various mapping algorithms is independent of sizable varie,nces in runtime predictions”, 7th IEEE Heterogeneous Computing Workshop
(HCW '98), Mar. 1998, pp. 79-87
R. F. Freund, M. Gherrity, S. Ambrosius, M. Campbell, M. Halderman,
D. Hensgen, E. Keith, T. Kidd, M. Kussow, J. D. Lima, F. Mirabile, L.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 2726
Download