Partial Critical Path

advertisement
Cost-driven Scheduling of Grid
Workflows Using Partial Critical
Paths
28 October 2010
Saeid Abrishami and Mahmoud Naghibzadeh
Ferdowsi University of Mashhad
Mashhad, Iran
Dick Epema
Delft University of Technology
Delft, the Netherlands
Grid 2010, Brussels, Belgium
Delft
University of
Technology
Challenge the future
Introduction
• Utility Grids versus Community Grids
• main difference: QoS and SLAs
• Workflows: a common application type in distributed systems
• The Workflow Scheduling Problem
• community grids:
many heuristics try to minimize the
makespan of the workflow
• utility grids:
other QoS attributes than execution time,
e.g., economical cost, play a role, so it is a
multi-objective problem
• We propose a new QoS-based workflow scheduling algorithm
called Partial Critical Paths (PCP)
Partial Critical Paths
2
The PCP Algorithm: main idea
• The PCP Algorithm tries to create a schedule that
1. minimizes the total execution cost of a workflow
2. while satisfying a user-defined deadline
• The PCP Algorithm
1. first schedules the (overall) critical path of the workflow such that
a. its execution cost is minimized
b. it completes before the user’s deadline
2. finds the partial critical path to each scheduled task on the
critical path and executes the same procedure in a recursive
manner
overall critical path
partial critical path
Partial Critical Paths
3
Scheduling System Model (1)
• Workflow Model
• an application is modeled by a directed acyclic graph G(T,E)
• T is a set of n tasks {t1, t2, …, tn}
• E is a set of arcs between two tasks
• each arc ei,j = (ti, tj) represents a precedence constraint
• dummy tasks: tentry and texit
t1
tentry
e1,4
t4
texit
t3
t2
t5
t6
Partial Critical Paths
4
Scheduling System Model (2)
• Utility Grid Model
• Grid Service Providers (GSPs)
• Each task can be processed by a number of services on different GSPs
• ET(ti,s) and EC(ti,s)
• estimated execution time and execution cost for processing task ti on
service s
• TT(ei,j,r s) and TC(ei,j,r,s)
• estimated transfer time and transfer cost of sending the required data
along ei,j from service s (processing task ti) to service r (processing task tj)
• Grid Market Directory (GMD)
Partial Critical Paths
5
Basic Definitions
• Minumum Exection Time:
MET (ti )  min ET (ti , s)
sSi
• Minimum Transfer Time:
MTT (ei , j )  min TT (ei , j , s, r )
sSi , rS j
used for finding the
partial critical paths
• Earliest Start Time:
EST(t entry )  0
EST(t i ) 
max
t p  parents( t i )
EST(t p )  MET(t p )  MTT(e p,i )
• SS(ti): the selected service for processing the scheduled task ti
• AST(ti): the actual start time of ti on its selected service

Partial Critical Paths
6
The PCP Scheduling Algorithm
PROCEDURE ScheduleWorkflow(G(T,V), deadline)
1.
2.
3.
Request available services for each task in T from GMD
Query available time slots for each service from related GSPs
Add tentry and texit and their corresponding edges to G
4.
5.
6.
Compute MET(ti) for each task in G
Compute MTT(ei,j) for each edge in G
Compute EST(ti) for each task in G
4.
5.
6.
7.
Mark tentry and texit as scheduled
Set AST(tentry)=0 and AST(texit) = deadline
Call ScheduleParents(texit)
If this procedure was successful make advance reservations for
all tasks in G according to the schedule, otherwise return failure
Partial Critical Paths
7
ScheduleParents (1)
• The Critical Parent of a node t is the unscheduled parent p of t
for which EST(p)+MET(p)+MTT(ep,t) is maximal
126
p
281
t
69
• The Partial Critical Path of node t is:
• empty if t does not have unscheduled parents
• consists of the Critical Parent p of t and the Partial Critical Path
of p if t has unscheduled parents
• Critical parent and partial critical path change over time
Partial Critical Paths
8
ScheduleParents (2)
PROCEDURE ScheduleParents(t)
1.
If t has no unscheduled parents then return success
2.
Let CriticalPath be the partial critical path of t
3.
Call SchedulePath(CriticalPath)
4.
If this procedure is unsuccessful, return failure and a
suggested start time for the failed node (try to repair)
5.
For all ti on CriticalPath /* from start to end */
Call ScheduleParents(ti)
6.
Iterate over all non-scheduled parents of t
Partial Critical Paths
9
SchedulingPath (1)
• SchedulePath tries to find the cheapest schedule for a Path
without violating the actual start times of the scheduled
children of the tasks on Path
• SchedulePath is based on a backtracking strategy
• A selected service for a task is admissible if the actual start
times of the scheduled children of that task can be met
Partial Critical Paths
10
SchedulingPath (2)
• Moves from the first task in Path to the last task
• For each task, it selects an untried available service
• If the selected service creates an admissible (partial) schedule, then
it moves forward to the next task, otherwise it selects another
untried service for that task
• If there is no available untried service for that task left, then it
backtracks to the previous task on the path and selects another
service for it
• This may lead to failure
Partial Critical Paths
11
An Example (1)
Start: Call ScheduleParents(E)
S
1
4
7
2
5
8
3
6
9
Partial Critical Paths
E
12
An Example (2)
find the Partial Critical Path for node E
(this is the overall critical path of the workflow)
S
1
4
7
2
5
8
3
6
9
Partial Critical Paths
E
13
An Example (3)
Call SchedulePath for path 2-6-9
Call ScheduleParents for nodes 2, 6, and 9, respectively
S
1
4
7
2
5
8
3
6
9
Partial Critical Paths
E
14
An Example (4)
Node 2 has no unscheduled parents,
so its partial critical path is empty
S
1
4
7
2
5
8
3
6
9
Partial Critical Paths
E
15
An Example (5)
Node 6: find its partial critical path and then call SchedulePath
ScheduleParents is called for node 3 but it has no unscheduled parents
1
S
If SchedulePath
4 cannot schedule7this path,
then it returns failure, which causes the
path 2-6-9 to be rescheduled.
2
5
8
3
6
9
Partial Critical Paths
E
16
An Example (6)
Node 9: find its partial critical path and then call SchedulePath
ScheduleParents is called for the nodes 5 and 8 but they have no unscheduled parents
1
S
If SchedulePath
4 cannot schedule this
7 path,
then it returns failure, which causes the path
2-6-9 to be rescheduled.
2
5
8
3
6
9
Partial Critical Paths
E
17
An Example (7)
Now scheduling of the path 2-6-9 has been finished, and
ScheduleParents is called again for node E to find its
next partial critical path and to schedule that path
S
1
4
7
2
5
8
3
6
9
Partial Critical Paths
E
18
Performance Evaluation
Experimental Setup (1): the system
• Simulation Software: GridSim
• Grid Environment: DAS-3, a multicluster grid in the Netherlands
• 5 clusters (32-85 nodes)
• Average inter-cluster bandwidth: between 10 to 512 MB/s
• Processor speed: have been changed to make a 10 times difference
between the fastest and the slowest cluster
• Processor price: fictitious prices have been assigned to each cluster
(faster cluster has a higher price)
Partial Critical Paths
19
Performance Evaluation
Experimental Setup (2): the workflows
• Five synthetic workflow applications that are based on
real scientific workflows (see next page)
• Montage
• CyberShake
• Epigenomics
• LIGO
• SIPHT
• Three sizes for each workflow:
• small (about 30 tasks)
• medium (about 100 tasks)
• large (about 1000 tasks)
• Each task can be executed on every cluster
Partial Critical Paths
20
Performance Evaluation
Experimental Setup (3): the workflows
LIGO
Montage
SIPHT
Epigenomics
CyberShake
Partial Critical Paths
21
Performance Evaluation
Experimental Setup (4): metrics
• Three scheduling algorithms to schedule each workflow:
• HEFT:
• Fastest:
• Cheapest:
a well-known makespan minimization algorithm
submits all tasks to the fastest cluster
submits all tasks to the cheapest (and slowest) cluster
• The Normalized Cost and the Normalized Makespan of a
workflow:
NC 

totalschedulecost
CC
NM 
schedule m akespan
MH
• CC : the cost of executing that workflow with Cheapest
• MH : the makespan of executing that workflow with HEFT
Partial Critical Paths
22
Performance Evaluation
Experimental Results (1)
Normalized Makespan (left) and Normalized Cost (right) of
scheduling workflows with HEFT, Fastest and Cheapest
Partial Critical Paths
23
Performance Evaluation
Experimental Results (2)
deadline=deadline-factor x MH
Normalized Makespan (left) and Normalized Cost (right) of
scheduling small workflows with the Partial Critical Paths algorithm
Partial Critical Paths
24
Performance Evaluation
Experimental Results (3)
Normalized Makespan (left) and Normalized Cost (right) of
scheduling medium workflows with the Partial Critical Paths algorithm
Partial Critical Paths
25
Performance Evaluation
Experimental Results (4)
Normalized Makespan (left) and Normalized Cost (right) of
scheduling large workflows with the Partial Critical Paths algorithm
Partial Critical Paths
26
Performance Evaluation
Comparison to Other Algorithms (1)
• One of the most cited algorithms in this area has been proposed
by Yu et al.:
• divide the workflow into partitions
• assign each partition a sub-deadline according to the minimum
execution time of each task and the overall deadline of the workflow
• try to minimize the cost of execution of each partition under the subdeadline constraints
Partial Critical Paths
27
Performance Evaluation
Comparison to Other Algorithms (2): Cost
CyberShake
Epigenomics
us
LIGO
Partial Critical Paths
them
28
Performance Evaluation
Comparison to Other Algorithms (3): Cost
Montage
SIPHT
Partial Critical Paths
29
Related Work
• Sakellariou et al. proposed two scheduling algorithms for
minimizing the execution time under budget constraints:
1. Initially schedule a workflow with minimum execution time, and
then refine the schedule until its budget constraint is satisfied
2. Initially assign each task to the cheapest resource, and then
refine the schedule to shorten the execution time under budget
constraints
Partial Critical Paths
30
Conclusions
• PCP: a new algorithm for workflow scheduling in utility grids
that minimizes the total execution cost while meeting a userdefined deadline
• Simulation results:
• PCP has a promising performance in small and medium workflows
• PCP’s performance in large workflows is variable and depends on the
structure of the workflow
• Future work:
• To extend our algorithm to support other economic grid models
• Try to enhance it for the cloud computing model
Partial Critical Paths
31
Information
• PDS group home page and publications database:
www.pds.ewi.tudelft.nl
• KOALA web site: www.st.ewi.tudelft.nl/koala
• Grid Workloads Archive (GWA): gwa.ewi.tudelft.nl
• Failure Trace Archive (FTA): fta.inria.fr
Partial Critical Paths
32
Download