Task Scheduling with QoS Satisfaction in Hadoop Cloud Computing

advertisement
Adaptive Scheduling with QoS
Satisfaction in Hybrid Cloud
Environment
研究生:李羿慷
指導老師:張玉山 老師
Outline
1.
2.
3.
4.
Introduction
Related Work
Problem Definition and Formulation
Adaptive Scheduling Algorithm with QoS
Satisfaction
5. Experiments and Discussion
6. Conclusions and Future Work
2
1.
2.
3.
4.
Introduction
Related Work
Problem Definition and Formulation
Adaptive Scheduling Algorithm with QoS
Satisfaction
5. Experiments and Discussion
6. Conclusions and Future Work
3
1. Introduction
• Cloud computing
– Huge data store and highly parallel computing
– Cloud services: SaaS, PaaS, IaaS
• Private cloud
– Control and security issue
– One-time purchase and long term maintain
• Public cloud
– Flexible, scalable
– Pay-per-use
4
Introduction (cont.)
• Cloud environment workload status
– Ex: Yahoo! Video
5
Introduction (cont.)
• Hybrid Cloud
– Combine Private and Public cloud
– Private cloud
• Regular workload
• Constant maintenance cost
– Public cloud
• Transit overloading
• Pay-per-use (cost issue)
6
Introduction (cont.)
• Cloud services
– Efficiency
– Reliability
– Cost
• Quality of Service (QoS)
– Response time ↓
– Payment ↓
7
Introduction (cont.)
• Hybrid Cloud
– Guarantee user QoS demand
• Workload dispatching
– Private Cloud
• Maximize utilization
• Minimize execution time
– Public Cloud
• Minimize cost expense
8
Introduction (cont.)
• To improve QoS satisfaction in hybrid cloud
– We propose:
Adaptive Scheduling with QoS Satisfaction in
Hybrid Cloud Environment
9
MMKP
• Mapping the QoS satisfaction and cost
function into a MMKP
– MMKP (Multi-dimension Multi-choice Knapsack
Problem)
– MMKP is proved as NP-complete
– Maximal utilization
– Minimal cost value
– QoS deadline constraint
10
Introduction (cont.)
• We may solve our problem by finding a near
optimal heuristic solutions in polynomial time
• Using dynamic programming finding a
heuristic solution (near optimal)
– Solving complex problem into smaller subproblems
• Using CloudSim to evaluate the experiment
11
1.
2.
3.
4.
Introduction
Related Work
Problem Definition and Formulation
Adaptive Scheduling Algorithm with QoS
Satisfaction
5. Experiments and Discussion
6. Conclusions and Future Work
12
2. Related Work
• FIFO: first come first serve
– Most common
– Drawback: Convoy Effect
• Fair (Facebook), Capacity (Yahoo) Scheduling
– Solve multi-user problem in FIFO
– Ensure every task has approximately equal
computational resource/time
13
Related Work (cont.)
• Intelligent Workload Factoring for A Hybrid
Cloud Computing Model
– Split the workload into two parts
– Base load and trespassing load (privately-owned
data center and public cloud service)
– Reduce data cache/replication overhead
– But not support real-time QoS constraint
computing
14
Related Work (cont.)
• GA-Based Task Scheduler for the Cloud
Computing Systems
– Genetic Algorithm based for task level scheduling
in Hadoop MapReduce
– Achieve better load balancing
– GA for making the optimal decision
– Not for Hybrid Cloud
– Not supporting QoS constraint
15
Related Work (cont.)
• Cost-Minimizing Scheduling of Workflows on a
Cloud of Memory Managed Multicore Machines
– Service-oriented architecture framework
– Cost function, maps values of workflow tardiness to
corresponding cost function value
– To minimize the sum of cost function values for all
workflows
– Cost function not for user-aspect design
– Not supporting Hybrid Cloud
16
1.
2.
3.
4.
Introduction
Related Work
Problem Definition and Formulation
Adaptive Scheduling Algorithm with QoS
Satisfaction
5. Experiments and Discussion
6. Conclusions and Future Work
17
3. Problem Definition and Formulation
• 3.1 Resource Slot Definition
• 3.2 Request Job Definition
• 3.3 Problem Formulation
18
3.1 Resource Slot Definition
• Private resource slot
– One node (machine) in private cloud can generate
more slots
– Base on virtual machine infrastructure
– One slot require one CPU resource ability
– Basic unit for handling request task
19
Resource Slot Definition (cont.)
Example of Private Slots
20
Resource Slot Definition (cont.)
• Resource Slots has different computing ability,
depends on CPU speed, memory…etc
– Unit: Million Instruction Per Second
• Private Cloud set data replications between
resource slots
21
Resource Slot Definition (cont.)
• Public Resource Slot
– Resource from charging public cloud provider
– Based on different instance type
– Unify charging policy by
• Computing price
• Storage price
• Data transfer price
22
Resource Slot Definition (cont.)
Example of Public Slots
23
3.2 Request Job Definition
• Target applications
– Internet-based applications
– Focus on data sets on certain kinds of
distributable, parallel problems
– Ex: image and video rendering codes and highly
parallel data analysis codes
– Each application has a completion deadline
24
Example of
Request Jobs
25
3.3 Problem Formulation
• For guarantee QoS demand deadline
constraint
– Maximize private slot utilization
– Minimize task execution time
– Minimize cost value
26
Definition
• Deadline constraint:
– For Job Ji = {Vi1 ~ Vin}, and deadline Di
– Code size SCij for task Vij
– For private slot PrRk and computing ability Prμk, k
= 1 to m
m
n
SC ij
  Pr   D i
k 1 j 1
k
27
Definition (cont.)
• Budget control
– For Job Ji = {Vi1 ~ Vin}, and cost budget Mi
– Code size SCij for task Vij
– Information data size SDij for task Vij
– For public slot PuRq and computing price xq
– For public slot PuRq and storage price yq
• k = 1 to m
m
n
  x
q
 SC ij  y q  SD ij   M i
q 1 j 1
28
Definition (cont.)
• Estimated Finish Time (Est)
– For task Vij on private slot PrRk
– Code size SCij for Vij
– Computing ability Prμk for PrRk
Est [ k ] 
SC ij [ k ] remain
Pr  k
29
Definition (cont.)
• Estimated Execution Time (EEt)
– For task Vij on private slot PrRk
– Code size SCij for Vij
– Computing ability Prμk for PrRk
• Data Transmission Time (Dtt)
– For task Vij and code size SCij
– Network bandwidth NB
– Disk speed DSk on resource slot k
EEt [ k , ij ] 
SC ij
Pr  k
 Dtt
,
 SD ij
Dtt 
 
NB
 DS k
SD ij

 2


30
Definition (cont.)
• Cost Function (CostF)
– Code size SCij and information data size SDij for
task Vij
– Computing price xk, storage price yk, data transfer
in price dtik and data transfer out price dtio for
public resource slot PuRk
CostF  SC ij
 1
1 

 x k  SD ij  y k  SD ij 

dto k 
 dti k
31
MMKP
• Mapping our mathematical formulate
problems into MMKP (NP-complete)
– MMKP (Multi-dimension Multi-choice Knapsack
Problem)
– Maximal utilization
– Minimal cost value
– QoS deadline constraint
• We may solve our problem by finding a near
optimal heuristic solutions in polynomial time
32
33
1.
2.
3.
4.
Introduction
Related Work
Problem Definition and Formulation
Adaptive Scheduling Algorithm with QoS
Satisfaction
5. Experiments and Discussion
6. Conclusions and Future Work
34
4. Adaptive Scheduling Algorithm
with QoS Satisfaction
• 4.1 Resource Needed Weight
• 4.2 Execution Time Estimation with Task on
Different Slots
• 4.3 Dynamic Programming for Dispatching to
Candidate Slots
• 4.4 Dispatch Selection from Slot Queue
• 4.5 Dynamic Programming for Minimal Cost
on Public Slot
35
4.1 Resource Needed Weight
• If multi Jobs arrive in the pool at the same
time, they’ll share the resource by the % of
Resource Needed Weight Wi
• Differ from Fair Scheduling, guarantee the
resource amount base on code size and deadline
– Wx:Wy:Wz ->Slot distributed rate, for Job x, y, z
N SC
ij
Wi  
j 1 D i
36
Resource Needed Weight (cont.)
Fair Scheduling
AsQ
37
4.2 Execution Time Estimation with
Task on Different Slots
• Collect private cloud resources’ current status
and information
– Remain code size
– Computation ability
• Calculate estimated finish time (Est)
Est [ k ] 
SC ij [ k ]  remain
Pr μ k
• Can find out when the slot will be available
38
Example of Est
39
Execution Time Estimation with Task
on Different Slots (cont.)
• Calculate estimated execution time (EEt) of
current tasks on every private resource from 1
to k
– Estimated Execution time
EEt [ k , ij ] 
– Data Transfer Time
• If V ij  L k , Dtt=0
• Else if V ij  L k ,
 SD ij
Dtt 

 DS
NB
k

SD ij
SC ij
Pr  k
 Dtt

 2


40
Example of EEt
41
Execution Time Estimation with Task
on Different Slots (cont.)
• By having Est and EEt, the slots which were
able to finish the task before the deadline can
be selected
• The slots which can reach the QoS (deadline)
will be collect in a candidate set
Example of Est + EEt
42
Example of Overloading Dispatch
43
4.3 Dynamic Programming for
Dispatching to Candidate Slots
• The optimal scheduling has been mapping to
MMKP
• Using dynamic programming to solve the NPcomplete problem
• Finding the minimal runtime of every tasks
and slots
– Data location, computation ability, network
bandwidth…etc, will effect the total runtime
44
Example of Scheduling Job 2
45
Dynamic Programming for Dispatching
to Candidate Slots (cont.)
• Dynamic programming will make the decision
with minimal execution time of all
• The less execution time we take, the more
task we can serve on the same private cloud
resources with same operation cost
• More on private, less on charging public
46
4.4 Dispatch Selection from
Slot Queue
• When transit overloading or strict deadline
– Private slots can not handle in QoS demand
• Need to dispatch into charging public slots
• Examining the possibility of task in queue with
dispatching into public slots
– Data transmission time
47
Example of Job 3 Arrive
48
Example of Examining Dtt in Queue
49
4.5 Dynamic Programming for Minimal
Cost on Public Slot
• Trying to minimize the cost in renting public
resource slots
• Cost function with knapsack problem can be
solve by dynamic programming
• Find out the minimal cost and reach the QoS
deadline
 1
1 

CostF  SC ij  x k  SD ij  y k  SD ij 


dt
dt
o 
 i
50
Example of Minimum Cost Selection
51
Algorithm of
[Execution Time Estimation with Task on Different Slots] &
[Dynamic Programming for Dispatching to Candidate Slots]
52
Algorithm of
[Dispatch Selection from Slot Queue]
53
Algorithm of
[Dynamic Programming for Minimal
Cost on Public Slot]
54
1.
2.
3.
4.
Introduction
Related Work
Problem Definition and Formulation
Adaptive Scheduling Algorithm with QoS
Satisfaction
5. Experiments and Discussion
6. Conclusions and Future Work
55
5. Experiments and Discussion
• CloudSim
• Support for modeling and simulation with
customizable policies for resources scheduler
on Cloud computing
Slot experiment setup
image size
5 GB
RAM
512 MB
file size
[200, 400] MB
BW
1,000
output size
[20, 40] MB
CPU Number
1
code size
[400, 1000] MI
computing ability
[10, 50] MIPS
Task experiment setup
56
• 5.1 Measurement of AsQ, FIFO and Fair
– Latency measurement
– QoS satisfaction rate measurement
– Cost analysis
• 5.2 Measurement of AsQ and COSHIC
– Cost analysis
– Latency measurement
– QSR spending time measurement
– Normalized Violated Quality Value measurement
57
5.1 Latency Measurement
• Latency measurement
– No deadline limit
– Private resource only
– 5, 10, 20, 50 tasks
– Waiting time
– Execution time
– Finish time
58
Task Waiting Time Measurement
59
Task Execution Time Measurement
60
Task Finish Time Measurement
61
Measurement of
AsQ, FIFO and Fair (cont.)
• QoS satisfaction rate measurement
– Percentage of complete in time tasks of all
– QSR = k/n
• n=total task number, k=task number which response
before deadline, 0≦k ≦n
– 20, 50, 70 tasks
– Private slots only
– Deadline: loose → strict
62
QSR Measurement (20 tasks)
63
QSR Measurement (50 tasks)
64
QSR Measurement (70 tasks)
65
Measurement of
AsQ, FIFO and Fair (cont.)
• QSR – Cost measurement
– 50, 70 tasks
– Using public slots
– Paying more for higher QSR
Public Cloud Slots
Computing
Computing
ability (MIPS)
price ($/MI)
10
0.1
20
0.2
50
0.5
Storage price
($/MB)
[0.01, 0.05]
[0.01, 0.05]
[0.01, 0.05]
66
QSR – Cost Measurement (50 tasks)
67
QSR – Cost Measurement (70 tasks)
68
Measurement of
AsQ, FIFO and Fair (cont.)
• Cost analysis
– 20, 50, 70 tasks
– Deadline: loose → strict
69
Cost Analysis (20 tasks)
70
Cost Analysis (50 tasks)
71
Cost Analysis (70 tasks)
72
5.2 Measurement of AsQ and COSHIC
• Compare with “Cost-optimal Scheduling in
Hybrid IaaS Clouds”
– Linear programming formulation
– Assume that applications are CPU and network
intensive
– Scheduling applications in the public cloud, in
terms of cost minimization
73
Cost Analysis
Cost 22.7% as AsQ
74
Task Execution Time Measurement
Time spend 10.7%
75
Task Finish Time Measurement
Time spend 16.8%
76
QSR Spending Time Measurement
COSHIC spend 4.9 times than AsQ
77
NVQV Measurement
• Normalized Violated Quality Value
• For normalized the performance between
execution time and cost value
78
NVQV Measurement (cont.)
79
Comparison with other Scheduling
Algorithm
80
1.
2.
3.
4.
Introduction
Related Work
Problem Definition and Formulation
Adaptive Scheduling Algorithm with QoS
Satisfaction
5. Experiments and Discussion
6. Conclusions and Future Work
81
6. Conclusions and Future Work
• We propose Adaptive Scheduling Algorithm
with QoS Satisfaction
• Satisfy user QoS demand
• Near optimal resource allocation
– Better resource utilization
• Lower cost spend for service provider
82
• Finding suitable workload on private cloud
with better tradeoff between operation cost
and computing efficiency
• Reliability
• Implement on a real cloud environment
83
• End
84
Download