Investigating Business-Driven Cloudburst Schedulers for e-Science Bag-of-Tasks Applications David Candeia, Ricardo Araújo, Raquel

advertisement
Investigating Business-Driven
Cloudburst Schedulers for e-Science
Bag-of-Tasks Applications
David Candeia, Ricardo Araújo, Raquel
Lopes, Francisco Brasileiro
UFCG/LSD - Brazil
E-Science
• Computers are changing scientific
research
– More collaborative
– As investigation tools
(simulations, data analysis, etc...)
• Many researchers are now
hungry for computing resources
– Sometimes they have deadlines
© Raquel Lopes - UFCG/LSD
2
Solution 1: P2P grids
www.ourgrid.org
© Raquel Lopes - UFCG/LSD
3
Solution 1: P2P grids
www.ourgrid.org
© Raquel Lopes - UFCG/LSD
4
Solution 1: P2P grids
www.ourgrid.org
© Raquel Lopes - UFCG/LSD
5
The problem with the solution 1
© Raquel Lopes - UFCG/LSD
6
Solution 2: Cloud providers
© Raquel Lopes - UFCG/LSD
7
The problem with solution 2
• They have in-house resources and they’d like
to use them
• They have access to a P2P grid almost for
“free” and they’d like to use it
• They have some budget to buy cloud
resources, but they’d like to do that efficiently
© Raquel Lopes - UFCG/LSD
8
The problem
• We’d like to run BoT applications
in this hybrid infrastructure
• How to schedule the
application?
– To meet the deadline
– Use the budget efficiently
© Raquel Lopes - UFCG/LSD
10
Business driven approach
• Resources from cloud providers have a cost
• Applications have value to their owners
– Utility functions
– What is the gain if we run the application in ∆t
units of time?
• Maximise profit: utility(∆t) – cost(myCloud)
© Raquel Lopes - UFCG/LSD
11
Approaches we studied
• Use all your budget acquiring cloud resources
until the application has finished
– Greedy scheduler
• Try to find the execution time that
maximises the profit achieved by running
the application
– Online cloudburst scheduler
© Raquel Lopes - UFCG/LSD
12
Online Cloudburst Scheduler
BoT
submitted
1.
2.
3.
4.
“Estimate” P2P grid throughput
Simulation process…
Find out the best time to finish
Acquire cloud instances for the
next hour
BoT
deadline
…
1
2
3
…
© Raquel Lopes - UFCG/LSD
Time (hours)
13
Online Cloudburst Scheduler
BoT
submitted
1.
2.
3.
4.
“Estimate” P2P grid throughput
Simulation process…
Find out the best time to finish
Acquire cloud instances for the
next hour
BoT
deadline
…
1
2
3
…
© Raquel Lopes - UFCG/LSD
Time (hours)
14
P2P grid throughput estimation
• Collects past information about the grid
• Uses the information to estimate future
throughput
• Prediction approaches:
– Conservative
– Derivative
– Predictive
© Raquel Lopes - UFCG/LSD
15
Evaluation
• Question: These online solutions seem to be
more sophisticated than the greedy approach,
but are they more efficient?
• Simulation experiments
– We developed in Java
– Simulates a scheduler coordinating the execution of a
BoT application
– Each simulation experiment gives the profit achieved
© Raquel Lopes - UFCG/LSD
16
Evaluation
• Optimal scheduler
– Knows the real grid capacity
– Able to make optimal decisions
• Compare profits
– Efficiency metric in (0,-∞)
– e = -0.3 means the profit is 30% worse than the
optimal profit achieved
© Raquel Lopes - UFCG/LSD
17
Experimental setup - Application
• Collection of tasks whose demand are
normally distributed
– Four application flavours
• Two utility function:
• Maximum utility: 1xCost, 2xCost,
10xCost, 50xCost,
100xCost
© Raquel Lopes - UFCG/LSD
18
Experimental setup - others
• One cloud provider
– $0.085 per one hour of cloud instance
– Number of instances acquired simultaneously:
limited to 20, or unlimited
• Scheduling policies:
– Greedy
– Online schedulers (Conservative, Derivative,
Predictive-±10%, Predictive-±50%)
• Turn size: 1 hour
• Researcher budget: +∞ ()
© Raquel Lopes - UFCG/LSD
19
Linearly decaying utility – cloud unlimited
1 2
10
50
1 2
10
50
1 2
10
50
1 2
10
50
Utility/Cost (log scale)
© Raquel Lopes - UFCG/LSD
Utility/Cost (log scale)
20
Exponentially decay utility – unlimited
1 2
10
50
1 2
10
50
1 2
10
50
1 2
10
50
Utility/Cost (log scale)
© Raquel Lopes - UFCG/LSD
Utility/Cost (log scale)
21
Linear decay utility – cloud limit is 20
© Raquel Lopes - UFCG/LSD
22
Exponentially decay utility – cloud limit is 20
© Raquel Lopes - UFCG/LSD
23
Conclusions
• Modeled the problem and carried out
simulation experiments whose results were
treated with appropriate statistical methods
• Utility/Cost relationship drives the scheduling
– Small (units): beware of the costs
– High: the cost of acquiring more resources is almost
negligible in the face of the utility they return
© Raquel Lopes - UFCG/LSD
24
Future work
• Investigate other online schedulers
– Use a more intelligent grid QoS predicting model
• Consider data transfer costs in the model
• Carry out measurement experiments
– Cloudburst scheduler implemented as an OurGrid
Broker
• Consider different experimental environments
– User has also a cluster in-house
© Raquel Lopes - UFCG/LSD
25
Thanks
david@lsd.ufcg.edu.br
ricardo@lsd.ufcg.edu.br
raquel@dsc.ufcg.edu.br
fubica@dsc.ufcg.edu.br
© Raquel Lopes - UFCG/LSD
26
Download