presentation slides

advertisement
A Case for Economy Grid
Architecture for Service Oriented
Grid Computing
Rajkumar Buyya, David
Abramson, Jon Giddy
School of Computer Science
and Software Engineering,
Monash University,
Melbourne, Australia
www.buyya.com/ecogrid
http://www.gridcomputing.com
Overview
A brief introduction to Grid computing
 Resource Management issues
 A Glance at Approaches to Grid computing.
 Grid Architecture for Computational Economy
 Economy Grid = Globus + GRACE
 Nimrod-G: A Grid Resource Broker
Grid
 Scheduling Experiments
 Conclusions

Economy
Grid
Scheduling
Economics
Scalable HPC: Breaking
Administrative Barriers
2100
2100
2100
2100
2100
2100
2100
2100
?
P
E
R
F
O
R
M
A
N
C
E
2100
Administrative Barriers
•Individual
•Group
•Department
•Campus
•State
•National
•Globe
•Inter Planet
•Universe
Desktop
SMPs or
SuperComputers
Local
Cluster
Enterprise
Cluster/Grid
Global
Cluster/Grid
Inter Planet
Cluster/Grid ??
Why Grids ? Large Scale Exploration
needs them—Killer Applications.

Solving grand challenge applications using
computer modeling, simulation and analysis
Aerospace
Internet &
Ecommerce
Life Sciences
CAD/CAM
Digital Biology
Military Applications
What is Grid ?

An infrastructure that couples:






Computers – PCs, workstations, clusters,
supercomputers, laptops, notebooks,
mobile devices, PDA, etc;
Software – e.g., ASPs renting expensive special
purpose applications on demand;
Catalogued data and databases – e.g.
transparent access to human genome database;
Special devices – e.g., radio telescope –
SETI@Home searching for life in galaxy.
People/collaborators.
Potentially Offers a simple, consistent,
dependable, and pervasive access across widearea networks and presents users with an
integrated global resource.
Grid Applications-Drivers

Distributed HPC (Supercomputing):


High-throughput computing:


Data mining, particle physics (CERN), Drug Design.
On-demand computing:


Application service provides (ASPs).
Data-intensive computing:


Sharing digital contents among peers (e.g., Napster)
Remote software access/renting services:


Large scale simulation/chip design & parameter studies.
Content Sharing


Computational science.
Medical instrumentation & network-enabled solvers.
Collaborative:

Collaborative design, data exploration, education.
Building and Using Grids requires...





Services that make our systems Grid Ready!
Security mechanisms that permit resources to
be accessed only by authorized users.
(New) programming tools that make our
applications Grid Ready!.
Tools that can translate the requirements of
an application into requirements for
computers, networks, and storage.
Tools that perform resource discovery,
trading, composition, scheduling and
distribution of jobs and collects results.
Players in Grid
Computing
What users want ?
Users in Grid Economy & Strategy

Grid Consumers



Execute jobs for solving varying problem size and
complexity
Benefit by selecting and aggregating resources wisely
Tradeoff timeframe and cost


Strategy: minimise expenses
Grid Providers



Contribute “idle” resource for executing consumer jobs
Benefit by maximizing resource utilisation
Tradeoff local requirements & market opportunity

Strategy: maximise returns on services
Sources of Complexity in Resource
Management for World Wide Computing










Size (large number of nodes, providers, consumers)
Heterogeneity of resources (PCs, Workstatations, clusters, and
supercomputers)
Heterogeneity of fabric management systems (single system image
OS, queuing systems, etc.)
Heterogeneity of fabric management polices
Heterogeneity of applications (scientific, engineering, and
commerce)
Heterogeneity of application requirements (CPU, I/O, memory,
and/or network intensive)
Heterogeneity in demand patters
Geographic distribution and different time zones
Differing goals (producers and consumers have different objectives
and strategies)
Unsecure and Unreliable environment
Traditional approaches to resource
management are NOT useful for Grid ?

They use centralised policy that need



Due to too many heterogenous parameters in the Grid it is
impossible to define:



complete state-information and
common fabric management policy or decentralised consensus-based
policy.
system-wide performance matrix and
common fabric management policy that is acceptable to all.
So, we propose the usage of “economics” paradigm for managing
resources





proved successful in managing decentralization and heterogeneity that
is present in human economies!
We can easy leverage proven Economic principles and techniques
Easy to regulate demand and supply
User-centric, scalable, adaptable, value-driven costing, etc.
Offers incentive (money?) for being part of the grid!
mix-and-match
Object-oriented
Internet-WWW
Problem Solving Approach
Market/Computational
Economy
Grid RMS to support
•Authentication (once).
•Specify (code, resources,
etc.).
•Discover resources.
authorization,
•Negotiate authorisation,
acceptable
acceptableuse,
use,Cost,
Cost,etc.
etc.
•Acquire resources.
Jobs.
•Schedule jobs.
•Initiate computation.
Domain 1
Domain 2
•Steer computation.
•Access remote data-sets.
•Collaborate with results.
•Account for usage.
Ack: Globus..
Building an Economy Grid
“brokerage” system…..
Foundation for the Grid Economy
Economic Models for Resource
Trading





Commodity Market Model
Posted Prices Models
Bargaining Model
Tendering (Contract Net) Model
Auction Model




English, first-price sealed-bid, second-price
sealded-bid (Vickrey), and Dutch.
Proportional Resource Sharing Model
Shareholder Model
Partnership Model
Grid Architecture for Computational Economy
Sign-on
Info ?
Grid Explorer
Application
Job
Control
Agent
Grid Market
Services
Information
Server(s)
Health
Monitor
Grid Node N
Secure
Schedule Advisor
QoS
Grid Node1
Pricing
Algorithms
Trade Server
Trade Manager
…
Deployment Agent
Trading
JobExec
Grid User
Grid Resource Broker
Misc. services
Resource Allocation
Storage
R1
Grid Middleware
Services
Accounting
Resource
Reservation
R2
…
Rm
Grid Service Providers
Economy Grid = Globus + GRACE
Applications
Science
Engineering
MPI-G
MDS
Condor
LSF
MPI-IO
Heartbeat
Monitor
Nexus
GASS
GRD
PBS
…
Portals
High-level Services and Tools
GlobusView
DUROC
Commerce
DUROC
QBank
eCash
ActiveSheet
Grid Status
Nimrod/G
CC++
globusrun
Grid
Apps.
Grid
Tools
Core Services
Globus
Security
Interface
Local
Services
GRACE-TS
GRAM
GARA
GMD
GBank
JVM
TCP
UDP
Linux
Irix
Solaris
Grid
Middleware
Grid
Fabric
GRACE components







A resource broker (e.g., Nimrod/G)
Resource trading protocols
A mediator for negotiating between users and
grid service providers (Grid Market Directory)
A deal template for specifying resource
requirements and services offers
A trade server
A pricing policy specification
Accounting (e.g., QBank) and payment
management (GBank)
Grid Open Trading Protocols
Trade Manager
Trade Server
Get Connected
Call for Bid(DT)
Reply to Bid (DT)
Pricing Rules
Negotiate Deal(DT)
API
….
Confirm Deal(DT, Y/N)
Cancel Deal(DT)
Change Deal(DT)
Get Disconnected
DT - Deal Template
- resource requirements (BM)
- resource profile (BS)
- price (any one can set)
- status
- change the above values
- negotiation can continue
- accept/decline
- validity period
Open Trading Finite State
Machine
DT
< TM, Request for Resource >
< TM, Ask Price >
<< TS, Update >>
DT
< TS, Final Offer >
Offer
TS
< TM, Accept >
DA
<TM, Rej.>
<< TM, Update >>
<TS, Bid >
< TM, Final Offer >
Offer
TM
< TS, Reject >
DN
DT
TM
TM
DA
DN
- Deal Template
- Trade Manager
- Trade Server
- Deal Accepted
- Deal Not accepted
Pricing, Accounting, Allocations and Job
Scheduling Flow @ each site/Grid Level
0
Pricing Policy
GRID Bank
(digital transactions)
2
0
1 Trade Server 3
5
4
DB@Each Site
QBank
8
Resource Manager
IBM-LL/PBS/….
6
7
Compute Resources
clusters/SGI/SP/...
0. Make Deposits,
Transfers, Refunds,
Queries/Reports
1. Clients negotiates for
access cost.
2. Negotiation is performed
per owner defined policies.
3. If client is happy, TS informs
QB about access deal.
4. Job is Submitted
5. Check with QB for “go ahead”
6. Job Starts
7. Job Completes
8. Inform QB about resource
resource utilization.
Service Items to be Charged


CPU - User and System time
Memory:








maximum resident set size - page size
amount of memory used
page faults: with/without physical I/O
Storage: size, r/w/block IO operations
Network: msgs sent/received
Signals received, context switches
Software and Libraries accessed
Data Sources (e.g. Protein Data Bank)
How to decide Price ?














Fixed price model (like today’s Internet)
Dynamic/Demand and Supply (like tomorrow’s Internet)
Usage Period
Loyalty of Customers (like Airlines favoring frequent flyers!)
Historical data
Advance Agreement (high discount for corporations)
Usage Timing (peak, off-peak, lunch time)
Calendar based (holiday/vacation period)
Bulk Purchase (register 100 .com domains at once!)
Voting -- trade unions decide pricing structure
Resource capability as benchmarked in the market!
Academic R&D/public-good application users can be offered at
cheaper rate compared to commercial use.
Customer Type – Quality or price sensitive buyers.
Can be Prescribed by Regulating (Govt.) authorities
Payments- Options & Automation


Buy credits in advance / GSPs bill the user later--”pay as
you go”
Pay by Electronic Currency via Grid Bank


NetCash (anonymity), NetCheque, and Paypal
NetCheque: - http://www.isi.edu/gost/info/netcash/


NetCash - http://www.isi.edu/gost/info/netcheque/


Users register with NC accounting servers, can write electronic
cheques and send (e.g email). When deposited, balance is
transferred from sender to receiver account.
It supports anonymity and it uses the NetCheque system to clear
payments between currency servers.
Paypal.com– account+email is linked to credit card.


Enter the recipient’s email address and the amount you wish to
request.
The recipient gets an email notification and pays you at
www.PayPal.com
A Glance at Nimrod-G Broker
Nimrod/G Client
Nimrod/G Client
Nimrod/G Client
Nimrod/G Engine
Schedule Advisor
Trading Manager
Grid
Store
Grid Dispatcher
Grid Explorer
Grid Middleware
TM
Globus,Legion, Condor-g,, Ninf,etc.
TS
GE
GIS
Grid Information Server(s)
RM & TS
RM & TS
G
RM & TS
C
L
G
Globus enabled node.
L
Legion enabled
node.
RM: Local Resource Manager, TS: Trade Server
C
Condor enabled node.
Nimrod/G : A Grid Resource Broker


A resource broker for managing and steering task
farming (parametric sweep) applications on
computational Grids based on deadline and
computational economy.
Key Features






A single window to manage & control experiment
Resource Discovery
Trade for Resources
Resource Composition & Scheduling
Steering & data management
It allows to study the behaviour of some of the
output variables against a range of different input
scenarios.
Nimrod/G Grid Broker Architecture
Legacy Applications
Customised Apps
(Active Sheet)
P-Tools (GUI/Scripting)
(parameter_modeling)
Farming Engine
Monitoring and
Steering Portals
Meta-Scheduler
Algorithm1
Programmable Entities Management
Resource
Job
Task
Schedule
Advisor
Variables
...
AlgorithmN
ResourceScheduler
Nimrod Clients
JobServer
Grid
Explorer
Dispatcher
Nimrod Broker
Trading
Manager
(transport and execution management)
Globus
Computers
Legion
Local Schedulers
PC/WS/Clusters
...
Condor-G
Storage
Condor/LL/Mosix/
GRACE-TS
Networks
Database
...
...
G-Bank
Instruments
Radio Telescope
Middleware
Fabric
Deadline
A Nimrod/G
Client
Cost
66 Arlington
Alexandria
Legion hosts
She na nd o a h
Rive r
64
64
81
Ra p p a ha n no c k Po to m a c
Rive r
Rive r
Roanoke
Ja m e s
Rive r
Ap p o m a to x
Rive r
Richmond
Hampton
Norfolk
Virginia Beach
Portsmouth Chesapeake
Newport News
77
VIRGINIA
85
Globus Hosts
Bezek is in both
Globus and Legion Domains
Nimrod/G Interactions
Resource
Discovery
Farming
Engine
Scheduler
Trade
Server
Dispatcher
Process
server
I/O
server
Root node
Grid Info
servers
Resource
allocation
(local)
Queuing
System
Job
Wrapper
User
process
File access
Gatekeeper node
Computational node
Adaptive Scheduling algorithms
Adaptive Scheduling
Algorithms
Time Minimisation
Cost Minimisation
None Minimisation
Execution Time
(not beyond deadline)
Minimise
Limited by deadline
Limited by deadline
Discover Establish
Resources
Rates
Distribute Jobs
Compose &
Schedule
Execution Cost
(not beyond budget)
Limited by budget
Minimise
Limited by budget
Discover
More
Resources
Evaluate &
Reschedule
Meet requirements ? Remaining
Jobs, Deadline, & Budget ?
Inter-Continental Grid
Australia
North America
ANL: SGI/Sun/SP2
USC-ISI: SGI
UVa: Linux Cluster
Monash Uni.:
Nimrod/G
Linux cluster
Globus+Legion
+Condor/G
Solaris WS
Globus/Legion
GRACE_TS
Internet
Asia/Japan
Tokyo I-Tech.:
ETL, Tuskuba
Linux cluster
Globus +
GRACE_TS
Europe
ZIB/FUB: T3E/Mosix
Cardiff: Sun E6500
Paderborn: HPCLine
Lecce: Compaq SC
CNR: Cluster
Calabria: Cluster
CERN: Cluster
Pozman: SGI/SP2
Globus +
GRACE_TS
Experiment-1 Setup

Workload:




165 jobs, each need 5 minute of cpu time
Deadline: 1 hrs. and budget: 800,000
units
Strategy: minimise cost and meet
deadline
Execution Cost with cost optimisation


AU Peaktime:471205 (G$)
AU Offpeak time: 427155 (G$)
Resources Selected & Price/CPU-sec.
Resource
Owner and
Type & Size Location
Grid services Peaktime
Cost (G$)
Offpeak
cost
Linux
cluster Monash,
(60 nodes)
Australia
Globus/Condor
20
5
IBM SP2
nodes)
Globus/LL
5
10
(80 ANL, Chicago,
US
Sun (8 nodes)
ANL, Chicago,
US
Globus/Fork
5
10
SGI (96 nodes)
ANL, Chicago,
US
Globus/Condor-G
15
15
SGI (10 nodes)
ISI, LA, US
Globus/Fork
10
20
Execution @ AU Peak Time
Linux cluster - Monash (20)
12
Sun - ANL (5)
SP2 - ANL (5)
SGI - ANL (15)
SGI - ISI (10)
10
6
4
2
Time (minutes)
54
52
51
49
47
46
44
43
41
40
38
37
36
34
33
31
30
28
27
25
24
22
21
20
19
17
15
14
12
10
9
8
6
4
3
1
0
0
Jobs
8
Execution @ AU Offpeak Time
Linux cluster - Monash (5)
12
Sun - ANL (10)
SP2 - ANL (10)
SGI - ANL (15)
SGI - ISI (20)
10
6
4
2
Time (minutes)
60
57
55
53
50
48
46
43
41
39
37
35
32
31
28
26
23
21
19
17
15
13
10
8
7
4
3
0
0
Jobs
8
AU peak: Resources/Cost in Use
40
30
After the calibration phase, note
the difference in pattern of two graphs.
This is when scheduler stopped using
expensive resources.
25
20
15
10
5
500
0
350
300
250
200
150
100
50
Time (in min.)
54
51
47
44
41
38
36
33
30
27
24
21
19
15
12
9
6
3
0
0
51
47
44
41
38
54
400
Cost of Resources in Use
Time (in min.)
36
33
30
27
24
21
19
15
12
9
6
3
450
0
Resources (No. of CPUs) in Use
35
Time (in min.)
59
53
56
49
43
47
41
38
32
35
29
22
26
20
14
17
11
8
6
3
0
Cost of Resources in Use
Time (in min.)
59
56
53
49
47
43
41
38
35
32
29
26
22
20
17
14
11
8
6
3
0
Resources (No. of CPUs) in Use
AU offpeak: Resources/Cost in Use
30
25
20
15
10
5
350
0
300
250
200
150
100
50
0
DesignDrug@Home: Data
Intensive Computing on Grid



A Virtual Laboratory for
“Molecular Modelling for Drug
Design" on Peer-to-Peer Grid.
It provides tools for
examining millions of
chemical compounds
(molecules) in the Protein
Data Bank (PDB) to identify
those having potential use in
drug design.
In collaboration with:

Kim Branson, Structural Biology,
Walter and Eliza Hall Institute
(WEHI)
http://www.csse.monash.edu.au/~rajkumar/dd@home/
Active Sheet: Spreadsheet
Processing on Grid
Nimrod
Proxy
Nimrod/G
Related Works (contd)

Mariposa-Distributed Database system (UCB)


UCB Millennium clusters


query with budget, creates sub-query & divides
budget, trades with (remote) servers
remote execution environment on clusters and
supports computational economy rexec for clusters proportional resource sharing
UNSW Mungi

Storage management: allocation of backing store and garbage
collection of unwanted memory segments depending available
credit. Amount of credit required to store increases as available
storage space becomes minimum.
Related Works

JaWS - Java based Webcomputing system






offers market oriented programming and computing
mechanisms on the Web.
Xenoservers - Accounted execution of untrusted
code
D’Agents - Agents and computational economy
MOSIX - cost based cluster load balancing
A number of theoretical works on pricing.
FIPA standard Agents Interaction Protocols (for
trading) - we plan to explore this!
Can we Predict its Future ?
“I think there is a world market for about five computers.”
Thomas J. Watson Sr., IBM Founder, 1943
Conclusions





The HPC will be dominated by Peer-to-Peer
Grid of clusters.
Adaptive, scalable, and easy to use Systems
and End-User applications will be prominent.
Access electricity, internet, entertainment
(music, movie,…), etc. from the wall socket!
An Economics –based Service Oriented Grid
Computing computing needed for eventual
success of Grids!
The impact of World-Wide Grid on 21st century
economy will be the same as electricity on 20th
century economy.
Thank You… Any ??
Download