Scientific Grid RM and Discovery Karl Czajkowski Center for Grid Technologies

advertisement
Scientific Grid RM
and Discovery
Karl Czajkowski
Center for Grid Technologies
USC/ISI
Talk Outline
z
Introduction
– Scientific Grid RM
– Virtual Organizations
z
MDS-2 Architecture
– Distributed service model
– Mapping to OGSA
z
GRAM-1 and GRAM–2 Architectures
– Service model
– Mapping to OGSA
Scientific Resource Management
z
Supercomputing jobs are services
– Provide domain-specific capability
– Require resource/hosting-environment
– Many legacy applications
z
Leading-edge users act like administrators
– Deploy jobs dynamically
– Reconfigure service environment
z
Complex resource environment
– This is the root of Grid computing
Resource Discovery/Monitoring
R
R
R
?
R
R
dispersed users
R
R
R
?
network
?
R
R
R
R
R
R
R
R
?
VO-A
R
R
R
VO-B
z
Distributed users and resources
z
Variable resource status
z
Variable grouping and connectivity
R
Resource Acquisition Phases
z
Resource Discovery
– “What resources are relevant?”
– Bootstraps planner state
z
Resource Status Inquiry
– “How do resources compare (now)?”
– Refines planner knowledge
z
Resource Control
– “Did I acquire the resources?”
– Affects service environment
Base Required Features
z
Virtual Organizations (VOs)
– Group together resources and users
– Support community-specific “discovery”
– Specialized “views”
z
Scalability
– Many resources
– Many VOs
– Graceful degradation of service
Virtual Organizations
z
Collaborating individuals and institutions
– Shared goals
– Enable sharing of resources
– Non-locality of participants
z
Dynamic in nature
– VOs come and go
– Resources join and leave VOs
– Resources change status and fail
z
Community-wide goals
MDS-2 Service Architecture
?
discovery (GRIP?)
VO-specific Aggregate Directories
A
A
lookup (GRIP)
registration (GRRP)
R
R
R
standard Resource Description services
z
Dynamic Registration via Reg. Protocol (GRRP)
z
Resource Inquiry via Info. Protocol (GRIP)
– Co-located with resource on network
z
Resource Discovery (via GRIP or other)
– Using GRIP allows resource/directory hierarchy
R
Distributed Services
R
R
R
R
D
R
R
R
R
R
R
D
R
R
registration
messages
R
R R
R
replicated directories
R
R
R
R
R
R
R
R
R
R R
R
D
fault-partition
D
R
R
R
R
divergent directories R
R
VO-A
VO-B
z
Service scales with Grid growth
z
Loose consistency model tolerates failures
z
Interoperability by GRIP/GRRP protocols
Soft-state Registration
z
Periodic notification
– “Service/resource is available”
– Expected-frequency metadata
z
Automatic index/registry construction
– Add new resources to registry
– Invite resources to join new registry
z
Self-cleaning
– Reduce occurrence of “dead” references
Mapping to OGSA
z
GRIP: OGSI ServiceData enquiry
– Self-describing services
– Extensible data model
– Query and subscription/notification
z
GRRP: OGSI Registry
– Simple case of “mutable store”
z
GRRP: OGSI ServiceData notification
– Allows general Index transformation
Index Namespace Management
host: hn=R1, O=O1
host: hn=R2, O=O1
host: hn=R3, O=O1
host: hn=R1, O=O2
host: hn=R2, O=O2
host: hn=R1
O1
host: hn=R1
host: hn=R2
host: hn=R3
AggDir
O2
host: hn=R1
host: hn=R2
R1
R2
R3
R1
R2
host
host
host
host
host
z
z
AggDir
R1
AggDir
host
ResDesc
ServiceData is named within home service
Qualifying “source name” to disambiguate in
index, or use URLs to refer to remote info
GRAM Architecture
RSL
specialization
Broker
RSL
Queries
& Info
Application
Ground RSL
Information
Service
Co-allocator
Simple ground RSL
Local
resource
managers
GRAM
GRAM
GRAM
LSF
Condor
NQE
Resource Specification Language
z
Common notation for exchange of
information between components
– Meant as a machine-to-machine language
z
RSL provides two types of information:
– Resource requirements: Machine type,
number of nodes, memory, etc.
– Job configuration: Directory, executable,
args, environment
Advance Reservation
and Other Generalizations
z
General-purpose Architecture for Reservation
and Allocation (GARA)
– 2nd generation resource management services
z
Broadens GRAM on two axes
– Generalize to support various resource types
> CPU, storage, network, devices, etc.
– Advance reservation of resources, in addition
to allocation
z
Currently a research prototype
GARA: The Big Picture
Co-Reservation Agent
Gatekeeper
GRIO RM
Gatekeeper
Scheduler RM
MDS Info Service
Gatekeeper
Diffserv RM
Gatekeeper
DSRT RM
GRAM-2 (planned for GT-3)
z
Advance reservations
– As prototyped in GARA in previous 2 years
z
Multiple resource types
– Manage anything: storage, networks, etc., etc.
z
z
Recoverable requests, timeout, etc.
Exploit OGSI capabilities
– Reliable lifetime management
– Use ServiceData mechanisms
– Depend on generalized security solutions
Karl Czajkowski, Steve Tuecke, others
GRAM-2 Agreement Model
z
Submission agreements
– Manager agrees to run task for client
– Temporary service deployment
z
Assignment agreements
– Manager agrees to provide resources
– Advance reservation and QoS
z
Binding agreements
– Manager binds assignment to task
– Allows complex RM arrangements
Mapping to OGSA
z
Manager is a Factory
– Agreements rendered as transient services
– Agreements present simple meta-interface
z
Agreements embed Resource Description
– XML-based Resource Model forms “RSL2”
z
Tasks may reflect as Grid Services
– Provide OGSI service interface
> Including ServiceData and domain-specific methods
– Appear in service registries
> Become discoverable resources themselves
Moving Forward
z
MDS-2 Architecture details
– Paper from HPDC-10 on www.globus.org
z
GRAM-2 Architecture details
– Paper submitted for publication
> Contact me (Karl Czajkowski) for access
– Grid Service rendering being outlined
> Perhaps a BOF at GGF-5?
z
Resource Modeling
– Not just for requests… also advertisement
– Needs GGF discussion
Download