An Open Grid Services Architecture & Specification Steven Newhouse Jeff Nick, IBM

advertisement
L ondon
e-S cience
C entre
An Open Grid Services
Architecture & Specification
Steven Newhouse
London e-Science Centre
Jeff Nick, IBM
Steve Tuecke / Ian Foster
Argonne National Laboratory
Globus Project™
http://www.globus.org
Partial Acknowledgements
z
Open Grid Services Architecture work is performed in
collaboration with
–
–
–
–
z
z
z
Ian Foster, Globus Co-PI @ ANL & UC
Carl Kesselman, Globus Co-PI @ USC/ISI
Steve Tuecke, Globus Toolkit Architect @ANL
Jeff Nick, Steve Graham, Jeff Frey @ IBM
Globus Toolkit R&D also involves many fine scientists
& engineers at ANL, USC/ISI, and elsewhere (see
www.globus.org)
Strong collaborations with many outstanding EU, UK,
US Grid projects
Support from DOE, NASA, NSF, Microsoft
Grid Services
www.globus.org/ogsa
Why Grids?
z
eScience
– A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour
– 1,000 physicists worldwide pool resources for peta-op
analyses of petabytes of data
z
eBusiness
– An application service provider offloads excess load to
a compute cycle provider
– An enterprise configures internal & external resources
to support eBusiness workload
z
Technology Drivers
– Moore’s law ⇒ highly functional end-systems
– Ubiquitous Internet ⇒ universal connectivity
Grid Services
www.globus.org/ogsa
Elements of the Grid Problem
z
Resource sharing
– Computers, storage, sensors, networks, …
– Heterogeneity of device, mechanism, policy
– Sharing conditional: negotiation, payment, …
z
Coordinated problem solving
– Integration of distributed resources
– Compound quality of service requirements
z
Dynamic, multi-institutional virtual orgs
– Dynamic overlays on classic org structures
– Map to underlying control mechanisms
Grid Services
www.globus.org/ogsa
Grid Services
www.globus.org/ogsa
The Grid World: Current Status
z
Dozens of major Grid projects in scientific &
technical computing/research & education
– Deployment, application, technology
z
Considerable consensus on key concepts and
technologies
– Open source Globus Toolkit™ a de facto standard for
major protocols & services
– Far from complete or perfect, but out there, evolving
rapidly, and large tool/user base
z
z
Global Grid Forum a significant force
Industrial interest emerging rapidly
Grid Services
www.globus.org/ogsa
The Globus Toolkit in One Slide
z
Grid protocols (GSI, GRAM, …) enable resource
sharing within virtual orgs; toolkit provides reference
implementation ( = Globus Toolkit services)
MDS-2
(Meta Directory Service)
Reliable
remote
GSI User
invocation Gatekeeper Reporter
(Grid
(registry +
Authenticate &
(factory)
discovery)
Security create proxy
Create process Register
Infrastruc- credential
ture)
User
process #1
Proxy
User
process #2
Proxy #2
GRAM
(Grid Resource Allocation & Management)
z
Soft state
registration;
enquiry
Other GSIauthenticated
remote service
requests
GIIS: Grid
Information
Index Server
(discovery)
Other service
(e.g. GridFTP)
Protocols (and APIs) enable other tools and services
for membership, discovery, data mgmt, workflow, …
Grid Services
www.globus.org/ogsa
Globus Toolkit: Evaluation (+)
z
Good technical solutions for key problems, e.g.
–
–
–
–
z
Authentication and authorization
Resource discovery and monitoring
Reliable remote service invocation
High-performance remote data access
This + good engineering is enabling progress
– Good quality reference implementation, multilanguage support, interfaces to many systems,
large user base, industrial support
– Growing community code base built on tools
Grid Services
www.globus.org/ogsa
Globus Toolkit: Evaluation (-)
z
Protocol deficiencies, e.g.
– Heterogeneous basis: HTTP, LDAP, FTP
– No standard means of invocation, notification, error
propagation, authorization, termination, …
z
Significant missing functionality, e.g.
– Databases, sensors, instruments, workflow, …
– Virtualization of end systems (hosting envs.)
z
Little work on total system properties, e.g.
– Dependability, end-to-end QoS, …
– Reasoning about system properties
Grid Services
www.globus.org/ogsa
L ondon
e-S cience
C entre
Service Oriented Architecture
Grid Services
www.globus.org/ogsa
L ondon
e-S cience
C entre
Service Oriented Architectures
z
Web Services
– WSDL, SOAP, UDDI
z
CORBA
– IDL, ORB’s
z
Jini/Java
– RMI, Look-up Server
z
The Web
– HTML+EYES, FTP/HTTP
z
…
Grid Services
www.globus.org/ogsa
Grid Services
www.globus.org/ogsa
“Web Services”
z
Increasingly popular standards-based framework for
accessing network applications
– W3C standardization; Microsoft, IBM, Sun, others
z
WSDL: Web Services Description Language
– Interface Definition Language for Web services
z
SOAP: Simple Object Access Protocol
– XML-based RPC protocol; common WSDL target
z
WS-Inspection
– Conventions for locating service descriptions
z
UDDI: Universal Desc., Discovery, & Integration
– Directory for Web services
Grid Services
www.globus.org/ogsa
Transient Service Instances
z
“Web services” address discovery & invocation
of persistent services
– Interface to persistent state of entire enterprise
z
In Grids, must also support transient service
instances, created/destroyed dynamically
– Interfaces to the states of distributed activities
– E.g. workflow, video conf., dist. data analysis
z
Significant implications for how services are
managed, named, discovered, and used
– In fact, much of our work is concerned with the
management of service instances
Grid Services
www.globus.org/ogsa
Open Grid Services Architecture
z
z
Service orientation to virtualize resources
From Web services:
– Standard interface definition mechanisms: multiple
protocol bindings, multiple implementations,
local/remote transparency
z
Building on Globus Toolkit:
–
–
–
–
z
Grid service: semantics for service interactions
Management of transient instances (& state)
Factory, Registry, Discovery, other services
Reliable and secure transport
Multiple hosting targets: J2EE, .NET, “C”, …
Grid Services
www.globus.org/ogsa
OGSA Service Model
z
z
System comprises (a typically few) persistent
services & (potentially many) transient services
All services adhere to specified Grid service
interfaces and behaviors
– Reliable invocation, lifetime management,
discovery, authorization, notification,
upgradeability, concurrency, manageability
z
Interfaces for managing Grid service instances
– Factory, registry, discovery, lifetime, etc.
=> Reliable, secure mgmt of distributed state
Grid Services
www.globus.org/ogsa
Specification of Protocols
z
The “Grid Service Specification” is a protocol
specification
– Only concerned with issues of how clients
interact with a service
– Promotes interoperable implementations
> E.g. J2EE, .NET, Python, C, etc.
z
Hosting environment issues are out of scope
– Will be addressed in other specifications
> E.g. How to write a Grid service as an EJB.
Grid Services
www.globus.org/ogsa
Open Grid Services Architecture:
Fundamental Structure
1) WSDL conventions and extensions for
describing and structuring services
– Useful independent of “Grid” computing
2) Standard WSDL interfaces & behaviors for
core service activities
– portTypes and operations => protocols
– Define common patterns that occur
repeatedly in Grid settings
Grid Services
www.globus.org/ogsa
Use of Web Services (1)
z
z
A Grid service interface is a WSDL portType
A Grid service definition is a WSDL extension
(serviceType) containing:
– A set of one or more portTypes supported by
the service
– portType & serviceType compatibility
statements, to support upgradability
> For discovery of compatible services when interfaces are
upgraded
– Implementation version information
Grid Services
www.globus.org/ogsa
Use of Web Services (2)
z
A GSR is a WSDL document with extensions:
– Extension to service element to reference serviceType
– Service element extensions to carry the GSH, and the
expiration time of the GSR
z
A GSH is an URL, with the following properties:
– Globally unique for all time
– http get on GSH + “.wsdl” returns GSR
– Can derive GSH to Mapper from it
z
Registry returns WS-Inspection documents
Grid Services
www.globus.org/ogsa
Using OGSA
to Construct Grid Environments
(a) Simple Hosting
Environment
Factory
Service
Service
Registry
Service
Factory
H2R
Mapper
Factory
...
Service
Registry
Service
...
...
Factory
(b) Virtual Hosting
Environment
Service
F
S
S
E2E
Factory
E2E Reg
H2R
Mapper
...
Service
R
M
F
(c) Compound Services
S
F
F
S
S
E2E H2R
Mapper
Service
E2E S
R
M
F
1
S
S
R
M
S
...
E2E S
S
R
M
F
2
S
E2E S
S
S
In each case, Registry handle is effectively the unique
name for the virtual organization.
Grid Services
www.globus.org/ogsa
Grid Services
www.globus.org/ogsa
OGSA and the Globus Toolkit
z
Technically, OGSA enables
– Refactoring of protocols (GRAM, MDS-2, etc.)—while
preserving all GT concepts/features!
– Integration with hosting environments: simplifying
components, distribution, etc.
– Greatly expanded standard service set
z
Pragmatically, we are proceeding as follows
– Develop open source OGSA implementation
> Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs
– Partnerships for service development
– Also expect commercial value-adds
Grid Services
www.globus.org/ogsa
Globus Toolkit Refactoring
z
Grid Security Infrastructure (GSI)
– Used in Grid service network protocol bindings
z
Meta Directory Service 2 (MDS-2)
– Native part of each Grid service:
> Discovery, Registry, RegistryManagement, Notification
z
Grid Resource Allocation & Mngt (GRAM)
– Gatekeeper -> Factory for job mgr instances
z
GridFTP
– Refactor control channel protocol
z
Other services refactored to used Grid services
Grid Services
www.globus.org/ogsa
WSDL Conventions & Extensions
z
portType (standard WSDL)
– Define an interface: a set of related operations
z
serviceType (extensibility element)
– List of port types: enables aggregation
z
serviceImplementation (extensibility element)
– Represents actual code
z
service (standard WSDL)
– instanceOf extension: map descr.->instance
z
compatibilityAssertion (extensibility element)
– portType, serviceType, serviceImplementation
Grid Services
www.globus.org/ogsa
Structure of a Grid Service
service
…
Service
Instantiation instanceOf
service
service
instanceOf
Service
Description serviceImplementation
instanceOf
cA
serviceType
=Standard WSDL
PortType
cA = compatibilityAssertion
Grid Services
…
…
service
instanceOf
serviceImplementation
cA
PortType
serviceType
cA
…
…
PortType
www.globus.org/ogsa
Service Description
z
z
Describes how a client interacts with a
service, independent of any particular service
instance
Primary purposes:
– Discovery: find services of interest
– Tooling: generate client proxies & server code
z
Any number of service instances may bind to
a particular service description
Grid Services
www.globus.org/ogsa
Discovery
z
Discovery drove many details of the GS Spec
z
Examples: Find me a service that…
– supports a particular set of operations.
– can create a service that supports operations.
– will respond as I expect to an op request.
– I can use.
– is currently suspended waiting for input.
– has 10MB bandwidth tomy machine.
– has 5ms latency to any copy of my database.
– has various combinations of these…
Grid Services
www.globus.org/ogsa
Tooling
z
Standard WSDL has most of what is
needed for code generation
– Client proxies in various languages
– Server skeletons
z
One missing bit: serviceType
– WSDL <service> element is ambiguous
about the relationship of its ports
> Do you generate one class that is a union of the
operations from all portTypes, or separate classes for
each port
Grid Services
www.globus.org/ogsa
Capturing Semantics
z
z
Service description obviously captures
interface syntax
But capturing semantic meaning is critical for
discovery
– Not only does the service accept an operation
request with a particular signature
– But it should also respond as expected
> “As expected” is usually defined offline in specifications
z
Approach: name everything
– Use names as basis for reasoning about
semantics
Grid Services
www.globus.org/ogsa
Compatibility Assertions
z
One type of semantic reasoning is about
compatibility between services
– Just because two services implement the same
operations, does not necessarily imply that
they can be used interchangeably by a client
z
z
Current approach: define compatibility
relations between named parts of the service
description
But who is making the assertion?
– Probably will move this out of WSDL, and into
service data of compatibility services
Grid Services
www.globus.org/ogsa
Standard Interfaces & Behaviors:
Four Interrelated Concepts
z
Naming and bindings
– Every service instance has a unique name,
from which can discover supported bindings
z
Information model
– Service data associated with Grid service
instances, operations for accessing this info
z
Notification
– Interfaces for registering interest and
delivering notifications
z
Lifecycle
– Service instances created by factories
– Destroyed explicitly or via soft state
Grid Services
www.globus.org/ogsa
OGSA Interfaces and Operations
Defined to Date
z
GridService
Required
z
Factory
– FindServiceData
– Destroy
– CreateService
z
PrimaryKey
– SetTerminationTime
z
– FindByPrimaryKey
– DestroyByPrimaryKey
NotificationSource
– SubscribeToNotificationTopic
z
Registry
– UnsubscribeToNotificationTopic
z
– RegisterService
NotificationSink
– DeliverNotification
– UnregisterService
z
HandleMap
Authentication, reliability are binding properties
Manageability, concurrency, etc., to be defined
Grid Services
– FindByHandle
www.globus.org/ogsa
Composition of portTypes
z
z
We are trying to define basic patterns of
interaction, which can be combined with
each other with custom patterns in a
myriad of ways
GS Spec focuses on:
– Atomic, composable patterns in the form of
portTypes and service data element types
– A model for how these are composed
z
Actual serviceType definitions are left to
other groups that are defining real services
– More on this later…
Grid Services
www.globus.org/ogsa
Naming and Bindings
z
Every service instance has a unique and
immutable name: Grid Service Handle (GSH)
– Basically just a URL
z
Handle must be converted to a Grid Service
Reference (GSR) to use service
– Includes binding information; may expire
– Separation of name from implementation
facilitates service evolution
z
The HandleMap interface allows a client to
map from a GSH to a GSR
– Each service instance has home HandleMap
Grid Services
www.globus.org/ogsa
Observations on Handles
z
Names vs references vs handles
– Handle is a special name that is known to
the service
z
Perhaps not as special as we first thought
– Maybe just another form of reference,
which requires a particular type of resolver
– Originally thought handle could be used for
policy assertions about the service
> But this only works if handle is an authenticated name
z
Should generalize the specification to allow
for other handle/resolver techniques
Grid Services
www.globus.org/ogsa
Service Data
z
A Grid service instance maintains a set of
service data elements
– XML fragments encapsulated in standard
<name, type, TTL-info> containers
– Includes basic introspection information,
interface-specific data, and application data
z
FindServiceData operation (GridService
interface) queries this information
– Extensible query language support
z
See also notification interfaces
– Allows notification of service existence and
changes in service data
Grid Services
www.globus.org/ogsa
Why Service Data?
z
z
Discovery often requires instance-specific,
perhaps dynamic information
Service data offers a general solution
– Every service must support some common
service data, and may support any additional
service data desired
– Not just meta-data, but also instance state
z
Part of the MDS-2 model contained in OGSA
– Defines standard data model, and query op
– Complements soft-state registration and
notification
Grid Services
www.globus.org/ogsa
ServiceData Attributes
z
z
z
z
z
z
name: local name for this Grid service data element.
globalName: A global name (i.e. QName) for this Grid service
data element.
type: The XML schema type of the element contained in the
extensibility element
goodFrom: Declares the time from which the value of the SDE
carried in its extensibility element is said to be valid. This is
typically the time at which the contained element was created or
aggregated.
goodUntil: Declares the time until which the value of the SDE
carried in its extensibility elements is said to be valid. This value
MUST be greater than the goodFrom time.
availableUntil: Declares the time until which this named SDE is
expected to be available. Prior to this time, a client SHOULD be
able to query for an updated value of this SDE. This value MUST
be greater than the goodFrom time.
Grid Services
www.globus.org/ogsa
ServiceData Example
<gsdl:serviceData name=“foo” type=”n1:sometype”
goodFrom="200204271020" goodUntil=”200204271120”
availableUntil=”200204281020”>
<n1:e1>
<n1:e2>
abc
</n1:e2>
<n1:e3 gsdl:goodUntil=”200204271030”>
def
</n1:e3>
<n1:e4 gsdl:availableUntil=”200203272020”>
ghi
</n1:e4>
</n1:e1>
</gsdl:serviceData>
Grid Services
www.globus.org/ogsa
FindServiceData
z
Standard query operation against a
service’s service data elements
– Simple “by name” query language required
– Can support Xpath, Xquery, etc.
z
Simple, extensible query operation
– Not meant to be the end-all, be-all of query
interfaces
– Expect other groups to define query
interfaces designed to handle other data
types (e.g. relational), large responses (e.g.
iterater-based interface), etc.
Grid Services
www.globus.org/ogsa
Static Service Data
z
In order to support rich discovery, we
often want to annotate a WSDL
serviceType with additional information
– Meta-data and policies about service
– What service data the service supports
z
Maybe support service data in WSDL
– A serviceType can reference a set of service
data elements
– All static service data also available from
instance via FindServiceData
Grid Services
www.globus.org/ogsa
Notification Interfaces
z
NotificationSource for client subscription
– Persistent query against service data
> Generates notification message, whose type is
determined by the query
> Filters, topics, etc. can be represented in query language
> Supports messaging services, 3rd party filter services, …
– Soft state subscription to a generator
z
z
NotificationSink for asynchronous delivery
of notification messages
A wide variety of uses are possible
– E.g. Dynamic discovery/registry services,
monitoring, application error notification, …
Grid Services
www.globus.org/ogsa
Notification & FindServiceData
z
In current spec they are somewhat separate
z
They are being unified in new spec
– Both are simply forms of query against the
service data of an instance
– FindServiceData is a simple query (pull)
– Notification subscription is a persistent query,
with asynchronous response (push)
z
Interesting open questions on what the
subscription language should look like
– How to define temporal aspects of query?
Grid Services
www.globus.org/ogsa
Notification Subscription Lifetime
z
Another planned change is to use normal
service lifetime management approach to
manage subscription lifetime
– A subscription is just a factory operation,
which creates a new services that
represents the subscription state
– SetTerminationTime & Destroy can be used
to manage lifetime of that subscription
– The service data of the subscription service
contains information about the subscriptio
Grid Services
www.globus.org/ogsa
Lifetime Management
z
GS instances created by factory or manually;
destroyed explicitly or via soft state
– Negotiation of initial lifetime with a factory
z
GridService interface supports
– Destroy operation for explicit destruction
– SetTerminationTime operation for keepalive
z
Soft state lifetime management avoids
– Explicit client teardown of complex state
– Resource “leaks” in hosting environments
Grid Services
www.globus.org/ogsa
Lifetime Management Questions
z
z
Should Destroy and SetTerminationTime be
required operations?
What are semantics of SetTerminationTime?
– Contract between client and service, related to
accounting?
> Client is willing to keep paying for service until time X
> Service will not charge for service after time Y
Grid Services
www.globus.org/ogsa
Factory
z
Factory interface’s CreateService operation
creates a new Grid service instance
– Reliable creation (once-and-only-once)
> Is reliability part of service interface, or at binding level?
z
z
z
“Reliable messaging” vs “reliable invocation”
CreateService operation can be extended to
accept service-specific creation parameters
Returns a Grid Service Handle (GSH)
– A globally unique URL
– Uniquely identifies the instance for all time
– Based on name of a home handleMap service
Grid Services
www.globus.org/ogsa
Factories as Templates
z
Factories are under-specified in current spec
– There is an extensibility argument that hides all
the interesting input/output parameters
> Good because it allow for generic clients
> Bad because it hinders discovery
z
Two options:
– Move to differently-named, fully-typed factory
creation operations
> Factories are just a concept (e.g. consider subscription)
– Use service data to describe what a particular
factory supports in its extensibility arguments
> Single Factory portType which is basically a template
Grid Services
www.globus.org/ogsa
Factories and Virtualization
z
Consider a factory to create a given service
– CreateService expects particular input args
z
GS interfaces permit various implementations,
translucent to the client
– Simple factory might create service within its
own hosting environment
– Factory might discover an appropriate resource
to host the service, and delegate request to
another factory
– Factory may decompose request into multiple
service creations, & create aggregating service
Grid Services
www.globus.org/ogsa
Observations on Registry
z
Perhaps “Registration” is a better name
than “Registry”
– Not concerned with registry query
– Just notification of existence
z
z
Debating if registration should just fold
into a notification subscription
More on this later…
Grid Services
www.globus.org/ogsa
Example: Building Registries
z
Options for building registries…
– Need for radically different query capabilities
– Topology of services used for discovery
> E.g hierarchical, p2p
z
Can illuminate important aspects of OGSA…
– Composition of interfaces
– Service data
– Multiple protocol bindings
Grid Services
www.globus.org/ogsa
Architecting Registries
z
There is no single registry that can serve all
purposes
– A Virtual Organization (community) must
architect registries that are appropriate to
their needs
z
But there are common primitives that can be
used to architect many different registries
– Service data
– Notification
– Soft-state registration
Grid Services
www.globus.org/ogsa
Need for Different Queries
z
Need registries that can answer radically
different queries
– “Find me all Redhat Linux 7.2 machines which
are available for my use with a load < 0.3.”
> Requires a registry that can deal with dynamic information
– “Find me both an available cluster and a one
my project database servers with good
network connectivity between them.”
> Requires a registry that can join information from multiple
services
Grid Services
www.globus.org/ogsa
Use of Service Data
z
A registry’s service data should be
architected to support query requirements
– Customized service data XML types
– More powerful (e.g. Xpath, Xquery) or
custom query languages
z
A registry is defined largely by its service
data and query language
Grid Services
www.globus.org/ogsa
Discovery Topologies
z
GS patterns can be applied in various ways
to build discovery topologies
– Hierarchical with caching
– Hierarchical with forwarding
– Peer-to-peer mesh
– Multicast/broadcast
Grid Services
www.globus.org/ogsa
Summary:
Evolution of Grid Technologies
z
Initial exploration (1996-1999; Globus 1.0)
– Extensive appln experiments; core protocols
z
Data Grids (1999-??; Globus 2.0+)
– Large-scale data management and analysis
z
Open Grid Services Architecture (2001-??, Globus
3.0)
– Integration w/ Web services, hosting environments,
resource virtualization
– Databases, higher-level services
z
Radically scalable systems (2003-??)
– Sensors, wireless, ubiquitous computing
Grid Services
www.globus.org/ogsa
Summary
z
z
z
The Grid problem: Resource sharing & coordinated
problem solving in dynamic, multi-institutional virtual
organizations
Grid architecture: Protocol, service definition for
interoperability & resource sharing
Globus Toolkit a source of protocol and API
definitions—and reference implementations
– And many projects applying Grid concepts (& Globus
technologies) to important problems
z
Open Grid Services Architecture represents (we
hope!) next step in evolution
Grid Services
www.globus.org/ogsa
For More Information
z
The Globus Project™
– www.globus.org
z
Grid architecture
– www.globus.org/research/pap
ers/anatomy.pdf
z
Open Grid Services
Architecture (soon)
– www.globus.org/research/pap
ers/ogsa.pdf
– www.globus.org/research/pap
ers/gsspec.pdf
Grid Services
www.globus.org/ogsa
The End
Download