L ondon e-S cience C entre An Open Grid Services Architecture & Specification Steven Newhouse London e-Science Centre Jeff Nick, IBM Steve Tuecke / Ian Foster Argonne National Laboratory Globus Project™ http://www.globus.org Partial Acknowledgements z Open Grid Services Architecture work is performed in collaboration with – – – – z z z Ian Foster, Globus Co-PI @ ANL & UC Carl Kesselman, Globus Co-PI @ USC/ISI Steve Tuecke, Globus Toolkit Architect @ANL Jeff Nick, Steve Graham, Jeff Frey @ IBM Globus Toolkit R&D also involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www.globus.org) Strong collaborations with many outstanding EU, UK, US Grid projects Support from DOE, NASA, NSF, Microsoft Grid Services www.globus.org/ogsa Why Grids? z eScience – A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour – 1,000 physicists worldwide pool resources for peta-op analyses of petabytes of data z eBusiness – An application service provider offloads excess load to a compute cycle provider – An enterprise configures internal & external resources to support eBusiness workload z Technology Drivers – Moore’s law ⇒ highly functional end-systems – Ubiquitous Internet ⇒ universal connectivity Grid Services www.globus.org/ogsa Elements of the Grid Problem z Resource sharing – Computers, storage, sensors, networks, … – Heterogeneity of device, mechanism, policy – Sharing conditional: negotiation, payment, … z Coordinated problem solving – Integration of distributed resources – Compound quality of service requirements z Dynamic, multi-institutional virtual orgs – Dynamic overlays on classic org structures – Map to underlying control mechanisms Grid Services www.globus.org/ogsa Grid Services www.globus.org/ogsa The Grid World: Current Status z Dozens of major Grid projects in scientific & technical computing/research & education – Deployment, application, technology z Considerable consensus on key concepts and technologies – Open source Globus Toolkit™ a de facto standard for major protocols & services – Far from complete or perfect, but out there, evolving rapidly, and large tool/user base z z Global Grid Forum a significant force Industrial interest emerging rapidly Grid Services www.globus.org/ogsa The Globus Toolkit in One Slide z Grid protocols (GSI, GRAM, …) enable resource sharing within virtual orgs; toolkit provides reference implementation ( = Globus Toolkit services) MDS-2 (Meta Directory Service) Reliable remote GSI User invocation Gatekeeper Reporter (Grid (registry + Authenticate & (factory) discovery) Security create proxy Create process Register Infrastruc- credential ture) User process #1 Proxy User process #2 Proxy #2 GRAM (Grid Resource Allocation & Management) z Soft state registration; enquiry Other GSIauthenticated remote service requests GIIS: Grid Information Index Server (discovery) Other service (e.g. GridFTP) Protocols (and APIs) enable other tools and services for membership, discovery, data mgmt, workflow, … Grid Services www.globus.org/ogsa Globus Toolkit: Evaluation (+) z Good technical solutions for key problems, e.g. – – – – z Authentication and authorization Resource discovery and monitoring Reliable remote service invocation High-performance remote data access This + good engineering is enabling progress – Good quality reference implementation, multilanguage support, interfaces to many systems, large user base, industrial support – Growing community code base built on tools Grid Services www.globus.org/ogsa Globus Toolkit: Evaluation (-) z Protocol deficiencies, e.g. – Heterogeneous basis: HTTP, LDAP, FTP – No standard means of invocation, notification, error propagation, authorization, termination, … z Significant missing functionality, e.g. – Databases, sensors, instruments, workflow, … – Virtualization of end systems (hosting envs.) z Little work on total system properties, e.g. – Dependability, end-to-end QoS, … – Reasoning about system properties Grid Services www.globus.org/ogsa L ondon e-S cience C entre Service Oriented Architecture Grid Services www.globus.org/ogsa L ondon e-S cience C entre Service Oriented Architectures z Web Services – WSDL, SOAP, UDDI z CORBA – IDL, ORB’s z Jini/Java – RMI, Look-up Server z The Web – HTML+EYES, FTP/HTTP z … Grid Services www.globus.org/ogsa Grid Services www.globus.org/ogsa “Web Services” z Increasingly popular standards-based framework for accessing network applications – W3C standardization; Microsoft, IBM, Sun, others z WSDL: Web Services Description Language – Interface Definition Language for Web services z SOAP: Simple Object Access Protocol – XML-based RPC protocol; common WSDL target z WS-Inspection – Conventions for locating service descriptions z UDDI: Universal Desc., Discovery, & Integration – Directory for Web services Grid Services www.globus.org/ogsa Transient Service Instances z “Web services” address discovery & invocation of persistent services – Interface to persistent state of entire enterprise z In Grids, must also support transient service instances, created/destroyed dynamically – Interfaces to the states of distributed activities – E.g. workflow, video conf., dist. data analysis z Significant implications for how services are managed, named, discovered, and used – In fact, much of our work is concerned with the management of service instances Grid Services www.globus.org/ogsa Open Grid Services Architecture z z Service orientation to virtualize resources From Web services: – Standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency z Building on Globus Toolkit: – – – – z Grid service: semantics for service interactions Management of transient instances (& state) Factory, Registry, Discovery, other services Reliable and secure transport Multiple hosting targets: J2EE, .NET, “C”, … Grid Services www.globus.org/ogsa OGSA Service Model z z System comprises (a typically few) persistent services & (potentially many) transient services All services adhere to specified Grid service interfaces and behaviors – Reliable invocation, lifetime management, discovery, authorization, notification, upgradeability, concurrency, manageability z Interfaces for managing Grid service instances – Factory, registry, discovery, lifetime, etc. => Reliable, secure mgmt of distributed state Grid Services www.globus.org/ogsa Specification of Protocols z The “Grid Service Specification” is a protocol specification – Only concerned with issues of how clients interact with a service – Promotes interoperable implementations > E.g. J2EE, .NET, Python, C, etc. z Hosting environment issues are out of scope – Will be addressed in other specifications > E.g. How to write a Grid service as an EJB. Grid Services www.globus.org/ogsa Open Grid Services Architecture: Fundamental Structure 1) WSDL conventions and extensions for describing and structuring services – Useful independent of “Grid” computing 2) Standard WSDL interfaces & behaviors for core service activities – portTypes and operations => protocols – Define common patterns that occur repeatedly in Grid settings Grid Services www.globus.org/ogsa Use of Web Services (1) z z A Grid service interface is a WSDL portType A Grid service definition is a WSDL extension (serviceType) containing: – A set of one or more portTypes supported by the service – portType & serviceType compatibility statements, to support upgradability > For discovery of compatible services when interfaces are upgraded – Implementation version information Grid Services www.globus.org/ogsa Use of Web Services (2) z A GSR is a WSDL document with extensions: – Extension to service element to reference serviceType – Service element extensions to carry the GSH, and the expiration time of the GSR z A GSH is an URL, with the following properties: – Globally unique for all time – http get on GSH + “.wsdl” returns GSR – Can derive GSH to Mapper from it z Registry returns WS-Inspection documents Grid Services www.globus.org/ogsa Using OGSA to Construct Grid Environments (a) Simple Hosting Environment Factory Service Service Registry Service Factory H2R Mapper Factory ... Service Registry Service ... ... Factory (b) Virtual Hosting Environment Service F S S E2E Factory E2E Reg H2R Mapper ... Service R M F (c) Compound Services S F F S S E2E H2R Mapper Service E2E S R M F 1 S S R M S ... E2E S S R M F 2 S E2E S S S In each case, Registry handle is effectively the unique name for the virtual organization. Grid Services www.globus.org/ogsa Grid Services www.globus.org/ogsa OGSA and the Globus Toolkit z Technically, OGSA enables – Refactoring of protocols (GRAM, MDS-2, etc.)—while preserving all GT concepts/features! – Integration with hosting environments: simplifying components, distribution, etc. – Greatly expanded standard service set z Pragmatically, we are proceeding as follows – Develop open source OGSA implementation > Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs – Partnerships for service development – Also expect commercial value-adds Grid Services www.globus.org/ogsa Globus Toolkit Refactoring z Grid Security Infrastructure (GSI) – Used in Grid service network protocol bindings z Meta Directory Service 2 (MDS-2) – Native part of each Grid service: > Discovery, Registry, RegistryManagement, Notification z Grid Resource Allocation & Mngt (GRAM) – Gatekeeper -> Factory for job mgr instances z GridFTP – Refactor control channel protocol z Other services refactored to used Grid services Grid Services www.globus.org/ogsa WSDL Conventions & Extensions z portType (standard WSDL) – Define an interface: a set of related operations z serviceType (extensibility element) – List of port types: enables aggregation z serviceImplementation (extensibility element) – Represents actual code z service (standard WSDL) – instanceOf extension: map descr.->instance z compatibilityAssertion (extensibility element) – portType, serviceType, serviceImplementation Grid Services www.globus.org/ogsa Structure of a Grid Service service … Service Instantiation instanceOf service service instanceOf Service Description serviceImplementation instanceOf cA serviceType =Standard WSDL PortType cA = compatibilityAssertion Grid Services … … service instanceOf serviceImplementation cA PortType serviceType cA … … PortType www.globus.org/ogsa Service Description z z Describes how a client interacts with a service, independent of any particular service instance Primary purposes: – Discovery: find services of interest – Tooling: generate client proxies & server code z Any number of service instances may bind to a particular service description Grid Services www.globus.org/ogsa Discovery z Discovery drove many details of the GS Spec z Examples: Find me a service that… – supports a particular set of operations. – can create a service that supports operations. – will respond as I expect to an op request. – I can use. – is currently suspended waiting for input. – has 10MB bandwidth tomy machine. – has 5ms latency to any copy of my database. – has various combinations of these… Grid Services www.globus.org/ogsa Tooling z Standard WSDL has most of what is needed for code generation – Client proxies in various languages – Server skeletons z One missing bit: serviceType – WSDL <service> element is ambiguous about the relationship of its ports > Do you generate one class that is a union of the operations from all portTypes, or separate classes for each port Grid Services www.globus.org/ogsa Capturing Semantics z z Service description obviously captures interface syntax But capturing semantic meaning is critical for discovery – Not only does the service accept an operation request with a particular signature – But it should also respond as expected > “As expected” is usually defined offline in specifications z Approach: name everything – Use names as basis for reasoning about semantics Grid Services www.globus.org/ogsa Compatibility Assertions z One type of semantic reasoning is about compatibility between services – Just because two services implement the same operations, does not necessarily imply that they can be used interchangeably by a client z z Current approach: define compatibility relations between named parts of the service description But who is making the assertion? – Probably will move this out of WSDL, and into service data of compatibility services Grid Services www.globus.org/ogsa Standard Interfaces & Behaviors: Four Interrelated Concepts z Naming and bindings – Every service instance has a unique name, from which can discover supported bindings z Information model – Service data associated with Grid service instances, operations for accessing this info z Notification – Interfaces for registering interest and delivering notifications z Lifecycle – Service instances created by factories – Destroyed explicitly or via soft state Grid Services www.globus.org/ogsa OGSA Interfaces and Operations Defined to Date z GridService Required z Factory – FindServiceData – Destroy – CreateService z PrimaryKey – SetTerminationTime z – FindByPrimaryKey – DestroyByPrimaryKey NotificationSource – SubscribeToNotificationTopic z Registry – UnsubscribeToNotificationTopic z – RegisterService NotificationSink – DeliverNotification – UnregisterService z HandleMap Authentication, reliability are binding properties Manageability, concurrency, etc., to be defined Grid Services – FindByHandle www.globus.org/ogsa Composition of portTypes z z We are trying to define basic patterns of interaction, which can be combined with each other with custom patterns in a myriad of ways GS Spec focuses on: – Atomic, composable patterns in the form of portTypes and service data element types – A model for how these are composed z Actual serviceType definitions are left to other groups that are defining real services – More on this later… Grid Services www.globus.org/ogsa Naming and Bindings z Every service instance has a unique and immutable name: Grid Service Handle (GSH) – Basically just a URL z Handle must be converted to a Grid Service Reference (GSR) to use service – Includes binding information; may expire – Separation of name from implementation facilitates service evolution z The HandleMap interface allows a client to map from a GSH to a GSR – Each service instance has home HandleMap Grid Services www.globus.org/ogsa Observations on Handles z Names vs references vs handles – Handle is a special name that is known to the service z Perhaps not as special as we first thought – Maybe just another form of reference, which requires a particular type of resolver – Originally thought handle could be used for policy assertions about the service > But this only works if handle is an authenticated name z Should generalize the specification to allow for other handle/resolver techniques Grid Services www.globus.org/ogsa Service Data z A Grid service instance maintains a set of service data elements – XML fragments encapsulated in standard <name, type, TTL-info> containers – Includes basic introspection information, interface-specific data, and application data z FindServiceData operation (GridService interface) queries this information – Extensible query language support z See also notification interfaces – Allows notification of service existence and changes in service data Grid Services www.globus.org/ogsa Why Service Data? z z Discovery often requires instance-specific, perhaps dynamic information Service data offers a general solution – Every service must support some common service data, and may support any additional service data desired – Not just meta-data, but also instance state z Part of the MDS-2 model contained in OGSA – Defines standard data model, and query op – Complements soft-state registration and notification Grid Services www.globus.org/ogsa ServiceData Attributes z z z z z z name: local name for this Grid service data element. globalName: A global name (i.e. QName) for this Grid service data element. type: The XML schema type of the element contained in the extensibility element goodFrom: Declares the time from which the value of the SDE carried in its extensibility element is said to be valid. This is typically the time at which the contained element was created or aggregated. goodUntil: Declares the time until which the value of the SDE carried in its extensibility elements is said to be valid. This value MUST be greater than the goodFrom time. availableUntil: Declares the time until which this named SDE is expected to be available. Prior to this time, a client SHOULD be able to query for an updated value of this SDE. This value MUST be greater than the goodFrom time. Grid Services www.globus.org/ogsa ServiceData Example <gsdl:serviceData name=“foo” type=”n1:sometype” goodFrom="200204271020" goodUntil=”200204271120” availableUntil=”200204281020”> <n1:e1> <n1:e2> abc </n1:e2> <n1:e3 gsdl:goodUntil=”200204271030”> def </n1:e3> <n1:e4 gsdl:availableUntil=”200203272020”> ghi </n1:e4> </n1:e1> </gsdl:serviceData> Grid Services www.globus.org/ogsa FindServiceData z Standard query operation against a service’s service data elements – Simple “by name” query language required – Can support Xpath, Xquery, etc. z Simple, extensible query operation – Not meant to be the end-all, be-all of query interfaces – Expect other groups to define query interfaces designed to handle other data types (e.g. relational), large responses (e.g. iterater-based interface), etc. Grid Services www.globus.org/ogsa Static Service Data z In order to support rich discovery, we often want to annotate a WSDL serviceType with additional information – Meta-data and policies about service – What service data the service supports z Maybe support service data in WSDL – A serviceType can reference a set of service data elements – All static service data also available from instance via FindServiceData Grid Services www.globus.org/ogsa Notification Interfaces z NotificationSource for client subscription – Persistent query against service data > Generates notification message, whose type is determined by the query > Filters, topics, etc. can be represented in query language > Supports messaging services, 3rd party filter services, … – Soft state subscription to a generator z z NotificationSink for asynchronous delivery of notification messages A wide variety of uses are possible – E.g. Dynamic discovery/registry services, monitoring, application error notification, … Grid Services www.globus.org/ogsa Notification & FindServiceData z In current spec they are somewhat separate z They are being unified in new spec – Both are simply forms of query against the service data of an instance – FindServiceData is a simple query (pull) – Notification subscription is a persistent query, with asynchronous response (push) z Interesting open questions on what the subscription language should look like – How to define temporal aspects of query? Grid Services www.globus.org/ogsa Notification Subscription Lifetime z Another planned change is to use normal service lifetime management approach to manage subscription lifetime – A subscription is just a factory operation, which creates a new services that represents the subscription state – SetTerminationTime & Destroy can be used to manage lifetime of that subscription – The service data of the subscription service contains information about the subscriptio Grid Services www.globus.org/ogsa Lifetime Management z GS instances created by factory or manually; destroyed explicitly or via soft state – Negotiation of initial lifetime with a factory z GridService interface supports – Destroy operation for explicit destruction – SetTerminationTime operation for keepalive z Soft state lifetime management avoids – Explicit client teardown of complex state – Resource “leaks” in hosting environments Grid Services www.globus.org/ogsa Lifetime Management Questions z z Should Destroy and SetTerminationTime be required operations? What are semantics of SetTerminationTime? – Contract between client and service, related to accounting? > Client is willing to keep paying for service until time X > Service will not charge for service after time Y Grid Services www.globus.org/ogsa Factory z Factory interface’s CreateService operation creates a new Grid service instance – Reliable creation (once-and-only-once) > Is reliability part of service interface, or at binding level? z z z “Reliable messaging” vs “reliable invocation” CreateService operation can be extended to accept service-specific creation parameters Returns a Grid Service Handle (GSH) – A globally unique URL – Uniquely identifies the instance for all time – Based on name of a home handleMap service Grid Services www.globus.org/ogsa Factories as Templates z Factories are under-specified in current spec – There is an extensibility argument that hides all the interesting input/output parameters > Good because it allow for generic clients > Bad because it hinders discovery z Two options: – Move to differently-named, fully-typed factory creation operations > Factories are just a concept (e.g. consider subscription) – Use service data to describe what a particular factory supports in its extensibility arguments > Single Factory portType which is basically a template Grid Services www.globus.org/ogsa Factories and Virtualization z Consider a factory to create a given service – CreateService expects particular input args z GS interfaces permit various implementations, translucent to the client – Simple factory might create service within its own hosting environment – Factory might discover an appropriate resource to host the service, and delegate request to another factory – Factory may decompose request into multiple service creations, & create aggregating service Grid Services www.globus.org/ogsa Observations on Registry z Perhaps “Registration” is a better name than “Registry” – Not concerned with registry query – Just notification of existence z z Debating if registration should just fold into a notification subscription More on this later… Grid Services www.globus.org/ogsa Example: Building Registries z Options for building registries… – Need for radically different query capabilities – Topology of services used for discovery > E.g hierarchical, p2p z Can illuminate important aspects of OGSA… – Composition of interfaces – Service data – Multiple protocol bindings Grid Services www.globus.org/ogsa Architecting Registries z There is no single registry that can serve all purposes – A Virtual Organization (community) must architect registries that are appropriate to their needs z But there are common primitives that can be used to architect many different registries – Service data – Notification – Soft-state registration Grid Services www.globus.org/ogsa Need for Different Queries z Need registries that can answer radically different queries – “Find me all Redhat Linux 7.2 machines which are available for my use with a load < 0.3.” > Requires a registry that can deal with dynamic information – “Find me both an available cluster and a one my project database servers with good network connectivity between them.” > Requires a registry that can join information from multiple services Grid Services www.globus.org/ogsa Use of Service Data z A registry’s service data should be architected to support query requirements – Customized service data XML types – More powerful (e.g. Xpath, Xquery) or custom query languages z A registry is defined largely by its service data and query language Grid Services www.globus.org/ogsa Discovery Topologies z GS patterns can be applied in various ways to build discovery topologies – Hierarchical with caching – Hierarchical with forwarding – Peer-to-peer mesh – Multicast/broadcast Grid Services www.globus.org/ogsa Summary: Evolution of Grid Technologies z Initial exploration (1996-1999; Globus 1.0) – Extensive appln experiments; core protocols z Data Grids (1999-??; Globus 2.0+) – Large-scale data management and analysis z Open Grid Services Architecture (2001-??, Globus 3.0) – Integration w/ Web services, hosting environments, resource virtualization – Databases, higher-level services z Radically scalable systems (2003-??) – Sensors, wireless, ubiquitous computing Grid Services www.globus.org/ogsa Summary z z z The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations Grid architecture: Protocol, service definition for interoperability & resource sharing Globus Toolkit a source of protocol and API definitions—and reference implementations – And many projects applying Grid concepts (& Globus technologies) to important problems z Open Grid Services Architecture represents (we hope!) next step in evolution Grid Services www.globus.org/ogsa For More Information z The Globus Project™ – www.globus.org z Grid architecture – www.globus.org/research/pap ers/anatomy.pdf z Open Grid Services Architecture (soon) – www.globus.org/research/pap ers/ogsa.pdf – www.globus.org/research/pap ers/gsspec.pdf Grid Services www.globus.org/ogsa The End