Overview of Workshop and OGSA-DAI EPCC, University of Edinburgh Ali Anjomshoaa ali@epcc.ed.ac.uk OGSA-DAI Training Workshop - April 2003 Release: P2R2 Workshop Overview Workshop Outline – Day 1 4 10:30 – Welcome – Workshop Overview 4 10:45 – Introduction – Grids and Virtual Organisations – OGSA-DAI Motivation and History 4 11:30 - Summary of prerequisites – XML, XML Schema, SOAP, WSDL, G-WSDL – Web Services, Web Containers (e.g Tomcat), Web Services Web Applications (e.g. Axis) – Grid Services, OGSI, OGSA – Introduction to GT3, GT3 Grid Service Container (Tomcat/Axis) – GT3-alpha3 vs. GT3-alpha4 4 13:00 - LUNCH 4 14:00 - OGSA-DAI Architecture – Gross overview of the architecture, highlighting the various components of the system (analyst, consumer, registry, factory, GDS, database, etc.) – Typical end-to-end interaction involving configuration and perform documents – preamble to end-to-end demonstrator 4 15:15 - GUI Demonstrator 4 15:30 - BREAK 4 16:00 - OGSA-DAI Architecture – Low level detail – Component interactions 4 17:30 - END OF DAY 1 3 OGSA-DAI Training Workshop, Release: P2R2 Workshop Outline – Day 2 4 09:30 - Recap of the OGSA-DAI components/actors 4 10:00 - GDSR - GDSF - Configuration 4 11:00 - BREAK 4 11:30 - Hands on GDSF Configuration 4 12:30 - LUNCH 4 13:30 - GDS - Activities 4 14:30 - Perform Documents 4 15:00 - BREAK 4 15:30 - Writing Perform Scripts 4 16:30 - Future Releases & Feedback 4 17:00 - END OF DAY 2 – CLOSE 4 OGSA-DAI Training Workshop, Release: P2R2 Introduction & OGSA-DAI Overview Talk Outline 4Grids and Virtual Organisations 4Data and Grids 4Motivating OGSA-DAI 6 OGSA-DAI Training Workshop, Release: P2R2 Grids Motivating Grids 4Difficult problems – “Grand Challenges” – require collaboration and the sharing of: – Ideas – Efforts – Resources 4Perhaps most importantly there is a strong need for the sharing of data. 4Emerging Open Grid Infrastructures will allow global collaboration and will change the way that we work. 8 OGSA-DAI Training Workshop, Release: P2R2 What are Grids? 4 Grids are about dynamic Virtual Organisations (VOs) sharing data and compute resources. 4 The headlining effort in the Global Grid Forum (GGF) is in the development of the Open Grid Services Architecture (OGSA). 4 Open Grids enable coordinated and coherent collaboration and resource sharing. 4 Service-Component based Grids enable heterogeneous environments to be integrated and reconciled to accomplish complex tasks. 4 Grid Services will use the ubiquitous Web infrastructure. 9 OGSA-DAI Training Workshop, Release: P2R2 Virtual Organisations 4 Virtual Organisations (VOs) • “Dynamic groups of individuals, institutions, and resources, which have a well-defined set of sharing and Quality-of-Service (QoS) rules associated with them.” [1] • VOs include resource providers as well as users. These include application and storage service providers (ASP and SSP). • Resources include software clients, servers, and other programs. 4 Dynamic VOs • VOs should be able to change in scope and extension, transparently from the end users, or clients, in the VO. 4 Resource Usage Policies • Grid end-users can utilise resources according to, – the policies (rules) of the VO, of which they are a member of, and – the policies of the resource providers. 10 OGSA-DAI Training Workshop, Release: P2R2 Examples of VOs 4Scientific Collaborations – e.g. Astronomers, Geneticists, Physicists, Medics, etc., • need to share and integrate data, and • need to share compute power for data analysis 4Engineering Collaborations – e.g. Aircraft design and construction consortia, • need to share compute power for simulations for design, and • need to share and integrate analysis data and results etc. 4Business – e.g. Business-to-Business (B-to-B), • need to share and integrate data, and • need to share compute power for e.g. data mining, market simulations, etc. 11 OGSA-DAI Training Workshop, Release: P2R2 Open Grids 4Open Grid Architecture • “defines the technologies and infrastructure of Grids”, [1, 2] • “supports the sharing and coordinated use of diverse resources in dynamic distributed VOs”. [1, 2] 4Open Grid Services Architecture (OGSA) – OGSA describes a set of implementation and platform independent protocols and standards. • OGSA defines the Grid architecture in terms of Grid Services. • Grid Services are aligned with Web Services in that they use Web Services technologies. • “Web Services technologies form the distributed computing foundations of Microsoft’s .NET framework, IBM’s e-business strategies, and Sun’s Sun ONE.” [3] + other co-operative initiatives. 12 OGSA-DAI Training Workshop, Release: P2R2 Data and Grids Data Requirements 4What do we need for effective sharing of data? – – – – – – – – – 14 Structured, organised, annotated & curated data Computable data models Visualisation of data Data provenance Shared distributed systems Networked workplaces, instruments, data sources Metadata, ontologies, standards Authentication, authorisation, accounting, policies … OGSA-DAI Training Workshop, Release: P2R2 The Need for Gr[ee]d 4Grid computing gives us the framework to access the resources that host databases. 4Can draw an analogy to the electricity grid: resources “on tap” 4We want to provide “data on tap” – Must be aware of a “data deluge” 15 OGSA-DAI Training Workshop, Release: P2R2 Uses of Data on the Grid 4There are a wide variety of uses for databases on the Grid. These include: – – – – – – – – 16 Metadata catalogues Provenance data repositories Resource inventories (registries) Rule bases for path optimisation Knowledge repositories Project repositories Accounting … OGSA-DAI Training Workshop, Release: P2R2 Grid Gains 4The infrastructure of the Grid will: – – – – – 17 Improve scalability Enable handling of unpredictable usage loads Provide metadata driven access Allow multiple database federations Provide distributed data stores OGSA-DAI Training Workshop, Release: P2R2 Integrating DBs into the Grid 4We want to build on existing DBs, not replace them. – Could produce a Grid-enabled version of JDBC/ODBC – Need something more for metadata-driven access to data – Service-based approach should be better 4Provide a uniform framework for access to databases on the Grid Grid Services 18 OGSA-DAI Training Workshop, Release: P2R2 Grid Services and OGSA? The Best of Both Worlds Open Grid Services Architecture Share Access Manage resource resource resource Continuous Availability Applications on demand Secure and universal access Business integration Web Services 19 Resources on demand Global Accessibility Vast resource scalability Grid Protocols OGSA-DAI Training Workshop, Release: P2R2 Example 1: Online retail 4An online retailer wants to: – be able to return results from its suppliers’ stock databases based on a customer query – each supplier uses a slightly different database structure – not all suppliers' records are always available – result must be returned to customer as soon as possible 20 OGSA-DAI Training Workshop, Release: P2R2 Example 2: Astronomy 4An astronomer wants to: – query the Sloan Digital Sky survey to find areas of sky with a particular density of astronomical bodies – do a further query on these areas, returning metadata to their desktop and transporting the actual data to a research server for further analysis 21 OGSA-DAI Training Workshop, Release: P2R2 Data Grid Future 4Ultimate goal: true virtualised data 4No need to worry about: – Location – Access 4Can also: – (Transparently) Generate (composite) data on demand – Provide efficiency through caching 4Grid Data Handles? 4Metadata is the key to data management 22 OGSA-DAI Training Workshop, Release: P2R2 Data Grid Challenges 4Generic concerns, for example: – Scale, Heterogeneity, Dynamism, Location, Distribution, Virtualization, Ownership, Cost… 4Application (or class of application) specific concerns, for example: – Common schema, Data description and semantics, Data formats, Process and procedure, Provenance… 23 OGSA-DAI Training Workshop, Release: P2R2 Motivating OGSA-DAI OGSA-DAI Objective 4To define: – – – – – 25 open standards and open source based uniform service interfaces for accessing heterogeneous data sources within the Open Grid Services Architecture (OGSA) framework OGSA-DAI Training Workshop, Release: P2R2 The OGSA-DAI Project 4 The OGSA-DAI project is joint funded by the UK DTI’s e-Science programme and by industry. 4 Provides data access and integration functions for computing grids using the OGSA framework. 4 Closely associated with the GGF’s DAIS-WG. 4 Project team members drawn from commercial and non-commercial organizations. 4 Project runs until July 2003 and will support DB2, Oracle, MySql, XIndice and another XML Database in the first instance. 26 OGSA-DAI Training Workshop, Release: P2R2 Who are we? Contributing to the global grid computing community IBM USA EPCC & NeSC Glasgow Newcastle Belfast Manchester Daresbury Lab EPCC & NeSC IBM UK IBM USA Manchester e-SC Newcastle e-SC Oracle 373 man months Oxford RAL Cardiff Cambridge Oracle Hinxton London IBM Hursley Southampton £3 million, 18 months, started February 2002 Funded by the Grid Core Programme 27 OGSA-DAI Training Workshop, Release: P2R2 What are we doing? Data Intensive Applications Scientific Data Mining & Integration Technology Monitoring Diagnosis Scheduling Accounting Logging Grid Plumbing & Security Infrastructure Data & Storage Resources Distributed 28 OGSA-DAI Training Workshop, Release: P2R2 What are we doing? Data Intensive Applications Scientific Data Mining & Integration Technology Monitoring Diagnosis Logging Data Integration Scheduling Accounting Authorisation Data Access Grid Plumbing & Security Infrastructure Data & Storage Resources Structured Data Distributed 29 OGSA-DAI Training Workshop, Release: P2R2 What are we doing? Data Intensive Applications App. Developers Scientific Data Mining & Integration Technology Monitoring Diagnosis Logging Data Integration Scheduling Accounting Authorisation Data Access Operations Grid Plumbing & Security Infrastructure Team Owners Data & Storage Resources Structured Data Distributed 30 OGSA-DAI Training Workshop, Release: P2R2 What are we doing? Data Intensive Application Scientists Data Intensive Applications App. Developers Scientific Data Mining & Integration Technology Tech. Developers Monitoring Diagnosis Logging Data Integration Scheduling Accounting Authorisation Data Access Operations Grid Plumbing & Security Infrastructure Team Owners Data & Storage Resources Distributed 31 Structured Data Data Providers Data Curators OGSA-DAI Training Workshop, Release: P2R2 What are we doing? Data Intensive Application Scientists Data Intensive Applications App. Developers Scientific Data Mining & Integration Technology Tech. Monitoring Diagnosis Scheduling Accounting Developers Logging Keep all the groups Authorisation happy Data Integration Data Access Operations Grid Plumbing & Security Infrastructure Team Owners Data & Storage Resources Distributed 32 Structured Data Data Providers Data Curators OGSA-DAI Training Workshop, Release: P2R2 Project Requirements 4Derived from project requirements survey – see DAIS WG 4Driven by Technical Authority and Early Adopters – AstroGrid – MyGrid 4Close relationship with many other projects 33 OGSA-DAI Training Workshop, Release: P2R2 OGSA-DAI Positioning - Vision OGSA-DAI Distributed Query OGSA-DAI Basic Services Data Grid Infrastructure – Location, Delivery, Replication… Resource Grid Infrastructure – OGSA… Database, Communication, OS… Technology 34 OGSA-DAI Training Workshop, Release: P2R2 OGSA-DAI Positioning - Today OGSA-DAI Distributed Query OGSA-DAI Basic Services Delivery Data Format Drivers Query (Create Retrieve Update Delete) App Specifics OGSA Meta Data Location Notification Lifetime Database, Communication, OS… Technology 35 OGSA-DAI Training Workshop, Release: P2R2 OGSA-DAI To Date 4Assuming that OGSA becomes the standard framework – Have adopted the OGSA approach 4Have first concentrated on data access – Data integration, for example distributed query and pipelines, comes later 4Implementation provides focus on basic functionality first – But architecturally we have tried to answer many pertinent questions – Functionality will increase over subsequent releases 36 OGSA-DAI Training Workshop, Release: P2R2 Project Timeline today WS + GSI UK support ( > 100 downloads) XML + OGSA Prototypes for Early Adopters Design Documents & Demos for DAIS WG @ GGF5 XML + OGSA Prototype Available RDB + GT2 / OGSA Prototypes Available GGF6 WG Papers & Prototypes Ship Release 1 (January 15th 2003) Demo & Release 1.5 @ GGF7 Release 2 Release 3 Feb ’02 May ’02 Phase 1 Starts 37 Jul ’02 Sep ’02 Dec ’02 Feb ’03 May ’03 Sep ’03 Phase 2 Starts OGSA-DAI Training Workshop, Release: P2R2 Web Services Web Services 4A Web Service is an interface that describes a collection of operations that are network accessible. 4Web Services build on XML based technologies and messaging, and use XML schemas to markup and describe services and their operations. 4Web Services commonly use the Simple Object Access Protocol (SOAP) as a communication protocol over HTTP. 39 OGSA-DAI Training Workshop, Release: P2R2 Web Services 4Service Oriented Architecture Service Service Registry Registry Find 4Service Provider Service Service Requestor Requestor Publish Bind Service Service Provider Provider – Network nodes which provide services and publish their availability through a registry. 4Service Registry – Network nodes which act as repositories, yellow pages, or clearing houses for services. 4Service Requestor – Network nodes that discover and invoke other services to provide a service based solution. 40 OGSA-DAI Training Workshop, Release: P2R2 XML 4eXtensible Markup Language • “The eXtensible Markup Language (XML) is an attempt at a standard format for structured documents and data. XML was designed to be readable by humans to facilitate the developers task of interacting with XML documents. Because of the humanreadability, XML can be less efficient than proprietary binary formats, but is still extremely useful.” – XML is a restricted form of SGML for web applications. – XML was developed to address the limitations of HTML. – XML is used to compose documents which describe their grammar and structure via a set of tags. 41 OGSA-DAI Training Workshop, Release: P2R2 An Example of XML 4Here's some e.g. XML code... <!-- all jars in jar directory --> <fileset dir="${jar.dir}"> <include name="*.jar"/> </fileset> 4'tags' delimit 'elements' which can contain zero or more 'attributes'. The attributes are specified in the opening tag of the element. 42 OGSA-DAI Training Workshop, Release: P2R2 Main XML Based Technologies 4The XML based technologies for Web Services include (but are not limited to) : – The Simple Object Access Protocol (SOAP) • SOAP is an XML protocol used to invoke a method on a server to execute an operation and return its results in XML. – The Web Services Description Language (WSDL) • WSDL uses XML schemas and is used to describe the functionality of Web Services through well defined interfaces (PortTypes). – The Universal Description, Discovery, and Integration (UDDI) • UDDI provides a searchable index of registered Web Services, and… – …along with the Web Services Inspection Language (WSIL) • provide mechanisms to register, or publish, WSDL documents describing Web Services, and for their subsequent discovery. 43 OGSA-DAI Training Workshop, Release: P2R2 Other XML Based Technologies – Business Process Execution Language for Web Services (BPEL4WS) • Can be used to integrate multiple services together according to the prescribed order. – The Web Services Invocation Framework (WSIF) • Can be used to dynamically generate service proxies that, for example, can be implemented in the language of the client application. 44 OGSA-DAI Training Workshop, Release: P2R2 OGSA, OGSA-DAI, OGSI… Talk Outline 4What is OGSA? • Motivation for OGSA • Relation between OGSA and OGSI 4What is OGSA-DAI? • Motivation for OGSA-DAI • Relation between OGSA and OGSI 4What is OGSI? 4What is OGSA? • Grid Services 46 OGSA-DAI Training Workshop, Release: P2R2 OGSA Motivation for OGSA - Problem 4 There is a need to integrate computing services across distributed, heterogeneous, and dynamic groups of resources (Virtual Organisations). 4 Grid concepts and technologies (e.g. Globus) have addressed the issues that allow applications to access and share resources and basic services across WANs, i.e. registration and discovery (GIS/MDS), access/security (GSI), job/resource management (GRAM), data access and transfer (GridFTP, GASS), etc). WHAT ABOUT SERVICE INTEGRATION??? 4 There is, however, more work to be done in order to enable 4 seamless There is more work to be done truly in order to enable collaboration across heterogeneous 4 environments seamless collaboration across trulyand heterogeneous and sets of services resources. 4 environments and sets of services and resources. 48 OGSA-DAI Training Workshop, Release: P2R2 Motivation for OGSA - Problem 4 What are resources ??? 4 Consider collaborations of distributed – software • (applications, OSs, compilers, databases, etc.), – hardware • (computers, PDAs and other mobile compute resources, instrumentation, sensors and transmitters, etc.), and – human resources • (scientists, business people, analysts, users/consumers, admin staff, etc.). 49 OGSA-DAI Training Workshop, Release: P2R2 Motivation for OGSA - Problem 4 What is a service ??? – – – – A facility providing some public demand. A facility providing maintenance and repair A helpful act. Useful labour that does not produce a tangible commodity. – Work performed by an entity that serves. • … and more (http://www.m-w.com/) 4 Basic services, e.g data transfer, compute, storage, security/access, etc. 4 Complex services, e.g. applications, weather services, access services, information services, etc. 50 OGSA-DAI Training Workshop, Release: P2R2 Motivation for OGSA - Solution 4Access and share resources and integrate services. 4Provide uniform access. 4Provide a set of standards and protocols as the infrastructure for accessing, sharing, and integrating. 4Protocols and standards for communication, translation and transformation of data, access and security, auditing and logging (charging?) 51 OGSA-DAI Training Workshop, Release: P2R2 Motivation for OGSA 4 Goal: provide a set of service-components based on a set of standards and protocols for the Grid. 4 Assumption: initial application areas are scientific, engineering, and medical e-Science. Can expand to e-business and other computing environments 4 Motivation: uniform coordinated access to the increasingly distributed and heterogeneous environments. 4 Future: Users will move away from technical issues such as handling resource and service location, structure, data transfer, translation & integration, security, access and auditing concerns and administration, etc. etc. 52 OGSA-DAI Training Workshop, Release: P2R2