Enabling Grids for E-sciencE Web Services, EGEE and Richard Hopkins Training Team, National e-Science Centre, Edinburgh, UK rph@nesc.ac.uk National Grid Services Induction March 11 2005 www.eu-egee.org EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 Outline and Acknowledgements Enabling Grids for E-sciencE • Grids and Web Services • EGEE • gLite Acknowlegements • This talk has drawn on material produced by many people, including • Tony Hey • Mike Mineter • EGEE colleagues Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 2 Specialised DistributedComputing Enabling Grids for E-sciencE Grids come within the general field of • Parallel / Distributed Computing A computational task involves coordination of components which occur • Simultaneously And/or • At physically separated locations The Distribution dimension - Parallel Multiprocessor Distributed Processor Pool Richard Hopkins, Training Team, NeSC Embedded Systems Flexible Manufacturing ... GRIDS NGS Induction, NeSc, March 11 2005 – Web Service Grids - 3 Enabling Grids for E-sciencE Grids & Web Services A very inclusive formulation of “Grid Computing” is • Coordination of computational components • Processing / storage resources • Of up to international level of geographic separation • Crossing organisation boundaries A (Web) service is a • S/W system designed to support interoperable machine-to-machine interaction over a network. Similar concepts • For grids, the focus is on a computational model – Running a job Marshalling the data needed • For web services – the focus is on service inter-operability A service can be any functionality you want to provide via the web Need to bring these technologies together • First we look closer at Web Services Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 4 Enabling Grids for E-sciencE Web Services Approach Web Services is the next step in the automation of inter-enterprise interaction Web Browsing • Human travel agent provides “organise holiday” service by surfing the web to look for and invoking services – book a hotel; book a plane; book a car hire; ….; confirm bookings of best options to meet client needs. Web Services • The aspiration of Web services is to provide a framework that allows that same model to be used in writing an application – • which is itself becomes an “organise a holiday” service, finding and using useful services Mode human intervention at – service provider service consumer E-mail Web browsing Web Services Yes No No Richard Hopkins, Training Team, NeSC Yes Yes No NGS Induction, NeSc, March 11 2005 – Web Service Grids - 5 Service Interaction Enabling Grids for E-sciencE I organise holidays Get a car rental quote locate service ask for quote I know the weather Is quote good enough? Yes Reserve it … get other resources reserved I locate services Confirm booking I book planes I book car Rentals Richard Hopkins, Training Team, NeSC I book hotels I convert currency NGS Induction, NeSc, March 11 2005 – Web Service Grids - 6 Enabling Grids for E-sciencE Essentials of WS Approach Need to achieve effective cooperation even though • the different services are produced by different organisations, without any design collaboration • the services are autonomously evolving • Loose coupling – minimum prior shared information between the designer of the two components of an interaction This requires • Self-description – Meta data • Tolerance of partial understanding Collaboration is on defining generic standards, rather than on specific design Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 7 Core WS Enabling Grids for E-sciencE Policy Quoteheld - 10 mins Prov Reserv. Held - 2 hours …. Basic Requirements • Service Description – WSDL • Web services Definition language, defines logical service – Operations. Message structures for each Physical – protocol, location • Allows service user to bind to the service Statically – use WSDL to generate code register Dynamically – using • Service Registry and Discovery Services – UDDI • Universal Description, Discovery and integration • Allows one service to register with a registry known by potential users • User can discover it and obtain its description • Usability Negotiation – Policies Richard Hopkins, Training Team, NeSC I book hotels I organise holidays I locate services NGS Induction, NeSc, March 11 2005 – Web Service Grids - 8 Core WS Enabling Grids for E-sciencE Basic Requirements • Communication Protocol - SOAP • Request - Response • Other patterns – e.g notification Ask car rental service to notify me of any special deals To notify me if my reservation hold time is about to expire • Common Language – XML Documents • Tag-value pair • Defined by a Schema • Used for messages, policies, service definition • Specific extensibility mechanism • Self Description – tagged values, WSDL • Necessary for loose coupling • Autonomous evolution • Autonomous evolution – discovery service – UDDI or out of band Richard Hopkins, Training Team, NeSC Give me a quote for … Quote Type = Quote Validity = 1 hour Price = £200 …. NGS Induction, NeSc, March 11 2005 – Web Service Grids - 9 Evolving Standards Enabling Grids for E-sciencE • • • Collaboration is on defining generic standards, not specific design Two main standards bodies – • W3C – actually produces “recommendations” – web community • OASIS – industry – IBM, Microsoft, Sun, …. These standards are factored to allow partial adoption and combination • The core standards • WS-I – clarifications to aid interoperability • Higher level standards built on them WSsecurity Core WS WS-Transaction UDDI* Framework WS-notification … WSRF WSDL* SCHEMAS* SOAP* DTD *WS-Interoperability Richard Hopkins, Training Team, NeSC XML* WS-addressing WS-MetaData Exchange NGS Induction, NeSc, March 11 2005 – Web Service Grids - 10 Core WS Enabling Grids for E-sciencE • • • • XML – the standard format for all information SCHEMA –the standard language for defining the structure (syntax/type) of a unit of information DTD is a deprecated predecessor of Schemas WSDL – the language for defining a service – • Operations; Logical Message Structure; Bindings; locations SOAP – the standard message format WSsecurity Core WS WS-Transaction UDDI* Framework WS-notification … WSRF WSDL* SCHEMAS* SOAP* DTD *WS-Interoperability Richard Hopkins, Training Team, NeSC XML* WS-addressing WS-MetaData Exchange NGS Induction, NeSc, March 11 2005 – Web Service Grids - 11 Some Further Standards Enabling Grids for E-sciencE • • • • • WS-Security – Framework for authentication and confidentiality WS-Transaction Framework – for robustness of correlated interactions, e.g two phase – provisionally book everything, then confirm everything UDDI – standard repository interface (included in WS-I) WS-MetaDataExchange – how to communicate meta-data …. WSsecurity Core WS WS-Transaction UDDI* Framework WS-notification … WSRF WSDL* SCHEMAS* SOAP* DTD *WS-Interoperability Richard Hopkins, Training Team, NeSC XML* WS-addressing WS-MetaData Exchange NGS Induction, NeSc, March 11 2005 – Web Service Grids - 12 Enabling Grids for E-sciencE • • • WSRF-Related Standards WS-Addressing - For communication of identities between services WS-Notification - Framework of notification interaction – subscribe, publish WSRF – Web Services Resource Framework – E.g. a quote is a persistent entity which will need to be identified in subsequent interactions to finalise a provisional booking – a resource – Consistent standard framework for creating, identifying, destroying resources WSsecurity Core WS WS-Transaction UDDI* Framework WS-notification … WSRF WSDL* SCHEMAS* SOAP* DTD *WS-Interoperability Richard Hopkins, Training Team, NeSC XML* WS-addressing WS-MetaData Exchange NGS Induction, NeSc, March 11 2005 – Web Service Grids - 13 Enabling Grids for E-sciencE • Grid Services Web Service Grids • Grid middleware can be seen as a special application area of web services dealing mainly with – VOs data replication job execution • Many advantages to basing grids on web services Cooperation between autonomously evolving components Mix-and-match of components Combining Aspects of grids with other Web Service application domains Rapid Development of higher-level services Leverage industrially-produced support packages Previously, independent simultaneous developments • Much of existing middleware components are not based on web services Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 14 Enabling Grids for E-sciencE • • • • Grid Services Now trying to unify these two major developmental thrusts - OGSA – Open Grid Services Architecture Quite central to Grids is dynamically created resources, particularly • Files • Jobs • VO membership Thus WSRF is important to this unification – • WSRF originated from the grid world A (web) resource is a concept similar to object in O-O programming • Multiple instances of the same type • Each with its own (changing) state Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 15 Enabling Grids for E-sciencE Web Service grids : WSRF Actual Grid (www/grid) – Services & Resources A Grid Access Service Submit Job Job Id : J6 Notify Started : J6 cancel: J6 Richard Hopkins, Training Team, NeSC www/grid Job J6 A resource can • Be created Globally unique id www/grid:J6 • Change state • Notify change of state • Be destroyed • Self-destruct (scheduled) NGS Induction, NeSc, March 11 2005 – Web Service Grids - 16 Web Service Grids: Interoperability Enabling Grids for E-sciencE WS-I – covers Schemas, WSDL, SOAP, UDDI (assumes XML) Specifications that have/will enter a standardisation process but are not stable and are still experimental WS-I Specifications that are emerging from standardisation process and are recognised as being ‘useful’ Richard Hopkins, Training Team, NeSC ‘WS-I+’ profile minimal for grids BPEL - workflows WS-Addressing WS-ReliableMessaging ‘WS-I+’ nextWSRF Notification Security Portlets Standards that have broad industry support and multiple interoperable implementations NGS Induction, NeSc, March 11 2005 – Web Service Grids - 17 Enabling Grids for E-sciencE Outline • Grids and Web Services • EGEE • gLite Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 19 EGEE – towards e-infrastructure Enabling Grids for E-sciencE EGEE is building a large-scale production grid service to: • Underpin research, technology and public service • Link with and build on national, regional and international initiatives • Foster international cooperation both in the creation and the use of the e-infrastructure Richard Hopkins, Training Team, NeSC Pan-European Grid Operations, Support and training Collaboration Network infrastructure & Resource centres NGS Induction, NeSc, March 11 2005 – Web Service Grids - 20 Background Enabling Grids for E-sciencE • By 2003: • Grid technology shown to be viable • Large amount of functional middleware • …thanks to: FP5 : DataGrid, DataTAG, CrossGrid, etc… USA: VDT, Globus, Condor, etc. … and others • Next step - major production infrastructure • EGEE was proposed to the EU in 2003 • 2 year project began in April 2004, with a 4-year vision. Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 21 Grids for e-Infrastructure… Enabling Grids for E-sciencE • In 2003, what was missing? • Production-quality (stable, mature) Grid middleware • Production-quality operational support Grid Operation Centres, Helpdesks, etc. • Multi-discipline grid-enabled application environment Now led by HEP, Bio-info • Administrative and policy decision framework in order to share resources at pan-European scale (and beyond) Areas such as AAA (Authentication, Authorisation, Accounting) End-to-end issues (Network related) Funding Policies (Grid economics) Resource Sharing Policies Usage Policies • EGEE project is tackling most of the above issues Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 22 Enabling Grids for E-sciencE In the first 2 years EGEE will • Establish production quality sustained Grid services • 3000 users from at least 5 disciplines • integrate 50 sites into a common infrastructure • offer 5 Petabytes (1015) storage • Demonstrate a viable general process to bring other scientific communities on board • Propose a second phase in mid 2005 Pilot New to take over EGEE in early 2006 Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 23 EGEE Organisation Enabling Grids for E-sciencE • 70 leading institutions in 27 countries, federated in regional Grids • ~32 M Euros EU funding for first 2 years starting April 2004 (matching funds from partners) • Leveraging national and regional grid activities • Promoting scientific partnership outside EU Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 24 Activities Definition Enabling Grids for E-sciencE • Network Activities • • • • • • Service Activities • • • NA1: Project Management NA2: Dissemination and Outreach NA3: User Training and Induction NA4: Application Identification and Support NA5: Policy and International Cooperation SA1: Grid Support, Operation and Management SA2: Network Resource Provision Joint Research Activities • • • • JRA1: Middleware Reengineering + Integration JRA2: Quality Assurance JRA3: Security JRA4: Network Services Development Emphasis in EGEE is on operating a production grid and supporting the end-users Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 25 Enabling Grids for E-sciencE Operations - Introduction • Strategy has been to • have a robust certification and testing activity, • simplify as far as possible what is deployed, and to make that robust and useable. • In parallel construct the essential infrastructure needed to operate and maintain a grid infrastructure in a sustainable way. • Current service based on work done in LCG – culminating in the current service (“LCG-2”) • Now at the point where in parallel we need to deploy and understand gLite – whilst maintaining a reliable production service. Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 26 Enabling Grids for E-sciencE EGEE Service Activities • Create, operate, support and manage a production quality infrastructure • Offered services: • Middleware deployment and installation • Software and documentation repository • Grid monitoring and problem tracking • Bug reporting and knowledge database • VO services • Grid management services Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 27 SA1 – Operations Structure Enabling Grids for E-sciencE • Operations Management Centre (OMC): • • Core Infrastructure Centres (CIC) • • • • • • • • Act as front-line support for user and operations issues Provide local knowledge and adaptations One in each region – many distributed User Support Centre (GGUS) • • Richard Hopkins, Training Team, NeSC Manage daily grid operations – oversight, troubleshooting Run essential infrastructure services Provide 2nd level support to ROCs UK/I, Fr, It, CERN, + Russia (M12) Taipei also run a CIC Regional Operations Centres (ROC) • • At CERN – coordination etc In FZK – manage PTS – provide single point of contact (service desk) Not foreseen as such in TA, but need is clear NGS Induction, NeSc, March 11 2005 – Web Service Grids - 28 Grid Operations Enabling Grids for E-sciencE • • RC RC ROC RC • RC • RC RC RC RC ROC CIC CIC RC CIC CIC • • • OMC RC CIC RC RC RC RC ROC RC • RC • ROC RC RC Operational oversight (grid operator) responsibility rotates weekly between CICs Report problems to ROC/RC ROC is responsible for ensuring problem is resolved ROC oversees regional RCs ROCs responsible for organising the operations in a region • • Essential to scale the operation CICs act as a single Operations Centre • RC RC CIC The grid is flat, but Hierarchy of responsibility Coordinate deployment of middleware, etc CERN coordinates sites not associated with a ROC RC = Resource Centre Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 29 (Human) Networking Activities Enabling Grids for E-sciencE • Dissemination and Outreach: 5% of EGEE budget • • Dissemination – to actively promote and raise awareness of the EGEE project Outreach – to identify and contact potential new user communities • Training and Induction: 4% of EGEE budget • • Induction – to introduce and orient - users and members Training – to create, collate, make available and deliver material and courses • Application Identification and Support • • Process for selecting new application areas Supports selected VO’s in porting applications • International cooperation Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 30 Enabling Grids for E-sciencE Dissemination • 1st project conference, Cork, April • 2nd conference in The Hague • 22-26 November • http://public.eu-egee.org/conferences/2nd • Over 300 delegates • Websites, Brochures and press releases • For project and general public www.eu- egee.org • Information packs for the general public, press and industry Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 31 Training and Induction Enabling Grids for E-sciencE NeSC Edinburgh UK & Ireland IHEP IMPB RAS ITEP JINR Protvino Russia Moscow Russia Moscow Russia Dubna Russia KU-NATFAK PNPI RRCKI Copenhagen Denmark Petersburgh Russia Moscow Russia GUP Linz Austria FZK Karlsruhe Germany Innsbruck Austria GRNET Athens Greece INFN CESNET Rome Italy Prague Czech Rep. Richard Hopkins, Training Team, NeSC BUTE Budapest Hungary II-SAS Bratislava Slovakia ICM PSNC ICI Warsaw Poland Poznan Poland Bucharest Romania ELUB Budapest Hungary MTA SZTAKI TAU Budapest Hungary Tel Aviv Isreal NGS Induction, NeSc, March 11 2005 – Web Service Grids - 32 Enabling Grids for E-sciencE “User interface” Current production m’ware: LCG-2 Input “sandbox” Output “sandbox” DataSets info Replica Catalogue Information Service Resource Broker Publish Logging & Book-keeping Job Query Job Submit Event Author. &Authen. Storage Element Job Status Richard Hopkins, Training Team, NeSC Computing Element NGS Induction, NeSc, March 11 2005 – Web Service Grids - 33 Enabling Grids for E-sciencE • Easily and quickly deployable Use existing services where possible Condor, EDG, Globus, LCG, … AliEn LCG ... Being built on Scientific Linux and Windows Security • • ... Portable • • EDG Allow for multiple interoperable implementations Lightweight (existing) services • • • VDT Service oriented approach • • gLite: Guiding Principles Sites and Applications • Performance/Scalability & Resilience/Fault Tolerance • Comparable to deployed infrastructure Co-existence with deployed infrastructure • • Site autonomy • • Richard Hopkins, Training Team, NeSC Co-existence with LCG-2 and OSG (US) are essential for the EGEE Grid services Reduce dependence on ‘global, central’ services Open source license NGS Induction, NeSc, March 11 2005 – Web Service Grids - 34 gLite Services for Release 1 Enabling Grids for E-sciencE JRA3 Grid Access Service CERN UK API Access Services Authorization Information & Monitoring Auditing Authentication File & Replica Catalog Storage Element Data Management Accounting Job Provenance Package Manager Site Proxy Computing Element Workload Management Data Services Richard Hopkins, Training Team, NeSC Application Monitoring Information & Monitoring Services Security Services Metadata Catalog IT/CZ Job Management Services NGS Induction, NeSc, March 11 2005 – Web Service Grids - 35 Approach Enabling Grids for E-sciencE • Role: gLite is intended as the middleware for Pan-European Grid • Approach: Combining and re-engineering existing components (as far as possible) • Middleware (MW) components now work reasonably well – problems usually elsewhere • Substantial body of experience of MW development, packing and use • Based on LCG2 • What is currently used within the EGEE project • Can be viewed as an initial version of Pan-European Grid middleware • gLIte deals with VDT EDG ... AliEn LCG ... • LCG2 shortcomings • Advanced application needs • State-of-the-art Internet approaches and standards Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 36 Enabling Grids for E-sciencE • • • • • Improvements Extended Functionality • Incorporates features from other products • Some new functionality Increased Modularity • Previous products tend to be monolithic – assumed to be used in entirety • Now need to move towards smaller scale of component combinability Improved Deployability • Modularity • Installation support • Non-dedicated machines Coherent architecture • extending functionalities and increasing modularity requires – • a revised picture of the organisation of functions • rationalisation of interface Adopt Web Services Approach and Standards Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 37 Enabling Grids for E-sciencE Guiding Principles • Lightweight - Easily and quickly deployable • Mix-and-match components • Multiple services running on the same physical machine (if possible) • Interoperable • Client may talk to different independent implementations of the same service • Resilient and Fault Tolerant • Co-exist with deployed infrastructure • Reduce requirements on site components • Co-existence with LCG-2 essential for the EGEE Grid service • Platform support- Goal is to have portable middleware • Building & Integration on RHEL 3 and windows • Initial testing (at least 3 sites) using different Linux flavours (including free distributions) Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 38 Guiding Principles Enabling Grids for E-sciencE • Service oriented approach • Based on web services standards • Service autonomy • User may talk to services directly or through other services (like access service) • Open Source software license • No restriction on usage (academic or commercial) beyond acknowledgement • That's for MW - for application software, Sites must obtain appropriate licenses before installation • Main Documentation Application requirements: http://egee-na4.ct.infn.it/requirements/ Architecture: https://edms.cern.ch/document/476451 Design: https://edms.cern.ch/document/476451 Release plan: https://edms.cern.ch/document/468699 Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 39 Application communities and EGEE Enabling Grids for E-sciencE • LCG and Bio-informatics from day 1 • New application communities are selected by the EGEE Generic Applications Advisory Panel: • For new applications • See: EGEE web site (NA4 activity) and also http://agenda.cern.ch/age?a042351 • Selected are: • • • • Computational chemistry Earth sciences Earth observation Astrophysics • Also working with DILIGENT: • Virtual digital data libraries Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 40 Development Cycle Enabling Grids for E-sciencE • Prototyping short development cycles for fast user feedback – Requirements Deployment : Prototype Infrastructure → Pre-production service → Production Service Planning & Design Implementation Testing LCG-1 Globus 2 based Richard Hopkins, Training Team, NeSC LCG-2 gLite-1 gLite-2 Web services based NGS Induction, NeSc, March 11 2005 – Web Service Grids - 41 Enabling Grids for E-sciencE Open Source Software License • The existing EGEE grid middleware (LCG-2) is distributed under an Open Source License developed by EU DataGrid project • Derived from modified BSD - no restriction on usage (academic or commercial) beyond acknowledgement • Approved by Open Source Initiative (OSI) • Same approach for new middleware (gLite) • New license agreed by partners is derived from the EDG license and takes into account feedback from the World Intellectual Property Office (WIPO) Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 42 Enabling Grids for E-sciencE Further Information EGEE www.eu-egee.org LCG lcg.web.cern.ch/LCG/ “Concertation event” and Second EGEE conference: http://public.euegee.org/conferences/2nd/programme/outline.html The Grid Cafe www.gridcafe.org •More EU sites: •http://www.cordis.lu/ist/grids/fp6_grid_projects.htm •http://www.gridstart.org/concertation_mtg.shtml •“e-Infrastructures Reflection Group http://www.e-irg.org •NeSC www.nesc.ac.uk Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 43 Enabling Grids for E-sciencE • THE END Richard Hopkins, Training Team, NeSC NGS Induction, NeSc, March 11 2005 – Web Service Grids - 44