A Bright Future with OGSA & DAIS Data Services Malcolm Atkinson Director www.nesc.ac.uk 6th July 2004 A la carte Menu • E-Science • Grid Infrastructure Deployment • WS & Grid comparisons • OGSI → WS-RF → WS-RF (5) + WS-Notification (3) • OGSA Data & DAIS • OGSA-DAI • Future Look BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 2 1 What is e-Science? • Goal: to enable better research • Method: Invention and exploitation of advanced computational methods to generate, curate and analyse research data • From experiments, observations and simulations • Quality management, preservation and reliable evidence to develop and explore models and simulations • Computation and data at extreme scales • Trustworthy, economic, timely and relevant results to enable dynamic distributed virtual organisations • Facilitating collaboration with information and resource sharing • Security, reliability, accountability, manageability and agility BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 3 The Primary Requirement … Enabling People to Work Together on Challenging Projects: Science, Engineering & Medicine BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 4 2 A la carte Menu • E-Science • Grid Infrastructure Deployment • WS & Grid comparisons • OGSI → WS-RF → WS-RF (5) + WS-Notification (3) • OGSA Data & DAIS • OGSA-DAI • Future Look BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 5 The e-Science Centres Globus Alliance Open Middleware Infrastructure Institute Digital Curation Centre e-Science Institute Grid Operations Centre ? CeSC (Cambridge) EGEE BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 6 3 UK e-Science Grids 1600 x CPU AIX 512 x CPU Irix Engineering Task Force (Contributions from e-Science Centres) HPC(x) 20 x CPU 18TB Disk Linux Grid Support Centre / Grid Operations Centre NGS OGSA Test Grid projects CeSC (Cambridge) 64 x CPU 4TB Disk Linux BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 7 Importance of collaboration: VDT • A highly successful collaborative effort VDT Working Group VDS (Chimera/Pegasus) team • Provides the “V” in VDT Used by many projects Systematic testing Rich integration of components • Middleware, testing, patches, feedback … The UK will be part of this PPDG • Hardening and testing – exploit test bed Pacman Condor Team Globus Alliance NMI Build and Test team EDG/LCG/EGEE • Provides easy installation capability • Currently Pacman 2, moving to Pacman 3 soon Thanks to Miron Livny contribute components BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 8 4 E-Science’s Growing Assets • Understanding of Processes & Requirements • International and Multi-disciplinary Skill base • Experience composing & adapting existing technologies and of building new components • Experience Supporting Developers and Users • Experience Establishing Virtual Organisations across Enterprise boundaries Embedded in People & Teams, Growing – they need nurture BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 9 EGEE Implementation • From day 1 (1st April 2004) Production grid service based on the LCG infrastructure running LCG-2 grid middleware (SA) LCG-2 will be maintained until the new generation has proven itself (fallback solution) • In parallel develop a “next generation” grid facility (JRA) Produce a new set of grid services according to evolving standards (Web Services) Run a development service providing early access for evaluation purposes Will replace LCG-2 on production facility in 2005 LCG-1 LCG-2 Globus 2 based EGEE-1 EGEE-2 Web services based VDT EDG ... AliEn LCG ... EGEE BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 10 5 Certification, Testing and Release Cycle SA1 Integrate HEP EXPTS Basic Functionality Tests BIO-MED Run Certification Matrix OTHER TBD Run tests C&T suites Site suites APPS SW Installation Certified Release candidate tag Dev Tag DEPLOY release tag SERVICES Deployment release tag PRODUCTION APP INTEGR PRE-PRODUCTION CERTIFICATION TESTING DEPLOYMENT PREPARATION DEVELOPMENT & INTEGRATION UNIT & FUNCTIONAL TESTING JRA1 Production tag BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 11 Sites in LCG-2/EGEE-0 : June 4 2004 Austria U-Innsbruck Canada Triumf Alberta Carleton Montreal Toronto Italy CNAF Frascati Legnaro Milano Napoli Roma Torino Czech Republic Prague-FZU Prague-CESNET Japan Tokyo Netherlands NIKHEF France CC-IN2P3 Clermont-Ferrand Pakistan NCP Poland Krakow Portugal LIP Russia SINP-Moscow JINR-Dubna Spain PIC UAM USC UB-Barcelona IFCA CIEMAT IFIC Germany FZK Aachen DESY Wuppertal Greece HellasGrid Hungary Budapest India TIFR Israel Tel-Aviv Weizmann Switzerland CERN CSCS Taiwan ASCC NCU UK RAL Birmingham Cavendish Glasgow Imperial Lancaster Manchester QMUL RAL-PP Sheffield UCL US BNL FNAL HP Puerto-Rico • 22 Countries • 58 Sites (45 Europe, 2 US, 5 Canada, 5 Asia, 1 HP) • Coming: New Zealand, China, other HP (Brazil, Singapore) • 3800 cpu BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 12 6 Examples of HealthGRID applications Grids for medical development Preparation and follow-up of medical missions in developing Clermont-Ferrand/Paris countries Support to local medical centres in terms of second g in diagnosis, patient follow-up learn aces e- rnt e efordea snis and e-learning tiv sntft toio Ibagué Hand surgery Medical centre c een an ra uctoi algt te ReeoPq-a dndsiu n I id cno V co se 2 missions (Ibagué & Chuxiong) with the french NPO « Chaîne de l’Espoir » used as test cases Chuxiong The grid impact : •Improved telemedecine services • Federation of patient databases •Interactive e-learning (high bandwidth network required) BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6 July 2004 - 13 th DataGrid : status of biomedical applications deployed tested on EDG under preparation • Bio-informatics Phylogenetics : BBE Lyon (T. Sylvestre) Search for primers : Centrale Paris (K. Kurata) Bio-informatics web portal : IBCP (C. Blanchet) Parasitology : LBP Clermont, Univ B. Pascal (N. Jacq) Data-mining on DNA chips : Karolinska (R. Médina, R. Martinez) Geometrical protein comparison : Univ. Padova (C. Ferrari) • Medical imaging MR image simulation : CREATIS (H. Benoit-Cattin) Medical data and metadata management : CREATIS (J. Montagnat) Mammographies analysis ERIC/Lyon 2 (S. Miguet, T. Tweed) Simulation platform for PET/SPECT based on Geant4 : GATE collaboration (L. Maigne) GATE MonteCarlo simulation platform for nuclear medecine 180 Local_Monopro1500MHz X10 X20 X50 X100 160 Temps en minutes eHealth eScience Sec P o at Pat Rned diient d nt qf ueasgtnoata 2ined dioalglowfosrtic no-sup tic 140 120 100 80 60 40 20 0 Parallelisation BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 14 7 A la carte Menu • E-Science • Grid Infrastructure Deployment • WS & Grid comparisons • OGSI → WS-RF → WS-RF (5) + WS-Notification (3) • OGSA Data & DAIS • OGSA-DAI • Future Look BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 15 October 2001 View Web Services Grid Technology Grid Services BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 16 8 Web Services – Q4 2001 view • Independence Client from Service Service from Client • Description Web Services DL … • Separation Function from Delivery • Tools & Platforms Java ONE Visual .NET WebSphere Oracle • Commercial Buy in www. w3c. org / TR / SOAP or TR/wsdl BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 17 Grid Technology • Distribution Various Protocols FTP • Security Single Sign in • Resource Sharing Discovery Process Creation Scheduling • Portability APIs • Gov’nm’t Agency Buy in Foster, I., Kesselman, C. and Tuecke, S., The Anatomy of the Grid: Enabling Virtual Organisations, Intl. J. Supercomputer Applications, 15(3), 2001 BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 18 9 Service-Oriented Architecture Registry Discovery Registration Invocation Client Service BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 19 WS & Grid Comparison 1 Web Services Grid Services • Goals • Goals Computational presentation & access of Enterprise services Marketing integrated large scale software and systems Model for independent development Model for independent operation Inter-organisational collaboration Sharing information and resources Framework for collaborative development Framework for collaborative operation BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 20 10 WS & Grid Comparison 2 Web Services Grid Services • Commitment • Commitment Most large technology providers Some service providers Some service hosters Some large laboratories Many governmentfunded research programmes Some resource providers • Standardisation • Standardisation W3C Oasis … IETF GGF Oasis BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 21 WS & Grid Comparison 3 Web Services Grid Services • Standards • Standards WS-I • Core of agreed & provided • WSDL, SOAP, UDDI, WS- security • Revised regularly Many others under way • WS-* are important • Competition & synthesis Commercial battleground • Do these standards support my business model • When do I want them None Many exist as proposals Important Architecture • OGSA Continuum from requirements & research to well specified standards proposals Building on & influencing WS Hard to understand and engage Hard to understand & engage What matters is what is widely adopted BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 22 11 WS & Grid Comparison 4 Web Services • Usage Grid Services • Usage Complex services created & delivered persistently by owner organisation Client interactions shortlived Multi-organisation integration responsibility of client • Workflow enactment • Transaction coordination • May be by an intermediate service All of WS patterns + Dynamic services / resources Long-lived interactions Persistent computational integration • Data management • Computation management Persistent operational infrastructures • GOC managing European- scale grid System organised optimisation End-to-end security (goal) Virtual Organisations • Establish multi-organisation Security on a local basis security policies BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 23 WS & Grid Comparison 5 Web Services • Status • Status Operational research projects and grids Commercially successful operational applications Several good toolsets available • >100 projects use GT2 or GT3 • Mostly costly to use outside academia • BPEL4WS Apache Tomcat • High-level work-load generators Beware hype and marketing Scale, usability & reliability problems in free-ware Much momentum Very high levels of investment No toolsets Scientific workflow • Chimera, Pegasus, VDT, … Workflow enactment • Many fixes were needed to Grid Services Some very robust and well tested technologies • Condor, GT2, VDT, GT3.2, LCG2, EGEE1 All free-ware Performance, usability and reliability problems Much momentum High levels of investment BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 24 12 WS & Grid Comparison 6 Web Services Grid Services • Interaction • Interaction Grids will influence provision systems Grids stimulating many standards development Using web services extensively Balancing act • Reach goals • Retain access to WS tools Expect a continuous coevolution • Significant new species next year Application goals push technical limits in both cases At limits expect difficulties – most work not at limits BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 25 A la carte Menu • E-Science • Grid Infrastructure Deployment • WS & Grid comparisons • OGSI → WS-RF → WS-RF (5) + WS-Notification (3) • OGSA Data & DAIS • OGSA-DAI • Future Look BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 26 13 What changed and why? • OGSI Notification continues One unit • Dynamic resources & properties • Static functions Dynamic lifetime management • Creation & termination common (cheap) operations Global persistent identification Lost • WS-Resource Framework WS-Address WS has functions Resource has lifetime, properties, etc. Partitioned specs more manageable ☺ Reconciles static view of WS with required dynamics BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 27 GT & WSRF Timeline GGF GW OASISGGF10 interop TC techPre 1 6-sys TC2 Demo Improved robustness, scalability, performance, usability GT3.2 2004 3.2 March 4.0 β Q2 2005 GT4.0 Not waiting for finalisation of WSRF specs. Use as submitted 4.0 Q3 WSRF; some new functionality; further usability, 4.2 performance Q2 ‘05 enhancements GT4.2 Numerous new WSRF-based services BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 28 14 Components in GT 3.2 GSI WU GridFTP Pre-WS GRAM MDS2 JAVA WS Core (OGSI) WS-Security RFT (OGSI) WS GRAM (OGSI) WS-Index (OGSI) OGSI C Bindings CAS (OGSI) RLS SimpleCA OGSA-DAI OGSI Python Bindings (contributed) pyGlobus (contributed) XIO Security Data Management Resource Management Information Services WS Core BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 29 Planned Components in GT 4.0 GSI New GridFTP Pre-WS GRAM MDS2 JAVA WS Core (WSRF) WS-Security RFT (WSRF) WS-GRAM (WSRF) WS-Index (WSRF) C WS Core (WSRF) CAS (WSRF) RLS CSF (contribution) SimpleCA OGSA-DAI Authz Framework XIO Security Data Management pyGlobus (contributed) Resource Management Information Services WS Core BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 30 15 Relative Importance – a sense of proportion • What envelopes you put your messages in How they are delivered Infrastructure to organise a common technical platform – the foundations of communication BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 31 Relative Importance • What envelopes you put your messages in How they are delivered Infrastructure to organise a common technical platform – the foundations of communication • What information you send in your messages Their patterns of Use - sequences that mean something Their Contents The Grammar and Vocabulary of Communication Agreed Interpretations Scope of OGSI to WS-Resource Framework change BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 32 16 Relative Importance • What envelopes you put your messages in Technical Experts How they are delivered Infrastructure to organise a common technical platform – the foundations of communication • What information you send in your messages Their patterns of Use - sequences that mean something Their Contents The Grammar and Vocabulary of Communication Agreed Interpretations • What you do when you get a message The Application Code you Execute The Middleware Services • Security, Privacy, Authorisation, Accounting, Registries, Brokers, … Integration Services • Multi-site Hierarchical Scheduling, Data Access & Integration, … Portals, Workflow Systems, Virtual Data, Semantic Grids Tools to support Application Developers, Users & Operations • Incremental deployment tools, diagnostic aids, performance monitoring, … BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 33 Relative Importance • What envelopes you put your messages in • What information you send in your messages • What you do when you get a message How they are delivered Infrastructure to organise a common technical platform – the foundations of communication Their patterns of Use - sequences that mean something Their Contents The Grammar and Vocabulary of Communication Agreed Interpretations The Application Code you Execute The Middleware Services • Security, Privacy, Authorisation, Accounting, Registries, Brokers, … Integration Services • Domain Experts Multi-site Hierarchical Scheduling, Data Access & Integration, … Portals, Workflow Systems, Virtual Data, Semantic Grids Tools to support Application Developers, Users & Operations • Creative Actions and Judgements of Researchers, Designers & Clinicians Data, Models & Analyses In Silico Experiments, Design, Diagnosis & Planning Creating the Scientific Record BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 34 17 A la carte Menu • E-Science • Grid Infrastructure Deployment • WS & Grid comparisons • OGSI → WS-RF → WS-RF (5) + WS-Notification (3) • OGSA Data & DAIS • OGSA-DAI • Future Look BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 35 Move Computation to Data • Code scale Increasingly Depends on wet-ware • No noticeable rate of improvement necessary Application Grows Moore’s Law or Moore’s Law2 Analysis of data control or Extracts & derivatives used higher-level • Often smaller – more value for current investigation service Implies move code to data decisions SQL, Xquery, Java code, … • Data scale • • • Extensibility mechanisms used by OGSA-DAIers • Java mobility (e.g. DataCutter), database procedures, … BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 36 18 Integration is Everything • Motivation • • • • No business or research team is satisfied with one data resource Federation or Human centred Virtualisation Domain-specialist driven preceding Dynamic specification of combination function integration or Iterative processes • Revised request minutes later kit of • Revised request after months of thought integration Sources inevitably heterogeneous tools to be Time-varying content, structure & policies interwoven Robust & stable steerable integration services with an Higher-level services over multiple resources Fundamental requirements for (re)negotiation application? BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 37 Multiple tasks / request C L I E N T R E Q U E S T O R 1 Data Set dr A P Ident I S T Ident U Type Type B 7Value 6 Value 2 Data Set 5 Ident Type Value 4 Ident Type Value 3 Ident Type Value 2 Ident Type Value 1 Ident Type Value Ident Type 0 Value BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 38 19 Be Direct • Double Handling costs too much Breaks down boundaries and • Double Handling via discs pathologically bad merges data, • Data translation expensive execution & Avoid transport • Deliver as stored, … requirements. Memory cycles, bus capacity, cache disruption, … Compose Stream Demands smart • Main memory is not big enough workflow Stream or use Disk enactment • Couple generator & consumer directly service & Stream from RAM to RAM foundation Requires coupled computation execution services Models for process transformation and optimisation BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 39 A la carte Menu • E-Science • Grid Infrastructure Deployment • WS & Grid comparisons • OGSI → WS-RF → WS-RF (5) + WS-Notification (3) • OGSA Data & DAIS • OGSA-DAI • Future Look BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 40 20 OGSA-DAI Request to Registry for sources of data about “x” Registry responds with Factory handle Analyst SOAP/HTTP Registry GDSR service creation API interactions Request to Factory for access to database Factory GDSF Factory returns handle of GDS to client Factory creates GridDataService Client queries GDS with SQL, XPath, XQuery etc Query results returned XML OR delivered to consumer as XML Consumer Grid Data Service GDS Database (Xindice, MySQL Oracle, DB2) GDS interacts with database A la carte Menu • E-Science • Grid Infrastructure Deployment • WS & Grid comparisons • OGSI → WS-RF → WS-RF (5) + WS-Notification (3) • OGSA Data & DAIS • OGSA-DAI • Future Look BNCOD Pre-meeting on Grids, NeSC, Edinburgh 6th July 2004 - 42 21