OGSA-DAI Status Report and Future Directions Neil Chue Hong N.ChueHong@epcc.ed.ac.uk Malcolm Atkinson mpa@nesc.ac.uk http://www.ogsadai.org.uk Outline of Talk 4Project background and motivation 4Status of current release 4Releases and Usage Statistics 4Roadmap for future releases http://www.ogsadai.org.uk 2 Motivation 4 Entering an age of data – Data Explosion • • • – CERN: LHC will generate 1GB/s = 10PB/y VLBA (NRAO) generates 1GB/s today Pixar generate 100 TB/Movie Storage getting cheaper 4 Data stored in many different ways – Data resources • • • Relational databases XML databases Flat files 4 Need ways to facilitate – – – Data discovery Data access Data integration E-Life™ e-Research 4 Empower e-Business and e-Science – The Grid is a vehicle for achieving this Major challenge: rate of growth in number & complexity of data sources & 3 http://www.ogsadai.org.uk rate at which they change ⇒ old “cottage industry” solutions don’t scale OGSA-DAI 4First steps towards a generic framework for integrating data access and computation 4Using the grid to take specific classes of computation nearer to the data 4Kit of parts for building tailored access and integration applications 4Investigations to inform DAIS-WG 4One reference implementation for DAIS 4Releases publicly available NOW http://www.ogsadai.org.uk 4 Project Partners Powered by …. Funded by the Grid Core Programme OGSA-DAI £3 million, 18 months, from Feb 2002 Three major releases, three interim releases DAIT (DAI-Two) Keep the OGSA-DAI brand name £1.5 million, 24 months, from Oct 2003 Four major releases GGF DAIS WG Strong involvement. Standardise the interfaces OGSA-DAI to be a reference implementation http://www.ogsadai.org.uk 5 Project Membership Malcolm Kostas Norman Paul Principal Investigators Research Team Programme Management Board Chair Neil Technical Review Board Chair Charaka Mike Ally Mario Project Manager Amy Charaka EPCC Team Andy Jon Simon Dave IBM Development Team http://www.ogsadai.org.uk Brian Neil Patrick IBM Dissemination Team 6 Infrastructure Architecture Data Intensive X Scientists Data Intensive Applications for Science X Simulation, Analysis & Integration Technology for Science X Generic Virtual Data Access and Integration Layer Job Submission Brokering Registry Banking Data Transport Workflow Structured Data Integration Authorisation OGSA OGSA-DAI Resource Usage Transformation Structured Data Access Grid or Web Service Infrastructure Compute, Data & Storage Resources Structured Data Relational Distributed Virtual Integration Architecture http://www.ogsadai.org.uk XML Semi-structured - 7 OGSA-DAI Request to Registry for sources of data about “x” Registry responds with Factory handle Analyst SOAP/HTTP Registry GDSR service creation API interactions Request to Factory for access to database Factory returns handle of GDS to client Factory GDSF Factory creates GridDataService Client queries GDS with SQL, XPath, XQuery etc Query results returned XML OR delivered to consumer as XML Consumer Grid Data Service GDS Database (Xindice, MySQL Oracle, DB2) GDS interacts with database Multiple tasks / request C L I E N T R E Q U E S T O R 1 Data Set dr Data Set A P Ident I S T Ident U Type Type B 7Value 6 Value 2 5 Ident Type Value 4 Ident Type Value 3 Ident Type Value 2 Ident Type Value 1 Ident Type Value Ident Type 0 Value Current status 4OGSA-DAI R4 available since April 2004 4OGSI based – built on top of GT 3.2 4Supports relational, xml and some files – MySQL, Oracle, DB2, SQL Server, Postgres, XIndice, CSV 4Supports various delivery options – SOAP, FTP, GridFTP, HTTP, files, email, inter-service 42746 downloads, 792 registered users (Aug 04) from all around the world 4If you need to build high level data services in your project – we strongly urge you to build on OGSA-DAI http://www.ogsadai.org.uk 10 Current Release 4 R4 April 2004 – Provides Data Access components, an extensible framework for building applications and some integration components – Built on top of Globus Toolkit 3.2 – Supports relational, xml and some files • MySQL, Oracle, DB2, SQL Server, Postgres, XIndice, CSV – Supports various delivery options • SOAP, FTP, GridFTP, HTTP, files, email, inter-service – Supports various transforms • XSLT, ZIP, GZip – – – – – Supports message level security using X509 certificates Client Toolkit library for application developers GUI data browser (contributed by FirstDIG project) Separate Distributed Query Processing components Comprehensive documentation and tutorials in XHTML format http://www.ogsadai.org.uk 11 Downloads by country 792 registered users @ 23/8/04 http://www.ogsadai.org.uk 12 Downloads by Release R4 3000 2500 R3 2000 1500 R2 R1 1000 500 http://www.ogsadai.org.uk 15/07/2004 15/05/2004 15/03/2004 15/01/2004 15/11/2003 15/09/2003 15/05/2003 15/03/2003 15/01/2003 0 15/07/2003 2746 downloads (~4.7 downloads a day) 13 Projects using OGSA-DAI at UK 2004 AHM 4 Astrogrid 4 ConvertGrid 4 eDiamond 4 EdSkyQueryG 4 FirstDIG 4 GEDDM 4 GeneGrid 4 INWA 4 myGrid 4 ODD-Genes All these projects have demos at All Hands http://www.ogsadai.org.uk 14 Some more projects http://cabig.nci.nih.gov/ “Expediting the cancer research communities' access to key bioinformatics platforms by deploying an integrating biomedical informatics infrastructure” – Chosen to use OGSA-DAI and Project Mobius to create data infrastructure OGSA-DAI being deployed on all GEON nodes http://www.geongrid.org/ http://www.ogsadai.org.uk 15 Future plans 4OGSA-DAI is working on: – WS-I/WS-RF interfaces – Data integration tools and applications – Workflow and mobile code 4Basic WS-I version will be released with OMII middleware in October 2004 – OGSA-DAI R5 (GT 3.2.1 based) will be available October 2004 also 4OGSA-DAI R6 available April 2005 – – – – – Full support for WS-RF Data Integration applications supporting identified scenarios OGSA-DQP as an integrated part of release Fully compliant JDBC Driver for OGSA-DAI Support for WS-Security implementations … and much more http://www.ogsadai.org.uk 16 Roadmap 4New roadmap document published on OGSA-DAI website – http://www.ogsadai.org.uk/docs/OtherDocs/OGSADAIRoadmapV2.0.pdf 4User feedback required to drive this document – Suggest requirements – Let us know your priorities 4User Group meetings – Next one at GGF12 (Monday 20th September, 5pm) – Details at http://www.ogsadai.org.uk/news/ug2.php http://www.ogsadai.org.uk 17 Release 5 4R5 October 2004 (Interfaces) – OGSI – Built on Globus Toolkit 3.2.1 – Re-engineered interface-independent core OGSA-DAI functionality. – Improved dependability and security integration. – New file data resources representing flat files queried using full text searches (e.g. EMBL format). – Installation and Configuration Wizard, including “all-in-one installer” – Improved Data Browser which allows XPath querying. – Set of standard benchmarks. – JSP Quick View interface. – Support for other databases (e.g. Access, Exist, HSQL). http://www.ogsadai.org.uk 18 WS-I Technical Preview 4 A limited functionality evaluation version – An OGSA-DAI “Data Service” combining the metadata, configuration and perform document capabilities of the OGSI-based GDSF and GDS services. – Access to service metadata provided by a partial implementation of the WS-ResourceProperties specification. – Example clients are provided for testing and coding reference. 4 Caveats/Issues: – No registry component, no support for third party delivery. – Security may be available (based on OMII WS-Security plug-in for Axis). – Document schema and interfaces WILL change. – The WSDL is based on the OGSI-based WSDL from OGSA-DAI – Will not be supported to same level as main release. 4 Released with OMII middleware distribution in October http://www.ogsadai.org.uk 19 Release 6 4 R6 April 2005 (Integration) – WS-RF – Data Integration applications supporting identified scenarios – OGSA-DQP as an integrated part of release – Fully compliant JDBC Driver for OGSA-DAI – Support for WS-Security implementations – Support for stored procedures on all supported databases – Improved support for different database specific SQL types – SQL translation between vendor dialects for subset of queries – Support for XQuery data resources – We expect to comply with a version of the emerging DAIS specification at this release. http://www.ogsadai.org.uk 20 Future releases 4Produce a reference implementation of the DAIS Specification 4Integrate with other eScience components – Workflow is vital 4Start to provide tools to manage distributed data resources 4Collaborate with other groups to produce additional functionality http://www.ogsadai.org.uk 21 Summary 4Data Access – done 4Data Integration – in progress 4OGSI → WS-I → WS-RF 4OGSA-DAI provides a framework which reduces application development time 4We need to know what you’re doing – Comment on the Roadmap – Participate in the User Group 4OGSA-DAI is out there NOW – Use it, and let us know what you’ve done http://www.ogsadai.org.uk 22 Further information 4The OGSA-DAI Project Site: – http://www.ogsadai.org.uk 4The DAIS-WG site: – http://cs.man.ac.uk/grid-db 4OGSA-DAI Users Mailing list – users@ogsadai.org.uk – General discussion on grid DAI matters 4Formal support for OGSA-DAI releases – http://www.ogsadai.org.uk/support – support@ogsadai.org.uk 4OGSA-DAI training courses http://www.ogsadai.org.uk 23