e-Science Data Information and Knowledge Transformation Edikt : e-Science Data, Information and Knowledge Transformation E-Science Centres of Excellence Meeting 6 November 2003 Dr. Denise Ecklund, edikt technical architect What is edikt? Standards Requirements analysis Technology matchmaking E-Science Apps CS Research Edikt project Gap filling Grid Services for e-Science Data Management Rigorous engineering Commercial SW components and skills The team: 8 professional software engineers, support staff, project manager, commercialisation manager, architect, and SAB SHEFC funded research and development grant – 3 years funding: May 2002 – 2005 – +3 years funding upon successful project and review 2 www.edikt.org Current activities Eldas – Enterprise level data access services – Core data services supporting e-Science virtual organisations BinX and AstroBinX – Binary XML – Supports data interchange for astronomy and other applications OSAGE – Ontology-based Species Atlas for Gene Expression – Defines a database schema for storing and annotating 3D anatomy and gene expression data for multiple species Technology and research evaluations 3 www.edikt.org Creating a Virtual Organization Radio spectrum Optical spectrum ELDAS DB2 DB X-Ray spectrum + Grid Directory Services Xindice DB MySQL DB 4 www.edikt.org ELDAS – Extensibility via DACs User1 User2 Reusable ELDAS Core User3 ELDAS ELDAS Core DAC2 DAC Xindice DB DAC MySQL DB DAC DAC DB2 DB Oracle 9i DB Data Access Components interface to distinct DBMSs Multiple DB drivers can be supported – JDBC, ODBC for relational DBMSs Plug-n-Play installation of ELDAS 5 www.edikt.org ELDAS – EJB Implementation Grid User1 ELDAS runs anywhere Suitable for grid & web Grid User2 Grid Proxy Web User1 Web Servlet Java Framework ELDAS EJB - GDS DAC Xindice DB DAC MySQL DB DAC DAC DB2 DB Oracle 9i DB Java 2 Enterprise Edition implements basic server tasks Java Beans container used to implement ELDAS core 6 www.edikt.org BinX – accessing legacy binary data simulations The Problem: – Many binary data files – Applications must “know” the data format – Binary data formats are machine-specific Binary Binary Data File Binary Data File Data File The Solution: – Write a “stand-aside” format description in XML – Provide a library to Interpret the description Provide file access across different machines – Build higher-level services BinX file describes binary file structure BinX Library e-Science Application 7 www.edikt.org AstroBinX – format transformation Even when we try to agree, we disagree Multiple data format standards require conversions BinX description Binary Data File Binary Data File BinX Library BinX Utilities FITS data format VOTable data format Spectral Analysis Application Data format transformations based on XML descriptions BinX description 3D Image Data Mining Application 8 www.edikt.org OSAGE – Applying Computer Science Extend the Edinburgh Mouse Atlas – Data model to describe multiple species – Support scientific collaboration via data sharing Computer Science theory and best practice – Data modelling to efficiently relate images and text – Flexible data annotation and versioning with XML CS theory DB2 DB Data Access Services 9 www.edikt.org The Future – bringing components together Extended Grid Data Services for Virtual Organisations Data Versioning Service Constraint Mgmt Service User Annotation Service ... CS research results layered over basic ELDAS services Xindice DB Data Archiving Service ELDAS BinX Library MySQL DB DB2 DB BinX is an intelligent Binary Files binary file data source 10 www.edikt.org e-Science Data Information and Knowledge Transformation http://www.edikt.org ELDAS –– download the library and docs in January 2004 BinX –– download the library, utilities, docs, and sample applications now Thank you! Questions?