E : dikt e-Science Data, Information

advertisement
e-Science Data Information and Knowledge Transformation
Edikt :
e-Science Data, Information
and Knowledge Transformation
E-Science Centres of Excellence Meeting
6 November 2003
Dr. Denise Ecklund, edikt technical architect
What is edikt?
Standards
Requirements
analysis
Technology
matchmaking
E-Science Apps
CS Research
Edikt project
Gap filling
Grid Services for
e-Science Data
Management
Rigorous
engineering
Commercial SW
components
and skills
 The team: 8 professional software engineers, support staff,
project manager, commercialisation manager, architect, and SAB
 SHEFC funded research and development grant
– 3 years funding: May 2002 – 2005
– +3 years funding upon successful project and review
2
www.edikt.org
Current activities
 Eldas – Enterprise level data access services
– Core data services supporting e-Science virtual organisations
 BinX and AstroBinX – Binary XML
– Supports data interchange for astronomy and other applications
 OSAGE – Ontology-based Species Atlas for Gene Expression
– Defines a database schema for storing and annotating
3D anatomy and gene expression data for multiple species
 Technology and research evaluations
3
www.edikt.org
Creating a Virtual Organization
Radio spectrum
Optical spectrum
ELDAS
DB2 DB
X-Ray spectrum
+ Grid Directory Services
Xindice DB
MySQL DB
4
www.edikt.org
ELDAS – Extensibility via DACs
User1
User2
Reusable
ELDAS Core
User3
ELDAS
ELDAS
Core
DAC2
DAC
Xindice DB
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
 Data Access Components interface to distinct DBMSs
 Multiple DB drivers can be supported
– JDBC, ODBC for relational DBMSs
 Plug-n-Play installation of ELDAS
5
www.edikt.org
ELDAS – EJB Implementation
Grid User1
ELDAS runs anywhere
Suitable for grid & web
Grid User2
Grid Proxy
Web User1
Web Servlet
Java
Framework
ELDAS
EJB - GDS
DAC
Xindice DB
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
 Java 2 Enterprise Edition implements basic server tasks
 Java Beans container used to implement ELDAS core
6
www.edikt.org
BinX – accessing legacy binary data
simulations
 The Problem:
– Many binary data files
– Applications must “know”
the data format
– Binary data formats are
machine-specific
Binary
Binary
Data File
Binary
Data File
Data File
 The Solution:
– Write a “stand-aside” format
description in XML
– Provide a library to
 Interpret the description
 Provide file access across
different machines
– Build higher-level services
BinX file
describes
binary file
structure
BinX Library
e-Science
Application
7
www.edikt.org
AstroBinX – format transformation
 Even when we try to agree, we disagree
 Multiple data format standards require conversions
BinX
description
Binary
Data
File
Binary
Data
File
BinX Library
BinX Utilities
FITS
data format
VOTable
data format
Spectral Analysis
Application
Data format transformations
based on XML descriptions
BinX
description
3D Image
Data Mining
Application
8
www.edikt.org
OSAGE – Applying Computer Science
 Extend the Edinburgh Mouse Atlas
– Data model to describe multiple species
– Support scientific collaboration via data sharing
 Computer Science theory and best practice
– Data modelling to efficiently relate images and text
– Flexible data annotation and versioning with XML
CS theory
DB2 DB
Data
Access
Services
9
www.edikt.org
The Future – bringing components together
Extended Grid Data Services
for Virtual Organisations
Data
Versioning
Service
Constraint
Mgmt
Service
User
Annotation
Service
...
CS research results
layered over basic
ELDAS services
Xindice DB
Data
Archiving
Service
ELDAS
BinX Library
MySQL DB
DB2 DB
BinX is an intelligent Binary Files
binary file data source
10
www.edikt.org
e-Science Data Information and Knowledge Transformation
http://www.edikt.org
 ELDAS
–– download the library and docs in January 2004
 BinX
–– download the library, utilities, docs, and
sample applications now
Thank you!
Questions?
Download