E : dikt e-Science Data, Information

advertisement
e-Science Data Information and Knowledge Transformation
Edikt :
e-Science Data, Information
and Knowledge Transformation
E-Science Centres of Excellence Meeting
6 November 2003
Dr. Denise Ecklund, edikt technical architect
What is edikt?
Standards
Requirements
analysis
Technology
matchmaking
E-Science Apps
CS Research
Edikt project
Gap filling
Grid Services for
e-Science Data
Management
Rigorous
engineering
Commercial SW
components
and skills
2
www.edikt.org
The edikt team
 The team
–
–
–
–
–
–
8 professional software engineers
project manager
commercialisation manager
architect
support staff
Scientific Advisory Board (SAB)
 SHEFC funded R&D grant
– 3 years funding: May 2002 – 2005
– +3 years funding upon successful review
3
www.edikt.org
Current activities
 Eldas – Enterprise level data access services
– Core data services supporting e-Science virtual organisations
 BinX and AstroBinX – Binary XML
– Supports data interchange for astronomy and other applications
 OSAGE – Ontology-based Species Atlas for Gene Expression
– Defines a database schema for storing and annotating
3D anatomy and gene expression data for multiple species
 Technology and research evaluations
4
www.edikt.org
Creating a Virtual Organization
Radio spectrum
DB2 DB
Optical spectrum
Xindice DB
X-Ray spectrum
MySQL DB
5
www.edikt.org
Creating a Virtual Organization
Radio spectrum
Optical spectrum
ELDAS
DB2 DB
X-Ray spectrum
+ Grid Directory Services
Xindice DB
MySQL DB
6
www.edikt.org
ELDAS – Extensibility via DACs
User1
User2
Reusable
ELDAS Core
User3
ELDAS
ELDAS
Core
DAC2
DAC
Xindice DB
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
 Data Access Components interface to distinct DBMSs
 Multiple DB drivers can be supported
– JDBC, ODBC for relational DBMSs
 Plug-n-Play installation of ELDAS
7
www.edikt.org
ELDAS – EJB Implementation
Using EJBs
ELDAS separates
- data access
- business logic
- presentation layers
Java
Framework
ELDAS
EJB - GDS
DAC
Xindice DB
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
 Java 2 Enterprise Edition implements basic server tasks
 Java Beans container used to implement ELDAS core
8
www.edikt.org
ELDAS – EJB Implementation
Grid User1
Grid User2
Grid Proxy
Suitable for grid users
Java
Framework
ELDAS
EJB - GDS
DAC
Xindice DB
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
 Java 2 Enterprise Edition implements basic server tasks
 Java Beans container used to implement ELDAS core
9
www.edikt.org
ELDAS – EJB Implementation
Grid User1
Suitable for grid users
and web users
Grid User2
Grid Proxy
Web User1
Web Servlet
Java
Framework
ELDAS
EJB - GDS
DAC
Xindice DB
DAC
MySQL DB
DAC
DAC
DB2 DB
Oracle 9i DB
 Java 2 Enterprise Edition implements basic server tasks
 Java Beans container used to implement ELDAS core
10
www.edikt.org
BinX – accessing legacy binary data
simulations
 The Problem:
– Many binary data files
– Applications must
“know” the data
format of each file
– Binary data formats
are machine-specific
Binary
Binary
Data File
Binary
Data File
Data File
e-Science
Application
11
www.edikt.org
BinX – accessing legacy binary data
simulations
 The Solution:
– Write a “stand-aside”
format description in
XML
– Provide a library to
 Interpret the
description
 Provide file access
across different
machines
– Build higher-level
services
Binary
Binary
Data File
Binary
Data File
Data File
BinX file
describes
binary file
structure
BinX Library
e-Science
Application
12
www.edikt.org
AstroBinX – format transformation
 Even when we try to agree, we disagree
 Multiple data format standards require conversions
Binary
Data
File
FITS
data format
Spectral Analysis
Application
Binary
Data
File
VOTable
data format
3D Image
Data Mining
Application
13
www.edikt.org
AstroBinX – format transformation
 Data format transformations based on XML descriptions
 Build AstroBinX services using the BinX library
BinX
description
Binary
Data
File
FITS
data format
Spectral Analysis
Application
BinX Library
BinX Utilities
Binary
Data
File
VOTable
data format
BinX
description
3D Image
Data Mining
Application
14
www.edikt.org
OSAGE – Applying Computer Science
 Extend the Edinburgh Mouse Atlas
– Data model to describe multiple species
– Support scientific collaboration via data sharing
 Computer Science theory and best practice
– Data modelling to efficiently relate images and text
– Flexible data annotation and versioning with XML
CS theory
DB2 DB
Data
Access
Services
15
www.edikt.org
The Future – bringing components together
ELDAS
BinX Library
Xindice DB
MySQL DB
DB2 DB
BinX is an intelligent Binary Files
binary file data source
16
www.edikt.org
The Future – bringing components together
Extended Grid Data Services
for Virtual Organisations
Data
Versioning
Service
Constraint
Mgmt
Service
User
Annotation
Service
...
CS research results
layered over basic
ELDAS services
Xindice DB
Data
Archiving
Service
ELDAS
BinX Library
MySQL DB
DB2 DB
BinX is an intelligent Binary Files
binary file data source
17
www.edikt.org
e-Science Data Information and Knowledge Transformation
http://www.edikt.org
 ELDAS
–– download the library and docs in January 2004
 BinX
–– download the library, utilities, docs, and
sample applications now
Thank you!
Questions?
Download