The Use of OGSA-DAI with DB2 Content Manager IBM Software Group

advertisement
®
IBM Software Group
The Use of OGSA-DAI with DB2 Content Manager
in the eDiaMoND Project
M Oevers, B Collins, A Knox, J Williams
IBM Software Group
Overview
 eDiaMoND the project
 Strategies for Virtualisation
 How DB2 and CM are used
 OGSA-DAI enablement of CM
 Lessons Learnt
IBM Software Group
eDiamond – Project Announcement
 “One of the pilot e-science projects is to develop a digital
mammography archive, together with an intelligent medical
decision support system for breast cancer diagnosis and
treatment. An individual hospital will not have
supercomputing facilities, but through the grid it could buy
the time it needs. So the surgeon in the operating theatre
will be able to pull up a high-resolution mammogram to
identify exactly where the tumour can be found” – Tony
Blair (speech to the royal society – 23 may 2002)
IBM Software Group
eDiaMoND Partners
IBM Software Group
eDiaMoND – Project Deliverables
Phase 0
Prototype
(end-2003)
eDiaMoND
BluePrint
Phase 1
Prototype
(mid-2004)
?
(Next Phase)
• Grid Infrastructure
• Grid-connected Workstation
• Database for Storage & Retrieval of Images & Metadata
• Computation for CADe, CADi and Statistical Analyses
• Required Hardware, Software & Network for given Service Levels
Breast
Screening
Programmes
IBM Software Group
eDiaMoND Functional Model
IBM Software Group
Strategies for Virtualisation
Use II & II4C
Expose through
OGSA-DAI
Investigate DQP
IBM Software Group
Virtualisation – things to remember
 Each Breast Care Unit (BCU) to operate independently from others
 Individual organisations coming together to for a Virtual Organisation
 Data loaded locally in each BCU
 Data is “owned” by the BCU
 Enable read access across all BCUs seamlessly
 Replication or Federation
 DB2 II & II4C
 Remember it’s got to be a Grid (eScience project)
 OGSA-DAI
 Distributed Query Processing (QDP) over OGSA-DAI
IBM Software Group
How OGSA-DAI is used with DB2 and CM
 DB2 stores the non-image data in a structured form
 DICOM describes an ER model Patient – Study – Series – Image
 Flexible to allow for multiple modalities
 Allow flexibility of data modelling/access control/query rewrite
 CM is used to store and manage the (large 30MB) DICOM files
 Files contain both non-image data and image data
 Identified by DICOM SOP Instance UID
 Flat CM data model (Customer Requirement)
 Both exposed as OGSA-DAI services
DICOM – Digital Imaging and Communications in Medicine
IBM Software Group
Screening
Administration Client
Viewer Client
Workflow
2
Client Layer
Grid Layer
1
3
4
1. Query
Query Service
Retrieve Service
Persistent
Persistent
2. Worklist Create
3. Worklist
Consume
OGSA-DAI Service
4. Retrieve
Persistent
OGSA-DAI Service
Persistent
Worklist Service
Transient
Grid Layer
Data Layer
DB2 Instance
Content Manager Instance
Patient ID
DICOM ID
DICOM ID
URL – DICOM ID
IBM Software Group
Grid Development – Phase 0 to Phase1
UED
UCL
KCL
CHU
Data
Loader
Admin
Viewer
Client
Layer
Grid
Layer
WORKLIST
Deploy
Grid
Layer
DB2
OGSA DAI
Data
Layer
CM
OGSA DAI
QUERY
RETRIEVE
DB2
DB2
FED
OGSA DAI
CMCM
FED
OGSA DAI
DB2
CM
CM Fed.
DB2 Fed.
DB2
CM
DB2
CM
DB2
CM
IBM Software Group
CM Grid enablement – What it means
OGSA-DAI conf/ext points
 Driver Class, e.g.
com.ibm.db2.jcc.DB2Driver
 Driver URI, e.g.
jdbc:db2://localhost:50000/SAMP
LE
 Connection
DriverManager.getConnection()
 Metadata
Mapping to CM
 Datastore object, e.g
com.ibm.mm.sdk.server.DKDatastoreICM
 Data store name, e.g.
ICMNLSDB
 Connected Datastore
Datastore.connect()
 Metadata
Table Schema for SQL
ItemTyes and Attributes
XML schema for XML DB
Could it be treated as an XML DB?
 Mapping of Grid Certificates to DB
user and password
 Mapping of Grid Certificate to CM user and
password
It was possible to map CM concepts to corresponding JDBC concepts
that are exposed in OGSA-DAI configuration files
2 XML files to edit and 2 Java classes to write
IBM Software Group
The Gory details
IBM Software Group
Lessons Learnt
 OGSA-DAI is a flexible framework into which CM fits reasonably well
 Chaining of activities
 User defined activities
 Developer focus on writing activities
 Use of dynamic discovery to configure the system
 Useful during development/testing
 Register more in the registry
 Unifies the view of the system as far as data is concerned
 Experience of grid-enabling an existing product
 Have not explored how to expose CM metadata yet
IBM Software Group
Thank You
Manfred Oevers
[email protected]
IBM Software Group
Data Load - High Level Design Load Client
DICOM Parser
Load API
LoadPlugin
for Core DB
1.
2.
3.
4.
5.
DICOM file gets parsed
XML file created with Reference
XML file passed to load services
CM pulls DICOM file in
As simple as possible
DICOM File
(Image or
SR)
Reference
LoadPlugin
for Core Store
XML File
Grid Boundarry
Invocation
Invocation
Pull from
Reference
OGSA-DAI
CM Service
OGSA-DAI
DB2 Service
IBM Software Group
Data Load Detailed Design
• Plugin Architecture
• Decoupling
• Configuration of Plugin to decide
• Parser also pluggable
• API as simple as possible
OUCL
IBM
IBM Software Group
eDiaMoND API
IBM Software Group
eDiaMoND - Organisation
Development (OUCL)
Oxford / Churchill
Edinburgh
Aberdeen
eDiaMoND LAN
eDiaMoND LAN
eDiaMoND LAN
eDiaMoND LAN
VPN & FW
VPN & FW
VPN & FW
VPN & FW
OUCL LAN
Oxford LAN
Edinburgh LAN
Aberdeen LAN
JANET Network
IBM LAN
Mirada LAN
UCL LAN
KCL LAN
VPN & FW
VPN & FW
VPN & FW
VPN & FW
eDiaMoND LAN
eDiaMoND LAN
eDiaMoND LAN
eDiaMoND LAN
Development (IBM)
Development (Mirada)
UCL / St Georges
KCL / Guys
Grid Boundary
Server
Workstation
T221
IBM Software Group
Federation setup DB2
DB=FEDCORE
Create view over
union of
Node=edibm
View cis.patient =
edibm.patient
nicknames of
identical tables
union
edouc.patient
No query rewrite
necessary
DB=EDCORE
Server = edibm
Server = edouc
Nickname=
Nickname=
edibm.patient
edouc.patient
DB=EDCORE
Node=edibm
Node=edouc
Table=cis.patient
Table=cis.patient
IBM Software Group
The M Diagram
IBM Software Group
eDiaMoND – Non-Functional
Anonymisation
Grid
Screening
Screening
Screening
Diagnosis
Diagnosis
Screening
Diagnosis
Teaching
Teaching
Teaching
Training
Epidemiology
Epidemiology
Epidemiology
Epidemiology
Ethics
Legal
Security
Performance
Scalability
Manageability
Auditability
……
Lossless Compression
Encryption
256MB & 5 secs
response
~100 Centres
Systems Administration
Non-Repudiation
IBM Software Group
Phase 1 Deployment
GEO
T221
Digit.
Digitiser
W/S
eDiaMoND
Dev.
W/S
MIR
T221
eDiaMoND LAN
eDiaMoND LAN
OUCL LAN
IBM
Dev.
Grid Node
Digit.
SCO
eDiaMoND
Demo
Grid Node
IBM
T221
T221
eDiaMoND
Demo.
W/S
eDiaMoND
Demo.
Grid Node
eDiaMoND
W/S
Digit.
T221
T221
eDiaMoND
Grid Node
KCL
GUY LAN
Digitiser
W/S
eDiaMoND
Test
Grid Node
eDiaMoND
Demo
W/S
IBM LAN
T221
eDiaMoND
W/S
UED LAN
T221
eDiaMoND
Repository
Server
CHU LAN
UED
eDiaMoND
Dev.
Grid Node
T221
JANET / Internet
UCL
eDiaMoND
Grid Node
T221
JANET / Internet
eDiaMoND
Grid Node
Digitiser
W/S
OUCL
UCL LAN
eDiaMoND
W/S
CHU
eDiaMoND
Dev.
Grid Node
MIR LAN
T221
T221
eDiaMoND
Grid Node
eDiaMoND
W/S
Digitiser
W/S
T221
T221
Digit.
GUY
IBM Software Group
UK Breast Screening – Challenges
Digital
Digital
2,000,000 - Screened every Year
120,000 - Recalled for Assessment
10,000 - Cancers
1,250 - Lives Saved
230 - Radiologists (Double Reading)
50% - Workload Increase
Began in 1988
Women 50-70
Screened
Every 3 Years
2 Views/Breast
+ Demographic
Increase
~100 Breast
Screening
Programmes
- Scotland
- Wales
- Northern Ireland
- England
IBM Software Group
Breast Cancer Facts
 1 in 8 women will develop breast cancer in the course of their
lives, 1 in 28 will die of it
 In the EC breast cancer accounts for 19% of cancer deaths
and 24% of cancer cases
 Diagnosed in 348,000 women in EC+USA and kills 115,000
women annually
 1,000,000 new cases world-wide in 1997
 Rationale for Screening
 Early diagnosis = better Prognosis
 Detection at 0.5cm has favourable outcome in 99% cases; but
at 2cm only 50%
IBM Software Group
UK Breast Screening Programme
The Recall rate is 86 for First Time Screening as no
comparison is possible with a previous Screening
Missed
1
Call
Screening
1000
Interval Cancers
Recall
Assessment
40 (86)
Cancer
6
Previous
All Clear
960 (914)
Current
All Clear
34 (80)
Epidemiology
~100 Breast
Screening
Programmes
Training
IBM Software Group
Project Teams
 Grid Infrastructure Team
 IBM
 Oxford University Computing Laboratory
 Image Analysis Technology Team
 Dept of Engineering Science
 Mirada Solutions
 Image Collection & Clinical Assessment Team
 St Georges Hospital
 Guy’s and St Thomas’ Hospitals
 Oxford Radcliffe Hospitals
 Kings College London
 University College London
 University of Edinburgh
IBM Software Group
SMF® - Mirada’s Patented Standardisation Process
Mammograms have very
different appearances,
depending on image settings
and acquisition systems
The “interesting tissue”
representation is a surface
independent of scanner
IBM Software Group
Mirada’s Interesting Tissue Representation
Tumour
Compression
Plates
Glandular
Tissue Fatty
Tissue
1cm
Hint
1.0 cm
A quantitative representation of breast tissue density
Download