Requirements for Complex Interactive Workflows in Biomedical Research Jeffrey S. Grethe, BIRN-CC

advertisement
Requirements for Complex Interactive
Workflows in Biomedical Research
Jeffrey S. Grethe, BIRN-CC
University of California, San Diego
e-Science Workflow Services
December 3, 2003
Scientific Workflows
Laboratory information and study management
Procedural flow
Subject Management
Experimental Protocol
Data Collection
Data Preparation
And
Validation
Data Analysis
Data acquisition and analysis
Data flow
Data Deposition
Telescience
Computation
Visualization
Training and
Dissemination
Databases &
Digital Libraries
Partnership
Network
Connectivity
Remote
Instrumentation
A combination of several independent technologies
integrated for the application of biological tomography
in a way that fosters collaboration.
Telescience Architecture
The Telescience Portal centralizes those layers and presents them
to the user as a SINGLE SIGN-ON web-based environment
Telescience Portal enabled Tomography Workflow
The tomography workflow is composed of the sequence of
steps required to acquire, process, visualize, and extract
useful information from a 3D volume.
• Problems with non-Portal “traditional” workflow:
• Applications are heterogenous and platform specific
• Spectrum of applications is extremely varied (~20)
• Simple Shell Scripts
• Parallel Grid enabled software
• Commercial software
• Administration is responsibility of the user
• Manual tracking, handling of data
• Advantages of workflow managed by Telescience Portal:
• Applications are centralized to a common interface
• Automatic and transparent data management - ease
use of deposition into database
• Appropriate tools have been merged into single
applications
Biomedical Informatics Research Network
Enable new understanding
of neurological disease by
integrating data across
multiple scales from
macroscopic brain function
to its molecular and cellular
underpinnings
• Federate distributed multiscale
brain data
• Accommodate associated Large
Scale Computational Challenges
• Provide Infrastructure for Next
Generation Collaboratory
Scales of NS from Maryann Martone
What Has BIRN Been Building?
•
A Stable, Robust, Shared
Network and Distributed
Database Environment,
Tailored to the Pioneering
BIRN Collaborations.
•
Generalizable and
Extensible Tools and IT
Infrastructure.
•
Project- & TestBedSpecific Software and
Scientific Workflows.
•
An Interdisciplinary
Community of BIRN
Investigators.
•
Processes to help Govern
Large Scale
Collaborations and
A
Resource Sharing.
The BIRN Site Map as of October 2003
Shared Biomedical IT Infrastructure to Hasten the Derivation
of New Understanding and Treatment of Disease through use of
Distributed Knowledge
Brain Morphometry BIRN
• Examining neuroanatomical correlates of
neuropsychiatric illnesses including Unipolar
Depression, mild Alzheimer’s Disease (AD) and mild
cognitive impairment (MCI)
• Harvard (MGH and BWH), Duke, UCLA, UC San
Diego, Johns Hopkins
Human Subjects Considerations
•
High-resolution structural images can be used as an identifier.
• Reconstruction of face from raw anatomical data might be able to be used to
identify subject
• Some members of BIRN require/desire unaltered raw data
• Need to be able to provide both sets of data and handle them properly within
the system
•
BIRN must conform to multiple overlapping regulations
•
•
•
•
Common Rule
HIPAA
State Law
International Law
BIRN De-identification and Upload Pipeline
BIRNDUP
• Create sharable output of images from diverse input MRI data using a
common data entry software package
GE
Siemens
DICOM
Files
Picker
Sort Images
De-identify
Go/NoGo
- Standard
-Deface or Mask
- Clean DICOM
Header
- Render Movies
-Display Movies
- QA Approval
of Defacing
BWH/MGH
Duke
Directory
Hierarchy
- Identify
Deface
Series
UCI
Philips
Upload
-Extract
Metadata
- Optimize
for SRB
UCSD
Conversion
to
DICOM
Retrospective
Data
Archives
(various
formats)
Local Desktop
Data can not be exported prior
to de-identification and
validation
Human
Image
Database
Data &
Provenance
Human Subjects Protection and Workflows
•
Security related metadata
•
•
All data uploaded within BIRN must have security related metadata
•
Data classification
•
IRB agreements
•
Subject consent
•
Longitudinal data
Access to data is dependent on metadata and access privileges
•
For example, de-identified data can not be shared with all users
•
Secure environment required for the storage of protected information
•
Trust in targeted computation resources
•
•
•
Compliance with privacy regulations (e.g. auditing)
•
Ability to trust actual applications/services accessed
Auditing of data access and movement required
•
HIPAA
•
Internal Security
Can distributed auditing/logging meet the above requirement?
Morphometry BIRN
Harvard-MGH
Surface based
coordinate system
Harvard-BWH
Model based three
dimensional Medical
Image Segmentation
UCLA-LONI
Dynamics of Gray Matter
Loss Rates, Mapped in a
Schizophrenia Population
Harvard-BWH
3D-Slicer - An Integrated visualization
system for surgical planning
and guidance using image
fusion and interventional imaging
Morphometry Analysis Workflow
Provide researchers with transparent access to a computing environment that
supports their natural working paradigm while taking advantage of the evolving
grid infrastructure
Expert users
required for
some
interactive
processing
Morphometry Analysis Workflow
Provide researchers with transparent access to a computing environment that
supports their natural working paradigm while taking advantage of the evolving
grid infrastructure
Data curation requires
determination of data
quality and validity
Global versus Local Optimization
Provide researchers with transparent access to a computing environment that
supports their natural working paradigm while taking advantage of the evolving
grid infrastructure
Long running
workflows may
need to be reoptimized during
execution
MIRIAD Project
Study of Major Depression in Late Life
MIRIAD collaboration offers promise of BWH and UCLA tools that offer
reduced variance and access to atlas-driven lobar and regional analysis
MGH
Freesurfer
Cortical &
Subcortical
segmentations
JHU
Linear Deformation Metric Mapping
Shape Analysis of Segmented
Structures
Acquisition
Site
De-identification
BWH
3D – Slicer
Visualization
BIRN
Data Grid
MIRIAD Plan/Data Flow
•
Use Anonymization at Duke to avoid IRB delays
•
•
•
•
•
UCLA Processing
•
•
•
•
select 50 depression subjects, baseline and year 2
MRI
select 50 age-comparable normal subjects,
baseline and year 2 MRI
select metadata variables that will be needed for
analysis
anonymize data retaining a new BIRN number to
link MRI and metadata that cannot be traced to
original subject
atlas preparation and orientation/registration (rigid
body) of all subjects (also compute HO
deformation parameters for segmentation)
registration of BWH atlas to common data set
after BWH segmentation, perform atlas driven
lobar analysis
BWH Processing
•
•
EM Segmentation for gray, white, CSF
Atlas driven regional segmentation
LONI Pipeline Environment
•
Legacy application from one of the BIRN testbed sites
• Designed by domain scientists for their needs
•
LONI Pipeline not currently designed with the Grid in mind
• Client-server model where workflow control client resides on user’s desktop
• No means for authentication through user certificates or proxies
• Does not use standard grid transfer protocols
• Client must remain running even for
extended jobs
• Does not utilize resource discovery
and monitoring to schedule job
•
Scientist’s Requirements
• Easy to use (nice GUI)
• Very straightforward way to “wrap”
new applications
• User selects specific application
(version, host)
• Easy to view status
Function BIRN
• Developing a common fMRI protocol to study regional brain
dysfunction related to the progression and treatment of
schizophrenia
• Correlating functional data with anatomical data acquired
from the Morphology test-bed to study if there are
neuroanatomical correlates with cognitive dysfunction across
disorders
• UCLA, UC San Diego, UC Irvine, Harvard (MGH and BWH),
Stanford, Minnesota, Iowa, New Mexico, Duke/U. North
Carolina
Function BIRN
Harvard-MGH
Patients with schizophrenia show
abnormal modulation of temporal
and frontal regions during semantic
processing
Stanford – Lucas Center
Inferior frontal/temporal neocortical
network mediates semantic processing
of sentences in healthy individuals
fMRI response at 3T and 1.5T with
identical software and hardware
platforms (GE SIGNA)
Functional MRI Analysis Workflow
Anatomical Image
MR
Scanner
Scanner
Parameters
Scanner
Parameters
Reconstruction
Slice Time
Motion
Anatomical
Correction
Correction
Co-Registration
Data
K-Space Images
Validation
Functional Images
Anatomical
Template
Spatial
Normalization
Motion
Parameters
Statistical Map Overlay
Data
Validation
Normalized
Anatomical
Image
Results
Overlay
Valid
Data
Results
Single Subject
Overlay
Statistics
Group
Statistics
Multiple Subject’s Data
Experimental
Paradigm
Normalized
Functional
Images
Functional MRI Analysis Workflow
Anatomical Image
MR
Scanner
Scanner
Parameters
Scanner
Parameters
Reconstruction
Slice Time
Motion
Anatomical
Correction
Correction
Co-Registration
Data
K-Space Images
Validation
Functional Images
Anatomical
Template
How do I re-analyze 120 subjects?
Statistical Map Overlay
Overlay
Validation
Valid
Data
Results
Single Subject
Overlay
Statistics
Group
Statistics
Multiple Subject’s Data
Normalization
Data
Normalized
Anatomical
Image
Results
Spatial
Experimental
Paradigm
Normalized
Functional
Images
Real-Time Experimental Control
Anatomical Image
Scanner
Parameters
Reconstruction
Scanner
Parameters
Slice Time
Motion
Correction
Correction
K-Space Images
Functional Images
Statistical Map Overlay
Anatomical
Image
Experiment
Control
Results
Single Subject
Overlay
Statistics
Experimental
Paradigm
http://www.nbirn.net
Download