Translating Imaging Science to the Emerging Grid Infrastructure May 31, 2016

advertisement
Translating Imaging Science to the
Emerging Grid Infrastructure
Jeffrey S. Grethe - BIRN
University of California, San Diego
Imaging, Medical Analysis and Grid Environments (IMAGE)
May 31, 2016
We speak piously of taking measurements
and making small studies that will add
another brick to the temple of science. Most
such bricks just lie around the brickyard.
Platt, J.R. (1964) Strong Inference. Science.
146: 347-353.
Objectives
• Establish a stable, high performance network linking key
Biotechnology Centers and General Clinical Research
Centers
• Establish distributed and linked data collections with
partnering groups - create a “Data GRID” for the BIRN
• Facilitate the use of "grid-based" computational infrastructure
and integrate BIRN with other GRID middleware projects
• Enable data mining from multiple data collections or
databases on neuroimaging and bioinformatics
• Build a stable software and hardware infrastructure that will
allow centers to coordinate efforts to accumulate larger
studies than can be carried out at one site.
Challenges
Neuroscience
High Speed Network
Computation
Mouse BIRN
User Access
FIRST BIRN
Distributed Data
Data Integration
Morphometry BIRN
Informatics
Community
Policies
Best Practices
IRB
HIPAA
Governance
Challenges
Neuroscience
High Speed Network
Computation
Mouse BIRN
User Access
FIRST BIRN
Distributed Data
Data Integration
Morphometry BIRN
Informatics
Community
Policies
Best Practices
IRB
HIPAA
Governance
CREATING BIRN TEST-BED PARTNERSHIPS
• Three Research Project “Application Test Beds”
have been Assembled to Shape BIRN and Guide
Infrastructure Development:
• Multi-scale Mouse BIRN - Animal Models of disease / Multi
Scale/Multi Method - Examples: MS Mouse, DAT KOM (a
schizophrenic and otherwise interesting mouse animal
model) and a Parkinson’s Disease Mouse
• Brain Morphometrics (Human Structure BIRN) - Targets:
neuroanatomical correlates of neuropsychiatric illness
(Unipolar Depression, mild Alzheimer's Disease (AD), mild
cognitive impairment (MCI)
• Functional Imaging BIRN – Development of a common
functional magnetic resonance imaging (fMRI) protocol
and to study regional brain dysfunction related to the
progression and treatment of schizophrenia - attack on
underlying cause of disease
A National Collaboratory
Science Drives The Infrastructure
• USE APPLICATION SCIENCE “PULL” TO GUIDE
DEVELOPMENT OF THE NEXT GENERATION
CYBERINFRASTRUCTURE
• Craft a plan to achieve an important scientific goal
requiring development and implementation of innovative
computational infrastructure.
• Articulate a Grand Challenge and define work to achieve
this goal with increasing levels of specificity.
• Bring application scientists and computer scientists
together in projects at each level to build elements of the
new infrastructure.
Challenges
Neuroscience
High Speed Network
Computation
Mouse BIRN
User Access
FIRST BIRN
Distributed Data
Data Integration
Morphometry BIRN
Informatics
Community
Policies
Best Practices
IRB
HIPAA
Governance
User Access to Grid Resources
•Application environment
being developed to
provide centralized
access to BIRN tools,
applications, resources
with a Single Login from
any Internet capable
location
•Provides simple, intuitive
access to Grid resources
for data storage,
distributed computation,
and visualization
Interfacing the Desktop with the Grid
• Developed a Java Grid Interface (JGI) that provides wrapper for
applications on a users desktop.
• Brokers communications and information/data transfer between the
application and BIRN resources (e.g. SRB)
• LONI Pipeline, 3D Slicer, FreeSurfer, and ImageJ
• Continue to extend and develop the JGI
• OGSA compliance
Distribution of a Bioinformatics Toolbox
• Package and deploy test bed—specific software through the distribution of
the BIRN bioinformatics toolbox
• Use ROCKS (http://www.rocksclusters.org) as the distribution mechanism
• Bioinformatics toolbox can be made
available to any researcher interested in a
robust
package
of
neuroimaging
applications.
• First release to occur this fall using the new
ROCKS distribution model.
BIRN Roll
FreeSurfer
AFNI
AIR
FSL
•••
Grid Wrappers
BIRN ROCKS
Distribution
GridRoll
Role
Grid
ROCKS Core
Scientific Workflow
•
Sequence of steps (utilities,
applications, pipelines) required to
acquire, process, visualize, and extract
useful information from a scientific
data.
•
Advantages of workflow managed
within the Portal:
• Progress through the workflow can
be organized and tracked
• Automated and transparent
mechanisms for the flow of data
from one step to the next using
SRB
• Tools are centralized and
presented with uniform GUIs to
improve usability
• Administration burden of each step
(groups of steps) is eliminated
• Flexibility to enhance each process
through direct, transparent access
to the grid
Interactive Scientific Workflows
Provide researchers with transparent access to a computing environment that
supports their natural working paradigm while taking advantage of the evolving
grid infrastructure
Data curation requires
determination of data
quality and validity
Workflow Considerations
• Provide full provenance for data within the BIRN environment
• Morphometry BIRN is modifying tools to provide proper provenance
information
• Data provenance is being taken into account in the human imaging database
• Workflow Optimization
• Take advantage of resource discovery services being deployed
• Use of data provenance information
• Global versus run time optimizations
• Incorporation of legacy applications
•
•
•
•
LONI Pipeline (UCLA)
Standard install
Incorporation into Portal
Advisement on future Grid
enhancements to Pipeline
Challenges
Neuroscience
High Speed Network
Computation
Mouse BIRN
User Access
FIRST BIRN
Distributed Data
Data Integration
Morphometry BIRN
Informatics
Community
Policies
Best Practices
IRB
HIPAA
Governance
Governance
• Incorporating processes for Multi-sites studies and sharing of
human data
•
•
•
HIPPA Compliance
Patient confidentiality
Institutional Review Board (IRB) approvals
• Developing guidelines - for sharing data & authorship
• Breaking down the barriers
•
•
•
•
•
Mistrust
Open sharing of information
Who gets credit
Commercial products
Governance
• Integrating new participants
IRB Working Group
•
One member from each BIRN site required to participate
•
Each member is required to review BIRN consents, waivers and
procedures with local IRBs
•
Regular video conferences among members to coordinate information
and activities
•
Produce BIRN template language for subject consent, IRB waiver for
data upload and IRB waiver for data download
•
Interact with Data Sharing Task Force
What Regulations Apply?
Institutional Policy
HIPAA
Common Rule
It Depends!
State Law
IRB Interpretation
Local Policy
Data Sharing Task Force
• Produce guidelines and procedures for data sharing across institutions
taking into account Common Rule, HIPAA and state regulations
• Develop procedures to allow for longitudinal studies within BIRN
• Examine policies that are relevant to BIRN (e.g. revised policies being
drafted for tissue banks and data banks)
• Interact with Architecture working groups to help define security and
subject confidentiality infrastructure and policy
•
•
•
•
•
Data Replication
Certificate Policies
Registration Authority Policies
Local access control
Auditing & activity logs
EU Privacy Directives
• EU directive 95/46/EC: article 8
• Member states shall prohibit the processing of personal
data concerning health or sex life.
• Recommendation nr R (97) 5: Exceptions
• Diagnostic and therapeutic reasons
• Public health reasons, public interest
• Criminal offenses
• Specific contractual obligations fulfilment
• Legal claims
• Consent for specific purposes
Data Classifications
Characteristic
Individually identifiable ie.,
meets HIPAA definition of
individually identifiable helath
information
Used for support clinical
decision making for an
individual, or for payment or
operations
Associated with healthcare
service event
Need-to-know, minimum
necessary access control
Separation of personidentifiable and non-person
identifiable data elements
wherever feasible
Individual authorization
(consent) for creation and use
of data
Business Partner agreements
for disclosures
Logs and audit trails of use
and disclosure
Right to request amendment of
records
Protected Health
Information
Research Health
Information
Yes
Yes
Yes
No
Yes
No
Yes
Yes
No
Yes
Varies
Yes
Yes
No
Yes
Current best practice for
research records
Yes
At discretion of
investigator
Table 1: Data Characteristic s (adapted from Masys et al. 2002 )
Limited
Data Set
De-Identified Data
Varies
No
Varies
Varies
Varies
Varies
Yes
No
Yes
N/A
Varies
No
No
Current best
practice for
research records
At discretion of
investigator
No
Current best
practice for
research records
At discretion of
investigator
Anonymization vs. De-Identification
•
•
•
•
•
Both require deletion of direct identifiers
Anonymization cannot have a link field (DeIdentified data can).
Anonymization makes protocol eligible for
exemption from IRB review.
De-Identification makes data exempt from HIPAA
regulations.
De-Identification with link field does NOT exempt
data from IRB review.
EU Data Definitions
• Recommendation R (97)5 on the protection of medical data
• Personal data covers any information relating to an identified
or identifiable individual.
• An individual shall not be regarded as ‘identifiable’ if
identification requires an unreasonable amount of time and
manpower.
• In cases where the individual is not identifiable, the data are
referred to as anonymous
Identifiable Health Information
•
High-resolution structural images can be used as an identifier.
•
Reconstruction of face from raw anatomical data might be able to be used
to identify subject
•
Some members of scientific community require/desire unaltered raw data
•
Are allowed to provide both raw and skull stripped data
•
Need to get approval from local IRB to allow for the sharing of raw
anatomical data
•
Users wishing to access data also require IRB approval
Is there a scalable and
distributed solution for
researchers to access
identifiable health
information?
Raw
Skull Stripped
Data Sharing Infrastructure
•
Security related metadata
•
•
All data uploaded within BIRN must have associated metadata
•
Data classification
•
IRB agreements
•
Subject consent
•
Longitudinal data
Data sharing permissions are dependent on metadata
•
•
•
For example, de-identified data can not be shared with all users
Secure environment required for the storage of protected
information
•
Linkage of BIRN ID with original subject ID
•
Protected data
Auditing of data access and movement required
•
HIPAA
•
Internal Security
•
Data Usage Statistics
Download