QI-Bench_F2F_11-15

advertisement
3rd Program Face to Face
November 15, 2011
WITH FUNDING
SUPPORT
PROVIDED BY
NATIONAL
INSTITUTE OF
STANDARDS
AND
TECHNOLOGY
Andrew J. Buckler, MS
Principal Investigator
Value proposition of QI-Bench
• Efficiently collect and exploit evidence establishing
standards for optimized quantitative imaging:
– Users want confidence in the read-outs
– Pharma wants to use them as endpoints
– Device/SW companies want to market products that produce them
without huge costs
– Public wants to trust the decisions that they contribute to
• By providing a verification framework to develop
precompetitive specifications and support test
harnesses to curate and utilize reference data
• Doing so as an accessible and open resource facilitates
collaboration among diverse stakeholders
2
QI-Bench
Structure / Acknowledgements
•
•
Prime: BBMSC (Andrew Buckler, Gary Wernsing, Mike Sperling, Matt Ouellette)
Co-Investigators
–
–
•
•
Financial support as well as technical content: NIST (Mary Brady, Alden Dima, Guillaume Radde)
Collaborators / Colleagues / Idea Contributors
–
–
–
–
–
–
•
FDA (Nick Petrick, Marios Gavrielides)
UCLA (Grace Kim)
UMD (Eliot Siegel, Joe Chen, Ganesh Saiprasad)
VUmc (Otto Hoekstra)
Northwestern (Pat Mongkolwat)
Georgetown (Baris Suzek)
Industry
–
–
•
Kitware (Rick Avila, Patrick Reynolds, Julien Jomier, Mike Grauer)
Stanford (David Paik, Tiffany Ting Liu)
Pharma: Novartis (Stefan Baumann), Merck (Richard Baumgartner)
Device/Software: Definiens (Maria Athelogou), Claron Technologies (Ingmar Bitter)
Coordinating Programs
–
–
RSNA QIBA (e.g., Dan Sullivan, Binsheng Zhao)
Under consideration: CTMM TraIT (Henk Huisman, Jeroen Belien)
3
QI-Bench is Use case Driven
Create and Manage Semantic
Infrastructure and Linked Data
Archives
Create and Manage Physical
and Digital Reference Objects
Core Activities for Biomarker
Development
Collaborative Activities to
Standardize and/or Optimize
the Biomarker
Consortium Establishes Clinical
Utility/Efficacy of Putative
Biomarker
Commercial Sponsor Prepares
Device/Test for Market
4
Reference Data Set
Manager
Reference Data Sets, Annotations, and
Analysis Results
QIBO
Quantitative Imaging
Specification Language
Batch Analysis
Service
3. Batch analysis
scripts
Source of clinical study
results
UPICT Protocols, QIBA
Profiles, entered with
Ruby on Rails web
service
T Protocols, QIBA
es, literature
rs and other
es
4. Output
QIBO(red edges
represent
biostatistical
generalizability)
Clinical Body of Evidence
BatchMaketo enable
(formatted
Scripts
SDTM
and/or other
standardized registrations
5
Reference Data Set
Manager
Batch Analysis
Service
Reference Data Sets, Annotations, and
Analysis Results
QIBO
3. Batch analysis
scripts
Source of clinical study
results
Quantitative Imaging
Specification Language
UPICT Protocols, QIBA
Profiles, entered with
Ruby on Rails web
service
4. Output
QIBO-
T Protocols, QIBA
es, literature
rs and other
es
(red edges represent
biostatistical generalizability)
• Specify context for use and
assay methods.
• Use consensus terms in
doing so.
Specify
Formulate
• Assemble applicable
reference data sets.
• Include both imaging and
non-imaging clinical data.
• Compose and iterate batch
analyses on reference data.
• Accumulate quantitative
read-outs for analysis.
Execute
Analyze
• Characterize the method
relative to intended use.
• Apply the existing tools
and/or extend them.
Clinical Body of Evidence (formatted
to enable BatchMake
SDTM and/or other
standardizedScripts
registrations
• Compile evidence for
regulatory filings.
• Use standards in transfer to
regulatory agencies.
Package
6
•Specify context for
use and assay
methods.
•Use consensus terms
in doing so.
Specify
QIBO, AIM,
RadLex/
Snomed/
NCIt; built
using Ruby
on Rails.
Formulate
•Assemble applicable
reference data sets.
•Include both imaging
and non-imaging
clinical data.
caB2B,
NBIA,
PODS data
elements,
DICOM
query tools.
•Compose and iterate
batch analyses on
reference data.
•Accumulate
quantitative read-outs
for analysis.
Execute
MIDAS,
BatchMake,
Condor
Grid; built
using
CakePHP.
Analyze
•Characterize the
method relative to
intended use.
•Apply the existing
tools and/or extend
them.
MVT
portion of
AVT, reuseable
library of R
scripts.
•Compile evidence for
regulatory filings.
•Use standards in
transfer to regulatory
agencies.
Package
STDM
standard of
CDISC into
repositories
like FDA’s
Janus.
7
BSD-2 license
Domain is
www.qi-bench.org.
Landing page
provides
• Access to
prototypes,
• Repositories for
download and
development,
• Acknowledgemen
ts,
• Jira issue tracking,
and
• Documentation
8
Project wiki includes
sections for
• Project
management
plan,
• User needs
analysis
(including use
cases),
• Lab Protocol,
• Developer’s helps
(including use of
Git),
• Meeting minutes,
and
• Discussion of
investigators/
collaborators.
9
Specify:
Specify is
presently a
composite of QISL
and the AIM
template builder.
The Quantitative
Imaging Specification
Language (QISL) uses
the Quantitative
Imaging Biomarker
Ontology (QIBO) and
other linked
ontologies to develop
a triple store based
on user Q/A.
10
Quantitative Imaging Biomarker
Ontology (QIBO)
•
Initial curation to collect terms: reviewed 126 articles across 6 therapeutic areas elaborating 225 imaging markers
•
Reusing other publicly available ontologies: MeSH, NCI thesaurus, GO, FMA, and BIRNLex
•
Current sate: 490 classes and relationship properties for clinical context for use and assay methods.
•
Next steps: Basic Formal Ontology (BFO) as an upper ontology that provides a formal structure of upper level abstract
classes that has been adapted by the Open Biological and Biomedical Ontologies (OBO) foundry, a large collaborative
effort for the goal of creating orthogonal and interoperable ontologies in biomedical research.
11
Specify (cont)
The idea is that AIM
templates would be
constructed and
linked to the other
specification
information from the
ontologies.
Presently it just coexists in the
prototype app, it is
not yet functionally
integrated as
ultimately intended.
12
Formulate
Web-enabled service for aggregating
reference data based on endpoints
13
Formulate (continued)
•
One small part of Formulate that we have done is to create a CQL “connecter” to
import data from NBIA. The reason we do this is to optimize storage for grid
computing and to include metadata storage needed to run experiments.
14
15
Endpoints for Formulate
Execute is
where analyses of
Reference Data Sets
take place. It is
based on MIDAS and
the associated
Batchmake”
capability but
extends it for QIBench. The storage
model is optimized
for metadata storage
and grid computing.
16
Execute
First Reference Data Sets
• Pilot 3A
– 156 lesions for evaluation (1A read 15)
• Pivotal 3A
– 408 lesions for evaluation (1A read 40)
• Study 1C
– 2364 lesions for evaluation (1C is set to read 66)
• Study 1187
– 7122 lesions for evaluation
• Available: RIDER, IDRI, MSKCC “1B”, …
17
Execute roadmap
•
Script to write “Image Formation” content into Biomarker Database for provenance
of Reference Data Sets: Application for pulling in data from the “Image Formation” schema to populate the biomarker
database. This data will originate from the DICOM imagery imported into QI-Bench.
•
Laboratory Protocol for the NBIA Connector and Batch Analysis Service: Laboratory protocol
to describe the use of the NBIA Connector and the Image Formation script to import data into QIBench and use of the Batch Analysis Service
for server-side processing.
•
Support change analysis biomarkers serial studies (up to two time points in the
current period, extensible to additional in subsequent development iterations): Support
experiments including at minimum two time points. An example of this is the change in volume or SUV, rather than (only) estimation of the
value at one time point.
•
•
Document and solidify the API harness for execution modules of the Batch Analysis
Service: This task will include the documentation and complete specification of the Batchmake Application API.
Support scripted reader studies: Support reader studies through worklist items specified via AIM templates as well as
Query/Retrieve via DICOM standards for interaction with reader stations. ClearCanvas will serve as the target reader station for the first
implementation.
•
Generate output from the LSTK module via AIM template (as opposed to hardcoded): Generate annotation and image markup output from reference algorithms (i.e., LSTK for volumetric CT and Slicer3D for SUV)
based on AIM templates instead of the current hard-coded implementation. An AIM Template is an .xsd file.
•
Re-work the NBIA Connector to run in the context of Formulate: This task will include refactoring
and stabilization of the NBIA Connector in order to incorporate its functionality into Formulate.
18
Analyze
Current Prototype Capabilities
19
Analyze
MVT provides good framework, but with gaps
•
•
•
•
Lesion tracking
Other modalities and measures, e.g., SUV via FDG-PET
Properly functioning multiple regression and N-way ANOVA
Support Clinical Performance assessment (i.e., in addition to current
Technical Performance)
– Outcome studies
– Integrated genomic/proteomic correlation studies
– Group studies for biomarker qualification
•
•
•
•
Serial studies / change analysis
Persistent database
Scale-up to handle thousands of cases (10’s thousands of lesions)
Deploy as Web app
20
Analyze
Figures of Merit and Descriptive Statistics
• Collaborative activity underway to converge definitive
descriptive statistics for technical and clinical performance
• Approaches to defining and administering compliance in
relationship with QIBA profiles.
Mid-term
Process the lesion reads on the same 40 lesions used in the 1A pivotal as a 7th reader
using 1A STATA method and compare results.
3. Process the 6 selected lesions from the MVT demonstrator using Perform and analyze
the 1A STATA method and compare results. [MVT<->1A STATA]
other studies (e.g.,
4. Process the lesion reads on the same 6 lesions used in the MVT
1C, 3A, 1187, other
th
demo set as a 7 reader and compare results. [QI-Bench<->MVT] modes, etc. using
5. Convert 1A STATA analysis to R and compare the results on the
STATA analysis
408. [STATA<->R]
method.
6. Extend MVT to use the created R scripts (and fill other gaps).
7. Re-do analyses to verify that results come out the same.
Convert to R-based, MVT-implemented analysis.
As Funding
Allows
Completion of tool box to include all of the descriptive statistics determined in the discussions
/ workshops.
Done
Next-up
1.
21
Package
Structure submissions according to eCTD, HL7
RCRIM, and SDTM
Section 2
Summaries
2.1. Biomarker Qualification Overview
2.1.1. Introduction
2.1.2. Context of Use
2.1.3. Summary of Methodology and Results
2.1.4. Conclusion
2.2. Nonclinical Technical Methods Data
2.2.1. Summary of Technical Validation Studies and Associated Analytical Methods
2.2.2. Synopses of individual studies
2.3. Clinical Biomarker Data
2.3.1. Summary of Biomarker Efficacy Studies and Associated Analytical Methods
2.3.2. Summary of Clinical Efficacy [one for each clinical context]
2.3.3. Synopses of individual studies
Section 3
Quality
<used when individual sponsor qualifies marker in context of a specific NDA>
Section 4
Nonclinical Reports
4.1. Study reports
4.1.1. Technical Methods Development Reports
4.1.2. Technical Methods Validation Reports
4.1.3. Nonclinical Study Reports (in vivo)
4.2. Literature references
Section 5
Clinical Reports
5.1. Tabular listing of all clinical studies
5.2. Clinical study reports and related information
5.2.1. Technical Methods Development reports
5.2.2. Technical Methods Validation reports
5.2.3. Clinical Efficacy Study Reports [context for use]
5.3. Literature references
22
Package
Standards Mapping
Domain
SDTM
Variabl
e Name
Variabl
e
Label
Definiti
on
DM
BRTHDT
C
Patients
Date of
Birth
Date/ti
me of
birth of
subject
CDASH/SDTM
Variables
ACRIN
DICOM
NBIA
Data
Element
BRIDG
Data
type
(0010,
0030)
Patient'
s Date
of Birth
Biologic
Entity.
birthDa
te*
CHAR*
ACRIN
Reference
DICOM
Tag
NBIA
DE
Control
led
Terms,
Codelis
t or
Format
NA
Role
Implem
entatio
n Notes
Record
Qualifie
r
Permiss
ible
Point to
NCIt, RadLex,
etc. here
ISO8601 &
21090
Core
SDTM
Role
23
Package
NCI – CPATH – CDISC CRF WG
24
Package
Web-enabled service for compiling results
25
Open source model
• www.qi-bench.org domain
• BSD license
• Extending rather than forking assets
– Engaging with CBIIT OSDI program
• QI-Bench specific assets in publicly accessible
repositories and full access to development
tools through www.qi-bench.org
• Project wiki at www.qi-bench.org/wiki
26
Development Lifecycle Process for Centrally
Developed Portions
High-level relationship among development processes
(modeled after corresponding process flow at caBIG to allow effective integration)
27
QI-Bench Developer’s Resources
28
29
Year Outlook
Time frame
Content
October 2011
•
•
•
3A Pilot data curation
Specify feasibility established with proof of concept using BioPortal and QIBO
Data center upgrades (now including 24 Processor cores, 40 GB Main Memory, 12 TB Disk storage,
Linux, Windows Server, and Mac OS X)
November 2011
•
•
•
QI-Bench developer environment (Git, Jira, dashboards)
Execute upgrade
3A Challenge Launch
December 2011
•
•
“Image Formation”: scripts and spreadsheet for provenance of Reference Data Sets
Laboratory Protocol (Formulate, Execute, Analyze)
January 2012
•
Support change analysis / serial studies (up to two time points, extensible to additional in
subsequent development iterations) (Formulate, Execute, Analyze)
February 2012
•
•
Specify working model (including QIB Ontology and AIM template builder 18 or 23 as possible)
Document and solidify the API harness for execution modules of the Batch Analysis Service
(Execute)
March 2012
•
Drive reading stations by making worklist items from within Batchmake scripts (Formulate, Execute)
April 2012
•
Support DICOM Query/Retrieve from Reference Data Set Manager (to complete the Reader Studies
support package) (Execute, Analyze)
May 2012
•
Extend harness API for AIM templates (including generate output from the LSTK module via AIM
template) (Execute)
Summer 2012
•
Formulate working model (including re-worked NBIA Connector and other content as above)
30
Summary:
QI-Bench Contributions
• We make it practical to increase the magnitude of data for increased
statistical significance.
• We provide practical means to grapple with massive data sets.
• We address the problem of efficient use of resources to assess limits of
generalizability.
• We make formal specification accessible to diverse groups of experts that are
not skilled or interested in knowledge engineering.
• We map both medical as well as technical domain expertise into
representations well suited to emerging capabilities of the semantic web.
• We enable a mechanism to assess compliance with standards or
requirements within specific contexts for use.
• We take a “toolbox” approach to statistical analysis.
• We provide the capability in a manner which is accessible to varying levels of
collaborative models, from individual companies or institutions to larger
consortia or public-private partnerships to fully open public access.
31
32
Download