"Analyze" Scope Description (ASD) - QI

advertisement
QI-Bench “Analyze” ASD
Rev 0.1
QI-Bench “Analyze” Scope Description
June 2011
Rev 0.1
Required Approvals:
Author of this
Revision:
Andrew J. Buckler
Project Manager:
Andrew J. Buckler
Print Name
Signature
Date
Document Revisions:
Revision
Revised By
Reason for Update
Date
0.1
AJ Buckler
Initial version
June 2011
BBMSC
1 of 11
QI-Bench “Analyze” ASD
Rev 0.1
Table of Contents
1.
EXECUTIVE SUMMARY ................................................................................................ 3
1.1. APPLICATION PURPOSE .......................................................................................................... 3
1.2. APPLICATION SCOPE .............................................................................................................. 3
1.3. THE REASON WHY THE APPLICATION IS NECESSARY............................................................... 4
1.4. TERMS USED IN THIS DOCUMENT .......................................................................................... 4
2.
PROFILES........................................................................................................................... 6
2.1. INFORMATION PROFILES ........................................................................................................ 6
2.2. FUNCTIONAL PROFILES .......................................................................................................... 7
2.2.1. Biostatistical Assessment of Predictive Biomarkers ...................................................... 7
2.2.2. Biostatistical Assessment of Prognostic Biomarkers ..................................................... 8
2.3. BEHAVIORAL PROFILES ......................................................................................................... 9
3.
CONFORMANCE ASSERTIONS .................................................................................... 9
4.
REFERENCES .................................................................................................................. 10
BBMSC
2 of 11
QI-Bench “Analyze” ASD
Rev 0.1
1. Executive Summary
Imaging biomarkers are developed for use in the clinical care of patients and in the
conduct of clinical trials of therapy. In clinical practice, imaging biomarkers are intended
to (a) detect and characterize disease, before, during or after a course of therapy, and
(b) predict the course of disease, with or without therapy. In clinical research, imaging
biomarkers are intended to be used in defining endpoints of clinical trials. A precondition
for the adoption of the biomarker for use in either setting is the demonstration of the
ability to standardize the biomarker across imaging devices and clinical centers and the
assessment of the biomarker’s safety and efficacy. Currently qualitative imaging
biomarkers are extensively used by the medical community. Enabled by the major
improvements in clinical imaging, the possibility of developing quantitative biomarkers is
emerging. For this document “Biomarker” will be used to refer to the measurement
derived from an imaging method, and “device” or “test” refers to the hardware/software
used to generate the image and extract the measurement.
Regulatory approval for clinical use1 and regulatory qualification for research use
depend on demonstrating proof of performance relative to the intended application of
the biomarker:




In a defined patient population,
For a specific biological phenomenon associated with a known disease state,
With evidence in large patient populations, and
Externally validated.
The use of imaging biomarkers occurs at a time of great pressure on the cost of medical
services. To allow for maximum speed and economy for the validation process, this
strategy is proposed as a methodological framework by which stakeholders may work
together.
1.1. Application Purpose
The purpose of the QI-Bench project is to aggregate evidence relevant to the process of
implementing imaging biomarkers to allow sufficient quality and quantity of data are
generated to support the responsible use of these new tools in clinical settings. The
efficiencies that follow from using this approach could translate into defined processes
that can be sustained to develop and refine imaging diagnostic and monitoring tools for
the healthcare marketplace to enable sustained progress in improving healthcare
outcomes. Specifically, the “Analyze” app is developed to allow users to:


Characterize the method relative to intended use.
Apply the existing tools and/or extend them.
1.2. Application Scope
From a technology point of view, Analyze refers to the part of the project most closely
associated with the Measurement Variability Toolkit portion of AVT live as well as the
BBMSC
3 of 11
QI-Bench “Analyze” ASD
Rev 0.1
ideas being discussed presently for a library of reference statistical analysis methods.
Its job is to enrich the logical specification with statistical results.
Most literally, Analyze would be packaged in two forms: 1) as a web-service linking to
the databases on the project server dev.bbmsc.com; and 2) as a local
installation/instance of the functionality for more sophisticated users.
1.3. The reason why the application is necessary
Biomarkers are useful for clinical practice only if they improve performance and to the
extent that they are clinically relevant. As such, objective evidence regarding the
biomarkers’ relationships to health status must be established. Imaging biomarkers are
usually used in concert with other types of biomarkers and with clinical endpoints (such
as patient reported outcomes (PRO) or survival). Imaging and other biomarkers are
often essential to the qualification of each other.
In the past decade researchers have grappled with emerging high-throughput
technologies and data analysis problems they present. Statistical validation provides a
means of understanding the results of high-throughput datasets. Conceptually,
statistical validation involves associating elements in the results of high-throughput data
analysis to concepts in an ontology of interest, using the ontology hierarchy to create a
summarization of the result, and computing statistical significance for observed trends in
such a way as to identify the performance of methods within tested contexts for use and
to identify the limits of generalizability across them.
The canonical example of statistical validation is when sufficient statistical power either
proving or disproving a hypothesis is met, usually stated along with a characterization of
variability under defined scenarios. Determining the biological relevance of a
quantitative imaging read-out is a difficult problem. First it is important to establish to
what extent a read-out is an intermediate end-point capable of being measured prior to
a definitive endpoint that is causally rather than coincidentally related. Second, given
the combinatorial complexity that arises with a multiplicity of contexts for use coupled
with the cost in time and resource to experimentally interrogate the space fully, a logical
and mathematical framework is needed to establish how extant study data may be used
to establish performance in contexts that have not been explicitly tested.
1.4. Terms Used in This Document
The following are terms commonly used that may of assistance to the reader.
AAS
Application Architecture Specification
ASD
Application Scope Description
BAM
Business Architecture Model
BRIDG
Biomedical Research Integrated Domain Group
caBIG
Cancer Biomedical Informatics Grid
caDSR
Cancer Data Standards Registry and Repository
BBMSC
4 of 11
QI-Bench “Analyze” ASD
CAT
Composite Architecture Team
CBIIT
Center for Biomedical Informatics and Information Technology
CFSS
Conceptual Functional Service Specification
CIM
Computational Independent Model
DAM
Domain Analysis Model
EAS
Enterprise Architecture Specification
ECCF
Enterprise Conformance and Compliance Framework
EOS
End of Support
ERB
Enterprise Review Board
EUC
Enterprise Use-case
IMS
Issue Management System (Jira)
KC
Knowledge Center
NCI
National Cancer Institute
NIH
National Institutes of Health
PIM
Platform Independent Model
PSM
Platform Specific Model
PMO
Project Management Office
PMP
Project Management Plan
QA
Quality Assurance
QSR
FDA’s Quality System Regulation
SAIF
Service Aware Interoperability Framework
SDD
Software Design Document
SIG
Service Implementation Guide
SUC
System Level Use-case
SME
Subject Matter Expert
SOA
Service Oriented Architecture
SOW
Statement of Work
UML
Unified Modeling Language
UMLS
Unified Medical Language System
VCDE
Vocabularies & Common Data Elements
Rev 0.1
When using the template, extend with specific terms related to the particular EUC being
documented.
BBMSC
5 of 11
QI-Bench “Analyze” ASD
Rev 0.1
2. Profiles
A profile is a named set of cohesive capabilities. A profile enables an application to be
used at different levels and allows implementers to provide different levels of
capabilities in differing contexts. Whereas interoperability is the metric with services,
applications focus on usability (from a user’s perspective) and reusability (from an
implementer’s).
Include the following three components in each profile:

Information Profile: identification of a named set of information descriptions (e.g.
semantic signifiers) that are supported by one or more operations.

Functional Profile: a named list of a subset of the operations defined as
dependencies within this specification which must be supported in order to claim
conformance to the profile.

Behavioral Profile: the business workflow context (choreography) that fulfills one
or more business purposes for this application. This may optionally include
additional constraints where relevant.
Fully define the profiles being defined by this version of the application.
When appropriate, a minimum profile should be defined. For example, if an application
provides access to several business workflows, then one or more should be deemed
essential to the purpose of the application.
Each functional profile must identify which interfaces are required, and when relevant,
where specific data groupings, etc… are covered etc.
When profiling, consider the use of your application in:

Differing business contexts

Different localizations

Different information models

Partner-to-Partner Interoperability contexts

Product packaging and offerings
Profiles themselves are optional components of application specifications, not
necessarily defining dependencies as they define usage with services. Nevertheless,
profiles may be an effective means of creating groupings of components that make
sense within the larger application concept.
2.1. Information Profiles

BBMSC
Identify a named set of information descriptions (e.g. semantic signifiers) that are
supported by one or more operations.
6 of 11
QI-Bench “Analyze” ASD
Rev 0.1
2.2. Functional Profiles
The most demanding standard for clinical biomarker application stems from federal drug
approval agencies which have a statutory requirement that any test have demonstrated
validity and reliability. In casual scientific conversations in imaging contexts, the words
reliability and validity are often used to describe a variety of properties (and sometimes
the same one). The metrology view of proof of performance dictates that a
measurement result is complete only when it includes a quantitative statement of its
uncertainty.2,3 Generating this statement typically involves the identification and
quantification of many sources of uncertainty, including those due to reproducibility and
repeatability (which themselves may be due to multiple sources). Measures of
uncertainty are required to assess whether a result is adequate for its intended purpose
and how it compares with alternative methods. A high level of uncertainty can limit
utilization, as uncertainty reduces statistical power, especially in the multi-center trials
needed for major studies. Uncertainty compromises longitudinal measurements,
especially when patients move between centers or when scanner changes occur.
2.2.1. Biostatistical Assessment of Predictive Biomarkers4
Biomarker qualification will be determined by analyses of data addressing treatment
induced changes in the biomarker that also correlate with the corresponding clinical
outcome. The performance of the biomarker will be assessed against standard practice
as a benchmark, as well as how the marker performs relative to established practice.
Data used in the study would variously include results from the published literature,
retrospectively re-analyzed data from previous clinical trials, and analyzed data from
existing ongoing trials and those based on RSNA/QIBA protocols and profiles.
The primary purpose of predictive response markers in the phase II setting is to serve
as an early but accurate indicator of a promising treatment effect on survival. The key
criteria proposed to judge the utility of the new endpoint primarily relate to its ability to
accurately and reproducibly predict the eventual phase III endpoint for treatment effect,
which is assessed by a difference between two arms on progression-free or overall
survival, both at the patient level and more importantly at the trial level. More precisely,
the measure of treatment effect on the phase II endpoint must correlate sufficiently well
with the measure of treatment effect on the phase III primary endpoint such that the
former can be considered reasonably predictive of the latter.
It is not sufficient that the endpoint being considered for a phase II trial be a prognostic
indicator of clinical outcome. Within the context of a clinical trial, the early endpoint must
capture at least a component of treatment benefit, a concept that specifies that a
change due to treatment in the early endpoint predicts a change in the ultimate clinical
endpoint. Theoretical principles to define treatment benefit were outlined by Prentice,5
although capturing the full treatment benefit (as measured by the phase III endpoint)
has been recognized as too cumbersome to be useful in practice.6,7 A more practical
and demonstrable criterion requires that the early endpoint captures a substantial
proportion of the treatment benefit, for example, more than 50%. This approach has
been used to establish the utility of endpoints such as progression-free survival (PFS)
by demonstrating that they are sufficiently predictive of OS, even if they do not satisfy
BBMSC
7 of 11
QI-Bench “Analyze” ASD
Rev 0.1
the Prentice criterion.8,9,10,11,12,13,14,15,16 We are primarily trying to show association of the
biomarker to the clinical endpoint that is not likely due to chance. Eventually we would
like to do this better than present methods like RECIST. How does 50% compare to
RECIST? We would like to claim that FDG and VIA are better than RECIST (or an
equivalent measure of disease state).
The Freedman approach involves estimating the treatment effect on the true endpoint,
defined as s, and then assessing the proportion of treatment effect explained by the
early endpoint. However, as noted by Freedman, this approach has statistical power
limitations that will generally preclude conclusively demonstrating that a substantial
proportion of the treatment benefit at the individual patient level is explained by the early
endpoint. In addition, it has been recognized that the proportion explained is not indeed
a true proportion, as it may exceed 100%, and that whilst it may be estimated within a
single trial, that data from multiple trials are required to provide a robust estimate of the
predictive endpoint. Additionally, it can have interpretability problems, also pointed out
by Freedman. Buyse and Molenberghs also proposed an adjusted correlation method
that overcomes some of these issues.
2.2.2. Biostatistical Assessment of Prognostic Biomarkers17
The assessment framework for predictive markers stems from the accepted definition of
a surrogate marker is a measure which can substitute for a more difficult, distant, or
expensive-to-measure endpoint in predicting the effect of a treatment or therapy in a
clinical trial.18 Greatly complicating the issue is the fact that all the definitions of
surrogacy revolve around the elucidation of the joint and conditional distributions of the
desired endpoint, putative surrogate and their dependence on a specified
therapy.19,20,21,22 Therefore, what may work adequately for a given endpoint and one
type of therapy may not be adequate for the same endpoint and a different type of
therapy. Disease screening calls for a prognostic marker where it is neither necessary
nor possible to anticipate all the potential therapies for which a surrogate marker might
be desired.
Nevertheless, as measurements are developed that capture more and more accurately
the structure, functioning and tissue metabolism of pre-symptomatic cancer, it will
become more likely that proposed biomarkers are on the causal pathway to the
symptomatic disease and its clinical outcomes and can function as surrogate markers
for at least one element of disease. Furthermore, the longitudinal nature of the proposed
screening application allows correlation of changes within a person over time between
different elements of disease including different measures of structural change, such as
volumetric CT findings. So that the screening studies will support analyses that
researchers may want to perform to evaluate putative biomarkers and assess their
potential for surrogacy, it is designed to have adequate precision for estimating the joint
relationship between proposed biomarkers and desired endpoints. At the very least,
investigators will be able to identify a number of promising biomarkers for use in early
development of treatments and that can be tested in trials as surrogates for treatment
effects. These initial objectives for surrogacy may require somewhat different validation
BBMSC
8 of 11
QI-Bench “Analyze” ASD
Rev 0.1
standards in comparison to use of surrogates by regulatory authorities in registering a
new drug treatment.
Surrogacy means more than a demonstrable or even a strong association between the
desired endpoint and the proposed surrogate and original definitions have been
criticized as being limited in scope and having fundamental shortcomings.23 Recent
proposals in the context of meta-analysis get more to the heart of surrogacy. By
correlating changes in the surrogate with changes in a primary endpoint, these
approaches more directly address the surrogacy question. These analytic techniques
are equally applicable in a longitudinal setting, such as screening.
The techniques for doing so are most easily described in the context of a continuous
surrogate (e.g. change in nodule volume) and a continuous outcome. Linear mixed
models24 with random slopes (or, more generally, random functions) and intercepts
through time are built for both the surrogate marker and the endpoint. That is, the joint
distribution of the surrogate marker and the endpoint are modeled using the same
techniques as used for each variable individually. The degree to which the random
slopes for the surrogate and the endpoint are correlated give a direct measure of how
well changes in the surrogate correlate with changes in the endpoint. The ability of the
surrogate to extinguish the influence of potent risk factors, in a multivariate model,
further strengthens its use as a surrogate marker.
In practice, it is likely there will often be competing candidate surrogate markers each
correlated to a different degree with the endpoint. The preferred surrogate is one that is
biologically defensible and most highly correlated with the endpoint. The statistical
significance of the differences between correlations can be evaluated using a
parametric bootstrap.25
2.3. Behavioral Profiles

The business workflow context (choreography) that fulfills one or more business
purposes for this application. This may optionally include additional constraints
where relevant.
3. Conformance Assertions
Conformance Assertions are testable, verifiable statements made in the context of a
single RM-ODP Viewpoint (ISO Standard Reference Model for Open Distributed
Processing, ISO/IEC IS 10746|ITU-T X.900). They may be made in four of the five RMODP Viewpoints, i.e. Enterprise, Information, Computational, and/or Engineering. The
Technology Viewpoint specifies a particular implementation /technology binding that is
run within a ‘test harness’ to establish the degree to which the implementation is
conformant with a given set of Conformance Assertions made in the other RM-ODP
Viewpoints. Conformance Assertions are conceptually non-hierarchical. However,
Conformance Assertions may have hierarchical relationships to other Conformance
Assertions within the same Viewpoint (i.e. be increasingly specific). They are not,
however, extensible in and of themselves.
BBMSC
9 of 11
QI-Bench “Analyze” ASD
Rev 0.1
4. References
1
http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSearch.cfm?CFRPart=8
20&showFR=1, accessed 28 February 2010.
2
International Organization for Standardization, “Guide to the Expression of
Uncertainty in Measurement”, (International Organization for Standardization,
Geneva) 1993.
3
Joint Committee for Guides in Metrology, “International Vocabulary of Metrology –
Basic and General Concepts and Associated Terms”, (Bureau International des
Poids et Mesures, Paris) 2008.
4
Sargent DJ, Rubinstein L, Schwartz L, Dancey JE, Gatsonis C, Dodd LE, Shankar
LK. Validation of novel imaging methodologies for use as cancer clinical end-points.
European Journal of Cancer 45 (2009) 290-299.
5
Prentice RL. Surrogate endpoints in clinical trials: definitions and operational criteria.
Stat Med 1989;8:431–40
6
Freedman LS, Graubard BI, Schatzkin S. Statistical validation of intermediate
endpoints for chronic diseases. Stat Med 1992;11:167–78
7
Korn EL, Albert PS, McShane LM. Assessing surrogates as trial endpoints using
mixed models. Stat Med 2005;24:163–82.
8
Buyse M, Burzykowski T, Carroll K, et al. Progression-free survival is a surrogate for
survival in advanced colorectal cancer. J Clin Oncol 2007;25:5218–24.
9
Buyse M, Thirion P, Carlson RW, Burzykowski T, Molenberghs G, Piedbois P.
Relation between tumor response to first line chemotherapy and survival in
advanced colorectal cancer: a meta-analysis. Lancet 2000;356:373–8.
10
Burzykowski T, Buyse M, Piccart-Gebhart MJ, et al. Evaluation of tumor response,
disease control, progression-free survival, and time to progression as potential
surrogate end points in metastatic breast cancer. J Clin Oncol 2008;26:1987–92.
11
Buyse M, Molenberghs G. Criteria for the validation of surrogate endpoints in
randomized experiments. Biometrics 1998;54:1014–29
12
Burzykowski T, Molenberghs G, Buyse M, Geys H, Renard D. Validation of
surrogate end points in multiple randomized clinical trials with failure time end points.
Appl Stat 2001;50:405–22
13
Burzykowski T, Molenberghs G, Buyse M. The validation of surrogate end points by
using data from randomized clinical trials: a case-study in advanced colorectal
cancer. J Royal Stat Soc A 2004;167:103–24
14
Bruzzi P, Del Mastro L, Sormani MP, et al. Objective response to chemotherapy as a
potential surrogate end point of survival in metastatic breast cancer patients. J Clin
Oncol 2005;23:5117–25.
BBMSC
10 of 11
QI-Bench “Analyze” ASD
Rev 0.1
15
Sargent DJ, Wieand HS, Haller DG, et al. Disease-free survival versus overall
survival as a primary end point for adjuvant colon cancer studies: individual patient
data from 20,898 patients on 18 randomized trials. J Clin Oncol 2005;23:8664–70.
16
Sargent DJ, Patiyil S, Yothers G, et al. End points for colon cancer adjuvant trials:
observations and recommendations based on individual patient data from 20,898
patients on 18 randomized trials from the ACCENT group. J Clin Oncol
2007;25:4569–74.
17
Nevitt MC, Felson DT, Lester G. The Osteoarthritis Initiative, Protocol for the Cohort
Study. Osteoarthritis Initiative, V 1.1 6.21.06
18
Prentice RL. Surrogate endpoints in clinical trials: Definition and operational criteria.
Statistics in Medicine 1989;8:431-40.
19
Buyse M, Molenberghs G, Burzykowski T, Renard D, Geys H. Statistical validation of
surrogate endpoints: problems and proposals. Drug Information Journal
2000;34:447-54.
20
Freedman LS, Graubard BI, Schatzkin A. Statistical validation of intermediate
endpoints for chronic diseases. Statistics in Medicine 1992;11:167-78.
21
Buyse M, Molenberghs G, Burzykowski T, Geys H, Renard D. The validation of
surrogate endpoints in meta-analyses of randomized experiments. Biostatistics
2000;1:1-19.
22
Fleming TR, DeMets DL. Surrogate endpoints in clinical trials: are we being mislead?
Annals Int Med 1996;125:605-13.
23
Fleming TR, DeMets DL. Surrogate endpoints in clinical trials: are we being mislead?
Annals Int Med 1996;125:605-13.
24
McCulloch CE, Searle SR. Generalized, Linear and Mixed Models. New York: Wiley;
2000.
25
Davison AC, Hinkley DV. Bootstrap methods and their application. Cambridge:
Cambridge University Press; 1997.
BBMSC
11 of 11
Download