7/21/2014 QIB Technical Assessment Quantitative Imaging Metrology: What Should Be Assessed and How? Methods for Technical Performance Assessment Nicholas Petrick CDRH/OSEL/DIDSR U.S. Food and Drug Administration QIB Technical Assessment Outline of Talk • • • • Purpose Main components of technical assessment Example Analysis Summary 7/21/2014 2 QIB Technical Assessment Purpose • Discuss QIBA consensus on technical assessment of QIB performance* – Include example analysis from our research 7/21/2014 *Raunig et al., Stat Methods Med Res, 2014 3 1 7/21/2014 QIB Technical Assessment Why FDA? • QIB are important growing area of interest • CDER (drugs) – Qualification of QIBs for use in drug trials • CDRH (devices) – Potential for specific QIB device claims – Develop least burdensome assessment methods *Raunig et al., Stat Methods Med Res, 2014 7/21/2014 4 QIB Technical Assessment Components of a QIB Patient Patient Patient Patient Image Image Acquisition Image Acquisition Image System Acquisition System Acquisition System System Image Image Measurement Image Measurement Image Measurement Measurement QIB Performance Analysis • Factors influencing reliability of a QIB – Patient factors • Natural variation in lesions/patients • Variation among patients – Acquisition parameters • Slice thickness, collimation, exposure, reconstruction, etc. • Variation in imaging platform – Software effects • Automated/semi-automated • Variation in measurement tool 7/21/2014 5 QIB Technical Assessment QIB Technical Assessment • QIBs typically evaluated for reliability under study conditions • Problem: QIB performance evaluation often different from study-to-study – Partly due to differences in data set, etc. – Also due to analysis differences • Ad hoc metrics • Inconsistent use of statistical metrics • Inconsistent terminology 7/21/2014 6 2 7/21/2014 QIB Technical Assessment QIBA Technical Assessment Group* • Provide common framework for researchers to assess technical performance • Ensure reliability of image measurement – Subject QIBs to same rigor as other quantitative clinical measurements 7/21/2014 *Raunig et al., Stat Methods Med Res, 2014 7 QIB Technical Assessment Designing QIB analysis plan* Step 1: Define QIB and its relationship to quantity to be measured (measurand) Step 2: Define question to be addressed – Hypothesis or bounds on performance Step 3: Define the experimental unit – Lesion-level, patient-level Step 4: Define statistical measures of performance – What are the metrics for bias, repeatability, reproducibility? Step 5: Specify elements of statistical design Reference, reproducibility conditions, etc. Step 6: Determine the data requirements – Patient population, type of images, sample size, etc. Step 7: Collect data and perform statistical analysis 7/21/2014 *Raunig et al., Stat Methods Med Res, 2014 8 QIB Technical Assessment Basic Technical Assessment Components • Bias/Linearity • Repeatability • Reproducibility 7/21/2014 9 3 7/21/2014 QIB Technical Assessment QIB Technical Assessment Bias and Linearity 7/21/2014 10 QIB Technical Assessment Bias and Linearity • Relationship of QIB to reference standard (measurand) • E 𝑋 𝑋 = 𝑥 = 𝑓(𝑥) • 𝑋: Estimate • 𝑋: Reference standard (measurand) • Linearity – On average, change in 𝑥 reflects a proportional change in 𝑋 6/26/2013 Petrick, QMI 2013 11 QIB Technical Assessment Bias and Linearity • Linearity 6000 – Proportional change • True (red) 1.0 • Measured – 1000: 0.6 – 3000: 1.4 Non-constant Bias Constant Bias Curvature Identity Slope=1.2 4000 Measured Value • Nonlinear Green Blue Red Black --- 5000 • True (green) : 1.0 • Measured (green): 1.2 ~1400 3000 2000 ~600 1000 0 0 1000 2000 3000 4000 5000 6000 True Value 7/21/2014 12 4 7/21/2014 QIB Technical Assessment Bias • 𝐵𝑖𝑎𝑠 𝑋 = 𝐸 𝑋 𝑋 = 𝑥 − 𝑥 6000 • Types Green Blue Red Black --- 5000 – Unbiased • 𝐵𝑖𝑎𝑠 𝑋 = 0 Non-constant Bias Constant Bias Curvature Identity Measured Value 4000 – Slope=1 – Intercept=0 – Constant bias 3000 2000 • 𝐵𝑖𝑎𝑠 𝑋 = 𝐵 – Slope=1 1000 – Non-constant bias 0 • 𝐵𝑖𝑎𝑠 𝑋 = 𝐵(𝑥) 0 1000 2000 3000 4000 5000 6000 True Value – Depended on truth 7/21/2014 13 QIB Technical Assessment Bias and Linearity Assessment • Visually assess measured vs reference – Define limits of quantitation Measured Value Identity Line • Bias analysis – Utilize multiple replicates for multiple “truth” values • Linearity analysis Measuring Interval LLoQ ULoQ True Value – Test for linearity • Sequential polynomial testing 7/21/2014 14 QIB Technical Assessment QIB Technical Assessment Repeatability 7/21/2014 28 5 7/21/2014 QIB Technical Assessment Repeatability • Test-retest within-subject variability – Within-subject variability • Variability across repeat observations nested within patients 7/21/2014 29 QIB Technical Assessment Repeatability Metrics • Within-subject variance – 𝜎𝑤2 = 𝐸𝑝 𝐸𝑟 (𝑦𝑖𝑗 − 𝜇𝑖 )2 • Repeat observations, 𝑗 = 1, … , 𝑘 • Patient, 𝑖 = 1, … , 𝑛 • Repeatability Coefficient – Variance of difference in two independent measurements • 𝑅𝐶 = 1.96 2𝜎𝑤2 , for normal data – Limits of Agreement • 𝐿𝑂𝐴 = [−𝑅𝐶, 𝑅𝐶] • Others – Intra-class correlation (ICC) – Within-subject coefficient of variation (𝑤𝐶𝑉) 7/21/2014 30 QIB Technical Assessment • Reference known – Box-whisker plot • Reference unknown Normalized Size (%) Repeatability Plots 50 0 -50 10 20 Reference Diameter (mm) – Bland-Altman plot • Plot of difference against mean of two repeated measurements 7/21/2014 31 6 7/21/2014 QIB Technical Assessment QIB Technical Assessment Reproducibility 7/21/2014 32 QIB Technical Assessment Reproducibility • Variability of QIB under differing experimental conditions • Evaluate QIB performance under varied but controlled conditions – Potential reproducibility conditions • • • • Across scanners Across QIB measurement tools Across sites Across readers/clinicians 7/21/2014 33 QIB Technical Assessment Reproducibility Metrics • Within-subject reproducibility variance 2 2 2 – 𝜎𝑅𝑒𝑝𝑟𝑜𝑑𝑢𝑐𝑖𝑏𝑖𝑙𝑖𝑡𝑦 = 𝜎𝑅𝑒𝑝𝑒𝑎𝑡𝑎𝑏𝑖𝑙𝑖𝑡𝑦 + 𝜎𝑏𝑒𝑡𝑤𝑒𝑒𝑛−𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑠 • Reproducibility Coefficient – Variance for difference in two independent measurements 2 • 𝑅𝐷𝐶 = 1.96 2𝜎𝑅𝑒𝑝𝑟𝑜𝑑𝑢𝑐𝑖𝑏𝑖𝑙𝑖𝑡𝑦 , for normal data – Limits of Agreement • 𝐿𝑂𝐴𝑅𝐷𝐶 = [−𝑅𝐷𝐶, 𝑅𝐷𝐶] • Others – Concordance correlation (CCC) 7/21/2014 34 7 7/21/2014 QIB Technical Assessment Reproducibility Plots • 2 Factors – Scatter plots • Regression – Slope= 0.97 – Intercept= 0.09 – Bland-Altman plot – RDC= 0.234 • ≥2 Factors – Box-whisker plot –… 7/21/2014 35 QIB Technical Assessment QIB Technical Assessment Phantom Study Example 7/21/2014 36 QIB Technical Assessment Purpose • Evaluate technical performance of a nodule volume estimation tool – Segmentation-based nodule volume estimation tool – Reproducibility condition • CT slice thickness ranges includes 075 mm & 1.5 mm 7/21/2014 37 8 7/21/2014 QIB Technical Assessment Study Design • Phantom – Thorax phantom with vascular insert • Synthetic nodules – 4 spherical nodules • 5, 8, 9, 10 mm • +100 HU 7/21/2014 38 QIB Technical Assessment Study Design Phantom • Image collection protocols 100 mAs – 10 repeat acquisitions 16×0.75 mm Acquisition 0.75 mm 1.5 mm Detailed Detailed Reconstruction 7/21/2014 39 QIB Technical Assessment Bias/Linearity Nodule Size 8 mm 9 mm 5 mm 10 mm 60 550 ) Measured Value-Median (mm 40 3 3 Measured Value (mm) 450 350 250 150 20 0 -20 -40 50 0.75 1.5 0.75 1.5 0.75 1.5 0.75 1.5 Slice Thickness (mm) -60 5 mm 8 mm 9 mm 10 mm Nodule Size • Visually assess data – Evaluate means/variances • Is data transformation necessary/appropriate? 7/21/2014 40 9 7/21/2014 QIB Technical Assessment Bias/Linearity • Visually assess measured vs reference 600 𝑦 = 1.005𝑥 + 3.37 500 3 Measured Value (mm) – Linear Regression • 𝑦 = 1.005𝑥 + 3.37 – 𝐵𝑖𝑎𝑠𝐸𝑠𝑡 = 4.79 𝑚𝑚 3 400 300 200 100 LLoQ 0 0 100 ULoQ 200 300 400 Reference Standard (mm3) 500 600 7/21/2014 QIB Technical Assessment Results • Repeatability/Reproducibility Slice Thickness 0.75 mm 1.5 mm 14.4 mm3 RC 27.6 mm3 29.0 mm3 RDC – Want to calculate CI on values 7/21/2014 42 QIB Technical Assessment Repeatability Analysis Bland-Altman Plot (0.75 mm Slices) Bland-Altman Plot (1.5 mm Slices) 60 60 RC=14.44 mm3 40 20 3 Diff(Esti,Estj) (mm ) 3 Diff(Esti,Estj) (mm ) 40 0 -20 -40 -60 0 20 0 -20 -40 100 200 300 400 3 Mean(Esti,Estj) (mm ) 0.75 mm Slices 7/21/2014 RC=27.62 mm3 500 600 -60 0 100 200 300 400 3 Mean(Esti,Estj) (mm ) 500 600 1.5 mm Slices 43 10 7/21/2014 QIB Technical Assessment Reproducibility Analysis Bland-Altman Plot (Reproducibility) 60 3 Diff(Esti,Estj) (mm ) 40 RDC=28.95 mm3 20 0 -20 -40 -60 0 100 200 300 400 3 Mean(Esti,Estj) (mm ) 500 600 7/21/2014 44 QIB Technical Assessment Summary • Main components of QIB technical assessment – – – – Bias/linearity analysis Repeatability analysis Reproducibility analysis Others • Identification of significant factors/subgroups • …. • Challenging to maintain consistency across studies – – – – – – Phantom/clinical data Transformation of data Reference standard Is test-retest data available? Reproducibility conditions …. 7/21/2014 45 QIB Technical Assessment Acknowledgement • Qin Li and Marios A. Gavrielides for their efforts on the FDA QIB projects • Qin Li for her efforts in developing this presentation 7/21/2014 46 11