Multimodal MR Imaging Model to Predict Tumor Infiltration in

advertisement
Multimodal MR Imaging Model to Predict Tumor Infiltration in Patients with
Gliomas
APPENDIX (On-line Only)
APPENDIX 1. STATISTICAL METHODS
Overview: As previously indicated, the goal of our study was to evaluate if data from a combination of
advanced MR imaging techniques could be utilized to better predict the tumor infiltration in patients
with gliomas, compared to a single imaging technique. A three-step analytical modeling process was
undertaken to achieve this goal.
Principal component analyses: Due to the small samples size and the large number of imaging
variables, the first step in the model building process was to reduce the dimensionality of the imaging
dataset without losing important information about each of the12 imaging variables. This step was
accomplished by way of a principal components analysis (PCA). PCA utilizes orthogonal projections
of the multivariate data to produce a set of linear independent principal composite scores (pccomposite scores), which together capture all of the information associated with the complete
multivariate dataset. The number of orthogonal projects is equal to the number of variables in the
multivariate dataset. Thus, for the study at hand we used a PCA to generate 12 pc-composite scores
per nuclear density measurement. Each pc-composite score was derived as linear combination of the
values of the precontrast MPRAGE, T2, FLAIR, DWI, DTI, DSC, PWI, post-contrast MPRAGE, and
axial T1 spin echo imaging variables associated with the nuclear density measurement.
Supplemental Table 1. Principal component non standardized coefficients (i.e. the weights give to
the individual predictors)
Predictor PC1
0.05830
T1
0.00082
fa
0.00000
fa num
0.00000
fa
denom
0.00000
Mean
Diff.
-0.96708
T2
-0.00001
K2
-0.00681
rCBF
-0.19794
rCBV
corr.
-0.14875
rCBV
uncorr.
-0.00042
rMTT
0.00336
TTP
PC2
PC3
PC4
PC5
PC6
PC7
PC8
PC9
PC10 PC11 PC12
0.44323
-0.86202
0.23875
-0.00630
-0.00117
0.00470
-0.00062
0.00000
0.00000
0.00000
0.00000
-0.00090
-0.00103
0.00421
0.01733
-0.01466
-0.17951
0.98348
0.00184
0.00022
-0.00179
0.00006
0.00000
0.00000
0.00001
0.00000
-0.00005
-0.00057
0.00086
0.22320
0.37934
0.82340
0.35819
0.00000
0.00000
0.00000
0.00000
0.00004
-0.00016
-0.00123
0.32903
0.77806
-0.23945
-0.47857
0.00000
0.00000
0.00000
-0.00001
0.00006
0.00016
-0.00117
0.08694
0.29929
-0.51017
0.80162
-0.19246
-0.13901
0.09161
0.00199
0.00114
-0.00069
-0.00006
0.00000
0.00000
0.00000
0.00000
-0.00001
-0.00002
-0.00007
0.00010
0.00004
0.00073
0.00187
-0.91343
0.40144
0.06639
-0.00857
0.01967
0.01531
0.00064
-0.04356
0.04308
0.98131
0.18056
0.00117
0.00004
0.00015
0.00002
0.57266
0.06273
-0.78983
-0.06910
-0.00826
-0.01643
0.00223
0.00004
-0.00002
-0.00001
0.00000
0.66192
0.48231
0.55080
0.05720
0.00823
-0.01899
-0.00548
-0.00007
0.00002
0.00000
0.00000
-0.00030
0.01033
0.02728
-0.19516
-0.97970
0.03506
-0.00487
-0.00003
0.00005
-0.00005
0.00000
-0.00584
0.02636
0.08148
-0.97551
0.19489
-0.05400
0.00991
-0.00012
0.00003
-0.00001
-0.00001
Univariate analyses: The second step in the model building process was to identify the set of pccomposite scores (pc-composite scores) that were linearly associated with nuclear density. This
step was accomplished by conducting a set of univariate generalized estimating equation regression
(GEER) analyses, in which for each of the 12 set of pc-composite scores, nuclear density served as
the GEER model response variable and the set of pc-composite scores served as the GEE model
predictor variable. With regard to the GEER model specification, the Gaussian distribution was the
assumed underlying distribution of the nuclear density measurements, and since each patient had
multiple nuclear density measurements, to account for intra-patient measurement correlation in the
hypothesis testing process, the GEE model variance-covariance parameter estimates were derived
by way of the Huber and White sandwich variance-covariance estimator. With regard to hypothesis
testing, a p≤0.05 decision rule was utilized as the criterion for rejecting the null hypothesis that there
was no linear association between the pc-composite score and nuclear density measurement. The
adequacy of the univariate model to predict nuclear density was assessed via the coefficient of
determination (R2).
Supplemental Figure 1. Principal component coefficients for PC1-PC12.
Principle Component Coefficients
PC 1
PC 2
TTP
PC 3
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
-1 .0
-0 .5
0 .0
0 .5
1 .0
-1 .0
-0 .5
PC 5
0 .0
0 .5
1 .0
T1
0 .5
1 .0
-1 .0
-0 .5
TTP
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
T1
0 .0
0 .0
0 .5
1 .0
-1 .0
0 .5
1 .0
1 .0
0 .5
1 .0
-1 .0
0 .5
1 .0
0 .5
1 .0
T1
-0 .5
0 .0
0 .5
1 .0
-1 .0
-0 .5
PC 11
0 .5
0 .0
TTP
0 .0
PC 12
TTP
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
T1
T1
0 .0
-0 .5
rM TT
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
-0 .5
-1 .0
PC 8
TTP
TTP
-0 .5
1 .0
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
PC 10
rM TT
-1 .0
0 .5
TTP
PC 9
T1
0 .0
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
T1
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
T1
0 .0
-0 .5
PC 7
TTP
TTP
-0 .5
-1 .0
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
T1
PC 6
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
-1 .0
TTP
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
T1
rM TT
rCBV u n c o r
rCBV c o rr
rCBF
K2
T2
m e a n d i ff
fa d e n o m
fa n u m
fa
T1
T1
PC 4
TTP
TTP
-1 .0
-0 .5
0 .0
0 .5
1 .0
-1 .0
-0 .5
0 .0
Supplemental Table 2. Univariate generalized estimating equation analysis (modeling only marginal
associations).
Predictor
PCS 1
PCS 2
PCS 3
PCS 4
PCS 5
PCS 6
PCS 7
PCS 8
PCS 9
PCS 10
PCS 11
PCS 12
Degrees of
Freedom
1
1
1
1
1
1
1
1
1
1
1
1
Chi-Square
P-value
R2
0.16
1.36
0.68
0.78
0.03
5.31
1.16
0.56
0.00
5.75
0.06
3.97
0.688
0.244
0.408
0.376
0.854
0.021
0.281
0.453
0.981
0.016
0.811
0.046
0.01
0.04
0.01
0.01
0.00
0.05
0.05
0.02
0.00
0.08
0.00
0.19
Multivariate analyses: The third and final step in model building process was to utilize the pc-scores
that were found to be statistically associated with nuclear density in step 2 of the model building
process to build a multivariate model to predict nuclear density. Two multivariate GEER models were
examined. The first model included the pc-composite scores that were found to be associated with
nuclear density in step 2 of the model building process, while the second model include linear and
nonlinear restricted cubic spline functions of the same pc-composite scores. As in the univariate
analyses, the Gaussian distribution was the assumed underlying distribution of the nuclear density
measurements, and the GEE model variance-covariance parameter estimates were derived by way
of the Huber and White sandwich variance-covariance estimator. With regard to hypothesis testing,
the GEE modified version of the generalized Wald test was utilized to compare the two models. A
p≤0.05 decision rule was utilized as the criterion for rejecting the null hypothesis that the predictive
information gained by allowing for non-linear associations between the pc-composite scores and
nuclear density was no greater than what would be expected by pure chance. The same hypothesis
testing strategy was utilized to determine the composition of the pc-composite scores that were
included in the final multivariate GEER model.
The final regression model is described by Equation 1 (Eq. 1), where E(Density |X) denotes the
predicted nuclear density given the values of PC10 and PC12, given in Equations 2 and 3,
respectively.
Eq 1. E(Density |X)
= 1857.0969 + 322.9077(PC10 x 1000) − 1996.2808(PC10 x 1000 − 5.2251)3+
+ 5038.5490(PC10 x 1000 − 5.4618)3+ − 3043.2683(PC10 x 1000 − 5.6171)3+
− 1100.6736(PC12 x 10000) + 14148.0490(PC12 x 10000 + 0.5485)3+
− 24336.1410(PC12 x 10000 + 0.4333 )3+ + 10188.0910(PC12 x 10000 + 0.2734)3+
Note that if the quantity within (∙)3+ is ≤ 0, (∙)3+ = 0, else (∙)3+ = (∙)3
where
Eq 2. PC10
= (4.17E − 06 ∗ T1) + (2.24E − 04 ∗ FA) + (3.79E − 01 ∗ FA numerator)
+ (7.78E − 01 ∗ FA denominator) + (2.99E − 01 ∗ Mean Diffusivity)
+ (4.01E − 01 ∗ K2) + (3.96E − 05 ∗ rCBF)
+ (−1.13E − 06 ∗ T2)
+ (−1.99E − 05 ∗ rCBVcorrected)
+ (1.85E − 05 ∗ rCBVuncorrected) + (5.18E − 05 ∗ rMTT) + (3.0E − 05 ∗ TTP)
and
Eq 3. PC12
= (−8.78E − 08 ∗ T1) + (5.59E − 05 ∗ FA) + (3.58E − 01 ∗ FA numerator)
+ (−4.76E − 01 ∗ FA denominator) + (8.02E − 01 ∗ Mean Diffusivity) + (7.39E − 08 ∗ T2)
+ (−8.57E − 03 ∗ K2) + (1.58E − 05 ∗ rCBF) + (−8.77E − 09 ∗ rCBVcorrected)
+ (−6.535E − 07 ∗ rCBVuncorrected) + (3.92E − 06 ∗ rMTT) + (−5.25E − 06 ∗ TTP)
Model adequacy, with respect to the final model’s ability to predict nuclear density, was assessed
based on a biased corrected version of the multiple coefficient of determination (R 2). The biased
correct R2 was estimated via the bootstrap validation function “validate” of the HMISC library of
Spotfire Splus version 8.3 (TIBCO Inc., Palo Alto, CA). The biased corrected R 2 essentially
represents the predicted R2 after subtracting out the optimism in the observed value of R2 induced by
the fact that the model parameter estimates were optimized to predict the observed values of the
response variable.
Statistical software: The principal components analysis, and the univariate and multivariate
generalized estimating equation modeling were conducted utilizing the software of the Spotfire Splus
version 8.3 (TIBCO Inc., Palo Alto, CA) statistical package.
Download