Online Appendix for the following JACC article TITLE: Reproducibility of Echocardiographic Techniques for Sequential Assessment of Left Ventricular Ejection Fraction and Volumes: Application to Patients Undergoing Cancer Chemotherapy AUTHORS: Paaladinesh Thavendiranathan, MD, MSc, Andrew D. Grant, MD, Tomoko Negishi, MD, Juan Carlos Plana, MD, Zoran B. Popović, MD, PhD, Thomas H. Marwick, MD, PhD, MPH APPENDIX Supplementary Methods Expanded Statistical Analysis To calculate the standard error of the measurement (SEM) for each technique for the entire follow-up period one way analysis of variance (ANOVA) was used. The EF or volumes from each technique was used as the dependent factor while patient (each patient was provided an ID from 1 to 56) was used as fixed factor. The square root of the error term was used as the measure of the temporal variability in EF for each technique. Intra-observer and inter-observer variability were determined using the approach described by Eliasziw et al. (1). First, two way ANOVA was performed using EF or volumes as the dependent factor, with observers and patients as fixed factors. From this analysis, the mean squared error for observers, patients, observer-patient interaction, and residuals was used as described by Eliasziw et al. to calculate the inter-observer and intra-observer mean square error (MSE). In this analysis the observers were treated as random factors. The square root of the MSE was the calculated variability. The intra-observer variability consisted of the overall (“average”) intra-observer variability across the two observers, while the interobserver variability consisted of the variability among observers’ measurements and the variability within observers’ measurements. This inter-observer variability is more clinically relevant as it illustrates the disagreement between the observers as well as the imprecision with which each observer makes the measurements. Traditionally inter-observer variability estimates have only provided the variability between the two observers assuming that the measurement by each observer is error free (an assumption that is not necessarily correct). The inter-observer test-retest variability was also calculated using two-way ANOVA. EF or volumes were used as the dependent factor while patients and observers were used as fixed factors. The Eliasziw et al. (1) method as described above was subsequently used to calculate the inter-observer test-retest variability. This measure consists of variability within observers, between observers, and over time. All statistical analysis was performed using SPSS (ver 19.0.0, IBM Corporation, Chicago, IL). Supplementary Results Table A: Temporal variability represented as coefficient of variation (COV) and 95% CI for all methods for all follow-up periods. Method EF COV (95% CI), % EDV COV (95% CI), % ESV COV (95% CI), % 7.4 (6.2 - 9.1) 16.2 (13.7 – 20.0) 22.0 (18.5-27.0) 8.4 (7.0 – 10.5)* 16.0 (13.3 – 20.0) 23.6 (19.7 – 29.6) 9.4 (7.9 – 11.5) 23.0 (19.4 – 28.2)* 26.2 (22.1 – 32.3) 9.4 (7.8 – 11.8) 20.1 (16.7 – 25.2) 23.6 (19.7 – 29.7) 3D 4.0 (3.3 – 4.9) 11.9 (10.0 – 14.7) 13.2 (11.1 – 16.2) 3D + Contrast 7.2 (6.0 - 9.1)* 16.6 (13.8 – 20.9)* 20.0 (16.5 – 25.1)* Bi-Plane Bi-Plane + Contrast Triplane Triplane + Contrast Non-contrast 3D had the lowest temporal variability based on COV for EF, EDV, and ESV compared to all other methods (p<0.01 for all). *statistically different when compared to the respective non-contrast method (p<0.05). Table B: Temporal variability and 95% CI for first 3 visits only where data were available for all patients (N=56). Non-contrast 3D had the lowest temporal variability for EF, EDV, and ESV compared Method EF SEM (95% CI) EDV SEM (95% CI), ml ESV SEM (95% CI), ml Bi-Plane 0.050 (0.045 – 0.055) 19.1 (17.2 – 21.3) 7.7 (7.0 – 8.7) Bi-Plane + Contrast 0.057 (0.051 – 0.065) 16.2 (14.4 – 18.4)* 7.9 (7.0 – 8.9) Triplane 0.061 (0.055 – 0.069) 15.4 (13.9 – 17.3) 8.0 (7.2 – 9.0) 0.069 (0.061-0.078) 19.7 (17.5 – 22.2)* 8.8 (7.8 – 9.9) 3D 0.027 (0.024 – 0.031) 10.0 (8.9 – 11.2) 4.5 (4.0 – 5.0) 3D + Contrast 0.052 (0.046 – 0.060)* 17.0 (15.40– 19.3)* 7.6 (6.7 – 8.6)* Triplane + Contrast to all other methods (p<0.01 for all). *statistically different when compared to the respective non-contrast method (p<0.05). Table C: Temporal variability and 95% CI for any 3 visits where data from all techniques were available at each visit (N=32) Non-contrast 3D had the lowest temporal variability for EF, EDV, and ESV compared Method EF SEM (95% CI) EDV SEM (95% CI), ml ESV SEM (95% CI), ml 0.049 (0.043-0.057 13.9 (12.1 – 16.1) 8.5 (7.4 – 9.8) Bi-Plane + Contrast 0.051 (0.044 – 0.059) 15.0 (13.0 – 17.3) 7.1 (6.2 – 8.3)* Triplane 0.063 (0.055 – 0.073) 21.0 (18.3 – 24.3) 9.8 (8.6 – 11.4) Triplane + Contrast 0.061 (0.053 – 0.070) 21.0 (18.3 – 24.3) 10.7 (9.4 – 12.4) 0.029 (0.025-0.033) 10.1 (8.8 – 11.7) 4.5 (3.9 – 5.2) 18.9 (16.5 – 21.9)* 8.2 (7.2 – 9.5)* Bi-Plane 3D 3D + Contrast 0.055 (0.048 – 0.064)* to all other methods (p<0.01 for all). *statistically different when compared to the respective non-contrast method (p<0.05). References 1. Eliasziw M, Young SL, Woodbury MG, Fryday-Field K. Statistical methodology for the concurrent assessment of interrater and intrarater reliability: using goniometric measurements as an example. Physical therapy 1994;74:777-88.