___________________________________________________________________________________ VALIDATION OF GEOSPATIAL MODELS USING EQUIVALENCE TESTS B. Tyler Wilson NC Research Station Numerous modeling efforts, both within and outside of the Forest Service, are underway to develop maps of forest attributes (e.g. area, volume, and growth) utilizing satellite imagery and other geospatial datasets. More rigorous statistical tools must be developed in order to evaluate these models and their resultant maps. This paper proposes a method for validating geospatial models and maps of forest attributes by using FIA plot data with equivalence tests. Unlike traditional significance testing for model validation, equivalence testing posits as the null hypothesis that the test statistics for the population of observations and predictions are different. With sufficient evidence the null hypothesis can be rejected and the model can be validated. This differs from traditional tests where a failure to reject the null hypothesis does not suggest that the model has been validated. The proposed methodology is applied to several geospatial models of forest area for illustration. Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data 203 B. Tyler Wilson _______________________________________________________________________________ Validation of geospatial models using equivalence tests B. Tyler Wilson, Mark H. Hansen, Ronald E. McRoberts North Central Research Station, St. Paul, MN 55108 barrywilson@fs.fed.us 204 Evaluation of geospatial models • FIA data used with numerous predictive models • Not only “how much”, but “where” • Geospatial data as input – Satellite imagery – Other raster data (e.g., DEM, climate) – Vector data (e.g. ecological regions, soils) • Produce maps as output • How good are resultant maps? Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data ________________________________________ Validation of Geospatial Moels Using Equivalence Tests Primary reference • Robinson, Andrew P., Remko A. Duursma, and John D. Marshall. “A regression-based equivalence test for model validation: shifting the burden of proof.” Tree Physiology, 25 (2005): 903-13. • Derived from bioequivalence testing • Used to compare efficacy of drugs 205 Hypothesis testing for models • Compare two populations – observations – predictions • Test statistic – e.g. mean difference between associated pairs Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data B. Tyler Wilson _______________________________________________________________________________ Traditional significance test • Hypotheses – Null is that mean difference = 0 – Alternative is that mean difference ≠ 0 • Specify α (region of rejection) • Rejection of null hypothesis – acceptance of alternative hypothesis • Failure to reject null hypothesis – not acceptance of null – simply lack of evidence • Burden of proof misplaced for model validation 206 Equivalence test • Hypotheses – Null is that mean difference ≠ 0 – Alternative is that mean difference = 0 • Specify α and θ (region of equivalence) • Rejection of null hypothesis – acceptance of alternative hypothesis – validates model • Failure to reject null hypothesis – not acceptance of null – lack of evidence to validate model Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data ________________________________________ Validation of Geospatial Moels Using Equivalence Tests Regression-based validation procedure 1. 2. 3. 4. 5. Tabulate observations and predictions Subtract mean prediction from predictions Define regions of equivalence Fit linear regression Test null hypotheses of dissimilarity 207 Interpreting the results • Model validated if confidence interval for α within region of equivalence • Separate tests for intercept and slope • Alternative is report minimum θ where null hypothesis rejected Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data B. Tyler Wilson _______________________________________________________________________________ An illustration a. b. c. d. 208 Modeling forest area • Compare three geospatial models • Produce prediction maps of forest area • Observations estimated from FIA plots Name Technique Imagery Type FIA-DT FIA-Logit Decision tree Logistic 250 m MODIS 250 m MODIS Thematic Thematic VCF Regression tree 500 m MODIS Continuous Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data ________________________________________ Validation of Geospatial Moels Using Equivalence Tests Evaluation area • Subset of Minnesota • Easily applied to larger region 209 Circular estimation units • • • • • Spatial mismatch between plots and pixels Use circular estimation units instead Random center points 500 circles each at radius of 5 - 100 km Each circle a data point in validation procedure Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data B. Tyler Wilson _______________________________________________________________________________ Two estimates for each circle • Compute estimates from FIA plot observations and model predictions • Model-based estimate average of all pixels within a circle • Estimates are proportion forest 68% forest 73% forest *Not true plot locations 210 Coefficient of determination 1 0.9 0.8 0.7 0.6 R2 0.5 0.4 0.3 0.2 0.1 0 FIA-DT FIA-Logit VCF 5km 20km 60km 100km Radius Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data ________________________________________ Validation of Geospatial Moels Using Equivalence Tests Standard error of regression 0.16 0.14 0.12 0.1 RMSE 0.08 FIA-DT FIA-Logit VCF 0.06 0.04 0.02 0 5km 20km 60km 100km Radius 211 Equivalence test of intercept 0.025 0.02 0.015 θ (α = .05) FIA-DT FIA-Logit VCF 0.01 0.005 0 5km 20km 60km 100km Radius Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data B. Tyler Wilson _______________________________________________________________________________ Equivalence test of slope 0.16 0.14 0.12 0.1 θ (α = .05) FIA-DT FIA-Logit VCF 0.08 0.06 0.04 0.02 0 5km 20km 60km 100km Radius 212 Conclusions • Readily applied at national scale • Coarse or fine spatial resolution • Relatively simple to implement Workshop Proceedings: Quantitative Techniques for Deriving National Scale Data