SUPPLEMENTARY INFORMATION Review of heterogeneity modelling approaches and additional tables/figures There is a large body of research on estimating effective hydraulic parameters for heterogeneous aquifer systems (e.g., Renard and de Marsily 1997; Sanchez-Vila et al. 2006). These approaches range from averaging point data using an upscaling formula to interpreting multiple drawdown curves from pumping tests through type curves (Neuman et al. 2007) or inverse modeling to estimate effective parameters and to obtain uncertainty estimates. While effective parameter estimates may be applicable for larger scale problems, or aquifer systems with a relatively low variance, using effective parameters for highly heterogeneous systems to predict drawdown responses or transport over short distances can pose a challenge. As such, methods for mapping the heterogeneous distribution of hydraulic parameters are commonly employed. One of the most common approaches for interpolating small-scale data is to use geostatistics (de Marsily et al. 2005). However, traditional geostatistics such as kriging tend to provide a smooth image of the spatial heterogeneity and may not represent the subsurface heterogeneity accurately. Although a variety of stochastic simulation techniques (e.g., Deutsch and Journel 1998) exist that can overcome this issue of smoothing, many of them do not preserve geological features such as its morphology and facies assemblages. This is due to the fact that traditional geostatistical methods are based on variograms computed using two-point statistics. To overcome this shortcoming, multiple point geostatistics (e.g., Guardiano and Srivastava 1993; Caers 2001; Strebelle 2002) has been developed through the use of more complex point configurations, whose statistics are retrieved from training images that represent the geological facies distributions obtained from outcrop mappings and/or geophysical imaging. Alternative approaches to representing abrupt changes in parameters values from one layer to the next or resolving facies is based on categorical interpolation methods such as indicator kriging (e.g., Journel 1983; Journel and Isaaks 1984; Johnson and Driess 1989; Journel and Alabert 1990; Journel and Gomez-Hernandez 1993) and Transition Probability/Markov Chain geostatistical methods (Carle 1999; Carle and Fogg 1997; Weissmann et al. 1999). These approaches interpolate categories, as opposed to discrete values, making it possible to reproduce abrupt material changes and juxtapositional tendencies of different hydrofacies. For cases where high resolution conditioning data are available, realistic hydrofacies models can be constructed. Another approach is to construct geological models based on stratigraphic, or hydrostratigraphic units (e.g., Martin and Frind 1998; Jones et al. 2008). These models are often based on the interpretation of soil cores collected during the installation of wells. While soil cores can provide information on material types along a given borehole, their collection and analysis is expensive and, depending on the material, sample recovery can be poor. This can pose a challenge for mapping the lateral extent of layers or their connectivity at a site. Due to the lack of availability of lateral information, information on stratigraphy are often interpolated manually, using interpolation algorithms or through genesis models that consider geological processes to create sedimentary units (Koltermann and Gorelick, 1996; Teles et al. 2004; de Vries et al. 2009; Ronayne et al. 2010). Once constructed, individual layers within these models can either be assigned hydraulic parameters values deterministically or they can be estimated through calibration of a groundwater model. Over the last several decades, significant progress has also been made in the development of geostatistical and stochastic inverse methods (e.g., Kitanidis and Vomvoris 1983; Hoeksema and Kitanidis 1984, 1989; Rubin and Dagan 1987, 1992; Gutjahr and Wilson 1989; Harvey and Gorelick 1995; Kitanidis 1995; LaVenue et al. 1995; RamaRao et al. 1995; Yeh et al. 1995, 1996; Gómez-Hernández et al. 1997; Vesselinov et al. 2001; Hernandez et al., 2003, 2006; Alcolea et al., 2006, 2008; Riva et al. 2009) to obtain maps of K heterogeneity. Zimmerman et al. (1998) compared seven geostatistically-based inverse approaches to estimate transmissivities for modeling advective transport by groundwater flow using synthetic data. One important finding from this study was that the proper selection of the variogram of the log10 transmissivity field was found to have a significant impact on the accuracy and precision of the transport predictions. More recently, Hendricks Franssen et al. (2009) compared more modern methods for geostatistical inverse modeling, but again, the study was based on synthetic data and only one groundwater flow and transport scenario was considered. Additional references not provided in the main text Alcolea A., J. Carrera, and A. Medina. 2006. Pilot points method incorporating prior information for solving the groundwater flow inverse problem. Advances in Water Resources 29, no.11: 1678–89, doi:10.1016/j.advwatres.2005.12.009. Alcolea A., J. Carrera, and A. Medina. 2008. Regularized pilot points method for reproducing the effect of small-scale variability: application to simulations of contaminant transport. Journal of Hydrology 355, no.1–4: 76–90. doi:10.1016/j.jhydrol.2008.03.004. Caers, J. 2001. Geostatistical reservoir modelling using statistical pattern recognition. Journal of Petroleum Science and Engineering 29, 177–188. de Vries, L. M., J. Carrera, O. Falivene, O. Gratacos, and L. J. Slooten. 2009. Application of multiple point geostatistics to non-stationary images, Mathematical Geosciences 41, no.1: 29-42. Gómez-Hernández, J. J., A. Sahuquillo, and J. E. Capilla. 1997. Stochastic simulation of transmissivity fields conditional to both transmissivity and piezometric data, 1, Theory, Journal of Hydrology 203, 162-174. Guardiano, F. and R. M. Srivastava. 1993. Multivariate geostatistics: beyond bivariate moments. In: Soares A (ed) Geostatistics-Troia. Kluwer, Dordrecht, pp 133–144. Gutjahr, A. L., and J. L. Wilson. 1989. Cokriging for stochastic models, Transport in Porous Media, 4, no.6: 585-598. Hernandez, A. F., S. P. Neuman, A. Guadagnini, and J. Carrera. 2003. Conditioning mean steady state flow on hydraulic head and conductivity through geostatistical inversion, Stochastic Environmental Research and Risk Assessment 17, no. 5: 329-338, DOI: 10.1007/s00477003-0154-4. Hernandez, A.F., S. P. Neuman, A. Guadagnini, and J. Carrera. 2006. Inverse stochastic moment analysis of steady state flow in randomly heterogeneous media, Water Resources Research 42, W05425, doi:10.1029/2005WR004449. Hoeksema, R. J., and P. K. Kitanidis. 1984. An application of the geostatistical approach to the inverse problem in two-dimensional groundwater modeling, Water Resources Research 20, no. 7: 1003-1020. Hoeksema, R. J., and P. K. Kitanidis. 1989. Prediction of transmissivities, heads, and seepage velocities using mathematical modeling and geostatistics, Advances in Water Resources 12, 90-101. Johnson, N. M. and S. J. Driess. 1989. Hydrostratigraphic interpretation using indicator geostatistics, Water Resources Research 25, no. 12: 2501-2510. Journel, A. G. 1983. Nonparametric estimation of spatial distribution, Mathematical Geology 15, no. 3: 445-468. Journel, A. G. and E. K. Isaacs. 1984. Conditional indicator simulation: application to a Saskatchewan uranium deposit. Mathematical Geology 16, no. 7:685-718. Journel, A. G. and F. G. Alabert. 1990. New method for reservoir mapping, Journal of Petroleum Technology 42, no. 2:212-218. Journel, A. G. and J. Gomez-Hernandez. 1993. Stochastic imaging of the Wilmington clastic sequence. Society of Petroleum Engineering Formation Evaluation 8, no.1: 33-40. Kitanidis, P. K., and E. G. Vomvoris. 1983. A geostatistical approach to the inverse problem in groundwater modeling and one-dimensional simulations, Water Resources Research 19, no. 3: 677-690. LaVenue, A. M., B. S. RamaRao, G. de Marsily, and M. G. Marietta. 1995. Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 2, Application, Water Resources Research 31, no. 3: 495-516. Neuman, S. P., A. Blattstein, M. Riva, D. M. Tartakovsky, A. Guadagnini, and T. Ptak. 2007. Type curve interpretation of late-time pumping test data in randomly heterogeneous aquifers, Water Resources Research 43, no. 10: W10421, doi:10.1029/2007WR005871. RamaRao, B. S., A. M. LaVenue, G. de Marsily, and M. G. Marietta. 1995. Pilot point methodology for automated calibration of an ensemble of conditionally simulated transmissivity fields, 1, Theory and computational experiments, Water Resources Research 31, no. 3: 475-493. Riva, M., A. Guadagnini, S. P. Neuman, E. Bianchi Janetti, and B. Malama. 2009. Inverse analysis of stochastic moment equations for transient flow in randomly heterogeneous media, Advances in Water Resources 32, 1495-1507, doi:10.1016/j.advwatres.2009.07.003. Ronayne, M. J., S. M. Gorelick, and C. Zheng. 2010. Geological modeling of submeter scale heterogeneity and its influence on tracer transport in a fluvial aquifer, Water Resources Research 46, W10519, doi:10.1029/2010WR009348. Rubin, Y., and G. Dagan. 1987. Stochastic identification of transmissivity and effective recharge in steady groundwater flow, 1, Theory, Water Resources Research 23, no. 7: 1185-1192. Rubin, Y., and G. Dagan. 1992. Conditional estimation of solute travel time in heterogeneous formations: Impact of transmissivity measurements, Water Resources Research 28, no. 4: 1033-1040. Strebelle, S. 2002. Conditional simulation of complex geological structures using multiple-point statistics. Mathematical Geology 34, 1–22. Teles V, F. Delay, G. de Marsily. 2004. Comparison between different methods for characterizing the heterogeneity of alluvial media: groundwater flow and transport simulations. Journal of Hydrology, 294, no. 1–3:103–121. Vesselinov, V. V., S. P. Neuman, and W. A. Illman (2001), Three-dimensional numerical inversion of pneumatic cross-hole tests in unsaturated fractured tuff: 2. Equivalent parameters, high-resolution stochastic imaging and scale effects, Water Resources Research 37, no. 12: 3019-3042. Weissmann, G., S. Carle, and G. Fogg. 1999. Three‐Dimensional Hydrofacies Modeling Based on Soil Surveys and Transition Probability Geostatistics, Water Resources Research 35, no. 6: 1761-1770. Yeh, T.-C. J., A. L. Gutjahr, and M. G. Jin. 1995. An iterative cokriging-like technique for groundwater-flow modeling, Ground Water 33, no. 1: 33-41. Table S1: Geometric mean, variance, and correlation lengths of ln K for each approach. Ln K (m/s) Approach Kx, Ky, Kz (m/s) 1. Kriging 2. Effective parameter model Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 1.4 × 10-5 7.3 × 10-5 2.0 × 10-7 4.0 × 10-6 4.8 × 10-6 3.0 × 10-8 2.9 × 10-5 3.0 × 10-5 1.0 × 10-6 4.2 × 10-6 9.6 × 10-6 2.0 × 10-7 KG 2 ln K Sill x y z Model 4.0 × 10-8 5.5 4.3 (8z) 19.4 19.4 7.2 Exponential (Gaussianz) - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3. Transition Probability/Markov Chain model Case 1 (PW1-3) 1.3 × 10-6 6.5 6.5 1.4 0.9 0.5 Case 2 (PW3-3) 5.6 × 10-8 25 25 2.3 1.85 0.5 Case 3 (PW4-3) 6.5 × 10-7 15.2 15.3 1.4 1.4 0.5 Case 4 (PW5-3) 3.5 × 10-7 13.8 13.6 1.4 1.4 0.5 4. Geological model Case 1 (PW1-3) 1.6 × 10-7 3.7 3.8 11.7 11.7 2.3 Case 2 (PW3-3) 9.0 × 10-8 0.2 0.2 8.1 7.2 1.4 Case 3 (PW4-3) 2.3 × 10-7 5.5 5.2 13.5 12.2 3.2 Case 4 (PW5-3) 1.6 × 10-7 4.1 4.0 21.1 18.5 4.1 5. Stochastic inverse model with conditioning Case 1 (PW1-3) 1.1 × 10-6 3.7 3.5 15.3 14.0 4.1 Case 2 (PW3-3) 9.2 × 10-7 3.3 3.5 14.9 14.9 4.1 Case 3 (PW4-3) 1.4 × 10-6 4.1 4.0 13.5 11.7 4.1 Case 4 (PW5-3) 9.8 × 10-7 4.0 4.0 8.6 8.6 3.2 6. Transient hydraulic 7.0 × 10-6 4.3 4.7 5.4 5.4 1.8 tomography (unconditioned; (PW1-3, PW3-3, PW4-3, PW5-3) 7. Transient hydraulic 1.3 × 10-6 4.8 5.3 9.5 9.5 2.3 tomography (conditioned; PW13, PW3-3, PW4-3, PW5-3) * = Calculated from raw data; KG for the EPM-calibrated case is the geometric mean of the K in the principal directions; - = data not available, z = vertical orientation Exponential Exponential Exponential Exponential Exponential Exponential Exponential Exponential Exponential Exponential Exponential Exponential Exponential Exponential Table S2: Geometric mean, variance, and correlation lengths of ln Ss for each approach. Ln Ss (m-1) Approach -1 Ss (m ) SsG ln2 S Sill x y z Model - - - - - 0.3 0.3 0.1 0.1 1.8 1.4 1.4 1.4 1.4 1.4 1.4 1.4 0.9 0.5 0.5 0.5 Exponential Exponential Exponential Exponential 1.0 0.2 4.5 0.6 6.3 5.9 13.1 2.3 6.3 5.9 12.2 1.8 0.9 1.8 5.0 0.5 Exponential Exponential Exponential Exponential 0.7 0.5 0.7 0.9 0.7 10.4 11.7 10.8 9.5 9.9 10.4 11.7 10.8 8.1 9.9 5.0 4.1 8.1 3.2 5.9 Exponential Exponential Exponential Exponential Gaussian 1.3 8.6 8.6 3.6 Exponential s 2. Effective parameter model Case 1 (PW1-3) 1.0 × 10-7 Case 2 (PW 3-3) 6.8 × 10-4 Case 3 (PW4-3) 1.9 × 10-6 Case 4 (PW5-3) 1.4 × 10-7 3. Transition Probability/Markov Chain model Case 1 (PW1-3) 1.3 × 10-7 0.2 Case 2 (PW 3-3) 6.5 × 10-8 0.3 Case 3 (PW4-3) 1.3 × 10-6 0.1 Case 4 (PW5-3) 1.3 × 10-6 0.1 4. Geological model Case 1 (PW1-3) 1.4 × 10-4 1.0 Case 2 (PW 3-3) 7.8 × 10-5 0.2 Case 3 (PW4-3) 4.3 × 10-5 4.6 Case 4 (PW5-3) 1.5 × 10-4 0.7 5. Stochastic inverse model with conditioning Case 1 (PW1-3) 8.3 × 10-5 0.5 Case 2 (PW3-3) 7.2 × 10-5 0.4 Case 3 (PW4-3) 1.1 × 10-4 0.6 Case 4 (PW5-3) 8.7 × 10-5 0.6 6. Transient hydraulic 8.9 × 10-5 0.7 tomography (unconditioned; (1.3z) PW1-3, PW3-3, PW4-3, PW5-3) 7. Transient hydraulic 1.1 × 10-4 1.1 tomography (conditioned; PW13, PW3-3, PW4-3, PW5-3) * = Calculated from raw data; - = data not available, z = vertical orientation, Table S3: Statistics of the linear model fit and coefficient of determination (R2) from scatterplots of simulated versus observed drawdowns during model calibration. Approach 2. Effective parameter model 3. Transition Probability/Markov Chain model 4. Geological model 5. Stochastic inverse model with conditioning 6. Transient hydraulic tomography (unconditioned) 7. Transient hydraulic tomography (conditioned) Slope Intercept R2 Slope Intercept R2 Slope Intercept R2 Slope Intercept R2 Slope Intercept R2 Slope Intercept R2 PW1-3 0.64 0.08 0.62 0.15 0.19 0.16 0.43 0.03 0.38 0.18 0.02 0.39 0.96 -0.01 0.89 0.29 -0.01 0.57 PW3-3 0.06 0.06 0.05 0.23 0.05 0.26 0.04 0.00 0.36 0.58 0.02 0.66 0.87 0.01 0.86 0.82 0.01 0.94 PW4-3 0.60 0.06 0.57 0.74 0.50 0.14 1.54 0.00 0.34 0.71 0.03 0.79 0.83 0.04 0.95 0.76 0.02 0.85 PW5-3 0.41 0.06 0.39 0.08 0.04 0.25 0.19 0.00 0.31 0.58 0.01 0.90 0.58 0.01 0.93 0.56 0.00 0.88 Average 0.43 0.07 0.41 0.30 0.20 0.20 0.55 0.01 0.35 0.51 0.02 0.69 0.81 0.01 0.91 0.61 0.01 0.81 Table S4: Statistics of the linear model fit and coefficient of determination (R2) from scatterplots of simulated versus observed drawdowns during model validation. Slope Max 7.58 Approach Min Mean 1. Kriging 0.08 3.15 2. Effective parameter model Case 1 (PW1-3) 0.20 1.19 0.55 Case 2 (PW3-3) -0.05 0.32 0.10 Case 3 (PW4-3) 0.09 0.55 0.25 Case 4 (PW5-3) 0.36 2.12 0.88 3. Transition Probability/Markov Chain model Case 1 (PW1-3) 0.07 0.38 0.22 Case 2 (PW3-3) 0.02 1.29 0.38 Case 3 (PW4-3) 0.03 2.77 0.37 Case 4 (PW5-3) 0.03 0.47 0.18 4. Geological model Case 1 (PW1-3) 0.27 1.06 0.57 Case 2 (PW3-3) 0.36 4.02 1.30 Case 3 (PW4-3) 0.27 2.90 1.11 Case 4 (PW5-3) 0.04 31.69 10.69 5. Stochastic inverse model with conditioning Case 1 (PW1-3) 0.11 1.28 0.46 Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 6. Transient hydraulic tomography (unconditioned) 7. Transient hydraulic tomography (conditioned) Min 0.03 R2 Max 0.25 Mean 0.14 0.04 0.28 0.02 0.07 0.09 0.00 0.19 0.14 0.57 0.04 0.63 0.67 0.29 0.02 0.39 0.35 0.40 0.55 12.13 0.09 0.11 0.18 1.38 0.04 0.05 0.01 0.01 0.00 0.24 0.29 0.18 0.48 0.17 0.14 0.09 0.26 -0.01 0.09 0.02 -2.64 0.09 1.00 0.14 0.35 0.03 0.32 0.08 -0.29 0.04 0.02 0.08 0.00 0.70 0.13 0.37 0.73 0.43 0.08 0.23 0.35 Min 0.02 Intercept Max 0.61 Mean 0.13 0.01 0.04 0.01 0.01 0.08 0.98 0.05 0.13 0.02 0.01 0.11 0.00 0.17 0.10 0.11 1.74 1.08 1.87 0.65 0.34 0.61 0.02 0.03 0.01 0.02 0.28 0.42 0.09 0.27 0.02 0.12 0.04 0.09 0.09 0.03 0.09 0.02 0.30 0.40 0.48 0.75 0.15 0.14 0.20 0.21 0.05 0.93 0.46 0.00 0.12 0.03 0.00 0.82 0.50 0.16 0.87 0.44 0.01 0.14 0.05 0.01 0.83 0.36 Figure S1: Location of core samples used for permeameter analysis to create the kriged K field in Alexander et al. (2011) and this study. These data are also utilized to condition some of the models in this study. Figure S2: Transition probability matrix for the horizontal direction. The dots are the measured transition probabilities and the solid line is the data fit by the Markov chain. Figure S3: Transition probability matrix in the vertical direction. The dots are the measured transition probabilities and the solid line is the data fit by the Markov chain. Figure S4: Observed vs. simulated drawdown for each of the 10 TPROGs realizations. The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S5: K and Ss fields from the stochastic inversion of a pumping test performed at PW3-3 conditioned to permeameter K data (Approach 5, Case 2): a) K-field; b) variance associated with the estimated K field; Ss field; and d) variance associated with the estimated Ss field. Note that the square formed by the slices in the main figure corresponds to the outer edges of the field plot. The inset image for each figure is a cross-section through the middle of the central square. This corresponds to cross-sections through CMT2 to CMT1 (section oriented N to S), and CMT4 to CMT3 (section oriented E to W). Black open circle indicates the pumped location. Figure S6: K and Ss fields from the stochastic inversion of a pumping test performed at PW4-3 conditioned to permeameter K data (Approach 5, Case 3): a) K-field; b) variance associated with the estimated K field; Ss field; and d) variance associated with the estimated Ss field. Note that the square formed by the slices in the main figure corresponds to the outer edges of the field plot. The inset image for each figure is a cross-section through the middle of the central square. This corresponds to cross-sections through CMT2 to CMT1 (section oriented N to S), and CMT4 to CMT3 (section oriented E to W). Black open circle indicates the pumped location. Figure S7: K and Ss fields from the stochastic inversion of a pumping test performed at PW5-3 conditioned to permeameter K data (Approach 5, Case 4): a) K-field; b) variance associated with the estimated K field; Ss field; and d) variance associated with the estimated Ss field. Note that the square formed by the slices in the main figure corresponds to the outer edges of the field plot. The inset image for each figure is a cross-section through the middle of the central square. This corresponds to cross-sections through CMT2 to CMT1 (section oriented N to S), and CMT4 to CMT3 (section oriented E to W). Black open circle indicates the pumped location. 2. Effective Parameter Model 3. Transition Probability/Markov Chain model 4. Geological model 5. Stochastic inverse model with conditioning 6. Transient hydraulic tomography (unconditioned; PW1-3, PW3-3, PW4-3, PW5-3) 7. Transient hydraulic tomography (conditioned; PW1-3, PW3-3, PW4-3, PW5-3) Max 60th Percentile Min PW1-3 0.04 0.10 0.09 0.12 PW3-3 0.01 0.01 0.01 0.003 PW4-3 0.02 0.37 0.25 0.012 PW5-3 0.06 0.06 0.06 0.01 Average 0.034 0.135 0.102 0.039 Rank 2 6 5 4 0.01 0.001 0.003 0.01 0.008 1 0.11 0.001 0.009 0.02 0.034 3 0.37 0.04 0.001 Figure S8: L2 norms of observed versus simulated drawdowns from the four pumping tests used for model calibration at the NCRS. The minimum L2 norm is assigned a color of dark green, the maximum value a color of dark red, and the 60 percentile value a color of yellow. 2. Effective Parameter Model 3. Transition Probability/Markov Chain model 4. Geological model 5. Stochastic inverse model with conditioning 6. Transient hydraulic tomography (unconditioned; PW1-3, PW3-3, PW4-3, PW5-3) 7. Transient hydraulic tomography (conditioned; PW1-3, PW3-3, PW4-3, PW5-3) Max 60th Percentile Min PW1-3 0.79 0.40 0.62 0.62 PW3-3 0.23 0.51 0.60 0.81 PW4-3 0.75 0.38 0.58 0.89 PW5-3 0.44 0.50 0.56 0.95 Average 0.55 0.45 0.59 0.82 Rank 5 6 4 3 0.95 0.93 0.98 0.97 0.95 1 0.76 0.97 0.92 0.94 0.90 2 0.98 0.81 0.23 Figure S9: Correlation (R) of observed versus simulated drawdowns from the four pumping tests used for model calibration at the NCRS. The minimum R is assigned a color of dark red, the maximum value a color of dark green, and the 60 percentile value a color of yellow. Figure S10: Scatterplots of observed vs. simulated drawdown for a) PW1-3, b) PW3-3, c) PW43, and d) PW5-3 at observation ports for various times. Simulated drawdown values are computed with the estimated Keff and Sseff values obtained using the calibrated effective parameter modeling method (Approach 2). The solid line is the 45 degree line, while the dashed line is the linear model fit to the data. Figure S11: Scatterplots of observed vs. simulated drawdown for a) PW1-3, b) PW3-3, c) PW43, and d) PW5-3 at observation ports for various times. Simulated drawdown values are computed with the heterogeneous K and Ss distributions obtained using the calibrated Transition Probability/Markov Chain modeling method (Approach 3). The solid line is the 45 degree line, while the dashed line is the linear model fit to the data. Figure S12: Scatterplots of observed vs. simulated drawdown for a) PW1-3, b) PW3-3, c) PW43, and d) PW5-3 at observation ports for various times. Simulated drawdown values are computed with the heterogeneous K and Ss distributions obtained using the calibrated geological modeling method (Approach 4). The solid line is the 45 degree line, while the dashed line is the linear model fit to the data. Figure S13: Scatterplots of observed vs. simulated drawdown for a) PW1-3, b) PW3-3, c) PW43, and d) PW5-3 at observation ports for various times. Simulated drawdown values are computed with the heterogeneous K and Ss distributions obtained using the stochastic inverse modeling method (Approach 5) conditioned to permeameter K data. The solid line is the 45 degree line, while the dashed line is the linear model fit to the data. Figure S14: Scatterplots of observed vs. simulated drawdown for a) PW1-3, b) PW3-3, c) PW43, and d) PW5-3 at observation ports for various times. Simulated drawdown values are computed with the heterogeneous K and Ss tomograms obtained using the transient hydraulic tomography method (Approach 6) unconditioned to permeability K data. The solid line is the 45 degree line, while the dashed line is the linear model fit to the data. Figure S15: Scatterplots of observed vs. simulated drawdown for a) PW1-3, b) PW3-3, c) PW43, and d) PW5-3 at observation ports for various times. Simulated drawdown values are computed with the heterogeneous K and Ss tomograms obtained using the transient hydraulic tomography method (Approach 7) conditioned to permeameter K data. The solid line is the 45 degree line, while the dashed line is the linear model fit to the data. PW1-3 0.13 PW1-4 0.03 PW1-5 0.63 PW3-3 0.10 PW3-4 0.08 PW4-3 18.89 PW5-3 0.56 PW5-4 0.39 PW5-5 1.01 Average 2.42 Rank 6 0.04 0.04 0.10 0.05 0.02 0.02 0.03 0.03 0.007 0.01 0.002 0.017 0.01 0.01 0.02 0.01 0.0007 0.00 0.0005 0.0011 0.21 0.21 0.04 0.51 0.05 0.05 0.07 0.04 0.02 0.02 0.01 0.03 0.02 0.02 0.02 0.03 0.04 0.04 0.03 0.08 0.05 3 3. Transition Probability/Markov Chain model Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 0.08 0.11 0.11 0.11 0.03 0.07 0.03 0.02 0.01 0.02 0.002 0.001 0.02 0.01 0.02 0.02 0.0004 0.0014 0.001 0.0006 0.30 0.65 0.06 0.06 0.06 0.05 0.08 0.07 0.01 0.05 0.01 0.01 0.01 0.02 0.02 0.02 0.06 0.11 0.04 0.03 0.06 4 4. Geological model Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 0.16 0.08 0.16 0.16 0.11 0.45 0.11 0.11 0.30 3.52 0.31 0.31 0.01 0.54 0.01 0.01 0.0159 0.32 0.0165 0.0165 6.89 189.87 0.07 0.07 0.06 3.85 0.07 0.07 0.08 1.01 0.08 0.08 0.52 4.42 0.44 0.44 0.91 22.67 0.14 0.14 5.96 7 5. Stochastic inverse model with conditioning Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 0.08 0.10 0.11 0.08 0.03 0.04 0.03 0.05 0.02 0.04 0.006 0.03 0.01 0.01 0.02 0.01 0.0006 0.0026 0.0004 0.001 0.37 0.95 0.03 0.35 0.06 0.06 0.06 0.02 0.01 0.03 0.01 0.04 0.02 0.03 0.01 0.02 0.07 0.14 0.03 0.07 0.08 5 0.02 0.02 0.001 0.004 0.0004 0.03 0.03 0.04 0.01 0.017 1 0.06 0.04 0.002 0.003 0.0004 0.03 0.02 0.04 0.01 0.025 2 1. Kriging 2. Effective parameter model Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 6. Transient hydraulic tomography (unconditioned; PW1-3, PW 3-3, PW4-3, PW5-3) 7. Transient hydraulic tomography (conditioned; PW1-3, PW 3-3, PW4-3, PW5-3) Max 60th Percentile Min 1.00 0.05 0.0003 Figure S16: L2 norms of observed versus simulated drawdowns from the nine pumping tests used for model validation at the NCRS. The minimum L2 norm is assigned a color of dark green, the maximum value a color of dark red, and the 60 percentile value a color of yellow. PW1-3 0.30 PW1-4 0.17 PW1-5 0.49 PW3-3 0.39 PW3-4 0.39 PW4-3 0.50 PW5-3 0.29 PW5-4 0.31 PW5-5 0.40 Average 0.36 Rank 7 0.79 -0.06 0.76 0.72 0.51 0.08 0.53 0.56 0.62 -0.09 0.62 0.57 0.77 0.15 0.80 0.82 0.40 0.16 0.41 0.37 0.68 0.11 0.72 0.70 0.61 0.17 0.64 0.62 0.33 0.20 0.39 0.41 0.44 0.15 0.45 0.38 0.57 0.10 0.59 0.57 0.46 3 3. Transition Probability/Markov Chain model Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 0.46 0.52 0.59 0.53 0.26 0.08 0.63 0.62 0.32 0.43 0.58 0.69 0.41 0.42 0.39 0.49 0.28 0.18 0.30 0.33 0.23 0.54 0.44 0.47 0.36 0.46 0.46 0.52 0.40 0.13 0.44 0.59 0.32 0.17 0.05 0.06 0.34 0.32 0.43 0.48 0.39 5 4. Geological model Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 0.54 0.50 0.73 0.53 0.25 0.16 0.42 0.25 0.58 0.21 0.39 0.59 0.55 0.29 0.56 0.52 0.60 0.34 0.48 0.60 0.63 0.30 0.49 0.60 0.48 0.24 0.54 0.46 0.30 0.19 0.48 0.28 0.35 0.18 0.14 0.35 0.47 0.27 0.47 0.46 0.42 4 5. Stochastic inverse model with conditioning Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 0.55 0.41 0.38 0.59 0.30 0.23 0.30 0.19 0.46 0.45 0.58 0.50 0.44 0.63 0.49 0.32 0.30 0.18 0.33 0.37 0.40 0.41 0.70 0.51 0.40 0.32 0.42 0.87 0.31 0.31 0.30 0.12 0.30 0.28 0.33 0.19 0.38 0.36 0.42 0.41 0.39 5 0.90 0.56 0.740 0.88 0.44 0.80 0.84 0.02 0.73 0.66 1 0.68 0.20 0.65 0.91 0.31 0.79 0.83 0.08 0.26 0.52 2 1. Kriging 2. Effective parameter model Case 1 (PW1-3) Case 2 (PW3-3) Case 3 (PW4-3) Case 4 (PW5-3) 6. Transient hydraulic tomography (unconditioned; PW1-3, PW 3-3, PW4-3, PW5-3) 7. Transient hydraulic tomography (conditioned; PW1-3, PW 3-3, PW4-3, PW5-3) Max 60th Percentile Min 0.91 0.48 -0.09 Figure S17: Correlation (R) of observed versus simulated drawdowns from the nine pumping tests used for model validation at the NCRS. The minimum R is assigned a color of dark red, the maximum value a color of dark green, and the 60 percentile value a color of yellow. Figure S18: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the kriged K field and a homogeneous Ss value (Approach 1). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S19: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the effective K and Ss values from the calibration of an effective parameter groundwater model to the pumping test at PW1-3 (Approach 2, Case 1). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S20: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the effective K and Ss values from the calibration of an effective parameter groundwater model to the pumping test at PW3-3 (Approach 2, Case 2). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S21: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the effective K and Ss values from the calibration of an effective parameter groundwater model to the pumping test at PW4-3 (Approach 2, Case 3). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S22: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the effective K and Ss values from the calibration of an effective parameter groundwater model to the pumping test at PW5-3 (Approach 2, Case 4). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S23: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the calibration of a realization generated using the Transition Probability Markov Chain method to the pumping test at PW1-3 (Approach 3, Case 1). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S24: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the calibration of a realization generated using the Transition Probability Markov Chain method to the pumping test at PW3-3 (Approach 3, Case 2). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S25: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the calibration of a realization generated using the Transition Probability Markov Chain method to the pumping test at PW4-3 (Approach 3, Case 3). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S26: Scatterplots of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the calibration of a realization generated using the Transition Probability Markov Chain method to the pumping test at PW5-3 (Approach 3, Case 4). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S27: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the calibration of a geological model to the pumping test at PW13 (Approach 4, Case 1). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S28: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the calibration of a geological model to the pumping test at PW33 (Approach 4, Case 2). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S29: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the calibration of a geological model to the pumping test at PW43 (Approach 4, Case 3). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S30: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the calibration of a geological model to the pumping test at PW53 (Approach 4, Case 4). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S31: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the stochastic inverse modeling of a pumping test at PW1-3 (Approach 5, Case 1) conditioned to permeameter K data. The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S32: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the stochastic inverse modeling of a pumping test at PW3-3 (Approach 5, Case 2) conditioned to permeameter K data. The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S33: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the stochastic inverse modeling of a pumping test at PW4-3 (Approach 5, Case 3) conditioned to permeameter K data. The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S34: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss distributions from the stochastic inverse modeling of a pumping test at PW5-3 (Approach 5, Case 4) conditioned to permeameter K data. The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S35: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss tomograms obtained through the transient hydraulic tomography analysis of 4 pumping tests (Approach 6). The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot. Figure S36: Scatterplot of simulated versus observed drawdowns for all 9 pumping tests using the K and Ss tomograms obtained through the transient hydraulic tomography analysis of 4 pumping tests (Approach 7) conditioned to permeameter K data. The solid line is a 1:1 line indicating a perfect match. The dashed line is a best fit line, and the parameters describing this line are on each plot.