Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark Delorey, F. Jay Breidt, Colorado State University Abstract Baysian approach Data from a surface water monitoring program, the Temporally Integrated Monitoring of Ecosystems (TIME) are used to study trends in acid deposition in surface water. The TIME data consists of a probability sample of lakes and streams. One of the tools used to evaluate characteristics of acidity is the cumulative distribution function of slope trends (acid concentration/year). An understanding of the distribution of these slopes helps evaluate the impact of the to the Clean Air Act Amendments of 1990. For example, the proportion of lakes whose acidic neutralizing (ANC) has been decreasing can be estimated. A hierarchical model is constructed to describe these slopes as functions of available auxiliary information; constrained Bayes techniques are used to estimate the ensemble of slope values. Spatial relationships are represented by incorporating a conditional autoregressive model into the constrained Bayes methods. Incorporating Spatial Structure Bayesian Inference • Bayes estimates (posterior means): where • m h 1 • B 2 B h a 1 a B h • • B where H1 a 1 H2 ˆ ˆ • 1 h • Adjacency matrix: 1 distance between HUC - 8 centroids ; h and k neighbors chk 0 ; otherwise hh = 1/mh, h = 1,…,m hk = 0, h ≠ k • If is known, Stern and Cressie (1999) show how to solve for H1(Y) and H2(Y) under the mean and variance constraints: • First level (2-digit) divides U.S. into 21 major geographic regions • Second level (4-digit) identifies area drained by a river system, closed • Form the constrained Bayes (CB) estimates as where is an unknown coefficient vector, C = (cij) represents the adjacency matrix, is a parameter measuring spatial dependence, is a known diagonal matrix of scaling factors for the variance in each HUC, and is an unknown parameter. Adjacency matrix C can reflect watershed structure HUC and Neighborhood Structure H ˆ CB h • Let Ah denote a set of neighboring HUCs for HUC h • The spatial model is equivalent to (Cressie and Stern, 1991): 2 h | k , k h ~ N X h chk k X k , hh , k A h 1,, m • = diag(inverse number of neighbors) H1 ˆ tr Var 1 | ˆ 2 1 2 2 m ˆ E h | h 1 • Compute the scalars • Interested in estimating individual HUC-specific slopes • Also interested in ensemble: spatially-indexed estimates: | , , ~ N X , I C Constrained Bayes Estimates Two Inferential Goals B h B 2 Conditional Auto-Regressive with Constrained Bayes ˆ | , ~ N , D 2 – yield good individual estimates – do not yield a good ensemble estimate – too little variability to give good representation of edf since h 1 spatially-indexed true values: h h 2 Objective • • Let 2 watershed processes exceeds buffering capacity Temporal trends in ANC within watersheds (8-digit HUC’s) – characterize the spatial ensemble of trends – make a map, construct a histogram, plot an empirical distribution function hB E h | ˆ E h ˆh 1 h xThˆ | ˆ m • Evaluation of the Clean Air Act Amendments of 1990 – examine acid neutralizing capacity (ANC) – surface waters are acidic if ANC < 0 – supply of acids from atmospheric deposition and Spatial Model 2 basin, or coastal drainage area Third level (6-digit) creates accounting units of surface drainage basins or combination of basins Fourth level (8-digit) distinguishes parts of drainage basins and unique hydrologic features All watersheds within the same HUC-6 region were considered part of same neighborhood No spatial relationship among HUC-4 regions or HUC-2 regions considered at this point P E β | Y P βˆ and est m h h 1 T ˆ L E E β β Φ 1 β βˆ | Y, Y E X Φ | T • T 1 1 T Hierarchical Area-Level Model 0 eh NID0, h h xTh h , h NID0, 2 ˆh h eh , 0 0 0 T Constrained Bayes with CAR 0 0 0 0 0 00 0 0 0 0 • Extend model specification by describing parameter uncertainty: 1 T 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 f , 2 f f 2 f 2 1.0 Estimated EDF’s of the Slope Ensemble CB 0.6 • In Bayesian context, posterior means are overshrunk; in order to obtain • 0.2 The work reported here was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of the presenter and the STARMAP, the Program he represents. EPA does not endorse any products or commercial services mentioned in this presentation. • 0.0 This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative # CR – 829095 and Agreements # CR – 829096 -1000 Future Work Summary 0.4 Funding/Disclaimer Cumulative Probability 0.8 Posterior Mean of F Bayes -500 0 Slope in ug / L / year 500 1000 1500 estimates appropriate for ensemble, need to adjust In CAR, if is known, can find CB estimators following Stern and Cressie (1999); if is unknown, can still find CB estimators numerically Contour plot indicates that trend slopes of ANC are smoothed and somewhat homogenized within HUCs | to get a system of equations that can be used to solve for β̂ Posterior quantities can be estimated using BUGS or other software Constrained Bayes 0 0 X X Φ E β | Y, X Φ X XT Φ 1 βˆ Y T 0 • Prior specification: | T T 0 E E β P β Φ 1 β P β | Y, βˆ T I P Φ 1 I P βˆ Y Shrinkage Comparisons for the Slope Ensemble 1 F ( z ) I h z m h 1 T • We place a uniform prior on and minimize the Lagrangian: m m 1 1 P X X X XT 1 1 of h h1 – this adjusts shrinkage so that sample variance of estimates matches posterior variance of true values – subgroup analysis: what proportion of HUC’s have ANC increasing over time? – “empirical” distribution function (edf): respectively, where • This improves the ensemble estimate by reducing shrinkage – sample mean of Bayes estimates already matches posterior mean m h h 1 T T E β P β Φ1 β P β | Y βˆ T I P Φ1 I P βˆ • • • • • Restrict to acid-sensitive waters Combine probability and convenience samples Other covariates? Modify spatial structure Site-level model? – useful sub-watershed covariates? – spatial scales: HUC to HUC, site to site – more concern with design, normality assumptions