Distribution Function Estimation in Small Areas for Aquatic Resources

advertisement
Distribution Function Estimation in Small Areas for Aquatic Resources
Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity
Mark Delorey, F. Jay Breidt, Colorado State University
Abstract
Baysian approach
Data from a surface water monitoring program, the Temporally Integrated
Monitoring of Ecosystems (TIME) are used to study trends in acid deposition in
surface water. The TIME data consists of a probability sample of lakes and streams.
One of the tools used to evaluate characteristics of acidity is the cumulative
distribution function of slope trends (acid concentration/year). An understanding of
the distribution of these slopes helps evaluate the impact of the to the Clean Air Act
Amendments of 1990. For example, the proportion of lakes whose acidic
neutralizing (ANC) has been decreasing can be estimated. A hierarchical model is
constructed to describe these slopes as functions of available auxiliary information;
constrained Bayes techniques are used to estimate the ensemble of slope values.
Spatial relationships are represented by incorporating a conditional autoregressive
model into the constrained Bayes methods.
Incorporating Spatial Structure
Bayesian Inference
• Bayes estimates (posterior means):

where
 

•
m
h 1
•
B 2
B
h
 a  1  a 
B
h
•
•
B
where

H1

a  1 
H2



ˆ 
ˆ 
•
1
h
• Adjacency matrix:
1 distance between HUC - 8 centroids ; h and k neighbors
chk  
0
; otherwise

hh = 1/mh, h = 1,…,m
 hk = 0, h ≠ k
• If  is known, Stern and Cressie (1999) show how to solve for H1(Y) and
H2(Y) under the mean and variance constraints:
• First level (2-digit) divides U.S. into 21 major geographic regions
• Second level (4-digit) identifies area drained by a river system, closed
• Form the constrained Bayes (CB) estimates as

where  is an unknown coefficient vector, C = (cij) represents the adjacency
matrix,  is a parameter measuring spatial dependence,  is a known
diagonal matrix of scaling factors for the variance in each HUC, and  is an
unknown parameter.
Adjacency matrix C can reflect watershed structure
HUC and Neighborhood Structure
  

H ˆ       
CB
h

• Let Ah denote a set of neighboring HUCs for HUC h
• The spatial model is equivalent to (Cressie and Stern, 1991):


2
 h |  k , k  h ~ N X h   chk  k  X k ,    hh ,
k A


h  1,, m
•  = diag(inverse number of neighbors)
H1 ˆ  tr Var    1 | ˆ
2
1
2
2
m

ˆ
 E   h    |  
 h 1

• Compute the scalars
• Interested in estimating individual HUC-specific slopes
• Also interested in ensemble:
spatially-indexed estimates:


 |  ,   ,  ~ N X ,   I  C 
Constrained Bayes Estimates
Two Inferential Goals
 
B
h
B 2
Conditional Auto-Regressive with Constrained Bayes
ˆ |  , ~ N , D
2
– yield good individual estimates
– do not yield a good ensemble estimate
– too little variability to give good representation of edf since
h 1
spatially-indexed true values:


h 
 h   2
Objective
•
• Let
2
 
watershed processes exceeds buffering capacity
Temporal trends in ANC within watersheds (8-digit HUC’s)
– characterize the spatial ensemble of trends
– make a map, construct a histogram, plot an empirical distribution function
 
 hB  E  h | ˆ  E  h ˆh  1   h xThˆ | ˆ
m
• Evaluation of the Clean Air Act Amendments of 1990
– examine acid neutralizing capacity (ANC)
– surface waters are acidic if ANC < 0
– supply of acids from atmospheric deposition and
Spatial Model
2
basin, or coastal drainage area
Third level (6-digit) creates accounting units of surface drainage basins or
combination of basins
Fourth level (8-digit) distinguishes parts of drainage basins and unique
hydrologic features
All watersheds within the same HUC-6 region were considered part of same
neighborhood
No spatial relationship among HUC-4 regions or HUC-2 regions considered
at this point
P E  β | Y   P βˆ
and
est m
h
h 1





T

ˆ
L  E E β  β Φ 1 β  βˆ | Y,   Y



  E X Φ
|
T
•
T
1



1
T
Hierarchical Area-Level Model
0
eh  NID0,  h 
 h  xTh   h ,  h  NID0,  2 
ˆh   h  eh ,


 
0
0 0
T
Constrained Bayes with CAR
0
0
0
0 0
00
0
0
0
0
• Extend model specification by describing parameter uncertainty:
1
T
0
0
0
0 0
00 0
0
0
0
0
0
0
0
0
 
f  ,  2  f   f  2  f  2
1.0
Estimated EDF’s of the Slope Ensemble
CB
0.6
• In Bayesian context, posterior means are overshrunk; in order to obtain
•
0.2
The work reported here was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S.
Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed
by EPA. The views expressed here are solely those of the presenter and the STARMAP, the Program he represents. EPA
does not endorse any products or commercial services mentioned in this presentation.
•
0.0
This research is funded by
U.S.EPA – Science To Achieve
Results (STAR) Program
Cooperative
# CR – 829095 and
Agreements
# CR – 829096
-1000
Future Work
Summary
0.4
Funding/Disclaimer
Cumulative Probability
0.8
Posterior Mean of F
Bayes
-500
0
Slope in ug / L / year
500
1000
1500
estimates appropriate for ensemble, need to adjust
In CAR, if  is known, can find CB estimators following Stern and Cressie
(1999); if  is unknown, can still find CB estimators numerically
Contour plot indicates that trend slopes of ANC are smoothed and somewhat
homogenized within HUCs
|
to get a system of equations that can be used to solve for β̂
Posterior quantities can be estimated using BUGS or other software
Constrained Bayes
0
0

X X Φ E β | Y,   X Φ X XT Φ 1 βˆ Y
T
0
• Prior specification:
|
T
T
 0 E E  β  P β  Φ 1  β  P β  | Y,   βˆ T I  P  Φ 1 I  P βˆ Y
Shrinkage Comparisons for the Slope Ensemble
1
F ( z )   I  h  z
m h 1
T
• We place a uniform prior on  and minimize the Lagrangian:
m
m

1
1
P  X X  X XT  1
1
of  h h1
– this adjusts shrinkage so that sample variance of estimates
matches posterior variance of true values
– subgroup analysis: what proportion of HUC’s have ANC increasing over
time?
– “empirical” distribution function (edf):

respectively, where
• This improves the ensemble estimate by reducing shrinkage
– sample mean of Bayes estimates already matches posterior mean
m
h h 1

T
T
E  β  P β  Φ1  β  P β  | Y  βˆ T I  P  Φ1 I  P βˆ
•
•
•
•
•
Restrict to acid-sensitive waters
Combine probability and convenience samples
Other covariates?
Modify spatial structure
Site-level model?
– useful sub-watershed covariates?
– spatial scales: HUC to HUC, site to site
– more concern with design, normality assumptions
Download