A Comparison of Variance Estimates of Stream Network Resources Sarah J. Williams

advertisement
A Comparison of
Variance Estimates of
Stream Network Resources
Sarah J. Williams
Candidate for the degree of Master of Science
Colorado State University
Department of Statistics
October, 25 2006
But First –
What have I been
doing since June?
…Not Insurance
…Not Pharmaceutical Industry
…Not Government
Noel-Levitz
‘Your Partner in Higher Education’





Targeted market research
Student satisfaction surveys
E-Learning Tools
Retention strategy
Effective recruitment strategies
Noel-Levitz








Two main offices: Denver and Iowa City, IA
Full time consultants
Associate consultants
Marketing
Sales
Administrative
IT
Rest of us are in analysis in differing capacity
Forecast PLUS (F+)
Recruitment models




SmartApproach
Prospect
Inquiry
Admit
Econometric models

FAS
What else do I do?
Product Enhancements


Software testing
Methodology research
Emerging Marketing Product

Discrete Choice Modeling
Assisting Consultants and TPSS staff
What else do I do?
National Conference
Campus Visits
SAS programming classes
Meetings…meetings…meetings!
A Comparison of
Variance Estimates of
Stream Network Resources
Sarah J. Williams
Candidate for the degree of Master of Science
Colorado State University
Department of Statistics
October, 25 2006
Today’s Outline
1.
2.
The problem
Two methods of variance estimation


3.
Local Area Neighborhood estimate
Linear model and Components of Variance
Comparing the two estimates

Practical Implications
Part 1: The Problem
Surveys of aquatic resources provide
challenges in sampling and also analysis
Studies of aquatic resources need
to have a proper temporal design
that will allow for trend detection

The revisit structure may be the most difficult
part of the study to define

Waterways are highly variable and volatile
Studies of aquatic resources need
to have a proper spatial design
that will preserve spatial proximity
Generalized Random Tessellation Stratified (GRTS) sample
By using well planned panel
designs researchers achieve both
of these very important objectives
Part 2: Two methods of
variance estimation
Of course, there are many methods for
estimating the variance of a sample
Today, we will focus on two of these methods


Local Area Neighborhood Estimate (NBH)
Linear model components of variance
Local Area Neighborhood
Estimate

Design-based

Compares well to a Horvitz-Thompson estimate
S
2
NBH
( Zˆ T ) 
 z(s j )


wij
 z D ( si ) 


  (s )

si R si D ( s j )
j


2
Mixed Linear Model and
Components of Variance

y  Xβ  Zγ  ε

Why use this method?


Design-based estimators have no inclusion of time
Specifications of the study
Mixed Linear Model and
Components of Variance
Yij  S i  T j  Eij
2
S i ~ (0,  site
)
2
T j ~ (   (t j ),  year
)
2
Eij ~ (0,  resid
)

Yij     (t j )  Si  T j  Eij
2
S i ~ (0,  site
)

2
T j ~ (0,  year
)
2
Eij ~ (0,  resid
)
In either case above we have that for a single observation:
E[Yij ]     (t j )
2
2
2
Var[Yij ]   site
  year
  resid
Part 3:
Comparing the two methods
Coho Salmon of the Pacific Northwest
The Oregon Plan for
Salmon and Watersheds

GRTS design (40 panels)

3-year salmon lifecycle

Each monitoring area is a
stratum
The Data



35 responses of interest (Landscape & habitat)
6 regions of interest
8 years available

1,535 site visits (1,055 distinct sites)
Response Region Count
Site
LWDPIECE1 1-NC
329
77.73
2-MC
308
81.11
SECCHNAREA 1-NC
314 247805.84
2-MC
304 191217.93
Response Region Count
LWDPIECE1 1-NC
329
2-MC
308
SECCHNAREA 1-NC
314
2-MC
304
Linear model estimates:
Year Residual
13.54
149.59
-0.43
14.43
-9.22 92810.91
-3020.04 73495.43
Site+Res
Local
SRS Var
227.32
138.47
234.91
95.54
52.19
96.13
340616.75 225939.60 342010.30
264713.36 77529.41 262645.77
Linear model estimates:
Site/SRS Year/SRS Res/SRS Site+Res/SRS Local/SRS
0.331
0.058
0.637
0.968
0.589
0.844
-0.004
0.150
0.994
0.543
0.725
0.000
0.271
0.996
0.661
0.728
-0.011
0.280
1.008
0.295
Results and Conclusions
0.8
0.6
0.4
Local Variance
As residual component of
variance increases, so too
does NBH
0.0
0.2

1.0
All Regions
0.0
0.2
0.4
0.6
0.8
1.0
Residual Variance Component
0.8
0.6
Local Variance
0.4
0.2
As site component of variance
increases, NBH decreases
0.0

1.0
All Regions
0.0
0.2
0.4
0.6
Site Variance Component
0.8
1.0
Results and Conclusions
2
1 35 sresid,i
 2  0.63
35 i 1 s NBH
,i
Indicating that on average, the residual component
of variance is 0.63 times the NBH estimate
2
1 35 ssite,i
 2  1.32
35 i 1 s NBH
,i
Indicating that on average, the site component of
s
variance is 1.32 times the NBH estimate
2
NBH ,i
It was also of interest to express the NBH estimate as a linear combination
of the residual and site components of variance:
2
2
2
sˆNBH
,i  0.42 * ssite,i  0.72 * sresid,i
This relation is forced through the origin. Using corr ( sNBH ,i , sˆNBH ,i )
interpretable as R2 shows moderate strength of this relation
2
2
2
 0.517
What does it all mean?
Practical Implications
When is it appropriate to use the linear model
method over the local area estimate of variance?


The local estimate will capture all variance due to residual effect and
we have seen that time variance is relatively negligible
The results of this study show that the NBH accounts for roughly 60%
-70% of variance due to site
Note again, the inverse relationship of site variance with NBH
This project is an intermediate step in the larger problem of accurately
modeling trend of an environmental indicator
Acknowledgements
N. Scott Urquhart
Don Stevens
Tom Kincaid
Kim Jones
Oregon Department of Fisheries and Wildlife
The work reported here was developed under the STAR Research Assistance
Agreements CR-829095 awarded to Colorado State University and CR829096 awarded to Oregon State University by the U.S. Environmental
Protection Agency (EPA). This presentation has not been formally reviewed
by the EPA. The work done and views expressed here are solely those of
the author and STARMAP. EPA does not endorse any products or
commercial services mentioned in this report.
Thank you for listening today
Resources














Hocking, R.R. Methods and Applications of Linear Models. Wiley, 2002
Horvitz, D.G. and D.J. Thompson. 1952. A generalization of sampling without replacement from a
finite universe. Journal of the American Statistical Association 47: 663–685.
Sarndal, C., Swensson, B., and Wretman, J. Model Assisted Survey Sampling. SpringerVerlag, 1992.
Stevens, D.L., and Olsen A.R. 1999. Spatially Restricted Surveys Over Time for Aquatic
Resources. Journal of Agricultural, Biological and Environmental Statistics, Vol.4, No.4: 415-428.
Stevens, D.L. 2003. Sampling Design and Statistical Analysis Methods for the Integrated
Biological and Physical Monitoring of Oregon Streams. The Oregon Plan for Salmon and
Watersheds.
Stevens, D.L., and Olsen, A.R. 2003. Variance estimation for spatially balanced samples of
environmental resources. Environmetrics, submitted.
Strahler A.N. 1957. Quantitative analysis of watershed geomorphology. Transactions of
the American Geophysical Union, 21, 913-920.
Urquhart, N.S., Overton W.S., and Birkes D.S. 1993. Comparing Sampling Designs for Monitoring
Ecological Status and Trends: Impact of Temporal Patterns. Statistics for the Environment,
chapter 3:71-85.
Urquhart, N.S., Paulsen, S.G. and Larsen, D.P. 1998. Monitoring for policy-relevant regional
trends over time. Ecological Applications, 8: 246-257.
Urquhart, N.S., and Kincaid, T.M. 1999. Designs for detecting trend from repeated surveys of
ecological resources. Journal of Agricultural, Biological and Environmental Statistics, 4: 404-414.
The Oregon Plan for Watersheds and Salmon, Biennial Report, 2005.
www.epa.gov/emap
www.dfw.state.or.us
Map obtained from http://nrimp.dfw.state.or.us/crl/default.aspx?PN=GCAs
Download