WE PROBABLY COULD HAVE MORE FUN TALKING ABOUT THESE TRAFFIC STOPPERS

advertisement
WE PROBABLY COULD HAVE MORE
FUN TALKING ABOUT THESE
TRAFFIC STOPPERS
KSU Monitoring Designs # 1
WHO CLEARLY HAVE
THE RIGHT OF WAY!
BUT…
KSU Monitoring Designs # 2
DESIGNING MONITORING
SURVEYS OVER TIME
(PANEL SURVEYS)
POWER, VARIANCE and RELATED
TOPICS
N. Scott Urquhart
Senior Research Scientist
Department of Statistics
Colorado State University
Fort Collins, CO 80527-1877
KSU Monitoring Designs # 3
OUTLINE
 Anatomy Of Sampling Studies Of Ecological
Responses Through Time

Collaborator = Tony Olsen, EPA, WED

http://www.oregonstate.edu/instruct/st571/urquhart/anatomy/index.htm
Urquhart, N.S. (1981). Anatomy of a study. HortScience 16:621-627.

 Elaboration on


Survey Designs – GRTS – Work of Don Stevens
Temporal Designs
 Power to detect trend – joint with Tom Kincade

Uses components of variance
 Current work = estimating variance

Work of Sarah Williams, finishing MS this month
KSU Monitoring Designs # 4
A CONTEXT
 “EMAP-TYPE SITUATIONS”
EMAP = US EPA’S
Environmental Monitoring and Assessment Program
 Estimate status, changes, and trends in selected
indicators of our nation’s ecological resources
on a regional scale with known confidence.
 Estimate status, changes, and trends in the
extent and geographic coverage of our nation’s
ecological resources on a regional scale with
known confidence.
 Describe associations between indicators of
anthropogenic stress and indicators of
condition.
KSU Monitoring Designs # 5
WHO MUST COMMUNICATE

Ecologists & Other Biologists
 Statisticians
 Geographers
 Geographic Information Specialists
 Information Managers
 Quality Assurance Personnel
 Managers, at Various Levels
KSU Monitoring Designs # 6
“SAMPLING”
 A WORD OF MANY MEANINGS

A statistician often associates it with survey sampling

An ecologist may associate it with the
selection of local sites or material

A laboratory scientist may associate it with the selection
of material to be analyzed from the material supplied

Common general meaning, varied specific meanings
KSU Monitoring Designs # 7
THE SPECIAL NEED

Communication Demands a Distinction Between
 The

local process of evaluating a response, and
The statistical selection of a sampling unit,
for example,
 A lake
 A point on a steam
 A point in vegetation
 The


terms
Response design
Sampling design or survey design
 Can be used to make this distinction
KSU Monitoring Designs # 8
BASIC ROLES
 Survey Design Tells Us Where To Go to
Collect Sample Information or Material
 Response Design Tells Us What To Do
Once We Get There

But These Two Components Exist in a
Broader Context
KSU Monitoring Designs # 9
AN IMPORTANT DISTINCTION

Monitoring Strategy

Conceptual

Impacted by objectives

Addressable without regard to the inference strategy
 Inference Strategy

Places to evaluate the response

Relation between points evaluated and the population
 Ie, the basis for inference
KSU Monitoring Designs # 10
SAMPLING STUDIES OF
ECOLOGICAL RESPONSES
THROUGH TIME HAVE


Monitoring Strategy

Universe model

Statistical population

Domain design

Response design
These components
exist regardless
of the
inference strategy
Inference Strategy

Survey design

Temporal design

Quality assurance design
These components
exist for any
monitoring strategy
KSU Monitoring Designs # 11
The UNIVERSE MODEL
 Reality (Universe): Ecological Entity Within a Defined
Geographic Area to be Monitored
 Model of the Universe:

Development of monitoring approach requires construction of
a model for the universe
 Elements Of The Universe Model: Set of Entities
Composing the Entire Universe of Concern
KSU Monitoring Designs # 12
The UNIVERSE MODEL
 Population Description And Its Sampling Require
Definition Of the “Units” in the Population

Discrete units:
 Lakes may be viewed this way
 Individual trees can be viewed this way, too

Continuous structure in space of some dimension:
 2-SPACE: Forests or Agroecosystems
 1-SPACE: Streams
 3-SPACE: Groundwater
KSU Monitoring Designs # 13
A CONTINUOUS MODEL FOR STREAMS
Strahler Orders
Second
Order
First Orders
First Orders
First Orders
First Order
KSU Monitoring Designs # 14
The STATISTICAL POPULATION
 The Collection of Units (as modeled)
Over Some Region of Definition

Spatial
 Temporal
 Spatial and Temporal
 Population Definition Could Include Features
Which Depend on Response Values

EX: acid sensitive streams at upper elevations
KSU Monitoring Designs # 15
The DOMAIN Design
 Specifies Subpopulations or “Domains”
of Special Interest
 May Specify Meaningful Comparisons Between
Domains


Similar to “planned comparisons” in experimental design
situations
Domain design may depend in response values
 EX: Warm Versus Cold Water Lakes
KSU Monitoring Designs # 16
The RESPONSE DESIGN
 The Response Design Specifies

The process of obtaining a response
 At an individual element (site)
 Of the resource
 During a single monitoring period

Response: What Will Be Determined on an Element
 Needs to be responsive to the objectives of the
monitoring activity
KSU Monitoring Designs # 17
The INFERENCE STRATEGY
 Is The Basis For Scientific Inference
 Provides The Connection Between Objectives and
the Monitoring Strategy
 Monitoring Strategy Usually Must Rely On
Obtaining Information on a Subset Of All Possible
Elements in the Universe
 Specifies Which Elements of the Universe Will Have
Responses Determined on Them
 Can Be Based on Either

Judgment selection of units

Inferential validity rests on knowledge of relation between the universe and
the units evaluated
– Why do a study if you know this much about the population?

Probability selection of units

The focus here
KSU Monitoring Designs # 18
The SURVEY Design
 Probability Based Survey Designs are
Considered Here
May Be Somewhat Limited To Sedentary Resources
 Positive Features -- As An Observational Study
 Permit clear statistical inference to
well-defined populations
 Measurements often can be made in natural settings,
giving to greater realism to results
KSU Monitoring Designs # 19
The SURVEY DESIGN - CONTINUED
 Disadvantages



Limited control over predictor variables
Restricts causative inference
Usually will produce inaccessible sampling points
 Good - for inference
 Bad - for logistics
KSU Monitoring Designs # 20
The TEMPORAL Design
 The TEMPORAL DESIGN specifies the
pattern of revisits to sites selected by the
Survey Design



Sampled population units are partitioned into one
(degenerate case) or more PANELS.
Each population unit in the same panel has the
same temporal pattern of revisits.
Panel definition could be probabilistic or
systematic
 Several temporal designs follow after a brief
discussion of the rest of the Anatomy, and a bit on
site selection.
KSU Monitoring Designs # 21
QUALITY ASSURANCE DESIGN
 Defines Those Activities Intended
to Provide Data of Known Quality:
 Blind duplicates
 Accepted chemical standards, etc
 Can Provide Valid Estimates of the Variance Of Pure
Measurement Error
KSU Monitoring Designs # 22
ON SITE SELECTION
 Systematically Selected Sites



Good for means & totals, but do not support
design-based estimate of variance
Probably OK for large areas like national forests,
Systematic designs can systematically miss things that
have a natural layout.
 EX: Triangular grid (deliberately skewed) in early EMAP got
fowled up with
– Coastline in the Northeast
– The canal network in Florida
– Lakes east of the Cascade Mountain Range in Oregon
 How to select spatially balanced, but random sites?
KSU Monitoring Designs # 23
GENERALIZED RANDOM TESSELLATION
STRATIFIED (GRTS) DESIGN
 Due to Don Stevens – see references
 Allows

A continuous population model
 Variable density sampling by defined areas
 Accommodates an “imperfect frame” = reality
 Sequential addition of points while maintaining
spatial balance
 Differing measurements




Lots of points for inexpensive measures
A subset for more expensive measures
A further subset for very expensive measures
Implemented in Southern California Bight
KSU Monitoring Designs # 24
GENERALIZED RANDOM TESSELLATION
STRATIFIED (GRTS) DESIGN
 Two GIS-based implementations

EMAP R code operates on ARC “Shape” files, and
returns points there
 Begin at http://www.epa.gov/nheerl/arm/
 http://www.epa.gov/nheerl/arm/designpages/monitdesign/monitoring_design_info.htm
 http://www.epa.gov/nheerl/arm/documents/design_doc/psurvey.design_2.2.1.zip
 STARMAP – Dave Theobald

RRQRR operates completely in ArcGIS
 http://www.nrel.colostate.edu/projects/starmap/rrqrr_index.htm
 Both Allow Variable (spatial) Sampling Rates


Generally much better than stratification
(We can talk about this more if you want)
KSU Monitoring Designs # 25
THE FOLLOWING MATERIAL WAS
ADAPTED FROM
Urquhart, N.S. and T.M Kincaid (1999). Designs for
detecting trend from repeated surveys of ecological
resources. Journal of Agricultural, Biological and
Environmental Statistics 4: 404 - 414.
Initially presented at the invited conference
Environmental Monitoring Surveys Over Time, held
at the University the Washington, Seattle, in 1998
KSU Monitoring Designs # 26
MOTIVATING SITUATION
 In 1986 Oregon Department of Fisheries and
Wildlife Sought a “One Time” Probability
Sampling Design To Survey Coastal Salmon.
They Used It In 1990.

It showed earlier estimates of salmon returns to spawn
to have been grossly overstated.
 Consequence: continue to repeat an available design.
 How Good Is The Repeated Use Of Such a Design
For Estimating Trend?
KSU Monitoring Designs # 27
CONCLUSIONS
 General: Power for Trend Detection

Planned revisits are far superior to obtaining revisits
from random “hits”
 Year Variance: Power Deteriorates Fast as
Increases
 Site Variance:


2
 YEAR
No problem with revisit designs.
Without revisits it increases residual variance.
 Sampling Rate: Power Increases with
Sampling Rate (No surprise!)
KSU Monitoring Designs # 28
EVALUATION CONTEXT
 General Perspective


Finite population sampling
But model assisted
 A generalization of the “error analysis” perspective of
samplers
 But recognizing realities of natural resource sampling
 Specific Perspective



Finite population, like of stream segments.
Response exists continuously in time, or at least for
reoccurring blocks of time.
Take independent samples at different points in time
(during an “index window”)
KSU Monitoring Designs # 29
EVALUATION CONTEXT
(CONTINUED)
 Model:



Sites (or stream segments) = a random effect
Years = a random effect, but may contain trend
Residual = a random effect
 Specific evaluation time
 Variation introduced by collection protocol
 Crew effect, if present
– (often present for large surveys)
 “Measurement error” - broadly interpreted
KSU Monitoring Designs # 30
PANEL PLANS
= “TEMPORAL DESIGNS”
 Sampled Population Units are Partitioned into
One (Degenerate Case) or More Panels


Each population unit in the same panel has the same
temporal pattern of revisits.
Panel definition could be probabilistic or systematic
 Specific Plans



Always revisit
Never revisit  repeated surveys
Random revisits and other plans
KSU Monitoring Designs # 31
TEMPORAL DESIGN #1:
ALWAYS REVISIT = ONE PANEL
(This is Wayne Fuller’s “PURE PANEL”)
PANEL
1
TIME PERIOD ( ex: YEARS)
1 2 3 4 5 6 7 8 9 10 11 12 13 ...
X X X X X X X X X X X X X
KSU Monitoring Designs # 32
TEMPORAL DESIGN #2:
NEVER REVISIT = NEW PANEL EACH YEAR
(INDEPENDENT SURVEYS IN A LARGE POPULATION)
PANEL
1
2
3
4
5
6
7
8
9
1 2
X
X
TIME PERIOD ( ex: YEARS)
3 4 5 6 7 8 9 10 11 12 13 ...
X
X
X
X
X
X
X
KSU Monitoring Designs # 33
TEMPORAL DESIGN #3:
ROTATING PANEL like NASS
PANEL
1
2
3
4
5
6
7
8
9
1
X
2
X
X
TIME PERIOD ( ex:
3 4 5 6 7 8
X X X
X X X X
X X X X X
X X X X X
X X X X
X X X
X X
X
YEARS)
9 10 11 12 13 ...
X
X
X
X
X
X
X X
X X
X X
X
X X
KSU Monitoring Designs # 34
TEMPORAL DESIGN #3:
ROTATING PANEL

A Rotating Panel Design Is The Temporal Design
Used By The National Agricultural Statistical
Service (US - “NASS”)
 This Temporal Design Is “Connected” In The
Experimental Design Sense


It is fairly well suited for estimation “status,”
But not nearly particularly powerful for detecting trend
over intermediate time spans
KSU Monitoring Designs # 35
TEMPORAL DESIGN:
SERIALLY ALTERNATING
(ORIGINAL EMAP)
TIME PERIOD ( ex: YEARS)
PANEL 1 2 3 4 5 6 7 8 9 10 11 12 13 ...
1
X
X
X
X
2
X
X
X
3
X
X
X
4
X
X
X
 This Temporal Design Is “Unconnected” in the
Experimental Design Sense.
KSU Monitoring Designs # 36
TEMPORAL DESIGN #5:
AUGMENTED SERIALLY ALTERNATING
(CURRENTLY USED BY EMAP FOR SURFACE WATERS)
TIME PERIOD ( ex: YEARS)
PANEL 1 2 3 4 5 6 7 8 9 10 11 12 13 ...
1
2
3
4
1A
1B
…
2A
…
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
 This Temporal Design Is “Connected” in the
Experimental Design Sense.
KSU Monitoring Designs # 37
TEMPORAL DESIGN #6:
RANDOM PANELS
YEAR
PANEL 1 2 3 …
1
X
2
X
3
X
4
X X
5
X
X
6
X X
7
X X X
NO
VISIT
NUMBERS OF OCCURENCES
N = 240
N = 600
SAMPLE 1 SAMPLE 2
SAMPLE 1
SAMPLE 2
37
38
35
9
12
11
2
36
35
34
9
10
11
5
46
46
46
6
6
6
2
48
48
49
6
5
5
1
96
100
442
438
KSU Monitoring Designs # 38
STATISTICAL MODEL
 Consider A Finite Population Of Sites

{S1 , S2 , … , SN }
 and a Time Series Of Response Values At Each Site:
{Y1 (t ), Y2 (t ),, YN (t )} and their average: Y (t )

A finite population of time series
 Time is continuous, but suppose
 Only a sample can be observed in any year, and
 Only during an index window of, say, 10% of a year
KSU Monitoring Designs # 39
STATISTICAL MODEL -- II
AGAIN CONSIDER THE UNDERLYING TIME SERIES
DURING AN INDEX WINDOW
{Y1 (t ), Y2 (t ), , YN (t )}
and their averages: Yi (), Y (t ), and Y ().
 2SITE = var{Yi ()},
2
 YEAR
 var{Y (t )}
 2RESIDUAL  var{Yi (t )  Yi ()  Y (t )  Y ()}
KSU Monitoring Designs # 40
PART OF A TIME SERIES
DURING AN INDEX WINDOW
RESPONSE
VALUES
20
EU
|V
D|W
15
2
RESIDUAL
10
5
3.4
3.5
3.6
3.7
YEARS
KSU Monitoring Designs # 41
STATISTICAL MODEL -- III
{Yi (t )}  {Yij }
i indexes sites
R
where S
Tj indexes " years"
Yij  Y  (Yi   Y )  (Y j  Y )  (Yij  Yi   Y j  Y )
 Y  Si  Tj  Eij
2
and Si ~ (0,  2SITE ), Tj ~ (0,  YEAR
), and Eij ~ (0,  2RESIDUAL ),
with these random variables otherwise uncorrelated.
KSU Monitoring Designs # 42
STATISTICAL MODEL -- IV
 If P Indexes Panels, Then



Sites are nested in panels: p( i ) and
Years of visit are indicated by panel with
npj > 0 or npj = 0
for panels visited or not visited in year j
The vector of cell means ( of “visited” cells) has
a covariance matrix S :
ch
2
cov Ypj  S ( 2SITE ,  YEAR
,  2RESIDUAL , n pj )
KSU Monitoring Designs # 43
STATISTICAL MODEL -- V
 Now Let X Denote a Regressor Matrix Containing
a Column Of 1’s and a Column of the Numbers
of the Time Periods Corresponding to the Filled
Cells. The Second Elements of
1
1
1

  (X'S X ) X'S Y ,
1
1

cov(  )  ( X ' S X )
and
Contain an Estimate Of Trend and its
Standard Error.
KSU Monitoring Designs # 44
TOWARD POWER
 Ability of a Panel Plan to Detect Trend Can Be
Expressed As Power.
 We Will Evaluate Power in Terms of Ratios of
Variance Components:
2
 2SITE /  2RESIDUAL and  YEAR
/  2RESIDUAL
 and of
   0 /  RESIDUAL , so approximately,  ~ N (  ,  2 )
KSU Monitoring Designs # 45
A SIMULATION STUDY
TO MAKE POWER COMPARISONS



 2SITES
2
RESIDUAL
 0, 1.875, 2.5
2
 YEARS
 0, 0.075, 0.15, 0.3
2
 RESIDUAL
 n = 60
 N = 60, 240, 600, 1200, 10,000

==> Sampling rates of 100%, 25%, 10%, 5%, ~ 0%
KSU Monitoring Designs # 46
POWER FOR DETECTING TREND
SAMPLING A FINITE POPULATION OF SIZE N
2
 2SITES  1875
.
and  YEARS
 0.000
POWER for TREND
1
ALWAYS REVISIT,
or EMAP-LIKE
0.8
0.6
N = 60, n = 60
0.4
0.2
0
0
5
10
15
20
TIME ( = YEARS )
KSU Monitoring Designs # 47
POWER FOR DETECTING TREND
SAMPLING A FINITE POPULATION OF SIZE N
2
 2SITES  1875
.
and  YEARS
 0.000, 0.075, 0.15, 0.30
POWER for TREND
1
ALWAYS REVISIT,
or EMAP-LIKE
0.8
0.6
N = 60, n = 60
0.4
0.2
0
0
5
10
15
20
TIME ( = YEARS )
KSU Monitoring Designs # 48
POWER FOR DETECTING TREND
SAMPLING A FINITE POPULATION OF SIZE N
2
 2SITES  1875
.
and  YEARS
 0.000
POWER for TREND
1
ALWAYS REVISIT,
or EMAP-LIKE
0.8
0.6
N = 60, n = 60
0.4
0.2
0
0
5
10
15
20
TIME ( = YEARS )
KSU Monitoring Designs # 49
POWER FOR DETECTING TREND
SAMPLING A FINITE POPULATION OF SIZE N
2
 2SITES  1875
.
and  YEARS
 0.000
POWER for TREND
1
ALWAYS REVISIT,
or EMAP-LIKE
0.8
0.6
0.4
NEVER REVISIT
N = 60, n = 60
N = 10,000, n = 60
0.2
0
0
5
10
15
20
TIME ( = YEARS )
KSU Monitoring Designs # 50
POWER FOR DETECTING TREND
SAMPLING A FINITE POPULATION OF SIZE N
2
 2SITES  1875
.
and  YEARS
 0.000
POWER for TREND
1
N = 60, n = 60
ALWAYS REVISIT,
or EMAP-LIKE
0.8
RANDOM REVISIT
0.6
N = 600, n = 60
0.4
NEVER REVISIT
0.2
N = 10,000, n = 60
0
0
5
10
15
20
TIME ( = YEARS )
KSU Monitoring Designs # 51
POWER FOR DETECTING TREND
SAMPLING A FINITE POPULATION OF SIZE N
2
 2SITES  1875
.
and  YEARS
 0.000
POWER for TREND
1
ALWAYS REVISIT,
or EMAP-LIKE
0.8
N = 60, n = 60
RANDOM REVISIT
0.6
N = 600, n = 60
0.4
NEVER REVISIT
0.2
N = 10,000, n = 60
0
0
5
10
15
20
TIME ( = YEARS )
KSU Monitoring Designs # 52
POWER FOR DETECTING TREND:
AS A FUNCTION OF TEMPORAL DESIGN
POWER for TREND
1
0.8
0.6
4&5
1
0.4
2
ROTATING PANEL
3
0.2
0
0
5
10
15
20
TIME ( = YEARS )
KSU Monitoring Designs # 53
CONCLUSIONS
 General: Power for Trend Detection

Planned revisits are far superior to obtaining revisits
from random “hits”
 Year Variance: Power Deteriorates Fast as
Increases
 Site Variance:


2
 YEAR
No problem with revisit designs.
Without revisits it increases residual variance.
 Sampling Rate: Power Increases with
Sampling Rate (No surprise!)
KSU Monitoring Designs # 54
CURRENT WORK
 Stevens D.L. Jr and A.R. Olsen (2003). Variance
estimation for spatially balanced samples of
environmental resources. Environmetrics 14: 593-610.
 Proposed a local estimator for variance.

I have been using some variance component estimators.
 How do these two approaches relate?
 Should one be used rather than the other?
 MS Student – Sarah Williams

Use local estimator for things like status measures
 Because it includes some site variance

Use components of variance for trend studies
 Revisits to sites remove most of the effect of that component
 Currently investigating variance component of trend

And its impact on trend detection
KSU Monitoring Designs # 55
FUNDING ACKNOWLEDGEMENT
The work reported here today was developed under the STAR Research Assistance
Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to
Colorado State University. This presentation has not been formally reviewed by EPA. The
views expressed here are solely those of presenter and STARMAP, the Program he
represented. EPA does not endorse any products or commercial services mentioned in this
presentation.
This research is funded by
U.S.EPA – Science To Achieve
Results (STAR) Program
Cooperative
# CR - 829095
Agreement
KSU Monitoring Designs # 56
20 40
0
20 40
0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899
Percent
= 25%
RATE
N = 240
0
20 40
0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899
Percent
= 10%
RATE
N = 600
0
Percent
= 5%
RATE
N = 1,200
DISTRIBUTION OF SIMULATED POWER: 10 YEARS
SITE VARIANCE = 1.875; YEAR VARIANCE:
0.30
0.10
0.075
0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899
Power
Power
Power
KSU Monitoring Designs # 57
20 40
0
Percent
0.144
0.148
0.208
0.222
0.237
0.302
0.342
0.382
0.139
0.144
0.148
0.208
0.222
0.237
0.302
0.342
0.382
0.139
0.144
0.148
0.208
0.222
0.237
0.302
0.342
0.382
20 40
0.139
0
20 40
0
Percent
Percent
5%
RATE
N = =1,200
10%
RATE
N ==600
25%
RATE
N ==240
DISTRIBUTION OF SIMULATED POWER: 20 YEARS
SITE VARIANCE = 1.875; YEAR VARIANCE:
0.30
0.10
0.075
Power
Power
Power
KSU Monitoring Designs # 58
20 40
0
Percent
= 5%
RATE
N = 1,200
DISTRIBUTION OF SIMULATED POWER: 10 YEARS
SITE VARIANCE = 2.50; YEAR VARIANCE:
0.30
0.10
0.075
20 40
0
Percent
= 10%
RATE
N = 600
0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899
20 40
0
Percent
= 25%
RATE
N = 240
0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899
0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899
Power
Power
Power
KSU Monitoring Designs # 59
20 40
0.144
0.148
0.208
0.222
0.237
0.302
0.342
0.382
0.139
0.144
0.148
0.208
0.222
0.237
0.302
0.342
0.382
0.139
0.144
0.148
0.208
0.222
0.237
0.302
0.342
0.382
20 40
0.139
20 40
0
Percent
= 25%
RATE
N = 240
0
Percent
= 10%
RATE
N = 600
0
= 5%
RATE
N
= 1,200
Percent
DISTRIBUTION OF SIMULATED POWER: 20 YEARS
SITE VARIANCE = 2.50; YEAR VARIANCE:
0.30
0.10
0.075
Power
Power
Power
KSU Monitoring Designs # 60
Download