Multi-Lag Cluster Enhancement of Fixed Grids for Variogram Estimation for Near Coastal Systems

advertisement
Multi-Lag Cluster Enhancement
of Fixed Grids for Variogram
Estimation for Near Coastal
Systems
Kerry J. Ritter, SCCWRP
Molly Leecaster, SCCWRP
N. Scott Urquhart, CSU
Ken Schiff , SCCWRP
Dawn Olsen, City of San Diego
Tim Stebbins, City of San Diego
Project Funding
• The work reported here was developed under the
STAR Research Assistance Agreement CR-829095
awarded by the U.S. Environmental Protection
Agency (EPA) to Colorado State University. This
presentation has not been formally reviewed by
EPA. The views expressed here are solely those of
the presenter and STARMAP, the Program they
represent. EPA does not endorse any products or
commercial services mentioned in this presentation.
• Southern Californian Coastal Water Research
Project (SSCWRP)
Background
• Maps of sediment condition are important for
making decisions regarding pollutant discharge
• Maps in marine systems are rare
• Special study by San Diego Municipal Wastewater
Treatment Plant
• Objective: To build statistically defensible maps of
chemical constituents and biological indices
around two sewage outfalls
– Point Loma
– South Bay
Point Loma
and
South Bay
Outfalls
TYPICAL DESIGN SITUATION
• Many features of the real situation are
unknown.
– Here: The nature of the semivariogram
• Multiple Responses
–  What is a good solution for one response
may not be a good design for another!
• Time constraint
– Answer was required by June 14, 2004
Two-Phase Approach
• Phase I: Model spatial variability at various
spatial scales (eg. Variogram)
– This summer
• Phase II: Use information from Phase I to
design survey that meets accuracy
requirements
– next summer = 2005
• Current focus is on Phase I
Variogram
2.0
2.5
VARIOGRAM
1.5
0.5
1.0
gamma
SILL=>
RANGE

}
0.0
NUGGET=>
0
10
20
30
distance
40
50
Design Considerations for
Modeling the Variogram
• Sufficient replication at various spatial scales
– Variogram model
– Parameter estimates
• Adequate spatial coverage to support investigating
– Stationarity
– Isotropy vs. Anisotropy
– Strata
• Allow for multiple responses
Empirical Variograms
(Point Loma 2000 Regional Survey)
TOC
COPPER
0.05
10
15
gamma
0.10
gamma
30
20
2
4
6
8
0
0.0
5
10
0
0
0
2
distance
R=5.09 S=36.27 N =0.00
4
6
8
0
distance
R=8.8 S=.077 N =0.0242
50
40
250
20
30
No. of pairs
200
150
100
10
50
0
0
2
4
6
distance
R=6.14 S=218.55 N =0.00
8
2
4
2
4
6
distance
R=2.75 S=22.53 N =0.00
Lag Distribution Variogram
300
ZINC
gamma
gamma
20
40
25
0.15
50
30
60
CHROMIUM
6
lag distance (km)
8
8
Multi-Lag Cluster (MCL)
Enhancements to Fixed Grids
• Clusters of sites, spaced at various lag
distances, are placed around fixed locations
on an existing grid.
• Allows current monitoring grid to remain
“in tact”.
• Provides replication at multiple spatialscales
There are many ways to
allocate resources within the
MLC
• Economic constraints: limit total number of
samples
– ( eg. 100 in Point Loma)
•
•
•
•
More clusters with fewer sites within a cluster?
or less clusters with fewer sites?
Shorter sample spacing or larger sample spacing?
What is best (decent!) design configuration?
Choosing the Best Design
Case Study: Point Loma
• Three design configurations
– S, STAR, and S with satellites
• Two sets of lag classes
– Shorter vs. larger sample spacing
• Compare lag distributions
• Simulation study
– Simulate response
– Consider different models of spatial variability
• Compare relative performance of designs for
estimating parameters
“STAR” and “S” Cluster
Designs
STAR DESIGN
0
0
20
20
40
40
Yk m
Yk m
60
60
80
80
100
100
S DESIGN
0
20
40
60
Xkm
80
100
0
20
40
60
Xkm
80
100
“S” and “S with Satellites”
Design
S DESIGN
0
0
20
20
40
40
Ykm
Ykm
60
60
80
80
100
100
S w ith SATELLITES DESIGN
0
20
40
60
Xkm
80
100
0
20
40
60
Xkm
80
100
0
20
40
Yk m
60
80
100
STAR DESIGN
0
20
40
60
Xkm
80
100
0
20
40
Ykm
60
80
100
S DESIGN
0
20
40
60
Xkm
80
100
0
20
40
Ykm
60
80
100
S w ith SATELLITES DESIGN
0
20
40
60
Xkm
80
100
Sample Allocation
Star
S
S with Satellites
Grid Stations =12
Grid Stations =12
Grid Stations =12
5 “STAR” Clusters of Size 17
3 grid station
2 sites of interest
1 “S” Cluster of Size 9
11 “S” Clusters of Size 9
5 grid stations
6 sites of interest
8 “S” Clusters of Size 9
8 Satellites added to 3 S”
4 grid stations
4 sites of interest
Field duplicates=9
Field duplicates=6
Field duplicates=8
Total Samples =
12+3*(17-1)
+2*(17)+9+9=112
Total Samples =
12+5*(9-1)+6*(9)+6=112
Total Samples =
12+4*(9-1) +6*(9)+6=112
“Star” Cluster Design
Point Loma 5 Star + 1 S Cluster
3610
3610
3615
3615
Ykm
Ykm
3620
3620
3625
3625
Point Loma 5 Star + 1 S Cluster
466
468
470
Xkm
472
474
466
468
470
Xkm
472
474
“S” Cluster Design
Lag = 0.05, 0.10, 0.20, 0.50
Lag = 0.05, 0.25, 1.00, 3.00
S DESIGN
Ykm
3610
3610
3615
3615
Ykm
3620
3620
3625
3625
S DESIGN
466
468
470
Xkm
472
466
468
470
Xkm
472
474
“S” Cluster with Satellites
S w ith SATELLITES DESIGN
Yk m
3610
3610
3615
3615
Yk m
3620
3620
3625
3625
S w ith SATELLITES DESIGN
466
468
470
Xkm
472
466
468
470
Xkm
472
474
Omnidirectional Lag Dist.
Lag = 0.05, 0.10, 0.20, 0.50
Lag = 0.05, 0.25, 1.00, 3.00
Ominidirectional Lag Dist
200
S
Star
SSAT
0
100
100
200
No. of pairs
300
SD3
StarD5
SSATD3
0
No. of pairs
300
400
400
Ominidirectional Lag Dist
0
2
4
6
Pairwise Lag distances
8
0
2
4
6
Pairwise Lag distances
8
Directional Lag Dist
Lag = 0.05, 0.10, 0.20, 0.50
{ Lag = 0.05, 0.25, 1.00, 3.00 is similar}
Direction = 0
120
120
Direction = 90
100
S90
STAR90
SSAT90
80
0
0
20
20
40
60
No. of pairs
60
40
No. of pairs
80
100
S0
ST AR0
SSAT 0
0
2
4
6
8
0
Pairwise Lag distances
2
4
6
8
Pairwise Lag distances
120
Direction = 135
120
Direction = 45
60
No. of pairs
80
100
S135
STAR135
SSAT135
0
20
40
60
40
20
0
No. of pairs
80
100
S45
STAR45
SSAT45
0
2
4
6
Pairwise Lag distances
8
0
2
4
6
Pairwise Lag distances
8
Simulation Study
• 3 Grid Enhancements: S, STAR, S with Satellites
• Two sets of lag classes of size 4
– 0.05, 0.10, 0.20, 0.50 (km)
– 0.05, 0.25, 1, 3 (km)
• Spherical variogram
– Range = 1, 2, 4, 6
– Nugget = 0.00, 0.10
– Sill = 1
• 1000 sims
• Fit using automated procedure in Splus
– This may have introduced artifacts
Percent Difference from Target Range
(Median Range)
S=1, N= 0.10
Lag = 0.05, 0.25, 1.00, 3.00
40
40
Lag = 0.05, 0.10, 0.20, 0.50
20
10
10
20
Percent of Target
30
S
Star
SSAT
0
0
-10
-10
Percent of Target
30
S
Star
SSAT
1
2
3
4
True Range
5
6
1
2
3
4
True Range
5
6
Percent Difference from Target Sill
(Median Sill)
S=1, N= 0.10
Lag = 0.05, 0.25, 1.00, 3.00
20
20
Lag = 0.05, 0.10, 0.20, 0.50
10
15
S
Star
SSAT
0
5
Percent of Target
10
5
0
-5
-5
-10
-10
Percent of Target
15
S
Star
SSAT
1
2
3
4
True Range
5
6
1
2
3
4
True Range
5
6
Percent Difference from Target Nugget
(Median Nugget)
S=1, N= 0.10
Lag = 0.05, 0.25, 1.00, 3.00
100
100
Lag = 0.05, 0.10, 0.20, 0.50
50
S
STAR
SSAT
-50
0
Median
0
-50
-100
-100
Median
50
S
STAR
SSAT
1
2
3
4
True Range
5
6
1
2
3
4
True Range
5
6
Summary
STAR- performed better than S and S with
Satellites for estimating variogram
parameters
- robust to different lag classes
Multiple lag distances better than increased
replication at fewer lag distances
Larger lag classes generally did better than
shorter lag classes (eliminates “holes”)
Final
Design
Five “S” clusters
and includes10
duplicates: five at
star centers &
five elsewhere)
Further Research
• Choose another variogram model
– Exponential
• Choose another variogram fitting algorithm
– REML
• Simulate anisotropy
• Investigate robustness to model misspecification
• Explore other designs
STARMAP and CITY OF SAN
DIEGO?
• Outreach to a member of the EPA affiliates
• Research opportunity – real problem
– Mapping consequences
– Apparently no other US data exists which is
• spatially intense and
• near coastal
– This mapping requirement resulted from SD’s
permit renewal
– Similar repeats are very likely
MORE GENERAL
QUESTION
• How much spatial correlation is there in
aquatic systems, after accounting for habitat
features?
– I am trying to assemble spatially intense
relevant data sets in a number of settings
– Ask for such data sets at EMAP 2004
Symposium in May
• Have located a few
SPATIALLY INTENSE DATA
SETS
OF ENVIRONMENTAL RESPONSES
• Ohio River
– Have 400+ sites
• Josh French is looking at this data
• Have about 60 Virginia stream sites
• On two streams
• Access to a northeast estuary study 100+ points
• Some spatial correlation demonstrated
• Detroit River – fairly short segment 60+ points
• San Diego study = near coastal
SPATIALLY INTENSE DATA
SETS
OF ENVIRONMENTAL RESPONSES
• Have nothing on wetlands
• Other possibilities
– San Francisco Bay
• Preliminary observation – SD data shows
greater range in the semivariogram than I had
expected
– Even after accounting for depth or particle size
– Why had I expected that? Effluent is fresh water; it
rises fast from outfall. Coastal and tidal currents
are strong there.
END OF PLANNED
PRESENTATION
• Questions and suggestions are welcome
Download