Methods for interpolating stream width, depth

e c o l o g i c a l m o d e l l i n g 1 9 6 ( 2 0 0 6 ) 256–264
available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/ecolmodel
Methods for interpolating stream width,
depth, and current velocity
Jud F. Kratzer a,∗ , Daniel B. Hayes a , Bradley E. Thompson b
a
b
Department of Fisheries and Wildlife, 12 Natural Resources Building, Michigan State University, East Lansing, MI 48824, USA
Washington Department of Fish and Wildlife, 600 Capitol Way North, Olympia, WA 98501, USA
a r t i c l e
i n f o
a b s t r a c t
Article history:
Interpolation is a type of modeling that can be used to estimate habitat variables throughout
Received 30 September 2004
a stream based on measurements distributed along the stream’s length, but little guidance
Received in revised form 23 January
is available to select the best method of interpolation. Thus, we compared several methods
2006
to determine which produced the most accurate interpolation of width, depth, and cur-
Accepted 1 February 2006
rent velocity, separately. We also determined whether interpolation should be performed
Published on line 20 March 2006
using separate datasets for riffles, runs, and pools or unstratified datasets. We measured
stream width, maximum depth, and mean current velocity in a northern Michigan water-
Keywords:
shed. We tested seven methods of interpolation including global average, linear regression,
Interpolation
cubic spline, moving average, Lagrange polynomials, Kriging, and Loess smoother. Accuracy
Stream habitat
of different methods was determined by comparing interpolated habitat conditions to actual
Loess
values measured at points along the river. This study produced two main recommenda-
Kriging
tions. First, when performing interpolations, data should be stratified by meso-habitat type
Moving average
(riffles, runs, and pools) only when habitat variables are different for each meso-habitat
type and stratification does not increase distance between points such that interpolation
accuracy is reduced. If habitat variables are similar for all meso-habitat types, knowing the
meso-habitat type within which a point falls does not add information that will increase
interpolation accuracy. Second, the Loess smoother with a smoothing parameter from 0.2
to 0.4 generally produced the most accurate interpolated values and is the method we recommend for similar situations.
© 2006 Elsevier B.V. All rights reserved.
1.
Introduction
Knowing the habitat available in a stream is useful for modeling the distribution and production of fish. To completely
describe available habitat, habitat conditions need to be measured throughout the stream. Because of the expense of sampling at a fine scale throughout an entire river system, two
approaches are often used to sample stream fish habitat. In the
representative reach approach, habitat conditions are measured at a fine scale (e.g., sample grid of 1–2 m), but sampling is limited to a relatively short reach (e.g., 1 km or less).
∗
Corresponding author. Tel.: +1 517 353 6697; fax: +1 517 432 1699.
E-mail address: kratzer1@msu.edu (J.F. Kratzer).
0304-3800/$ – see front matter © 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.ecolmodel.2006.02.004
This approach is limited because habitat conditions between
selected reaches are unknown. Another approach is to sample at points spaced broadly along the length of a stream (e.g.,
100 m), thereby covering greater lengths of stream. A limitation with this approach, however, is that habitat conditions
are not directly sampled at a fine scale, thereby preventing
development of detailed maps of stream habitat. The goal of
this study was to explore how interpolation could be used with
this broad-scale sampling approach to estimate habitat conditions, in effect creating a one-dimensional map of stream
habitat based on habitat measurements scattered along the
257
e c o l o g i c a l m o d e l l i n g 1 9 6 ( 2 0 0 6 ) 256–264
Fig. 1 – Example of sampling transects along a river reach,
illustrating the need for interpolation to estimate
conditions at intervening points.
stream’s length. Interpolation does not describe the relationship between habitat variables such as width, depth, and
velocity at a given point, but rather, it uses measurements at
selected sampling points to estimate conditions at unknown
points.
For example, in a study of the movement behavior of
steelhead, Oncorhynchus mykiss, in the Pine River, Michigan,
Thompson (2004) found it necessary to describe habitat conditions over large reaches of the stream. Stream width, maximum depth, and mean water velocity were collected at transects dispersed along the river reach (Fig. 1). Thus, habitat conditions were known at each sampling transect. Because fish
movement depends on habitat conditions throughout their
travel route, it was necessary to estimate conditions at a finer
scale than the data were collected (Fig. 1), or would even be
feasible to collect. The motivation for our study was how best
to estimate conditions along the entire river reach. This is a
common problem because there are always limitations on the
number and scale of measurements that can be taken.
Interpolation has been used to describe aquatic habitat and
the distribution of aquatic species. Lehmann et al. (1997) interpolated texture, nutrients, and organic content from point
sediment samples in the littoral zone of Lake Geneva, Switzerland. Battista (2001) used a GIS-based habitat suitability index
to interpolate suitability of habitat for the eastern oyster,
Crassotrea virginica, in Chesapeake Bay. On a smaller scale,
Beebe (1996) interpolated water velocities at channel edges,
and around large woody debris (Beebe, 2001). The PHABSIM
model also relies heavily on interpolation of points collected at
a fine scale to estimate available habitat (Milhous et al., 1989).
Hankin (1984) and Hankin and Reeves (1988) estimated fish
abundance in reaches distributed along a stream in order to
estimate total fish abundance. However, interpolation has not
been widely used to describe the longitudinal distribution of
habitat in rivers, and consequently, there are few guidelines
for choosing the appropriate method and protocols for interpolating river habitat variables.
There are many methods that can be used to interpolate
habitat variables in streams. Thus, our first objective was to
determine which method produces the most accurate interpolation of several stream habitat variables. The second objective
was to determine whether interpolation should be performed
using separate datasets for riffles, runs, and pools or unstratified datasets. The third objective was to determine whether
there were inherent differences in interpolation accuracy for
Fig. 2 – Pine River watershed location within Michigan.
the three habitat variables: width, depth, and velocity. The
objective was not to describe relationships between habitat
variables, but rather to create a separate one-dimensional map
for each of the three habitat variables.
2.
Study area
The Pine River, a tributary to the AuSable River and Lake
Huron, drains a 756 km2 watershed located in the southeast quarter of Alcona County, Michigan, USA (Fig. 2). Backus
Creek, and the East, West, and South branches converge to
form the Main Branch of the Pine before emptying into Van
Etten Lake, a 5.7 km2 impoundment. The South Branch is
the longest tributary of the Main Branch and contributes
the largest amount of discharge relative to other branches
(Table 1). The majority of each sub-catchment is forested
Table 1 – Mean width, maximum depth, velocity, and
discharge; percent habitat composition; number of
transects; total length surveyed in Backus Creek and the
East, Main, South, and West Branches of the Pine River
watershed
Branch
Backus
East
Main
South
West
Width (m)
Max. depth (cm)
Velocity (m/s)
Discharge (m3 /s)
5.4
34.8
0.132
0.702
7.0
53.8
0.192
1.286
14.3
68.0
0.152
2.476
8.6
52.7
0.199
2.328
7.7
42.9
0.141
1.211
Habitat composition
Pool
Riffle
Run
17%
30%
52%
17%
38%
45%
22%
19%
58%
27%
25%
49%
26%
24%
50%
Number of transects
68
Total (km)
7.5
137
15.2
87
14.5
183
20.2
146
16.5
258
e c o l o g i c a l m o d e l l i n g 1 9 6 ( 2 0 0 6 ) 256–264
and contains minimal proportions of urban and agricultural
areas. The majority of stream habitat is classified as run, with
lesser amounts of riffle and pool habitat (Table 1). The stream
is a freely meandering stream with low gradient, and surficial substrate composed mostly of sand and gravel. Summer
stream temperatures of the major tributaries rarely exceed
23 ◦ C and support a coldwater fish community dominated by
juvenile steelhead (O. mykiss), brown trout (Salmo trutta), brook
trout (Salvelinus fontinalis), mottled sculpin (Cottus bairdii),
creek chub (Semotilus atromaculatus), white sucker (Catostomus commersoni), and northern brook lamprey (Ichthyomyzon
fossor).
3.
Methods
3.1.
Data collection
We measured stream width, maximum depth, and mean current velocity during summer base flow conditions in the East
Branch, West Branch, South Branch, and Mainstem Pine River
and Backus Creek, Alcona County, Michigan (Fig. 1). To collect habitat data, we walked upstream a randomly determined
distance from the downstream end of each branch and measured hydraulic variables at random distances thereafter. We
determined distance from a random number generator with a
uniform distribution from 10 to 190 m for all branches but the
mainstem, for which the distribution ranged from 10 to 290 m.
We measured wetted width at the selected point. Depth (cm)
and current velocity (m/s) were measured at 50 cm intervals in
stream reaches <15 m wide and at 100 cm intervals in stream
reaches >15 m wide. We also recorded the meso-habitat type
as either riffle, run, or pool (Hicks and Watson, 1985). We then
randomly selected another distance and repeated the process.
We stopped moving upstream when width became <2 m wide
or we reached a barrier to upstream fish movement because
these data were collected as part of a study on juvenile steelhead.
3.2.
Data preparation
For each branch, we divided data into four datasets. There
were three “stratified” datasets, one for each meso-habitat
(i.e., riffles, runs, and pools). The “unstratified” dataset consisted of data from all meso-habitat types combined. To assess
the accuracy of interpolation, we subsampled each dataset
by leaving out every other sample point. This allowed us to
produce interpolation estimates that could be compared to
known values at points deleted from the interpolation.
3.3.
Interpolation methods
We tested seven methods of interpolation including global
average, linear regression, cubic spline, moving average,
Lagrange polynomials, Kriging, and Loess smoother. We did
not describe relationships between the three habitat variables,
but rather used width to interpolate width, depth to interpolate depth, and velocity to interpolate velocity. The global average method simply used the overall mean value of observed
data to estimate conditions at unknown points. We used lin-
ear regression to take into account longitudinal trends and
make estimates of habitat variables at unknown points using
regression predictions.
In the cubic spline interpolation method (Press et al., 1992),
the first set of four points was used to make estimations
for deleted points within the span of the data. The “window” was then advanced one point (i.e., the second through
fifth points) and an estimation was made for the second
deleted point. The cubic spline interpolation calculations
were performed in Microsoft Excel using a macro found at
www.srs1software.com.
We evaluated moving averages with varying windows to
interpolate habitat variables. For a moving average with a
window width of two, the average value of the two data
points surrounding a deleted point was used to provide
an interpolation estimate. For a width of four, four surrounding observed data points were used, two on each side.
Widths ranged from 2 to 10 for the unstratified dataset, but
small sample size of some meso-habitats reduced the largest
possible moving average width for stratified data in some
branches.
We used Lagrange polynomials ranging from linear to
fourth order (Press et al., 1992). The general equation for an
N − 1 order Lagrange polynomial interpolation is
P(x) = y1
(x − x2 )(x − x3 ) . . . (x − xN )
(x1 − x2 )(x1 − x3 ) . . . (x1 − xN )
+ y2
(x − x2 )(x − x3 ) . . . (x − xN )
(x2 − x1 )(x2 − x3 ) . . . (x2 − xN )
+ yN
(x − x2 )(x − x3 ) . . . (x − xN )
(xN − x2 )(xN − x3 ) . . . (xN − xN−1 )
where y is the habitat variable, x (without a subscript) is the
upstream distance for the point to be interpolated, and the
subscripts 1 to N index the number of observed data points
used in the interpolation. As with cubic spline interpolation,
sequential sets of observed points were used to interpolate for
deleted points.
We performed Kriging using the KRIGE2D procedure in
SAS. The KRIGE2D procedure requires that the shape, range,
and scale of the sample variogram be estimated according
to the procedure outlined in the SAS/STAT User’s Guide (SAS
Institute Inc., 2000). The KRIGE2D procedure performs twodimensional Kriging and requires three values for every data
point: an x-coordinate, a y-coordinate, and the variable of
interest. Our stream habitat data was one-dimensional, so
we simply assigned an x-coordinate value of zero to all data
points. A Kriging interpolation is essentially a weighted average, with the weights assigned to observed data values according to their distance from the point of interest and their redundancy (Isaaks and Srivastava, 1989). Observed points that are
closer to the deleted point and observed points that are more
isolated from other observed points receive higher weights.
Global Kriging makes use of all observed data points, but in
local Kriging, the observed data points used in the interpolation are limited to points falling within a specified radius. For
one-dimensional data, the radius is actually just a distance
along the line. We performed global Kriging and local Kriging
with radii of 100, 500, 1000, 2000, and 4000 m. In each local Krig-
259
e c o l o g i c a l m o d e l l i n g 1 9 6 ( 2 0 0 6 ) 256–264
ing interpolation, the minimum number of points to include
in the interpolation was set at two.
The final method of interpolation was the Loess smoother.
We performed Loess interpolation using the LOESS procedure
in SAS. In the LOESS procedure, the habitat variable at an
unknown data point is estimated from linear regression of
observed data points within a neighborhood of chosen size
surrounding each unknown point. The smoothing parameter (s) is used to change the radius of the neighborhood used
for local regression. When s < 1, the local neighborhood consists of the s fraction of the observed data closest to the
given unknown point. When s ≥ 1, all observed data points
are used in the local regression. We used smoothing parameters of 0.1, 0.2, 0.3, 0.4, 0.5, and 1 for the unstratified data.
Small sample size of some meso-habitats made using small
smoothing parameters impossible for the stratified data of
some branches. Data points in a given local neighborhood
are weighted based on their distance from the center of the
neighborhood, with points further from the center receiving
less weight in the local regression. The weight of a data point
3 3
is given by: wi = (32/5)(1 − (di /Q) ) , where di is the distance
of point i from the center of the neighborhood, and Q is the
distance between the center of the neighborhood and the
point furthest from the center (SAS Institute Inc., 2000). As s
decreases, the wi of a given point will tend to decrease because
Q will decrease.
3.4.
Comparison of methods
We used mean squared errors to determine which method
provided the most accurate interpolations for each habitat
variable. Squared error was calculated as the squared difference between the actual value of deleted data points and their
interpolated values. To determine if interpolation accuracy
was increased by stratifying data by meso-habitat types, we
compared mean squared errors for interpolations using stratified and unstratified data for each interpolation method individually. The difference between interpolations using stratified and unstratified datasets is that in stratified interpolation,
only riffle data were used to estimate habitat in riffles, only
runs were used for runs, and only pools were used for pools,
whereas unstratified interpolation used data from all mesohabitat types. In order to explain the differences between
stratified and unstratified interpolations, we used ANOVA to
test for differences in habitat variables among riffles, runs, and
pools.
We also used the coefficient of variation (CV, square root of
mean squared error divided by the mean value of the habitat
variable) as a dimensionless measure to compare interpolation accuracy for stream width, depth, and velocity.
4.
Results and discussion
4.1.
Stratification
Over all branches, stream width did not vary by meso-habitat
type (P > 0.1), but velocity and water depth did (P < 0.01 in each
case). Stream width averaged between eight and nine meters
regardless of habitat type. As expected, riffles were shallower
Table 2 – Mean and standard deviation of width, depth,
and current velocity in pools, riffles, and runs from all
studied reaches of the Pine River system
Pool (n = 143)
Mean S.D.
Width (m)
8.87
Depth (cm)
74.10
Velocity (m/s) 0.12
4.58
28.13
0.05
Riffle (n = 190)
Mean
8.03
36.91
0.22
S.D.
2.95
16.61
0.08
Run (n = 301)
Mean
8.20
48.03
0.15
S.D.
3.82
18.50
0.06
and had faster current than runs, which were shallower and
faster than pools (Table 2). When interpolating stream width, it
was generally better to not stratify the data by habitat type for
all interpolation methods (Table 3). In other words, knowing
the width of the river at surrounding points was more important than knowing whether these points fell in riffles, runs,
or pools. This result was not unexpected because riffles, runs,
and pools are generally defined by their depths and current
velocities but not by their widths. The South Branch was the
only branch for which stratified data provided a more accurate interpolation (based on median mean squared error of
the different methods), and this was also the only branch that
had significantly different widths in riffles, runs, and pools
(P = 0.04).
For water depth, stratified data provided more accurate
interpolations for the East Branch, the South Branch, and
Backus Creek, but unstratified data generally provided more
accurate interpolations for the West Branch and the Mainstem
(Table 4). Riffles, runs, and pools are partially defined by depth,
so interpolation should intuitively be aided by stratifying the
data. For the West Branch and the Mainstem, depths in the
area surrounding the estimated point, regardless of the habitat type, were better predictors of depth than depths in similar
habitat types.
Stratified data generally provided more accurate interpolations of stream velocity. For the Mainstem, unstratified data
provided the better interpolation (based on median mean
squared errors of the different methods), and for the East
Branch, stratified and unstratified interpolations performed
similarly (Table 5). Mean velocity differed significantly among
riffles, runs, and pools within all branches, thus knowing that
an unknown data point fell in a riffle, run, or pool generally
increased the accuracy of interpolation.
4.2.
Comparison of methods
There was no single method of interpolation that consistently produced the most accurate interpolations. However,
three methods generally performed best-moving average with
width from four to eight points, Kriging with radius from
1000 to 4000 m, and the Loess smoother with a smoothing
parameter from 0.2 to 0.4. These three methods are conceptually similar, as each one uses data within a predetermined
distance from the point to be interpolated. They also statistically smooth the data because an interpolation of a sampled
data point usually estimates a value that is not equal to the
actual measured value. They differ in that the moving average simply assigns the average of the surrounding data, while
local Kriging assigns a weighted average of the surrounding
260
e c o l o g i c a l m o d e l l i n g 1 9 6 ( 2 0 0 6 ) 256–264
Table 3 – Mean squared errors for the different methods of interpolation for stream width (m)
Interpolation
East Branch
U
S
West Branch
U
S
Mainstem
U
South Branch
Backus Creek
Average
S
U
S
U
S
U
S
Global average
Linear regression
Cubic spline
3.15
2.44
3.51
3.37
3.53
4.00
12.55
12.60
13.52
13.46
12.67
14.89
13.06
8.16
8.25
13.53
8.31
17.63
13.15
12.84
10.95
11.59
11.04
9.24
1.95
1.92
2.68
1.57
1.69
2.10
8.77
7.59
7.78
8.70
7.45
9.57
Moving average
Width = 2
Width = 4
Width = 6
Width = 8
Width = 10
2.94
2.51
2.49
2.49
2.52
3.54
3.20
3.31
3.25
NA
12.70
11.61
11.85
12.43
13.07
13.56
12.75
12.45
12.74
12.75
8.52
7.72
7.19
7.78
7.79
8.52
6.88
7.42
8.31
9.44
9.74
10.33
10.48
10.69
10.79
6.97
8.33
10.10
9.71
9.50
2.24
1.94
1.87
1.95
1.99
2.46
1.94
NA
NA
NA
7.23
6.82
6.78
7.07
7.23
7.01
6.62
8.32
8.50
10.56
LaGrange
Linear
Quadratic
3rd order
4th order
3.08
3.40
3.44
4.20
3.44
3.76
4.89
9.78
12.41
15.48
13.93
14.44
12.75
15.59
14.52
38.13
7.49
8.67
10.91
13.45
8.11
19.41
160.35
474.42
9.87
11.65
11.57
13.83
5.31
11.49
13.04
332.74
2.40
2.63
2.88
3.29
1.93
2.08
2.23
3.44
7.05
8.37
8.55
9.84
6.31
10.46
39.01
171.70
Kriging
Global
Radius = 100
Radius = 500
Radius = 1000
Radius = 2000
Radius = 4000
3.02
3.09
3.18
3.04
3.07
3.04
3.47
3.65
3.59
3.55
3.44
3.44
12.62
12.08
11.73
12.39
12.20
12.42
12.96
11.90
11.85
12.22
12.63
12.74
10.13
7.71
7.12
6.57
7.38
7.55
10.54
8.10
8.10
7.81
8.05
8.05
9.22
9.88
9.27
8.98
8.79
8.84
7.93
6.02
5.93
6.63
6.67
6.46
2.15
2.44
2.24
2.18
2.17
2.13
1.54
1.91
1.91
1.86
1.67
1.58
7.43
7.04
6.71
6.63
6.72
6.80
7.29
6.32
6.28
6.41
6.49
6.45
Loess smoother
Smooth = 0.1
Smooth = 0.2
Smooth = 0.3
Smooth = 0.4
Smooth = 0.5
Smooth = 1
2.46
2.36
2.37
2.42
2.42
2.41
NA
3.07
3.09
3.06
3.10
3.18
11.88
12.40
12.37
12.38
12.45
12.58
NA
13.23
12.61
13.14
12.78
12.99
7.44
7.28
7.54
7.92
8.01
8.10
NA
7.71
7.68
8.12
8.13
7.40
9.88
10.64
10.75
10.99
11.19
12.26
5.17
6.87
8.12
9.09
9.23
9.88
2.40
1.96
1.90
1.94
1.96
1.92
NA
NA
NA
NA
2.10
1.88
6.81
6.93
6.99
7.13
7.21
7.45
5.17
7.72
7.87
8.36
7.07
7.06
Median
2.98
3.44
12.43
12.75
7.79
8.12
10.66
8.71
2.14
1.91
7.10
7.37
The “U” column reports error when data from riffles, runs, and pools were combined in one dataset (unstratified), and the “S” column reports
error when interpolation was performed using data from riffles, runs, and pools as separate datasets (stratified).
data, and the Loess smoother assigns a value based on a
weighted linear regression of the local data. An advantage of
the Loess smoother and Kriging over the moving average is
that they can make interpolations regardless of how close the
point to be estimated is to the edge of the dataset. For example, in a moving average with a width of six, three observed
data points are required on each side of the point to be estimated. Therefore, a moving average of width six cannot be
used when there are only two or fewer observed data points
between the point to be estimated and the end of the dataset.
To overcome this problem, we simply applied the value of
the interpolation closest to the edge to all remaining points
for which moving average interpolation was impossible. One
major drawback of Kriging is that it requires the estimation of
more parameters than moving average or the Loess smoother.
The shape, range, and scale of the sample variogram must
be estimated for each dataset. Also, Kriging was much more
sensitive to the radius than the Loess smoother was to the
smoothing parameter. It is favorable to have less sensitivity to
these parameters because it would make the outcome of the
interpolation less dependent on the researcher’s choice of the
parameter. Because of these weaknesses of the moving average and Kriging methods, we feel that the Loess smoother is
preferable.
A Loess smoother with a smoothing parameter of 0.2 was
generally one of the more accurate interpolation methods
(Tables 3–5). For this interpolation, 20% of the observed data
points centered around a given unknown data point were
used to estimate habitat conditions at the unknown point. In
a sense, the smoothing parameter determines how sensitive
the interpolation is to extremes in the data. With a smoothing parameter of 0.1, the interpolation is based on only 10%
of the total dataset, making the interpolation more responsive to peaks and valleys in the observed data (Fig. 3). As the
smoothing parameter increases, the interpolation becomes
less influenced by local highs and lows in the observed data. If
autocorrelation is high and sampled points are close together,
a smaller smoothing parameter may produce more accurate
interpolations.
The Loess interpolation was less accurate for depth than for
width (Table 6). The lower accuracy of the depth interpolation
was at least partially caused by the higher variability of maximum depth (Table 7). The coefficient of variation for depths of
unknown points was consistently greater than that of widths
in all branches. The difference in variability was not caused
solely by differences in width and depth along the river, as
differences in depth from one unknown data point to the next
were also consistently greater than differences in width. The
261
e c o l o g i c a l m o d e l l i n g 1 9 6 ( 2 0 0 6 ) 256–264
Table 4 – Mean squared errors for the different methods of interpolation for maximum water depth (cm)
Interpolation
East Branch
West Branch
Mainstem
U
South Branch
Backus Creek
Average
S
U
S
U
S
U
S
U
S
U
S
Global average
Linear regression
Cubic spline
487
483
500
253
261
384
385
386
919
696
675
880
587
575
1142
853
839
1831
578
520
646
324
305
438
199
190
248
109
139
173
447
431
691
447
444
741
Moving average
Width = 2
Width = 4
Width = 6
Width = 8
Width = 10
420
391
415
434
432
302
243
230
233
256
717
534
449
432
427
698
722
764
788
781
772
549
481
497
538
916
855
860
896
920
552
494
517
520
493
335
294
282
293
304
205
178
162
161
172
141
93
NA
NA
NA
533
429
405
409
412
478
441
534
553
565
LaGrange
Linear
Quadratic
3rd order
4th order
408
531
505
551
235
401
729
1009
741
1083
1152
1288
749
891
862
1817
826
902
1387
2122
841
1890
25162
85908
581
624
648
768
377
429
522
1281
207
246
270
376
147
173
192
263
553
677
792
1021
470
757
5494
18056
Kriging
Global
Radius = 100
Radius = 500
Radius = 1000
Radius = 2000
Radius = 4000
423
408
434
429
424
426
227
261
258
262
247
243
363
748
541
447
388
383
719
717
717
738
728
755
549
771
747
500
525
558
871
915
915
890
861
888
571
583
555
550
544
541
336
368
362
312
293
304
168
208
181
165
165
163
97
135
137
126
111
108
415
544
492
418
409
414
450
479
478
466
448
460
Loess smoother
Smooth = 0.1
Smooth = 0.2
Smooth = 0.3
Smooth = 0.4
Smooth = 0.5
Smooth = 1
382
418
435
444
454
487
NA
245
242
228
234
264
481
409
391
385
386
393
NA
745
661
695
730
723
583
501
513
542
555
567
NA
836
830
876
855
835
510
510
518
514
513
515
349
300
273
269
271
285
207
164
161
165
172
189
NA
NA
NA
NA
117
121
433
400
404
410
416
430
349
531
501
517
441
445
Median
433
253
439
730
562
876
543
308
179
135
430
478
The “U” column reports error when data from riffles, runs, and pools were combined in one dataset (unstratified), and the “S” column reports
error when interpolation was performed using data from riffles, runs, and pools as separate datasets (stratified).
Loess interpolation of velocity was less accurate than that of
width, but the accuracies of velocity and depth interpolations
were more similar (Table 6). Unknown data points had similar
coefficients of variation (CV) and point-to-point variation for
velocity and depth, while the CV and point-to-point variation
of width were much smaller (Table 7).
The global average approach performed surprisingly well,
presumably because there was relatively little longitudinal
trend in width, depth, or velocity within the length of stream
sampled in each branch. Slopes of regression lines were relatively shallow, ranging from −0.0014 to 0.0002 for width, depth,
and velocity in all branches; thus linear regression performed
similarly to the global average because slopes were near zero.
In streams where stronger longitudinal trends are present, a
local interpolation method such as the Loess smoother would
be preferable.
Cubic spline and Lagrange polynomials were consistently
poor methods of interpolation. These methods were too sensitive to extreme values in the observed data. In contrast to the
Loess smoother, moving average, and Kriging, these methods
do not statistically smooth the data.
4.3.
Fig. 3 – Observed stream width used in interpolations and
interpolated stream width using Loess smoother with
smoothing parameters of 0.1 and 0.2.
Distances between points
The distance between sampled points will determine the
accuracy of any interpolation method. We would expect to
see smaller fluctuations in habitat conditions and more accurate interpolations between sampled points that are closer to
each other. Determining the ideal distance between sampled
points was not a goal of this study, but it must be considered
262
e c o l o g i c a l m o d e l l i n g 1 9 6 ( 2 0 0 6 ) 256–264
Table 5 – Mean squared errors for the different methods of interpolation for average water velocity (m/s)
Interpolation
East Branch
West Branch
U
S
U
Global average
Linear regression
Cubic spline
0.0075
0.0060
0.0068
0.0058
0.0054
0.0113
0.0053
0.0038
0.0062
Moving average
Width = 2
Width = 4
Width = 6
Width = 8
Width = 10
0.0059
0.0052
0.0053
0.0051
0.0048
0.0036
0.0033
0.0031
0.0031
0.0036
LaGrange
Linear
Quadratic
3rd order
4th order
0.0059
0.0067
0.0070
0.0073
Kriging
Global
Radius = 100
Radius = 500
Radius = 1000
Radius = 2000
Radius = 4000
S
Mainstem
South Branch
Backus Creek
U
U
S
U
S
0.0028
0.0020
0.0029
0.0016
0.0016
0.0029
0.0017
0.0017
0.0030
0.0052
0.0046
0.0069
0.0039
0.0030
0.0050
0.0137
0.0137
0.0142
0.0045
0.0031
0.0033
0.0032
0.0035
0.0021
0.0017
0.0017
0.0017
0.0017
0.0019
0.0018
0.0014
0.0017
0.0018
0.0026
0.0021
0.0021
0.0019
0.0017
0.0048
0.0040
0.0042
0.0043
0.0039
0.0036
0.0033
0.0032
0.0029
0.0027
0.0037
0.0054
0.0095
0.0220
0.0046
0.0055
0.0072
0.0079
0.0023
0.0031
0.0039
0.0094
0.0023
0.0028
0.0037
0.0027
0.0028
0.0031
0.0032
0.0038
0.0055
0.0064
0.0071
0.0097
0.0061
0.0059
0.0061
0.0061
0.0061
0.0061
0.0053
0.0097
0.0103
0.0105
0.0102
0.0099
0.0047
0.0045
0.0033
0.0035
0.0035
0.0034
0.0025
0.0022
0.0021
0.0019
0.0019
0.0019
0.0020
0.0023
0.0021
0.0019
0.0020
0.0020
0.0023
0.0027
0.0027
0.0026
0.0025
0.0025
Loess smoother
Smooth = 0.1
Smooth = 0.2
Smooth = 0.3
Smooth = 0.4
Smooth = 0.5
Smooth = 1
0.0052
0.0051
0.0051
0.0052
0.0054
0.0059
NA
0.0086
0.0070
0.0057
0.0052
0.0049
0.0033
0.0034
0.0034
0.0034
0.0034
0.0035
NA
0.0020
0.0018
0.0018
0.0017
0.0018
0.0017
0.0015
0.0016
0.0017
0.0016
0.0016
Median
0.0059
0.0057
0.0035
0.0020
0.0018
Average
S
U
S
0.0012
0.0012
0.0018
0.0067
0.0060
0.0074
0.0031
0.0027
0.0048
0.0140
0.0142
0.0138
0.0139
0.0139
0.0013
0.0010
NA
NA
NA
0.0062
0.0057
0.0056
0.0057
0.0056
0.0027
0.0023
0.0025
0.0024
0.0024
0.0041
0.0048
0.0057
0.0097
0.0029
0.0180
0.0129
0.0151
0.0013
0.0015
0.0029
0.0112
0.0042
0.0079
0.0076
0.0085
0.0028
0.0036
0.0050
0.0112
0.0045
0.0056
0.0051
0.0047
0.0047
0.0046
0.0034
0.0039
0.0038
0.0037
0.0033
0.0029
0.0137
0.0139
0.0137
0.0137
0.0137
0.0136
0.0012
0.0014
0.0014
0.0012
0.0012
0.0013
0.0062
0.0064
0.0061
0.0060
0.0060
0.0059
0.0029
0.0040
0.0041
0.0040
0.0038
0.0037
NA
0.0027
0.0026
0.0024
0.0022
0.0019
0.0040
0.0041
0.0042
0.0042
0.0042
0.0046
0.0040
0.0036
0.0033
0.0032
0.0031
0.0030
0.0136
0.0138
0.0137
0.0137
0.0138
0.0138
NA
NA
NA
NA
0.0010
0.0010
0.0056
0.0056
0.0056
0.0056
0.0057
0.0059
0.0040
0.0042
0.0037
0.0033
0.0026
0.0025
0.0025
0.0046
0.0035
0.0137
0.0013
0.0060
0.0034
The “U” column reports error when data from riffles, runs, and pools were combined in one dataset (unstratified), and the “S” column reports
error when interpolation was performed using data from riffles, runs, and pools as separate datasets (stratified).
when planning any interpolation. Spacing of sampled points
is probably site-specific, with more heterogeneous conditions
requiring closer spacing. Limited funding will tend to favor
more widely spaced points. Before interpolating stream habitat, a pilot study to determine the most appropriate spacing
of sample points may be helpful. Because the goals (and thus
acceptable precision) and funds available would be specific to
the particular application, we suggest that the approach we
have taken in this paper (i.e., collect data at a “reasonable”
scale for the application, then determine approximate preci-
sion by dropping out every other point) is how the data in a
pilot study could be analyzed.
4.4.
Falls River, Upper Peninsula, Michigan
In order to determine whether the results of this study would
be applicable to streams in other geologic settings, we performed interpolations of stream width with the moving average, Kriging, and Loess methods, using unstratifed data from
the Falls River, which is in Michigan’s Upper Peninsula (Bryan
Table 6 – Coefficient of variation (CV = square root of mean squared error/mean) for interpolations of the three habitat
variables in each branch using a Loess smoother with a smoothing parameter of 0.2
Branch
Width
Depth
Velocity
Unstratified
Stratified
Unstratified
Stratified
Unstratified
Stratified
East
West
Main
South
Backus
0.23
0.44
0.19
0.39
0.25
0.24
0.43
0.19
0.34
0.39
0.48
0.33
0.42
0.35
0.33
0.57
0.40
0.30
0.36
0.43
0.27
0.33
0.88
0.44
0.30
0.35
0.30
Average
0.30
0.30
0.39
0.40
0.45
0.35
ecological modelling
196
263
( 2 0 0 6 ) 256–264
Table 7 – Coefficient of variation (CV = standard deviation/mean) and mean point-to-point variation (ptp) of width, depth,
and velocity of deleted data points for stratified and unstratified datasets
Branch
Width
Depth
Unstratified
Stratified
Unstratified
Velocity
Stratified
Unstratified
Stratified
CV
ptp
CV
ptp
CV
ptp
CV
ptp
CV
ptp
CV
ptp
East
West
Main
South
Backus
0.26
0.26
0.23
0.37
0.15
0.23
0.26
0.20
0.27
0.17
0.24
0.21
0.23
0.41
0.21
0.23
0.16
0.29
0.41
0.71
0.43
0.35
0.40
0.38
0.48
0.42
0.39
0.41
0.42
0.46
0.35
0.40
0.34
0.31
0.35
0.30
0.43
0.57
0.32
0.40
0.30
0.35
0.42
0.38
0.40
0.33
0.43
0.55
0.28
0.39
0.31
0.26
0.25
0.26
Average
0.25
0.23
0.27
0.22
0.46
0.42
0.41
0.33
0.40
0.38
0.41
0.27
Burroughs, Department of Fisheries and Wildlife, Michigan
State University, personal communication). The geology of
this region is much different than that of the Pine River. While
the Pine River has a low gradient and sandy bottom, the Falls
River has a rocky bottom and several waterfalls. Despite the
different geologies, the three methods performed similarly,
with the Loess smoother with a smoothing parameter of 0.5
yielding the lowest mean squared error (Table 8).
4.5.
5.
Relation to physical models
Many investigations of stream habitat rely on physical models
(e.g., PHABSIM) to estimate conditions within a study reach.
The focus of such models is on the interrelationship among
variables such as stream width, depth, velocity, and substrate
size. These models provide useful estimates of the multivariate structure of stream environments and are based on fundamental laws of physics. Although the relationship among
these variables can be thought of as following deterministic
physical laws, the actual conditions encountered at a small
scale (e.g., local variation in substrate leading to differences
in local stream slope, width, etc.) has a stochastic component
that leads to the need for having input data available at an
Table 8 – Mean squared errors for the different methods
of interpolation for stream width (m) in the Falls River,
Upper Peninsula, Michigan. Mean width = 11.8 m
Interpolation
appropriate scale. In this study, we investigated methods for
interpolating individual stream habitat variables that could
then be used directly, or as inputs into physical models such
as PHABSIM. Although the sampling scale, and thus the appropriate interpolation method may vary across studies, the need
for some application of interpolation is a consistent feature of
such research.
Conclusions
This study produced two main recommendations. First, data
should be stratified by habitat type only when habitat variables are different for each habitat type and stratification does
not increase distance between points such that interpolation
accuracy is significantly reduced. If habitat variables are similar for all meso-habitat types, knowing the meso-habitat type
within which a point falls does not add information that will
increase interpolation accuracy. Second, the Loess smoother
with a smoothing parameter from 0.2 to 0.4 may generally be
the most favorable method.
Acknowledgement
We would like to thank the Michigan Department of Natural Resources and Michigan State University for funding this
study.
references
M.S.E.
Moving average
Width = 4
Width = 6
Width = 8
Width = 10
19.09
18.84
19.08
19.27
Kriging
Radius = 500
Radius = 1000
Radius = 2000
Radius = 4000
18.70
18.57
18.67
18.80
Loess smoother
Smooth = 0.2
Smooth = 0.3
Smooth = 0.4
Smooth = 0.5
Smooth = 1
18.93
18.50
18.13
18.05
18.32
Battista, T.A., 2001. Habitat suitability index modeling to
support marine resource restoration efforts—a geographic
information system approach. In: Proceedings of the Second
Biennial Coastal GeoTools Confernce, Charleston, SC,
January 8–11.
Beebe, J.T., 1996. Fluid speed variability and the importance to
managing fish habitat in Rivers. Regul. Rivers: Res. Manage.
12 (1), 63–79.
Beebe, J.T., 2001. Flow disturbance caused by cross-stream
coarse woody debris. Phys. Geogr. 22 (3), 222–236.
Hankin, D.G., 1984. Multistage sampling designs in fisheries
research: applications in small streams. Can. J. Fish. Aquat.
Sci. 41, 1575–1591.
Hankin, D.G., Reeves, G.H., 1988. Estimating total fish
abundance and total habitat area in small streams based on
visual estimation methods. Can. J. Fish. Aquat. Sci. 45,
834–844.
264
ecological modelling
Hicks, B.J., Watson, N.R.N., 1985. Seasonal changes in
abundance of brown trout (Salmo trutta) and rainbow
trout (S. gairdneri) assessed by drift diving in the
Rangitikei River, New Zealand. N. Z. J. Mar. Freshwater Res.
19, 1–10.
Isaaks, E.H., Srivastava, R.M., 1989. Applied Geostatistics.
Oxford University Press, New York.
Lehmann, A., Jaquet, J.M., Lachavanne, J.B., 1997. A GIS
approach of aquatic plant spatial heterogeneity in relation
to sediment and depth gradients, Lake Geneva, Switzerland.
Aquat. Bot. 58, 347–361.
Milhous, R.T., Updike, M.A., Schneider, D.M., 1989. Physical
habitat simulation system reference manual—version II.
196
( 2 0 0 6 ) 256–264
Instream Flow Information Paper No. 26, U.S. Fish and
Wildlife Service. Biological Report 89 (16).
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.,
1992. Numerical Recipes in C: the Art of Scientific
Computing, 2nd ed. Cambridge University Press, New York.
SAS Institute Inc., 2000.
Chapter 38: The LOESS Procedure. SAS OnlineDoc, Version [8]
http://gsbwww.uchicago.edu/computing/research/SASManual/
main.htm.
Thompson, B.E., 2004. Modeling of juvenile steelhead growth
and movement in the Pine River, Michigan. Ph.D.
Dissertation. Michigan State University, East Lansing, MI,
USA.