REALIGNING THE LAKE TAHOE INTERAGENCY  MONITORING PROGRAM  Vol. II  Technical Appendices   

advertisement
Hydroikos Ltd.
2512 Ninth Street, Ste. 7
Berkeley, CA 94710
Phone: (510) 295-4094
coats@hydroikos.com
www.hydroikos.com
REALIGNING THE LAKE TAHOE INTERAGENCY MONITORING PROGRAM Vol. II Technical Appendices By
Robert Coats and Jack Lewis
August 25, 2014
1
Cover Photo
The late Bob Leonard of the UC Davis Tahoe Research Group collecting a water sample from
Blackwood Creek using a DH-48 depth-integrating sampler, January 14, 1980. Discharge at the
time was about 400 cfs. Photo by Bob Richards, Tahoe Environmental Research Center.
2
Organization of Appendixes
A-1
Analytical Methods and correction of nitrate-N data
A-2
Detailed results for all simulations
A-2.1 Simulations from synthetic populations
A-2.1.1 SSC
A-2.1.2 FS by mass
A-2.1.3 FS by count
A-2.1.4 TP
A-2.1.5 TKN
A-2.2 Simulations from the worked records
A-2.2.1 NO3
A-2.2.2 SRP
A-2.2.3 SS
A-2.2.4 THP
A-2.3 Confidence limits as a function of sample size
A-2.3.1 Without turbidity
A-2.3.2 With Turbidity
A-3
Time-of-sampling bias
A-4
Bias in loads estimated from a standard sediment rating curve
A-4.1 Simulation results
A-4.1.1 Simulations from synthetic populations
A-4.1.2 Simulations from the worked records
A-4.2 Historic loads
A-5
Time Trend Analysis
A-5.1 Partial regression plots for loads computed from daytime samples
A-5.2 Adjusted Mann-Kendall p-values
A-6
Power analysis methodology and results
A-6.1 Propagation of measurement error
3
A-6.2 Power analysis methodology
A-6.3 Power analysis results
A-7
Treatment of values less than Method Detection Limit
4
Appendix A-1. Adjustment of old nitrate-N data to correct for change in analytic method
Since the late 1970s, the Lake Tahoe Interagency Monitoring Program (LTIMP) has been
sampling Tahoe basin streams, and analyzing the samples for nitrate-nitrogen (among other
constituents). Before 1976, samples were run through columns packed with cadmium
amalgamated with copper, reducing the nitrate to nitrite. In May 1976, the Cd columns were
replaced by the hydrazine reduction method described by Kamphake, et al. 1967. The samples
have been analyzed in the laboratory of the Tahoe Research Group at Tahoe City, or at the High
Sierra Water Laboratory near Truckee. The hydrazine reduction, however, can be subject to
interference from dissolved Ca and Mg ions, and from dissolved oxygen, which inhibit the
reduction by hydrazine (HS) from nitrate to nitrite (Kempers and Luft, 1989; Kempers and Van
Der Velde, 1992). This interference probably explains the poor spike recoveries seen for stream
samples. (When a known amount of nitrate is added to a sample, its remeasured concentration is
often less than its calculated concentration). Samples from the Lake do not seem to be much
affected by interference. In the mid-1990s, in consultation with the USGS, water chemists at the
Tahoe Research Group conducted a series of experiments to develop an improved methodology
for measuring nitrate in basin streams. The method that showed the most promise involves
addition of pyrophosphate with copper as a catalyst; it is based on the method recommended by
Kempers and Luft, 1988.
In 2004, we used two years (2003 and 2004 water years) of data from samples analyzed by the
hydrazine reduction method, both with and without the pyrophosphate catalyst. Samples from
18 stations were included, with a total of 575 samples. In order to establish the relationship
between the two methods, we used linear regression for all stations and years together, all
stations together for each water year, and for each individual station, for both years together. We
then tested for homogeneity of the regression coefficients, to test for significant differences in
the regression relationships between stations, and between years. We tried log-transformed data
as well as untransformed data. We treated the pyrophosphate method data as the dependent (Y)
variable, and the data from the old method as the independent (X) variable, since the goal is to
adjust the old data to the new method. We found that the regression coefficients varied
significantly among LTIMIP streams, so separate adjustment equations were needed for each
stream.
In order to develop equations for adjusting the old nitrate data in this study, we used 2370 data
pairs from 18 stations, collected from 2003 to 2008. For the final equations we used a secondorder polynomial fit (quadratic equations), even though the improvement over linear equations
was usually slight. Table A-1 shows the derived adjustment equations and R2 values. The
nitrate-N data from the 1976 through April 2003 were adjusted using these equations. Since the
regressions were not forced through the origin, a few negative values were created in the
adjustment. These are set to zero. All nitrate data since April 2003 reflect the use of
pyrophosphate catalyst with the hydrazine reduction step.
Nitrate data for two secondary stations—WC-7A and ED-3, with records respectively from
1992-2001 and 1990-2001, were not adjusted, since sampling of these stations was discontinued
before the use of the catalyst was introduced. The nitrate-N loads reported in this study for those
two stations may be on biased on the low side, but are internally consistent over the periods or
5
record. No time trends in nitrate-N for those stations were detected.
Table
A-1
Station
BC-1
ED-5
ED-9
GC-1
GL-1
IN-1
IN-2
IN-3
LH-1
TC-1
TC-2
TC-3
TH-1
UT-1
UT-3
UT-5
WC-3A
WC-8
Regression equations for correcting old nitrate-N data to current method (with
pyrophosphate)
Coefficients
Equation
R²
n
X2
X
Constant
y = 0.0049x2 + 1.367x + 0.5878
0.90 177.00 0.00 1.37 0.59
y = -0.0011x2 + 1.6723x - 1.3544
0.97 120.00 0.00 1.67 -1.35
y = 0.001x2 + 2.0923x - 7.769
0.85 122.00 0.00 2.09 -7.77
y = 0.0362x2 + 3.5525x - 6.9866
0.64 170.00 0.04 3.55 -6.99
y = -0.0022x2 + 1.7903x - 3.6596
0.91 134.00 0.00 1.79 -3.66
y = 0.0072x2 + 1.3984x - 1.59
0.93 151.00 0.01 1.40 -1.59
y = 0.0105x2 + 1.279x + 0.6727
0.90 75.00
0.01 1.28 0.67
y = 0.0024x2 + 1.3917x + 0.5088
0.91 114.00 0.00 1.39 0.51
y = -0.0446x2 + 2.7554x - 3.7429
0.75 132.00 -0.04 2.76 -3.74
y = 0.0458x2 + 1.0669x + 0.3007
0.78 151.00 0.05 1.07 0.30
y = 0.0107x2 + 1.7334x - 2.0809
0.73 117.00 0.01 1.73 -2.08
y = -0.037x2 + 1.9848x - 1.3726
0.75 74.00
-0.04 1.98 -1.37
y = 0.0087x2 + 1.3006x - 1.351
0.85 142.00 0.01 1.30 -1.35
y = -0.0044x2 + 1.8258x - 1.271
0.68 161.00 0.00 1.83 -1.27
y = -0.0004x2 + 1.3272x + 0.5448
0.78 119.00 0.00 1.33 0.54
y = -0.0076x2 + 1.5043x - 0.0603
0.83 114.00 -0.01 1.50 -0.06
y = 0.0078x2 + 1.1913x + 0.9787
0.84 116.00 0.01 1.19 0.98
y = 0.0054x2 + 1.6508x - 0.523
0.78 181.00 0.01 1.65 -0.52
6
Appendix A-2. Detailed results for all simulations
Simulation results are shown for SS, FS by mass and count, TP, and TKN from resampling the
synthetic populations; and for NO3, SRP, SS, and THP from resampling the worked records.
The load estimation methods shown in the legends are defined in Section 8.3 of the body of this
report.
For constituents resampled from synthetic populations (Appendix A-2.1), each subsection of the
appendix first shows results from the individual synthetic populations, followed by 4 summary
graphs expressing accuracy in terms of (1) RMSE with turbidity, (2) MAPE with turbidity, (3)
RMSE without turbidity methods, and (4) MAPE without turbidity methods. In the summary
graphs, average results across populations are shown in terms of RMSE and MAPE. The
summary graphs show results for the top 6 methods, ranked by the mean of RMSE or MAPE
(both expressed as percentages of the true load), averaging first across populations and then
across sample sizes. The standard method that has been used historically, rcload.mdq, is shown
in all the summary graphs regardless of its ranking.
Statistics for the worked records are computed across all stations and years, so only summary
graphs are shown in Appendix A-2.2, and these are analogous to those presented in Appendix A2.1.
Appendix A-2.3 shows confidence limits derived from the simulations as a function of sample
size.
7
A-2.1 Simulations from synthetic populations
A-2.1.1 SSC
WC-8 1999 SSC: top-ranking methods by RMSE
Root mean square error (%)
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
30
20
10
20
40
60
Sample size
8
80
WC-8 1999 SSC: top-ranking methods by MAPE
Median absolute error (%)
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
20
15
10
5
20
40
60
Sample size
9
80
TC-R1 2011 SSC: top-ranking methods by RMSE
Root mean square error (%)
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
20
15
10
5
20
40
60
Sample size
10
80
TC-R1 2011 SSC: top-ranking methods by MAPE
Median absolute error (%)
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
15
10
5
20
40
60
Sample size
11
80
HWD 2010 SSC: top-ranking methods by RMSE
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
30
25
20
15
10
5
20
40
60
Sample size
12
80
HWD 2010 SSC: top-ranking methods by MAPE
Median absolute error (%)
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
15
10
5
20
40
Sample size
13
60
80
TC-2 2011 SSC: top-ranking methods by RMSE
Root mean square error (%)
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
20
15
10
5
20
40
60
Sample size
14
80
TC-2 2011 SSC: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
12
10
8
6
4
2
20
40
60
Sample size
15
80
TC-2 2010 SSC: top-ranking methods by RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
20
15
10
5
20
40
60
Sample size
16
80
TC-2 2010 SSC: top-ranking methods by MAPE
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
12
10
8
6
4
20
40
Sample size
17
60
80
TH-1 2005 SSC: top-ranking methods by RMSE
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
35
30
25
20
15
10
5
20
40
60
Sample size
18
80
TH-1 2005 SSC: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
25
20
15
10
5
20
40
60
Sample size
19
80
SSC: Best methods by mean of RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
60
80
30
mean of RMSE
25
20
15
10
5
20
40
Sample size
20
SSC: Best methods by mean of MAPE
mean of MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
60
80
15
10
5
20
40
Sample size
21
SSC: Best methods by mean of RMSE w/o turbidity
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
40
60
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
mean of RMSE
30
25
20
15
20
Sample size
22
80
SSC: Best methods by mean of MAPE w/o turbidity
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
mean of MAPE
20
15
10
20
40
60
Sample size
23
80
A-2.1.2 FS by mass
HWD 2010 FS: top-ranking methods by RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
80
60
40
20
20
40
60
Sample size
24
80
HWD 2010 FS: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
25
20
15
10
5
20
40
60
Sample size
25
80
TC-4 2010 FS: top-ranking methods by RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
20
15
10
20
40
60
Sample size
26
80
TC-4 2010 FS: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
16
14
12
10
8
6
20
40
60
Sample size
27
80
Rosewood Above 2010 FS: top-ranking methods by RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
15
10
5
20
40
60
Sample size
28
80
Rosewood Above 2010 FS: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
20
40
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
10
8
6
4
2
60
Sample size
29
80
FS: Best methods by mean of RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
30
mean of RMSE
25
20
15
10
5
20
40
60
Sample size
30
80
FS: Best methods by mean of MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
mean of MAPE
15
10
20
40
60
Sample size
31
80
FS: Best methods by mean of RMSE w/o turbidity
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
35
mean of RMSE
30
25
20
15
20
40
60
Sample size
32
80
FS: Best methods by mean of MAPE w/o turbidity
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
60
80
20
mean of MAPE
18
16
14
12
20
40
Sample size
33
A-2.1.3 FS by count
AC2 2010 FSP: top-ranking methods by RMSE
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
30
25
20
15
10
5
20
40
60
Sample size
34
80
AC2 2010 FSP: top-ranking methods by MAPE
Median absolute error (%)
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
15
10
5
20
40
60
Sample size
35
80
A-2.1.4 TP
WC-8 1999 TP: top-ranking methods by RMSE
Root mean square error (%)
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
20
15
10
5
20
40
Sample size
36
60
80
WC-8 1999 TP: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
15
10
5
20
40
60
Sample size
37
80
AC-2 2008 TP: top-ranking methods by RMSE
Root mean square error (%)
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
14
12
10
8
6
20
40
60
Sample size
38
80
AC-2 2008 TP: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
10
8
6
4
20
40
60
Sample size
39
80
HWD 2010 TP: top-ranking methods by RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
25
20
15
10
20
40
Sample size
40
60
80
HWD 2010 TP: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
18
16
14
12
10
8
20
40
60
Sample size
41
80
TC-2 2011 TP: top-ranking methods by RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
12
10
8
6
4
2
20
40
Sample size
42
60
80
TC-2 2011 TP: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
10
8
6
4
2
20
40
60
Sample size
43
80
TP: Best methods by mean of RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
60
80
mean of RMSE
20
15
10
20
40
Sample size
44
TP: Best methods by mean of MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
16
mean of MAPE
14
12
10
8
20
40
60
Sample size
45
80
TP: Best methods by mean of RMSE w/o turbidity
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
mean of RMSE
20
15
10
20
40
60
Sample size
46
80
TP: Best methods by mean of MAPE w/o turbidity
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
mean of MAPE
16
14
12
10
8
20
40
60
Sample size
47
80
A-2.1.5 TKN
WC-8 1999 TKN: top-ranking methods by RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Root mean square error (%)
15
10
5
20
40
60
Sample size
48
80
WC-8 1999 TKN: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
10
8
6
4
2
20
40
Sample size
49
60
80
AC-2 2008 TKN: top-ranking methods by RMSE
Root mean square error (%)
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
18
16
14
12
10
20
40
60
Sample size
50
80
AC-2 2008 TKN: top-ranking methods by MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
Median absolute error (%)
16
14
12
10
8
20
40
60
Sample size
51
80
HWD 2010 TKN: top-ranking methods by RMSE
Root mean square error (%)
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
20
15
10
20
40
60
Sample size
52
80
HWD 2010 TKN: top-ranking methods by MAPE
Median absolute error (%)
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
15
10
5
20
40
Sample size
53
60
80
TKN: Best methods by mean of RMSE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
40
60
80
25
mean of RMSE
20
15
10
20
Sample size
54
TKN: Best methods by mean of MAPE
rcload.turb
rcload.turb2
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
rcb3
rcb4
60
80
mean of MAPE
20
15
10
5
20
40
Sample size
55
TKN: Best methods by mean of RMSE w/o turbidity
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
40
60
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
25
mean of RMSE
20
15
10
20
Sample size
56
80
TKN: Best methods by mean of MAPE w/o turbidity
rcload
rcload.mdq
rcload.mdq2
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg
pdmean
pdlinear
pdinstant
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
60
80
mean of MAPE
20
15
10
5
20
40
Sample size
57
A-2.2 Simulations from the worked records
A-2.2.1 NO3
NO3: Best methods by RMSE
rcload.mdq
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg.boot
pdmean
pdlinear
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
40
RMSE (%)
30
20
10
20
40
60
Sample size
58
80
NO3: Best methods by MAPE
rcload.mdq
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg.boot
pdmean
pdlinear
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
MAPE (%)
15
10
5
20
40
60
Sample size
59
80
A-2.2.2 SRP
SRP: Best methods by RMSE
rcload.mdq
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg.boot
pdmean
pdlinear
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
25
RMSE (%)
20
15
10
5
20
40
60
Sample size
60
80
SRP: Best methods by MAPE
rcload.mdq
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg.boot
pdmean
pdlinear
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
10
MAPE (%)
8
6
4
2
20
40
60
Sample size
61
80
A-2.2.3 SSC
SSC: Best methods by RMSE
rcload.mdq
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg.boot
pdmean
pdlinear
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
100
RMSE (%)
80
60
40
20
20
40
Sample size
62
60
80
SSC: Best methods by MAPE
rcload.mdq
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg.boot
pdmean
pdlinear
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
25
MAPE (%)
20
15
10
5
20
40
60
Sample size
63
80
A-2.2.4 THP
THP: Best methods by RMSE
rcload.mdq
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg.boot
pdmean
pdlinear
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
40
RMSE (%)
30
20
10
20
40
60
Sample size
64
80
THP: Best methods by MAPE
rcload.mdq
rcload.mdq3
rcload.mdq4
rcload.mdq5
rcload.mdq6
loess.g
loess.s
ace
avas
areg.boot
pdmean
pdlinear
pdlocal2
pdlocal2a
pdlocal4
rcb1
rcb2
MAPE (%)
15
10
5
20
40
60
80
Sample size
A-2.3 Confidence limits as a function of sample size
Confidence limits are simulated percentiles of the absolute value of errors, expressed as
percentages of the estimated load. Mean confidence limits are the average across simulations.
The fourth graph in each set redisplays the means from the first 3 graphs together in one plot.
SS, FS, and TP were simulated from synthetic populations using method rcb2 (best regression
model selected by Gilroy's MSE). FSP (fine sediment particle counts) were simulated from just
one synthetic population using a standard rating curve. TKN was simulated from synthetic
populations using the period-weighted sample estimator. Sampling from synthetic populations
was limited to a 9-6 workday. NO3 and SRP were simulated from the worked records using the
period-weighted sample estimator. For those constituents where turbidity improves estimation,
confidence limits are shown with turbidity in Appendix A-2.3.2. Optimal methods using
turbidity were linear regression of sqrt(c) on sqrt(turb) for SS and FS, and rcb3 (best regression
model selected by AIC) for TP.
65
A-2.3.1 Confidence limits without turbidity
30
40
50
60
70
80
70
60
Mean
30
40
50
TCR11
TH105
WC899
20
30
20
80
10
20
30
40
50
60
Sample size
Sample size
SS 95% Confidence Limits
SS Mean Confidence Limits
70
80
50
40
10
20
40
95th percentile
90th percentile
80th percentile
30
Mean
20
TCR11
TH105
WC899
60
HWD10
TC210
TC211
Mean Error (% of load)
80
60
20
HWD10
TC210
TC211
10
90th Percentile of Error (% of load)
60
Mean
10
10
95th Percentile of Error (% of load)
TCR11
TH105
WC899
40
50
HWD10
TC210
TC211
SS 90% Confidence Limits
0
80th Percentile of Error (% of load)
SS 80% Confidence Limits
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Samples size
66
60
70
80
30
40
50
60
70
80
70
60
Mean
30
40
50
TCR11
TH105
WC899
20
30
20
80
10
20
30
40
50
60
Sample size
Sample size
SS 95% Confidence Limits
SS Mean Confidence Limits
70
80
50
40
10
20
40
95th percentile
90th percentile
80th percentile
30
Mean
20
TCR11
TH105
WC899
60
HWD10
TC210
TC211
Mean Error (% of load)
80
60
20
HWD10
TC210
TC211
10
90th Percentile of Error (% of load)
60
Mean
10
10
95th Percentile of Error (% of load)
TCR11
TH105
WC899
40
50
HWD10
TC210
TC211
SS 90% Confidence Limits
0
80th Percentile of Error (% of load)
SS 80% Confidence Limits
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Samples size
67
60
70
80
10
20
30
40
95th percentile
90th percentile
80th percentile
0
Mean Error (% of load)
50
FSP Confidence Limits based on AC210 only
10
20
30
40
50
60
70
80
Samples size
The only simulation performed for fine sediment particle counts used the 2010 synthetic record
for station AC-2.
68
40
50
60
70
40
WC899
Mean
10
20
30
AC208
HWD10
TC211
0
10
30
80
10
20
30
40
50
60
Sample size
Sample size
TP 95% Confidence Limits
TP Mean Confidence Limits
70
80
40
40
20
90th Percentile of Error (% of load)
40
WC899
Mean
20
30
AC208
HWD10
TC211
10
35
30
25
20
10
10
95th percentile
90th percentile
80th percentile
15
Mean Error (% of load)
WC899
Mean
20
30
AC208
HWD10
TC211
0
95th Percentile of Error (% of load)
TP 90% Confidence Limits
0
80th Percentile of Error (% of load)
TP 80% Confidence Limits
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Samples size
69
60
70
80
30
40
50
60
70
35
30
25
20
10 15
5
80
10
20
30
40
50
60
70
Sample size
TKN 95% Confidence Limits
TKN Mean Confidence Limits
80
10 15
20
25
95th percentile
90th percentile
80th percentile
0
5
Mean Error (% of load)
0
5
10 15
20
25
30
AC208
HWD10
WC899
Mean
30
35
Sample size
35
20
AC208
HWD10
WC899
Mean
0
90th Percentile of Error (% of load)
35
5
10 15
20
25
30
AC208
HWD10
WC899
Mean
10
95th Percentile of Error (% of load)
TKN 90% Confidence Limits
0
80th Percentile of Error (% of load)
TKN 80% Confidence Limits
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Samples size
70
60
70
80
20
30
40
50
60
70
40
TH-1
UT-1
WC-8
Mean
20
30
BC-1
GC-1
SN-1
TC-1
10
90th Percentile of Error (% of load)
30
10
80
10
20
30
40
50
60
70
Sample size
Sample size
NO3 95% Confidence Limits
NO3 Mean Confidence Limits
80
20
30
20
0
10
95th percentile
90th percentile
80th percentile
10
TH-1
UT-1
WC-8
Mean
30
40
BC-1
GC-1
SN-1
TC-1
Mean Error (% of load)
50
40
10
95th Percentile of Error (% of load)
TH-1
UT-1
WC-8
Mean
15
20
25
BC-1
GC-1
SN-1
TC-1
NO3 90% Confidence Limits
5
80th Percentile of Error (% of load)
NO3 80% Confidence Limits
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Samples size
71
60
70
80
20
30
40
50
60
70
50
TH-1
UT-1
WC-8
Mean
20
30
40
BC-1
GC-1
SN-1
TC-1
10
90th Percentile of Error (% of load)
20
80
10
20
30
40
50
60
70
Sample size
Sample size
SRP 95% Confidence Limits
SRP Mean Confidence Limits
80
30
20
0
20
95th percentile
90th percentile
80th percentile
10
TH-1
UT-1
WC-8
Mean
40
60
BC-1
GC-1
SN-1
TC-1
Mean Error (% of load)
80
40
10
95th Percentile of Error (% of load)
TH-1
UT-1
WC-8
Mean
10
15
BC-1
GC-1
SN-1
TC-1
SRP 90% Confidence Limits
5
80th Percentile of Error (% of load)
SRP 80% Confidence Limits
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Samples size
72
60
70
80
A-2.3.2 Confidence limits with turbidity
30
40
50
60
70
50
TCR11
TH105
WC899
Mean
10
20
30
40
HWD10
TC210
TC211
0
20
20
SS 90% Confidence Limits Using Turbidity
90th Percentile of Error (% of load)
50
Mean
10
10
80
10
20
30
40
50
60
70
80
SS 95% Confidence Limits Using Turbidity
SS Mean Confidence Limits Using Turbidity
50
Sample size
50
Sample size
95th percentile
90th percentile
80th percentile
30
20
0
10
20
10
40
Mean
Mean Error (% of load)
TCR11
TH105
WC899
30
40
HWD10
TC210
TC211
0
95th Percentile of Error (% of load)
TCR11
TH105
WC899
30
40
HWD10
TC210
TC211
0
80th Percentile of Error (% of load)
SS 80% Confidence Limits Using Turbidity
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Sample size
73
60
70
80
20
30
40
50
60
70
40
FS 90% Confidence Limits Using Turbidity
TC410
Mean
10
20
30
HWD10
RWA10
0
90th Percentile of Error (% of load)
40
10
10
80
10
20
30
40
50
60
70
80
FS 95% Confidence Limits Using Turbidity
FS Mean Confidence Limits Using Turbidity
40
Sample size
30
0
10
95th percentile
90th percentile
80th percentile
20
Mean Error (% of load)
TC410
Mean
20
30
HWD10
RWA10
10
40
Sample size
0
95th Percentile of Error (% of load)
TC410
Mean
20
30
HWD10
RWA10
0
80th Percentile of Error (% of load)
FS 80% Confidence Limits Using Turbidity
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Sample size
74
60
70
80
50
40
30
WC899
Mean
20
20
0
10
20
30
40
50
60
70
80
10
20
30
40
50
60
70
Sample size
TP 95% Confidence Limits Using Turbidity
TP Mean Confidence Limits Using Turbidity
30
40
95th percentile
90th percentile
80th percentile
20
20
0
10
0
80
10
Mean Error (% of load)
WC899
Mean
30
40
AC208
HWD10
TC211
50
Sample size
50
10
95th Percentile of Error (% of load)
AC208
HWD10
TC211
10
90th Percentile of Error (% of load)
50
WC899
Mean
30
40
AC208
HWD10
TC211
TP 90% Confidence Limits Using Turbidity
0
80th Percentile of Error (% of load)
TP 80% Confidence Limits Using Turbidity
10
20
30
40
50
60
70
80
Sample size
10
20
30
40
50
Sample size
75
60
70
80
APPENDIX A-3
Time-of-sampling bias
Loads were recomputed for NO3, SRP, TKN, SSC, and TP using (1) unrestricted sampling and
(2) only samples collected between 9am and 6pm. The graphs in this Appendix show the change
in load as a function of the proportion of samples omitted. Loads associated with sediment (SS,
TP, and TKN) are most sensitive to time-of-sampling. The dissolved loads (NO3 and SRP) are
largely unaffected, except for WC-8, where daytime-only sampling could result in an
underestimate of total load by up to 40 percent.
0.0 0.1 0.2 0.3 0.4 0.5
UT-3
UT-5
0.0 0.1 0.2 0.3 0.4 0.5
WC-3A
WC-7A
WC-8
Proportional change in SSC load
0.4
0.2
0.0
-0.2
-0.4
-0.6
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.4
0.2
0.0
-0.2
-0.4
-0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
BC-1
ED-3
ED-5
ED-9
GC-1
0.4
0.2
0.0
-0.2
-0.4
-0.6
0.0 0.1 0.2 0.3 0.4 0.5
0.0 0.1 0.2 0.3 0.4 0.5
Proportion of samples omitted
76
0.0 0.1 0.2 0.3 0.4 0.5
0.0
UT-3
0.2
0.4
0.6
UT-5
0.0
WC-3A
0.2
0.4
0.6
WC-7A
WC-8
0.2
0.0
-0.2
Proportional change in TP load
-0.4
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.2
0.0
-0.2
-0.4
0.2
0.0
-0.2
-0.4
BC-1
ED-3
ED-5
ED-9
GC-1
0.2
0.0
-0.2
-0.4
0.0
0.2
0.4
0.6
0.0
0.2
0.4
0.6
Proportion of samples omitted
77
0.0
0.2
0.4
0.6
0.0 0.1 0.2 0.3 0.4 0.5
UT-3
UT-5
0.0 0.1 0.2 0.3 0.4 0.5
WC-3A
WC-7A
WC-8
0.4
0.2
0.0
-0.2
Proportional change in TKN load
-0.4
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.4
0.2
0.0
-0.2
-0.4
0.4
0.2
0.0
-0.2
-0.4
BC-1
ED-3
ED-5
ED-9
GC-1
0.4
0.2
0.0
-0.2
-0.4
0.0 0.1 0.2 0.3 0.4 0.5
0.0 0.1 0.2 0.3 0.4 0.5
Proportion of samples omitted
78
0.0 0.1 0.2 0.3 0.4 0.5
0.0 0.1 0.2 0.3 0.4 0.5
UT-3
UT-5
0.0 0.1 0.2 0.3 0.4 0.5
WC-3A
WC-7A
WC-8
0.4
0.2
0.0
-0.2
Proportional change in NO3 load
-0.4
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.4
0.2
0.0
-0.2
-0.4
0.4
0.2
0.0
-0.2
-0.4
BC-1
ED-3
ED-5
ED-9
GC-1
0.4
0.2
0.0
-0.2
-0.4
0.0 0.1 0.2 0.3 0.4 0.5
0.0 0.1 0.2 0.3 0.4 0.5
Proportion of samples omitted
79
0.0 0.1 0.2 0.3 0.4 0.5
0.0 0.1 0.2 0.3 0.4 0.5
UT-3
0.0 0.1 0.2 0.3 0.4 0.5
UT-5
WC-3A
WC-7A
WC-8
0.4
0.2
0.0
-0.2
Proportional change in SRP load
-0.4
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.4
0.2
0.0
-0.2
-0.4
0.4
0.2
0.0
-0.2
-0.4
BC-1
ED-3
ED-5
ED-9
GC-1
0.4
0.2
0.0
-0.2
-0.4
0.0 0.1 0.2 0.3 0.4 0.5
0.0 0.1 0.2 0.3 0.4 0.5
Proportion of samples omitted
80
0.0 0.1 0.2 0.3 0.4 0.5
APPENDIX A-4
Bias in loads estimated from a standard sediment rating curve
We examined the bias in loads estimated from standard sediment rating curves in two ways.
Appendix A-4.1 uses simulation results to compare the bias with that of our selected methods.
Appendix A-4.2 looks at the differences between estimates of historic loads computed by the
standard rating curve and our selected methods.
A-4.1 Simulation results
The simulations provided estimates of the bias of all load estimation methods. This Appendix
shows the bias of the standard rating curve method (rcload.mdq), compared to that of our
selected methods for SS, TP, and TKN from resampling the synthetic data; and for NO3, SRP,
THP, and SS from resampling the worked records.
A-4.1.1 Simulations from synthetic populations
Standard rating curve
Best model by GRMSE
20
TCR11
40
60
80
TH105
WC899
30
20
Bias (% of true SS load)
10
0
-10
HWD10
TC210
TC211
30
20
10
0
-10
20
40
60
80
20
Sample size
81
40
60
80
Standard rating curve
Best model by GRMSE
20
TC211
40
60
80
WC899
30
20
10
Bias (% of true TP load)
0
-10
AC208
HWD10
30
20
10
0
-10
20
40
60
80
Sample size
82
Standard rating curve
Period-weighted
WC899
20
10
Bias (% of true TKN load)
0
-10
-20
AC208
HWD10
20
10
0
-10
-20
20
40
60
80
Sample size
83
A-4.1.2 Simulations from the worked records
Standard rating curve
Period-weighted
WC-8
20
10
0
-10
-20
-30
Bias (% of true NO3 load)
TC-1
TH-1
UT-1
20
10
0
-10
-20
-30
BC-1
GC-1
SN-1
20
10
0
-10
-20
-30
20
40
60
80
20
Sample size
84
40
60
80
Standard rating curve
Period-weighted
WC-8
15
10
5
0
Bias (% of true SRP load)
TC-1
TH-1
UT-1
15
10
5
0
BC-1
GC-1
SN-1
15
10
5
0
20
40
60
80
20
Sample size
85
40
60
80
Standard rating curve
Period-weighted
WC-8
30
20
10
0
Bias (% of true THP load)
TC-1
TH-1
UT-1
30
20
10
0
BC-1
GC-1
SN-1
30
20
10
0
20
40
60
80
20
Sample size
86
40
60
80
Standard rating curve
Best model by GRMSE
WC-8
100
80
60
40
20
0
-20
Bias (% of true SSC load)
TC-1
TH-1
UT-1
100
80
60
40
20
0
-20
BC-1
GC-1
SN-1
100
80
60
40
20
0
-20
20
40
60
80
20
40
60
80
Sample size
A-4.2 Historical loads
The following graphs show the distribution of differences between historical loads estimated
using standard sediment rating curves and our selected methods. Positive differences indicate
that the sediment rating curve estimates are higher.
87
SS
0
50 100 150
0
50 100 150
UT-5
WC-3A
WC-7A
WC-8
TC-3
TH-1
UT-1
UT-3
0.06
0.04
0.02
0.00
0.06
0.04
0.02
Density
0.00
IN-3
LH-1
TC-1
TC-2
GC-1
GL-1
IN-1
IN-2
0.06
0.04
0.02
0.00
0.06
0.04
0.02
0.00
BC-1
ED-3
ED-5
ED-9
0.06
0.04
0.02
0.00
0
50 100 150
0
50 100 150
Difference between rating curve estimate and best model selection by GRMSE (% of latter)
88
TP
0
50 100 150
0
50 100 150
UT-5
WC-3A
WC-7A
WC-8
TC-3
TH-1
UT-1
UT-3
0.15
0.10
0.05
0.00
0.15
0.10
0.05
0.00
IN-3
LH-1
TC-1
TC-2
GC-1
GL-1
IN-1
IN-2
Density
0.15
0.10
0.05
0.00
0.15
0.10
0.05
0.00
BC-1
ED-3
ED-5
ED-9
0.15
0.10
0.05
0.00
0
50 100 150
0
50 100 150
Difference between rating curve estimate and best model selection by GRMSE (% of latter)
89
TKN
0
50 100 150
0
50 100 150
UT-5
WC-3A
WC-7A
WC-8
TC-3
TH-1
UT-1
UT-3
0.06
0.04
0.02
0.00
0.06
0.04
0.02
Density
0.00
IN-3
LH-1
TC-1
TC-2
GC-1
GL-1
IN-1
IN-2
0.06
0.04
0.02
0.00
0.06
0.04
0.02
0.00
BC-1
ED-3
ED-5
ED-9
0.06
0.04
0.02
0.00
0
50 100 150
0
50 100 150
Difference between rating curve estimate and period-weighted (% of PWE)
90
NO3
0
50 100 150
0
50 100 150
UT-5
WC-3A
WC-7A
WC-8
TC-3
TH-1
UT-1
UT-3
0.06
0.04
0.02
0.00
0.06
0.04
0.02
Density
0.00
IN-3
LH-1
TC-1
TC-2
GC-1
GL-1
IN-1
IN-2
0.06
0.04
0.02
0.00
0.06
0.04
0.02
0.00
BC-1
ED-3
ED-5
ED-9
0.06
0.04
0.02
0.00
0
50 100 150
0
50 100 150
Difference between rating curve estimate and period-weighted (% of PWE)
91
SRP
0
50 100 150
0
50 100 150
UT-5
WC-3A
WC-7A
WC-8
TC-3
TH-1
UT-1
UT-3
0.12
0.10
0.08
0.06
0.04
0.02
0.00
Density
0.12
0.10
0.08
0.06
0.04
0.02
0.00
IN-3
LH-1
TC-1
TC-2
GC-1
GL-1
IN-1
IN-2
0.12
0.10
0.08
0.06
0.04
0.02
0.00
0.12
0.10
0.08
0.06
0.04
0.02
0.00
BC-1
ED-3
ED-5
ED-9
0.12
0.10
0.08
0.06
0.04
0.02
0.00
0
50 100 150
0
50 100 150
Difference between rating curve estimate and period-weighted (% of PWE)
92
APPENDIX A-5
Time Trend Analysis
This appendix shows results of Alley's adjusted variable Kendall test, recommended by Helsel
and Hirsh (2002, p. 335). Alley's test is basically a Mann-Kendall test on the partial regression
plot of log(load) versus water year. In the partial regression plots (Appendix A-5.1), log(load)
and water year are both regressed on the same set of predictors and the two sets of residuals are
plotted against one another. For TP, SRP, and NO3, we used multistation models of the form
log(loadi) = b0i + b1ilog(flow) + b2log(peak).
where the subscript i denotes the gaging station. For TKN, the interaction between station and
log(flow) was not significant, so there was only one coefficient for log(flow); the slightly
simplified model is written by dropping the i subscript from the coefficient b1i
log(loadi) = b0i + b1log(flow) + b2log(peak).
For SS, the interaction between station and log(peak) improved the model more than that
between station and log(flow), hence the model is written as adding a subscript i to coefficient b2
log(loadi) = b0i + b1log(flow) + b2ilog(peak).
The partial regression plots of Appendix A-5.1 look very similar to more familiar plots (Figures
11.1-1 to 11.1-5) in which the x variable is water year. Mann-Kendall tests were also carried out
on the familiar plots. As stated by Helsel and Hirsh, these tests were less powerful, failing to
reject the null hypothesis for (1) SS at ED-3, and (2) TP at IN-1, IN-2, and WC-8. Otherwise
both tests identified the same set of trends for all constituents. Only the adjusted plots based on
daytime samples are included in Appendix A.5-1. These are more "correct" because they
eliminate doubts about time-of-sampling bias and the influence of correlations between water
year and the other predictors.
Tables in Appendix A-5.2 contain the p-values from the adjusted Mann-Kendall tests for trends
in loads computed from both daytime samples and the full set of samples.
93
A.5-1. Partial regression plots for loads computed from daytime samples
SS loads computed from daytime samples only
-10
SS Residual from Regression: log(dayload) ~ log(flow) + stn*log(peak)
UT-3
0
UT-5
10
-10
WC-3A
0
10
WC-7A
WC-8
3
2
1
0
-1
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
3
2
1
0
-1
3
2
1
0
-1
BC-1
ED-3
ED-5
ED-9
GC-1
3
2
1
0
-1
-10
0
10
-10
0
10
Water Year Residual on Same Predictors
94
-10
0
10
TP loads computed from daytime samples only
-10
TP Residual from Regression: log(dayload) ~ stn*log(flow) + log(peak)
UT-3
0
UT-5
10
-10
WC-3A
0
10
WC-7A
WC-8
1.0
0.5
0.0
-0.5
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
1.0
0.5
0.0
-0.5
1.0
0.5
0.0
-0.5
BC-1
ED-3
ED-5
ED-9
GC-1
1.0
0.5
0.0
-0.5
-10
0
10
-10
0
10
Water Year Residual on Same Predictors
95
-10
0
10
TKN loads computed from daytime samples only
TKN Residual from Regression: log(dayload) ~ stn + log(flow) + log(peak)
-10 -5 0 5 10
2.0
1.5
1.0
0.5
0.0
-0.5
2.0
1.5
1.0
0.5
0.0
-0.5
-10 -5 0 5 10
UT-3
UT-5
WC-3A
WC-7A
WC-8
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
BC-1
ED-3
ED-5
ED-9
GC-1
-10 -5 0 5 10
-10 -5 0 5 10
Water Year Residual on Same Predictors
96
-10 -5 0 5 10
2.0
1.5
1.0
0.5
0.0
-0.5
2.0
1.5
1.0
0.5
0.0
-0.5
NO3 loads computed from daytime samples only
-20 -10
NO3 Residual from Regression: log(dayload) ~ stn*log(flow) + log(peak)
UT-3
0
UT-5
10 20
-20 -10
WC-3A
0
10 20
WC-7A
WC-8
2
1
0
-1
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
2
1
0
-1
2
1
0
-1
BC-1
ED-3
ED-5
ED-9
GC-1
2
1
0
-1
-20 -10
0
10 20
-20 -10
0
10 20
Water Year Residual on Same Predictors
97
-20 -10
0
10 20
SRP loads computed from daytime samples only
SRP Residual from Regression: log(dayload) ~ stn*log(flow) + log(peak)
-10
UT-3
0
UT-5
10
-10
WC-3A
0
10
WC-7A
WC-8
1.0
0.5
0.0
-0.5
-1.0
TC-1
TC-2
TC-3
TH-1
UT-1
GL-1
IN-1
IN-2
IN-3
LH-1
1.0
0.5
0.0
-0.5
-1.0
1.0
0.5
0.0
-0.5
-1.0
BC-1
ED-3
ED-5
ED-9
GC-1
1.0
0.5
0.0
-0.5
-1.0
-10
0
10
-10
0
10
-10
0
10
Water Year Residual on Same Predictors
A.5-2. Adjusted Mann-Kendall p-values for loads computed from both daytime
samples and the full set of samples.
For significance testing of the trend in each partial regression plot, the rejection level for the
Mann-Kendall test was set at alpha=0.05/20= 0.0025, to keep the overall experiment-wise error
for each constituent at approximately 0.05. Significant tests are highlighted in yellow. All
significant trends were declines.
98
SS
BC-1
ED-3
ED-5
ED-9
GC-1
GL-1
IN-1
IN-2
IN-3
LH-1
day only
0.3754 0.0000 0.3686 0.6158 0.6369 0.1137 0.0021 0.0795 0.0091 0.1888
full set
1.0000 0.0010 0.4543 0.1289 0.0135 0.1554 0.0004 0.0472 0.0059 0.0498
SS
TC-1
TC-2
TC-3
TH-1
UT-1
UT-3
UT-5
WC-3A WC-7A WC-8
day only
0.0870 0.0033 0.0030 0.0003 0.0006 0.0187 0.7429 0.4605 0.6007 0.0053
full set
0.0781 0.0021 0.0014 0.0065 0.0004 0.0221 1.0000 0.6771 0.4843 0.0159
TP
day only
full set
BC-1
ED-3
ED-5
ED-9
GC-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.0075 0.0210 0.1124 0.4343 0.0000 0.0812 0.0082 0.0020 0.4223 0.3194
0.0111 0.0088 0.1311 0.2628 0.0011 0.2944 0.0025 0.0020 0.0855 0.3457
TP
day only
full set
TC-1
TC-2
TC-3
TH-1
UT-1
UT-3
UT-5
WC-3A WC-7A WC-8
0.5071 0.2422 0.0548 0.0008 0.0007 0.3536 0.8815 0.7246 0.2164 0.0002
0.5722 0.2945 0.0283 0.0013 0.0008 0.4196 0.8815 0.6771 0.2912 0.0022
TKN
day only
full set
BC-1
ED-3
ED-5
ED-9
GC-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.6062 0.0018 1.0000 0.7381 0.9804 0.5304 0.0870 0.5183 0.6308 0.8755
0.4461 0.0311 0.5009 0.4680 0.9024 0.2483 0.0339 0.2792 0.5006 0.9168
TKN
day only
full set
TC-1
TC-2
TC-3
TH-1
UT-1
UT-3
UT-5
WC-3A WC-7A WC-8
0.6410 0.4196 0.1650 0.0072 0.1590 0.3858 0.4550 0.8227 0.0466 0.9168
0.6062 0.2675 0.1458 0.0051 0.2269 0.3232 0.5706 0.7732 0.0047 0.5653
NO3
day only
full set
BC-1
ED-3
ED-5
ED-9
GC-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.0000 0.3807 0.1124 0.3712 0.0007 0.1404 0.0012 0.6672 0.0018 0.1554
0.0000 0.3807 0.1124 0.1970 0.0002 0.1715 0.0070 0.6672 0.0014 0.0812
NO3
day only
full set
TC-1
TC-2
TC-3
TH-1
UT-1
UT-3
UT-5
WC-3A WC-7A WC-8
0.3371 0.6545 0.0135 0.0002 0.0010 0.0090 0.0852 0.2086 0.8618 0.0011
0.3627 0.8347 0.0164 0.0004 0.0012 0.0026 0.0745 0.1284 0.6007 0.0083
SRP
day only
full set
BC-1
ED-3
ED-5
ED-9
GC-1
GL-1
IN-1
IN-2
IN-3
LH-1
0.0431 0.3807 0.0395 0.9113 0.0001 0.0335 0.5392 0.0193 0.0468 0.0566
0.0223 0.3807 0.0395 0.5031 0.0001 0.0139 0.8637 0.0356 0.0164 0.0812
SRP
day only
full set
TC-1
TC-2
TC-3
TH-1
UT-1
UT-3
UT-5
WC-3A WC-7A WC-8
0.0025 0.1763 0.0638 0.0002 0.0328 0.2945 0.9287 0.0740 0.0286 0.1391
0.0030 0.1245 0.1126 0.0058 0.1162 0.2422 1.0000 0.0548 0.0091 0.0401
99
APPENDIX A-6
Power analysis methodology and results
A-6.1. Propagation of load estimation error to regression error
Trend of regression residuals is to be tested using Mann-Kendall or adjusted version of MannKendall that works on residuals from an "added-variable" (partial regression) plot.
The regression model without measurement error can be written
log( yi ) = f ( xi ) + ε i , Var (ε i ) = σ 2
If there is error in the load estimate the model can be written
log( yi* ) = f ( xi ) + ε i*
where yi* = yi + mi represents the load estimate with measurement error and ε i* is the associated
residual error. Let's assume that the measurement error is approximately normally distributed
with mean Bi and standard deviation σ m,i , i.e. mi N ( Bi , σ m,i ) . (The normal distribution is not
strictly possible because the mi can never be less than –yi.) The load with error is then
⎛ m
yi* = yi + mi = yi ⎜1 + i
yi
⎝
δ i N (1 + Br , σ r )
⎞
⎟ = yiδ i
⎠
where δ i is defined to be 1 + mi yi , Br = Bi yi is the relative bias and σ r = σ m,i yi is the
relative measurement error, which we assume to be constant for all measurements yi. Then
log ( yi* ) = log ( yiδ i ) = log ( yi ) + log (δ i )
= f ( xi ) + ε i + log (δ i )
= f ( xi ) + γ i
Assuming that the regression and estimation errors are independent, the residual variance for the
regression with estimation error is
Var (γ i ) = Var [ε i + log(δ i ) ] = Var (ε i ) + Var [ log(δ i ) ]
The last term can be approximated using the delta method (best when when σ r 1 ), so
100
Var (γ i ) = σ 2 + Var [ log(δ i ) ]
≅σ2 +
Var (γ i ) ≅ σ 2 +
Var [δ i ]
E 2 [δ i ]
σ r2
(1 + Br )
2
The last expression shows how measurement error inflates residual error, which will in turn
erode statistical power. Interestingly, when the bias is negative it increases the variance, and
when it is positive it decreases it. This becomes a problem, because methods like the simple
rating curve often have positive bias, which makes them look better than methods with zero bias
in a statistical power analysis.
A-6.2. Power analysis methodology
Code was developed in R to estimate the power of the Mann-Kendall test. For any specified
sample size (number of years), the algorithm randomly resamples the residuals that were tested
in the adjusted Mann-Kendall tests for trend. (These are equivalent to residuals from complete
models that include water year). All stations are pooled in these models. The residuals from the
models weren't that badly distributed but all except those for TKN were somewhat long in the
tails (Figure 12.1-1 in the body of the report). Resampling the model residuals in some cases
gives different results than generating normally distributed errors; the former is more
representative of reality. Normally distributed measurement errors were generated based on
specified values relative precision (0.1, 0.3, or 0.5) and bias (-0.2, 0.0, or +0.2). Finally, a trend
of a specified magnitude (1, 3, or 5% per year, direction up or down doesn't matter) was added,
and the Mann-Kendall test was applied to see if the trend was significant at levels of 0.0025,
0.0050, and 0.05. The first two significance levels correspond to the Bonferroni rule for testing
groups of 20 and 10 stations, respectively. The statistical power is estimated as the proportion of
significant tests in 5000 resamplings, and seems to be repeatable to nearly 3 decimals.
The next graph contrasts bias at levels of -10% 0% and +10%. It shows the power of the MannKendall test to detect trends (slope) from 1% to 5% per year, with measurement errors (CV)
varying from 10% to %50% of the load. The significance level for this analysis is fixed at
0.0025, reflecting the Bonferroni adjustment when conducting a family of 20 tests. At this level,
bias does not strongly affect statistical power.
101
SSC (alpha=0.0025)
Bias: -10%
0%
20
30
10%
40
50
60
CV=0.5
Slope=-0.05
CV=0.5
Slope=-0.03
CV=0.5
Slope=-0.01
CV=0.3
Slope=-0.05
CV=0.3
Slope=-0.03
CV=0.3
Slope=-0.01
1.0
0.8
0.6
0.4
Mann-Kendall Statistical Power
0.2
0.0
1.0
0.8
0.6
0.4
0.2
0.0
CV=0.1
Slope=-0.05
CV=0.1
Slope=-0.03
CV=0.1
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
30
40
50
60
20
30
40
50
60
Sample size
Note that the methodology as described so far adds measurement error to residuals that already
contain measurement error. The issue was addressed by shrinking the residual variance from the
models, i.e. multiplying the residuals by a constant less than 1 (determined by trial and error) to
make the final residual variance, with added measurement error, match the original residual
variance. For example, if we shrink the original residuals to 88% of their original standard error,
then add 10% measurement error, the result matches the original residual variance. The reason
the shrinkage factor is not exactly 100 minus the measurement error is that we are working with
logarithms of measurements. The appropriate shrinkage factors, determined by trial and error,
are shown in the following table, and were used in all subsequent power analyses.
102
Parameter
Mean LTIMP Estimation
sample size
method
RSE
used in
MannKendall
test
Best regression 0.516
Best regression 0.287
Period-weighted 0.321
Period-weighted 0.386
Period-weighted 0.219
SS
TP
TKN
NO3
SRP
31.7
31.5
26.9
34.7
34.1
Relative
precision from
simulations for
given sample
size
0.139
0.073
0.104
0.081
0.079
Shrinkage
factor needed
to match RSE
used in MannKendall test
0.96
0.96
0.94
0.98
0.92
Preliminary analyses using a shrinkage factor of 92% for SRP, a 2% trend and alpha=0.0025
showed that it would take 40 years with 25% measurement errors (ME) or 30 years with 10%
ME to have power of 0.82 (i.e. 81% chance of detecting the trend). If the trend were 3% per
year, one would need 23 years with 10% ME or 30 years with 30% error to have power of 0.79.
To put these trends in perspective here are the trends (log to the base e units per year) that we
found to be significant. For small slopes, each 0.01 in slope units corresponds to about 1%
change per year.
SS
ED-3
IN-1
TC-2
TC-3
TH-1
UT-1
-0.1167 -0.0614 -0.0464 -0.0585 -0.0624 -0.0276
TP
GC-1
IN-1
IN-2
TH-1
UT-1
WC-8
0.0242 -0.0159
-0.041 -0.0321
-0.015 -0.0241
TKN
ED-3
-0.0567
SRP
GC-1
TC-1
TH-1
-0.0172
0.012 -0.0104
NO3
BC-1
GC-1
IN-1
IN-3
TH-1
UT-1
-0.0239 -0.0334 -0.0186
-0.028 -0.0311 -0.0174
Percent change per year can be calculated precisely from slope as 100(1-eslope)
slope
-0.01 -0.02 -0.03 -0.04 -0.05 -0.06 -0.07 -0.08 -0.09
-0.1 -0.11 -0.12
% change -1.00 -1.98 -2.96 -3.92 -4.88 -5.82 -6.76 -7.69 -8.61 -9.52 -10.42 -11.31
103
A-6.3. Power analysis results
SS Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.0025
Bias=0.2
Slope=-0.05
Alpha=0.0025
Bias=0.2
Slope=-0.03
Alpha=0.0025
Bias=0.2
Slope=-0.01
Alpha=0.0025
Bias=0
Slope=-0.05
Alpha=0.0025
Bias=0
Slope=-0.03
Alpha=0.0025
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.0025
Bias=-0.2
Slope=-0.05
Alpha=0.0025
Bias=-0.2
Slope=-0.03
Alpha=0.0025
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
104
40
60
80
SS Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.005
Bias=0.2
Slope=-0.05
Alpha=0.005
Bias=0.2
Slope=-0.03
Alpha=0.005
Bias=0.2
Slope=-0.01
Alpha=0.005
Bias=0
Slope=-0.05
Alpha=0.005
Bias=0
Slope=-0.03
Alpha=0.005
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.005
Bias=-0.2
Slope=-0.05
Alpha=0.005
Bias=-0.2
Slope=-0.03
Alpha=0.005
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
105
40
60
80
SS Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.05
Bias=0.2
Slope=-0.05
Alpha=0.05
Bias=0.2
Slope=-0.03
Alpha=0.05
Bias=0.2
Slope=-0.01
Alpha=0.05
Bias=0
Slope=-0.05
Alpha=0.05
Bias=0
Slope=-0.03
Alpha=0.05
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.05
Bias=-0.2
Slope=-0.05
Alpha=0.05
Bias=-0.2
Slope=-0.03
Alpha=0.05
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
106
40
60
80
TP Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.0025
Bias=0.2
Slope=-0.05
Alpha=0.0025
Bias=0.2
Slope=-0.03
Alpha=0.0025
Bias=0.2
Slope=-0.01
Alpha=0.0025
Bias=0
Slope=-0.05
Alpha=0.0025
Bias=0
Slope=-0.03
Alpha=0.0025
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.0025
Bias=-0.2
Slope=-0.05
Alpha=0.0025
Bias=-0.2
Slope=-0.03
Alpha=0.0025
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
107
40
60
80
TP Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.005
Bias=0.2
Slope=-0.05
Alpha=0.005
Bias=0.2
Slope=-0.03
Alpha=0.005
Bias=0.2
Slope=-0.01
Alpha=0.005
Bias=0
Slope=-0.05
Alpha=0.005
Bias=0
Slope=-0.03
Alpha=0.005
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.005
Bias=-0.2
Slope=-0.05
Alpha=0.005
Bias=-0.2
Slope=-0.03
Alpha=0.005
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
108
40
60
80
TP Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.05
Bias=0.2
Slope=-0.05
Alpha=0.05
Bias=0.2
Slope=-0.03
Alpha=0.05
Bias=0.2
Slope=-0.01
Alpha=0.05
Bias=0
Slope=-0.05
Alpha=0.05
Bias=0
Slope=-0.03
Alpha=0.05
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.05
Bias=-0.2
Slope=-0.05
Alpha=0.05
Bias=-0.2
Slope=-0.03
Alpha=0.05
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
109
40
60
80
TKN Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.0025
Bias=0.2
Slope=-0.05
Alpha=0.0025
Bias=0.2
Slope=-0.03
Alpha=0.0025
Bias=0.2
Slope=-0.01
Alpha=0.0025
Bias=0
Slope=-0.05
Alpha=0.0025
Bias=0
Slope=-0.03
Alpha=0.0025
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.0025
Bias=-0.2
Slope=-0.05
Alpha=0.0025
Bias=-0.2
Slope=-0.03
Alpha=0.0025
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
110
40
60
80
TKN Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.005
Bias=0.2
Slope=-0.05
Alpha=0.005
Bias=0.2
Slope=-0.03
Alpha=0.005
Bias=0.2
Slope=-0.01
Alpha=0.005
Bias=0
Slope=-0.05
Alpha=0.005
Bias=0
Slope=-0.03
Alpha=0.005
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.005
Bias=-0.2
Slope=-0.05
Alpha=0.005
Bias=-0.2
Slope=-0.03
Alpha=0.005
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
111
40
60
80
TKN Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.05
Bias=0.2
Slope=-0.05
Alpha=0.05
Bias=0.2
Slope=-0.03
Alpha=0.05
Bias=0.2
Slope=-0.01
Alpha=0.05
Bias=0
Slope=-0.05
Alpha=0.05
Bias=0
Slope=-0.03
Alpha=0.05
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.05
Bias=-0.2
Slope=-0.05
Alpha=0.05
Bias=-0.2
Slope=-0.03
Alpha=0.05
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
112
40
60
80
NO3 Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.0025
Bias=0.2
Slope=-0.05
Alpha=0.0025
Bias=0.2
Slope=-0.03
Alpha=0.0025
Bias=0.2
Slope=-0.01
Alpha=0.0025
Bias=0
Slope=-0.05
Alpha=0.0025
Bias=0
Slope=-0.03
Alpha=0.0025
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.0025
Bias=-0.2
Slope=-0.05
Alpha=0.0025
Bias=-0.2
Slope=-0.03
Alpha=0.0025
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
113
40
60
80
NO3 Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.005
Bias=0.2
Slope=-0.05
Alpha=0.005
Bias=0.2
Slope=-0.03
Alpha=0.005
Bias=0.2
Slope=-0.01
Alpha=0.005
Bias=0
Slope=-0.05
Alpha=0.005
Bias=0
Slope=-0.03
Alpha=0.005
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.005
Bias=-0.2
Slope=-0.05
Alpha=0.005
Bias=-0.2
Slope=-0.03
Alpha=0.005
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
114
40
60
80
NO3 Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.05
Bias=0.2
Slope=-0.05
Alpha=0.05
Bias=0.2
Slope=-0.03
Alpha=0.05
Bias=0.2
Slope=-0.01
Alpha=0.05
Bias=0
Slope=-0.05
Alpha=0.05
Bias=0
Slope=-0.03
Alpha=0.05
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.05
Bias=-0.2
Slope=-0.05
Alpha=0.05
Bias=-0.2
Slope=-0.03
Alpha=0.05
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
115
40
60
80
SRP Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.0025
Bias=0.2
Slope=-0.05
Alpha=0.0025
Bias=0.2
Slope=-0.03
Alpha=0.0025
Bias=0.2
Slope=-0.01
Alpha=0.0025
Bias=0
Slope=-0.05
Alpha=0.0025
Bias=0
Slope=-0.03
Alpha=0.0025
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.0025
Bias=-0.2
Slope=-0.05
Alpha=0.0025
Bias=-0.2
Slope=-0.03
Alpha=0.0025
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
116
40
60
80
SRP Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.005
Bias=0.2
Slope=-0.05
Alpha=0.005
Bias=0.2
Slope=-0.03
Alpha=0.005
Bias=0.2
Slope=-0.01
Alpha=0.005
Bias=0
Slope=-0.05
Alpha=0.005
Bias=0
Slope=-0.03
Alpha=0.005
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.005
Bias=-0.2
Slope=-0.05
Alpha=0.005
Bias=-0.2
Slope=-0.03
Alpha=0.005
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
117
40
60
80
SRP Loads
CV: 10%
30%
20
40
50%
60
80
Alpha=0.05
Bias=0.2
Slope=-0.05
Alpha=0.05
Bias=0.2
Slope=-0.03
Alpha=0.05
Bias=0.2
Slope=-0.01
Alpha=0.05
Bias=0
Slope=-0.05
Alpha=0.05
Bias=0
Slope=-0.03
Alpha=0.05
Bias=0
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
Mann-Kendall Statistical Power
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Alpha=0.05
Bias=-0.2
Slope=-0.05
Alpha=0.05
Bias=-0.2
Slope=-0.03
Alpha=0.05
Bias=-0.2
Slope=-0.01
1.0
0.8
0.6
0.4
0.2
0.0
20
40
60
80
20
Sample size
118
40
60
80
APPENDIX A-7
Treatment of Values Less Than Method Detection Limit
Triangular distribution to impute left-censored values for water quality constituents
RB Thomas-June 10, 2002
Sets of some water quality constituents measured at Lake Tahoe contain subsets of missing
values due to being below minimum detection limit (MDL) of measuring devices. Missing
values should be accounted for to avoid biasing estimates of distribution parameters. Missing
values should not be omitted since they depend on the constituent values themselves. This report
describes one method to "impute" missing values when the purpose is to estimate parameters of
the constituent distributions and when there aren't "too many" missing values and compares it to
the MDL/2 imputation. Neither method should be used when the variables are part of a
multivariate complex where variate values for each case will be used to investigate relationships
among them.
We would expect the constituent distributions to be right skewed, bounded on the left by zero,
and unbounded on the right. Left censoring removes a portion of the left of the distribution from
the sample. The common practices of substituting zero, MDL/2, or MDL itself for the missing
values concentrates too much probability mass in the wrong places. Using a random method
might be expected to work better, but it is necessary to use a realistic distribution to reasonably
model the missing portion. For this reason, uniform or normal random variates would not work
well. Since we expect the missing portion of the distribution to drop to zero at the left and to rise
toward the right a triangular distribution is more realistic. This model has lower probability of
measurements occurring toward the left and higher probability toward the right (i.e., at the MDL)
and is simple to compute. We know the upper limit of the missing portion and how many values
are missing. We can use this information to generate "missing" values based on the triangle
distribution. Figure 1 on the next page illustrates the method.
The triangular distribution is bounded by the blue, red, and horizontal black lines and has height
2/MDL (not on the same scale as the probability density axis) so that the area of the triangle is 1.
This is a special case of this distribution where the mode is at its right boundary making it a
"right triangular distribution." At least for this concentration distribution the red line
approximates the left limb of the lognormal distribution reasonably well. This technique could
only be expected to work when the MDL is less than the mode; otherwise the red line would cut
across the "true" unknown distribution and poorly describe the missing portion. The distribution
(not density) function for this distribution that gives the probability of the random variable, x,
being less than, the specified value can be written:
F( x ) =
x2
MDL2
119
Note that this function is 0 when x is zero and 1 when x=MDL as required. To estimate a
missing value, x, select a uniform random number, RN, between zero and one and substitute it
for the value of the distribution function. Solving for x we can estimate x as:
x = MDL RN
Illustration of missing data imputation using triangular distribution
0.04
2/MDL
0.03
0.02
0.01
0.00
0
X MDL
1
2
Constituent concentration
This can be done in Excel by selecting a uniform random number for each missing value, taking
its square root, and using the product with the MDL as the triangular estimate.
This is illustrated in the attached Excel file <LogNormal Example>. The LN(0,1) column
contains 100 values generated from the lognormal distribution having its associated distribution
standard normal. An IF statement was then used to simulate the Field Data column where all
values less than an MDL of 0.3 are coded as zero. This should work in this case since there are
no legitimate measured zeros as long as MDL>0.The final Restored Data column is constructed
with another IF statement that creates a triangle distributed value for all missing (i.e., 0) values
and just copies the measured data.
The last two pages contain output from a simulation to compare the triangle and MDL/2
imputation methods. Two thousand samples of size 100 were selected from the lognormal
distribution associated with the standard normal distribution. The MDL was set at 0.3. The mean
and standard deviation was calculated for each complete sample, for the sample "restored" by the
triangle method, and by the MDL/2 method. Then the percentage differences of the means and
standard deviations of the two restoration methods was calculated from those for the
corresponding full sample and the values plotted as a "dotplot" for comparison.
120
The population of means estimated by the triangle method centers on zero which one would hope
to see while those estimated by the MDL/2 method are centered left of zero. The percent
difference is not large; only about a third of a percentage point, but there is a slight negative bias.
This is because this method puts too much probability mass too low. Using the mean of the
triangular distribution, 2MDL/3, would probably work better. For the standard deviations the
points from the triangular distribution again cluster around zero, but this time those for the
MDL/2 method are higher. This can be expected since the low probability mass for this method
puts too many points too far from the mean, thus increasing the variance and standard deviation.
Again, this bias is not large, only about 0.25 percent.
Character Dotplot
Each dot represents 13 points
.
.
.
. :
:.:::.
:::::::.
.:::::::::..
.::::::::::::.
..::::::::::::::::..
.......::::::::::::::::::::::...... .
.
+---------+---------+---------+---------+---------+------trimn%
Each dot represents 14 points
.
:
.::::::
::::::::
::::::::::.
:.::::::::::::
.::::::::::::::.
.. .....:.::::::::::::::::::... .
+---------+---------+---------+---------+---------+------mdl/2mn%
-1.20
-0.80
-0.40
0.00
0.40
0.80
Descriptive Statistics
Variable
trimn%
mdl/2mn%
N
Mean
Median
TrMean
2000 0.01828 0.01360 0.01806
2000 -0.34662 -0.33157 -0.34083
Variable
Min
trimn%
-0.82913
mdl/2mn% -1.13847
Max
Q1
Q3
1.03259 -0.11196 0.14885
0.17348 -0.45772 -0.21853
121
StDev
0.20962
0.17858
SEMean
0.00469
0.00399
Download