Hydroikos Ltd. 2512 Ninth Street, Ste. 7 Berkeley, CA 94710 Phone: (510) 295-4094 coats@hydroikos.com www.hydroikos.com REALIGNING THE LAKE TAHOE INTERAGENCY MONITORING PROGRAM Vol. II Technical Appendices By Robert Coats and Jack Lewis August 25, 2014 1 Cover Photo The late Bob Leonard of the UC Davis Tahoe Research Group collecting a water sample from Blackwood Creek using a DH-48 depth-integrating sampler, January 14, 1980. Discharge at the time was about 400 cfs. Photo by Bob Richards, Tahoe Environmental Research Center. 2 Organization of Appendixes A-1 Analytical Methods and correction of nitrate-N data A-2 Detailed results for all simulations A-2.1 Simulations from synthetic populations A-2.1.1 SSC A-2.1.2 FS by mass A-2.1.3 FS by count A-2.1.4 TP A-2.1.5 TKN A-2.2 Simulations from the worked records A-2.2.1 NO3 A-2.2.2 SRP A-2.2.3 SS A-2.2.4 THP A-2.3 Confidence limits as a function of sample size A-2.3.1 Without turbidity A-2.3.2 With Turbidity A-3 Time-of-sampling bias A-4 Bias in loads estimated from a standard sediment rating curve A-4.1 Simulation results A-4.1.1 Simulations from synthetic populations A-4.1.2 Simulations from the worked records A-4.2 Historic loads A-5 Time Trend Analysis A-5.1 Partial regression plots for loads computed from daytime samples A-5.2 Adjusted Mann-Kendall p-values A-6 Power analysis methodology and results A-6.1 Propagation of measurement error 3 A-6.2 Power analysis methodology A-6.3 Power analysis results A-7 Treatment of values less than Method Detection Limit 4 Appendix A-1. Adjustment of old nitrate-N data to correct for change in analytic method Since the late 1970s, the Lake Tahoe Interagency Monitoring Program (LTIMP) has been sampling Tahoe basin streams, and analyzing the samples for nitrate-nitrogen (among other constituents). Before 1976, samples were run through columns packed with cadmium amalgamated with copper, reducing the nitrate to nitrite. In May 1976, the Cd columns were replaced by the hydrazine reduction method described by Kamphake, et al. 1967. The samples have been analyzed in the laboratory of the Tahoe Research Group at Tahoe City, or at the High Sierra Water Laboratory near Truckee. The hydrazine reduction, however, can be subject to interference from dissolved Ca and Mg ions, and from dissolved oxygen, which inhibit the reduction by hydrazine (HS) from nitrate to nitrite (Kempers and Luft, 1989; Kempers and Van Der Velde, 1992). This interference probably explains the poor spike recoveries seen for stream samples. (When a known amount of nitrate is added to a sample, its remeasured concentration is often less than its calculated concentration). Samples from the Lake do not seem to be much affected by interference. In the mid-1990s, in consultation with the USGS, water chemists at the Tahoe Research Group conducted a series of experiments to develop an improved methodology for measuring nitrate in basin streams. The method that showed the most promise involves addition of pyrophosphate with copper as a catalyst; it is based on the method recommended by Kempers and Luft, 1988. In 2004, we used two years (2003 and 2004 water years) of data from samples analyzed by the hydrazine reduction method, both with and without the pyrophosphate catalyst. Samples from 18 stations were included, with a total of 575 samples. In order to establish the relationship between the two methods, we used linear regression for all stations and years together, all stations together for each water year, and for each individual station, for both years together. We then tested for homogeneity of the regression coefficients, to test for significant differences in the regression relationships between stations, and between years. We tried log-transformed data as well as untransformed data. We treated the pyrophosphate method data as the dependent (Y) variable, and the data from the old method as the independent (X) variable, since the goal is to adjust the old data to the new method. We found that the regression coefficients varied significantly among LTIMIP streams, so separate adjustment equations were needed for each stream. In order to develop equations for adjusting the old nitrate data in this study, we used 2370 data pairs from 18 stations, collected from 2003 to 2008. For the final equations we used a secondorder polynomial fit (quadratic equations), even though the improvement over linear equations was usually slight. Table A-1 shows the derived adjustment equations and R2 values. The nitrate-N data from the 1976 through April 2003 were adjusted using these equations. Since the regressions were not forced through the origin, a few negative values were created in the adjustment. These are set to zero. All nitrate data since April 2003 reflect the use of pyrophosphate catalyst with the hydrazine reduction step. Nitrate data for two secondary stations—WC-7A and ED-3, with records respectively from 1992-2001 and 1990-2001, were not adjusted, since sampling of these stations was discontinued before the use of the catalyst was introduced. The nitrate-N loads reported in this study for those two stations may be on biased on the low side, but are internally consistent over the periods or 5 record. No time trends in nitrate-N for those stations were detected. Table A-1 Station BC-1 ED-5 ED-9 GC-1 GL-1 IN-1 IN-2 IN-3 LH-1 TC-1 TC-2 TC-3 TH-1 UT-1 UT-3 UT-5 WC-3A WC-8 Regression equations for correcting old nitrate-N data to current method (with pyrophosphate) Coefficients Equation R² n X2 X Constant y = 0.0049x2 + 1.367x + 0.5878 0.90 177.00 0.00 1.37 0.59 y = -0.0011x2 + 1.6723x - 1.3544 0.97 120.00 0.00 1.67 -1.35 y = 0.001x2 + 2.0923x - 7.769 0.85 122.00 0.00 2.09 -7.77 y = 0.0362x2 + 3.5525x - 6.9866 0.64 170.00 0.04 3.55 -6.99 y = -0.0022x2 + 1.7903x - 3.6596 0.91 134.00 0.00 1.79 -3.66 y = 0.0072x2 + 1.3984x - 1.59 0.93 151.00 0.01 1.40 -1.59 y = 0.0105x2 + 1.279x + 0.6727 0.90 75.00 0.01 1.28 0.67 y = 0.0024x2 + 1.3917x + 0.5088 0.91 114.00 0.00 1.39 0.51 y = -0.0446x2 + 2.7554x - 3.7429 0.75 132.00 -0.04 2.76 -3.74 y = 0.0458x2 + 1.0669x + 0.3007 0.78 151.00 0.05 1.07 0.30 y = 0.0107x2 + 1.7334x - 2.0809 0.73 117.00 0.01 1.73 -2.08 y = -0.037x2 + 1.9848x - 1.3726 0.75 74.00 -0.04 1.98 -1.37 y = 0.0087x2 + 1.3006x - 1.351 0.85 142.00 0.01 1.30 -1.35 y = -0.0044x2 + 1.8258x - 1.271 0.68 161.00 0.00 1.83 -1.27 y = -0.0004x2 + 1.3272x + 0.5448 0.78 119.00 0.00 1.33 0.54 y = -0.0076x2 + 1.5043x - 0.0603 0.83 114.00 -0.01 1.50 -0.06 y = 0.0078x2 + 1.1913x + 0.9787 0.84 116.00 0.01 1.19 0.98 y = 0.0054x2 + 1.6508x - 0.523 0.78 181.00 0.01 1.65 -0.52 6 Appendix A-2. Detailed results for all simulations Simulation results are shown for SS, FS by mass and count, TP, and TKN from resampling the synthetic populations; and for NO3, SRP, SS, and THP from resampling the worked records. The load estimation methods shown in the legends are defined in Section 8.3 of the body of this report. For constituents resampled from synthetic populations (Appendix A-2.1), each subsection of the appendix first shows results from the individual synthetic populations, followed by 4 summary graphs expressing accuracy in terms of (1) RMSE with turbidity, (2) MAPE with turbidity, (3) RMSE without turbidity methods, and (4) MAPE without turbidity methods. In the summary graphs, average results across populations are shown in terms of RMSE and MAPE. The summary graphs show results for the top 6 methods, ranked by the mean of RMSE or MAPE (both expressed as percentages of the true load), averaging first across populations and then across sample sizes. The standard method that has been used historically, rcload.mdq, is shown in all the summary graphs regardless of its ranking. Statistics for the worked records are computed across all stations and years, so only summary graphs are shown in Appendix A-2.2, and these are analogous to those presented in Appendix A2.1. Appendix A-2.3 shows confidence limits derived from the simulations as a function of sample size. 7 A-2.1 Simulations from synthetic populations A-2.1.1 SSC WC-8 1999 SSC: top-ranking methods by RMSE Root mean square error (%) rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 30 20 10 20 40 60 Sample size 8 80 WC-8 1999 SSC: top-ranking methods by MAPE Median absolute error (%) avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 20 15 10 5 20 40 60 Sample size 9 80 TC-R1 2011 SSC: top-ranking methods by RMSE Root mean square error (%) avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 20 15 10 5 20 40 60 Sample size 10 80 TC-R1 2011 SSC: top-ranking methods by MAPE Median absolute error (%) avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 15 10 5 20 40 60 Sample size 11 80 HWD 2010 SSC: top-ranking methods by RMSE avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 30 25 20 15 10 5 20 40 60 Sample size 12 80 HWD 2010 SSC: top-ranking methods by MAPE Median absolute error (%) avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 15 10 5 20 40 Sample size 13 60 80 TC-2 2011 SSC: top-ranking methods by RMSE Root mean square error (%) avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 20 15 10 5 20 40 60 Sample size 14 80 TC-2 2011 SSC: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 12 10 8 6 4 2 20 40 60 Sample size 15 80 TC-2 2010 SSC: top-ranking methods by RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 20 15 10 5 20 40 60 Sample size 16 80 TC-2 2010 SSC: top-ranking methods by MAPE avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 12 10 8 6 4 20 40 Sample size 17 60 80 TH-1 2005 SSC: top-ranking methods by RMSE avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 35 30 25 20 15 10 5 20 40 60 Sample size 18 80 TH-1 2005 SSC: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 25 20 15 10 5 20 40 60 Sample size 19 80 SSC: Best methods by mean of RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 60 80 30 mean of RMSE 25 20 15 10 5 20 40 Sample size 20 SSC: Best methods by mean of MAPE mean of MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 60 80 15 10 5 20 40 Sample size 21 SSC: Best methods by mean of RMSE w/o turbidity rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant 40 60 pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 mean of RMSE 30 25 20 15 20 Sample size 22 80 SSC: Best methods by mean of MAPE w/o turbidity rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 mean of MAPE 20 15 10 20 40 60 Sample size 23 80 A-2.1.2 FS by mass HWD 2010 FS: top-ranking methods by RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 80 60 40 20 20 40 60 Sample size 24 80 HWD 2010 FS: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 25 20 15 10 5 20 40 60 Sample size 25 80 TC-4 2010 FS: top-ranking methods by RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 20 15 10 20 40 60 Sample size 26 80 TC-4 2010 FS: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 16 14 12 10 8 6 20 40 60 Sample size 27 80 Rosewood Above 2010 FS: top-ranking methods by RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 15 10 5 20 40 60 Sample size 28 80 Rosewood Above 2010 FS: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace 20 40 avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 10 8 6 4 2 60 Sample size 29 80 FS: Best methods by mean of RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 30 mean of RMSE 25 20 15 10 5 20 40 60 Sample size 30 80 FS: Best methods by mean of MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 mean of MAPE 15 10 20 40 60 Sample size 31 80 FS: Best methods by mean of RMSE w/o turbidity rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 35 mean of RMSE 30 25 20 15 20 40 60 Sample size 32 80 FS: Best methods by mean of MAPE w/o turbidity rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 60 80 20 mean of MAPE 18 16 14 12 20 40 Sample size 33 A-2.1.3 FS by count AC2 2010 FSP: top-ranking methods by RMSE avas areg pdmean pdlinear pdinstant pdlocal2 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 30 25 20 15 10 5 20 40 60 Sample size 34 80 AC2 2010 FSP: top-ranking methods by MAPE Median absolute error (%) rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 15 10 5 20 40 60 Sample size 35 80 A-2.1.4 TP WC-8 1999 TP: top-ranking methods by RMSE Root mean square error (%) rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 20 15 10 5 20 40 Sample size 36 60 80 WC-8 1999 TP: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 15 10 5 20 40 60 Sample size 37 80 AC-2 2008 TP: top-ranking methods by RMSE Root mean square error (%) rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 14 12 10 8 6 20 40 60 Sample size 38 80 AC-2 2008 TP: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 10 8 6 4 20 40 60 Sample size 39 80 HWD 2010 TP: top-ranking methods by RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 25 20 15 10 20 40 Sample size 40 60 80 HWD 2010 TP: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 18 16 14 12 10 8 20 40 60 Sample size 41 80 TC-2 2011 TP: top-ranking methods by RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 12 10 8 6 4 2 20 40 Sample size 42 60 80 TC-2 2011 TP: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 10 8 6 4 2 20 40 60 Sample size 43 80 TP: Best methods by mean of RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 60 80 mean of RMSE 20 15 10 20 40 Sample size 44 TP: Best methods by mean of MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 16 mean of MAPE 14 12 10 8 20 40 60 Sample size 45 80 TP: Best methods by mean of RMSE w/o turbidity rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 mean of RMSE 20 15 10 20 40 60 Sample size 46 80 TP: Best methods by mean of MAPE w/o turbidity rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 mean of MAPE 16 14 12 10 8 20 40 60 Sample size 47 80 A-2.1.5 TKN WC-8 1999 TKN: top-ranking methods by RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Root mean square error (%) 15 10 5 20 40 60 Sample size 48 80 WC-8 1999 TKN: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 10 8 6 4 2 20 40 Sample size 49 60 80 AC-2 2008 TKN: top-ranking methods by RMSE Root mean square error (%) rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 18 16 14 12 10 20 40 60 Sample size 50 80 AC-2 2008 TKN: top-ranking methods by MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 Median absolute error (%) 16 14 12 10 8 20 40 60 Sample size 51 80 HWD 2010 TKN: top-ranking methods by RMSE Root mean square error (%) rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 20 15 10 20 40 60 Sample size 52 80 HWD 2010 TKN: top-ranking methods by MAPE Median absolute error (%) rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 15 10 5 20 40 Sample size 53 60 80 TKN: Best methods by mean of RMSE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 40 60 80 25 mean of RMSE 20 15 10 20 Sample size 54 TKN: Best methods by mean of MAPE rcload.turb rcload.turb2 rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 rcb3 rcb4 60 80 mean of MAPE 20 15 10 5 20 40 Sample size 55 TKN: Best methods by mean of RMSE w/o turbidity rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant 40 60 pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 25 mean of RMSE 20 15 10 20 Sample size 56 80 TKN: Best methods by mean of MAPE w/o turbidity rcload rcload.mdq rcload.mdq2 rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg pdmean pdlinear pdinstant pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 60 80 mean of MAPE 20 15 10 5 20 40 Sample size 57 A-2.2 Simulations from the worked records A-2.2.1 NO3 NO3: Best methods by RMSE rcload.mdq rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg.boot pdmean pdlinear pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 40 RMSE (%) 30 20 10 20 40 60 Sample size 58 80 NO3: Best methods by MAPE rcload.mdq rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg.boot pdmean pdlinear pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 MAPE (%) 15 10 5 20 40 60 Sample size 59 80 A-2.2.2 SRP SRP: Best methods by RMSE rcload.mdq rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg.boot pdmean pdlinear pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 25 RMSE (%) 20 15 10 5 20 40 60 Sample size 60 80 SRP: Best methods by MAPE rcload.mdq rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg.boot pdmean pdlinear pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 10 MAPE (%) 8 6 4 2 20 40 60 Sample size 61 80 A-2.2.3 SSC SSC: Best methods by RMSE rcload.mdq rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg.boot pdmean pdlinear pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 100 RMSE (%) 80 60 40 20 20 40 Sample size 62 60 80 SSC: Best methods by MAPE rcload.mdq rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg.boot pdmean pdlinear pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 25 MAPE (%) 20 15 10 5 20 40 60 Sample size 63 80 A-2.2.4 THP THP: Best methods by RMSE rcload.mdq rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg.boot pdmean pdlinear pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 40 RMSE (%) 30 20 10 20 40 60 Sample size 64 80 THP: Best methods by MAPE rcload.mdq rcload.mdq3 rcload.mdq4 rcload.mdq5 rcload.mdq6 loess.g loess.s ace avas areg.boot pdmean pdlinear pdlocal2 pdlocal2a pdlocal4 rcb1 rcb2 MAPE (%) 15 10 5 20 40 60 80 Sample size A-2.3 Confidence limits as a function of sample size Confidence limits are simulated percentiles of the absolute value of errors, expressed as percentages of the estimated load. Mean confidence limits are the average across simulations. The fourth graph in each set redisplays the means from the first 3 graphs together in one plot. SS, FS, and TP were simulated from synthetic populations using method rcb2 (best regression model selected by Gilroy's MSE). FSP (fine sediment particle counts) were simulated from just one synthetic population using a standard rating curve. TKN was simulated from synthetic populations using the period-weighted sample estimator. Sampling from synthetic populations was limited to a 9-6 workday. NO3 and SRP were simulated from the worked records using the period-weighted sample estimator. For those constituents where turbidity improves estimation, confidence limits are shown with turbidity in Appendix A-2.3.2. Optimal methods using turbidity were linear regression of sqrt(c) on sqrt(turb) for SS and FS, and rcb3 (best regression model selected by AIC) for TP. 65 A-2.3.1 Confidence limits without turbidity 30 40 50 60 70 80 70 60 Mean 30 40 50 TCR11 TH105 WC899 20 30 20 80 10 20 30 40 50 60 Sample size Sample size SS 95% Confidence Limits SS Mean Confidence Limits 70 80 50 40 10 20 40 95th percentile 90th percentile 80th percentile 30 Mean 20 TCR11 TH105 WC899 60 HWD10 TC210 TC211 Mean Error (% of load) 80 60 20 HWD10 TC210 TC211 10 90th Percentile of Error (% of load) 60 Mean 10 10 95th Percentile of Error (% of load) TCR11 TH105 WC899 40 50 HWD10 TC210 TC211 SS 90% Confidence Limits 0 80th Percentile of Error (% of load) SS 80% Confidence Limits 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Samples size 66 60 70 80 30 40 50 60 70 80 70 60 Mean 30 40 50 TCR11 TH105 WC899 20 30 20 80 10 20 30 40 50 60 Sample size Sample size SS 95% Confidence Limits SS Mean Confidence Limits 70 80 50 40 10 20 40 95th percentile 90th percentile 80th percentile 30 Mean 20 TCR11 TH105 WC899 60 HWD10 TC210 TC211 Mean Error (% of load) 80 60 20 HWD10 TC210 TC211 10 90th Percentile of Error (% of load) 60 Mean 10 10 95th Percentile of Error (% of load) TCR11 TH105 WC899 40 50 HWD10 TC210 TC211 SS 90% Confidence Limits 0 80th Percentile of Error (% of load) SS 80% Confidence Limits 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Samples size 67 60 70 80 10 20 30 40 95th percentile 90th percentile 80th percentile 0 Mean Error (% of load) 50 FSP Confidence Limits based on AC210 only 10 20 30 40 50 60 70 80 Samples size The only simulation performed for fine sediment particle counts used the 2010 synthetic record for station AC-2. 68 40 50 60 70 40 WC899 Mean 10 20 30 AC208 HWD10 TC211 0 10 30 80 10 20 30 40 50 60 Sample size Sample size TP 95% Confidence Limits TP Mean Confidence Limits 70 80 40 40 20 90th Percentile of Error (% of load) 40 WC899 Mean 20 30 AC208 HWD10 TC211 10 35 30 25 20 10 10 95th percentile 90th percentile 80th percentile 15 Mean Error (% of load) WC899 Mean 20 30 AC208 HWD10 TC211 0 95th Percentile of Error (% of load) TP 90% Confidence Limits 0 80th Percentile of Error (% of load) TP 80% Confidence Limits 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Samples size 69 60 70 80 30 40 50 60 70 35 30 25 20 10 15 5 80 10 20 30 40 50 60 70 Sample size TKN 95% Confidence Limits TKN Mean Confidence Limits 80 10 15 20 25 95th percentile 90th percentile 80th percentile 0 5 Mean Error (% of load) 0 5 10 15 20 25 30 AC208 HWD10 WC899 Mean 30 35 Sample size 35 20 AC208 HWD10 WC899 Mean 0 90th Percentile of Error (% of load) 35 5 10 15 20 25 30 AC208 HWD10 WC899 Mean 10 95th Percentile of Error (% of load) TKN 90% Confidence Limits 0 80th Percentile of Error (% of load) TKN 80% Confidence Limits 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Samples size 70 60 70 80 20 30 40 50 60 70 40 TH-1 UT-1 WC-8 Mean 20 30 BC-1 GC-1 SN-1 TC-1 10 90th Percentile of Error (% of load) 30 10 80 10 20 30 40 50 60 70 Sample size Sample size NO3 95% Confidence Limits NO3 Mean Confidence Limits 80 20 30 20 0 10 95th percentile 90th percentile 80th percentile 10 TH-1 UT-1 WC-8 Mean 30 40 BC-1 GC-1 SN-1 TC-1 Mean Error (% of load) 50 40 10 95th Percentile of Error (% of load) TH-1 UT-1 WC-8 Mean 15 20 25 BC-1 GC-1 SN-1 TC-1 NO3 90% Confidence Limits 5 80th Percentile of Error (% of load) NO3 80% Confidence Limits 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Samples size 71 60 70 80 20 30 40 50 60 70 50 TH-1 UT-1 WC-8 Mean 20 30 40 BC-1 GC-1 SN-1 TC-1 10 90th Percentile of Error (% of load) 20 80 10 20 30 40 50 60 70 Sample size Sample size SRP 95% Confidence Limits SRP Mean Confidence Limits 80 30 20 0 20 95th percentile 90th percentile 80th percentile 10 TH-1 UT-1 WC-8 Mean 40 60 BC-1 GC-1 SN-1 TC-1 Mean Error (% of load) 80 40 10 95th Percentile of Error (% of load) TH-1 UT-1 WC-8 Mean 10 15 BC-1 GC-1 SN-1 TC-1 SRP 90% Confidence Limits 5 80th Percentile of Error (% of load) SRP 80% Confidence Limits 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Samples size 72 60 70 80 A-2.3.2 Confidence limits with turbidity 30 40 50 60 70 50 TCR11 TH105 WC899 Mean 10 20 30 40 HWD10 TC210 TC211 0 20 20 SS 90% Confidence Limits Using Turbidity 90th Percentile of Error (% of load) 50 Mean 10 10 80 10 20 30 40 50 60 70 80 SS 95% Confidence Limits Using Turbidity SS Mean Confidence Limits Using Turbidity 50 Sample size 50 Sample size 95th percentile 90th percentile 80th percentile 30 20 0 10 20 10 40 Mean Mean Error (% of load) TCR11 TH105 WC899 30 40 HWD10 TC210 TC211 0 95th Percentile of Error (% of load) TCR11 TH105 WC899 30 40 HWD10 TC210 TC211 0 80th Percentile of Error (% of load) SS 80% Confidence Limits Using Turbidity 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Sample size 73 60 70 80 20 30 40 50 60 70 40 FS 90% Confidence Limits Using Turbidity TC410 Mean 10 20 30 HWD10 RWA10 0 90th Percentile of Error (% of load) 40 10 10 80 10 20 30 40 50 60 70 80 FS 95% Confidence Limits Using Turbidity FS Mean Confidence Limits Using Turbidity 40 Sample size 30 0 10 95th percentile 90th percentile 80th percentile 20 Mean Error (% of load) TC410 Mean 20 30 HWD10 RWA10 10 40 Sample size 0 95th Percentile of Error (% of load) TC410 Mean 20 30 HWD10 RWA10 0 80th Percentile of Error (% of load) FS 80% Confidence Limits Using Turbidity 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Sample size 74 60 70 80 50 40 30 WC899 Mean 20 20 0 10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 Sample size TP 95% Confidence Limits Using Turbidity TP Mean Confidence Limits Using Turbidity 30 40 95th percentile 90th percentile 80th percentile 20 20 0 10 0 80 10 Mean Error (% of load) WC899 Mean 30 40 AC208 HWD10 TC211 50 Sample size 50 10 95th Percentile of Error (% of load) AC208 HWD10 TC211 10 90th Percentile of Error (% of load) 50 WC899 Mean 30 40 AC208 HWD10 TC211 TP 90% Confidence Limits Using Turbidity 0 80th Percentile of Error (% of load) TP 80% Confidence Limits Using Turbidity 10 20 30 40 50 60 70 80 Sample size 10 20 30 40 50 Sample size 75 60 70 80 APPENDIX A-3 Time-of-sampling bias Loads were recomputed for NO3, SRP, TKN, SSC, and TP using (1) unrestricted sampling and (2) only samples collected between 9am and 6pm. The graphs in this Appendix show the change in load as a function of the proportion of samples omitted. Loads associated with sediment (SS, TP, and TKN) are most sensitive to time-of-sampling. The dissolved loads (NO3 and SRP) are largely unaffected, except for WC-8, where daytime-only sampling could result in an underestimate of total load by up to 40 percent. 0.0 0.1 0.2 0.3 0.4 0.5 UT-3 UT-5 0.0 0.1 0.2 0.3 0.4 0.5 WC-3A WC-7A WC-8 Proportional change in SSC load 0.4 0.2 0.0 -0.2 -0.4 -0.6 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.4 0.2 0.0 -0.2 -0.4 -0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 BC-1 ED-3 ED-5 ED-9 GC-1 0.4 0.2 0.0 -0.2 -0.4 -0.6 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Proportion of samples omitted 76 0.0 0.1 0.2 0.3 0.4 0.5 0.0 UT-3 0.2 0.4 0.6 UT-5 0.0 WC-3A 0.2 0.4 0.6 WC-7A WC-8 0.2 0.0 -0.2 Proportional change in TP load -0.4 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.2 0.0 -0.2 -0.4 0.2 0.0 -0.2 -0.4 BC-1 ED-3 ED-5 ED-9 GC-1 0.2 0.0 -0.2 -0.4 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 Proportion of samples omitted 77 0.0 0.2 0.4 0.6 0.0 0.1 0.2 0.3 0.4 0.5 UT-3 UT-5 0.0 0.1 0.2 0.3 0.4 0.5 WC-3A WC-7A WC-8 0.4 0.2 0.0 -0.2 Proportional change in TKN load -0.4 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.4 0.2 0.0 -0.2 -0.4 0.4 0.2 0.0 -0.2 -0.4 BC-1 ED-3 ED-5 ED-9 GC-1 0.4 0.2 0.0 -0.2 -0.4 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Proportion of samples omitted 78 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 UT-3 UT-5 0.0 0.1 0.2 0.3 0.4 0.5 WC-3A WC-7A WC-8 0.4 0.2 0.0 -0.2 Proportional change in NO3 load -0.4 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.4 0.2 0.0 -0.2 -0.4 0.4 0.2 0.0 -0.2 -0.4 BC-1 ED-3 ED-5 ED-9 GC-1 0.4 0.2 0.0 -0.2 -0.4 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Proportion of samples omitted 79 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 UT-3 0.0 0.1 0.2 0.3 0.4 0.5 UT-5 WC-3A WC-7A WC-8 0.4 0.2 0.0 -0.2 Proportional change in SRP load -0.4 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.4 0.2 0.0 -0.2 -0.4 0.4 0.2 0.0 -0.2 -0.4 BC-1 ED-3 ED-5 ED-9 GC-1 0.4 0.2 0.0 -0.2 -0.4 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Proportion of samples omitted 80 0.0 0.1 0.2 0.3 0.4 0.5 APPENDIX A-4 Bias in loads estimated from a standard sediment rating curve We examined the bias in loads estimated from standard sediment rating curves in two ways. Appendix A-4.1 uses simulation results to compare the bias with that of our selected methods. Appendix A-4.2 looks at the differences between estimates of historic loads computed by the standard rating curve and our selected methods. A-4.1 Simulation results The simulations provided estimates of the bias of all load estimation methods. This Appendix shows the bias of the standard rating curve method (rcload.mdq), compared to that of our selected methods for SS, TP, and TKN from resampling the synthetic data; and for NO3, SRP, THP, and SS from resampling the worked records. A-4.1.1 Simulations from synthetic populations Standard rating curve Best model by GRMSE 20 TCR11 40 60 80 TH105 WC899 30 20 Bias (% of true SS load) 10 0 -10 HWD10 TC210 TC211 30 20 10 0 -10 20 40 60 80 20 Sample size 81 40 60 80 Standard rating curve Best model by GRMSE 20 TC211 40 60 80 WC899 30 20 10 Bias (% of true TP load) 0 -10 AC208 HWD10 30 20 10 0 -10 20 40 60 80 Sample size 82 Standard rating curve Period-weighted WC899 20 10 Bias (% of true TKN load) 0 -10 -20 AC208 HWD10 20 10 0 -10 -20 20 40 60 80 Sample size 83 A-4.1.2 Simulations from the worked records Standard rating curve Period-weighted WC-8 20 10 0 -10 -20 -30 Bias (% of true NO3 load) TC-1 TH-1 UT-1 20 10 0 -10 -20 -30 BC-1 GC-1 SN-1 20 10 0 -10 -20 -30 20 40 60 80 20 Sample size 84 40 60 80 Standard rating curve Period-weighted WC-8 15 10 5 0 Bias (% of true SRP load) TC-1 TH-1 UT-1 15 10 5 0 BC-1 GC-1 SN-1 15 10 5 0 20 40 60 80 20 Sample size 85 40 60 80 Standard rating curve Period-weighted WC-8 30 20 10 0 Bias (% of true THP load) TC-1 TH-1 UT-1 30 20 10 0 BC-1 GC-1 SN-1 30 20 10 0 20 40 60 80 20 Sample size 86 40 60 80 Standard rating curve Best model by GRMSE WC-8 100 80 60 40 20 0 -20 Bias (% of true SSC load) TC-1 TH-1 UT-1 100 80 60 40 20 0 -20 BC-1 GC-1 SN-1 100 80 60 40 20 0 -20 20 40 60 80 20 40 60 80 Sample size A-4.2 Historical loads The following graphs show the distribution of differences between historical loads estimated using standard sediment rating curves and our selected methods. Positive differences indicate that the sediment rating curve estimates are higher. 87 SS 0 50 100 150 0 50 100 150 UT-5 WC-3A WC-7A WC-8 TC-3 TH-1 UT-1 UT-3 0.06 0.04 0.02 0.00 0.06 0.04 0.02 Density 0.00 IN-3 LH-1 TC-1 TC-2 GC-1 GL-1 IN-1 IN-2 0.06 0.04 0.02 0.00 0.06 0.04 0.02 0.00 BC-1 ED-3 ED-5 ED-9 0.06 0.04 0.02 0.00 0 50 100 150 0 50 100 150 Difference between rating curve estimate and best model selection by GRMSE (% of latter) 88 TP 0 50 100 150 0 50 100 150 UT-5 WC-3A WC-7A WC-8 TC-3 TH-1 UT-1 UT-3 0.15 0.10 0.05 0.00 0.15 0.10 0.05 0.00 IN-3 LH-1 TC-1 TC-2 GC-1 GL-1 IN-1 IN-2 Density 0.15 0.10 0.05 0.00 0.15 0.10 0.05 0.00 BC-1 ED-3 ED-5 ED-9 0.15 0.10 0.05 0.00 0 50 100 150 0 50 100 150 Difference between rating curve estimate and best model selection by GRMSE (% of latter) 89 TKN 0 50 100 150 0 50 100 150 UT-5 WC-3A WC-7A WC-8 TC-3 TH-1 UT-1 UT-3 0.06 0.04 0.02 0.00 0.06 0.04 0.02 Density 0.00 IN-3 LH-1 TC-1 TC-2 GC-1 GL-1 IN-1 IN-2 0.06 0.04 0.02 0.00 0.06 0.04 0.02 0.00 BC-1 ED-3 ED-5 ED-9 0.06 0.04 0.02 0.00 0 50 100 150 0 50 100 150 Difference between rating curve estimate and period-weighted (% of PWE) 90 NO3 0 50 100 150 0 50 100 150 UT-5 WC-3A WC-7A WC-8 TC-3 TH-1 UT-1 UT-3 0.06 0.04 0.02 0.00 0.06 0.04 0.02 Density 0.00 IN-3 LH-1 TC-1 TC-2 GC-1 GL-1 IN-1 IN-2 0.06 0.04 0.02 0.00 0.06 0.04 0.02 0.00 BC-1 ED-3 ED-5 ED-9 0.06 0.04 0.02 0.00 0 50 100 150 0 50 100 150 Difference between rating curve estimate and period-weighted (% of PWE) 91 SRP 0 50 100 150 0 50 100 150 UT-5 WC-3A WC-7A WC-8 TC-3 TH-1 UT-1 UT-3 0.12 0.10 0.08 0.06 0.04 0.02 0.00 Density 0.12 0.10 0.08 0.06 0.04 0.02 0.00 IN-3 LH-1 TC-1 TC-2 GC-1 GL-1 IN-1 IN-2 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.12 0.10 0.08 0.06 0.04 0.02 0.00 BC-1 ED-3 ED-5 ED-9 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0 50 100 150 0 50 100 150 Difference between rating curve estimate and period-weighted (% of PWE) 92 APPENDIX A-5 Time Trend Analysis This appendix shows results of Alley's adjusted variable Kendall test, recommended by Helsel and Hirsh (2002, p. 335). Alley's test is basically a Mann-Kendall test on the partial regression plot of log(load) versus water year. In the partial regression plots (Appendix A-5.1), log(load) and water year are both regressed on the same set of predictors and the two sets of residuals are plotted against one another. For TP, SRP, and NO3, we used multistation models of the form log(loadi) = b0i + b1ilog(flow) + b2log(peak). where the subscript i denotes the gaging station. For TKN, the interaction between station and log(flow) was not significant, so there was only one coefficient for log(flow); the slightly simplified model is written by dropping the i subscript from the coefficient b1i log(loadi) = b0i + b1log(flow) + b2log(peak). For SS, the interaction between station and log(peak) improved the model more than that between station and log(flow), hence the model is written as adding a subscript i to coefficient b2 log(loadi) = b0i + b1log(flow) + b2ilog(peak). The partial regression plots of Appendix A-5.1 look very similar to more familiar plots (Figures 11.1-1 to 11.1-5) in which the x variable is water year. Mann-Kendall tests were also carried out on the familiar plots. As stated by Helsel and Hirsh, these tests were less powerful, failing to reject the null hypothesis for (1) SS at ED-3, and (2) TP at IN-1, IN-2, and WC-8. Otherwise both tests identified the same set of trends for all constituents. Only the adjusted plots based on daytime samples are included in Appendix A.5-1. These are more "correct" because they eliminate doubts about time-of-sampling bias and the influence of correlations between water year and the other predictors. Tables in Appendix A-5.2 contain the p-values from the adjusted Mann-Kendall tests for trends in loads computed from both daytime samples and the full set of samples. 93 A.5-1. Partial regression plots for loads computed from daytime samples SS loads computed from daytime samples only -10 SS Residual from Regression: log(dayload) ~ log(flow) + stn*log(peak) UT-3 0 UT-5 10 -10 WC-3A 0 10 WC-7A WC-8 3 2 1 0 -1 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 3 2 1 0 -1 3 2 1 0 -1 BC-1 ED-3 ED-5 ED-9 GC-1 3 2 1 0 -1 -10 0 10 -10 0 10 Water Year Residual on Same Predictors 94 -10 0 10 TP loads computed from daytime samples only -10 TP Residual from Regression: log(dayload) ~ stn*log(flow) + log(peak) UT-3 0 UT-5 10 -10 WC-3A 0 10 WC-7A WC-8 1.0 0.5 0.0 -0.5 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 1.0 0.5 0.0 -0.5 1.0 0.5 0.0 -0.5 BC-1 ED-3 ED-5 ED-9 GC-1 1.0 0.5 0.0 -0.5 -10 0 10 -10 0 10 Water Year Residual on Same Predictors 95 -10 0 10 TKN loads computed from daytime samples only TKN Residual from Regression: log(dayload) ~ stn + log(flow) + log(peak) -10 -5 0 5 10 2.0 1.5 1.0 0.5 0.0 -0.5 2.0 1.5 1.0 0.5 0.0 -0.5 -10 -5 0 5 10 UT-3 UT-5 WC-3A WC-7A WC-8 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 BC-1 ED-3 ED-5 ED-9 GC-1 -10 -5 0 5 10 -10 -5 0 5 10 Water Year Residual on Same Predictors 96 -10 -5 0 5 10 2.0 1.5 1.0 0.5 0.0 -0.5 2.0 1.5 1.0 0.5 0.0 -0.5 NO3 loads computed from daytime samples only -20 -10 NO3 Residual from Regression: log(dayload) ~ stn*log(flow) + log(peak) UT-3 0 UT-5 10 20 -20 -10 WC-3A 0 10 20 WC-7A WC-8 2 1 0 -1 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 2 1 0 -1 2 1 0 -1 BC-1 ED-3 ED-5 ED-9 GC-1 2 1 0 -1 -20 -10 0 10 20 -20 -10 0 10 20 Water Year Residual on Same Predictors 97 -20 -10 0 10 20 SRP loads computed from daytime samples only SRP Residual from Regression: log(dayload) ~ stn*log(flow) + log(peak) -10 UT-3 0 UT-5 10 -10 WC-3A 0 10 WC-7A WC-8 1.0 0.5 0.0 -0.5 -1.0 TC-1 TC-2 TC-3 TH-1 UT-1 GL-1 IN-1 IN-2 IN-3 LH-1 1.0 0.5 0.0 -0.5 -1.0 1.0 0.5 0.0 -0.5 -1.0 BC-1 ED-3 ED-5 ED-9 GC-1 1.0 0.5 0.0 -0.5 -1.0 -10 0 10 -10 0 10 -10 0 10 Water Year Residual on Same Predictors A.5-2. Adjusted Mann-Kendall p-values for loads computed from both daytime samples and the full set of samples. For significance testing of the trend in each partial regression plot, the rejection level for the Mann-Kendall test was set at alpha=0.05/20= 0.0025, to keep the overall experiment-wise error for each constituent at approximately 0.05. Significant tests are highlighted in yellow. All significant trends were declines. 98 SS BC-1 ED-3 ED-5 ED-9 GC-1 GL-1 IN-1 IN-2 IN-3 LH-1 day only 0.3754 0.0000 0.3686 0.6158 0.6369 0.1137 0.0021 0.0795 0.0091 0.1888 full set 1.0000 0.0010 0.4543 0.1289 0.0135 0.1554 0.0004 0.0472 0.0059 0.0498 SS TC-1 TC-2 TC-3 TH-1 UT-1 UT-3 UT-5 WC-3A WC-7A WC-8 day only 0.0870 0.0033 0.0030 0.0003 0.0006 0.0187 0.7429 0.4605 0.6007 0.0053 full set 0.0781 0.0021 0.0014 0.0065 0.0004 0.0221 1.0000 0.6771 0.4843 0.0159 TP day only full set BC-1 ED-3 ED-5 ED-9 GC-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.0075 0.0210 0.1124 0.4343 0.0000 0.0812 0.0082 0.0020 0.4223 0.3194 0.0111 0.0088 0.1311 0.2628 0.0011 0.2944 0.0025 0.0020 0.0855 0.3457 TP day only full set TC-1 TC-2 TC-3 TH-1 UT-1 UT-3 UT-5 WC-3A WC-7A WC-8 0.5071 0.2422 0.0548 0.0008 0.0007 0.3536 0.8815 0.7246 0.2164 0.0002 0.5722 0.2945 0.0283 0.0013 0.0008 0.4196 0.8815 0.6771 0.2912 0.0022 TKN day only full set BC-1 ED-3 ED-5 ED-9 GC-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.6062 0.0018 1.0000 0.7381 0.9804 0.5304 0.0870 0.5183 0.6308 0.8755 0.4461 0.0311 0.5009 0.4680 0.9024 0.2483 0.0339 0.2792 0.5006 0.9168 TKN day only full set TC-1 TC-2 TC-3 TH-1 UT-1 UT-3 UT-5 WC-3A WC-7A WC-8 0.6410 0.4196 0.1650 0.0072 0.1590 0.3858 0.4550 0.8227 0.0466 0.9168 0.6062 0.2675 0.1458 0.0051 0.2269 0.3232 0.5706 0.7732 0.0047 0.5653 NO3 day only full set BC-1 ED-3 ED-5 ED-9 GC-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.0000 0.3807 0.1124 0.3712 0.0007 0.1404 0.0012 0.6672 0.0018 0.1554 0.0000 0.3807 0.1124 0.1970 0.0002 0.1715 0.0070 0.6672 0.0014 0.0812 NO3 day only full set TC-1 TC-2 TC-3 TH-1 UT-1 UT-3 UT-5 WC-3A WC-7A WC-8 0.3371 0.6545 0.0135 0.0002 0.0010 0.0090 0.0852 0.2086 0.8618 0.0011 0.3627 0.8347 0.0164 0.0004 0.0012 0.0026 0.0745 0.1284 0.6007 0.0083 SRP day only full set BC-1 ED-3 ED-5 ED-9 GC-1 GL-1 IN-1 IN-2 IN-3 LH-1 0.0431 0.3807 0.0395 0.9113 0.0001 0.0335 0.5392 0.0193 0.0468 0.0566 0.0223 0.3807 0.0395 0.5031 0.0001 0.0139 0.8637 0.0356 0.0164 0.0812 SRP day only full set TC-1 TC-2 TC-3 TH-1 UT-1 UT-3 UT-5 WC-3A WC-7A WC-8 0.0025 0.1763 0.0638 0.0002 0.0328 0.2945 0.9287 0.0740 0.0286 0.1391 0.0030 0.1245 0.1126 0.0058 0.1162 0.2422 1.0000 0.0548 0.0091 0.0401 99 APPENDIX A-6 Power analysis methodology and results A-6.1. Propagation of load estimation error to regression error Trend of regression residuals is to be tested using Mann-Kendall or adjusted version of MannKendall that works on residuals from an "added-variable" (partial regression) plot. The regression model without measurement error can be written log( yi ) = f ( xi ) + ε i , Var (ε i ) = σ 2 If there is error in the load estimate the model can be written log( yi* ) = f ( xi ) + ε i* where yi* = yi + mi represents the load estimate with measurement error and ε i* is the associated residual error. Let's assume that the measurement error is approximately normally distributed with mean Bi and standard deviation σ m,i , i.e. mi N ( Bi , σ m,i ) . (The normal distribution is not strictly possible because the mi can never be less than –yi.) The load with error is then ⎛ m yi* = yi + mi = yi ⎜1 + i yi ⎝ δ i N (1 + Br , σ r ) ⎞ ⎟ = yiδ i ⎠ where δ i is defined to be 1 + mi yi , Br = Bi yi is the relative bias and σ r = σ m,i yi is the relative measurement error, which we assume to be constant for all measurements yi. Then log ( yi* ) = log ( yiδ i ) = log ( yi ) + log (δ i ) = f ( xi ) + ε i + log (δ i ) = f ( xi ) + γ i Assuming that the regression and estimation errors are independent, the residual variance for the regression with estimation error is Var (γ i ) = Var [ε i + log(δ i ) ] = Var (ε i ) + Var [ log(δ i ) ] The last term can be approximated using the delta method (best when when σ r 1 ), so 100 Var (γ i ) = σ 2 + Var [ log(δ i ) ] ≅σ2 + Var (γ i ) ≅ σ 2 + Var [δ i ] E 2 [δ i ] σ r2 (1 + Br ) 2 The last expression shows how measurement error inflates residual error, which will in turn erode statistical power. Interestingly, when the bias is negative it increases the variance, and when it is positive it decreases it. This becomes a problem, because methods like the simple rating curve often have positive bias, which makes them look better than methods with zero bias in a statistical power analysis. A-6.2. Power analysis methodology Code was developed in R to estimate the power of the Mann-Kendall test. For any specified sample size (number of years), the algorithm randomly resamples the residuals that were tested in the adjusted Mann-Kendall tests for trend. (These are equivalent to residuals from complete models that include water year). All stations are pooled in these models. The residuals from the models weren't that badly distributed but all except those for TKN were somewhat long in the tails (Figure 12.1-1 in the body of the report). Resampling the model residuals in some cases gives different results than generating normally distributed errors; the former is more representative of reality. Normally distributed measurement errors were generated based on specified values relative precision (0.1, 0.3, or 0.5) and bias (-0.2, 0.0, or +0.2). Finally, a trend of a specified magnitude (1, 3, or 5% per year, direction up or down doesn't matter) was added, and the Mann-Kendall test was applied to see if the trend was significant at levels of 0.0025, 0.0050, and 0.05. The first two significance levels correspond to the Bonferroni rule for testing groups of 20 and 10 stations, respectively. The statistical power is estimated as the proportion of significant tests in 5000 resamplings, and seems to be repeatable to nearly 3 decimals. The next graph contrasts bias at levels of -10% 0% and +10%. It shows the power of the MannKendall test to detect trends (slope) from 1% to 5% per year, with measurement errors (CV) varying from 10% to %50% of the load. The significance level for this analysis is fixed at 0.0025, reflecting the Bonferroni adjustment when conducting a family of 20 tests. At this level, bias does not strongly affect statistical power. 101 SSC (alpha=0.0025) Bias: -10% 0% 20 30 10% 40 50 60 CV=0.5 Slope=-0.05 CV=0.5 Slope=-0.03 CV=0.5 Slope=-0.01 CV=0.3 Slope=-0.05 CV=0.3 Slope=-0.03 CV=0.3 Slope=-0.01 1.0 0.8 0.6 0.4 Mann-Kendall Statistical Power 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 CV=0.1 Slope=-0.05 CV=0.1 Slope=-0.03 CV=0.1 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 30 40 50 60 20 30 40 50 60 Sample size Note that the methodology as described so far adds measurement error to residuals that already contain measurement error. The issue was addressed by shrinking the residual variance from the models, i.e. multiplying the residuals by a constant less than 1 (determined by trial and error) to make the final residual variance, with added measurement error, match the original residual variance. For example, if we shrink the original residuals to 88% of their original standard error, then add 10% measurement error, the result matches the original residual variance. The reason the shrinkage factor is not exactly 100 minus the measurement error is that we are working with logarithms of measurements. The appropriate shrinkage factors, determined by trial and error, are shown in the following table, and were used in all subsequent power analyses. 102 Parameter Mean LTIMP Estimation sample size method RSE used in MannKendall test Best regression 0.516 Best regression 0.287 Period-weighted 0.321 Period-weighted 0.386 Period-weighted 0.219 SS TP TKN NO3 SRP 31.7 31.5 26.9 34.7 34.1 Relative precision from simulations for given sample size 0.139 0.073 0.104 0.081 0.079 Shrinkage factor needed to match RSE used in MannKendall test 0.96 0.96 0.94 0.98 0.92 Preliminary analyses using a shrinkage factor of 92% for SRP, a 2% trend and alpha=0.0025 showed that it would take 40 years with 25% measurement errors (ME) or 30 years with 10% ME to have power of 0.82 (i.e. 81% chance of detecting the trend). If the trend were 3% per year, one would need 23 years with 10% ME or 30 years with 30% error to have power of 0.79. To put these trends in perspective here are the trends (log to the base e units per year) that we found to be significant. For small slopes, each 0.01 in slope units corresponds to about 1% change per year. SS ED-3 IN-1 TC-2 TC-3 TH-1 UT-1 -0.1167 -0.0614 -0.0464 -0.0585 -0.0624 -0.0276 TP GC-1 IN-1 IN-2 TH-1 UT-1 WC-8 0.0242 -0.0159 -0.041 -0.0321 -0.015 -0.0241 TKN ED-3 -0.0567 SRP GC-1 TC-1 TH-1 -0.0172 0.012 -0.0104 NO3 BC-1 GC-1 IN-1 IN-3 TH-1 UT-1 -0.0239 -0.0334 -0.0186 -0.028 -0.0311 -0.0174 Percent change per year can be calculated precisely from slope as 100(1-eslope) slope -0.01 -0.02 -0.03 -0.04 -0.05 -0.06 -0.07 -0.08 -0.09 -0.1 -0.11 -0.12 % change -1.00 -1.98 -2.96 -3.92 -4.88 -5.82 -6.76 -7.69 -8.61 -9.52 -10.42 -11.31 103 A-6.3. Power analysis results SS Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.0025 Bias=0.2 Slope=-0.05 Alpha=0.0025 Bias=0.2 Slope=-0.03 Alpha=0.0025 Bias=0.2 Slope=-0.01 Alpha=0.0025 Bias=0 Slope=-0.05 Alpha=0.0025 Bias=0 Slope=-0.03 Alpha=0.0025 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.0025 Bias=-0.2 Slope=-0.05 Alpha=0.0025 Bias=-0.2 Slope=-0.03 Alpha=0.0025 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 104 40 60 80 SS Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.005 Bias=0.2 Slope=-0.05 Alpha=0.005 Bias=0.2 Slope=-0.03 Alpha=0.005 Bias=0.2 Slope=-0.01 Alpha=0.005 Bias=0 Slope=-0.05 Alpha=0.005 Bias=0 Slope=-0.03 Alpha=0.005 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.005 Bias=-0.2 Slope=-0.05 Alpha=0.005 Bias=-0.2 Slope=-0.03 Alpha=0.005 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 105 40 60 80 SS Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.05 Bias=0.2 Slope=-0.05 Alpha=0.05 Bias=0.2 Slope=-0.03 Alpha=0.05 Bias=0.2 Slope=-0.01 Alpha=0.05 Bias=0 Slope=-0.05 Alpha=0.05 Bias=0 Slope=-0.03 Alpha=0.05 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.05 Bias=-0.2 Slope=-0.05 Alpha=0.05 Bias=-0.2 Slope=-0.03 Alpha=0.05 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 106 40 60 80 TP Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.0025 Bias=0.2 Slope=-0.05 Alpha=0.0025 Bias=0.2 Slope=-0.03 Alpha=0.0025 Bias=0.2 Slope=-0.01 Alpha=0.0025 Bias=0 Slope=-0.05 Alpha=0.0025 Bias=0 Slope=-0.03 Alpha=0.0025 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.0025 Bias=-0.2 Slope=-0.05 Alpha=0.0025 Bias=-0.2 Slope=-0.03 Alpha=0.0025 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 107 40 60 80 TP Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.005 Bias=0.2 Slope=-0.05 Alpha=0.005 Bias=0.2 Slope=-0.03 Alpha=0.005 Bias=0.2 Slope=-0.01 Alpha=0.005 Bias=0 Slope=-0.05 Alpha=0.005 Bias=0 Slope=-0.03 Alpha=0.005 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.005 Bias=-0.2 Slope=-0.05 Alpha=0.005 Bias=-0.2 Slope=-0.03 Alpha=0.005 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 108 40 60 80 TP Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.05 Bias=0.2 Slope=-0.05 Alpha=0.05 Bias=0.2 Slope=-0.03 Alpha=0.05 Bias=0.2 Slope=-0.01 Alpha=0.05 Bias=0 Slope=-0.05 Alpha=0.05 Bias=0 Slope=-0.03 Alpha=0.05 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.05 Bias=-0.2 Slope=-0.05 Alpha=0.05 Bias=-0.2 Slope=-0.03 Alpha=0.05 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 109 40 60 80 TKN Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.0025 Bias=0.2 Slope=-0.05 Alpha=0.0025 Bias=0.2 Slope=-0.03 Alpha=0.0025 Bias=0.2 Slope=-0.01 Alpha=0.0025 Bias=0 Slope=-0.05 Alpha=0.0025 Bias=0 Slope=-0.03 Alpha=0.0025 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.0025 Bias=-0.2 Slope=-0.05 Alpha=0.0025 Bias=-0.2 Slope=-0.03 Alpha=0.0025 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 110 40 60 80 TKN Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.005 Bias=0.2 Slope=-0.05 Alpha=0.005 Bias=0.2 Slope=-0.03 Alpha=0.005 Bias=0.2 Slope=-0.01 Alpha=0.005 Bias=0 Slope=-0.05 Alpha=0.005 Bias=0 Slope=-0.03 Alpha=0.005 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.005 Bias=-0.2 Slope=-0.05 Alpha=0.005 Bias=-0.2 Slope=-0.03 Alpha=0.005 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 111 40 60 80 TKN Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.05 Bias=0.2 Slope=-0.05 Alpha=0.05 Bias=0.2 Slope=-0.03 Alpha=0.05 Bias=0.2 Slope=-0.01 Alpha=0.05 Bias=0 Slope=-0.05 Alpha=0.05 Bias=0 Slope=-0.03 Alpha=0.05 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.05 Bias=-0.2 Slope=-0.05 Alpha=0.05 Bias=-0.2 Slope=-0.03 Alpha=0.05 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 112 40 60 80 NO3 Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.0025 Bias=0.2 Slope=-0.05 Alpha=0.0025 Bias=0.2 Slope=-0.03 Alpha=0.0025 Bias=0.2 Slope=-0.01 Alpha=0.0025 Bias=0 Slope=-0.05 Alpha=0.0025 Bias=0 Slope=-0.03 Alpha=0.0025 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.0025 Bias=-0.2 Slope=-0.05 Alpha=0.0025 Bias=-0.2 Slope=-0.03 Alpha=0.0025 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 113 40 60 80 NO3 Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.005 Bias=0.2 Slope=-0.05 Alpha=0.005 Bias=0.2 Slope=-0.03 Alpha=0.005 Bias=0.2 Slope=-0.01 Alpha=0.005 Bias=0 Slope=-0.05 Alpha=0.005 Bias=0 Slope=-0.03 Alpha=0.005 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.005 Bias=-0.2 Slope=-0.05 Alpha=0.005 Bias=-0.2 Slope=-0.03 Alpha=0.005 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 114 40 60 80 NO3 Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.05 Bias=0.2 Slope=-0.05 Alpha=0.05 Bias=0.2 Slope=-0.03 Alpha=0.05 Bias=0.2 Slope=-0.01 Alpha=0.05 Bias=0 Slope=-0.05 Alpha=0.05 Bias=0 Slope=-0.03 Alpha=0.05 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.05 Bias=-0.2 Slope=-0.05 Alpha=0.05 Bias=-0.2 Slope=-0.03 Alpha=0.05 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 115 40 60 80 SRP Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.0025 Bias=0.2 Slope=-0.05 Alpha=0.0025 Bias=0.2 Slope=-0.03 Alpha=0.0025 Bias=0.2 Slope=-0.01 Alpha=0.0025 Bias=0 Slope=-0.05 Alpha=0.0025 Bias=0 Slope=-0.03 Alpha=0.0025 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.0025 Bias=-0.2 Slope=-0.05 Alpha=0.0025 Bias=-0.2 Slope=-0.03 Alpha=0.0025 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 116 40 60 80 SRP Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.005 Bias=0.2 Slope=-0.05 Alpha=0.005 Bias=0.2 Slope=-0.03 Alpha=0.005 Bias=0.2 Slope=-0.01 Alpha=0.005 Bias=0 Slope=-0.05 Alpha=0.005 Bias=0 Slope=-0.03 Alpha=0.005 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.005 Bias=-0.2 Slope=-0.05 Alpha=0.005 Bias=-0.2 Slope=-0.03 Alpha=0.005 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 117 40 60 80 SRP Loads CV: 10% 30% 20 40 50% 60 80 Alpha=0.05 Bias=0.2 Slope=-0.05 Alpha=0.05 Bias=0.2 Slope=-0.03 Alpha=0.05 Bias=0.2 Slope=-0.01 Alpha=0.05 Bias=0 Slope=-0.05 Alpha=0.05 Bias=0 Slope=-0.03 Alpha=0.05 Bias=0 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 Mann-Kendall Statistical Power 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Alpha=0.05 Bias=-0.2 Slope=-0.05 Alpha=0.05 Bias=-0.2 Slope=-0.03 Alpha=0.05 Bias=-0.2 Slope=-0.01 1.0 0.8 0.6 0.4 0.2 0.0 20 40 60 80 20 Sample size 118 40 60 80 APPENDIX A-7 Treatment of Values Less Than Method Detection Limit Triangular distribution to impute left-censored values for water quality constituents RB Thomas-June 10, 2002 Sets of some water quality constituents measured at Lake Tahoe contain subsets of missing values due to being below minimum detection limit (MDL) of measuring devices. Missing values should be accounted for to avoid biasing estimates of distribution parameters. Missing values should not be omitted since they depend on the constituent values themselves. This report describes one method to "impute" missing values when the purpose is to estimate parameters of the constituent distributions and when there aren't "too many" missing values and compares it to the MDL/2 imputation. Neither method should be used when the variables are part of a multivariate complex where variate values for each case will be used to investigate relationships among them. We would expect the constituent distributions to be right skewed, bounded on the left by zero, and unbounded on the right. Left censoring removes a portion of the left of the distribution from the sample. The common practices of substituting zero, MDL/2, or MDL itself for the missing values concentrates too much probability mass in the wrong places. Using a random method might be expected to work better, but it is necessary to use a realistic distribution to reasonably model the missing portion. For this reason, uniform or normal random variates would not work well. Since we expect the missing portion of the distribution to drop to zero at the left and to rise toward the right a triangular distribution is more realistic. This model has lower probability of measurements occurring toward the left and higher probability toward the right (i.e., at the MDL) and is simple to compute. We know the upper limit of the missing portion and how many values are missing. We can use this information to generate "missing" values based on the triangle distribution. Figure 1 on the next page illustrates the method. The triangular distribution is bounded by the blue, red, and horizontal black lines and has height 2/MDL (not on the same scale as the probability density axis) so that the area of the triangle is 1. This is a special case of this distribution where the mode is at its right boundary making it a "right triangular distribution." At least for this concentration distribution the red line approximates the left limb of the lognormal distribution reasonably well. This technique could only be expected to work when the MDL is less than the mode; otherwise the red line would cut across the "true" unknown distribution and poorly describe the missing portion. The distribution (not density) function for this distribution that gives the probability of the random variable, x, being less than, the specified value can be written: F( x ) = x2 MDL2 119 Note that this function is 0 when x is zero and 1 when x=MDL as required. To estimate a missing value, x, select a uniform random number, RN, between zero and one and substitute it for the value of the distribution function. Solving for x we can estimate x as: x = MDL RN Illustration of missing data imputation using triangular distribution 0.04 2/MDL 0.03 0.02 0.01 0.00 0 X MDL 1 2 Constituent concentration This can be done in Excel by selecting a uniform random number for each missing value, taking its square root, and using the product with the MDL as the triangular estimate. This is illustrated in the attached Excel file <LogNormal Example>. The LN(0,1) column contains 100 values generated from the lognormal distribution having its associated distribution standard normal. An IF statement was then used to simulate the Field Data column where all values less than an MDL of 0.3 are coded as zero. This should work in this case since there are no legitimate measured zeros as long as MDL>0.The final Restored Data column is constructed with another IF statement that creates a triangle distributed value for all missing (i.e., 0) values and just copies the measured data. The last two pages contain output from a simulation to compare the triangle and MDL/2 imputation methods. Two thousand samples of size 100 were selected from the lognormal distribution associated with the standard normal distribution. The MDL was set at 0.3. The mean and standard deviation was calculated for each complete sample, for the sample "restored" by the triangle method, and by the MDL/2 method. Then the percentage differences of the means and standard deviations of the two restoration methods was calculated from those for the corresponding full sample and the values plotted as a "dotplot" for comparison. 120 The population of means estimated by the triangle method centers on zero which one would hope to see while those estimated by the MDL/2 method are centered left of zero. The percent difference is not large; only about a third of a percentage point, but there is a slight negative bias. This is because this method puts too much probability mass too low. Using the mean of the triangular distribution, 2MDL/3, would probably work better. For the standard deviations the points from the triangular distribution again cluster around zero, but this time those for the MDL/2 method are higher. This can be expected since the low probability mass for this method puts too many points too far from the mean, thus increasing the variance and standard deviation. Again, this bias is not large, only about 0.25 percent. Character Dotplot Each dot represents 13 points . . . . : :.:::. :::::::. .:::::::::.. .::::::::::::. ..::::::::::::::::.. .......::::::::::::::::::::::...... . . +---------+---------+---------+---------+---------+------trimn% Each dot represents 14 points . : .:::::: :::::::: ::::::::::. :.:::::::::::: .::::::::::::::. .. .....:.::::::::::::::::::... . +---------+---------+---------+---------+---------+------mdl/2mn% -1.20 -0.80 -0.40 0.00 0.40 0.80 Descriptive Statistics Variable trimn% mdl/2mn% N Mean Median TrMean 2000 0.01828 0.01360 0.01806 2000 -0.34662 -0.33157 -0.34083 Variable Min trimn% -0.82913 mdl/2mn% -1.13847 Max Q1 Q3 1.03259 -0.11196 0.14885 0.17348 -0.45772 -0.21853 121 StDev 0.20962 0.17858 SEMean 0.00469 0.00399