An Investigation of Techniques to Predict Stormwater Chemical

advertisement
Abstract
The purpose of this report is to investigate different techniques used to predict
concentrations of chemicals through a storm event. Sampling of storm events can be
done electronically, using very limited data loggers, that continuously record such
variables as pH and temperature or by using automatic samplers that take water samples
at set intervals, but must be loaded and unloaded frequently. If a parameter that is
continuously being recorded can be shown to be a proxy for a water constituent that must
be sampled and analyzed, then that constituent can be successfully predicted. This paper
examines attempts to predict NO3 based on specific conductivity and stage, which can be
measured continuously. Multiple regression and step multiple regression equations are
generated and then compared to the actual nitrate concentrations over a storm event. The
techniques are not sufficient for simulating the nitrate concentrations at Millstone Spring,
KY. Cluster analysis is examined as a technique to quantify chemical variability, and
appears to be an effective tool for describing chemical changes through a storm event.
Background
Karst aquifers are notably different from other types of aquifers. Karst systems
are characterized by highly soluble bedrock, generally either limestone or dolomite. As
such, the aquifer has open-conduit flow, from fractures, as well as dissolution features,
such as sinkholes and caves. These aquifers also have diffuse flow. Because of this
open, fractured nature of karst, surface contamination can impact these aquifers.
Understanding and quantifying this surface contamination is important for a variety of
water-quality issues.
Because of the various flow paths available to groundwater and the extreme
heterogeneity, it is also incredibly difficult to model these aquifers. These aquifers
cannot be quantified in the same way that a sandstone aquifer can, for example.
Traditional modeling packages that simulate groundwater flow are effectively useless in
karst systems. Figure 1 is a schematic cross-section of a karst aquifer.
Figure 1: Karst aquifer cross section. From
http://www.forester.net/images/sw0111_49.gif
Springs offer a source of information in karst systems. These discharge points
represent an average view of the aquifer characteristics.
Storm water sampling of springs provides a “slice” of the aquifer behavior.
Sampling of storm events can be done electronically, using very limited data loggers, that
continuously record such variables as pH and temperature or by using automatic samplers
that take water samples at set intervals, but must be loaded and unloaded frequently.
The first portion of the storm water to be sampled is the base flow, or that water
that is not affected by recharge. This is the water that was already in the system before
the storm event. In a karst system, baseflow water chemistry is dominated by the
presence of dissolved calcium carbonate and, to a lesser extent, magnesium. Calcium
buffers pH and increases specific conductivity (Peterson, Davis and Brahana, 2000, p.
47).
The second part of the sampling captures the storm water itself. These samples
are characterized by an increase in water level, a dilution of chemical constituents that are
part of the bedrock or epikarst, and an increase in those chemical constituents that are
from the surface, being flushed into the system with the storm water.
The third part of the sample is the recession as the influence of the storm water
decreases in the system and the spring begins to revert back to base flow conditions.
Surface constituents will be decreasing in concentration, but it is possible that there will
be a lag time before bedrock constituents are at base flow levels.
Methods of Prediction
There are many possible sources of contamination in a karst aquifer. Figure 2 is
an illustration of some of the various contaminants and paths in a karst system.
Figure 2: Contamination in a karst aquifer. From
http://www.dyetracing.com/karst/ka01013.html
Contamination from agricultural practices, specifically in the case of nitrates (NO3), is the
focus of this report.
Peterson, Davis and Brahana propose a multiple regression method for predicting
nitrate concentration based on stage and specific conductivity (2000, p. 43). The rational
behind using stage as a proxy for nitrate concentration is that stage is a measure of
volume of discharge, and as discharge increases, so do nitrate concentrations (Peterson,
Davis and Brahana, 2000, p.47).
Specific conductance (SC) is a measure of ionic strength and ionic strength is
controlled primarily by calcium in a carbonate system (Peterson, Davis and Brahana,
2000, p.47).
SC can thus be thought of as a function of water with a higher residence time in
the system, rather than rainwater with a low SC. As SC decreases, nitrates should
increase through a storm event.
The first step for predicting nitrate concentration in a karst system would be a
visual inspection of a storm water hydrograph. This seems like an elementary exercise
but it is important to determine if SC and stage do fluctuate according to the model.
After visual examination of the data, the data should be sorted and nitrate versus
stage should be plotted. This plot will exhibit various points where the slope radically
changes, and at these points the line should be divided into steps (Peterson, Davis and
Brahana, 2000, p.49). Figure 3 is the sorted data from the spring measured by Peterson,
Davis and Brahana (2000, p.55).
After the stage data is separated into steps, a multiple regression is performed on
the data set that relates to each step. The resulting regression equations are included
below:
step 1 no3 = 2.88 - 0.106 step 1 stage + 0.00696 step 1 sc
step 2 NO3 = 6.54 + 0.208 Step 2 stage - 0.0167 step 2 sc
step 3 NO3 = 6.88 - 0.0434 Step 3 Stage - 0.00510 Step 3 SC
In addition, a multiple regression equation was generated for the entire data set.
This equation is:
NO3-N = 5.49 - 0.0311 Stage - 0.00204 SC
Stafford Spring, AR
6
step 2
16.70-18.54
step 1
9.00-16.70
5
step 3
18.54-73.00
NO3-N
4
3
2
1
0
0
10
20
30
40
50
60
70
Stage
Figure 3: Sorted stage versus nitrate concentration.
The computer program MINITAB explains the regression equation with the
following relationship:
Response = constant + coefficient (predictor) + … + coefficient (predictor)
where the response (Y) is the value of the response. The constant is the value of the
response variable when the predictor variable(s) is zero. In the MINITAB program
80
identifies the constant as the intercept because it determines where the regression line
intercepts (meets) the Y-axis. Predictor(s) (X) are the value of the predictor variable(s).
The coefficients are the estimated change in average response.
A portion of the summary table from this multiple regression is included below.
The P is the probability and should be lower than the -level, chosen in this case to be
0.05. SC may be a borderline predictor, because it is above the -level, but only slightly.
Predictor
Constant
Stage
SC
Coef
7.830
-0.044250
-0.008774
SE Coef
1.177
0.006665
0.003773
-
P
0.001
0.001
0.059
The R and adjusted R-values can be thought of as a proportional variation. The adjusted
R is adjusted to the number of terms in the model.
R-Sq = 89.7%
R-Sq(adj) = 86.3%
Figure 4 shows the simulated nitrate concentration plotted with the real
concentration. With the exception of step 1, the plots of the simulated and actual
concentrations follow each other very well. Even step 1 follows the shape of the actual
nitrate concentration, but appears to be too high of a magnitude plot.
Stafford Spring Simulated vs. Actual Nitrate
Concentrations
12
10
step 1
NO3
8
step 2
6
Step 3
4
Nitrate Concentration
2
0
Tim e (m in)
Figure 4: Simulated and actual nitrate concentrations over a storm event.
Peterson, Davis and Brahana also suggest creating a multiple regression equation
for the entire data set (2000, p.48). A partial summary of the results is included below.
The regression equation is
NO3-N = 5.68 - 0.0324 Stage - 0.00252 SC
Predictor
Constant
Stage
SC
Coef
5.6818
-0.032367
-0.002517
R-Sq = 61.3%
SE Coef
0.9444
0.008542
0.002209
T
6.02
-3.79
-1.14
P
0.000
0.001
0.264
R-Sq(adj) = 58.7%
The P-value for SC seems anomalously high, indicating that it may not be a good
predictor. The R-values are also rather low, indicating that the variation of nitrates is not
fully explained by the SC and stage when analyzed as a whole data set.
Figure 5 is the plot of the multiple regression equation as well as the actual nitrate
concentration. Even though the summary statistics indicate that multiple regressions may
not be an effective way to simulate nitrate concentration, the plot still looks visually
rather convincing.
Stafford Spring NO3 Concentrations
6
NO3 (mg/L)
5
4
Simulated NO3 using
MR
3
Measured NO3
2
1
0
time
Figure 5: Multiple regression plots of nitrates and actual nitrate concentrations
Application of Methods
After investigating the methods used by Peterson, Davis and Brahana, the
techniques were applied to a spring in Kentucky, Millstone Spring. The spring is located
in a karst area. The data was collected over a storm event in 1999.
The first step is a visual inspection of the data. Figures 6 and 7 are provided
below.
Figure 6: Nitrate and Stage
Figure 7: Nitrate and SC
It is hard to draw any obvious conclusions from these plots. SC appears to
decrease as nitrates increase, and stage may increase with nitrates. Neither of these
figures shows a clear-cut linear relationship, so it is hard to draw too many conclusions.
Figure 8 shows the plot of sorted stage versus nitrate. From this different steps
were delineated at abrupt slope changes. These rather arbitrary choices are labeled
below. Although the first point appears to be an outlier, it is included in the step
calculation.
Millstone Spring, KY
11000
outlier?
NO3
10000
Step 2
101.3-105.8
Step 3
105.8-111.9
9000
Step 1
95.3-101.3
8000
7000
6000
95
100
105
110
Stage
Figure 7: Sorted stage and nitrates.
The partial summary of the multiple regression (without breaking into steps) is listed
below:
The regression equation is
NO3_1 = 25563 + 3.17 SC - 178 stage
Predictor
T
P
Constant
3.59
0.003
SC
0.60
0.557
stage
-2.86
0.012
115
S = 1040.26
R-Sq = 41.1%
R-Sq(adj) = 32.7%
The SC p-value is much to high. The R-values are low, and do not explain very
much of the variance. This does not appear to be an effective model for nitrates.
The step regressions should be better models for nitrate behavior. The partial
summary for step 1 is listed below:
The regression equation is
NO3_1_1 = 19013 + 14.6 SPC_1 - 158 stage_1_1
Predictor
T
P
Constant
0.22 0.843
SPC_1
0.26 0.811
stage_1_1 -0.24 0.826
S = 1026.57
R-Sq = 43.0%
R-Sq(adj) = 4.9%
The p-values for step 1 are too high for an -level of 0.05. The r-squared value
does not fully explain the variations, and once it is adjusted for number of variables,
explains even less.
The partial summary for step 2 is listed below. The p-values are still too high, but
not as high as step 1. The R values account for more of the variation than in the previous
step.
The regression equation is
NO3_1_2 = 25270 - 3.41 SPC_2 - 154 stage_1_2
Predictor
Constant
SPC_2
stage_1_2
S = 202.739
T
4.38
-1.86
-2.75
R-Sq = 79.1%
P
0.022
0.160
0.071
R-Sq(adj) = 65.1%
The partial summary for step 3 is listed below. The p-values are much too high,
but the r-squared values do indicate that a majority of the variance is explained by this
equation.
The regression equation is
NO3_1_2_1 = - 481 + 8.41 SPC_2_1 - 41.0 stage_1_2_1
Predictor
Constant
SPC_2_1
stage_1_2_1 -40.95
S = 3.95823
T
-0.31
1.22
25.33
R-Sq = 80.5%
P
0.771
0.291
-1.62
0.181
R-Sq(adj) = 70.8%
Figure 8 is the plot of the single multiple regression with the actual measured
nitrates. The data sets seem to follow a similar trend, but there is quite a bit of variability
in the data sets. The simulated nitrate line does not reflect the same variability as the
actual values, a point made by the calculated r-squared values.
Simulated and Measured NO3 Millstone Spring
11000
NO3
10000
9000
8000
7000
6000
171
173
175
177
time (julian date)
179
measured N03
simulated NO3
Figure 8: Single Multiple Regression
Figure 9 is the plot of the multi-step regression plots. This is essentially a useless
plot. The negative SC values are meaningless, but even when plotted as absolute values,
do not correlate well. Step 1 follows the measured data closer than any other step. Based
solely on the p and r-squared values, it would have appeared that step 2 would have more
closely predicted the nitrate concentration.
Simulated Nitrates Using Step Regression
30000
25000
20000
15000
SC
10000
5000
0
-5000172
174
176
178
180
step 1
-10000
step 2
-15000
-20000
step 3
time (julian date)
NO3
Figure 9: Simulated Nitrate Concentration, Millstone Spring.
Analysis of Regression Techniques
Based on the p-values, r-squared values and plots of the results, this does not
appear to be an effective model for modeling the flow in Millstone Spring. There are
several different possibilities for this.
It is possible that a larger data set would produce more reliable regression
equations. Peterson, Davis and Brahana do not address data set size, but this could be a
factor, by giving each individual point more weight than is appropriate.
The selection of slope and determining what makes an obvious slope change is a
weak point in this model. This is not a quantifiable evaluation, and is instead subject to
individual interpretation. Figure 7 had at least one outlier and determining obvious
changes in slope was difficult. The data points did not break out in as identifiable slope
changes as in Figure 3.
Other slopes could have been isolated out in Figure 3 besides
those that Peterson, Davis and Brahana selected.
There are also geological parameters at work that can impact the effectiveness of
this model. Peterson, Davis and Brahana conclude that the step regression methods work
best for springs fed primarily by diffuse flow (2000, p. 61). It is quite possible that
Millstone Spring is supplied by conduit flow instead, and in fact; the lack of effective
modeling may indicate that.
The method of step multiple regressions seem like an effective way to quantify
nitrate concentration in areas of heavy agricultural use. The springs in Arkansas analyzed
by Peterson, Davis and Brahana are heavily impacted by poultry farming. Millstone
Spring, on the other hand, is in an area of less heavy use. It may be that the aquifer must
be profoundly impacted to be predictable using this model.
This method may be effective for a very select kind of karst aquifer. However,
predicting nitrate contamination in a heavily impacted area is an extremely important
tool. Determining other techniques that are not so dependent on type of flow may be an
effective tool. Investigating other easily measured parameters, such as pH and
temperature, would also be an effective next step.
Other Prediction Techniques
Another approach to predicting how chemical concentrations vary during a storm
event is by using cluster analysis. If suites of chemicals can be shown to vary with, or
inversely to each other, then they can help to explain chemical fluctuations. These
techniques are not able to qualitatively predict chemical concentrations through a storm
event, but can predict basic relationships.
Figure 10 is a cluster analysis of chemical concentrations through the 1999 storm
event at Millstone Spring, KY. The constituents being compared are arsenic, barium,
cadmium, magnesium, strontium, calcium, chromium, iron, silicon, manganese, lead,
copper, potassium and sodium. The data is divided into baseflow (B), storm (S) and
recession (R). The recession and baseflow are clustered close to each other, while the
storm flows are more dispersed. It is intuitive that the storm waters would have the most
chemical variability, because the storm water has the greatest stage fluctuations. Storm
water can be thought of as the flux component, while baseflow is more of an average
flow. Recession is a median value between the two.
Scatterplot of Chemical Concentrations, Millstone Spring
6
S
5
4
C20
3
S
S
2
1
S
S
0
-1
S
S
Recession
R
B
R B
B
S
R
-2
-3
-10.0
S
Baseflow
-7.5
-5.0
-2.5
0.0
C19
Figure 10: Scatterplot of chemical concentrations. B-baseflow, R-recession, S-storm.
Figure 11 is the dendrogram of recession and baseflow chemical concentrations.
These two portions cluster similarly, so it is feasible that they could be analyzed using the
same dendrogram. Chromium, iron, magnesium, potassium, silica, lead and copper have
the highest similarity to each other. It is this suite of minerals that would be found in
spring water at higher concentrations during baseflow and recession periods. This suite
of minerals is probably associated with bedrock, because this water is not as impacted by
surface input of water.
Cluster Analysis of Baseflow and Recession Water, Millstone Spring
Similarity
65.05
76.70
88.35
100.00
As
Cr
Fe_1
Mn
K_1
Si_1
Pb
Cu
Variables
Ba (D) Ca_1
Cd
Mg_1 Na_1 Sr (D)
Figure 10: Dendrogram of recession and baseflow, Millstone Spring.
Figure 11 is the dendrogram of storm water during the Millstone Spring 1999
storm event. Chromium, iron, silica, manganese, lead, copper, potassium is the suite of
elements with the greatest similarity. This is the same suite of minerals with the highest
level of similarity in Figure 11.
The main difference between Figure 11 and 10 is that Figure 11 has less
variability. The chemical constituents are more clustered, which may indicate that they
are always present in the system, but increase during storm events. Notice that calcium
and magnesium, primary constituents of karst waters, are less clustered at
baseflow/recession. That is probably due to the high concentrations of these ions relative
to the other constituents.
Cluster Analysis of Storm Water Constituents, Millstone Spring
Similarity
64.60
76.40
88.20
100.00
As
Ba (D)
Cd
Mg_1 Sr (D) Ca_1
Cr
Fe_1
Variables
Si_1
Mn
Pb
Cu
K_1
Na_1
Figure 11: Dendrogram of storm water, Millstone Spring.
Analysis of Cluster Analysis
Cluster analysis is an effective technique for isolating out different types of spring
water. Storm and baseflow can be effectively separated out based on the chemical
constituents, while the recession waters are clustered somewhere between these two
extremes.
Dendrograms can identify chemical similarities and suites of elements that occur
in different waters. The effect of fresh water, in the form of storm runoff, is apparent in
the difference in the dendrograms.
These techniques are effective for quantifying different components of storm
sampling. These techniques cannot predict concentrations, but can provide a good idea
for how samples may fall out. Understanding how baseflow, storm and recession
portions of a storm event will plot can be very important.
If samples vary from this predetermined cluster, it could indicate that there was a
sampling or analysis problem. It could also indicate that there has been some sort of
change in the aquifer, such as change in land use.
Deviations from this cluster analysis could also indicate subsurface changes, such
as flow from diffuse rather than conduit flow. Cluster analysis is an important way to
quantify chemical variability in a karst aquifer.
Works Cited
Croft, A., 2003, Introduction to Karst Environmental Problems,
http://www.dyetracing.com/karst/ka01013.html
Forester Communications, 2003, Karst Cross-Section,
http://www.forester.net/images/sw0111_49.gif
Peterson, E.W., Davis, R.K. and Brahana, J.V., 2000, The use of Regression
Analysis to Predict Nitrate-Nitrogen Concentrations in Springs of Northwest
Arkansas, in Sasowsky, I.D. and Wicks, C.M. (eds), Groundwater Flow and
Contaminant Transport in Carbonate Aquifers, A.A. Balkema, Rotterdam, p.43-63.
In addition to the sources listed below, the statistical programs MINITAB, PSI-Plot and
Excel were used in the analysis of the data.
Dr. Dorothy Vespers provided the chemical data for Millstone Spring.
An Investigation of Techniques to Predict and Quantify Stormwater Chemical
Concentrations in a Karst Aquifer System
Rachel Grand
11 December 2003
Download