job description - University of Massachusetts Boston

advertisement
Office of Research and Standards
Summer Intern Project Description
Characterization of Manganese Concentration Temporal Variability in
Groundwater and Surface Water
Project Description
The objective of this project will be to statistically characterize the temporal variability of
manganese concentrations in ground and surface waters that may serve as source waters for
public drinking water supplies. Concentrations of manganese do vary over time as characteristics
of surface waters or groundwater flow change seasonally or annually. We seek to assign
statistically based limits to how much those concentrations may vary using existing data from
several databases. The project work will be computer based and very quantitative in nature. It
will involve manipulation of sizeable data files, use of preprogrammed functions to manipulate
the data and use of powerful statistical software.
Project Components:
Data Acquisition: Data will be sought from several sources to complement some data that we
already have in our electronic files. Data may also be downloaded from a US Geological Survey
national web site.
Data Manipulations: Data files may need some cleaning up prior to further analysis. Appendix
A to this Project Description describes some of the data cleaning steps that we have had to
perform with other evaluations that we have performed. The data files will be structured to
identify cases where results have been reported as less than the detection limit of the method
used to quantify the manganese concentrations. We’ll calculate the distributional characteristics
of the overall data set so as to provide a picture of groundwater and surface water manganese
concentration distributions in New England. We’ll incorporate more advanced methods for
incorporating information on non-detects into the analysis of data sets such as survival analysis
or maximum likelihood estimation (the intern will be instructed in the basis for these methods
and how to use them). The primary objective will be to identify within each data file, cases
where the same location has been sampled multiple times over a period of time. For each such
case, sample statistics will be calculated over the time series (mean, standard deviation,
coefficient of variation, minimum and maximum). The set of statistics over all the cases will then
be used to calculate an overall coefficient of variation for manganese variability in the type of
water source (groundwater or surface water).
Output: The anticipated output from the project will be a brief report similar to the appended
report (Appendix B) on radon variability in groundwater which we prepared a few years ago.
Desired Skills:




Must have working knowledge of use of spreadsheets including use of built-in functions
for manipulating data.
Very desirable to have some comfort working with larger data files (hundreds to
thousands of records).
Would be helpful, but not necessary to have working knowledge of Visual Basic
programming language.
Exposure through course work to elementary statistics.
Skills and Knowledge to Developed:




The intern will be instructed in the use of a more powerful stand alone software package
for data processing (Statistica).
Instruction will also be provided on the basic statistical principles which will govern the
processing of the data.
The topic of “non-detects” in analytical data reporting will be introduced to the intern and
alternate ways for dealing with ND’s in the analysis of environmental data sets will be
reviewed.
The output of this project will be used by our department’s drinking water program in
guidance that they issue to public water supplies for resampling frequency after initial
manganese concentration determinations. We will instruct the intern on how the drinking
water program operates with respect to water quality standards and how they are enforced
or applied.
Appendix A – Example of Data File Cleaning Prior to Analyses for Another Related
Project
Data Content and Processing Notes:
The data file on manganese concentrations in this file was imported from an Excel file provided
by USGS. The two variables of interest were “Manganese water filtered” and “Manganese water
unfiltered recoverable”. The second variable was closest to the type of sample that would be
analyzed by Public Water Suppliers when determining the manganese content of their source
water. It would reflect both the dissolved and mineral content of the water samples. There were
many more data records for the first variable.
I had to do some cleaning up of the imported data file. The Excel file had “- -“ in spaces where
there was no data. I had to remove all of them. The data file contained alphanumeric
representations of results that were NDs expressed as “< ##”. In addition, letter codes were also
present (“E #” for estimated and “M” present but not quantified. In order to perform any
statistics on the data, I had to recode the data to make it all numeric. There were also a number of
data entries which were “0” rather than an ND indicator. In the variable for filtered recoverable
Mn, I substituted < 10 for the zeros as this was the detection limit for that variable across the data
set. For the Mn filtered variable, there was only one 0 value which I recoded as < 10 since it
came from 1958 when I imagine detection limits were not very good.
Rather than recoding NDs as one half the detection limit as is often done when analyzing data, I
plan to use survival analysis to determine the distributional characteristics of the data, following
guidance provided by (Helsel 2005). NDs were recoded to permit their use in the survival
analysis module of statistics. The steps that I went through were:
1. Created two new variables for each of the variables noted above: i) a censored
column where a code of 1 or 0 was used to indicate if the original data entry was a
censored (< ##) or non censored value; ii) the original data values were recoded to
show the <## values as numeric values with the same numeric value as the
number in the < expression (i.e., < 10 would be shown as 10); numeric values
were left as is.
2. “E” and “M” entries were recoded in the original variable to <## values. The <##
value assigned was the same as any other below detection limit entries reported
around the same date. If it wasn’t possible to do this, a numeric value was entered
(i.e.: E .8 was given 1; E .1 was given 1, E 1.1 was given 3.2. There was an
element of judgment involved in doing these assignments as often around the
same time there were a couple of detection limits reported.
Reference
Helsel, D. R. (2005). Nondetects and Data Analysis. Statistics for Censored Environmental Data.
Hoboken, NJ, Wiley- Interscience: 250 pp.
Appendix B - Example Output of Project
Recommendations for Radon Monitoring and Decision Making at
Transient Non-Community Public Water Supplies
Prepared by Office of Research and Standards
For
The Drinking Water Program
Massachusetts Department of Environmental Protection
Boston, MA
November 2008
BACKGROUND
Public Community Water Systems in Massachusetts have well proscribed protocols for water
quality testing and frequency of testing (310 CMR 22.00). These types of operations serve at
least 15 service connections used by year-round residents or regularly serve at least 25 yearround residents. Systems serving these numbers of connections or customers less than yearround are non-community water systems. There are two subgroups of non-community systems
classified on the basis of the frequency with which people use the water from those supplies:
non-transient non-community (NTNC) water systems and transient non-community (TNC) water
systems. Public water systems classified as Transient Non-Community (TNC) and NonTransient Non-Community (NTNC) have limited testing requirements compared to full-fledged
community water systems. MassDEP provides private well testing guidance (MassDEP 2008)
which notes that baseline assessment of chemical quality for some major groups of chemicals be
determined every 10 years. The private well guidance for radionuclides indicates that after initial
monitoring, future sampling frequency should be based upon the results. In order to provide
additional guidance on resampling frequency after initial testing for radon in NTNC, TNC and
private wells, this document presents a recommendation for resampling frequency and an
improved decision process for determining whether or not reported radon concentrations are
greater than the health-based exposure guideline for radon in water1 used by MassDEP. The
decision process is based upon an assessment of temporal variability in groundwater radon
concentrations from published data for New England.
NATURAL VARIABILITY IN GROUNDWATER RADON CONCENTRATIONS IN
NEW ENGLAND
Larson and Rydell (1992) summarized radon occurrence data and temporal variability
information for wells in New England. They present graphs of well radon concentrations versus
time for 16 influents for domestic water supplies, public water supplies and a USGS well in New
Hampshire, Maine and Connecticut. Data points for these graphs were digitized after visual
examination and the readings entered into a data file. Means and standard deviations for each
time series were calculated and the coefficient of variation (CV) of each data set calculated as the
standard deviation divided by the mean times 100. The overall mean of the CVs for each system
was also calculated.
Dupuy et al.(1992) present a summary of data on radionuclide concentrations in Connecticut
groundwater wells and time series data for a single well for radon and radium 226.
The MassDEP Water Quality Testing System (WQTS) database was queried in July 2008 for all
records for radionuclide results for water systems classed as transient non-community supplies
that had data for multiple sampling dates. The selected records were visually scanned to identify
systems having results for several years. None of the radon data were used in this analysis
because most all of the readings were of finished water, rather than raw source groundwater. It
was therefore unclear whether the systems had any sort of treatment in place specific for the
radon and whether the finished water reflected this treatment or more closely source water.
Radon concentrations in groundwater in New England vary widely as illustrated by the data from
any one of these wells (Figure 1). Concentrations range over three orders of magnitude. Within
any one well, there is substantial radon concentration variation about the mean radon
concentrations with time, with the standard deviations averaging 28% of the mean concentrations
(Table 1) and values remaining generally within an order of magnitude of each other over time.
IDENTIFICATION OF RADON SCREENING LEVELS FOR TREATMENT OR
RESAMPLING
A health based drinking water guideline for radon is employed to indicate when radon
concentrations have the potential to cause unacceptable adverse health effects either through
direct ingestion of water or inhalation of radon that volatilizes from water into indoor air of
dwellings where radon-containing water is used. The Department’s present guidance is used in
such a manner that a recommendation to do indoor air testing for radon is provided when radon
concentrations are greater than the Office of Research and Standards’ (ORS) guideline of 10,000
1
http://www.mass.gov/dep/water/drinking/standards/dwstand.htm#rads
pCi/L2. If indoor air concentrations of radon in a residence are above current US EPA guidance
value of 4 pCi/L, then radon mitigation is recommended. Mitigation is usually most costeffectively accomplished through basement venting systems, but may also be accomplished
through treating the water coming into the home with air stripping.
Given the wide variation in water radon concentrations that can occur over time in a particular
well, a single determination of radon concentration as a basis for deciding whether or not radon
concentrations are truly greater than a health-based guideline may not give an accurate picture of
average concentrations of radon in the water and hence average exposures over time that may
occur with use of the water. For example, because of the temporal variation in radon
concentrations, it may not be appropriate to conclude from a single sample with a radon
concentration less than 10,000 pCi/L that the radon in the water will not likely translate into an
indoor air problem. The average concentration over time could be greater than the 10,000 pCi/L.
Figure 1. Time Series of Groundwater Well Radon Concentrations, pCi/L
(Source: Larson et al. 1992; Dupuy et al. 1992)
radon concentration, pCi/L
1000000
100000
study: Mt Vernon NH PWS
study: LeeNH priv w ell
study: Derry NH PWS
study: ME private w ell
study: ME private w ell #3
study: ME private w ell # 4
study: USGS w ell
study: CT GW w ell
10000
1000
-50
0
50
100
150
200
250
300
350
400
450
Sampling Day
ORS plans to review the basis for this value in the near future in light or more recent assessments of radon’s
carcinogenic potency which might suggest that a lower value would be supportable.
2
Table 1. Summary statistics for radon time series in groundwater. pCi/L.
study
Mt Vernon NH PWS
LeeNH priv well
Derry NH PWS
ME private well #3
ME private well # 4
USGS well
CT GW well
ME domestic well #1
ME domestic well #2
ME domestic well #3
ME domestic well #4
ME domestic well #5
ME domestic well #6
ME domestic well #7
ME domestic well #8
ME domestic well #9
ME domestic well #10
Overall
mean
261333
2359
32821
35458
40114
85625
328889
-
n
15
16
19
12
14
16
9
101
s
42865
195
4200
4031
23028
58559
225413
-
CV %*
(s/mean x100)
=(v4/v2)*100
16
8
13
11
57
68
69
17
15
13
55
20
23
14
30
28
18
28
source
Larson et
Larson et
Larson et
Larson et
Larson et
Larson et
Dupuy et
Larson et
Larson et
Larson et
Larson et
Larson et
Larson et
Larson et
Larson et
Larson et
Larson et
al.
al.
al.
al.
al.
al.
al.
al.
al.
al.
al.
al.
al.
al.
al.
al.
al.
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
(1992)
* all CVs for ME domestic wells read directly from graph of CV versus well number in source document. No stats
available.
If one uses the information on groundwater radon variability summarized earlier to define a
statistical confidence zone about 10,000 pCi/L, then one can more confidently ascertain whether
any single sample is most likely to come from a population with a mean over time of 10,000
pCi/L. In the case of radon, the average CV of 28% means that standard deviations average 28%
of the mean for radon in groundwater wells. The 95% confidence interval around the mean
would therefore be ±1.96 times the standard deviation. Since the standard deviation is 0.28 times
the mean, the 95% confidence interval about a mean with this degree of variability would be:
± 1.96 x 0.28 x mean or 0.5 x mean. For a mean of 10,000, the lower and upper 95% confidence
intervals would therefore be 10,000 ± 0.55 x 10,000 , or 4,500 and 15,500 pCi/L.
For ease of implementation, round these limits to 5,000 and 16,000. Therefore any single radon
concentrations less than 5,000 pCi/L likely comes from a population with a mean less than
10,000 pCi/L and no additional near-term sampling should be recommended. If a single
concentration is between 5,000 and 16,000, then additional sampling is warranted to better
establish the longer-term average concentration, since one can’t confidently conclude that long
term mean is less than 10,000 pCi/L given the expected temporal variance in the radon
concentrations. If a single measurement is greater than 16,000 pCi/L, then the long-term mean is
likely greater than 10,000 pCi/L and indoor air testing should be pursued. These decision
criteria are summarized in Table 2.
Table 2. Radon Well Sampling Data Interpretation Guide
Single radon
concentration, pCi/L
< 5,000 pCi/L
≥ 5,000 – 16,000 pCi/L
> 15,000 pCi/L
Implication of Result
Long-term average
radon concentration
most likely no greater
than 10,000 pCi/L
guideline
Not possible to conclude
that are under or over
the guideline of 10,000
pCi/L
Long-term average
radon concentration
most likely greater than
10,000 pCi/L guideline
Follow-up
Resampling Frequency
None immediately.
Recommend resampling
every 5 years
Resample
Sample quarterly to get
annual average to compare
with 10,000 pCi/L guideline
Sample indoor air or
implement radon
mitigation of water
Confirm treatment is
effective. No need to
resample water when
treatment is in place.
Reference List
Dupuy C.J.; Healy, D.; Thomas, M. A.; Brown, D. R.; Siniscalchi, A. J., and Dembek, Z. F. A
survey of naturally occurring radionuclides in groundwater in selected bedrock aquifers
in Connecticut and implications for public health policy. In, C.E. Gilbert and E.J.
Calabrese, eds. Regulating Drinking Water Quality. Chelsea, MI: Lewis Publishers;
1992; pp. 95-119.
Larson, C. D. and Rydell, S. Regional Perspective on Radon in Drinking Water. Gilbert, C. E.
and Calabrese, E. J., eds. Regulating Drinking Water Quality. Boca Raton, FL: Lewis
Publishers; 1992; pp. 83-93.
MassDEP. 2008. Private Well Guidelines. Bureau of Resource Protection, Drinking Water
Program, Massachusetts Department of Environmental Protection. Boston, MA.
(available at: http://www.mass.gov/dep/water/laws/prwellgd.pdf)
Download