Office of Research and Standards Summer Intern Project Description Characterization of Manganese Concentration Temporal Variability in Groundwater and Surface Water Project Description The objective of this project will be to statistically characterize the temporal variability of manganese concentrations in ground and surface waters that may serve as source waters for public drinking water supplies. Concentrations of manganese do vary over time as characteristics of surface waters or groundwater flow change seasonally or annually. We seek to assign statistically based limits to how much those concentrations may vary using existing data from several databases. The project work will be computer based and very quantitative in nature. It will involve manipulation of sizeable data files, use of preprogrammed functions to manipulate the data and use of powerful statistical software. Project Components: Data Acquisition: Data will be sought from several sources to complement some data that we already have in our electronic files. Data may also be downloaded from a US Geological Survey national web site. Data Manipulations: Data files may need some cleaning up prior to further analysis. Appendix A to this Project Description describes some of the data cleaning steps that we have had to perform with other evaluations that we have performed. The data files will be structured to identify cases where results have been reported as less than the detection limit of the method used to quantify the manganese concentrations. We’ll calculate the distributional characteristics of the overall data set so as to provide a picture of groundwater and surface water manganese concentration distributions in New England. We’ll incorporate more advanced methods for incorporating information on non-detects into the analysis of data sets such as survival analysis or maximum likelihood estimation (the intern will be instructed in the basis for these methods and how to use them). The primary objective will be to identify within each data file, cases where the same location has been sampled multiple times over a period of time. For each such case, sample statistics will be calculated over the time series (mean, standard deviation, coefficient of variation, minimum and maximum). The set of statistics over all the cases will then be used to calculate an overall coefficient of variation for manganese variability in the type of water source (groundwater or surface water). Output: The anticipated output from the project will be a brief report similar to the appended report (Appendix B) on radon variability in groundwater which we prepared a few years ago. Desired Skills: Must have working knowledge of use of spreadsheets including use of built-in functions for manipulating data. Very desirable to have some comfort working with larger data files (hundreds to thousands of records). Would be helpful, but not necessary to have working knowledge of Visual Basic programming language. Exposure through course work to elementary statistics. Skills and Knowledge to Developed: The intern will be instructed in the use of a more powerful stand alone software package for data processing (Statistica). Instruction will also be provided on the basic statistical principles which will govern the processing of the data. The topic of “non-detects” in analytical data reporting will be introduced to the intern and alternate ways for dealing with ND’s in the analysis of environmental data sets will be reviewed. The output of this project will be used by our department’s drinking water program in guidance that they issue to public water supplies for resampling frequency after initial manganese concentration determinations. We will instruct the intern on how the drinking water program operates with respect to water quality standards and how they are enforced or applied. Appendix A – Example of Data File Cleaning Prior to Analyses for Another Related Project Data Content and Processing Notes: The data file on manganese concentrations in this file was imported from an Excel file provided by USGS. The two variables of interest were “Manganese water filtered” and “Manganese water unfiltered recoverable”. The second variable was closest to the type of sample that would be analyzed by Public Water Suppliers when determining the manganese content of their source water. It would reflect both the dissolved and mineral content of the water samples. There were many more data records for the first variable. I had to do some cleaning up of the imported data file. The Excel file had “- -“ in spaces where there was no data. I had to remove all of them. The data file contained alphanumeric representations of results that were NDs expressed as “< ##”. In addition, letter codes were also present (“E #” for estimated and “M” present but not quantified. In order to perform any statistics on the data, I had to recode the data to make it all numeric. There were also a number of data entries which were “0” rather than an ND indicator. In the variable for filtered recoverable Mn, I substituted < 10 for the zeros as this was the detection limit for that variable across the data set. For the Mn filtered variable, there was only one 0 value which I recoded as < 10 since it came from 1958 when I imagine detection limits were not very good. Rather than recoding NDs as one half the detection limit as is often done when analyzing data, I plan to use survival analysis to determine the distributional characteristics of the data, following guidance provided by (Helsel 2005). NDs were recoded to permit their use in the survival analysis module of statistics. The steps that I went through were: 1. Created two new variables for each of the variables noted above: i) a censored column where a code of 1 or 0 was used to indicate if the original data entry was a censored (< ##) or non censored value; ii) the original data values were recoded to show the <## values as numeric values with the same numeric value as the number in the < expression (i.e., < 10 would be shown as 10); numeric values were left as is. 2. “E” and “M” entries were recoded in the original variable to <## values. The <## value assigned was the same as any other below detection limit entries reported around the same date. If it wasn’t possible to do this, a numeric value was entered (i.e.: E .8 was given 1; E .1 was given 1, E 1.1 was given 3.2. There was an element of judgment involved in doing these assignments as often around the same time there were a couple of detection limits reported. Reference Helsel, D. R. (2005). Nondetects and Data Analysis. Statistics for Censored Environmental Data. Hoboken, NJ, Wiley- Interscience: 250 pp. Appendix B - Example Output of Project Recommendations for Radon Monitoring and Decision Making at Transient Non-Community Public Water Supplies Prepared by Office of Research and Standards For The Drinking Water Program Massachusetts Department of Environmental Protection Boston, MA November 2008 BACKGROUND Public Community Water Systems in Massachusetts have well proscribed protocols for water quality testing and frequency of testing (310 CMR 22.00). These types of operations serve at least 15 service connections used by year-round residents or regularly serve at least 25 yearround residents. Systems serving these numbers of connections or customers less than yearround are non-community water systems. There are two subgroups of non-community systems classified on the basis of the frequency with which people use the water from those supplies: non-transient non-community (NTNC) water systems and transient non-community (TNC) water systems. Public water systems classified as Transient Non-Community (TNC) and NonTransient Non-Community (NTNC) have limited testing requirements compared to full-fledged community water systems. MassDEP provides private well testing guidance (MassDEP 2008) which notes that baseline assessment of chemical quality for some major groups of chemicals be determined every 10 years. The private well guidance for radionuclides indicates that after initial monitoring, future sampling frequency should be based upon the results. In order to provide additional guidance on resampling frequency after initial testing for radon in NTNC, TNC and private wells, this document presents a recommendation for resampling frequency and an improved decision process for determining whether or not reported radon concentrations are greater than the health-based exposure guideline for radon in water1 used by MassDEP. The decision process is based upon an assessment of temporal variability in groundwater radon concentrations from published data for New England. NATURAL VARIABILITY IN GROUNDWATER RADON CONCENTRATIONS IN NEW ENGLAND Larson and Rydell (1992) summarized radon occurrence data and temporal variability information for wells in New England. They present graphs of well radon concentrations versus time for 16 influents for domestic water supplies, public water supplies and a USGS well in New Hampshire, Maine and Connecticut. Data points for these graphs were digitized after visual examination and the readings entered into a data file. Means and standard deviations for each time series were calculated and the coefficient of variation (CV) of each data set calculated as the standard deviation divided by the mean times 100. The overall mean of the CVs for each system was also calculated. Dupuy et al.(1992) present a summary of data on radionuclide concentrations in Connecticut groundwater wells and time series data for a single well for radon and radium 226. The MassDEP Water Quality Testing System (WQTS) database was queried in July 2008 for all records for radionuclide results for water systems classed as transient non-community supplies that had data for multiple sampling dates. The selected records were visually scanned to identify systems having results for several years. None of the radon data were used in this analysis because most all of the readings were of finished water, rather than raw source groundwater. It was therefore unclear whether the systems had any sort of treatment in place specific for the radon and whether the finished water reflected this treatment or more closely source water. Radon concentrations in groundwater in New England vary widely as illustrated by the data from any one of these wells (Figure 1). Concentrations range over three orders of magnitude. Within any one well, there is substantial radon concentration variation about the mean radon concentrations with time, with the standard deviations averaging 28% of the mean concentrations (Table 1) and values remaining generally within an order of magnitude of each other over time. IDENTIFICATION OF RADON SCREENING LEVELS FOR TREATMENT OR RESAMPLING A health based drinking water guideline for radon is employed to indicate when radon concentrations have the potential to cause unacceptable adverse health effects either through direct ingestion of water or inhalation of radon that volatilizes from water into indoor air of dwellings where radon-containing water is used. The Department’s present guidance is used in such a manner that a recommendation to do indoor air testing for radon is provided when radon concentrations are greater than the Office of Research and Standards’ (ORS) guideline of 10,000 1 http://www.mass.gov/dep/water/drinking/standards/dwstand.htm#rads pCi/L2. If indoor air concentrations of radon in a residence are above current US EPA guidance value of 4 pCi/L, then radon mitigation is recommended. Mitigation is usually most costeffectively accomplished through basement venting systems, but may also be accomplished through treating the water coming into the home with air stripping. Given the wide variation in water radon concentrations that can occur over time in a particular well, a single determination of radon concentration as a basis for deciding whether or not radon concentrations are truly greater than a health-based guideline may not give an accurate picture of average concentrations of radon in the water and hence average exposures over time that may occur with use of the water. For example, because of the temporal variation in radon concentrations, it may not be appropriate to conclude from a single sample with a radon concentration less than 10,000 pCi/L that the radon in the water will not likely translate into an indoor air problem. The average concentration over time could be greater than the 10,000 pCi/L. Figure 1. Time Series of Groundwater Well Radon Concentrations, pCi/L (Source: Larson et al. 1992; Dupuy et al. 1992) radon concentration, pCi/L 1000000 100000 study: Mt Vernon NH PWS study: LeeNH priv w ell study: Derry NH PWS study: ME private w ell study: ME private w ell #3 study: ME private w ell # 4 study: USGS w ell study: CT GW w ell 10000 1000 -50 0 50 100 150 200 250 300 350 400 450 Sampling Day ORS plans to review the basis for this value in the near future in light or more recent assessments of radon’s carcinogenic potency which might suggest that a lower value would be supportable. 2 Table 1. Summary statistics for radon time series in groundwater. pCi/L. study Mt Vernon NH PWS LeeNH priv well Derry NH PWS ME private well #3 ME private well # 4 USGS well CT GW well ME domestic well #1 ME domestic well #2 ME domestic well #3 ME domestic well #4 ME domestic well #5 ME domestic well #6 ME domestic well #7 ME domestic well #8 ME domestic well #9 ME domestic well #10 Overall mean 261333 2359 32821 35458 40114 85625 328889 - n 15 16 19 12 14 16 9 101 s 42865 195 4200 4031 23028 58559 225413 - CV %* (s/mean x100) =(v4/v2)*100 16 8 13 11 57 68 69 17 15 13 55 20 23 14 30 28 18 28 source Larson et Larson et Larson et Larson et Larson et Larson et Dupuy et Larson et Larson et Larson et Larson et Larson et Larson et Larson et Larson et Larson et Larson et al. al. al. al. al. al. al. al. al. al. al. al. al. al. al. al. al. (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) (1992) * all CVs for ME domestic wells read directly from graph of CV versus well number in source document. No stats available. If one uses the information on groundwater radon variability summarized earlier to define a statistical confidence zone about 10,000 pCi/L, then one can more confidently ascertain whether any single sample is most likely to come from a population with a mean over time of 10,000 pCi/L. In the case of radon, the average CV of 28% means that standard deviations average 28% of the mean for radon in groundwater wells. The 95% confidence interval around the mean would therefore be ±1.96 times the standard deviation. Since the standard deviation is 0.28 times the mean, the 95% confidence interval about a mean with this degree of variability would be: ± 1.96 x 0.28 x mean or 0.5 x mean. For a mean of 10,000, the lower and upper 95% confidence intervals would therefore be 10,000 ± 0.55 x 10,000 , or 4,500 and 15,500 pCi/L. For ease of implementation, round these limits to 5,000 and 16,000. Therefore any single radon concentrations less than 5,000 pCi/L likely comes from a population with a mean less than 10,000 pCi/L and no additional near-term sampling should be recommended. If a single concentration is between 5,000 and 16,000, then additional sampling is warranted to better establish the longer-term average concentration, since one can’t confidently conclude that long term mean is less than 10,000 pCi/L given the expected temporal variance in the radon concentrations. If a single measurement is greater than 16,000 pCi/L, then the long-term mean is likely greater than 10,000 pCi/L and indoor air testing should be pursued. These decision criteria are summarized in Table 2. Table 2. Radon Well Sampling Data Interpretation Guide Single radon concentration, pCi/L < 5,000 pCi/L ≥ 5,000 – 16,000 pCi/L > 15,000 pCi/L Implication of Result Long-term average radon concentration most likely no greater than 10,000 pCi/L guideline Not possible to conclude that are under or over the guideline of 10,000 pCi/L Long-term average radon concentration most likely greater than 10,000 pCi/L guideline Follow-up Resampling Frequency None immediately. Recommend resampling every 5 years Resample Sample quarterly to get annual average to compare with 10,000 pCi/L guideline Sample indoor air or implement radon mitigation of water Confirm treatment is effective. No need to resample water when treatment is in place. Reference List Dupuy C.J.; Healy, D.; Thomas, M. A.; Brown, D. R.; Siniscalchi, A. J., and Dembek, Z. F. A survey of naturally occurring radionuclides in groundwater in selected bedrock aquifers in Connecticut and implications for public health policy. In, C.E. Gilbert and E.J. Calabrese, eds. Regulating Drinking Water Quality. Chelsea, MI: Lewis Publishers; 1992; pp. 95-119. Larson, C. D. and Rydell, S. Regional Perspective on Radon in Drinking Water. Gilbert, C. E. and Calabrese, E. J., eds. Regulating Drinking Water Quality. Boca Raton, FL: Lewis Publishers; 1992; pp. 83-93. MassDEP. 2008. Private Well Guidelines. Bureau of Resource Protection, Drinking Water Program, Massachusetts Department of Environmental Protection. Boston, MA. (available at: http://www.mass.gov/dep/water/laws/prwellgd.pdf)