Sum6Comp

advertisement
Comparison of EPA Sum06 Secondary Ozone Exposure
Estimates with Observed Data
by Stefan Falke, Bret Schichtel, and Luis Vasconcelos
CAPITA
August 11, 1997
DRAFT
Introduction ......................................................................................................................... 1
Data Used ............................................................................................................................ 1
Methodology ....................................................................................................................... 2
Sum06 calculation ........................................................................................................... 2
Differences in Sum06 calculation ................................................................................... 3
EPA 3-month maximum Sum06 grid ............................................................................. 5
EPA Grid to Data 3-month max. Sum06 comparison..................................................... 6
Results ................................................................................................................................. 6
EPA Grid Evaluation ...................................................................................................... 6
Discussion ........................................................................................................................... 9
Acknowledgments............................................................................................................... 9
References ........................................................................................................................... 9
Introduction
EPRI is reviewing the U.S. EPA's economic analysis for the proposed secondary ozone
standard. In support of this review, CAPITA is conducting an evaluation of EPA ozone
exposure estimates used in EPA's analysis. This paper presents the comparison of three
month maximum ozone Sum06 values calculated from the AIRS database with those
estimated by the EPA in their review of the secondary ozone standard. The objectives of
this evaluation are twofold: first, to reproduce the Sum06 (the summation of hourly ozone
concentrations greater than or equal to 60 ppb) at monitoring stations used in the EPA
analysis and second, to assess the performance of the EPA estimation with monitoring
sites not included in the EPA analysis
Data Used
Two sets of ozone data were used in this analysis; an integrated hourly ozone database
derived from a number of monitoring networks and sum06 3-month maximum ozone
exposure estimates derived from the Environmental Research Laboratory's Geographic
Information System (GIS).
The ozone data used in this report were collected from multiple sources:
Data Set
Supplying Organization
Years
AIRS
CASTNet
SCION
EPA
EPA
1991, 1995
1991, 1995
1993, 1995
LADCO
GEORGIA
NORTH CAROLINA
Southern
Oxidant Study
Lake Michigan Air
Directors Consortium
State of Georgia
State of North Carolina
1991 (88, 93, 95)
1988, 91, 93, 95
Data from each network were extracted and combined into a single integrated data set.
The details of the data sources and quality control procedures are discussed in the report
"Preparation of Ozone Files for Data Analysis" by Husar and Husar 1996.
The first examination of average daily maximum ozone maps has revealed anomalous
ozone "holes" and peaks at unexpected locations. For those sites the hourly and daily
maximum ozone values were re-examined for possible inconsistencies. Sudden
systematic changes in the ozone concentrations, as well as major deviation from
neighboring sites were the main clues for anomalous behavior. As a result of this quality
control process, 6 out of ~1000 monitoring sites were discarded. The database described
in the Husar and Husar 1996 document contains data from the Eulerian Model Evaluation
and Field Study (EMEFS) during 1988 - 6/90. This ozone concentrations in this database
were found to be systematically low and were removed for this analysis (see “Validation
of an Ozone Integrated Database.”) The remaining data were used in all the subsequent
computations exactly as submitted by the networks.
Methodology
The calculation of the ozone Sum06 followed the method used by the EPA. Monthly and
3 month maximum Sum06 values were calculated for each of the EPA and non-EPA
monitoring sites. Values from the 1990 EPA Sum06 grid were extracted for each grid
cell in which monitoring sites were located and were compared with 1990 Sum06 values
calculated from observed data to evaluate the accuracy of the EPA estimates.
Sum06 calculation
A 3-month Sum06 value is the summation of all hourly ozone concentrations, during the
day, greater than or equal to 6 ppm for a continuous three month period. The three month
period was contained within the EPA defined ozone season. The ozone season varies for
different regions of the U.S. For example, the southern states have a year long ozone
season while Montana’s ozone season runs from June through September. The sum06
metric is a summation so it is necessary to correct for all missing data. The EPA (1996)
recommended corrections were used and are presented below:




Daily Sum06 values were created by summing hourly ozone observations greater than
or equal to 60 ppb for the 12-hour period from 8:00 AM to 8:00PM for each day. All
days that had 70% or more valid data were flagged as valid days.
Monthly Sum06 values were computed by summing all daily sum06 values for a
given month. A monthly Sum06 value was calculated for a monitoring site only if at
least 70% of the days in the month were valid.
Valid monthly values were corrected for missing data by multiplying by M/m, where
M is the total number of hours in the month from 8 AM to 8 PM and m is the number
of hours with ozone concentrations from 8 AM and 8 PM.
Three-month Sum06 values were calculated for each site by summing together the
monthly values for every three consecutive months, i.e. May - July, June - August,
etc. If a monthly sum06 values was missing but the two months adjacent to it had at
least 90% valid days, then a Sum06 value was calculated for the missing month as the
weighted average of the adjacent months, where the weight was the number of days
in the month. The maximum 3-month sum06 was found from the 3-month values.
Differences in Sum06 calculation
The above outlined methodology was applied to 1990 hourly ozone observations at AIRS
monitoring sites. The resulting three month maximum Sum06 values (called CAPITA
Sum06 in this discussion) differed from those used by the EPA in the generation of their
exposure estimates (EPA Sum06). Figure A is a correlation plot of the EPA Sum06 with
the CAPITA Sum06. The scatter shows that for many sites, the CAPITA Sum06 is larger
than the EPA Sum06. The CAPITA Sum06 is lower than the EPA Sum06 for only few
sites and these are at 3-month Sum06 values less than 30 ppm-hrs.
Figure A.
Analysis of the EPA 3-month maximum indicated that it was derived without the
correction of monthly Sum06 for missing hourly ozone as outlined in the methodology
section above. The CAPITA Sum06 was recalculated without the correction of monthly
values and the results are correlated with the EPA Sum06 in Figure B. Many of the
CAPITA Sum06 points which were higher than the EPA Sum06 were corrected but there
are still differences between the two calculations.
Figure B.
Further evaluation of the EPA Sum06 values revealed that the correction for missing
monthly values in the 3-month Sum06 calculation was different than that outlined in the
methodology. The methodology said that months adjacent to a missing month required at
least 90% valid days to be used in imputing a value for the missing month but in the
calculation of the EPA Sum06 it appeared that this requirement was less strict at about
70% valid sites. Using a 70% restriction instead of 90%, the CAPITA Sum06 was
recalculated and correlated with the EPA Sum06 in Figure C. All but a few sites were
corrected. It was found that those sites which still had differences had different monthly
Sum06 values and the cause for these differences was unknown. It would seem that the
hourly ozone concentrations used in the monthly Sum06 calculation were different since
no correction was conducted on the monthly Sum06 values.
Figure C.
A question remaining after this analysis of the EPA sum06 values is whether the EPA
Sum06 values described in this section were used in generating the EPA gridded sum06
estimates described in the next section. One possibility to determine this is to examine
differences between the data derived sum06 and the gridded sum06. The gridded sum06
corresponding to each monitoring site location was extracted from the EPA GIS grid and
compared with the CAPITA sum06 and EPA sum06. Average Sum06 over all
monitoring sites were calculated. The EPA GIS sum06 values averaged to 20.68 ppmhrs. The CAPITA sum06 had an average of 21.59 ppm-and the EPA sum06 averaged to
20.66 ppm-hrs. The EPA sum06 average is very close to the EPA GIS average indicating
the EPA sum06 values were used in the generation of EPA gridded sum06 estimates.
Averaging sum06 values over the entire US is a crude and very simple way of comparing
the data but it does provide some initial insight into the source of the gridded sum06
values.
EPA 3-month maximum Sum06 grid
The EPA used a GIS (Geographic Information System) to derive its sum06 maps from
the monitoring station point values. It uses a potential exposure surface (PES) as a model
of spatial variation of ozone. The PES incorporates factors such as temperature, cloud
cover, elevation, wind direction, and ozone precursor emission sources. The GIS
generated 1990 3 -month maximum Sum06 grid with a resolution of 10 km2 is shown in
Figure 1. The San Joaquin Valley in California had the largest 3-month max. Sum06
(>50 ppm-hrs). The Southeast was also a region of high Sum06 with values above 30
ppm-hrs. The southern parts of Illinois and Ohio also show elevated Sum06 as do New
Jersey, Delaware and Maryland.
Figure 1. GIS generated EPA grid of 3-month maximum ozone Sum06 for 1990.
EPA Grid to Data 3-month max. Sum06 comparison
The 1990 3-month max. Sum06 values calculated directly form the observed data were
compared with those of EPA’s grid. Grid values at those locations where monitoring
stations exist were extracted and compared to the data derived Sum06 using correlation
plots, differences, and ratios. The point values for differences and ratios were
interpolated to a grid using an inverse distance weighting interpolation.
Results
EPA Grid Evaluation
The comparison of EPA Sum06 values with those calculated from the data was done at
the sites used in the EPA analysis as well as at a set of stations not used by the EPA. The
locations of the EPA sites along with a contoured grid of their data derived 3-month
Sum60 values are displayed in Figure 2. The underlying contour was generated using
inverse distance interpolation and exhibits unrealistic spreading of Sum06 values into
areas with no monitoring sites, e.g. the high Sum06 values in Nevada and central Texas.
These areas differ from the estimates in EPA’s GIS grid but the Sum06 spatial patterns in
areas of high spatial density are similar to that in EPA’s grid (Figure 1) although the
elevated values in the East appear to be higher than what is indicated by the EPA grid.
Figure 3 contains a map of the non-EPA site locations. The size of the squares is
proportional to the three month maximum Sum06 at each monitoring site. The spatial
density of the monitoring sites is decent except for in the Central Plains States. The
highest Sum06 values are in California, the Southeast, the Midwest, and on the Atlantic
Coast.
Figure 2. Three month maximum Sum06 values at sites included in EPA’s analysis.
Figure 3. Three month maximum Sum06 values at sites not included in EPA’s analysis.
Figure 4 contains scatterplots for the 3-month maximum Sum06 obtained from the grid
and their measured data counterparts. Grid values for EPA sites tend to agree with the
data. At lower values (<25 ppm-hrs), the grid was mostly larger than the data, whereas
higher Sum06 data values were underestimated by the grid. More scatter is seen at nonEPA sites (Figure 4b) and the grid values are biased low. The two outliers at EPA GIS
values near 100 ppm-hrs cause the regression line fit to give a slope less than one. The
scatter of the data points indicates a bias in the grid to underestimate the data Sum06.
One reason for the scatter seen in Figure 4 is that the values extracted from the EPA grid
were at the centroid of the 10X10 km2 grid cell and not at the exact coordinates of the
monitoring stations. The only way to achieve 1:1 correspondence between a site’s
Sum06 and its grid value is if the site were located exactly at the center of the grid cell.
This rarely, if ever, occurs and, therefore, an inherent uncertainty exists in the extraction
of grid Sum06 values.
Figure 4. Correlation of EPA grid vs. data Sum06 a) at EPA sites b) at non EPA sites.
The difference between the EPA gridded Sum06 and data Sum06 values are displayed in
Figure 5a for the EPA stations and Figure 5b for the non-EPA sites. Most of the EPA
sites have a grid-data difference between –3 and +3 ppm-hrs. A few sites are outside of
this range, namely sites in California, eastern Utah, southeastern Missouri, and South
Carolina. It is possible that the western sites have such large differences due to the
highly textured topography in their areas. The grid values used in the difference were
obtained from the centroid of 10 km2 grid cells and not the Sum06 from the sites’
geographical coordinates. This could produce a Sum06 value 7 km away and at a
substantially higher or lower elevation from the monitoring site. The contoured plots in
the figures were created by first calculating the Sum06 difference at each of the
monitoring sites and then interpolating this difference.
Figure 5. Difference between grid and data 3-month max. Sum06 values a) at EPA sites and b) at nonEPA sites
The non-EPA sites show larger differences than the EPA sites. The grid values obtained
at most sites are lower than their calculated Sum06, particularly in much of the eastern
half of the U.S. where the EPA grid estimates are more than 9ppm-hrs lower than
measure data.
Discussion
A rising concern from this analysis is that the EPA Sum06 values almost always
underestimate what is calculated from the measured data. Currently, we are verifying
that the method used to calculate the 3-month max. Sum06 was identical to that used in
the generation of the EPA grids and we are investigating if any of the bias is caused by
extracting grid values at grid cell centroids rather than at exact station locations.
Acknowledgments
References
Download