Creating a topoclimatic daily air temperature dataset for the

advertisement
INTERNATIONAL JOURNAL OF CLIMATOLOGY
Int. J. Climatol. (2014)
Published online in Wiley Online Library
(wileyonlinelibrary.com) DOI: 10.1002/joc.4127
Creating a topoclimatic daily air temperature dataset
for the conterminous United States using homogenized station
data and remotely sensed land skin temperature
Jared W. Oyler,a,b* Ashley Ballantyne,b Kelsey Jencso,b Michael Sweetb and Steven W. Runninga
a
Numerical Terradynamic Simulation Group, Department of Ecosystem and Conservation Sciences, University of Montana, Missoula, MT, USA
b Montana Climate Office, Montana Forest and Conservation Experiment Station, University of Montana, Missoula, MT, USA
ABSTRACT: Gridded topoclimatic datasets are increasingly used to drive many ecological and hydrological models and
assess climate change impacts. The use of such datasets is ubiquitous, but their inherent limitations are largely unknown
or overlooked particularly in regard to spatial uncertainty and climate trends. To address these limitations, we present
a statistical framework for producing a 30-arcsec (∼800-m) resolution gridded dataset of daily minimum and maximum
temperature and related uncertainty from 1948 to 2012 for the conterminous United States. Like other datasets, we use weather
station data and elevation-based predictors of temperature, but also implement a unique spatio-temporal interpolation that
incorporates remotely sensed 1-km land skin temperature. The framework is able to capture several complex topoclimatic
variations, including minimum temperature inversions, and represent spatial uncertainty in interpolated normal temperatures.
Overall mean absolute errors for annual normal minimum and maximum temperature are 0.78 and 0.56 ∘ C, respectively.
Homogenization of input station data also allows interpolated temperature trends to be more consistent with US Historical
Climate Network trends compared to those of existing interpolated topoclimatic datasets. The framework and resulting
temperature data can be an invaluable tool for spatially explicit ecological and hydrological modelling and for facilitating
better end-user understanding and community-driven improvement of these widely used datasets.
KEY WORDS
kriging; air temperature; land skin temperature; homogenization; MODIS
Received 28 January 2014; Revised 31 May 2014; Accepted 14 July 2014
1. Introduction
Given that climate is a key driver of many ecological and
hydrological processes (Running et al., 1987), the effects
of climate change have increasingly become a central
focus within different areas of environmental research,
conservation, and natural resource management (Wiens
and Bachelet, 2010; Glick et al., 2011; Millard et al.,
2012; Morisette, 2012). As a result, the demand for accurate and spatially continuous climate data that match the
scales of local environmental processes and land management decision-making has continued to rise (Daly, 2006;
Wiens and Bachelet, 2010; Beier et al., 2011). Assessments of climate change impacts across smaller regions
present a challenge, however, owing to the mismatch in
scale between local topoclimatic factors and synoptic outputs from global climate models (GCMs) and atmospheric
reanalyses (Beniston, 2006; Daly, 2006). This mismatch
is especially apparent in mountainous landscapes where
topography frequently drives rapid changes in temperature
* Correspondence to: J. W. Oyler, Numerical Terradynamic Simulation
Group, Department of Ecosystem and Conservation Sciences, University
of Montana, 32 Campus Drive, Missoula, MT 59812, USA. E-mail:
jared.oyler@ntsg.umt.edu
© 2014 Royal Meteorological Society
and precipitation over relatively small spatial scales
(Beniston, 2006; Barry, 2008).
Accordingly, gridded topoclimatic datasets (TCDs) that
account for local topoclimatic factors are often necessary
to assess local environmental impacts. TCDs generally
exist at spatial resolutions ≤10 km, the scale at which the
influence of topoclimatic factors such as elevation, cold
air drainage potential, and coastal zones becomes greatest (Daly, 2006). Within the conterminous United States
(CONUS), the most frequently used TCDs are the interpolated PRISM (Daly et al., 2002; Daly et al., 2008) and
Daymet (Thornton et al., 1997) datasets. Both datasets use
point-source weather station data and a digital elevation
model (DEM) to incorporate the effects of topoclimatic
factors and statistically interpolate climate variables to a
regular grid. The use of PRISM and Daymet is ubiquitous
and recent environmental modelling applications include
various GCM statistical downscaling efforts (e.g. Maurer
and Hidalgo, 2008; Abatzoglou and Brown, 2012), climate
impact assessments (e.g. Elsner et al., 2010; Littell et al.,
2010), wildfire hazard and risk assessments (e.g. Keane
et al., 2010), and analyses of trends in ecosystem productivity (e.g. Turner et al., 2011) and plant species distributions (e.g. Crimmins et al., 2011).
While TCDs like PRISM and Daymet are clearly valuable and have been diligently maintained over many
J. W. OYLER et al.
Figure 1. Process flow diagram of the TopoWx (‘Topography Weather’) statistical framework. Numbers above components represent sections where
components are described. GWR is geographically weighted regression.
years, their inherent limitations are often overlooked by
end-users, particularly in regard to spatial interpolation
uncertainty and their appropriateness in assessing interdecadal and long-term climate trends (Beier et al., 2011;
Bishop and Beier, 2013). Although model validation and
performance evaluations have been conducted by TCD
developers (Thornton et al., 1997; Daly et al., 2008), there
are currently no grid-cell-specific metrics of uncertainty.
It is consequently difficult for end-users to determine the
quality of a TCD for a specific region of interest or to
incorporate uncertainty into subsequent analyses.
Additionally, while TCDs usually have basic quality
assurance (QA) checks on input station data, they do
not account for changes in station siting, instrumentation, exposure or observation and data processing practices
through time – types of changes, termed inhomogeneities,
that can result in significant artificial jumps and trends in
climate (Menne et al., 2009; Trewin, 2010). The climate
community has conducted substantial research in this area
(e.g. Alexandersson, 1986; Peterson et al., 1998; Reeves
et al., 2007; Menne et al., 2009) and four global gridded
temperature datasets that account for inhomogeneities are
now available (Smith et al., 2008; Hansen et al., 2010;
Jones et al., 2012; Rohde et al., 2013), but inhomogeneity
detection and correction algorithms (i.e. homogenization
algorithms) have not yet been integrated into TCDs. Lastly,
most TCD models require expert knowledge to run and
are closed-source systems that cannot be easily extended
or improved by the general climate impacts research
community.
Addressing these limitations, we present an open source
statistical framework for modelling topoclimatic air temperature (Figure 1). Targeted to create a 30-arcsec (∼800
m) resolution CONUS dataset of 1948–2012 daily minimum and maximum temperatures (Tmin, Tmax), the
objectives of the framework, termed TopoWx (‘Topography Weather’), are to provide (1) improved temporal and
spatial representations of topoclimatic air temperature; (2)
grid-cell level uncertainty estimations; and (3) an impetus
© 2014 Royal Meteorological Society
to increase both end-user understanding of TCD limitations and end-user involvement in TCD development.
2. Materials and methods
2.1. Overview
Similar to existing TCDs (Thornton et al., 1997; Daly
et al., 2008), we use weather station data and spatial
grids of auxiliary predicators to model the influence of
topoclimatic factors and spatially interpolate daily Tmin
and Tmax. However, to address the limitations of existing TCDs and meet the framework objectives, we differentiate the TopoWx framework through several carefully
constructed components (Figure 1). A first component
consists of comprehensive QA procedures (Durre et al.,
2010) that better ensure the overall quality of the input station observation records (Section 2.2.2.). The second component consists of homogenization procedures (Menne
and Williams, 2009) that we apply to the quality assured
station data (Section 2.2.3.). Without homogenization,
inhomogeneities in the station records have the potential
to significantly bias temperature trends in the final gridded
output (Menne et al., 2009). The third component consists of missing value infilling procedures (Schafer, 1997;
Stacklies et al., 2007) that generate a serially complete
record at each station location (Section 2.2.4.). Missing
value infilling ensures a spatially consistent set of input
stations throughout the entire 1948–2012 time period,
yet still allows for the incorporation of important data
from short-term or incomplete station records. Lastly, a
set of several interpolation components consists of the
main spatio-temporal interpolation procedures that take
the homogenized, serially complete station data as input
and produce the final gridded topoclimatic temperature and
uncertainty estimates (Section 2.3.). The spatio-temporal
interpolation procedures include both geostatistical kriging (Isaaks and Srivastava, 1989; Hengl, 2009), geographically weighted regression (GWR) (Fotheringham et al.,
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
(a)
(b)
Figure 2. Maps of (a) the final set of 14 087 stations used as input to TopoWx and (b) underlying topography of the conterminous US. Station
networks include the daily Global Historical Climatology Network (GHCN-D), Remote Automatic Weather Stations (RAWS) network, and the
Snowpack Telemetry (SNOTEL) network. Each station has ≥5 years of raw data for each month. Boundaries represent US climate divisions.
2002), and a novel application of remotely sensed land skin
temperature (LST) as a spatial predictor of topoclimatic air
temperature (Wan and Li, 2011).
2.2. Weather station data
2.2.1. Data sources
As our primary weather station data source, we use the
daily Global Historical Climatology Network (GHCN-D;
Menne et al., 2012), a global weather station dataset consisting of observations from a multitude of different networks and sources. We spatially limit GHCN-D stations
to North America between 53 and 22∘ N latitude and 126
and 64∘ W longitude resulting in a total of 14 729 potential stations with temperature observations (Figure 2(a)).
To gain better spatial cover in the topographically complex areas of the western CONUS (Figure 2(b)), we
also obtain 764 potential station records from the more
remote Natural Resources Conservation Service (NRCS)
© 2014 Royal Meteorological Society
Snowpack Telemetry (SNOTEL) network and 1308 potential station records from the US Forest Service and Bureau
of Land Management Remote Automatic Weather Stations
(RAWS) network.
For inclusion in the TopoWx framework, we require a
station to have at least 5 years of observations in each
month, a threshold much shorter than the 20-year threshold imposed by other longer-term TCDs (e.g. Livneh et al.,
2013). Our 5-year threshold was chosen based on the finding that at least 5–7 years of observations are required
before pairwise relationships between stations begin to stabilize (Hubbard, 1994; Camargo and Hubbard, 1999). The
ability to reliably model relationships between a station
and its neighbours is critical for infilling and extending
shorter station records back to 1948 (see Section 2.2.4.).
2.2.2.
Quality assurance
To check for possible duplicate observations, outliers and
numerous internal, temporal, and spatial inconsistencies
Int. J. Climatol. (2014)
J. W. OYLER et al.
in the GHCN-D, RAWS, and SNOTEL station data, we
use the QA procedures of Durre et al. (2010). After we
mark any QA-flagged observation as missing, if a station falls below the 5-year threshold in any 1 month,
we drop it from the framework. Similar to PRISM (Daly
et al., 2008), we additionally check all station elevations
for consistency with corresponding location elevations
from a DEM. We manually investigate any station elevation having a discrepancy greater than 200 m (Daly
et al., 2008) and modify either the station elevation or
location.
2.2.3.
Homogenization
Although the QA procedures remove bad observations,
they address neither potential inhomogeneities nor the
occurrence of time of observation departures (TODs)
where a station’s reported daily Tmin or Tmax is off by a
calendar day (Janis, 2002). TOD can be a significant problem for daily spatial interpolations given that the various
input station time series are assumed to be aligned at a daily
time step. To reduce TOD and inhomogeneities in the input
station data, we apply two adjustment procedures: a simple
daily time-step correction of Tmax observations having a
morning observation time and a homogenization algorithm
developed by Menne and Williams (2009).
Many US stations within GHCN-D are part of the
National Oceanic and Atmospheric Administration
(NOAA) Cooperative Observer Program (COOP) Network and staffed by volunteers. For convenience, Tmin
and Tmax observations at COOP stations are often manually taken once daily over a 24-h period that does not
directly correspond to the typical midnight-to-midnight
calendar day (Karl et al., 1986). For these non-midnight
observation times, the most common and consistent
instance of TOD is for morning observations of Tmax
as the recorded Tmax is likely for the previous calendar
day (Janis, 2002; Holder et al., 2006). Therefore, we shift
all morning Tmax observations back a day. For morning
Tmax observations at COOP stations in North Carolina,
Holder et al. (2006) found that this simple shift significantly improved correlations with midnight-to-midnight
observations at automated collocated stations. While less
frequent, depending on the time of year and the passage
of fronts, TOD issues can also occur for other observation
times of Tmin and Tmax, but their detection and correction are more complex and would likely require the use of
hourly data (Janis, 2002). Consequently, we limit explicit
daily TOD corrections to the more consistent and frequent
TOD occurrence within morning observations of Tmax.
We also do not apply the Tmax TOD correction to the
7.9% of input GHCN-D COOP Tmax observations that
are missing a documented observation time (n = 8 706 353
of 110 051 460).
In addition to TOD, non-midnight observations can also
result in a time-of-observation bias (TOB) where a single
Tmin during a very cold morning or a single Tmax during a
very warm late afternoon is recorded over two successive
days (DeGaetano, 1999; Janis, 2002). Even slight 1-day
© 2014 Royal Meteorological Society
shifts can result in seasonal biases for monthly temperature
(Karl et al., 1986). The TOB issue is particularly evident
when trying to assess temperature trends at stations whose
time-of-observation has changed through time and is one
of the main network-wide inhomogeneities in the US temperature record (Menne et al., 2009). Adjustment methods
have been developed to correct for TOB at a monthly time
step (e.g. Karl et al., 1986), but there has been less focus
on corrections for daily data.
For simplicity, we consolidate corrections for TOB
changes and all other network-wide and local inhomogeneities within the monthly time-step pairwise homogenization algorithm (PHA) of Menne and Williams (2009).
PHA uses a recursive implementation of the standard normal homogeneity test (SNHT; Alexandersson, 1986) and
numerous pairwise comparisons of temperature time series
to identify inhomogeneities in a station’s observations relative to surrounding stations (Menne and Williams, 2009).
Once specific artificial changepoints in a station’s temperature series are identified, PHA estimates their magnitude
and adjusts the segments between changepoints relative
to the most current identifiable homogenous segment.
Although these adjustments effectively remove the trend
bias at a station, it is important to note that PHA does
not adjust for a station’s mean temperature bias (Menne
et al., 2009). For instance, if a station switches to a morning time-of-observation that causes an artificial drop in
monthly temperature, PHA will adjust all previous observations downward to remove the trend bias caused by the
change. Nonetheless, the station will still have a cool bias
in its mean monthly temperatures relative to stations that
are at midnight-to-midnight observation time. In the end,
the purpose of PHA is not to adjust all station records to a
theoretical set of standard observation practices, siting, and
instrumentation. Instead, the purpose of PHA is to remove
trend biases caused by individual station changes in such
items.
Within the TopoWx framework, we use the default configuration of the PHA v52i software (Menne and Williams,
2009; Williams et al., 2012), which is currently applied to
homogenize monthly station data for the US Historical Climatology Network (USHCN) v2.5 dataset (Menne et al.,
2009) and the GHCN-Monthly v3.2 dataset (Lawrimore
et al., 2011). As PHA runs on a monthly time step, we
first aggregate the daily station data to monthly means,
apply PHA, and then scale the daily values to match the
PHA-adjusted monthly means. This is similar to the procedure of Vincent et al. (2002) who homogenized daily
data at stations in Canada by adjusting daily observations to match homogenized monthly and annual data.
Although scaling daily observations to match the homogenized monthly data only corrects the mean and not the
variance or skewness of a station’s temperature distribution (Della-Marta and Wanner, 2006; Kuglitsch et al.,
2009), the approach is relatively straightforward and provides daily temperature series that match the trends and
variations in the homogenized monthly data without the
added complexities and uncertainties in detecting and
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
correcting inhomogeneities at a daily time step (Vincent
et al., 2002).
2.3.
2.2.4. Missing value infilling
At smaller spatial scales, synoptic-scale atmospheric conditions are mediated in the boundary layer by several main
topoclimatic factors, namely elevation, topographic convergence and cold air drainage potential, slope and aspect,
water bodies, and land cover (Daly, 2006; Dobrowski
et al., 2009). To define the interpolation grid and represent the main topoclimatic factor of elevation, we use the
30-arcsec PRISM DEM derived from the National Elevation Dataset (Gesch et al., 2002) by Daly et al. (2008).
We chose the DEM used by PRISM because it facilitates straightforward comparisons between TopoWx and
PRISM and allows for easier development of models that
can combine the two datasets.
To account for other topoclimatic factors not completely
represented by the DEM, we use spatially continuous
remotely sensed observations of LST. Compared to the
thermodynamic temperature that is typically measured 1.5
to 2-m above the ground, LST is the radiometric temperature of the ground surface (Jin and Dickinson, 2010).
Properties of the land surface, such as land cover, topography, albedo, and soil characteristics, and their interaction
with atmospheric conditions, control spatial patterns of
LST (Mostovoy et al., 2006; Jin and Dickinson, 2010).
While LST and air temperature have different physical
meanings, LST spatial and temporal variability have been
found to be highly correlated with air temperature and
LST has been used to inform air temperature interpolations where weather station observations are sparse (e.g.
Mostovoy et al., 2006; Vancutsem et al., 2010; Hengl
et al., 2011; Benali et al., 2012).
For observations of LST, we use the Moderate Resolution Imaging Spectroradiometer (MODIS), 8-day, 1-km
LST product (MYD11A2; Dozier, 1996; Wan, 2008).
MYD11A2 estimates LST using the thermal infrared signal received by the MODIS sensor and a split-window
algorithm that uses differential absorption in adjacent
infrared bands to correct for atmospheric attenuation and
land cover classification-based emissivities to account for
variability in surface emissivity (Dozier, 1996; Snyder
et al., 1998). The 8-day product is an average of daily
clear-sky LST during a respective 8-day period. We use
MYD11A2 from the Aqua satellite since its day and night
overpass times more closely correspond to the diurnal timing of Tmax and Tmin in the CONUS (Crosson et al.,
2012).
As Aqua MYD11A2 is only available from mid-2002
and we are interpolating temperature back to 1948, we
calculate 10-year (2003–2012) monthly LST means for
both day and night observations and use them as static
auxilary predictors analgous to the elevation predictor,
but monthly-varying. In other words, we use a different
mean LST predictor for each month and temperature variable (Tmin or Tmax) for a total of 24 auxilary mean
LST predictors. We quantify mean LST using the eight
MYD11A2 8-day periods centred around each respective
month.
The frequent incompleteness of weather station observations creates an additional challenge for climate research
and spatio-temporal interpolations (Huth and Nemesova,
1995). Simply interpolating raw incomplete data could
produce inhomogeneities in the gridded output as the number of stations and station spatial coverage vary during
the 1948–2012 time period (Guentchev et al., 2010). The
issue is particularly acute in the mountainous areas of the
western CONUS where many remote and higher elevation
SNOTEL and RAWS stations have only come online in the
past 30 years.
The use of non-missing neighbouring observations to
infill missing data at a target station has been regularly used
to create serially complete station data (DeGaetano et al.,
1995; Huth and Nemesova, 1995; Eischeid et al., 2000).
The most generally accurate infilling method, termed spatial regression (Durre et al., 2010), uses overlapping observation periods to develop regression models between a
target station and neighbouring stations and then uses
the models to infill the target’s missing values (Kemp
et al., 1983; Huth and Nemesova, 1995; Hubbard and You,
2005). Quantified with a correlation metric, more weight
is often given to those neighbours having a stronger relationship with the target (Kemp et al., 1983; Hubbard and
You, 2005; Durre et al., 2010).
Building upon the spatial regression assumption that
there is a useful correlation structure between a target
station and its neighbours, we adopt a two-step statistical
procedure (Appendix S1) to infill missing temperature
values in the homogenized station records using not
only neighbouring longer-term stations, but also synoptic atmospheric conditions as provided by the National
Centers for Environmental Prediction/National Center
for Atmospheric Research reanalysis dataset (Kalnay
et al., 1996). The procedure is identical for Tmin and
Tmax and we complete it separately for each variable.
Using both an expectation maximization-based infilling (Schafer, 1997) and a principal component analysis
method robust to missing values (Stacklies et al., 2007),
we estimate the 1948–2012 daily temperature mean and
variance for an incomplete station time series and then
infill the daily anomalies around the mean (Appendix
S1). We found that this approach reduces mean absolute error (MAE) and maintains observed temperature
variance better than the pure spatial regression methods.
To ensure that station time series are consistent through
time, for any station that has more than 5 continuous
years of missing data from 1948–2012, we replace all
the station’s temperature observations with values from
the station’s infill model. While this will likely have
some effect on daily interpolation accuracy, the accuracy
trade-off allows us to still incorporate valuable data from
short-term stations while avoiding the introduction of even
slight artificial changepoints in temperature means and
variances.
© 2014 Royal Meteorological Society
2.3.1.
Temperature interpolation
Auxiliary spatial predictors
Int. J. Climatol. (2014)
J. W. OYLER et al.
Similar to the station data, MYD11A2 also suffers from
a significant amount of missing data largely due to cloud
contamination (Crosson et al., 2012). For a single 8-day
pixel value, if the MODIS QA flags indicate cloud contamination or other possible issues resulting in an average
emissivity error >0.02 or average LST error >2 ∘ C, we
do not consider the value in the 10-year mean. We also
completely remove any grid cell missing more than two
thirds of its 2003–2012 8-day values. When only using
non-missing data to calculate mean LST, we found that
missing data, especially during regional cloudy periods
in winter, resulted in discontinuities and spatial artefacts.
Using the three nearest stations to each MYD11A2 grid
cell, we consequently apply the same mean estimation procedure used for the station data (Appendix S1) to better
estimate 2003–2012 mean LST values.
To further characterize the influence of topography on
daily cold air drainage, we derive a multi-scale topographic
dissection index (TDI; Holden et al., 2011a) from the
PRISM DEM:
( )
n
z s0 − zmin (i)
( ) ∑
(1)
TDI s0 =
z (i) − zmin (i)
i=1 max
where TDI(s0 ) is the final multi-scale TDI value for
grid-cell location s0 , z(s0 ) is the elevation of grid-cell location s0 , zmin (i) is the overall minimum grid-cell elevation in
spatial window i, zmax (i) is the overall maximum grid-cell
elevation in spatial window i and n is the number of spatial windows (Holden et al., 2011a). The TDI for a specific
window size reflects the height of a grid cell relative to
the surrounding terrain and ranges from 0 (lower than the
surrounding terrain) to 1 (higher than the surrounding terrain). Across a network of temperature sensors in complex
terrain, Holden et al. (2011a) found a multi-scale TDI to
be well correlated with daily patterns of Tmin anomalies
influenced by cold air drainage. Ranging in value from 0
to 5, we calculate our multi-scale TDI across a total of five
spatial window sizes (3, 6, 9, 12, and 15 km). Although
our selection of these five window sizes is subjective, the
window sizes account for spatial variations in an optimal
TDI scale across the CONUS domain, yet still maintain a
spatially static definition of the TDI predictor.
2.3.2.
Monthly normal temperature interpolation
Similar to the two-step infilling algorithm, we use a
two-step interpolation procedure that first interpolates the
monthly temperature normals at a grid cell and then the
1948–2012 daily variation around the normals. The procedure is again identical for both Tmin and Tmax. We define
a month’s normal Tmin or Tmax as the month’s mean value
from 1981–2010, the latest 30-year normal period defined
by NOAA’s National Climatic Data Center. We adopt a
regression-kriging (RK) framework (Hengl et al., 2004)
that assumes monthly normal temperature represents a spatial process that can be expressed by the sum of deterministic and spatially autocorrelated stochastic components:
(
)
(
)
(
)
T s0 , m0 = T πœ‡ s0 , m0 + T e s0 , m0
(2)
© 2014 Royal Meteorological Society
(
)
where T s0 , m0 is the final interpolated normal tempera(
)
ture at grid-cell location s0 and month m0 , T πœ‡ s0 , m0 is
the deterministic spatial trend or drift in normal temperature modelled by station
horizontal
locations and auxiliary
(
)
predictors, and T e s0 , m0 is a stochastic spatially autocorrelated residual with mean zero (Hengl et al., 2004;
Webster and Oliver, 2007).
Following the RK framework of Hengl et al. (2004) and
the multiple linear regression model (of Florio
) et al. (2004),
we use linear regression to fit T πœ‡ s0 , m0 and ordinary
(
)
kriging (OK) to interpolate T e s0 , m0 :
(
( )
)
T s0 , m0 = 𝛽0 + 𝛽1 x + 𝛽2 y + 𝛽3 z + 𝛽4 lst m0
+
n
∑
(
)
(
)
wi s0 , m0 · T e si , m0
(3)
i=1
where 𝛽 0 , 𝛽 1 , 𝛽 2 , 𝛽 3 , and 𝛽 4 , are the estimated regression
trend model coefficients for the intercept, longitude, latitude, elevation, and monthly average LST, respectively;
x, y, z, and lst(m0 ) are the longitude, latitude, elevation,
and average LST for m0 at grid-cell location s0 , wi (s0 , m0 )
are( weights
) defined by residual spatial covariance, and
T e si , m0 are the regression residuals for n stations.
In addition to the interpolation of T, RK provides an
important estimate of kriging prediction standard error
(𝜎 k ) at every grid cell and month, which is a straightforward method to represent general spatial uncertainty
in interpolated monthly normals. RK 𝜎 k is a composite
uncertainty measure that reflects not only the interpolation
error associated with the regression trend model, but also
the geographical arrangement of stations (Hengl et al.,
2004). For instance, RK 𝜎 k will be higher for grid cells
that are located further away from station locations and
from the centre of the station predicator space (Hengl
et al., 2004). We estimate RK 𝜎 k through the calculation
of the universal kriging 𝜎 k (Cressie, 1993; Hengl et al.,
2004).
Like most traditional kriging analyses, we use a variogram model to define the spatial covariance structure
of T e (Isaaks and Srivastava, 1989). However, given the
large and diverse landscape of the CONUS, it is likely not
valid to use a single global variogram that assumes the
covariance structure is the same within the entire domain
and across months (Lloyd, 2009). Additionally, performing RK at each grid cell using a global regression model
and the entire population of stations would be computationally inefficient (Hengl, 2009) and prone to over smooth
the interpolations or result in less accurate predictions and
uncertainty estimates compared to more locally defined
models (Lloyd, 2009). To account for non-stationarity in
regression parameters and T e covariance, we use a local
moving window kriging (MWK; Haas, 1990) implementation of RK (MW-RK), a kriging approach that fits a
separate local regression and variogram around each and
every interpolation point using only n surrounding stations
(Appendix S2).
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
2.3.3. Daily temperature anomaly interpolation
Similar to the MW-RK monthly normal temperature interpolations, we assume that 1948–2012 daily temperature
anomalies from the 1981–2010 normals can be expressed
as the sum of a regression-modelled spatial trend and an
interpolated residual. However, because there are 23 742
days from 1948–2012, a daily varying MW-RK approach
is not computationally practical. We subsequently apply a
simpler daily-varying interpolation model where we use
a moving window GWR for the spatial trend and inverse
distance weighting (IDW) for the residual interpolation.
GWR is similar to regular linear regression except observation points are weighted according to their distance
from a prediction location (Fotheringham et al., 2002).
The GWR model is identical to the regression component in Equation (3) except we add TDI as an additional
auxiliary predictor. For each month, we use the optimization procedures discussed in Appendix S2 to obtain locally
optimal n values for the number of surrounding stations to
use in the GWR and IDW. We calculate station weights for
the GWR via a bisquare weighting function:
( ) ⎑
wi s0 = ⎒1 −
⎒
⎣
( ( ) )2 2
h s0 i ⎀
βŽ₯
βŽ₯
r
⎦
(4)
where wi (s0 ) is the distance-based weight of station i at
interpolation location s0 , r is the interpolation window
radius defined as the distance of the n + 1 closest station,
and h(s0 )i is the distance between station i and interpolation location s0 . We use a power parameter of 2 for the
IDW. While the GWR model and IDW interpolations vary
daily, both the locally optimal n and the GWR and IDW
weights remain constant for each month. We obtain a final
estimate of actual daily temperature by combining T and
the interpolated anomaly:
(
(
)
)
(
)
T s0 , d0 = T s0 , m0 + 𝛿T s0 , d0
(5)
where T(s0 , d0 ) is the temperature at interpolation point s0
for day d0 within m0 and 𝛿T(s0 , d0 ) is the daily temperature
anomaly at interpolation point s0 for day d0 .
Our combined use of a more complex procedure to interpolate T and a simpler, faster method to interpolate 𝛿T can
be considered a form of climatologically aided interpolation (CAI; Willmott and Robeson, 1995). However, unlike
traditional implementations of CAI that model 𝛿T with
univariate methods like pure IDW (Willmott and Robeson,
1995), we incorporate auxiliary predictors that can be critical for properly representing topoclimatic spatial patterns
of 𝛿T. Holden et al. (2011b) showed topoclimatic factors
in a mountainous region to be directly related to spatial
patterns of 𝛿T, especially during stable atmospheric conditions favourable for cold air inversions.
2.4. Validation
2.4.1. Basic error statistics
For a basic validation of the infilled daily station temperatures, interpolated monthly normal temperatures, and
© 2014 Royal Meteorological Society
interpolated daily temperatures, we use three main model
performance metrics: MAE (Willmott and Matsuura,
2005), bias, and the refined index of agreement (dr ), a
dimensionless measure of average error (Willmott et al.,
2012; Appendix S3). The dr metric (Equation S5) ranges
from −1.0 to 1.0 with a value >0.5 indicating a predicative ability greater than the observed mean (Willmott
et al., 2012; Legates and McCabe, 2013). Unlike basic
correlation measures, dr is sensitive to differences in
magnitude and variance between observed and modelled
values (Legates and McCabe, 1999). Since the largest
mode of variability in a station’s time series is normally
the seasonal cycle, we also apply a baseline adjustment
to dr (Legates and McCabe, 1999; Willmott et al., 2012;
Appendix S3). This effectively avoids inflated dr values
that are simply the result of the model capturing the main
seasonality, but not necessarily day-to-day variability
(Legates and McCabe, 1999; Willmott et al., 2012).
We use three separate sets of stations to validate the daily
temperature infill models: long-term GHCN-D stations
that are part of USHCN and at least 95% complete for the
1948–2012 time period and SNOTEL and RAWS stations
that have at least 20 years of data. Assuming a worst-case
missing data scenario, for each station, we set all but its last
5 years of observations to missing, build the 1948–2012
temperature infill models and then compare the infilled
values with the observed values that were artificially set
to missing. We calculate an overall daily MAE, bias, and
mean dr (dr ) for the three networks. We also calculate the
MAE of average station temperatures (AVG-MAE), which
is essentially the mean of the absolute station biases.
To evaluate model performance in the interpolation
of 1981–2010 temperature normals and 1948–2012
daily Tmin and Tmax, we perform a leave-one-out
cross-validation (LOOCV; Willmott and Robeson, 1995)
with every station in the interpolation domain. We summarize MAE, bias, and dr by US climate division (Guttman
and Quayle, 1996) and, following Abatzoglou (2013),
October–April (‘cold’ season) and May–September
(‘warm’ season) time periods. For daily temperature,
we limit the LOOCV to only non-missing, non-infilled
observations to provide a better indication of the error
associated with actual observed temperature and not the
infilled values.
2.4.2.
Homogenization
Since TopoWx is the first CONUS-scale TCD to use
homogenized station data, it is important to specifically
validate the homogenization process. As a validation
dataset, we use homogenized monthly observations
from the official USHCN v2.5 product (Menne et al.,
2009; version 2.5.0 20130622). For each USHCN station (n = 1218), we extract the TopoWx interpolated
1948–2012 daily temperatures from the nearest 30-arcsec
grid cell. Following Menne et al. (2009), we then calculate
annual temperature anomalies (1981–2010 base period)
for each USHCN station location for both the USHCN
v2.5 and TopoWx data and interpolate the anomalies to a
Int. J. Climatol. (2014)
J. W. OYLER et al.
0.25∘ grid using the IDW method of Willmott et al. (1985).
From the 0.25∘ anomaly grids, we calculate and compare the area-weighted CONUS mean annual anomalies of
TopoWx and USHCN v2.5. If TopoWx effectively homogenizes the input station data, 1948–2012 CONUS annual
anomaly differences between TopoWx and USHCN v2.5
should be small. Although USHCN v2.5 is the official
homogenized station dataset for the CONUS, it also uses
PHA and is not completely independent from TopoWx.
Therefore, we also compare TopoWx CONUS annual
anomalies to those of Berkeley Earth, a global temperature dataset that uses an entirely different procedure to
account for station record inhomogeneities (Rohde et al.,
2013).
To examine whether the homogenized TopoWx TCD
does in fact improve upon non-homogenized TCDs,
we additionally apply the same USHCN v2.5 and
Berkeley Earth annual anomaly comparison to three
non-homogenized datasets: TopoWx interpolations based
on non-homogenized station data (TopoWx Raw), the
Daymet 1-km product (Thornton et al., 2012), and the
PRISM 2.5-min monthly product (PRISM Climate Group,
2013a). In conducting these comparisons, we acknowledge that most non-homogenized TCDs were never
intended to be used to analyse temperature trends (PRISM
Climate Group, 2013b). Nonetheless, the use of TCDs in
such a context continues to occur (e.g. Diaz and Eischeid,
2007; van Mantgem et al., 2009; Crimmins et al., 2011).
2.4.3.
Land skin temperature
Within the TopoWx framework, the application of
remotely sensed LST as an auxiliary predictor likely
has the greatest potential to improve spatial representations of temperature normals (Hengl et al., 2011). To
quantify the influence of the LST predictors and whether
the influence differs between Tmin and Tmax, we first
compare temperature normal biases between TopoWx
and three TCDs that do not use LST: TopoWx without
an LST predictor (TopoWx-No-LST), the Daymet 1-km
product (Thornton et al., 2012), and the PRISM 30-arcsec
1981–2010 monthly normals product (PRISM Climate
Group, 2012). Both Daymet and PRISM use a GWR
approach to interpolation, but Daymet only accounts for
elevation (Thornton et al., 1997), while PRISM has a
sophisticated station weighting scheme to account for
numerous other topoclimatic factors (Daly et al., 2002;
Daly et al., 2008). We focus the bias analysis on stations
in the more topographically complex western CONUS
(n = 4923; Figure 2(a)). For all four datasets, we calculate
bias in relation to an index of station LST spatial setting
(LST-I). We generate LST-I values for each station by
applying the TDI calculation in Equation (1) to the LST
grids. An LST-I value of 0 represents an area with an LST
value relatively colder than surrounding terrain while a
value of 5 represents an area with an LST value relatively
warmer than surrounding terrain.
In addition to the bias analysis, we also analyse the absolute and relative influence of LST and the other MW-RK
© 2014 Royal Meteorological Society
predictors (longitude, latitude, and elevation) on interpolations of western CONUS monthly normals. At each station
location, we perform basic monthly multiple linear regressions of the predictors and monthly normals. We quantify
relative predicator influence by partitioning the proportion of model variance explained (R2 ) accounted for by
each predictor using the ‘lmg’ method (Lindeman et al.,
1980) of the relaimp package (Grömping, 2006) within
the R environment for statistical computing (R Core Team,
2012). The lmg method averages the sequential sum of
squares over different predictor orderings to better account
for multicollinearity.
2.4.4. Uncertainty
In addition to improved temporal and spatial representations of topoclimatic air temperature, one of the main
objectives of TopoWx is to provide accurate grid-cell
level estimations of uncertainty. To assess the accuracy
of MW-RK prediction standard error (𝜎 k ), we evaluate
the relationship between station LOOCV monthly normal
MAE and 𝜎 k . If 𝜎 k properly accounts for local variability in station monthly normals, 𝜎 k should have a strong
positive correlation with MAE (Harris et al., 2010). We
also examine the relationship between 𝜎 k and MAE at a
regional scale by quantifying the correlation between climate division average MAE and 𝜎 k .
Besides correlating 𝜎 k to MAE, by assuming normality,
we can use 𝜎 k to estimate symmetric prediction confidence
intervals (PCIs). If the PCIs are accurate, a given n% of
LOOCV predictions should fall with their n% PCI (Harris et al., 2010). For instance, 95% of LOOCV predictions
should fall within their respective 95% PCI. We quantify
PCI accuracy across the full range of interval probabilities with the G-statistic (Goovaerts, 2001; Harris et al.,
2010). The G-statistic ranges from 0.0 to 1.0 with values
closer to 1.0, indicating higher PCI accuracy. As previously described, 𝜎 k is a composite measure that incorporates uncertainty from both the deterministic and spatially
autocorrelated stochastic components of the MW-RK procedure (Hengl et al., 2004). Because most other TCDs
use forms of GWR that only model a deterministic spatial trend (Thornton et al., 1997; Daly et al., 2008), we
also compare 𝜎 k to uncertainty estimates from a GWR
version of the MW-RK trend model (Equation (3)). We
use Equation (4) to define local GWR weights and calculate GWR prediction standard errors (𝜎 GWR ) according to
Leung et al. (2000). We compare the 𝜎 GWR MAE correlations, G-statistics, and average PCI widths to those of 𝜎 k
and also examine differences in spatial patterns between
the two uncertainty measures.
3. Results and discussion
3.1. Basic error statistics
3.1.1. Infilled missing values
After the removal of QA-flagged observations (Table S1),
14 087 stations met the minimum criteria of 5 years of
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
Table 1. Infill error statistics. (a) Cross-validation error statistics for daily 1948–2012 temperature infilling based on using only 5
years of data to build the infill models. (b) Error statistics for daily 1948–2012 temperature infilling on days with both infilled and
observed values for stations within the CONUS.
Network
(number of stations)
(a)
GHCN-D (626)
RAWS (376)
SNOTEL (541)
(b)
GHCN-D (9480)
RAWS (1244)
SNOTEL (691)
Tmin
Tmax
Bias (∘ C)
AVG-MAE (∘ C)
Daily
MAE (∘ C)
dr
Bias (∘ C)
AVG-MAE (∘ C)
Daily
MAE (∘ C)
dr
+0.00
−0.12
−0.07
0.22
0.19
0.15
1.36
1.58
1.67
0.82
0.76
0.75
+0.01
+0.04
+0.09
0.23
0.18
0.23
1.48
1.40
1.77
0.82
0.83
0.78
+0.00
−0.02
−0.01
0.03
0.06
0.04
1.06
1.10
1.10
0.85
0.83
0.83
+0.00
−0.00
+0.00
0.03
0.06
0.05
1.03
1.01
1.14
0.87
0.88
0.86
Error metrics are defined in Section 2.4.1.
observations for Tmin and/or Tmax (Figure 2(a)). Of
these, a total of 626 USHCN stations, 376 RAWS stations, and 541 SNOTEL stations were used for infill
model cross-validation based on their longer periods of
record. Overall, cross-validation errors for the infill models appeared to be reasonable, especially considering that
the cross-validation procedure limited model building to
only 5 years of data (Table 1(a)). Except for a RAWS
Tmin bias of −0.12 ∘ C, temperature bias for each network
was within ±0.10 ∘ C. For all three networks, AVG-MAE
was <0.25 ∘ C, daily MAE was <2.0 ∘ C, and dr was ≥0.75
(Table 1(a)).
As described in Section 2.2.4., to minimize even slight
artificial mean and variance changepoints, for any station
with more than 5 continuous years of missing data from
the period 1948–2012, we replace all the station’s temperature observations with values from the station’s infill
model. For both Tmin and Tmax, around 80% of stations
fell into this category and had their observations replaced
with infilled values (Tmin n = 11 289; Tmax n = 11 315).
This still resulted in around 3000 long-term stations retaining non-infilled observations. To make sure that the infilled
values adequately represented the original observations at
the shorter-term stations, we calculated error summaries
for all stations in the CONUS (Table 1(b)). These error
statistics are different than those from the cross-validation
procedure as they represent residuals between infilled values and the observations from which the infill models
were actually built. For all three networks and both Tmin
and Tmax, bias was within ±0.02 ∘ C, AVG-MAE was
<0.10 ∘ C, daily temperature MAE was <1.15 ∘ C, and dr
was ≥0.83.
3.1.2. Interpolated monthly normal temperatures
Across the CONUS, overall LOOCV monthly normal
Tmin MAE (0.80–0.84∘ C) was higher than overall
monthly normal Tmax MAE (0.60–0.62 ∘ C; Figure 3).
Likely a reflection of the multifaceted relationship between
Tmin and elevation (e.g. Bolstad et al., 1998; Lundquist
et al., 2008; Daly et al., 2010; Holden et al., 2011a),
higher monthly normal Tmin MAE was most apparent in
© 2014 Royal Meteorological Society
the topographically complex areas of the western CONUS
(Figure 3). Monthly normal Tmin MAE was less elevated
in climate divisions with relatively flat and homogenous
landscapes, especially in the interior plains of the central CONUS (Figure 3). The highest monthly normal
Tmax MAE was during the May–September time period
within climate divisions along the California Pacific coast
(Figure 3). During the summer, owing to the relatively
cool California current and the position of the North
Pacific High, coastal marine inversion layers and stratus
clouds produce a strong Tmax gradient from the coast to
more inland areas and a greatly complicated relationship
between elevation and Tmax (Daly et al., 2008; Iacobellis
and Cayan, 2013).
Overall monthly normal Tmin was slightly positively
biased (+0.01 ∘ C) for the CONUS while monthly normal Tmax was slightly negatively biased (−0.03 to
−0.01 ∘ C; Figure 3). At the scale of individual climate
divisions, Tmin generally had marginally larger biases
than Tmax. For instance, 33% of climate divisions had
a monthly normal Tmin absolute bias >0.1 ∘ C for both
the October–April and May–September time periods
compared to 23% of climate divisions for Tmax.
3.1.3.
Interpolated daily temperatures
Compared to the monthly normals, daily temperature
LOOCV MAE was greater with overall daily Tmin
(Tmax) MAE ranging from 1.43 ∘ C (1.34 ∘ C) in the
May–September time period to 1.75 ∘ C (1.61 ∘ C) in
the October–April time period (Figure 4). Similar to the
monthly normals, higher daily Tmin MAE was noticeable in the topographically complex areas of the western
CONUS and daily Tmax MAE had higher values along
the Pacific coast during the May–September time period
(Figure 4). From October–April, a north-south swath of
higher daily Tmax MAE was also evident through portions
of the Rocky Mountains and Great Plains (Figure 4). This
region of higher daily Tmax MAE could be a result of
both the occurrence of wintertime Tmax inversions (Daly
et al., 2010) and the higher frequency and magnitude of
wintertime cold and warm fronts (Camargo and Hubbard,
Int. J. Climatol. (2014)
J. W. OYLER et al.
Figure 3. Leave-one-out cross-validation error statistics for interpolated 1981–2010 monthly temperature normals summarized by US climate
division. Statistics are based on all input GHCN-D, SNOTEL, and RAWS stations within the conterminous United States (n = 11 589 for minimum
temperature; n = 11 619 for maximum temperature). MAE is mean absolute error. Point maps of individual station MAE and bias can be found in
Figures S1–S4.
1999). Given the relatively flat, open prairie and agricultural landscapes of the Great Plains, it is also likely that
winter spatial patterns of daily Tmax in the region are
less a function of the underlying terrain than they are a
function of air mass and front positions.
Overall LOOCV Tmin dr ranged from 0.73 to 0.78
while Tmax dr ranged from 0.79 to 0.82 (Figure 4).
Spatial patterns of dr were generally similar to those of
daily MAE with weaker dr values in the western CONUS
(Figure 4). In contrast to daily MAE patterns, weaker dr
values were also found in the Florida peninsula during
the May–September time period where climate division
Tmin dr ranged from 0.59 to 0.66 and Tmax dr ranged
from 0.60 to 0.72 (Figure 4). We found summer station
observations in Florida to have the lowest temporal standard deviations out of any stations in the CONUS. Thus,
even though daily MAE is relatively low in Florida during
the summer (Figure 4), small differences between interpolated and observed values have greater potential to reduce
dr than in regions or time periods with greater observation seasonality and daily variability (Hubbard, 1994). Sea
breezes along the Florida coast and associated convective
activity are also strongest in summer (e.g. Pielke, 1974),
likely making spatial patterns in daily Tmax harder to
resolve.
© 2014 Royal Meteorological Society
3.2. Homogenization
Compared to the non-homogenizied TCDs, TopoWx
CONUS annual temperature anomalies appeared to be
more temporally consistent with USHCN v2.5 data and
Berkeley Earth, especially for Tmax (Figure 5). The
TopoWx 1948–2012 CONUS Tmax trend of 0.123 ∘ C
decade−1 was nearly identical to the USHCN v2.5 Tmax
trend of 0.125 ∘ C decade−1 and only slightly warmer than
the 0.118 ∘ C decade−1 Berkeley Earth trend (Table 2).
In contrast, TopoWx Raw and PRISM 1948-2012 Tmax
trends were non-significant and much less positive
(Table 2). The cold bias in the TopoWx Raw, PRISM,
and Daymet Tmax trends is a well-known attribute of the
non-homogenized US Tmax record and is attributed to the
general conversion from evening to morning observation
times and the switch from liquid-in-glass thermometers
to the maximum–minimum temperature system (Menne
et al., 2009).
Homogenization also appeared to improve the correspondence in CONUS Tmin anomalies between TopoWx
and both USHCN v2.5 and Berkeley Earth, but not to
the extent of Tmax (Figure 5). The TopoWx 1948-2012
Tmin trend of 0.160 ∘ C decade−1 , while greater than the
0.134 ∘ C decade−1 TopoWx Raw and 0.142 ∘ C decade−1
PRISM trends, was still biased cold in relation to the
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
Figure 4. Leave-one-out cross-validation error statistics for interpolated 1948–2012 daily temperatures summarized by US climate division. Statistics
are based on observed, non-missing observations at all input GHCN-D, SNOTEL, and RAWS stations within the conterminous United States
(n = 11 589 for minimum temperature; n = 11 619 for maximum temperature). MAE is mean absolute error. Mean dr is the mean refined index
of agreement. Note inverted color bar for dr. Point maps of individual station MAE and dr can be found in Figures S5–S8.
Table 2. Annual temperature trends for the CONUS based on
USHCN v2.5 data, Berkeley Earth, TopoWx, TopoWx Raw,
PRISM, and Daymet.
Dataset
USHCN v2.5
Berk Earth
TopoWx
TopoWx Raw
PRISM
Daymet
1948–2012
Trend
(∘ C decade−1 )
1981–2010
Trend
(∘ C decade−1 )
Tmin
Tmax
Tmin
Tmax
+0.185*
+0.181*
+0.160*
+0.134*
+0.142*
NA
+0.125*
+0.118*
+0.123*
+0.025
+0.000
NA
+0.199*
+0.182*
+0.177*
+0.169*
+0.193*
+0.191*
+0.266*
+0.252*
+0.272*
+0.117
+0.080
+0.077
*p-value ≤ 0.10.
USHCN v2.5 0.185 ∘ C decade−1 and Berkeley Earth
0.181 ∘ C decade−1 trends (Table 2). Additionally, over the
1981–2010 time period, both Daymet and PRISM had
1981–2010 Tmin trends more similar to USHCN v2.5
while TopoWx was closer to Berkeley Earth (Table 2).
The remaining cold bias in the TopoWx Tmin trend in
relation to USHCN v2.5 and Berkeley Earth could be the
result of PHA not entirely adjusting for TOB inhomogeneities in the Tmin record. For the USHCN v2.5 data,
a specific monthly TOB correction (Karl et al., 1986)
© 2014 Royal Meteorological Society
is applied before PHA. Additionally, in contrast to the
1895-present USHCN v2.5 period-of-record, we only
run PHA over the 1948–2012 time period. Nevertheless,
given the closer match in annual anomalies between
TopoWx and both USHCN v2.5 and Berkeley Earth,
the 1948–2012 PHA-only homogenization still appears
to largely account for the main network-wide inhomogeneities in the raw station data and is a clear improvement
over the non-homogenized TCDs (Figure 5).
3.3.
Land skin temperature
Unlike TCDs without an LST predictor, TopoWx had
consistently low monthly normal Tmin bias across
seasons and different LST spatial settings (Figure 6).
TopoWx-No-LST, PRISM, and Daymet tended to overestimate Tmin in areas with colder LST values and
underestimate Tmin in areas with warmer LST values
(Figure 6). Averaged across all western CONUS stations,
LST was also the most important predictor of monthly normal Tmin accounting for >50% of the variance explained
across all months (Figure 7(b)). In contrast, the relative
importance of the elevation predictor remained near or
below 20% for most months and only rose to near 30%
during the spring (Figure 7(b)).
Throughout the mountainous western CONUS, microclimate influences on Tmin can be strong and Tmin cold
Int. J. Climatol. (2014)
J. W. OYLER et al.
(a)
(b)
(c)
(d)
Figure 5. Differences in average mean annual temperature anomalies for the conterminous United States. (a) Difference from USHCN v2.5 minimum
temperature anomalies, (b) difference from USHCN v2.5 maximum temperature anomalies, (c) difference from Berkley Earth minimum temperature
anomalies, and (d) difference from Berkeley Earth maximum temperature anomalies. Differences are the respective dataset values minus USHCN
v2.5 or Berkeley Earth values. TopoWx Raw is TopoWx driven by non-homogenized station data. Daymet is only available from 1980 onwards.
(a)
(b)
(c)
(d)
Figure 6. Dataset bias for stations in the western United States (n = 4923) grouped by an index of land skin temperature (LST). (a) Cold season
minimum temperature, (b) cold season maximum temperature, (c) warm season minimum temperature, and (d) warm season maximum temperature.
An LST index value of 0 represents an area with an LST value relatively colder than surrounding terrain while a value of 5 represents an area with
an LST value relatively warmer than surrounding terrain. TopoWx-No-LST is TopoWx without an LST predictor.
© 2014 Royal Meteorological Society
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
(a)
(b)
(c)
Figure 7. Diagnostic statistics of moving window monthly multiple linear regressions relating the moving window regression kriging auxiliary
predictors (elevation, land skin temperature, latitude, and longitude) and 1981–2010 monthly temperature normals within the western United States.
(a) Overall variance explained (R2 ); and proportion of R2 attributed to each predictor for (b) minimum temperature and (c) maximum temperature.
Statistics are averaged across 4923 western US station locations.
air pools and inversions are a common phenomenon, especially during periods of atmospheric stability and significant radiative cooling (e.g. Lundquist et al., 2008; Daly
et al., 2010; Holden et al., 2011b). As a result, Tmin often
does not have a simple linear relationship with elevation,
which limits the ability of an individual elevation predictor to properly represent Tmin spatial patterns (Daly et al.,
2008; Dobrowski et al., 2009). On the basis of the high relative importance of the LST predictor and the decreased
bias of TopoWx over TopoWx-No-LST, the addition of
LST appears to help overcome the limitations of the elevation predictor and provides significant added value to the
monthly normal Tmin interpolations (Figure 7(b)).
Although the LST predictor appeared to decrease Tmax
bias at the lowest LST-I values, differences between
TopoWx and TCDs without LST were not as significant as those seen for Tmin (Figure 6). Except for the
lowest LST-I values, Tmax bias for all the datasets was
<±0.25 ∘ C (Figure 6). Furthermore, in contrast to Tmin,
the relative importance of the LST predictor in predicting western CONUS Tmax normals was less than that
of elevation in all months except for December and January (Figure 7(c)). There are likely two main reasons for
this result. First, since Tmax generally displays a simpler linear decrease with elevation, elevation is already
a strong predictor of Tmax without the addition of LST
(Daly, 2006; Daly et al., 2008; Dobrowski et al., 2009).
© 2014 Royal Meteorological Society
This was not the case for Tmin where elevation was a relatively weak predictor (Figure 7(b)). Second, owing to solar
radiation effects on the thermal infrared signal (Vancutsem et al., 2010; Benali et al., 2012), different mediating
effects of land cover and moisture regimes on the surface
energy balance (e.g. Mildrexler et al., 2011), and increased
daytime convective turbulence and advection compared
to nighttime conditions (Pielke et al., 2007; Kloog et al.,
2012), the relationship between Tmax and LST is often
more complex than that of Tmin and LST (Vancutsem
et al., 2010; Benali et al., 2012; Kloog et al., 2012). Given
that MODIS LST can only be retrieved under a relatively
cloudless atmosphere, the maximum LST predictor is also
likely biased to clear-sky conditions when the difference
between maximum LST and Tmax is normally greatest
because of increased insolation (Jin et al., 1997). For the
winter months that did display slightly higher relative
influence values for LST (Figure 7(c)), lower wintertime
insolation is likely resulting in a more linear correspondence between LST and Tmax across different surface
conditions. In the winter, climatological Tmax inversions
and snow cover in many mountainous regions of the western CONUS (Whiteman et al., 1999; Pepin et al., 2011)
could also be lessoning the predictive power of elevation
and increasing that of LST. Ultimately, even though the
MW-RK linear model had overall greater predictive power
for Tmax than Tmin (Figure 7(a)), the added value of LST
on interpolations of monthly normal Tmax in the western
Int. J. Climatol. (2014)
J. W. OYLER et al.
Table 3. Performance metrics of monthly normal prediction standard error (𝜎) for moving window regression kriging (MW-RK) and
geographically weighted regression (GWR).
Tmin
MW-RK
GWR
Tmax
MW-RK
GWR
G-statistic
MAE and
𝜎 correlation
Average climate division
MAE and 𝜎 correlation
Average PCI
width (∘ C)
0.995
0.984
0.41
0.36
0.89
0.90
1.636
1.735
0.992
0.981
0.35
0.33
0.81
0.82
1.218
1.274
Metrics are defined in Section 2.4.4.
CONUS appears to be mainly confined to specific months
(Figure 7(c)) or environmental settings (Figure 6) and is
less significant than the added value seen for Tmin.
3.4.
Uncertainty
The 1981–2010 monthly normal temperature PCIs derived
from 𝜎 k displayed high accuracy with G-statistics of 0.995
and 0.992 for Tmin and Tmax, respectively (Table 3). For
instance, the actual percentage of LOOCV monthly normal predictions within the 95% PCI was 94.6% for Tmin
and 94.5% for Tmax. Compared to the high PCI accuracy, the correlation between individual station LOOCV
MAE and 𝜎 k was positive, but not overwhelmingly strong
(Table 3). The MAE and 𝜎 k correlation was 0.41 for Tmin
and 0.35 for Tmax. Nonetheless, the correlations are similar to those from the best performing kriging models
reviewed by Harris et al. (2010). At the scale of US climate divisions, the correlations between MAE and 𝜎 k were
also much stronger with Tmin and Tmax at 0.89 and 0.81,
respectively (Table 3). These results are similar to those
of Daly et al. (2008) who found PCIs derived from the
PRISM interpolation model to be more highly correlated
with MAE at larger regional aggregations.
Although accuracy metrics for 𝜎 k were favourable, they
were not significantly better than those of the deterministic
GWR model (Table 3). The GWR G-statistics and individual station MAE and 𝜎 GWR correlations were lower than
those of 𝜎 k , but nearly indistinguishable. For similarly performing uncertainty models, the one with the smaller average PCI widths is normally preferred (Harris et al., 2010).
While 𝜎 k again performed better than 𝜎 GWR in this regard,
differences were not substantial (Table 3). The average 𝜎 k
PCI widths were 4.4–5.7% smaller than the average 𝜎 GWR
PCI widths.
In contrast to the accuracy metrics, differences in local
spatial patterns between 𝜎 k and 𝜎 GWR were much more distinguishable. As a local example, in the western climate
division of Montana, USA, August monthly normal Tmin
𝜎 k displayed bullseyes of decreased uncertainty around
station locations while 𝜎 GWR did not (Figure 8). Unlike
𝜎 GWR , which only represents model goodness of fit (Daly
et al., 2008), 𝜎 k accounts for the geographical arrangement of stations (Hengl et al., 2004). The 𝜎 k field was
also smooth while 𝜎 GWR had circular arcs of discontinuities likely resulting from specific stations moving in or out
of the local GWR radius. In contrast to these differences,
© 2014 Royal Meteorological Society
both 𝜎 k and 𝜎 GWR had spatial patterns that followed the
underlying elevation and/or LST values of the grid cells.
In the end, the uncertainty spatial patterns are reflective of
the advantages of MW-RK 𝜎 k over not only GWR 𝜎 GWR ,
but also OK 𝜎 k . The GWR 𝜎 GWR measure only represents
model goodness of fit while OK 𝜎 k only accounts for the
geographical arrangement of stations. As evident in the
local example, MW-RK 𝜎 k is able to combine both components of uncertainty into a single composite measure
(Hengl et al., 2004).
3.5. Example output and comparison with other datasets
As an example of the final TopoWx output for the
CONUS, we concentrate on the summer month of August.
In August, nighttime microclimate influences and Tmin
inversions are more consistent in many mountainous
regions of the western CONUS due to increased nighttime
atmospheric stability (e.g. Finklin, 1986; Holden et al.,
2011b). Coastal marine inversions layers also increase
Tmax spatial complexity along the Pacific coast (Daly
et al., 2008; Iacobellis and Cayan, 2013). We examine
spatial patterns in both August Tmin and Tmax normals
and corresponding uncertainty. We also compare TopoWx
August Tmin and Tmax normals within the western
CONUS to those of the Daymet 1-km product (Thornton
et al., 2012), and PRISM 30-arcsec 1981–2010 monthly
normals product (PRISM Climate Group, 2012).
In August, TopoWx Tmax displayed a strong correspondence to elevation gradients, especially in the western
CONUS (Figure 9). Cooler Tmax temperatures were also
noticeable along the Pacific coast. In contrast, TopoWx
August Tmin displayed more complexity with relationships to not only elevation, but also convergent valleys,
large inland lakes and rivers, and urban areas (Figure 9).
Uncertainty patterns for both August Tmin and Tmax
(Figure 9) directly corresponded to warm season MAE
(Figure 3). Higher August Tmin 𝜎 k values were seen
throughout the topographically complex western CONUS,
while higher Tmax 𝜎 k values were mainly confined to the
Pacific coast (Figure 9). Although regional differences in
𝜎 k dominated the spatial patterns at the scale of CONUS,
uncertainty patterns related to station locations and topographical patterns were still discernable (Figure 9).
Differences in western CONUS August Tmin normals
between TopoWx and the existing PRISM and Daymet
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
(a)
(b)
(c)
Figure 8. Maps of prediction standard error for 1981–2010 August minimum temperature normals for the western climate division of Montana,
USA. (a) Climate division topography and geographical context; (b) TopoWx moving window regression kriging prediction standard error; and (c)
geographically weighted regression prediction standard error. Dots in (a) are weather station locations.
Figure 9. Conterminous US maps of TopoWx 1981–2010 August temperature normals and corresponding uncertainty. Note different scales for Tmin
normals and Tmax normals.
© 2014 Royal Meteorological Society
Int. J. Climatol. (2014)
J. W. OYLER et al.
Figure 10. Western US maps of differences in 1981–2010 August temperature normals between PRISM and TopoWx and between Daymet and
TopoWx.
TCDs were substantial (Figure 10). Only 47% of western CONUS grid cells for Daymet were within 1.0 ∘ C of
TopoWx Tmin and Daymet had an overall −0.83 ∘ C cold
bias in relation to TopoWx western CONUS Tmin. Differences between PRISM and TopoWx western CONUS
Tmin were smaller, but still significant (Figure 10). PRISM
Tmin displayed an overall −0.30 ∘ C cold bias in relation to
TopoWx Tmin and 57% of PRISM grid cells were within
1.0 ∘ C of TopoWx Tmin.
In mountainous terrain, both Daymet and PRISM generally displayed warmer valley and cooler mountain Tmin
than TopoWx, but PRISM tended to better match TopoWx
spatial patterns. For instance, within the undulating basin
and range topography of the northeastern climate division
of Nevada, USA, Daymet Tmin significantly smoothed out
terrain influences while PRISM displayed Tmin inversion
patterns more similar to TopoWx (Figure 11). Elevation,
the only topoclimatic factor accounted for by Daymet, is
a poor predictor of August Tmin in this Nevada climate
division. For the TopoWx MW-RK Tmin model, the average relative importance of elevation within the region was
only 6% of variance explained. Conversely, LST was the
dominant predictor at over 77% of variance explained.
Daymet differences from TopoWx were subsequently
© 2014 Royal Meteorological Society
negatively correlated with LST (r = −0.82). While the
PRISM model does not use LST as a predictor, its sophisticated station weighting scheme better accounts for Tmin
inversions and other topoclimatic factors (Daly et al.,
2008). Nevertheless, with a negative correlation between
PRISM differences from TopoWx and the LST predictor
(r = −0.61), PRISM Tmin still tended to be warmer in the
valleys and cooler in the mountains than TopoWx Tmin
(Figure 11).
In addition to these differences in mountainous terrain,
Daymet and PRISM August Tmin were also cooler than
TopoWx over large inland water bodies like the Great Salt
Lake (Figure 10). While TCDs are usually only used over
land and not expected to be valid over water, an ability to
better represent temperature patterns directly over water
bodies would be a beneficial advancement. However, given
the significant differences in LST values over water bodies
compared to their surrounding terrestrial landscapes, a lack
of station observations over many lakes, and generally
higher water body 𝜎 k uncertainty values (e.g. Figure 8),
further validation is likely required to confirm the accuracy
of Tmin spatial patterns over water.
Compared to water bodies, differences related to urban
areas in the western CONUS were less visually discernable
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
Figure 11. Comparison of TopoWx, PRISM, and Daymet 1981–2010 August minimum temperature normals within the northeastern climate division
of Nevada, USA.
except in the Central Valley of California. Within the Central Valley, Daymet and PRISM were generally warmer
than TopoWx except for islands of warmer TopoWx Tmin
over urban areas like Fresno (Figure 10). Differences likely
unrelated to underlying terrain or land cover were also
noticeable, especially in northwestern California where
Daymet August Tmin was more than 10 ∘ C degrees colder
than both TopoWx and PRISM throughout much of the
region (Figure 10).
In contrast to Tmin, differences between TopoWx
August Tmax and the other TCDs were not as significant
(Figure 10). The percentage of western CONUS grid cells
within 1.0 ∘ C of TopoWx Tmax was 91% for Daymet and
90% for PRISM. In relation to TopoWx Tmax, Daymet
was biased +0.06 ∘ C and PRISM was biased +0.27 ∘ C
within the western CONUS. Corresponding to TopoWx
Tmax uncertainty (Figure 9), the most substantial Tmax
© 2014 Royal Meteorological Society
differences were mainly confined to areas near and along
the Pacific coast (Figures 10 and 12).
Owing to the frequent onshore presence of a marine layer
in the summer (Johnstone and Dawson, 2010; Iacobellis
and Cayan, 2013), the California Pacific coast represents
one of the few areas where Tmax and elevation do not
have a simple linear relationship (Daly et al., 2008). For
example, within the north coast drainage climate division
of California (Figure 12), elevation only had an average
relative importance of 14% within the TopoWx MW-RK
Tmax model while LST relative importance was 47%.
Correspondingly, in viewing TCD outputs for the climate
division, Daymet Tmax normals were over smoothed
in certain areas while both PRISM and TopoWx displayed more realistic coastal and topographic influences
(Figure 12). Similar to the Nevada example for Tmin,
TopoWx relies on LST to overcome limitations of the
Int. J. Climatol. (2014)
J. W. OYLER et al.
Figure 12. Comparison of TopoWx, PRISM, and Daymet 1981–2010 August maximum temperature normals within the north coast drainage climate
division of California, USA.
elevation predictor while PRISM uses a station weighting
scheme based on coastal proximity and terrain blockage of
the marine layer (Daly et al., 2008). Within this context,
TopoWx Tmax tended to display a deeper inland penetration of the cooling maritime influence on the Pacific
coast than PRISM Tmax (Figure 12). One reason for
this difference could be related to the fog and low stratus
clouds that frequently accompany the marine layer (Johnstone and Dawson, 2010; Iacobellis and Cayan, 2013). As
LST observations can only be retrieved under relatively
cloudless conditions, the TopoWx Tmax spatial patterns
could be biased to what is more frequently seen under a
clear-sky atmosphere. While a more detailed validation
would be required to determine the exact advantages and
disadvantages of PRISM and TopoWx along the Pacific
© 2014 Royal Meteorological Society
coast, this represents a good example of one potential
limitation of the LST predictor and the spatial patterns it
produces.
Overall, the differences in western CONUS August
Tmin and Tmax normals between TopoWx and the other
TCDs are consistent with the results of the LST predictor analysis (Figures 6 and 7). As a strong predictor of
Tmin, LST is likely driving many of the spatial differences
between TopoWx and the other TCDs. The influence of the
LST predictor was clearly evident in the Nevada example
(Figure 11) where it had high relative importance and was
negatively correlated with Daymet and PRISM differences
from TopoWx. In contrast, with the overall lower relative
importance of LST in Tmax interpolations, there were subsequently less differences between the datasets except in
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
regions where the linear relationship between Tmax and
elevation was not as strong (Figure 12).
4. Conclusion
As evident in our validation, TopoWx contributes three
main advancements to topoclimatic temperature interpolation: (1) an improved representation of interdecadal and
long-term temperature trends; (2) an improved representation of complex temperature spatial patterns, particularly
for Tmin; and (3) a spatial representation of uncertainty
that accounts for both model goodness of fit and the
geographical arrangement of stations. These advancements were made through the use of previously developed
homogenization procedures (Menne and Williams, 2009),
remotely sensed LST as an auxiliary predictor of topoclimatic air temperature, and a unique implementation of
MW-RK.
In the context of these advancements, several caveats
should still be noted. Homogenization procedures largely
remove artificial jumps and trends, but they can also
smooth out finer-scale trend variations by imposing the
regional climate signal on each station (Pielke et al., 2007).
Additionally, as illustrated by differences in USHCN
trends and those of relatively finer resolution reanalysis datasets (Vose et al., 2012), there are still uncertainties in regional climate signals even within homogenized
datasets. TopoWx is also a daily product, but homogenized on a monthly time step. Future work should look to
improve corrections for daily time-of-observation departures and biases and to incorporate daily homogenization
schemes (e.g. Della-Marta and Wanner, 2006; Kuglitsch
et al., 2009) that additionally correct for artificial changes
in temperature distributions, not just the mean. More rigorous inter-comparisons with other TCDs that use truly
independent station data are also warranted to fully understand the advantages and disadvantages of TopoWx in
specific regions of interest, particularly in regions of
high uncertainty like the California Pacific coast. Since
the RK approach already lends itself to using any arbitrary model for the deterministic trend component (Hengl
et al., 2007), more sophisticated modelling methods that
move beyond linear regression should also be investigated.
Regression trees or generalized additive models could be
used to account for complex, nonlinear predictor relationships and possibly improve LST predictive power. Lastly,
while the 𝜎 k metric provides a good indication of spatial uncertainty in temperature normals, it does not propagate uncertainty from the station data infilling step nor
does it reflect changes in daily temperature uncertainty
through time. TopoWx will remain a work-in-progress
and we encourage community-driven enhancements, feedback, and derivative datasets. All associated TopoWx
input/output data, software code, validation metrics, and
station QA, homogenization and infill statistics will be
available at http://www.ntsg.umt.edu/project/TopoWx.
Even with the model’s remaining caveats, TopoWx takes
an important next step in addressing the main limitations
© 2014 Royal Meteorological Society
of current TCDs particularly in regard to representing
topoclimatic variations in Tmin, improving upon issues
stemming from non-homogenized station data and quantifying spatial uncertainty. The TopoWx methods developed for temperature should also be applicable to other
climate variables. For instance, the station data record
extension methods that combine atmospheric reanalysis
and local long-term station data could be key for better representing interdecadal temporal variability in precipitation
at higher elevation station locations (Luce et al., 2013).
Ultimately, TopoWx should help advance climate-driven
ecological and hydrological modelling and facilitate more
openness in TCDs and a better end-user understanding of
their uncertainties and limitations.
Acknowledgements
We thank Dr. Anna Klene and 3 anonymous reviewers
for invaluable feedback on previous drafts. This study
is based on work supported by the National Science
Foundation under EPSCoR Grant EPS-1101342, the US
Geological Survey North Central Climate Science Center Grant G-0734-2 and the US Geological Survey Energy
Resources Group Grant G11AC20487. Any opinions, findings, and conclusions or recommendations expressed in
this article are those of the authors and do not necessarily
reflect the views of the National Science Foundation.
Supporting Information
The following supporting information is available as part
of the online article:
Appendix S1. Methods for missing value infilling.
Appendix S2. Methods for moving window regression
kriging.
Appendix S3. Model performance metrics.
Table S1. Number of daily temperature observations from
1948 to 2012 flagged by Durre et al. (2010) quality assurance procedures for GHCN-D, SNOTEL and RAWS station networks.
Figure S1. Leave-one-out cross-validation mean absolute
error (MAE) for interpolated 1981–2010 monthly minimum temperature normals. Points are all input GHCN-D,
SNOTEL, and RAWS stations within the contiguous
United States (n = 11 589).
Figure S2. Leave-one-out cross-validation mean absolute
error (MAE) for interpolated 1981–2010 monthly maximum temperature normals. Points are all input GHCN-D,
SNOTEL, and RAWS stations within the contiguous
United States (n = 11 619).
Figure S3. Leave-one-out cross-validation bias for interpolated 1981–2010 monthly minimum temperature normals. Points are all input GHCN-D, SNOTEL, and RAWS
stations within the contiguous United States (n = 11 589).
Figure S4. Leave-one-out cross-validation bias for interpolated 1981–2010 monthly maximum temperature normals. Points are all input GHCN-D, SNOTEL, and RAWS
stations within the contiguous United States (n = 11 619).
Int. J. Climatol. (2014)
J. W. OYLER et al.
Figure S5. Leave-one-out cross-validation mean absolute
error (MAE) for interpolated 1948–2012 daily minimum
temperatures. MAE is based on observed, non-missing
observations at all input GHCN-D, SNOTEL, and RAWS
stations within the contiguous United States (n = 11 589).
Figure S6. Leave-one-out cross-validation mean absolute
error (MAE) for interpolated 1948–2012 daily maximum
temperatures. MAE is based on observed, non-missing
observations at all input GHCN-D, SNOTEL, and RAWS
stations within the contiguous United States (n = 11 619).
Figure S7. Leave-one-out cross-validation refined index
of agreement (dr ) for interpolated 1948–2012 daily minimum temperatures. The dr is based on a monthly-varying
baseline and observed, non-missing observations at all
input GHCN-D, SNOTEL, and RAWS stations within the
contiguous United States (n = 11 589).
Figure S8. Leave-one-out cross-validation refined index
of agreement (dr ) for interpolated 1948–2012 daily maximum temperatures. The dr is based on a monthly-varying
baseline and observed, non-missing observations at all
input GHCN-D, SNOTEL, and RAWS stations within the
contiguous United States (n = 11 619).
References
Abatzoglou JT. 2013. Development of gridded surface meteorological
data for ecological applications and modelling. Int. J. Climatol. 33:
121–131, doi: 10.1002/joc.3413.
Abatzoglou JT, Brown TJ. 2012. A comparison of statistical downscaling
methods suited for wildfire applications. Int. J. Climatol. 32: 772–780,
doi: 10.1002/joc.2312.
Alexandersson H. 1986. A homogeneity test applied to precipitation
data. J. Climatol. 6: 661–675, doi: 10.1002/joc.3370060607.
Barry RG. 2008. Mountain Weather and Climate, 3rd edn. Cambridge
University Press: Cambridge, UK.
Beier CM, Signell SA, Luttman A, DeGaetano AT. 2011.
High-resolution climate change mapping with gridded historical
climate products. Landsc. Ecol. 27: 327–342, doi: 10.1007/s10980011-9698-8.
Benali A, Carvalho AC, Nunes JP, Carvalhais N, Santos A. 2012.
Estimating air surface temperature in Portugal using MODIS LST
data. Remote Sens. Environ. 124: 108–121, doi: 10.1016/j.rse.2012.
04.024.
Beniston M. 2006. Mountain weather and climate: a general overview
and a focus on climatic change in the Alps. Hydrobiologia 562: 3–16,
doi: 10.1007/s10750-005-1802-0.
Bishop DA, Beier CM. 2013. Assessing uncertainty in high-resolution
spatial climate data across the US Northeast. PLoS One 8: e70260,
doi: 10.1371/journal.pone.0070260.
Bolstad PV, Swift L, Collins F, Régnière J. 1998. Measured and predicted air temperatures at basin to regional scales in the southern
Appalachian mountains. Agric. For. Meteorol. 91: 161–176, doi:
10.1016/S0168-1923(98)00076-8.
Camargo MB, Hubbard KG. 1999. Spatial and temporal variability
of daily weather variables in sub-humid and semi-arid areas of the
United States high plains. Agric. For. Meteorol. 93: 141–148, doi:
10.1016/S0168-1923(98)00122-1.
Cressie N. 1993. Statistics for Spatial Data. Wiley: New York, NY.
Crimmins SM, Dobrowski SZ, Greenberg JA, Abatzoglou JT, Mynsberge AR. 2011. Changes in climatic water balance drive downhill
shifts in plant species’ optimum elevations. Science 331: 324–327,
doi: 10.1126/science.1199040.
Crosson WL, Al-Hamdan MZ, Hemmings SNJ, Wade GM. 2012. A daily
merged MODIS Aqua–Terra land surface temperature data set for the
conterminous United States. Remote Sens. Environ. 119: 315–324,
doi: 10.1016/j.rse.2011.12.019.
Daly C. 2006. Guidelines for assessing the suitability of spatial climate
data sets. Int. J. Climatol. 26: 707–721, doi: 10.1002/joc.1322.
© 2014 Royal Meteorological Society
Daly C, Gibson WP, Taylor GH, Johnson GL, Pasteris P. 2002. A
knowledge-based approach to the statistical mapping of climate. Clim.
Res. 22: 99–113.
Daly C, Halbleib M, Smith JI, Gibson WP, Doggett MK, Taylor GH,
Curtis J, Pasteris PP. 2008. Physiographically sensitive mapping of
climatological temperature and precipitation across the conterminous
United States. Int. J. Climatol. 28: 2031–2064, doi: 10.1002/joc.1688.
Daly C, Conklin DR, Unsworth MH. 2010. Local atmospheric decoupling in complex topography alters climate change impacts. Int. J.
Climatol. 30: 1857–1864, doi: 10.1002/joc.2007.
DeGaetano AT. 1999. A method to infer observation time based on
day-to-day temperature variations. J. Clim. 12: 3443–3456, doi:
10.1175/1520-0442(1999)012<3443:AMTIOT>2.0.CO;2.
DeGaetano AT, Eggleston KL, Knapp WW. 1995. A method to estimate
missing daily maximum and minimum temperature observations. J.
Appl. Meteorol. 34: 371–380, doi: 10.1175/1520-0450-34.2.371.
Della-Marta PM, Wanner H. 2006. A method of homogenizing the
extremes and mean of daily temperature measurements. J. Clim. 19:
4179–4197, doi: 10.1175/JCLI3855.1.
Diaz HF, Eischeid JK. 2007. Disappearing “alpine tundra” Köppen
climatic type in the western United States. Geophys. Res. Lett. 34:
L18707, doi: 10.1029/2007GL031253.
Dobrowski SZ, Abatzoglou JT, Greenberg JA, Schladow S. 2009. How
much influence does landscape-scale physiography have on air temperature in a mountain environment? Agric. For. Meteorol. 149:
1751–1758, doi: 10.1016/j.agrformet.2009.06.006.
Dozier J. 1996. A generalized split-window algorithm for retrieving
land-surface temperature from space. IEEE Trans. Geosci. Remote
Sens. 34: 892–905, doi: 10.1109/36.508406.
Durre I, Menne MJ, Gleason BE, Houston TG, Vose RS. 2010.
Comprehensive automated quality assurance of daily surface
observations. J. Appl. Meteorol. Climatol. 49: 1615–1633, doi:
10.1175/2010JAMC2375.1.
Eischeid JK, Pasteris PA, Diaz HF, Plantico MS, Lott NJ. 2000. Creating
a serially complete, national daily time series of temperature and
precipitation for the western United States. J. Appl. Meteorol. 39:
1580–1591, doi: 10.1175/1520-0450(2000)039<1580:CASCND>
2.0.CO;2.
Elsner MM, Cuo L, Voisin N, Deems JS, Hamlet AF, Vano JA, Mickelson
KEB, Lee S, Lettenmaier DP. 2010. Implications of 21st century
climate change for the hydrology of Washington State. Clim. Change
102: 225–260, doi: 10.1007/s10584-010-9855-0.
Finklin AI. 1986. A climatic handbook for Glacier National Park – with
data for Waterton Lakes National Park. General Technical Report
INT-204, US Department of Agriculture Forest Service Intermountain
Research Station, Ogden, UT.
Florio EN, Lele SR, Chi Chang Y, Sterner R, Glass GE. 2004. Integrating
AVHRR satellite data and NOAA ground observations to predict
surface air temperature: a statistical approach. Int. J. Remote Sens. 25:
2979–2994, doi: 10.1080/01431160310001624593.
Fotheringham AS, Brunsdon C, Charlton M. 2002. Geographically
Weighted Regression: The Analysis of Spatially Varying Relationships.
Wiley: Chichester, UK.
Gesch D, Oimoen M, Greenlee S, Nelson C, Steuck M, Tyler D. 2002.
The National Elevation Dataset. Photogramm. Eng. Remote Sens. 68:
5–11.
Glick P, Stein BA, Edelson NA (eds). 2011. Scanning the Conservation Horizon: A Guide to Climate Change Vulnerability Assessment.
National Wildlife Federation: Washington, DC.
Goovaerts P. 2001. Geostatistical modelling of uncertainty in soil science. Geoderma 103: 3–26, doi: 10.1016/S0016-7061(01)00067-2.
Grömping U. 2006. Relative importance for linear regression in R: the
package relaimpo. J. Stat. Softw. 17: 1–27.
Guentchev G, Barsugli JJ, Eischeid J. 2010. Homogeneity of gridded
precipitation datasets for the Colorado River Basin. J. Appl. Meteorol.
Climatol. 49: 2404–2415, doi: 10.1175/2010JAMC2484.1.
Guttman NB, Quayle RG. 1996. A historical perspective of U.S.
Climate Divisions. Bull. Am. Meteorol. Soc. 77: 293–303, doi:
10.1175/1520-0477(1996)077<0293:AHPOUC>2.0.CO;2.
Haas TC. 1990. Kriging and automated variogram modeling within
a moving window. Atmos. Environ. 24A: 1759–1769, doi:
10.1016/0960-1686(90)90508-K.
Hansen J, Ruedy R, Sato M, Lo K. 2010. Global surface temperature
change. Rev. Geophys. 48: RG4004, doi: 10.1029/2010RG000345.
Harris P, Charlton M, Fotheringham AS. 2010. Moving window kriging
with geographically weighted variograms. Stoch. Environ. Res. Risk
Assess. 24: 1193–1209, doi: 10.1007/s00477-010-0391-2.
Hengl T. 2009. A Practical Guide to Geostatistical Mapping. Lulu
Publishers: Raleigh, NC.
Int. J. Climatol. (2014)
TOPOCLIMATIC DAILY AIR TEMPERATURE
Hengl T, Heuvelink GBM, Stein A. 2004. A generic framework for spatial prediction of soil variables based on regression-kriging. Geoderma
120: 75–93, doi: 10.1016/j.geoderma.2003.08.018.
Hengl T, Heuvelink GBM, Rossiter DG. 2007. About regression-kriging:
from equations to case studies. Comput. Geosci. 33: 1301–1315, doi:
10.1016/j.cageo.2007.05.001.
Hengl T, Heuvelink GBM, Perčec Tadić M, Pebesma EJ. 2011.
Spatio-temporal prediction of daily temperatures using time-series
of MODIS LST images. Theor. Appl. Climatol. 107: 265–277, doi:
10.1007/s00704-011-0464-2.
Holden ZA, Abatzoglou JT, Luce CH, Baggett LS. 2011a. Empirical
downscaling of daily minimum air temperature at very fine resolutions in complex terrain. Agric. For. Meteorol. 151: 1066–1073, doi:
10.1016/j.agrformet.2011.03.011.
Holden ZA, Crimmins MA, Cushman SA, Littell JS. 2011b. Empirical
modeling of spatial and temporal variation in warm season nocturnal
air temperatures in two North Idaho mountain ranges, USA. Agric.
For. Meteorol. 151: 261–269, doi: 10.1016/j.agrformet.2010.10.006.
Holder C, Boyles R, Syed A, Niyogi D, Raman S. 2006. Comparison of collocated automated (NCECONet) and manual (COOP) climate observations in North Carolina. J. Atmos. Oceanic Technol. 23:
671–682, doi: 10.1175/JTECH1873.1.
Hubbard KG. 1994. Spatial variability of daily weather variables in
the high plains of the USA. Agric. For. Meteorol. 68: 29–41, doi:
10.1016/0168-1923(94)90067-1.
Hubbard KG, You J. 2005. Sensitivity analysis of quality assurance
using the spatial regression approach – a case study of the maximum/
minimum air temperature. J. Atmos. Oceanic Technol. 22: 1520–1530,
doi: 10.1175/JTECH1790.1.
Huth R, Nemesova I. 1995. Estimation of missing daily temperatures: Can a weather categorization improve its accuracy? J. Clim.
8: 1901–1916, doi: 10.1175/1520-0442(1995)008<1901:EOMDTC>
2.0.CO;2.
Iacobellis SF, Cayan DR. 2013. The variability of California summertime marine stratus: impacts on surface air temperatures. J. Geophys.
Res. Atmos. 118: 9105–9122, doi: 10.1002/jgrd.50652.
Isaaks EH, Srivastava RM. 1989. Applied Geostatistics. Oxford University Press: Oxford, UK.
Janis MJ. 2002. Observation-time-dependent biases and departures for
daily minimum and maximum air temperatures. J. Appl. Meteorol.
41: 588–603, doi: 10.1175/1520-0450(2002)041<0588:OTDBAD>
2.0.CO;2.
Jin M, Dickinson RE. 2010. Land surface skin temperature climatology:
benefitting from the strengths of satellite observations. Environ. Res.
Lett. 5: 044004, doi: 10.1088/1748-9326/5/4/044004.
Jin M, Dickinson RE, Vogelmann AM. 1997. A comparison of
CCM2–BATS skin temperature and surface-air temperature with
satellite and surface observations. J. Clim. 10: 1505–1524, doi:
10.1175/1520-0442(1997)010<1505:ACOCBS>2.0.CO;2.
Johnstone JA, Dawson TE. 2010. Climatic context and ecological implications of summer fog decline in the coast redwood region. Proceedings of the National Academy of Sciences of the United States of America. 107: 4533–4538, doi: 10.1073/pnas.0915062107.
Jones PD, Lister DH, Osborn TJ, Harpham C, Salmon M, Morice
CP. 2012. Hemispheric and large-scale land-surface air temperature
variations: an extensive revision and an update to 2010. J. Geophys.
Res. 117: D05127, doi: 10.1029/2011JD017139.
Kalnay E, Kanamitsu M, Kistler R, Collins W, Deaven D, Gandin
L, Iredell M, Saha S, White G, Woollen J, Zhu Y, Chelliah M,
Ebisuzaki W, Higgins W, Janowiak J, Mo K, Ropelewski C, Wang J,
Leetmaa A, Reynolds R, Jenne R, Joseph D. 1996. The NCEP/NCAR
40-year reanalysis project. Bull. Am. Meteorol. Soc. 77: 437–471, doi:
10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2.
Karl TR, Williams CN, Young PJ, Wendland WM. 1986. A model to estimate the time of observation bias associated with monthly mean maximum, minimum and mean temperatures for the United States. J. Clim.
Appl. Meteorol. 25: 145–160, doi: 10.1175/1520-0450(1986)025
<0145:AMTETT>2.0.CO;2.
Keane RE, Drury SA, Karau EC, Hessburg PF, Reynolds KM. 2010.
A method for mapping fire hazard and risk across multiple scales
and its application in fire management. Ecol. Model. 221: 2–18, doi:
10.1016/j.ecolmodel.2008.10.022.
Kemp WP, Burnell DG, Everson DO, Thomson AJ. 1983. Estimating missing daily maximum and minimum temperatures. J. Clim.
Appl. Meteorol. 22: 1587–1593, doi: 10.1175/1520-0450(1983)022
<1587:EMDMAM>2.0.CO;2.
Kloog I, Chudnovsky A, Koutrakis P, Schwartz J. 2012. Temporal and
spatial assessments of minimum air temperature using satellite surface
© 2014 Royal Meteorological Society
temperature measurements in Massachusetts, USA. Sci. Total Environ.
432: 85–92, doi: 10.1016/j.scitotenv.2012.05.095.
Kuglitsch FG, Toreti A, Xoplaki E, Della-Marta PM, Luterbacher J,
Wanner H. 2009. Homogenization of daily maximum temperature
series in the Mediterranean. J. Geophys. Res. 114: D15108, doi:
10.1029/2008JD011606.
Lawrimore JH, Menne MJ, Gleason BE, Williams CN, Wuertz DB, Vose
RS, Rennie J. 2011. An overview of the Global Historical Climatology
Network monthly mean temperature data set, version 3. J. Geophys.
Res. 116: D19121, doi: 10.1029/2011JD016187.
Legates DR, McCabe G. 1999. Evaluating the use of “goodness-of-fit”
measures in hydrologic and hydroclimatic model validation. Water
Resour. Res. 35: 233–241, doi: 10.1029/1998WR900018.
Legates DR, McCabe GJ. 2013. A refined index of model performance:
a rejoinder. Int. J. Climatol. 33: 1053–1056, doi: 10.1002/joc.3487.
Leung Y, Mei C-L, Zhang W-X. 2000. Statistical tests for spatial nonstationarity based on the geographically weighted regression model.
Environ. Plann. A 32: 9–32, doi: 10.1068/a3162.
Lindeman R, Merenda P, Gold R. 1980. Introduction to Bivariate and
Multivariate Analysis. Scott Foresman: Glenview, IL.
Littell JS, Oneil EE, McKenzie D, Hicke JA, Lutz JA, Norheim RA,
Elsner MM. 2010. Forest ecosystems, disturbance, and climatic
change in Washington State, USA. Clim. Change 102: 129–158, doi:
10.1007/s10584-010-9858-x.
Livneh B, Rosenberg EA, Lin C, Nijssen B, Mishra V, Andreadis
KM, Maurer EP, Lettenmaier DP. 2013. A long-term hydrologically
nased dataset of land surface fluxes and states for the conterminous
United States: update and extensions. J. Clim. 26: 9384–9392, doi:
10.1175/JCLI-D-12-00508.1.
Lloyd CD. 2009. Nonstationary models for exploring and mapping
monthly precipitation in the United Kingdom. Int. J. Climatol. 30:
390–405, doi: 10.1002/joc.1892.
Luce CH, Abatzoglou JT, Holden ZA. 2013. The missing mountain
water: slower westerlies decrease orographic enhancement in the
Pacific Northwest USA. Science 342: 1360–1364, doi: 10.1126/science.1242335.
Lundquist JD, Pepin N, Rochford C. 2008. Automated algorithm for
mapping regions of cold-air pooling in complex terrain. J. Geophys.
Res. 113: D22107, doi: 10.1029/2008JD009879.
van Mantgem PJ, Stephenson NL, Byrne JC, Daniels LD, Franklin JF,
Fulé PZ, Harmon ME, Larson AJ, Smith JM, Taylor AH, Veblen TT.
2009. Widespread increase of tree mortality rates in the western United
States. Science 323: 521–524, doi: 10.1126/science.1165000.
Maurer EP, Hidalgo HG. 2008. Utility of daily vs. monthly large-scale
climate data: an intercomparison of two statistical downscaling
methods. Hydrol. Earth Syst. Sci. 12: 551–563, doi: 10.5194/hess12-551-2008.
Menne MJ, Williams CN. 2009. Homogenization of temperature
series via pairwise comparisons. J. Clim. 22: 1700–1717, doi:
10.1175/2008JCLI2263.1.
Menne MJ, Williams CN, Vose RS. 2009. The U.S. Historical Climatology Network Monthly Temperature Data, Version 2. Bull. Am. Meteorol. Soc. 90: 993–1007, doi: 10.1175/2008BAMS2613.1.
Menne MJ, Durre I, Vose RS, Gleason BE, Houston TG. 2012. An
overview of the Global Historical Climatology Network-Daily
Database. J. Atmos. Oceanic Technol. 29: 897–910, doi: 10.1175/
JTECH-D-11-00103.1.
Mildrexler DJ, Zhao M, Running SW. 2011. A global comparison
between station air temperatures and MODIS land surface temperatures reveals the cooling role of forests. J. Geophys. Res. 116: G03025,
doi: 10.1029/2010JG001486.
Millard MJ, Czarnecki CA, Morton JM, Brandt LA, Briggs JS, Shipley FS, Sayre R, Sponholtz PJ, Perkins D, Simpkins DG, Taylor
J. 2012. A national geographic framework for guiding conservation on a landscape scale. J. Fish Wildl. Manage. 3: 175–183, doi:
10.3996/052011-JFWM-030.
Morisette JT (ed). 2012. North Central Climate Science Center – Science
agenda 2012–2017: U.S. Geological Survey Open-File Report
2012–1265, USGS, Reston, VA, 19 pp.
Mostovoy GV, King RL, Reddy KR, Kakani VG, Filippova MG. 2006.
Statistical estimation of daily maximum and minimum air temperatures from MODIS LST data over the state of Mississippi. GISci.
Remote Sens. 43: 78–110, doi: 10.2747/1548-1603.43.1.78.
Pepin NC, Daly C, Lundquist J. 2011. The influence of surface versus free-air decoupling on temperature trend patterns in the western United States. J. Geophys. Res. 116: D10109, doi: 10.1029/
2010JD014769.
Peterson TC, Easterling DR, Karl TR, Groisman P, Nicholls N, Plummer N, Torok S, Auer I, Boehm R, Gullett D, Vincent L, Heino R,
Int. J. Climatol. (2014)
J. W. OYLER et al.
Tuomenvirta H, Mestre O, Szentimrey T, Salinger J, Førland EJ,
Hanssen-Bauer I, Alexandersson H, Jones P, Parker D. 1998. Homogeneity adjustments of in situ atmospheric climate data: a review.
Int. J. Climatol. 18: 1493–1517, doi: 10.1002/(SICI)1097-0088
(19981115)18:13<1493::AID-JOC329>3.0.CO;2-T.
Pielke RA. 1974. A three-dimensional numerical model of the sea
breezes over south Florida. Mon. Weather Rev. 102: 115–139, doi:
10.1175/1520-0493(1974)102<0115:ATDNMO>2.0.CO;2.
Pielke RA, Davey CA, Niyogi D, Fall S, Steinweg-Woods J, Hubbard K, Lin X, Cai M, Lim Y-K, Li H, Nielsen-Gammon J, Gallo
K, Hale R, Mahmood R, Foster S, McNider RT, Blanken P. 2007.
Unresolved issues with the assessment of multidecadal global land
surface temperature trends. J. Geophys. Res. 112: D24S08, doi:
10.1029/2006JD008229.
PRISM Climate Group. 2012. Norm81m dataset, Oregon State University, Corvallis, OR. ftp://prism.nacse.org/normals_800m (accessed 5
April 2014).
PRISM Climate Group. 2013a. AN81m dataset, Oregon State University,
Corvallis, OR. ftp://prism.nacse.org/monthly (accessed 5 April 2014).
PRISM Climate Group. 2013b. Descriptions of PRISM spatial climate datasets for the conterminous United States, Oregon State University, Corvallis, OR. http://www.prism.oregonstate.edu/documents/
PRISM_datasets_aug2013.pdf (accessed 27 May 2014).
R Core Team. 2012. R: A Language and Environment for Statistical
Computing. R Foundation for Statistical Computing: Vienna.
Reeves J, Chen J, Wang XL, Lund R, Lu Q. 2007. A review and
comparison of changepoint detection techniques for climate data. J.
Appl. Meteorol. Climatol. 46: 900–915, doi: 10.1175/JAM2493.1.
Rohde R, Muller R, Jacobsen R, Perlmutter S, Rosenfeld A, Wurtele J,
Curry J, Wickham C, Mosher S. 2013. Berkeley Earth temperature
averaging process. Geoinform. Geostat. 1(2): 1–13, doi: 10.4172/
2327-4581.1000103.
Running SW, Nemani RR, Hungerford RD. 1987. Extrapolation of
synoptic meteorological data in mountainous terrain and its use for
simulating forest evapotranspiration and photosynthesis. Can. J. For.
Res. 17: 472–483, doi: 10.1139/x87-081.
Schafer JL. 1997. Analysis of Incomplete Multivariate Data. Chapman
and Hall/CRC: Boca Raton, FL.
Smith TM, Reynolds RW, Peterson TC, Lawrimore J. 2008. Improvements to NOAA’s historical merged land–ocean surface temperature analysis (1880–2006). J. Clim. 21: 2283–2296, doi: 10.1175/
2007JCLI2100.1.
Snyder WC, Wan Z, Zhang Y, Feng Y. 1998. Classification-based
emissivity for land surface temperature measurement from space. Int.
J. Remote Sens. 19: 2753–2774, doi: 10.1080/014311698214497.
Stacklies W, Redestig H, Scholz M, Walther D, Selbig J. 2007.
pcaMethods–a bioconductor package providing PCA methods for
incomplete data. Bioinformatics 23: 1164–1167, doi: 10.1093/bioinformatics/btm069.
Thornton PE, Running SW, White MA. 1997. Generating surfaces of
daily meteorological variables over large regions of complex terrain.
J. Hydrol. 190: 214–251, doi: 10.1016/S0022-1694(96)03128-9.
Thornton PE, Thornton MM, Mayer BW, Wilhelmi N, Wei Y, Cook
RB. 2012. Daymet: Daily surface weather on a 1 km grid for
© 2014 Royal Meteorological Society
North America, 1980–2012, Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, TN, doi: 10.3334/ORNLDAAC/Daymet_V2. http://daymet.ornl.gov/ (accessed 19 November
2013).
Trewin B. 2010. Exposure, instrumentation, and observing practice
effects on land temperature measurements. WIREs: Clim. Change 1:
490–506, doi: 10.1002/wcc.46.
Turner DP, Ritts WD, Yang Z, Kennedy RE, Cohen WB, Duane
MV, Thornton PE, Law BE. 2011. Decadal trends in net ecosystem production and net ecosystem carbon balance for a regional
socioecological system. For. Ecol. Manage. 262: 1318–1325, doi:
10.1016/j.foreco.2011.06.034.
Vancutsem C, Ceccato P, Dinku T, Connor SJ. 2010. Evaluation of
MODIS land surface temperature data to estimate air temperature
in different ecosystems over Africa. Remote Sens. Environ. 114:
449–465, doi: 10.1016/j.rse.2009.10.002.
Vincent LA, Zhang X, Bonsal BR, Hogg WD. 2002. Homogenization
of daily temperatures over Canada. J. Clim. 15: 1322–1334, doi:
10.1175/1520-0442(2002)015<1322:HODTOC>2.0.CO;2.
Vose RS, Applequist S, Menne MJ, Williams CN, Thorne P. 2012. An
intercomparison of temperature trends in the U.S. Historical Climatology Network and recent atmospheric reanalyses. Geophys. Res. Lett.
39: L10703, doi: 10.1029/2012GL051387.
Wan Z. 2008. New refinements and validation of the MODIS
land-surface temperature/emissivity products. Remote Sens. Environ.
112: 59–74, doi: 10.1016/j.rse.2006.06.026.
Wan Z, Li Z. 2011. MODIS land surface temperature and emissivity. In
Land Remote Sensing and Global Environmental Change, Ramachandran B, Justice CO, Abrams MJ (eds). Springer: New York, NY.
Webster R, Oliver MA. 2007. Geostatistics for Environmental Scientists,
2nd edn. Wiley: Chichester, UK.
Whiteman CD, Bian X, Zhong S. 1999. Wintertime evolution of
the temperature inversion in the Colorado plateau basin. J. Appl.
Meteorol. 38: 1103–1117, doi: 10.1175/1520-0450(1999)038<1103:
WEOTTI>2.0.CO;2.
Wiens JA, Bachelet D. 2010. Matching the multiple scales of conservation with the multiple scales of climate change. Conserv. Biol. 24:
51–62, doi: 10.1111/j.1523-1739.2009.01409.x.
Williams CN, Menne MJ, Thorne PW. 2012. Benchmarking the performance of pairwise homogenization of surface temperatures in
the United States. J. Geophys. Res. 117: D05116, doi: 10.1029/
2011JD016761.
Willmott CJ, Matsuura K. 2005. Advantages of the mean absolute error
(MAE) over the root mean square error (RMSE) in assessing average
model performance. Clim. Res. 30: 79–82, doi: 10.3354/cr030079.
Willmott CJ, Robeson SM. 1995. Climatologically aided interpolation
(CAI) of terrestrial air temperature. Int. J. Climatol. 15: 221–229, doi:
10.1002/joc.3370150207.
Willmott CJ, Rowe CM, Philpot WD. 1985. Small-scale climate maps:
a sensitivity analysis of some common assumptions associated with
grid-point interpolation and contouring. Am. Cartogr. 12: 5–16.
Willmott CJ, Robeson SM, Matsuura K. 2012. A refined index of
model performance. Int. J. Climatol. 32: 2088–2094, doi: 10.1002/
joc.2419.
Int. J. Climatol. (2014)
Download