grl53499-sup-0001-supinfo

advertisement
Geophysical Research Letters
Supporting Information for
Short-tailed Temperature Distributions over North America and Implications for Future
Changes in Extremes
Paul C. Loikith1,2 and J. David Neelin3
1. Portland State University, Portland, Oregon
2. Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California
3. University of California Los Angeles, Department of Atmospheric and Oceanic
Sciences, Los Angeles, California
Contents of this file
Text S1.
Figures S1 to S4.
Introduction
This supplementary information provides:

Text S1 that elaborates on the figures and the statistical significance test;

two figures (S1-S2) that compare the primary data set used in the main text
(MERRA-CRU) with results from two other data sets (PRISM 4km daily
temperature and Global Historical Climatology Network daily station data);

two figures (S3-S4) to elaborate on the statistical significance test for nonnormality presented in the manuscript. The data and data processing used for
these figures is the same as the data used in the manuscript.

One figure (S5) to show the effects of smaller warm shift than used in the main
text on extreme exceedances
1
Text S1.
Comparison between the MERRA-CRU and two other datasets
A leading aim in the main text is to show the widespread geographic occurrence of
shorter-then-Gaussian tails. The MERRA-CRU reanalysis data set is thus used so that
spatial patterns can be consistently examined over North America. While a full
comparison of MERRA-CRU to other data sets, and validation against station data where
these exist, is beyond the scope of this paper, we briefly provide comparisons to two
other relevant data sets (Figs. S1-S2). Over the Continental US, PRISM 4km resolution
daily temperature data [Daly et al. 1994] analyzed as in Figs 3-4 of the main text are
shown in Fig. S1. The MERRA-CRU dataset, despite much lower resolution (~50 km
compared with 4km for PRISM) generally validates well with some differences in JJA
over Texas and the Great Plains. This difference is consistent with previously
documented erroneously positive skewness in the underlying MERRA reanalysis
examined in Loikith et al. (2015b), where it was noted that the CRU bias correction
greatly reduces the skewness error. Overall, the higher-resolution PRISM data reinforce
the point that shorter-than-Gaussian warm tails are prevalent over geographically
coherent regions in both winter and summer. For comparison to station data, Fig. S2
shows examples of stations from the Global Historical Climatology Network (GHCN;
Menne et al. 2012). These stations are chosen based on the criterion of station data with
long records (all cases shown here contain 31 years, 1979-2009) existing in a region
exhibiting shorter-than Gaussian warm tails. These also serve as additional examples of
non-Gaussian tails in the MERRA-CRU data in addition to those shown in the main text.
In one instance, Boise in DJF, the GHCN has a warm side tail that is not well
distinguished from the Gaussian core although the distribution exhibits similar nonGaussianity overall, with a longer cold side tail compared to the MERRA-CRU. In
another instance, Miami in DJF, the GHCN exhibits a notably shorter warm side tail than
the MERRA-CRU. Overall, the MERRA-CRU appears to reflect the nature of the nonGaussianity in the station data. However, we emphasize the point that the widespread
existence of non-Gaussian tails found here, and the apparent potential importance of
these for global warming applications, underscore the need for systematic validation of
both models and reanalysis data sets for these features.
Measures of Departures from Gaussian
Here we provide additional background and details regarding the tests for nonGaussianity discussed in Section 3c of the main text. While physically motivated by the
shift of the distribution that is likely to be a leading contribution to changes under global
warming, we underline that the statistics tested under the shift of the distribution are
purely properties of the present-day distribution and correspond to variants of known
tests for non-normality. As background, the Lilliefors test for normality is an extension of
the Kolmogorov-Smirnov test to circumstances where the values of the population
parameters (mean, standard deviation) must be estimated from the sample (Sheskin
2003). If at any point the empirical cumulative distribution function (CDF) lies outside a
given range obtained by sampling the reference distribution, the null hypothesis that they
are drawn from the same distribution is considered to be rejected; the greatest vertical
distance at any point between the two CDFs is used as the statistic.
2
Here, a random sampling procedure is performed to gauge the extent to which each of the
three measures discussed in the main text differs significantly from the reference
Gaussian. A sample of size equal to that of the observed time series is constructed by
randomly sampling the reference distribution, and then analyzed as for the observed
series. This procedure is then repeated 10000 times, and the 5th- 95th percentile range is
constructed for comparison to the observed distribution. The null hypothesis that the
empirical distribution is drawn from the Gaussian distribution is tested under these
measures of the separation between the CDF of the observed time series and the large
ensemble of CDFs sampled from the comparison Gaussian.
Figure S3 shows the relationship of the Kolmogorov-Smirnov/Lilliefors (KS/L) statistic
to the first of the three measures —the separation of the threshold temperature Tt between
the empirical and reference CDF, with Tt the 95th percentile P95. Both statistics are
shown in Fig. S3 for CDFs corresponding to Fig. 1 of the main text, while Fig. S4 shows
the same for CDFs corresponding to Fig. 2 of the main text. The KS/L statistic is the
maximum vertical separation between the empirical and reference CDF, while the
separation of the threshold temperature measures the distance between the two CDFs in
the horizontal at a location chosen to be relevant to the warm side tail. The caveat must
be noted, however, that contributions to the departure from Gaussian leading to the CDF
separation can arise in other parts of the PDF — in these examples, this includes effects
from the long cold-side tail seen in Figs. 1 and 2 of the main text.
The second measure — the fraction of days exceeding Tt as the distribution is shifted
compared to that of the sampled reference distribution —is likewise a variant of the KS/L
procedure. To see this, note that fraction of days as a function of the temperature shift
shown in Figs. 1c,d,g,h and 2 c,d,g,h of the main text (i.e., lower panels for each location)
is equivalent to displaying the CDF of the shift variable s= (Tt–
, where sigma is the
standard deviation of the core of the empirical distribution. This is also equivalent to the
complementary CDF of T in units of standard deviation with the axes reversed, and
referenced to Tt. In other words, to get the corresponding panels of Figs. 1 and 2 of the
main text from Figs. S3 and S4, invert both axes and make Tt the origin. If at any s the
vertical separation of the empirical CDF(s) lies outside the given range created by
sampling the reference distribution, the null hypothesis is rejected, just as in KS/L. In
Figs. 1c,d,g,h and 2 c,d,g,h of the main text, this vertical separation may be seen to
emerge smoothly as one moves away from the origin in s. This difference in slope
between the empirical and reference CDF(s) allows this measure to be identified with
non-Gaussianity of the PDF arising in the neighborhood of Tt, in this case specifically
with warm-side short tails.
Because this separation between the empirical and reference CDF(s) emerges smoothly
and is statistically significant over a large range of the shift variable s, we can choose the
separation at any s value within this range to create a third measure to display on maps.
For Figs. 3 and 4 of the main text, we use the separation between the empirical and
reference CDF(s) evaluated at s=1, i.e., Tt–T = . However, the separation at s= 0.5, or
even smaller values, would have worked equally well for these distributions. This is
3
illustrated in Figure S5, which shows the s
which shows the expected similarity of spatial pattern to the s= 1 case in Figs. 3a and 4a
of the main text. Displaying the shift variable in units of
has the advantage that there
is a single value corresponding to the Gaussian case that applies at all points on the map
and a percentile range to leave unshaded if the values on the map do not fall outside it can
be determined from the random sampling of the Gaussian.
To summarize, all three measures of the departures from Gaussian can be simply related
to variants of the KS/L test. The second measure is the most fundamental for identifying
departures from Gaussian that arise in the warm-side tail (the third measure is an index
created from the second). This measure arose from the physical question of the fraction
of days that exceed a given threshold under a warming shift of the distribution, but it
provides an additional strong justification for considering simple shifts as a standard step
in evaluating present-day distributions: namely, the relationship to existing tests for nonGaussianity.
The value of the standard deviation used in the reference Gaussian is estimated from
the core of the distribution here. This choice was made to be conservative in testing for
shorter-than-Gaussian tails, given the asymmetric distributions often encountered, so that
would not be artificially inflated by a long tail on the other side of the distribution. If
distributions are so non-Gaussian that a core cannot be easily defined, as occurs in a few
instances in this application, we simply revert to the standard deviation of the entire
distribution.
The above methods can easily be adapted to address cold-side tails. The threshold Tt
would be chosen on the cold side, say the 5th percentile, with an analogous shift variable,
but with the fractional reduction in cold extremes for the empirical PDF relative to the
Gaussian as the focus.
References
Daly, C., M. R. P. Neilson, and D. L. Phillips (1994), A statistical-topographic model for
mapping climatological precipitation over mountainous terrain. J. Appl. Meteor., 33,
140-158.
Loikith, P. C., D. E. Waliser, H. Lee, J. Kim, J. D. Neelin, B. R. Lintner, S. McGinnis, C.
Mattmann, and L. O. Mearns (2015b), Surface Temperature Probability Distributions in
the NARCCAP Hindcase Experiments: Evaluation Methodology, Metrics and Results, J.
Climate, 28, 978-997
Menne, M. J., I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston (2012), An
overview of the Global Historical Climatology Network-Daily Database. Journal of
Atmospheric and Oceanic Technology, 29, 897-910, doi:10.1175/JTECH-D-1100103.1.
Sheskin, D. (2003), Handbook of parametric and nonparametric statistical proceedures,
3 ed., Chapman & Hall/CRC.
4
Figure S1. Comparison between the MERRA-CRU dataset described in Section 2 and
PRISM 4km resolution daily temperature data. Shaded values are the percentage of days
exceeding the 95th percentile of the current temperature distribution after a 1σ. The left
(DJF) panels correspond to Figure 3 and the right (JJA) panels correspond to Figure 4.
PRISM is only available over the continental US limiting this comparison to the common
domain. PRISM data obtained from the PRISM Climate Group, Oregon State
University, http://prism.oregonstate.edu, created 18 June 2015.
5
Figure S2. Comparison between MERRA-CRU and Global Historical Climatology
Network-Daily (GHCN) station data at (left) three select locations exhibiting nonGaussian short warm tails in DJF and (right) JJA. The MERRA-CRU results are from
the grid cell nearest to the observation station.
6
Figure S3. Cumulative distributions for the four examples presented in Figure 1. The
blue (green) lines are for the observed (Gaussian) distributions. The shaded region
around the Gaussian curve is the 5th and 95th percentile range obtained by randomly
sampling a Gaussian distribution 10000 times and computing the CDF. The red arrows
show the maximum displacement in the vertical between the observed and Gaussian
envelope, representing the Kolmogorov-Smirnov/Lilliefors statistic. The blue arrow
shows the displacement in the horizontal between the location of the threshold
temperature Tt for the observed and reference distribution, where CDF(Tt)= 0.95. The
shaded gray region shows the 5th-95th percentile range of 95th percentiles of the randomly
generated Gaussian distributions.
7
Figure S4. Same as in Figure S3 except for JJA.
8
Figure S5. Same as (left) Figure 3a and (right) Figure 4a of the main text except with a
0.5σ shift. Values less than 14% and greater than 11% (not shaded) are outside the 5th95th percentile range of expected Gaussian exceedances as determined by randomly
sampling the normal distribution.
9
Download