METO630test12004

advertisement
METO630 Midterm test 2004
Answers: Generally, like in real life, there is not a single “correct” answer (although there
are incorrect or incomplete answers!)
1) Assume you have a long time series, for example a 100 year long NAO index.
You find that over the last 7 years, the index has increased monotonically,
something that had not been observed before. Design a bootstrapping test to check
if such a sequence could not have happened randomly with a level of significance
of 95%.
The question asks about 7 years of monotonic increase. Null hypothesis: you can get
monotonic increases just by chance. Several students used trends, which is OK, but
strictly speaking we should test for monotonic change. Whether we use a two tail (2.5%
or one tail (5%) depends on whether we have an alternative hypothesis that, for example,
climate change is producing an increase. If we have no idea, we should use a two tailed
test.
Several resampling strategies were offered. The one I prefer is to scramble the original
data and then test for the presence of 7 year monotonic sequences. Other strategies that
are reasonable are to sample with replacement 7 years and then test for monotonicity. It
was pointed out by one student that if the data is monthly, we should also take into
account for the presence of persistence, and sub sample seasonally or annually to avoid
its effect.
2) You have the results of an ensemble of 10 one month long forecasts made with
climatological SSTs and another 10 one month long forecasts made using the SST
with the initial SST anomalies. Design a significance test to determine whether
the impact of the SST anomalies is positive on the forecasts. Assume you verify
500hPa geopotential heights, and use a parametric test.
Null hypothesis: no improvement, alternative hypothesis, an improvement (one-tailed).
Compare the forecast errors with climatological SST (E1) and with observed SST (E2) at
n
every grid point. Parametric approach: I would use a paired test t 
2
n
| E1n |  | E2 n |
s12  s22
n
where
1 n 1
s 
Ein  Ei n  and test for a 5% level.


n  1 i 1
If using a parametric approach, after finding the points at which there is an improvement,
(there may be other points where there is a significant deterioration as well), we have to
make an additional judgment about whether they represent real improvement by test for
multiplicity: Is the number of grid points we obtained significantly larger than what we
2
i
would obtain from N grid points each of which having a 0.05 probability using the
binomial distribution?
A non-parametric approach is similar: Create a null hypothesis 1000 sample by lumping
together both types of forecasts and choosing from the 20 forecasts sets of 10 “E1” and 10
“E2” results. Find the grid points for which the real | E1 |  | E2 | is larger than 95% of the
null hypothesis “ | E1 |  | E2 | ” distribution. You would still have to test whether the total
number of improved points is larger than what would be expected for a p=0.05 level.
3) Describe how you would derive a prediction equation for the surface minimum
temperature in Washington DCA assuming you have access to the previous day
max and min temperatures, and to a dependent sample of 50 years of surface
observations for Mid-Atlantic stations.
First, stratify by season. Then choose the potential predictors: a number of reasonable
stations and physically meaningful variables (e.g., Tmin and Tmax, but also wind,
cloudiness and precipitation could be considered).
Then do stepwise screening regression for each of the seasons, reserving 10% to do cross
validation. Ideally do complete cross validation (jackknifing).
4) Suppose that you have a hydrological model and you want to run it with past
observed precipitation, but you have only access to monthly averages, except for a
single representative station, for which you have daily amounts. How would you
disaggregate the monthly precipitation into daily amounts, as required by the
hydrological model?
There is no “perfect” approach. If the stations are sufficiently close, I would simply
padaily
assume that the daily precipitation at each station n is pndaily  pnmonth month
, where a is the
pa
representative station for which we have daily values. If not, I would fit a Markov chain
to station a, and assume that it represents well the persistence patterns for all stations. I
would also fit to the a station a gamma function, assuming that the shape parameter  is
the same for all stations, and get the scale parameter  from the mean precipitation. I
would then fit the rainy days (from the Markov chain) with a gamma distribution of the
precipitation.
5) What is wrong with the following procedure to choose among 50 potential
predictors xi to construct a 10 predictor multiple regression for the predictand y
(monthly SST anomalies) for which we have 50 years of observations. We
perform correlations between each predictor and the predictand over the
dependent sample (50 years), and choose the 10 predictors with highest
correlation. (There may be more than one thing wrong!).
Several things are wrong, the most important being that the procedure ignores that there
maybe correlation among the predictors, so that, for example, a predictor may have the
second highest correlation with the predictand, but also be highly correlated with the first
predictor. In this case adding it as a second predictor may do more harm than good.
Other problems:
We shouldn’t choose the number of predictors (10) a priori. The screening regression and
cross validation should guide this decision. Also, we should not say “highest correlation”,
but highest squared correlation, so as not to ignore negative correlations. There was no
attempt to stratify the data. No cross-validation.
Download