Further Development of Bartlett-Lewis model for fine-resolution rainfall

advertisement
Further Development of Bartlett-Lewis model for
fine-resolution rainfall ∗
Jo Kaczmarska
Department of Statistical Science
University College London, Gower Street, London, WC1E 6BT, UK.
(jo@stats.ucl.ac.uk)
April 15, 2011
Abstract
In a recent development in the literature, a new temporal rainfall model, based on
the Bartlett-Lewis clustering mechanism, and intended for sub-hourly application,
was introduced. That model replaced the rectangular rain cells of the original
Bartlett-Lewis model with a Poisson process of instantaneous pulses, in order to
allow greater variability in rainfall intensity over small time intervals. A version
with two superposed processes provided a good fit to five-minute data from New
Zealand, but required a large number of parameters. In the present paper the
basic instantaneous pulse model is extended, following the approach developed in
an earlier study, by randomising the cell duration parameter, thus allowing the
durations of cells within a single storm to be dependent. Moments up to 3rd order
for the aggregated rainfall process are developed for the new model, which is then
fitted to 69 years of 5-minute data from Bochum, Germany. The new model is
compared with a number of other Bartlett-Lewis variants, and found to perform
well, improving on the non-random version, and providing a parameter-efficient
method of allowing for different storm types. A further improvement is found by
fitting a model variant in which pulse depths within cells are dependent.
∗
Research Report No. 312, Department of Statistical Science, University College London.
Date: April 2011
1
1
Introduction
Temporal rainfall models based on the clustered Poisson process approach introduced by
Rodriguez-Iturbe et al. (1987) have been used for over thirty years, in order to simulate
the artificial rainfall series required as input for hydrological models, for example for flood
risk analysis, sewerage system design, and the design of reservoirs. The models assume
that rain-events arrive in a Poisson process. Each rain event consists of a cluster of rain
cells, with the temporal location of cells relative to the event origin specified by one of
two clustering mechanisms - Bartlett-Lewis or Neyman-Scott. In the most commonly
used models, each cell is assumed to have a random duration, during which rain with a
constant random intensity is deposited, giving rise to their description as “rectangular
pulse models”. The models’ ability to generate simulations in continuous time is one
of their principal advantages, allowing aggregation of the properties and simulations to
different timescales in a consistent way. A further important feature of the models is their
representation of the physical rainfall process in a realistic, if simplified way, such that
the hierarchical structure of rainfall is represented, and the parameters have interpretable
meanings. This means that fitted model parameters can provide some insight into the
nature of differences between sites, or indeed between different potential future climate
conditions.
Since their introduction, many refinements have been introduced. Key amongst these
have been those which have allowed for different types of rainfall. These include models
with multiple cell-types (Cowpertwait 1994), or multiple superposed processes (Cowpertwait 2004, Cowpertwait et al. 2007). In order to keep parameter numbers manageable,
these methods have generally limited the number of cell types or processes to just two,
which can be thought of as representing heavy, short-duration convective and lighter,
long-duration stratiform types of rainfall. An alternative modification to enable variation between storms is the randomisation of the cell duration parameter between storms
(Rodriguez-Iturbe et al. (1988), Entekhabi et al. (1989)). In effect this allows a continuous
range of storm types. The primary motivation was to improve the fit of the models to the
probability of no rain in an interval (“proportion dry”), particularly for longer periods
of several hours or more. In the Bartlett-Lewis case (Rodriguez-Iturbe et al. 1988) the
model was re-parameterised such that different storms essentially had the same structure,
but operated on different timescales. This model was recommended by Wheater et al.
(2005), following a practical review of several models, for combining good performance
with a relatively parsimonious model structure. A more recent approach from Cowpertwait (2010) addresses the issue of different types of rainfall, by assuming a continuum of
storm types of random type Z, whose parameters are functions of Z, and investigates this
approach in the special case where Z is uniformly distributed. This approach again has
the advantage of being relatively parsimonious, but choices for the distribution of Z and
parameter functions of Z are likely to be limited by tractability.
2
Other variations of the basic models include the addition of a jitter to make the cell
intensity more realistically irregular (Rodriguez-Iturbe et al. 1987, Gyasi-Agyei & Willgoose 1997), the introduction of dependence between cell duration and intensity (Kakou
& Onof 1996) and a more realistic assumption for the shape of rainfall intensity within
cells (Northrop & Stone 2005).
The models are fitted to discrete data from rain-gauges, typically using the generalised
method of moments (the complexity of the models, particularly when aggregated, making
a maximum likelihood method impracticable). This is a fairly subjective method, for
which there is considerable flexibility, particularly in terms of the number and types of
properties chosen for fitting and the weights applied to these. Examples of practical
application are numerous (Onof et al. 2000, Wheater et al. 2005, Cowpertwait 2006,
Kilsby et al. 2007, Burton et al. 2008), with generally very good performance, and little
to choose between the two clustering mechanisms (Wheater et al. 2005). Although some
shortcomings in performance are found, it is not always clear to what extent these are due
to the models, or the fitting, the subjective fitting method being at once a disadvantage,
and an advantage (since weights and properties can be selected to focus specifically on
areas in which the hydrologist is most interested for a particular application). The most
commonly noted shortcomings relate to the reproduction of wet/dry properties and to
extremes. The former was addressed to some extent by randomising the cell duration
parameter, as discussed; the latter by the introduction of the skewness coefficient as one
of the fitting properties (Cowpertwait 1997).
Parameter identifiability can be a problem, particularly with the model variants with relatively high numbers of parameters, such as those with multiple cell types or superposed
processes. Cowpertwait (2010) suggests that any more than eight parameters per season is
likely to be excessive since the sample moments used in model fitting are highly correlated.
This is backed-up by empirical studies, for example Wheater et al. (2005), in a comparison of models, found that the Bartlett-Lewis model with random cell duration with a
one-parameter cell intensity distribution (6 parameters) had reasonably well-identified
parameters, whereas a model with two cell types, each with a two-parameter intensity
function (10 parameters) did not.
Another shortcoming of these models is that they are stationary, and much recent development has focused on simulating future rainfall, allowing for potential impacts of climate
change. However, to date no straightforward approach has been found, and many of the
approaches in the existing literature continue to use the clustered Poisson models within
their methodology. For example, such models are used for downscaling climate model
output (Kilsby et al. 2007, Burton et al. 2010), or for the disaggregation of simulations
produced using alternative modelling strategies from daily to sub-daily (see for example
Glasbey et al. (1995), Koutsoyiannis & Onof (2000, 2001) for methodology and Chandler
et al. (2007) for application). An alternative approach from Fowler et al. (2000) used the
3
Neyman-Scott rectangular pulse model to simulate rainfall within a given weather-state,
the states themselves being modelled using a semi-Markov process.
While much of the application of the models has been at hourly or longer timescales,
there is also a significant requirement for sub-hourly resolution, in particular for the
design of stormwater sewerage systems. This was the motivation for the development of
the Bartlett-Lewis Pulse model (Cowpertwait et al. 2007), which replaces the rectangular
rain cells of the original Bartlett-Lewis model with a Poisson process of instantaneous
pulses (thus incorporating two levels of clustering, and allowing greater variability in
rain intensity at short timescale). We will refer to this model as the Bartlett-Lewis
Instantaneous Pulse model (BLIP). The model achieved a very good fit to a time-series of
five-minute rainfall data from a site near Wellington, New Zealand, using two superposed
storm processes.
In this paper, we examine the performance of the BLIP model on another long series
of five-minute rainfall, from a single rain-gauge in Bochum, Germany. We confirm the
problems of parameter identifiability of the 11-parameter model with two superposed
processes, and therefore the need to develop a parsimonious model structure that is still
capable of allowing for different types of precipitation. We therefore go on to develop a
version of the BLIP model with a random cell duration parameter, following the approach
of the Random Parameter (or Random η) Bartlett-Lewis model of Rodriguez-Iturbe et al.
(1988), and compare the fit using this model against the non-random version, and against
other key variants.
Note that although we are focusing here purely on temporal models i.e those fitted to a
single site, such models may readily be extended to the spatial dimension and fitted to
rain-gauges following the approach of Cowpertwait (1995), Cowpertwait et al. (2002).
2
2.1
Specification of the Bartlett-Lewis suite of models
Summary of Existing Models
In the basic Bartlett-Lewis Rectangular Pulse (BLRP) model, rain-events arrive in a Poisson process of rate λ, each event generating a cluster of cell arrivals. The Bartlett-Lewis
clustering mechanism assumes that the time intervals between successive cells are independent, identically distributed random variables (whereas in the Neyman-Scott model,
it is the temporal distances of the cells from their storm origin which are independent
and identically distributed). It is normally assumed that the intervals between cells are
exponentially distributed, so that the cell arrivals constitute a secondary Poisson process
of rate β. Each cell is associated with a rectangular pulse of rain, of random duration,
L, and with random intensity, X. In the simplest version of the model, these are both
4
assumed to be exponentially distributed with parameters η and 1/µX respectively, and
are independent of each other. The cell origin process terminates after a time that is also
exponentially distributed with rate γ. This basic version thus has five parameters in total.
Additional flexibility can be added by allowing for a distribution with more parameters
for pulse intensities. A distribution with a longer tail may help in particular with the fit
of extreme values, and popular variants include the Gamma and Weibull distributions.
One additional parameter is required in order to use either of these.
Both storms and cells may overlap, and the total intensity of rain at any point in time,
Y (t) is given by the sum of all pulses “active” at time t.
The Random Parameter Bartlett-Lewis model (Rodriguez-Iturbe et al. 1988) extends this
basic model by allowing the parameter η, that specifies the duration of cells, to vary randomly between storms. This is achieved by assuming that the η values for distinct storms
are independent, identically distributed random variables from a Gamma distribution
with index α and scale parameter ν. The model is re-parameterised so that, rather than
keeping the cell arrival rate, β, and the storm termination rate, γ constant for each storm,
it is the ratio of both of these parameters to η that is kept constant. Thus, for a higher
η (i.e. typically shorter cell durations), we have correspondingly shorter storm durations,
and shorter cell interarrival times. This is desirable as it is in line with what we observe
in practice - that short duration convective rain is more intense than the longer duration
stratiform rain. Essentially the effect is that all storms have a common structure, but
distinct storms occur on different (random) timescales.
An issue exists with the original random η model (Verhoest et al. 2010), which led to
the development of the Truncated Random η model (Onof, C., T. Meca-Figueras, J. M.
Kaczmarska, R. E. Chandler, and L. Hege, Modelling rainfall with a Bartlett-Lewis process: third-order moments, proportion dry, and a truncated random parameter version,
(manuscript in preparation, 2011)), where the Gamma distribution for the cell duration
parameter, η, is truncated, with support (ε, ∞). The issue arises due to the divergence at
zero of integrals over η for the variance and skewness of the aggregated series, for certain
values of the shape parameter α. In fact, if the skewness coefficient is to be included in
the fitting, α in the original model would need to be greater than 4, which in practice is
an undesirable constraint. The lower limit, ε, for the integrals over η can be pre-specified,
or alternatively, as in the Truncated Random η model, can constitute a further parameter
to be determined.
The Bartlett Lewis Instantaneous Pulse model (Cowpertwait et al. 2007), intended for
fitting to fine-scale (of the order of five to fifteen minute) data, has a minimum of six
parameters (one more than the original Bartlett-Lewis model), and is defined and parameterised as follows:
• Storm origins arrive in a Poisson process of rate λ.
5
• Each storm origin initiates a Poisson process of cell origins of rate β; in contrast
to the basic Bartlett-Lewis model, it is not assumed that there is a cell at the
storm origin itself, so a storm may have no rainfall. This is purely for mathematical
convenience and does not lead to any loss of generality.
• Each cell origin initiates a further Poisson process of rainfall pulses of rate ξ. Again,
it is not assumed that there is a pulse at the cell origin, so a cell may have no rainfall.
Note that the pulses are instantaneous - they have a depth, but no duration. This
Poisson process of instantaneous pulses replaces the rectangular pulse assumption
of the original Bartlett-Lewis model.
• Both the storm duration (the duration of the cell origin process), and the cell duration are assumed to be exponentially distributed, the former with rate γ, and the
latter with rate η. The process of pulses terminates with the cell or storm lifetime,
whichever is the sooner.
• Associated with each pulse is a depth, X, so the pulse process is a marked point
process (Cox & Isham (1980)). The model developed by Cowpertwait et al. (2007)
allows pulse depths from a single cell to be dependent, but those from distinct
cells are assumed independent. No specific dependence structure is specified, and
the model fitted in the paper assumed independent, exponentially distributed pulse
depths, with mean depth µX .
The fitted model also assumed two superposed processes, with a common depth parameter
across the two storm types, giving a total of eleven parameters.
2.2
Developing a Random η Version of the Bartlett Lewis Instantaneous Pulse model
For the randomisation of η in the BLIP model, we take the same approach as for the
original Bartlett-Lewis model, but now with the additional assumption that the ratio of
the pulse arrival rate to the cell duration parameter (ι = ξ/η) is kept constant.
In order to calculate the moments, it is helpful to think of the random η model as the
superposition of a continuum of independent processes with random cell duration parameter, η, and storm origin rate, λf (η), where f (η) is the density function of η. Now, the
rth cumulant of a sum of independent random variables is the sum of their rth cumulants. Therefore the mean, variance and 3rd central moment (which are the first three
cumulants) can simply be obtained by replacing λ with λ f (η) in their original equations,
and integrating over possible values of η.
6
The integration approach described
¸some expectations of functions of η. In
·³ requires
´k
particular, we will need to use Eη η1 e−ηx for k = 1 and various values of x, given
by:
"µ ¶
#
Z ∞
k
να
1
−ηx
Eη
e
=
η α−1−k e−(ν+x)η dη
η
Γ[α] 0
=
να
Γ[α − k]
×
Γ[α] (ν + x)α−k
Note that, in order for the integral not to diverge at zero, we require α > k. This proved to
be an issue for the original Bartlett-Lewis model, as discussed above, where the skewness
integral included elements with k = 4. For the Bartlett-Lewis Instantaneous Pulse model,
we only need k = 1, so that we require α > 1, which does not significantly prejudice the
fit, and a “truncated” version is thus not required.
The moments are derived from the original equations of Cowpertwait et al. (2007), by
taking expectations over η and using the formula above, as discussed. All the moments
can be expressed exactly, which is an advantage for this type of model where numerical
approximations can lead to slow computational speeds. The moments for the new model
are given in the Appendix.
3
Fitting the Models
The generalised method of moments (GMM) is used for fitting. This is an extension of the
method of moments which estimates parameters by equating expressions for population
moments with their sample values. In the GMM, the number of properties that we want
to fit to exceeds the number of unknown parameters, and our estimator is given by the
value of θ that minimises:
S(θ|T ) = (T − τ (θ))0 W (T − τ (θ))
for some positive definite weighting matrix W , where θ is the unknown parameter vector,
T is the vector of observed values for a set of k properties, and τ (θ) is the vector of their
expected values under the model. S is referred to as the “objective function”. Here,
we take W to be a diagonal matrix, so that the objective function becomes S(θ|T ) =
Pk
2
i=1 wi [Ti (y) − τi (θ)] , with the wi equal to 1/Var(Ti (y))). This is a slight simplification
of the theoretically optimal approach (in terms of the identifiability of parameters) of
Hansen (1982), where W is the inverse of the covariance matrix of statistics.
Note that, since the number of properties included in S exceeds the number of parameters,
there is no guarantee that there will be a good fit to all the fitting properties. The
7
adequacy of the fit is thus assessed by considering properties used in the fitting procedure,
as well as others that are of interest in hydrological applications. Some properties will
need to be assessed using simulations, for example, extreme values.
We follow Cowpertwait et al. (2007) in our choice of fitting properties - the hourly mean,
plus the coefficient of variation, lag-1 correlation and skewness at timescales of 5 minutes,
1 hour, 6 hours and 24 hours.
Minimisation of S requires a numerical optimisation routine. The approach followed here
to fit the Bochum data, is that of Wheater et al. (2005), and we have used the optimisation
routines developed for that project. Firstly, a set number of optimisations are carried out
using the Nelder-Mead method, each starting with a different initial value for the set of
parameters. This set of initial values is generated by random perturbation about a single
user-supplied value. The best parameter set is then used as a new starting value for a
further set of optimisations, which now use a Newton-type algorithm. The reason for the
use of two different optimisation routines is that the first is more robust and thus well
suited to identifying promising regions of the parameter space, whereas the second is more
powerful if given good starting values.
We used the method outlined by Wheater et al. (2005), which is based on the theory
of estimating equations, to estimate standard errors. However, we found that numerical
instabilities in the calculation of the standard errors could give very different answers
for different iterations, even when broadly the same solution for the parameter set was
found, and for some of the more complex models, standard errors could not be found
at all (due to singularity of the Hessian matrix, required in the calculations). In terms
of assessing parameter uncertainty, we therefore preferred the alternative approach suggested by Wheater et al. (2005), which is the examination of profile objective functions.
Each parameter in turn is fixed at each of a set of values, and the objective function is
optimised over the remaining parameters. The resulting plot for each parameter showing
the optimised objective function against the set of parameter values provides a useful
means for assessing the identifiability of the parameter - for example, a very flat objective
function indicates a wide range of plausible values. Approximate 95% confidence intervals
can also be calculated using the objective function itself (although here again there may
be problems with numerical instabilities). We will use this approach to give an idea of
parameter uncertainty for our new model.
8
4
Comparison of Models on Bochum Data
4.1
Models Fitted
The models were fitted, using the methodology and fitting properties discussed, to 69 years
of five-minute rainfall data from a single site in Bochum in Germany. The measurements
were obtained using a Hellmann rain gauge, in which rain displaces a float and a marking
pen attached to the float makes a continuous trace on a recording chart. A separate fit
was produced for each month, to allow for seasonality. In each case, we assume that
σX /µX = 1, and that the skewness coefficient of X is 2 (effectively X is exponentially
distributed). For the Instantaneous Pulse models, initially we also assume that all pulse
depths are independent, and, for the two storm type version, we follow Cowpertwait et al.
(2007) in assuming a common mean depth for both types. Initially, no constraints were
imposed on the parameters other than that they should be greater than zero. The six
models initially fitted were:
Rectangular Pulse Models
1. the Bartlett-Lewis Rectangular Pulse model (BLRP)
2. the Bartlett-Lewis Truncated random η Model (BLRPR)
3. the Bartlett-Lewis Rectangular Pulse model with two superposed processes (BLRP2);
Instantaneous Pulse Models
1. the Bartlett-Lewis Instantaneous Pulse Model (BLIP)
2. the Bartlett-Lewis Instantaneous Pulse Random η model, developed in Section 2.2
(BLIPR)
3. the Bartlett-Lewis Instantaneous Pulse model with two superposed processes (BLIP2)
For the Bartlett-Lewis Rectangular Pulse model, on randomising η, the fitted solution
gave such a high precision to the mean cell duration, that it effectively replicated the
non-random solution. Thus, the fitted parameter set for the BLRPR model is simply a
re-parameterised version of the set of BLRP parameters, and there is thus no improvement
in the fit compared with the fixed η version. This appears to contradict examples in the
literature where the randomised η version had shown an improved fit compared to the fixed
η model (Rodriguez-Iturbe et al. 1988, Wheater et al. 2005). On further investigation,
we concluded that the improvement in the fit to proportion dry that had previously been
9
found by randomising η was at the expense of a deterioration in the fit to the skewness,
which had not been included as a fitting property in these earlier analyses.
Fitting the models with two superposed processes proved problematic. Although the
BLRP2 model with no parameter constraints gave a very good fit in terms of a low minimum objective function value, the parameters thus obtained were highly unstable, unrealistic and inconsistent from month to month, and no standard errors could be found. It
was clear that there was insufficient information in our observed data to identify the large
number of required parameters. Introducing constraints for the parameters increased the
minimum objective function values, and did not resolve the situation, with resulting solutions having many parameters lying on the constraint boundaries. We therefore concluded
that ensuring realistic and reasonably smooth parameters across months would require
constraints on the relationships between parameters, rather than just setting bounds on
individual parameters. Although it has two more parameters than the BLRP2 model, the
Bartlett-Lewis Instantaneous Pulse model with two superposed processes (BLIP2) proved
slightly less problematic. For this model, with minimal constraints, we found parameters
for most months which were within realistic bounds and which gave a very low minimum
objective function value. However, here also we found solutions quite unstable, with issues
of parameter identifiability, particularly in the summer months, and again no standard
errors could be found. We came to the conclusion that both of these models’ parameter
identifiability issues made them unsuitable for practical application.
Given the above findings, we present results here for the following three models only:
BLRP, BLIP, BLIPR.
For the Bartlett-Lewis Instantaneous Pulse Random η model (BLIPR), the unconstrained
solution gave an extremely high number of pulses per hour, so for practical reasons, we
constrained µX to be above 0.001. This resulted in the fitted µX being at the constraint
level for all months (effectively reducing the number of parameters by one), with all other
parameters broadly as before, except for a corresponding change in ι. The quality of the
fit was unchanged with this constraint, as the product term µX ι effectively forms a single
composite parameter over most of the possible parameter space, as we will see in Section
6 from the profile objective functions. We also constrained α to be above 1, as discussed
in Section 2.2.
In the next section we will compare these three models, firstly in terms of the moments
and the minimum objective function value, and then by considering wet/dry properties,
which were not included within the objective function.
10
4.2
4.2.1
Performance Comparison of the Fitted Models
Moments
Plots of the fits of the models (BLRP, BLIP, BLIPR) against the observed data for each
month in respect of the mean, variance, lag-1 correlation and skewness coefficient are
shown in Figures 1-4.
All the models generally perform well with respect to the properties included in the fitting.
They reproduce the mean exactly (this is not a given, since the number of properties fitted
exceeds the number of parameters), and fit the variance well at all timescales. All tend
to underestimate the lag-1 auto-correlation at longer timescales, and the skewness at the
shorter ones.
It is interesting that the BLRP model generally outperforms the BLIP model, with a lower
minimum objective function value in all months except January and December. The model
with rectangular pulses has generally been considered unsuitable for timescales shorter
than the mean cell duration, due to the unrealistic intensity shape. However, when finescale data is available for fitting, the fitted model tends to have shorter, more frequent
cells than if only hourly data is available (of the order of 5-10 minutes, compared with
20-40 minutes for most months), which are arguably more realistic, and which broadly
resolve the problem.
The best fit, however, is achieved by the new BLIPR model, and this has a lower minimum
objective function value for all months than the BLRP or BLIP models.
4.2.2
Wet/dry properties
The proportion of dry intervals is a very important property for hydrological applications.
Although this could have been included as one of the fitting properties, it is useful to
reserve an important feature for subsequent model validation, as this gives an independent
test of the appropriateness of the model structure. Plots of the fits of the models against
the observed data for each month in respect of the proportion dry are shown in Figure
5. The BLIPR model can be seen to outperform the other models (including the BLIP2
model) strongly with respect to the fit to proportion dry, across all timescales.
It is also of interest to consider the wet and dry spell transition probabilities (i.e the
probability that a wet interval is followed by another wet interval, or a dry by another
dry), which are important for the accurate modelling of antecedent conditions. Figure 6
shows that the BLIPR model again outperforms the other models with respect to the wet
spell transition probability. While the BLRP model has a good fit at the hourly timescale,
and the BLIP2 model at five minutes, these both perform poorly at other timescales, with
11
only the BLIPR model showing consistency of performance across timescales. There is
less difference between models for the dry spell transition probabilities, with all models
providing a reasonable fit at all timescales.
Based on the properties examined so far, the BLIPR model gives the best performance.
Finally, in the next Section, we consider the fit of this model to extreme values, and
include one further minor modification to the structure of pulses within cells to improve
this aspect.
5
Extreme Value Performance
In the derivation of moments for the BLIP model (Cowpertwait et al. (2007)), pulse depths
for pulses within the same cell were allowed to be dependent, although the empirical fits
assumed independence, as have the fits we have carried out so far. Intuitively, dependent
pulse depths should allow higher values of extremes at short timescales, which is desirable
since we are currently understating five-minute skewness. We suppose here the most
extreme form of dependence in which pulse depths within the same cell have a common
depth, with depths in different cells still allowed to vary (denoted the BLIPRd model).
This was found to give a lower minimum objective function value than the independent
pulse version. The fit to five-minutes skewness was much improved, albeit with a slight
deterioration in the variance at the 24-hour timescale.
Table 1 shows the minimum objective function value for each of the models that we have
successfully fitted, for each month. Since the same set of moments and weights were used
for each model, these are directly comparable.
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
BLRP
BLIP
BLIPR
BLIPRd
83
38
100
110
141
152
162
140
149
92
68
68
67
56
113
168
239
275
345
268
271
150
76
67
45
30
58
85
93
92
110
86
87
71
30
32
40
24
48
66
76
72
95
76
65
50
25
28
Table 1: Comparison of minimum objective function value
12
For our data, the months with the highest rainfall, rainfall variability and skewness are the
summer months, and these are also the months with the highest extremes. A comparison
of the fit of extremes for July for the BLIPR model is given in Figure 7, using Gumbel
plots. These compare the observed annual maxima (for the month of July) against fifteen
simulations, where each simulation is of the same length as our observed data. The
maximum rainfall per unit-time is plotted against the “reduced-variate” − ln(− ln(1 −
1/R)) where R is the return period i.e. the average time period within which rainfall of
the specified magnitude can be expected to occur once. The graphs for July show that
the model has a tendency slightly to underestimate extremes, as has been noted before
for this type of model. Results for other months give a fairly similar picture.
A comparison showing mean annual extremes (averaged over fifteen simulations) for a
number of alternative models at the five minute and hourly timescales is also shown in
Figure 8. At the five minute timescale, the BLIPRd model gives the best performance,
although all the models underestimate the extremes. Results are closer at the one-hour
timescale, and for longer timescales, there is essentially no difference between models.
Note that, although the simulated extremes under-estimate the observed values for all
timescales, this is partly due here to sampling variation. We have fitted to the mean
observed properties, including the skewness coefficient, averaging over each of the 69
years of data. For our data, this gave a lower observed skewness coefficient than we would
have obtained by calculating over all 69 years, although in practice the difference could
go either way. The latter would have given us a slightly better fit to our extremes, but
does not permit the calculation of the covariance matrix for the observed statistics, since
it is just a single sample.
Based on our analysis, the BLIPRd is shown to be the best performing of the models
compared, both in terms of the moments fitted, and more importantly, in respect of the
wet/dry properties and extreme values not included in the fit.
The fitted parameter set for the BLIPRd model is given in Table 2.
It is interesting to consider the parameters in terms of their physical realism, and to
consider also the intuition behind our results. Comparing with empirical observations
from Houze & Hobbs (1982), the parameter values seem reasonable. Winter storms last
several hours, have around 20 cells, which last on average around 22 minutes. In summer,
storms have a similar mean duration, but only around 8 cells. However, these have a
correspondingly much higher pulse rate, giving broadly the same amount of rainfall per
storm over all months. In terms of the intuition behind our results, we conclude that
it is not the replacement of rectangular pulses by instantaneous ones that leads to the
improved performance of the BLIPR model, compared with the BLRPR model. This is
clear from the fact that the better-performing version of the instantaneous pulse model is
the one where all pulses have the same depth. With its very short pulse inter-arrival times,
the model thus effectively simply replicates rectangular pulses. Instead, we attribute the
13
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
λ
µx
α
α/ν
κ
φ
ι
0.0236
0.0235
0.0227
0.0240
0.0274
0.0321
0.0308
0.0298
0.0256
0.0206
0.0251
0.0264
0.0010
0.0010
0.0010
0.0010
0.0010
0.0010
0.0010
0.0010
0.0010
0.0010
0.0010
0.0010
2.1468
3.6795
1.9210
1.9902
1.5185
1.2407
1.6347
1.2842
1.3859
2.1263
1.9307
2.0346
4.5905
4.3936
5.5057
6.7368
7.7972
10.1391
10.3442
11.1444
8.9390
6.6978
5.3808
4.5838
1.0273
1.0958
0.7161
0.5176
0.5094
0.4679
0.1883
0.5649
0.4163
0.5801
1.0629
1.0926
0.0458
0.0582
0.0436
0.0387
0.0604
0.0675
0.0452
0.0796
0.0540
0.0406
0.0491
0.0544
173
187
203
248
393
530
899
521
505
286
181
188
Table 2: Parameters for Bartlett-Lewis Pulse random eta model, with common pulse depths
within the same cell
improved performance of the BLIPR and BLIPRd models to the fact that, unlike the
BLRPR model, these allow rainfall intensity to vary with cell duration, since the pulse
rate effectively drives the intensity and is proportional to the cell duration parameter, η.
Our model thus gives a simple, but effective way of introducing dependence between cell
duration and intensity.
6
Parameter Identifiability and Confidence Intervals
for the Bartlett-Lewis Random η Pulse model
Finally, we explore the parameter identifiability of the new model using profile objective
functions as described in Section 3. For these, we fitted the model to the natural log
of parameters, which gave the same fitted solution as we had before, but with greater
stability, so that we could derive a Hessian and approximate confidence intervals for all
months. 95% intervals are given for illustration for the month of January in Table 3.
Profile objective functions for the log of all parameters, again for the month of January,
are shown in Figure 9. These illustrate the large range over which the profile objective
functions for µX and ι are flat, as discussed in Section 4. Over this range, the product
of these two parameters effectively constitutes a single parameter, such that an increase
in one of them can be directly compensated for by a corresponding decrease in the other.
Once ι gets too small, however, terms in its reciprocal in the skewness equation start to be
significant and the relationship changes. The spikes in some of the plots are an indication
of numerical difficulties in the fitting. Re-running the plots would tend not to replicate
14
these, but might produce other spikes at different locations.
95% Interval
λ
µx
α
α/ν
κ
φ
ι
(0.021, 0.027)
(NA, 0.0030)
(1.611, 2.970)
(3.642, 5.671)
(0.842, 1.233)
(0.037, 0.057)
(54.206, NA)
Table 3: Approximate confidence intervals for January’s parameter estimates for the BartlettLewis Pulse random eta model, with common pulse depths within the same cell
7
Discussion and Conclusions
In this paper, an extension to the Bartlett-Lewis Pulse model has been developed, the
BLIP Random η model, which allows the cell duration parameter to vary randomly between storms, following the approach of the original Random η Bartlett-Lewis Rectangular Pulse model. A version of the model in which all pulses within the same cell have
the same depth is found to be an improvement on the assumption of independent pulse
depths. The new model is found to perform well at all timescales, with a marked improvement compared with the fixed η version of the fit to skewness at short timescales
and to proportion dry at all timescales. The model also outperforms the original Random
η Bartlett-Lewis model, which is found to give the same solution as the fixed η version
if skewness is included as one of the fitting properties. The fit to extremes at very short
timescales, although better than for the BLRP and BLIP models, remains a potential
area for improvement, but for most months, the fit at timescales of one hour or more is
satisfactory. It is possible that an alternative distribution for the pulse intensities might
improve the fit to extremes further. This was not investigated in depth here, other than
replacing the Exponential distribution with the more flexible Gamma, which had no positive impact on the minimum objective function value nor on the fit to extremes. In fact,
the fitted parameter σx /µx , allowed to vary from its previous constrained value of 1, was
close to zero for most months, such that the pulse depths were effectively fixed at their
mean value. A longer-tailed distribution, such as the Pareto or Weibull, might be more
effective, but this was not pursued here.
The BLIPRd model has seven parameters, effectively reduced to six, as for most of the
parameter space the product of the mean depth, µX and ι, the ratio of the mean cell
duration to the mean pulse inter-arrival rate, constitutes a single parameter. The model
15
is therefore far more stable then the alternative “two superposed processes” version, which
also aims to allow for different storm types. Even so, our profile objective function plots
show that issues of parameter identifiability remain, and constraints may be desirable to
ensure that parameters are physically realistic.
The BLIPRd model is therefore our preferred model for practical application, improving
on the fit to the commonly used BLRPR model, with no greater complexity. Although
not pursued further here, it would be interesting also to consider linking the cell intensity
random variable X in the BLRPR model to the cell duration parameter η, by assuming
that the mean intensity µX varies in proportion to η. This is expected to give similar
results.
8
Acknowledgements
Deutsche Montan Technologie and Emschergenossenschaft/Lippeverband in Germany are
gratefully acknowledged for providing the data. I would also like to thank Valerie Isham,
Christian Onof, Richard Chandler and Joao Jesus for helpful advice.
16
Appendices
A
Moments for the Barlett-Lewis Instantaneous Pulse
Random η model
Parameter definitions
• λ - storm arrival rate
• α - shape parameter for the Gamma distribution of the cell duration parameter, η
• ν - scale parameter for the Gamma distribution of η
• κ - ratio of the cell arrival rate to η (i.e. β/η)
• φ - ratio of the storm duration parameter to η (i.e. γ/η)
• ι - ratio of the pulse arrival rate to η (i.e. ξ/η)
• µX - mean cell intensity
• E(Xijk Xijl ) - product moment of the depths of 2 pulses within the same cell
• E(Xijk Xijl Xijm ) - product moment of the depths of 3 pulses within the same cell
A.1
Mean
E[Yih ] = λµp µX h
A.2
Variance
(
Ã
´
2 κι
2µ
1 −φηh 1
V ar[Yih ] = λµp E(X 2 )h + X2 Eη
e
− + φh
φ
η
η
"
# Ã
!)
2ι
φ
1
1
+
E(Xijk Xijl ) − µ2X κ
Eη
e−(φ+1)ηh − + (φ + 1)h
(φ + 1)2
φ+2
η
η
(
Ã
!
2µ2X κι
να
ν
2
= λµp E(X )h +
−
+ φh
φ2
(α − 1)(ν + φh)α−1 α − 1
"
#Ã
!)
2ι
φ
να
ν
2
+
E(Xijk Xijl ) − µX κ
−
+ (φ + 1)h
(φ + 1)2
φ+2
(α − 1)(ν + (φ + 1)h)α−1 α − 1
17
A.3
Covariance (k ≥ 1)
h
Cov(Yih , Yi+k
)
"
Ã
!
µ2X κ
e−φη(k−1)h − 2e−φηkh + e−φη(k+1)h
= λµp ι
Eη
φ2
η
Ã
! Ã
!#
φ
e−(φ+1)η(k−1)h − 2e−(φ+1)ηkh + e−(φ+1)η(k+1)h
2
+ E(Xijk Xijl ) − µX κ
Eη
(φ + 2)
(1 + φ)2 η
!α−1
Ã
!α−1 Ã
!α−1 )
ν
ν
ν
−2
+
ν + φ(k − 1)h
ν + φkh
ν + φ(k + 1)h
!
φ
+ E(Xijk Xijl ) − µ2X κ
(φ + 2)
(Ã
!α−1
Ã
!α−1 Ã
!α−1 )#
ν
ν
ν
×
−2
+
ν + (φ + 1)(k − 1)h
ν + (φ + 1)kh
ν + (φ + 1)(k + 1)h
Ã
ν
= λµp ι
α−1
Ã
!"
µ2X κ
φ2
(Ã
18
A.4
3rd Central Moment
E[(Y h − E(Y h ))3 ]
(
"
#
3 κ2
E(X
X
X
)
2E(X
X
)µ
κ
µ
6
ijm
X
ijk
ijl
ijk
ijl
= λκι3
+
− X
(1 + φ)3
φ
φ(2 + φ)
(2 + φ)
"
Ã
!α−1
Ã
!α #
2ν
ν
ν
2ν
× h−
+
+h
(α − 1)(1 + φ) (1 + φ)(α − 1) ν + (1 + φ)h
ν + (1 + φ)h
"
#
2E(Xijk Xijl )µX κ
µ3X κ2
6
+
+
−
(1 + φ)(2 + φ)2
(1 + φ)
(3 + φ)
"
(
Ã
!Ã
!α−1 Ã
!Ã
!α−1 )#
ν
3 + 2φ
2+φ
ν
1+φ
ν
× h−
−
+
(α − 1) (1 + φ)(2 + φ)
1+φ
ν + (1 + φ)h
2+φ
ν + (2 + φ)h
"
Ã
!α−1
Ã
!α #
6µ3X κ2
2ν
2ν
ν
ν
+
h−
+
+h
3
φ (1 + φ)
φ(α − 1) φ(α − 1) ν + φh
ν + φh
"
#
2E(Xijk Xijl )µX κ
µ3X κ2
6
−
+
φ(1 + φ)2
φ
(2 + φ)
"
(
Ã
!α−1
Ã
!α−1 )#
ν
1 + 2φ
(1 + φ)
ν
φ
ν
× h−
−
+
(α − 1) φ(1 + φ)
φ
ν + φh
(1 + φ) ν + (1 + φ)h
"
(
Ã
!α−1 )#
2 X )
6E(Xijk
ν
ν
ijl
+
h−
1−
ιφ(1 + φ)2
(1 + φ)(α − 1)
ν + (1 + φ)h
"
Ã
!α−1
6E(X 2 )µX κ
ν
ν
ν
φ2
+
h
−
+
−
ιφ2 (1 + φ)
φ(α − 1) φ(α − 1) ν + φh
(1 + φ)(2 + φ)
Ã
!α−1 !#
)
Ã
ν
ν
E(X 3 )h
ν
+
+ 2
×
h−
(1 + φ)(α − 1) (1 + φ)(α − 1) ν + (1 + φ)h
ι φ(1 + φ)
B
Figures
19
0.11
obs
BLRP
BLIP
BLIPR
1 hr mean,mm
0.10
0.09
0.08
J
F
M
A
M
J
J
A
S
O
N
D
Month
Figure 1: The mean 1 hour rainfall by month, fitted v observed
5−min
1 hour
0.012
0.5
0.010
0.4
0.008
0.006
0.3
0.004
0.2
Var, mm
0.002
0.1
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
6 hour
J
J
A
S
O
N
D
A
S
O
N
D
24 hour
6
30
5
25
4
20
3
15
2
10
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
J
J
Month
obs
BLRP
BLIP
BLIPR
Figure 2: variance by month, fitted v observed
20
5−min
1 hour
0.55
0.80
0.50
0.75
0.45
0.70
0.40
0.65
0.35
0.30
ac lag 1
0.60
0.25
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
6 hour
J
J
A
S
O
N
D
A
S
O
N
D
24 hour
0.35
0.20
0.30
0.15
0.25
0.10
0.20
0.15
0.05
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
J
J
Month
obs
BLRP
BLIP
BLIPR
Figure 3: Lag-1 correlation by month, fitted v observed
5−min
1 hour
30
12
11
25
10
20
9
8
15
skewness coeff
7
10
6
5
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
6 hour
J
J
A
S
O
N
D
A
S
O
N
D
24 hour
5.5
3.0
5.0
2.8
4.5
2.6
4.0
2.4
2.2
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
J
J
Month
obs
BLRP
BLIP
BLIPR
BLIP2
Figure 4: Coefficient of skewness by month, fitted v observed
21
5−min
1 hour
0.98
0.95
0.96
0.90
0.94
proportion dry
0.92
0.85
0.90
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
6 hour
J
J
A
S
O
N
D
A
S
O
N
D
24 hour
0.85
0.60
0.80
0.55
0.50
0.75
0.45
0.70
0.40
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
J
J
Month
obs
BLRP
BLIP
BLIPR
Figure 5: Proportion dry by month, fitted v observed
5−min
1 hour
0.90
0.8
0.85
0.7
0.80
wet spell transition probability
0.75
0.6
0.70
0.5
0.65
0.60
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
6 hour
J
J
A
S
O
N
D
A
S
O
N
D
24 hour
0.70
0.6
0.65
0.60
0.5
0.55
0.4
0.50
0.45
0.3
J
F
M
A
M
J
J
A
S
O
N
D
J
F
M
A
M
J
J
Month
obs
BLRP
BLIP
BLIPR
Figure 6: Transition probability of a wet interval being followed by another wet interval, by
month, fitted v observed
22
2
5−min
5
10
20
50
100
14
35
12
30
10
25
8
20
6
15
4
10
2
5
−1
0
1
2
2
3
6 hour
5
10
20
4
50
1 hour
2
5
−1
5
0
100
1
10
2
24 hour
2
5
10
20
50
3
4
20
50
3
4
100
5
100
80
60
60
50
40
40
30
20
20
10
0
−1
0
1
2
3
4
5
−1
0
1
2
5
Figure 7: Gumbel plots of observed v simulated extremes for July, using the Bartlett-Lewis
Instantaneous Pulse random η model; pulses within the same cell assumed to have a common
depth
2
5
10
20
50
100
2
Return period (years)
20
50
100
30
Rainfall mm
Rainfall mm
10
obs
BLRP
BLIP
BLIPR (indep)
BLIPR (dep)
40
obs
BLRP
BLIP
BLIPR (indep)
BLIPR (dep)
15
5
Return period (years)
10
20
5
10
0
0
−1
0
1
2
3
4
5
−1
Gumbel reduced variate
0
1
2
3
4
5
Gumbel reduced variate
(a) 5 minute
(b) 1 hour
Figure 8: Annual Gumbel plots of observed v simulated extremes for variants of the BartlettLewis model
23
log(λ)
log(mux)
600
Objective function
Objective function
250
200
150
100
400
300
200
100
50
−7
−6
−5
−4
−3
−7
−6
−5
−3
Parameter value
log(α)
log(α ν)
200
Objective function
65
60
55
50
45
150
100
50
40
0.0
0.5
1.0
1.5
0
Parameter value
1
2
3
Parameter value
log(κ)
log(φ)
120
Objective function
120
Objective function
−4
Parameter value
70
Objective function
500
100
80
60
40
100
80
60
40
−3
−2
−1
0
−4.5
−4.0
Parameter value
−3.5
−3.0
−2.5
−2.0
−1.5
Parameter value
Objective function
log(ι)
100
approx 95% CI
approx 99% CI
80
60
40
0
2
4
6
8
10
Parameter value
Figure 9: Profile Objective Function Plots for January for the Bartlett-Lewis Pulse random η
model; pulses within the same cell assumed to have a common depth
24
References
Burton, A., Fowler, H., Blenkinsop, S. & Kilsby, C. (2010), ‘Downscaling transient climate
change using a Neyman-Scott Rectangular Pulses stochastic rainfall model’, Journal of
Hydrology 381 (1-2).
Burton, A., Kilsby, C. G., Fowler, H. J., Cowpertwait, P. S. P. & O’Connell, P. E.
(2008), ‘Rainsim: A spatial-temporal stochastic rainfall modelling system’, Environmental Modelling & Software 23.
Chandler, R. E., Isham, V. S., Wheater, H. S., Onof, C. J., Leith, N., Frost, A. J. &
Segond, M.-L. (2007), Spatial-temporal rainfall modelling with climate change scenarios, Technical Report FD2113, DEFRA/EA.
Cowpertwait, P., Isham, V. & Onof, C. (2007), ‘Point process models of rainfall: Developments for fine-scale structure’, Proc. R. Soc.Lond. A 463.
Cowpertwait, P., Kilsby, C. & O’Connell, P. (2002), ‘A space-time Neyman-Scott model
of rainfall: empirical analysis of extremes’, Water Resources Research 38 (8), 1131.
doi:10.1029/2001WR000709.
Cowpertwait, P. S. P. (1994), ‘A generalized point process model for rainfall’, Proc. R.
Soc.Lond. A 447, 23–37.
Cowpertwait, P. S. P. (1995), ‘A generalized spatial-temporal model of rainfall based on
a clustered point process’, Proc. R. Soc.Lond. A 450, 163–175.
Cowpertwait, P. S. P. (1997), ‘A poisson-cluster model of rainfall: high-order moments
and extreme values’, Proc. R. Soc.Lond. A 454, 885–898.
Cowpertwait, P. S. P. (2004), ‘Mixed rectangular pulses models of rainfall’, Hydrology and
Earth System Sciences 8(5).
Cowpertwait, P. S. P. (2006), ‘A spatial-temporal point process model of rainfall for the
Thames catchment, UK’, Journal of Hydrology 330.
Cowpertwait, P. S. P. (2010), ‘A neyman-scott model with continuous distribution of
storm types’, Australian and New Zealand Industrial and Applied Mathematics Journal
51, 97–108.
Cox, D. & Isham, V. (1980), Point Processes, Chapman and Hall.
Entekhabi, D., Rodriguez-Iturbe, I. & Eagleson, P. . (1989), ‘Probabilistic representation
of the temporal rainfall process by a modified Neyman-Scott rectangular pulses model:
parameter estimation and validation’, Water Resources Research 25(2), 295–302.
25
Fowler, H., Kilsby, C. & OConnell, P. (2000), ‘A stochastic rainfall model for the assessment of regional water resource systems under changed climatic conditions’, Hydrol.
Earth Sys. Sci. 4(2), 263–282.
Glasbey, C. A., Cooper, G. & McGechan, M. B. (1995), ‘Disaggregation of daily rainfall
by conditional simulation from a point process model’, Journal of Hydrology 165.
Gyasi-Agyei, Y. & Willgoose, G. R. (1997), ‘A hybrid model for point rainfall modelling’,
Water Resources Research 33(7).
Hansen, L. P. (1982), ‘Large sample properties of generalized method of moments estimators’, Econometrica 46, 1029–1054.
Houze, R. A. & Hobbs, P. V. (1982), ‘Organization and structure of precipitating cloud
systems’, Advances in Geophysics 24, 225–315.
Kakou, A. & Onof, C. (1996), ‘A point process model for rainfall with duration intensity
dependence’, Annales Geophysicae . Suppl. II to vol. 14: part II, C302.
Kilsby, C., Jones, P., Burton, A., Ford, A., Fowler, H., Harpham, C., James, P., Smith,
A. & Wilby, R. (2007), ‘A daily weather generator for use in climate change studies’,
Environmental Modelling and Software 22.
Koutsoyiannis, D. & Onof, C. (2000), ‘HYETOS - a computer program for stochastic
disaggregation of fine-scale rainfall’. http://www.itia.ntua.gr/e/softinfo/3/.
Koutsoyiannis, D. & Onof, C. (2001), ‘Rainfall disaggregation using adjusting procedures
on a Poisson cluster model’, Journal of Hydrology 246, 109–122.
Northrop, P. J. & Stone, T. M. (2005), ‘A point process model for rainfall with truncated gaussian rain cells’. Research Report No. 251, Department of Statistical Science,
University College London.
Onof, C., Chandler, R., Kakou, A., Northrop, P., Wheater, H. & Isham, V. (2000), ‘Rainfall modelling using poisson-cluster processes: a review of developments’, Stochastic
Environmental Research and Risk Assessment 14, 384–411.
Rodriguez-Iturbe, I., Cox, D. & Isham, V. (1987), ‘Some models for rainfall based on
stochastic point processes’, Proc. R. Soc.Lond. A 410, 269–288.
Rodriguez-Iturbe, I., Cox, D. & Isham, V. (1988), ‘A point process model for rainfall:
further developments’, Proc. R. Soc.Lond. A 417, 283–298.
Verhoest, N., Vandenberghe, S., Cabus, P., Onof, C., Meca-Figueras, T. & Jameleddine, S.
(2010), ‘Are stochastic point rainfall models able to preserve extreme flood statistics?’,
Hydrological Processes 24, 3439–3445.
26
Wheater, H. S., Chandler, R. E., Onof, C. J., Isham, V. S., Bellone, E., Yang, C., Lekkas,
D., Lourmas, G. & Segond, M.-L. (2005), ‘Spatial-temporal rainfall modelling for flood
risk estimation’, Stoch Environ Res Risk Assess 19.
27
Download