Uploaded by luisana

Chaos theory in hydrology

advertisement
HYDROL 3897
Journal of Hydrology 227 (2000) 1–20
www.elsevier.com/locate/jhydrol
Review
Chaos theory in hydrology: important issues and interpretations
B. Sivakumar*
Department of Hydrology and Water Resources, The University of Arizona, Tucson, AZ 85721, USA
Received 1 June 1999; received in revised form 24 September 1999; accepted 22 October 1999
Abstract
The application of the concept of chaos theory in hydrology has been gaining considerable interest in recent times. However,
studies reporting the existence of chaos in hydrological processes are often criticized due to the fundamental assumptions with which
the chaos identification methods have been developed, i.e. infinite and noise-free time series, and the inherent limitations of the
hydrological time series, i.e. finite and noisy. This paper is designed: (1) to address some of the important issues in the application of
chaos theory in hydrology; and (2) to provide possible interpretations to the results reported by past studies reporting chaos in
hydrological processes. A brief review of some of the past studies investigating chaos in hydrological processes is presented. An
insight into the studies reveals that most of the problems, such as data size, noise, delay time, in the application of chaos theory have
been addressed by past studies, and caution taken in the application of the methods and interpretation of the results. The study also
reveals that the problem of data size is not as severe as it was assumed to be, whereas the presence of noise seems to have much more
influence on the nonlinear prediction method than the correlation dimension method. The study indicates that the presence of noise in
the data could be an important reason for the low-prediction accuracy estimates achieved in some of the past studies. These
observations, with the fact that most of the past studies used the correlation dimension either as a proof or as a preliminary evidence
of chaos, suggest that the hypothesis of deterministic chaos, as the basis in those studies, for hydrological processes is valid and has
great practical potential. q 2000 Elsevier Science B.V. All rights reserved.
Keywords: Chaos theory; Hydrological data; Identification methods; Correlation dimension; Nonlinear prediction; Data size; Noise
1. Introduction
One aspect which hydrologists have been extensively working on is the structure of hydrological
processes, such as rainfall and runoff. Even though,
during the past few decades, a number of mathematical models have been proposed for modeling hydrological processes, there is, however, no unified
mathematical approach. In part, this difficulty stems
from the fact that hydrological processes exhibit
* Present address: Department of Land, Air and Water Resources,
Veihmeyer Hall, University of California, Davis, CA 95616, USA.
Fax: 11-530-752-5262.
E-mail address: sbellie@ucdavis.edu (B. Sivakumar).
considerable spatial and temporal variability.
However, another part of this difficulty is due to the
limitation in the availability of ‘appropriate’ mathematical tools to exploit the structure underlying the
hydrological processes. The latter aspect has gained
considerable interest in recent times.
The tremendous spatial and temporal variability of
hydrological processes has been believed, until
recently, to be due to the influence of a large number
of variables. Consequently, the majority of the
previous investigations on modeling hydrological
processes have essentially employed the concept of
a stochastic process. However, recent studies have
indicated that even simple deterministic systems,
influenced by a few nonlinear interdependent
0022-1694/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S0022-169 4(99)00186-9
2
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
variables, might give rise to very complicated structures (i.e. deterministic chaos). Therefore, it is now
believed that the dynamic structures of the seemingly
complex hydrological processes, such as rainfall and
runoff, might be better understood using nonlinear
deterministic chaotic models than the stochastic ones.
The investigation of the existence of chaos in
hydrological processes has been of much interest
lately (e.g. Hense, 1987; Rodriguez-Iturbe et al.,
1989; Sharifi et al., 1990; Tsonis et al., 1993;
Jayawardena and Lai, 1994; Koutsoyiannis and
Pachakis, 1996; Porporato and Ridolfi, 1996, 1997;
Sivakumar et al., 1998, 1999a). The outcomes of the
investigations are very encouraging as they provided
evidence regarding the existence of low-dimensional
chaos, implying the possibility of accurate short-term
predictions. However, such studies and the reported
results have very often been subject to intense debate
(e.g. Ghilardi and Rosso, 1990; Koutsoyiannis and
Pachakis, 1996) because of the inherent limitations
in employing the chaos identification methods for
hydrological processes. The low prediction accuracy
estimates achieved for the rainfall and streamflow,
which have also been identified to exhibit low-dimensional chaos (Jayawardena and Lai, 1994; Sivakumar
et al., 1998, 1999a), only raise further questions.
In view of the above findings, there is a need to
bridge the gap between the theoretical notions of
deterministic chaos on one hand, and practical
hydrology on the other. Therefore, the present paper
has two main objectives. The first is to address some
of the important issues, in the application of chaos
identification methods in hydrology. The second, but
equally important, objective of this paper is to seek
implications of the studies investigating and reporting
the existence of chaos in hydrological processes.
The organization of this paper is as follows. In
Section 2, a brief review of the previous studies
investigating the existence of chaos in hydrological
processes is furnished. Section 3 addresses some of
the important issues in the application of the chaos
identification methods to hydrological data. An
attempt is then made in Section 4 to discuss the important results reported by past studies and to provide
possible interpretations. Such interpretations lead to
the general discussion, in Section 5, concerning the
question of whether a hypothesis of deterministic
chaos is valid for hydrological processes.
2. Review of studies investigating chaos in
hydrology
The last decade has witnessed a number of studies
employing the concept of chaos theory in hydrology
(e.g. Hense, 1987; Rodriguez-Iturbe et al., 1989; Sharifi
et al., 1990; Jayawardena and Lai, 1994; Georgakakos
et al., 1995; Sangoyomi et al., 1996; Puente and Obregon, 1996; Porporato and Ridolfi, 1996, 1997; Liu et al.,
1998; Wang and Gan, 1998; Sivakumar et al., 1998,
1999a,c). Even though the primary objective of those
studies was to investigate the existence of chaos in
hydrological processes, other aspects such as prediction
(e.g. Jayawardena and Lai, 1994; Porporato and Ridolfi,
1996, 1997; Liu et al., 1998; Sivakumar et al., 1999a),
noise level determination (e.g. Sivakumar et al.,
1999b,c), and noise reduction (e.g. Porporato and
Ridolfi, 1997; Sivakumar et al., 1999c) were also
given due consideration. In this section, only a brief
account of some of the studies employing the concept
of chaos theory in hydrology is presented so as to facilitate us to address the important issues in implementation, and to subsequently discuss about the validity of
such studies and the reported results.
The possible existence of chaos in hydrological
processes was first investigated by Hense (1987),
who applied the correlation dimension method to a
series of 1008 values of monthly rainfall recorded in
Nauru Island. The existence of chaos in the rainfall
time series was indicated based on the low correlation
dimension value (between 2.5 and 4.5) obtained.
Rodriguez-Iturbe et al. (1989) investigated the
existence of chaos in rainfall using the correlation
dimension method and the Lyapunov exponent
method. They analyzed two rainfall records: (1)
weekly rainfall data over a period of 148 years
observed in Genoa; and (2) a record of 1990 rainfall
values, measured with a sampling frequency of 8 Hz
and then aggregated at equally spaced intervals of
15 s, from a single storm event in Boston. Observation
of a finite low-correlation dimension of about 3.78
provided preliminary evidence on the existence of
chaos in the storm data. The presence of chaos in
the storm data was supported further by the observation of a positive Lyapunov exponent (0.0002 bits/s).
However, the application of the correlation dimension
method to the weekly rainfall data did not indicate the
existence of chaos.
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
Further evidence on the presence of chaos in storm
rainfall was presented by Sharifi et al. (1990), who
employed the correlation dimension method to
examine fine-increment data from three storms. The
total number of data points for each of the three
storms were 4000, 3991, and 3316 and the correlation
dimensions obtained were 3.35, 3.75, and 3.60,
respectively. Tsonis et al. (1993) investigated data
representing the time between successive raingage
signals each corresponding to a collection of
0.01 mm of rain using the correlation dimension
method. The presence of a low correlation dimension
of about 2.4 indicated the possible existence of chaos.
Islam et al. (1993) obtained a low-dimension on
analyzing simulated rainfall intensity data using the
correlation dimension method. From a data set of
7200 points, generated at 10-s time steps from a
three-dimensional cloud model, they obtained a
value as low as about 1.5 for the dimension.
Jayawardena and Lai (1994) investigated the daily
rainfall and streamflow data from three and two
stations, respectively, in Hong Kong for the purpose
of identifying the existence of chaos. The correlation
dimension method, the Lyapunov exponent method,
the Kolmogorov entropy method, and the nonlinear
prediction method were applied to data sets containing 4015 points (for rainfall), and 7300 and 6205
points (for streamflow), respectively. Their study
provided convincing evidence of the existence of
chaos in the daily rainfall and streamflow data in
Hong Kong. Although the rainfall and streamflow
prediction accuracy estimates were found to be low,
they showed the superiority of the nonlinear
prediction method over the traditional linear autoregressive moving average (ARMA) method. Using
the nonlinear prediction method, Waelbroeck et al.
(1994) observed that the prediction skill for daily rainfall dropped off quickly within a time scale of 2 days.
However, the prediction skill of 10-day rainfall
accumulations was found to be much better. Using
the correlation dimension method, Georgakakos et
al. (1995) analyzed data from 11 storm events in
Iowa City, and reported the possible existence of
chaos (except for data from one of these storms).
The correlation exponents were found to range from
2.8 to 7.9 in the high-intensity scaling region, while in
the low-intensity scaling region they ranged from 0.5
to 1.6. The possibility of the existence of chaos, in the
3
volume of the Great Salt Lake was studied by
Sangoyomi et al. (1996). The analysis of a 144-year,
biweekly time series of the Great Salt Lake volume
yielded a correlation dimension of about 3.4.
Puente and Obregon (1996) reported the existence
of chaos in storm events observed in Boston by
analyzing a time series of 1990 points using the correlation dimension method, the Kolmogorov entropy
method, the false neighbors algorithm, and the
Lyapunov exponent method. They presented a deterministic fractal-multifractal (FM) approach for
modeling the storm event. A detailed comparison of
the real and FM fitted time series revealed the
possibility of the use of a deterministic FM approach
for a faithful representation of the Boston storm event,
and led them to hint that a stochastic framework for
rainfall modeling might not be necessary. However, at
the same time, Koutsoyiannis and Pachakis (1996)
defended the use of stochastic models in modeling
hydrological processes. They concluded, while
analyzing incremental rainfall depths measured
every 15 min, that a synthetic continuous rainfall
series generated by a well-structured stochastic rainfall model might be practically indistinguishable from
a historic rainfall series even if one used the tools of
the chaotic dynamics theory to characterize and
compare the two rainfall series.
Porporato and Ridolfi (1996) provided clues to the
existence of deterministic chaos in the daily flow data
of Dora Baltea, a tributary of the river Po, in Italy. The
application of the correlation dimension method and
the nonlinear prediction method to a time series
consisting of 14,246 points indicated the existence
of a strong deterministic component. The study also
paved the way for a more detailed analysis of the flow
phenomenon (Porporato and Ridolfi, 1997), such as
noise reduction, interpolation, and nonlinear
prediction, which provided important confirmations
of the nonlinear deterministic behavior of the flow
phenomenon. Liu et al. (1998) analyzed, using the
nonlinear prediction method, the daily streamflow
data observed in 28 selected stations from the continental United States and reported that the daily
streamflow signals spanned a wide dynamical range
between deterministic chaos and periodic signal
contaminated with additive noise. Further studies
regarding the existence of deterministic chaos in
streamflow data were carried out by Wang and Gan
4
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
(1998), who estimated the correlation dimensions of
the unregulated streamflow data of six rivers in the
Canadian prairies to be about 3.0. However, based on
their observation of the consistent underestimation of
the correlation dimension for the randomly resampled data by an amount of 4–6, they interpreted
that the actual dimensions of the streamflow data
should be between 7 and 9.
Sivakumar et al. (1998, 1999a) investigated the
daily rainfall data of different record lengths observed
from each of six stations in Singapore using the correlation dimension method and the nonlinear prediction
method, and provided convincing evidence regarding
the existence of chaos. They also employed the
surrogate data method, which indicated the absence
of linearity in the rainfall time series. Subsequently,
Sivakumar et al. (1999b,c) studied the problem of the
influence of the presence of noise (measurement
error) on the correlation dimension and prediction
accuracy estimates, by proposing a systematic
approach for noise reduction, coupling a noise level
determination method and a noise reduction method.
The outcomes provided additional support regarding
the existence of a deterministic component in the rainfall phenomenon and possible reasons for the low
prediction accuracy estimates achieved in the earlier
study (Sivakumar et al., 1999a).
3. Issues in the investigation of chaos in hydrology
As the application of the concept of chaos theory to
hydrological processes has been gaining momentum
lately, so are the questions on the validity of such
studies and the reported results. This section addresses
the possible bases for such questions. It is not the
intent of this section to discuss, in detail, all the issues
pertaining to such questions, rather the intent is to
address only those issues that have been recognized
or suspected to significantly influence the outcomes.
The questions regarding the applicability of chaos
theory in hydrology (or any natural phenomenon) may
broadly be divided into two categories. The first is
concerned with the lack of investigative methods,
which provide sufficient conditions to identify the
existence of chaotic dynamics in hydrological
phenomena. The second is concerned with the validity
of chaos identification methods to hydrological data
due to practical limitations such as small sample size,
insufficient sampling frequency, and presence of
noise. What is more important is that all the above
issues play major roles when one deals with hydrological data and, therefore, makes the problem of
chaos identification much more difficult. A brief
discussion of the above problems is provided below.
3.1. Chaos identification methods
The science of chaos is a burgeoning field and the
available methods to investigate the existence of
chaos in a time series are still in the state of infancy.
Though a wide variety of methods are available, such
as the correlation dimension method (e.g. Grassberger
and Procaccia, 1983a), the Lyapunov exponent
method (e.g. Wolf et al., 1985), the Kolmogorov
entropy method (e.g. Grassberger and Procaccia,
1983b), the nonlinear prediction method (e.g. Farmer
and Sidorowich, 1987; Casdagli, 1989, 1991; Sugihara
and May, 1990), and the surrogate data method (e.g.
Theiler et al., 1992a,b; Schreiber and Schmitz, 1996),
there is no single method that can provide an infallible
distinction between a chaotic and a stochastic system.
For instance: (1) a finite correlation dimension,
usually understood as the principal, if not unique,
sign of deterministic chaos, may be observed also
for a stochastic process (e.g. Osborne and Provenzale,
1989); (2) an autoregressive (AR) stochastic process
can also produce accurate short-term prediction,
which is a typical characteristic of a chaotic process;
(3) a positive Lyapunov exponent may be observed
also for random and ARMA processes (e.g.
Jayawardena and Lai, 1994); (4) random noises with
power law spectra may provide convergence of the
Kolmogorov entropy (e.g. Provenzale et al., 1991);
and (5) phase-randomized surrogates can produce
spurious identifications of non-random structure
(e.g. Rapp et al., 1994). Consequently, a conclusive
resolution of whether or not a given finite data set is
chaotic is difficult to provide.
On one hand, these problems have motivated
improvements on existing methods for the diagnosis
of chaos and the proposal of new ones. Popular among
these are nonlinear prediction (e.g. Farmer and Sidorowich, 1987; Casdagli, 1989; Sugihara and May,
1990) including deterministic versus stochastic
(DVS) diagrams (e.g. Casdagli, 1991), surrogate
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
data (e.g. Theiler et al., 1992a,b; Schreiber and
Schmitz, 1996), and linear and nonlinear redundancies (e.g. Palus, 1995; Prichard and Theiler, 1995). On
the other hand, they have highlighted the caution
needed in studying natural phenomena. Only the
application of diverse techniques, each one in some
way complementary to the others, and their critical
analysis can enable us to confirm whether or not to
exclude the existence of chaotic dynamics in a
phenomenon (e.g. Porporato and Ridolfi, 1997).
Having said the above, it is crucial to note, at this
point, that only a few studies employed more than one
method in their investigation of the existence of chaos
in hydrological processes and to verify or confirm the
results (e.g. Rodriguez-Iturbe et al., 1989; Jayawardena
and Lai, 1994; Porporato and Ridolfi, 1996, 1997;
Sivakumar et al., 1999a). Also, all the other studies,
except those of Waelbroeck et al. (1994) and Liu et
al. (1998), based their conclusions on the correlation
dimension method, where the presence of a finite lowcorrelation dimension was taken as an indicator of
chaos (e.g. Hense, 1987; Rodriguez-Iturbe et al.,
1989; Sharifi et al., 1990; Islam et al., 1993; Tsonis
et al., 1993; Georgakakos et al., 1995; Koutsoyiannis
and Pachakis, 1996; Sangoyomi et al., 1996; Sivakumar
et al., 1998; Wang and Gan, 1998). The observation,
as previously mentioned, that even stochastic
processes may yield finite low-correlation dimensions
therefore brings an inevitable question on the validity
of the reported results. The failure to continue investigations either to provide further support, or confirmation on the existence of chaos, or to try to
make short-term predictions at least based on those
preliminary results only raises additional concerns.
3.2. Limitations of data
A fundamental limitation of the applicability of
chaos theory in hydrology arises from the basic
assumptions with which the chaos identification
methods are developed, i.e. the time series is infinite
and noise-free. This is because hydrological data are
always finite and inherently contaminated by noise,
such as errors arising from measurement. A finite
and small data set may probably result in an underestimation of the actual dimension of the process (e.g.
Havstad and Ehlers, 1989). The presence of noise may
affect the scaling behavior in the correlation dimension
5
estimate and the prediction accuracy in the nonlinear
prediction method (e.g. Schreiber and Kantz, 1996).
There are also other issues such as the sampling
frequency, delay time and critical embedding dimension. Since the correlation dimension method and the
nonlinear prediction method have been widely
employed in studies investigating the existence of
chaos in hydrological processes, much of the discussion below on the data limitations is restricted to these
two methods. However, most of these limitations apply
to other methods as well.
3.2.1. Data size and sampling frequency
The problem of data size is believed to be much
more serious in the correlation dimension method
than in the nonlinear prediction method. The correlation exponent and hence the correlation dimension
are computed from the slope of the scaling region in
the log C r† versus log r plot. It is always desirable to
have a larger scaling region to determine the slope,
since the determination of the slope for a smaller
scaling region may be difficult and possibly give
rise to errors. The infinite length of the data set results
in a larger scaling region due to the inclusion of a
large number of points (or vectors) on the reconstructed phase–space. However, if the data set were
finite and small, there would be only a few points on
the reconstructed phase–space, which makes slope
determination difficult. Therefore, it may be necessary
to have a large data size for dimension estimation.
The belief that a large data size would be necessary
created a lot of debate on the minimum data size
required for the computation of the correlation dimension. Numerous attempts have been and are being
made to provide some guidelines on this issue (e.g.
Smith, 1988; Havstad and Ehlers, 1989; Nerenberg
and Essex, 1990; Ramsey and Yuan, 1990). A brief
overview of some of the important suggestions and
recommendations is given below.
The painful exercise of determining the minimum
number of data points (Nmin) was first tackled by Smith
(1988), who concluded that this number was equal to
42 m, where m is the smallest integer above the dimension of the attractor (an attractor is a geometric form
that characterizes long-term behavior of a system in
the phase–space). Nerenberg and Essex (1990)
demonstrated that Smith’s procedure to obtain the
42 m estimate was flawed and that the data requirements
6
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
might not be so extreme. They suggested that the
minimum number of points required for the dimension estimate is N min , 10210:4m : Havstad and Ehlers
(1989) used a variant of the nearest neighbor dimension algorithm to compute the dimension of the time
series generated from the Mackey–Glass equation
(Mackey and Glass, 1977), whose actual dimension
is 7.5. Using a data set of as small as 200 points, the
study resulted in an underestimation of the dimension
by about 11%. Ramsey and Yuan (1990) concluded
that for small sample sizes, dimension could be
estimated with upward bias for chaotic systems and
with downward bias for random noise as the embedding dimension is increased. They proved that, due to
these bias effects, a correlation dimension estimate of
0.214 could imply an actual correlation dimension
value of as high as 1.68.
Though none of the studies addressing the issue of
data size has been able to provide a clear-cut guideline
on the minimum data size for the correlation dimension estimation, what is clear is that a large (if not
infinite) data set may be necessary to obtain realistic
results. However, a large data size alone does not
solve the overall problem, as other factors, such as
the sampling frequency, may also be important. This
is because, the theorems which justify the use of delay
(or other) embedding vectors recovered from a scalar
measurement as a replacement for the ‘original’
dynamical variables are themselves strictly valid
only for data of infinite (or at least very high)
resolution.
It is important to recognize that large data sets with
infinite resolution are generally not available in the
field of hydrology (the problem of deriving highresolution data from low-resolution data has been of
much interest in recent times). For instance, regarding
data size, let us consider the recommendation by
Smith (1988) on the minimum data size, i.e. Nmin ˆ
42m : This means that, when m ˆ 4; if Nmin is not at
least equal to 3,111,696 no accurate estimate of the
dimension can be obtained. In other words, if one is
dealing with daily data, then one has to have data
collected over a period of about 8350 years. Such a
restrictive figure questions all the studies claiming
low-dimensional chaos in hydrology (e.g. Hense,
1987; Rodriguez-Iturbe et al., 1989; Sharifi et al.,
1990; Islam et al., 1993; Tsonis et al., 1993;
Jayawardena and Lai, 1994; Georgakakos et al.,
1995; Koutsoyiannis and Pachakis, 1996; Sangoyomi
et al., 1996; Puente and Obregon, 1996; Porporato and
Ridolfi, 1996, 1997; Liu et al., 1998; Wang and Gan,
1998; Sivakumar et al., 1998, 1999a,c). In fact,
Smith’s results effectively eliminate the possibility
of estimating the dimension of any hydrological
phenomena, since no time series contains such a
large number of values. A similar kind of problem is
faced regarding the sampling frequency, as rainfall
and runoff data generally available are as low a
frequency as daily, though recent advances in technology, such as high-resolution measurement gages and
remote sensing, might solve this problem to a certain
extent.
3.2.2. Noise
Noise affects the performance of many techniques
of identification, modeling, prediction, and control of
deterministic systems. Some of the most characteristic
examples of the effects of noise are: (1) self-similarity
of the attractor is broken; (2) phase–space reconstruction appears as high-dimensional on small length
scales; (3) nearby trajectories diverge diffusively
rather than exponentially; and (4) prediction error is
found to be bounded from below no matter which
prediction method is used and to how many digits
the data are recorded (e.g. Kantz and Schreiber,
1997). The severity of the influence of noise on
chaos identification and prediction methods depends
largely on the level and the nature of noise. In general,
when the noise level approaches a few percent,
estimates can become quite unreliable (e.g. Schreiber
and Kantz, 1996; Kantz and Schreiber, 1997).
The presence of noise influences the estimation of
the correlation dimension primarily from the identification of the scaling region. Noise may corrupt the
scaling behavior at all length scales, but its effects are
significant especially at smaller length scales. If the
data are noisy, then below a length scale of a few
multiples of the noise level, the data points are not
confined to the fractal structure but smeared out over
the whole available phase–space. Thus, the local
scaling exponents may increase. It has been observed
that even small levels of noise significantly
complicate estimates of dimension, a quantity that in
principle should be straightforward to measure (e.g.
Schreiber and Kantz, 1996).
Noise is one of the most prominent limiting factors
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
for the predictability of deterministic systems. Noise
limits the accuracy of predictions in three possible
ways: (1) the prediction error cannot be smaller than
the noise level, since the noise part of the future
measurement cannot be predicted; (2) the values on
which the predictions are based are themselves noisy,
inducing an error proportional to and of the order of
the noise level; and (3) in the generic case, where the
dynamical evolution has to be estimated from the
data, this estimate will be affected by noise (Schreiber
and Kantz, 1996). In the presence of the above three
effects, the prediction error will increase faster than
linearly with the noise level.
The sensitivity of the correlation dimension (or any
other invariant) and the prediction accuracy to the
presence of noise is the price one has to pay for
using these to identify chaos. The definitions of
these involve the limit of small length scales because
it is only then that the quantity becomes independent
of the details of the measurement technique, the data
processing and the phase–space reconstruction
method. The permissible noise level for a practical
application of these methods depends, in a complicated way, on the details of the underlying system
and the measurement.
The foregoing discussion clearly indicates that
noise present in hydrological data cannot be ignored
if the analysis is to remain realistic. The important
first step is to be aware of the problem and to recognize its effects on the data analysis techniques by
estimating the level and the nature of noise. If it is
found that the level of noise is only moderate, and
there are hints that there is a strong deterministic
component in the signal, then one can attempt the
second step of separating the deterministic signal
from the noise. However, none of the studies investigating chaos in hydrology, except a very recent study
by Sivakumar et al. (1999c), attempted to determine
the level of noise in the data and, therefore, it is very
difficult to comprehend the possible effects of noise on
the reported results. Although a wide variety of
nonlinear noise reduction methods have been made
available in the literature over the past decade (e.g.
Schreiber and Grassberger, 1991; Schreiber, 1993;
Grassberger et al., 1993), their applicability to hydrological data has been tested only recently (Porporato
and Ridolfi, 1997; Sivakumar et al., 1999c). The failure of the majority of the studies to address the
7
problem of noise and its possible effects on chaos
identification in hydrological data forms another
side of the criticism of the validity of such studies.
3.2.3. Delay time
An appropriate delay time, t , for the reconstruction
of the phase–space necessary because an optimum
selection of t gives best separation of neighboring
trajectories within the minimum embedding phase–
space (e.g. Frison, 1994). If t is too small, then
there is little new information contained in each
subsequent datum and this may result in an underestimation of the correlation dimension (e.g. Havstad
and Ehlers, 1989). On the contrary, if t is too large,
and the dynamics are chaotic, all relevant information
for phase–space reconstruction is lost since neighboring trajectories diverge, and averaging in time and/or
space is no longer useful (e.g. Sangoyomi et al.,
1996). This may result in an overestimation of the
correlation dimension (e.g. Havstad and Ehlers,
1989).
Many researchers have addressed the problem of
the selection of an appropriate delay time and
proposed various methods. Well known among these
are the autocorrelation function method (e.g. Holzfuss
and Mayer-Kress, 1986; Schuster, 1988; Tsonis and
Elsner, 1988), the mutual information method (e.g.
Frazer and Swinney, 1986) and the correlation
integral method (e.g. Liebert and Schuster, 1989).
The autocorrelation function method is the most
commonly used one due to its computational ease.
Holzfuss and Mayer-Kress (1986) suggested using a
value of delay time at which the autocorrelation
function first crosses the zero line. Other approaches
consider the lag time at which the autocorrelation
function attains a certain value, say 0.1 (Tsonis and
Elsner, 1988), 0.5 (Schuster, 1988). According to
Frazer and Swinney (1986), however, the autocorrelation function method measures the linear dependence between successive points and, thus, may not
be appropriate for nonlinear dynamics. They
suggested the use of the local minimum of the mutual
information, which measures the general dependence
between successive points. They reasoned that if t is
chosen to coincide with the first minimum of the
mutual information, then the recovered state vector
would consist of components that possess minimal
mutual information between them. The mutual
8
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
information method is a more comprehensive method
of determining proper delay time values (e.g. Tsonis,
1992). However, the method has the disadvantage of
requiring a large number of data, unless the dimension
is small, and is computationally cumbersome. A
somewhat similar approach, which does not demand
as much data as the mutual information method, was
proposed by Liebert and Schuster (1989). According
to this approach, the first minimum of the logarithm of
the generalized correlation integral provides a proper
choice of the delay time.
For some attractors, it really does not matter
whether the autocorrelation function or the mutual
information or the correlation integral is used. For
example, when applied to the Rossler system (Rossler,
1976), all approaches provided a value of t approximately equal to one-fourth of the mean orbital period
(Tsonis, 1992). However, for some other attractors,
the estimation of t might depend strongly on the
approach employed. Evidently, none of the aforementioned rules has emerged as the definitive rule
for choosing t , but the mutual information approach
appears to have the edge. In the absence of clear-cut
guidelines, a practical approach is to experiment with
different t to ascertain its effect on the correlation
dimension (e.g. Rodriguez-Iturbe et al., 1989; Tsonis,
1992; Tsonis et al., 1993).
The problem of the selection of an appropriate t
has been addressed by some of the studies investigating the existence of chaos in hydrological data
(e.g. Tsonis et al., 1993; Jayawardena and Lai,
1994; Sangoyomi et al., 1996; Sivakumar et al.,
1999a). However, on one hand, most of these studies,
except that of Sangoyomi et al. (1996), have
employed only the autocorrelation function method
to determine the appropriate t and, therefore, there
is no way to comprehend whether the delay times
used are in fact appropriate. On the other hand, even
when employing the autocorrelation function method,
not many studies could ascertain the effect of t on the
correlation dimension estimate, as they failed to carry
out the analysis with different t values.
3.2.4. Other problems
The problems of data size, data sampling
frequency, delay time, critical embedding dimension,
and presence of noise are encountered in almost every
field of natural and physical phenomena, including
hydrology, and this is why they have received considerable attention. However, there could also be
other problems, as serious as the above, that might
not have received the necessary attention because of
their association with a particular field. One such
problem that is commonly encountered in the field
of hydrology is the presence of a large number of
zeros in the measurements. One possible influence
of this problem is that in the presence of a large
number of zeros (or any other single value), the reconstructed hyper-surface in phase–space will tend to a
point and may result in an underestimation of the
correlation dimension (e.g. Tsonis et al., 1993). In
fact, some of the criticism on studies reporting lowdimensional chaos in hydrological data, particularly
low-resolution data such as daily, revolves around
the problem of the presence of a large number of
zeros.
The various issues discussed above on the inability
of the investigative methods to provide sufficient
conditions to identify chaos, and the inherent limitations of the hydrological data, clearly indicate the
potential difficulties and uncertainties on chaos
identification in hydrological processes. The failure
of most of the past studies to address, in detail, all
the pertinent and important issues only raises further
concern on the basis for the application of the chaos
theory in hydrology and the validity of the reported
results.
4. Reported results and interpretations
It is clear, from the foregoing discussion, that there
cannot be any second opinion on the inherent
problems in the application of chaos theory in hydrology and the associated uncertainties on the outcomes.
However, whether such problems are tremendously
serious enough to warrant criticism on the use of
chaos theory in hydrology and the evidence provided
by past studies on the existence of chaos in hydrological processes, is a question that needs to be immediately addressed. In this regard, the emphasis should
not be on dwelling much on the limitations of the
methods and the data, but rather should be on trying
to provide possible interpretations of the results
obtained, keeping in mind the limitations. Therefore,
this section is dedicated to providing possible
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
interpretations of the results reported by the past
studies, most of which have also to some extent
addressed, and even taken care of, the limitations.
4.1. Use of diverse techniques
As discussed previously, each of the available
chaos identification methods has its own limitations
and, therefore, it is absolutely not possible to provide
irrefutable proof regarding the existence of chaotic
dynamics in a phenomenon. Having this in mind,
when looking for the possible existence of a deterministic component in a phenomenon, the goal must be to
try to acquire clues that allow us not to exclude its
existence rather than to ensure its existence. One
possible way to achieve this is to employ diverse
techniques, in order to verify whether the results
from each one of them are complementary to the
others. Unfortunately, this has not often been the
case as far as studies investigating the existence of
chaos in hydrological processes are concerned. Most
of the studies reported existence of chaos based on the
finite low-dimensions achieved using the correlation
dimension method and, as a result, paved the way for
criticisms of such studies since finite correlation
dimensions may be observed also for linear stochastic
processes (e.g. Osborne and Provenzale, 1989).
Though there cannot be any argument on the possibility of linear stochastic processes providing finite
correlation dimensions, a pertinent question is
whether this alone is sufficient enough to form a
strong basis for interpreting the low correlation
dimensions (resulting from stochastic processes)
reported by past studies investigating chaos in hydrological processes. The importance of this question lies
in the fact that not many (artificial) stochastic systems
have been identified to yield finite and low correlation
dimensions, whereas low dimensions have been
observed for every (artificial) chaotic system, e.g.
Lorenz system (Lorenz, 1963), Henon map (Henon,
1976), Rossler system (Rossler, 1976), Mackey–
Glass delay differential equation (Mackey and Glass,
1977) and the Ikeda map (Ikeda, 1979). An implication of this is that, though it may not be possible to
conclude based on the correlation dimension results
reported by past studies, whether or not chaos exists in
hydrological processes, such an existence cannot be
excluded altogether. This point has been reflected in
9
almost all the studies investigating chaos in hydrological processes as they either recommended or
employed other methods to verify and confirm the
results. Tsonis et al. (1993), for example, established
that due to the weaknesses of the existing algorithms,
such as the Grassberger–Procaccia algorithm, the
results from the correlation dimension method could
be considered to present just evidence rather than
proof of existence of chaos. They recommended that
evidence for chaos should be fortified by additional
evidence using other methods, such as Lyapunov
exponent and nonlinear prediction.
Though only a few studies (e.g. Rodriguez-Iturbe et
al., 1989; Jayawardena and Lai, 1994; Puente and
Obregon, 1996; Porporato and Ridolfi, 1996, 1997;
Sivakumar et al., 1999a) have employed one or
more methods in addition to the correlation dimension
method, it is important to note that the outcomes
clearly provided additional evidence to those
achieved using the correlation dimension method,
regarding the existence of chaos in hydrological
processes. Among the methods used, the nonlinear
prediction method was found to be very promising
(e.g. Jayawardena and Lai, 1994; Porporato and
Ridolfi, 1996, 1997; Sivakumar et al., 1999a),
although the method yielded only low-prediction
accuracy in some of the studies (e.g. Jayawardena
and Lai, 1994; Sivakumar et al., 1999a). (This could
possibly be due to the presence of noise in the data,
details of which will be discussed below.) The advantages of this method are: (1) the existence of chaos can
be identified by comparing the prediction accuracy
against the number of neighbors (e.g. Casdagli,
1991), the embedding dimension (e.g. Casdagli,
1989), and the lead time (e.g. Sugihara and May,
1990); and (2) it does not require a large data size
and can provide reasonably good results even when
the data size is small. The studies of Jayawardena and
Lai (1994) and Porporato and Ridolfi (1996, 1997)
checked the prediction accuracy against the lead
time and embedding dimension, whereas all the
above three were used by Sivakumar et al. (1999a)
in their investigation of chaos in the daily rainfall
data observed in Singapore. All the above studies
provided convincing evidence regarding the existence
of chaos in rainfall and streamflow.
On the contrary, since finite correlation dimensions
may be observed even for linear stochastic processes,
10
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
it is necessary to confirm the absence of linearity in
the data to verify the results achieved using the above
methods. One possible approach to achieve this is to
reject a null hypothesis that the data could be the
outcome of a linear stochastic process. Such an
approach, popularly known as the surrogate data
method (e.g. Theiler, 1992a,b), makes use of the
substitute data generated in accordance to the probabilistic structure underlying the original data. This
means that the surrogate data possess some of the
properties specified in a null hypothesis. The rejection
of the null hypothesis can be made based on some
discriminating statistics, such as the correlation
dimension. If the discriminating statistics obtained
for the surrogate data are significantly different from
those of the original time series, then the null hypothesis can be rejected and the original time series may
be considered to have come from a nonlinear process.
However, if the discriminating statistics obtained for
the original data and the surrogate data are not significantly different, then the null hypothesis cannot be
rejected and the original time series is considered to
have come from a linear stochastic process.
To the author’s knowledge, only two studies
employed the concept of surrogate data in order to
confirm the absence of linear stochasticism in
hydrological data (Koutsoyiannis and Pachakis,
1996; Sivakumar et al., 1999a). Koutsoyiannis and
Pachakis (1996) generated a synthetic time series
using a stochastic model capable of preserving
important properties of the rainfall process, such as
intermittency, seasonality and scaling behavior.
Based on the correlation dimensions obtained for
both the original and the synthetic data, they
concluded that a synthetic rainfall series might be
practically indistinguishable from a historic time
series even if one used tools of the chaotic dynamics
theory to characterize the rainfall time series. Their
study, though rejecting the possibility of chaos in
hydrological processes, drew its conclusion only
based on the results obtained from the correlation
dimension method and, therefore, is similar to most
of the other studies, as it failed to verify the results
using other methods. Recently, Sivakumar et al.
(1999a), in their investigation of the daily rainfall
data in Singapore, generated surrogate data sets
preserving the major probabilistic characteristics of
the original data. The (number of) zero and non-zero
values in the rainfall data were modeled using a
Bernoulli random variable. The high significance
values of the statistic (correlation dimension) indicated that the null hypothesis (i.e. the data arose
from a linear stochastic process) could be rejected
and hence the original (rainfall) data were possibly
derived from a nonlinear process (for further details,
see Sivakumar et al., 1999a). The results provided
additional evidence, to those obtained using the
correlation dimension method and the nonlinear
prediction method, regarding the existence of chaos
in the rainfall data. Also, the observation of no
saturation of the correlation exponent for the surrogate data sets, having almost the same number of
zeros and non-zeros as the original data, reveals that
the low correlation dimensions obtained for the original
rainfall data are not due to the presence of a large
number of zeros, an issue raised above in Section 3.2.4.
4.2. Is a large data size necessary?
One reason for the general belief that a large data
size is required for the correlation dimension estimate
is the assumption that the data size is a function of the
embedding dimension used to obtain the vectors by
phase–space reconstruction (Smith, 1988; Nerenberg
and Essex, 1990). However, this is not entirely true,
since the data size required may depend largely on the
dynamics of the phenomenon. In practice, for a
particular data size, the number of reconstructed
vectors may not differ much whether an embedding
dimension of, for example, 4 or 10 is used. For
example, according to Nerenberg and Essex (1990),
for a four and ten-dimensional embedding phase–
space the number of points required is, respectively,
about 4000 10210:4×4 † and 1,000,000 10210:4×10 †:
However, a fairly accurate estimation of the correlation dimension can be obtained with as low as 5000
points even for an embedding dimension of 10, as a
large scaling region is evident in the correlation
dimension plots shown in Fig. 1 for an artificial
(noise-free) chaotic (Henon) data. It seems, in this
case, that accurate estimation of the correlation dimension may be obtained for even higher embedding
dimensions. These observations suggest that the data
size may not be a function of the embedding dimension.
The argument that the data size required for the
correlation dimension estimate may not be a function
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
11
Fig. 1. Local slopes versus log r for Henon data.
of the embedding dimension can also be explained as
follows. Assuming that we have a time series (Henon
data) of dimension d ˆ 1:22; in all embedding dimensions m , 1:22; the object is space filling. Thus, for
m , 2; d ˆ m; while for m $ 2; d ˆ 1:22: Thus, the
first deviation of the correlation exponent from the
diagonal (i.e. d ˆ 1:22 starting at m ˆ 2 and remain
constant for higher values of m) against the embedding dimension from the diagonal should provide
estimation of the correlation dimension (Fig. 2(a)).
This, however, may not be what is usually observed
when data from measurements or from known
dynamical systems are analyzed. For such systems,
such as rainfall, the correlation exponent may deviate
from the diagonal for values of m , 2 (e.g. m ˆ 1)
and gradually increase with an increase in embedding
dimension up to certain value (e.g. m ˆ 10), which is
higher than the minimum dimension required m ˆ 2†
to embed the attractor (Fig. 2(b)). Surely, in such
cases, the first deviation from the diagonal does not
correspond to the dimension of the underlying
attractor for which one needs to go to higher embedding dimensions. This is the case in most of the
studies employing the correlation dimension method
for hydrological data (e.g. Tsonis et al., 1993; Islam et
al., 1993; Jayawardena and Lai, 1994; Sangoyomi et
al., 1996; Koutsoyiannis and Pachakis, 1996; Porporato and Ridolfi, 1996; Wang and Gan, 1998; Sivakumar et al., 1998, 1999a). Also, this is the reason for the
proposal of minimum and sufficient dimensions of the
embedding phase–space (or number of variables) to
model the dynamics of the system (e.g. Fraedrich,
1986), rather than 2d 1 1 dimensions (e.g. Takens,
1981) or d 1 1 dimensions (e.g. Abarbanel et al.,
1990).
The above observations imply that the minimum
data size required for the correlation dimension estimation may largely depend on the type and dimension
of the attractor, rather than the embedding dimension.
The calculations and derivations, presented thus far,
relating the data size and the embedding dimension
may be valid only for m , d: The need for data size
may increase when m . d; but at a much slower rate.
Therefore, in cases where saturation is observed for
m . d; one may need N min , f d† points rather than
N min , f m† points (provided m is not much larger
than d). However, this conclusion may not be valid
for every dynamical system or data set.
According to Lorenz (1991), different variables
could yield different estimates of correlation dimension and suitably selected variables could sometimes
yield a fairly good estimate even if the number of
points were not large. Such an interpretation was
also supported by Zeng and Pielke (1993), who
reported that the apparent reason for finding lowdimensional atmospheric attractors was that this
might reflect the weak nonlinear interaction between
the analyzed variable and the other variables in the
12
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
Fig. 2. Relationship between correlation exponent and embedding dimension for: (a) Henon data; (b) Singapore rainfall data.
atmosphere. Islam et al. (1993) offered an explanation
that if the single variable time series chosen for
analysis depended on physical constraints and thresholds, then its correlation dimension would be significantly less than that of the underlying dynamical
system. They argued that this could be the reason,
why variables like pressure and vertical wind velocity
yielded high correlation dimension values, while
derived variables, like sunshine duration and rainfall,
resulted in low dimension estimates. While attempting to compare the behavior of rainfall with that of the
vertical wind velocity, they reported correlation
dimension values of about 1.5 for the rainfall data,
and an infinite dimension for the vertical wind
velocity data. Their studies suggest that a low number
of variables, resulting from a low correlation dimension, may capture the important dynamical aspects of
the analyzed time series rather than the entire underlying dynamical system. The results from the studies
investigating chaos in hydrological processes seem to
indicate that such processes may be strongly coupled
with only a few dominant variables of the underlying
systems. Therefore, it is important to try to identify
those strongly coupled variables than to worry about
data requirements.
In the absence of clear-cut guidelines, one reasonable
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
13
Table 1
Results of correlation dimension analysis: daily rainfall data in Singapore with different data sizes
Record
length
(year)
Number
of data
points
Time
delay
(day)
Correlation
dimension
Minimum
embedding
dimension
Sufficient
embedding
dimension
30
20
10
5
4
3
2
1
10 958
7305
3653
1826
1461
1096
731
365
10
10
12
8
8
7
8
5
1:01 ^ 0:02
1:03 ^ 0:03
1:03 ^ 0:03
1:03 ^ 0:03
1:01 ^ 0:03
1:01 ^ 0:04
0:91 ^ 0:08
0:87 ^ 0:06
2
2
2
2
2
2
1
1
12
16
15
16
16
15
14
16
way to determine the minimum data size is to compute
the correlation dimensions for different sample sizes
until significant changes are observed below a certain
sample size (e.g. Rodriguez-Iturbe et al., 1989;
Lorenz, 1991; Tsonis, 1992; Tsonis et al., 1993).
Such an approach has been employed by Sivakumar
et al. (1998, 1999a) in their analysis of the daily rainfall data observed in Singapore, where the correlation
dimensions were estimated for data of 30, 20, 10, 5, 4,
3, 2, and 1 years from each of six stations in Singapore. The dimension results achieved for data from
one of the stations (Station 05) are presented in
Table 1. The results indicate that, in general, significant variations in the dimension of rainfall data seem
to occur when the rainfall record length is less than
4 years (equivalent to 1461 points), suggesting that
the minimum number of data points essential to
reasonably represent the dynamics of the daily rainfall
process in Singapore might be taken to be about 1500.
Although this does not provide a general guideline on
the minimum data size, as it depends on the properties
of the attractor, the results indicate that such an analysis would be very useful for determining an approximate data size for the computation of the correlation
dimension. The minimum data size estimated (about
1500 or equivalent to 4 years) by Sivakumar et al.
(1998, 1999a) for the computation of the correlation
dimension of the daily rainfall data in Singapore
seems to be reasonable since there is no significant
variation in rainfall observed in Singapore and, therefore, a record length of about 4 years is sufficient to
reasonably represent the dynamics of the daily rainfall
process. Having this in mind, it can be suggested
that the data size cannot be the reason for the low
correlation dimension achieved for the Singapore
rainfall data, as record lengths of much higher than
4 years (i.e. 30 years) yielded almost the same dimensions. The fact that almost all the studies investigating
the existence of chaos in hydrological processes used
at least a few thousands of points (e.g. RodriguezIturbe et al., 1989; Sharifi et al., 1990; Jayawardena
and Lai, 1994; Sangoyomi et al., 1996; Porporato and
Ridolfi, 1996, 1997; Sivakumar et al., 1998, 1999a),
the low correlation dimensions achieved might not be
due to the data size used in the analysis but could
actually be representations of the true dimensions of
the processes investigated. Therefore, criticisms that
the results of low correlation dimensions achieved for
hydrological processes are due to the small data size
used (e.g. Ghilardi and Rosso, 1990) may not always
be correct.
4.3. Noise has more influence on prediction than
dimension estimation
As mentioned previously, noise affects the performance of many techniques of identification and
prediction of chaotic deterministic systems. The
influence of noise on the outcomes of studies investigating chaos in hydrological processes can be readily
explained from the correlation dimension and
prediction accuracy results reported by the studies.
The observations of: (1) small scaling region in the
correlation dimension plots (e.g. Sangoyomi et al.,
1996; Porporato and Ridolfi, 1996, 1997); and (2)
low prediction accuracy even for short lead times
(e.g. Jayawardena and Lai, 1994; Sivakumar et al.,
1999a) indicate only less than perfect characteristics
14
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
Table 2
Comparison of results for noise-free and noise added Henon data: correlation dimension and prediction accuracy
Noise
level (%)
0
4
8
16
Correlation
dimension
1.22
1.26
1.31
1.34
Prediction accuracy
Correlation
coefficient
Number of
neighbors
Optimal
embedding
dimension
Maximum
lead time
0.97
0.86
0.72
0.55
20
40
70
160
4
4
4
4
7
4
3
1
of chaos. Though such problems have already been
identified (e.g. Jayawardena and Lai, 1994), most of
the studies failed to either investigate the extent of its
influence or to reduce it. One possible reason for this
is that dealing with the problem of noise present in
hydrological data is not a straightforward task due to
the lack of prior information on: (1) the level and the
nature of noise; and (2) the noise-free signal and the
dynamics of the system. As a result: (1) it is difficult to
determine the extent of the influence of noise on data
analysis; and (2) the appropriate noise reduction
method and the extent of improvement that can be
achieved after noise reduction are difficult to determine.
To the author’s knowledge, the tremendous task of
reducing the noise present in a hydrological time
series was first attempted by Porporato and Ridolfi
(1997), who employed a simple noise reduction
method developed by Schreiber and Grassberger
(1991) to the (chaotic) flow series of the river Dora
Baltea (Porporato and Ridolfi, 1996, 1997). In their
study, a local averaging procedure was applied
iteratively until the mean absolute corrections
between successive iterations became insignificant.
The procedure was stopped after 200 iterations since
it was noted that above 200 iterations an unjustifiable
calculation time was necessary to produce significant
corrections. The improvements achieved in the
estimates of correlation dimension and prediction
accuracy for the noise-reduced river flow series are
indeed encouraging.
However, Sivakumar et al. (1999b,c) identified
some of the potential problems in the noise reduction
method of Schreiber and Grassberger (1991) applied
by Porporato and Ridolfi (1997), or any other method
for that matter, to hydrological time series. They
stressed the importance of the determination of the
level of noise present in the hydrological time series,
because of the problems faced in the selection of the
optimal values of the parameters involved in the
method, such as the size of the neighborhood, and
the number of iterations of the procedure required to
achieve optimal noise reduction. Subsequently, they
also demonstrated that, in the absence of prior
knowledge on the level of noise in the time series,
the application of the noise reduction method could
have serious consequences, as the deterministic
component that drives the dynamics of the system
might also be removed. To overcome such problems,
they proposed a systematic noise reduction approach,
by coupling a noise level determination method
(Schouten et al., 1994) and a noise reduction method
(Schreiber, 1993). The approach was demonstrated
first on different levels (4, 8, and 16%) of an additive
and uniformly distributed noise-added artificial
chaotic (Henon) time series and then tested on a
hydrological time series, the daily rainfall data
observed in Singapore. The prediction accuracy was
considered as the main diagnostic tool to verify the
success of the noise reduction, since the prediction
accuracy can be determined without any knowledge
of the noise-free signal or the underlying dynamics of
the system and is also sensitive to under- or overremoval of noise. The correlation dimension was
used as a supplementary tool. An important feature
of this approach is that the noise reduction results
themselves may provide some guidelines on the
most probable level of noise present in the data. In
the following paragraphs, some of the important
results achieved by Sivakumar et al. (1999c) are highlighted.
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
15
Table 3
Noise reduction results for Henon data: correlation dimension and prediction accuracy
Noise level (%)
0
4
8
16
Correlation dimension
Correlation coefficient
Original
Noise-reduced
Original
Noise-reduced
1.22
1.26
1.31
1.34
–
1.23
1.23
1.26
0.97
0.86
0.72
0.55
–
0.93
0.88
0.79
A summary of the correlation dimension and
prediction accuracy estimates obtained for the noisefree and different levels of noisy Henon data is shown
in Table 2, whereas Table 3 presents a comparison of
the correlation dimension and prediction accuracy
results obtained for noisy- and noise-reduced data.
The results shown for the noise-reduced data, in
Table 3, are those achieved at the optimal level of
noise reduction obtained through the systematic
noise reduction approach (see Sivakumar et al.
(1999c) for details). The results indicate that, in the
presence of noise: (1) an overestimation of the correlation dimension occurs, and the dimension increases
as the noise level increases; (2) the prediction
accuracy decreases with an increase in the noise
level; (3) when the noise level increases, relatively
large number of neighbors (indicating stochastic
modeling) is required to obtain the best predictions;
(4) the prediction accuracy decreases when the
embedding dimension is increased beyond the optimal
embedding dimension; and (5) the lead time for which
good predictions are possible decreases with an
increase in the noise level. An important observation
from the results is that while the presence of even
small levels of noise significantly influences the
prediction accuracy estimates, the correlation
dimension estimates do not seem to be significantly
influenced even when the noise levels are high. This
suggests that: (1) the correlation dimension method
may be used as a preliminary approach for the investigation of the existence of chaos in hydrological data,
before attempting detailed analysis such as application of noise reduction procedures; and (2) the
nonlinear prediction method may not provide accurate
results when applied to hydrological data, unless the
noise is reduced. The results achieved for the noisereduced data indicate that: (1) the correlation
dimension estimates achieved for the noise-reduced
data are very close to those of the noise-free data;
and (2) in general, significant improvement in the
prediction accuracy estimates is achieved after noise
reduction. These observations suggest that the
application of a noise reduction procedure is always
desirable before employing any of the chaos identification methods, but may be necessary if the nonlinear
prediction method is employed.
Table 4 presents a summary of the correlation
dimension and prediction accuracy results achieved
for the original and noise-reduced Singapore rainfall
data. The ranges of the most probable levels of noise
Table 4
Noise reduction results for Singapore rainfall data: correlation dimension, prediction accuracy, and most probable noise level
Station no.
05
07
22
23
31
43
Correlation dimension
Correlation coefficient
Original
Noise-reduced
Original
Noise-reduced
1:02 ^ 0:02
1:03 ^ 0:03
1:06 ^ 0:03
1:03 ^ 0:02
1:02 ^ 0:02
1:03 ^ 0:02
0:97 ^ 0:01
0:96 ^ 0:01
1:02 ^ 0:03
1:01 ^ 0:01
0:95 ^ 0:02
0:97 ^ 0:01
0.291
0.248
0.262
0.421
0.326
0.288
0.484
0.431
0.410
0.587
0.478
0.471
Most probable noise level (%)
6.9–11.5
8.3–13.8
8.0–13.3
7.0–10.5
9.2–15.3
9.6–14.4
16
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
in the rainfall data observed in the six stations are also
presented in Table 4. The results achieved for the
original and noise-reduced rainfall data indicate that:
(1) the correlation dimension estimates for the rainfall
data are not significantly affected by noise reduction;
and (2) significant improvements are achieved in the
prediction accuracy estimates after noise reduction.
These results could possibly have the following implications: (1) the influence of noise on the correlation
dimension estimate is not significant; and (2) the
presence of noise has significant effects on the
prediction results. It may be intrinsically impossible
to provide strong proof for these results, since the
noise-free rainfall data is not available, but the
emphasis is that such implications cannot be excluded
altogether. All these observations suggest that the
application of a noise reduction procedure is always
desirable before employing any of the chaos identification methods, but may be necessary if the nonlinear
prediction method is employed.
The most probable noise levels estimated for the
rainfall data observed in the six stations in Singapore
are in the range between 6.9 and 15.3%. The magnitudes of the estimated noise levels seem to be quite
reasonable considering the fact that the rainfall
measurement is influenced by a large number of
factors, such as wind, wetting, evaporation, gage
exposure, instrumentation, and human error in reading
rainfall data. The data used in the study is also influenced by a certain imprecision due to the round-off
errors resulting from the conversion of the hourly data
to daily data. The noise level range achieved in the
present study, using the systematic noise reduction
approach proposed, is in good agreement with the
one observed by other means (e.g. Sevruk, 1996).
The presence of noise in the rainfall data in the
order of the above magnitudes could significantly
influence the outcomes of the chaos identification
methods, in particular nonlinear prediction, as is
evident from the results obtained for the Henon
data, particularly with noise levels of 8 and 16%.
This could be the main reason for the low prediction
accuracy achieved for the (noisy) original rainfall data
(Table 4, see also Sivakumar et al. (1999a)). The
significant improvement in the prediction results
achieved for the noise-reduced rainfall data (Table
4) provides additional support to the above. However,
the failure to achieve accurate predictions could be
attributed to the following: (1) the noise reduction
method (of Schreiber (1993)) might not be so effective
when the noise levels are high; (2) the noise levels
estimated might only be the most probable noise
levels and not the exact ones; and (3) the dynamical
noise might also have some influence on prediction,
but unfortunately could not be studied using the noise
reduction method of Schreiber (1993).
Also, it is very important to note that, even a simple
additive, independent, and uniformly distributed noise
could have significant influence on the prediction
accuracy, as the results achieved for the Henon data
indicate. Since the noise present in the rainfall data
might not be of the simple additive, independent, and
uniformly distributed type, but could be more
complicated or even a combination of several types,
it only indicates the uncertainties of the problem one
is dealing with and the extent of its influence.
However, the not so significant variations in the
correlation dimension estimates achieved for the
original and noise-reduced rainfall data seem to
imply that the influence of noise on the correlation
dimension is reasonably less, if not negligible. The
consistency of these observations with those for the
artificial chaotic (Henon) data provides only additional support to such implications. The improvement
achieved in the correlation dimension (larger scaling
region) and prediction accuracy (higher correlation
coefficient) estimates achieved for the noise-reduced
Singapore rainfall data confirms and reinforces with
greater clarity and consistency the evidence found in
the earlier studies (Sivakumar et al., 1998, 1999a)
regarding the existence of chaos.
The above observations suggest that the low-prediction accuracy reported in past studies investigating
chaos in hydrological processes (e.g. Jayawardena
and Lai, 1994; Sivakumar et al., 1999a) could be due
to the presence of noise and, therefore, the reported
results regarding the existence of chaos might be
acceptable.
4.4. Delay time
With respect to the delay time, t , a possible criticism
on the low-correlation dimensions reported by past
studies investigating chaos in hydrological processes
could be that the delay time used was not appropriate,
in other words it may be small, because a small t may
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
17
Fig. 3. Correlation exponent versus embedding dimension for various delay time values: Singapore rainfall data.
result in an underestimation of the dimension.
However, this is not necessarily the case, as most of
the studies used an appropriate t computed using any
of the widely accepted methods. Among these, most
employed the autocorrelation function method, and t
was taken as the lag time at which the autocorrelation
function first crossed the zero line (e.g. Jayawardena
and Lai, 1994; Sangoyomi et al., 1996; Porporato and
Ridolfi, 1996, 1997; Wang and Gan, 1998; Sivakumar
et al., 1998, 1999a). In fact, few of these studies
employed more than one method to verify the results.
For example, Sangoyomi et al. (1996) used the autocorrelation function method and the mutual information method to determine the appropriate t for the
volume time series of the Great Salt Lake, and
found no significant difference between the t values
obtained. The autocorrelation function method yielded
a t value of 13, whereas a value of t between 9 and 13
was obtained using the mutual information method.
Rodriguez-Iturbe et al. (1989) recommended, in the
absence of clear-cut guidelines, estimating correlation
dimensions using different delay time values to
ascertain its effect. This approach was employed by
Sivakumar et al. (1999a) to investigate the effect of
delay time on the correlation dimension estimates for
the daily rainfall data observed in Singapore. For
rainfall data from one of the stations (Station 05) in
Singapore, the delay time computed using the
autocorrelation function method was 10 days, and
the correlation dimension obtained was about 1.01.
Subsequently, using other t values of 1, 2, 8, 12,
20, and 50 days, they observed an underestimation
or overestimation of the dimension when t was
considerably smaller or larger than 10 days. Fig. 3
shows the relationship between the correlation exponent values and embedding dimension values for a 30year data from one of the stations (Station 05) in
Singapore with different values of t . Based on these
results, they recommended the selection of the lag
time at which the autocorrelation function first crosses
the zero line, if the autocorrelation function method is
used. These observations suggest that the low correlation dimensions reported by past studies, employing
the autocorrelation function method to compute t for
the phase–space reconstruction, investigating chaos
in hydrological processes could not be a result of
the selection of an inappropriate t , but could be the
true dimensions of the processes investigated.
5. Summary and conclusions
Though the science of chaos has been receiving
considerable attention in hydrology, there have also
been widespread criticisms on the application of
chaos theory in hydrology and suspicions on studies
18
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
reporting the existence of chaos in hydrological
processes. Important reasons for this are: (1) the
assumptions with which the chaos identification
methods have been developed, i.e. infinite and
noise-free time series; and (2) the inability of the
investigative methods to provide irrefutable proof
regarding the existence of chaos. The fact that hydrological time series are always finite and are inherently
contaminated by noise, such as errors arising from
measurements, necessitate addressing the above
issues in the application of chaos theory in hydrology
and, therefore, formed the basis for this paper. The
paper followed a systematic approach to address
these issues by: (1) reviewing some of the important
studies investigating the existence of chaos in hydrological processes; (2) presenting the critical issues that
have been raised in the application of chaos theory in
hydrology; and (3) discussing some of the notable
results reported by past studies and providing possible
interpretations to those. The important conclusions
are as follows.
Due to the limitations of each of the chaos identification methods, it may not be possible to provide a
definitive resolution of whether or not hydrological
processes exhibit chaotic behavior based on the
results achieved from the application of a single
method. It is necessary to employ diverse techniques
to facilitate us to verify whether the results from each
one of them is complementary to the other and, hence,
to confirm the results. Although only the correlation
dimension method was used in most of the studies
investigating existence of chaos in hydrological
processes, some of the studies (e.g. Rodriguez-Iturbe
et al., 1989; Jayawardena and Lai, 1994; Puente and
Obregon, 1996; Porporato and Ridolfi, 1996, 1997;
Sivakumar et al., 1999a) employed more than one
method and reported evidence of chaos indicating
that the reported results could be very meaningful.
The studies indicate that, among the methods available,
the nonlinear prediction method seems to provide
better results, since the existence of chaos can be
identified using three different approaches. In
addition, the method has been found to be effective
even when the data size is small.
The investigations carried out on the issue of the
minimum data size required for the correlation dimension estimation indicate that this issue may not be as
severe as it is believed to be. The minimum data size
may largely depend on the type and dimension of the
attractor and, therefore, it may be possible to obtain
reasonably accurate results even with a small data
size. This suggests that the use of a small data size
for the correlation dimension estimation cannot be
considered as the sole cause for low correlation
dimensions achieved for hydrological processes, as
commented, for example by Ghilardi and Rosso
(1990). Since most of the past studies used at least a
few thousands of points for dimension estimation, the
results achieved may be considered reasonable and,
therefore, the dimensions could well be the actual
dimensions of the underlying systems. Regarding
the delay time, the selection of delay time using the
autocorrelation function method, where the delay time
is taken as the lag time at which the autocorrelation
function first crosses the zero line, provides reasonable results on dimension estimation. This indicates
that the dimension estimates reported by past studies
may not be significantly affected due to the problem of
delay time, since most of the studies employed the
autocorrelation function method to determine the
delay time.
The studies on the influence of noise revealed that
the correlation dimension was not significantly influenced by the presence of noise, whereas noise had
significant effect on the prediction accuracy. These
observations, together with the fact that all the past
studies investigating chaos in hydrological data used
the correlation dimension either as a proof or as a
preliminary evidence of chaos, suggested that the
outcomes of such studies might still be valid, though
the influence of noise was not considered. The study
also indicated that the low prediction accuracy
achieved, in the past studies (e.g. Jayawardena and
Lai, 1994; Sivakumar et al., 1999a), for chaotic hydrological data could well be due to the presence of noise,
but significantly improved if noise were reduced.
On one hand, the basis for the criticisms, among the
majority of the hydrological community, of studies
investigating and reporting existence of chaos in
hydrological processes is our strong belief that they
are influenced by a large number of variables and,
therefore, are stochastic. On the other hand, the
outcomes of the present study provide strong support
to the claims that the (seemingly) highly irregular
hydrological processes could be the result of simple
deterministic systems with a few degrees of freedom.
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
Therefore, the hypothesis of chaos in hydrology is
reasonable and can provide an alternative approach
for characterizing and modeling the dynamics of
hydrological processes. There is no doubt that the
significant inroads that have been made in the past
decade into the application of chaos theory in
hydrology would reach out to a wider audience.
References
Abarbanel, H.D.I., Brown, R., Kadtke, J.B., 1990. Prediction in
chaotic nonlinear systems: methods for time series with broadband Fourier spectra. Phys. Rev. A 41 (4), 1782–1807.
Casdagli, M., 1989. Nonlinear prediction of chaotic time series.
Physica D 35, 335–356.
Casdagli, M., 1991. Chaos and deterministic versus stochastic nonlinear modeling. J. R. Stat. Soc. B 54 (2), 303–328.
Farmer, D.J., Sidorowich, J.J., 1987. Predicting chaotic time series.
Phys. Rev. Lett. 59, 845–848.
Fraedrich, K., 1986. Estimating the dimensions of weather and
climate attractors. J. Atmos. Sci. 43 (5), 419–432.
Frazer, A.M., Swinney, H.L., 1986. Independent coordinates for
strange attractors from mutual information. Phys. Rev. A 33
(2), 1134–1140.
Frison, T., 1994. Nonlinear data analysis techniques. In: Deboeck,
G.J. (Ed.). Trading on the Edge: Neural, Genetic, and Fuzzy
Systems for Chaotic Financial Markets, Wiley, New York,
pp. 280–296.
Georgakakos, K.P., Sharifi, M.B., Sturdevant, P.L., 1995. Analysis
of high-resolution rainfall data. In: Kundzewicz, Z.W. (Ed.).
New Uncertainty Concepts in Hydrology and Water Resources,
Cambridge University Press, New York, pp. 114–120.
Ghilardi, P., Rosso, R., 1990. Comment on chaos in rainfall. Water
Resour. Res. 26 (8), 1837–1839.
Grassberger, P., Procaccia, I., 1983a. Measuring the strangeness of
strange attractors. Physica D 9, 189–208.
Grassberger, P., Procaccia, I., 1983b. Estimation of the Kolmogorov
entropy from a chaotic signal. Phys. Rev. A 28, 2591–2593.
Grassberger, P., Hegger, R., Kantz, H., Schaffrath, C., 1993. On
noise reduction methods for chaotic data. Chaos 3 (2), 127–141.
Havstad, J.W., Ehlers, C.L., 1989. Attractor dimension of
nonstationary dynamical systems from small data sets. Phys.
Rev. A 39 (2), 845–853.
Henon, M., 1976. A two-dimensional mapping with a strange
attractor. Commun. Math. Phys. 50, 69–77.
Hense, A., 1987. On the possible existence of a strange attractor for
the southern oscillation. Beitr. Phys. Atmos. 60 (1), 34–47.
Holzfuss, J., Mayer-Kress, G., 1986. An approach to error-estimation
in the application of dimension algorithms. In: Mayer-Kress, G.
(Ed.). Dimensions and Entropies in Chaotic Systems, Springer,
New York, pp. 114–122.
Ikeda, K., 1979. Multiple valued stationary state and its instability
of the transmitted light by a ring cavity system. Opt. Commun.
30, 257–261.
Islam, S., Bras, R.L., Rodriguez-Iturbe, I., 1993. A possible
19
explanation for low correlation dimension estimates for the
atmosphere. J. Appl. Meteor. 32, 203–208.
Jayawardena, A.W., Lai, F., 1994. Analysis and prediction of chaos
in rainfall and stream flow time series. J. Hydrol. 153, 23–52.
Kantz, H., Schreiber, T., 1997. Nonlinear Time Series Analysis,
Cambridge University Press, Cambridge.
Koutsoyiannis, D., Pachakis, D., 1996. Deterministic chaos versus
stochasticity in analysis and modeling of point rainfall series. J.
Geophys. Res. 101 (D21), 26 441–26 451.
Liebert, W., Schuster, H.G., 1989. Proper choice of the time delay
for the analysis of chaotic time series. Phys. Lett. A 141, 386–
390.
Liu, Q., Islam, S., Rodriguez-Iturbe, I., Le, Y., 1998. Phase-space
analysis of daily streamflow: characterization and prediction.
Adv. Water Resour. 21, 463–475.
Lorenz, E.N., 1963. Deterministic nonperiodic flow. J. Atmos. Sci.
20, 130–141.
Lorenz, E.N., 1991. Dimension of weather and climate attractors.
Nature 353, 241–244.
Mackey, M.C., Glass, L., 1977. Oscillations and chaos in physiological control systems. Science 197, 287–289.
Nerenberg, M.A.H., Essex, C., 1990. Correlation dimension and
systematic geometric effects. Phys. Rev. A 42 (12), 7065–7074.
Osborne, A.R., Provenzale, A., 1989. Finite correlation dimension
for stochastic systems with power-law spectra. Physica D 35,
357–381.
Palus, M., 1995. Testing for nonlinearity using redundancies:
quantitative and qualitative aspects. Physica D 80, 186–205.
Porporato, A., Ridolfi, L., 1996. Clues to the existence of deterministic chaos in river flow. Int. J. Mod. Phys. B 10, 1821–1862.
Porporato, A., Ridolfi, L., 1997. Nonlinear analysis of river flow
time sequences. Water Resour. Res. 33 (6), 1353–1367.
Prichard, D., Theiler, J., 1995. Generalized redundancies for time
series analysis. Physica D 84, 476–493.
Provenzale, A., Osborne, A.R., Soj, R., 1991. Convergence of the
K2 entropy for random noises with power law spectra. Physica D
47, 361–372.
Puente, C.E., Obregon, N., 1996. A deterministic geometric
representation of temporal rainfall: results for a storm in Boston.
Water Resour. Res. 32 (9), 2825–2839.
Ramsey, J.B., Yuan, H.J., 1990. The statistical properties of
dimension calculations using small data sets. Nonlinearity 3,
155–176.
Rapp, R.E., Albano, A.M., Zimmerman, I.D., Jimenez-Montano,
M.A., 1994. Phase-randomised surrogates can produce spurious
identifications of non-random structure. Phys. Lett. A 192, 27–33.
Rodriguez-Iturbe, I., De Power, F.B., Sharifi, M.B., Georgakakos,
K.P., 1989. Chaos in rainfall. Water Resour. Res. 25 (7), 1667–
1675.
Rossler, O.E., 1976. An equation for continuous chaos. Phys. Lett.
A 57, 397–398.
Sangoyomi, T.B., Lall, U., Abarbanel, H.D.I., 1996. Nonlinear
dynamics of the Great Salt Lake: dimension estimation. Water
Resour. Res. 32 (1), 149–159.
Schouten, J.C., Takens, F., van den Bleek, C.M., 1994. Estimation
of the dimension of a noisy attractor. Phys. Rev. E 50 (3), 1851–
1861.
20
B. Sivakumar / Journal of Hydrology 227 (2000) 1–20
Schreiber, T., 1993. Extremely simple nonlinear noise reduction
method. Phys. Rev. E 47 (4), 2401–2404.
Schreiber, T., Grassberger, P., 1991. A simple noise reduction
method for real data. Phys. Lett. A 160, 411–418.
Schreiber, T., Kantz, H., 1996. Observing and predicting chaotic
signals: is 2% noise too much? In: Kravtsov, Yu.A., Kadtke, J.B.
(Eds.). Predictability of Complex Dynamical Systems, Springer
Series in Synergetics, Springer, Berlin, pp. 43–65.
Schreiber, T., Schmitz, A., 1996. Improved surrogate data for
nonlinearity tests. Phys. Rev. Lett. 77 (4), 635–638.
Schuster, H.G., 1988. Deterministic Chaos, VCH, Weinheim.
Sevruk, B., 1996. Adjustment of tipping-bucket precipitation gage
measurement. Atmos. Res. 42, 237–246.
Sharifi, M.B., Georgakakos, K.P., Rodriguez-Iturbe, I., 1990.
Evidence of deterministic chaos in the pulse of storm rainfall.
J. Atmos. Sci. 47, 888–893.
Sivakumar, B., Liong, S.-Y., Liaw, C.-Y., 1998. Evidence of chaotic
behavior in Singapore rainfall. J. Am. Water Resour. Assoc. 34
(2), 301–310.
Sivakumar, B., Liong, S.-Y., Liaw, C.-Y., Phoon, K.-K., 1999a.
Singapore rainfall behavior: chaotic? J. Hydrol. Engng, ASCE
4 (1), 38–48.
Sivakumar, B., Phoon, K.-K., Liong, S.-Y., Liaw, C.-Y., 1999b.
Comment “on nonlinear analysis of riverflow time series” by
Amilcare Porporato and Luca Ridolfi. Water Resour. Res. 35
(3), 895–897.
Sivakumar, B., Phoon, K.-K., Liong, S.-Y., Liaw, C.-Y., 1999c. A
systematic approach to noise reduction in observed chaotic time
series. J. Hydrol. 219 (3,4), 103–135.
Smith, L.A., 1988. Intrinsic limits on dimension calculations. Phys.
Lett. A 133 (6), 283–288.
Sugihara, G., May, R.M., 1990. Nonlinear forecasting as a way of
distinguishing chaos from measurement error in time series.
Nature 344, 734–741.
Takens, F., 1981. Detecting strange attractors in turbulence. In:
Rand, D.A., Young, L.S. (Eds.). Dynamical Systems and
Turbulence, Lecture Notes in Mathematics, 898. Springer,
Berlin, pp. 366–381.
Theiler, J., Galdrikian, B., Longtin, A., Eubank, S., Farmer, J.D.,
1992a. Using surrogate data to detect nonlinearity in time series.
In: Casdagli, M., Eubank, S. (Eds.). Nonlinear Modeling and
Forecasting, pp. 163–185.
Theiler, J., Eubank, S., Longtin, A., Galdrikian, B., Farmer, J.D.,
1992b. Testing for nonlinearity in time series: the method of
surrogate data. Physica D 58, 77–94.
Tsonis, A.A., 1992. Chaos: from Theory to Applications, Plenum
Press, New York.
Tsonis, A.A., Elsner, J.B., 1988. The weather attractor over very
short timescales. Nature 333, 545–547.
Tsonis, A.A., Elsner, J.B., Georgakakos, K.P., 1993. Estimating the
dimension of weather and climate attractors: important issues
about the procedure and interpretation. J. Atmos. Sci. 50, 2549–
2555.
Waelbroeck, H., Lopez-Pena, R., Morales, T., Zertuche, F., 1994.
Prediction of tropical rainfall by local phase space reconstruction. J. Atmos. Sci. 51 (22), 3360–3364.
Wang, Q., Gan, T.Y., 1998. Biases of correlation dimension
estimates of streamflow data in the Canadian prairies. Water
Resour. Res. 34 (9), 2329–2339.
Wolf, A., Swift, J.B., Swinney, H.L., Vastano, A., 1985. Determining Lyapunov exponents from a time series. Physica D 16, 285–
317.
Zeng, X., Pielke, R.A., 1993. What does a low-dimensional weather
attractor mean? Phys. Lett. A 175, 299–304.
Download