Some Statistical Models of Periodic Phenomenon William A. Sethares March 19, 2006

advertisement
Some Statistical Models of Periodic
Phenomenon
William A. Sethares March 19, 2006
Models of statistical periodicity do not presume that the signal itself is periodic; rather, they assume that there is a periodicity in the underlying statistical
distributions. For example, a record of temperature verses time might experience
daily and yearly cycles. A record of power consumption in a city might have daily
fluctuations in both mean and variance. Midnight usage might be very predictable
(small variance) while mid-afternoon usage might be highly variable depending
on the amount of air conditioning needed.
Statistical models of periodicity need not be much more complex than the
Department
of Electrical and Computer Engineering, University of Wisconsin-Madison,
Madison, WI 53706-1691 USA. 608-262-5669 sethares@ece.wisc.edu
1
ball and urn. This section presents four models that generate repetitive behavior:
the lattice model presumes a fixed grid on which random variations are imposed,
the successive model builds the rungs of its ladder from realizations of positive
random variables, the additive model assumes an underlying periodicity in the
mean of the stochastic process, while the AM model presumes a periodic change in
the variances. The model which generates outputs most resembling the character
of the data is the best choice for a particular application. There are two kinds of
models. In the lattice and successive models, the interval of repetition (i.e, the
period) is random. In the additive and AM models, the period is fixed and the
values of the process are random. The models are compared and contrasted in
Sect. 5.
1 Lattice Model
Perhaps the simplest statistical model with repetitive behavior assumes a fixed
grid of times with period and phase (or starting
point) . A collection of random variables Æ with are assumed to
have known distribution. The output of the model is the process defined by the
2
grid and the Æ by
if Æ
otherwise
The process is zero almost everywhere except in the neighborhood of the grid
points
, where it takes on the value one at Æ . Thus the process
Æ defines deviations from the regular lattice. If the mean of the Æ are zero then the
expected time of the nth event is
. If the Æ were degenerate (equal to zero
for all ) then would be exactly periodic. The upper half of Fig. 1 diagrams
the parameters in the lattice model and shows how the realizations cluster around
the underlying grid.
This model is generative in the sense that if the values for all the parameters
are known it is easy to determine the probability that a candidate data set may
have arisen (or been generated) from the model. For example, suppose that Æ is
distributed normally with the same mean and variance for all . The model
is fully specified by the parameter vector . For a given set of data
, the probability that the data arose from a model of this form is 3
.
δ1
δ2
δ3
δ4
δ5
τ+T
τ+2T
τ+3T
τ+4T
τ+5T
Lattice
Model
s1
Successive
Model
τ
s2
s3
τ+s1 τ+s1+s2
s4
3
τ+Σ si
i=1
s5
4
τ+Σ si
i=1
5
τ+Σ si
i=1
Figure 1: The lattice model is built from a discrete stochastic process Æ that defines the deviation from an underlying periodic grid. The successive model is built
from a collection of positive random variables that directly define the time between events. While similar over short times, the long term behavior of the two
models is quite different.
4
2 Successive Model
The successive model is built from a collection of positive random variables which are assumed to have known distribution. Let tive sum of the ’s and let
be the cumula-
be the starting point. Then the output is
if otherwise
The process is zero almost everywhere except when , where it is one.
If the ’s had mean and variance zero, then the output would be strictly periodic
with period . A diagram of the construction of the successive model is given in
the lower half of Fig. 1. If the distribution of the is known (for instance, it might
be the absolute value of a normal random variable with mean and variance ),
the model is fully specified by the parameter vector .
Despite the similarities in the short term, the long term behavior of the successive model is very different from the long term behavior of the lattice model. The
nth event in the lattice model is always close to
, whereas the nth event in
the successive model depends on the values of all previous ’s. Effectively, the
lattice approach models deviations from a single periodicity whereas the successive approach begins each new repetition where the previous one ended.
5
3 Additive Model
The additive model presumes that there is an underlying -periodic sequence
and a random process Æ that defines the deviations of the output from the periodic
sequence. The values of the output, given by
mod Æ fluctuate around the periodic sequence. This is illustrated in the upper half of
Fig. 2 for the case using the three-periodic sequence .
If the
Æ is degenerate (zero for all ), the output simply repeats the periodic . This
model represents a periodic signal corrupted by additive noise (the Æ ) and is
amenable to a large variety of analytical techniques. The parameters needed to
fully specify the model are the distribution of the Æ , the periodic sequence , and
a starting time .
4 Amplitude Modulation (AM) Model
Like the additive model, the AM model assumes an underlying -periodic sequence .
Rather than defining mean values as above, the pe6
a1
δ1
a1
δ0
a0
Additive
Model
a0
a2
s0 ~ N(0,σ02 )
a0
δ2
a2
s3 ~ N(0,σ02 )
s1 ~ N(0,σ12 )
Amplitude
Modulation
Model
δ1
δ0
δ0
...
δ2
s6 ~ N(0,σ02 )
s4 ~ N(0,σ12 )
...
s2 ~ N(0,σ22 )
τ
τ+k
τ+2k
τ+3k
s5 ~ N(0,σ22 )
τ+4k
τ+5k
τ+6k
...
Figure 2: Two models representing cyclostationary stochastic processes. The additive model is most simply viewed as a periodic signal corrupted by additive
noise Æ . Outputs of the AM model are generated from a stochastic process defined with a periodic pattern of variances .
7
riodic sequence defines the variance of the zero mean process , that is, the variance of is mod . The output of the process is a realization .
The
case is illustrated in the lower half of Fig. 2 for the three-periodic case with
, , and . The model is fully specified by the distribution
of the , the variances , and the starting time .
5 Discussion of the Models
Stochastic variations in repetition are important in many fields. For example, in
communications the message is effectively random while the modulation, synchronization, and frame structure impose periodic fluctuations. In mechanics, rotating elements provide periodicity while cavitation, turbulence, and varying loads
impose randomness. Rhythmic physiological processes such as the heartbeat and
brainwaves are clearly repetitive but are neither completely periodic nor fully predictable. There are two kinds of repetitive behavior shown by these models. In
the lattice and successive models, the rate of repetition depends on a random process. In the additive and AM models the process is locked to a grid on which the
statistics are defined by an underlying periodic sequence.
Such processes have been studied extensively in the mathematical literature.
8
A discrete-time stochastic process1 is called stationary if both the expectation
and the autocorrelation are independent of .
Stationarity captures the idea that while a process may be random from moment to
moment, it has an underlying unity in that the distribution of the process remains
fixed through time. Only slightly more complex is the idea that the mean and
autocorrelation may be periodic in time. A process is called cyclostationary
if both the expectation and the autocorrelation are periodic
functions of [1]. Both the additive and AM models are cyclostationary.
More generally, suppose that is a periodic sequence and that is a stationary process. Then the sum is cyclostationary, which generalizes the
additive model. Similarly, the product is cyclostationary, which generalizes
the AM model.
In modeling a particular phenomenon or data set, certain models may be more
appropriate than others. For example, the lattice model specifies the time at which
events occur and hence may be best applied to event-driven signals. A drum
sequence recorded in MIDI might be an ideal fit. The successive model might
be useful to predict a heartbeat (or other physioloigical signal) because each beat
begins when the previous one ends; there is not necessarily an underlying rigid
1
Analogous definitions apply to continuous-time stochastic processes.
9
lattice to which the beats must conform. Either the additive or the AM models
are more appropriate when trying to represent audio signals, because they deal
explicitly with amplitudes and not solely with timing.
Of course, many such models are possible. Correlations may occur in the various stochastic processes. Distributions may change over time, effectively combining various models such as the sum (additive) and product (AM) models. Or
the timing of events within one of the cyclostationary models may be modified by
a random variable (combining, for example, the timing structure of the successive
model with the amplitude variations of the additive model).
In applications where the parameters are not known beforehand, there is an
essential tradeoff between the accuracy of the model and the number of parameters
required. The step of choosing an appropriate model (one that is complex enough
to capture the essence of the phenomenon of interest, yet simple enough to remain
tractable) is probably the most difficult step in any application.
References
[1] W. A. Gardner and L. E. Franks, “Characterization of cyclostationary random
signal processes,” IEEE Trans. Inform Theory, Vol. IT-21, pp. 414, 1975.
10
Related documents
Download