1 - WMO

advertisement
A Low-Rank Kernel-Particle Kalman Filter
for Data Assimilation with High Dimensional Systems
Ibrahim Hoteit
Scripps Institution of Oceanography, La Jolla, CA 92093-0230, USA
Email: ihoteit@ucsd.edu
Dinh-Tuan Pham
Laboratoire de Modélisation et Calcul, P.B. 53, 38041, Grenoble, France
Email: dinh-tuan.pham@imag.fr
George Triantafyllou
Hellenic Center for Marine Research, PO BOX 712, 19013, Anavyssos, Greece
Email: gkorres@hcmr.gr
Abstract: We introduce a simplified discrete solution of the optimal nonlinear filter suitable for data
assimilation with high dimensional systems. The method is based on a local linearization in a low-rank
Kernel representation of the nonlinear filter's prior probability density functions. This leads to a new filter,
called Low-Rank Kernel Particle Kalman Filter (LR-KPKF), in which the standard (weight-type) particle
filter correction is complemented by a Kalman-type correction for each particle. The Kalman-type
correction attenuates the particles degeneracy problem which allows the filter to efficiently operate with
small size ensembles. Combined with the low-rank approximation, it enables the implementation of the
LR-KPKF with computationally demanding models. The new filter is described and its relevance
demonstrated using a realistic configuration of the Princeton Ocean Model in the Mediterranean Sea.
1. Kalman, Ensemble and Particle Filters
The Kalman filter provides the optimal
(minimum-variance) solution of the linear Gaussian
sequential data assimilation problem. Since most
dynamical and observational systems encountered
in practice are nonlinear, the system equations are
often linearized about the most recent estimate,
leading to the popular, but no longer optimal,
Extended Kalman (EK) Filter. Several studies have
however demonstrated that the linearization of the
system produces instabilities, even divergence,
when applied to strongly nonlinear systems. The
solution of the nonlinear sequential assimilation
problem is actually given by the optimal nonlinear
filter which involves the estimation of the
probability
density
functions
(PDFs),
not
necessarily Gaussian, of the system state (Doucet,
2001). As the Kalman Filter, this filter operates in
two steps: analysis step at measurement times to
update the filtering density pk ( xk y1:k ) , which is
xk given
all measurements y1:k  ( y1 , , yk ) up to time k ,
the conditional density of the state vector
with the Bayes' rule, and forecast step to
propagate the predictive density pk 1 k ( xk 1 y1:k ) to
the time of the next available observations.
The particle filter is a discrete approximation
of the optimal nonlinear filter and it is based on
point mass representations (mixture of Dirac
distributions), called particles, of the state's PDFs.
In this filter, each particle is assigned a weight that
is updated by the filter's analysis step and the
solution is then the weighted-mean of the particles
ensemble. In practice, this filter suffers from a
major problem known as the degeneracy
phenomenon; after several iterations most weights
are concentrated on very few particles, and
therefore only a tiny fraction of the ensemble
contributes to the mean. This leads very often to
the divergence of the filter. The use of more
particles could only attenuate this problem over
short time periods, and the only possible way to
get around it is resampling (Doucet, 2001). This
technique basically consists of drawing new
particles according to the distribution of the
ensemble and then reassigning them the same
weights. Besides being computationally intensive,
this approach introduces Monte Carlo fluctuations
which can seriously degrade the filter's
performance. In practice, even with resampling,
the filter still requires a large number of particles
to provide acceptable performances. This makes
brute-force implementation of the particle filter
problematic with high dimensional systems.
Interesting discussions about the use of the
optimal nonlinear filter for atmospheric and oceanic
data assimilation systems can be found in
(Anderson and Anderson, 1999), (Kivman, 2003)
and (Van Leeuwen, 2003).
The popular Ensemble Kalman (EnK) filter,
which has been introduced by Evensen (1994),
makes also use of an ensemble of particles. More
precisely, it has the same forecast step as the
particle filter, but not the same analysis step. The
EnK filter actually retains the ‘’linearity aspect'' of
the Kalman filter in the analysis, in that it applies
the Kalman correction to all particles using forecast
error covariances estimated from the particles
cloud. It therefore only depends on the first two
moments of the ensemble, meaning it is
suboptimal for non-Gaussian PDFs. In practice,
however, the EnK filter was found (Kivman, 2003;
Van Leeuwen, 2003) to be more robust than the
particle filter for small-size ensembles thanks to the
Kalman update of its particles which significantly
reduces the risk of ensemble collapses.
2.
Kernel Particle Kalman (LR-KPK) filter. The basic
idea behind this new filter is to approximate the
nonlinear filter predictive density at time k  1 by
a Gaussian mixture with the same small, in some
sense, low-rank (  N  1 ) covariance matrix
Pk k 1  Lk k 1U k k 1LTk k 1 for all the mixture
components, i.e.
i 1

where

xki k 1 is the ith forecast particle and wki k 1 the
associated weight.

  | P 
denotes
the
centered
density with covariance matrix
Gaussian
P.
It can then be shown that the predictive and the
analysis densities at the next time can be always
approximated by mixtures of Gaussian densities
of the same form,
pk ( x y1:k )   wki   x  xki Pk  LkU k LTk  .
N
i 1
Low-Rank Kernel Particle Kalman Filter
The Kernel particle Kalman (KPK) filter has been
introduced by (Pham et al., 2004) for an efficient
discrete implementation of the optimal nonlinear
filter. This filter makes use of a mixture of N
Gaussian distributions in a Kernel representation to
approximate the filter's PDFs. A Gaussian mixture
has been already considered by Anderson and
Anderson (199) and Chen and Liu (2000) and is
expected to provide better approximation than a
Dirac mixture used in the particle filter. A local
linearization about each particle is then applied,
under the assumption of small covariance
matrices, which leads to a Kalman-type correction
for each particle complementing the usual particletype correction. Basically, the KPKF runs an
ensemble of EK filters and the analysis state is
then the weighted-mean of all the sub-filters'
analysis. As in the EnK filter, the Kalman-type
correction attenuates the degeneracy problem and
therefore allows the filter to efficiently operate with
small-size ensembles.
The KPK filter requires manipulating an
ensemble of N error covariance matrices (one
associated with each particle) and this is
computationally not feasible for high dimensional
systems. We therefore follow the formulation of the
singular evolutive extended and interpolated
Kalman (SEEK and SEIK) filters that make use of
low-rank error covariance matrices to develop a
simplified variant of the KPK filter called low-rank

N
pk k 1 ( x y1:k 1 )   wki 1 x  xki k 1 Pk k 1 ,
i
i
The parameters of x , w , L , and U of both
mixtures are updated by the filter as follows. To
present the filter's algorithm, we consider the
nonlinear dynamical system
 xk  M k ( xk 1 )  k

 yk  H k ( xk )   k
where
 M k and
H k represent the transition and the
observational operators, respectively.
  k and  k denote the dynamical and the
observational Gaussian noise of mean zero
and covariance matrices Qk and Rk .
Initialization
Based on the Kernel density estimation, the filter’s
PDF is initialized by
p0 ( x)  i1 w0i   x  x0i P0  ,
N
where the initial particles
x0i are sampled from the
unconditional distribution of the initial state
estimate, w0i  1 N , and P0  h02 cov( x0i ) . h0 is a
bandwidth parameter. Here
Gaussian of mean
matrix
p0 is assumed to be
x0 and low-rank covariance
i
0
P0 . The x can be then randomly sampled
using the second order-exact drawing scheme
(Pham, 2001). Estimates of
x0 and P0  L0U 0 LT0
are obtained as the mean and sample covariance
matrix of a set of model realizations. More
precisely, the later is computed via an empirical
orthogonal functions (EOF) analysis.
performed in which the weights are redistributed
uniformly, and (ii) possible increase of Pk 1 k . In
this case, a ‘’partial resampling’’ is performed in
which the weights remain unchanged to reduce
Monte Carlo fluctuations. The resampling step is
summarized as follows. First the entropy criterion
Analysis step
N
Kalman-type correction: Each forecast particle is
corrected with the new observations according to
x x
i
k
i
k k 1
where
and
Entk  log N  i1 log wki
 Lk k 1U k ( HL)
T
k k 1
R  yk  H k ( x
1
k
i
k k 1
( HL)k k 1   H k ( xk1 k 1 )
)  ,
H k ( xkNk 1 )   T ,
U k is computed from
is used to decide whether a full or a partial
resampling is needed. Then N random Gaussian
i
vectors v are drawn with zero mean and
covariance matrix k 1  Pk 1 k  h2  k 1 k , where
 k 1 k   j 1 wkj Pk 1 k  cov( xki 1 k wk )
N
is
the
1
covariance matrix of the predictive density and h
U k1  U k k 1   L Qk TL   ( HL)Tk k 1 Rk1 ( HL) k k 1 is a tuning parameter such that   0 .


k 1
Full resampling: Select N particles among the
with T a N  ( N  1) full rank-matrix with zero
xki 1 k according to the probabilities wki , then add
columns sum, and   ( LT L)1 LT is the
k k 1
k k 1
L
projection operator onto L . The associated
covariance matrix of the mixture is
Pk  Lk k 1U k LTk k 1  LkU k LTk ,
Partial resampling: Add the
with
Lk   x
x   T , and U k is determined via
U k1  BkU k1BkT ,
with
Bk  I d  U k ( HL)Tk k 1 Rk1  y1:k  ( HL)k k 1   T .
1
k
N
k
Particle-type correction: The particles weights are
updated as in the particle filter,
wki 
where

wki 1 yk  H k ( xki k 1 ) k

N
j 1

w  yk  H k ( x
j
k 1
j
k k 1

) k

,
k  ( HL)k k 1U k k 1 ( HL)Tk k 1  Rk .
Forecast step
The forecast particles
xki 1 k are obtained by
integrating the model forward in time starting from
xki . The weights remain unchanged, and the
associated covariance matrix is approximated by
Pk 1 k  Lk 1 kU k 1LTk 1 k with Lk 1 k   xk11 k
i
to each one of them a vector v to obtain the new
particles. Set wki  1 N and Pk 1 k  h2  k 1 k .
xkN1 k   T
Resampling
A resampling step is applied every m filtering
cycles to avoid: (i) the degeneracy of the particles
weights. In this case a ‘’full resampling’’ is
vi
to the
xki 1 k to
obtain the new particles. Keep the weights
unchanged and set Pk 1 k  h2  k 1 k .
3.
First Application
The Model
We use the Princeton Ocean model (POM) which
is a primitive equations finite difference model
formulated under the hydrostatic and Boussinesq
approximations. POM solves the 3-D NavierStokes equations on an Arakawa-C grid using a
numerical scheme that conserves mass and
energy. Time stepping is achieved using a
leapfrog scheme associated with an Asselin filter.
The numerical computation is split into an external
barotropic mode with a short time step solving for
the time evolution of the free surface elevation
and the depth averaged velocities, and an internal
baroclinic mode which solves the vertical velocity
shear. Horizontal mixing is parameterized using
nonlinear viscosities and diffusivities while vertical
mixing is calculated using the Mellor and Yamada
2.5 turbulence closure scheme. The reader is
referred to (Blumberg and Mellor, 1987) for a
detailed description of POM.
The model domain covers the entire
Mediterranean basin. The horizontal resolution is
1/4o1/4o with 25 sigma levels in the vertical. The
model bathymetry was obtained from the US Navy
Digital Bathymetric Data Bases. The surface
forcing, which includes monthly wind stress, heat
flux, net shortwave radiation and evaporation rate,
were derived from the ECMWF reanalysis, except
for the precipitation which was derived from Jaeger
climatology. Bulk formulae were used to compute
the surface momentum, heat and freshwater fluxes
at each time step of model integration taking into
account the SST predicted by the model.
The model dynamics were first adjusted to
achieve a perpetually repeated seasonal cycle by
integrating the model climatologically for 20 years.
This run started from rest with the MODB-MED4
temperature and salinity profiles. Next, another 2year (1980-1981) integration was carried out to
adjust the model dynamics to the inter-annual
ECMWF forcing.
Filter Initialization and Experiments design
A representative set of model realizations was
obtained from a 4-year run between 1982 and
1985. The state variables were normalized, as they
are not of the same nature, by the inverse of the
square-root of their domain-averaged variances
before applying the EOF analysis. The filter’s rank
was set to 50 as the first 50 EOFs resumed more
than 90% of the set total variance.
A reference model run was first carried out
over 1986. A set of 73 reference states was formed
by retaining one vector every 5 days to provide the
pseudo-data and to be later compared with the
fields produced by the filter. Twin-experiments
were carried out assimilating observations of sea
surface height extracted from the reference states
every 4 grid points. Random Gaussian errors of
zero mean and 3cm standard deviation were
added to the observations. The assimilation
experiment was initialized from the mean state
vector of the 4-year period used for the calculation
of the EOFs. The model was assumed perfect.
Assimilation Results
Fig. 2 plots the evolution in time of the relative
analysis errors for the model state variables as
they result from the LR-KPK and SEEK filters. Both
filters were implemented with the same rank (50),
so that their computational costs were practically
the same. The LR-KPK filter therefore used 51
particles only. The behavior of the LR-KPK filter is
quite satisfactory. It provides better estimates for
all model state variables and seems to be more
robust than the SEEK filter. The good performance
of the SEEK filter, which is a Kalman based filter,
suggests that the model is not strongly nonlinear.
Fig. 1. RMS for physical variables.
More applications with different setups and
different ocean and atmospheric models are still
needed to further assess the qualities of the new
filter and the benefice of the nonlinear analysis.
References:
Anderson, J., and S. Anderson, A Monte Carlo
implementation of the nonlinear filtering problem to
produce ensemble assimilations and forecasts. Mon.
Wea. Rev., 127, 2741-2758, 1999.
Blumberg, A. F., and G. L. Mellor, A description of a
three-dimensional coastal ocean circulation model. In
N.S. Heaps, editor, Three-dimensional coastal ocean
circulation models, Coastal Estuarine Science, pages 116. AGU, Washington, D.C., 4th edition, 1987.
Chen, R., and J. Liu, Mixture Kalman filters. J. Roy.
Statist. Soc., 62, 493-508, 2000.
Doucet A., N. de Freitas, and N. Gordon, Sequential
Monte Carlo methods in practice. New York: Springer,
pp.581, 2001.
Evensen, G., Sequential data assimilation with a
nonlinear quasi-geostrophic model using Monte Carlo
methods to forecast error statistics. J. Geophys. Res.,
99, 143-10,162, 1994.
Kivman, G., Sequential parameter estimation for
stochastic systems. Nonlin. Proc. Geophys., 10, 253259, 2003.
Pham, D.-T., Stochastic methods for sequential data
assimilation in strongly nonlinear systems. Mon. Wea.
Rev., 129, 1194-1207, 2001.
Pham, D.-T., K. Dahia, and C. Musso, A KalmanParticle Kernel filter and its application to terrain
navigation. Proc. 6th Int. Conf. Inf. Fus., 2004.
Van Leeuwen, P. J., A variance-minimizing filter for
large-scale applications. Mon. Wea. Rev., 131, 20712084, 2003.
Download