A Low-Rank Kernel-Particle Kalman Filter for Data Assimilation with High Dimensional Systems Ibrahim Hoteit Scripps Institution of Oceanography, La Jolla, CA 92093-0230, USA Email: ihoteit@ucsd.edu Dinh-Tuan Pham Laboratoire de Modélisation et Calcul, P.B. 53, 38041, Grenoble, France Email: dinh-tuan.pham@imag.fr George Triantafyllou Hellenic Center for Marine Research, PO BOX 712, 19013, Anavyssos, Greece Email: gkorres@hcmr.gr Abstract: We introduce a simplified discrete solution of the optimal nonlinear filter suitable for data assimilation with high dimensional systems. The method is based on a local linearization in a low-rank Kernel representation of the nonlinear filter's prior probability density functions. This leads to a new filter, called Low-Rank Kernel Particle Kalman Filter (LR-KPKF), in which the standard (weight-type) particle filter correction is complemented by a Kalman-type correction for each particle. The Kalman-type correction attenuates the particles degeneracy problem which allows the filter to efficiently operate with small size ensembles. Combined with the low-rank approximation, it enables the implementation of the LR-KPKF with computationally demanding models. The new filter is described and its relevance demonstrated using a realistic configuration of the Princeton Ocean Model in the Mediterranean Sea. 1. Kalman, Ensemble and Particle Filters The Kalman filter provides the optimal (minimum-variance) solution of the linear Gaussian sequential data assimilation problem. Since most dynamical and observational systems encountered in practice are nonlinear, the system equations are often linearized about the most recent estimate, leading to the popular, but no longer optimal, Extended Kalman (EK) Filter. Several studies have however demonstrated that the linearization of the system produces instabilities, even divergence, when applied to strongly nonlinear systems. The solution of the nonlinear sequential assimilation problem is actually given by the optimal nonlinear filter which involves the estimation of the probability density functions (PDFs), not necessarily Gaussian, of the system state (Doucet, 2001). As the Kalman Filter, this filter operates in two steps: analysis step at measurement times to update the filtering density pk ( xk y1:k ) , which is xk given all measurements y1:k ( y1 , , yk ) up to time k , the conditional density of the state vector with the Bayes' rule, and forecast step to propagate the predictive density pk 1 k ( xk 1 y1:k ) to the time of the next available observations. The particle filter is a discrete approximation of the optimal nonlinear filter and it is based on point mass representations (mixture of Dirac distributions), called particles, of the state's PDFs. In this filter, each particle is assigned a weight that is updated by the filter's analysis step and the solution is then the weighted-mean of the particles ensemble. In practice, this filter suffers from a major problem known as the degeneracy phenomenon; after several iterations most weights are concentrated on very few particles, and therefore only a tiny fraction of the ensemble contributes to the mean. This leads very often to the divergence of the filter. The use of more particles could only attenuate this problem over short time periods, and the only possible way to get around it is resampling (Doucet, 2001). This technique basically consists of drawing new particles according to the distribution of the ensemble and then reassigning them the same weights. Besides being computationally intensive, this approach introduces Monte Carlo fluctuations which can seriously degrade the filter's performance. In practice, even with resampling, the filter still requires a large number of particles to provide acceptable performances. This makes brute-force implementation of the particle filter problematic with high dimensional systems. Interesting discussions about the use of the optimal nonlinear filter for atmospheric and oceanic data assimilation systems can be found in (Anderson and Anderson, 1999), (Kivman, 2003) and (Van Leeuwen, 2003). The popular Ensemble Kalman (EnK) filter, which has been introduced by Evensen (1994), makes also use of an ensemble of particles. More precisely, it has the same forecast step as the particle filter, but not the same analysis step. The EnK filter actually retains the ‘’linearity aspect'' of the Kalman filter in the analysis, in that it applies the Kalman correction to all particles using forecast error covariances estimated from the particles cloud. It therefore only depends on the first two moments of the ensemble, meaning it is suboptimal for non-Gaussian PDFs. In practice, however, the EnK filter was found (Kivman, 2003; Van Leeuwen, 2003) to be more robust than the particle filter for small-size ensembles thanks to the Kalman update of its particles which significantly reduces the risk of ensemble collapses. 2. Kernel Particle Kalman (LR-KPK) filter. The basic idea behind this new filter is to approximate the nonlinear filter predictive density at time k 1 by a Gaussian mixture with the same small, in some sense, low-rank ( N 1 ) covariance matrix Pk k 1 Lk k 1U k k 1LTk k 1 for all the mixture components, i.e. i 1 where xki k 1 is the ith forecast particle and wki k 1 the associated weight. | P denotes the centered density with covariance matrix Gaussian P. It can then be shown that the predictive and the analysis densities at the next time can be always approximated by mixtures of Gaussian densities of the same form, pk ( x y1:k ) wki x xki Pk LkU k LTk . N i 1 Low-Rank Kernel Particle Kalman Filter The Kernel particle Kalman (KPK) filter has been introduced by (Pham et al., 2004) for an efficient discrete implementation of the optimal nonlinear filter. This filter makes use of a mixture of N Gaussian distributions in a Kernel representation to approximate the filter's PDFs. A Gaussian mixture has been already considered by Anderson and Anderson (199) and Chen and Liu (2000) and is expected to provide better approximation than a Dirac mixture used in the particle filter. A local linearization about each particle is then applied, under the assumption of small covariance matrices, which leads to a Kalman-type correction for each particle complementing the usual particletype correction. Basically, the KPKF runs an ensemble of EK filters and the analysis state is then the weighted-mean of all the sub-filters' analysis. As in the EnK filter, the Kalman-type correction attenuates the degeneracy problem and therefore allows the filter to efficiently operate with small-size ensembles. The KPK filter requires manipulating an ensemble of N error covariance matrices (one associated with each particle) and this is computationally not feasible for high dimensional systems. We therefore follow the formulation of the singular evolutive extended and interpolated Kalman (SEEK and SEIK) filters that make use of low-rank error covariance matrices to develop a simplified variant of the KPK filter called low-rank N pk k 1 ( x y1:k 1 ) wki 1 x xki k 1 Pk k 1 , i i The parameters of x , w , L , and U of both mixtures are updated by the filter as follows. To present the filter's algorithm, we consider the nonlinear dynamical system xk M k ( xk 1 ) k yk H k ( xk ) k where M k and H k represent the transition and the observational operators, respectively. k and k denote the dynamical and the observational Gaussian noise of mean zero and covariance matrices Qk and Rk . Initialization Based on the Kernel density estimation, the filter’s PDF is initialized by p0 ( x) i1 w0i x x0i P0 , N where the initial particles x0i are sampled from the unconditional distribution of the initial state estimate, w0i 1 N , and P0 h02 cov( x0i ) . h0 is a bandwidth parameter. Here Gaussian of mean matrix p0 is assumed to be x0 and low-rank covariance i 0 P0 . The x can be then randomly sampled using the second order-exact drawing scheme (Pham, 2001). Estimates of x0 and P0 L0U 0 LT0 are obtained as the mean and sample covariance matrix of a set of model realizations. More precisely, the later is computed via an empirical orthogonal functions (EOF) analysis. performed in which the weights are redistributed uniformly, and (ii) possible increase of Pk 1 k . In this case, a ‘’partial resampling’’ is performed in which the weights remain unchanged to reduce Monte Carlo fluctuations. The resampling step is summarized as follows. First the entropy criterion Analysis step N Kalman-type correction: Each forecast particle is corrected with the new observations according to x x i k i k k 1 where and Entk log N i1 log wki Lk k 1U k ( HL) T k k 1 R yk H k ( x 1 k i k k 1 ( HL)k k 1 H k ( xk1 k 1 ) ) , H k ( xkNk 1 ) T , U k is computed from is used to decide whether a full or a partial resampling is needed. Then N random Gaussian i vectors v are drawn with zero mean and covariance matrix k 1 Pk 1 k h2 k 1 k , where k 1 k j 1 wkj Pk 1 k cov( xki 1 k wk ) N is the 1 covariance matrix of the predictive density and h U k1 U k k 1 L Qk TL ( HL)Tk k 1 Rk1 ( HL) k k 1 is a tuning parameter such that 0 . k 1 Full resampling: Select N particles among the with T a N ( N 1) full rank-matrix with zero xki 1 k according to the probabilities wki , then add columns sum, and ( LT L)1 LT is the k k 1 k k 1 L projection operator onto L . The associated covariance matrix of the mixture is Pk Lk k 1U k LTk k 1 LkU k LTk , Partial resampling: Add the with Lk x x T , and U k is determined via U k1 BkU k1BkT , with Bk I d U k ( HL)Tk k 1 Rk1 y1:k ( HL)k k 1 T . 1 k N k Particle-type correction: The particles weights are updated as in the particle filter, wki where wki 1 yk H k ( xki k 1 ) k N j 1 w yk H k ( x j k 1 j k k 1 ) k , k ( HL)k k 1U k k 1 ( HL)Tk k 1 Rk . Forecast step The forecast particles xki 1 k are obtained by integrating the model forward in time starting from xki . The weights remain unchanged, and the associated covariance matrix is approximated by Pk 1 k Lk 1 kU k 1LTk 1 k with Lk 1 k xk11 k i to each one of them a vector v to obtain the new particles. Set wki 1 N and Pk 1 k h2 k 1 k . xkN1 k T Resampling A resampling step is applied every m filtering cycles to avoid: (i) the degeneracy of the particles weights. In this case a ‘’full resampling’’ is vi to the xki 1 k to obtain the new particles. Keep the weights unchanged and set Pk 1 k h2 k 1 k . 3. First Application The Model We use the Princeton Ocean model (POM) which is a primitive equations finite difference model formulated under the hydrostatic and Boussinesq approximations. POM solves the 3-D NavierStokes equations on an Arakawa-C grid using a numerical scheme that conserves mass and energy. Time stepping is achieved using a leapfrog scheme associated with an Asselin filter. The numerical computation is split into an external barotropic mode with a short time step solving for the time evolution of the free surface elevation and the depth averaged velocities, and an internal baroclinic mode which solves the vertical velocity shear. Horizontal mixing is parameterized using nonlinear viscosities and diffusivities while vertical mixing is calculated using the Mellor and Yamada 2.5 turbulence closure scheme. The reader is referred to (Blumberg and Mellor, 1987) for a detailed description of POM. The model domain covers the entire Mediterranean basin. The horizontal resolution is 1/4o1/4o with 25 sigma levels in the vertical. The model bathymetry was obtained from the US Navy Digital Bathymetric Data Bases. The surface forcing, which includes monthly wind stress, heat flux, net shortwave radiation and evaporation rate, were derived from the ECMWF reanalysis, except for the precipitation which was derived from Jaeger climatology. Bulk formulae were used to compute the surface momentum, heat and freshwater fluxes at each time step of model integration taking into account the SST predicted by the model. The model dynamics were first adjusted to achieve a perpetually repeated seasonal cycle by integrating the model climatologically for 20 years. This run started from rest with the MODB-MED4 temperature and salinity profiles. Next, another 2year (1980-1981) integration was carried out to adjust the model dynamics to the inter-annual ECMWF forcing. Filter Initialization and Experiments design A representative set of model realizations was obtained from a 4-year run between 1982 and 1985. The state variables were normalized, as they are not of the same nature, by the inverse of the square-root of their domain-averaged variances before applying the EOF analysis. The filter’s rank was set to 50 as the first 50 EOFs resumed more than 90% of the set total variance. A reference model run was first carried out over 1986. A set of 73 reference states was formed by retaining one vector every 5 days to provide the pseudo-data and to be later compared with the fields produced by the filter. Twin-experiments were carried out assimilating observations of sea surface height extracted from the reference states every 4 grid points. Random Gaussian errors of zero mean and 3cm standard deviation were added to the observations. The assimilation experiment was initialized from the mean state vector of the 4-year period used for the calculation of the EOFs. The model was assumed perfect. Assimilation Results Fig. 2 plots the evolution in time of the relative analysis errors for the model state variables as they result from the LR-KPK and SEEK filters. Both filters were implemented with the same rank (50), so that their computational costs were practically the same. The LR-KPK filter therefore used 51 particles only. The behavior of the LR-KPK filter is quite satisfactory. It provides better estimates for all model state variables and seems to be more robust than the SEEK filter. The good performance of the SEEK filter, which is a Kalman based filter, suggests that the model is not strongly nonlinear. Fig. 1. RMS for physical variables. More applications with different setups and different ocean and atmospheric models are still needed to further assess the qualities of the new filter and the benefice of the nonlinear analysis. References: Anderson, J., and S. Anderson, A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 2741-2758, 1999. Blumberg, A. F., and G. L. Mellor, A description of a three-dimensional coastal ocean circulation model. In N.S. Heaps, editor, Three-dimensional coastal ocean circulation models, Coastal Estuarine Science, pages 116. AGU, Washington, D.C., 4th edition, 1987. Chen, R., and J. Liu, Mixture Kalman filters. J. Roy. Statist. Soc., 62, 493-508, 2000. Doucet A., N. de Freitas, and N. Gordon, Sequential Monte Carlo methods in practice. New York: Springer, pp.581, 2001. Evensen, G., Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 143-10,162, 1994. Kivman, G., Sequential parameter estimation for stochastic systems. Nonlin. Proc. Geophys., 10, 253259, 2003. Pham, D.-T., Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 1194-1207, 2001. Pham, D.-T., K. Dahia, and C. Musso, A KalmanParticle Kernel filter and its application to terrain navigation. Proc. 6th Int. Conf. Inf. Fus., 2004. Van Leeuwen, P. J., A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131, 20712084, 2003.