Dual Kalman Filters for Simultaneous Assimilation of Physical and Biochemical Data into a Marine Ecosystem Model George Triantafyllou Hellenic Center for Marine Research, PO BOX 712, 19013, Anavyssos, Greece Email: email@example.com Ibrahim Hoteit Scripps Institution of Oceanography, La Jolla, CA 92093-0230, USA Email: firstname.lastname@example.org Gerasimos Korres Hellenic Center for Marine Research, PO BOX 712, 19013, Anavyssos, Greece Email: email@example.com Abstract: We present a reduced dual Kalman filter approach to simultaneously assimilate physical and biochemical data into a complex three-dimensional ecosystem model of the Eastern Mediterranean. The ecosystem model is composed of two on-line coupled sub-models: the Princeton Ocean Model (POM) and the European Regional Seas Ecosystem Model (ERSEM). In the dual approach two Kalman filters acting independently on the physics and the ecology are considered to assimilate available data to each subsystem. Here we used the Singular Evolutive Extended Kalman (SEEK) filter which operates with low-rank error covariance matrices to reduce the heavily computational burden of the extended Kalman filter. Results of preliminary twin experiments are presented and discussed. 1. Introduction Marine ecosystem modeling requires the coupling of two complex models: the physical model that describes the currents of the modeled area, and the biochemical model that describes the interactions between the different ecological species. Up today, most studies only considered the assimilation problem with one of the two models while assuming that the other one is perfect. However, assimilating data into one system only may result in misalignments of the physical and biological fronts, giving rise to spurious cross-frontal fluxes of biological quantities. For instance, assimilation of biological data alone often leads to spurious ecological responses (e.g. enhanced productivity). Likewise a perfect model assumption is far too optimistic and obtaining reliable estimates of the ecology, for example, using imperfect physical forcing can be very difficult. It is therefore necessary to constrain both models simultaneously with physical and biological observations, to improve their behavior and to assure consistency between their respective analyses. In other words, a successful ecosystem assimilation system requires the coupling of a biological and a hydrodynamical assimilation system capable of producing relevant physical fields, supportive of the newly analyzed biology. In the contest of Kalman filtering, this problem can be addressed following ‘’the Joint approach’’ or ‘’the dual approach’’ (Wan and Nelson, 2000). Both approaches were originally designed for the estimation of the model state concurrently with the model parameters using an analogous filter. We generalize them here to the problem of estimating the state of the ecological model concurrently with the physical forcing which evolves in time according to a dynamical model. Another potential difference in our case is that ecological as well as physical data can be available for assimilation. The joint approach is the simplest among the two to conceptualize: the physical state vector is simply appended into the ecological state vector, to form a single state vector for the coupled system. The physical and ecological observations are also appended together into one observation vector. The time update for the ecological part and the physical part is performed by each model, but the entire augmented covariance matrix is propagated as one. The dual filtering approach adopted in this study, intertwines a pair of distinct Kalman filters; one estimating the ecology and the other estimating the physics. In this sense, the dual approach respects more the one-way coupling nature of our ecological model by allowing the filter to correct the ecology independently from the physics, Food Web than that of different species) where organisms wi t h s i m i l ar p r o p e r t i es a r e Pelagic Model Irradiation Wind Heat Flux PicoPhyto NanoPhyto POC Flagellat es S i N PO O3 N H Diatom s 4 Bacte ria 4 C O DOC 2 Mesozoo Carnivorous Microzoo o C Heterotr. Flagellates N, P, Si Physical Model Mesozoo Omnivorous Sedimentation Benthic Fig. 1. A schematic description of theModel ecosystem functional groups and their trophic relations. resulting in more degrees of freedom to better fit the data. Another advantage of the dual approach is that it allows applying different degrees of simplification to each filter according to the needs of the system and the user. This was of crucial importance for the realization of the present study as it enabled to significantly reduce heavily computational load associated with the application of two advanced Kalman filters with two state-ofthe-art ecological and ocean circulation models. Hereafter we briefly describe the ecosystem model and the assimilation method before presenting the results of the assimilation experiments. 2. The Ecosystem Model The ecosystem model consists of two, online coupled, sub-models: the Princeton Ocean Model (POM) (Blumberg and Mellor, 1987), which describes the hydrodynamics of the area, and provides the physical forcing to the second submodel, the European Regional Seas Ecosystem Model (ERSEM) ( Bar etta et al., 1995), as summarized in Fig.1. POM is a three dimensional time dependent primitive equations ocean model; the equations are solved over an Arakawa-C d i f f er e n c i n g s c h em e a n d a - c o o r d i n a t e s discretization in the vertical. Time integration is achieved through an explicit scheme in which the barotropic and baroclinic modes are integrated separately using a leap frog scheme with different time steps. ERSEM describes the biogeochemical cycles and has been successfully applied in a wide variety of regimes from coastal eutrophic to open sea oligotrophic systems and on a variety of spatial scales. The use of a functional group idea (rather Fig. 2. Model domain. grouped together, increases ERSEM portability to any area. The biotic system encompasses the three major types (producers, consumers and decomposers) with each type being further subdivided, increasing the required complexity into 88 state variables. The modeled area covers the entire eastern Mediterranean (Fig.2). A horizontal resolution of 6 minutes was chosen producing a 165106 horizontal grid points. The -layers were distributed logarithmically near the surface and the bottom in order to better resolve the surface and the bottom boundary layers respectively. The model was initialized with climatological objectively analyzed temperature and salinity profiles from the Mediterranean Ocean Database (MODB-MED4). Initial velocities were set to zero. Wind stress fields were derived from the ECMWF 6-hour reanalysis data. Biological variables were initialized through a 1D ecological model as described by (Triantafyllou et al., 2004). 3. The Assimilation Method The assimilation scheme is based on the Singular Evolutive Extended Kalman (SEEK) filter which has been developed by Pham et al. (1997) as a reduced-rank Extended Kalman (EK) filter with application to highly dimensional systems ( n ) in mind. It basically reduces the prohibitive computational burden of the (EK) filter associated with the huge size of the filter’s error covariance matrices P by operating with low-rank ( r ) error T covariance matrices, i.e. P LUL , where L and U are n r and r r matrices. Under this assumption, the algorithm of the EK filter remains mostly unchanged. Only the evolution of P is avoided and replaced by those of L and U . The EK filter correction is then only applied along the directions of L , which we refer to as the “correction basis” of the filter. At the initial time, the correction basis is initialized through an Empirical Orthogonal Functions (EOF) analysis, which is generally applied on a historical set of model outputs. The evolution of L is then performed with the tangent linear model. This can be numerically very demanding, as it requires r 1 model integrations to linearize the model around every column of L . Several studies demonstrated however that the evolution of L could be omitted for weakly variable models (Hoteit et al., 2002) without significantly affecting the filter’s performance. The resulting Singular Fixed Extended Kalman (SFEK) filter, which makes use of a set of EOFs as invariant correction basis, can be numerically r 1 times faster than the SEEK filter. In separate recent studies, the authors noticed that the SFEK filter behaved fairly well when applied to ERSEM (Triantafyllou et al., 2004), while the evolution of L was needed for the assimilation with POM (Korres at al., 2005). These findings were very beneficial for setting up a computationally reasonable dual filtering system in which the assimilation system ERSEM/SFEK was coupled with the system POM/SEEK allowing for a significant reduction in computing time with respect to the joint approach, since the latter makes use of only one single state vector, and the implementation of only one filter (SEEK or SFEK). However, the use of the SEEK filter was necessary to obtain satisfactory performances with POM. It deservers noting that in its general form, ecological data should be also assimilated into the physical POM/SEEK system. However this was not considered in the present study to reduce heavy computational load required for the linearization of the biological model with respect to the physics. 4. Experiments and Discussion The effectiveness of the dual assimilation ecosystem system was evaluated following a ‘’twin-experiments'' approach in which the ‘’truth'' is assumed to be provided by the model itself. Twin experiments allow assessing the filters behavior on non-observed variables, as all uncertain parameters are known by design. The model statistics were also used for the initialization of the filters. After a four years integration of the ecosystem model to achieve a quasi adjustment of the model dynamics, another integration of two years was carried out to generate two historical sequences and of 365 POM and ERSEM state vectors. The states were sampled every two days. The ‘’physical filter’’ and the ‘’ecological filter’’ were then respectively initialized by the means of and . Reduced-rank approximations of the filters initial error covariance Physics Ecology Model POM ERSEM Filter SEEK SFEK Data SSH CHL Filter Rank 50 20 Table 1. Table summarizing the characteristics of each component of the dual assimilation system. matrices were obtained by applying separate EOF analysis on and . Prior to the analysis, the models variables were normalized by the inverse of the square-root of their domain-averaged variances to make the distance between different model state variables independent from unit of measure. The ranks of the physical filter and the ecological filter were set to 50 and 20, as the first 50 physical EOFs and 20 ecological EOFs explain more than 80% and 90% of the systems total variance, respectively. For the twin-experiments, pseudoobservations of sea surface height (SSH) and chlorophyll (CHL) data were assumed to be available every two-days over the whole surface of the model domain. These observations were extracted from a set of 45 reference physical and ecological states simulated by the coupled model over a three months period (from March 5th to June 5th). Random Gaussian noises were also added to the pseudo-observations in order to build a more realistic framework. The reference states were afterwards used to evaluate the filters estimates, relative to the model free-run estimates. The free-run refers to model integration (without any assimilation) during the assimilation period and initialized from the filters initial conditions. The quality of the filters’ estimates was measured for all physical and ecological state variables by the relative error (RMS), which is defined as the ratio between the filter/reference and free-run/reference domain misfits. Relative errors smaller than unity indicate that the solution of the assimilation system is closer to the truth than that of the model free-run. Assimilation experiments were performed to assess the performance of the filters. Fig.3 and Fig.4 respectively show the evolution of the filters’ RMS resulting from the assimilation run for the physical and the ecological variables, and compare them to those obtained from the model free-run. For both sub-systems, the dual assimilation clearly enhances the models fit to the data for all variables and throughout the assimilation window. After a large reduction of the estimation error at the first analysis step, subsequent filters’ analyses are less important. able to improve the overall behavior of both subsystems. However, more experiments are needed to closely assess the impact of omitting the impact of assimilating the ecological data into the physical model, which can be important when real data are assimilated to guaranty a consistency between both filters analyses. Fig. 4. RMS for ecological variables. Fig. 3. RMS for physical variables. Overall, the state variables estimates were improved by almost 70% for the physical model and by more than 45% for the ecological model. Better results could have been obtained for the ecology if the SEEK filter was used, but this would have entailed significant increase in the total computational cost. The assimilation results also suggest that the analyses of the two sub-systems are consistent, as no unstable behavior was detected in the evolution of both filters forecasts. In other experiments not shown here, the assimilation of SSH alone was also found to have an important impact on the ecology, although additional assimilation of CHL data leads to further improvements. This is in accordance with the oligotrophic conditions that characterize the system where the evolution of the ecological variables strongly depend on the physical forcing, and shows the importance of improving the physical forcing for the estimation of the ecology. Comparing to the joint approach, the dual approach respects more the one-way coupling of our ecosystem model. It also offers more flexibility, e.g. more degrees of freedom to fit the data, different choices for the parameters of the assimilation scheme according to the need of each sub-system, which is very important for an efficient implementation, in term of cost and performance. In our experiments, the dual filtering system was References: Baretta, J. W., W. Ebenhoh and P. Ruardij. The European regional seas ecosystem model, a complex marine ecosystem model, Netherlands Journal of Sea Research, 33, 233-246, 1995. Blumberg, A. F., and G. L. Mellor. A description of a three-dimensional coastal ocean circulation model. In N.S. Heaps, editor, Three-dimensional coastal ocean circulation models, Coastal Estuarine Science, pages 116. AGU, Washington, D.C., 4th edition, 1987. Hoteit, I., D.T. Pham, and J. Blum. A simplified reduced order Kalman filtering and application to altimetric data assimilation in the Tropical Pacific. J. Mar. Sys., 36, 101-127. Korres, G., I. Hoteit, and G. Triantafillou. Data assimilation into a Princeton Ocean model using advanced Kalman filters. J. Mar. Sys., accepted, 2006. Pham, D. T., Verron, J., and Roubaud, M. C.: Singular evolutive Kalman filter with EOF initialization for data assimilation in oceanography, J. Mar. Sys., 16, 323– 340, 1997. Triantafyllou, G., I. Hoteit, and A.I. Pollani. Toward a pre-operational data assimilation system for the E. Mediterranean using Kalman filtering techniques. Lect. Ser. Comp. comp. sci., 1, 500-505, 2004. Wan, E., and T. Nelson: Dual Kalman Filtering Methods for Nonlinear Prediction, Smoothing and Estimation. In: Adv. Neur. Inf. Proc. Sys., Vol.9, 1997.