Ocean Modelling 33 (2010) 87–100 Contents lists available at ScienceDirect Ocean Modelling journal homepage: www.elsevier.com/locate/ocemod Implementation of a reduced rank square-root smoother for high resolution ocean data assimilation E. Cosme a,*, J.-M. Brankart a, J. Verron a, P. Brasseur a, M. Krysta a,b a b LEGI, CNRS/UJF/INPG, BP53 38041 Grenoble Cedex France, France LJK, CNRS/UJF/INPG/INRIA, BP53 38041 Grenoble Cedex France, France a r t i c l e i n f o Article history: Received 5 February 2009 Received in revised form 1 December 2009 Accepted 7 December 2009 Available online 24 December 2009 Keywords: High resolution ocean modelling Retrospective data assimilation Kalman filtering Smoothing a b s t r a c t Optimal smoothers enable the use of future observations to estimate the state of a dynamical system. In this paper, a square-root smoother algorithm is presented, extended from the Singular Evolutive Extended Kalman (SEEK) filter, a square-root Kalman filter routinely used for ocean data assimilation. With this filter algorithm, the smoother extension appears almost cost-free. A modified algorithm implementing a particular parameterization of model error is also described. The smoother is applied with an ocean circulation model in a double-gyre, 1/4! configuration, able to represent mid-latitude mesoscale dynamics. Twin experiments are performed: the true fields are drawn from a simulation at a 1/6! resolution, and noised. Then, altimetric satellite tracks and sparse vertical profiles of temperature are extracted to form the observations. The smoother is efficient in reducing errors, particularly in the regions poorly covered by the observations at the filter analysis time. It results in a significant reduction of the global error: the Root Mean Square Error in Sea Surface Height from the filter is further reduced by 20% by the smoother. The actual smoothing of the global error through time is also verified. Three essential issues are then investigated: (i) the time distance within which observations may be favourably used to correct the state estimates is found to be 8 days with our system. (ii) The impact of the model error parameterization is stressed. When this parameterization is spuriously neglected, the smoother can deteriorate the state estimates. (iii) Iterations of the smoother over a fixed time interval are tested. Although this procedure improves the state estimates over the assimilation window, it also makes the subsequent forecast worse than the filter in our experiment. " 2009 Elsevier Ltd. All rights reserved. 1. Introduction The development of data assimilation in geophysics has been triggered by the need of an accurate representation of the current atmospheric state to initialize a numerical weather forecast. In the framework of estimation theory, this issue is a filtering problem of estimating a dynamical state given a model (numerical), past and present observations (the only available data for a forecast). The spearhead of dynamical estimation methods is the Kalman filter (Kalman, 1960). The Kalman filter has received much interest in geophysics. This is due to its solid roots in estimation theory on the one hand, and to its possible implementation – provided some simplifications are made – with large numerical models on the other hand [e.g. Parrish and Cohn, 1985; Todling and Cohn, 1994; Evensen, 1994; Fukumori and Malanotte-Rizzoli, 1995; Houtekamer and Mitchell, 1998]. However, many oceanographic applications now expect more from data assimilation than only initialize a prediction. The study of climate variability and evolution, the tar* Corresponding author. Tel.: +33 476825051; fax: +33 476825271. E-mail address: Emmanuel.Cosme@hmg.inpg.fr (E. Cosme). 1463-5003/$ - see front matter " 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.ocemod.2009.12.004 geted case studies of ocean dynamics, or biogeochemistry, require reanalyses of the ocean circulation: complete, consistent, and accurate datasets of the variables that describe the ocean, over a continuous time period in the past. The estimation of past ocean states implies that posterior observations exist (up to present), and may clearly be used retrospectively to improve the data assimilation outputs. Such estimation problem is tackled by smoothers. Optimal linear smoothers stemming from estimation theory can be considered as an extension of the Kalman filter that takes future observations into account. Actually, all optimal linear smoother algorithms involve the Kalman filter. For a detailed description of the various types of smoothers based on Kalman’s theory, and their algorithms, we refer the reader to textbooks such as Anderson and Moore (1979) or Simon (2006). In this paper, we are concerned with the sequential approach of smoothing, as exposed by Cohn et al. (1994) or Evensen and van Leeuwen (2000): the Kalman filter analysis is followed by retrospective analyses, i.e. corrections of the past state estimates using the Kalman filter innovation. Smoother algorithms involve the Kalman filter, however, it is well known that the Kalman filter cannot be implemented in its canonical form with realistic models of the ocean. First, models 88 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 are generally nonlinear. Linearity is an essential hypothesis for the Kalman filter to be optimal. Various strategies can be adopted to extend the Kalman filter to nonlinear systems. The Extended Kalman filter is the straightforward, first order generalization of the Kalman filter. Higher order approaches exist, such as the unscented Kalman filter (Julier and Uhlmann, 1997). Evensen (1994) proposed an ensemble approach to nonlinear Kalman filtering. The second hindrance to the application of the standard Kalman filter is due to computer storage and CPU requirements. The Kalman filter is computationally untractable, and approximations have to be made for implementation. The most common strategy in oceanography consists in reducing the dimension of the error space. The Ensemble Kalman Filter (EnKF) of Evensen (1994) does it implicitly, by handling an ensemble of states far smaller than the state vector dimension. The Reduced-Rank SQuare-RooT filter (Verlaan and Heemink, 1997), the Error Subspace Statistical Estimation (ESSE) algorithm (Lermusiaux and Robinson, 1999), and the Singular Evolutive Extended Kalman (SEEK) filter (Pham et al., 1998; Brasseur and Verron, 2006) are explicitly founded on the order reduction. Some algorithms approximating the Kalman filter have been extended for smoothing and applied in problems connected to oceanography. It is the case of the EnKF (van Leeuwen and Evensen, 1996; Evensen and van Leeuwen, 2000), with applications with a 2layer quasigeostrophic model (van Leeuwen, 1999, 2001). Lermusiaux and Robinson (1999) have derived a ESSE smoother. This algorithm has been run in real data assimilation experiments at high resolution by Lermusiaux (1999a,b), and Lermusiaux et al. (2002). Fukumori (2002) have tested a Kalman filter and an approximation of the RTS smoother (Rauch et al., 1965), based on a state-partitioning approach, with a one-dimensional shallowwater model. Todling and Cohn (1996) and Todling et al. (1998) have introduced various strategies to make the Kalman filter and optimal smoothers applicable with large systems and tested some of them with a linear shallow-water model. Gaspar and Wunsch (1989) have experimented the RTS smoother with an over-simplified equation of ocean dynamics. Recently, Ravela and McLaughlin (2007) have illustrated fast ensemble smoothing algorithms in identical twin experiments with the Lorenz-95 100-variable model. The work presented in this paper basically aims at pioneering the implementation of a smoother algorithm, based on the SEEK filter algorithm, in a high resolution ocean data assimilation system imbedding a primitive equations ocean circulation model. The mid-term perspective is to make high quality, high resolution reanalyses of the ocean circulation. In this prospect, the chosen smoothing algorithm is sequential, in the spirit of Evensen and van Leeuwen (2000) or Cohn et al. (1994). The first part of the paper is dedicated to the theoretical derivation and the algorithmic aspects of the SEEK extension to smoothing, including a method to handle a common parameterization of the model error. It is shown that the implementation of this sequential smoother is straightforward, and results in negligible additional cost, when the SEEK filter is already in place. In the second part of the paper, the application of the SEEK smoother with a mesoscale ocean circulation model is demonstrated and evaluated. Three issues expected to be tricky or relevant in a real data assimilation context are examined. Section 2 recapitulates the Kalman filter and the sequential smoothing ingredients and algorithm. Section 3 presents the SEEK filter and the SEEK smoother algorithm. The SEEK filter is well known (Pham et al., 1998; Rozier et al., 2007). In Section 3.1, we stress the aspects particularly relevant or problematical for the smoother extension. In Section 3.2, the equations of the new SEEK smoother algorithm are written down. For clarity’s sake, the perfect model and imperfect model cases are considered separately. Two practical aspects are examined in Section 4: the computational complexity and the numerical implementation. In Section 5, we present the set-up of a twin experiment carried out with the SEEK smoother and an ocean circulation model in a high resolution, idealized configuration. Results are examined in Section 6, and Section 7 reports three examinations and numerical experiments concerning key issues with the SEEK smoother. Section 8 concludes. 2. Optimal linear smoothers The Kalman filter and the most common optimal linear smoothers are derived and described in many textbooks (Anderson and Moore, 1979; Simon, 2006; Evensen, 2007). Here we first recall the well-known Kalman filter algorithm to set down the notations that will be used next. Then we provide an intuitive view and the equations of the less-known optimal smoothers. Note that we are interested in the smoothers of the sequential type here. 2.1. The Kalman filter The Kalman filter is initialized with an analysis state vector xa0 and the associated error covariance matrix Pa0 . The assimilation sequence is performed according to the Kalman filter equations: Initialization: xa0 and P0 Forecast step: xfkjk!1 ¼ Mk!1;k xak!1jk!1 ; ! "T Pfkjk!1 ¼ Mk!1;k Mk!1;k Pak!1jk!1 þ Q k!1;k ; ð1aÞ ð1bÞ Analysis step: ! "T Gk ¼ Hk Hk Pfkjk!1 þ Rk ; Hk xfkjk!1 ; dk ¼ yk ! ! "T Kkjk ¼ Hk Pfkjk!1 G!1 k ; xakjk Pakjk xfkjk!1 þ Kkjk dk ; $ ¼ I ! Kkjk Hk Pfkjk!1 : ¼ # ð2aÞ ð2bÞ ð2cÞ ð2dÞ ð2eÞ The subscript and superscript notations are those of Cohn et al. (1994). Superscripts f and a, respectively, mean ‘forecast’ and ‘analysis’. k ! 1 and k indicate two consecutive time, tk!1 and t k , at which observations are available. The subscript notation kjk ! 1 is inherited from estimation theory. In the Kalman filter theory, the state vector and the associated error covariance matrix are the first and second moments of a probability distribution function for the state, conditioned on observations. The conditioning is symbolized by the vertical bar. Then, xfkjk!1 represents the forecast state at time t k , i.e. the state estimate at time t k given the observations up to time t k!1 . xakjk is the analysis state at time tk , i.e. the state estimate at time tk given the observations up to time t k . Pfkjk!1 and Pakjk are the associated state error covariance matrices. Eqs. (1a and 1b) perform the propagation between times tk!1 and tk . They involve the linear, dynamical model Mk!1;k and the model error covariance matrix Q k!1;k . The other equations perform the observational updates of the state estimate and error statistics at time tk . They use the observation vector yk , the observation error covariance matrix Rk , the observation operator Hk . Three elements are internally defined: the innovation dk , the innovation error covariance matrix Gk , and the Kalman gain Kkjk . Note that the superscript (f, a) and the conditioning notation in the subscript (kjk ! 1, kjk) are redundant. Indeed, a state with subscript kjk ! 1 is obviously a forecast estimate. We use the superscripts anyway to stick to the usual notation. 89 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 The Kalman filter analysis equations remain unchanged but are completed with the smoother equations: 2.2. Sequential smoothers and cross-covariance matrices Smoothers use the observations at a time t k to improve the estimation of states at times ti , with t i 6 t k . The analysis state at t i that includes information of all observations till time tk is noted xaijk , in agreement with the notations introduced previously. We note Rk the set of time indices at which the retrospective analysis is produced, i.e. the ensemble that i must span when the observations at tk are considered. The nature of Rk determines the type of smoother. If Rk is a singleton, it corresponds to the fixed-point smoother. The user is interested in the optimal estimation of the state at one single date (if Rk ¼ f0g for instance, it is the initial state of a time interval). If Rk is a fixed and homogeneous series, Rk ¼ f0; 1; . . . M ! 1; Mg for instance, the smoother is of the fixedinterval type. Finally, if Rk contains L indices (L fixed) preceding k: Rk ¼ fk ! L; . . . k ! 1g, then the smoother is of the fixed-lag type, and L is the lag. In the numerical experiments presented next, the fixed-lag type is used. But all the theoretical considerations are valid for the three types. In particular, there is no major difference in their implementations. Finally, we call the lag either the number of retrospective analyses, or the time length covered by the maximum number of retrospective analyses. When the assimilation cycle length is known, no confusion is possible. Let efkjk!1 represent the forecast error at time t k (a random vector), and eaijk!1 the analysis error at time ti ðt i < tk Þ, given all observations up till time t k!1 . The filter is not concerned with times previous to t k , and the filter algorithm handles only the covariance matrices (E is the statistical expectation operator): h Pfkjk!1 ¼ E efkjk!1 efkjk!1 T i and Pakjk h i T ¼ E eakjk eakjk : In the ‘standard’ sequential smoothing algorithm, as presented in textbooks (Anderson and Moore, 1979; Simon, 2006) and implemented by Cohn et al. (1994) with a linear shallow-water model for instance, the novelty with respect to the Kalman filter lies in the use of cross-covariance matrices. Crossing takes place in the time dimension. The tk forecast=t i analysis cross-covariance matrix is defined as: h i T f a Pfa k;ijk!1 ¼ E ekjk!1 eijk!1 : The sequential smoothers also involve cross-covariance matrices of analyses errors only: h i a a T Paa k;ijk ¼ E ekjk eijk : xaijk ¼ xaijk!1 ¼ Paijk!1 Paa k;ijk Paijk i 2 Rk ; þ Kijk dk ; i 2 Rk ; # $ ¼ I ! Kkjk Hk Pfa i 2 Rk ; k;ijk!1 ; ! Kijk Hk Pfa k;ijk!1 ; i 2 Rk : ð4aÞ ð4bÞ ð4cÞ ð4dÞ Note that theoretically, Eq. (4d) is useful only for performance diagnostics, for the result is not used in the following smoother calculations. The smoother equations are easily derived using an augmented state vector approach of Kalman filtering, where the state vector is augmented with the desired past states (Anderson and Moore, 1979). 3. Square-root transformation and order reduction The square-root transformation and the order reduction are introduced in the smoother equations. This is done here with the Singular Evolutive Extended Kalman (SEEK) filter, but adaptation to any other square-root filter is similar [e.g. Ravela and McLaughlin, 2007]. 3.1. The SEEK filter The SEEK filter is a Kalman filter algorithm designed to be applicable to large systems. It was introduced by Pham et al. (1998) and used for various oceanic applications, in particular with real data [e.g. Verron et al., 1999; Testut et al., 2003; Brankart et al., 2003; Castruccio et al., 2006]. An operational proxy of the SEEK filter is implemented in the MERCATOR-océan prediction system (Brasseur et al., 2005). Recent syntheses on the SEEK filter are provided by Brasseur and Verron (2006) and Rozier et al. (2007). 3.1.1. Analysis step The SEEK analysis equations are obtained by a reformulation of the Kalman filter analysis equations, followed by a low rank approximation for the state error covariance matrix. First, a square root decomposition is introduced: the state error covariance matrix Pfkjk!1 , real and symmetric, can be written as: Pfkjk!1 ¼ Sfkjk!1 Sfkjk!1 T Then, the Kalman gain expression (Eqs. (2a) and (2c)) is transformed using the previous decomposition of Pfkjk!1 and the Sherman–Morrison–Woodbury formula: ! "!1 ðA þ UDVÞ!1 ¼ A!1 ! A!1 U D!1 þ VA!1 U VA!1 Hk Sfkjk!1 , ð3aÞ ð3bÞ ð3cÞ T ð5Þ with A ' Rk , U ' V ¼ U , and D ' I, the identity matrix. Then the SEEK analysis equations are: ! "T ! " Ck ¼ Hk Sfkjk!1 R!1 Hk Sfkjk!1 ; k dk ¼ yk ! Kkjk xakjk Sakjk Forecast step: aa Pfa i 2 Rk ; k;ijk!1 ¼ Mk!1;k Pk!1;ijk!1 ; ! "T f fa Pkjk!1 ¼ Mk!1;k Pk;k!1jk!1 þ Q k!1;k : ! "T Kijk ¼ Hk Pfa G!1 k ; k;ijk!1 where Sfkjk!1 is a n & n matrix, n being the length of the state vector. The cross-covariance matrices are then easily recognizable by their notation ‘‘aa” or ‘‘fa” in exponent. These matrices enable to infer a correction at past time t i from information at time t k . They are involved in the calculation of smoother gains and the smoother gains are themselves used for smoother analyses. The sequential smoothers are initialized as the Kalman filter is. Adjustments in the algorithm may be necessary at the beginning of the time integration. In the case of the fixed-lag smoother for example, while k is lower than the lag L, the retrospective analyses are performed only to the k ! 1 previous states. In the Kalman filter forecast step, the covariance propagation (Eq. (1b)) is splitted into two Eqs. (3b) and (3c) below, for the intermediate results (crosscovariance matrices) are needed for the smoother analyses: xfkjk!1 ¼ Mk!1;k xak!1jk!1 ; Smoother analysis step: Hk xfkjk!1 ; ! "T ¼ Sfkjk!1 ½I þ Ck )!1 Hk Sfkjk!1 R!1 k ; ¼ ¼ xfkjk!1 þ Kkjk dk ; Sfkjk!1 ðI þ Ck Þ !1=2 : ð6aÞ ð6bÞ ð6cÞ ð6dÞ ð6eÞ Two conditions are required to make this analysis algorithm numerically efficient. On the one hand, the observation error covariance matrix Rk must be numerically invertible at low cost. A simple hypothesis is to consider it diagonal. Other forms have been used by Testut et al. (2003) and Brankart et al. (2003). Brankart et al. (2009) 90 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 also introduced an elegant way of handling observation error correlations while keeping a diagonal matrix. On the other hand, the matrix I þ Ck must also be invertible at low cost. Without any approximation, this matrix has the dimension n & n, usually too large to be inverted. To fix this, the SEEK filter makes the assumption that errors only occur of a low dimension subspace of the state space. This means that most of the eigenvalues of Pfkjk!1 are assumed to be negligible. If r the rank of Pfkjk!1 , the square-root transformation can be applied so that instead of being n & n, the matrix Sfkjk!1 is only n & r. The matrix I þ Ck to be inverted is now r & r, and inversion can be numerically fast if r is small enough. The r columns of Sfkjk!1 hold the nonzero principal components of the state error. To initialize the matrix at the beginning of an assimilation experiment, these principal components must be identified. In practice, the first Empirical Orthogonal Functions (EOFs) are computed from a time series of model states. The underlying assumption is that variability mimics the effects of the (random) processes that are responsible for the initial error (Rozier et al., 2007). 3.1.2. Forecast step and model error The SEEK formulation does not alter the state propagation equation of the Kalman filter (Eq. (1a)). The square-root transformation of Eq. (1b) leads to: ! "! "T Pfkjk!1 ¼ Mk!1;k Sak!1jk!1 Mk!1;k Sak!1jk!1 þ Q k!1;k : ð7Þ This calculation implies a number of model integrations equal to the number of columns in Sak!1jk!1 . It is possible only with the low rank approximation already introduced for the analysis step. The model error Q k!1;k can be processed in several ways. In the early introduction of the RRSQRT filter (Verlaan and Heemink, 1997), the model error is defined (a priori, of any possible dimension), Eq. (7) is applied and a reduction step is performed to obtain a new, reduced square root matrix Sfkjk!1 . In a similar line, Lermusiaux (1999b) (with the ESSE filter and real data) build a new mode based on the innovation residual (yk ! Hk xakjk ), concatenate this new mode with the oth- ers and apply a reduction (Singular Value Decomposition) to recover the initial dimension. Brasseur et al. (1999) (with the SEEK filter) put this new mode in place of the less significant mode of the analysis. In the ensemble approach, the error can be introduced through perturbations in the model during the propagation of the ensemble members (Evensen, 2003). With the ESSE system again, Lermusiaux and Robinson (1999) propose to add a stochastic forcing to each mode. Possible methods to define this forcing are detailed in Lermusiaux (2006). In the first presentation of the SEEK filter (Pham et al., 1998), the forgetting factor strategy (also called covariance inflation) is introduced: the model error Q k!1;k is assumed proportional to Mk!1;k ðMk!1;k Pak!1jk!1 ÞT with a factor ð1 ! qÞ=q, q being the forgetting factor, between 0 (propagated error statistics are neglected) and 1 (perfect model case). This formulation leads to this simple propagation equation for error covariances: 1 Sfkjk!1 ¼ pffiffiffi Mk!1;k Sak!1jk!1 : q ð8Þ Actually, the various strategies to deal with model error can be applied to any square-root filter algorithm (RRSQRT, ensemble, SEEK, etc.). Although very simple, the forgetting factor approach offers an interesting framework for adaptive parameterizations of the model error (Testut et al., 2003) which is still subject of new developments [e.g. Li et al., 2009]. In what follows, only the forgetting factor method is considered. The SEEK filter is initialized with a state xa0 and the associated reduced, square root error matrix Sa0 . Then the equations are (1a) and (8) for the forecasts, and Eqs. (6a)–(6e) for the analyses. 3.2. The SEEK smoother In this section, the SEEK smoother is derived, based on the material presented in Sections 3.1 and 2.2. We first address the perfect model case, starting with the forecast step, following with the analysis step. The case with model errors (forgetting factor formulation) is addressed in Section 3.2.4. The smoother algorithm presented here is of the sequential type, like those of Cohn et al. (1994) and Evensen and van Leeuwen (2000). Other algorithms can also be derived, the RTS smoother for instance, but they are not considered in this work. The SEEK smoother equations may be established from a probabilistic formulation (Evensen and van Leeuwen, 2000). Here we opt for another, recursive method, that exhibits first, concrete implications for the numerical implementation, then, the limits due to the model error parameterization. To establish the SEEK smoother equations, we proceed recursively: starting from the outputs of a SEEK filter analysis at a time t k!1 , we apply the smoother forecast and analysis equations at the following observation time tk . We show that the smoother equations are easily rewritten in the SEEK form, and that they provide, in addition to the filter analysis outputs at time t k , the smoother analysis outputs at t k!1 , conditioned to observations at tk . Then we generalize the SEEK smoother equations to times prior to tk!1 . 3.2.1. Forecast step We begin with the SEEK filter analysis outputs at tk!1 : a state xak!1jk!1 and a square root Sak!1jk!1 of the covariance matrix. Introducing the square-root decomposition of the Kalman filter analysis covariance matrix, the smoother forecast equations Eqs. (3a)–(3c) yield xfkjk!1 ¼ Mk!1;k xak!1jk!1 ; Pfa k;k!1jk!1 ¼ T Mk!1;k Sak!1jk!1 Sak!1jk!1 T ¼ T Sfkjk!1 Sak!1jk!1 ; T Pfkjk!1 ¼ Mk!1;k Sak!1jk!1 Sfkjk!1 ¼ Sfkjk!1 Sfkjk!1 ; ð9aÞ ð9bÞ ð9cÞ where Sfkjk!1 is defined by Sfkjk!1 ¼ Mk!1;k Sak!1jk!1 as in the SEEK filter (Eq. (8) with q ¼ 1). It is remarkable that the cross-covariance mais determined only from outputs of the filter. trix Pfa k;k!1jk!1 3.2.2. Analysis step From the previous analysis and forecast steps we have in hand xak!1jk!1 , Sak!1jk!1 , xfkjk!1 , Sfkjk!1 . The SEEK filter equations (6a)–(6e) provide xakjk and Sakjk . We now focus on the smoother components. Smoother gain: From expression (4a) and the decomposition of Pfa k;k!1jk!1 obtained at the forecast step, the smoother gain writes ! " ! "T T T !1 Kk!1jk ¼ Hk Sfkjk!1 Sak!1jk!1 Gk ¼ Sak!1jk!1 Hk Sfkjk!1 G!1 k : ð10Þ ! "T Kk!1jk ¼ Sak!1jk!1 ½I þ Ck )!1 Hk Sfkjk!1 R!1 k ; ð11Þ The Sherman–Morrison–Woodbury formula (5) may be used as it was for deriving the SEEK filter to provide where Ck is the same as in the SEEK filter. Smoother state: The smoothed state xak!1jk is directly computed using the smoother gain (Eq. (4b)). Analysis covariance: Introducing the decompositions of Pak!1jk!1 and Pfa k;k!1jk!1 , and expression (11) for the smoother gain, into the smoother expression (4d), we compute T Pak!1jk ¼ Sak!1jk!1 Sak!1jk!1 ! Kk!1jk Hk Sfkjk!1 Sak!1jk!1 T ¼ Sak!1jk!1 ½I þ Ck )!1 Sak!1jk!1 : T ð12Þ 91 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 Now, defining Sak!1jk ¼ Sak!1jk!1 ½I Table 1 Equations of the SEEK smoother with a perfect model. þ Ck ) !1=2 ð13Þ ; a square-root decomposition of the smoother error covariance is obtained. Analysis cross-covariances: Introducing again the decomposition (9b) of Pfa k;k!1jk!1 from the forecast step, and expression (6c) for the Kalman filter gain, into the smoother expression (4c), we compute Paa k;k!1jk ¼ ðI ! T Kkjk Hk ÞSfkjk!1 Sak!1jk!1 ¼ Sfkjk!1 ½I þ T Ck )!1 Sak!1jk!1 ð14Þ and using the definitions (6e) and (13), it appears that the crosscovariance matrix Paa k;k!1jk can be decomposed using the square roots of Pakjk and Pak!1jk : Paa k;k!1jk ¼ T Sakjk Sak!1jk : ð15Þ At the end of the assimilation cycle, the analysis covariance and cross-covariance matrices of the smoother are fully determined with the square root matrices Sakjk and Sak!1jk . 3.2.3. Past states estimates We have just shown that, from the SEEK filter state vector and square root error covariance matrix at time t k!1 , it is possible to determine the smoothed analysis state vector and square root error covariance matrix for time tk!1 given observations at t k . The strong point is that the square root matrices not only lead to the covariance matrices, but also provide the cross-covariance matrix. Proceeding then recursively, the smoother equations may be applied to compute the smoother estimates xaijk and Saijk ði < k ! 1Þ from the filter estimate xak!1jk!1 , Sak!1jk!1 , and the smoother estimates xaijk!1 and Saijk!1 ði < k ! 1Þ. The smoother equations may be applied involving the smoother estimate at time ti . This strictly follows the steps of Sections 3.2.1 and 3.2.2. The forecast/analysis cross-covariance is given by T T a a f a Pfa k;ijk!1 ¼ Mk!1;k Sk!1jk!1 Sijk!1 ¼ Skjk!1 Sijk!1 ; ð16Þ the smoother gain and the square root error covariance matrix of the smoothed estimate are computed as: ! "T Kijk ¼ Saijk!1 ½I þ Ck )!1 Hk Sfkjk!1 R!1 k ; Saijk ¼ Saijk!1 ½I !1=2 þ Ck ) ð17Þ ð18Þ : Finally, it can be verified that the analysis error covariance and cross-covariance matrices are decomposed as T Paijk ¼ Saijk Saijk ; ð19Þ T a a Paa k;ijk ¼ Skjk Sijk : ð20Þ This finalizes the full set of the SEEK smoother equations with a perfect model, summarized in Table 1. From Eq. (6a) (Ck is positive definite), (18) and (19), it can be verified that each diagonal element of Paijk is smaller than its counterpart in Paijk!1 . In other words, each smoother analysis step is expected to reduce the error variances of the estimate. A fortiori, the total error variance (the trace of the covariance matrix) is also reduced. 3.2.4. The SEEK smoother in presence of model error In the standard form of the smoother, model error covariances affect the forecast state error covariances Pfkjk!1 (Eq. (3c)) but not the cross-covariances Pfa k;ijk!1 (Eq. (3b)). This comes from the Kalman hypotheses according to which the model error is white in time and uncorrelated with the initial state error (see Cohn et al., 1994). If the model is perfect, the decomposition of both matrices involves the same forecast, square root error covariance Sfkjk!1 (Eqs. Initialization xa0 and Sa0 Forecast step State propagation xfkjk!1 ¼ Mk!1;k xak!1jk!1 Sfkjk!1 ¼ Error propagation Mk!1;k Sak!1jk!1 Filter analysis step f Ck ¼ ðHk Sfkjk!1 ÞT R!1 k ðHk Skjk!1 Þ dk ¼ yk ! Hk xfkjk!1 Kkjk ¼ xakjk Sakjk ¼ ¼ Sfkjk!1 ½I xfkjk!1 !1 þ Ck ) þ Kkjk dk Sfkjk!1 ½I Innovation Kalman gain ðHk Sfkjk!1 ÞT R!1 k Filter analysis Filter analysis (cov.) !1=2 þ Ck ) Smoother analysis step Kijk ¼ Saijk!1 ½I þ Ck )!1 ðHk Sfkjk!1 ÞT R!1 k ; i 2 Rk xaijk Saijk ¼ ¼ xaijk!1 þ Kijk dk ; i 2 Rk Saijk!1 ½I þ Ck )!1=2 ; i 2 Rk Smoother gains Smoother analyses Smoother analyses (cov.) Rk defines the set of time indices at which the retrospective analysis is produced. See Section 2.2 for details. (9b) and (9c)). But that simple mathematical formulation is lost in presence of model errors because the matrix Sfkjk!1 includes, by definition, the model error parameterization, which should not affect Pfa k;ijk!1 . The necessary upgrade of the smoother algorithm appears quite simple when the forgetting factor parameterization is used. Dealing with other model error parameterizations in a square-root smoother algorithm is a challenge, for it introduces theoretical and practical difficulties, the full discussion of which would go beyond the primary scope of this paper. This issue should be specifically addressed in a future work. Considering that Sfkjk!1 is given by Eq. (8) that includes the forgetting factor, the cross-covariance matrix is now: Pfa k;ijk!1 ¼ pffiffiffi f qSkjk!1 Saijk!1 T ; ð21Þ and this must be accounted for in the smoother analysis step. The upgraded SEEK smoother analysis equations are reported in Table 2. The major change from the perfect model case arises in that the definition of the matrix Saijk (modified so that the cross-covariance matrix Paa k;ijk can still be written as in Eq. (20), see Table 2) becomes inconsistent with the corresponding expression of the smoother analysis covariance matrix Paijk (Eq. (19)). However, it is easy to see that the rank of Paijk remains equal to r, so that Paijk can be rewritten in the form: T Paijk ¼ Saijk Tijk TTijk Saijk ; ð22Þ with Tijk defined with a recursive formula (see Table 2). Note that the smoother analysis covariance matrices Paijk are not used in the algorithm; they are useful only for a posteriori diagnostics. In Table 2 Analysis equations of the SEEK smoother with the forgetting factor parameterization of model error. Kijk ¼ pffiffiffi a qSijk!1 ½I þ Ck )!1 ðHk Sfkjk!1 ÞT R!1 k ; xaijk ¼ xaijk!1 þ Kijk dk ; Saijk i 2 Rk i 2 Rk pffiffiffi ¼ qSaijk!1 ½I þ Ck )!1=2 ; i 2 Rk Tkjk ¼ I Tijk ¼ p1qffiffi f½I þ Ck )1=2 ðTijk!1 TTijk!1 ! q½I þ Ck )!1 Ck Þ½I þ Ck )1=2 g1=2 Smoother gains Smoother analyses Smoother analyses Smoother analyses Smoother analyses 92 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 practice, the computation of Tijk may then be avoided. Finally, it is easy to show, using the recursive definition of Tijk , that the error variances in Paijk are smaller than the error variances in Paijk!1 . 3.3. SEEK smoothing with nonlinear dynamics When the dynamical model is nonlinear, the propagation of the error covariance matrix involves the linearized model. In the SEEK filter approach, it is common to apply a finite difference scheme using the full nonlinear model, rather than a linearized version. The jth error mode is then propagated by Sfkjk!1;j ! " ¼ Mk!1;k xak!1jk!1 þ Sak!1jk!1;j ! xfkjk!1;j : ð23Þ This method avoids the derivation of the linearized model, and is often more robust than the use of the linearized model (Brasseur and Verron, 2006). Extension to nonlinear dynamics is similar in the SEEK smoother and the SEEK filter. The smoother part concerns only the analysis step, so that model nonlinearity can be processed as in the filter. The observation operator is involved in the smoother equations in no different way as it is in the filter equations. Thus, nonlinear filtering approaches also apply to the smoother (see, for instance, Zhu et al., 1999, for a technical description of the extended fixedlag smoother). 3.4. Comparison with other smoother schemes We now shortly come back on other smoother algorithms already used in oceanography, for comparison: the Ensemble Smoother (EnS) (van Leeuwen and Evensen, 1996; van Leeuwen, 1999, 2001); the Ensemble Kalman Smoother (EnKS) (Evensen and van Leeuwen, 2000), implemented for twin experiments by Ngodock et al. (2006), and the ESSE smoother (Lermusiaux and Robinson, 1999) used by Lermusiaux (1999b) and Lermusiaux et al. (2002) with real data. The EnKS is the closest to the SEEK smoother algorithm described previously. The smoother part is sequential in both algorithm. The differences lie in the handling of an ensemble rather than the mean and covariance of the state, and in the analysis scheme used to compute the Kalman gain. Both algorithms also differ in their formal presentation. An advantage of the recursive derivation presented in this paper is to naturally tackle the problem posed by the covariance inflation approach to parameterize the model error term. The EnS is of the fixed-interval type and not sequential. In a first step, the ensemble of states is integrated over the interval and defines a four-dimensional background trajectory. Then a four-dimensional analysis is performed using all the observations in the interval. Error statistics (cross-covariances) are estimated from the background ensemble. In practice though, four-dimensional statistics are not explicitly handled. van Leeuwen and Evensen (1996) describe the following strategy: (i) During the initial, forward ensemble integration, only the innovations (for each member) and the ensemble departure from the mean, projected in the observation ^i space, are stored. With our notations, these could be denoted d bb and H S, ‘‘i” indicating the ensemble member and the hat denoting the four-dimensional extension of the vectors and matrices. (ii) ^ defined by Bennett (1992) for the repreThe coefficient vectors b i ^ i is solution of senter method can be computed for each member; b & ' ! "T ^ ¼d ^: bb bb b b H S H S þR i i ð24Þ (iii) The background ensemble is integrated a second time and corrected online. At each assimilation step k, the analysis of member i is computed as ! "T ^i; bb S b xai;k ¼ xbi;k þ Sbk H ð25Þ ! " xakjN ¼ xakjk þ Ak xakþ1jN ! xfkþ1jk ; ð26Þ ! " T !1 Sfkþ1jk Sfkþ1jk : ð27Þ but does not feed the next forecast as in the filter. Here the superscript b stands for ‘‘background”. The ESSE smoother is also of the fixed-interval type and not sequential. Smoothing occurs in a backward run after a filtering pass. The filter analysis at tk is corrected based on the difference between the filter forecast at t kþ1 and its smoothed counterpart calculated previously. Denoting N the number of assimilation step in the interval: where Ak is the smoother gain. As mentioned by Lermusiaux and !1 leads to the RTS Robinson (1999), taking Ak ¼ Pakjk MTk;kþ1 Pfkþ1jk smoother formulation. With a square-root formulation and the notations adopted so far, this expression becomes: Ak ¼ Sakjk Sfkþ1jk T This expression involves the inversion of a huge and singular matrix. The ESSE strategy lies in the Singular Value Decomposition (SVD) and truncation of the matrices of error modes, as: ! " T SVD Sakjk ¼ Eakjk Rakjk Vakjk ; ! " T SVD Sfkþ1jk ¼ Efkþ1jk Rfkþ1jk Vfkþ1jk ; ð28aÞ ð28bÞ where the matrices R are diagonal, positive definite. The smoother gain then takes the following form: T !1 T Ak ¼ Eakjk Rakjk Vakjk Vfkþ1jk Rfkþ1jk Efkþ1jk ; ð29Þ which is computationally tractable. More details, in particular concerning the update of the covariances, are given in Lermusiaux and Robinson (1999). 4. Practical aspects 4.1. Algorithm complexity Considering that the expensive operations in the Kalman filter are the model integrations (Eqs. (1a) and (1b)) and the inversion of the innovation error covariance (Eqs. (2a) and (2c)), the Kalman filter algorithm complexity approaches: ACðKFÞ * ðn þ 1ÞN þ s3 ; where n is the size of the state vector, N the number of operations in one model integration, and s the number of observations. The full rank fixed-lag smoother is far more expensive than the Kalman filter, for it involves L & n extra model integrations (Eq. (3b)), where L is the smoother lag. Then ACðKSÞ * ½nðL þ 1Þ þ 1)N þ s3 : In the SEEK filter, the various hypotheses lead to a significant reduction of the algorithm complexity in comparison with the Kalman. First, it requires only r þ 1 model integrations instead of n þ 1 (r, the size of the reduced state error space, is taken lower than n by several orders of magnitude). Also, the size of the matrices to be inverted is r, instead of s. Then ACðSFÞ * ðr þ 1ÞN þ r 3 : As regard to computational complexity, the smoother largely benefits from the square-root formulation. The number of model integrations is similar to the SEEK filter’s, and the smoother analysis requires no additional matrix inversion. Thus, the SEEK smoother algorithm complexity also approaches (only the dominant orders of magnitude are considered): 93 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 ACðSSÞ * ðr þ 1ÞN þ r 3 : As regard to machine issues, the main difference between the SEEK filter and the smoother lies in the storage requirements. Let us define a storage unit (SU) as the amount of memory needed to store a state vector with the associated matrix of error modes. Let us note M the number of assimilation cycles on the time interval of a numerical experiment. If both the forecast and analysis results are stored, the filter requires 2M þ 1 SU (this includes the initial state). The storage required by the smoother depends on the type of smoother and the user’s needs. In the case of the fixed-point type and the storage of the last smoother estimate only (xaijM , the intermediate estimates xaijk being ruled out), the smoother requires only 2 SU in addition to the filter. If a fixed-interval smoother is used, and each intermediate estimates is stored, the figure reaches MðM þ 1Þ=2 SU in addition to the filter. If M becomes large, this can be rapidly prohibitive. With the fixed-lag (L) smoother storing each intermediate estimate, the smoother needs approximately ðL + MÞLMSU, what remains reasonable if L is not too large, even if M is large. 4.2. Implementation issue The SEEK smoother differs from the SEEK filter only by a few additional operations in the analysis step. If correctly implemented with the SEEK filter analysis, the smoother analysis is almost costfree. In the present work, the filter analysis is performed using SESAM software (Brankart et al., 2002) according to the following steps (the subscripts are dropped for conciseness, although each element still depends on time): 1. Computation of C: ! "T ! " C ¼ HSf R!1 HSf : ð30Þ C ¼ UKUT : ð31Þ 2. U ! D decomposition of C: 3. Computation of the innovation in the reduced space: ! "T ! " d ¼ HSf R!1 y ! Hxf : ð32Þ 4. Computation of the correction in the reduced space: c ¼ U½I þ K)!1 UT d: f a 5. Computation of S to S transformation matrix: L ¼ U½I þ K) !1=2 T U : ð33Þ ð34Þ 6. Computation of the analysis state: xa ¼ xf þ Sf c: ð35Þ 7. Computation of the analysis error covariance matrix: Sa ¼ Sf L: ð36Þ The smoother require only two extra steps: 1. Computation of the retrospective analysis states: xaijk ¼ xaijk!1 þ Saijk!1 c: ð37Þ 2. Computation of the retrospective analysis error covariance matrix: Saijk ¼ Saijk!1 L: ð38Þ 5. Twin experiment with a nonlinear, ocean circulation model: description The smoother algorithm is implemented and applied with an ocean circulation model. In this section, the experimental set-up is described. Results are presented in the next section. 5.1. Model The ocean circulation model is the NEMO (Nucleus for European Modelling of the Ocean) system (Madec, 2008). NEMO uses an Arakawa C grid and a ‘free surface’ formulation. The idealized configuration used in this work is inspired from previous works on the middle latitude ocean circulation by Holland (1978) and Le Provost and Verron (1987). The domain extends from 25!N to 45!N, and over 30! in longitude. The grid contains 120 points in longitude, 94 in latitude. The projection on the sphere is of the MERCATOR type. It is regular in longitude, with a resolution of 0.2544!. In latitude, the resolution is of 0.2544! at the Equator (out of the domain) and varies from 0.23 to 0.18 between the Southern and the Northern borders. The ocean is sliced into 11 vertical levels described with a z coordinate. The domain is closed and its lateral boundaries are frictionless. The bottom is flat and exerts a linear friction with a coefficient C D ¼ 2:65 & 10!4 s!1 . No heat nor freshwater flux is introduced. The circulation is only forced by a zonal wind. The zonal wind stress (N s!2 , defined on the u points of Arakawa grid) is prescribed as: & sx ð/Þ ¼ !10!1 cos 2p ' / ! /min ; /max ! /min where / is the latitude of the point, /min and /max are the latitudes of the Southern and Northern boundaries. Lateral dissipation is performed with a biharmonic diffusion operator with a coefficient Ah ¼ !8 & 1010 m4 s!1 . Time integration is done with a leap-frog scheme and an Asselin filter (c ¼ 0:1). The time step is 15 min. This double-gyre configuration allows the development of (nonlinear) features typical of the middle latitude like the Gulf Stream system. Fig. 4 (fully discussed later) presents snapshots of the Sea Surface Height (SSH) field from the model. We clearly identify the central jet and its hydrodynamical destabilization that leads to meanders and eddies. The spin-up phase is initialized with a resting water, homoge!1 neous salinity (35 g kg ) and an analytical temperature profile (Chassignet and Gent, 1991): TðzÞ ¼ 25 þ 5:9 & 10!5 & 800 9:81 & 2 & 10!4 & ðe!z=800 ! 1Þ: The model is run over 85 years. Diagnosing the central jet meridional position, we estimated that a statistical equilibrium was reached after 50 years. Only outputs beyond this time are used to perform the assimilation experiment. 5.2. Observations A twin experiment is performed: observations are synthetic, extracted from another model simulation. However, to make things not so perfect, we built the observational dataset as follow: (i) NEMO model is run in the same configuration as described previously, but at a 1/6! resolution instead of 1/4!; year 70 is taken as the true ocean state. This change in resolution makes the 1/4! resolution model imperfect, and introduces a representativity error that must be considered in the observation error covariance matrix. (ii) SSH from the true state is perturbed with a spatially uncorrelated gaussian noise with a standard deviation of 3 cm, 94 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 Fig. 1. SSH observation network of the experiment. Left: SSH tracks covered in 2 days, by a satellite with TOPEX/POSEIDON flight parameters. The observations of the day 12 are shown here. Right: SSH tracks covered in 10 days. consistently with the error levels obtained from altimetry. (iii) SSH observations are extracted along a satellite track simulated using the actual TOPEX/POSEIDON flight parameters. The observational dataset for one analysis step gathers SSH of the grid points covered by the satellite track during one assimilation cycle (2 days). Like TOPEX/POSEIDON, the repetitivity period is of 10 days. Fig. 1 depicts the SSH observation field for one analysis step (2 days of data), and five consecutive analysis steps (10 days of tracks, the repetitivity period). Temperature observations are also considered. No noise is added to the true field, for errors in temperature measurements by ARGO profilers are supposed to be very small (,0:005 - C) and, here, negligible in comparison with representativity errors. Observations are then distributed on a idealized network that mimics the ARGO network. Only the first 2000 m are observed. Every 2 days, a set of vertical profiles, 6! apart from each other in longitude and latitude, is available. This 6! & 6! pattern is shifted by 2! from one assimilation step to the next, to coarsely reflect the evolutive nature of the true ARGO network. The resulting network density is one profile every 2!, and every 18 days, close to the true, average ARGO density of one profile every 2!, and every 15 days. Fig. 2 exhibits the horizontal coverage of a temperature observation field for one analysis step (2 days of data), and nine consecutive analysis steps (18 days of data). 5.3. Assimilation settings The assimilation experiment is initialized using a 10 year interval of the simulation with the free model, with outputs every 5 days: the mean state is taken as the initial state. Then, an EOF analysis is performed with the ensemble of states, and the first 20 modes are retained to form the initial, reduced error basis. Observation errors are considered uncorrelated in space and time, with standard deviations of 6 cm for SSH and 0.3 !C for temperature. These values are larger than the true errors (of 3 cm and 0 !C, respectively) to take representativity errors into account. The assimilation cycle is 2 days and the smoother is of the fixedlag type, with a maximum lag of 16 days. The process starts with an analysis step. A method of analysis localization is used in order to rule out corrections due to distant observations. Such correction occur in presence of error correlations between distant grid points. And for several reasons (nature of the sources of error, truncation of the state error space, in particular), these correlations are most often unreliable. For the Ensemble Kalman Filter, the localization approach was introduced by Houtekamer and Mitchell (2001) and discussed in further details by Hamill et al. (2001). Here, we apply the method adapted to the SEEK filter, described by Brankart et al. (2003) and Testut et al. (2003): to compute the correction at each water column, the observations are weighted by a factor 2 expð!r 2 =d ), with d . 200 km. In brief, only observations within a 6 & 6 square centered on the water column are used for the correction. Our experiment is run over 360 days. The filter actually runs over 376 days, so as to get a 16-day smoother analysis at day 360. This time period is long enough to evaluate the smoother, but short enough to avoid the use of a perfectly (adaptively) tuned model error parameterization. Error modes are propagated following the finite difference approach given by Eq. (23). Model error is introduced through the forgetting factor parameterization Fig. 2. Horizontal distribution of the observations of temperature. Left: Observations network for the day 12. Right: Observation network covered in 18 days. 95 Total SSH error variance E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 100 80 60 40 20 0 0 50 100 150 Days 200 250 300 350 Fig. 3. Time evolution of the total SSH error variance of the SEEK filter, in m2 (forecast/analysis steps). The total variance is computed as the trace of the error covariance matrix. presented previously. This factor is fixed at 0.9. Although not optimal, this approach makes the assimilation procedure stable through time. Fig. 3 depicts the time trajectory of the total SSH error variance, computed as the trace of the error covariance matrix (for SSH only). The error variance quickly reaches and keeps an equilibrium, neither collapsing (resulting in the filter divergence) nor diverging. Such convergence is theoretically expected with the Kalman filter, when the network and quality of observations is constant and the model error is constant and additive. Also, the examination of the error modes evolution, not detailed here, shows that each modes always holds a significant amount of error information. In other words, the error ensemble does not collapse. These facts argue for the validity of the forgetting factor here. 6. Results Concerning the smoother, we focus on the 8-days retrospective analysis in this section. The reason will be given in Section 7.1, where the choice of the lag is discussed. 6.1. Smoother effects on the spatial distribution of errors To illustrate how the smoother affects the spatial distribution of errors, we examine the effects of the filter and the smoother on the SSH field of the 38th day of the experiment. Fig. 4 presents SSH from the filter forecast and the truth. The filter provides a satisfactory result: visually, the forecast has many features in common with the truth. There are differences though. The two rectangular boxes that overlay the graphs emphasize regions where errors in the forecast are apparent. Fig. 5 depicts the SSH error for the filter forecast, filter analysis, the 4-days and 8-days retrospective analyses. These error fields were calculated by interpolating the model fields onto the true grid, then by substracting the true SSH field. On the upper left graph, the two rectangular boxes of Fig. 4 are reported and clearly exhibit large errors in the forecast SSH field. On the other graphs, the SSH colored fields are overlaid with the simulated TOPEX/ POSEIDON traces used for the corresponding analysis updates. Then, on the filter analysis graph (upper right) only the traces of the day 38 are shown. The 4-day smoother analysis graph (lower left) also displays the satellite traces of the days 40 and 42, used to retrospectively update the state estimate of day 38. The 8-day smoother analysis graph (lower right) presents 10 days (38 to 46) of satellite observations. Ten days being the repetitivity period of TOPEX/POSEIDON, the coverage reaches its maximum. What strikes most in this figure is the poor impact of the filter analysis step onto the forecast error field. The error patches in the rectangular boxes are still present, and with the same magnitude as in the forecast field. This is obviously due to the observational configuration: no satellite trace crosses these regions at day 38, as it can be seen on the upper right graph. However, during the next two assimilation cycles, a few satellite traces come close or cross these regions. Errors initially present in the big rectangular box are then significantly reduced by the 4-day, then the 8-day smoother steps. On the contrary, the small rectangular box exhibits an error patch that persists through the filter and smoother analysis updates. This likely results from the fact that there is no very close observations before the 8-day smoother step, combined with low error crosscorrelations between days 38 and 46. 6.2. Reduction of the global error To get a global synthesis of error levels we compute Root Mean Square Errors (RMSEs) as: RMSE ¼ ( 1X ðX ! X true Þ2 N N )1=2 ; ð39Þ where X is the assimilation-produced field, interpolated on the true state grid (1/6!), X true is the true field, and N is the number of grid points (for SSH: on the horizontal plane). Fig. 6 shows the time evolution of the RMSE in SSH over 1 year, for the filter and the 8-days retrospective analysis. The filter error lies near 2.5 cm most of the time. We can notice that this corresponds to a mean square error of 6.25 cm2 per grid point, what makes a total mean square error near 7 m2 (there are about 10,000 grid points on the horizontal). This is quite consistent with the total error variance diagnosed with the covariance matrix (Fig. 3), near 10 m2. The smoother error is close to 2 cm. The filter RMSE is then reduced by about 20% thanks to the smoother. Where the filter analysis generates a weak correction to the forecast, the smoother correction can be several times larger than the filter correction. Most of the correction is therefore due to the smoother, and not to the filter. The main smoother Fig. 4. SSH field at day 38, in meters: filter forecast (left) and truth (right). The two rectangular boxes emphasize regions especially discussed in the text. 96 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 Fig. 5. Upper panel and bottom left: SSH error field at day 38, in meters: filter forecast, filter analysis, 4-day and 8-day lagged smoother analysis. The associated color bars are displayed on the right side. The two rectangular boxes emphasize regions especially discussed in the text. The simulated TOPEX/POSEIDON traces used for the corresponding analyses are superimposed (see text for more details). contribution occurs in the first days of the experiment, where the absence of past observations yields weight to subsequent observations. The time evolution of the RMSE in temperature is presented in Fig. 7. Both the filter and smoother quickly converge to a constant value close to 0.75 !C. Thus, the error in temperature remains high in comparison with the observation (representativity) error introduced in the system, but this must be appreciated considering the sparsity of the temperature observation network and the errors of the model, due to resolution. Optimistically, we note that the filter prevents the growth of the RMSE in temperature, and that the smoother, though not really more efficient, at least does not spoil the filter analysis with unappropriate corrections. Complementary experiments revealed that the assimilation of temperature profiles is necessary to preserve the stability of the assimilation process and keep the model dynamics on good tracks. The action of the filter and the smoother onto the unobserved dynamical variables (U and V currents) is examined in Figs. 8 and 9. For both variables, and at all time, the filter analysis RMSE is reduced by the smoother. In fact, as with SSH, the corrections due to the smoother often exceeds those due to the filter. The filter analysis RMSEs are reduced by 5% and 6%, respectively, with the smoother. 6.3. Smoothing effect A common and expected feature of the RMSE series for SSH, U and V, is that the smoother seems to actually smooth the filter RMSE trajectories. This is a good point, because for oceanic problems, we may appreciate that the error level be relatively stable in time, so that time series of physical fields are of homogeneous quality. In other words, we would like the RMSE time derivative as close to zero as possible. The forward time derivative of the 0.12 0.1 0.08 0.06 0.04 0.02 0 0 50 100 150 Days 200 250 300 350 Fig. 6. Time evolution of the RMSE in SSH. The line represents the Kalman filter trajectory, with the alternation of RMSE growth during forecast steps, and RMSE reduction at analysis steps. The dots are the results of the 8-days lagged smoother. RMSE in T (degrees) RMSE in SSH (m) 0.14 0.8 0.78 0.76 0.74 0.72 0 50 100 150 200 Days 250 300 350 Fig. 7. Time evolution of the RMSE in temperature. Symbols are the same as in Fig. 6. 97 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 0.06 0.04 0.02 Day 0 Day 2 Day 4 Day 6 Day 30 Day 90 Day 180 0.15 0.08 0 50 100 150 Days 200 250 300 350 RMSE in SSH (m) RMSE in U (m/s) 0.1 0.1 0.05 Fig. 8. Time evolution of the RMSE in zonal current. Symbols are the same as in Fig. 6. 0 RMSE in V (m/s) 0.09 0.08 0.07 F an ilt. Fig. 11. RMSE in SSH, plotted against the assimilation step, for 7 different days. 0.06 0.05 0.04 0.03 0.02 0 50 100 150 Days 200 250 300 350 Fig. 9. Time evolution of the RMSE in meridional current. Symbols are the same as in Fig. 6. SSH RMSE forward difference ys ys ys ys ys ys ys ys sis aly . 2 da . 4 da . 6 da . 8 da 10 da 12 da 14 da 16 da . . . . o o o o o o o o Sm Sm Sm Sm Sm Sm Sm Sm t cas re Fo 0.002 0.001 0 -0.001 -0.002 0 50 100 150 Days 200 250 300 350 Fig. 10. Forward time derivative of the RMSE in SSH. Thin, dotted line: the filter analysis; thick line, the 8-days lagged smoother. SSH RMSE series of the filter analysis, and the 8-days lagged smoother, are presented in Fig. 10. Though not zero, the retrospective analysis signal displays lower amplitudes than the filter analysis signal. The smoothing effect of the smoother is thus clear. A measure of roughness using an integral function of the time derivative of RMSE, Ro ¼ '2 Z & dRMSE dt; dt ð40Þ enables to quantify this: roughness of the filter forecast, filter analysis, and 8-days lagged smoother are then, respectively, 13.69, 11.95, and 1.9 cm2 day!1. Thus, the smoother reduces the filter analysis roughness by a factor of 6. The smoother also reduces the filter analysis roughness by a factor of 7 for U, and 2 for V. 7. Key issues with the SEEK smoother 7.1. On the smoother lag In the framework of the Kalman filter hypotheses, the largest lag theoretically provides the best error reduction and smoothing. But it generally makes sense to consider a limited number of retro- spective analyses. In presence of unstable or dissipative dynamics in particular, most of the smoother improvements are due to the first few retrospective analyses, the others having a minor impact on the results (Cohn et al., 1994). High resolution ocean circulation models develop nonlinear, unstable dynamics. Lower resolution models must be dissipative to a certain degree, at least for stability. Also, the storage requirements associated with a large lag (or a fixed-interval smoother over a long time period) may be prohibitive. In the present work, the number of retrospective analyses has been limited to 8 (16 days) a priori. The identification of the best smoother lag is easy in such a twin experiment framework, for the full error fields can be calculated. The appropriate lag is simply the lag that yields assimilation results with minimal errors. Fig. 11 presents the evolution of RMSE in SSH, for 7 different days in the time interval, as a function of the assimilation step. For instance, the RMSE of the initial state (forecast at day 0) is close to 15 cm. The filter analysis reduces this RMSE down to 13 cm. Then, the 2-days smoother analysis takes the RMSE down to 9.5 cm, and so on. RMSE reaches an asymptotic value at the 8-days smoother analysis: the following smoother analyses have very little effect, if any. It is actually the case for each day shown here. Thus, 8 days (4 assimilation cycles) seems to be the appropriate smoother lag here. 7.2. On the model error It stands to reason that the cross-correlations of state errors must fade with the time distance between the two, either because of actual model errors, or because of nonlinearities in the dynamics. This results in the convergence of the smoother analyses when the lag increases sufficiently. Cohn et al. (1994) clearly exhibit this property with a linear shallow-water model. Fig. 11 reflects this convergence in our experiment. As the fading of cross-correlations is partly due to model errors, it can be expected that a spurious parameterization of this model error leads to undesirable behaviour of the smoother. This is illustrated with another experiment, similar to the one discussed so far, except that the forgetting factor is set to 1 instead of 0.9. This leads to the slow divergence of the filter, but this is not the point here. We focus on the first days only. Also, the smoother corrections are applied over 16 cycles (32 days) instead of 8. Fig. 12 is the counterpart of Fig. 11 for this new experiment. In the first few smoother steps, corrections seem appropriate and the RMSE decreases. Then, when observations beyond 10 days in the future are introduced, the smoother deteriorates the state estimates. These smoother corrections testify the existence of significant but spurious cross-covariances beyond a lag 98 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 RMSE in SSH (m) 0.1 Day 0 Day 2 Day 4 Day 6 0.075 0.05 s s s s s s s ast sis ays ays day 2 day 6 day 0 day 4 day 8 day 2 day rec aly d d 3 2 2 2 1 1 Fo ilt. anmo. 2mo. 4 mo. 8 . . . . . . o o o o o o S F S S Sm Sm Sm Sm Sm Sm Fig. 12. Same as Fig. 11 but with a forgetting factor equal to 1 and a lag of 32 days. Only days 0–6 are shown. of 10 days. Here, they are due to a bad parameterization of the model error. 7.3. Iterative fixed-interval smoothing If Kalman’s hypotheses are verified, the final smoother results are the best and cannot be further improved. Performing an additional analysis with an already used observation should not have any impact. In practice though, the SEEK smoother is suboptimal and such re-analysis may be considered, leading to the concept of iterative smoothing. This has been discussed by Jazwinski (1970) and already tackled by Zhu et al. (2003) for atmospheric data assimilation. We test an iterative smoothing over a fixed time interval covering the first 10 days of the experiment. This window length is chosen in relation to the best lag identified previously (Figs. 11 and 12). After the first smoother pass, the smoothed initial state and covariances (corrected with observations from days 0 to 10) are used to initialize a new filter and smoother pass over the interval. This is done five times. The evolution of the RMSE in SSH in shown in Fig. 13. Unexpectedly, estimates from the first smoother pass (black dots) are better than those of the second filtering pass (green line). This inclines us to be cautious in iterating the smoother. It seems better to perform too many iterations rather than too few. In our case, a conver- 0.14 Filter iteration 1 Filter iteration 2 Filter iteration 5 Smoother iteration 1 Smoother iteration 5 Free model iteration 6 RMSE in SSH (m) 0.12 0.1 0.08 0.06 0.04 0 2 4 6 8 Days 10 12 14 16 18 Fig. 13. Evolution of the RMSE in SSH in the iterative smoothing experiments. Iterative smoothing is performed over the first 10 days. Beyond day 10, the system is in forecast mode. In black: 1st iteration (line: filter; dots: smoother); green: 2nd iteration; blue: 5th iteration; red: 6th iteration with the free model. gence is reached after 5 iterations. Iterating the smoother is nonetheless successful: the filter and smoother analyses at the 5th iteration (blue line and dots) provide better estimates than the smoother at the first pass. In the first two days, the error reduction is more than 1 cm. If a 6th iteration is performed without assimilation, the resulting error (red line) is larger than with the 5th filter or smoother pass. This is due to the imperfect model, that naturally deviates from the true trajectory if not straightly guided. Iterating the smoother also improves the estimate of the final state, though to a limited extent here. Iterations can then be thought useful even for prediction. To address this point more precisely, 3 forecasts are performed over the next 8 days, starting from the filter solutions at the 1st and 5th iterations, and from the free model 6th iteration (colors in Fig. 13 are used for guidance). The free model results being the worst at day 10, the following forecast displays errors larger than for the other initializations. Surprisingly, in the forecast initialized with the 5th filter iteration, the error grows faster than with the 1st filter iteration. The analysis of the 5th filter iteration, although more accurate, is probably not a numerically balanced initial state for the model. Again, this takes us back to the quality of the model, and to the fact that improvements in the analysis performances do not necessarily results in better forecasts. Using a smoother to improve the forecasts is probably relevant, but our experiments suggest it must be carefully considered in regard to the quality of the model. 8. Conclusion In theory, the Kalman filter is well designed to provide an initial state for a prediction. To build a re-analysis of the ocean circulation, optimal smoothers are more appropriate. In particular, the impacts of gaps in the observation network are efficiently smoothed out by smoothers. In this paper, a smoother algorithm has been derived based on a reduced rank square-root filter, the SEEK filter, and implemented with a high resolution ocean circulation model. The theoretical derivation of the smoother from the SEEK filter is straightforward when the model is considered perfect. The smoother equations lend themselves to the square-root decomposition of the state error covariance matrix, and their implementation results in a limited extra computational burden. When the reduced rank square-root filter is implemented, the smoother algorithm is virtually cost-free in terms of computing time. This contrasts with the standard optimal smoother, which computational complexity is several times larger than the Kalman filter’s. As with the Kalman filter, the presence of model errors complexifies the theoretical derivation of the square-root smoothers. A solution has been proposed, based on a forgetting factor approach that consists in inflating the forecast error covariance matrix by a factor larger than 1 to mimic the addition of a model error covariance matrix. This approach moderately modifies the smoother algorithm, and can be implemented strictly. The reduced rank square-root smoother was tested in a twin experiment with NEMO circulation model in an idealized doublegyre configuration at a 1/4! resolution. Observations were synthetic but distributed according to realistic sampling strategies, and contained errors. Representativity and model errors were also introduced by the fact that the simulation from which observations were extracted was performed at a 1/6! resolution. The smoother has proven to positively affect the spatial distribution of errors. If a region is not observed at the filter analysis time, errors in this region may not be reduced by the filter. The smoother gives another chance to these errors to be reduced, if appropriate observations exist in the subsequent assimilation steps. The smoother reduces by nearly 20% the RMSE in SSH obtained with the filter alone. This E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 error reduction occurs for velocity as well, though to a lesser degree. Also, the smoother efficiently reduces roughness in the RMSE time series of the filter, thus improving the constancy in the quality of the state estimates through time. Demonstration is now made that smoothing can be advantageously applied with high resolution ocean circulation models. In the optimal smoother theory, the larger the smoother lag is, the better the results are. But generally, due to model errors and nonlinearities in the dynamics, the best result is obtained with a finite number of smoother lags. The identification of the appropriate lag (the smallest lag that provides the minimum error) is straightforward in a twin experiment approach. In our experiment, it was found to be 8 days. But it may become a more complex task in a real data assimilation context. A cross-validation technique, such as used by Cane et al. (1996) or Frolov et al. (2008) in the framework of Kalman filtering, may show good capabilities to identify such unknown parameters. This consists in a series of experiments, in which only a part of the observations are assimilated, while the other part is withheld for verification. The best lag issue can hardly be disconnected from the model error problem. Indeed, with misspecified or approximate error statistics, the smoother may easily come to spoil the state estimates. This was illustrated with an experiment in which the model error was set to zero, whereas the model is nonlinear and imperfect. A good smoother then requires a good parameterization of the model error. Covariance inflation, as used in the present paper, only inflates error statistics in the subspace of the dynamical component of the forecast errors. In general, model errors have components that are not part of that subspace. More sophisticated modelling of model errors is then required [e.g. Lermusiaux, 2006; Houtekamer et al., 2009]. Testing such models will be a priority in the forthcoming developments of the SEEK smoother. Optimistically, the monitoring of the smoother corrections might be used to check the quality of the model error parameterization, but this is speculative and was not addressed in this work. Acknowledging that assimilation systems are never optimal, hence that the solution can always be further improved, iterating the smoother over a fixed time interval is tempting. This was tested here. Iterations improve the representation of the ocean state in the interval where observations are available for assimilation. In our experiments performed with an imperfect model, it was found that the forecast from the analysis improved by iterations was poorer than the forecast from the standard filter analysis. This results lead us to express reservations about the use of the smoother for prediction issues with an imperfect model. The main benefit of the reduced rank square-root smoother is expected for reanalyses of the ocean circulation. But its use in an operational context will raise new challenges. In our opinion, the most challenging issue is due to the fact that present operational systems of ocean prediction often neglect the forecast of state error statistics, and use pre-defined, fixed error statistics. To account for the known, long-term variations of the dynamical system (seasonality typically), these statistics are sometimes changed over the assimilation time interval. But, as long as nothing is made to represent correctly the error cross-covariances over time, such simplified statistics can only lead to uncorrect retrospective analyses when the smoother is applied. The best strategy will then have to be chosen: adapting the smoothing algorithm to such steady-state error scheme (this requires an ad-hoc fading of cross-correlations), or introducing the dynamical propagation of error statistics (this requires outstanding adaptive schemes to parameterize model errors). New perspective might also arise from smoothing. With a good model, iterative smoothing may become useful for prediction. Finally, the smoother makes the best of observations with nonhomogeneous space–time distribution, when the filter may sometimes fails. The development of efficient 99 smoothers might then bring a new insight on the strategies to observe the ocean. Acknowledgements We thank Jean-Marc Molines for sharing his skills with the NEMO model. This work was supported by CNRS/INSU through the LEFE/ASSIM program and by the ANR program. Partial support of the European Commission under Grant Agreement FP7-SPACE2007-1-CT-218812-MYOCEAN is gratefully acknowledged. Calculations were performed using HPC resources from GENCI-IDRIS (Grant 2009-011279). The original manuscript has been well improved thanks to Pr Pierre Lermusiaux and two other anonymous referees. References Anderson, B.D.O., Moore, J.B., 1979. Optimal Filtering. Prentice-Hall. 357pp. Bennett, A.F., 1992. Inverse methods in physical oceanography. In: Cambridge Monographs on Mechanics and Applied Mathematics. Cambridge University Press. 347pp. Brankart, J., Testut, C., Parent, L., 2002. An integrated system of sequential assimilation modules: sesam reference manual. Tech. Rep., Office Note. LEGI/ MEOM, Grenoble, France. Brankart, J.-M., Testut, C.-E., Brasseur, P., Verron, J., 2003. Implementation of a multivariate data assimilation scheme for isopycnic coordinate ocean models: application to a 1993–96 hindcast of the North Atlantic Ocean circulation. Journal of Geophysical Research 108 (19), 1–20. Available from: <http://wwwmeom.hmg.inpg.fr/Web/Outils/SESAM/sesam.html>. Brankart, J.-M., Ubelmann, C., Testut, C.-E., Cosme, E., Brasseur, P., Verron, J., 2009. Efficient parameterization of the observation error covariance matrix for square root or ensemble Kalman filters: application to ocean altimetry. Monthly Weather Review 137, 1908–1927. Brasseur, P., Ballabrera, J., Verron, J., 1999. Assimilation of altimetric data in the mid-latitude oceans using the seek filter with an eddy-resolving primitive equation model. Journal of Marine Systems 22, 269–294. Brasseur, P., Verron, J., 2006. The seek filter method for data assimilation in oceanography: a synthesis. Ocean Dynamics. doi:10.1007/s10236-006-0080-3. Brasseur, P. et al., 2005. Data assimilation for marine monitoring and prediction: the MERCATOR operational assimilation systems and the MERSEA developments. Quarterly Journal of the Royal Meteorological Society 131, 3561–3582. Cane, M.A., Kaplan, A., Miller, R.N., Tang, B., Hackert, E.C., Busalacchi, A.J., 1996. Mapping tropical Pacific sea level: data assimilation via a reduced state space Kalman filter. Journal of Geophysical Research 101, 22599–22617. Castruccio, F., Verron, J., Gourdeau, L., Brankart, J.-M., Brasseur, P., 2006. On the role of the GRACE mission in the joint assimilation of altimetry and TAO data in a tropical Pacific ocean model. Geophysical Research Letters 33, L14616. Chassignet, E.P., Gent, P.R., 1991. The influence of boundary conditions on midlatitude jet separation in ocean numerical models. Journal of Physical Oceanography 21, 1290–1299. Cohn, S.E., Sivakumaran, N.S., Todling, R., 1994. A fixed-lag Kalman smoother for retrospective data assimilation. Monthly Weather Review 122, 2838–2867. Evensen, G., 1994. Sequential data assimilation with a non linear quasigeostrophic model using monte carlo methods to forecast error statistics. Journal of Geophysical Research 99 (C5), 10143–10162. Evensen, G., 2003. The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dynamics 53, 343–367. Evensen, G., 2007. Data Assimilation. The Ensemble Kalman Filter. Springer. Evensen, G., van Leeuwen, P.J., 2000. An ensemble Kalman smoother for nonlinear dynamics. Monthly Weather Review 128, 1852–1867. Frolov, S., Baptista, A.M., Leen, T.K., Lu, Z., van der Merwe, R., 2008. Fast data assimilation using a nonlinear Kalman filter and a model surrogate: an application to the columbia river estuary. Dynamics of Atmospheres and Oceans. doi:10.1016/j.dynatmoce.2008.10.004. Fukumori, I., 2002. A partitioned Kalman filter and smoother. Monthly Weather Review 130, 1370–1383. Fukumori, I., Malanotte-Rizzoli, P., 1995. An approximate Kalman filter for ocean data assimilation; an example with an idealized Gulf stream model. Journal of Geophysical Research 100, 6777–6793. Gaspar, P., Wunsch, C., 1989. Estimates from altimeter data of barotropic Rossby waves in Northwestern Atlantic ocean. Journal of Physical Oceanography 19, 1821–1844. Hamill, T.M., Whitaker, J.S., Snyder, C., 2001. Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Monthly Weather Review 129, 2776–2790. Holland, W.R., 1978. The role of mesoscale eddies in the general circulation of the ocean—numerical experiments using a wind-driven quasi-geostrophic model. Journal of Physical Oceanography 8, 363–392. Houtekamer, P.L., Mitchell, H.L., 1998. Data assimilation using an Ensemble Kalman Filter technique. Monthly Weather Review 126, 796–811. 100 E. Cosme et al. / Ocean Modelling 33 (2010) 87–100 Houtekamer, P.L., Mitchell, H.L., 2001. A sequential ensemble Kalman filter for Atmospheric data assimilation. Monthly Weather Review 129, 123–137. Houtekamer, P.L., Mitchell, H.L., Deng, X., 2009. Model error representation in an operational Ensemble Kalman Filter. Monthly Weather Review 137, 2126–2143. Jazwinski, A.H., 1970. Stochastic processes and filtering theory. Mathematics in Science and Engineering, vol. 64. Academic Press. 376pp. Julier, S.J., Uhlmann, J.K., 1997. A new extension of the Kalman filter to nonlinear systems. In: SPIE Proceedings of the 11th International Symposium on Aerospace/Defense Sensing, Simulation and Controls, Orlando, Florida. Kalman, R.E., 1960. A new approach to linear filter and prediction problems. Journal of Basic Engineering 82, 35–45. Le Provost, C., Verron, J., 1987. Wind-driven mid-latitude circulation – transition to barotropic instability. Dynamics of Atmospheres and Oceans 11, 175–201. Lermusiaux, P.F.J., Robinson, A.R., 1999. Data assimilation via error subspace statistical estimation. Part I: theory and schemes. Monthly Weather Review 127, 1385–1407. Lermusiaux, P.F.J., 1999a. Data assimilation via error subspace statistical estimation. Part II: middle Atlantic Bight shelfbreak front simulations and ESSE validation. Monthly Weather Review 127, 1408–1432. Lermusiaux, P.F.J., 1999b. Estimation and study of mesoscale variability in the strait of Sicily. Dynamics of Atmospheres and Oceans 29, 255–303. Lermusiaux, P.F.J., 2006. Uncertainty estimation and prediction for interdisciplinary ocean dynamics. Journal of Comparative Physiology 217, 176–199. Lermusiaux, P.F.J., Robinson, A.R., Haley, P.J.H., Leslie, W.G., 2002. Advanced interdisciplinary data assimilation: filtering and smoothing via error subspace statistical estimation. In: Proceedings of The OCEANS 2002 MTS/IEEE Conference. IEEE, Holland Publications, pp. 795–802. Li, H., Kalnay, E., Miyoshi, T., 2009. Simultaneous estimation of covariance inflation and observation errors within an Ensemble Kalman Filter. Quarterly Journal of the Royal Meteorological Society 135, 523–533. Madec, G., 2008. Nemo reference manual, ocean dynamics component: nemo-opa. preliminary version. Tech. Rep., Institut Pierre-Simon Laplace (IPSL), France, No. 27. ISSN No. 1288-1619, 91pp. Available from: <http://www.loceanipsl.upmc.fr/NEMO/general/manual/index.html>. Ngodock, H.E., Jacobs, G.A., Chen, M., 2006. The representer method, the ensemble Kalman filter and the ensemble Kalman smoother: a comparison study using a nonlinear reduced gravity ocean model. Ocean Modelling 12, 378–400. Parrish, D.F., Cohn, S.E., 1985. A Kalman filter for a two-dimensional shallow-water model: formulation and preliminary experiments. Tech. Rep., Office Note 304. National Meteorological Center, Washington, DC. Pham, D.T., Verron, J., Roubaud, M.C., 1998. A singular evolutive extended Kalman filter for data assimilation in oceanography. Journal of Marine Systems 16, 323– 340. Ravela, S., McLaughlin, D., 2007. Fast ensemble smoothing. Ocean Dynamics 57, 123–134. Rauch, H.E., Tung, F., Striebel, C.T., 1965. Maximum likelihood estimates of linear dynamic systems. AIAA Journal 3 (8), 1445–1450. Rozier, D., Cosme, E., Birol, F., Brasseur, P., Brankart, J., Verron, J., 2007. A reducedorder Kalman filter for data assimilation in physical oceanography. SIAM Review 49 (3), 449–465. Simon, D., 2006. Optimal State Estimation. Wiley & sons. 530pp. Testut, C.-E., Brasseur, P., Brankart, J.-M., Verron, J., 2003. Assimilation of sea-surface temperature and altimetric observations during 1992–1993 into an eddy permitting primitive equation model of the North Atlantic Ocean. Journal of Marine Systems 40–41, 291–316. Todling, R., Cohn, S.E., 1994. Suboptimal schemes for atmospheric data assimilation based on the Kalman filter. Monthly Weather Review 122, 2530–2557. Todling, R., Cohn, S.E., 1996. Some strategies for Kalman filtering and smoothing. In: Proceedings of the ECMWF Seminar on Data Assimilation. ECMWF, Reading, United Kingdom, pp. 91–111. Todling, R., Cohn, S.E., Sivakumaran, N.S., 1998. Suboptimal schemes for retrospective data assimilation based on the fixed-lag Kalman smoother. Monthly Weather Review 126, 2274–2286. van Leeuwen, P.J., 1999. The time mean circulation in the Agulhas region determined with the ensemble smoother. Journal of Geophysical Research 104, 1393–1404. van Leeuwen, P.J., 2001. An ensemble smoother with error estimates. Monthly Weather Review 129, 709–728. van Leeuwen, P.J., Evensen, G., 1996. Data assimilation and inverse methods in terms of a probabilistic formulation. Monthly Weather Review 124, 2898–2913. Verlaan, M., Heemink, A.W., 1997. Tidal flow forecasting using reduced-rank square root filter. Stochastic Hydrology and Hydraulics 11, 349–368. Verron, J., Gourdeau, L., Pham, D.T., Murtugudde, R., Busalacchi, A.J., 1999. An extended Kalman filter to assimilate satellite altimeter data into a non-linear numerical model of the Tropical Pacific: method and validation. Journal of Geophysical Research 104, 5441–5458. Zhu, Y., Todling, R., Cohn, S.E., 1999. Technical remarks on smoother algorithms. NASA/GSFC/Data Assimilation Office, Office Note 99-02. Available from: <http:// dao.gsfc.nasa.gov/pubs/on/>. Zhu, Y., Todling, R., Guo, J., Cohn, S.E., Navon, I.M., Yang, Y., 2003. The GEOS-3 retrospective data assimilation system: the 6-hour lag case. Monthly Weather Review, 2129–2150.