A singular vector perspective of 4D-Var Christine Johnson, Nancy K. Nichols* and Brian J. Hoskins *Corresponding author, School of Mathematics, Meteorology and Physics, The University of Reading, Reading UK, n.k.nichols@rdg.ac.uk Four-dimensional variational data assimilation (4D-Var) combines the information from a time-sequence of observations with model dynamics and a background state to produce an analysis. It is shown that the 4D-Var analysis increments can be written as a linear combination of the singular vectors of a matrix, known as the observability matrix of the system, which is a function of both the observational and the forecast model systems. This singular vector perspective is used to examine the filtering and interpolating properties of 4D-Var. The results of the case studies with the 2D Eady model demonstrate clearly how the 4D-Var algorithm is able to interpolate through observations distributed in time to infer the state in unobserved regions whilst filtering the components with small spatial scales that correspond to noise. The results also show that the appropriate specification of the a priori statistics is vital in order to extract the maximal amount of useful information from the observations. The optimal signal-to-noise parameter is estimated using Tikhonov regularization theory. 1. Introduction The purpose of data assimilation is to find an accurate estimate of the true state of a system using observations. With observations that are both noisy and sparse, the data assimilation algorithm must be able to interpolate through the observations, whilst filtering the noise. In atmospheric and oceanic data assimilation, we also have extra information available in the form of a numerical model of the system dynamics. This information is used in the algorithm known as four-dimensional variational assimilation (4D-Var), which finds the estimate by minimizing a cost function that combines both the observations and the model dynamics. This use of dynamical information allows 4DVar to reconstruct the state in unobserved regions (e.g. Courtier, 1987). In this paper we examine how this is achieved and we also determine how this benefit can be exploited. Simple 4D-Var identical twin experiments using the 2D Eady model (Eady, 1949) are presented. This is followed by an analysis of the qualitative information content of observations in 4DVar using the singular vectors of the observability matrix. It is found that the specification of the signalto-noise parameter is critical in extracting the information from the observations. The L-Curve method is presented as a technique to compute the optimal parameter value. Further studies related to this work may be found in Johnson (2003), Johnson et al (2005 a & b). 2. Experiment description The aim of 4D-Var assimilation is to find the maximum likelihood Bayesian estimate of the initial state given the observations, prior estimate of the state and model dynamics. To find the solution, we minimize the cost function J ( x 0 ) 12 2 x 0 x b0 N 1 2 x T 0 x b0 (1) T Hi xi y i Hi xi y i i 0 subject to the system equations xi 1 M(ti 1, ti )xi i 0,, N 1 (2) where 2 o2 / b2 is the signal-to-noise parameter or 2 error variance ratio, and o and b2 are the observation and background error variances respectively, x i is the estimate of the state at time ti , x b0 is the prior estimate or background state, H is the linear observation operator which transforms from state space to observation space, and y i is the set of observations at time ti . (We note that the effects of background and observation error correlations and model errors and nonlinearity are ignored in the present study). Using identical twin experiments with the 2D Eady model, we investigate the ability of 4D-Var to reconstruct an upper level temperature wave using only observations of a lower level wave at two points in time. Perfect observations are generated from a model run over a time interval corresponding to 6h, initiated with the most rapidly growing normal mode. This initial temperature field exhibits an eastward tilt with height that leads to exponential growth of the solution. The prior estimate is equal to the true state with a phase shift. We investigate the impact of observational noise and the size of the signal-to-noise parameter on the reconstruction of the upper level wave. 3. Upper Level Lower Level Results We first consider the impact of the specification of 2 when there is no observational noise. Fig. 1a-b shows the true state (dotted), background state (dashed) and analysis (solid), at the end of the 6h assimilation window, for the case with 2 0.01 . In the assimilation the algorithm extracts the time evolution information contained in the observations of the lower level wave (circles) to infer the unobserved upper level wave, so that both the upper and lower level waves are moved from the background state closer to the true state. When 2 2 is increased (Fig. 2a-b, 0.1 ), the lower level wave is still corrected, but the upper level wave is reconstructed less accurately. In particular, because the background state has a phase error, the analysis has large amplitude and phase errors. We next consider the impact of the specification of 2 when there is observational noise. The observations have added Gaussian noise with standard deviation 1. When the statistically correct weights are specified ( 2 0.08 ) the analysis (not shown) is similar to that without observational noise: the lower level wave is reconstructed accurately, but the upper level wave contains phase and amplitude errors, again due to the weighting of the background term. If less weight is given to the background term (Fig. 3a-b, 2 0.01 ), a nonphysical wave is generated by the assimilation algorithm on the upper boundary. Thus the results are sensitive to the noise in the observations, especially in the unobserved regions. To summarize, weighting the background state too heavily may filter information needed to reconstruct the state in the unobserved regions. However, the analysis in the unobserved regions is sensitive to the observational noise if the background state is not weighted heavily enough. The specification of the appropriate value for 2 o2 / b2 is therefore critical in extracting the maximum amount of useful information from the observations. 0 Figure 1. 4D-Var analyses using perfect observations and 2 a relatively small signal-to-noise ratio 0.01 . The panels show the (a) upper and (b) lower level temperature fields. Upper Level Lower Level (a) (b) Figure 2. 4D-Var analyses using perfect observations, and 2 a relatively large signal-to-noise ratio 0.1 . The details are as Fig. 1. Upper Level Lower Level (a) (b) Figure 3. 4D-Var analyses using noisy observations, and 2 a relatively small signal-to-noise ratio 0.01 . The details are as Fig. 1. 4. Singular vector interpretation To analyse the critical features in the 4D-Var assimilation process, we consider the singular value decomposition (SVD) of the observability matrix. The observability matrix is defined as: T Hˆ HT0 , H1 M(t1 , t0 ) ,, H N M(t N , t0 ) T (3) selected. In particular, the coefficients are seen to grow as the singular values decay. If the corresponding singular vectors are not sufficiently filtered, then the analysis may be inaccurate due to the large projection of the observational noise, as seen in Fig. 3. However, if the corresponding singular vectors are filtered too much then the second pair of significant RSVs may be filtered out, causing the reconstructed upper level wave to lose accuracy, as seen in Fig. 2. and the SVD is defined as: r ˆ j u j v Tj H ( 4) j 1 where λ j , u j and v j are the singular values, left singular vectors and right singular vectors (RSVs). Applying the SVD to the solution of the minimization problem gives: x0 x b0 r f j c j v j (5) j 1 10 8 Height (km) 6 4 2 0 10 The increments made to the prior estimate by the 4D-Var algorithm are thus given by a linear combination of the RSVs of the observability matrix, weighted by the two factors: fj 2j 2 2j cj u Tj d j ( 6) The RSVs define the structures that can form the analysis increments and the weights determine the contribution of these structures to the analysis for a given set of observations. In the previous experiments there are only two pairs of RSVs that are needed to reconstruct the solution (as determined by the values of c j that are non-zero). These RSVs are shown in Fig. 4. The RSVs form pairs with the same singular value due to the zonal symmetry of the model and have the same spatial structure apart from a phase shift in the horizontal. Here, the first pair of RSVs has a singular value of 1.45 and a maximum amplitude on the lower boundary. The second pair of RSVs has a singular value of 0.27 and a maximum amplitude on the upper boundary. Thus it is the second pair of RSVs that is needed to reconstruct the unobserved upper level wave. Figure 5 shows the values of the projection coefficients c j for perfect and noisy observations. With perfect observations only 4 RSVs are selected, but with noisy observations, many more RSVs are 8 Height (km) 6 4 2 0 Figure 4. The streamfunction fields for the right singular vectors (RSVs) that are required to form the analysis increment. Figure 5. The values of the projection coefficients c j for perfect (dashed) and noisy (solid) observations. 5. Tikhonov regularization If the value of 2 is relatively small, the 2 solution is sensitive to the noise, whilst if is relatively large, the useful information in the observations is filtered. Thus, a vital part in 4D-Var is 2 the specification of the value of . Accurate estimates of the background error variances are not easily available. We show here how good choices for 2 can be determined directly from the observations. The 4D-Var problem (1) can be viewed as a Tikhonov regularization, which is used to solve discrete ill-posed inverse problems, where even when there are sufficient data to define a unique solution the solution is still sensitive to noise. In such problems, a term similar to the background term in 4D-Var is added to regularize the problem and the parameter 2 is known as the regularization parameter. A simple method to compute the optimal value for 2 is the L-Curve (Hansen and O’Leary, 1993), illustrated in Fig. 5. The L-Curve is a parametric plot of the logs of the two separate leastsquare terms at the minimum of the cost function. As we wish to minimize the sensitivity of the solution whilst minimizing the loss in accuracy due to the 2 extra constraint, the optimal choice for is found at the point of maximum curvature (the corner of the L). For the Eady problem examined here, we see that the optimal value should be in the region 2 0.08 0.1 , which is the range for the best values found experimentally. state in unobserved regions, the appropriate value for the signal-to-noise ratio must be specified. The use of the L-Curve shows that even if the error statistics are unknown, it is still possible to find the 2 appropriate value for from the data. Figure 6. The L-Curve: a parametric plot of the two terms 2 of the cost function as a function of , the values of which are written beside each point. Acknowledgements: The authors are grateful to S. P. Ballard from the Met Office and A. S. Lawless from The University of Reading for their contributions to this research. References 6. Conclusions A new mathematical insight into the use of observations within 4D-Var is presented. It is shown that the 4D-Var analysis increments can be written as a linear combination of the singular vectors of a matrix, known as the observability matrix of the system, which is a function of both the observational and the forecast model systems. For the simple case study here it is found that the information needed to reconstruct the state in the unobserved region corresponds to relatively small singular values. This means that this reconstruction is a delicate task. If too much weight is given to the observations, the state becomes noisy, but if too little weight is given to the observations, the state is not reconstructed adequately. This study demonstrates that to exploit the use of the model dynamics in 4D-Var, and hence to extract the information needed to reconstruct the Eady ET. Long waves and cyclone waves. Tellus 1:33-52, 1949. Courtier P and Talagrand O. Variational assimilation of meteorological observations with the adjoint vorticity equation (ii): Numerical results. Q. J. R. Meteorol. Soc. 113:1329-1368, 1987. Hansen PC, O’Leary DP. The use of the L-Curve in the regularization of discrete ill-posed problems. SIAM J. Scientific. Comp. 14:1487-1503, 1993. Johnson C. Information content of observations in variational data assimilation. Ph.D. Thesis, The University of Reading, 2003. Johnson C, Hoskins BJ and Nichols NK. A singular vector perspective of 4D-Var: Filtering and interpolation. Q. J. R. Meteorol. Soc. 131:1-20, 2005a. Johnson C. Nichols NK and Hoskins BJ. Very large inverse problems in atmosphere and ocean modeling. Int. J. Num. Meth. Fluids. 47:759-771, 2005b.