Exercise number 4 Variational Data Assimilation Jean-Michel Brankart, LEGI, Grenoble, France Jean-Michel.Brankart@hmg.inpg.fr The exercise is based on the ‘Double Pendulum 4D-VAR Demonstration Package’ by Ross Bannister, Data Assimilation Research Center, University of Reading, UK http://www.met.rdg.ac.uk/~ross/DARC/DocDPVar.html 1. Introduction In order to produce a realistic ocean forecast, all operational centres need data assimilation. The general idea is to use all available (past) observations of the ocean to drive the model phase trajectory as close as possible to the “true” trajectory (at present time) in order to best initialise a deterministic ocean forecast. The objective of this exercise is to get a better understanding of the problem by playing with a toy assimilation system. This should lead you to examine closely the basic hypotheses and limitations of the assimilation and forecasting systems. The toy system that you will operate is the double pendulum mechanical system, controlled by a 4DVAR assimilation scheme. Though this dynamical system may seem very simplistic, there are some important similarities with the ocean, so that the conclusions that you will draw about assimilation problems will also often be valid for the ocean. In this exercise, we will ask you to experiment the double pendulum and to generalise the results for the ocean. 2. The double pendulum system The double pendulum (illustrated in Fig. 1) is a mechanical system consisting of 2 masses suspended from a fixed point C, in a constant and homogeneous gravity field. The first mass m1 is constrained to be at distance 1 from the fixed point and the second mass m2 is constrained to be at distance 2 from the first mass. The double pendulum has 2 degrees of freedom: the angles 1 and 2 of the 2 pendulum sections with respect to the vertical. The equations of motion for the double pendulum are derived in Annex A. They fully determine the future evolution of the system given an initial condition for the 4 state variables: the angles 1 and 2 , and the angular speeds 1 and 2 . These equations can be integrated numerically (see Annex A) as illustrated in Fig. 2 for the initial condition 10 20 / 2 , 10 20 0 . (In this exercise, we always use m1 m2 and 1 2 0.1 meter.) Instead of representing the evolution of the state variables as a function of time, we can also follow the evolution of the system in the phase space (with the 4 dimensions 1 , 2 , 1 and 2 ), by using 2D projections on several sections in the phase space (such as the plane 1 , 1 in Fig. 3). Figs 2 and 3 are the 2 kinds of figures that you will use in this exercise to analyse the behaviour of the system. (Animations illustrating the movement of the double pendulum can be found in the web sites listed in Annex B.) Figure 1 1 2 1 1 t Figure 3 Figure 2 The double pendulum is an interesting system for us because it exhibits a wide range of different dynamical behaviours, from quasi-periodic oscillations (for small oscillations, or for high energy movement, as gravity becomes unimportant), to chaotic evolution (at intermediate energy levels, for specific domains of initial conditions). In the chaotic regions of the phase space, the movement is very sensitive to the initial conditions. This is the first important similarity with oceanic systems that will be exploited in this exercise. Fig. 4 shows the power spectrum of the 1 time series (on the same example as above), in the case of a chaotic movement ( 10 20 / 2 , 10 20 0 ). The presence of energy at many different time scales, with the presence of highs and lows, is a second Figure 2 similarity with oceanic systems. Of course the double pendulum is not like the ocean. The 2 main differences that we will have to keep in mind here are the absence of dissipation (the system is conservative) and the absence of time-dependent external forcing (the system is autonomous), so that the system is reversible. 3. 4DVAR data assimilation Assume that we observe the movement of a double pendulum with unknown initial conditions. The problem of estimating the initial conditions optimising the fit between the observations and the model solution can be formulated using a variational formalism. The optimal initial conditions x(0) are defined as the minimum of the cost or penalty function: J ( x) x B x(0)T B 1 x B x(0) 12 y (t ) H t0 x(0)T E 1 y (t ) H t0 x(0) t 1 2 t 0 where y (t ) is the set of observations at time t; H t0 is the model operator between time 0 and time t; x B is the first guess for the initial conditions; B is the background (first guess) error covariance matrix; E is the observational error covariance matrix. The solution of this problem can be easily found using a steepest descent algorithm (the main difficulty being the computation of the gradient of the cost function), as in most 4DVAR schemes used in meteorology and oceanography. (The word 4DVAR is certainly abusive for the double pendulum, which is not a 4-dimensional system.) In the double pendulum implementation of this “4DVAR” algorithm, the following particular choices have been made: The observations y (t ) are available regularly in time every n y time steps. There are 2 kinds of possible observations: (i) all state variables are observed (the angles and the angular speeds), (ii) only the angles are observed. The observation error standard deviation is the same for all angle observations, and the same for all angular speed observations. The observation errors are all statistically independent, so that the E matrix is diagonal. The background error covariance matrix B is parameterised as the time covariance of the model forecast starting from the first guess x B : t B 1 t T (x x) (x x)dt with x 0 t 1 t x(t )dt and x(0) x B 0 Due to the non-linearity of the double pendulum there is no guarantee that the above cost function has a single minimum (which would be the global minimum). There can be a large number of local minima, which are not the solution of our problem, and to which the steepest descent algorithm can be lead to converge. In order to circumvent this difficulty, the full estimation period ( n time steps) must be divided in a series of shorter assimilation cycles ( nc time steps, with n y nc n ), in which the effect of the model non-linearities will be negligible. The first guess state x B for each cycle is the final condition from the previous cycle. 4. Twin assimilation experiments In this exercise, you will not use observations from a real double pendulum system, to assimilate in the double pendulum model, as it is done in ocean forecasting systems. Instead, you will sample observations from a reference model simulation, that will be considered as the “true” movement of the system. These observations will then be assimilated by the 4DVAR scheme in your assimilation experiments. Of course, in such experiments, we must assume that we do not know what the initial conditions are, because it is the job of the assimilation scheme to estimate them. As a first guess x B , you will choose values that are (slightly) different from the true initial conditions. This is the basic idea of twin assimilation experiments that can also be used with ocean models, when developing new assimilation schemes, or to help optimising the ocean observing system. They are used to analyse the behaviour of an assimilation system. This is also the objective of this exercise. The software that you will operate (“Assim.f”, see Annex D) can work in 2 different modes: the “MakeObs” mode, to perform the reference simulation and to produce the observations, and the “Analyse” mode to run 4DVAR assimilation experiments. The mode is the first parameter to set in the “Assim.conf” parameter file. In both modes, you have to specify the following parameters: 1. Dimensions ( 1 , 2 , in meters) and masses ( m1 , m2 , in kg) of the pendulum, 2. Acceleration due to gravity (in m/s2), 3. Time step (in seconds), 4. Length of integration (in time steps), 5. Double pendulum initial condition ( 1 , 2 , 1 and 2 , in deg or deg/s). In the “MakeObs” mode, you have to specify the following additional parameters: 1. Frequency of observation output (in number of time steps), 2. Standard deviation of the simulated observation error for angles, 3. Standard deviation of the simulated observation error for angular speeds (if set to zero the angular speeds are assumed unobserved). In the “Analyse” mode, you have to specify the following additional parameters: 1. Length of the assimilation cycles (in number of time steps), 2. Observations to assimilate (obtained from the “MakeObs” mode): value, type (1 for 1 , 2 for 2 , 3 for 1 , 4 for 2 ), error, time (in seconds). You will find an example configuration file in Annex C. The following exercises should lead you to examine the sensitivity of the system to the initial conditions, the sensitivity of the assimilation to the system characteristics (non linearity, first guess error, model error), and the sensitivity of the assimilation to the quantity and quality of the observations (controllability of the system). In order to facilitate the implementation of the exercises, a few hints are given in Annex F. Exercise 1 Use the “MakeObs” mode of the program to generate two reference simulations: the first one using initial conditions leading to quasi-periodic oscillations, and the second one using initial conditions leading to a chaotic movement. In order to estimate the sensitivity of these trajectories to the initial conditions, redo the simulations by adding a very small perturbation to the initial conditions. Does the new trajectory diverge from the reference one? In case of instability, can you evaluate (roughly) the error doubling time (i.e. the predictability time scale)? What can you say about the validity of the numerical solution for such systems? Do the oceanic systems exhibit the same kind of behaviour? In which conditions? Why is it much more difficult to estimate the predictability time scale for the ocean? Exercise 2 Use the observations (with small observation error, and high observation frequency), generated by the reference simulation of Exercise 1 (the chaotic one), in order to control the divergent simulation by the 4DVAR (the “Analyse” mode of the program). Use assimilation cycles that are (i) short and (ii) long with respect to the predictability time scale. Do you observe a different behaviour? Why? Now, increase the error on the first guess (keeping short enough assimilation cycles). What is the effect on the solution? By controlling the initial condition only, the 4DVAR scheme (as implemented here) assumes that there is no model error in the system. Is it robust to the presence of small model error (slight perturbations that you will add to the mass ratio or to the length of the pendulum)? Do you think this would be important in ocean assimilation systems? Exercise 3 Again on the same case study, you will now analyse the effect of decreasing the number and the quality of the observations. Perform several simulations to test the effect of (i) increasing the observation error (for this purpose, you need to regenerate observation data sets from the reference simulation, using larger observation noise), (ii) reducing the observation frequency, and (iii) suppressing the observations of the angular speeds. What do you observe? Can you find situations for which the control of the system becomes ineffective? Can you induce (rough) necessary conditions of controllability? Using these conditions, can you deduce a list of ocean processes that would certainly be impossible to control using only altimetric data, only ARGO data, or both? To what extent is it possible to compensate observation error by increasing the observation frequency? Why is this effect much more difficult to exploit in practice for oceanic applications (i.e. with altimetry)?