Exercise number 4: variational data assimilation

advertisement
Exercise number 4
Variational Data Assimilation
Jean-Michel Brankart, LEGI, Grenoble, France
Jean-Michel.Brankart@hmg.inpg.fr
The exercise is based on the ‘Double Pendulum 4D-VAR Demonstration Package’
by Ross Bannister, Data Assimilation Research Center, University of Reading, UK
http://www.met.rdg.ac.uk/~ross/DARC/DocDPVar.html
1. Introduction
In order to produce a realistic ocean forecast, all operational centres need data assimilation.
The general idea is to use all available (past) observations of the ocean to drive the model
phase trajectory as close as possible to the “true” trajectory (at present time) in order to best
initialise a deterministic ocean forecast. The objective of this exercise is to get a better
understanding of the problem by playing with a toy assimilation system. This should lead
you to examine closely the basic hypotheses and limitations of the assimilation and
forecasting systems.
The toy system that you will operate is the double pendulum mechanical system, controlled
by a 4DVAR assimilation scheme. Though this dynamical system may seem very simplistic,
there are some important similarities with the ocean, so that the conclusions that you will
draw about assimilation problems will also often be valid for the ocean. In this exercise, we
will ask you to experiment the double pendulum and to generalise the results for the ocean.
2. The double pendulum system
The double pendulum (illustrated in Fig. 1) is a mechanical system consisting of 2 masses
suspended from a fixed point C, in a constant and homogeneous gravity field. The first mass
m1 is constrained to be at distance  1 from the fixed point and the second mass m2 is
constrained to be at distance  2 from the first mass. The double pendulum has 2 degrees of
freedom: the angles  1 and  2 of the 2 pendulum sections with respect to the vertical.
The equations of motion for the double pendulum are derived in Annex A. They fully
determine the future evolution of the system given an initial condition for the 4 state
variables: the angles 1 and  2 , and the angular speeds 1 and 2 . These equations can be
integrated numerically (see Annex A) as illustrated in Fig. 2 for the initial condition
10   20   / 2 , 10  20  0 . (In this exercise, we always use
m1  m2 and  1   2  0.1 meter.) Instead of representing the
evolution of the state variables as a function of time, we can also
follow the evolution of the system in the phase space (with the 4
dimensions  1 ,  2 , 1 and 2 ), by using 2D projections on
several sections in the phase space (such as the plane  1 , 1 in
Fig. 3). Figs 2 and 3 are the 2 kinds of figures that you will use in
this exercise to analyse the behaviour of the system. (Animations
illustrating the movement of the double pendulum can be found
in the web sites listed in Annex B.)
Figure 1

1
2
1
1
t
Figure 3
Figure 2
The double pendulum is an interesting system for us because it exhibits a wide range of
different dynamical behaviours, from quasi-periodic oscillations (for small oscillations, or for
high energy movement, as gravity becomes unimportant), to chaotic evolution (at
intermediate energy levels, for specific domains
of initial conditions). In the chaotic regions of the
phase space, the movement is very sensitive to
the initial conditions. This is the first important
similarity with oceanic systems that will be
exploited in this exercise. Fig. 4 shows the power
spectrum of the  1 time series (on the same
example as above), in the case of a chaotic
movement ( 10   20   / 2 , 10  20  0 ). The
presence of energy at many different time scales,
with the presence of highs and lows, is a second
Figure 2
similarity with oceanic systems.
Of course the double pendulum is not like the ocean. The 2 main differences that we will have
to keep in mind here are the absence of dissipation (the system is conservative) and the
absence of time-dependent external forcing (the system is autonomous), so that the system is
reversible.
3. 4DVAR data assimilation
Assume that we observe the movement of a double pendulum with unknown initial
conditions. The problem of estimating the initial conditions optimising the fit between the
observations and the model solution can be formulated using a variational formalism. The
optimal initial conditions x(0) are defined as the minimum of the cost or penalty function:
J ( x) 
x B  x(0)T B 1 x B  x(0)   12  y (t )  H t0 x(0)T E 1 y (t )  H t0 x(0)
t
1
2
t 0
where
y (t ) is the set of observations at time t;
H t0 is the model operator between time 0 and time t;
x B is the first guess for the initial conditions;
B is the background (first guess) error covariance matrix;
E is the observational error covariance matrix.
The solution of this problem can be easily found using a steepest descent algorithm (the main
difficulty being the computation of the gradient of the cost function), as in most 4DVAR
schemes used in meteorology and oceanography. (The word 4DVAR is certainly abusive for
the double pendulum, which is not a 4-dimensional system.)
In the double pendulum implementation of this “4DVAR” algorithm, the following particular
choices have been made:
 The observations y (t ) are available regularly in time every n y time steps. There are 2
kinds of possible observations: (i) all state variables are observed (the angles and the
angular speeds), (ii) only the angles are observed.
 The observation error standard deviation is the same for all angle observations, and the
same for all angular speed observations. The observation errors are all statistically
independent, so that the E matrix is diagonal.
 The background error covariance matrix B is parameterised as the time covariance of the
model forecast starting from the first guess x B :
t
B
1
t
T
 (x  x) (x  x)dt with x 
0
t
1
t
 x(t )dt
and x(0)  x B
0
Due to the non-linearity of the double pendulum there is no guarantee that the above cost
function has a single minimum (which would be the global minimum). There can be a large
number of local minima, which are not the solution of our problem, and to which the steepest
descent algorithm can be lead to converge. In order to circumvent this difficulty, the full
estimation period ( n time steps) must be divided in a series of shorter assimilation cycles ( nc
time steps, with n y  nc  n ), in which the effect of the model non-linearities will be
negligible. The first guess state x B for each cycle is the final condition from the previous
cycle.
4. Twin assimilation experiments
In this exercise, you will not use observations from a real double pendulum system, to
assimilate in the double pendulum model, as it is done in ocean forecasting systems. Instead,
you will sample observations from a reference model simulation, that will be considered as
the “true” movement of the system. These observations will then be assimilated by the
4DVAR scheme in your assimilation experiments. Of course, in such experiments, we must
assume that we do not know what the initial conditions are, because it is the job of the
assimilation scheme to estimate them. As a first guess x B , you will choose values that are
(slightly) different from the true initial conditions. This is the basic idea of twin assimilation
experiments that can also be used with ocean models, when developing new assimilation
schemes, or to help optimising the ocean observing system. They are used to analyse the
behaviour of an assimilation system. This is also the objective of this exercise.
The software that you will operate (“Assim.f”, see Annex D) can work in 2 different
modes: the “MakeObs” mode, to perform the reference simulation and to produce the
observations, and the “Analyse” mode to run 4DVAR assimilation experiments. The mode
is the first parameter to set in the “Assim.conf” parameter file.

In both modes, you have to specify the following parameters:
1. Dimensions (  1 ,  2 , in meters) and masses ( m1 , m2 , in kg) of the pendulum,
2. Acceleration due to gravity (in m/s2),
3. Time step (in seconds),
4. Length of integration (in time steps),
5. Double pendulum initial condition (  1 ,  2 , 1 and 2 , in deg or deg/s).
 In the “MakeObs” mode, you have to specify the following additional parameters:
1. Frequency of observation output (in number of time steps),
2. Standard deviation of the simulated observation error for angles,
3. Standard deviation of the simulated observation error for angular speeds (if set to zero
the angular speeds are assumed unobserved).
 In the “Analyse” mode, you have to specify the following additional parameters:
1. Length of the assimilation cycles (in number of time steps),
2. Observations to assimilate (obtained from the “MakeObs” mode): value, type (1 for
 1 , 2 for  2 , 3 for 1 , 4 for 2 ), error, time (in seconds).
You will find an example configuration file in Annex C.
The following exercises should lead you to examine the sensitivity of the system to the initial
conditions, the sensitivity of the assimilation to the system characteristics (non linearity, first
guess error, model error), and the sensitivity of the assimilation to the quantity and quality of
the observations (controllability of the system). In order to facilitate the implementation of the
exercises, a few hints are given in Annex F.
Exercise 1
Use the “MakeObs” mode of the program to generate two reference simulations: the first one
using initial conditions leading to quasi-periodic oscillations, and the second one using initial
conditions leading to a chaotic movement. In order to estimate the sensitivity of these
trajectories to the initial conditions, redo the simulations by adding a very small perturbation
to the initial conditions. Does the new trajectory diverge from the reference one? In case of
instability, can you evaluate (roughly) the error doubling time (i.e. the predictability time
scale)? What can you say about the validity of the numerical solution for such systems? Do
the oceanic systems exhibit the same kind of behaviour? In which conditions? Why is it much
more difficult to estimate the predictability time scale for the ocean?
Exercise 2
Use the observations (with small observation error, and high observation frequency),
generated by the reference simulation of Exercise 1 (the chaotic one), in order to control the
divergent simulation by the 4DVAR (the “Analyse” mode of the program). Use
assimilation cycles that are (i) short and (ii) long with respect to the predictability time scale.
Do you observe a different behaviour? Why? Now, increase the error on the first guess
(keeping short enough assimilation cycles). What is the effect on the solution? By controlling
the initial condition only, the 4DVAR scheme (as implemented here) assumes that there is no
model error in the system. Is it robust to the presence of small model error (slight
perturbations that you will add to the mass ratio or to the length of the pendulum)? Do you
think this would be important in ocean assimilation systems?
Exercise 3
Again on the same case study, you will now analyse the effect of decreasing the number and
the quality of the observations. Perform several simulations to test the effect of (i) increasing
the observation error (for this purpose, you need to regenerate observation data sets from the
reference simulation, using larger observation noise), (ii) reducing the observation frequency,
and (iii) suppressing the observations of the angular speeds. What do you observe? Can you
find situations for which the control of the system becomes ineffective? Can you induce
(rough) necessary conditions of controllability? Using these conditions, can you deduce a list
of ocean processes that would certainly be impossible to control using only altimetric data,
only ARGO data, or both? To what extent is it possible to compensate observation error by
increasing the observation frequency? Why is this effect much more difficult to exploit in
practice for oceanic applications (i.e. with altimetry)?
Download