Assimilation of non-linear observations using approximate

advertisement
Assimilation of non-linear observations using approximate
background error covariances. Part II: Radar data
assimilation into WRF ARW for short-term forecasting.
Chang-Hwan Park, Mark S. Kulie, and Ralf Bennartz
Atmospheric and Oceanic Sciences
University of Wisconsin – Madison
Abstract
Temporally and spatially highly resolving radar measurements are the only means to
continuously observe dynamically evolving meteorological phenomena such as severe
thunderstorms. Assimilation of radar data into mesoscale models might be a key factor to
improve precipitation forecasting especially at shorter time scales.
However, major obstacles for the assimilation of radar data lie in the strong non-linearity
of the observation operators and the intermittent nature of the precipitation processes.
These result in a severe violation of the assumption of Gaussian error characteristics in
the data assimilation schemes, which manifests itself in unrealistic background error
covariance matrices and in unstable solutions.
In this paper, we propose a new method to address this problem and to assimilate radar
observations into mesoscale models. The proposed solution includes two steps to obtain a
well-conditioned background error covariance matrix: A normalizing step and a rescaling
step. We also introduce a new weighting technique to avoid filter divergence, another
common issue especially in Ensemble Kalman Filter (EnKF) type assimilation schemes.
The proposed method is applied to simulate an intensive precipitation event near
Madison, WI in July 2006 using the Weather Research and Forecasting (WRF) Model in
an EnKF mode. The new method results in a significant improvement of short-range
forecasting skills for this severe weather event.
Keywords : non-Gaussianity, non-linearity, discontinuity , filter divergence, flowdependent optimization, perturbation method, inverse method, WRF and EnKF
1) Introduction
Forecasting severe thunderstorms and convective systems plays an important role in
weather prediction. Severe thunderstorms can develop very quickly and can produce high
winds, hail, and flash flooding, which are potentially life-threatening and can cause
substantial property damage in a short time. According to the U.S. Natural Hazard
Statistics over 30 year average (1977-2006), “Flash floods/floods are the #1 cause of
deaths associated with thunderstorms.” (Sky warn review by Phil Hysell). Also, NOAA’s
annual compilations of flood loss statistics show that the damage scale of the flood tends
to increase with time, possibly mostly due to urban spread. Fast and accurate nowcasting
and short-term forecasting systems, including the prediction of timing, location, and
intensity of the severe storm, are becoming more and more important in this regard.
Mesoscale models together with advanced methods to incorporate radar and other realtime observations have become one increasingly popular research venue in this area.
Only radars allow for observing convective scale thunderstorms at high temporal and
spatial resolution (Sun 2005). Several studies indicate that in particular Ensemble Kalman
Filter (EnKF) approaches are able to successfully integrate radar and model data for
short-term prediction of mesoscale convective precipitation (Snyder and Zhang 2003;
Dowell, Zhang et al. 2004; Zhang, Snyder et al. 2004; Tong and Xue 2005; Xue, Jung et
al. 2007; Meng and Zhang 2008). The rapid evolution of mesoscale systems as well as
the non-linear relation between radar reflectivity and precipitation intensity makes the
assimilation of radar data into models a challenging problem, both mathematically and
physically.
Specifically, problems include the non-linearity of the observation operator (Evensen
2003), non-Gaussian background errors (Harlim and Hunt 2007), filter divergence
(Houtekamer and Mitchell 1998), as well as the flow-dependence of the background-error
covariances (Houtekamer and Mitchell 1998; Houtekamer and Mitchell 2001;
Houtekamer, Mitchell et al. 2005).
There are two interrelated key factors in EnKF approaches: Firstly, the ensemble
perturbation, which determines the initial spread of the ensemble and hence bears
significant importance in determining the background error covariance; secondly, the
actual ensemble update in which observations are used to steer the ensemble into a
direction consistent with the observations. Both are interrelated because the ensemble
spread crucially determines the weight given to observations in the update.
Various approaches to incorporate radar data into mesoscale models have been
reported. Lin et al. (Lin, Ray et al. 1993) initialized a cloud-scale model using radar, a
thermodynamic retrieval method (Galchen and Kropfli 1984; Hane and Ray 1985) of
incorporating pressure and potential temperature and the model velocity retrieved from
the radial velocity. Snyder and Zhang (Snyder and Zhang 2003) showed an EnKF for
assimilation of single-Doppler radar observations in cloud-scale models by initially
assimilating radial velocity. The radial velocities observation operator is linear and based
directly on prognostic model variables (i.e. wind). The simulation of radar reflectivity is
more challenging than radial velocity, because the observation operator of radar
reflectivity is highly non-linear and has a non-Gaussian error probability density function
(PDF). Recent work (Tong and Xue 2005); (Xue, Tong et al. 2006) employed radar
reflectivities within the EnKF framework in the log domain due to the severe nonlinearity of radar reflectivity. However, (Xue, Jung et al. 2007) indicate that the
observation sampling error of radar reflectivity in the log domain violates the Gaussian
assumption. One of the most serious problems in assimilating non-Gaussian variables is
the non-physical update due to the non-Gaussian PDFs in the EnKF. Several methods for
this problem are currently being investigated, e. g. localization (Anderson 2003; Ott,
Hunt et al. 2004; Chen and Snyder 2007) and regularization(Beezley and Mandel 2008;
Johns and Mandel 2008). Another problem in the forecast update using EnKF is a filter
divergence. The divergence problem can be improved by double EnKF(Houtekamer and
Mitchell 1998) , covariance inflation (Anderson 2001) , multiplicative covariance
inflation (Hamill and Whitaker 2005; Houtekamer, Mitchell et al. 2005), covariance
relaxation (Zhang, Snyder et al. 2004), or weighting between static and ensemble
covariances in a hybrid ensemble transform Kalman filter (Wang, Barker et al. 2008).
In this study we explore a new method to incorporate non-linear observations with
non-Gaussian error covariances. This new method approximates the ill-defined
background error covariances matrix of the observations using correlative information
from other model variables. The approach is tested using radar observations of a
convective precipitation event observed in Madison, WI in July 2006.
In section Error! Reference source not found. the theory of EnKF and the
description of WRF are briefly reviewed. In section 2) a synoptic and observational
overview for this case study is provided. Also the model initiation for the convective case
is discussed. Section Error! Reference source not found. introduces the new method,
the background error approximation. Then we will validate the performance of forecast
capability improved by EnKF with new method in section Error! Reference source not
found..
2) The Madison Flood Case
a) Overview of Synoptic situation and Radar observations
The heavy convective rain event modeled in this study occurred in the southern tier of
the state of Wisconsin on 27 July 2006. As shown in Figure 1, the main synoptic feature
at 15 UTC on 27 July 2006 was a stationary front that stretched across southern
Wisconsin and served as the focal point for initial convective development.
Figure 1. Surface weather map for 15 UTC on 27 July 2006
National Weather Service Weather Surveillance Radar (WSR-88D) Level II data from
the Milwaukee/Sullivan, WI (KMKX) site were obtained from the National Climatic
Data Center data archive. Base reflectivity data from 1600 to 1700 UTC were utilized in
this study. Figure 2 illustrates the light clutter and clear air echoes that existed near the
radar site early in the analyzed time period. The quality-control for the radar reflectivity
was performed manually with eyeball method. In this study, the manual quality-control is
sufficiently reliable, because the reflectivity of the radar observation in the assimilation
cycles, which can be considered as the precipitation, appears far from ground clutter as
shown Figure 2. Therefore, it is easy to remove non-precipitation feature from the radar
observation.
Figure 2. Base radar reflectivity from the KMKX radar site at 1630 UTC on 27 July 2006. The
approximate location of the stationary front at 15 UTC is also indicated.
The synoptic environment was not tremendously dynamic. The stationary front moved
gradually over 1200 UTC and 0000 UTC. In addition, the vertical wind profile nearby
12UTC soundings indicated very little vertical shear existed in the low- to mid-levels.
Surface winds were very light, and a slight wind direction shift was evident in the vicinity
of the boundary. Also, upper level winds were not excessive in the region, so severe
organized convection was not forecasted by National Weather Service Forecast and WRF
does not simulate it.
Nevertheless, the radar captured that the convection initiated around 1630 UTC as
shown Figure 2 and it developed very quickly into the severe deep convections. For
example, by 1730, Doppler radar rainfall estimates exceeded 6 inches in isolated locales
throughout south-central Wisconsin, and severe urban flash flooding was reported in the
city of Madison. Additional intense rainfall formed after about 1900 UTC along a lake
breeze front that progressed steadily inland through the western suburbs of Milwaukee
(about 30 km from the shore of Lake Michigan), and further flash flooding was reported
from these convective cells in 20UTC. The deep convections over the flood regions of
south-central Wisconsin is driven by huge convective available potential energy (CAPE).
According to the morning National Weather Service Forecast Discussion on 27 July 2006
(available from the National Climatic Data Center archives), mixed layer CAPE values
based on modified soundings were near 2000 Jkg-1 in the region with minimal to no
convective inhibition. CAPE increases when unsaturated lower layer includes high water
vapor. By definition of potential instability (Saucier 1955) , if an air column with dryer
upper layer and high-moisture lower layer is raised by stationary frontal lifting, the
instability of the atmospheric state will increase until moisture in the lower layer has been
entirely saturated. The even weak frontal lifting with the upper dry air, being quickly
cooler by dry-adiabatic laps rate and the lower humid air, slowly cooling along the
saturation-adiabatic lapse rate, can be intensified into unstable atmospheric state and it
will trigger a sever convection. Through these synoptic and observation overview, the
cause of severe convection generating flood over the south-central Wisconsin turns out
the high potential instability due to warmer and high moisture conditions on the surface
layer. In order to improve forecasting capability of WRF for this event, we need to
modify the potential instability in states of WRF using radar reflectivity and EnKF.
b) The WRF-ARW model
The Weather Research and Forecasting Model (WRF) have been developed for
advanced mesoscale numerical weather prediction. In this study, the WRF is used to
simulate a severe mesoscale convective system that produced heavy rain in South-Central
Wisconsin and urban street flooding in Madison, WI.
Table 1. WPS & WRF Model setup for the Madison Flood case
WRF Preprocess System Option
Description
(WPS)
Data
GFS-ANL
NECP High resolution Global Forecast
System (1degree GFS)
Center point
43.071N
Latitude
89.37W
Longitude
Spatial resolution
7.12 km
dx, dy in the mother dom.
1.42 km
in the nest dom.
2

Domain size
700 1582 km
Mother dom.
142  285 km2
Nest dom.
WRF
Option
Description
Microphysics
WSM6
Mother/nest dom.
Cumulus Parameterization Grell-3
Mother dom.
Time control
18hr
2006/07/27/ 06:00 ~ 07/28 00:00
History interval
15 min
Corresponding to the radar observation
Vertical level
27
number of vertical levels
Table 1 shows some of the important WRF model characteristics used in this study.
The microphysics option chosen is WRF Single Moment (WSM) 6-Class graupel scheme.
The WSM scheme has been developed as 3-class (vapor, cloud ice/water, snow/rain
based on simple freezing/melting process), 5-class (vapor, rain, snow, cloud ice, cloud
water as five different arrays) and 6-class (similar to 5-class with graupel added) scheme.
The WSM6 scheme is used in this study to compare the rain distribution between radar
observation and WRF simulation, because the six-class scheme is believed to show the
most proper behavior for cloud-resolving grids (Hong 2006). The mother domain
resolution in this study is about 7 km resolution. In this study we use the Grell scheme
(G3) cumulus convection scheme for the mother domain, which allows sub-grid
convective effects for 5~10km grid resolutions. We chose the nested model to have a 1.4
km grid scale.
c) Ensemble Forecast using WRF
The timeline of the ensemble forecast is described in Figure 3. After 15 initial
ensemble perturbation in potential temperature field, enough spin-up time (10 hour in this
study) is required to establish hydrological balance in the model. From 1600 to 1700
UTC, Radar reflectivity is assimilated into WRF in every 15 minutes. After assimilation,
15 ensemble runs will predict precipitation in targeted forecasting cycles.
Figure 3. Forecast Timeline modified by application of EnKF (R: Radar data assimilation in WRF,
S: Spin-up time after the initial perturbation and after the initialization by observation assimilation,
F: Forecast after Radar data assimilation)
Table 2 shows restart option used in this study as well as the time interval between
restarts. The restart option helps to assimilate radar observation into the model. After the
first model spin-up, WRF is stopped and updated through EnKF. Because the update is
performed only on the nested domain, the feedback and smooth option in the WRF are
utilized to indirectly update also the mother domain from the updated nested domain.
These options help to bring the proper boundary condition of nested domain from mother
domain.
Table 2. WRF Model setup for the radar data assimilation
WRF
Restart
Restart_interval (1)
Restart_interval (2)
Feedback & Smooth
Option
.correct.,
1600
15
1
Description
Restart after imposing spin-up time & RDA
10hr spin-up time from 06:00 to 16:00
Every 15min RDA from 16:00 to 17:00
Reflect nested domain’s assimilation product also
into the mother domain for reducing-difference
between the mother and nested domains
We represent the forecast ensemble by 15 ensemble members from the potential
temperature field at 0600 UTC. The stability associated with the different ensemble
members was randomly varied by perturbing the potential temperature field using
random perturbation based on a Gaussian distribution.
Figure 4. Sampling 15 ensemble members of the model variables by the initial perturbation of
potential temperature field.
As can be seen from Figure 4, the perturbation in this study changes linearly with
height, thus modifying the stability of the model domain. Positively perturbed members
are promoted toward greater potential instability increasing the surface temperature. In
contrast, negatively perturbed members have decreased potential instability.
Table 3. Experimental design of statistical components in EnKF implementation for radar
reflectivity assimilation
Control
EXP.1
EXP.2
EXP.3
Radar Observation
without
assimilation
Z surface
Z surface
Z surface
Model variables
-
 , Qr
 , Qr
 , Qr
-
A 3D , Qr 3D 
A 3D 
A 3D 
-
HA  Qr surface 
HA  surface 
HA  surface 
Perturbation of
Model State
Perturbation of
Model Simulation
BG. and OB. error
-
Unweighted
Unweighted
Weighted
In Table 3, we present two experiments including a control run. The control run is an
ensemble run without a data assimilation (Control). WRF in this case couldn’t simulate
deep convective precipitation in forecasting cycles. Therefore, in the first experiment
(EXP.1) we performed the data assimilation of radar reflectivity to initiate deep
convection in WRF, but without any approximation to non-linear observation operator in
the Kalman gain. The second experiment (EXP.2) is the radar reflectivity assimilation
case using approximate observation operator to resolve non-Gaussian problem in EnKF.
3) Methods
a) The Ensemble Kalman Filter
The general optimal estimation can be expressed as the following equation (Kalman
1960; Gelb 1974):
X  X  K  D  h( X ) 
(1)
Using the Jacobian matrix H the observation operator can be approximated linearly as:
 
h X  HX
(2)
leading to a fully linearized form of Equation (1):
X  X  K  D  HX 
(3)
The optimal filtering is implemented mainly in two parts, the Kalman gain K and the
observation increment D  h( X ) . The observation increment is the only component,
which includes absolute information of the observation and the model state. The Kalman
gain matrix K only holds information about the uncertainties and error covariances and
does not contain information about the current state of the observation or the model. Thus,
K determines the degree to which observational information gets filtered into the model.
The Kalman gain itself can be written as:
K   BH T  HBH T  R 
1
(4)
If the observation-error matrix R is small compared to the background error, the update
from the observation increment will become large through K. In other words, the model
state will be effectively improved by the observation. On the contrary, if the observation
error covariance is relatively large, the observations will be largely ignored. Large
uncertainties in the observation correspond to a high R. As a result, the diminished
Kalman gain will not update the model state from the observation data and the forecast
model will not be modified by the observation.
In the EnKF framework, the background error covariance matrix is calculated from
the sampling uncertainty of a forecast ensemble using a Monte-Carlo method (Evensen
1994):
1
(5)
B
AAT
s 1
where A  X  X are the ensemble perturbations. The Kalman gain K in the EnKF can
then be expressed as:
 1

1
K
AAT H T 
HAAT H T  R
s 1
 s 1

1
(6)
Applying the linearized observation operator free implementation to avoid high
computational costs for the construction of the H matrix, K can be rewritten as:
1
T  1
T

K
A  HA  
HA  HA   R 
1 s
 1 s

1
(7)
Data assimilation using an ensemble approach (Eq. (5)) often suffers from issues related
to too small ensemble sizes (Whitaker and Hamill 2002). If the number of ensemble
members is too small, the updated ensemble often becomes biased, unrepresentative, or
the ensemble spread does not represent the true uncertainty in the model background state
anymore. This issue will partly be addressed later on when the variance of the
background versus observation error covariance are addressed. Subsequently, we deal
with the approximation of HA under conditions with non-linear forward operators h(…).
b) Background error approximation
The matrix HA can be interpreted as the deviation of the ensemble members from the
ensemble mean state mapped into observation space. The crucial assumption underlying
the matrix HA is that the observation operator can be approximated by a linear function
around the mean state. This requires the deviations between the individual ensemble
members and the mean state to be small, an assumption, which is frequently violated
when dealing for example with intermittent phenomena such as clouds and precipitation.
Furthermore, when dealing with strongly non-linear observation operators, the
distribution of HA will most likely not be Gaussian, even if A is Gaussian. In addition, in
many instances the distribution of A itself will rather be bi-modal or skewed. For
example in convective precipitation situations some models will see precipitation and
some will not. As a consequence, A as well as HA will also be bi-modal, even if the
observation operator h(…) is not too strongly non-linear, as is the case for example with
radiative transfer models. In case of a strongly non-linear forward model, such as radar
reflectivity, the issue becomes even more relevant. The impact of a non-linear
observation operator is schematically depicted in Figure 5. The strongly non-linear
observation operator in Figure 5c will map all negative distributions of the model PDF
onto zero in observation space and create a highly skewed PDF in observation space.
This is a common problem with radar observations. The assimilation of intermittent
observations might also lead to rank-deficiencies in B for example in cases where neither
of the ensemble members produces precipitation.
Figure 5. Schematic distribution types of observation operators perturbation simulated after
perturbation of model states a) linear observation operator forming a Gaussian distribution b)
weakly non-linear observation operator forming a skewed Gaussian distribution c) strongly nonlinear observation operator forming a highly non-Gaussian distribution
The above considerations make it clear that a stable estimate of HA is crucial for
successful assimilation of a large set of observations especially regarding clouds and
1
HA (HA )T needs to be well conditioned.
precipitation. In particular HBHT 
s 1
The physical interpretation of the background error covariance matrix B is two-fold.
Firstly, B determines the error correlations of different components of the state vector.
Thus, B will determine how much a certain change in an observed variable will affect
variables in the state vector, which are not observed. Secondly, the magnitude of HBH
compared to R determines the relative weight, which is put onto the observations in the
update. This relative weighting between ensemble mean and observations depends mostly
on the initial ensemble spread as well as on the development of the ensemble and can
thus not be determined objectively, if the a-priori distribution of the ensemble is chosen
arbitrarily. The two purposes of B are independent of each other and can be decoupled
explicitly in a slightly revised formulation of the EnKF problem.
Assume an observable linked to the state of the model via a strongly non-linear
forward operator, such as precipitation rate. In this case HA represents the deviation of
the individual ensemble members from the ensemble mean. Since the distribution of
precipitation rate is typically highly skewed, the resulting distribution of HA will be
highly skewed and not well suited for assimilation purposes. While precipitation rate is
observed, the underlying process (e.g. ‘convection triggering precipitation’) might not be
observed directly. However, various other model variables will be correlated with this
process. For a typical convective precipitation situation, deviations in temperature in the
boundary layer might be positively correlated with deviations in precipitation.
In this paper, a new method is proposed to address this issue. The basic idea relies on the
aforementioned correlative model variables and can be put into very simple terms:
‘If the background error correlation in observation space cannot be used directly, it might
be approximated by another variable, which is correlated with the observations and has
better error characteristics.’
This method corresponds to a translation of the non-Gaussian PDF into a Gaussian
form, maintaining its original physical units. In Figure 6, the process is schematically
explained. Two variables are shown as observable variable y and observation proxy Y.
The PDF of the observable variables y forms shows a non-Gaussian distribution (blue
curve) whereas the observation proxy Y (red curve) has a Gaussian distribution and the
positive part of the distribution is well correlated to observable y (not shown).
Figure 6. New method in general from : Modulation of Probability Density Function (PDF) between
different physical variables by normalization process with rescaling factors
The proxy background error covariance in observation space can now be derived via a
rescaling. The ratio of the traces of the original background error covariance and the
proxy are used to renormalize the total variance of the approximated background error
covariance to match the physical units of R in the subsequent EnKF calculations. Note,
that the ratio of the traces of these matrices is only a scalar factor and other, more
sophisticated methods might be devised too.
Another issue related to ensemble assimilation is the triggering of clouds or precipitation
in cases where initially none of the ensemble members produce any cloud or precipitation.
In such cases HA collapses to identical zero and no assimilation can be performed.
Replacing HA with a proxy variable will yield a well-defined background error
covariance and potentially will allow to initiate clouds or precipitation in the assimilation.
This is highlighted in Figure 6, middle panel. Assume the PDF of precipitation (black
curve) to be approximated by an ensemble with all identical zeros. Then the background
error covariance will collapse. Assume the red curve to be the distribution of potential
temperature in the boundary layer. Areas with high potential temperature will correspond
to high precipitation and areas with lower potential temperature to zero precipitation. By
replacing the black with the red curve, a better proxy of the true error covariance will be
obtained.
Using the variable Ao to denominate the proxy for HA, this leads to the following
approximation for HA in the Kalman gain:
HA  Ao  t
(8)
The revised Kalman gain can now be expressed as:
 1
 1

K 
AAoT t 
Ao AoT t 2  R 
 s 1
 s  1

1
(9)
The rescaling factor t allows us to translate HA into Ao maintaining the units of the
observation space. The next procedure is to find a suitable proxy for HA. This proxy can
for example be part of the ensemble itself as long as certain criteria are fulfilled. A
variable needs to be picked, which is well correlated with the model simulation HA. In
the present study, we use potential temperature deviations in the boundary layer, which
are well correlated with the initiation of convective precipitation. HA and Ao need to have
identical dimensions. The whole process described in Equation (8) can be thought of as
replacing H by a sparse matrix of dimensions m  n with one value t in each row and all
other values in that row set to zero. To maintain the correct dimension, only m columns
can contain an entry larger than zero. More complicated observation proxies can be
thought of too, but are not pursued in this initial publication. The rescaling factor t is
recalculated in every assimilation cycles, so that the EnKF can effectively adjust to
changing model states.
The Kalman gain can be slightly reformulated to gain insight into the data assimilation
process. When we extract the background error covariance matrix from the background
error influence term by placing an imaginary HBHT ·(HBHT)-1 between the two terms in
brackets in Eq. (4) we can rewrite the Kalman gain as:
T
T
K  AAo  Ao Ao 
1
Correlation Information
between BG and OB
  I  R  B o1t 2   t 1
1
Error Information
for the OB increment
(10)
Rescaling
factor
This reformulation allows us clearly to see the correlation information (CI) between the
background state and the observation and the error information (EI) for the observation
increment. The possible situations in the data assimilation study are presented in Table 4.
Table 4 : The influence of the background and the observation errors on the observation assimilation
case
Error
Norms
Error Information
N  I  R  Bo1t 2
Error
Reflected
Observation
Observation Update
Increment
K  D - HX 
N 1  D  HX 
C.1
R  Bot 2
I
  D  HX 
C.2
R  Bot 2
2 I

C.3
R  Bot 2
Large
 Moderate
Moderate-update
C.4
R  Bot 2
Extremely Large
0
Non-update
 D  HX 
2
Large-update
Intermediate-update
If the observation error is much smaller than the background error (Table.1-C.1), the
EnKF actively updates the observation increment to the model states. On the contrary, if
the Kalman gain K includes the extremely small background error (Table.1-C.4) the
EnKF updates nearly no information from the observation increment, mistakenly
interpreting that the current model state is highly trustful due to its small model
background error. This C.4 case is the most common situation in the data assimilation
study because of two reasons. First, the ensemble members produced by the initial model
perturbations usually are distributed only with small spread size due to the model physics
limitation. Therefore, the physically acceptable background errors tend to be much
smaller than the observation errors. Secondly, the background error surfers from the
natural underestimating process in the sequential update cycles. It is known as the filter
divergence problem. Finally, due to the different temporal error evolution, the magnitude
of the background error can be underestimated compared to the observation error in the
data assimilation. This problem is schematically illustrated in the Figure 7.
Figure 7. Background error underestimation problem a) absolute error simulation, b) relative error
simulation (interval indication in blue curves - BG error in the convective initiation stage, in red
curves - OB error in developing stage)
Due to the non-linear relationship in the perturbation process, the range interval of the
blue lines, HA, in Fig.3 would be intensified nonlinearly with time, even though the
perturbation A maintains the constant interval over the whole cycles. The observation
error R is, also, not constant as a relative error which is proportional to the true state
variation. Therefore, the constant absolute error shown in Fig.3-a is an unrealistic error
simulation regardless of not suffering the underestimation problem of the BG error. On
the other hand, with the realistic error modeling (Fig.3-b), the underestimation of the BG
error is an inevitable problem in the data assimilation. For example, in the Madison flood
case study, it is found that the model simulation produced the initial precipitation
approximately at 1 hour later than the KMKX radar measurement. This time difference
causes the Kalman filter consistently to ignore the well quality-controlled and highly
resolved radar observation during assimilation cycles. Consequently the radar data
assimilation by EnKF couldn’t improve the forecast results. Therefore, the next step of
the new method is the weighting technique to count this temporal evolution of the
different error sources in the data assimilation.
weighting to background error covariance matix
T
1
BH T   2  f   BH T 
2 f A
2  f Ao  t
s 1
1
HBH T   2  f   HBH T 
2  f Ao  t
2  f Ao  t
s 1



weighting to observation error covariance matix
R  f R


(11)

T
(12)
The weighting factor f linearly modulates the magnitude of observation and background
errors (See Eq.(11) and (12)). In the assimilation cycles (see Fig.4) the observation error
evolution is the more proceeded state than the background model error in this study.
Figure 8. Weighting factor for the OB error and the underestimated BG error to improve the update
process in the data assimilation cycles
Therefore, the f in Fig.4 is chosen as less than 1 to reduce the difference between OB and
BG errors. It means that by the weighting technique the C.4 in the Table 4 can change
into C.3 allowing the effective update through the multiple assimilation cycles without an
abrupt model imbalance which can happen in C.1 and C.2. In opposite case (the model
simulation advents earlier than the observation), the f is simply selected as larger than 1
to increase the difference between BG and OB errors. The weighting term can be
simplified in EnKF formulation as following expression.
K  AA
T
o
A A 
o
T
o
1
Correation Information
1


f
 I 
 R  Bo1t 2   t 1
2 f


(1)
Renormalized Error Information
This expression tells us that the introduced weighting technique adjusts only the relative
error ratio (R·B-1t-2) in the error normalization term, conserving the ensemble sizes of A
and HA (Aot in the new method) in the observation influence term. It means that this
modification only renormalizes the observation increment. This method basically
resolves all kind of inactive update problems due to the insufficiently perturbed ensemble
spread, the filter divergence and BG error underestimation by the time lag of the BG error
evolution in the data assimilation.
In the expression of Sherman–Morrison–Woodbury formula to reduce computational
expense when dealing with a large number of data points (Mandel 2006), the final
Kalman gain in EnKF with weighting factor and rescaling factor is derived as,
K  K  t    AAoT R 1  I    Ao I    AoT R 1 Ao  AoT Ro1 


1
Where

(2)
2 f
f  t  s  1
This newly expressed Kalman gain K is non-dimensional. Also, all factors in K are
incorporated into just one constant,  .
After applying the new method, the full expression of approximate EnKF for the
highly non-linear case is
Xˆ  X  K  D  HX  / t
(3)
4) Forecast Results and Analysis
In order to validate the new method, three experiments are designed: Control case
(without assimilation), EXP.1 (with assimilation using non-approximate HA), EXP.2
(with assimilation using approximate HA) and EXP.3 (with assimilation using
approximate and weighted errors).
a) Filter Stabilization and Forecast Correlation by the HA approximation
The HA approximation includes two important considerations: the Gaussian form in
HA for the stable recursive filter assimilation and the proper forecast correlation
information for the forecast improvement.
Figure 9. Filter Stabilization test: EXP.1 (without HA approximation): a) Original HA (radar
reflectivity, Z), b) Non-Gaussian spread update at 1630UTC (potential temperature, Kelvin), c) NonGaussian spread update at 1645UTC, EXP.2 (with HA approximation): a) Approximate HA , b)
Gaussian spread update at 1630UTC, c) Gaussian spread update at 1645UTC
Firstly, the EnKF with the non-linear observation operator suffers from the unstable
filter update process. For instance, the non-Gaussian HA (Fig.8-a in EXP.1) due to the
non-linearity between HA-A creates the skewed spread update at 1630 UTC (Fig.8-b in
EXP.1). By the recursive filter process, the spread update term at 1645UTC produces
unrealistically huge θ indicating over 100 Kelvin (□- profile in Figure 9-c), not allowing
further model runs. Also, the K loses the consistent filter controllability showing different
spread update patterns at 1630 and 1645 UTC (compare the □- profile in Figure 9-b and
c). On the other hand, when we look at the EXP.2 in the Figure 9, the data assimilation by
EnKF with the approximate HA consistently maintains the Gaussian distribution at
1630UTC and 1645UTC; during the recursive filter updates the data assimilation is
continued without any filter instability.
innovation spelling
Figure 10. Forecast Correlation test : a) Positive Innovation based on higher observation and lower
current model state, b) Inversed Innovation controlled by correlation and error information of K, c)
Improved forecast by input innovation, (EXP.1 : blue – no approximation with f=1.999, EXP.3 : red approximation with f=0.1)
The second problem is the inappropriate forecast correlation between HA and A. For
example, if there is no approximation for HA, the K naturally adopts the instant negative
correlation; the positive ensemble member of the HA(Qr) is originated from instant
condensation by the negatively perturbed ensemble member of the A(θ) and vice versa.
Through the negative HA-A correlation, the positive input innovation (D-HX) (orange
curve of Figure 10-a in 1hr assimilation cycles) inversely induces the θ decrease of the all
ensemble members A(θ) (blue curves in the assimilation cycles of Figure 10-b).
Consequently, this cooling effect produces the intensified bubbles in early assimilation
cycles (blue curves in Figure 10-c). As discussed in the previous synoptic overview,
however, the cause of the Madison flood is the potential instability. It means that the
positively perturbed ensemble member in the current A(θ) (high potential instability) will
creates the positive ensemble member in the future HA(Qr) (high convective
precipitation). Therefore, in this study the negative instant correlation (condensation
tendency) of the original K is replaced with the positive forecast correlation (convective
tendency), by selecting the HA proxy which is positively correlated with A. Based on this
HA-A relationship, the positive input innovation (D-HX) is assimilated to increase θ in the
current model state (reds curves after spin-up in Figure 10-b) and produce the huge Qr
forecast improvement (reds curves after spin-up in Figure 10-c).
b) Renormalization of the error information by the Weighting technique
Another important feature of the new method is the error renormalization by the
weighting technique. The results of the weighting experiments are examined on three
temporal stages.
Figure 11. Error weighting test: a) the updated model stated by the inversed innovation b) the mean
reflectivity intensity improvements according to various weighting factors, c) Storm Coverage
distribution improvement with the threshold value 10 dBZ for the meaningful storm scale
precipitation excluding unrealistic bubble feature.)
The contribution of the radar observation on the data assimilation is described as a
convective initiation on the potential temperature field (Figure 11-a). From 1600UTC to
1700UTC, the observation innovations renormalized by the various weighting factors are
updated with different θ increase at the surface layer. On the other hand, the improvement
of the forecasting performance is evaluated in terms of the storm intensity and the storm
coverage in the Figure 11-b and c. In the forecasting cycles, the intensity and the
coverage of the thunderstorm is most close to the radar measurement when f is equal to
0.1 which heightened the BG error and reduced OB error. This simple choice of the
weighting factor provides a powerful modulating skill in the weather forecast.
Moreover, the spin-up cycles when the thermal energy is translated into the convective
dynamic energy might be a practically important duration for monitoring the potentiality
whether the newly initialized model state develops into severe thunderstorms or not. Also,
in the end of the spin-up the weighting experiments create the tipping point in b)
corresponding with the advent moment of coverage score c) for the meaningful
precipitation regions. These mathematical simple patterns before the main event
appearing can be indexed to equip the earlier weather hazard warning system.
Figure 12. Observation assimilation effect in the forecast system (Control vs. EXP.3): WRF (control
without the radar assimilation), ENKF(EXP.3 with the radar assimilation based on the new method),
RADAR (base radar reflectivity observation); 16:30UTC (assimilation), 17:30UTC (spin-up),
19:00UTC & 21:00UTC (forecasts)
After applying HA approximation and weighting technique introduced in the new
method we can truly examine how the forecasting capability is improved by the radar
reflectivity assimilation. The ENKF (EXP. 3 in Figure 12) shows the radar data
assimilation at 16:30UTC. In the Madison flood case study, 1600~1700 UTC is the most
effective times for the quality control of the radar reflectivity. Because precipitations in
these cycles are still distributed far from the center of the radar, where the ground cluster
of the radar reflectivity is located, non-precipitation feature can be easily removed from
the precipitation distribution. Also, if precipitation doesn’t appear on the model
simulation, the link between observation and model state (rescaling factor t) is absent in
EnKF implementation. Therefore, in 1600UTC, the radar reflectivity observation starts
being assimilated in WRF.
After assimilating the radar observation, spin-up time (1700 ~1730UTC in this study)
is needed for hydrological balance due to the disparity between present background
feature and newly introduced observational feature on model states. This process is well
displayed on 1730UTC scene in ENKF (EXP.3) of Figure 12. Compared to the control
run, the weak convective precipitation appears at right side of middle domain part. On the
contrary, the precipitation, which was simulated on the control run, tends to be
suppressed in EXP. 3, at right side of lower domain part. Even though EXP.3 still doesn’t
produce enough precipitations as much as the radar observation displays, we can notice
that the model condition on this stage are seemingly translating from the background
state into observation state.
The main difference between assimilation (ENKF) and non-assimilation (WRF) cases
manifest at 1900UTC. In this stage, the control run simulates only weak frontal
precipitation with unrealistic bubble feature (WRF in Figure 12). Contrarily, instead of
bubbles, the ENKF shows the well organized deep convection popping-up simulating
high radar reflectivity. Also the orientation of this convection is very similar to the
distribution pattern of the radar reflectivity measurement at 2100UTC. Fully developed
supercell storm forecasted in ENKF displays a outflow boundary feature, which is rapidly
expanding by strong convective updraft. During 1900~2200 UTC, the model domain
includes a whole convective cell. Therefore, the forecast results in these cycles are well
matched to the real radar observation. However, after 2200 UTC the forecast results do
not well correspond quantitatively to the compared radar observation, because the
convective cell disappears crossing the boundary of the model domain.
5) Summary
In this study we realize that in order to utilize highly non-linear observation operator
and intermittent physical properties for the data assimilation, the following considerations
are essential for the background error covariance matrix: the initial perturbation which is
associated with the future targeted observation, the appropriate forecast correlation in the
Kalman gain, the Gaussian distribution maintenance during sequential updates and the
error renormalization for underestimated background error. All these considerations aim
to design the well-conditioned background error covariance matrix. For instance, the
perturbation method needs to produce initial ensemble members in the model space
clearly showing the correlation with what we want to forecast. In the observational and
synoptic overviews, it turns out that the potential instability is the main cause of the
convective precipitations, leading to the Madison flood at July 2006. Therefore, the initial
ensemble members are necessary to be spread randomly with the various potential
instabilities providing statistical information for the data assimilation. Specifically, we
perturbed the potential temperature varying linearly with heights to impose various
potential instabilities. However, the ensemble perturbation in the observation space (HA)
forms non-Gaussian distribution due to its non-linear physical response to the initial
ensemble perturbation in the model space (A). This non-Gaussian distribution leads to the
unstable filter behavior during assimilation cycles. The background error covariance
matrix, which is approximated with HA proxy, satisfies the basic Gaussian assumption of
EnKF. Also, in the physical aspect, the correlation in the background error covariance
matrix is conditioned as the forecast correlation to control consistently the forecast
system toward a desirable state. After the filter is endowed with the stability by
approximating HA to have Gaussian distribution and the controllability by imposing
forecast correlation between HA and A, the proposed weighting technique boosts the
update process resolving the underestimation problem of BG.
The results in the Figure 12 describe that convective and suppressive patterns on
“ENKF” are originated from the positive input innovation, which is difference between
“WRF” and “RADAR”, and their uncertainty information, which determines the forecast
state between “WRF” and “RADAR” states. Then, through the correlation information in
K, the present model state (the potential instability in this study) will be updated from the
error-oriented innovation. We presented following experiments systemically to solve
each problem.
Table 5 : The summary of the problems, solutions and results on each experiment
Control
EXP.1
EXP.2
EXP.3
-
Original
Approximate
Approximate
-
Instant
correlation
Forecast
correlation
Forecast
correlation
-
Non-linear
Linear
Linear
HA PDF
-
Non-Gaussian
Gaussian
Gaussian
Filter behavior
-
Unstable
Stable
Stable
B in K
-
Illconditioned
Wellconditioned
Wellconditioned
Weighting
factor
-
f = 1.0
f = 1.0
f = 0.1
B in N
-
Physically
Unrealistic
R in N
-
Original scale
Improvement
-
Forecast
No
convection
Underestimated
Initial BGE
Large
Matured OBE
Insufficient
update
Shallow
convection
Enhanced
Initial BGE
Diminished
Matured OBE
Significant
update
Severe deep
convection
HA in K
HA–A
Relationship
HA Response to
A Perturbation
Problematic
model update
Intensified
bubble feature
Owing to the new method, we can fully explore the real case for the forecast of the
Madison flood in July 2006. The results of the experiments in this study well describes
that the WRF system improved by EnKF predicts a greater instability for overall
ensemble mean, even though the initial random perturbations, before assimilation, were
spread statistically equally between stable and unstable situations. Also, it tells us that
even if the initiation by the radar observation in assimilation cycles has been performed
for one hour, the WRF simulation for 4~5hr forecast cycles varied tremendously. The
results of the new method in this research emphasize how important the proper model
initiation by highly resolved observation data during assimilation is in the forecasting
system. Furthermore, the mathematically well derived solution presents meaningful
patterns potentially to provide early decisive forecast capability whether the weather
situation will develop into the natural hazard scale or not.
In summary, for the fast but accurate forecast of severe convective precipitations, the
radar observation and EnKF are utilized in this study, but it is quite a challenge to apply
them together because of the “non-linearity” of the observation operator and the
"Gaussian assumption" of EnKF. New method proposed in this paper doesn’t smooth or
generalize a nonlinear nature of observation. It associates with approximation of the
background error covariance matrix to linearize Kalman gain (relative information) and
to renormalize the observation increment (absolute information) conserving non-linearity
and/or discontinuity in the observation increment. It means that new method intends the
linear response of the ensemble spread among various model variables and the non-linear
response of the ensemble mean in the model space to the observation increment variance.
The EnKF with the well conditioned background error covariance matrix remarkably
improves the short-term forecast performance for the mesoscale convective precipitation.
This result is the good initial forecast for Madison flood case in July 27 2006.
Appendix A.
Symbol
X
X
D
A
h ...

...
Tr(…)
H
B
HA
Y
Ao
y
Bo
Description
Updated ensemble
Initial ensemble
Observations
Ensemble perturbations
Observation operator
Type
Matrix
Matrix
Matrix
Matrix
Operator
Dimension
ns
ns
ms
ns
(n  s)  (m  s)
Ensemble averaging operator
Operator
(k  s)  (k  s)
Trace operator
Linearized observation operator
(Jacobian matrix)
Background error covariance
Ensemble
perturbations
in
observation space
Observation Vector
Proxy for HA
Operator
Matrix
(k  k)  1
mn
Matrix
Matrix
nn
ms
Vector
Matrix
m
Proxy for observation vector
Vector
Background error covariance in Matrix
observation space based on
proxy
m
ms
m m
R
K
•
K
n
m
k
s
f

t
I
Observation error covariance
Kalman Gain
Final modified Kalman Gain
Number of elements in state
vector
Number
of
elements
in
observation vector
Unspecified dimension
Number of ensemble member
Weighting factor
Renormalization
factor
combining s, t, and f
Rescaling factor
Identity matrix
Matrix
Matrix
Matrix
Scalar
m m
nm
nm
1
Scalar
1
Scalar
Scalar
Scalar
Scalar
1
1
1
1
Scalar
Matrix
1
Square
Acknowledgements
References
Anderson, J. L. (2001). "An ensemble adjustment Kalman filter for data assimilation."
Monthly Weather Review 129(12): 2884-2903.
Anderson, J. L. (2003). "A local least squares framework for ensemble filtering."
Monthly Weather Review 131(4): 634-642.
Beezley, J. D. and J. Mandel (2008). "Morphing ensemble Kalman filters." Tellus Series
a-Dynamic Meteorology and Oceanography 60(1): 131-140.
Chen, Y. and C. Snyder (2007). "Assimilating vortex position with an ensemble Kalman
filter." Monthly Weather Review 135: 1825-1845.
Dowell, D. C., F. Q. Zhang, et al. (2004). "Wind and temperature retrievals in the 17 May
1981 Arcadia, Oklahoma, supercell: Ensemble Kalman filter experiments." Monthly
Weather Review 132(8): 1982-2005.
Evensen, G. (1994). "Sequential data assimilation with a nonlinear quasi-geostrophic
model using Monte-Carlo methods to forecast error statistics." Journal of Geophysical
Research-Oceans 99(C5): 10143-10162.
Evensen, G. (2003). "The ensemble Kalman filter: Theoretical formulation and practical
implementation." Ocean Dyn. 53: 343–367.
Galchen, T. and R. A. Kropfli (1984). "Buoyancy and pressure perturbations derived
from dual-Doppler radar observations of the planetary boundary-layer - Applications for
matching models with observations." Journal of the Atmospheric Sciences 41(20): 30073020.
Gelb, A., ed. (1974). "Applied Optimal Estimation." M.I.T. Press, Cambridge, USA.
Hamill, T. M. and J. S. Whitaker (2005). "Accounting for the error due to unresolved
scales in ensemble data assimilation: A comparison of different approaches." Monthly
Weather Review 133: 3132-3147.
Hane, C. E. and P. S. Ray (1985). "Pressure and buoyancy fields derived from Doppler
radar data in a tornadic thunderstorm." Journal of the Atmospheric Sciences 42(1): 18-35.
Harlim, J. and B. R. Hunt (2007). "A non-Gaussian ensemble filter for assimilating
infrequent noisy observations." Tellus Series a-Dynamic Meteorology and Oceanography
59(2): 225-237.
Hong, S.-Y., Lim, J (2006). "The WRF Single-Moment 6-class Microphysics Scheme
(WSM6)." J. Korean Mete Soc., 42: 129 - 151.
Houtekamer, P. L. and H. L. Mitchell (1998). "Data assimilation using an ensemble
Kalman filter technique." Monthly Weather Review 126(3): 796-811.
Houtekamer, P. L. and H. L. Mitchell (2001). "A sequential ensemble Kalman filter for
atmospheric data assimilation." Monthly Weather Review 129(1): 123-137.
Houtekamer, P. L., H. L. Mitchell, et al. (2005). "Atmospheric data assimilation with an
ensemble Kalman filter: Results with real observations." Monthly Weather Review
133(3): 604-620.
Johns, C. J. and J. Mandel (2008). "A two-stage ensemble Kalman filter for smooth data
assimilation." Environmental and Ecological Statistics 15(1): 101-110.
Kalman, R. E. (1960). "A new approach to linear filtering and prediction problems."
Journal of Basic Engineering 82((1)): 35–45.
Lin, Y., P. S. Ray, et al. (1993). "Initialization of a modeled convective storm using
Doppler radar derived fields." Monthly Weather Review 121(10): 2757-2775.
Mandel, J. (2006). "Efficient implementation of the ensemble Kalman filter." University
of Colorado at Denver and Health Sciences Center CCM Report 231.
Meng, Z. Y. and F. Q. Zhang (2008). "Tests of an ensemble kalman filter for mesoscale
and regional-scale data assimilation. Part III: Comparison with 3DVAR in a real-data
case study." Monthly Weather Review 136(2): 522-540.
Ott, E., B. R. Hunt, et al. (2004). "A local ensemble Kalman filter for atmospheric data
assimilation." Tellus Series a-Dynamic Meteorology and Oceanography 56(5): 415-428.
Saucier, W. J. (1955). "Principles of Meteorological Analysis." 76–78.
Snyder, C. and F. Q. Zhang (2003). "Assimilation of simulated Doppler radar
observations with an ensemble Kalman filter." Monthly Weather Review 131(8): 16631677.
Sun, J. Z. (2005). "Convective-scale assimilation of radar data: Progress and challenges."
Quarterly Journal of the Royal Meteorological Society 131(613): 3439-3463.
Tong, M. J. and M. Xue (2005). "Ensemble Kalman filter assimilation of Doppler radar
data with a compressible nonhydrostatic model: OSS experiments." Monthly Weather
Review 133(7): 1789-1807.
Wang, X. G., D. M. Barker, et al. (2008). "A Hybrid ETKF-3DVAR Data Assimilation
Scheme for the WRF Model. Part I: Observing System Simulation Experiment." Monthly
Weather Review 136(12): 5116-5131.
Whitaker, J. S. and T. M. Hamill (2002). "Ensemble data assimilation without perturbed
observations." Monthly Weather Review 130(7): 1913-1924.
Xue, M., Y. S. Jung, et al. (2007). "Error modeling of simulated reflectivity observations
for ensemble Kalman filter assimilation of convective storms." Geophysical Research
Letters 34(10): 5.
Xue, M., M. J. Tong, et al. (2006). "An OSSE framework based on the ensemble square
root Kalman filter for evaluating the impact of data from radar networks on thunderstorm
analysis and forecasting." Journal of Atmospheric and Oceanic Technology 23(1): 46-66.
Zhang, F., C. Snyder, et al. (2004). "Impacts of initial estimate and observation
availability on convective-scale data assimilation with an ensemble Kalman filter."
Monthly Weather Review 132(5): 1238-1253.
The U.S. Natural Hazard Statistics, NOAA, http://www.nws.noaa.gov/om/hazstats.shtml
Hydrological
information
center,
NOAA,
http://www.nws.noaa.gov/oh/hic/flood_stats/Flood_loss_time_series.shtml
Download