Parameter estimation: To what extent can data assimilation techniques correctly uncover stochasticity? Jim Hansen MIT, EAPS jhansen@mit.edu (with lots of help from Cecile Penland and Greg Lawson) Indistinguishable? Accounting for vs. reducing model inadequacy • Accounting for model inadequacy – “If you can show me how I can make better forecasts using chicken bones and voodoo dolls, then I’m going to use them!” » Harold Brooks, NSSL – initial conditions (Q-term) – forecasts (MM, stochastic, MOS, forecast 4d-Var) • Reducing model inadequacy – Making changes to our model so that it becomes a better representation of the true system – parametric error – structural error Accounting for vs. reducing model inadequacy • Accounting for model inadequacy – “If you can show me how I can make better forecasts using chicken bones and voodoo dolls, then I’m going to use them!” » Harold Brooks, NSSL – initial conditions (Q-term) – forecasts (MM, stochastic, MOS, forecast 4d-Var) • Reducing model inadequacy – Making changes to our model so that it becomes a better representation of the true system – parametric error – structural error Reducing model inadequacy • Reducing model inadequacy is best framed as an off-line, or “reanalysis” activity – The process of attempting to identify model inadequacy tends to make both initial conditions and forecasts worse – The aim is to quantify how the model is wrong, fix it, and then worry about data assimilation and forecasting A proposed approach • Use data assimilation tools to alter model parameters to better fit observations • Identify relationships between fit parameters and prognostic variables (a parametric MOS) • Change model to reflect relationships • Repeat When all relationships have been uncovered, the history of fit parameter values provides a distribution from which to (carefully) draw for the purpose of stochastic parameterizations Use data assimilation tools to alter model parameters to better fit observations • Augment control vector with unknown parameters x x α • Augmentation removes the nonlinearity from the observation operator and inserts it into the specification of the control vector Augmented control vector sample covariance xx xx T xα T T αx T αα T Parametric error example: L63 • System equations x ( x y ) y rx xz y z xy bz • Model equations x ( x y ) y rx xz y z xy bz 0 T x [x y z ] Importance of a state-dependent background error covariance • Ensemble 4d-Var parameter parameter • 4d-Var, static covariance time time Structural error example: Lorenz ‘96 System: xi xi 2 xi 1 xi 1 xi 1 x F 2 i Model: xi xi 2 xi 1 xi 1 xi 1 xi F x [x ] T parameter Regressing parameter vs. prognostic variable gives: xi x(1) Alter model equations with new information System: xi xi 2 xi 1 xi 1 xi 1 x F 2 i Original model: xi xi 2 xi 1 xi 1 xi 1 xi F New model: xi xi 2 xi 1 xi 1 xi 1 ( xi ) xi F Example: Lorenz ’96 Model II hx c J xi xi 2 xi 1 xi 1 xi 1 xi F y j ,i b j 1 hy c y j ,i cby j 1,i y j 2,i cby j 1,i y j 1,i cy j ,i xi b System: hx c J xi xi 2 xi 1 xi 1 xi 1 xi F y j ,i b j 1 hy c y j ,i cby j 1,i y j 2,i cby j 1,i y j 1,i cy j ,i xi b Model: xi xi 2 xi 1 xi 1 xi 1 xi Fi (t ) x [x F] T parameter Regressing parameter vs. prognostic variable gives: Fi xi G x(1) System: hx c J xi xi 2 xi 1 xi 1 xi 1 xi F y j ,i b j 1 hy c y j ,i cby j 1,i y j 2,i cby j 1,i y j 1,i cy j ,i xi b Original model: xi xi 2 xi 1 xi 1 xi 1 xi Fi (t ) New model: xi xi 2 xi 1 xi 1 xi 1 xi ( i 1) xi G SDE crash course • The type of calculus used to integrate the stochastic bits of SDEs matters – Stratonovich calculus • noise process is continuous (typical assumption for geophysical fluid flows) – Ito calculus • noise process is discrete (like DA!) • SDEs can be tricky (and expensive) to integrate – used stochastic RK4 (Hansen and Penland, 2005) What if the system really is stochastic? • System is an SDE dx 0 ( x y )dt s ( x y ) dW dy (rx xz y )dt dz ( xy bz )dt • Model is an ODE x ( x y ) y rx xz y z xy bz 0 x [x y z ] T Can DA uncover the correct form of the stochasticity? - NO • EnKF parameter parameter • Ensemble 4d-Var time 10.08, std ( ) 0.36 time 10.02, std ( ) 0.32 0 10, s 0.1 Why can’t DA uncover the correct form of the stochasticity? • Stochasticity operating at different time-scales – SDE has infinitesimal time-scale – ODE with DA has 6-hourly time-scale • System is using Stratonovich calculus, DA is using Ito calculus Model error! • All leads to a danger of misinterpretation How should we use this information for forecasting? 1. Deterministic model using constant, tuned parameter value ( ) 2. Stochastic model using mean and standard deviation of tuned parameter value ( , std ( ) ) 3. Deterministic, multi-model ensemble with parameters drawn from ( , std ( ) ) 4. Deterministic model where parameter varies in the same manner as it was estimated ( , std ( ) ) Tuned deterministic x ( x y ) y rx xz y z xy bz How should we use this information for forecasting? 1. Deterministic model using constant, tuned parameter value ( ) 2. Stochastic model using mean and standard deviation of tuned parameter value ( , std ( ) ) 3. Deterministic, multi-model ensemble with parameters drawn from ( , std ( ) ) 4. Deterministic model where parameter varies in the same manner as it was estimated ( , std ( ) ) Incorrect SDE dx ( x y )dt std ( )( x y ) dW dy (rx xz y )dt dz ( xy bz )dt How should we use this information for forecasting? 1. Deterministic model using constant, tuned parameter value ( ) 2. Stochastic model using mean and standard deviation of tuned parameter value ( , std ( ) ) 3. Deterministic, multi-model ensemble with parameters drawn from ( , std ( ) ) 4. Deterministic model where parameter varies in the same manner as it was estimated ( , std ( ) ) Multi-model x N( , var( ))( x y ) y rx xz y z xy bz where the draw from N( , var( )) is held constant over the entire forecast period. How should we use this information for forecasting? 1. Deterministic model using constant, tuned parameter value ( ) 2. Stochastic model using mean and standard deviation of tuned parameter value ( , std ( ) ) 3. Deterministic, multi-model ensemble with parameters drawn from ( , std ( )) 4. Deterministic model where parameter varies in the same manner as it was estimated ( , std ( ) ) Hybrid x N( , var( ))( x y ) y rx xz y z xy bz where the draw from N( , var( )) is made every 6 model hours. Median of ensemble mean forecast distributions Normalized RMSE Tuned deterministic Incorrect SDE Multi-model Hybrid Perfect Forecast lead (model days) std(err/ens_std) Must assess probabilistically! Tuned deterministic Incorrect SDE Multi-model Hybrid Perfect Forecast lead (model days) Relative (to perfect) entropy relative entropy Multi-model Hybrid Forecast lead (model days) What if we use a stochastic model? • System is an SDE dx 0 ( x y )dt s ( x y ) dW dy (rx xz y )dt dz ( xy bz )dt • Model is an SDE dx 0 ( x y )dt s ( x y ) dW dy (rx xz y )dt dz ( xy bz )dt x [ x y z 0 s ] T parameter std parameter mean Now can DA uncover the correct form of the stochasticity? - NO time time 0 10.11 s 0.26 0 and s are not unique 0 10, s 0.1 What’s the problem this time? • Wrong trajectory of random numbers Model error! SDE forecast errors s 0 What does it all mean? • Deterministic model DA approaches alone are not enough to uncover the correct form of stochasticity – Implies that we cannot attach physical significance to tuned parameter values or distributions • Our efforts to reduce model inadequacy ultimately lead to a sensible way to account for model inadequacy • Synoptic time-scale, Ito-like stochasticity via parameter estimation does a great job accounting for model inadequacy during forecasting The future(?) of data assimilation • • • • • Model error issues Nonlinearity New disciplines: e.g. paleo, climate Improved image DA is part of a larger problem The future(?) of data assimilation • Nonlinearity – Implementing nonlinear approaches – Extend minimum error variance approaches a bit more into the nonlinear regime • Feature-based non-Gaussianity The future(?) of data assimilation • Improved image – DA has a bad/boring reputation – Ensemble methods bringing DA to the masses • University research can be quasi-operational • Reasonable DA now where none before ATMOS COLLEGE The future(?) of data assimilation • Improved image – DA has a bad/boring reputation – Ensemble methods bringing DA to the masses • University research can be quasi-operational • Reasonable DA now where none before The future(?) of data assimilation • DA is part of a larger problem – The future of DA is not independent of the future of observations, ensemble forecasting, verification, calibration, etc.. • • • • Ensemble forecasting Targeting Increasing ensemble forecast size at low cost Ensemble synoptic analysis Transformed Lag Ensemble Forecasting (TLEF) • Ensemble size is increased by using ensemble-based data assimilation techniques to transform (scale and rotate) old forecasts using new observations. Time The future(?) of data assimilation • DA is part of a larger problem – The future of DA is not independent of the future of observations, ensemble forecasting, verification, calibration, etc.. • • • • Ensemble forecasting Targeting Increasing ensemble size at low cost Ensemble synoptic analysis Hakim and Torn WRF, 100 ensemble members, surface pressure obs Hakim and Torn Ensembles make PV inversion fun and easy! PV Ertel AX a 1 x A pv Ertel a Approach defined by Hakim and Torn Note, no worries about balance assumptions or boundary conditions The future(?) of data assimilation • • • • • Model error issues Nonlinearity New disciplines: e.g. paleo, climate Sales/Marketing DA is part of a larger problem