D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 ChE 494/598 Course Objectives Introduction to System Identification • Present fundamental background to allow students to make judicious choices of design variables in system identification. Daniel E. Rivera, Associate Professor • Provide lab exercises that will give students a working feel for the course topics. MATLAB (particularly the System Identification Toolbox) will be the program of choice. Control Systems Engineering Laboratory Department of Chemical and Materials Engineering Arizona State University Tempe AZ 85287-6006 • Provide a glimpse of cutting-edge identification research at ASU and other academic institutions around the world. daniel.rivera@asu.edu (480) 965-9476 © Copyright, 1998-2004 System Identification Some System Identification Facts “Identification is the determination, on the basis of input and output, of a system within a specified class of systems, to which the system under test is equivalent.” - L. Zadeh, (1962) • problem not exclusively associated with control design, although it forms a significant part of control implementation Disturbances Inputs System • often times, the system identification task is the most expensive and time consuming part of advanced control implementation Outputs • broadly applicable technology with applications in many diverse fields System identification focuses on the modeling of dynamical systems from experimental data © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 1 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Shell Heavy Oil Fractionator Example Distillation Column Data PC • Manipulate top, side draw and/or FC Top Draw bottoms reflux duty to maintain A top and side TOP DRAW endpoints at Top setpoint, Endpoint T LC Upper Reflux Duty UPPER REFLUX T Intermediate Reflux INTERMEDIATE REFLUX Duty T T A BOTTOMS REFLUX T LC F T 0 -2 0 Side Draw • Reject disturbances from the upper and intermediate reflux duties. SIDE DRAW Side • Keep Bottoms Endpoint Reflux Temperature above constraints. 20 40 60 80 100 120 140 160 180 200 120 140 160 180 200 INPUT #1 FC Bottoms Reflux Duty Bottoms Reflux Temp 2 SIDE STRIPPER LC Q(F,T) CONTROL OUTPUT #1 4 0 20 0 40 60 80 100 • response of overhead temperature (top) to changes in reflux flowrate (bottom) BOTTOMS FEED Epi Reactor Temperature Control Epi Reactor Identification Data • keep center, front, side and rear temperatures constant by adjusting power to the lamp banks solid:master; dashed:side; dotted:front; d-dotted:rear 0 Power [%] -2 -4 -6 -8 0 200 400 600 800 1000 1200 Time [seconds] 1400 1600 1800 2000 1800 2000 solid:master; dashed:side; dotted:front; d-dotted:rear Temp. Deviation [C] 10 0 -10 -20 -30 -40 0 200 400 600 800 1000 1200 Time [seconds] 1400 1600 solid: center; dashed: side; dotted:front; dash-dotted: rear © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 2 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Semiconductor Mfg Supply Chain Managment Fab/Test Node Dynamic Response Fab/Sort starts Load Controller LT ADI Forecast A/T starts Real LT SFGI Outs Demand Shipments Starts A/T: Assembly/Test Facility ADI: Assembly-Die Inventory LT CW SFGI: Semi-Finished Goods Inventory Demand CW: Component Warehouse Time D1 t D3 D2 Wing Flutter Example Other Challenging Application Areas Filtered data used for modeling 10 Excitation 5 • Economic/financial systems 0 -5 -10 0 1 2 3 4 Time [s] 5 6 7 • modeling economic indicators such as the Dow Jones, S&P 500 indices 8 Filtered data used for modeling 5 Response • Behavioral/social systems 0 -5 0 1 2 3 4 Time [s] 5 6 7 • time-varying adaptive interventions for the prevention of chronic, relapsing disorders (such as alcoholism, smoking and drug abuse) 8 • artificial mechanical vibrations (top) introduced to a wing at certain flight conditions; responses shown on bottom © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 3 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 STAGES OF SYSTEM IDENTIFICATION Stages of System Identification Start a priori process information Experimental Design and Execution • Experimental Design and Execution ( Step, Pulse, or PRBS-Generated Data) "Identification" • Data Preprocessing • Data Preprocessing • Model Structure Determination • Parameter Estimation • Model Structure Selection ( Linear Plant and Disturbance Models) • Parameter Estimation Model Validation • Model Validation (Simulation, Residual auto and cross- correlation, step-response) No Does the model meet validation criteria? Yes End Prior system knowledge: physics, linguistics, first-hand, etc. Stages of System Identification - II Keys to Successful System Identification in Practice Experiment design Pre-treat data Choose model structure • Understanding the various identification methods and associated decision variables in terms of bias-variance tradeoffs Choose performance criterion Parameter estimation Not OK revise prior? Validate model OK Not OK accept model! • Effective use of a priori knowledge regarding the system to be identified and the intended application (e.g., simulation, prediction, control) revise! Controller Design & Commissioning "the classical statistical approach," per Ljung... • courtesy P. Lindskog, ISY, Linköping University, Sweden © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 4 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Furnace Control Example System Identification Challenges • Skill-level issues: many system identification methods assume the user has extensive background in statistics, signal processing, discrete-time systems, and optimization. (Output) (Disturbance) • Large number of design variables. CONTROLLER • Process operating restrictions make identification one of the most time consuming tasks in advanced control implementation projects. (Input) Objective: Use fuel gas flow to keep outlet temperature under control, in spite of significant changes in the feed flowrate. The "Shower Problem" Graphical System Identification Using Step Testing Consider the problem of adjusting hot water flow to maintain shower temperature despite cold water fluctuations... Response of a first-order with deadtime model for a step input of magnitude A τ KA Transportation lag Makes this a difficult control problem... Hot -θs p(s) = K e , τ s+1 Cold θ Time Many references for this technique, example: Seborg, Edgar, and Mellichamp, Process Dynamics and Control, Wiley, 1989, Chapter 7. © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 5 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Perils of Step Testing Furnace Example (Continued) Measured Output Temperature Response to a Step Increase in Fuel Gas Flow 25 20 15 Open-loop disturbance (no control) 5 0 0 Compare Step Responses: FOPDT Model[--], PLANT Data[-], TRUE PLANT[-.] -5.5 Temperature (Controlled Variable) 10 200 400 600 800 1000 1200 Time[Min] 1400 1600 1800 -6 -6.5 2000 exp(-5s) p(s) (true) = -------------10 s + 1 -7 Input 1 Fuel Gas Flow (Manipulated Variable) 0.5 -7.5 -8 0 0.748 exp(-6s) p(s) (est) = -------------------7.095 s + 1 -8.5 -0.5 -9 -1 0 200 400 600 800 1000 1200 Time[Min] 1400 1600 1800 2000 2000 minutes -9.5 -10 0 Consider the application of step testing to a system subject to a drifting, nonstationary disturbance. 40 minutes 5 10 15 20 25 30 35 40 Time[Min] Principal Sources of Error in System Identification Design Variable Selection Issues ERROR = BIAS + VARIANCE • Input Signal Selection: Random Binary, Pseudo-Random Binary, or Multiple Step/Pulse Inputs? • BIAS. Systematic errors caused by - input signal characteristics (i.e., excitation) • Input Signal Parameters: Example: PRBS - number of shift registers, switching time, signal magnitude, and signal duration - choice of model structure - mode of operation (i.e., closed-loop instead of open-loop) • Data Preprocessing: Detrending, control-relevant prefiltering, outlier removal, etc. • VARIANCE. Random errors introduced by the presence of noise in the data, which do not allow the model to exactly reproduce the plant output. It is affected by the following factors: • Model Structure Selection and Parameter Estimation: ARX, ARMAX, Output Error, Box-Jenkins - number of model parameters • Model Validation: Simulation, Crossvalidation, Correlation Analysis; Examination of Step, Impulse, Frequency Responses. - duration of the identification test - signal-to-noise ratio © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 6 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Course Outline From Identification to Controller Implementation OPEN LOOP RESPONSE CLOSED LOOP RESPONSE IDENTIFICATION DATA • Signals and Systems Overview Measured Output 25 • Input Signal Design and Nonparametric Estimation 20 Measured Output 15 10 • Parametric Model Estimation and Validation 5 0 -5 0 500 1000 1500 2000 2500 3000 Time[Min] Input 3500 4000 4500 5000 • Control-Relevant and Closed-Loop Identification 15 10 Input 5 • Multivariable Identification 0 -5 -10 -15 0 500 1000 1500 2000 2500 3000 Time[Min] 3500 4000 4500 • Issues in nonlinear and semiphysical identification 5000 Furnace example with PRBS input, PID with filter controller Systems Representations Course Focus Nonlinear Lumped Parameter System • Very broad subject - Linear or Nonlinear? (Mostly) LINEAR - Continuous or Discrete? DISCRETE - Parametric or nonparametric? BOTH - Time or frequency domain? BOTH { Linearization Step/ Impulse Response and Frequency Response State-Space Model T Sampling Laplace transforms Discrete-time S-S Model Realization T s-domain Transfer Function Model © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 7 } z-domain Transfer Function Model (difference equation) Sampling Discretetime Step/ Impulse Response and Frequency Response D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Discrete Model Representations Pulse Transfer Functions computer control algorithm Nonparametric Step Response PLANT { uk u(t) Zero-order Hold y(t) P(s) T yk computer ZOH-equivalent Pulse Transfer Function discrete input PLANT continuous input uk Impulse Response continuous output u(t) discrete output y(t) yk U(k) Y(k+1) D { U(z) time Difference Equations time time G(z) time Y(z) Z-Transforms Parametric Examples time System Identification Structure z-domain δ (t) 1 1 1 t ≥ 0 s(t) = 0 t < 0 1 s z z −1 Impulse Step s-domain y(t) = p(z)u(t) + H(z)a(t) a Random Signal H(z) ZOH Pulse K K(1 - exp(-T/τ)) Transfer z - exp(-T/τ) τs + 1 Function Integrating/ ZOH Pulse KT K Transfer Ramp z −1 s Function First-Order ZOH Pulse K exp(−θ s) K(1 - exp(-T/τ))z -N with Delay Transfer τ s + 1 z - exp(-T/τ) Function θ = NT First-Order Lag Input Signal u P(z) + + υ Disturbance Signal y Output Signal P(z) and H(z) are discrete-time (z-domain) transfer functions © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 8 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Signals Overview Example 1: White-noise signal at • Deterministic versus stochastic signals xt 1 x t = at • Stationary versus nonstationary signals ρk = σ 2a 1.5 AUTOCORRELATION OF WHITE NOISE 0.5 Mean, auto and cross-covariance, power and cross-spectra will be the measures/tools utilized here =1 0.5 0 0 2 0 4 0 1 Lag Example 1: White Noise Signal Analysis 2 Frequency [Radians] Example 3: AR(1) signal (From Sample Estimators) Signal 1.5 σ 2a 1 phi • Crosscorrelated versus uncorrelated signals 1.5 POWER SPECTRUM OF WHITE NOISE Φ(ω ) 1 rhok • White versus autocorrelated signals γk at = N(0, σ 2a ) Signal 3 1 2 0.5 1 0 0 -1 -0.5 -2 -1 0 1 200 400 600 800 1000 1200 Sample Number Autocorrelation Coefficients 10 0 10 -1 10 -2 1400 1600 1800 -3 0 2000 Power Spectral Density 1 0 -0.5 -20 0 20 Lag k 40 10 400 600 800 1000 1200 Sample Number 10 1 0.5 10 0 0 10 -1 10 -2 1400 1600 1800 2000 Power Spectral Density rhok rhok 0.5 200 Autocorrelation Coefficients -2 -1 10 Frequency [Radians/Time] 10 0 -0.5 -20 0 20 Lag k 40 10 -2 © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 9 -1 10 Frequency [Radians/Time] 10 0 ω 3 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 System Identification, Revisited "Plant Friendly" Input Signal Design a white noise signal A plant friendly input signal should: • be as short as possible H(z) Input Signal (Random or Deterministic) u P(z) Disturbance Signal (random, autocorrelated) υ + + • not take actuators to limits, or exceed move size restrictions • cause minimum disruption to the controlled variables (i.e., low variance, small deviations from setpoint) y Output Signal (random, autocorrelated) • u and y are crosscorrelated • a and y are crosscorrelated • If u and a are statistically independent, then u and ν will be uncrosscorrelated... Note that theoretical requirements may strongly conflict with "plant-friendly" operation! PRBS, continued Pseudo-Random Binary Sequence One cycle of the PRBS time input signal The PRBS is a periodic, deterministic input which can be generated using shift registers and Boolean algebra 1 1 1 0 1 0.5 0 nr Shift Registers 0 1 1 1 0 -0.5 1 -1 0 5 10 15 20 25 30 Time[Min] Power Spectrum of the PRBS input 35 40 45 0 10 AR -1 Exclusive OR (Modulo 2 Adder) 10 -2 10 -3 10 0 10 Test Signal Radians/Min PRBS design for Tsampl = 1, Tsw = 3, n (registers) = 4, and signal magnitude = +/- 1.0. One cycle duration is 45 minutes long. The main design variables are switching time (Tsw), number of shift registers (nr), and signal amplitude © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 10 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Inputs to Consider Nonparametric Methods • Step/Pulse Inputs • Correlation Analysis: • Gaussian White Noise - direct estimation of impulse response coefficients from identification data • Random Binary Signal (RBS) • Pseudo-Random Binary Signal (PRBS) • Spectral Analysis: • multi-level Pseudo-Random Signals - direct estimation of frequency response from identification data • Multisine inputs (e.g., Schroeder-phased, minimum crest factor) Correlation Analysis Results, Hairdryer Data Wing Flutter Example, Spectral Analysis Smoothed SPA model (solid). Raw ETFE (*). 1.5 0 Covf for prewhitened u Amplitude [dB] Covf for filtered y 0.15 1 0.1 0.5 0.05 0.6 -10 0 10 20 Correlation from u to y (prewh) -0.5 -20 -10 0.15 Impulse response estimate 0.4 0.1 0.2 0.05 0 0 -0.2 -20 -10 0 10 20 -0.05 -20 0 10 -10 0 10 -15 -20 -25 20 4 5 6 7 8 9 Frequency [Hz] Smoothed SPA model (solid). Raw ETFE (*). 10 11 10 11 150 Phase [degree] 0 -20 0 -5 -10 20 100 50 0 4 5 6 7 8 Frequency [Hz] © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 11 9 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Prediction-Error Model of Structures Prediction-Error Family Models e Smoothing, Filtering, Prediction u C(z) D(z) B(z) −nk z F(z) 1 A(z) + + y B(z) C(z) A(z)y(t) = u(t − nk) + e(t) F (z) D(z) A(z) B(z) C(z) D(z) F (z) • In the prediction problem, current and previous measurements from the plant are used to obtain estimates k+1 (or beyond) time steps in the future = = = = = 1 + a1z −1 + . . . + ana z −na b1 + b2z −1 + . . . + bnb z −nb+1 1 + c1z −1 + . . . + cnc z −nc 1 + d1z −1 + . . . + dnd z −nd 1 + f1z −1 + . . . + fnf z −nf In transfer function form: y(t) = p̃(z)u(t) + p̃e(z)e(t) p̃(z) = Popular PEM Structures Method ARX ARMAX FIR Box-Jenkins Output Error p̃(z) p̃e(z) B(z) −nk A(z) z B(z) −nk A(z) z −nk 1 A(z) C(z) A(z) B(z)z B(z) −nk F (z) z C(z) D(z) B(z) −nk F (z) z 1 B(z) z −nk A(z)F (z) p̃e(z) = C(z) A(z)D(z) ARX Parameter Estimation The one-step ahead predictor for y ŷ(t|t−1) = −a1y(t−1)−. . .−ana y(t−na)+b1u(t−nk)+. . .+bnb u(t−nk−nb+1) can be expressed as a linear regression problem via ϕ = [ −y(t − 1) . . . −y(t − na) u(t − nk) . . . u(t − nk − nb + 1) ]T 1 and θ, the vector of parameter estimates: θ = [ a1 . . . ana . . . bnb ]T b1 Rewriting the objective (“loss”) function as 2 N 1 y − ϕT (t)θ N i=1 leads to the well-established linear least-squares solution min V = min θ B(z) C(z) u(t − nk) + e(t) F (z) D(z) y(t) = p̃(z)u(t) + p̃e(z)e(t) A(z)y(t) = θ̂ = θ 1 N N t=1 −1 ϕ(t)ϕT (t) 1 N N t=1 ϕ(t)y(t) © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 12 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Modeling Requirements for Process Control Model Validation Techniques • Simulation (plot the measured output time series versus the predicted output from the model). "Decomposed" • Crossvalidation (simulate on a data set different than the one used for parameter estimation; for a number of different model structures, plot the loss function and select the minimum. "Integrated/Synergistic" START Modeling Modeling/ Control • Impulse, step, and frequency responses (compare with physical insight regarding process). Control • Scatter Plots/correlation analysis on the prediction errors (make sure they resemble white noise). Same result is not obtained from both approaches! • Information criteria (Akaike or Rissanen's Maximum Description Length) Control-Relevant Identification Control-Relevant Prefiltering Overhead Temperature • Some general ideas behind control-relevant modeling • Design variables for control-relevant id – Control-relevant prefiltering – Control-relevant input signals Solid: Raw Data; Dashed: Prefiltered Data 4 2 0 -2 0 20 40 60 80 100 Time 120 140 160 180 200 0 20 40 60 80 100 Time 120 140 160 180 200 Reflux Flow 2 • Brief comments on uncertainty estimation from id data 0 -2 • Integrated system id and PID controller design The purpose of c-r prefiltering is to emphasize information in the data most important for control purposes © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 13 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Refinery Debutanizer Problems in Closed-Loop Identification d ud r+ - C CF Pd + u + FC υ REFLUX FLOW + + + + y P FEED FLOW F FEED TEMP T P BOTTOMS-TO-FEED DIFFERENTIAL PRESSURE BOTTOMS TEMP T • crosscorrelation will exist between disturbance (d) and input (u) as a result of the control FUEL GAS FLOW FC G FUEL GAS SPECIFIC GRAVITY MPC loop between Bottoms Temperature and Fuel Gas Flow SP • control action will introduce additional bias by "eating away" at excitation Debutanizer Closed-Loop Testing Multivariable System Identification PRBS Signal and Input Series 0.1 T REBOIL FEED TEMP • Motivation for multivariable identification 0.05 Fuel Gas Flowrate • Multiple input extensions to: 0 Setpoint -0.05 -0.1 – PRBS, RBS design 0 50 100 150 200 250 – ARX estimation 300 – PEM Output Series 4 • Brief overviews of Bayard’s, Zhu’s, and subspace methods 2 Bottoms Temperature 0 -2 -4 0 • Overview of ASU’s MIMO control-relevant methodology 50 100 150 200 250 – “zippered” multisine signals 300 time Closed-loop data set generated by signal injection at the Fuel Gas Flowrate Setpoint; dashed line shows external signal (ud); solid lines show u and y, respectively Illustrations from various applications © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 14 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 MIMO PRBS Input Experimental Data (Noise free) Control-Relevant Identification Methodology MIMO PRBS Input Experiment Data y1[T21] 0.02 “Plant-Friendly” Input Design 0 -0.02 0 100 200 300 400 500 600 700 800 y2[T7] 0.02 Schroeder-Phased SIMO 0 -0.02 u1[L] 1 1 u2[V] 0x 10 -3 100 200 300 400 500 600 700 800 0x 10 -3 100 200 300 400 500 600 700 800 0 100 SIMO MIMO 0 -1 DFT Analysis PRBS 0 -1 Nonparametric Estimation 200 300 400 Time[Min] 500 600 700 Random Binary Sequence 800 L H max PRBS: Specifying τdom = 5, τdom = 33, αs = 2, βs = 3, Tsettle = 165, and Tsampl = 2 min leads to nr = 7, Tsw = 6 min, and D = 168 min. Signal magnitude set at usat = 0.001. SIMO MIMO Model Predictive Control Past High-Order ARX Estimation Control-Relevant Parameter Estimation Frequency-Weighted Curvefitting and Controller Design SIMO: Single-Input, Multi-Output MIMO:Multi-Input, Multi-Output Semiphysical Modeling Brine-Water Mixing Tank Example Future Prediction Horizon ^ y(k) O O O O O O O O O O O O O u(k) O Move Horizon k k+1 k+2 k+M k+P Following prediction, the MPC controller solves the following multiobjective optimization problem, Keep controlled variables at setpoint p min [∆u(k),...,∆u(k+m)] =1 + m =1 2 ΓY (y(k + | k) − r(k + )) Move suppression Γu∆u(k Consider the dynamics of a tank mixing fresh and brine flow streams 2 + − 1) © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 15 D.E. Rivera, Introduction to System Identification, ChE 494/598, January 20, 2004 Mixing Tank Example, Continued System Identification Toolbox (SITB) Graphical User Interface (GUI) The first-principles model for this system is: dc = qc cc − (qc + qw ) c dt Using a forward-difference approximation on the derivative leads to V c(t + 1) − c(t) qc(t) cc(t) (qc(t) + qw (t)) c(t) = − T V V which solving for c(t + 1) yields qc(t) cc(t) T (qc(t) + qw (t)) c(t) T − V V Rearranging and consolidating terms leads to the semiphysical structure c(t + 1) = c(t) + c(t) = θ1c(t−1)+θ2qc(t−1) cc(t−1)+θ3qc(t−1) c(t−1)+θ4qw (t−1) c(t−1) θ1, θ2, θ3, and θ4 can be estimated via linear regression. © Copyright 1998-2004 by D.E. Rivera, All Rights Reserved 16