LTI System Identification Majid Mirbagheri Postdoctoral Fellow Institute for Learning and Brain Sciences System Identification •What is system identification? •What is the goal of system identification? •How is system identification being carried out? Example Noise Sound Signal U(n) Auditory Processing + EEG Observations Y(n) Systems and Signals u System T y Systems and signals A description of the system T should specify how the output signal(s) y depend on the input signal(s) u. The signals depend on time: continuous or at discrete time instants. Several system descriptions are available: • • Continuous versus discrete time Time domain versus frequency domain Examples for LTI systems: • • • • Frequency Response Function (FRF) or Bode plot Impulse response of step response State space model Transfer function Ronald Aarts PeSi/2/1 Frequency Response The Bode plot of a system G(iω) is the amplitude and phase plot of the complex function G depending on the real (angular) frequency ω. It specifies for each frequency ω the input-output relation for harmonic signals (after the transient behaviour vanished): • The amplification equals the absolute value |G(iω)|. • The phase shift is found from the phase angle ∠G(iω). The output y(t) for an arbitrary input signal u(t) can be found by considering all frequencies in the input signal: Fourier transform. Procedure (in principle): Transform the input into the frequency domain, multiply with the Bode plot or FRF, transform the result back to the time domain. Ronald Aarts PeSi/2/2 Ronald Aarts PeSi/2/1 Impulse Response Function The impulse response g(t) of an LTI system can also be used to compute the output y(t) for an arbitrary input signal u(t): The input signal can be considered as a sequence of impulses with some (time dependent) amplitude u(t). The outputs due to all impulses are added: Continuous time signals and systems: Convolution integral y(t) = (g ∗ u)(t) = !∞ 0 g(τ )u(t − τ ) dτ Discrete time signals and systems: Summation y(k) = (g ∗ u)(k) = Ronald Aarts ∞ " l=0 g(l)u(k − l) PeSi/2/3 (k = 0, 1, 2, ...) Ronald Aarts PeSi/2/5 Signal characterization Signal characterisation M ATLAB’s identification toolbox ident works with time domain data. Even then, the frequency domain will appear to be very important. Furthermore, identification can also be applied in the frequency domain. • Frequency content: Fourier transform • Energy • Power Deterministic or stochastic signals? Ronald Aarts PeSi/2/7 Ronald Aarts PeSi/2/6 Fourier transforms Continuous-time deterministic signals u(t): Fourier integral: ! ∞ ! ∞ 1 −iωt U (ω) = u(t)e dt u(t) = U (ω)eiωt dω 2π −∞ −∞ For a finite number (N ) of discrete-time samples ud(tk ): Fourier summation: UN (ωl ) = N" −1 k=0 −iωl tk ud(tk )e −1 1 N" UN (ωl )eiωl tk ud(tk ) = N l=0 UN (ωl ) with ωl = Nl ωs = Nl 2π Ts , l = 0, ..., N − 1 is the discrete Fourier transform (DFT) of the signal ud(tk ) with tk = kTs, k = 0, ...N − 1. For N equal a power of 2, the Fast Fourier Transform (FFT) algorithm can be applied. Ronald Aarts PeSi/2/8 6 900 10 800 10 4 YN [−] y [−] Example: 32768 data samples of the piezo mechanism (left). 700 2 10 0 600 10 0 0.5 t [s] 1 0 10 2 10 f [Hz] DFT (right) in 16384 frequencies computed with M ATLAB’s fft command. Horizontal axis in steps of 1/(total measurement time) = 0.925 Hz. Ronald Aarts PeSi/2/9 4 10 Energy and power (continuous signals) Energy spectrum: Energy: Ψu(ω) = |U (ω)|2 Eu = ∞ −∞ 1 u(t)2dt = 2π ∞ Ψu(ω)dω −∞ 1 |UT (ω)|2 T →∞ T Power spectrum: Φu(ω) = lim Power: 1 Pu = lim T →∞ T T 0 1 u(t)2dt = 2π ∞ Φu(ω)dω −∞ [ UT (ω) is de Fourier transform of a continuous time signal with a finite duration ] Signal types 1 • Deterministic with finite energy: " Eu = Ψu(ω)dω bounded. Ψu(ω) = |U (ω)|2 limited. 0 • Deterministic with finite power: " Pu = Φu (ω)dω finite. Φu(ω) = T1 |UT (ω)|2 unbounded. • Stochastic with finite power: " Pu = Φu (ω)dω finite. Φu(ω) = T1 |UT (ω)|2 bounded. Ronald Aarts −1 1 0 −1 1 0 −1 0 20 40 60 t [s] PeSi/2/11 80 100 Ronald Aarts PeSi/2/10 Energy and power (discrete time signals) Energy spectrum: Ψu(ω) = |U (ω)|2 Energy: Eu = ∞ # 2 ud(k) = k=−∞ (from the DFT) ! Ψu(ω)dω ωs 1 |UN (ω)|2 N →∞ N Power spectrum: Φu(ω) = lim Power: ! −1 1 N# 2 Pu = lim ud(k) = Φu(ω)dω N →∞ N k=0 ωs Ronald Aarts PeSi/2/12 (“periodogram”) Convolution Convolutions Continuous time: y(t) = (g ∗ u)(t) = !∞ 0 g(τ )u(t − τ ) dτ After Fourier transform: Y (ω) = G(ω) · U (ω) Discrete time: y(k) = (g ∗ u)(k) = ∞ " l=0 g(l)u(k − l) (t = 0, 1, 2, ...) After Fourier transform: Y (ω) = G(ω) · U (ω) Example: u is the input and g(k), k = 0, 1, 2, ... is the impulse response of the system, that is the response for an input signal that equals 1 for t = 0 en equals 0 elsewhere. Then with the expressions above y(k) is the output of the system. Ronald Aarts PeSi/2/13 Stochastic Signals Realisation of a signal x(t) is not only a function of time t, but depends also on the ensemble behaviour. An important property is the expectation:E{ } f (x(t)) Examples: Mean E { } x(t) Power E { } (x(t) − E x(t))2 { } Cross-covariance: Rxy (τ ) =E [x(t) − Ex(t)][y(t − τ ) − Ey(t − τ )] { } Autocovariance: Rx(τ ) =E [x(t) − Ex(t)][x(t − τ ) − Ex(t − τ )] White noise: e(t) is not correlated met signals e(t − τ ) for any τ = 0. Consequence: Re(τ ) = 0 for τ = 0. Ronald Aarts PeSi/2/13 Power density or Power Spectral Density: Φx(ω) = !∞ Rx(τ ) e−iωτ dτ ∞ " Φxd (ω) = Rxd (k) e−iωkT k=−∞ −∞ With (inverse) Fourier transform: 1 Rx(τ ) = 2π Power: !∞ iωτ Φx(ω) e dω −∞ T Φxd (ω) eiωkT dω Rxd (k) = 2π ω ! s White Noise E (x(t) − E x(t))2 = Rx(0) = 1 = 2π Ronald Aarts !∞ White Noise PSD E (xd(t) − E xd(t))2 = Rxd (0) = T Φxd (ω) dω = 2π ω ! Φx(ω) dω −∞ s PeSi/2/15 Ronald Aarts PeSi/2/14 Systems and models A system is defined by a number of external variables (signals) and the relations that exist between those variables (causal behaviour). v u Signals: • • • Ronald Aarts ✲ ❄ ✗✔ ✲ ✖✕ G y ✲ measurable input signal(s) u measurable output signal(s) y unmeasurable disturbances v (noise, non-linearities, ...) PeSi/2/16 Estimators Suppose we would like to determine a vector θ with n real coefficients of which the unknown (true) values equal θ0. An estimator θ̂N has been computed from N measurements. This estimator is • unbiased if the estimator E { } θ̂N = θ0. • consistent, if for N → ∞ the estimator E{ } θ̂N resembles a δ-function, or in other words the certainty of the estimator improves for increasing N . The estimator is consistent if it is unbiased and the asymptotic covariance lim covθ̂N = lim E (θ̂N − E θ̂N )(θ̂N − E θ̂N )T = 0 N →∞ N →∞ { } Non-­‐parametric (System) Identification • t-domain: Impulse or step response • f -domain: Bode plot • Give models with “many” numbers, so we don’t obtain models with a “small” number of parameters. • The results are no “simple” mathematical relations. • The results are often used to check the “simple” mathematical relations that are found with (subsequent) parametric identification. • Non-parametric identification is often the first step. Correlation Analysis Correlation analyse Ident manual: Tutorial pages 3-9,10,15; Function reference cra (4-42,43). v u ✲ G0 ❄ ✗✔ ✲ ✖✕ y ✲ y(t) = G0(z) u(t) + v(t) Using the Impulse Response g0(k), k = 0, 1, 2, ... of system G0(z) y(t) = ∞ ! k=0 g0(k)u(t − k) + v(t) (t = 0, 1, 2, ...) So the transfer function can be written as: G0(z) = ∞ ! g0(k)z −k k=0 ⇒ Impulse response of infinite length. ⇒ Assumption that the “real” system is linear and v is a disturbance (noise, not related to input u). Ronald Aarts PeSi/3/2 Ronald Aarts PeSi/3/1 The Finite Impulse Response (FIR) ĝ(k), k = 0, 1, 2, ..., M , is a model estimator for system G0(z) for sufficiently high order M : y(t) ≈ M ! k=0 ĝ(k)u(t − k) (t = 0, 1, 2, ...) Note: In an analysis the lower limit of the summation can be taken less than 0 (e.g. −m) to verify the (non-)existence of a non-causal relation between u(t) and y(t). How do we compute the estimator ĝ(k)? • • • u(t) en v(t) are uncorrelated (e.g. no feedback from y to u!!!). " Multiply the expression y(t) = g0(k)u(t − k) + v(t) with u(t − τ ) and compute the expectation. This leads to the Wiener-Hopf equation: Ryu(τ ) = ∞ ! k=0 Ronald Aarts g0(k)Ru (τ − k) PeSi/3/3 Wiener-Hopf: Ryu(τ ) = ∞ ! k=0 g0(k)Ru (τ − k) If u(t) is a white noise signal, then Ru(τ ) = σu2δ(τ ), so g0(τ ) = Ryu (τ ) R̂yu (τ ) and ĝ(τ ) = σu2 σu2 How do we compute the estimator for the cross covariance R̂yu (τ ) from N measurements? Sample covariance function: N 1 ! N R̂yu(τ ) = y(t)u(t − τ ) N t=τ is asymptotically unbiased (so for N → ∞). Ronald Aarts PeSi/3/4 Spectral Analysis Spectral analyse Ident manual: Tutorial pages 3-10,15,16; Function reference etfe (4-53,54), spa (4-193–195). v u ✲ G0 ❄ ✗✔ ✲ ✖✕ y ✲ y(t) = G0(z) u(t) + v(t) Fourier transform (without v): Y (ω) = G0(eiωT )U (ω), so G0(eiωT ) = Y (ω) . U (ω) Estimator for G0(eiωT ) using N measurements: ĜN (eiωT ) = V (ω) Effect of v: ĜN (eiωT ) = G0(eiωT ) + N . UN (ω) Ronald Aarts PeSi/3/12 YN (ω) . UN (ω) The estimator ĜN (a) (b) (c) is unbiased. has an asymptotic variance 1 |U (ω)|2 unequal 0 ! Φv (ω)/ N N is asymptotically uncorrelated for different frequencies ω. Difficulty: For N → ∞ there is more data, but there are also estimators at more (=N/2) frequencies, all with a finite variance. Solutions: 1. Define a fixed period N0 and consider an increasing number of measurements N = rN0 by r → ∞. Carry out the spectral analysis for each period and compute the average to obtain a “good” estimator in N0/2 frequencies. 2. Smoothen the spectrum in the f -domain. Ronald Aarts PeSi/3/13 Parametric Analysis Going from “many to “just a few” parameters: a first step Idea: Try to recognise “features” in the data. • Immediate function of u(k) and y(k) • In the spectral models: Are there “features” e.g. like peaks as they are expected in the Bode plots / FRF (eigenfrequency, ...) of a system with a complex pole pair. • In the impulse response (measured or identified): (1) Recognise “features” (settling time, overshoot, ...). (2) Realisation algorithms → to be discussed next. Linear Regression and Least Square Estimate v Intermezzo: Linear regression and Least squares estimate u ϕi { } Regression: G0 + • Prediction of variable y on the basis of information provided by other measured variables ϕ1, ..., ϕd. ⎡ ⎤ ϕ1 ⎢ . ⎥ • Collect ϕ = ⎣ . ⎦. ϕd • Problem: find function of the regressors g(ϕ) that minimises the difference y − g(ϕ) in some sense. So ŷ = g(ϕ) should be a good prediction of y. • Example in a stochastic framework: minimise E[y − g(ϕ)]2. Ronald Aarts PeSi/4/1 y Ronald Aarts PeSi/3/24 Ron v u Linear regression: ϕi { } G0 • Regression function g(ϕ) is parameterised. It depends on a set of parameters ⎤ ⎡ θ1 ⎥ ⎢ θ = ⎣ .. ⎦. θd • Special case: regression function g(ϕ) is linear in the parameters θ. Note that this does not imply any linearity with respect to the variables from ϕ. • Special case: g(ϕ) = θ1ϕ1 + θ2ϕ2 + ... + θdϕd So g(ϕ) = ϕT θ. + y Lin Ronald Aarts PeSi/4/1 Example Linear regression — Examples: y [−] 15 Then g(ϕ) = ϕT θ with input vector ϕ = ( ' and parameter vector θ = ' x 1 ( 0 0 ) * a . So: g(ϕ) = x 1 b • Quadratic function y = c2x2 + c1x + c0.⎡ x2 5 ⎤ a . b 15 10 5 0 −5 Then g(ϕ) = ϕT θ with input vector ϕ = ⎣ x ⎦ 0 1 ⎡ ⎤ ⎡ c2 ) * ⎢ ⎥ ⎢ 2 and parameter vector θ = ⎣ c1 ⎦. So: g(ϕ) = x x 1 ⎣ c0 ⎢ Ronald Aarts PeSi/4/3 ⎥ 5 x [−] 10 5 x [−] 10 ( ' y [−] • Linear fit y = ax + b. 10 ⎤ c2 ⎥ c1 ⎦. c0 Least-squares estimate (LSE): • N measurements y(t), ϕ(t), t = 1, ..., N . N 1 ! • Minimise VN (θ) = [y(t) − g(ϕ(t))]2 . N t=1 • So a suitable θ is θ̂N = arg min VN (θ). N 1 ! [y(t) − ϕT (t)θ]2. • Linear case VN (θ) = N t=1 Ronald Aarts PeSi/4/4 Linear least-squares estimate (1): N 1 ! • In the linear case the “cost” function VN (θ) = [y(t) − ϕT (t)θ]2 N t=1 is a quadratic function of θ. • It can be minimised analytically: All partial derivatives zero in the minimum: N 1 ! 2ϕ(t)[y(t) − ϕT (t)θ] = 0 N t=1 ∂VN (θ) have to be ∂θ The solution of this set of equations is the parameter estimate θ̂N . Ronald Aarts PeSi/4/5 Linear least-squares estimate (2): • A global minimum is found for θ̂N that satisfies a set of linear equations, the normal equations ⎡ ⎣ N 1 ! N t=1 ⎤ ϕ(t)ϕT (t)⎦ θ̂ N 1 ! ϕ(t)y(t). N = N t=1 • If the matrix on the left is invertible, the LSE is ⎡ θ̂N = ⎣ Ronald Aarts N 1 ! N t=1 T ⎤−1 ϕ(t)ϕ (t)⎦ N 1 ! ϕ(t)y(t). N t=1 PeSi/4/6 Linear least-squares estimate — Matrix formulation: ⎡ ⎤ y(1) .. ⎥ ⎦, y(N ) ⎡ ⎤ T ϕ (1) ⎢ ⎥ .. and the inputs in the N × d regression matrix ΦN = ⎣ ⎦. ϕT (N ) ⎢ • Collect the output measurements in the vector YN = ⎣ ( ) T • Normal equations: ΦN ΦN θ̂N = ΦT N YN . † • Estimate θ̂N = ΦN YN )−1 ( † T ΦT (Moore-Penrose) pseudoinverse of ΦN : ΦN = ΦN ΦN N. † Note: ΦN ΦN = I. Ronald Aarts PeSi/4/7 Example 0.1 Amplitude 0.05 0 -0.05 -0.1 0 0.2 0.4 0.6 Time (s) 0.8 1 1.2 e(n): envelope of speech signal Sound Signal φi(n)=e(n-­‐‑i) Noise Auditory Processing h(i) = θi + EEG Observations Y(n) −5 8 x 10 6 4 2 0 −2 −4 −6 0 50 125 250 Time (ms) 375 500 A priori considerations Physical insight regarding the (minimal) order of the model. Physical insight regarding the nature of the noise disturbance. Relation between the number of data points N and the number of parameters to be estimated Nθ : General: N ≫ Nθ Rule of thumb: N > 10Nθ Note that the required number of points depends strongly on the signal to noise ratio. Ronald Aarts PeSi/6/29