Lecture Notes No. 15 { } Part 3 System Identification

advertisement

2.160 System Identification, Estimation, and Learning

Lecture Notes No. 1

5

April 12, 2006

Part 3 System Identification

Perspective of System Identification Theory u(t) e(t)

True Process

S y(t)

Experiment Design

Data Set

Z

N = { u ( t ), y ( t )

}

Key Questions:

Model Set

M arg

θ min V

N

(

θ

)

Consistency

θ ˆ

N

~

θ

0

?

Q1: Is a given data set informative enough to uniquely determine a model from a given model set?

Does Z contain sufficient information to distinguish any two models in M ?

Q2: Is merely minimizing V

N

(

θ

) good enough to obtain the true (unbiased) model?

What if the true model is not involved in the model set?

How is the model-data fitting influenced by noise characteristics and input properties?

Q3: How accurate is the estimated model?

How much variance, expected error, etc.?

How much data needed?

How to design experiments

Key Results

Informative experiment and persistent excitation

Consistent (unbiased) estimate

Signal to noise ration

Asymptotic variance

Input desig n : Pseudo Random Binary signal

Accuracy-variance trade-off

System order estimate: Model selection

1

Mathematical tools for Part 3 system identification

Discrete Fourier transform and spectral analysis

Central limit theorems

Random processes: wide-sense stationary process, ergodic process, etc.

10 Frequency Domain Analysis

10.1 Discrete Fourier Transform and Power Spectrum

Discrete Fourier transform of a sampled-data system: ( ) ,

X

ω = k

+∞

=−∞

− ω

(1)

Note that X

ω

is a 2

π

-periodic function:

X (

ω +

2

π n )

= k

=−∞

− i (

ω π

) = k

=−∞

( )

Time

X

( )

− ω e

− ⋅

2

π nk

1

=

X (

ω

) (2)

X (

ω +

2

π n )

=

X (

ω

) (3)

A periodic function can be expanded to a Fourier series expansion. Therefore, we can write

π ∫

− π

ω i

ω

X ( ) e d

ω =

π ∫

− π k

=−∞

− ω i

ω

( ) e d

ω = k

=−∞

∫ π

( ) e

− π i

ω

(

− k ) d

ω

2

π x ( ) : k

0 : k

= which means that the inverse transform of X ( )

ω

exists:

(4)

=

2

1

π

∫ π

− π

X

ω e d

ω

(5)

Power Spectrum

Consider a deterministic, bounded sequence { s ( t )} for which the following limit exists:

2

R s

τ =

N lim

→∞

1

N t

N ∑

=

1 s t

+ τ

) s ( t )

τ

Average time

The power spectrum of { s ( t )} function R s

(

τ

) is defined as the Fourier transform of auto-covariance

Φ s

=

τ

=−∞

R s

τ e

− i

τω

(7)

Inverse Transform:

R s

=

1

2

π

π

− π

Φ s

ω e d

ω

(8)

A special case is:

1

N lim

→∞ N t

N ∑

=

1 s t

=

R s

(0)

=

1

2

π

π

− π

Φ s

ω e d

ω

(9)

Important! We will often use this formula to obtain the mean of a squared signal s ( t ), a special case of

τ =

0 .

White Noise

We have seen “White Noise” in many sections of the previous lectures.

We defined { e ( t and covariance

)}

λ

:

as a sequence of independent random variables with zero mean values

E

[ e ( t ) e ( s )

]

=

λ

0 t

= s t

≠ s

(10)

Now its characteristics are formally defined using Power Spectrum.

The auto-covariance of the random process e ( t ) is given by

3

( )

[ ]

λ t

= s

⎩ 0 t

≠ s time

R s

τ =

N lim

→∞

1

N t

N ∑

=

1

( ) (

τ [

( ) (

+ τ

)

] =

(11)

The left hand side of the above expression is a time average, while the right hand side is an ensemble average. If these two averages are the same, the process is called ergodic.

We assume this ergodicity for most of the processes. See the discussion at the end of this lecture notes.

For the above equation the Power spectrum is given by

Φ e

ω =

τ

=−∞

R e

τ e

− i

τω =

τ

=−∞

λδ τ e

− i

τω = λ

(12)

τ against time and the corresponding power spectrum

Φ e

ω

against frequency. Note that the power spectrum plot is constant for the entire frequency. In optics, this means that the light has a uniform distribution over the entire wave length, that is, “White”. This is why the random process e ( t ) is called “White Noise”.

τ Φ e

ω

λ

τ ω

A colored random signal can be created with the White noise going through a dynamical process. The following theorem plays a major role in many of the analyses involved in system identification.

4

Theorem

Let ( ) be the transfer function of a (BIBO) stable process with a White noise input ( ) of variance

λ

. v ( t )

=

H ( q ) e ( t ) (13)

T he power spectrum of v ( t ) is given by where

Φ v

(

ω

)

= λ

H

( )

2

(14) i

is the norm of a complex number. e

( )

Stationary

G

( )

E

[ e ( t ) e ( s )

]

=

λ

0 v

( ) t t

=

≠ s s

Proof

= v ( k ) v ( k

=

− τ

) l

=

0 h ( l ) e ( k

=

− l )

∑ m

=

0 h ( m ) e ( k

− τ − m ) combining these two

(15)

R v

(

τ

)

=

N lim

→ ∞

=

N lim

→ ∞

1

N

1

N k

N ∑

=

1 v ( k ) v ( k

N

∑∑ k

=

1 l

=

0 h ( l ) e ( k

− τ

)

− l )

∑ m

=

0 h ( m ) e ( k

− τ

R v

(

τ

)

=

∞ ∞

∑∑ l

=

0 m

=

0 h ( l ) h ( m )

N lim

→ ∞

1

N

N ∑ k

=

1 e ( k

− m )

− l ) h ( m ) e ( k

− τ

λδ

( l

− τ − m )

=

⎧ λ

⎩ 0 :

:

− m ) l

= τ + m otherwise

R v

(

τ

)

= λ l

= max( 0 ,

τ h ( l

)

) h ( l

− τ

) : h ( l )

=

0 for l

<

0

The power spectrum is then given by

5

(16)

Φ v

(

ω

)

=

τ

= −∞

R v

(

τ

) e

− i

ω t

= λ

τ =

∑ ∑

−∞ l

=

∞ max( 0 h

, t )

( l ) e

− i

ω t h ( l

Re placing l

− τ by

− τ

) e

− i

ω

( l

− τ

) s

=

=

λ l

=

0

λ

H h ( l ) e

− il

ω s

=

0 h ( s ) e i

ω s

( ) ( )

= λ

H

( )

2

For a stationary signal with spectrum {

ω

( t )} with spectrum

Φ w

(

ω

) w

( )

G

( ) s

( )

The power spectrum and cross spectrum are given by

(17)

Φ

)

=

( ) s

(

ω

(18)

Φ

)

=

G

G e i

ω

( )

2 Φ

ω

Φ

ω sw

(

ω

(

ω

(

ω

)

) s ( t )

=

G ( q ) w ( t )

10.2 Applying spectral Analysis to System Identification

(19) y ( t )

=

G ( q ) u ( t )

+

H ( q ) e ( t )

Deterministic Stochastic

Two issues to clarify for mathematical rigor

1) Strictly speaking, the process is not stationary; input u ( t ) drives the system.

Therefore, the covariance function R s

(

τ

) cannot be defined in general.

R s

(

τ

)

=

N lim

→ ∞

1

N t

N ∑

=

1 s ( t ) s ( t

− τ

)

The stationary property that we need is the existence of this covariance. Then, we extend the definition to the one assuming the existence of R s

(

τ

) Quasi-Stationary

(wide sense stationary)

Important theories and techniques of system identification are dependent upon the spectra of the involved signals, i.e. only the second-order properties, and not on any higher-order properties.

6

Deterministic and stochastic processes are mixed.

{ }

,

{ } are stochastic processes.

The covariance function for this type of variable must be given an ensemble mean:

(20) R s

(

τ

)

=

N lim

→ ∞

1

N t

N ∑

=

1

E

[ s ( t ) s ( t

− τ

)

]

If this is equivalent to:

(21) R s

(

τ

)

=

1

N lim

→ ∞ N t

N ∑

=

1 s ( t ) s ( t

− τ

) treatment will be very convenient, since we need to consider only one realization of the stochastic process, rather than considering the whole collection of ensemble average.

This is an ergodicity problem. For dynamical systems,

this ergodicity holds for signals generated through uniformly stable filters

(22) s ( t )

=

G ( q ) e ( t )

G ( q )

Under this assumption, the theoretical boundary between stochastic and deterministic processes is low.

7

Download