ch2 (Review_of_Probability).ppt

advertisement

Review of Probability

1

Probability Theory:

 Many techniques in speech processing require the manipulation of probabilities and statistics.

 The two principal application areas we will encounter are:

 Statistical pattern recognition.

 Modeling of linear systems.

2

Events:

 It is customary to refer to the probability of an event.

 An event is a certain set of possible outcomes of an experiment or trial.

 Outcomes are assumed to be mutually exclusive and, taken together, to cover all possibilities.

3

Axioms of Probability:

 To any event A we can assign a number,

P(A), which satisfies the following axioms:

 P(A)≥0.

 P(S) =1.

 If A and B are mutually exclusive, then

P(A+B) = P(A)+P(B).

 The number P(A) is called the

probability

of A .

4

Axioms of Probability

(some consequence)

:

 Some immediate consequence:

 If is the complement of A, then

( A

A )

S

P ( A )

1

P ( A )

 P(0) ,the probability of the impossible event , is 0.

 P(A) ≤ 1.

 If two event A and B are not mutually exclusive, we can show that

 P(A+B)=P(A)+P(B)-P(AB).

5

Conditional Probability:

 The conditional probability of an event A , given that event B has occurred, is defined as:

P ( A | B )

P ( AB )

P ( B )

 We can infer P(B|A) by means of Bayes’ theorem:

P ( B | A )

P ( A | B )

P ( B )

P ( A )

6

Independence:

 Events A and B may have nothing to do with each other and they are said to be independent.

 Two events are independent if

P(AB)=P(A)P(B).

 From the definition of conditional probability:

P ( A |

P ( B |

B )

A )

P ( A )

P ( B )

P ( A

B )

P ( A )

P ( B )

P ( A ) P ( B )

7

Independence:

 Three events A,B and C are independent only if:

P ( AB )

P ( A ) P ( B )

P ( AC )

P ( A ) P ( C )

 P (

P ( BC

ABC )

)

P (

P ( B ) P ( C )

A ) P ( B ) P ( C )

8

Random Variables:

 A random variable is a number chosen at random as the outcome of an experiment.

 Random variable may be real or complex and may be discrete or continuous.

 In S.P. ,the random variable encounter are most often real and discrete.

 We can characterize a random variable by its probability distribution or by its probability density function (pdf).

9

Random Variables

(distribution function)

:

 The distribution function for a random variable y is the probability that y does not exceed some value u ,

F y

( u )

P ( y

 u )

 and

P ( u

 y

 v )

F y

( v )

F y

( u )

10

Random Variables

(probability density function)

:

The probability density function is the derivative of the distribution : f y

( u )

 d du

F y

( u ) v and, P ( u

 y

 v )

  u f y

( y ) dy

F y

(

)

1



 f y

( y ) dy

1

11

Random Variables

(expected value)

:

 We can also characterize a random variable by its statistics.

 The expected value of g(x) is written

E{g(x)} or <g(x)> and defined as

 Continuous random variable:

 g ( x )

 

 

 g ( x ) f ( x ) dx

 Discrete random variable:

 g ( x )

  g ( x x ) p ( x )

12

Random Variables

(moments)

:

 The statistics of greatest interest are the moment of X.

 The kth moment of X is the expected value

X k of .

 For a discrete random variable: m k



X k  x

 x k p ( x )

13

Random Variables

(mean & variance)

:

 m x .

 Continuous:

X

 



 xf ( x ) dx

 Discrete:  

X



X

  x xp ( x )

The second central moment, also known as the variance of p(x), is given by

2   x

( x

 x )

2 p ( x )

 m

2

X

2

14

Random Variables …:

 To estimate the statistics of a random variable, we repeat the experiment which generates the variable a large number of times.

 If the experiment is run N times, then each value x will occur Np(x) times, thus

ˆ k

1

N i

N 

1 x i k x

1

N i

N 

1 x i

15

Random Variables

(Uniform density)

:

 A random variable has a uniform density on the interval (a, b) if :

 F

X

( x )

(

0

1 , x

,

 a ) /( b

 a ), a

 x x x

 a b b

 f

X

( x )

1

 0

/( b

 a

,

), a

 x

 otherwise b

2 

1

12

( b

 a )

2

16

Random Variables

(Gaussian density)

:

 The Gaussian, or normal density function is given by: n ( x ;

,

)

1

2

 e

( x

 

)

2

/ 2

 2

17

Random Variables (…Gaussian density) :

 The distribution function of a normal variable is:

N ( x ;

,

)

 x 

  n ( u ;

,

) du

 If we define error function as

1 erf ( x )

2

 x

  e

 u

2

/ 2 du

 Thus,

N ( x ;

,

)

1 erf ( x

)

18

Two Random Variables:

 If two random variables x and y are to be considered together, they can be described in terms of their joint probability density f(x, y) or, for discrete variables, p(x, y).

 Two random variable are independent if

 p ( x , y )

 p ( x ) p ( y )

19

Two Random Variables

(…Continue)

:

 Given a function g(x, y), its expected value is defined as:

 Continuous:

 g ( x , y )



  

 g ( x , y ) f ( x , y ) dxdy

 Discrete:

 g ( x , y )

 x

 y , g ( x , y ) p ( x , y )

 And joint moment for two discrete random variable is: m ij

 x

 y , x i y j p ( x , y )

20

Two Random Variables

( …Continue)

:

Moments are estimated in practice by averaging repeated measurements:

ˆ ij

1

N

N 

1 x

 i y

 j

A measure of the dependence of two random variables is their correlation and the correlation of two variables is their joint second moment: m

11

 xy

 x

 y , xyp ( x , y )

21

Two Random Variables

( …Continue)

:

 The joint second central moment of x , y is their covariance :

 xy



( x

 x )( y

 y )

 m

11

 x y

 If x and y are independent then their covariance is zero.

 The correlation coefficient of x and y is their covariance normalized to their standard deviations: r xy

 xy x

 y

22

Two Random Variables

(…Gaussian

Random Variable )

:

 Two random variables x and y are jointly

Gaussian if their density function is : n ( x , y )

2

 x

 y

1

1

 r

2 exp

1

2 ( 1

 r

2

) x

2

 x

2

2 rxy

 x

 y

 y

2

2 y

 Where r xy

 xy x

 y

23

Two Random Variables

(…Sum of

Random Variables )

:

 The expected value of the sum of two random variables is :

 x

 y

 x

   y

 This is true whether x and y are independent or not

 And also we have :

 cx

 c

 x

 i

 x i

 i

  x i

24

Two Random Variables

( …Sum of

Random Variable )

:

 The variance of the sum of the two independent random variable is :

  x

2

 y

  x

2   y

2

 If two random variable are independent , the probability density of their sum is the convolution of the densities of the individual variables :

Continuous:

Discrete: p x

 f y x

( y

( z ) z )

 u



  p x f x

( u )

( u ) p y f y

(

( z z

 u ) du u )

25

Central Limit Theorem

 Central Limit Theorem (informal paraphrase ):

If many independent random variable are summed, the probability density function

(pdf) of the sum tends toward the

Gaussian density, no matter what their individual densities are.

26

Multivariate Normal Density

 The normal density function can be generalized to any number of random variables.

Let X be the random vector,

N ( x )

Where

( 2

)

 n / 2

| R |

1 exp

Col



[

1

2

X

1

, X

Q ( x

2

,...,

 x )



X n

]

Q ( x

 x )

( x

 x )

T

R

1

( x

 x )

 The matrix R is the covariance matrix of X

(R is Positive-Definite)

R



( x

 x )( x

 x )

T 

27

Random Functions :

 A random function is one arising as the outcome of an experiment.

 Random function need not necessarily be functions of time, but in all case of interest to us they will be.

 A discrete stochastic process is characterized by many probability density of the form, p ( x

1

, x

2

, x

3

,..., x n

, t

1

, t

2

, t

3

,..., t n

)

28

Random Functions :

 If the individual values of the random signal are independent, then p ( x

1

, x

2

,..., x n

, t

1

, t

2

,..., t n

)

 p ( x

1

, t

1

) p ( x

2

, t

2

)...

p ( x n

, t n

)

 If these individual probability densities are all the same, then we have a sequence of independent, identically distributed samples (i.i.d.).

29

mean & autocorrelation

 The mean is the expected value of x(t) : x ( t )

 x ( t )

  xp ( x , t ) x

 The autocorrelation function is the expected value of the product :

1 2 r ( t

1

, t

2

)

 x ( t

1

) x ( t

2

)

 x

1

,

 x

2 x

1 x

2 p ( x

1

, x

2

, t

1

, t

2

)

30

ensemble & time average

 Mean and autocorrelation can be determined in two ways:

 The experiment can be repeated many times and the average taken over all these functions. Such an average is called ensemble average .

 Take any one of these function as being representative of the ensemble and find the average from a number of samples of this one function. This is called a time average .

31

ergodicity & stationarity

 If the time average and ensemble average of a random function are the same, it is said to be ergodic .

 A random function is said to be stationary if its statistics do not change as a function of time.

 Any ergodic function is also stationary.

32

ergodicity & stationarity

 For a stationary signal we have:

 Where p (

 x

1

, x

2 t

2

, t

1 t x ( t )

1

, t

2

)

 x p ( x

1

, x

2

,

)

 And the autocorrelation function is : r (

)

 x

1

,

 x

2 x

1 x

2 p ( x

1

, x

2

,

)

33

ergodicity & stationarity

 When x(t) is ergodic, its mean and autocorrelation are : x

1

N lim

  2 N t

N 

N x ( t ) r (

)

 x ( t ) x ( t

 

)



N lim

 

1

N t

N 

N x ( t ) x ( t

 

)

34

cross-correlation

 The cross-correlation of two ergodic random functions is : r xy

(

)

 x ( t ) y ( t

 

)



N lim

 

1

N t

N 

N x ( t ) y ( t

 

)

 The subscript xy indicates a cross-correlation.

35

Random Functions

(power & cross spectral density)

:

 The Fourier transform of r (

) (the autocorrelation function of an ergodic random function) is called the power spectral density of x(t) :

S (

)

 

 r (

) e

 j



 The cross-spectral density of two ergodic random functions is :

S xy

(

)

 

 r xy

(

) e

 j



36

Random Functions

( …power density)

:

 For an ergodic signal x(t), r (

) written as: r (

)

 x (

)

 x (

 

) can be

 Then from elementary Fourier transform properties,

S (

)

X (

) X (

 

)

X (

) X

(

)

| X (

) |

2

37

Random Functions

(White Noise)

:

 If all values of a random signal are uncorrelated, r (

)

 

2

(

)

 Then this random function is called white noise

 The power spectrum of white noise is constant,

S (

)

 

2

 White noise is a mixture of all frequencies.

38

Random Signal in Linear Systems :

 Let T[ ] represent the linear operation; then

T [ x ( t )]



T [

 x ( t )

]

 Given a system with impulse response h(n),

 y ( n )

 x ( n )

 h ( n )

 x ( n )

  h ( n )

 A stationary signal applied to a linear system yields a stationary output, r yy

(

)

 r xx

(

)

S yy

(

)

S xx h (

(

) |

)

 h (

 

H (

) |

2

)

39

Download