# Document 13477651

```16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
Lecture 4
Last time: Left off with characteristic function.
4. Prove φx (t ) = Πφx (t ) where X = X 1 + X 2 + ... + X n (Xi independent)
i
Let S = X 1 + X 2 + ...X n
where the Xi are independent.
jtS
⎡ jt ( X + X
φs (t ) = E ⎡e
⎣ ⎤⎦ = E ⎣ e
1
= E ⎡⎣ e
⎤⎦ E ⎣⎡ e
= ∏ φX i (t )
jtX 1
jtX 2
2 +... + X n
⎤⎦ ...E ⎡⎣ e
) ⎤
jtX n
⎦
⎤
⎦
This is the main reason why use of the characteristic function is convenient.
This would also follow from the more devious reasoning of the density function
for the sum of n independent random variables being the nth order convolution of
the individual density functions – and the knowledge that convolution in the
direct variable domain becomes multiplication in the transform domain.
5. MacLaurin series expansion of φ (t )
∞
Because f(x) is non-negative and
∫
∞
f ( x )dx = 1 (or, even better,
−∞
∫
f ( x ) dx = 1 ), it
−∞
∞
follows that
∫
f ( x ) dx = 1 converges so that f(x) is Fourier transformable. Thus
−∞
the characteristic function φ (t ) exists for all distributions and the inverse relation
φ (t ) → f (x) holds for all distributions. This implies that φ (t ) is analytic for all
real values of t.
Then it can be expanded in a power series, which converges for all finite values
of t.
φ (t) = φ (0) + φ (1) ( 0 ) t +
φ (t ) =
∞
∫
1 (2)
1
φ ( 0 ) t 2 + ... + φ ( n ) ( 0 ) t n + ...
n!
2!
f (x )e jtx dx , φ (0) = 1
−∞
Page 1 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
∞
d nφ (t )
= ∫ f ( x )( jx) n e jtx dx
dt n
−∞
φ
(n)
∞
( 0 ) = j n ∫ x n f (x )dx = j n X n
−∞
1 2 22
1
j ) X t + ... + ( j n ) X n t n + ...
(
2!
n!
The coefficients of the expansion are given by the moments of the distribution.
Thus the characteristic function can be determined from the moments.
Similarly, the moments can be determined from the characteristic function
directly by
1 d nφ (t)
Xn = n
j dt n t =0
φ (t) = 1 + jXt +
or by expanding φ (t ) into its power series in some other way and identifying the
coefficients of the various powers of t.
The Generating Function
The generating function has its most useful application to random variables
which take integer values only. Examples of such would be the number of
telephone calls into a switchboard in a certain time interval, the number of cars
entering a toll station in a certain time interval, the number of times a 7 is thrown
in n tosses of 2 dice, etc.
For integer-valued random variables, the Generating Function yields the same
advantages as the Characteristic Function and is of simpler form.
Consider a random variable which takes the integer values k:
P( X = k ) = pk
(k=0,1,2,…)
For a discrete distribution you can sum in lieu of integration. The Characteristic
Function for this random variable is
∞
φ (t ) = E ⎡⎣ e jtX ⎤⎦ = ∑ e jtk pk
k =0
∞
= ∑ pk ( e jt )
k
k = 0
If we define a new variable s = e jt , we have
∞
G ( s ) = ∑ pk s k
k =0
Page 2 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
which is called the Generating Function. It has all the interesting properties of
the characteristic function. Note that t → 0 corresponds to s → 1 .
Let’s establish the connection between moments of a distribution and the
generating function:
dG ∞
= ∑ kpk s k −1
ds k =0
d 2G ∞
= ∑ k ( k − 1) pk s k −2
ds 2 k =0
∞
∞
k =0
k =0
= ∑ k 2 pk s k −2 − ∑ kpk
s k −2
Just calculate
dG
ds
d 2G
ds 2
dG
ds
and
s=1
d 2G
ds 2
and reorganize them in terms of X and X 2 :
s=1
∞
s=1
= ∑ kpk = X , ← 1st moment expression
k =0
∞
∞
k =0
k =0
= ∑ k 2 pk − ∑ kpk
s=1
X2 =
2
d G
ds 2
+
s=1
dG
ds
← 2 nd moment expression
s=1
Each moment is a linear combination of its order derivative and lower order
derivatives. The generating function for the sum of independent integer-valued
variables is the product of their generating functions. This is harder to prove
than the same property of the characteristic function, but it does, in fact, hold
true.
Multiple Random Variables
Characterizing a joint set of random variables, define a probability distribution
function
F ( x) = P( X 1 ≤ x1 , X 2 ≤ x2 ,..., X n ≤ xn )
This is called the joint probability distribution function.
Properties:
If any of the arguments xi goes to −∞ , then F ( x ) → 0 . lim F (
x) = 0
any xi →−∞
Page 3 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
If all of the xi go to ∞ , then F ( x ) → 1 .
lim F ( x) = 1
all xi →∞
F ( x ) is monotonically non-decreasing in each xi.
Define joint density function by differentiation:
∂n
f ( x) =
∂x1∂x2 ...∂xn
f ( x ) ≥ 0, ∀x
Fx1 ... xn ( x1...xn ) =
xn
x1
∫ du ... ∫ du
1
−∞
n
f x1 ... xn (u1...un )
−∞
Setting each xi → ∞ ,
∞
∞
∫ du ... ∫ du F
1
−∞
n
u1 ,...,un
(u1 ,..., un ) = 1
−∞
Fx1 ,..., xk ( x1 ,..., xk ) = P( X 1 ≤ x1 ,..., X n ≤ xn )
= P( X 1 ≤ x1 ,..., X k ≤ xk , X k +1 ≤ ∞,..., X n ≤ ∞ )
= Fx1 ,..., xn ( x1 ,..., xk , ∞,..., ∞)
Page 4 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
For the density function:
∂ k
f x1 ,..., xk ( x1 ,..., xk ) =
Fx ,..., x ( x1 ,..., xk )
∂x1∂x2 ...∂xk 1 k
∂ k
Fx ,..., x ( x1 ,..., xk , ∞,..., ∞)
=
∂x1∂x2 ...∂xk 1 n
∂k
∂x1∂x2 ...∂xk
=
∞
∫ du
=
∞
∫
=
−∞
∫
−∞
xk
du1... ∫ duk
−∞
∞
∫
−∞
∞
duk +1... ∫ dun f x1 ,..., xn (u1 ,..., un )
−∞
∞
k +1
−∞
x1
... ∫ dun f x1 ,...,xn ( x1 ,..., xk , uk +1 ,..., un )
−∞
∞
duk +1... ∫ dun f x1 ,..., xn ( x1 ,..., xn )
−∞
Marginal density
If you integrate above over all variables but one, it is referred to as the marginal
density.
∞
f xi ( xi ) =
∞
∫ dx ... ∫ dx
1
−∞
n
f x1 ,..., xn ( x1 ,..., xn )
−∞
14
4244
3
n-1 terms: all except xi
Mutually independent sets of random variables
Definition of independence:
P [ X 1 ∈ s1 , X 2 ∈ s2 ,...] = P [ X 1 ∈ s1 ] P [ X 2 ∈ s2 ]...
for any sets s1, s2, … The product rule holds for joint probability distribution and density functions for independent random variables. Fx1 , x2 , x3 ,... ( x1 , x2 , x3 ,...) = Fx1 ( x1 ) Fx2 ( x2 ) Fx2 ( x2 )...
f x1 , x2 , x3 ,... ( x1 , x2 , x3 ,...) = f x1 ( x1 ) f x2 ( x2 ) f x2 ( x2 )...
Expectations
E [ g ( x )] =
∞
∫
−∞
∞
dx1... ∫ dxn g ( x ) f ( x )
−∞
Page 5 of 6
16.322 Stochastic Estimation and Control, Fall 2004
Prof. Vander Velde
For the sum of multiple random variables:
E [ X 1 + X 2 + ... + X n ] =
∞
∞
∫ dx ... ∫ dx ( x
1
n
−∞
∞
=
∫
−∞
1
∞
∞
−∞
−∞
dx1... ∫ dxn x1 f x1 ,..., xn ( x1 ,..., xn ) + ... +
∞
=
+ x2 + ... + xn ) f x1 ,..., xn ( x1 ,..., xn )
−∞
∫xf
∫
∞
1 x1
( x1 )dx1 +
−∞
∫x
∞
dx1... ∫ dxn xn f x1 ,..., xn ( x1 ,..., xn )
−∞
∞
2
f x2 ( x2 )dx2 + ... +
−∞
= E [ X 1 ] + E [ X 2 ] + ... + E [ X n ]
∫x
n
f xn ( xn )dxn
−∞
This relation is true whether or not the xi are independent.
For the product of multiple independent random variables:
E [ X 1 X 2 ... X n ] =
∞
∫
−∞
∞
=
∫
−∞
∞
dx1... ∫ dxn ( x1 x2 ... xn ) f x1 ,..., xn ( x1 ,..., xn )
−∞
∞
dx1... ∫ dxn ( x1 x2 ... xn ) f x1 ( x1 ) f x2 ( x2 )... f xn ( xn )
−∞
∞
=
∫xf
1 x1
−∞
∞
∞
−∞
−∞
( x1 )dx1 ∫ x2 f x2 ( x2 )dx2 ... ∫ xn f xn ( xn )dxn
= E [ X 1 ] E [ X 2 ]... E [ X n ]
Page 6 of 6
```