3.3 The multinomial distribution

advertisement
3.3 The multinomial distribution
The multinomial distribution is in many ways the most natural
distribution to consider in the context of a polytomous response
variable. We introduce the properties of the multinomial distribution
in this section.
(a) Source
There are two derivations of multinomial distribution. One is based
on simple random sampling and the other is based the conditional
distribution of Poisson random variable.
1. Simple random sampling:
Suppose there are K attributes A1 , A2 , , Ak . The attributes might
be “color of hair”, “socio-economic status”, “family size”, “cause of
death” and so on. If the population is effectively infinitely large and if
a simple random sample of size m is taken, the probability of the
number of individuals will be observed to have attributes
A1 , A2 ,, Ak is
PY1  y1 , Y2  y 2 ,, Yk  y k  
m!
k
y !
 1y1  2y2  kyk 
m!
 1y1  2y2  kyk
,
y1! y k !
j
j 1
k
where
y
i 1
i
 m and 0  yi  m .
2. Conditional distribution of Poisson random variables:
Let Y1 , Y2 ,, Yk ~ P1 , P 2 ,, P k  . Denote
1
k
k
i 1
i 1
Y   Yi ,    i ,  i 
i

.
Then, the conditional joint distribution of Y1 , Y2 ,, Yk
given
Y  m is
k


P Y1  y1 , Y2  y 2 , , Yk  y k |  Yi  m 
i 1


m!

 1y1  2y 2  ky k
y1! y 2 ! y k !
(b) Moments and cumulants
The moment generating function of the multinomial distribution is
  k
  k

M Y t   M Y t1 , t 2 , , t k   E exp   tiYi     i exp ti 
  i 1

  i 1
m
and the cumulant generating function is
k

KY t   KY t1 , t 2 , , t k   log M Y t1 , t 2 , , t k   m log   i exp ti 
 i1

.
Then,
 K t , t ,, tk 
E Yr    Y 1 2

tr

 t 0
and for r  s
2






m

exp
t
r 
 k r
 m r


   i exp ti  
 i1
 t 0
  2 KY t1 , t2 ,, tk 
CovYr , Ys   


t

t
r
s

 t 0









m

exp
t

exp
t
r
r
s
s 

2
  k


    i exp ti 

  i1
 t 0

 m r s
and
  2 K Y t1 , t 2 ,  , t k  
Var Yr   

t r2

 t 0


m exp t r 
 k r


   i exp ti 
 i 1


m r2 exp 2t r  
2
 k
 
   i exp ti  
 i 1
  t 0
 m r  m r2
 m r 1   r 
In addition, Z1  Y1 , Z 2  Y1  Y2 ,, Z k  Y1  Y2    Yk ,
 Z1  1 0  0 Y1 
 Z  1 1  0 Y 
  2   LY
Z   2  
          
,
  
 
 Z k  1 1  1 Yk 
where L is a lower-triangular matrix containing unit values. Then,
E Z j   mrj
and for
jl
CovZ j , Z l   mrj 1  rl  .
3
Note:
For j  l  t , the conditional distribution of
Z j given Z l  zl
 rj 

rt  rl 



 .
Z
~
B
z
,
Z

z
~
B
m

z
,
j
l
is
 l r  . In addition, t l

1  rl 
l 


Note:
k
For
s   siYi , then
i 1
k
 k Yi 
 s  E   si     i si
 i 1 m  i 1
and
2
k
k
k


 k



2
2
Var  siYi   m  i si   s   m  i si     i si  
i 1
 i 1

 i 1
 
 i 1
(c) Marginal and conditional distributions
The multinomial distribution has the following important properties:
1. The marginal distribution of
Y j is Y j ~ B m, 
2. The joint marginal distribution of
j
.
Y1 ,Y2 , m  Y1  Y2 
is
multinomial on 3 categories with index m and parameter
 1 , 2 ,1   1   2 
3. The conditional distribution of
given that
Y1 ,, Yi 1 , Yi 1 ,, Yk 
Yi  yi is multinomial with index m  yi and
4
probabilities
 1
 i 1  i 1
k

,

,
,
,

,
1
1 i 1i
1i
i

4. The marginal distribution of
Z j is Z j ~ B m, r j  .
5. The conditional distribution of

r
B z j , i

rj



.

Z i given Z j  z j is


 for i  j .

6. The conditional distribution of

 j 1
B m  z j ,

1  rj

Y j 1 given Z j  z j is


.

7. The multinomial distribution can be expressed as a product of k-1
binomial factors
PY1  y1 ,, Yk  yk   f  y1 | z0  f  y2 | z1  f  yk 1 | zk 2 
where
 m  z j 1   j 


f  y j | z j 1   



 y j  1  rj 1 
and
yj
 1  rj 


1 r 
j 1 

m z j 1  y j
z 0  r0  1
8. The sequence
Z1 ,, Z k
has the Markov property. That is,
PZ j | Z j 1  z j 1 ,, Z1  z1   PZ j | Z j 1  z j 1  .
(d) Quadratic forms
In order to test


H 0 :    0   10 ,  20 ,,  k0 , the quadratic form
5
(Pearson’s statistic) in the residuals,
k
X2 
Y
 m 0j 
2
j
m 0j
j 1
,
can be used to test the hypothesis. As m is large,
approximately distributed as
X2
is
 k21 . In addition, we can also use the
cumulative multinomial vector
k 1

Z
j 1
with
rj0
j
 mrj0
m

2



 1
 k  2 Z j  mrj0 Z j 1  mrj01
1
 
2
0
,
 0  0  
m

j

1
j
j

1
j 1


computed under


H 0 :    0   10 ,  20 ,,  k0 . Note that
the above quadratic form is identical to
6
X2.
Download