3.3 The multinomial distribution
The multinomial distribution is in many ways the most natural
distribution to consider in the context of a polytomous response
variable. We introduce the properties of the multinomial distribution
in this section.
(a) Source
There are two derivations of multinomial distribution. One is based
on simple random sampling and the other is based the conditional
distribution of Poisson random variable.
1. Simple random sampling:
Suppose there are K attributes A1 , A2 , , Ak . The attributes might
be “color of hair”, “socio-economic status”, “family size”, “cause of
death” and so on. If the population is effectively infinitely large and if
a simple random sample of size m is taken, the probability of the
number of individuals will be observed to have attributes
A1 , A2 ,, Ak is
PY1 y1 , Y2 y 2 ,, Yk y k
m!
k
y !
1y1 2y2 kyk
m!
1y1 2y2 kyk
,
y1! y k !
j
j 1
k
where
y
i 1
i
m and 0 yi m .
2. Conditional distribution of Poisson random variables:
Let Y1 , Y2 ,, Yk ~ P1 , P 2 ,, P k . Denote
1
k
k
i 1
i 1
Y Yi , i , i
i
.
Then, the conditional joint distribution of Y1 , Y2 ,, Yk
given
Y m is
k
P Y1 y1 , Y2 y 2 , , Yk y k | Yi m
i 1
m!
1y1 2y 2 ky k
y1! y 2 ! y k !
(b) Moments and cumulants
The moment generating function of the multinomial distribution is
k
k
M Y t M Y t1 , t 2 , , t k E exp tiYi i exp ti
i 1
i 1
m
and the cumulant generating function is
k
KY t KY t1 , t 2 , , t k log M Y t1 , t 2 , , t k m log i exp ti
i1
.
Then,
K t , t ,, tk
E Yr Y 1 2
tr
t 0
and for r s
2
m
exp
t
r
k r
m r
i exp ti
i1
t 0
2 KY t1 , t2 ,, tk
CovYr , Ys
t
t
r
s
t 0
m
exp
t
exp
t
r
r
s
s
2
k
i exp ti
i1
t 0
m r s
and
2 K Y t1 , t 2 , , t k
Var Yr
t r2
t 0
m exp t r
k r
i exp ti
i 1
m r2 exp 2t r
2
k
i exp ti
i 1
t 0
m r m r2
m r 1 r
In addition, Z1 Y1 , Z 2 Y1 Y2 ,, Z k Y1 Y2 Yk ,
Z1 1 0 0 Y1
Z 1 1 0 Y
2 LY
Z 2
,
Z k 1 1 1 Yk
where L is a lower-triangular matrix containing unit values. Then,
E Z j mrj
and for
jl
CovZ j , Z l mrj 1 rl .
3
Note:
For j l t , the conditional distribution of
Z j given Z l zl
rj
rt rl
.
Z
~
B
z
,
Z
z
~
B
m
z
,
j
l
is
l r . In addition, t l
1 rl
l
Note:
k
For
s siYi , then
i 1
k
k Yi
s E si i si
i 1 m i 1
and
2
k
k
k
k
2
2
Var siYi m i si s m i si i si
i 1
i 1
i 1
i 1
(c) Marginal and conditional distributions
The multinomial distribution has the following important properties:
1. The marginal distribution of
Y j is Y j ~ B m,
2. The joint marginal distribution of
j
.
Y1 ,Y2 , m Y1 Y2
is
multinomial on 3 categories with index m and parameter
1 , 2 ,1 1 2
3. The conditional distribution of
given that
Y1 ,, Yi 1 , Yi 1 ,, Yk
Yi yi is multinomial with index m yi and
4
probabilities
1
i 1 i 1
k
,
,
,
,
,
1
1 i 1i
1i
i
4. The marginal distribution of
Z j is Z j ~ B m, r j .
5. The conditional distribution of
r
B z j , i
rj
.
Z i given Z j z j is
for i j .
6. The conditional distribution of
j 1
B m z j ,
1 rj
Y j 1 given Z j z j is
.
7. The multinomial distribution can be expressed as a product of k-1
binomial factors
PY1 y1 ,, Yk yk f y1 | z0 f y2 | z1 f yk 1 | zk 2
where
m z j 1 j
f y j | z j 1
y j 1 rj 1
and
yj
1 rj
1 r
j 1
m z j 1 y j
z 0 r0 1
8. The sequence
Z1 ,, Z k
has the Markov property. That is,
PZ j | Z j 1 z j 1 ,, Z1 z1 PZ j | Z j 1 z j 1 .
(d) Quadratic forms
In order to test
H 0 : 0 10 , 20 ,, k0 , the quadratic form
5
(Pearson’s statistic) in the residuals,
k
X2
Y
m 0j
2
j
m 0j
j 1
,
can be used to test the hypothesis. As m is large,
approximately distributed as
X2
is
k21 . In addition, we can also use the
cumulative multinomial vector
k 1
Z
j 1
with
rj0
j
mrj0
m
2
1
k 2 Z j mrj0 Z j 1 mrj01
1
2
0
,
0 0
m
j
1
j
j
1
j 1
computed under
H 0 : 0 10 , 20 ,, k0 . Note that
the above quadratic form is identical to
6
X2.