3.3 The multinomial distribution The multinomial distribution is in many ways the most natural distribution to consider in the context of a polytomous response variable. We introduce the properties of the multinomial distribution in this section. (a) Source There are two derivations of multinomial distribution. One is based on simple random sampling and the other is based the conditional distribution of Poisson random variable. 1. Simple random sampling: Suppose there are K attributes A1 , A2 , , Ak . The attributes might be “color of hair”, “socio-economic status”, “family size”, “cause of death” and so on. If the population is effectively infinitely large and if a simple random sample of size m is taken, the probability of the number of individuals will be observed to have attributes A1 , A2 ,, Ak is PY1 y1 , Y2 y 2 ,, Yk y k m! k y ! 1y1 2y2 kyk m! 1y1 2y2 kyk , y1! y k ! j j 1 k where y i 1 i m and 0 yi m . 2. Conditional distribution of Poisson random variables: Let Y1 , Y2 ,, Yk ~ P1 , P 2 ,, P k . Denote 1 k k i 1 i 1 Y Yi , i , i i . Then, the conditional joint distribution of Y1 , Y2 ,, Yk given Y m is k P Y1 y1 , Y2 y 2 , , Yk y k | Yi m i 1 m! 1y1 2y 2 ky k y1! y 2 ! y k ! (b) Moments and cumulants The moment generating function of the multinomial distribution is k k M Y t M Y t1 , t 2 , , t k E exp tiYi i exp ti i 1 i 1 m and the cumulant generating function is k KY t KY t1 , t 2 , , t k log M Y t1 , t 2 , , t k m log i exp ti i1 . Then, K t , t ,, tk E Yr Y 1 2 tr t 0 and for r s 2 m exp t r k r m r i exp ti i1 t 0 2 KY t1 , t2 ,, tk CovYr , Ys t t r s t 0 m exp t exp t r r s s 2 k i exp ti i1 t 0 m r s and 2 K Y t1 , t 2 , , t k Var Yr t r2 t 0 m exp t r k r i exp ti i 1 m r2 exp 2t r 2 k i exp ti i 1 t 0 m r m r2 m r 1 r In addition, Z1 Y1 , Z 2 Y1 Y2 ,, Z k Y1 Y2 Yk , Z1 1 0 0 Y1 Z 1 1 0 Y 2 LY Z 2 , Z k 1 1 1 Yk where L is a lower-triangular matrix containing unit values. Then, E Z j mrj and for jl CovZ j , Z l mrj 1 rl . 3 Note: For j l t , the conditional distribution of Z j given Z l zl rj rt rl . Z ~ B z , Z z ~ B m z , j l is l r . In addition, t l 1 rl l Note: k For s siYi , then i 1 k k Yi s E si i si i 1 m i 1 and 2 k k k k 2 2 Var siYi m i si s m i si i si i 1 i 1 i 1 i 1 (c) Marginal and conditional distributions The multinomial distribution has the following important properties: 1. The marginal distribution of Y j is Y j ~ B m, 2. The joint marginal distribution of j . Y1 ,Y2 , m Y1 Y2 is multinomial on 3 categories with index m and parameter 1 , 2 ,1 1 2 3. The conditional distribution of given that Y1 ,, Yi 1 , Yi 1 ,, Yk Yi yi is multinomial with index m yi and 4 probabilities 1 i 1 i 1 k , , , , , 1 1 i 1i 1i i 4. The marginal distribution of Z j is Z j ~ B m, r j . 5. The conditional distribution of r B z j , i rj . Z i given Z j z j is for i j . 6. The conditional distribution of j 1 B m z j , 1 rj Y j 1 given Z j z j is . 7. The multinomial distribution can be expressed as a product of k-1 binomial factors PY1 y1 ,, Yk yk f y1 | z0 f y2 | z1 f yk 1 | zk 2 where m z j 1 j f y j | z j 1 y j 1 rj 1 and yj 1 rj 1 r j 1 m z j 1 y j z 0 r0 1 8. The sequence Z1 ,, Z k has the Markov property. That is, PZ j | Z j 1 z j 1 ,, Z1 z1 PZ j | Z j 1 z j 1 . (d) Quadratic forms In order to test H 0 : 0 10 , 20 ,, k0 , the quadratic form 5 (Pearson’s statistic) in the residuals, k X2 Y m 0j 2 j m 0j j 1 , can be used to test the hypothesis. As m is large, approximately distributed as X2 is k21 . In addition, we can also use the cumulative multinomial vector k 1 Z j 1 with rj0 j mrj0 m 2 1 k 2 Z j mrj0 Z j 1 mrj01 1 2 0 , 0 0 m j 1 j j 1 j 1 computed under H 0 : 0 10 , 20 ,, k0 . Note that the above quadratic form is identical to 6 X2.