Binomial, Multinomial & Poisson Stat 557 Fall 2012 Outline • Coverage of C.I for π • Multinomial • Poisson • Sampling Types • Contingency Tables pU = 0.6 + 1.96 · = 115 = 0.6 + 0.0895 = 0.6895 � � � � n n � � nn αα j j n−j n−j ==PP (Y(Y≥≥y)y)== p p(1(1−−p)p) 22 jj j=y j=y � Wald C.I. � � � � y y 1 � α α p ± Zα/2 � n np) j j p(1 − n−j n−j ==PP (Y(Y≤≤y)y)==n p p(1(1−−p)p) j 22 j j=0 j=0 Exact C.I. α �� P (Y ≤ y) = �� −1−1 n n−−y2y++1 1 pLpL== 1 1++ yF α/2) yF (1(1−−α/2) 2y,2(n+y−1) 2y,2(n+y−1) �� �� −1−1 n n−−y y pUpU== 1 1++ 5 1)F (α/2) (y(y++1)F (α/2) 2(y+1),2(n+y) 2(y+1),2(n+y) � � � � n n � � Confidence Intervals • • α = P (Y ≤ y) = 2 � y � � � n Coverage j=0 j pj (1 − p)n−j �−1 n−y+1 pL = 1 + Definition: yF2y,2(n+y−1) (1 − α/2) for a � fixed value of a parameter the actual� −1 −y coverage probability of ann interval estimator pU = 1 + is the probability that2(y+1),2(n+y) the interval (α/2) contains (y + 1)F the parameter: � � n � n j n−j C(p, n) = I(j, p) · p (1 − p) j • j=0 �contains p for I(j,p) is 1, if the interval 1 observation j and 0 otherwise p(1 − p) M≤ 1.96 · n Wald & Exact 95% C.I.s Coverage of Confidence Intervals for Binomial 5 10 50 1.00 0.95 coverage 0.90 Method Wald Score 0.85 Exact adj_Wald 0.80 0.75 0.70 0.0 0.2 0.4 • • 0.6 0.8 1.0 0.0 0.2 0.4 p 0.6 0.8 1.0 0.0 0.2 Wald is excessively liberal Clopper-Pearson (Exact) is impractically conservative 0.4 0.6 0.8 1.0 Score • y+2 p̃ = n+4 � 1 p̃ ± ztest p̃(1null, − p̃) Invert normal rather than α/2 using n estimated standard error: � p − po 1 n po (1 − po ) = ±zα/2 p − po �� = ±z α/2 � � �−1 � � 1 2 2 2 � n po (1 − po zα/2 zα/2 zα/2 � p(1 − p) + p + ± z /n 1+ α/2 �� 2n � � �n−1 � 2 2 4n 2 � zα/2 p + ± zα/2 � p(1 − p) + 2n Y 4n zα/2 p= n Y p= |p − π| n /n 1 + zα/2 n Wilson, 1937 Adjusted Wald • ‘add x failures and x successes’, good values for x = 1,2: y+2 p̃ = n+4 �y + 2 p̃ = 1 p̃ + zα/2 n +p̃(1 4 − p̃) �n 1 po p̃ ±pz− p̃(1 − p̃) α/2 � n = ±zα/2 1 o (1 − po npp− po �� � = ±z� α/2 � �−1 � 1 2 2 2 � n po (1 − po zα/2 zα/2 zα/2 � p + ± zα/2� p(1 − p) + /n 1 + 2n 4n � � n �−1 � � 2 2 2 Score, adjusted Wald Coverage of Confidence Intervals for Binomial 5 10 50 1.00 0.95 coverage 0.90 Method Wald Score 0.85 Exact adj_Wald 0.80 0.75 0.70 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 p 0.6 0.8 1.0 0.0 0.2 0.4 0.6 • Score C.I has average coverage close to nominal value • adjusted Wald is `simple fix’, makes liberal Wald slightly more conservative than Score see also: Agresti & Coull, American Statistician, 1998 0.8 1.0 1 th π1 + π2 + π3 = 1 1 1 Multinomial Distribution K � π= πi = 1 P π(Y o = y2 ) = p(y2 ) = π2 i=1 0.75 ... 0.1875 P (Y = y ) = p(y ) = π K K K 0.0625 Series of n independent and identical trials, t N (0, diag(π)where − ππthe ) outcome for each trial falls into one of K mutually exclusive categories with e, i.e. Y ∈ {y , y , ..., y } with • 1 π1 2 πj = P (Yi = j), K y mass function� then is K π2 • then πi = 1 1 ≤ j ≤ K, 1 ≤ i ≤ n n! nk n1 n2 i=1 p(n1 , n2 , ..., nk ) = π1 π2 · ... · πk , n1 !n2 ! · ... · nk ! πK ability mass function X ∈ {0, 1, 2, 3, ...} Multinomial x µ P (X = x) = p(x) = e−µ , x! he rate parameter (remember, that the rate depends on the u E[X]of times = λ that we • Let y be the number j observe outcome j, then= λ V ar[X] √ 0 ≤ yj ≤skewness n (for all j) = and λ = 1/λ sum of ykurtosis j is n • • • P (Y1 = y1, Y2 = y2, ..., YK = yK ) = n! y1 y 2 yK = π1 π2 · ... · πK y1 !y2 ! · ... · yK ! skewness = � λ G = 2 = y1/λ i log kurtosis 2 � mi m0,i � = 0.74 Multinomial � (m − m ) = y , ..., Y = y ) = i 2 i 0,i P (Y1 = y1 , Y2 X22 = K K = 0.69 m 0,i n! y y yK 1 2 i = π1 π2 · ... · πK y1 !y2 ! · ... · yK ! ∼ Bn,πi have Binomial • Marginals Ha : 0 ≤ Y π≤ 1 with π1 +distributions, π2 + π3 = 1and i E[Y = =nπ Hio] : π πoi V ar[Yi ] = nπi (1− πi ) 0.75 Cov(Yi , Yj ) = −nπi πj πo = �0.1875 k ni ni = . Let L(π , π , ..., π ) = π the multinomial likelihoo 0.0625 i 1 2 k i=1 i n • limiting�distribution: k √ ditional constraint is actually i=1 πi = 1 the likelihood function t n(p − π) −→ N (0, diag(π) − ππ ) bles: � �nk k−1 k−1 with p =� (n1/n, n2/n, ...,� nK/n) (nominal) categorical variable, i.e. Y ∈ {y1 , y2 , ..., yK } L(π1 , π2 , ..., πk−1 ) = πini · 1 − πi with X = # of insects trapped ove or X = # of mosquito bites in a Poisson Distribution ⇒ no upper limit for n X ∈ {0, 1, 2, 3, ...} Simeon Poisson, 1781-1840 Poisson probability mass function • x X = # #of of insects trapped overnight, µ insects trapped overnight in tent, −µ , P (X = x) = p(x) = e x! X = # #mosquito of mosquitobites bites ininan anhour, hour, vehicles km of I-35that the rate de where λ no is#stranded the rate parameter (remember, ⇒ upper limit for n in 10 ∈ {0, 1, 2, 3, ...} X inX{0,1,2,3,4,...} function(no upper limit) E[X] = λ V ar[X] = λ √ skewness = λ x µ P (X = x) = p(x) = e−µ , x! kurtosis = 1/λ meter (remember, that the rate depends on the unit used). E[X] = λ Poisson distribution lambda= 3 0.00 0.10 0.00 0.15 0.30 0.20 lambda= 1 1 2 3 4 5 6 1 3 7 9 11 13 15 lambda= 10 0.00 0.00 0.06 0.10 0.12 lambda= 5 5 1 3 5 7 9 11 14 17 20 23 1 5 9 13 18 23 28 33 38 43 48 approximates Binomial for large n, small p Normal approximation holds for large values of lambda Properties of Poisson Relationship to Multinomial • Let Y1, ...,YK be ind. Poisson with parameters µ1, ..., µK •Y 1 + ... + YK is Poisson with parameter ∑µi Ho : P ( heart disease | Cholesterol ≤ 220) = P ( heart disease | Cholesterol > 220) Contingency Tables π π 11 π11 + π12 = 21 π21 + π22 Y be two categorical variables, with I, J categories respectively a , ..., yJ }. and categorical variables I and en the pair X (X, Y )Yisare categorical variable withwith IJ outcomes. J categories: e table X\Y y1 y2 ... yJ x1 n11 n12 ... n1J n1. x2 n21 n22 ... n2J n2. .. .. .. .. .. .. . . . . . . • xI nI1 nI2 n.1 n.2 ... ... nIJ n.J nI. n Contingency Tables ed a two dimensional contingency table or cross-classification t and(xY ,are categorical variables with I and obability of X pair y ) then the table i j • J categories: X\Y x1 x2 .. . xI y1 y2 π11 π12 π21 π22 .. .. . . πI1 πI2 π.1 π.2 ... yJ ... π1J ... π2J .. .. . . ... πIJ ... π.J π1. π2. .. . πI. 1 Poisson vs Multinomial Sampling • Poisson Sampling: • each cell n is assumed to be Poisson ij distributed • Overall sum is random • Multinomial • If overall sum is fixed, conditional probabilities become multinomial Product Multinomial Sampling • one of the margins in the contingency table is fixed • e.g. rare disease, `pairing’ of combinations • given the fixed margins, the other direction still has a multinomial distribution, resulting in a product of multinomials. • set-up of traditional case-control study Margins are fixed • Both margins are fixed • less frequent in studies, more frequent in inferential methods • Hypergeometric distribution Example: Cholesterol/Heart Disease • 1329 patients of same age/sex Coronary Disease mg/l present absent Cholesterol ≤ 220 y11= 20 y12= 553 > 220 y21= 72 y22= 684 Cholesterol/Heart Disease • Y = (y11, y12, y21, y22) ~ Mult (1329, π) • Is incidence of heart disease independent of cholesterol levels? i.e. is incidence rate of heart disease the same for both levels of cholesterol? χ1,0.05 = 3.84, χ1,0.01 = 6.634897 γ= Π C + ΠD π00 ·ππ11 πθij:= =π πij π=i+ππi++j · π+j 10 01 Cholesterol/Heart Disease π : (1 − π) Ho : P ( heart disease | Cholesterol ≤ 220) = − πdisease 1 − π2 , j|i=1 =:| πCholesterol Ho : Pπ(j|i=0 heart ≤ 220) = P ( heart disease | Cholesterol > 220) P ( heart disease | Cholesterol > 220) πj=1|i=0 r := πj=1|i=1 π π21 equivalent11πto π21 11 = π11 + π12 2 π=21 + π22 2 π11 + πχ + π22 χ1,0.05 = 3.84, 121,0.01π= 21 6.634897 categorical variables, with I, J categories respectively and X ∈ {x wo categorical variables, with I, J categories respectively and X ∈ }. = πi+ ·with π+j IJ outcomes. to ir (X, Y ) is equivalent categoricalπijvariable pair (X, Y ) is categorical variable with IJ outcomes. • • Ho : X\Y y1 y2 ... yJ X\Y y1 y2 ... yJ ... n1J n1.≤ 220) = 1 x ndisease 11n n12n| Cholesterol P ( xheart ... n1J n1. 1 11 12 Cholesterol/Heart Disease • Expected Values under independence Coronary Disease mg/l present absent Total Cholesterol ≤ 220 13.66 533.33 573 > 220 52.33 703.67 756 Total 92 1237 1329 Cholesterol/Heart Disease • loglikelihood ratio test G2 = 19.8 • Pearson test X2 = 18.4 • both G2 and X2 are chisquare distributed with df = 3 - 2 = 1 independence seems to be violated Visualizing Contingency Tables Mosaicplots • Area plots (i.e. area represents #combinations) • Built hierarchically, i.e. order of variables matters • in R: mosaicplot() (base package) imosaic() (iplots package) productplots package http://cran.r-project.org/doc/contrib/Short-refcard.pdf Odds ratio X=0 X=1 Y=0 a b Y=1 c d • Measure of association between X and Y • odds ratio = (ad)/(bc) • odds ratio = 1 is independence • log odds ratio is symmetric around 0 • log odds is approx Normal with variance 1/a + 1/b + 1/c + 1/d Visualizing Associations Visualizing Associations Reading the Odds probability scale 1.0 ln 1-d d c 2 1 a 0.5 ln 1-b b +inf d 0 -1 -2 b 0 -0.85 0.94 -inf log odds scale Odds ratios in 2 x 2 x K tables X and Y are binary variables, Z is categorical with K categories Death Penalties in Florida: X death penalty (yes/no) Y defendant’s race (black/white) Z victim’s race (black/white) Marginal Table of X/Y white black yes no 53 430 483 15 176 191 68 606 674 Marginal odds ratio: 53*176/(430*15) = 1.45 (±0.59) slight indication in favor of black defendants Odds ratios in 2 x 2 x K tables Conditional Tables of X/Y Z = white Z = black yes no whit 53 11 e black 414 37 451 64 467 48 515 Conditional odds ratios: 0.43 whit e black yes no 0 4 4 16 139 155 0 very strong indication against black defendants 16 143 159 Simpson’s paradox • Simpson’s paradox: marginal association between X and Y is opposite to conditional associations between X and Y for each level of Z • due to: very strong association between X and Z or Y and Z Physicians’ study • Myocardial Infarction among 22071 physicians in 5 year period Fatal Non-fatal No Placebo 18 171 10,845 Aspirin 5 99 10,933