Always be mindful of the kindness and not the faults of others. 1 One-way Anova: Inferences about More than Two Population Means Model and test for oneway anova Assumption checking Nonparamateric alternative 2 Analysis of Variance & One Factor Designs (One-Way ANOVA) Y= RESPONSE VARIABLE (of numerical type) (e.g. battery lifetime) X = EXPLANATORY VARIABLE (of categorical type) (A possibly influential FACTOR) (e.g. brand of battery) OBJECTIVE: To determine the impact of X on Y 3 Completely Randomized Design (CRD) • Goal: to study the effect of Factor X • The same # of observations are taken randomly and independently from the individuals at each level of Factor X i.e. n1=n2=…nc (c levels) 4 Example: Y = LIFETIME (HOURS) 3 replications per level 1 BRAND 2 3 4 5 6 7 8 1.8 4.2 8.6 7.0 4.2 4.2 7.8 9.0 5.0 5.4 4.6 5.0 7.8 4.2 7.0 7.4 1.0 4.2 4.2 9.0 6.6 5.4 9.8 5.8 2.6 4.6 5.8 7.0 6.2 4.6 8.2 7.4 5.8 5 Analysis of Variance 6 Statistical Model C “levels” OF BRAND R observations for each level 1 2 ••• • • •••R 1 Y11 Y12 • • • • • • •Y1R 2 Y21 • • • • • • • Yij • • • • • • • • C YcI • • • • • • • •YcR Yij = + i + ij i = 1, . . . . . , C j = 1, . . . . . , R 7 Where = OVERALL AVERAGE i = index for FACTOR (Brand) LEVEL j = index for “replication” i = Differential effect associated with ith level of X (Brand i) = i – and ij = “noise” or “error” due to other factors associated with the (i,j)th data value. i = AVERAGE associated with ith level of X (brand i) = AVERAGE of i ’s. 8 Yij = + i + ij C By definition, i=1 i = 0 The experiment produces RxC Yij data values. The analysis produces estimates of , ,,c . (We can then get estimates of the ij by subtraction). 9 Let Y1, Y2, etc., be level means c Y • = Y i C = “GRAND MEAN” i=1 / (assuming same # data points in each column) (otherwise, Y • = mean of all the data) 10 MODEL: Y• Yi - Y • Yij = + i + ij estimates estimates i (= i – ) (for all i) These estimates are based on Gauss’ (1796) PRINCIPLE OF LEAST SQUARES and on COMMON SENSE 11 MODEL: Yij = + j + ij If you insert the estimates into the MODEL, < (1) Yij = Y • + (Yj - Y • ) + ij. it follows that our estimate of ij is (2) ij = Yij – Yj, called residual 12 Then, Yij = Y• + (Yi - Y• ) + ( Yij - Yi) { { { or, (Yij - Y• ) = (Yi - Y•) + (Yij - Yi ) (3) VARIABILITY in Y Variability Variability TOTAL = in Y + in Y associated associated with X with all other factors 13 If you square both sides of (3), and double sum both sides (over i and j), you get, [after some unpleasant algebra, but lots of terms which “cancel”] C R C 2 C R 2 (Yij - Y• ) = R • (Yi - Y•) + (Yij - Yi) i=1 i=1 j=1 { { { i=1 j=1 2 ( ( TSS TOTAL SUM OF SQUARES = SSB + = SUM OF SQUARES BETWEEN SAMPLES + ( SSW (SSE) ( SUM OF SQUARES WITHIN SAMPLES 14 ANOVA TABLE SOURCE OF VARIABILITY Between samples (due to brand) Within samples (due to error) TOTAL SSQ SSB SSW TSS DF C-1 Mean (M.S.) square SSB = MSB C-1 SSW = MSW (R - 1) • C (R-1)•C RC -1 15 Example: Y = LIFETIME (HOURS) BRAND 3 replications per level 1 2 3 4 5 6 7 8 1.8 4.2 8.6 7.0 4.2 4.2 7.8 9.0 5.0 5.4 4.6 5.0 7.8 4.2 7.0 7.4 1.0 4.2 4.2 9.0 6.6 5.4 9.8 5.8 2.6 4.6 5.8 7.0 6.2 4.6 8.2 7.4 5.8 SSB 2 2 2 = 3 ( [2.6 - 5.8] + [4.6 - 5.8] + • • • + [7.4 - 5.8] ) = 3 (23.04) = 69.12 16 SSW =? 2 (4.2 - 4.6)2 =.16 2 (5.4 - 4.6)2= .64 • • • • (7.4 - 7.4)2 = 0 2 (4.2 - 4.6)2= .16 (5.8 - 7.4)2 = 2.56 .96 5.12 (1.8 - 2.6) = .64 (5.0 - 2.6) = 5.76 (1.0 - 2.6) = 2.56 8.96 (9.0 -7.4)2 = 2.56 Total of (8.96 + .96 + • • • + 5.12), SSW = 46.72 17 ANOVA TABLE Source of Variability SSQ df M.S. BRAND 69.12 7 9.87 = 8-1 ERROR 46.72 16 2.92 = 2 (8) TOTAL 115.84 23 = (3 • 8) -1 18 We can show: “VCOL” E (MSB) = 2 ( + MEASURE OF DIFFERENCES AMONG LEVEL MEANS R C-1 • (i - )2 i E (MSW) = 2 (Assuming Yij follows N(j ,2) and they are independent) 19 E ( MSBC ) = 2 + VCOL E ( MSW ) = 2 This suggests that if if MSBC MSW MSBC MSW > < There’s some evidence of non1 , zero V , or “level COL of X affects Y” No evidence that 1, VCOL > 0, or that “level of X affects Y” 20 With HO: HI: Level of X has no impact on Y Level of X does have impact on Y, We need MSBC MSW >>1 to reject HO. 21 More Formally, HO: 1 = 2 = • • • c = 0 HI: not all j = 0 OR HO: 1 = 2 = • • • • c (All level means are equal) HI: not all j are EQUAL 22 The distribution of MSB MSW = “Fcalc” , is The F - distribution with (C-1, (R-1)C) degrees of freedom Assuming HO true. C = Table Value 23 In our problem: ANOVA TABLE Source of Variability SSQ df M.S. BRAND 69.12 7 9.87 3.38 ERROR 46.72 16 2.92 = 9.87 2.92 Fcalc 24 F table: table 8 = .05 C = 2.66 3.38 (7,16 DF) 25 Hence, at = .05, Reject Ho . (i.e., Conclude that level of BRAND does have an impact on battery lifetime.) 26 MINITAB INPUT life 1.8 5.0 1.0 4.2 5.4 4.2 . . . 9.0 7.4 5.8 brand 1 1 1 2 2 2 . . . 8 8 8 27 ONE FACTOR ANOVA (MINITAB) MINITAB: STAT>>ANOVA>>ONE-WAY Analysis of Variance for life Source DF SS MS F P 3.38 0.021 brand 7 69.12 9.87 Error 16 46.72 2.92 Total 23 115.84 Estimate of the common variance ^2 28 Boxplots of life by brand (means are indicated by solid circles) 10 9 8 7 life 6 5 4 3 2 1 8 7 6 5 4 3 2 brand 1 0 29 Assumptions MODEL: Yij = + i + ij Run order plot Normality plot & test Residual plot & test 1.) the ij are indep. random variables 2.) Each ij is Normally Distributed E(ij) = 0 for all i, j 3.) 2(ij) = constant for all i, j 30 Diagnosis: Normality • The points on the normality plot must more or less follow a line to claim “normal distributed”. • There are statistic tests to verify it scientifically. • The ANOVA method we learn here is not sensitive to the normality assumption. That is, a mild departure from the normal distribution will not change our conclusions much. Normal probability plot & normality test of residuals 31 Minitab: stat>>basic statistics>>normality test Probability Plot of RESI1 Normal 99 Mean StDev N AD P-Value 95 90 -1.48030E-16 1.425 24 0.481 0.212 Percent 80 70 60 50 40 30 20 10 5 1 -4 -3 -2 -1 0 RESI1 1 2 3 4 32 Diagnosis: Constant Variances • The points on the residual plot must be more or less within a horizontal band to claim “constant variances”. • There are statistic tests to verify it scientifically. • The ANOVA method we learn here is not sensitive to the constant variances assumption. That is, slightly different variances within groups will not change our conclusions much. Tests and Residual plot: fitted values vs. residuals 33 Minitab: Stat >> Anova >> One-way Residuals Versus the Fitted Values (response is life) 3 Residual 2 1 0 -1 -2 2 3 4 5 Fitted Value 6 7 8 34 Minitab: Stat>> Anova>> Test for Equal variances Test for Equal Variances for life 1 Bartlett's Test Test Statistic P-Value 2 Lev ene's Test Test Statistic P-Value 3 brand 4.20 0.757 0.31 0.938 4 5 6 7 8 0 10 20 30 40 95% Bonferroni Confidence Intervals for StDevs 35 Diagnosis: Randomness/Independence • The run order plot must show no “systematic” patterns to claim “randomness”. • There are statistic tests to verify it scientifically. • The ANOVA method is sensitive to the randomness assumption. That is, a little level of dependence between data points will change our conclusions a lot. Run order plot: order vs. residuals 36 Minitab: Stat >> Anova >> One-way Residuals Versus the Order of the Data (response is life) 3 Residual 2 1 0 -1 -2 2 4 6 8 10 12 14 16 Observation Order 18 20 22 24 37 KRUSKAL - WALLIS TEST (Non - Parametric Alternative) HO: The probability distributions are identical for each level of the factor HI: Not all the distributions are the same 38 Brand A B C 32 32 28 30 32 21 30 26 15 29 26 15 26 22 14 23 20 14 20 19 14 19 16 11 18 14 9 12 14 8 Mean: 23.9 22.1 BATTERY LIFETIME (hours) (each column rank ordered, for simplicity) 14.9 (here, irrelevant!!) 39 HO: no difference in distribution among the three brands with respect to battery lifetime HI: At least one of the 3 brands differs in distribution from the others with respect to lifetime 40 Ranks in ( ) Brand A B C 32 (29) 32 (29) 28 (24) 30 (26.5) 32 (29) 21 (18) 30 (26.5) 26 (22) 15 (10.5) 29 (25) 26 (22) 15 (10.5) 26 (22) 22 (19) 14 (7) 23 (20) 20 (16.5) 14 (7) 20 (16.5) 19 (14.5) 14 (7) 19 (14.5) 16 (12) 11 (3) 18 (13) 14 (7) 9 (2) 12 (4) 14 (7) 8 (1) T1 = 197 T2 = 178 T3 = 90 n1 = 10 n2 = 10 n3 = 10 41 TEST STATISTIC: 12 H= N (N + 1) K • (Tj2/nj ) - 3 (N + 1) j=1 nj = # data values in column j K N = nj j=1 K = # Columns (levels) Tj = SUM OF RANKS OF DATA ON COL j When all DATA COMBINED (There is a slight adjustment in the formula as a function of the number of ties in rank.) 42 H= 30 (31) [ 197 2 178 2 902 + + 10 10 10 [ 12 - 3 (31) = 8.41 (with adjustment for ties, we get 8.46) 43 What do we do with H? We can show that, under HO , H is well 2 approximated by a distribution with df = K - 1. Here, df = 2, and at = .05, the critical value = 5.99 -,df = .05 = F-,df, 8 2 df 5.99 8.41 = H Reject HO; conclude that mean lifetime NOT the same for all 3 BRANDS 44 Minitab: Stat >> Nonparametrics >> KruskalWallis • Kruskal-Wallis Test: life versus brand • Kruskal-Wallis Test on life • • • • • • • • • • brand 1 2 3 4 5 6 7 8 Overall • • H = 12.78 DF = 7 P = 0.078 H = 13.01 DF = 7 P = 0.072 (adjusted for ties) N 3 3 3 3 3 3 3 3 24 Median 1.800 4.200 4.600 7.000 6.600 4.200 7.800 7.400 AveRank 4.5 7.8 11.8 16.5 13.3 7.8 20.0 18.2 12.5 Z -2.09 -1.22 -0.17 1.05 0.22 -1.22 1.96 1.48 45