Hierarchical (nested) ANOVA Hierarchical ANOVA Hierarchical (nested) ANOVA • In some two-factor experiments the level of one factor , say B, is not “cross” or “cross classified” with the other factor, say A, but is “NESTED” with it. • The levels of B are different for different levels of A. – For example: 2 Areas (Study vs Control) • 4 sites per area, each with 5 replicates. • There is no link from any sites on one area to any sites on another area. Hierarchical ANOVA • That is, there are 8 sites, not 2. Study Area (A) Control Area (B) S1(A) S2(A) S3(A) S4(A) X X X X X X X X X X X X X X X X X X X X S5(B) S6(B) S7(B) S8(B) X X X X X X X X X X X X X X X X X X X X X = replications Number of sites (S)/replications need not be equal with each sites. Analysis is carried out using a nested ANOVA not a two-way ANOVA. Hierarchical ANOVA • A Nested design is not the same as a twoway ANOVA which is represented by: A1 A2 A3 B1 XXXXX XXXXX XXXXX B2 XXXXX XXXXX XXXXX B3 XXXXX XXXXX XXXXX Nested, or hierarchical designs are very common in environmental effects monitoring studies. There are several “Study” and several “Control” Areas. Hierarchical ANOVA Objectives • The nested design allows us to test two things: (1) difference between “Study” and “Control” areas, and (2) the variability of the sites within areas. • If we fail to find a significant variability among the sites within areas, then a significant difference between areas would suggest that there is an environmental impact. • In other words, the variability is due to differences between areas and not to variability among the sites. Hierarchical ANOVA • In this kind of situation, however, it is highly likely that we will find variability among the sites. • Even if it should be significant, however, we can still test to see whether the difference between the areas is significantly larger than the variability among the sites with areas. Hierarchical ANOVA Statistical Model Yijk = m + ri + t(i)j + e(ij)k i indexes “A” (often called the “major factor”) (i)j indexes “B” within “A” (B is often called the “minor factor”) (ij)k indexes replication i = 1, 2, …, M j = 1, 2, …, m k = 1, 2, …, n Hierarchical ANOVA Model (continue) Yijk Y Yi .. Y Yij. Yi .. Yijk Yij. and Y ijk i j k Y 2 i Y i .. j i k Y j Y 2 ijk Yij. k 2 i Y ij. j k Yi .. 2 Hierarchical ANOVA Model (continue) Or, TSS = SSA + SS(A)B+ SSWerror = m.n Yi .. Y n M 2 i 1 M i 1 Y m j 1 ij. Yi .. 2 M i 1 Degrees of freedom: M.m.n -1 = (M-1) + M(m-1) + Mm(n-1) Y m n j 1 k 1 ijk Yij. 2 Hierarchical ANOVA Example M=3, m=4, n=3; 3 Areas, 4 sites within each area, 3 replications per site, total of (M.m.n = 36) data points M1 M2 M3 Areas 1 2 3 4 5 6 7 8 9 10 11 12 10 12 8 13 11 13 9 10 13 14 7 10 14 8 10 12 14 11 10 9 10 13 9 7 9 10 12 11 8 9 8 8 16 12 5 4 11 10 10 12 11 11 9 9 13 13 7 10.75 10.0 10.25 10.0 Y 7 Yi.. Sites Repl. Yij. Hierarchical ANOVA Example (continue) SSA = 4 x 3 [(10.75-10.25)2 + (10.0-10.25)2 + (10.0-10.25)2] = 12 (0.25 + 0.0625 + 0.625) = 4.5 SS(A)B = 3 [(11-10.75)2 + (10-10.75)2 + (10-10.75)2 + (12-10.75)2 + (11-10)2 + (11-10)2 + (9-10)2 + (9-10)2 + (13-10)2 + (13-10)2 + (7-10)2 + (7-10)2] = 3 (42.75) = 128.25 TSS = 240.75 SSWerror= 108.0 Hierarchical ANOVA ANOVA Table for Example Nested ANOVA: Observations versus Area, Sites Source DF SS(平方和) Area 2 4.50 Sites (A)B 9 128.25 Error 24 108.00 Total 35 240.75 MS(方差) F P 2.25 0.158 0.856 14.25 3.167 0.012** 4.50 What are the “proper” ratios? E(MSA) = s2 + V(A)B + VA = MSA/MS(A)B E(MS(A)B)= s2 + V(A)B E(MSWerror) = s2 = MS(A)B/MSWerror Hierarchical ANOVA Summary • Nested designs are very common in environmental monitoring • It is a refinement of the one-way ANOVA • All assumptions of ANOVA hold: normality of residuals, constant variance, etc. • Can be easily computed using SAS, MINITAB, etc. • Need to be careful about the proper ratio of the Mean squares. • Always use graphical methods e.g. boxplots and normal plots as visual aids to aid analysis. Hierarchical ANOVA Length mosquito cage Sample: Hierarchical (nested) ANOVA 58.5 59.5 77.8 80.9 84.0 83.6 70.1 68.3 69.8 69.8 56.0 54.5 50.7 49.3 63.8 65.8 56.6 57.5 77.8 79.2 69.9 69.2 62.1 64.5 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 Hierarchical ANOVA Length = β0 + βcage ╳ cage + βmosquito(cage) ╳ mosquito (cage) + error df? Hierarchical ANOVA data anova6; input length 58.5 59.5 77.8 80.9 84.0 83.6 70.1 68.3 69.8 69.8 56.0 54.5 50.7 49.3 63.8 65.8 56.6 57.5 77.8 79.2 69.9 69.2 62.1 64.5 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4 mosquito cage;cards; 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 ; proc glm data=anova6; class cage mosquito; model length = cage mosquito(cage); test h=cage e=mosquito(cage); output out = out1 r=res p=pred; proc print data=out1;var res pred; proc plot data = out1;plot res*pred; proc univariate data=out1 normal plot;var res; run; Hierarchical ANOVA Class Levels Values mosquito 4 1234 cage 3 123 Number of observations 24 Hierarchical ANOVA Source DF Sum of Squares Mean Square F Value Pr > F Model 11 Error 12 Corrected Total 23 2386.353333 15.620000 2401.973333 216.941212 1.301667 166.66 Source Type I SS Mean Square F Value Pr > F 665.675833 1720.677500 332.837917 191.186389 255.70 146.88 DF cage 2 mosquito(cage) 9 <.0001 <.0001 <.0001 Tests of Hypotheses Using the MS for mosquito(cage) as an Error Term Source cage DF 2 Type I SS 665.675833 Mean Square 332.837917 F Value Pr > F 1.74 0.2295 Hierarchical ANOVA 2 1.5 res1 1 0.5 0 -0.5 -1 -1.5 -2 -2 -1.5 -1 -0.5 0 res 0.5 1 1.5 2 Hierarchical ANOVA 2 res 1 0 -1 -2 45 55 65 pred 75 85 Hierarchical ANOVA Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling W 0.978828 D 0.093842 W-Sq 0.038078 A-Sq 0.22057 Pr < W 0.8733 Pr > D >0.1500 Pr > W-Sq >0.2500 Pr > A-Sq >0.2500 Hierarchical ANOVA Two way ANOVA vs. nested ANOVA