Hierarchical (nested) ANOVA

advertisement
Hierarchical (nested) ANOVA
Hierarchical ANOVA
Hierarchical (nested) ANOVA
• In some two-factor experiments the level of
one factor , say B, is not “cross” or “cross
classified” with the other factor, say A, but is
“NESTED” with it.
• The levels of B are different for different
levels of A.
– For example: 2 Areas (Study vs Control)
• 4 sites per area, each with 5 replicates.
• There is no link from any sites on one area to any sites
on another area.
Hierarchical ANOVA
• That is, there are 8 sites, not 2.
Study Area (A)
Control Area (B)
S1(A) S2(A) S3(A) S4(A)
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
S5(B) S6(B) S7(B) S8(B)
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X = replications
Number of sites (S)/replications need not be equal with each sites.
Analysis is carried out using a nested ANOVA not a
two-way ANOVA.
Hierarchical ANOVA
• A Nested design is not the same as a twoway ANOVA which is represented by:
A1
A2
A3
B1
XXXXX
XXXXX
XXXXX
B2
XXXXX
XXXXX
XXXXX
B3
XXXXX
XXXXX
XXXXX
Nested, or hierarchical designs are very common in
environmental effects monitoring studies. There
are several “Study” and several “Control” Areas.
Hierarchical ANOVA
Objectives
• The nested design allows us to test two
things: (1) difference between “Study” and
“Control” areas, and (2) the variability of the
sites within areas.
• If we fail to find a significant variability among
the sites within areas, then a significant
difference between areas would suggest that
there is an environmental impact.
• In other words, the variability is due to
differences between areas and not to
variability among the sites.
Hierarchical ANOVA
• In this kind of situation, however, it is highly
likely that we will find variability among the
sites.
• Even if it should be significant, however, we
can still test to see whether the difference
between the areas is significantly larger than
the variability among the sites with areas.
Hierarchical ANOVA
Statistical Model
Yijk = m + ri + t(i)j + e(ij)k
i indexes “A” (often called the “major factor”)
(i)j indexes “B” within “A” (B is often called the
“minor factor”)
(ij)k indexes replication
i = 1, 2, …, M
j = 1, 2, …, m
k = 1, 2, …, n
Hierarchical ANOVA
Model (continue)
Yijk  Y  Yi ..  Y   Yij.  Yi ..   Yijk  Yij. 
and
   Y
ijk
i
j
k
 Y   
2
i
  Y
i ..
j

i
k
  Y
j
 Y   
2
ijk  Yij. 
k
2
i
  Y
ij.
j
k
 Yi .. 
2
Hierarchical ANOVA
Model (continue)
Or,
TSS = SSA + SS(A)B+ SSWerror
= m.n  Yi ..  Y   n 
M
2
i 1
M
i 1
 Y
m
j 1
ij.  Yi ..   
2
M
i 1
Degrees of freedom:
M.m.n -1 = (M-1) + M(m-1) + Mm(n-1)
  Y
m
n
j 1
k 1
ijk  Yij. 
2
Hierarchical ANOVA
Example
M=3, m=4, n=3; 3 Areas, 4 sites within each area, 3
replications per site, total of (M.m.n = 36) data points
M1
M2
M3
Areas
1
2
3
4
5
6
7
8
9
10
11
12
10
12
8
13
11
13
9
10
13
14
7
10
14
8
10
12
14
11
10
9
10
13
9
7
9
10
12
11
8
9
8
8
16
12
5
4
11
10
10
12
11
11
9
9
13
13
7
10.75
10.0
10.25
10.0
Y
7
Yi..
Sites
Repl.
Yij.
Hierarchical ANOVA
Example (continue)
SSA = 4 x 3 [(10.75-10.25)2 + (10.0-10.25)2 + (10.0-10.25)2]
= 12 (0.25 + 0.0625 + 0.625) = 4.5
SS(A)B = 3 [(11-10.75)2 + (10-10.75)2 + (10-10.75)2 + (12-10.75)2 +
(11-10)2 + (11-10)2 + (9-10)2 + (9-10)2 +
(13-10)2 + (13-10)2 + (7-10)2 + (7-10)2]
= 3 (42.75) = 128.25
TSS = 240.75
SSWerror= 108.0
Hierarchical ANOVA
ANOVA Table for Example
Nested ANOVA: Observations versus Area, Sites
Source
DF SS(平方和)
Area
2
4.50
Sites (A)B 9
128.25
Error
24 108.00
Total
35 240.75
MS(方差) F
P
2.25
0.158 0.856
14.25 3.167 0.012**
4.50
What are the “proper” ratios?
E(MSA) = s2 + V(A)B + VA
= MSA/MS(A)B
E(MS(A)B)= s2 + V(A)B
E(MSWerror) =
s2
= MS(A)B/MSWerror
Hierarchical ANOVA
Summary
• Nested designs are very common in
environmental monitoring
• It is a refinement of the one-way ANOVA
• All assumptions of ANOVA hold: normality of
residuals, constant variance, etc.
• Can be easily computed using SAS, MINITAB,
etc.
• Need to be careful about the proper ratio of
the Mean squares.
• Always use graphical methods e.g. boxplots
and normal plots as visual aids to aid analysis.
Hierarchical ANOVA
Length mosquito cage
Sample:
Hierarchical
(nested)
ANOVA
58.5
59.5
77.8
80.9
84.0
83.6
70.1
68.3
69.8
69.8
56.0
54.5
50.7
49.3
63.8
65.8
56.6
57.5
77.8
79.2
69.9
69.2
62.1
64.5
1
1
2
2
3
3
4
4
1
1
2
2
3
3
4
4
1
1
2
2
3
3
4
4
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
Hierarchical ANOVA
Length = β0 + βcage ╳ cage
+ βmosquito(cage) ╳ mosquito (cage)
+ error
df?
Hierarchical ANOVA
data anova6;
input length
58.5
59.5
77.8
80.9
84.0
83.6
70.1
68.3
69.8
69.8
56.0
54.5
50.7
49.3
63.8
65.8
56.6
57.5
77.8
79.2
69.9
69.2
62.1
64.5
1
1
2
2
3
3
4
4
1
1
2
2
3
3
4
4
1
1
2
2
3
3
4
4
mosquito
cage;cards;
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
;
proc glm data=anova6;
class cage mosquito;
model length = cage mosquito(cage);
test h=cage e=mosquito(cage);
output out = out1 r=res p=pred;
proc print data=out1;var res pred;
proc plot data = out1;plot res*pred;
proc univariate data=out1 normal plot;var res;
run;
Hierarchical ANOVA
Class
Levels
Values
mosquito
4
1234
cage
3
123
Number of observations 24
Hierarchical ANOVA
Source
DF
Sum of
Squares
Mean Square
F Value Pr > F
Model
11
Error
12
Corrected Total 23
2386.353333
15.620000
2401.973333
216.941212
1.301667
166.66
Source
Type I SS
Mean Square
F Value Pr > F
665.675833
1720.677500
332.837917
191.186389
255.70
146.88
DF
cage
2
mosquito(cage) 9
<.0001
<.0001
<.0001
Tests of Hypotheses Using the MS for mosquito(cage) as an Error Term
Source
cage
DF
2
Type I SS
665.675833
Mean Square
332.837917
F Value Pr > F
1.74
0.2295
Hierarchical ANOVA
2
1.5
res1
1
0.5
0
-0.5
-1
-1.5
-2
-2
-1.5
-1
-0.5
0
res
0.5
1
1.5
2
Hierarchical ANOVA
2
res
1
0
-1
-2
45
55
65
pred
75
85
Hierarchical ANOVA
Tests for Normality
Test
--Statistic---
-----p Value------
Shapiro-Wilk
Kolmogorov-Smirnov
Cramer-von Mises
Anderson-Darling
W 0.978828
D 0.093842
W-Sq 0.038078
A-Sq 0.22057
Pr < W 0.8733
Pr > D >0.1500
Pr > W-Sq >0.2500
Pr > A-Sq >0.2500
Hierarchical ANOVA
Two way
ANOVA vs.
nested
ANOVA
Download