AOV-Summary

advertisement
Analyses of Variance
Review
Simple Situation
Genotype A
135
Genotype B
34
Simple Situation
Genotype A
135
115
102
110
115.5
Genotype B
34
76
83
64
64.2
t-test
|x1-x2|
t=
2[(12+22)/(n1+n2)]
More than two treatments
Rep.
Genotype
Brundage Lambert Croft
Stephens
1
64
78
75
55
2
3
4
5
6
72
68
77
56
95
91
97
82
85
77
93
78
71
63
76
66
49
64
70
68
Multiple t-tests
 Brundage
v Lambert; Brundage v Croft;
Brundage v Stephens; Lambert v Croft;
Lambert v Stephens; Croft v Stephens.
 Problems?
 If all tests were done at 95% significance
level, and one difference was significant, we
have done 6 tests and would expect 1/20 to
be significant, at random.
Analysis of Variance
 Is
an elegant and quicker way to
calculate a pooled error term.
 Analysis is simple in simple designs but
can be complicated and lengthy in some
designs (i.e. rectangular lattices).
 In some experimental designs the
ANOVA is the only method to estimate
a pooled error term.
Analysis of Variance
 It
can provide an F-test to tests specific
hypotheses. (i.e. to test general
differences between different
treatments).
 Can be an invaluable initial
contribution to interpretation of
experiments.
Theory of Analysis of Variance
ij(xij-x..)2 = ij[(xij-xi.) + (xi.-x..)]2
ij[(xij-xi.)2+2(xij-xi.)(xi.-x..)+(xi.-x..)2]
ij(xij-x..)2 = ij(xij-xi.)2+ki(xi.-x..)2]
ki(xi.-x..)2 = Between Treatment SS
ij(xij-xi.)2 = Within Treatment SS
Theory of Analysis of Variance
BTMS ~ 2n-1 df : WTMS ~ 2nk-n df
2n-1 df 2nk-n df ~ F Dist n-1,nk-n df
Theory of Analysis of Variance
Source of variation
df
EMS
Between treatments
n-1 e2 + kt2
Within treatments
Total
nk-n e2
nk-1
[e2 + kt2]/e2 = 1, if kt2 = 0
Assumptions behind the ANOVA
 Assumption
of data being normally
distributed.
 Homogeneity of error variance.
 Additivity of variance effects.
 Data collected from a properly
randomized experiment.
Analyses of CRB Designs
Yij =  + ti + eij
Analysis of Variance of CRB
Source
df
Between
treatments
Within
treatments
k-1 [G12/n1 + G22/n2 … Gk2/nk] - CF
Total
SS
jk-k By difference
jk-1 [x112 + x122 + … + xjk2] - CF
CF = [xij]2/jk
Analyses of RCB Designs
Yij =  + bi + tj + eij
Analysis of Variance of RCB
Source
df
Blocks
r-1
[B12 + B22 + … + Br2]/t – CF
Treatments
t-1
[T12 + T22 + … + Tt2]/r – CF
Error
Total
SS
(r-1)(t-1) By difference
rt-1
[x112 + x122 + … + xrt2] – CF
CF = [xij]2/rt
Analyses of Latin Designs
Yijk =  + ri + cj + tk(ij) + eijk
Analysis of Variance of Latin
Source
df
SS
Rows
t-1
[R12 + R22 + … + Rt2]/t – CF
Columns
t-1
[C12 + C22 + … + Ct2]/t – CF
Treatments
t-1
[T12 + T22 + … + Tt2]/t – CF
Error
Total
(t-1)(t-2) By difference
t2-1
[x112 + x122 + … + xtt2] – CF
CF = [xij]2/t2
Efficiency of Latin Squares
cw CRB Design
[MSr + MSc + (t-1)EMS]/(t+1)EMS
If value response is 325, then latin square in
will increase precision by 225% over CRB
and CRD would have need 2.25 x 4 = 9
replicates to be as accurate.
Efficiency of Latin Squares
cw RCB Design
Row (RCB)
= [MSr + (t-1)EMS]/(t+1)EMS
Col(RCB)
= [MSc + (t-1)EMS]/(t+1)EMS
-19%
+266%
☺
+226%
-19%
☺
Analyses of Lattice Squares
Yijk =  + ri +
a
bj
+
a
t k + eijk
Lattice Square ANOVA
Source
Reps
Blk(adj)
Intra error
T(adj)
Eff. Error
df
4
15
45
15
45
SS
5,946
11,382
14,533
24,030
16,605
MS
1,486
759
323
1,602
369
F
4.03 *
2.35 ns
4.34 **
-
Efficiency of Lattice Design
100 x [Blk(adj)SS+Intra error SS]/k(k2-1)EMS
100 [11,382 + 14,533]/4(16)369
117%
I
II
III
IV
I
II
III
IV
V
V
Dealing with Wrongful Data
 It
is usually assumed that the data
collected is correct!.
 Why would data not be correct?
Mis-recording, mis-classification,
transcription errors, errors in data
entry.
Outliers.
Dealing with Wrongful Data
 What
things can help?
Keep detailed records, on each
experimental unit.
Decide beforehand what values
would arouse suspision.
Dealing with Wrongful Data
 What
do you do with suspicios data?
If correct, and it is discarded, then
valuable information is lost. This
will bias the results.
If wrong and included, will bias
results and may have extreme
consequences.
Checking ANOVA Accurucy
of variation: [e/]x100.
 CV=(√100.9/73.75)*100=13.6%
 Coefficient
R2
value = {[TSS-ESS]/TSS}x100.
R2 = (1654/3654)*100 = 44.7%.
Compare
the effect of blocking or
sub-blocking (discussed later).
Marvelous Marvin
father of the Groom
Alaskan Wedding Feast
ANOVA of Factorial Designs
Factorial AOV Example
Source
df
SS
MS
F
Reps
2
0.01
0.005
ns
Seed Density
2
2.75
1.375
33.9 ***
Nitrogen
5
81.56
16.312 401.9***
SxN
10
1.33
0.133
Error
34
1.38
0.041
Total
53
87.03
3.28***
Factorial AOV Example
Source
df
SS
MS
F
Reps x Seed Rate
4
0.2268 0.0567
1.63 ns
Rep x N rate
10
0.4528 0.0453
1.30 ns
Rep x Seed x N
20
0.6936 0.0347
Split-plot AOV
Source
Reps
Seed Density
Error (1)
Nitrogen
SxN
Error (2)
Total
df
2
2
4
5
10
30
53
SS
0.01
2.75
0.2268
81.56
1.33
1.1464
87.03
MS
0.005
1.375
0.057
16.312
0.133
0.038
F
ns
24.2 ***
426.9***
3.5***
-
Strip-plot AOV
Source
Reps
Seed Density
Error 1 (Seed)
Nitrogen
Error 2 (N)
SxN
Error 3 (SxN)
Total
df
2
2
4
5
10
10
20
53
SS
0.01
2.75
0.2268
81.56
0.4528
1.33
0.6936
87.03
MS
0.005
1.375
0.0567
16.312
0.0453
0.133
0.0347
F
ns
24.2 ***
360.1***
3.83***
-
Fixed and Random
Effects
Expected Mean Squares
Dependant
on whether factor
effects are Fixed or Random.
Necessary to determine which
F-tests are appropriate and
which are not.
Setting Expected Mean Squares
 The
expected mean square
for a source of variation (say X)
contains.
 the error term.
 a term in 2x. (or S2x )
 a variance term for other selected
interactions involving the factor X.
Coefficients for EMS
Coefficient for error mean square is
always 1
Coefficient of other expected mean
squares is n times the product of
factors levels that do not appear in
the factor name.
Expected Mean Squares
Which
interactions to include in an
EMS?
All the letter (i.e. A, B, C, …)
appear in X.
All the other letters in the
interaction (except those in X) are
Random Effects.
A and B Fixed Effects
Source
d.f.
EMSq
A (a)
a-1
2e + rbS2A
B (b)
b-1
2e + raS2B
AxB
(a-1)(b-1)
2e + rS2AB
Error
r(a-1)(b-1)
2e
Model yield=A B A*B;
A and B Random Effects
Source
d.f.
EMSq
A (a)
a-1
2e + r2AB + rb2A
B (b)
b-1
2e + r2AB + ra2B
AxB
(a-1)(b-1)
2e + r2AB
Error
r(a-1)(b-1)
2e
Model yield=A B A*B;
Test h = A B e=A*B;
A Fixed and B Random
Source
d.f.
EMSq
A (a)
a-1
2e + r2AB + rbS2A
B (b)
b-1
2e + ra2B
AxB
(a-1)(b-1)
2e + r2AB
Error
r(a-1)(b-1)
2e
Model yield=A B A*B;
Test h = A e=A*B;
Multiple Comparisons
Multiple Range
Tests:
Tukey’s and
Duncan’s.
Orthogonal
Contrasts.
Tukey’s Multiple Range Test
W = q(p,f) x se[x]
se[x] = (2/n)
(94,773/4) = 153.9
W = 4.64 x 153.9 = 714.1
Tukey’s Multiple Range Test
Comparison
A = 2678 – W = 1964.9
B = 2552 – W = 1837.9
C = 2128 – W = 1413.9
D = 2127 – W = 1412.9
E = 1796 – W = 1081.9
F = 1681 – W = 966.9
Result
A=B, C & D; A>E, F & G
B=C & D; B>E, F & G
C=D, E & F; C>G
D=E & F; D>G
E=F & G
F=G
Duncan’s Multiple Range Test
p
rp
2
3
4
5
6
7
2.94 3.02 3.18 3.24 3.30 3.33
Rp = (rp x se[x])
p
Rp
2
453
3
476
4
489
5
499
6
508
7
513
Duncan’s Multiple Range Test
Comparison
A = 2678 – R7 = 2165
B = 2552 – R6 = 2044
C = 2128 – R5 = 1628
D = 2127 – R4 = 1638
E = 1796 – R3 = 1331
F = 1681 – R2 = 1229
Result
A=B; A>C, D, E, F & G
B=C & D; B>E, F & G
C=D, E, & F; C>G
D=E & F; D>G
E=F; E> G
F=G
Multiple Comparisons
Genotype
A
B
C
D
E
F
G
Tukey
2678 a
2552 ab
2128 abc
2127 abcd
1796 cde
1681 cdef
1316 ef
Duncan
2678 a
2552 ab
2128 bc
2127 bcd
1796 cde
1681 cdef
1316 ef
Orthogonal Contrasts
Orthogonal Contrasts
Maximum number of orthogonal contrasts is
df for treatment.
SS of all contrasts must equal SS of treatment
effect.
Rem SS is difference beyween treatment SS
and sum of contrast SS.
Contrasts can help understand main effects
and interactions.
Orthogonality
ci = 0
[c1i x c2i] = 0
-1 -1 +1 +1 -- ci = 0
-1 +1 -1 +1 -- ci = 0
+1 -1 -1 +1 -- ci = 0
Calculating Orthogonal Contrasts
d.f. (single contrast) = 1
S.Sq(contrast) = M.Sq = [ci x Yi]2/nci2]
Analyses of Variance
 Detect
significant differences
between treatment means.
 Determine trends that may exist as a
result of varying specific factor
levels.
Trend Analyses
3
3
2
2
1
1
0
0
-1
-1
-2
-2
-3
-3
Linear
3
8
2
6
Quadratic
4
1
2
0
0
-1
-2
-2
-4
-3
-6
Cubic
Quartic
When rabbits come to dinner
 Two
carrot cultivars (‘Orange Gold’
and ‘Bugs Delight’.
 Four seeding rates (1.5, 2.0, 2.5 and
3.0 lb/acre).
 Three replicates.
 TQ #1 p. 155.
Analysis of Variance
Source
d.f.
S.Sq
M.Sq
Replicates
2
0.3575 0.1787
Cultivar
1
0.0122 0.0122
Seeding Dens 3 12.2496 4.0832
C x SD
3
6.4490 2.1497
Error
23 4.7967 0.3426
F-val
0.50 ns
0.03 ns
11.9 ***
***
6.27
When rabbits come to dinner
Cultivar
Orange Gold
Bug’s Delight
Mean
Seeding Rate
(lb/acre)
2.0
2.5
1.5
4.53
bc
3.25
3.89
d
B
cd
4.01
cd
3.70
B
3.85
ab
5.23
ab
5.41
A
5.28
3.0
bc
4.48
a
6.08
A
5.32
When rabbits come to dinner
Cultivars
Seeding rate
------------ lb/acre ----------1.5
2.0
2.5
3.0
Orange Gold
4.53
Bug’s Delight
Total
Linear
Quadratic
Cubic
3.25 3.70 5.41 6.08
23.34 23.13 31.92 31.68
SSq
-3
-1
1
3
1143/120
1
-1
-1
1
0.0/24
-1
3
-3
1
325/120
4.01
5.23
4.48
Analysis of Variance
Source
Replicates
Cultivar
Seeding (L)
(Q)
(C)
C x SD
Error
d.f.
2
1
1
1
1
3
98
S.Sq
0.3575
0.0122
9.5316
0.0000
2.7180
6.4490
70610
M.Sq
F-val
0.1787 0.50 ns
0.0122 0.03 ns
***
9.5316 27.82
0.0000 0.00 ns
**
2.7180 7.93
2.1497 6.27 ***
720
When rabbits come to dinner
Cultivar Seeding
Rate
Yield
Cv’s
Orange
Gold
0.5
4.53
-1
-3
+3
2.0
4.01
-1
-1
+1
2.5
5.23
-1
+1
-1
3.0
4.48
-1
+3
-3
0.5
3.25
+1
-3
-3
2.0
3.97
+1
-1
-1
2.5
5.41
+1
+1
+1
3.0
6.8
+1
+3
+3
Bug’s
Delight
Linear C x L
Analysis of Variance
Source
Replicates
Cultivar
Seeding (L)
(Q)
(C)
C x (L)
(Q)
(C)
Error
d.f.
2
1
1
1
1
1
1
1
23
S.Sq
0.3575
0.0122
9.5316
0.0000
2.7180
6.2199
0.0794
0.1498
4.7967
M.Sq
F-val
0.1787 0.50 ns
0.0122 0.03 ns
9.5316 27.82 ***
0.0000 0.00 ns
2.7180 7.93 **
6.2199 18.15 ***
0.0794 0.23 ns
0.1498 0.44 ns
0.3426
Yield
When rabbits come to dinner
6.5
6
5.5
5
4.5
4
3.5
3
Orange Gold
Bug’s Delight
1.5
2
2.5
Seeding Rate
3
End of Analyses of
Variance Section
Download