L_10_Split_plot

advertisement
Stat 512
Chapter 18 Split-Plot Designs and Repeated Measures
Split Plot:
When experiments have a factorial structure and one of the factors
might be hard to implement (time consuming, expensive etc), one
often uses a SPLIT plot design, where the randomization is done in
two steps.
Hence, here we have two types of units, the WHOLE unit and the subunits. The researcher must identify the size of the experimental units,
along with the associated design and treatment structures in order to
properly define the model and analyze the observed data.
Example 1:
Consider an situation where we are interested in 2 factors – A
Nitrogen Treatment (3 levels) to soil and the variety of wheat (4
levels) and our response is the yield of wheat.
A true Factorial Design would require we randomize the 12 treatment
combinations on the 36 units available to us. However, there is a
practical problem. The Nitrogen Fertilizer is applied using a tractor
and a particular setting of the tractor allows a certain level of the
fertilizer to be put on the soil.
So lets say random assignment requires the following assignment for
A1B1
A1B1
A1B1
A1B1
A1B1
A1B1
A1B1
A1B1
A1B1
1
A1B1
Stat 512
As you can see from the plot it is fairly difficult to assign the A1 as
after each plot the tractor has to be removed to the next one getting
A1B1. A better, more convenient choice would be to first select the
tracts that get A1, A2 and A3 and do them in one go and then
randomize the varieties on the assigned fertilizer.
A1
A1
A1
A1
A3
A3
A3
A3
A3
A3
A3
A3
A2
A2
A2
A2
A1
A1
A1
A1
A2
A2
A2
A2
A3
A3
A3
A3
A2
A2
A2
A2
A1
A1
A1
A1
So the part that is marked with a box is considered a WHOLE plot and
the 4 plots within it are called subplots.
2
Stat 512
Example 2: Consider a CRD with an oneway treatment structure.
In this case, suppose the treatment structure consisted of three (3)
varieties of wheat ( V1 , V2 and V3 ), planted on four (12) randomly
selected FARMS (large experimental units). The response for this
experiment might be wheat yield in bushels per acre. We randomize
the varieties to the farms. This design layout might appear as follows:
V1
V2
V1
V3
V1
V3
V3
V1
V2
V2
V3
V2
However, the researcher might also be interested in the effects of two
different fertilizers ( F1 and F2 ) on yield. The CRD presented earlier can
be modified by splitting each farm (exp. unit) in half and then
randomly assigning the fertilizers: one fertilizer to each half
experimental unit (sub-unit). This modified design might appear as
follows:
V1F1 V1F2 V4F2 V4F1 V2F1 V2F2
V2F2 V2F1 V3F1 V3F2 V4F2 V4F1
V1F1 V1F2 V3F2 V3F1 V3F1 V3F2
V4F1 V4F2 V1F2 V1F2 V2F2 V2F1
In this experiment there are two different sizes of experimental units:
the large units are the farms; and the small units (sub-units) are the
half-farms.
3
Stat 512
These experimental design consists of two components:
1) Whole Plot Design and Treatment Structure;
2) Subplot Design and Treatment Structure.
Whole Plot Design and Treatment Structure:
A CRD (exp. unit = farm) with an oneway treatment structure (Wheat
Variety).
Subplot Design and Treatment Structure:
A RCBD (exp. unit = half-farm) with an oneway treatment structure
(Fertilizer).
Effects Model
Yijk =  + wholeplot_TRTi + wholeplot_TRT_error i(j)
+ subplot_TRT k + subplot_TRT*wholeplot_TRTik
+subplot_error_TRT(ij)k
i = 1, 2, , a
j = 1, 2, , r
k = 1, 2, , b
Expected Mean Squares
EMS(Wholeplot)
EMS(Whole Plot Error)
EMS(subplot)
EMS(wholeplot*subplot)
EMS(Subplot Error)
=
=
=
=
=

b

2
e
2

r
b

V
a
r
i
e
t
y
2
W
h
o
l
e
P
l
o
t
2
e2 bW
holeP
lot
2
2
e raFertilizer
2
2

r

e
F
e
rtiliz
e
r*
V
a
rie
ty
 e2
Here wholeplot=wheat variety
Subplot=fertilizer
4
Stat 512
Anova Table
Source
W
SS
MS
F0
SS Wheat
df
a-1
SSWheat
a 1
M
SWheat
M
SWholePlot Error
WP Error
SSWholePlot Error
(r-1)a
SSWholePlot Error
r1a
F
SSFertilizer
b-1
SSFertilizer
b1
MSFertilizer
MSError
F*W
SSFertilizer*Wheat
(a-1)(b-1)
SSFertilizer*Wheat
a1b1
M
SFertilizer*Wheat
M
SError
SP Error
SS Error
a(r-1)(b-1)
S
S
E
rro
r
ar
1
 b
1


5
Stat 512
Example 3:
In an experiment on the preparation of chocolate cakes, conducted at
Iowa State College, 3 Recipes for preparing the batter were compared.
Recipes I and II differed in that the chocolate was added at 40o C
and 60o C , respectively, while recipe III contained extra sugar. In
addition, 6 different baking Temperatures were tested: these ranged
in 10o C steps from 175o to 225o . Each time that a mix was made by a
recipe, enough batter was prepared for 6 cakes, each of which was
baked at a different temperature. In this way, 5 replicates of each
recipe were constructed.
The data from this experiment are shown in the following table:
Breaking Angle for Cakes (Degrees, Cochran and Cox, 1957).
Temperature
Recipe
Rep
175 185 195 205 215 225
1
42 46 47 39 53 42
2
47 29 35 47 57 45
1
3
32 32 37 43 45 45
4
26 32 37 43 39 26
5
28 30 31 37 41 47
2
1
2
3
4
5
39
35
34
25
31
46
46
30
26
30
51
47
42
28
29
49
39
35
46
35
55
52
42
37
40
42
61
35
37
36
1
46 44 45 46
48 63
2
43 43 43 46
47 58
3
3
33 24 40 37
41 38
4
38 41 38 30
36 35
5
21 25 31 35
33 23
________________________________________________________
Breaking Angle for Cakes (Cochran and Cox, 1957).
6
Stat 512
Model:
Yijk =  + Recipei + batch(recipe) i(j)
+ temperaturek + recipe*temperatureik +e (ij)k
i = 1, 2, , 3
j = 1, 2, , 5
k = 1, 2, , 6
SAS results:
The GLM Procedure
Source
Type III Expected Mean Square
Recipe
Var(Error) + 6 Var(Batch(Recipe)) + Q(Recipe,Recipe*Temperature)
Batch(Recipe)
Var(Error) + 6 Var(Batch(Recipe))
Temperature
Var(Error) + Q(Temperature,Recipe*Temperature)
Recipe*Temperature Var(Error) + Q(Recipe*Temperature)
The GLM Procedure
Tests of Hypotheses for Mixed Model Analysis of Variance
Dependent Variable: Angle
Source DF
* Recipe
Error
Type III SS Mean Square F Value Pr > F
2
1.800000
0.900000
12
3457.466667
288.122222
0.00 0.9969
Error: MS(Batch(Recipe))
* This test assumes one or more other fixed effects are zero.
Source
Batch(Recipe)
* Temperature
DF Type III SS Mean Square F Value Pr > F
12 3457.466667
288.122222
8.68 <.0001
5 1149.166667
229.833333
6.92 <.0001
183.533333
18.353333
0.55 0.8452
60 1992.133333
33.202222
Recipe*Temperature 10
Error: MS(Error)
* This test assumes one or more other fixed effects are zero.
7
Stat 512
Least Squares Means for effect Temperature
Pr > |t| for H0: LSMean(i)=LSMean(j)
Dependent Variable: Angle
i/j
1
1
2
3
4
5
6
0.8996 0.0580 0.0077 <.0001 0.0007
2 0.8996
0.0759 0.0108 <.0001 0.0010
3 0.0580 0.0759
0.4133 0.0092 0.1047
4 0.0077 0.0108 0.4133
0.0664 0.4133
5 <.0001 <.0001 0.0092 0.0664
0.2999
6 0.0007 0.0010 0.1047 0.4133 0.2999
Note: To ensure overall protection level, only probabilities associated with pre-planned
comparisons should be used.
8
Stat 512
Split-Plot vs. Split-Block Designs
Some authors distinguish between the "Split-block" and the "Splitplot" designs. The distinction is at the whole-plot level of the design:
Whole-Plot Design
The whole-plot design can be a CRD, an RCBD or a Latin Square
design.
The whole-plot treatment structure can be one-way, two-way, etc.
Sub-Plot Design
The sub-plot design is always an RCBD.
The sub-plot treatment structure can be one-way, twoway, etc.
In addition, sub-plot designs can be split multiple times to produce
designs which are split-split plot, split-split-split plot, etc.
9
Stat 512
Example
A researcher is interested in comparing the yield among four (4)
varieties of Oats which are planted in combination with four (4) seed
treatments. The design chosen consists of four (4) field strips, each of
which is divided into four (4) equal sized units. The four oat varieties
are randomly assigned to the four plots within each field strip.
Following the assignment of each variety of oats, each experimental
unit containing a variety of oat, was subdivided into four sub-units.
The four seed treatments were randomly assigned to the four subunits. This structure can be visualized in the following field strip:
Block
1
2
3
Seed
Treatment
1
2
3
4
Variety
Oat 1 Oat 2 Oat 3 Oat 4
42.9 53.3 62.3 75.4
53.8 57.6 63.4 70.3
49.5 59.8 64.5 68.8
44.4 64.1 63.6 71.6
1
2
3
4
41.6
58.5
53.8
41.8
69.6
69.6
65.8
57.4
58.5
50.4
46.1
56.1
65.6
67.3
65.3
69.4
1
2
3
4
28.9
43.9
40.7
28.3
45.4
42.4
41.4
44.1
44.6
45.0
62.6
52.7
54.0
57.6
45.6
56.6
1
35.1 35.1 50.3 52.7
4
2
51.9 51.9 46.7 58.5
3
45.4 45.4 50.3 51.0
4
51.6 51.6 51.8 47.4
________________________________________________
10
Stat 512
Split Block Design - Effects Model
Y
=
μ
+
B
l
o
c
k
+
O
a
t
+
W
h
o
l
e
P
l
o
t
E
r
r
o
r
i
j
k
i
j
i
j

S
e
e
d

S
e
e
d
*
O
a
t

S
u
b
p
l
o
t
E
r
r
o
r
k
j
k
i
j
k


W
h
o
l
e
P
l
o
t
E
r
r
o
r
B
l
o
c
k
*
O
a
t
i
j=
i
j
S
u
b
p
l
o
t
E
r
r
o
r
=
B
l
o
c
k
*
O
a
t
*
S
e
e
d

B
l
o
c
k
*
S
e
e
d
i
j
k
i
k
i
j
k


i = 1, 2, , r
j = 1, 2, , a
k = 1, 2, , b
Anova Table
Source
Block
SS
SS Block
df
r-1
MS
F0
SS Block
r 1
SSOat
a 1
Oat
SSOat
a-1
M
SOat
M
SWholePlot Error
WP Error
SSWholePlot Error
(r-1)(a-1)
SSWholePlot Error
r1a1
Seed
SSSeed
b-1
SS Seed
b 1
MSSeed
MS Error
Seed*Oat
SSSeed*Oat
(a-1)(b-1)
SSSeed*Oat
a1b1
MSSeed*Oat
MSError
SP Error
SS Error
a(r-1)(b-1)
S
S
E
rro
r
ar
1
 b
1


WP Error = Whole Plot Error; SP Error = Subplot Error
Source
Model
Error
Corrected Total
Source
block
oat
block*oat
seed
oat*seed
DF
27
36
63
Sum of
Squares
7066.191875
731.202500
7797.394375
Mean Square
261.710810
20.311181
F Value
12.89
R-Square
Coeff Var
Root MSE
yield Mean
0.906225
8.534077
4.506793
52.80938
Pr > F
<.0001
DF
Type I SS
Mean Square
F Value
Pr > F
3
3
9
3
9
2842.873125
2848.021875
618.294375
170.536875
586.465625
947.624375
949.340625
68.699375
56.845625
65.162847
46.66
46.74
3.38
2.80
3.21
<.0001
<.0001
0.0042
0.0539
0.0059
11
Stat 512
Source
block
oat
block*oat
seed
oat*seed
DF
Type III SS
Mean Square
F Value
Pr > F
3
3
9
3
9
2842.873125
2848.021875
618.294375
170.536875
586.465625
947.624375
949.340625
68.699375
56.845625
65.162847
46.66
46.74
3.38
2.80
3.21
<.0001
<.0001
0.0042
0.0539
0.0059
Tests of Hypotheses Using the Type III MS for block*oat as an Error Term
Source
oat
DF
3
Type III SS
2848.021875
Mean Square
949.340625
12
F Value
13.82
Pr > F
0.0010
Stat 512
The formulae for the split plot with main plots organized in LS are
similar and are given in Table above for RCBD. These different
designs will not affect the last three rows of the previous table. The
three upper lines are:
CRD
A
Error A
Total
Factor B
AxB
Error B
Total
RCBD
Latin Square
Rows
r-1
Columns
a-1
A
(r-1)(a-1) Error A
ra-1
Total
b-1
Factor B
(a-1)(b-1) A x B
a(r-1)(b-1) Error B
rab-1
Total
Blocks
a-1
A
a(r-1)
Error A
ra-1
Total
b-1
Factor B
(a-1)(b-1) A x B
a(r-1)(b-1) Error B
rab-1
Total
a-1
a-1
a-1
(a-1)(a-2)
ra-1
b-1
(a-1)(b-1)
a(r-1)(b-1)
rab-1
Error B (B*Block+A*B*Block) df = (b-1)*(r-1) + (b-1)*(r-1)*(a-1)=
(b-1)*(r-1)*[1+(a-1)]= a*(b-1)*(r-1)
For CRD the Model Statement is:
Y = A Rep(A) B
Random Rep(A)
A*B
For RCBD the Model Statement is:
Y = A Block Block*A B
Random Block Block*A
A*B
For LSD the Model Statement is:
Y = A Row Column Error_A B A*B
Random Row Column Error_A
13
Stat 512
Split-Split-Plot Design Structure
By beginning with a split-block or split-plot design, a split-split-block
or split-split-plot design can be constructed. This construction consists
of a second split (randomization restriction) at the sub-plot level.
Example:
A meat scientist wants to study the effect of temperature (T) with
three levels, types of packaging (P) with two levels, lighting
intensity (I) with four levels on the color of meat stored in a meat
cooler of seven days. Six coolers are available for the experiment, and
the three temperatures ( 34o F , 40o F , and 46o F ) are assigned at random to
two coolers.
Each cooler is partitioned into 4 columns. Because light intensities are
regulated by distance, all partitions in the column are assigned, at
random, the same light intensity (100 watts, 150 watts, 200 watts, and
300 watts). Each column is then partitioned into two areas in which
the two types of packaging are randomly assigned.
14
Stat 512
The partial ANOVA table for the above design is as follows:
Source of Variation
df
Cooler Analysis (CRD)
Mean()
Ti
Error(Cooler) = C(T)(i)j
Whole Plot Total
1
2
3
6
Intensity Analysis (RCBD)
Ik
I*Tik
Error(Column) = C(T)*I(i)jk
Sub-Plot Total
3
6
9
18
Packaging Analysis (RCBD)
Pl
P*Til
P*Ikl
P*T*Iikl
Error(Partition) = C(T)*P(i)jl +
C(T)*I*P(i)jkl
Sub-Sub-Plot Total
1
2
3
6
12
24
15
Stat 512
Split Plot Design - Standard Errors and LSD
CRD - Whole Plot Factor
M
S
h
o
l
e
P
l
o
tE
r
r
o
r
S
E
Y
 W

i
.
.
r

b
2

M
S
W
h
o
l
e
P
l
o
t
E
r
r
o
r
S
E
Y
Y




i
.
.
i
.
.
r

b
2

M
S
W
h
o
l
e
P
l
o
t
E
r
r
o
r
L
S
D

t




,
d
f
r

b
 

W
h
o
l
e
P
l
o
t
E
r
r
o
r

2


RCBD - Sub-Plot Factor
M
S
b

P
l
o
tE
r
r
o
r
S
E
Y
 Sur
.
.
k
a
2

M
S
S
u
b

P
l
o
t
E
r
r
o
r
S
E
Y

Y




.
.
k
.
.
k
r

a
L
S
D

t


2

M
S
S
u
b
P
l
o
t
E
r
r
o
r

r

a

,
d
f


S
u
b
P
l
o
t
E
r
r
o
r

2


16
Stat 512
Significant Interaction in a Split-Plot Experiment
Suppose that the Interaction between the whole plot treatment and the
sub-plot treatment is significant.
Analysis must be based on the twoway cell means, not the marginal
means.
Comparing two sub-plot treatments at the same level of the whole
plot treatment:
L
S
D

t


2

M
S
S
u
b
P
l
o
t
E
r
r
o
r

r

a

,
d
f


S
u
b
P
l
o
t
E
r
r
o
r

2


Comparing two whole plot treatments at the same level (or
different
levels) of the sub-plot treatment:
M
S

b

1

M
S



W
h
o
l
e
P
l
o
t
E
r
r
o
r
S
u
b
P
l
o
t
E
r
r
o
r
L
S
D

t

2




*
,
d
f



2


Where
r

b

b

1

M
S

S
u
b
P
l
o
t
E
r
r
o
r
t

M
S

t
W
h
o
l
e
P
l
o
t
E
r
r
o
r




,
r

1

a




2 

t




*
,
d
f


2





,
r

1

a

b

1






2


M
S

b

1

M
S

W
h
o
l
e
P
l
o
t
E
r
r
o
r
S
u
b
P
l
o
t
E
r
r
o
r
17
Stat 512
Strip-Plot Designs
Consider an agricultural field trial involving "t" varieties of wheat
(Factor A) and "s" types of fertilizers (Factor B). Both seeding and
fertilizing are most easily performed in strips. By placing the wheat
varieties in rows and the fertilizers in columns a "strip plot"
experimental design is produced. Before inferences can be made for
either of the factors, the design must be replicated (Rep), say n  2
times.
The linear model for the strip plot design is
Yijk =  + Repk
+ Ai + Rep*Aik (Row)
+ Bj + Rep*Bjk (Column)
+ A*Bij + eijk (Interation)
Y
=
μ+
R
e
p
ijk
k
+
A
i
+
R
e
p
*
A
R
o
w
)
ik (
+
B
j
+
R
e
p
*
B
C
o
lu
m
n
)
jk (
+
A
B
e
i*
ij +
ijk
(
I
n
te
r
a
tio
n
)
i = 1, 2, , s
j = 1, 2, , t
k = 1, 2, , n
Note: eijk = Rep*A*Bijk
Sour df
EMS
ce
2
e2 ns
 tR
Rep r - 1
2
2
2

n

t

n
r

t
A
s-1
e
R
*
A
A
2
 t
R*A (r-1)*(s-1) e2nr
RA
*
2
2
2


t

n
t
B
t-1
R
A
B
R
*
B
B
2
R2AB tRB
R*B (r-1)*(t-1)
*
2
2
A*B (s-1)*(t-1) RAB nAB
*
2
 RAB
R*A (r-1)*(s*B
1)*(t-1)
18
Stat 512
Split Plot
Advantages
1. It permits the efficient use of some factors which require large
experimental units in combination with other factors which require
small experimental units.
2. It provides increased precision in the comparison of some factors.
3. It permits the introduction of new treatments into an experiment
which is already in progress.
Disadvantages
1. Statistical analysis is complicated because different comparisons
have different error variances.
2. Low precision on the whole plots can result in large differences
being nonsignificant, while small differences on the subplots may be
statistically significant even though they are of no practical
significance.
19
Stat 512
Uses of Split-plot designs
1. Split-plot designs, and a variation, the split-block, are frequently
used for factorial experiments in which the nature of the experimental
material or the operations involved make it difficult to handle all
factor combinations in the same manner. It may be used when the
treatments associated with the levels of one or more of the factors
require larger amounts of experimental material in an experimental
unit than do treatments for other factors.
2. These designs are also used when the investigator wishes to
increase precision in estimating certain effects and is willing to
sacrifice precision in estimating certain others. The design usually
sacrifices precision in estimating the average effects of the treatments
assigned to main plots. It often improves the precision for comparing
the average effects of treatments assigned to subplots and, when
interactions exist, for comparing the effects of subplot treatments for a
given main plot treatment. This arises from the fact that experimental
error for main plots is usually larger than the experimental error used
to compare subplot treatments. Usually, the error term for subplot
treatments is smaller than would be obtained if all treatment
combinations were arranged in a randomized complete block design.
3. The design may be used when an additional factor is to be
incorporated in an experiment to increase its scope. For example,
suppose that the major purpose of an experiment is to compare the
effects of several seed protectants. To increase the scope of the
experiment several varieties are used as main plots and the seed
protectants are used as subplots.
20
Stat 512
Remark:
The basic split-plot design involves assigning the treatments of one
factor to main plots arranged in a CRD, RCBD or a Latin-Square
design and then assigning the second factor to subplots within
each main plot.
Note that randomization is a two-stage one. First, levels of factor A
are randomized over the main plots and then levels of factor B are
randomized over the subplots. Each main plot may be considered as a
block as far as factor B is concerned but only as an incomplete block
as far as the full set of treatments is concerned because not every
subplot has the same chance of getting every treatment combination.
This restriction in randomization results in the presence of two error
terms, one for main plots and one for subplots. Ordinarily the error
term for the main plots is larger than it would be in a complete design
since the main plots are larger and further apart, while the subplot
error is smaller than it would be in a complete design. Since the
interactions are compared using the smaller subplot error, the
precision in estimating interactions is usually increased.
A classical example of a split plot is an irrigation experiment where
irrigation levels are applied to large areas, and factors like varieties
and fertilizers are assigned to smaller areas within a particular
irrigation treatment. The proper analysis of a split-plot design
recognizes that treatments applied to main plots are subject to larger
experimental error than those applied to subplots; hence, different
mean squares are used as denominators for the corresponding F ratios.
This concept is discussed in terms of expected mean squares in this
topic.
21
Stat 512
Generally, the error associated with the subplots is smaller than that
for the whole plots. This is because
1. Small units within the large units tend to be positively correlated.
This has the effect of reducing experimental error.
2. Error degrees of freedom for the whole plots are usually less than
those for the subplots. This has the effect of increasing the whole-plot
error relative to that of the subplots.
In summary, the factors that require smaller amount of
experimental material, that are of major importance, that are
expected to exhibit smaller differences, or for which greater
precision is desired are assigned to the subplots.
The distinction between the factor split-plot design and the standard
two-factor experiments lies in the randomization. In a split-plot
design, there are two stages to the randomization process; first levels
of factor A are randomized to the wholeplots within each block, and
then levels of factor B are randomized to the subplot units within each
whole plot of every block. In contrast, for a two-factor experiment laid
off in a randomized block design, the randomization is a one-step
procedure; treatments (factor level combinations of the two factors)
are randomized to the experimental units in each block.
Note: whole plot testing similar if block random or fixed. In subplot, if
block fixed, all interactions with block are pooled into error. If it is
random, this may or may not be done.
22
Download