Fractional factorial designs
Material in this lecture is based Chapter 8 of the textbook "Design and Analysis of
Experiments" by Douglas Montgomery.
The 2 k
factorial design is efficient, but as the number of factors, k, increases, the total number of observations for a complete replicate of all possible 2 k
combinations will grow beyond the available resources. In these cases, rather than running all 2 k combinations, we will run a carefully-selected fraction of the combinations. Typically, the fractions are 1/2, a 1/4, or 1/8 or all the possible combinations.
The one-half fraction of the 2 k
design
Consider a situation in which three factors, each at two levels, are of interest, but the researchers cannot afford to run all 2 3 = 8 treatment combinations. They can, however, afford 4 runs. This suggests a one-half fraction of a 2 3 design. Because the design contains 2 3-1 = 4 treatment combinations, a one-half fraction of the 2 3 design is often called a 2 3-1 design.
Here is the design matrix for all combinations of a 2 3 design.
I A B C AB AC BC ABC Treatment combination a b c
+
+
+
+
-
-
-
+
-
-
-
+
-
-
+
-
+
-
+
-
-
+
+
+ abc ab ac bc
(1)
+
+
+
+
+
+
+
-
-
+
+
-
+
-
+
+
-
-
+
+
+
+
-
-
+
+
-
+
-
+
+
-
-
+
+
+
-
Suppose we select the four treatment combinations a, b, c, abc as our 2 3-1 one-half fraction. These are shown in the top half of the table, with ABC=+.
Notice that, in the first 4 rows of the table,
the + and – signs are the same for column A and column BC.
the + and – signs are the same for column B and column AC.
the + and – signs are the same for column C and column AB.
We say
A is aliased with BC
-
-
-
B is aliased with AC
C is aliased with AB
Terminology (useful but not required to know): Notice that the 2 3-1 design is formed by selecting only those treatment combinations that have a + sign in the ABC column. Thus,
ABC is called the generator of this design. Sometimes we will refer to a generator such as ABC as a word. Furthermore, the identity column I is also always +, so we call I=ABC the defining relation.
In the 2 3-1 design, we estimate the main effects of A, B, and C using the + and – signs in the table:
[A] = ½ * {a – b – c + abc}
[B] = ½ * {-a +b – c + abc}
[C]= ½ * {-a –b + c + abc}
We estimate the two-factor interactions of AB, AC, and BC using the + and – signs in the table:
[BC] = ½ * {a – b – c + abc}
[AC] = ½ * {-a +b – c + abc}
[AB]= ½ * {-a –b + c + abc}
We saw that, in the first 4 rows of the table, the + and – signs are the same for A and BC.
So
[A] = [BC]
[B] = [AC]
[C] = [AB]
Consequently,
The apparent effect of A is actually the effect of A plus the interaction [BC].
The apparent effect of B is actually the effect of A plus the interaction [AC].
The apparent effect of C is actually the effect of A plus the interaction [AB].
This is why we say
A is aliased with BC
B is aliased with AC
C is aliased with AB
Aliasing is the price we pay when we use a fractional factorial design. We use fewer total runs, but some effects will be aliased. Aliasing is also called confounding.
Example fractional factorial design for 3 factors in 4 runs
If we want to look at all the possible interactions between k factors, then we need to run 2^k experiments. However, we may want to use fewer than 2^7 experiments, and may be willing to give up the ability to test for all those interactions.
Suppose we are baking cookies, and we are interested in the factors that may affect yield of good cookies. The 3 factors we examine are flour, oven, and sifting. We believe that interactions among these on their effects on yield will be small compared to main effects. We can examine this assumption later.
Usually, we will use a pre-defined design matrix to look at 3 factors in 4 runs, which we can get at a website that has tables of fractional factorial designs: http://www.itl.nist.gov/div898/handbook/pri/section3/pri3347.htm
Just to see how it is done, we'll generate our own design matrix. Here is the design matrix for 3 factors in 8 runs, with all interactions. To get a one-half fraction, we'll select the rows where ABC values are 1.
A B C AB AC BC ABC
-1
-1
1
1
-1
-1
1
1
-1
1
-1
1
-1
1
-1
1
1
-1
-1
1
-1
1
1
-1
1
-1
-1
1
1
-1
-1
1
-1
1
-1
1
1
-1
1
-1
-1
-1
1
1
1
1
-1
-1
Here is R code to create this design matrix we can use to examine the effects of 3 factors in 4 runs mystring.levels= "run, oven, flour, sift, yield
1,-1,-1,1,50
2,-1,1,-1,50
3,1,-1,-1,60
4,1,1,1,60"
OFS.data=read.table(textConnection(mystring.levels),header=TRUE,sep=",",row.names=
"run")
OFS.data
1
1
1
1
-1
-1
-1
-1
> OFS.data
oven flour sift yield
1 -1 -1 1 50
2 -1 1 -1 50
3 1 -1 -1 60
4 1 1 1 60
> lm.3factor=lm(yield ~ oven + flour + sift, data=OFS.data) summary(lm.3factor)
> summary(lm.3factor)
Call: lm(formula = yield ~ oven + flour + sift, data = OFS.data)
Residuals:
ALL 4 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.500e+01 NA NA NA oven 5.000e+00 NA NA NA flour -8.882e-16 NA NA NA sift 8.882e-16 NA NA NA
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 3 and 0 DF, p-value: NA
This experiment indicates that Oven type has a big effect (Estimate = 5), but neither sifting nor flour type has an effect (Estimate = e-16). We don't know if these are significant; at this point we just know what appears to be the most important factor.
But this design matrix looks exactly like the design matrix we used previously to look at
2 factors with interaction: mystring.levels= "run, oven, flour, oven.flour, yield
1,-1,-1,1,50
2,-1,1,-1,50
3,1,-1,-1,60
4,1,1,1,60"
OF.interaction.data=read.table(textConnection(mystring.levels),header=TRUE,sep=",",ro w.names="run")
OF.interaction.data
Design matrix with interaction term (oven*flour) in the 3 rd column:
> OF.interaction.data
oven flour oven.flour yield
1 -1 -1 1 50
2 -1 1 -1 50
3 1 -1 -1 60
4 1 1 1 60
>
Design matrix with an additional factor (sift) in the 3 rd column:
> OFS.data
oven flour sift yield
1 -1 -1 1 50
2 -1 1 -1 50
3 1 -1 -1 60
4 1 1 1 60
>
If we wanted to test for an interaction between "Flour type" and "Oven", we would use
Column 3, labeled "oven.flour".
When we use Column 3 to look at another factor (sift), the calculated effect of that factor is actually the sum of the effect of that factor plus the effect of any interaction.
We say that the factor "sift" is aliased with the oven*flour interaction.
The apparent effect of "sift" is the combined effect of "sift" and any oven*flour interaction.
apparent sift effect = sift + flour* oven interaction
apparent oven effect = oven + flour*sift interaction
apparent flour effect = flour + oven*sift interaction
Fractional factorial design for 7 factors in 8 runs
Here is the design matrix for a full factorial for 3 factors in 8 runs including the interaction terms.
A
Length
1
1
1
1
-1
-1
-1
-1
B
Width
-1
-1
1
1
-1
-1
1
1
C
Paper type
-1
1
-1
1
-1
1
-1
1
D
Length*Width
-1
-1
1
1
1
1
-1
-1
E
L*P
1
-1
1
-1
-1
1
-1
1
F
W*P
1
-1
-1
1
1
-1
-1
1
G
L*W*P
Column D, is the interaction term for Length*Width, =A*B
E =A*C.
F=B*C
G = A*B*C
We could use all 2^3 = 8 runs to test for main effects and interactions of the 3 factors, as we did in the full factorial experiment.
If we are willing to assume that 2-way and 3-way interactions are small compared to the main effects, we can test 7 factors in 8 runs, as in the cookie baking example.
The design matrix is identical to the design matrix for looking at 3 factors with all possible interactions.
A B C D E F G
Batch Temp Water Sift
1 -1 -1 -1
Time
1
Butter Soda
1 1
Sugar
-1
1
-1
-1
1
-1
1
1
-1
7
8
5
6
2
3
4
-1
1
-1
1
1
-1
1
-1
-1
1
1
-1
1
1
1
1
1
1
-1
-1
-1
1
-1
-1
1
-1
-1
1
-1
1
-1
1
-1
1
-1
-1
-1
1
1
1
-1
-1
1
-1
-1
1
1
1
-1
In this design matrix, we have the same aliasing between columns:
D = A*B. So column D, Time, is aliased with the interaction term for Temp*Water =A*B.
There are other aliases.
A B C
Batch Temp Water Sift
1 -1 -1
4
5
2
3
1
-1
1
-1
-1
1
1
-1
6
7
8
1
-1
1
-1
1
1
-1
-1
-1
-1
1
1
1
1
D
Time
1
-1
-1
1
1
-1
-1
1
E F
Butter Soda
1
-1
1
-1
-1
1
1
-1
-1
-1
1
-1
1
-1
1
1
G
Sugar
-1
-1
1
1
1
-1
-1
1
In this design matrix, notice that the Temp column is the same as the Soda*Sugar interaction. A=F*G. So Temp is aliased with the Soda*Sugar interaction
The Temp column is also the same as the Water*Time and Sift*Butter columns.
A=B*D
A=C*E
So Temp is aliased with 3 two-way interactions: Soda*Sugar, Water*Time, and
Sift*Butter interactions
A=F*G
A=B*D
A=C*E
The apparent effect of Temp is aliased with these three interactions, so the apparent of effect of Temp is actually the sum of the main effect of Temp plus 2-way interaction
(plus higher-order interactions).
Temp + Soda*Sugar + Water*Time + Sift*Butter + higher-order interactions
The other columns have similar aliasing patterns. When we choose this design, we can examine which interactions are aliased, and decide if we think the interactions may be real and large. If so, we can consider them in further experiments.
Tables of fractional factorial designs are available at this website: http://www.itl.nist.gov/div898/handbook/pri/section3/pri3347.htm
Here is an example from the website, showing the design matrix that we used above for
7 factors in 8 runs. The Resolution of a design tells us what level of interactions the design is able to resolve, for example all main effects, or all 2-way, etc.
=======
2**(7-4) FRACTIONAL FACTORIAL DESIGN
NUMBER OF LEVELS FOR EACH FACTOR = 2
NUMBER OF FACTORS = 7
NUMBER OF OBSERVATIONS = 8 (A SATURATED DESIGN)
RESOLUTION = 3 (CAUTION! MAIN EFFECTS ARE
CONFOUNDED WITH 2-TERM INTERACTIONS)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
FACTOR DEFINITION CONFOUNDING STRUCTURE
1 1 1 + 24 + 35 + 67 + HIGHER-ORDER INTERACTIONS
2 2 2 + 14 + 36 + 57 + HIGHER-ORDER INTERACTIONS
3 3 3 + 15 + 26 + 47 + HIGHER-ORDER INTERACTIONS
4 12 4 + 12 + 37 + 56 + HIGHER-ORDER INTERACTIONS
5 13 5 + 13 + 27 + 46 + HIGHER-ORDER INTERACTIONS
6 23 6 + 17 + 23 + 45 + HIGHER-ORDER INTERACTIONS
7 123 7 + 16 + 25 + 34 + HIGHER-ORDER INTERACTIONS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12 12 + 4 + 37 + 56 + HIGHER-ORDER INTERACTIONS
13 13 + 5 + 27 + 46 + HIGHER-ORDER INTERACTIONS
14 14 + 2 + 36 + 57 + HIGHER-ORDER INTERACTIONS
15 15 + 3 + 26 + 47 + HIGHER-ORDER INTERACTIONS
16 16 + 7 + 25 + 34 + HIGHER-ORDER INTERACTIONS
17 17 + 6 + 23 + 45 + HIGHER-ORDER INTERACTIONS
23 23 + 6 + 17 + 45 + HIGHER-ORDER INTERACTIONS
24 24 + 1 + 35 + 67 + HIGHER-ORDER INTERACTIONS
25 25 + 7 + 16 + 34 + HIGHER-ORDER INTERACTIONS
26 26 + 3 + 15 + 47 + HIGHER-ORDER INTERACTIONS
27 27 + 5 + 13 + 46 + HIGHER-ORDER INTERACTIONS
34 34 + 7 + 16 + 25 + HIGHER-ORDER INTERACTIONS
35 35 + 1 + 24 + 67 + HIGHER-ORDER INTERACTIONS
36 36 + 2 + 14 + 57 + HIGHER-ORDER INTERACTIONS
37 37 + 4 + 12 + 56 + HIGHER-ORDER INTERACTIONS
45 45 + 6 + 17 + 23 + HIGHER-ORDER INTERACTIONS
46 46 + 5 + 13 + 27 + HIGHER-ORDER INTERACTIONS
47 47 + 3 + 15 + 26 + HIGHER-ORDER INTERACTIONS
56 56 + 4 + 12 + 37 + HIGHER-ORDER INTERACTIONS
57 57 + 2 + 14 + 36 + HIGHER-ORDER INTERACTIONS
67 67 + 1 + 24 + 35 + HIGHER-ORDER INTERACTIONS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DEFINING RELATION = I = 124 = 135 = 236 = 347 = 257 = 167 = 456
DEFINING RELATION (CONT.) = 1237 = 2345 = 1346 = 1256 = 1457
DEFINING RELATION (CONT.) = 2467 = 3567 = 1234567
REFERENCE--BOX, HUNTER & HUNTER, STAT. FOR EXP., PAGE 410, 391, 392.
NOTE--THIS DESIGN IS EQUIVALENT TO A TAGUCHI L8 DESIGN
REFERENCE--TAGUCHI, SYS. OF EXP. DES., VOL. 2.
NOTE--IF POSSIBLE, THIS (AS WITH ALL EXPERIMENT DESIGNS) SHOULD BE
RUN IN RANDOM ORDER (SEE DATAPLOT'S RANDOM PERMUTATION FILES).
NOTE--TO READ THIS FILE INTO DATAPLOT--
SKIP 75
READ 2TO7M4.DAT X1 TO X7
DATE--DECEMBER 1988
NOTE--IN THE DESIGN BELOW, "-1" REPRESENTS THE "LOW" SETTING OF A FACTOR
"+1" REPRESENTS THE "HIGH" SETTING OF A FACTOR
NOTE--ALL FACTOR EFFECT ESTIMATES WILL BE OF THE FORM
AVERAGE OF THE "HIGH" - AVERAGE OF THE "LOW"
X1 X2 X3 X4 X5 X6 X7
--------------------------
-1 -1 -1 +1 +1 +1 -1
+1 -1 -1 -1 -1 +1 +1
-1 +1 -1 -1 +1 -1 +1
+1 +1 -1 +1 -1 -1 -1
-1 -1 +1 +1 -1 -1 +1
+1 -1 +1 -1 +1 -1 -1
-1 +1 +1 -1 -1 +1 -1
+1 +1 +1 +1 +1 +1 +1
==========
Let's use this design to analyze a cookie baking experiment with 7 factors in 8 runs. mystring.levels="batch, flour, temp, soda, time, water, butter, sift, yield
1,brown,low,low,high,high,sift,low,47
2,white,low,low,low,low,sift,high,74
3,brown,high,low,low,high,no,high,84
4,white,high,low,high,low,no,low,62
5,brown,low,high,high,low,no,high,53
6,white,low,high,low,high,no,low,78
7,brown,high,high,low,low,sift,low,87
8,white,high,high,high,high,sift,high,60" cookie.data=read.table(textConnection(mystring.levels),header=TRUE,sep=",",row.nam
es="batch") cookie.data
flour temp soda time water butter sift yield
1 brown low low high high sift low 47
2 white low low low low sift high 74
3 brown high low low high no high 84
4 white high low high low no low 62
5 brown low high high low no high 53
6 white low high low high no low 78
7 brown high high low low sift low 87
8 white high high high high sift high 60
>
plot.design(cookie.data) lm.cookie = lm(yield ~ ., data= cookie.data) summary(lm.cookie)
Call: lm(formula = yield ~ ., data = cookie.levels.data)
Residuals:
ALL 8 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 61.50 NA NA NA flourwhite 0.75 NA NA NA templow -10.25 NA NA NA sodalow -2.75 NA NA NA timelow 25.25 NA NA NA waterlow 1.75 NA NA NA
buttersift -2.25 NA NA NA siftlow 0.75 NA NA NA
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 7 and 0 DF, p-value: NA
We are estimating 8 parameters in the model (intercept + 7 factors) with only 8 batches.
This fits the data perfectly, so R cannot calculate a standard error and R-squared = 1. But we can look at the magnitude of the coefficients (labeled "Estimates" in the summary), which are informative. coef(lm.cookie)
(Intercept) flourwhite templow sodalow
61.50 0.75 -10.25 -2.75
timelow waterlow buttersift siftlow
25.25 1.75 -2.25 0.75
>
Time appears to have the biggest effect (25.25), followed by temperature (10.25), with lesser effects for the other factors. Sifting and type of flour have very little influence on yield.
If "sifting" was a labor-intensive or expensive step, this experiment would be evidence that the step could be eliminated without affecting quality.
If the two types of "flour" had different cost, this experiment would be evidence that the less expensive type could be used without affecting quality.
We can't directly test the statistical significance of the coefficients, because we have estimated 8 coefficients with 8 runs. But there is another way to get an idea of what is significant.
If there were no significant factors, then their effect sizes should be randomly distributed around zero in a roughly normal distribution. We would expect them all to fall on the normal line in the qq plot. Two factors fall far from the line: time and temperature. qqnorm(coef(lm.cookie)[2:8], pch=names(coef(lm.cookie)[2:8])) qqline(coef(lm.cookie)[2:8])
This can be easier to see in a half-normal qq plot: halfnorm(coef(lm.cookie)[2:8], labs=names(coef(lm.cookie)[2:8]))
In this screening example, we assume that interaction effect (such as an interaction of time and temperature) are small compared to the main effects of each factor. We can examine this further.
A post-hoc look at the interaction terms
In the screening study for the cookies, we looked at 7 variables in 8 batches. We assumed interaction effects (such as an interaction of time and temperature) were small compared to the main effects of each factor.
The two most important factors according to the screening study were time, and temp.
A good next step would be further experiments on these variables.
However, at some risk of data dredging, before doing that, we can re-analyze the cookie data and look for interactions. lm.cookie.2vars = lm(yield ~ time * temp, data= cookie.data) summary(lm.cookie.2vars)
Call: lm(formula = yield ~ time * temp, data = cookie.data)
Residuals:
1 2 3 4 5 6 7 8
-3.0 -2.0 -1.5 1.0 3.0 2.0 1.5 -1.0
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 61.000 2.016 30.264 7.1e-06 timelow 24.500 2.850 8.595 0.00101 templow -11.000 2.850 -3.859 0.01816 timelow:templow 1.500 4.031 0.372 0.72869
(Intercept) *** timelow ** templow * timelow:templow
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.85 on 4 degrees of freedom
Multiple R-squared: 0.9786, Adjusted R-squared: 0.9626
F-statistic: 60.98 on 3 and 4 DF, p-value: 0.0008523
The interaction term timelow:templow gives a p-value of 0.73, indicating that there is not a significant interaction, at least at the times and temperatures we used for this experiment. We could analyze other pairs of factors in a similar way. with( cookie.data
, interaction.plot(temp, time, yield))
The lines on the time * temp interaction plot are about as parallel as they can get, again indicating no significant interaction.
A compromise
So far we've looked at two extreme cases:
1. Test for all possible interactions among k factors using 2^k runs
2. Assume no interactions, study using the minimum possible number of runs.
We can compromise between these two extremes.
We can design experiments that let us test for, say, 2-way interactions, but assume that 3-way and higher interactions are small compared to the main effects and 2-way interactions. The price we will pay is doing more runs, though still many fewer than 2^k.
We can design experiments that let us test for 2-way and 3-way interactions, but assume that 4-way and higher interactions are small compared to the main effects, 2-way and 2-way interactions. Again, the price we will pay is doing more runs, though fewer than 2^k.
Here is another example design matrix. In this design matrix, we examine 5 factors in 16 runs. Having fewer factors and/or more runs allows us to resolve interactions. In this design, no main effects are confounded with any 2-factor interactions or 3-factor interactions. Main effects are confounded with 4-factor interactions.
Fractional factorial designs from the website: http://www.itl.nist.gov/div898/handbook/pri/section3/pri3347.htm
=====
THIS IS DATAPLOT DESIGN-OF-EXPERIMENT FILE 2TO5M.DAT
2**(5-1) FRACTIONAL FACTORIAL DESIGN
NUMBER OF LEVELS FOR EACH FACTOR = 2
NUMBER OF FACTORS = 5
NUMBER OF OBSERVATIONS = 16
RESOLUTION = 5 (THEREFORE NO MAIN EFFECTS ARE
CONFOUNDED WITH ANY 2-FACTOR INTERACTIONS
OR 3-FACTOR INTERACTIONS;
MAIN EFFECTS ARE CONFOUNDED WITH
4-FACTOR INTERACTIONS)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
FACTOR DEFINITION CONFOUNDING STRUCTURE
1 1 1 + 2345
2 2 2 + 1345
3 3 3 + 1245
4 4 4 + 1235
5 1234 5 + 1234
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12 12 + 345
13 13 + 245
14 14 + 235
15 15 + 234
23 23 + 145
24 24 + 135
25 25 + 134
34 34 + 125
35 35 + 124
45 45 + 123
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DEFINING RELATION = I = 12345
REFERENCE--BOX, HUNTER & HUNTER, STAT. FOR EXP., PAGE 410,
NOTE--IF POSSIBLE, THIS (AS WITH ALL EXPERIMENT DESIGNS) SHOULD BE
RUN IN RANDOM ORDER (SEE DATAPLOT'S RANDOM PERMUTATION FILES).
NOTE--TO READ THIS FILE INTO DATAPLOT--
SKIP 50
READ 2TO5M1.DAT X1 TO X5
DATE--DECEMBER 1988
NOTE--IN THE DESIGN BELOW, "-1" REPRESENTS THE "LOW" SETTING OF A FACTOR
"+1" REPRESENTS THE "HIGH" SETTING OF A FACTOR
NOTE--ALL FACTOR EFFECT ESTIMATES WILL BE OF THE FORM
AVERAGE OF THE "HIGH" - AVERAGE OF THE "LOW"
X1 X2 X3 X4 X5
------------------
-1 -1 -1 -1 +1
+1 -1 -1 -1 -1
-1 +1 -1 -1 -1
+1 +1 -1 -1 +1
-1 -1 +1 -1 -1
+1 -1 +1 -1 +1
-1 +1 +1 -1 +1
+1 +1 +1 -1 -1
-1 -1 -1 +1 -1
+1 -1 -1 +1 +1
-1 +1 -1 +1 +1
+1 +1 -1 +1 -1
-1 -1 +1 +1 +1
+1 -1 +1 +1 -1
-1 +1 +1 +1 -1
+1 +1 +1 +1 +1
======
In general we want to choose designs that, at least, can distinguish (resolve) main effects from 2-way interactions. Usually, we'll also want at least 8, preferably 16 runs or more to give us power to detect modest effects.
Center points
When we run factorial designs with only two levels for a factor, we are not able to detect curvature in the response surface.
One way to help deal with this problem is to add center points to the design. If the original two levels for a factor are coded as {-1,1}, a center point would be put at 0, half way between the high and low settings.
For example, here is a center point we could use for a factor Temperature in cookie baking.
Actual value
350 o
400 o
450 o
Coding
-1
0
1
When we have several quantitative factors, we use a center point where each factor is set to a zero coded value. Here's an example of using center points for the cookie baking. mystring.levels="Temp,Time,GoodCookies
-1,-1,50
-1,-1,51
-1,1,60
-1,1,61
1,-1,60
1,-1,61
1,1,70
1,1,71
0,0,61
0,0,62
0,0,61,
0,0,59" cookies.data =read.table(textConnection(mystring.levels),header=TRUE,sep=",") cookies.data
> cookies.data
Temp Time GoodCookies
1 -1 -1 50
2 -1 -1 51
3 -1 1 60
4 -1 1 61
5 1 -1 60
6 1 -1 61
7 1 1 70
8 1 1 71
9 0 0 61
10 0 0 62
11 0 0 61
12 0 0 59
> lm.centerpoints = lm(GoodCookies ~ Temp*Time, data = cookies.data)
summary(lm.centerpoints)
> summary(lm.centerpoints)
Call: lm(formula = GoodCookies ~ Temp * Time, data = cookies.data)
Residuals:
Min 1Q Median 3Q Max
-1.5833 -0.5833 0.4167 0.4167 1.4167
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.058e+01 2.684e-01 225.71 < 2e-16 ***
Temp 5.000e+00 3.287e-01 15.21 3.46e-07 ***
Time 5.000e+00 3.287e-01 15.21 3.46e-07 ***
Temp:Time 9.735e-15 3.287e-01 0.00 1
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9298 on 8 degrees of freedom
Multiple R-squared: 0.983, Adjusted R-squared: 0.9766
F-statistic: 154.2 on 3 and 8 DF, p-value: 2.040e-07
In this case, there does not appear to be a significant interaction. Using 4 center points has given us a better estimate of the error. Because we have replicates, we can use the center points to do a test for lack of fit. If we see evidence of curvature, we would use response surface methods.