6.6 Response surface designs

advertisement
Response surface designs
We use responses surface designs when there is curvature in the response surface
(interactions, quadratic terms). We usually want to find a maximum or a minimum, or
target a specific value of the response variable.
We've already seen most of the tools that we'll use: linear models with interactions and
quadratic terms, interaction plots, contour, wireframe and level plots. We'll introduce a
couple of new types of design matrices.
One factor response surface
Suppose we want to optimize a value, such as yield, as a function of temperature. We
think that yield will reach a maximum at a particular temperature, which implies that
the relationship is not a simple linear function of temperature. We'll model the
relationship using an x^2 (quadratic) term.
mystring.levels="temperature,temp.squared,yield
0,0,24.21
1,1,37.41
2,4,36.66
3,9,50.83
4,16,53.49
5,25,52.49
6,36,45.38
7,49,48.79
8,64,38.97
9,81,32.99
10,100,29.29"
temp.data=read.table(textConnection(mystring.levels),header=TRUE,sep=",")
temp.data
> temp.data
temperature temp.squared yield
1
0
0 24.21
2
1
1 37.41
3
2
4 36.66
4
3
9 50.83
5
4
16 53.49
6
5
25 52.49
7
6
36 45.38
8
7
49 48.79
9
8
64 38.97
10
9
81 32.99
11
10
100 29.29
>
with( temp.data, plot(yield ~ temperature))
lm.quadratic = lm(yield ~ temperature + temp.squared, data = temp.data)
summary(lm.quadratic)
> summary(lm.quadratic)
Call:
lm(formula = yield ~ temperature + temp.squared, data =
temp.data)
Residuals:
Min
1Q Median
-5.234 -2.539 1.474
3Q
2.947
Max
3.883
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
25.7531
2.8896
8.912 1.99e-05 ***
temperature
10.0827
1.3444
7.500 6.93e-05 ***
temp.squared -1.0060
0.1295 -7.770 5.39e-05 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.793 on 8 degrees of freedom
Multiple R-squared: 0.883,
Adjusted R-squared: 0.8537
F-statistic: 30.19 on 2 and 8 DF, p-value: 0.0001875
We'll create a graph with the estimated regression function
First create the scatterplot.
with( temp.data, plot(yield ~ temperature))
To superimpose the estimated regression function, form a vector that encompasses the
range of the horizontal axis, incremented by a suitably small value. It appears that the
horizontal axis ranges between 0 and 10, so we will use an increment value of 0.1 to
form the vector:
vec <- seq(0, 10, by=0.1)
Then use the lines() function to superimpose the regression curve, which is obtained by
inputting the vector into the estimated regression function:
lines( vec, 25.7531 + 10.08*vec -1.006*vec^2 )
We then have the following plot of the data with the fitted regression function:
Box-Behnken Designs
For screening designs, we usually have 2 levels of each factor. To fit a response surface
to two factors, we'd like to have observations at three or more levels for each factor.
If you can only run the experiment conveniently with 3 levels (-1, 0, 1) of the factors,
you can use a Box-Behnken RSM design:
mystring.levels="x1,x1.sq,x2,x2.sq,x1.x2,y
-1,1,-1,1,1,1.09
-1,1,0,0,0,2.88
-1,1,1,1,-1,1.20
0,0,-1,1,0,2.88
0,0,0,0,0,4.15
0,0,1,1,0,4.79
1,1,-1,1,-1,1.76
1,1,0,0,0,4.82
1,1,1,1,1,5.80"
Box.Behnken.data=read.table(textConnection(mystring.levels),header=TRUE,sep=",")
Box.Behnken.data
> Box.Behnken.data
x1 x1.sq x2 x2.sq x1.x2
y
1 -1
1 -1
1
1 1.09
2 -1
1 0
0
0 2.88
3 -1
1 1
1
-1 1.20
4 0
0 -1
1
0 2.88
5 0
0 0
0
0 4.15
6 0
0 1
1
0 4.79
7 1
1 -1
1
-1 1.76
8 1
1 0
0
0 4.82
9 1
1 1
1
1 5.80
>
# To create our graphs more easily later, we are going to use a slightly different version
of the formula.
# The formula "I(x1^2)" tells R to include the x1^2 term.
lm.BB = lm(y ~ x1 + I(x1^2) + x2 + I(x2^2) + x1*x2, data = Box.Behnken.data)
# The model statement "y~." says to model y as a function of all the other variables.
summary(lm.BB)
> summary(lm.BB)
Call:
lm(formula = y ~ x1 + I(x1^2) + x2 + I(x2^2) + x1 * x2, data =
Box.Behnken.data)
Residuals:
1
2
3
7
8
-0.262500 0.470000 -0.207500
0.030833 0.006667
9
0.024167
4
5
0.293333 -0.476667
Coefficients:
Estimate Std. Error t value
(Intercept)
4.6267
0.3552 13.025
x1
1.2017
0.1946
6.177
I(x1^2)
-1.0150
0.3370 -3.012
x2
1.0100
0.1946
5.191
I(x2^2)
-1.0300
0.3370 -3.057
x1:x2
0.9825
0.2383
4.123
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01
Pr(>|t|)
0.000977
0.008545
0.057116
0.013882
0.055140
0.025860
6
0.183333 -
***
**
.
*
.
*
‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4766 on 3 degrees of freedom
Multiple R-squared: 0.971,
Adjusted R-squared: 0.9227
F-statistic: 20.1 on 5 and 3 DF, p-value: 0.01632
The response function is
y = 4.6267 + 1.2 * x1 – 1.015 * x1^2 + 1.01 * x2 – 1.03 * x2^2 + 0.9825 * x1 * x2
We'll use lattice to create some graphs.
Load the lattice library if you have not already done so.
library(lattice)
We want to be able to display the regression surface. To do so, we first evaluate the
regression function (lm.BB) on a regular grid of predictor values. See the textbook
"Lattice" by Deepayan Sarkar for more details and advanced examples.
We first create the vectors of values for each predictor that defines the margins of the
grid. We'll create a 50 * 50 mesh.
x1.mesh = with(Box.Behnken.data, do.breaks(range(x1), 50))
x2.mesh = with(Box.Behnken.data, do.breaks(range(x2), 50))
x1.mesh
x2.mesh
> x1.mesh
[1] -1.00 -0.96 -0.92 -0.88 -0.84 -0.80 -0.76 -0.72 -0.68 -0.64 -0.60 -0.56
[13] -0.52 -0.48 -0.44 -0.40 -0.36 -0.32 -0.28 -0.24 -0.20 -0.16 -0.12 -0.08
[25] -0.04 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40
[37] 0.44 0.48 0.52 0.56 0.60 0.64 0.68 0.72 0.76 0.80 0.84 0.88
[49] 0.92 0.96 1.00
> x2.mesh
[1] -1.00 -0.96 -0.92 -0.88 -0.84 -0.80 -0.76 -0.72 -0.68 -0.64 -0.60 -0.56
[13] -0.52 -0.48 -0.44 -0.40 -0.36 -0.32 -0.28 -0.24 -0.20 -0.16 -0.12 -0.08
[25] -0.04 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40
[37] 0.44 0.48 0.52 0.56 0.60 0.64 0.68 0.72 0.76 0.80 0.84 0.88
[49] 0.92 0.96 1.00
We'll use the lattice function expand.grid() to create the grid.
grid=expand.grid(x1=x1.mesh,x2=x2.mesh)
grid
Finally, we add a column to the grid data frame containing the predicted values for the
model.
grid[["fit.lm"]] = predict(lm.BB , newdata=grid)
We'll use this grid data frame to create some plots.
wireframe(fit.lm ~ x1 * x2, grid, zlab="Response")
contourplot(fit.lm ~ x1 * x2, grid, zlab="Response")
levelplot (fit.lm ~ x1 * x2, grid, zlab="Response", col.regions = topo.colors)
Central Composite Designs
If we are not limited to 3 levels of each X variable, but can instead choose more than 3
levels, Central Composite Designs (CCD) are a common choice.
Here is an example of a design matrix for a CCD.
0,2
-1,1
-2,0
1,1
0,0
-1,-1
2,0
1,-1
0,-2
We can use this design for the following data.
mystring.levels="x1,x1.sq,x2,x2.sq,x1.x2,y
-2,4,0,0,0,-1.90
-1,1,1,1,-1,1.88
-1,1,-1,1,1,1.20
0,0,2,4,0,2.88
0,0,0,0,0,4.15
0,0,-2,4,0,-1.20
1,1,1,1,1,5.76
1,1,-1,1,-1,1.82
2,4,0,0,0,2.80"
CCD.data=read.table(textConnection(mystring.levels),header=TRUE,sep=",")
CCD.data
> CCD.data
x1 x1.sq
1 -2
4
2 -1
1
3 -1
1
4 0
0
5 0
0
6 0
0
7 1
1
8 1
1
9 2
4
>
x2 x2.sq x1.x2
y
0
0
0 -1.90
1
1
-1 1.88
-1
1
1 1.20
2
4
0 2.88
0
0
0 4.15
-2
4
0 -1.20
1
1
1 5.76
-1
1
-1 1.82
0
0
0 2.80
lm.CCD = lm(y ~ x1 + I(x1^2) + x2 + I(x2^2) + x1*x2, data = CCD.data)
# The model statement "y~." says to model y as a function of all the other variables.
summary(lm.CCD)
Call:
lm(formula = y ~ x1 + I(x1^2) + x2 + I(x2^2) + x1 * x2, data =
CCD.data)
Residuals:
1
2
7
8
-0.092778 0.242222
0.175556 -0.004444
9
-0.026111
3
4
5
6
0.062222 -0.149444 -0.237778
0.030556
Coefficients:
Estimate Std. Error t value
(Intercept) 4.38778
0.18383 23.869
x1
1.15833
0.07120 16.270
I(x1^2)
-0.96958
0.06893 -14.065
x2
1.06500
0.07120 14.959
I(x2^2)
-0.87208
0.06893 -12.651
x1:x2
0.81500
0.12331
6.609
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01
Pr(>|t|)
0.000161
0.000505
0.000778
0.000648
0.001065
0.007053
***
***
***
***
**
**
‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2466 on 3 degrees of freedom
Multiple R-squared: 0.9961,
Adjusted R-squared: 0.9895
F-statistic: 151.5 on 5 and 3 DF, p-value: 0.000838
Prepare data for graphic display:
Create the vectors of values for each predictor that defines the margins of the grid.
We'll create a 50 * 50 mesh.
x1.mesh = with(CCD.data, do.breaks(range(x1), 50))
x2.mesh = with(CCD.data, do.breaks(range(x2), 50))
x1.mesh
x2.mesh
> x1.mesh
[1] -2.00 -1.92 -1.84 -1.76 -1.68 -1.60 -1.52 -1.44 -1.36 -1.28 -1.20 -1.12
[13] -1.04 -0.96 -0.88 -0.80 -0.72 -0.64 -0.56 -0.48 -0.40 -0.32 -0.24 -0.16
[25] -0.08 0.00 0.08 0.16 0.24 0.32 0.40 0.48 0.56 0.64 0.72 0.80
[37] 0.88 0.96 1.04 1.12 1.20 1.28 1.36 1.44 1.52 1.60 1.68 1.76
[49] 1.84 1.92 2.00
> x2.mesh
[1] -2.00 -1.92 -1.84 -1.76 -1.68 -1.60 -1.52 -1.44 -1.36 -1.28 -1.20 -1.12
[13] -1.04 -0.96 -0.88 -0.80 -0.72 -0.64 -0.56 -0.48 -0.40 -0.32 -0.24 -0.16
[25] -0.08 0.00 0.08 0.16 0.24 0.32 0.40 0.48 0.56 0.64 0.72 0.80
[37] 0.88 0.96 1.04 1.12 1.20 1.28 1.36 1.44 1.52 1.60 1.68 1.76
[49] 1.84 1.92 2.00
Use the lattice function expand.grid() to create the grid.
grid=expand.grid(x1=x1.mesh,x2=x2.mesh)
grid
# Add a column to the grid data frame containing the predicted values for the model.
grid[["fit.lm"]] = predict(lm.CCD , newdata=grid)
# We'll use this grid data frame to create some plots.
wireframe(fit.lm ~ x1 * x2, grid, zlab="Response")
contourplot(fit.lm ~ x1 * x2, grid, zlab="Response")
levelplot (fit.lm ~ x1 * x2, grid, zlab="Response", col.regions = topo.colors)
Both the Box-Behnken and the CCD design suggest that the yield will increase by
increasing both x1 and x2. So our next step could be runs spaced in that direction.
When we start working with more than 2 factors we want graphics specific for DOE.
Such graphics are available in commercial DOE software packages such as Design-Expert,
JMP, Statistica, Minitab, and others. These packages also help find minima, maxima, and
targets in high dimensional space. They also allow multiple response variables (such as
yield, taste, and cost for cookies) to be simultaneously optimized.
A useful tool is the rsm library, written by Russell Lenth at the University of Iowa:
cran.r-project.org/web/packages/rsm/vignettes/rsm.pdf
From the rsm library documentation:
"Functions are provided to generate central-composite and Box-Behnken designs. For
analysis of the resulting data, the package provides for estimating the response surface,
testing its lack of fit, displaying an ensemble of contour plots of the fitted surface, and
doing follow-up analyses such as steepest ascent, canonical analysis, and ridge analysis.
It also implements a coded-data structure to aid in this essential aspect of the
methodology. The functions are designed in hopes of providing an intuitive and effective
user interface."
Download