RESPONSE SURFACE METHODS AND DESIGNS Introduction First-order model The method of steepest ascent Second-order model. Central composite design Characterizing the response surface Response surface methods ... with Minitab 1. Introduction Response surface methodology (RSM) is a collection of mathematical and statistical techniques that are useful for the modeling and analysis of problems in which a response of interest is influenced by several variables and the aim is to optimize this response. How is a particular response affected by a given set of input variables over some specified region of interest? What values of the inputs will yield a maximum (or minimum) for a specific response? What is the relationship response-factors like close to this maximum (or minimum)? For instance, suppose we wish to find the levels of two factors x1, x2 that maximize the response variable y of a process: y f ( x1, x2 ) (noise) The surface represented by f ( x1, x2 ) is called a response surface, graphically represented as a solid surface in a three-dimensional space. In the contour plot, lines of constant response are drawn in the x1, x2 plane, which help visualize the shape of the response surface. Each contour corresponds to a particular height of the response surface. Such a plot is helpful in sutdying the levels of x1 and x2 that result in changes in the shape or height of the response surface. 64 Steps in RSM 1st Step: To find a suitable approximation for the true functional relationship between y and the set of independent variables (usually, a low-order polynomial in some region of the independent variables: first-order model, or second-order model if ther is curvature in the system). 2nd Step: To estimate the parameters in the approximating polynomials (to find the maximum response, for instance). 3rd Step: To do the response surface analysis in terms of the fitted surface. If the fitted surface is an adequate approximation of the true response function, then analysis of the fitted surface will be approximately equivalent to analysis of the actual system. Features of RSM - Sequential procedure. - The experimental problem can be understood in geometric terms. - A) approximative model, B) estimation of the parameters (design), C) data analysis, D) fitting of the model. Often when we are at a point on the response surface that is remote from the optimum there is little curvature in the system and the first-order model will be appropriate → to lead the experimenter rapidly and efficiently to the general vicintiy of the optimum. Once the region of the optimum has been found, a more elaborate model, such as the second-order model, may be employed. An analysis may be performed to locate the optimum. To determine the optimum operating conditions for the system or to determine a region of the factor space in which operating specifications are satisfied. RSM guarantees convergence to a local optimum only. 65 The model parameters can be estimated most effectively if proper experimental designs are used to collect the data. 1st-order strategies 2nd-order strategies 2k-p designs (+ add or eliminate factors, Central composite design. replicated experiments, etc.) The method of least squares. The method of least squares. Linear model: Quadratic model: y 0 1 X 1 2 X 2 Contour plots. y 0 1 X 1 2 X 2 11 X 12 22 X 2 2 12 X 1 X 2 Contour plots and canonical analysis. The method of steepest ascent. 2. First-order model Temperature? and Pressure? → minimal porosity index Porosity Index F (Temperature, Pressure) 22 factorial augmented by three center points. Repeat observations at the center allow to estimate the experimental error and to check the adequacy of the first-order model. Temperature (ºC) Pressure (kg/cm2) Porosity Index -1 (640) -1 (950) 6.09 +1 (660) -1 (950) 5.53 -1 (640) +1 (1000) 6.78 +1 (660) +1 (1000) 6.16 0 (650) 0 (975) 5.93 0 (650) 0 (975) 6.12 0 (650) 0 (975) 5.92 66 Advised working region for temperature 600-900ºC ; for pressure 700 Kg/cm2. Writing the first-order model in matrix notation, we have: Y = Xβ + ε where 6.09 5.53 6.78 Y = 6.16 5.93 6.12 5.92 1 - 1 - 1 1 1 - 1 1 - 1 1 X = 1 1 1 1 0 0 1 0 0 1 0 0 0 β = 1 2 That is to say, we obtain the following model in the coded variables: Y = β0 + β1 ·Temperature + β2· Pressure The adequacy of the model should be investigated before exploring further with the help of linear regression tools. Coefficients and overall check H0: nonsignificant coefficient versus H1: significant coefficient All the coefficients turn to be significant. H0: linear model is nonsignificant versus H1: linear model is significant Overall regression is significant. No reason to question the adequacy of the first-order model. Interaction check H0: β12 is nonsignificant versus H1: β12 is significant Interaction between the variables measured by the coefficient β12 of the cross-product term is nonsignificant. 67 Curvature check H0: there is no curvature versus H1: there is curvature There is no indication of a pure quadratic effect → Curvature is nonsignificant. Estimation of error The repeat observations at the center can be used to calculate an estimate of σ 2, regardless the model. This estimate can be compare with the estimate obtained from the 22 design by means of a test for comparing two variances. If both estimates are significantly different → the model is not adequate and a quadratic model or data transformation is required to approximate the response. 3. The method of steepest ascent The initial estimate of the optimum operating conditions for the system are frequently far from the actual optimum. Thus, we aim at moving rapidly to the general vicinity of the optimum. When we are remote from the optimum, we usually assume that a first-order model is an adequate approximation to the true surface in a small region of the x’s. The method of steepest ascent is a procedure for moving sequentially along the path in the direction of the maximum increase in the response. If minimization is desired → the method of steepest descent. The steps along the path are proportional to the regression coefficients (i). The actual step size is determined by the experimenter based on process knowledge or other practical considerations. Experiments are conducted along the path of steepest ascent (or descent) until no further increase in response is observed. Then a new first-order model may be fit, a new path of steepest ascent determined, and the procedure continued. Finally, the experimenter is near the optimum. 68 This fact is usually indicated by lack of fit of a first-order model. Then additional experiments are conducted to obtain a more precise estimate of the optimum. From the example above we consider the first-order model: Y = 6.076 - 0.295 T+ 0.33 P The gradient’s direction is: Y = -0.295 T Y = 0.33 P Since we are interested in the direction of the steepest descent: d = [ 0.295 , -0.33 ] To move away from the design center ( x1, x2 ) (0,0) along the path of steepest descent, we would move 0.295 units in the x1 direction for every -0.33 units in the x2 direction. Hence the path of steepest descent has a slope -0.33/0.295. We can choose the basic step size to move along the path. In this case we decide to use the normalized vector: u = [ 0.295/0.44 , -0.33/0.44 ] = [ 0.67 , -0.75 ], where 0.44 = 0.2952 + (- 0.33 )2 . We compute points along this path and observe the porosity index at these point until an increase in response is noted. 1 Steps Temperature Pressure Porosity Index 3u = (2.01 , -2.25) 670 920 4.53 69 2 5u = (3.35 , -3.75) 685 880 3.28 3 7u = (4.69 , -5.25) 700 845 2.91 4 9u = (6.03 , -6.85) 710 805 4.15 Decrease in response are observed through the third step. Beyond this point: an increase in porosity index. T, P natural variables, then the coded variables are: x1 = T - 650 10 x2 = P - 975 25 Therefore, another first-order model should be fit in the general vicinity of the point (700, 845). Once again, a 22 design with three center points is used. Temperature Pressure Porosity -1 (690) -1 (820) 2.57 +1 (710) -1 (820) 4.08 -1 (690) +1 (870) 3.23 +1 (710) +1 (870) 3.86 0 (700) 0 (845) 2.90 0 (700) 0 (845) 2.67 0 (700) 0 (845) 2.91 From these new data we obtain the following first-order model: Y = 3.174 + 0.535 T+ 0.11 P Coefficients and overall check 1 and 2 are nonsignificant. Overall regression is nonsignificant. The first-order model is not an adequate approximation. Curvature check Curvature is significant → quadratic model is adequate. 70 4. Second-order model. Central composite design We augment the design with enough points to fit a second-order model. The second design is called star design, and the experiments are distributed as follows: 0 1 - 0 1 1 0 - X = 1 0 1 0 0 0 1 0 1 ... ... Desiderable features of this design: orthogonality (minimal variance of the regression coefficients) and rotability (equal precision of estimation in all direction). The star design is made orthogonal and rotatable by the choice of and center points. Then, factorial design (1st design) + star design (2nd design) = central composite design The table below displays how to choose the values of and of center points for some central composite design: k 2 3 4 5 5, ½ 6 6, ½ 7 7, ½ nf 4 8 16 32 16 64 32 128 64 nof 3 4 4 8 6 8 8 16 8 71 ne 4 6 8 10 10 12 12 14 14 n0e 3 2 2 4 1 6 2 11 4 α 1.4142 1.6818 2 2.3784 2 2.8284 2.3784 3.3636 2.8284 k: number of factors. nf: experiments in factorial design. n0f: center points in factorial design. ne: experiments in star design. n0e: center points in star design. In the example above: Temperature Pressure Porosity Index - 2 (685) 0 (845) porositat 2.66 2 (715) 0 (845) 4.04 0 (700) - 2 (810) 3.54 0 (700) 2 (880) 3.40 0 (700) 0 (845) 2.84 0 (700) 0 (845) 2.92 0 (700) 0 (845) 2.81 From these new data we can estimate a second-order model by using the method of least squares. Analysis of Regression: Checks and Coefficients Since the adjusted R2 is 95.39%, the quadratic model is adequate to represent the variability of the index of porosity as a function of pressure and temperature. However, the coefficient 2 (pressure) is nonsignificant → we can eliminate it from the model. Then the resulting model is: Y = 2.84 + 0.51 T – 0.22 TP + 0.26 T2 + 0.32 P2 Blocking It is often necessary to consider blocking to eliminate nuisance variables. This may occur when a second-order design is assembled sequentially from a first-order design. Considerable 72 time may elapse between the running of the first-order design and the running of the secondorder design. During this time test conditions may change → blocking. The stationary point, if it exists, is the solution of: 5. Characterizing the response surface How to use this fitted model to find the optimum set of operating conditions for the x’s and to characterize the nature of the response surface? Once we have found the stationary point → Maxima, minima, saddle points and ridges??? The stationary point, if it exists, is the solution to Y Y ... 0 and could represent a x1 x2 point of maximum response, a point of minimum response, or a saddle point. 5.1. Contour plots The most straightforward way to characterize the natuer of the response surface is to examine a contour plot of the fitted model. If there are only two or three process variables, the construction and interpretation of this contour plot is relatively easy. For instance, when we deal with two factors, these represent the coordinate axes. From the evaluation of the quadratic model, we get a range of values corresponding to the response variable. 73 3D Plot of fitted response along with its contour plot 74 In the example above: Minimal porosity when temperature 690ºC and pressure 835 Kg/cm2. y = 0.51 + 0.52 x1 - 0.22 x2 = 0 x1 y = 0.64 x2 - 0.22 x1 = 0 x2 Hence, x1 = -1.15, x2 = -0.39. The coordinates of this point and the estimate of the porosity in the natural variables: Pc (688.5 ºC, 835.25 Kg/cm2) YPc = 2.55 However, even when there are relatively few variables → canonical analysis, a more formal analysis. 5.2. Canonical analysis Surfaces can be classed according to their canonical form. To obtain the canonical form of a certain surface, it is helpful first to transform the model into a new coordinate system with the origin at the stationary point and then to rotate the axes of this system until they are parallel to the principal axes of the fitted response surface. If we write the second-order model in matrix notation, we have 75 Y 0 X t X t BX where β0 = (β0) 1 2 β = 3 k x1 x2 x3 X= x k 11 12 2 12 22 B= 2 1k 2k 2 2 1k 2 2k 2 kk The previous transformation results in the canonical form of the model 2 2 2 y = y0 + λ1 ~ x 1 + λ2 ~ x 2 + ... + λk ~ xk where y0 is the value of the ordinate at the stationary point x0, λi the eigenvalues of the matrix B, and ~ x i2 are the transformed independent variables. The variables xi are related to the canonical variables ~ x i by ~ x i = Mt (xi - xi0) The columns of M are the normalized eigenvectors associated with the λi ‘s. We focus again on the example about the minimal porosity, for which the stationary point is (-1.15,-0.39): β0 = (2.84) x1 X= x 2 0.51 β= 0 0.26 - 0.11 B= - 0.11 0.32 The eigenvalues of the matrix B are λ1=0.4 and λ2=0.18. The normalized eigenvectors associated with λ1=0.4 and with λ2=0.18 are (-0.61,0.79) and (0.79,0.61), respectively. Hence: - 0.61 0.79 M= 0.79 0.61 The canonical form of the fitted model is: 76 2 2 y = 2.55 + 0.4 ~ x 1 + 0.18 ~ x2 The relationship between the coded variables, xi, and the canonical variables, ~ x i is: ~ x 1 = -0.61 (x1 + 1.15) + 0.79 (x2 + 0.39) = -0.61 x1 + 0.79 x2 - 0.39 ~ x 2 = 0.79 (x1 + 1.15) + 0.61 (x2 + 0.39) = 0.79 x1 + 0.61x2 - 1.15 The nature of the response surface can be determined from the stationary point and the sign and magnitude of the eigenvalues of the matrix B. Suppose that we are working with two factors. If the stationary point is within the region of exploration for fitting the second-order model: 1) If λ1 and λ2 are both negative, x0 is a single point of minimum response. If λ1 and λ2 are both positive, x0 is a single point of maximum response. 2) If λ1 and λ2 have different signs, x0 is a saddle point. 77 3) If λ2=0 (or λ1=0) the surface presents a stationary ridge (not a single point maximum or minimum, but a line of maxima or minima). If the stationary point is far outside tye region of exploration for fitting the second-order model and one or more eigenvalues are near zero, another canonical form may be helpful: 4) If λ2=0 (or λ1=0) and there is a linear term in ~ x 1 ( or in ~ x 2 ) the surface presents a rising (or falling) ridge. We illustrate this discussion using again the example about the minimal porosity. - Both eigenvalues are positive → the stationary point is a minimum and its coordinates (688.5 ºC, 835.1 kg/cm2). - If we were interested in predicting the response values for certain levels of the factors, in the vicinity of the stationary point, we should convert the xi (coded variables) into the natural variables through the relationship: x1 = T - 700 10 x2 = P - 845 25 78 Therefore, the equation of the model, using the natural variables, is: Y = 1087.6 - 2.85 T - 0.25 P + 0.0026 T2 + 0.00051 P2 - 0.00088 T·P Local approximation: the further from the stationary point we move, the less precise the predicted response is. - Porosity changes more rapid in the ~ x 1 direction ( ~ x 2 = 0) than in the ~ x 2 direction ( ~ x1 = 0), because λ1 = 0.4 is larger than λ2 = 0.18. The direction of smallest change in porosity: ~ x 1 = 0 = -0.61 x1 + 0.79 x2 - 0.39 and that of greatest change: ~ x 2 = 0 = 0.79 x1 + 0.61 x2 + 1.15 - For a given porosity index, there is a range of combinations of temperatures and pressures, which allows to work with smaller porosity. We just have to find a suitable contour for this particular value and choose any combination within the region determined by the contour. 79 6. Response surface methods... with Minitab 1) Create response surface design 80 » Options → randomize runs 81 Central Composite Design Factors: Base runs: Base blocks: 4 30 2 Replicates: Total runs: Total blocks: 1 30 2 Two-level factorial: Full factorial Cube points: Center points in cube: Axial points: Center points in axial: 16 4 8 2 Alpha: 2 82 2) Analyze response surface design 83 1st Trial » Include blocks in the model Response Surface Regression: Prod versus Block; T; Comp1; Comp2; rpm The analysis was done using coded units. Estimated Regression Coefficients for Prod Term Constant Block T Comp1 Comp2 rpm T*T Comp1*Comp1 Comp2*Comp2 rpm*rpm T*Comp1 T*Comp2 T*rpm Comp1*Comp2 Comp1*rpm Comp2*rpm Coef 9,55825 0,03525 0,04333 0,73000 0,47667 0,02417 -1,52500 0,21375 0,10875 0,17875 -0,10750 0,10750 -0,18375 0,23625 0,18500 -0,05500 SE Coef 0,3673 0,1721 0,1814 0,1814 0,1814 0,1814 0,1697 0,1697 0,1697 0,1697 0,2221 0,2221 0,2221 0,2221 0,2221 0,2221 S = 0,8886 R-Sq = 89,5% T 26,025 0,205 0,239 4,025 2,628 0,133 -8,988 1,260 0,641 1,054 -0,484 0,484 -0,827 1,063 0,833 -0,248 P 0,000 0,841 0,815 0,001 0,020 0,896 0,000 0,228 0,532 0,310 0,636 0,636 0,422 0,306 0,419 0,808 R-Sq(adj) = 78,3% 84 Analysis of Variance for Prod Source Blocks Regression Linear Square Interaction Residual Error Lack-of-Fit Pure Error Total DF 1 14 4 4 6 14 10 4 29 Seq SS 0,033 94,630 18,302 73,929 2,399 11,054 10,596 0,459 105,717 Adj SS 0,0331 94,6299 18,3017 73,9291 2,3991 11,0544 10,5959 0,4585 Adj MS 0,0331 6,7593 4,5754 18,4823 0,3998 0,7896 1,0596 0,1146 F 0,04 8,56 5,79 23,41 0,51 P 0,841 0,000 0,006 0,000 0,794 9,24 0,023 Unusual Observations for Prod Obs 10 27 StdOrder 10 27 Prod 8,970 11,260 Fit 7,353 10,190 SE Fit 0,688 0,716 Residual 1,617 1,070 St Resid 2,88 R 2,04 R R denotes an observation with a large standardized residual. Estimated Regression Coefficients for Prod using data in uncoded units Term Constant Block T Comp1 Comp2 rpm T*T Comp1*Comp1 Comp2*Comp2 rpm*rpm T*Comp1 T*Comp2 T*rpm Comp1*Comp2 Comp1*rpm Comp2*rpm Coef -223,7936 0,0352 2,5617 -0,1345 -1,2590 0,0009 -0,0068 0,0021 0,0272 0,0000 -0,0007 0,0036 -0,0000 0,0118 0,0000 -0,0001 » No significant interaction → interaction terms OUT » No significant blocking effect → Include blocks in the model 85 2nd Trial Response Surface Regression: Prod versus T; Comp1; Comp2; rpm The analysis was done using coded units. Estimated Regression Coefficients for Prod Term Constant T Comp1 Comp2 rpm T*T Comp1*Comp1 Comp2*Comp2 rpm*rpm Coef 9,57000 0,04333 0,73000 0,47667 0,02417 -1,52500 0,21375 0,10875 0,17875 SE Coef 0,3272 0,1636 0,1636 0,1636 0,1636 0,1530 0,1530 0,1530 0,1530 S = 0,8014 R-Sq = 87,2% T 29,251 0,265 4,463 2,914 0,148 -9,966 1,397 0,711 1,168 P 0,000 0,794 0,000 0,008 0,884 0,000 0,177 0,485 0,256 R-Sq(adj) = 82,4% » Neither rpm nor rpm*rpm are significant → OUT 86 3rd Trial Response Surface Regression: Prod versus T; Comp1; Comp2 The analysis was done using coded units. Estimated Regression Coefficients for Prod Term Coef SE Coef T P Constant 9,77429 0,2728 35,831 0,000 T 0,04333 0,1614 0,269 0,791 Comp1 0,73000 0,1614 4,523 0,000 Comp2 0,47667 0,1614 2,954 0,007 -1,55054 0,1494 -10,377 0,000 Comp1*Comp1 0,18821 0,1494 1,260 0,220 Comp2*Comp2 0,08321 0,1494 0,557 0,583 T*T S = 0,7906 R-Sq = 86,4% R-Sq(adj) = 82,9% » Neither Comp1*Comp1 nor Comp2*Comp2 are significant → OUT 87 4th Trial Response Surface Regression: Prod versus T; Comp1; Comp2 The analysis was done using coded units. Estimated Regression Coefficients for Prod Term Coef SE Coef T P 10,0156 0,1854 54,017 0,000 T 0,0433 0,1606 0,270 0,789 Comp1 0,7300 0,1606 4,546 0,000 Comp2 0,4767 0,1606 2,969 0,007 -1,5807 0,1466 -10,784 0,000 Constant T*T S = 0,7866 R-Sq = 85,4% R-Sq(adj) = 83,0% » Model in coded units (T included, though no significant, because T2 is significant): Prod 10,016 0,043T 0,73Comp1 0,477Comp2 1,581T 2 3) Contour and surface plots 88 » A factor involved in the model → include it as one of the axes. » A factor with few values → repeat plots for its different levels. 89 » T versus Comp1 Comp2 at low level Surface Plot of Prod vs Comp1; T Hold Values C omp2 4 12 9 P r od 6 3 160 180 T 45 200 220 75 60 C omp1 30 Contour Plot of Prod vs Comp1; T 70 Prod < 2 2 4 4 6 6 8 8 - 10 > 10 T = 184,378 Comp1 = 69,6586 Prod = 10,9694 60 Comp1 Hold Values C omp2 4 50 40 30 160 170 180 190 200 210 T 90 Comp2 at middle level Surface Plot of Prod vs Comp1; T Hold Values C omp2 6 12 9 P r od 6 3 160 180 T 45 200 220 75 60 C omp1 30 Contour Plot of Prod vs Comp1; T 70 Prod < 4 4 6 6 8 8 - 10 > 10 T = 184,891 Comp1 = 69,3163 Prod = 11,4252 60 Comp1 Hold Values C omp2 6 50 40 30 160 170 180 190 200 210 T 91 Comp2 at hight level Surface Plot of Prod vs Comp1; T Hold Values C omp2 8 12 9 P r od 6 3 160 180 T 45 200 220 75 60 C omp1 30 Contour Plot of Prod vs Comp1; T 70 Prod < 4 4 6 6 8 8 - 10 > 10 T = 185,533 Comp1 = 69,9153 Prod = 11,9456 60 Comp1 Hold Values C omp2 8 50 40 30 160 170 180 190 200 210 T 92 References Box, Hunter and Hunter (1978), pp. 510-539. Montgomery (1991), pp. 521-563. Prat et al (1994), pp. 301-332. See also: Grima, P., Marco, Ll., Tort-Martorell, J. (2004) Estadística Práctica con Minitab. Madrid: Pearson Prentice Hall NIST/SEMATECH e-Handbook of Statistical Methods (2006) [http://www.itl.nist.gov/div898/handbook/] 93