Electronic Supplementary Information Introduction to response

advertisement
Electronic Supplementary Information
1. Introduction to response surface method and details of the method for selection of relevant
variables in the model: selection of statistically significant variables.
Response surface method is used to examine the relationship between one or more response
variables and a set of quantitative experimental variables or factors. This method is often
employed after identifying a vital few controllable factors by which the response is
optimized. Designs of this type are usually chosen when you suspect curvature in the
response surface. Response surface method may be employed to find factor settings
(operating conditions) that produce the best response and satisfy operating or process
specifications. On the other side, this method helps to identify new operating conditions that
produce demonstrated improvement in product quality over the quality achieved by current
conditions and model a relationship between the quantitative factors and the response.
Using RSM needs to determine what design is most appropriate for the experiment. Choosing
the design correctly will ensure that the response surface is fit in the most efficient manner.
Generally, there are two types of response surface experimental design including central
composite design and Box-Behnken design. The latter which has been used in the present
research has treatment combinations that are at the midpoints of the edges of the experimental
space and require at least three factors. Because Box-Behnken design often has fewer design
points, it can be less expensive to run than central composite design with the same number of
factors. This design allows efficient estimation of the first- and second-order coefficients.
According to this design, a quadratic regression model could be considered for developing
the response surface models of second order as shown in Eq.1:
(1)
Where R denotes the predicted response of the process, xi refers to the coded levels of the
factors (independent or control variables), b0, bi, bii, bij are the regression coefficients. The
above equation consists of four terms including one constant term, linear terms, squared
terms and 2-way interaction terms. Typically, considering three independent variables causes
the equation to extend as Eq.2 .
(2)
Solving the above quadratic model needs to perform at least ten experimental runs. Also,
further experimental data are necessary to establish the validity of the fitted model. After
performing the experimental test runs and determining the coefficients of the model, the fitted
model may have some insignificant coefficients for some terms which did not influence the
response variation so they can be removed from the fitted model. Removing the insignificant
coefficients and reducing the model allows us to simplify a regression model and can make
the model easier to work with and yet maintain its high accuracy. Presence of insignificant
coefficients in the model may impair the model’s predictive ability. A model can be reduced
manually, or automatically using an algorithmic procedure e.g. stepwise regression which is
suggested by experimental design softwares such as Minitab® 15 software presented by
Minitab Incorporation. This software performs the Student’s t-test and calculates a p-value
for all the coefficients. P-value is a suitable criterion for determining which coefficient is
statistically insignificant and should be removed from the model. P-value determines the
appropriateness of rejecting the null hypothesis in a hypothesis test. Hypothesis test is a
procedure that evaluates two mutually exclusive statements about a population. A hypothesis
test uses sample data to determine which statement is best supported by the data. These two
statements are called the null hypothesis and the alternative hypotheses. In order to approve
the significance of a coefficient, the null hypothesis which explains the coefficient is zero (or
is not significant) must be rejected. For this purpose, the p-value must be less than or equal to
a significant level or α-level. Alpha (α) is the maximum acceptable level of risk for rejecting
a true null hypothesis and is expressed as a probability ranging between 0 and 1. If the p-
value is less than or equal to the α-level, reject the null hypothesis in favor of the alternative
hypothesis i.e. the coefficient is significant. But if the p-value is greater than the α-level, fail
to reject the null hypothesis. The most commonly used α-level is 0.05. At this level, the
chance of finding an effect due to the coefficient-related term that does not really exist is only
5%. Therefore, if the p-value is greater than the α-level, the model should be reduced by
removing the insignificant coefficient. After removing the first insignificant coefficient, the
regression could be repeated so that each time one insignificant coefficient is removed, until
only statistically significant coefficients remain. It is important to note that it is not possible
to remove a linear insignificant coefficient from the model unless all the second-order and 2way interaction related terms were already removed.
On the other hand, Percentage of response variation which is explained by its relationship to
one or more component variables could be adjusted for each variable in the model. This
adjustment is important because square of correlating coefficient (R2) will always increase
when a new term is added to any model. A model with more terms may appear to have a
better fit simply because it has more terms. But the adjusted R2 will only increase if the new
term improves the model more than would be expected by chance. Therefore, if adjusted R2
decreases while reducing the model it is recommended that the coefficient should not be
removed from the model. However, it is necessary to evaluate goodness-of-fit of the reduced
model. To perform additional runs beyond the minimum required to fit the model permits
establishing the validity of the fitted model. In this state, there are two experimental data
groups. The first group is used to fit the regression model and the second group includes
experimental data which are used to evaluate residuals error of the fitted model. Sum of
squares (SS) parameter is the sum of squared residuals. Regression-related SS parameter is
the portion of the variation explained by the fitted model, while residual error-related SS
parameter is the portion not explained by the fitted model and is attributed to error. Degree of
freedom (DF) indicates the number of independent pieces of information involving the
response data needed to calculate the mean square (MS) parameter. If n = number of the
experimental data and p = number of the terms in the fitted model, regression-related DF = p1 and residual error-related DF= n-p. MS parameter is calculated via dividing SS by DF.
Therefore, two MS parameters can be defined for the regression model and the residual error.
The adequacy (significance) of the model is determined by performing the Fisher test which
results in calculation of the F-value. The regression-related F-value is calculated via dividing
MS for regression by MS for residual error. Larger values of F support rejecting the null
hypothesis that there is not a significant model. If the F-value is greater than the F(DF1,DF2)
value, the model will be adequate for explaining the response variations. In this condition, pvalue is less than α-level (i.e.0.05). The F(DF1,DF2) value is determined by the F distribution
tables. DF1 and DF2 are degrees of freedom of a numerator and a denominator, respectively.
For the F distribution tables, the rows represent the denominator degrees of freedom and the
columns represent the numerator degrees of freedom. In this state, DF1 and DF2 are degrees
of freedom of the regression model and the related residual error, respectively. In addition to
the regression F-value, the F-value for the lack-of-fit is a criterion for evaluating adequacy
(significance) of the fitted model. If n= number of the replicated experimental data and m =
number of the experimental data for the residual error, the pure error DF = n-1 and the lackof-fit DF = m-n-1. Therefore, two MS parameters can be defined for the lack-of-fit and the
pure error (noise). The F-value for the lack-of-fit is calculated via dividing MS for the lackof-fit by MS for the pure error. If the F-value for the lack-of-fit is less than the F(DF1,DF2)
value, it will indicate an adequate goodness-of-fit. Also, if significant level is 0.05, p-value
will be larger than it. Similarly, the F(DF1,DF2) value is determined by the F distribution tables.
Here, DF1 and DF2 are degrees of freedom of the lack-of-fit and the pure error, respectively.
It is important to note that the residuals will be distributed normally if the model indicates an
adequate goodness-of-fit. For this purpose, the normality test could be performed through
calculating Anderson-Darling statistic by the software. The Anderson-Darling statistic
measures how well the residuals follow a particular distribution e.g. a normal distribution.
The better the distribution fits the data, the smaller this statistic will be. In practice, this test
compares the empirical cumulative distribution function of the residuals with the distribution
expected if the data were normal. If this observed difference is sufficiently large, the test will
reject the null hypothesis of population normality. The hypotheses for the Anderson-Darling
test are the residuals follow a normal distribution (null hypothesis) and the residuals do not
follow a normal distribution (alternatively hypothesis). If the p-value for the AndersonDarling test is lower than the chosen significance level (usually 0.05), conclude that the data
do not follow the normal distribution. Therefore, the normality test generates a normal
probability plot and performs a hypothesis test to examine whether or not the residuals follow
a normal distribution. Along with probability plot of the residuals, the test displays a table
with distribution parameter estimates, the Anderson-Darling statistic and related p-value in
order to evaluate the distribution fit to the data. 1,2
2- Estimated regression models for nickel (II) α-diimine complex 1 in the presense of MAO and
EASC
The performance of the nickel (II) α-diimine complex 1 was evaluated in the ethylene polymerization using
MAO and EASC as cocatalyst.
Stepwise regression was applied using backward elimination methods in order to exclude insignificant
coefficients from Eq.1. The coefficients and their p-values are tabulated in Table 1 and 2. In this way, three
partial quadratic models with adequate goodness-of-fit were obtained for the responses of activity (R A), weightaverage molecular weight (RMw) and crystallinity (%) (RXtl) for each of the cocatalysts. Models for nickel (II) αdiimine complex 1 activated with MAO are shown in Eq. (2) to (4).
𝑅A(MAO) = 513.33 − 76.38 𝑥T + 55.75 𝑥P + 32.62𝑥C + 394.46 𝑥T2 − 54.29 𝑥P2 + 26.96𝑥C2 + 47.50𝑥P 𝑥C
(2)
𝑅Mw(MAO) = 289366 − 450882𝑥T + 63726𝑥P + 13926𝑥C + 282697𝑥T2 + 134399𝑥P2
+109958𝑥C2
(3)
𝑅xtl(MAO) = 9.62 − 12.90𝑥T + 0.89𝑥P − 0.64𝑥C + 6.54𝑥T2 − 1.54𝑥P2 +7.53𝑥C2 + 1.91𝑥T 𝑥C
(4)
Also the models for EASC activated complex 1 are shown in Eq. (5) to (7).
𝑅A(EASC) = 1792.2 −386.7 𝑥T + 302.2 𝑥P + 370.3𝑥C + 771.7𝑥T2 − 367.3 𝑥P2 + 299.4𝑥C2 − 318.3𝑥T 𝑥C
(5)
𝑅Mw(EASC) = 321780 − 506173𝑥T + 36792𝑥P + 69194𝑥C + 404803𝑥T2 − 1173854𝑥T 𝑥C
(6)
𝑅xtl(EASC) = 9.722 − 16.46𝑥T + 1.03𝑥P + 2.47𝑥C + 9.67𝑥T2 + 4.0𝑥P2 − 3.64𝑥T 𝑥C
(7)
TABLE 1. Estimated regression coefficients for various regression models in catalyst 1/MAO system
Terms of
the models
Activity
Mw
Crystallinity
Coefficient
p-value
Coefficient
p-value
Coefficient
p-value
XT
-76.38
0.017
-450882
0.000
-12.90
0.000
XP
55.75
0.057
63726
0.057
0.89
0.154
XC
32.62
0.225
13926
0.640
-0.64
0.292
2
394.46
0.000
282697
0.000
6.54
0.000
XP2
-54.29
0.176
134399
0.013
-1.54
0.101
XC2
26.96
0.479
109958
0.031
-
-
XTxP
-
-
-
-
-
-
XTxC
-
-
-
-
1.91
0.045
XPxC
47.50
0.213
-
-
-
-
XT
TABLE 2. Estimated regression coefficients for various regression models in catalyst 1/EASC
system
Terms of
the models
Activity
Mw
Crystallinity
Coefficient
p-value
Coefficient
p-value
Coefficient
p-value
XT
-386.7
0.010
-506173
0.000
-16.46
0.000
XP
302.2
0.029
36792
0.481
1.03
0.189
XC
370.3
0.012
69194
0.200
2.47
0.009
XT2
771.7
0.002
404803
0.000
9.67
0.000
XP2
-367.3
0.058
-
-
4.00
0.005
XC2
299.4
0.107
-
-
-
-
XTxP
-
-
-
-
-
-
XTxC
-318.3
0.080
-173854
0.036
-3.64
0.007
XPxC
-
-
-
-
-
-
Table 3 and 4 shows the ANOVA table at 95% confidence level for complex 1 in the presence of MAO and
EASC respectively.
In the activity regression model for MAO (Eq.2), there are two significant coefficients related to x T2, xT because
their p-values are less than 0.05 (Table 1).Although the p-value of the other related coefficients are more than
0.05, removing them caused a reduction in the adjusted R 2 value(90.72%).Therefore, it is better to retain these
coefficients in the model. The significance of coefficient depends on the amount of p-value. The smaller the pvalue, the more significant is the corresponding coefficient.40 Thus, the square of temperature (xT2) is the most
significant coefficient in this model. The suggested model for the activity variations was adequate because not
only the F-value for the regression model was greater than F tab. (7, 7) (i.e. 20.55 > 3.78) but also F-value for the
lack-of-fit was less than Ftab. (5, 2) (i.e. 10.21 < 19.29) (Table 3). Also, according to Fig 1(a), the residuals
distribution is normal because the p-value for normal distribution of the residuals is 0.520 (i.e. greater than
0.05). This supports the adequacy of activity fitted model.
In the activity regression model for EASC (Eq.5), there are four significant coefficients related to xT2, xT , xC
and xP (Table 2). Because of the presence of significant xT and xC related coefficients in the fitted model, it is
not possible to remove insignificant interaction of temperature and CC (x TxC) related coefficient. Similar to
activity fitted model for MAO, xT2 is the most significant coefficient in this model too. The suggested model for
the activity variations was adequate because not only the F-value for the regression model was greater than Ftab.
(7, 7) (i.e. 9.70 > 3.78) but also F-value for the lack-of-fit was less than Ftab. (5, 2) (i.e. 3.14 < 19.29) (Table 4).
Also, Fig 2(a) shows that the residuals distribution is normal.
In the Mw regression model for MAO (Eq.3), there are four significant coefficients with p-values less than 0.05
(Table 1). xT and xT2 are the most significant coefficients in this model. The fitted model was adequate because
the F-value for the regression model was more than Ftab. (6, 8) (i.e. 51.55 > 3.58) and also F-value for the lackof-fit was less than Ftab. (6, 2) (i.e. 18.58 < 19.32) (Table 3). Fig 1(b) indicates that the p-value for normal
distribution of the residuals is 0.130, so the model has an adequate goodness of fit.
In the Mw regression model for EASC (Eq.6), there are three significant coefficients including xT2, xT and xT xC
(Table 2). Removing the xP related coefficients caused a decrease in the adjusted R2 value (97.84%), so it was
better that it remaines. The most significant coefficients in this model are similar to Mw fitted model for MAO.
The fitted model was adequate because the F-value for the regression model was more than Ftab. (5, 9) (i.e. 28.29
> 3.48) and also F-value for the lack-of-fit was less than Ftab. (7, 2) (i.e. 1.57 < 19.35) (Table 4). Also according
to Fig 2(b) the model has an adequate goodness of fit.
In the crystallinity regression model for MAO (Eq.4), significant coefficient related terms include x T, xT2, xTxC
(Table 1). Some coefficients with p-value higher than 0.05 were retained in the fitted model because of
reduction in the adjusted R2 value (97.67%) after their removal. Because the F-value for the regression model
was more than Ftab. (6, 8) (i.e. 98.57> 51.55) and also F-value for the lack-of fit was less than Ftab. (6, 2) (i.e.
0.1< 19.32), the fitted model was adequate (Table 3). Also, according to Fig 1(c) the model has an adequate
goodness-of-fit because the p-value is 0.061.
In the crystallinity regression model for EASC (Eq.7), all related coefficient terms are significant except xP
(Table 2). It is not possible to remove xP related coefficient because of the presence of significant ,the square of
ethylene pressure (xP2) related coefficient in the fitted model. The F-value for the regression model was more
than Ftab. (9, 5) (i.e. 46.68> 4.77) and also F-value for the lack-of fit was less than Ftab. (3, 2) (i.e. 7.99< 19.16),
so the fitted model was adequate (Table 4). This model has also an adequate goodness-of-fit because its p-value
is 0.875 (Fig 2(c)). The most significant coefficients for crystallinity regression model for both MAO and EASC
are the same and includes xT and xT2.This shows that cocatalyst structure has no influence on the most
significant coefficient of fitted models for activity,Mw and crystallinity.
TABLE 3. ANOVA table for the regression models in catalyst 1/ MAO system
Activity
Mw
Crystallinity
Source
DF
F
F tab.
p -value
DF
F
Regression
7
20.55
3.787
0.000
6
51.55
Residual error
7
Lack-of-fit
5
Pure error
2
2
2
Total
14
14
14
F tab.
3.580
p -value
DF
F
F tab.
p -value
0.000
6
98.57
51.55
0.000
0.10
19.32
0.988
8
10.21
19.296
0.092
8
6
18.58
19.329
0.052
6
TABLE 4. ANOVA table for the regression models in catalyst 1/EASC system
Activity
Mw
Crystallinity
Source
DF
F
F tab.
p -value
DF
F
Regression
7
9.70
3.787
0.004
5
28.29
Residual error
7
Lack-of-fit
5
Pure error
2
2
2
Total
14
14
14
F tab.
3.481
p -value
DF
F
F tab.
p -value
0.000
9
46.68
4.77
0.000
7.99
19.16
0.113
9
3.14
19.296
0.259
7
5
1.57
19.353
0.444
3
(a)
(b)
(c)
Fig 1 Probability Plot of Residuals for the Regression model for complex 1/MAO system
a) ativity b)Mw c) crytallinity
(a)
(b)
(c)
Fig 2 Probability Plot of Residuals for the Regression model for complex 1/EASC system
a) activity b)Mw c)crystallinity
References
1. Minitab® Release 15. Design of Experiments, User’s Manual. Minitab Inc.; 2007.
2. Asiaban, S.; Moradian, S. Dyes Pigm. 2011, 92, 642.
Download