III. Theoretical Model for Estimating Bias Error Bounds

advertisement

Generalized pointwise bias error bounds for response surface approximations

Tushar Goel

, Raphael T. Haftka † , Melih Papila

, Wei Shyy

¥

Department of Mechanical and Aerospace Engineering, University of Florida, Gainesville, FL 32611

Abstract

Surrogate models such as response surface approximations are commonly employed for design optimization and sensitivity evaluations. With growing reliance on first-principle based, often expensive computations, to support the construction of response surfaces, the cost consideration can prevent one from generating enough data needed to ascertain their accuracy. Since response surfaces usually employ low-order polynomials bias (modeling) errors may be substantial. This paper proposes a generalized pointwise bias error bounds estimation method for polynomial based response surface approximations. The method is demonstrated with the help of an analytical example where the model is a quadratic polynomial while the true function is assumed to be cubic polynomial. A relaxation parameter is introduced to account for inconsistencies in the data and the assumed true model. The effect of noise in the data and variation in the relaxation parameter are studied. It is demonstrated that when bias errors dominate, the bias error bounds characterize the actual error field better than the prediction variance. The results also help identify regions in the design space where the accuracy of the response surface approximations is inadequate. Based on such information, local improvements in the maximum bias error bound predictions were accomplished with the aid of selectively generated new data.

I.

Nomenclature

b

A b j e

Vector of estimated coefficients of basis functions

Alias matrix

Estimated coefficients of basis functions

Vector of true errors at the data points e b

( x ) True bias error at the design point x e es

( x )

F ( x ), F (1)

Estimated standard error at the design point

( x ), F (2) (x) Vectors of basis functions at x x f j

(

N x ) Basis function

Number of data points

Number of basis functions in the regression model n

1 n

2 x

X, X (1) , X (2) x

1

, x

2

, …,x n x

1

(i) , x

2

(i) ,…, x n

(i) y ŷ( x )

Number of missing basis functions in the regression model

Design point

Gramian (Design) matrices

Design variables vectors

Design variables

Vector of observed responses

Prediction of the response surface approximation

Graduate Student, Student Member AIAA

† Distinguished Professor, Fellow AIAA

‡ Post Doctoral Research Associate, Member AIAA

¥ Distinguished Professor and Department Chair, Fellow AIAA

1

r

 k

,

(1) ,

(2)

 j

,

 j

(1) ,

( x ) j

(2)

2

Vectors of basis function coefficients

Coefficients of basis functions

True mean response at x

Noise variance

Minimum relaxation

Relaxation parameter

Degree of relaxation

II.

Introduction and Literature review

Response surface approximations (RSAs) are widely accepted for solving optimization problems with high computational or experimental cost. RSAs offer a computationally less expensive way of evaluating designs.

Polynomial response surfaces are the most popular among all the response surface approximation techniques. There are numerous applications of RSAs to practical design and optimization problems in different areas. A few examples

are given as follows. Kaufman et al. (1996) [1], fitted quadratic polynomial RSAs to estimate the structural weight

of high speed civil transport, obtained with multiple structural optimizations via the commercial code GENESIS.

Balabanov et al. [2, 3] investigated RSA construction for the wing bending material weight of the HSCT based on

structural optimization results of a number of different configurations, and used RSAs in the configuration

optimization problem. Papila and Haftka [4, 5] constructed RSAs for weight equations based on structural optimizations. Madsen et al. [6], Vaidyanathan et al. [7, 8], Papila et al. [9, 10] and Shyy et al. [11, 12] used

polynomial-based RSAs as design evaluators for the optimization of propulsion components include turbulent flow

diffuser, supersonic turbine, swirl coaxial injector element and liquid rocket injector designs. Redhe et al. [13]

determined an efficient number of data points when using the RSA methodology in crashworthiness problems where

numerical simulations were carried out using LS-DYNA. Goel et al. [14] used the polynomial based RSAs to

approximate the Pareto optimal front in multi-objective optimization of a liquid rocket injector design.

It is important to ensure the accuracy of RSAs before using it for design and optimization. Accuracy of the RSA is primarily affected by three factors: (i) Limitations on the number of experiments (ii) Noise in the data and (iii)

Inadequacy of the fitting model (bias error).

Limitations on the number of experiments or simulations are posed when their cost is prohibitive. Design of experiments is used to minimize the number of experiments. In experiments the noise may appear due to measurement errors and other experimental errors. Noise in computer simulations is due to numerical noise which is small in most simulations. Quite often the true model representing the data is unknown. A simple model is fitted to the data. Thus for simulation-based RSAs, modeling error (also called bias error) is mainly responsible for the error

in prediction. Prediction variance is an established tool to characterize noise related prediction errors [15, 16]. It is

often inappropriately employed to estimate bias errors. A good amount of work on design of experiments for minimizing the mean squared error averaged over the design domain combining noise and bias errors has been done

[17–20]. The bias component of the averaged or integrated mean squared error can also be minimized to obtain so-

called minimum-bias designs. The fundamentals of minimizing integrated mean squared error and its components

can be found in Myers and Montgomery [15] and Khuri and Cornell [16]. Bias error averaged over the domain has

been studied extensively but there is a relatively small amount of work done to account for point-to-point variation of bias errors. Studying point-to-point bias error can help in two ways:

1.

Identification of regions of large errors

2.

Selection of design of experiments: maximum bias error can be minimized rather than the average error

An approach for estimating pointwise bounds on bias errors in RSAs was presented by Papila and Haftka [21].

The traditional decomposition of the mean squared error into variance and the square of the bias was followed, but point-to-point rather than averaged over the domain. The bounds do not depend on the data and can be used to

identify the regions in space where the response surface approximations are poor. Papila et al. [22] applied this

approach to obtain design of experiments that minimize the maximum bias error. Once the data is available, the point-to-point bias error bounds can be tightened based on the data. Their method assumes that the true model is a higher degree polynomial than the approximating polynomial and it satisfies the given data exactly. Then using the given data the bounds on the bias errors are estimated.

2

In most problems the true model is not known. Inconsistencies between the data and the assumed true model may arise due to two causes: (i) there may be random errors (noise) in the data and (ii) the assumed “true” model may not be exact. When the matrix of the equations used in polynomial regression has full rank, these inconsistencies manifest themselves in inaccuracies of the bounds on the bias error. However, when, as happens

commonly, this matrix is rank deficient, the method proposed by Papila et al. [22] fails to produce bias error

estimates. One objective of the present paper is to address this shortcoming.

In the current work, a generalized pointwise bias error bound estimation approach is presented to address discrepancies between the assumed true model and the data. The proposed method allows the assumed true model to satisfy the design data within the limit of a small relaxation parameter

. The bounds can be applied to identify the zones where the response surface approximation is inadequate and few more simulations (if feasible) can be conducted to improve the response surface approximation. The proposed method is demonstrated with the help of an analytical example problem with various data inconsistencies.

The paper is organized as follows. The next section presents the theoretical development of the generalized pointwise bias error bound estimation method. Subsequently a two-dimensional polynomial test problem is presented. Different variations of this analytical example are considered. The results demonstrating the efficiency of the generalized pointwise bias error bounds estimation method on different problems is presented next. Proposed method is also compared with the prediction variance based method. The last section recapitulates major conclusions.

III.

Theoretical Model for Estimating Bias Error Bounds

A polynomial response surface can be developed for a given set of data using linear regression. Prediction variance is often used to estimate the errors in the response surface approximations. The basics of the response surface approximations and the prediction variance are summarized in Appendix A. A method to estimate the bias error bounds in the response surface approximations when there was no noise error was developed by Papila et al.

[22]. In this section, the generalized pointwise bias error bound estimation method is discussed in detail.

Let F (1) ( x ) be the vector of basis functions used for linear regression, F (2) ( x ) be the additional vector of the basis functions in the true model which are missing in linear regression model. Let

β

and

β

be the actual coefficient vectors associated with basis function vectors F (1) ( x ) and F (2) ( x ) respectively. For any design point x the true response is given as:

 x

( F

(1) x

T β (1) 

( F (2)

( ))

T β (2)

(1)

Since the noise error is assumed to be zero the true response y at the set of N design points is given as:

X

(1) β (1) 

X

(2) β (2)  y

(2) where X is the Gramian design matrix and superscript 1 denotes that it is constructed using only the basis functions corresponding to F (1) and superscript 2 denotes that it is constructed using the missing basis functions corresponding to F (2) , y is the vector of observed response at the data points. A typical Gramian matrix with two variables is shown in Equation (3).

X

1 x

1

(1)

1 x

1

(2)

1 x

(i)

1

 1 x

1

( N) x x x x

(1)

2

(2)

2

(i)

2

( N)

2

1

(x ) x x

1

(1) 2

(x ) x x

1

(2) 2

(x ) x x

(i) 2

( N) 2

1

1

(1) (1)

1 2

(2) (2)

(i) (i)

(x ) x x

1 1

2

2

( N) ( N)

2

(1) 2

(x )

2

(2) 2

(x )

2

(i) 2

(x )

2

( N) 2

(x )

2

(x )

(x )

(x )

(x

(1) n

2

(2) n

2

(i) n

2

( N)

2

) n

(3)

3

For design point x , predicted response (using polynomial response surface) is given by y x

F

(1)

( ))

T b (4) where b is the coefficient vector associated with the approximating basis functions. For a set of N design points, (x i

, y i

), the coefficient vector b can be evaluated as follows: b

 (1)T (1) -1 (1)T

(X X ) X y (5)

Substituting for y in Equation (5), b

 (1)T (1) -1 (1)T

(X X ) X [X

(1) β (1) 

X

(2) β (2)

] which can be rearranged as b

 β (1)  Aβ (2)

, where A

 (1)T (1) -1 (1)T (2)

{X X } X X

A is called the alias matrix. From Equation (7),

β (1)   (2)

The error at the N design data points is given as, e y X

(1) b

(6)

(7)

(8)

(9)

Substituting for y from Equation (2) and using Equation (7), and rearranging terms gives e

[X

(2) 

X

(1)

A β

(2)

(10)

Thus the error at the N design points is a function of coefficient vector

β (2)

only. Error at a general point x is obtained using the Equations (1) and (4) as follows: e b x

  x

 y x

( F

(1) x

T β (1) 

( F (2)

( ))

T β (2) 

( F (1)

( ))

T b

Using Equation (7), the error can be given as e b

( )

(

(2)

( ))

 T (1)

( ( ))

T

F x A F x

β (2)

(11)

(12)

To estimate the error bounds, the absolute bias error e b

has to be maximized over possible

β

,

β (2)

[22].

If there are no noise errors, the bias error bounds at any design point x can be computed by solving the following optimization problem:

Maximize

 (1)

,

 ( 2) e b

( )

Subject to:

(1)

X β

(1)  X β (2)  y c l

(1)  β (1)  c (1) u c l

(2)  β  c (2) u

(13)

4

where c (l) and c (u) are lower and upper bounds respectively on the true coefficients

β

and

β (2)

.

This optimization problem is a linear programming (LP) problem except that the objective function involves an absolute value. So the bias error bounds are computed by solving two LP problems, one for finding the minimum error min e b

( ) and the other for finding the maximum error max e b x . The estimate of the bias error bound at a given design point x is given by e b

( )

 e b x

 e b x

. It is assumed that the data set is exactly satisfied by the assumed true model and the Gramian design matrix X (matrix constructed by adjoining matrix X (1) and X (2)

) is of full rank [22]. This optimization problem does not have a solution when the Gramian

matrix X is rank deficient and the data is inconsistent with the assumed true model.

Such inconsistency may reflect noise or the fact that the assumed true model may not be exact. Then the polynomial representing the assumed true model should be allowed to pass within the limit of a small relaxation parameter

of the data. This can be implemented by relaxing the constraints in the LP optimization problem by

.

Now the bias error bound at any design point x can be approximated by solving the following optimization problem:

Maximize

 (1)

,

 ( 2) e b

( )

Subject to: y -1

  (1)

X β

(1)  X β (2)    c l

(1)  β (1)  c (1) u c l

(2)  β  c (2) u

(14)

This optimization largely depends on the value of

. Too small a value of

may not relax the constraints enough to provide a feasible solution of the LP and too large a value of

may negate the usefulness of the data.

Fortunately the data can be used to approximate the appropriate value of

.

The minimum amount of relaxation r can be attained from the requirement that there exists a solution to the preceding optimization problem. r is found by solving the following optimization problem:

Minimize r

 (1)

,

 ( 2 )

Subject to: y - 1 r

 (1)

X β

(1)  X β (2)   r r

0 c l

(1)  β (1)  c (1) u c l

(2)  β  c (2) u

(15)

With

  r , it is possible to exactly satisfy the data and the choice of the polynomial coefficients is likely to be unique. To allow variation in the coefficients (

 i

’s), the required relaxation 

is increased as follows.

  k r (16) where k is the degree of relaxation. In this study k = 1.5 is chosen, unless mentioned otherwise.

With this relaxation parameter

, the bias error bounds can be computed even when the assumed true model does not exactly satisfy the data and the Gramian matrix is rank deficient. If the data has high random noise, this method of estimating the relaxation parameter

may not give accurate results. Then, an estimate of the magnitude of the noise can be taken as the relaxation parameter. Finally, the assumed true model may have a combination of low

5

noise and bias error. In such a case, a small fraction of root mean square error in the response surface can be taken as the relaxation parameter. The summary of the method of estimating relaxation parameter is given as following:

1.

If there is no information on the noise and confidence in the assumed true model is high, the relaxation parameter

is given by Equation (16).

2.

If the magnitude of the noise can be estimated, the relaxation parameter

is given as r b n

) , where b is the estimated bound on the noise. n

3.

If the confidence in the assumed true model is not high, the relaxation parameter

is chosen as max(k , k

* e rms study, k * is taken as 0.1.

where e rms

is root mean square error and k * is a small fraction. In this

4.

If both conditions 2 and 3 are true, the relaxation parameter

is chosen as max(k r, k

* e rms

( ) , b n

) .

The proposed approach to estimate bias error bounds is demonstrated with the help of analytical test problems.

The method is compared with the errors predicted using estimated standard error (ESE) and the actual errors.

IV.

Application to Analytical Problems

The approach discussed in previous section is demonstrated with the help of a simple polynomial problem.

Various cases of the polynomial example are considered here. The description of the problems investigated is given as follows:

Polynomial problem

A two-variable polynomial problem is constructed to demonstrate the proposed approach. The assumed true model is a cubic polynomial and a quadratic polynomial response surface is fit to the data in a least square sense.

The true function, the assumed true model, and the response surface model are given as follows:

True function: y=

1

(1) +

2

(1) x

1

+

3

(1) x

2

+

4

(1) x

Assumed true model: ỹ = 

1

2 +

1

(1)

5

(1)

+

 x

2

(1) x

1 x

2

+

1

+

6

(1) x

3

(1) x

2

2

2 +

+

1

(2) x

4

(1) x

1

2

1

3 +

+

2

(2)

5

(1) x

1 x x

1

2 x

2

+

2

+

6

(1)

3

(2) x

1 x x

2

2 +

2

2 +

1

(2) x

4

(2)

1

3 x

+

2

3 + C x

1

2

(2) x

1

2 x

4 + D U( x )

2

+

3

(2) x

1 x

2

2 +

4

(2) x

(17)

2

3 (18)

Response surface model: ŷ = b

1

+ b

2 x

1

+ b

3 x

2

+ b

4 x

1

2 + b

5 x

1 x

2

+ b

6 x

2

2 (19)

U( x ) is a uniformly distributed random variable in [0, 1] uncorrelated and independent of design point x .

Different combinations of C and D are used to introduce different types of inconsistencies between the data generated by the true function and the assumed true model. Range of the design variables is x

[0,1] . The errors in approximation are expected to be smaller than the response. So the coefficient vector

β

is smaller compared to the values of

β

. In all test cases, the coefficients C and D are selected such that the primary bias error (between the assumed true model and the response surface) is larger than the errors in the assumed true model. A brief

summary of all the test cases is given in Table 1.

Cases

Case 1: Assumed true model is true function

C D

0.0000 0.0000

0.0026 0.0000 Case 2: Assumed true model has bias error

Case 3: Assumed true model has noise error

Case 4: Assumed true model has bias and noise error

0.0000

0.0013

0.0026

0.0013

Table 1 Summary of different test cases

Testing procedure

6

Bias error bounds were computed using the proposed method. The bounds estimates were compared with the actual error field. The proposed method was also compared with the established prediction variance based method by estimating their ability to characterize the actual error field. Coefficient of correlation between the bias error bounds (BEB) & the actual errors and between the estimated standard errors (ESE) & the actual error were computed. Higher coefficient of correlation is considered a better characterization of the actual error. Stepwise testing procedure is described as follows.

Step 1: Select the design points for response surface approximation.

Step 2: Generate the response data y at the design points: It is tempting to try and select the random values of the data, but this in general leads to poor approximations. This is because random selection of response y may correspond to large values of coefficient vector

β (2)

or be inconsistent with the true function. Instead, response data can be generated by selecting the coefficients of the vector b and the error vector e randomly. Random selection of the components of e may yield inconsistency with the true model. Hence Equation (10) is used to ensure consistency between the error vector e and the true model by selecting the random values of coefficient vector

β (2)

.

Step 3: Generate the true model coefficient vector

β

,

β (2)

: The true polynomial coefficient vectors

β

,

β (2)

is unique if there is sufficient number of points and the Gramian matrix X is not rank deficient. Then, the bias error bounds and the estimated standard error can be compared with the actual errors. However, often these conditions are violated and all the components of the coefficient vectors

β

,

β (2)

can not be determined using the data. Then there are infinitely many candidate coefficient vectors

β

,

β (2)

which satisfy the data y . It should be noted that all the polynomials which satisfy the response data ( y ) give different responses on the points different than the selected design points. Hence it is not possible to compute the actual error at any arbitrary design point. However, the bias error bounds are the worst case error measures and characterize the worst case scenario of the actual error field.

Similarly the estimated standard errors are the average error measures and can be compared with the averaged actual error field. Worst case error field and averaged error field can be computed by generating a large number of candidate coefficient vectors

β

,

β (2)

and then computing actual errors. The worst error field will be the value of the worst error and the average error will be the averaged error at each design point in the space.

Step 4: Determine the bounds on the true coefficient vector: Selection of proper bounds is essential to estimate the

pointwise bias errors (Papila et al. [22]).

Step 5: Estimate the relaxation parameter

.

Step 6: Compute errors in design space: Error is evaluated at a uniform grid of 11x11 (121 nodes) points.

Step 6a: Estimate pointwise bias error bounds.

Step 6b: Calculate estimated standard error.

Step 6c: Compute actual worst error and actual average error (RMS error).

Step 7: Determine coefficient of correlation between the actual worst error and the bias error bounds.

Step 8: Determine the coefficient of correlation between estimated standard error and actual average error.

Important steps in the testing procedure are further elaborated below.

Selection of design points and generation of the response (y)

Once the functional forms of the true function and the response surface model are selected, the coefficients of the response surface model are computed using the function data at appropriately selected design points. Two sets of design points were used to construct the response surface models for different polynomial cases. The first set of

design points was a three-level design as shown in Figure 1A. The second set has an additional design point such that the Gramian matrix for the assumed true model was rank deficient (Figure 1B).

The response data ( y ) at the design points was selected by picking the vectors b and

β (2)

. There can be infinitely many combinations of response data ( y ) at the design points however, in this study five distinct combinations of coefficients (datasets) were selected to demonstrate the approach. Response surface coefficient vector b was selected randomly between -1 and 1 and true coefficient vector

β (2)

was selected randomly between -

0.2 and 0.2. Smaller

β (2)

coefficients ensured bias errors to be smaller than the true response. For different datasets coefficients b i

and

 (2)

are given in Table 2.

j

7

Coefficient Dataset A Dataset B Dataset C Dataset D Dataset E b

1

0.23 -0.88 -0.97 0.68 -0.61 b

2 b

3

0.58

0.84

-0.29

0.63

0.49

-0.11

-0.96

0.36

0.36

-0.39 b

4 b

5

0.48

-0.65

-0.98

-0.72

0.86

-0.07

-0.24

0.66

0.08

-0.70 b

6

 (2)

1

-0.19

0.17

-0.59

-0.12

-0.16

0.14

0.01

0.08

0.40

-0.05

 (2)

2

 (2)

3

 (2)

4

0.17

-0.04

0.16

0.04

-0.09

-0.12

0.01

-0.12

0.07

-0.03

-0.08

-0.12

0.14

0.14

0.04

Table 2 Case 1 function data for five randomly selected datasets

For each dataset, vector

β (1)

was evaluated using Equation (8). This true coefficient vector (

β

,

β (2)

) is represented by

β *

. Actual response y was appropriately calculated for all the cases by adding the appropriate perturbations. Response surface approximation vector b was recomputed for cases 2, 3 and 4 to account for inconsistencies between the true function and the assumed true model. The procedure for generating the data is summarized as follows:

1.

Pick the random coefficient vector b between [-1, 1].

2.

Pick the true coefficient vector

β (2)

between [-0.2, 0.2].

3.

Evaluate the true coefficient vector

β (1)

using Equation (8).

4.

Calculate the actual response vector ỹ using Equation (2).

5.

Calculate the actual response vector y by adding appropriate perturbations

6.

Compute the response surface coefficient vector b using the actual response vector y .

Generating true model coefficient vectors

β

:

The first set of design points has 9 points and the rank of the corresponding Gramian design matrix was 8. For the second set of design points (10 points), the corresponding Gramian design matrix was rank deficient. This suggests that the true coefficient vector

β *

was not unique. In this case, the bias error bounds and the estimated error bounds can be compared with the actual worst error and actual average error. A procedure for computing the actual worst and actual average errors is outlined here:

1.

Generate a large number (P, P

infinity) of candidate true coefficient vectors

β c

which satisfy the response data ( ỹ ). In this study, 1000 candidate true coefficient vectors (P = 1000) were used to estimate the actual error field.

2.

Estimate the error at any arbitrary design point using each candidate coefficient vector. e c

( )

( F (1) ( )) T β (1) c

( F (2) ( )) T β (2) c

( F (1) ( )) T b

(20)

3.

Find the actual worst error (AWE) at this design point as

AWE( )

 max e c

; ( c P P

 

4.

Compute the actual average error (AAE) by computing root mean square error.

(21)

P 

(22) c

1 c

2 e P

This approach is feasible if it is possible to generate a large number of candidate true coefficient vectors which satisfy the data. For the first set of design points, Equation (10) indicated that the errors at design points depend only

8

on the coefficients

 (2)

2

and

 (2)

3

. Thus the coefficients

 (2)

1

and

 (2)

4

can be selected randomly without affecting the response data. This is utilized to generate candidate true coefficient vectors as follows:

1.

Generate random values of coefficients

1

and

4

2.

Use

 (2*)

2

and

 (2*)

3

from the original

β *

vector to construct new vector

β  

1

,

 (2*)

2

,

 (2*)

2

,

4

] .

3.

Use the

β

vector generated in previous step to find vector

β

4.

Candidate true coefficient vector

β c

( β , β )

.

.

This approach is not applicable for second set of design points as the error vector e depends on all the coefficients of vector

β (2)

. In this case the approach to find the candidate true coefficient vectors is more involved.

This is discussed in Appendix B.

Bounds on data:

Selection of proper bounds is essential to estimate the pointwise bias errors (Papila et al. [22]). Since

β (1)

,

β (2)

,

and b are related by Equation (8), the bounds on the coefficient vector

β (1)

were estimated using the bounds on vector

β (2) and vector b as follows.

 j il b i

A i j

 (2) jl iu b u

 j

A i j

 (2) ju

(23)

In this equation subscript l denotes lower bound and subscript u denotes upper bound on the corresponding component of

β (1) vector. For example,

 il

(1) represents the lower bound on the i th component of the vector

β (1)

and

 iu

(1) represents the upper bound on the i th component of the vector

β (1)

. The bounds on the components of the vector

β (2)

were [-0.2, 0.2]. Then Equation (23) was simplified as follows:

  il

 b i

 

 b u

 j

 j

0.2

A i j

0.2

A i j

(24)

V.

Results

In this section the key results obtained using point-to-point bias error bounds estimation method are discussed.

The method is implemented in MATLAB

®

[23].

Quadratic cubic problem: The first set of design points (9 points)

The summary of the results for the first set of design points is given in Table 3. There was a very strong

correlation (~0.99) between the bias error bounds and the actual worst errors. The correlation coefficient deteriorated slightly with an increase in random noise in the data. This result was expected as the assumption of zero noise error in the data is violated. The magnitude of the maximum bias error bound was of the order of 10 -2 and the location of the maximum bias error bound was found to match well with the maximum actual worst case error. The order of maximum bias error bound estimated by the proposed method justified the selection of the value of the coefficients C and D . It is noteworthy that when the assumed true model and the true function were the same, the relaxation parameter value was very small.

The correlation between the estimated standard error and the actual average error (AAE) was relatively weaker and there were large variations in the coefficient of correlation between ESE and AAE. The location of the

9

maximum error did not correspond well with the actual error. This suggests that when the bias error is dominant, prediction variance based approach is not effective in characterizing the actual error field but the bias error bounds represent the actual errors very well.

Case # Dataset Epsilon (

)

Maximum

BEB

Correlation coefficient between BEB and AWE

Correlation coefficient between ESE and AAE

1

2

E

A

B

C

D

E

A

B

C

D

1.31e-14

5.73e-15

1.43e-14

3.71e-15

6.51e-15

8.40e-4

4.74e-4

5.79e-4

4.11e-4

9.53e-4

0.025

0.022

0.023

0.022

0.026

0.026

0.023

0.024

0.022

0.027

0.999

0.999

0.999

0.999

0.999

0.999

0.999

0.999

0.999

0.999

0.545

0.697

0.639

0.705

0.594

0.538

0.689

0.634

0.708

0.596

3

A

B

C

D

E

2.60e-3

2.60e-3

2.60e-3

2.60e-3

2.60e-3

0.029

0.026

0.027

0.026

0.030

0.996

0.997

0.995

0.990

0.997

0.549

0.696

0.653

0.723

0.604

4

A

B

C

D

1.30e-3

1.30e-3

1.30e-3

1.30e-3

0.027

0.024

0.025

0.024

0.999

0.998

0.998

0.997

0.530

0.692

0.634

0.716

E 1.30e-3 0.028 0.998 0.597

Table 3 Coefficients of correlations and maximum bias error bound for first set of design points

A typical distribution pattern of different errors in space is shown in Figure 2. Case 4 (Dataset C), where the assumed true model has both random noise and bias error was chosen as the example. Figure 2A shows the bias error bounds, Figure 2B depicts the actual worst errors, Figure 2C exhibits the estimated standard error and Figure

2D shows the actual average error. The black square dots in each figure show the location of the existing data points.

It can be seen from Figure 2A and Figure 2B that the actual worst errors and the bias error bounds had very similar

patterns. This justified the high correlation between the actual worst errors and the bias error bounds. The pattern of

the estimated standard errors (Figure 2C) and the actual average errors (Figure 2D) were quite dissimilar which

supported the weak correlation obtained between the two errors.

In summary, when the bias error is dominating, the bias error bounds estimate give a closer representation of the actual errors than the prediction variance based method. The high value of correlation coefficients between the bias error bounds and the actual worst errors (~0.99) also support the conclusion. Additional points can be added at the location where the bias error bound is high to improve the response surfaces locally.

Effect of adding data point: The second set of design points (10 points but rank deficient Gramian matrix)

An additional point was placed in the vicinity of high bias error bounds. The results for different cases are

summarized in Table 4. As observed previously, correlation between the bias error bounds and the actual worst errors was very high. It can be seen from Table 4 that when there was no error in the assumed true model, the

maximum bias error bounds reduced and the correlation between the bias error bounds and the actual worst errors increased. But when the assumed true model was not accurate, the correlation coefficient decreased slightly and the

10

maximum bias error bound slightly increased in some cases. As observed previously, the correlation coefficient decreased with increase in the random noise.

Case # Dataset

Epsilon

(

)

Maximum

BEB

Correlation coefficient between BEB and AWE

Correlation coefficient between ESE and AAE

1

2

A

B

C

D

E

A

B

C

D

E

1.00e-6

1.00e-6

1.00e-6

1.00e-6

1.00e-6

8.05e-4

4.55e-4

5.50e-4

4.56e-4

9.10e-4

0.024

0.021

0.019

0.014

0.022

0.026

0.022

0.021

0.015

0.025

1.000

1.000

1.000

1.000

1.000

0.997

0.999

0.997

0.996

0.994

0.437

0.499

0.327

0.544

0.419

0.427

0.484

0.330

0.532

0.422

3

4

A

B

C

D

E

A

B

C

D

E

2.60e-3

2.60e-3

2.60e-3

2.60e-3

2.60e-3

1.30e-3

1.30e-3

1.30e-3

1.30e-3

1.30e-3

0.028

0.025

0.026

0.020

0.029

0.026

0.023

0.023

0.017

0.025

0.963

0.971

0.966

0.942

0.968

0.993

0.991

0.987

0.977

0.991

0.427

0.503

0.359

0.560

0.445

0.441

0.490

0.340

0.543

0.418

Table 4 Coefficients of correlation and maximum bias error bounds after adding one more design point

The correlation coefficient between the actual average errors and the estimated standard errors was statistically insignificant for all the experiments. The typical result of distribution of the actual worst errors and the bias error

bounds in the design space is shown in Figure 3. As expected the errors in the region where additional point was

placed had reduced.

Effect of increase in relaxation parameter:

As discussed earlier, very small value of the relaxation parameter does not always give a feasible solution and a very large value of relaxation parameter may cause the fitted response surface to represent the data very poorly. The effect of relaxation parameter

on the correlation between the maximum actual error and bias error bound was studied. Case 4 of the example problem was selected as example problem because this case considered the effect of both bias error and random noise in assumed true model. Function data from dataset C was used. The results are

summarized in Figure 4. Figure 4A shows the variation of coefficient of correlation between the bias error bounds and actual worst errors. The correlation coefficient reduced with an increase in the relaxation parameter. Figure 4B

shows the variation in maximum bias error bound with variation in

. Maximum value of bias error bound increased linearly with the relaxation parameter. These results were in sync with the expectations as with increase in the value of

, the constraints were relaxed further and the search space expanded. This potentially found a better solution for the optimization problem.

VI.

Concluding remarks

In this work, the generalized pointwise bias error bounds estimation method is presented to handle the cases where the assumed true model does not satisfy the data exactly. This method is demonstrated with the analytical polynomial problems. The comparison of the bias error bounds and estimated standard error with actual errors

11

showed that the bias error bounds better characterize the actual error field. The bias error bounds were even able to characterize the actual worst errors when the assumed true model was not accurate. The location of the maximum bias error bound and the maximum actual worst case error were found to be very strongly related. It was also shown that with the addition of another data point at the location of maximum bias error bound, the response surface can be refined locally. Though the second set of design points posed a rank deficiency in the Gramian matrix, the proposed method estimated the bias error bounds without any difficulty. The correlation between the bias error bounds and the actual worst errors was very high. On the other hand, the correlation between the estimated standard error and the actual RMS error was relatively weak.

The bias error bound estimates depend on the amount of relaxation parameter allowed. Very high and very low relaxation parameters may not give correct solution. Effect of increase in relaxation parameter was studied to estimate its impact on the bias error bounds. It was found that with the increase in the relaxation parameter, the pattern of the bias error bounds was preserved but the coefficient of correlation between the bias error bounds and the actual worst errors decreased. The magnitude of the maximum bias error bounds increased linearly.

VII.

References

1.

Kaufman, M., Balabanov, V., Burgee, S.L., “Variable-Complexity Response Surface Approximations for

Wing Structural Weight in HSCT Design", Computational Mechanics, 18, 1996, pp. 112 – 126.

2.

Balabanov, V., Kaufman, M., Knill, D.L., Haim, D., Golovidov, O., Giunta, A.A., Haftka, R.T., Grossman,

G., Mason W.H., and Watson, L.T., “Dependence of Optimal Structural Weight on Aerodynamic Shape for a High Speed Civil Transport", in Proceedings, 6 th AIAA/NASA/USAF Symposium on Multidisciplinary

Analysis and Optimization , Bellevue, WA, AIAA paper 96-4046, September 1996, pp.599 – 612.

3.

Balabanov, V.O., Giunta, A.A., Golovidov, O., Grossman, B., Mason, W.H., Watson, L.T., and Haftka,

R.T., “Reasonable Design Space Approach to Response Surface Approximation", Journal of Aircraft ,

36(1), 1999, pp. 308-315.

4.

Papila, M. and Haftka, R.T., “Uncertainty and Wing Structural Weight Approximations", in Proceedings,

40 th AIAA/ASME/ASCE/ASC Structures, Structural Dynamics, and Material Conference , St. Louis, MO,

Paper AIAA-99-1312, April 1999, pp.988-1002.

5.

Papila, M. and Haftka, R.T., “Response Surface Approximations: Noise, Error Repair and Modeling

Errors", AIAA Journal , 38(12), 2000, pp.2336-2343.

6.

Madsen, J.I., Shyy, W. and Haftka, R.T, "Response Surface Techniques for Diffuser Shape Optimization",

AIAA Journal , Vol. 38, (2000), pp. 1512-1518.

7.

Vaidyanathan, R., Papila, N., Shyy, W., Tucker, K. P., Griffin, L. W., Haftka, R. T., and Fitz-Coy, N.,

"Neural Network and Response Surface Methodology for Rocket Engine Component Optimization", 8 th

AIAA/USAF/NASA/ISSMO Symposium on Multidisciplinary Analysis and Optimization , (2000), Paper No.

2000- 4480, Long Beach, CA.

8.

Vaidyanathan, R., Tucker, K. P., Papila, N. and Shyy, W., "CFD Based Design Optimization for a Single

Element Rocket Injector", 41 st Aerospace Sciences Meeting and Exhibit , Reno, NV 2003, (2003), Paper No.

2003-0296, (also accepted for publication in Journal of Fluids Engineering ).

9.

Papila, N., Shyy, W., Griffin, L. W., Huber, F., Tran, K., "Preliminary Design Optimization for a

Supersonic Turbine for Rocket Propulsion" 36 th AIAA/ASME/SAE/ASEE Joint Propulsion Conference and

Exhibit , Huntsville, Alabama, (2000).

10.

Papila, N., Shyy, W., Griffin, L. W., Dorney, D. J., "Shape Optimization of Supersonic Turbines Using

Response Surface and Neural Network Methods", Journal of Propulsion and Power , (2001), Vol. 18, pp.509-518.

11.

Shyy, W., Tucker, P.K., and Vaidyanathan, R., “Response surface and neural network techniques for rocket engine injector optimization", Journal of Propulsion and Power , 17(2), 2001, pp.391-401.

12.

Shyy. W., Papila, N., Vaidyanathan, R. and Tucker, K., “Global Design Optimization for Aerodynamics and Rocket Propulsion Components”, Progress in Aerospace Sciences , (2001), Vol. 37, pp. 59-118.

13.

Redhe, M., Forsberg, J., Jansson, T., Marklund, P.O., Nilsson, L., “Using the response surface methodology and the D-optimality criterion in crashworthiness related problems: An analysis of the surface approximation error versus the number of function evaluations", Structural and Multidisciplinary

Optimization, Vol. 24(3), (2002), pp.185-194.

12

14.

Goel, T., Vaidyanathan, R. V., Haftka, R. T., Queipo, N. V., Shyy, W., Tucker, K. P., “Response Surface

Approximation of Pareto Optimal Front in Multi-objective Optimization”, 10 th AIAA/ISSMO

Multidisciplinary Analysis and Optimization Conference.

(2004), 30 Aug-1 Sep. (accepted for publication)

15.

Myers, R.H., and Montgomery, D.C., Response Surface Methodology-Process and Product Optimization

Using Designed Experiments , New York: John Wiley & Sons, Inc., 1995, pp. 208-279.

16.

Khuri, A.I. and Cornell, J.A., Response Surfaces: Designs and Analyses , 2 nd edition, New York, Marcel

Dekker Inc., 1996, pp. 207-247.

17.

Box, G.E.P. and Draper, N.R., “The choice of a second order rotatable design", Biometrika , 50 (3), 1963, pp. 335 – 352.

18.

Draper, N. R. and Lawrence, W. E., “Designs which minimize model inadequacies: cuboidal regions of interest", Biometrika, 52 (1-2), (1965), pp.111-118.

19.

Kupper, L. L. and Meydrech, E. F., “A new approach to mean squared error estimation of response surfaces", Biometrika , 60 (3), (1973), pp.573-579.

20.

Welch, W. J., “A mean squared error criterion for the design of experiments", Biometrika , 70 (1), 1983, pp.205-213.

21.

Papila, M. and Haftka, R.T., “Uncertainty and Response Surface Approximations", in Proceedings, 42 nd

AIAA/ASME/ASCE/ASC Structures, Structural Dynamics, and Material Conference , Seattle, WA, (2001),

Paper AIAA-01-1680.

22.

Papila, M., Haftka, R. T. and Watson, L. T., “Bias Error Bounds for Response Surface Approximations and

Min-Max Bias Design”, AIAA Journal, (to appear 2004)

23.

MATLAB®, The Language of Technical computing, Version 6.5 Release 13. © 1984-2002, The

MathWorks, Inc.

Figures:

(A) First set of design points (B) Second set of design points

Figure 1 Datasets used to construct the response surfaces for polynomial example problem

(A) Bias error bounds (B) Actual worst error

13

(C) Estimated standard error (D) Actual average error

Figure 2 Different errors in the design space for quadratic-cubic problem (case 4, dataset 1)

(A) Bias error bounds (B) Actual worst case error

Figure 3 Distribution of bias error bounds and actual worst case error in the design space (dataset 2)

Effect of increasing epsilon on correlation

Max Bias Error Bounds vs. Epsilon

1.01

1.00

0.99

0.98

0.97

0.96

0.95

0.94

0.000

0.010

0.015

epsilon

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0.00

0.000

0.005

0.020

0.025

0.030

0.005

0.010

0.015

epsilon

0.020

0.025

(A) Effect of

on correlation between BEB and AWE (B) Effect of

on maximum BE bounds

Figure 4 Effect of relaxation parameter on correlation coefficient and maximum BE bounds

0.030

14

Appendix A: Response surface approximation and prediction variance

A scalar function

( x ), is computed at N design points x (i) , i = 1, 2, 3,…, N. The data at these points have random errors

 i

, which are assumed to be uncorrelated and normally distributed random variables with zero mean and

2 variance. The function value at a design point x (i) is given by y i

 

( x )

  i

(A1)

The true response

( x ) can be written in terms of basis functions f j

( x ) and the associated coefficient vector

as following: n

1

   j

1 j f ( )

 T x ( ) (A2) where F(x) T = (f

1

( x ), f

2

( x ),…f n1

( x )) and

T = (

1

,

2

,…,  j

). The basis functions are often chosen as monomials.

A response surface is fit to the N data points (N > n

1

) such that the approximate function is given by: y

 j n  1

1 b f ( ) j j

 T x F (x)b (A3) where b is the coefficient vector with the expected value as vector

. The difference in the actual function value and the estimated value is given as: e e i

 y y i

X y

ˆ( b x

(i)

)

For N data points, the residuals can be written in the matrix form as:

(A4)

(A5) where e T = (e

1

, e

2

, …, e

N

), and X is the Gramian matrix given as X i, j

 f ( j x (i) ) . One typical Gramian matrix is given in Equation 4. Mostly, the coefficient vector b is approximated in a least square regression such that the square of the residuals e ( )

2

is minimized. This is given as: b

 T -1 T

(X X) X y (A6)

The quality of fit between different surfaces can be evaluated by comparing the adjusted RMS error defined as:

 e i

2

(N- n )

(

T  T

X

T y ) (N- n ) (A7)

Here

 a

is the adjusted RMS error incurred while mapping the surface over the data set. The measure of error given by

 a

is normalized to account for the degrees of freedom in the model. This adjusted RMS error thus accounts for the nominal effect of higher order terms providing a better overall comparison among the different surface fits.

The prediction variance is the most widely accepted method to characterize the noise error. At any design point x , prediction variance is given as [15, 16]:

Var y x

  2 a

F

T x

T

( )(X X)

-1

F x (A8) and the estimated standard error given as the square root of the prediction variance, is e es

Var y (A9)

Appendix B: Generating family of candidate true coefficient vectors

For the second set of design points the Gramian matrix is rank deficient (rank 9). Let the eigen vector associated with zero eigen value, termed as null vector, be denoted as V . It is known that multiplying any rank deficient matrix with its null vector V yields zero ( X V

0 ). This property can be used to generate the family of the polynomials satisfying the data points as follows.

X

X

β

β y

 

X

& X

X(

β   V

)

 y

(D1)

15

One candidate true coefficient vector

 was obtained by randomly selecting the value of

4

(2) . Random selection of the value of scalar

generates a family of candidate true coefficient vectors (

β   V

) which satisfy the data.

It is important to note that the selection of bounds on

is very important. Arbitrary selection of bounds on

 may create the coefficient vector

which either violates the bounds on

or does not explore the complete feasible design space. One method of selecting the bounds is to identify the largest positive value of

with which the max original coefficient vector

can be perturbed as given in Equation (D1) such that all components of the vector

(

β   max

V

) satisfy the upper constraint on the true coefficient vector

. Similarly find the smallest negative number

(

β  

 min

with which the original coefficient vector

can be perturbed such that all components of the vector min

V

) satisfy the lower constraint on the true coefficient vector

. Using the bounds

 min

    max generated a family of candidate true coefficient vectors.

16

Download