Document

advertisement
TTÜ, Department of Machinery
Lead researcher Jüri Majak
E-mail: juri.majak@ttu.ee
MER9020
Structure of the course
Introduction
•
Mathematical models and tools for building these models. Use of experimental data based models.
Engineering statistics
–
–
–
•
Basics of Optimisation methods and search of solutions
–
–
–
–
–
•
Engineering statistics and design. Statistical Design of Experiments. Use of statistical models in design.
Artificial neural network models.
Accuracy of the model. Statistical tests of Hypotheses. Analysis of Variance. Risk, reliability, and Safety
Types of Optimization Problem
Multicriteria optimal decision theory. Pareto optimality, use of multicriteria decision theory in engineering design.
Traditional gradient based optimization. Lagrange multipliers. Examples of use the classical optimization method in
engineering.
Mathematical programming methods and their use for engineering design, process planning and manufacturing
resource planning. Direct and Dual Optimization tasks. Sensitivity analysis.
Evolutionary algorithms
Design of products and processes. Applications, samples
Basic activities and relations
Experimental study,
computer simulations.
Protsess monitoring,
statistical analysis of
experimental data,
model evaluation etc
Simulation, analysis
Meta modeling
Development of response
surface, its evaluation,
validation
Design optimization:decision
making, linear and nonlinear
programming
Engineering design optimization
Design of Experiment (DOE)
Homework topics
• Descriptive statistics: statistical estimates, realtion
between two test series(correlation) , variance.
• Response modeling
– Regression analysis, model validation.
– ANN based models
• Design optimization.
Engineering problems with one and multiple optimality criteria.
Overall Goal in Selecting Methods
The overall goal in selecting basic research method(s) in engineering
design is to get the most useful information to key decision makers
in the most cost-effective and realistic fashion. Consider the
following questions:
1. What information is needed to make current decisions about a
product or technology?
2. How much information can be collected and analyzed., using
experiments ,questionnaires, surveys and checklists?
3. How accurate will the information be?
4. Will the methods get all of the needed information?
5. What additional methods should and could be used if
additional information is needed?
6. Will the information appear as credible to decision makers, e.g., to
engineers or top management?
7. How can the information be analyzed?
Definition of Engineering tasks
Simulations – used for analysis of the object with given structure, an aim is to
evaluate the behaviour of the objects depending on given values of input
parameters
Analysis – The parameters set and strcuture of the object are given, an aim is to
analyse the behaviour of the object caused by changes of the parameter values
Diagnostics – inverse to simulation, an aim is to determine parameters providing
prescribed behaviour
Synthesis – inverse to analysis, an aim is to determine the strcuture and the values
of input parameters based on required output values
Basic Guidelines to Problem Solving and
Decision Making
1. Define the problem
If the problem still seems overwhelming, break it down by repeating
steps
Prioritize the problems
2. Look at potential causes for the problem
3. Identify alternatives for approaches to resolve the problem
At this point, it's useful to keep others involved.
4. Select method, tool ,technique etc to solve the problem
5. Plan the implementation of the best alternative solution ( action
plan)
6. Monitor implementation of the plan
7. Verify if the problem has been resolved or not
Response Surface Methodology (RSM)
There is a difference between data and information. To extract information from data
you have to make assumptions about the system that generated the data. Using these
assumptions and physical theory you may be able to develop a mathematical model of
the system. Generally, even rigorously formulated models have some unknown
parameters..
Identifying of those unknown constants and fitting an appropriate response surface
model from experimental data requires knowledge of Design of Experiments,
regression modelling techniques, and optimization methods.
The response surface equations give the response in terms of the several independent
variables of the problem. If the response is plotted as a function of etc., we obtain a
response surface
Response surface methodology (RSM) has two objectives:
1. To determine with one experiment where to move in the next experiment so as to
continually seek out the optimal point on the response surface.
2. To determine the equation of the response surface near the optimal point.
Response surface methodology (RSM) uses a two-step procedure aimed at rapid
movement from the current position into the region of the optimum. This is followed by
the characterization of the response surface in the vicinity of the optimum by a
mathematical model. The basic tools used in RSM are two-level factorial designs and
the method of least squares (regression) model and its simpler polynomial forms
The Purpose of Modelling
1. To make an idea concrete. This is done by representing it mathematically, pictorially
or symbolically.
2. To reveal possible relationships between ideas. Relationships of hierarchy, support,
dependence, cause, effect, etc. can be revealed by constructing a model.
We have to be careful, then, how much we let our models control our thinking.
3. To simplify the complex design problem to make it manageable or understandable.
Almost all models are simplifications because reality is so complex.
4. The main purpose of modelling, which often includes all of the above three purposes,
is to present a problem in a way that allows us to understand it and solve it..
Types of Models
A. Visual. Draw a picture of it. If the problem is or contains something physical, draw a
picture of the real thing--the door, road, machine, bathroom, etc. If the problem is not
physical, draw a symbolic picture of it, either with lines and boxes or by representing
aspects of the problem as different items--like cars and roads representing information
transfer in a company.
B. Physical. The physical model takes the advantages of a visual model one step
further by producing a three dimensional visual model.
C. Mathematical. Many problems are best solved mathematically.
Complexity theory for engineering design
Three types of complexity:
• Description complexity
• Numerical complexity
• Understanding (regognation) complexity
•
• Numerical complexity theory is part of the theory of computation dealing
with the resources required during computation to solve a given problem.
The most common resources are time (how many steps does it take to
solve a problem) and space (how much memory does it take to solve a
problem). Other resources can also be considered, such as how many
parallel processors are needed to solve a problem in parallel. Complexity
theory differs from computability theory, which deals with whether a
problem can be solved at all, regardless of the resources required.
Complexity classes
(reference-functions Big O notation) :
• Logaritmic complexity, O(log n)
• Linear complexity , O(n)
• Polynomial complexity, O( nq )
• Eksponential complexity
• Factorial complexity
• Double-eksponential complexity.
Algorithmic complexity is concerned about how fast or slow particular algorithm performs. We define complexity as a numerical
function T(n) - time versus the input size n. We want to define
time taken by an algorithm without depending on the
implementation details.
Complexity classes
http://www.cs.cmu.edu/~adamchik/15121/lectures/Algorithmic%20Complexity/complexity.html
Asymptotic Notations
The goal of computational complexity is to classify algorithms according to their
performances. We will represent the time function T(n) using the "big-O" notation
to express an algorithm runtime complexity. For example, the following statement
T(n) = O(n2)
says that an algorithm has a quadratic time complexity.
Definition of "big Oh"
For any monotonic functions f(n) and g(n) from the positive integers to the positive
integers, we say that f(n) = O(g(n)) when there exist constants c > 0 and n0 > 0
such that
f(n) ≤ c * g(n), for all n ≥ n0
Intuitively, this means that function f(n) does not grow faster than g(n), or that
function g(n) is an upper bound for f(n), for all sufficiently large n→∞
Complexity classes
Exercise. Let us prove n2 + 2 n + 1 = O(n2).
We must find such c and n0 that n 2 + 2 n + 1 ≤ c*n2.
Let n0=1, then for n ≥ 1
1 + 2 n + n2 ≤ n + 2 n + n2 ≤ n2 + 2 n2 + n 2 = 4 n2
Therefore, c = 4.
Complexity classes
Constant Time: O(1)
An algorithm is said to run in constant time if it requires the same amount of
time regardless of the input size.
Example:
array: accessing any element
Linear Time: O(n)
An algorithm is said to run in linear time if its time execution is directly
proportional to the input size, i.e. time grows linearly as input size increases.
Examples:
array: linear search, traversing, find minimum
i:=1
p:=1
for i:=1 to n
p:=p*i
i:=i+1
endfor
Ex: Find complexity of the algorithm
f(n)=4*n+2
O(n)
Complexity classes
Logarithmic Time: O(log n)
An algorithm is said to run in logarithmic time if its time execution is
proportional to the logarithm of the input size.
Example: binary search
Quadratic Time: O(n2)
An algorithm is said to run in quadratic time if its time execution is proportional
to the square of the input size.
Examples: bubble sort, selection sort, insertion sort
for i:=1 to n
for j:=1 to n
A(i,j):=x
endfor
endfor
Ex: Find complexity of the algorithm
Complexity classes
Ex: Find complexity of the algorithm
s:=0
for i:=1 to n
for j:=1 to i
s:=s+j*(i-j+1)
endfor
endfor
Complexity classes
Complexity for Recursive algorithms
Initial problem with data capacity n will be divided into b subproblems with
equal capacity. Only a (a<b) subproblems is solved (others not needed).
T (n)  aT (n / b)  f (n)
(1)
Theorem: Assuming a>=1 ja b>1 are constants, f(n) function and T(n) defined for
non-negative n by formula (1). Then:
a) T(n) is
b) T(n) is
c) T(n) is
O(n logb a )
O(n logb a log n)
O( f (n))
if f(n) is O(n logb ae ) (e – positive constant) ,
if f(n) isO(n logb a )
if f(n) is O(n logb ae ) (e – positive constant , and af(n/b)<=cf(n)
Ex1: apply theorem to binary search
Ex2: apply theorem for problem where a=2, b=4, f(n)=n2+2n+9
Ex3: apply theorem for problem where a=2, b=4, f(n)=3
Ex4: Find number of opertions for a=2, b=4, f(n)=2, n=2000.
Ex2: apply theorem for problem where a=2, b=4, f(n)=n2+2n+9
T (n)  2T (n / 4)  n^2  2n  9
Complexity of f(n) is O(n2+2n+9) = O(n2)
log b a  log 4 2  0.5
n logb a  n 0.5
O(n0.51.5 )  O(n 2 )
Case c) constant e=1.5
Asymptotic estimate from case c):
O( f (n))  O(n 2  2n  9)  O(n 2 )
Ex3: apply theorem for problem where a=2, b=4, f(n)=3
T (n)  2T (n / 4)  3
Complexity of f(n) is O(1)
log b a  log 4 2  0.5
n logb a  n 0.5
O(n 0.50.5 )  O(1)
Asymptotic estimate from case a):
Case a) constant e=0.5
O(nlogb a )  O(n0.5 )
Descriptive Statistics (Excel,...)
Descriptive Statistics
Find the mean, median, mode, and range for the following list of values:
13, 18, 13, 14, 13, 16, 14, 21, 13
The mean is the usual average, so:
(13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15
Note that the mean isn't a value from the original list. This is a common result. You
should not assume that your mean will be one of your original numbers.
The median is the middle value, so I'll have to rewrite the list in order:
13, 13, 13, 13, 14, 14, 16, 18, 21
There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2
= 5th number:
13, 13, 13, 13, 14, 14, 16, 18, 21
So the median is 14.
The mode is the number that is repeated more often than any other, so 13 is the
mode.
The largest value in the list is 21, and the smallest is 13, so the range is 21 – 13 = 8.
mean: 15
median: 14
mode: 13
range: 8
Descriptive Statistics
Standard deviation:
Standard error:
The variance of a random variable X is its second central moment, the expected
value of the squared deviation from the mean μ = E[X]
The variance is quadrat of standard deviation
Sample variance
In probability theory and statistics, the variance is used as a
measure of how far a set of numbers are spread out from each
other. It is one of several descriptors of a probability
distribution, describing how far the numbers lie from the mean
(expected value).
Descriptive Statistics
Kurtosis
In probability theory and statistics, kurtosis (from the Greek word κυρτός, kyrtos or
kurtos, meaning bulging) is a measure of the "peakedness" of the probability
distribution of a real-valued random variable, although some sources are insistent
that heavy tails, and not peakedness, is what is really being measured by kurtosis.
Higher kurtosis means more of the variance is the result of infrequent extreme
deviations, as opposed to frequent modestly sized deviations.
Kurtosis is a measure of how outlier-prone a distribution is. The kurtosis of the
normal distribution is 3. Distributions that are more outlier-prone than the normal
distribution have kurtosis greater than 3; distributions that are less outlier-prone
have kurtosis less than 3.
The kurtosis of a distribution is defined as
Descriptive Statistics
Skewness
In probability theory and statistics, skewness is a measure of the asymmetry of
the probability distribution of a real-valued random variable. The skewness value
can be positive or negative, or even undefined. Qualitatively, a negative skew
indicates that the tail on the left side of the probability density function is longer
than the right side and the bulk of the values (possibly including the median) lie
to the right of the mean. A positive skew indicates that the tail on the right side is
longer than the left side and the bulk of the values lie to the left of the mean. A
zero value indicates that the values are relatively evenly distributed on both
sides of the mean, typically but not necessarily implying a symmetric distribution.
The skewness of a random variable X is
the third standardized moment,
denoted γ1 and defined as
Correlation.
Correlation
Correlation is a statistical measure that indicates the extent to which two or
more variables fluctuate together. A positive correlation indicates the extent to
which those variables increase or decrease in parallel; a negative correlation
indicates the extent to which one variable increases as the other decreases.
Samples: http://www.mathsisfun.com/data/correlation.html
The local ice cream shop keeps track of how much ice cream they sell versus the temperature on that day, here are their figures
Correlation
Ice Cream Sales vs Temperature
Temperature °C
Ice Cream Sales
14,2°
$215
16,4°
$325
11,9°
$185
15,2°
$332
18,5°
$406
22,1°
$522
19,4°
$412
25,1°
$614
23,4°
$544
18,1°
$421
22,6°
$445
17,2°
$408
Correlation
Correlation
Example of correlation using Excel
Data analysis
Estimation of the variance of data series using F-test
http://en.wikipedia.org/wiki/F-test_of_equality_of_variances
This F-test is known to be extremely sensitive to non-normality
Estimation of the variance of data series using F-test
Excel Data Analysis fn: F-Test two –Sample for Variance.
Example: F-Test Two-Sample for Variances
Variable 1
Variable 2
Mean
4,642857
4,071429
Variance
6,708791
3,917582
Observations
14
14
df
13
13
F
1,712482
P(F<=f) one-tail
0,172124
F Critical one-tail
2,576927
0,172124 -
Generally, if this value is less than 0.05 you assume that the variances are NOT equal.
Estimation of the variance of data series using F-test
https://controls.engin.umich.edu/wiki/index.php/Factor_analysis_and_ANOVA
Homework1: Statistical evaluation of test data
Descriptive Statistics
1. Calculate basic descriptive statistics for two series of test data (data series
selected by yourself).
2. Estimate relations between these two data series (correlation, covariation)
3. Estimate the variance of data series using F-test.
If Fobserved  Fcritical then zero hypothesis is valid and the variance of two data
series is arbitrary. FT is table values taken with given confidence interval
(95% or 99%, etc.). If Fobserved > Fcritical, we conclude with 95% confidence
that the null hypothesis is false.
Null hypothesis Ho: all sample means arising from different factors are equal
Alternative hypothesis Ha: the sample means are not all equal
Homework1- all work done should be explained
meaning of parameters, etc.
1.Explain results of descriptive statistics
2. Explain relations between data series
3. Explain variance of data series
MATLAB Statistics
Descriptive StatisticsData summariesStatistical
VisualizationData patterns and trendsProbability
DistributionsModeling data frequencyHypothesis
TestsInferences from dataAnalysis of VarianceModeling
data varianceRegression AnalysisContinuous data
modelsMultivariate MethodsVisualization and
reductionCluster AnalysisIdentifying data categoriesModel
AssessmentIdentifying data
categoriesClassificationCategorical data modelsHidden
Markov ModelsStochastic data modelsDesign of
ExperimentsSystematic data collectionStatistical Process
Control Production monitoring
Linear regression analysis
The earliest form of regression was the method of least squares, which was published by
Legendre in 1805, and by Gauss in 1809.
Regression models
Regression models involve the following variables:
• The unknown parameters, denoted as β, which may represent a
scalar or a vector.
• The independent variables, X.
• The dependent variable, Y.
A regression model relates Y to a function of X and β.
The approximation is usually formalized as E(Y | X) = f(X, β).
To carry out regression analysis, the form of the function f must be
specified. Sometimes the form of this function is based on knowledge
about the relationship between Y and X that does not rely on the data.
If no such knowledge is available, a flexible or convenient form for f is
chosen.
Linear regression Models
Regressioonanalüüs
vaatluste (katse) plaan
Regressioonanalüüs
(katseplaan)
Sisendid
katsed
Väljund
x1
x2
x3
y
1
1
12
10
1
2
2
14
9
3
3
6
23
8
6
4
5
12
7
8
5
6
35
6
10
6
7
36
4
12
7
8
31
6
15
8
9
32
3
21
9
12
12
2
23
10
13
16
2
40
Regressioonanalüüs
(teisendatud katseplaan)
x1*x1
sqrt(x2)
x3*x3*x3
y
1
3,464102
1000
1
4
3,741657
729
3
36
4,795832
512
6
25
3,464102
343
8
36
5,91608
216
10
49
6
64
12
64
5,567764
216
15
81
5,656854
27
21
144
3,464102
8
23
169
4
8
40
Regressiooni analüüsi tulemuste näide
http://www.excel-easy.com/examples/regression.html
Linearization
y  a 0  a1 x1  a 2 x 2  a3 x3 
 a12 x1 x 2  a13 x1 x3  a 23 x 2 x3 
 a11 x12  a 22 x 22  a3 x32
Nonlinear equation
Linearized equation
y = a + bx
y = a + bx (linear)
b
ln y=ln a + b ln x
y = ax (logarithmic)
bx
ln y= ln a+bx
y = ae (exponential)
 bx
1
y  1  e (exponential)
ln
 bx
1 y
y=a+bx
y  a  b x (square
root)
y=a+b/x (inverse)
y=a+bx
y
ln y
ln y
ln
1
1 y
Linearized variables
W
V
x
ln x
x
x
y
x
y
1/x
MATLAB regression analysis
Regression Plots, Linear Regression,
Nonlinear Regression, Regression Trees,
Ensemble Methods
REGRESSION ANALYSIS. Matlab
Simple linear regression
y  b1 x  b0
% data
xx=[2.38 2.44 2.70 2.98 3.32 3.12 2.14 2.86 3.5 3.2 2.78 2.7 2.36 2.42 2.62 2.8
2.92 3.04 3.26 2.30]
yy=[51.11 50.63 51.82 52.97 54.47 53.33 49.90 51.99 55.81 52.93 52.87 52.36
51.38 50.87 51.02 51.29 52.73 52.81 53.59 49.77]
% data sort
[x,ind]=sort(xx);
y=yy(ind);
% linear regression
[c,s]=polyfit(x,y,1); % structure s contains fileds R,df, normr….
[Y,delta]=polyconf(c,x,s,0.05);
% plot
plot(x,Y,'k-',x,Y-delta,'k--',x,Y+delta,'k--',x,y,'ks',[x;x],[Y;y],'k-')
xlabel('x (input)');
ylabel('y (response)');
REGRESSION ANALYSIS. Matlab
,
k
y  b0 
,
Multiple
linear regression
b x
j
j
j 1
Example1: linear model for cubic polynomial in one independent variable
Let:
x1  x
Linear
model:
,
x2  x 2
x3  x3
y  b0  b1x1  b2 x2  b3 x3
,
Example2: linear model for quadratic polynomial in two independent variable
Let: ,
,
x1  x1
Linear model:
x2  x2
x3  x12
x4  x22
x5  x1x2
y  b0  b1x1  b2 x2  b3 x3  b4 x4  b5 x5
Example
x1=[7.3 8.7 8.8 8.1 9.0 8.7 9.3 7.6 10.0 8.4 9.3 7.7 9.8 7.3 8.5 9.5 7.4 7.8 7.8 10.3
7.8 7.1 7.7 7.4 7.3 7.6]’
x2=[0.0 0.0 0.7 4.0 0.5 1.5 2.1 5.1 0.0 3.7 3.6 2.8 4.2 2.5 2.0 2.5 2.8 2.8 3.0 1.7 3.3
3.9 4.3 6.0 2.0 7.8]’
x3=[0.0 0.3 1.0 0.2 1.0 2.8 1.0 3.4 0.3 4.1 2.0 7.1 2.0 6.8 6.6 5.0 7.8 7.7 8.0 4.2 8.5
6.6 9.5 10.9 5.2 20.7]’
y=[0.222 0.395 0.422 0.437 0.428 0.467 0.444 0.378 0.494 0.456 0.452 0.112 0.432
0.101 0.232 0.306 0.0923 0.116 0.0764 0.439 0.0944 0.117 0.0726 0.0412 0.251
0.00002]
X=[ones(length(y),1),x1,x2,x3,x1.*x2,x1.*x3,x2.*x3, x1.^2, x2.^2, x3.^2]
% Regression
[b,bcl,er,ercl,stat]=regress(y,X,0.05)
disp(['R2=' num2str(stat(1))])
disp(['F0=' num2str(stat(2))])
disp(['p-value=' num2str(stat(3))])
rcoplot(er,ercl)
In cases where the distribution of errors is asymmetric or prone to outliers, the
computed statistics with function regress() become unreliable.
Latter case the function robustfit() can be preferred.
% if the distribution of errors is asymmetric or prone to outliers
X2=X(:,2:9)
robustbeta = robustfit(X2,y)
Design of experiment
The aim in general:
to extract as much as possible information from a limited set of experimental study or
computer simulations.
to maximize the information content of the measurements/simulations in the context of
their utilization for estimating the model parameters.
Particularly:
the selection of the points where the response should be evaluated
Can be applied for model design with:
Experimental data
Simulation data
Can be applied for model fitting
Objective functions
Constraint functions
Why DOE
DOE allows the simultaneous investigation of the effect of a set of variables on a
response in a cost effective manner.
DOE is superior to the traditional one-variable-at-a-time method, which fails to
consider possible interaction between the factors.
DOE techniques selection
DOE is introduced for describing real life problems:
1920-s by R.Fisher, agricultural experiments,
G.Box 1950-s, for modeling chemical experiments
Nowadays various engineering applications, production planning, etc.
A huge number of DOE methods are available in literature and selection of the most
suitable method is not always the simplest task. Preparative activities needed:
 formulation of the problem to be modelled by DOE,
 selection of the response variable(s),
 choise of factors (design variables),
 determining ranges for design variables.
If this preliminar analysis is successfully done, then the selection of the suitable DOE
method is simpler. The selection of the levels of factors is also often classified as
preliminar work.
In the following two DOE methods are selected and discussed in more detail:
the Taguchi methods,
allows to obtain preliminary robust design with small number of experiments and
it is the most often applied at early stages of process development or used as
initial design
full factorial design
resource consuming, but leads to more accurate results
DOE techniques selection
Note, that the Taguchi design can be obtained from full factorial design by omitting
certain design points. Also there are several approaches, based on full factorial
design. For example central composite design can be obtained from 2N full factorial
design by including additional centre and axial points.
An alternate well known DOE techniques can be outlined as D-optimality criterion,
Latin hypercube, Van Keulen scheme, etc. The D-optimality criterion is based on
T
maximization of the determinant Xt X , where stand for the matrix of the design
variables. Application of the D optimality criterion yields minimum of the maximum
variance of predicted responses (the errors of the model parameters are minimized).
The Latin hypercube design maximizes the minimum distance between design points,
but requires even spacing of the levels of each factor [12]. The van Keulen’s scheme
is useful in cases where the model building is to be repeated within an iterative
scheme, since it adds points to an existing plan.
DOE methods vs selection criteria
Full-Factorial Experiments
Full factorial design
In order to overcome shortcomings of the Taguchi methods the full factorial design can
be applied (Montgomery, 1997) This approach captures interactions between design
variables, including all possible combinations. According to full factorial design strategy
the design variables are varied together, instead of one at a time. First the lower and
upper bounds of each of the design variables are determined (estimated values used if
exact values are not known). Next the design space is discretized by selecting level
values for each design variable. In latter case the experimental design is classified in the
following manner
2N full factorial design - each design variable is defined at only the lower and upper
bounds (two levels);
3N full factorial design - each design variable is defined at the lower and upper bounds
and also in the midpoints (three levels);
In the case of N=3 the 3N full factorial design contain 27 design points shown in Fig. 1.
Full factorial design
The full factorial design considered includes all possible combinations of design
variables and can be presented in the form of general second-order polynomial as
N
N
y  c0   ci xi   c x 
i 1
i 1
2
ii i
N
c x x
ij i
i , j 1; j  i
j
(2)
In (2) and stand for the design variables and , are model parameters.
It should be noted that the second order polynomial based mathematical model given
by formula (2) is just one possibility for response modeling. This model is used widely
due to its simplicity. In the current study the full factorial design is used for determining
dataset where the response should be evaluated, but instead of (2) artificial neural
networks is employed for response modeling.
Evidently, the number of experiments grows exponentially in the case of 3N full
factorial design (also for 2N). Thus, such an approach becomes impractical in the case
of large number of design variables. A full factorial design typically is used for five or
fewer variables
Fractional factorial design
In the case of large number of design variables, a fraction of a full factorial design is
most commonly used considering only a few combinations between variables [11]. The
one-third fractions for a 33 factorial design are depicted in Fig. 2 [11].
Figure 2 One-third fractions for a 33 full factorial design ([11], [31])
Thus, in latter case the number of experiments is reduced to one third in comparison
with 33 full factorial designs (from 27 to 9). The cost of such an simplification is fact
that just only a few combinations between variables are considered.
Central composite design
CCD are first-order (2N) designs, but with additional centre and axial points to allow
estimation of the tuning parameters of a second-order model. CCD for 3 design
variables is shown in figure 3.
Figure 3. Central composite design for 3 design variables at 2 levels
The CCD design shown involves 8 factorial points, 6 axial points and 1 central point.
CCD presents an alternative to 3N designs in the construction of second-order models
because the number of experiments is reduced as compared to a full factorial design (15
in the case of CCD compared to 27 for a full-factorial design). In the case of problems
with a large number of designs variables, the experiments may be time-consuming even
with the use of CCD.
Parameter study
One at a Time
Latin Hypercubes
The Taguchi method
The Taguchi approach is more effective method than traditional design of experiment
methods such as factorial design, which is resource and time consuming. For example,
a process with 8 variables, each with 3 states, would require 38=6561 experiments to
test all variables (full factorial design). However using Taguchi's orthogonal arrays, only
18 experiments are necessary, or less than 0.3% of the original number of experiments.
It is correct to point out also limitations of the Taguchi method. Most critical drawback of
the Taguchi method is that it does not account higher order interactions between design
parameters. Only main effects and two factor interactions are considered.
Taguchi methods, developed by Dr. Genichi Taguchi, are based on the following two
ideas
Quality should be measured by the deviation from a specified target value, rather than
by conformance to preset tolerance limits;
Quality cannot be ensured through inspection and rework, but must be built in through
the appropriate design of the process and product.
In the Taguchi method, two factors such as the control factor and the noise factor are
considered to study the influence of output parameters. The controlling factors are used
to select the best conditions for a manufacturing process, whereas the noise factors
denote all factors that cause variation. The signal-to-noise (SN) ratio is used to find the
best set of design variables
The Taguchi method
According to the performance characteristics analysis, the Taguchi approach is classified
into three categories:
Nominal-the-Better (NB),
Higher-the-Better (HB),
Lower-the-Better (LB).
In the following Lower-the-Better (LB) approach is employed in order to minimize the
objective functions. The SN ratio is calculated as follows:
 N i y k2 
SN i  10 log   
 k 1 N i 
Ni
where i, k,
stand for experiment number, trial number and number of trials for
experiment , respectively.
The results obtained from the Taguchi Method can (should) be validated by the
confirmation tests. The validation process is performed by conducting the
experiments with a specific combination of the factors and levels not considered in
initial design data.
The Taguchi method
Array Selector
(https://controls.engin.umich.edu/wiki/index.php/Design_of_experiments_via_taguchi_m
ethods:_orthogonal_arrays)
L4 array:
The Taguchi method
Impeller model: A, B, or C; Mixer speed: 300, 350, or 400 RPM
Control algorithm: PID, PI, or P;
Valve type: butterfly or globe
There are 4 parameters, and each one has 3 levels with the exception of valve type. The
highest number of levels is 3, so we will use a value of 3 when choosing our orthogonal
array.
Using the array selector above, we find that the appropriate orthogonal array is L9:
When we replace P1, P2, P3, and P4 with our parameters and begin filling in the
parameter values, we find that the L9 array includes 3 levels for valve type, while our
system only has 2. The appropriate strategy is to fill in the entries for P4=3 with 1 or 2
in a random, balanced way.
The Taguchi method
If the array selected based on the number of parameters and levels includes more
parameters than are used in the experimental design, ignore the additional parameter
columns. For example, if a process has 8 parameters with 2 levels each, the L12 array
should be selected according to the array selector. As can be seen below, the L12
Array has columns for 11 parameters (P1-P11). The right 3 columns should be ignored.
Design of experiment. Factorial Designs
KATSEPLAAN
Katse nr
1
2
3
4
5
6
7
8
Täiendavad katsed 0 punktis
x1
x2
x3
VÕIMALDAB LEIDA SEOSEID
x12=X1*X2 x13=X1*X3 x23= X2*X3 x1*x2*x3
1
1
1
1
1
1
1
-1
1
1
-1
-1
1
-1
1
-1
1
-1
1
-1
-1
-1
-1
1
1
-1
-1
1
1
1
-1
1
-1
-1
-1
-1
1
-1
-1
1
-1
1
1
-1
-1
-1
-1
1
1
-1
-1
-1
1
1
1
-1
0
0
0
0
0
0
Saame võrrandi: y=a0+a1*x1+a2*x2+a3*x3, täiendavalt: a12*x12 ; a13*x13; a23*x23;a123*x1*x2*x3
To use the standard plans the following coding of parameters is needed:
xi  1
correspond to the X i,max
xi  1
correspond to the X i ,min
This coding corresponds to the following linear transformation of the initial
parameters X i :
2 ( X i  X i ,min )
xi 
1
( X i ,max  X i ,min )
Calculation of the Main Effects.
Calculation of Effects
Calculation of the Main Effects. With a factorial design, the average main effect of changing
thrombin level from low to high can be calculated as the average response at the high level minus
the average response at the low level:
Main effect of X i 
 responses
at high X i -  responses at low X i
half the number of runs in experiment
Response modeling
Response modelling (RM) is widely used technique in engineering design. Most
commonly the response modeling is used in order to compose mathematical model
describing the relation between input data and objective functions. In general such an
approach can be applied in the case of time consuming and expensive experimental
study, but also for modeling complex numerical simulations. The mathematical model is
composed on base of “learning data” and can be used for evaluating objective
function(s) for any set of input data in design space considered. Some special cases can
be also pointed out where it is reasonable to apply response modeling:
* Experimental study is not expensive or time consuming but it cannot be performed in
certain sub-domain of the design space (technological, etc. limitations);
* Numerical simulations are not expensive or time consuming but here are singularities
in certain sub-domain of the design space.
The RM techniques are used commonly for describing objective functions, but actually
can be applied with same success for describing constraint functions (any functions).
Finally, it is correct to point out that some special situations where application of ANN or
also other meta-modeling techniques may be not successful:
* Design space is poorly covered by “learning data” (data used in DOE). Therese are
too few results or the results are non-uniformly distributed and some sub-domains
are not covered;
* Initial design space is well covered by DOE data, but here is need to evaluate
functions values outside of the initial design space.
Artificial neural networks (ANN)
An Artificial Neural Network (ANN) is an information processing paradigm inspired by
the human brain. ANNs consist of a large number of highly interconnected neurons
working altogether for solving particular problem. Each particular ANN is designed,
configured for solving certain class of problems like
data classification,
function approximation,
pattern recognition
through a learning process. In the following the ANN-s considered for function
approximation.The theoretical concepts of ANN are introduced 1940-s and first neural
model was proposed by McCulloch and Pitts in 1943 using simple neuron (the values of
the input, output and weights were restricted to values , etc). Significant progress in ANN
development was achieved by Rosenblatt in 1957 who introduced the one layer
perceptron and neurocomputer (Mark I Perceptron). A multi-layer perceptron (MLP)
model was taken use in 1960. The learning algorithm (backpropagation algorithm) for
the three-layer perceptron has been proposed by Werbos in 1974. The MLP becomes
popular after 1986 when Rummelhart and Mclelland generalized the backpropagation
algorithm for MLP. Radial Basis Function (RBF) networks were first introduced by
Broomhead & Lowe in 1988 and Self-Organizing Map (SOM) network model by
Kohonen in 1982.
ANN: neuron and transfer functions (McCulloch and
Pitts (1943))
f(x) = 1/(1 + e-x)
Artificial neural networks (ANN)
A simple two layer perceptron is shown in Figure 3(Leeventurf):
g  y3  f 3 w31 f1 ( w11x1  w12 x2  w10 )  w32 f 2 ( w21x1  w22 x2  w20 )  w30 
Briefly, the mathematical model of the two layer network can be obtained by:
Summarizing the inputs multiplied by the weights of input layer, adding bias
Applying the transfer function for each neuron;
Summarizing the obtained results multiplied by the weights of the hidden layer, adding bias
Applying the transfer function of the output layer.
Artificial neural networks (ANN)
Most commonly, for function approximation, the radial bases and linear transfer
functions are used in hidden and output layers, respectively.
The architecture of the ANN considered is shown in Figure
Architecture of the two layer feedforward neural network
It is correct to note that in literature the input layer is commonly not counted and the
ANN shown in Fig.3 is considered as two layer network (input layer has no transfer
function, just input data and weights).
Artificial neural networks (ANN)
The most commonly used backpropagation learning algorithm is the steepest descent
method (one of the gradient methods). However, the shortcoming of this method is its
slow convergence. The Newton-s method has good convergence rate (quadratic), but it
is sensitive with respect to initial data. For that reason herein is used the Levenberg–
Marquardt learning algorithm which has second-order convergence rate . The update
rule of the Levenberg–Marquardt algorithm is a blend of the simple gradient descent
and Gauss-Newton methods and is given as
xi 1  xi  ( H  diag H ) 1 f ( xi )
where H is the Hessian matrix evaluated at , and stand for the scaling coefficient and
gradient vector, respectively. The Levenberg–Marquardt algorithm is faster than pure
gradient method and is less sensitive with respect to starting point selection in
comparison with Gauss-Newton method.
Sensitivity analysis for ANN models
The sensitivity analysis applied on the trained neural network model allows identifying
all relevant and critical parameters from a total set of input parameters. Special
attention should be paid to calculation of the sensitivities in points corresponding to
optimal designs.
The output of the above described three layer percepton network can be computed as
Y  G2 (W2G1(W1 X  1 )  2 )
1
2
2 
1
1 





x
1
 1
w11  wm1
w11  wn1

     ,




W




1
X    , W1      , 2 
 
,
1
w 2  w 2 


w1  w1 

 x n 
k
1k
mk 

1
k
nk



12 
 
 2    ,
 m2 
 
 y1 
Y    ,
 y m 
X- input vector, Y- output vector W1, W2, stand for weight matrices
and 1, 2 for bias vectors. G1 , G2 - transfer functions
The sensitivity matrix S can be computed as gradient of the output vector Y as
S  Y 
For MLP: S
MLP
F
Y F2

W2 1 W1
X Z 2
Z1
where
Z1  W1 X  1
Y FN
F
F
 Y 

WN  2 W2 1 W ,
X Z N
Z 2 Z1
Z2  W2G1 ( Z1 )  2
Z1  W1 X  1 Z2  W2G1 ( Z1 )  2 ,...
Z N  WN GN 1 ( Z N 1 )  N
ANN applications
Real-life applications. The tasks artificial neural networks are applied to tend
to fall within the following broad categories:
• Function approximation, or regression analysis, including time series
prediction, modeling.
• Classification, including pattern and sequence recognition, novelty detection
and sequential decision making.
• Data processing, including filtering, clustering and compression.
• Robotics, including directing manipulators
• Control, including Computer numerical control
Application areas include system identification and control (vehicle control,
process control, natural resources management), game-playing and decision
making, pattern recognition (radar systems, face identification, object
recognition and more), sequence recognition (speech, handwritten text
recognition), medical diagnosis, financial applications (automated trading
systems), data mining (or knowledge discovery in databases ), visualization
and e-mail spam filtering.
Neural Network Toolbox
NN Toolbox provides functions for modeling complex
nonlinear systems that are not easily modeled with
traditional methods.
Neural Network Toolbox supports supervised learning with
feedforward, and dynamic networks. It also supports
unsupervised learning with self-organizing maps and
competitive layers. With the toolbox you can design, train,
visualize, and simulate neural networks.
Neural network model: using newff
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Programmi näide (vaja uus näide): Reinforcement: yes, no
% Lähteandmed
p=[
3 3 1 2 3 2 1 3 3 2 2 1 2 1 1 2 3 2 1 2; …
2 1 1 0 2 2 2 0 2 0 1 1 1 0 2 2 1 2 2 0;...
2 3 2 2 1 2 3 3 1 1 1 0 0 2 3 2 3 1 1 2; …
2 0 1 2 2 0 1 0 1 1 2 2 2 0 0 2 0 2 1 0;...
2 3 2 1 1 2 3 3 1 2 2 3 2 1 2 3 2 1 2 3; …
22211112211222121211]
t=[1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 ]
% Närvivõrgu mudeli kirjeldamine
net =newff(p,t,[4,1],{'logsig','logsig'},'trainlm');
% Muudame treeningu parameetreid
net.trainParam.show=100;
net.trainParam.lr=0.05;
net.trainParam.epochs=3000;
net.trainParam.goal=0.00001;
[net,tr]=train(net,p,t);
% Kontrollime lahendit
a=round(sim(net,p))
% Väljastame parameetrid
bias1=net.b{1}
bias2=net.b{2}
weights1=net.IW{1,1}
weights2=net.LW{2,1}
Neural network model: using newff
P = [0 1 2 3 4 5 6 7 8 9 10];
T = [0 1 2 3 4 3 2 1 2 3 4];
% two-layer feed-forward network.input ranges from [0 to 10].
%The first layer has five tansig neurons, the second layer has
one purelin neuron.
%The trainlm network training function is to be used.
net = newff([0 10],[5 1],{'tansig' 'purelin'});
%Here the network is simulated
Y = sim(net,P);
plot(P,T,P,Y,'o')
%network is trained for 50 epochs.
net.trainParam.epochs = 50;
net = train(net,P,T);
% new simulation and plot
Y2 = sim(net,P);
plot(P,T,P,Y,'o',P,Y2,'rx')
Neural network model: using newff
%1. Initial data, Input data (variables, levels)
x1L=[1, 1.5, 2, 2.5]
x2L=[1, 1.5, 2, 2.5]
x3L=[0.5, 1.0, 1.5, 2]
% full factorial design need 4^3=64 experiments
%2. input corresponding to Taguchi design L'16
x1=[1,1,1,1,1.5,1.5,1.5,1.5,2,2,2,2,2.5,2.5,2.5,2.5]
x2=[1,1.5,2,2.5,1,1.5,2,2.5,1,1.5,2,2.5,1,1.5,2,2.5]
x3=[0.5,1,1.5,2,1,0.5,2,1.5,1.5,2,0.5,1,2,1.5,1,0.5]
S=[x1;x2;x3]
%3. this formula we use instead of experiments
V=x1.^2+x2.^2+x3.^2
%4. Function approximation using ANN
global net;
% defining network
net=newff(S,V, [15,1],{'tansig','purelin'},'trainlm');
net.trainParam.lr=0.05; % learning rate
net.trainParam.epochs=50;
%Train network
net = train(net, S, V);
Neural network model: model validation
Homework 2
Complete matlab application , started at lecture (see slides 76-77)
% 1. Improve response surface accuracy
% a) use Taguchi design with 5 levels L25
% b) use full factorial design (remain on 4 level)
% 2. test results with some test data
•
•
•
•
•
•
Explain DOE methods used (advantages, disadvantages)
Solve problem by applying lienar regression, compare results
Estimate accuracy of the model
Search for better configuration of ANN (number of neurons, layers,
selection of transfer fn)
Try other functions for ANN models (newgrnn or feedforward)
Describe meaning of parameters, compare results
NB: Finally, generate new dataset of capacity
a) 100; b) 200; c) 300
using random values ( S=2*rand(3,100)+0.5)
use same function to obtain “test” results, compare , estimate results
Optimal design of engineering systems
In general optimal design in engineering can be defined as search for
solutions providing minimum, (or maximum) value to the objective
function and satisfy given constraints.
Typical objective functions: cost of the product, service time, net profit,
strength (stiffness) characteristics, quality characteristics, etc.
The constraints may be technological, or ressourses, etc
Optimal design is specially needed in the case of complex problems,
also when instead of particular product the systems and/or
processes are subjected to optimization.
Types of Optimization Tasks
•
Some specific research areas: operation research, decision theory,...
Problems may by classified to
•
Decision making
•
Planning (design of system parameters)
Depending on model used, the problem may be classified as
•
Deterministic
•
Stochastic
or
•
One step design
•
Iterative design
Classification depending on solution method
•
Analytical solution
•
Numeric solution.
•
Semianalytical solution
Planning can be divided into
•
Linear planning
•
Integer planning
•
Mixed integer planning
•
Nonlinear planning
•
Game theory
Decision making
In sciences and engineering , preference refers to the set
of assumptions related to ordering some alternatives,
based on the degree of satisfaction, or utility they provide,
a process which results in an optimal "choice" (whether real
or theoretical).
1. Deterministic search
• Simple with one criterion
• Multicriteria
2. Stochastic search (noncomplete information)
Deterministic search,two steps: search for
feasible solution, search of the best
solution
Alternatiiv
1
2
3
4
5
6
7
8
9
10
Valiku kriteerium
(sihifunktsioon) max G1(x)
2
4
5
7
4
3
6
4
3
5
Search in the case of limited/noncomplete
information
1. Variants are evaluated by mean values
2. Confidence levels and vaiance are considred is
3. Correlation is considered jm
Multiciteria search (decision making) , optimal
portfolio selection
Alternatiiv
1
2
3
4
5
6
7
8
9
10
Valiku kriteerium
(sihifunktsioon)
max G1(x)
2
4
5
7
4
3
6
4
3
5
Valiku kriteerium
min G2(x)
1
2
3
4
5
3
4
3
2
5
Pareto concept
data
decision
Max G1
1
2
3
4
5
A
B
C
D
E
Criteria
Max G2
2
4
3
4
4
Min G3
1
1
2
2
4
 Solution B is dominating over A
 Solution D is dominating over C
Pareto optimal set : B,D,E.
Decision
B
D
E
Max G1
2
4
5
Criteria
Max G2
4
4
4
Min G3
1
2
4
Normalization fo optimality criteria
Fi*  Fi ( x)
fi 
Fi*  Fi*
fi 
Fi ( x)  Fi*
Fi*
 Fi*
For maximization
For minimization
selection
A
B
C
D
E
Max G1
0
0.25
0.5
0.75
1
Criteria
Max G2
0
1
0.5
1
1
Max(- G3)
1
1
0.67
0.67
0
Approach 3. Preferred functions
Preferred functions
Ideal
Very good
good
satisfactory
Not suggested
Not allowed
Gi = >5.0
4.0<=G1<5.0
3.0<= G1 <4.0
2.0<=G1<3.0
1.00< G1 <2.0
G1 <=1.0
Each solution by criteria
Otsus (valiku
alternatiiv)
Max G1
A
Not allowed
B
satisfactory
C
good
D
Very good
G2 => 4.0
3.5<=G2<4.0
3.0<= G2 <3.5
2.5<=G2<3.0
2.0< G2 <2.5
G2 <=2.0
criteria
Max G2
Not allowed
ideal
good
ideal
-G3 => 4.0
3.25<=-G3<4.0
2.5<= -G3 <3.25
1.75<=-G3<2.5
1.0< -G3 <1.75
-G3 <=1.0
Max(- G3)
ideal
ideal
Very good
good
Weighted summation based search
Gsum= W1*X1+W2*X2+W3*X3, kus
W1+W2+W3= 1,0
Productivity of the each person
Person 1
a
0,8
b
0,5
c
0,35
Person 2
0,15
0,4
0,9
Person 3
0,2
0,3
0,1
MATHEMATICAL optimization PROBLEM
STATEMENT
We describe a step-by-step procedure for creating optimization models
The procedure for formulating an optimization model is as follows:
• Decision Variables Identify those aspects of a problem that can be
adjusted to improved performance. Represent the decision
variables.
• Design Constraints Identify the restrictions or constraints that
bound a problem. Express the restrictions/constraints as
mathematical equations.
• Design Objectives. Express the system effectiveness measures as
one or more objective functions.
Definition of Optimization problem
The goal of an optimization problem can be formulated as follows: find the
combination of parameters (independent variables) which optimize a given
quantity, possibly subject to some restrictions on the allowed parameter
ranges. The quantity to be optimized (maximized or minimized) is termed the
objective function; the parameters which may be changed in the quest for the
optimum are called control or decision variables; the restrictions on allowed
parameter values are known as constraints.
. The general optimization problem may be stated mathematically as:
Where f(x) is the objective function, x is the column vector of the independent
variables, and ci(x) is the set of constraint functions. Constraint equations of
the form are termed equality constraints, and those of the form are inequality
constraints. Taken together f(x) , and ci(x) are known as the problem
functions.
Classical problems
Determining extreme of the one variable function
df ( x )
0
dx
d 2 f ( x)
 0 local maximum
dx 2
d 2 f ( x)
 0 local minimum
dx 2
Lagrange method
Ovjective function: f ( x, y )
Constraints : 1 and  2 in form 1 ( x, y )  0;  2 ( x, y )  0.
Lagrange function L:
L( x, y, 1 ,  2 )  f ( x, y)  11 ( x, y)   2  2 ( x, y)
Optimal solution satisfies the following equation
L
 0;
x
L
0
y
L
0
 1
L
0
 2
Linear planning
Given : matrix A and vectors b and c,
Find maximum of the function : f ( x)  c T x
where
Ax  b and x  0
Alternatively: max cT x : Ax  b, x  0
Example
Max (x1)
where
x1
<= 3
x1 + x2<= 5
x=> 0
Linear planning(2)
Linear planning
Example. Suppose that we wish to maximize
Objective = 10*x1 + 11*x2
Subject to the constraint pair:
5*x1 + 4*x2 <= 40
2*x1 + 4*x2 <= 24
and x1 >= 0 and x2 >= 0. Together these four constraints define the feasible domain shown in
Dual problem
Initial problem:
max cx : Ax  b, x  0
Duaal problem. min y T b : AT y  c, y  0
The primal linear program solution answers the tactical question when it tells us how much to
produce. But the dual can have far greater impact because it addresses strategical issues
regarding the structure of the business itself.
Car protection system Tarmetec OÜ-le, (EU normatives)
Optimized
component
FEA Moudel fastening component
• FEA system: LS-Dyna
• >2000 fully integrated shell
elements
• Multi-linear material model
• Load case 1 – static
stiffness of the structure:
LS-Dyna implicit solver
(displacement caused by
forces Fx and Fy)
• Load case 2 – dynamic
properties of the structure:
LS-Dyna explicit solver
(reaction forces caused by
dynamic force Fz)
Design optimization
• The objectives: Min (( Fz max  Fz final ), Fz max )
where Fz is the reaction force axial
component on dynamic loading
• Design constraint:
umax  umax 
where umax is the maximum
deformation on static loading (loads
Fx and Fy)
• Design variables:
4 (a, b, c and e)
Results
Optimization study helped to
streamline the whole
product portfolio
Reduced cost
Reduced mass
Improved safety
4500
4000
Initial design
3500
Optimised design
Initial design:
• Heavy tubes
• Stiff brackets
• Does not pass the tests
Optimized design:
• Light tubes (same diameter,
thinner wall)
• Brackets with calibrated
stiffness
• Passes the tests
Force, N
3000
2500
2000
1500
1000
500
0
0,0000
0,0003
0,0005
0,0008
0,0010
Time, s
0,0013
0,0015
0,0018
Design of large composite parts
Topology Optimization
• Software: OptiStruct
(module of HyperWorks)
• Goal: to find optimal
thickness distribution of
reinforcement layer
• Constraints:
Thickness variation 1...8 mm
– Max equivalent stress 80
MPa
– Max displacement 5 mm
• 7 different optimization
studies (different values
of maximum layer
thickness)
• 50 to 100 iterations in
Thickness variation 1...40 mm
Design of reinforcement layer
The objective:
F2 ( x)  T ( x1 , x2 ,..., xn )
F1 ( x)  C ( x1 , x2 ,..., xn )
xi  xi ,  xi   xi , i  1,..., n
Subjected to linear constraints:
and non-linear constraints:
Where
u( x1, x2 ,..., xn )  u ,  e ( x1, x2 ,..., xn )   e
C(x) and T(x) are cost and manufacturing time of the glass-fiber-epoxy layer
x – a vector of design variables
x*i and xi* – upper and lower bounds of the i-th design variable
u – displacement
e – effective stress
u* and e* are corresponding upper limits
Integer Programming
Integer programming: branch and bound
LP solution:
x1  0.8,
x 2  2.4
x1  0
x1  1
x1  0,
x1  1,
x2  3
x2  2
obj: 3
x2 6
obj: 3
5
4
x1  x 2  6
3
(objective)
2
x2  2
x2  3
x1  1,
x1  0,
x2  2
x2  3
obj: 3
obj: 3
1
0
1
2
3
4
5
6
x1
110
Integer programming: branch and bound
http://www.sce.carleton.ca/faculty/chinneck/po.html
+Slide sampe
Integer programming: branch and bound
Max f = 5x1 + 8x2
x1 + x2  6; 5x1 + 9x2  45
x1 , x2 ≥ 0 integer
Relaxation-> solution: (x1, x2) = (2.25, 3.75); f= 41.25
Branch on the variable x2 yields two subproblems
Subproblem 1
Max f = 5x1 + 8x2
s.t. x1 + x2  6
5x1 + 9x2  45
x2  3
x1 , x2 ≥ 0
Opt. Solution: (3,3); f=39
Subproblem 2
Max f = 5x1 + 8x2
s.t. x1 + x2  6
5x1 + 9x2  45
x2 ≥ 4
x1 , x2 ≥ 0
Opt. Solution: (1.8, 4); f=41
S1: x2  3
Solution tree
All
(2.25, 3.75)
f=41.25
(3, 3); f=39
S2: x2 ≥ 4
(1.8, 4); f=41
If further branching on a subproblem will yield no useful information, then we
can fathom (dismiss) the subproblem: currently subproblem 1.
Integer programming: branch and bound
The best integer solution found so far is stored as incumbent. The value of the
incumbent is denoted by f*. In our case, the first incumbent is (3, 3), and f*=39.
Branch Subproblem 2 on x1 :
Subproblem 3:
Subproblem 4:
New restriction is x1  1.
New restriction is x1 ≥ 2.
Opt. solution (1, 4.44) with value 40.55
The subproblem is infeasible
All
Solution tree:
(2.25, 3.75)
f=41.25
S1: x2  3
(3, 3); f=39
S3: x1  1
(1, 4.44); f=40.55
S2: x2 ≥ 4
(1.8, 4); f=41
S4: x1 ≥ 2
infeasible
Integer programming: branch and bound
Branch Subproblem 3 on x2 :
Subproblem 5:
Subproblem 6:
New restriction is x2  4.
New restriction is x2 ≥ 5.
Opt. solution (1, 4); f=37
Opt. solution (0, 5); f=40
S1: x2  3
S5: x2  4
S3: x1  1
(3, 3); f=39
(1, 4); f=37
All
(1, 4.44)
(2.25, 3.75)
S6: x2 ≥ 5
f=40.55
S2:
x
≥
4
2
f=41.25
(0, 5); f=40
S4: x1 ≥ 2
(1.8, 4); f=41
infeasible
If the optimal value of a subproblem is  f*, then it is fathomed.
•In our case, Subproblem 5 is fathomed because 37  39 = f*.
If a subproblem has integral optimal solution x*, and its value > f*, then x* replaces
the current incumbent.
•In our case, Subproblem 5 has integral optimal solution, and its value
40>39=f*. Thus, (0,5) is the new incumbent, and new f*=40.
If there are no unfathomed subproblems left, then the current incumbent is an
optimal solution for (IP).
•In our case, (0, 5) is an optimal solution with optimal value 40.
Nonlinear planning
min  f ( x) : F ( X )  0, x  0
Optimization Toolbox MATLAB
bintprog
fgoalattain
fminbnd
fmincon
fminimax
fminsearch
fminunc
fseminf
ktrlink
linprog
quadprog
Solve binary integer programming problems
Solve multiobjective goal attainment problems
Find minimum of single-variable function on fixed interval
Find minimum of constrained nonlinear multivariable function
Solve minimax constraint problem
Find minimum of unconstrained multivariable function using
derivative-free method
Find minimum of unconstrained multivariable function
Find minimum of semi-infinitely constrained multivariable nonlinear
function
Find minimum of constrained or unconstrained nonlinear
multivariable function using KNITRO third-party libraries
Solve linear programming problems
Solve quadratic programming problems
New models
In artificial intelligence, an evolutionary algorithm (EA) is a a generic population-based metaheuristic
optimization algorithm. An EA uses some mechanisms inspired by biological evolution:
reproduction, mutation, recombination, and selection. Candidate solutions to the optimization
problem play the role of individuals in a population, and the fitness function determines the
environment within which the solutions "live" (see also cost function). Evolution of the population
then takes place after the repeated application of the above operators.
Evolutionary algorithms often perform well approximating solutions to all types of problems; this
generality is shown by successes in fields as diverse as engineering, art, biology, economics,
marketing, genetics, operations research, robotics, social sciences, physics, politics and chemistry
A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This
heuristic is routinely used to generate useful solutions to optimization and search problems.
Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate
solutions to optimization problems using techniques inspired by natural evolution, such as
inheritance, mutation, selection, and crossover.
Genetic algorithms are a sub-field of:
Evolutionary algorithms
Stochastic optimization
» Optimization
Simple generational genetic algorithm procedure:
•
Choose the initial population of individuals
•
Evaluate the fitness of each individual in that population
•
Repeat on this generation until termination (time limit, sufficient fitness achieved, etc.):
– Select the best-fit individuals for reproduction
– Breed new individuals through crossover and mutation operations to give birth to offspring
– Evaluate the individual fitness of new individuals
– Replace least-fit population with new individuals
Genetic Algorithm and Direct Search Toolbox MATLAB
Genetic Algorithm and Direct Search Toolbox functions extend the
capabilities of Optimization Toolbox™. These algorithms enable you to
solve a variety of optimization problems that lie outside the scope of
Optimization Toolbox solvers
They include routines for solving optimization problems using
• Direct search
• Genetic algorithm
• Simulated annealing
What Is Direct Search?
Direct search is a method for solving optimization problems that does not
require any information about the gradient of the objective function. Unlike
more traditional optimization methods a direct search algorithm searches a
set of points around the current point, looking for one where the value of the
objective function is lower than the value at the current point. You can use
direct search to solve problems for which the objective function is not
differentiable, or is not even continuous.
All pattern search algorithms that compute a sequence of points that
approach an optimal point. At each step, the algorithm searches a set of
points, called a mesh, around the current point—the point computed at the
previous step of the algorithm. The mesh is formed by adding the current
point to a scalar multiple of a set of vectors called a pattern. If the pattern
search algorithm finds a point in the mesh that improves the objective
function at the current point, the new point becomes the current point at the
next step of the algorithm.
What Is the Genetic Algorithm?
The genetic algorithm is a method for solving both constrained and
unconstrained optimization problems that is based on natural selection, the
process that drives biological evolution. The genetic algorithm repeatedly
modifies a population of individual solutions. At each step, the genetic algorithm
selects individuals at random from the current population to be parents and
uses them to produce the children for the next generation. Over successive
generations, the population "evolves" toward an optimal solution.
You can apply the genetic algorithm to solve a variety of optimization problems
that are not well suited for standard optimization algorithms, including problems
in which the objective function is discontinuous, nondifferentiable, stochastic, or
highly nonlinear.
The genetic algorithm uses three main types of rules at each step to create the
next generation from the current population:
• Selection rules select the individuals, called parents, that contribute to the
population at the next generation.
• Crossover rules combine two parents to form children for the next
generation.
• Mutation rules apply random changes to individual parents to form
children.
GA: http://www.myreaders.info/09_Genetic_Algorithms.pdf
Optimization, GA
0.7
0.6
Multicriteria design optimization (MDO)
HGA, B&B, ACO,...
0.5
Strategies for MDO
Pareto concept,
Physical progr. Tech.
Cost
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Combined mechanical characteristics
Pareto front: cost vs mechanical characteristics
GA roulette wheel selection operator
GA grossover operator for Binary coding
What Is Simulated Annealing?
Simulated annealing is a method for solving unconstrained and boundconstrained optimization problems. The method models the physical
process of heating a material and then slowly lowering the temperature
to decrease defects, thus minimizing the system energy.
At each iteration of the simulated annealing algorithm, a new point is
randomly generated. The distance of the new point from the current
point, or the extent of the search, is based on a probability distribution
with a scale proportional to the temperature. The algorithm accepts all
new points that lower the objective, but also, with a certain probability,
points that raise the objective.. An annealing schedule is selected to
systematically decrease the temperature as the algorithm proceeds. As
the temperature decreases, the algorithm reduces the extent of its
search to converge to a minimum.
GA sample
Sometimes the goal of an optimization is to find the global minimum or maximum of a function.
However, optimization algorithms sometimes return a local minimum The genetic algorithm can
sometimes overcome this deficiency with the right settings.
GA sample2
Genetic algorithms belong to the evolutionary algorithms (EA), which generate
solutions to optimization problems . Genetic algorithms find application in,
engineering, economics, chemistry, manufacturing, and other fields.
The 2006 NASA spacecraft antenna. This complicated shape was found by an
evolutionary computer design program to create the best radiation pattern.
Multidisciplinary design optimization
Multi-disciplinary design optimization (MDO) is a field of
engineering that uses optimization methods to solve design problems
incorporating a number of disciplines.
MDO allows designers to incorporate all relevant disciplines
simultaneously. The optimum of the simultaneous problem is superior
to the design found by optimizing each discipline sequentially, since it
can exploit the interactions between the disciplines. However, including
all disciplines simultaneously significantly increases the complexity of
the problem.
These techniques have been used in a number of fields, including
automobile design, naval architecture, electronics, architecture,
computers, and electricity distribution. However, the largest number of
applications have been in the field of aerospace engineering, such as
aircraft and spacecraft design.
Multicriteria optimization
In the following the multicriteria optimization problem is formulated in general form
covering different particular engineering applications (case studies) considered in the
current thesis.
1. Problem formulation
Practical engineering problems include often several objectives (strength, stiffness
characteristics, cost, time, etc.), also different technological, geometric, limitation of
resources, etc constraints. Thus, the multicriteria optimization problem can be
formulated as
In (10)-(12)
stand for the objective functions (describing
stiffness/strength, electrical properties, cost, etc.) and is a n-dimensional vector of
design variables. The upper and lower bounds of the design variables are denoted by
and , respectively. The functions
stand for constraint functions including both
linear and nonlinear constraints. Note that the equality constraints can be converted into
inequality constraints and are covered by (11). The n-dimensional design space is
defined with lower and upper bounds of the design variables.
Multicriteria optimization
The objective functions subjected to maximum and minimum can be normalised by formulas (13)
and (14), respectively
Obviously, after normalization with (13)-(14), both functions should be subjected to minimization. In (13)
difference of the maximum and current values of the objective function are minimized, in (14) the
difference of the current and minimum values of the objective function are minimized. It should be
pointed out that the values of the normalized objective functions are not necessarily in interval [0;1],
since the maximal and minimal values of the objective functions used in (13)-(14) are estimated values.
Multicriteria optimization
Physical programming
According to previous section, the solution techniques should be taking use after analysis of the optimality
criteria. Furthermore, it was pointed out that in the current study the physical programming techniques will be
applied for objectives which are not conflicting [46], [47], [48].
According to the weighted summation technique, the optimality criteria is first scaled, then multiplied by
weights and summed into the new objective
According to the compromise programming technique the combined objective is defined as the family of
distance functions
Multicriteria optimization
Pareto optimality concept
As pointed out above the use of the Pareto optimality concept is justified in the cases when contra dictionary
behaviour can be perceived between the objective functions [50], [51]. The physical programming techniques
discussed above are based on combining multiple objectives into one objective and solving latter problem as
single objective optimisation problem.
Model simulation and evaluation
1.
2.
Variance analysis. Sensitivity analysis
Robustness analysis
ROBUST OPTIMIZATION
The Taguchi optimization method minimises the variability of the performance
under uncertain operating conditions. Therefore in order to perform an
optimisation with uncertainties, the fitness function(s) should be associated with
two statistical formulas:
the mean value
the variance or standard deviation
Sample objectives: minimise
1. the total weight ,
2. normalised mean displacement,
3. standard deviation of the displacement
Homework 3
1. Solve optimization problem with one optimality criterion
a) use function fmincon
b) use function ga
c) use hybrid approach ga+fmincon
At least one sample solution should use ANN model built in Homework 2.
2. Solve multicriteria optimization problem using function gamultiobj
a) use weighted summation technique
(sample where objectives are not conflicting)
b) use Pareto optimality concept
(sample where objectives are confliction)
3. Robust optimization. Sample with one optimality criterion can be
considered as multicriteria optimization problem by adding the standard
deviation or variance of some output value as new criterion
clear all
%1. Initial data, Input data (variables, levels)
x1L=[1, 1.5, 2, 2.5];
x2L=[1, 1.5, 2, 2.5];
x3L=[0.5, 1.0, 1.5, 2];
% full factorial design need 4^3=64 experiments
%2. input corresponding to Taguchi design L'16
x1=[1,1,1,1,1.5,1.5,1.5,1.5,2,2,2,2,2.5,2.5,2.5,2.5];
x2=[1,1.5,2,2.5,1,1.5,2,2.5,1,1.5,2,2.5,1,1.5,2,2.5];
x3=[0.5,1,1.5,2,1,0.5,2,1.5,1.5,2,0.5,1,2,1.5,1,0.5];
S=[x1;x2;x3];
%3. this formula we use instead of experiments
V=x1.^2+x2.^2+x3.^2;
%4. Function approximation using ANN
global net;
% defining network
net=newff(S,V, [10],{'tansig','purelin'},'trainlm');
net.trainParam.lr=0.05; % learning rate
net.trainParam.epochs=50;
%Train network
net = train(net, S, V);
%optimization
x0=[1.8,1.7,1.9];
A=[];
b=[];
% 2*x1+3x2<=20 ; 5x1-x2<=15 A=[3, 2; 5, 1] b=[20;15]
Aeq=[];
beq=[];
lb=[1, 1, 0.5];
ub=[2.5, 2.5, 2];
% using fmincon (gradient method)
[x,f,error_flag]=fmincon(@mer9020_ob1,x0,A,b,Aeq,beq,lb,ub)
% using ga (genetic algorithm)
[x,f,error_flag]=ga(@mer9020_ob1,3,A,b,Aeq,beq,lb,ub)
%hybrid approach
[xxx,f,error_flag]=ga(@mer9020_ob1,3,A,b,Aeq,beq,lb,ub)
[x,f,error_flag]=fmincon(@mer9020_ob1,xxx,A,b,Aeq,beq,lb,ub)
Supporting material for homework 3
% multiobjective optimization
options = gaoptimset('PopulationSize',100,'Generations',30,'Display','iter',...
'ParetoFraction',0.7,'PlotFcns', {@gaplotpareto});
[x,f,error_flag]=gamultiobj(@mer9020_ob2,3,A,b,Aeq,beq,lb,ub,options)
% next functions should be in separate files
a) File : mer9020_ob1.m
function [ f ] = mer9020_ob1( x )
global net;
f(1)=sim(net,x');
% f(2)=..... Needed only in case of multiobjective problems
end
b) File : mer9020_ob2.m
function [ f ] = mer9020_ob2( x )
global net;
f(1)=sim(net,x');
f(2)=x(1)+x(2)^2+x(3)*x(2);
end
Download