Prof emer. Rein Küttner & juhtivteadur Jüri Majak TTÜ, masinaehituse instituut E-mail: juri.majak@ttu.ee MER9020 Aineprogramm Sissejuhatus • Matemaatilised mudelid ja vahendid nende loomiseks/kirjeldamiseks. Eksperimentaalsete mudelite kasutamine optimeerimisel. Inseneri statistika (Engineering statistics) – – – – • Insenerilahenduste optimeerimine, üldalused (Basics of Optimisation methods and search of solutions) – – – – – • • Asendusmudelid ja nende kasutamine Statistika ja projekteerimine. Eksperimentidest määratud mudelite kasutamine optimeerimisel. Katsete planeerimine. Engineering statistics and design. Statistical Design of Experiments. Use of statistical models in design. Kaasaegsed meetodid asendusmudelite koostamiseks. Närvivõrkude metoodika kasutamine. Mudeli täpsuse hindamine. Riskide arvestamine, töökindlus, ohutus. Statistical tests of Hypotheses. Analysis of Variance. Risk, reliability, and Safety Optimeerimisülesannete tüübid (Types of Optimization Problem) Mitmekriteriaalsed optimaalse valiku ülesanded. Pareto-optimaalsed lahendid. (Multicriterial optimal decision theory. Pareto optimality, Use of malticriterial decisiontheory in engineering design). Klassikalised optimeerimisülesanded. (Optimization by Differential Calculus. Lagrange multipliers. Examples of use the classical optimization method in engineering). Matemaatilise planeerimise meetodid. Otsene ja duaalne planeerimise ülesanne. Duaalse ülesande kasutamine. Optimeerimisülesennete lahendite tundlikkuse analüüs. (Mathematical programming methods and their use for engineering design, process planning and manufacturing resource planning.Direct and Dual Optimization tasks. Sensitivity analysis). Geneetilised optimeerimise algoritmid.. Tehniliste süsteemide ja protsesside simuleerimine. Simuleerimine kasutades juhuslike arvude generaatoreid ja SIMULINK’i Toodete ja tootmisprotsesside optimaalne projekteerimine arendused. Optimeerimise kasutamise näited. Kursuse üldine skeem Katsed, arvutikatsed, Protsesside jälgimine, katsetulemuste statistiline hindamine, mudelite hindamine jms Simuleerimine, analüüs Katsete plaanimine (DOE) Asendusmudel; Surrogaatmudel (response surface), asendusmudelite kasutamine Optimeerimine:optimaalse valiku ülesanded, lineaarne, mittelineaarne planeerimine Tehniliste süsteemide optimeerimine (skeem) Iseseisev-praktiline töö koosneb järgnevatest töödest: • Katsetulemuste statistiline hindamine: peamiste statistiliste hinnangute leidmine, kahe erineva katseseeria vahelise seose hindamine(korrelatsioon) , katseeeria varieeruvuse erinevuse hindamine. • Regressioonanalüüsi mudeli koostamine: Valitud mudeli sobivuse analüüs. • Närvivõrkude mudeli leidmine • Lineaarse (mittelineaarse) optimeerimisülesande lahendamine, tulemuste analüüs • Mudeli statistiline simuleerimine arvutil . Overall Goal in Selecting Methods The overall goal in selecting basic research method(s) in engineering design is to get the most useful information to key decision makers in the most cost-effective and realistic fashion. Consider the following questions: 1. What information is needed to make current decisions about a product or technology? 2. How much information can be collected and analyzed., using experiments ,questionnaires, surveys and checklists? 3. How accurate will the information be? 4. Will the methods get all of the needed information? 5. What additional methods should and could be used if additional information is needed? 6. Will the information appear as credible to decision makers, e.g., to engineers or top management? 7. How can the information be analyzed? Basic Guidelines to Problem Solving and Decision Making 1. Define the problem If the problem still seems overwhelming, break it down by repeating steps Prioritize the problems 2. Look at potential causes for the problem 3. Identify alternatives for approaches to resolve the problem At this point, it's useful to keep others involved. 4. Select method, tool ,technique etc to solve the problem 5. Plan the implementation of the best alternative solution ( action plan) 6. Monitor implementation of the plan 7. Verify if the problem has been resolved or not 1. Complexity Complexity of engineering problem Three types of complexity • Numerical complexity • Description complexity • Understanding (regognation) complexity Complexity classes (reference-functions Big O notation) : • Logaritmic complexity, O(log n) • Linear complexity , O(n) • Polynomial complexity, O( nq ) • Eksponential complexity • Factorial complexity • Double-eksponential complexity. Algorithmic complexity is concerned about how fast or slow par ticular algorithm performs. We define complexity as a numerical function T(n) - time versus the input size n. We want to define time taken by an algorithm without depending on the implementation details. But you agree that T(n) does depend on the implementation! Complexity classes http://www.cs.cmu.edu/~adamchik/15121/lectures/Algorithmic%20Complexity/complexity.html Asymptotic Notations The goal of computational complexity is to classify algorithms according to their performances. We will represent the time function T(n) using the "big-O" notation to express an algorithm runtime complexity. For example, the following statement T(n) = O(n2) says that an algorithm has a quadratic time complexity. Definition of "big Oh" For any monotonic functions f(n) and g(n) from the positive integers to the positive integers, we say that f(n) = O(g(n)) when there exist constants c > 0 and n0 > 0 such that f(n) ≤ c * g(n), for all n ≥ n0 Intuitively, this means that function f(n) does not grow faster than g(n), or that function g(n) is an upper bound for f(n), for all sufficiently large n→∞ Complexity classes Exercise. Let us prove n2 + 2 n + 1 = O(n2). We must find such c and n0 that n 2 + 2 n + 1 ≤ c*n2. Let n0=1, then for n ≥ 1 1 + 2 n + n2 ≤ n + 2 n + n2 ≤ n2 + 2 n2 + n 2 = 4 n2 Therefore, c = 4. Complexity classes Constant Time: O(1) An algorithm is said to run in constant time if it requires the same amount of time regardless of the input size. Example: array: accessing any element Linear Time: O(n) An algorithm is said to run in linear time if its time execution is directly proportional to the input size, i.e. time grows linearly as input size increases. Examples: array: linear search, traversing, find minimum i:=1 p:=1 for i:=1 to n p:=p*i i:=i+1 endfor Ex: Find complexity of the algorithm f(n)=4*n+2 O(n) Complexity classes Logarithmic Time: O(log n) An algorithm is said to run in logarithmic time if its time execution is proportional to the logarithm of the input size. Example: binary search Quadratic Time: O(n2) An algorithm is said to run in quadratic time if its time execution is proportional to the square of the input size. Examples: bubble sort, selection sort, insertion sort for i:=1 to n for j:=1 to n A(i,j):=x endfor endfor Ex: Find complexity of the algorithm Complexity classes Ex: Find complexity of the algorithm s:=0 for i:=1 to n for j:=1 to i s:=s+j*(i-j+1) endfor endfor Complexity classes O(n logb ae ) Complexity for Recursive algorithms Initial problem with data capacity n will be divided into b subproblems with equal capacity. Only a (a<b) subproblems is solved (others not needed). T (n) aT (n / b) f (n) (1) Theorem: Assuming a>=1 ja b>1 are constants, f(n) function and T(n) defined for non-negative n by formula (1). Then: a) T(n) is b) T(n) is c) T(n) is O(n logb a ) O(n logb a log n) O( f (n)) if f(n) is O(n logb ae ) (e – positive constant) , if f(n) isO(n logb a ) if f(n) is O(n logb ae ) (e – positive constant , and af(n/b)<=cf(n) Ex1: apply theorem to binary search Ex2: apply theorem for problem where a=2, b=4, f(n)=n2+2n+9 Ex3: apply theorem for problem where a=2, b=4, f(n)=3 Ex4: Find number of opertions for a=2, b=4, f(n)=2, n=2000. Ex2: apply theorem for problem where a=2, b=4, f(n)=n2+2n+9 T (n) 2T (n / 4) n^2 2n 9 Complexity of f(n) is O(n2+2n+9) = O(n2) log b a log 4 2 0.5 n logb a n 0.5 O(n0.51.5 ) O(n 2 ) Case c) constant e=1.5 Asymptotic estimate from case c): O( f (n)) O(n2 2n 9) O(n 2 ) Ex3: apply theorem for problem where a=2, b=4, f(n)=3 T (n) 2T (n / 4) 3 Complexity of f(n) is O(1) log b a log 4 2 0.5 n logb a n 0.5 O(n 0.50.5 ) O(1) Asymptotic estimate from case a): Case a) constant e=0.5 O(nlogb a ) O(n0.5 ) Descriptive Statistics (Excel,...) Descriptive Statistics Find the mean, median, mode, and range for the following list of values: 13, 18, 13, 14, 13, 16, 14, 21, 13 The mean is the usual average, so: (13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15 Note that the mean isn't a value from the original list. This is a common result. You should not assume that your mean will be one of your original numbers. The median is the middle value, so I'll have to rewrite the list in order: 13, 13, 13, 13, 14, 14, 16, 18, 21 There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th number: 13, 13, 13, 13, 14, 14, 16, 18, 21 So the median is 14. Copyright © Elizabeth Stapel 2004-2011 All Rights Reserved The mode is the number that is repeated more often than any other, so 13 is the mode. The largest value in the list is 21, and the smallest is 13, so the range is 21 – 13 = 8. mean: 15 median: 14 mode: 13 range: 8 Descriptive Statistics Standard deviation: Standard error: The variance of a random variable X is its second central moment, the expected value of the squared deviation from the mean μ = E[X] The variance is quadrat of standard deviation Sample variance In probability theory and statistics, the variance is used as a measure of how far a set of numbers are spread out from each other. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean (expected value). Descriptive Statistics Kurtosis In probability theory and statistics, kurtosis (from the Greek word κυρτός, kyrtos or kurtos, meaning bulging) is a measure of the "peakedness" of the probability distribution of a real-valued random variable, although some sources are insistent that heavy tails, and not peakedness, is what is really being measured by kurtosis. Higher kurtosis means more of the variance is the result of infrequent extreme deviations, as opposed to frequent modestly sized deviations. Kurtosis is a measure of how outlier-prone a distribution is. The kurtosis of the normal distribution is 3. Distributions that are more outlier-prone than the normal distribution have kurtosis greater than 3; distributions that are less outlier-prone have kurtosis less than 3. The kurtosis of a distribution is defined as Descriptive Statistics Skewness In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. The skewness value can be positive or negative, or even undefined. Qualitatively, a negative skew indicates that the tail on the left side of the probability density function is longer than the right side and the bulk of the values (possibly including the median) lie to the right of the mean. A positive skew indicates that the tail on the right side is longer than the left side and the bulk of the values lie to the left of the mean. A zero value indicates that the values are relatively evenly distributed on both sides of the mean, typically but not necessarily implying a symmetric distribution. The skewness of a random variable X is the third standardized moment, denoted γ1 and defined as Correlation. Correlation Correlation is a statistical measure that indicates the extent to which two or more variables fluctuate together. A positive correlation indicates the extent to which those variables increase or decrease in parallel; a negative correlation indicates the extent to which one variable increases as the other decreases. Samples: http://www.mathsisfun.com/data/correlation.html The local ice cream shop keeps track of how much ice cream they sell versus the temperature on that day, here are their figures Correlation Ice Cream Sales vs Temperature Temperature °C Ice Cream Sales 14,2° $215 16,4° $325 11,9° $185 15,2° $332 18,5° $406 22,1° $522 19,4° $412 25,1° $614 23,4° $544 18,1° $421 22,6° $445 17,2° $408 Correlation Correlation Example of correlation using Excel Data analysis Estimation of the variance of data series using F-test http://en.wikipedia.org/wiki/F-test_of_equality_of_variances This F-test is known to be extremely sensitive to non-normality Estimation of the variance of data series using F-test Excel Data Analysis fn: F-Test two –Sample for Variance. Example: F-Test Two-Sample for Variances Variable 1 Variable 2 Mean 4,642857 4,071429 Variance 6,708791 3,917582 Observations 14 14 df 13 13 F 1,712482 P(F<=f) one-tail 0,172124 F Critical one-tail 2,576927 Estimation of the variance of data series using F-test https://controls.engin.umich.edu/wiki/index.php/Factor_analysis_and_ANOVA Homework1: Statistical evaluation of test data Descriptive Statistics 1. Calculate basic descriptive statistics for two series of test data (data series selected by yourself). 2. Estimate relations between these two data series (correlation, covariation) 3. Estimate the variance of data series using F-test. If Fobserved Fcritical then zero hypothesis is valid and the variance of two data series is arbitrary. FT is table values taken with given confidence interval (95% or 99%, etc.). If Fobserved > Fcritical, we conclude with 95% confidence that the null hypothesis is false. Null hypothesis Ho: all sample means arising from different factors are equal Alternative hypothesis Ha: the sample means are not all equal Homework1- all work done should be explained meaning of parameters, etc. 1.Explain results of descriptive statistics 2. Explain relations between data series 3. Explain variance of data series MATLAB Statistics Descriptive StatisticsData summariesStatistical VisualizationData patterns and trendsProbability DistributionsModeling data frequencyHypothesis TestsInferences from dataAnalysis of VarianceModeling data varianceRegression AnalysisContinuous data modelsMultivariate MethodsVisualization and reductionCluster AnalysisIdentifying data categoriesModel AssessmentIdentifying data categoriesClassificationCategorical data modelsHidden Markov ModelsStochastic data modelsDesign of ExperimentsSystematic data collectionStatistical Process Control Production monitoring Linear regression analysis The earliest form of regression was the method of least squares, which was published by Legendre in 1805, and by Gauss in 1809. Regression models Regression models involve the following variables: • The unknown parameters, denoted as β, which may represent a scalar or a vector. • The independent variables, X. • The dependent variable, Y. A regression model relates Y to a function of X and β. The approximation is usually formalized as E(Y | X) = f(X, β). To carry out regression analysis, the form of the function f must be specified. Sometimes the form of this function is based on knowledge about the relationship between Y and X that does not rely on the data. If no such knowledge is available, a flexible or convenient form for f is chosen. Linear regression Models Regressioonanalüüs vaatluste (katse) plaan Regressioonanalüüs (katseplaan) Sisendid katsed Väljund x1 x2 x3 y 1 1 12 10 1 2 2 14 9 3 3 6 23 8 6 4 5 12 7 8 5 6 35 6 10 6 7 36 4 12 7 8 31 6 15 8 9 32 3 21 9 12 12 2 23 10 13 16 2 40 Regressioonanalüüs (teisendatud katseplaan) x1*x1 sqrt(x2) x3*x3*x3 y 1 3,464102 1000 1 4 3,741657 729 3 36 4,795832 512 6 25 3,464102 343 8 36 5,91608 216 10 49 6 64 12 64 5,567764 216 15 81 5,656854 27 21 144 3,464102 8 23 169 4 8 40 Regressiooni analüüsi tulemuste näide http://www.excel-easy.com/examples/regression.html Linearization y a 0 a1 x1 a 2 x 2 a3 x3 a12 x1 x 2 a13 x1 x3 a 23 x 2 x3 a11 x12 a 22 x 22 a3 x32 Nonlinear equation Linearized equation y = a + bx y = a + bx (linear) b ln y=ln a + b ln x y = ax (logarithmic) bx ln y= ln a+bx y = ae (exponential) bx y 1 e (exponential) ln 1 bx 1 y y=a+bx y a b x (square root) y=a+b/x (inverse) y=a+bx y ln y ln y ln 1 1 y Linearized variables W V x ln x x x y x y 1/x MATLAB regression analysis Regression Plots, Linear Regression, Nonlinear Regression, Regression Trees, Ensemble Methods REGRESSION ANALYSIS. Matlab Simple linear regression y b1 x b0 % data xx=[2.38 2.44 2.70 2.98 3.32 3.12 2.14 2.86 3.5 3.2 2.78 2.7 2.36 2.42 2.62 2.8 2.92 3.04 3.26 2.30] yy=[51.11 50.63 51.82 52.97 54.47 53.33 49.90 51.99 55.81 52.93 52.87 52.36 51.38 50.87 51.02 51.29 52.73 52.81 53.59 49.77] % data sort [x,ind]=sort(xx); y=yy(ind); % linear regression [c,s]=polyfit(x,y,1); % structure s contains fileds R,df, normr…. [Y,delta]=polyconf(c,x,s,0.05); % plot plot(x,Y,'k-',x,Y-delta,'k--',x,Y+delta,'k--',x,y,'ks',[x;x],[Y;y],'k-') xlabel('x (input)'); ylabel('y (response)'); REGRESSION ANALYSIS. Matlab , k y b0 , Multiple linear regression b x j j j 1 Example1: linear model for cubic polynomial in one independent variable Let: x1 x Linear model: , x2 x 2 x3 x3 y b0 b1x1 b2 x2 b3 x3 , Example2: linear model for quadratic polynomial in two independent variable Let: , , x1 x1 Linear model: x2 x2 x3 x12 x4 x22 x5 x1x2 y b0 b1x1 b2 x2 b3 x3 b4 x4 b5 x5 Example x1=[7.3 8.7 8.8 8.1 9.0 8.7 9.3 7.6 10.0 8.4 9.3 7.7 9.8 7.3 8.5 9.5 7.4 7.8 7.8 10.3 7.8 7.1 7.7 7.4 7.3 7.6]’ x2=[0.0 0.0 0.7 4.0 0.5 1.5 2.1 5.1 0.0 3.7 3.6 2.8 4.2 2.5 2.0 2.5 2.8 2.8 3.0 1.7 3.3 3.9 4.3 6.0 2.0 7.8]’ x3=[0.0 0.3 1.0 0.2 1.0 2.8 1.0 3.4 0.3 4.1 2.0 7.1 2.0 6.8 6.6 5.0 7.8 7.7 8.0 4.2 8.5 6.6 9.5 10.9 5.2 20.7]’ y=[0.222 0.395 0.422 0.437 0.428 0.467 0.444 0.378 0.494 0.456 0.452 0.112 0.432 0.101 0.232 0.306 0.0923 0.116 0.0764 0.439 0.0944 0.117 0.0726 0.0412 0.251 0.00002] X=[ones(length(y),1),x1,x2,x3,x1.*x2,x1.*x3,x2.*x3, x1.^2, x2.^2, x3.^2] % Regression [b,bcl,er,ercl,stat]=regress(y,X,0.05) disp(['R2=' num2str(stat(1))]) disp(['F0=' num2str(stat(2))]) disp(['p-value=' num2str(stat(3))]) rcoplot(er,ercl) In cases where the distribution of errors is asymmetric or prone to outliers, the computed statistics with function regress() become unreliable. Latter case the function robustfit() can be preferred. % if the distribution of errors is asymmetric or prone to outliers X2=X(:,2:9) robustbeta = robustfit(X2,y) Design of experiment The aim in general: to extract as much as possible information from a limited set of experimental study or computer simulations. to maximize the information content of the measurements/simulations in the context of their utilization for estimating the model parameters. Particularly: the selection of the points where the response should be evaluated Can be applied for model design with: Experimental data Simulation data Can be applied for model fitting Objective functions Constraint functions Why DOE DOE allows the simultaneous investigation of the effect of a set of variables on a response in a cost effective manner. DOE is superior to the traditional one-variable-at-a-time method, which fails to consider possible interaction between the factors. DOE techniques selection DOE is introduced for describing real life problems: 1920-s by R.Fisher, agricultural experiments, G.Box 1950-s, for modeling chemical experiments Nowadays various engineering applications, production planning, etc. A huge number of DOE methods are available in literature and selection of the most suitable method is not always the simplest task. Preparative activities needed: formulation of the problem to be modelled by DOE, selection of the response variable(s), choise of factors (design variables), determining ranges for design variables. If this preliminar analysis is successfully done, then the selection of the suitable DOE method is simpler. The selection of the levels of factors is also often classified as preliminar work. In the following two DOE methods are selected and discussed in more detail: the Taguchi methods, allows to obtain preliminary robust design with small number of experiments and it is the most often applied at early stages of process development or used as initial design full factorial design resource consuming, but leads to more accurate results DOE techniques selection Note, that the Taguchi design can be obtained from full factorial design by omitting certain design points. Also there are several approaches, based on full factorial design. For example central composite design can be obtained from 2N full factorial design by including additional centre and axial points. An alternate well known DOE techniques can be outlined as D-optimality criterion, Latin hypercube, Van Keulen scheme, etc. The D-optimality criterion is based on T maximization of the determinant Xt X , where stand for the matrix of the design variables. Application of the D optimality criterion yields minimum of the maximum variance of predicted responses (the errors of the model parameters are minimized). The Latin hypercube design maximizes the minimum distance between design points, but requires even spacing of the levels of each factor [12]. The van Keulen’s scheme is useful in cases where the model building is to be repeated within an iterative scheme, since it adds points to an existing plan. DOE methods vs selection criteria Full-Factorial Experiments Full factorial design In order to overcome shortcomings of the Taguchi methods the full factorial design can be applied (Montgomery, 1997) This approach captures interactions between design variables, including all possible combinations. According to full factorial design strategy the design variables are varied together, instead of one at a time. First the lower and upper bounds of each of the design variables are determined (estimated values used if exact values are not known). Next the design space is discretized by selecting level values for each design variable. In latter case the experimental design is classified in the following manner 2N full factorial design - each design variable is defined at only the lower and upper bounds (two levels); 3N full factorial design - each design variable is defined at the lower and upper bounds and also in the midpoints (three levels); In the case of N=3 the 3N full factorial design contain 27 design points shown in Fig. 1. Full factorial design The full factorial design considered includes all possible combinations of design variables and can be presented in the form of general second-order polynomial as N N y c0 ci xi c x i 1 i 1 2 ii i N c x x ij i i , j 1; j i j (2) In (2) and stand for the design variables and , are model parameters. It should be noted that the second order polynomial based mathematical model given by formula (2) is just one possibility for response modeling. This model is used widely due to its simplicity. In the current study the full factorial design is used for determining dataset where the response should be evaluated, but instead of (2) artificial neural networks is employed for response modeling. Evidently, the number of experiments grows exponentially in the case of 3N full factorial design (also for 2N). Thus, such an approach becomes impractical in the case of large number of design variables. A full factorial design typically is used for five or fewer variables Fractional factorial design In the case of large number of design variables, a fraction of a full factorial design is most commonly used considering only a few combinations between variables [11]. The one-third fractions for a 33 factorial design are depicted in Fig. 2 [11]. Figure 2 One-third fractions for a 33 full factorial design ([11], [31]) Thus, in latter case the number of experiments is reduced to one third in comparison with 33 full factorial designs (from 27 to 9). The cost of such an simplification is fact that just only a few combinations between variables are considered. Central composite design CCD are first-order (2N) designs, but with additional centre and axial points to allow estimation of the tuning parameters of a second-order model. CCD for 3 design variables is shown in figure 3. Figure 3. Central composite design for 3 design variables at 2 levels The CCD design shown involves 8 factorial points, 6 axial points and 1 central point. CCD presents an alternative to 3N designs in the construction of second-order models because the number of experiments is reduced as compared to a full factorial design (15 in the case of CCD compared to 27 for a full-factorial design). In the case of problems with a large number of designs variables, the experiments may be time-consuming even with the use of CCD. Parameter study One at a Time Latin Hypercubes The Taguchi method The Taguchi approach is more effective method than traditional design of experiment methods such as factorial design, which is resource and time consuming. For example, a process with 8 variables, each with 3 states, would require 38=6561 experiments to test all variables (full factorial design). However using Taguchi's orthogonal arrays, only 18 experiments are necessary, or less than 0.3% of the original number of experiments. It is correct to point out also limitations of the Taguchi method. Most critical drawback of the Taguchi method is that it does not account higher order interactions between design parameters. Only main effects and two factor interactions are considered. Taguchi methods, developed by Dr. Genichi Taguchi, are based on the following two ideas Quality should be measured by the deviation from a specified target value, rather than by conformance to preset tolerance limits; Quality cannot be ensured through inspection and rework, but must be built in through the appropriate design of the process and product. In the Taguchi method, two factors such as the control factor and the noise factor are considered to study the influence of output parameters. The controlling factors are used to select the best conditions for a manufacturing process, whereas the noise factors denote all factors that cause variation. The signal-to-noise (SN) ratio is used to find the best set of design variables The Taguchi method According to the performance characteristics analysis, the Taguchi approach is classified into three categories: Nominal-the-Better (NB), Higher-the-Better (HB), Lower-the-Better (LB). In the following Lower-the-Better (LB) approach is employed in order to minimize the objective functions. The SN ratio is calculated as follows: N i y k2 SN i 10 log k 1 N i Ni where i, k, stand for experiment number, trial number and number of trials for experiment , respectively. The results obtained from the Taguchi Method can (should) be validated by the confirmation tests. The validation process is performed by conducting the experiments with a specific combination of the factors and levels not considered in initial design data. The Taguchi method Array Selector (https://controls.engin.umich.edu/wiki/index.php/Design_of_experiments_via_taguchi_m ethods:_orthogonal_arrays) L4 array: The Taguchi method Impeller model: A, B, or C; Mixer speed: 300, 350, or 400 RPM Control algorithm: PID, PI, or P; Valve type: butterfly or globe There are 4 parameters, and each one has 3 levels with the exception of valve type. The highest number of levels is 3, so we will use a value of 3 when choosing our orthogonal array. Using the array selector above, we find that the appropriate orthogonal array is L9: When we replace P1, P2, P3, and P4 with our parameters and begin filling in the parameter values, we find that the L9 array includes 3 levels for valve type, while our system only has 2. The appropriate strategy is to fill in the entries for P4=3 with 1 or 2 in a random, balanced way. The Taguchi method If the array selected based on the number of parameters and levels includes more parameters than are used in the experimental design, ignore the additional parameter columns. For example, if a process has 8 parameters with 2 levels each, the L12 array should be selected according to the array selector. As can be seen below, the L12 Array has columns for 11 parameters (P1-P11). The right 3 columns should be ignored. Design of experiment. Factorial Designs KATSEPLAAN Katse nr 1 2 3 4 5 6 7 8 Täiendavad katsed 0 punktis x1 x2 x3 VÕIMALDAB LEIDA SEOSEID x12=X1*X2 x13=X1*X3 x23= X2*X3 x1*x2*x3 1 1 1 1 1 1 1 -1 1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 -1 -1 -1 1 1 -1 -1 1 1 1 -1 1 -1 -1 -1 -1 1 -1 -1 1 -1 1 1 -1 -1 -1 -1 1 1 -1 -1 -1 1 1 1 -1 0 0 0 0 0 0 Saame võrrandi: y=a0+a1*x1+a2*x2+a3*x3, täiendavalt: a12*x12 ; a13*x13; a23*x23;a123*x1*x2*x3 To use the standard plans the following coding of parameters is needed: xi 1 correspond to the X i,max xi 1 correspond to the X i ,min This coding corresponds to the following linear transformation of the initial parameters X i : 2 ( X i X i ,min ) xi 1 ( X i ,max X i ,min ) Calculation of the Main Effects. Calculation of Effects Calculation of the Main Effects. With a factorial design, the average main effect of changing thrombin level from low to high can be calculated as the average response at the high level minus the average response at the low level: Main effect of X i responses at high X i - responses at low X i half the number of runs in experiment Response modeling Response modelling (RM) is widely used technique in engineering design. Most commonly the response modeling is used in order to compose mathematical model describing the relation between input data and objective functions. In general such an approach can be applied in the case of time consuming and expensive experimental study, but also for modeling complex numerical simulations. The mathematical model is composed on base of “learning data” and can be used for evaluating objective function(s) for any set of input data in design space considered. Some special cases can be also pointed out where it is reasonable to apply response modeling: * Experimental study is not expensive or time consuming but it cannot be performed in certain sub-domain of the design space (technological, etc. limitations); * Numerical simulations are not expensive or time consuming but here are singularities in certain sub-domain of the design space. The RM techniques are used commonly for describing objective functions, but actually can be applied with same success for describing constraint functions (any functions). Finally, it is correct to point out that some special situations where application of ANN or also other meta-modeling techniques may be not successful: * Design space is poorly covered by “learning data” (data used in DOE). Therese are too few results or the results are non-uniformly distributed and some sub-domains are not covered; * Initial design space is well covered by DOE data, but here is need to evaluate functions values outside of the initial design space. Artificial neural networks (ANN) An Artificial Neural Network (ANN) is an information processing paradigm inspired by the human brain. ANNs consist of a large number of highly interconnected neurons working altogether for solving particular problem. Each particular ANN is designed, configured for solving certain class of problems like data classification, function approximation, pattern recognition through a learning process. In the following the ANN-s considered for function approximation.The theoretical concepts of ANN are introduced 1940-s and first neural model was proposed by McCulloch and Pitts in 1943 using simple neuron (the values of the input, output and weights were restricted to values , etc). Significant progress in ANN development was achieved by Rosenblatt in 1957 who introduced the one layer perceptron and neurocomputer (Mark I Perceptron). A multi-layer perceptron (MLP) model was taken use in 1960. The learning algorithm (backpropagation algorithm) for the three-layer perceptron has been proposed by Werbos in 1974. The MLP becomes popular after 1986 when Rummelhart and Mclelland generalized the backpropagation algorithm for MLP. Radial Basis Function (RBF) networks were first introduced by Broomhead & Lowe in 1988 and Self-Organizing Map (SOM) network model by Kohonen in 1982. ANN: neuron and transfer functions (McCulloch and Pitts (1943)) f(x) = 1/(1 + e-x) Artificial neural networks (ANN) A simple two layer perceptron is shown in Figure 3(Leeventurf): g y3 f 3 w31 f1 ( w11x1 w12 x2 w10 ) w32 f 2 ( w21x1 w22 x2 w20 ) w30 Briefly, the mathematical model of the two layer network can be obtained by: Summarizing the inputs multiplied by the weights of input layer, adding bias Applying the transfer function for each neuron; Summarizing the obtained results multiplied by the weights of the hidden layer, adding bias Applying the transfer function of the output layer. Artificial neural networks (ANN) Most commonly, for function approximation, the radial bases and linear transfer functions are used in hidden and output layers, respectively. The architecture of the ANN considered is shown in Figure Architecture of the two layer feedforward neural network It is correct to note that in literature the input layer is commonly not counted and the ANN shown in Fig.3 is considered as two layer network (input layer has no transfer function, just input data and weights). Artificial neural networks (ANN) The most commonly used backpropagation learning algorithm is the steepest descent method (one of the gradient methods). However, the shortcoming of this method is its slow convergence. The Newton-s method has good convergence rate (quadratic), but it is sensitive with respect to initial data. For that reason herein is used the Levenberg– Marquardt learning algorithm which has second-order convergence rate . The update rule of the Levenberg–Marquardt algorithm is a blend of the simple gradient descent and Gauss-Newton methods and is given as xi 1 xi ( H diag H ) 1 f ( xi ) where H is the Hessian matrix evaluated at , and stand for the scaling coefficient and gradient vector, respectively. The Levenberg–Marquardt algorithm is faster than pure gradient method and is less sensitive with respect to starting point selection in comparison with Gauss-Newton method. Sensitivity analysis for ANN models The sensitivity analysis applied on the trained neural network model allows identifying all relevant and critical parameters from a total set of input parameters. Special attention should be paid to calculation of the sensitivities in points corresponding to optimal designs. The output of the above described three layer percepton network can be computed as Y G2 (W2G1(W1 X 1 ) 2 ) 1 2 2 1 1 x 1 1 w11 wm1 w11 wn1 , W 1 X , W1 , 2 , 1 w 2 w 2 w1 w1 x n k 1k mk 1 k nk 12 2 , m2 y1 Y , y m X- input vector, Y- output vector W1, W2, stand for weight matrices and 1, 2 for bias vectors. G1 , G2 - transfer functions The sensitivity matrix S can be computed as gradient of the output vector Y as S Y For MLP: S MLP F Y F2 W2 1 W1 X Z 2 Z1 where Z1 W1 X 1 Y FN F F Y WN 2 W2 1 W , X Z N Z 2 Z1 Z2 W2G1 ( Z1 ) 2 Z1 W1 X 1 Z2 W2G1 ( Z1 ) 2 ,... Z N WN GN 1 ( Z N 1 ) N ANN applications Real-life applications. The tasks artificial neural networks are applied to tend to fall within the following broad categories: • Function approximation, or regression analysis, including time series prediction, modeling. • Classification, including pattern and sequence recognition, novelty detection and sequential decision making. • Data processing, including filtering, clustering and compression. • Robotics, including directing manipulators • Control, including Computer numerical control Application areas include system identification and control (vehicle control, process control, natural resources management), game-playing and decision making, pattern recognition (radar systems, face identification, object recognition and more), sequence recognition (speech, handwritten text recognition), medical diagnosis, financial applications (automated trading systems), data mining (or knowledge discovery in databases ), visualization and e-mail spam filtering. Neural Network Toolbox NN Toolbox provides functions for modeling complex nonlinear systems that are not easily modeled with traditional methods. Neural Network Toolbox supports supervised learning with feedforward, and dynamic networks. It also supports unsupervised learning with self-organizing maps and competitive layers. With the toolbox you can design, train, visualize, and simulate neural networks. Neural network model: using newff • • • • • • • • • • • • • • • • • • • • • • • • Programmi näide (vaja uus näide): Reinforcement: yes, no % Lähteandmed p=[ 3 3 1 2 3 2 1 3 3 2 2 1 2 1 1 2 3 2 1 2; … 2 1 1 0 2 2 2 0 2 0 1 1 1 0 2 2 1 2 2 0;... 2 3 2 2 1 2 3 3 1 1 1 0 0 2 3 2 3 1 1 2; … 2 0 1 2 2 0 1 0 1 1 2 2 2 0 0 2 0 2 1 0;... 2 3 2 1 1 2 3 3 1 2 2 3 2 1 2 3 2 1 2 3; … 22211112211222121211] t=[1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 ] % Närvivõrgu mudeli kirjeldamine net =newff(p,t,[4,1],{'logsig','logsig'},'trainlm'); % Muudame treeningu parameetreid net.trainParam.show=100; net.trainParam.lr=0.05; net.trainParam.epochs=3000; net.trainParam.goal=0.00001; [net,tr]=train(net,p,t); % Kontrollime lahendit a=round(sim(net,p)) % Väljastame parameetrid bias1=net.b{1} bias2=net.b{2} weights1=net.IW{1,1} weights2=net.LW{2,1} Neural network model: using newff P = [0 1 2 3 4 5 6 7 8 9 10]; T = [0 1 2 3 4 3 2 1 2 3 4]; % two-layer feed-forward network.input ranges from [0 to 10]. %The first layer has five tansig neurons, the second layer has one purelin neuron. %The trainlm network training function is to be used. net = newff([0 10],[5 1],{'tansig' 'purelin'}); %Here the network is simulated Y = sim(net,P); plot(P,T,P,Y,'o') %network is trained for 50 epochs. net.trainParam.epochs = 50; net = train(net,P,T); % new simulation and plot Y2 = sim(net,P); plot(P,T,P,Y,'o',P,Y2,'rx') Neural network model: using newff %1. Initial data, Input data (variables, levels) x1L=[1, 1.5, 2, 2.5] x2L=[1, 1.5, 2, 2.5] x3L=[0.5, 1.0, 1.5, 2] % full factorial design need 4^3=64 experiments %2. input corresponding to Taguchi design L'16 x1=[1,1,1,1,1.5,1.5,1.5,1.5,2,2,2,2,2.5,2.5,2.5,2.5] x2=[1,1.5,2,2.5,1,1.5,2,2.5,1,1.5,2,2.5,1,1.5,2,2.5] x3=[0.5,1,1.5,2,1,0.5,2,1.5,1.5,2,0.5,1,2,1.5,1,0.5] S=[x1;x2;x3] %3. this formula we use instead of experiments V=x1.^2+x2.^2+x3.^2 %4. Function approximation using ANN global net; % defining network net=newff(S,V, [15,1],{'tansig','purelin'},'trainlm'); net.trainParam.lr=0.05; % learning rate net.trainParam.epochs=50; %Train network net = train(net, S, V); Neural network model: model validation Homework 2 Complete matlab application , started at lecture (see slides 76-77) % 1. Improve response surface accuracy % a) use Taguchi design with 5 levels L25 % b) use full factorial design (remain on 4 level) % 2. test results with some test data • • • • • • Explain DOE methods used (advantages, disadvantages) Solve problem by applying lienar regression, compare results Estimate accuracy of the model Search for better configuration of ANN (number of neurons, layers, selection of transfer fn) Try other functions for ANN models (newgrnn or feedforward) Describe meaning of parameters, compare results NB: Finally, generate new dataset of capacity a) 100; b) 200; c) 300 using random values ( S=2*rand(3,100)+0.5) use same function to obtain “test” results, compare , estimate results Optimal design of engineering systems In general optimal design in engineering can be defined as search for solutions providing minimum, (or maximum) value to the objective function and satisfy given constraints. Typical objective functions: cost of the product, service time, net profit, strength (stiffness) characteristics, quality characteristics, etc. The constraints may be technological, or ressourses, etc Optimal design is specially needed in the case of complex problems, also when instead of particular product the systems and/or processes are subjected to optimization. Types of Optimization Tasks • Some specific research areas: operation research, decision theory,... Problems may by classified to • Decision making • Planning (design of system parameters) Depending on model used, the problem may be classified as • Deterministic • Stochastic or • One step design • Iterative design Classification depending on solution method • Analytical solution • Numeric solution. • Semianalytical solution Planning can be divided into • Linear planning • Integer planning • Mixed integer planning • Nonlinear planning • Game theory Decision making In sciences and engineering , preference refers to the set of assumptions related to ordering some alternatives, based on the degree of satisfaction, or utility they provide, a process which results in an optimal "choice" (whether real or theoretical). 1. Deterministic search • Simple with one criterion • Multicriteria 2. Stochastic search (noncomplete information) Deterministic search,two steps: search for feasible solution, search of the best solution Alternatiiv 1 2 3 4 5 6 7 8 9 10 Valiku kriteerium (sihifunktsioon) max G1(x) 2 4 5 7 4 3 6 4 3 5 Search in the case of limited/noncomplete information 1. Variants are evaluated by mean values 2. Confidence levels and vaiance are considred is 3. Correlation is considered jm Multiciteria search (decision making) , optimal portfolio selection Alternatiiv 1 2 3 4 5 6 7 8 9 10 Valiku kriteerium (sihifunktsioon) max G1(x) 2 4 5 7 4 3 6 4 3 5 Valiku kriteerium min G2(x) 1 2 3 4 5 3 4 3 2 5 Pareto concept data decision Max G1 1 2 3 4 5 A B C D E Criteria Max G2 2 4 3 4 4 Min G3 1 1 2 2 4 Solution B is dominating over A Solution D is dominating over C Pareto optimal set : B,D,E. Decision B D E Max G1 2 4 5 Criteria Max G2 4 4 4 Min G3 1 2 4 Normalization fo optimality criteria Fi* Fi ( x) fi Fi* Fi* fi Fi ( x) Fi* Fi* Fi* For maximization For minimization selection A B C D E Max G1 0 0.25 0.5 0.75 1 Criteria Max G2 0 1 0.5 1 1 Max(- G3) 1 1 0.67 0.67 0 Approach 3. Preferred functions Preferred functions Ideal Very good good satisfactory Not suggested Not allowed Gi = >5.0 4.0<=G1<5.0 3.0<= G1 <4.0 2.0<=G1<3.0 1.00< G1 <2.0 G1 <=1.0 Each solution by criteria Otsus (valiku alternatiiv) Max G1 A Not allowed B satisfactory C good D Very good G2 => 4.0 3.5<=G2<4.0 3.0<= G2 <3.5 2.5<=G2<3.0 2.0< G2 <2.5 G2 <=2.0 criteria Max G2 Not allowed ideal good ideal -G3 => 4.0 3.25<=-G3<4.0 2.5<= -G3 <3.25 1.75<=-G3<2.5 1.0< -G3 <1.75 -G3 <=1.0 Max(- G3) ideal ideal Very good good Weighted summation based search Gsum= W1*X1+W2*X2+W3*X3, kus W1+W2+W3= 1,0 Productivity of the each person Person 1 a 0,8 b 0,5 c 0,35 Person 2 0,15 0,4 0,9 Person 3 0,2 0,3 0,1 MATHEMATICAL optimization PROBLEM STATEMENT We describe a step-by-step procedure for creating optimization models The procedure for formulating an optimization model is as follows: • Decision Variables Identify those aspects of a problem that can be adjusted to improved performance. Represent the decision variables. • Design Constraints Identify the restrictions or constraints that bound a problem. Express the restrictions/constraints as mathematical equations. • Design Objectives. Express the system effectiveness measures as one or more objective functions. Definition of Optimization problem The goal of an optimization problem can be formulated as follows: find the combination of parameters (independent variables) which optimize a given quantity, possibly subject to some restrictions on the allowed parameter ranges. The quantity to be optimized (maximized or minimized) is termed the objective function; the parameters which may be changed in the quest for the optimum are called control or decision variables; the restrictions on allowed parameter values are known as constraints. . The general optimization problem may be stated mathematically as: Where f(x) is the objective function, x is the column vector of the independent variables, and ci(x) is the set of constraint functions. Constraint equations of the form are termed equality constraints, and those of the form are inequality constraints. Taken together f(x) , and ci(x) are known as the problem functions. Classical problems Determining extreme of the one variable function df ( x ) 0 dx d 2 f ( x) 0 local maximum dx 2 d 2 f ( x) 0 local minimum dx 2 Lagrange method Ovjective function: f ( x, y ) Constraints : 1 and 2 in form 1 ( x, y ) 0; 2 ( x, y ) 0. Lagrange function L: L( x, y, 1 , 2 ) f ( x, y) 11 ( x, y) 2 2 ( x, y) Optimal solution satisfies the following equation L 0; x L 0 y L 0 1 L 0 2 Linear planning Given : matrix A and vectors b and c, Find maximum of the function : f ( x) c T x where Ax b and x 0 Alternatively: max cT x : Ax b, x 0 Example Max (x1) where x1 <= 3 x1 + x2<= 5 x=> 0 Linear planning(2) Linear planning Example. Suppose that we wish to maximize Objective = 10*x1 + 11*x2 Subject to the constraint pair: 5*x1 + 4*x2 <= 40 2*x1 + 4*x2 <= 24 and x1 >= 0 and x2 >= 0. Together these four constraints define the feasible domain shown in Dual problem Initial problem: max cx : Ax b, x 0 Duaal problem. min y T b : AT y c, y 0 The primal linear program solution answers the tactical question when it tells us how much to produce. But the dual can have far greater impact because it addresses strategical issues regarding the structure of the business itself. Car protection system Tarmetec OÜ-le, (EU normatives) Optimized component FEA Moudel fastening component • FEA system: LS-Dyna • >2000 fully integrated shell elements • Multi-linear material model • Load case 1 – static stiffness of the structure: LS-Dyna implicit solver (displacement caused by forces Fx and Fy) • Load case 2 – dynamic properties of the structure: LS-Dyna explicit solver (reaction forces caused by dynamic force Fz) Design optimization • The objectives: Min (( Fz max Fz final ), Fz max ) where Fz is the reaction force axial component on dynamic loading • Design constraint: umax umax where umax is the maximum deformation on static loading (loads Fx and Fy) • Design variables: 4 (a, b, c and e) Results Optimization study helped to streamline the whole product portfolio Reduced cost Reduced mass Improved safety 4500 4000 Initial design 3500 Optimised design Initial design: • Heavy tubes • Stiff brackets • Does not pass the tests Optimized design: • Light tubes (same diameter, thinner wall) • Brackets with calibrated stiffness • Passes the tests Force, N 3000 2500 2000 1500 1000 500 0 0,0000 0,0003 0,0005 0,0008 0,0010 Time, s 0,0013 0,0015 0,0018 Design of large composite parts Topology Optimization • Software: OptiStruct (module of HyperWorks) • Goal: to find optimal thickness distribution of reinforcement layer • Constraints: Thickness variation 1...8 mm – Max equivalent stress 80 MPa – Max displacement 5 mm • 7 different optimization studies (different values of maximum layer thickness) • 50 to 100 iterations in Thickness variation 1...40 mm Design of reinforcement layer The objective: F2 ( x) T ( x1 , x2 ,..., xn ) F1 ( x) C ( x1 , x2 ,..., xn ) xi xi , xi xi , i 1,..., n Subjected to linear constraints: and non-linear constraints: Where u( x1, x2 ,..., xn ) u , e ( x1, x2 ,..., xn ) e C(x) and T(x) are cost and manufacturing time of the glass-fiber-epoxy layer x – a vector of design variables x*i and xi* – upper and lower bounds of the i-th design variable u – displacement e – effective stress u* and e* are corresponding upper limits Integer Programming Integer programming: branch and bound LP solution: x1 0.8, x 2 2.4 x1 0 x1 1 x 1 0, x1 1, x2 3 x2 2 obj: 3 x2 6 obj: 3 5 4 x1 x 2 6 3 (objective) 2 x2 2 x2 3 x1 1, x 1 0, x2 2 x2 3 obj: 3 obj: 3 1 0 1 2 3 4 5 6 x1 106 Integer programming: branch and bound http://www.sce.carleton.ca/faculty/chinneck/po.html +Slide sampe Integer programming: branch and bound Max f = 5x1 + 8x2 x1 + x2 6; 5x1 + 9x2 45 x1 , x2 ≥ 0 integer Relaxation-> solution: (x1, x2) = (2.25, 3.75); f= 41.25 Branch on the variable x2 yields two subproblems Subproblem 1 Max f = 5x1 + 8x2 s.t. x1 + x2 6 5x1 + 9x2 45 x2 3 x1 , x2 ≥ 0 Opt. Solution: (3,3); f=39 Subproblem 2 Max f = 5x1 + 8x2 s.t. x1 + x2 6 5x1 + 9x2 45 x2 ≥ 4 x1 , x2 ≥ 0 Opt. Solution: (1.8, 4); f=41 S1: x2 3 Solution tree All (2.25, 3.75) f=41.25 (3, 3); f=39 S2: x2 ≥ 4 (1.8, 4); f=41 If further branching on a subproblem will yield no useful information, then we can fathom (dismiss) the subproblem: currently subproblem 1. Integer programming: branch and bound The best integer solution found so far is stored as incumbent. The value of the incumbent is denoted by f*. In our case, the first incumbent is (3, 3), and f*=39. Branch Subproblem 2 on x1 : Subproblem 3: Subproblem 4: New restriction is x1 1. New restriction is x1 ≥ 2. Opt. solution (1, 4.44) with value 40.55 The subproblem is infeasible All Solution tree: (2.25, 3.75) f=41.25 S1: x2 3 (3, 3); f=39 S3: x1 1 (1, 4.44); f=40.55 S2: x2 ≥ 4 (1.8, 4); f=41 S4: x1 ≥ 2 infeasible Integer programming: branch and bound Branch Subproblem 3 on x2 : Subproblem 5: Subproblem 6: New restriction is x2 4. New restriction is x2 ≥ 5. Opt. solution (1, 4); f=37 Opt. solution (0, 5); f=40 S1: x2 3 S5: x2 4 S3: x1 1 (3, 3); f=39 (1, 4); f=37 All (1, 4.44) (2.25, 3.75) S6: x2 ≥ 5 f=40.55 S2: x ≥ 4 2 f=41.25 (0, 5); f=40 S4: x1 ≥ 2 (1.8, 4); f=41 infeasible If the optimal value of a subproblem is f*, then it is fathomed. •In our case, Subproblem 5 is fathomed because 37 39 = f*. If a subproblem has integral optimal solution x*, and its value > f*, then x* replaces the current incumbent. •In our case, Subproblem 5 has integral optimal solution, and its value 40>39=f*. Thus, (0,5) is the new incumbent, and new f*=40. If there are no unfathomed subproblems left, then the current incumbent is an optimal solution for (IP). •In our case, (0, 5) is an optimal solution with optimal value 40. Nonlinear planning min f ( x) : F ( X ) 0, x 0 Optimization Toolbox MATLAB bintprog fgoalattain fminbnd fmincon fminimax fminsearch fminunc fseminf ktrlink linprog quadprog Solve binary integer programming problems Solve multiobjective goal attainment problems Find minimum of single-variable function on fixed interval Find minimum of constrained nonlinear multivariable function Solve minimax constraint problem Find minimum of unconstrained multivariable function using derivative-free method Find minimum of unconstrained multivariable function Find minimum of semi-infinitely constrained multivariable nonlinear function Find minimum of constrained or unconstrained nonlinear multivariable function using KNITRO third-party libraries Solve linear programming problems Solve quadratic programming problems New models In artificial intelligence, an evolutionary algorithm (EA) is a a generic population-based metaheuristic optimization algorithm. An EA uses some mechanisms inspired by biological evolution: reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the environment within which the solutions "live" (see also cost function). Evolution of the population then takes place after the repeated application of the above operators. Evolutionary algorithms often perform well approximating solutions to all types of problems; this generality is shown by successes in fields as diverse as engineering, art, biology, economics, marketing, genetics, operations research, robotics, social sciences, physics, politics and chemistry A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover. Genetic algorithms are a sub-field of: Evolutionary algorithms Stochastic optimization » Optimization Simple generational genetic algorithm procedure: • Choose the initial population of individuals • Evaluate the fitness of each individual in that population • Repeat on this generation until termination (time limit, sufficient fitness achieved, etc.): – Select the best-fit individuals for reproduction – Breed new individuals through crossover and mutation operations to give birth to offspring – Evaluate the individual fitness of new individuals – Replace least-fit population with new individuals Genetic Algorithm and Direct Search Toolbox MATLAB Genetic Algorithm and Direct Search Toolbox functions extend the capabilities of Optimization Toolbox™. These algorithms enable you to solve a variety of optimization problems that lie outside the scope of Optimization Toolbox solvers They include routines for solving optimization problems using • Direct search • Genetic algorithm • Simulated annealing What Is Direct Search? Direct search is a method for solving optimization problems that does not require any information about the gradient of the objective function. Unlike more traditional optimization methods a direct search algorithm searches a set of points around the current point, looking for one where the value of the objective function is lower than the value at the current point. You can use direct search to solve problems for which the objective function is not differentiable, or is not even continuous. What Is the Genetic Algorithm? The genetic algorithm is a method for solving both constrained and unconstrained optimization problems that is based on natural selection, the process that drives biological evolution. The genetic algorithm repeatedly modifies a population of individual solutions. At each step, the genetic algorithm selects individuals at random from the current population to be parents and uses them to produce the children for the next generation. Over successive generations, the population "evolves" toward an optimal solution. You can apply the genetic algorithm to solve a variety of optimization problems that are not well suited for standard optimization algorithms, including problems in which the objective function is discontinuous, nondifferentiable, stochastic, or highly nonlinear. The genetic algorithm uses three main types of rules at each step to create the next generation from the current population: • Selection rules select the individuals, called parents, that contribute to the population at the next generation. • Crossover rules combine two parents to form children for the next generation. • Mutation rules apply random changes to individual parents to form children. GA: http://www.myreaders.info/09_Genetic_Algorithms.pdf What Is Simulated Annealing? Simulated annealing is a method for solving unconstrained and boundconstrained optimization problems. The method models the physical process of heating a material and then slowly lowering the temperature to decrease defects, thus minimizing the system energy. At each iteration of the simulated annealing algorithm, a new point is randomly generated. The distance of the new point from the current point, or the extent of the search, is based on a probability distribution with a scale proportional to the temperature. The algorithm accepts all new points that lower the objective, but also, with a certain probability, points that raise the objective.. An annealing schedule is selected to systematically decrease the temperature as the algorithm proceeds. As the temperature decreases, the algorithm reduces the extent of its search to converge to a minimum. GA sample Sometimes the goal of an optimization is to find the global minimum or maximum of a function. However, optimization algorithms sometimes return a local minimum The genetic algorithm can sometimes overcome this deficiency with the right settings. GA sample2 Genetic algorithms belong to the evolutionary algorithms (EA), which generate solutions to optimization problems . Genetic algorithms find application in, engineering, economics, chemistry, manufacturing, and other fields. The 2006 NASA spacecraft antenna. This complicated shape was found by an evolutionary computer design program to create the best radiation pattern. Multidisciplinary design optimization Multi-disciplinary design optimization (MDO) is a field of engineering that uses optimization methods to solve design problems incorporating a number of disciplines. MDO allows designers to incorporate all relevant disciplines simultaneously. The optimum of the simultaneous problem is superior to the design found by optimizing each discipline sequentially, since it can exploit the interactions between the disciplines. However, including all disciplines simultaneously significantly increases the complexity of the problem. These techniques have been used in a number of fields, including automobile design, naval architecture, electronics, architecture, computers, and electricity distribution. However, the largest number of applications have been in the field of aerospace engineering, such as aircraft and spacecraft design. Multicriteria optimization In the following the multicriteria optimization problem is formulated in general form covering different particular engineering applications (case studies) considered in the current thesis. 1. Problem formulation Practical engineering problems include often several objectives (strength, stiffness characteristics, cost, time, etc.), also different technological, geometric, limitation of resources, etc constraints. Thus, the multicriteria optimization problem can be formulated as In (10)-(12) stand for the objective functions (describing stiffness/strength, electrical properties, cost, etc.) and is a n-dimensional vector of design variables. The upper and lower bounds of the design variables are denoted by and , respectively. The functions stand for constraint functions including both linear and nonlinear constraints. Note that the equality constraints can be converted into inequality constraints and are covered by (11). The n-dimensional design space is defined with lower and upper bounds of the design variables. Multicriteria optimization The objective functions subjected to maximum and minimum can be normalised by formulas (13) and (14), respectively Obviously, after normalization with (13)-(14), both functions should be subjected to minimization. In (13) difference of the maximum and current values of the objective function are minimized, in (14) the difference of the current and minimum values of the objective function are minimized. It should be pointed out that the values of the normalized objective functions are not necessarily in interval [0;1], since the maximal and minimal values of the objective functions used in (13)-(14) are estimated values. Multicriteria optimization Physical programming According to previous section, the solution techniques should be taking use after analysis of the optimality criteria. Furthermore, it was pointed out that in the current study the physical programming techniques will be applied for objectives which are not conflicting [46], [47], [48]. According to the weighted summation technique, the optimality criteria is first scaled, then multiplied by weights and summed into the new objective According to the compromise programming technique the combined objective is defined as the family of distance functions Multicriteria optimization Pareto optimality concept As pointed out above the use of the Pareto optimality concept is justified in the cases when contra dictionary behaviour can be perceived between the objective functions [50], [51]. The physical programming techniques discussed above are based on combining multiple objectives into one objective and solving latter problem as single objective optimisation problem. Model simulation and evaluation 1. 2. Variance analysis. Sensitivity analysis Robustness analysis ROBUST OPTIMIZATION The Taguchi optimization method minimises the variability of the performance under uncertain operating conditions. Therefore in order to perform an optimisation with uncertainties, the fitness function(s) should be associated with two statistical formulas: the mean value the variance or standard deviation Sample objectives: minimise 1. the total weight , 2. normalised mean displacement, 3. standard deviation of the displacement Homework 3 1. Solve optimization problem with one optimality criterion a) use function fmincon b) use function ga c) use hybrid approach ga+fmincon At least one sample solution should use ANN model built in Homework 2. 2. Solve multicriteria optimization problem using function gamultiobj a) use weighted summation technique (sample where objectives are not conflicting) b) use Pareto optimality concept (sample where objectives are confliction) 3. Robust optimization. Sample with one optimality criterion can be considered as multicriteria optimization problem by adding the standard deviation or variance of some output value as new criterion clear all %1. Initial data, Input data (variables, levels) x1L=[1, 1.5, 2, 2.5]; x2L=[1, 1.5, 2, 2.5]; x3L=[0.5, 1.0, 1.5, 2]; % full factorial design need 4^3=64 experiments %2. input corresponding to Taguchi design L'16 x1=[1,1,1,1,1.5,1.5,1.5,1.5,2,2,2,2,2.5,2.5,2.5,2.5]; x2=[1,1.5,2,2.5,1,1.5,2,2.5,1,1.5,2,2.5,1,1.5,2,2.5]; x3=[0.5,1,1.5,2,1,0.5,2,1.5,1.5,2,0.5,1,2,1.5,1,0.5]; S=[x1;x2;x3]; %3. this formula we use instead of experiments V=x1.^2+x2.^2+x3.^2; %4. Function approximation using ANN global net; % defining network net=newff(S,V, [10],{'tansig','purelin'},'trainlm'); net.trainParam.lr=0.05; % learning rate net.trainParam.epochs=50; %Train network net = train(net, S, V); %optimization x0=[1.8,1.7,1.9]; A=[]; b=[]; % 2*x1+3x2<=20 ; 5x1-x2<=15 A=[3, 2; 5, 1] b=[20;15] Aeq=[]; beq=[]; lb=[1, 1, 0.5]; ub=[2.5, 2.5, 2]; % using fmincon (gradient method) [x,f,error_flag]=fmincon(@mer9020_ob1,x0,A,b,Aeq,beq,lb,ub) % using ga (genetic algorithm) [x,f,error_flag]=ga(@mer9020_ob1,3,A,b,Aeq,beq,lb,ub) %hybrid approach [xxx,f,error_flag]=ga(@mer9020_ob1,3,A,b,Aeq,beq,lb,ub) [x,f,error_flag]=fmincon(@mer9020_ob1,xxx,A,b,Aeq,beq,lb,ub) Supporting material for homework 3 % multiobjective optimization options = gaoptimset('PopulationSize',100,'Generations',30,'Display','iter',... 'ParetoFraction',0.7,'PlotFcns', {@gaplotpareto}); [x,f,error_flag]=gamultiobj(@mer9020_ob2,3,A,b,Aeq,beq,lb,ub,options) % next functions should be in separate files a) File : mer9020_ob1.m function [ f ] = mer9020_ob1( x ) global net; f(1)=sim(net,x'); % f(2)=..... Needed only in case of multiobjective problems end b) File : mer9020_ob2.m function [ f ] = mer9020_ob2( x ) global net; f(1)=sim(net,x'); f(2)=x(1)+x(2)^2+x(3)*x(2); end