Lecture 8 Neurofuzzy

Fuzzy Inference Systems Review Fuzzy Models If <antecedence> then <consequence>. Basic Configuration of a Fuzzy Logic System Fuzzification Inferencing Input Defuzzification Output Target Error =Target -Output Types of Rules Mamdani Assilian Model R1: If x is A1 and y is B1 then z is C1 R2: If x is A2 and y is B2 then z is C2 Ai , Bi and Ci, are fuzzy sets defined on the universes of x, y, z respectively Takagi-Sugeno Model R1: If x is A1 and y is B1 then z =f1(x,y) R1: If x is A2 and y is B2 then z =f2(x,y) For example: fi(x,y)=aix+biy+ci Types of Rules Mamdani Assilian Model Takagi-Sugeno Model Mamdani Fuzzy Models The Reasoning Scheme Both antecedent and consequent are fuzzy The Reasoning Scheme Both antecedent and consequent are fuzzy 1: IF FeO is high & SiO2 is low 1 & Granite is prox & Fault is prox, THEN metal is high Implication (Max) = 0 2: IF FeO is aver & SiO2 is high 1 & Granite is interm & Fault is prox, THEN metal is aver = 0 3: IF FeO is low 1 & SiO2 is high & Granite is dist & Fault is dist, THEN metal is low = 0 30% 50% 70% 40% FeO = 60% 55% 70% SiO2 = 60% 0 km 10 km 20km 0 km Granite = 5 km 5 km 10km Fault = 1 km 0t 100t Metal = ? 1000t 0t 100t 1000t Defuzzifier Since consequent is fuzzy, it has to be defuzzified • Converts the fuzzy output of the inference engine to crisp using membership functions analogous to the ones used by the fuzzifier. • Five commonly used defuzzifying methods: – Centroid of area (COA) – Bisector of area (BOA) – Mean of maximum (MOM) – Smallest of maximum (SOM) – Largest of maximum (LOM) Defuzzifier Rule 1: Rule 2: + + Rule 3: Aggregate (Max) = Defuzzify (Find centroid) Formula for centroid n  x i   ( xi ) i0 n  i0  ( xi ) 125 tonnes metal Sugeno Fuzzy Models • Also known as TSK fuzzy model – Takagi, Sugeno & Kang, 1985 Fuzzy Rules of TSK Model While antecedent is fuzzy, consequent is crisp If x is A and y is B then z = f(x, y) Fuzzy Sets The order of a Takagi-Sugeno type fuzzy inference system = the order of the polynomial used. Crisp Function f(x, y) is very often a polynomial function w.r.t. x and y. The Reasoning Scheme Examples R1: if X is small and Y is small then z = x +y +1 R2: if X is small and Y is large then z = y +3 R3: if X is large and Y is small then z = x +3 R4: if X is large and Y is large then z = x + y + 2 TAKAGI-SUGENO SYSTEM 1. 2. 3. 4. IF x is f1x(x) AND y is f1y(y) THEN z1 = p10+p11x+p12y IF x is f2x(x) AND y is f1y(y) THEN z2 = p20+p21x+p22y IF x is f1x(x) AND y is f2y(y) THEN z3 = p30+p31x+p32y IF x is f2x(x) AND y is f2y(y) THEN z4 = p40+p41x+p42y The firing strength (= output of the IF part) of each rule is: s1 = f1x(x) AND f1y(y) s2 = f2x(x) AND f1y(y) s3 = f1x(x) AND f2y(y) s4 = f2x(x) AND f2y(y) Output of each rule (= firing strength x consequent function) : 1. o1 = s1 ∙ z1 2. o2 = s2 ∙ z2 3. o3 = s3 ∙ z3 4. o4 = s4 ∙ z4 Overall output of the fuzzy inference system is: o +o +o +o z = s 1+ s 2+ s 3+ s 4 1 2 3 4 Sugeno system Rule1: IF FeO is high AND SiO2 is low AND Granite is proximal AND Fault is proximal, THEN Gold =p1(FeO%)+q1(SiO2%) +r1(Distance2Granite)+s1(Distance2Fault)+t1 Rule 2: IF FeO is average AND SiO2 is high AND Granite is intermediate AND Fault is proximal, THEN Gold =p2(FeO%)+q2(SiO2%)+r2(Distance2Granite)+s2(Distance2Fault)+t2 Rule 3: IF FeO is low AND SiO2 is high AND Granite is distal AND Fault is distal, THEN Gold =p3(FeO%)+q3(SiO2%)+r3(Distance2Granite)+s3(Distance2Fault)+t3 18 Sugeno system 1: IF FeO is high X SiO2 is low 1 X Granite is prox X Fault is prox, THEN Gold(R1) =p1(FeO%)+q1(SiO2%) + r1(Distance2Granite) +s1(Distance2Fault)+t1 s1 0 2: IF FeO is aver X SiO2 is high 1 X Granite is interm X Fault is prox, THEN Gold(R2) =p2(FeO%)+q2(SiO2%) + r2(Distance2Granite) +s2(Distance2Fault)+t2 s2 0 3: IF FeO is low 1 & SiO2 is high & Granite is dist & Fault is dist, THEN Gold(R3) =p3(FeO%)+q3(SiO2%) + r3(Distance2Granite) +s3(Distance2Fault)+t3 s3 0 30% 50% 70% 40% FeO = 60% 55% 70% SiO2 = 60% 0 km 10 km 20km 0 km Granite = 5 km 5 km 10km Fault = 1 km Metal = ? Sugeno system: Output Firing strength s1 Rule output Gold(R1) =p1(FeO%)+q1(SiO2%) + r1(Distance2Granite) +s1(Distance2Fault)+t1 s2 Gold(R2) =p2(FeO%)+q2(SiO2%) + r2(Distance2Granite) +s2(Distance2Fault)+t2 s3 Gold(R3) =p3(FeO%)+q3(SiO2%) + r3(Distance2Granite) +s3(Distance2Fault)+t3 Output  s1 * Gold ( R 1)  s 2 * Gold ( R 2 )  s 3 * Gold ( R 3 ) s1  s 2  s 3 A neural fuzzy system Implements FIS in the framework of NNs Output Nodes Antecedent Nodes Fuzzification Nodes x y Fuzzification Nodes Represents the term sets of the features. If we have two features x and y and two linguistic variables defined on both of it say BIG and SMALL. Then we have 4 fuzzification nodes. BIG SMALL x BIG SMALL y We use Gaussian Membership functions for fuzzification --- They are differentiable, triangular and trapezoidal membership functions are NOT differentiable. Fuzzification Nodes (Contd.)   x    2 z  ex p   2       and  are two free parameters of the membership functions which needs to be determined How to determine  and  Two strategies: 1) Fixed  and  2) Update  and  , through any tuning algorithm Consequent nodes z  px  qy  k p, q and k are three free parameters of the consequent polynomial function How to determine p, q, k Two strategies: 1) Fixed 2) Update through any tuning algorithm Target (t) Error = ½(t-o)2 z1 w1 z2 z4 z3 w3 w2 Output node O = (w1z1+w2z2+w3z3+w4z4)/ (w1+w2+w3+w4 Consequent nodes e.g. z4 = p4x + q4y + k4 w4 Antecedent nodes e.g. If x is Small & y is Small μx1 μx2 BIG SMALL x μy2 μy1 SMALL BIG y Fuzzification nodes ANFIS Architecture Squares: Adaptive nodes Circles: Fixed nodes ANFIS Architecture Layer 1 (Adaptive) Contains adaptive nodes, each with a Gaussian membership function:   (c  x ) 2 f ( x )  exp  2       Number of nodes = number of variables x number of linguistic values In the previous example there are 4 nodes (2 variable x 2 linguistic values for each) Two parameters to be estimated per node: mean (centre) and standard deviation (spread) These are called premise parameters Number of premise parameters = 2 x number of nodes = 8 in the example ANFIS Architecture Layer 2 (Fixed) Contains fixed nodes, each with product operator (T-norm operator). Returns the firing strength of each If-Then Rule. s1  f 1 x  f 1 y ; s 2  f 2 x  f 1 y s 3  f1 x  f 2 y ; s 4  f 2 x  f 2 y The firing strength can be normalized. In ANFIS, each node returns a normalized firing strength – s  s1 s1  s 2  s 3  s 4 Fixed nodes – no parameter to be estimated. ANFIS Architecture Layer 3 (Adaptive) Each node contains an adaptive polynomial, and returns output for each fuzzy If-Then rule z 1  s1  (p 10  p 11 x  p 12 y) z 2  s 2  (p 20  p 21 x  p 22 y) z 3  s 3  (p 30  p 31 x  p 32 y) z 4  s 4  (p 40  p 41 x  p 42 y) Number of nodes = number of If-Then Rules. The parameters ps are called consequent parameters. ANFIS Architecture Layer 4 (Fixed) Sums up the output of each node in the previous layer: z  z1  z 2  z 3  z 4 A single node in this layer. No parameter to be estimated. ANFIS Training z  z1  z 2  z 3  z 4 z 1  s1  (p 10  p 11 x  p 12 y) z 2  s 2  (p 20  p 21 x  p 22 y) Linear in the consequent parameters Pki, if the premise parameters and, therefore, the firing strengths sk of the fuzzy if-then rules are fixed. z 3  s 3  (p 30  p 31 x  p 32 y) z 4  s 4  (p 40  p 41 x  p 42 y) ANFIS uses a hybrid learning procedure (Jang and Sun, 1995) for estimation of the premise and consequent parameters. The hybrid learning procedure estimates the consequent parameters (keeping the premise parameters fixed) in a forward pass and the premise parameters (keeping the consequent parameters fixed) in a backward pass. ANFIS Training The forward pass: Propagate information forward until Layer 3 Estimate the consequent parameters by the least square estimator. Squares: Adaptive nodes Circles: Fixed nodes The backward pass: Propagate the error signals backwards and update the premise parameters by gradient descent. ANFIS Training : Least Square Estimation 1. Data assembled in form of (xn; yn) 2. We assume that there is a linear relation between x and y: y = ax + b 3. Can be extended to n dimensions: y = a1x1 + a2x2 + a3x3 + … + b The problem: Given the function f, find values of coefficients ais such that the linear combination best fits the data ANFIS Training : Least Square Estimation Given data {(x1; y1 (xN ; yN)}, we may define the error associated to saying y = ax + b by: This is just N times the variance of data : {y1 - (ax1+b),…., yn - (axN +b)} The goal is to find values of a and b that minimize the error. In other words minimize the partial derivative of the error wrt a and b: ANFIS Training : Least Square Estimation Which gives us: We may rewrite them as: The values of a and b which minimize the error satisfy the following matrix equation: Hence a and b are estimated using: ANFIS Training : Least Square Estimation For the following data find least square estimator 2 a  x      b   x a     b  x   1  1   xy      y    1  x . 1   x . SNo 1 2 3 4 5 6 7 8 9 TOTAL 2 X 2 3 4 6 8 1 2 11 14 51  1  x    x Y 9 11 13 17 21 7 9 27 33 147  x    x y   x    y   2 X2 XY 4 18 9 33 16 52 36 102 64 168 1 7 4 18 121 297 196 462 451 1157 ANFIS Training : Least Square Estimation z 1  s1  (p 10  p 11 x  p 12 y); z 2  s 2  (p 20  p 21 x  p 22 y); z 3  s 3  (p 30  p 31 x  p 32 y); z 4  s 4  (p 40  p 41 x  p 42 y) Output  o  z 1  z 2  z 3  z 4  s1  (p 10  p 11 x  p 12 y)  s 2  (p 20  p 21 x  p 22 y)  s 3  (p 30  p 31 x  p 32 y)  s 4  (p 40  p 41 x  p 42 y) Let the training  x1  x2 x  3 x4  x  5 x  6 y1 y2 y3 y4 y5 y6 data be : t1   t2  t3   t4   t5  t 6  E  ( t1  [ s1  (p 10  p 11 x 1  p 12 y 1 )  s 2  (p 20  p 21 x 1  p 22 y 1 )  s 3  (p 30  p 31 x 1  p 32 y 1 )  s 4  (p 40  p 41 x 1  p 42 y 1 )]  2 : E  ( t 6  [ s1  (p 10  p 11 x 6  p 12 y 6 )  s 2  (p 20  p 21 x 6  p 22 y 6 )  s 3  (p 30  p 31 x 6  p 32 y 6 )  s 4  (p 40  p 41 x 6  p 42 y 6 )]  2 E  p 10  2 ( t1  [ s1  (p 10  p 11 x 1  p 12 y 1 )  s 2  (p 20  p 21 x 1  p 22 y 1 )  s 3  (p 30  p 31 x 1  p 32 y 1 )  s 4  (p 40  p 41 x 1  p 42 y 1 )] .(  s1 )  0 : E  p 42  2 ( t1  [ s1  (p 10  p 11 x 1  p 12 y 1 )  s 2  (p 20  p 21 x 1  p 22 y 1 )  s 3  (p 30  p 31 x 1  p 32 y 1 )  s 4  (p 40  p 41 x 1  p 42 y 1 )] .(  y 1 )  0 Simplify and use LSE. ANFIS Training : Gradient descent After the least square estimate - the parameter - firing strength v - variable of all consequent parameters , plug in : values, alues and (x, y) values Error at layer 1  E  (Target Output - Actual Output) 2 2 Let the centre and spread parameters E ci  E O O i si  i O O i si  i ci where O is the output, E  i  in Layer 1 be represente d by c i (centre) and s i (spread). fi is the fuzzy membership function. Similarly, E O O i si  i O O i si  i  i T he above expression s are used to update centre and spread parameters .

Lecture 8 Neurofuzzy

Related documents

Products

Support

Lecture 8 Neurofuzzy

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib