Robert Fullér | Péter Majlender An analytic approach for obtaining maximal entropy OWA operator weights TUCS Technical Report No 328, January 2000 An analytic approach for obtaining maximal entropy OWA operator weights Robert Fullér Institute for Advanced Management Systems Research, Åbo Akademi University, Lemminkäinengatan 14B, Åbo, Finland e-mail: rfuller@abo.fi and Department of Operations Research, Eötvös Loránd University, Rákóczi ut 5, H-1088 Budapest, Hungary e-mail: rfuller@cs.elte.hu Péter Majlender Department of Operations Research, Eötvös Loránd University, Rákóczi ut 5, H-1088 Budapest, Hungary e-mail: mpeter@cs.elte.hu TUCS Technical Report No 328, January 2000 Abstract One important issue in the theory of Ordered Weighted Averaging (OWA) operators is the determination of the associated weights. One of the first approaches, suggested by O’Hagan, determines a special class of OWA operators having maximal entropy of the OWA weights for a given level of orness; algorithmically it is based on the solution of a constrained optimization problem. In this paper, using the method of Lagrange multipliers, we shall solve this constrained optimization problem analytically and derive a polinomial equation which is then solved to determine the optimal weighting vector. Keywords: OWA operator, dispersion, degree of orness 1 Introduction An OWA operator of dimension n is a mapping F : Rn → R that has an associated weighting vector W = (w1 , . . . , wn )T of having the properties w1 + · · · + wn = 1, 0 ≤ wi ≤ 1, i = 1, . . . , n, and such that F (a1 , . . . , an ) = n X wi bi , i=1 where bj is the jth largest element of the collection of the aggregated objects {a1 , . . . , an }. In [3], Yager introduced two characterizing measures associated with the weighting vector W of an OWA operator. The first one, the measure of orness of the aggregation, is defined as n 1 X orness(W ) = (n − i)wi . n − 1 i=1 and it characterizes the degree to which the aggregation is like an or operation. It is clear that orness(W ) ∈ [0, 1] holds for any weighting vector. The second one, the measure of dispersion of the aggregation, is defined as disp(W ) = − n X wi ln wi i=1 and it measures the degree to which W takes into account all information in the aggregation. It is clear that the actual type of aggregation performed by an OWA operator depends upon the form of the weighting vector. A number of approaches have been suggested for obtaining the associated weights, i.e., quantifier guided aggregation [3, 4], exponential smoothing [6] and learning [5]. Another approach, suggested by O’Hagan [2], determines a special class of OWA operators having maximal entropy of the OWA weights for a given level of orness. This approach is based on the solution of he following mathematical programming problem: n X − wi ln wi maximize i=1 subject to 1 n−1 n X i=1 n X i=1 (n − i)wi = α, 0 ≤ α ≤ 1 (1) wi = 1, 0 ≤ wi ≤ 1, i = 1, . . . , n. Using the method of Lagrange multipliers we shall transfer problem (1) to a polinomial equation which is then solved to determine the optimal weighting vector. 1 2 Obtaining maximal entropy weights First we note that disp(W ) is meaningful if wi > 0 and by letting wi ln wi to zero if wi = 0, problem (1) turns into disp(W ) → max; subject to {orness(W ) = α, w1 + · · · + wn = 1, 0 ≤ α ≤ 1}. If n = 2 then from orness(w1 , w2 ) = α we get w1 = α and w2 = 1 − α. Furthemore, if α = 0 or α = 1 then the associated weighting vectors are uniquely defined as (0, 0, . . . , 0, 1)T and (1, 0, . . . , 0, 0)T , respectively, with value of dispersion zero. Suppose now that n ≥ 3 and 0 < α < 1. Let us L(W, λ1 , λ2 ) = − n X wi ln wi + λ1 i=1 X n i=1 X n n−i w i − α + λ2 wi − 1 . n−1 i=1 denote the Lagrange function of constrained optimization problem (1), where λ1 and λ2 are real numbers. Then the partial derivatives of L are computed as ∂L n−j = − ln wj − 1 + λ1 + λ2 = 0, ∀j ∂wj n−1 n ∂L X = wi − 1 = 0 ∂λ1 i=1 (2) n ∂L X n − i = wi − α = 0. ∂λ2 n−1 i=1 For j = n equation (2) turns into − ln wn − 1 + λ1 = 0 ⇐⇒ λ1 = ln wn + 1, and for j = 1 we get − ln w1 − 1 + λ1 + λ2 = 0, and, therefore, λ2 = ln w1 + 1 − λ1 = ln w1 + 1 − ln wn − 1 = ln w1 − ln wn . For 1 ≤ j ≤ n we find j−1 n−j ln wj = ln wn + ln w1 ⇒ wj = n−1 n−1 q n−1 w1n−j wnj−1. If w1 = wn then (3) gives w1 = w2 = · · · = wn = 2 1 ⇒ disp(W ) = ln n, n (3) which is the optimal solution to (1) for α = 0.5 (actually, this is the global optimal value for the dispersion of all OWA operators of dimension n). Suppose now that w1 6= wn . Let us introduce the notations 1 1 u1 = w1n−1 , un = wnn−1 . Then we may rewrite (3) as wj = u1n−j unj−1, for 1 ≤ j ≤ n. From the first condition, orness(W ) = α, we find n n X X n−i wi = α ⇐⇒ (n − i)u1n−i ui−1 = (n − 1)α, n n − 1 i=1 i=1 and from n X i=1 (n − i)u1n−i ui−1 n n−1 X 1 n i n−i (n − 1)u1 − u1 un = u1 − un i=1 u1n−1 − unn−1 1 n (n − 1)u1 − u1 un = u1 − un u1 − un 1 n n n = (n − 1)u1 (u1 − un ) − u1 un + u1 un (u1 − un )2 1 n+1 n n (n − 1)u1 − nu1 un + u1 un , = (u1 − un )2 we get (n − 1)un+1 − nun1 un + u1 unn = (n − 1)α(u1 − un )2 1 nun1 − u1 = (n − 1)α(u1 − un ) 1 n un = ((n − 1)α + 1)u1 − nu1 (n − 1)α un (n − 1)α + 1 − nw1 = . u1 (n − 1)α (4) From the second condition, w1 + · · · + wn = 1, we get n X j=1 u1n−j unj−1 = 1 ⇐⇒ un1 − unn =1 u1 − un ⇐⇒ un1 − unn = u1 − un (5) ⇐⇒ u1n−1 − (6) un un × unn−1 = 1 − u1 u1 Comparing equations (4) and (6) we find w1 − nw1 − 1 (n − 1)α + 1 − nw1 × wn = (n − 1)α (n − 1)α 3 and, therefore, wn = Let us rewrite equation (5) as ((n − 1)α − n)w1 + 1 . (n − 1)α + 1 − nw1 (7) un1 − unn = u1 − un u1 (w1 − 1) = un (wn − 1) w1 (w1 − 1)n−1 = wn (wn − 1)n−1 ((n − 1)α − n)w1 + 1 (n − 1)α(w1 − 1 n−1 w1 (w1 − 1) = × (n − 1)α + 1 − nw1 (n − 1)α + 1 − nw1 w1 [(n − 1)α + 1 − nw1 ]n = ((n − 1)α)n−1[((n − 1)α − n)w1 + 1]. (8) n−1 So the optimal value of w1 should satisfy equation (8). Once w1 is computed then wn can be determined from equation (7) and the other weights are obtained from equation (3). Remark 2.1. If n = 3 then from (3) we get w2 = √ w1 w3 independently of the value of α, which means that the optimal value of w2 is always the geometric mean of w1 and w3 . 3 Computing the optimal weights Let us introduce the notations f (w1 ) = w1 [(n − 1)α + 1 − nw1 ]n , g(w1 ) = ((n − 1)α)n−1[((n − 1)α − n)w1 + 1]. Then to find the optimal value for the first weight we have to solve the following equation f (w1 ) = g(w1 ), where g is a line and f is a polinom of w1 of dimension n + 1. Without loss of generality we can assume that α < 0.5, because if a weighting vector W is optimal for problem (1) under some given degree of orness, α < 0.5, then its reverse, denoted by W R , and defined as wiR = wn−i+1 is also optimal for problem (1) under degree of orness (1 − α). Really, as was shown by Yager [4], we find that disp(W R ) = disp(W ) and orness(W R ) = 1 − orness(W ). 4 Therefore, for any α > 0.5, we can solve problem (1) by solving it with level of orness (1 − α) and then taking the reverse of that solution. From the equations 1 1 1 1 ′ ′ =g and f =g f n n n n we get that g is always a tangency line to f at the point w1 = 1/n. But if w1 = 1/n then w1 = · · · = wn = 1/n also holds, and that is the optimal solution for α = 0.5. Consider the the graph of f . It is clear that f (0) = 0 and by solving the equation f ′ (w1 ) = [(n − 1)α + 1 − nw1 ]n − n2 w1 [(n − 1)α + 1 − nw1 ]n−1 = 0 we find that its unique solution is ŵ1 = (n − 1)α + 1 1 < , n(n + 1) n and its second derivative, f ′′ (ŵ1 ) is negative, which means that ŵ1 is the only maximizing point of f on the segment [0, 1/n]. We prove now that g can intersect f only once in the open interval (0, 1/n). It will guarantee the uniqueness of the optimal solution of problem (1). Really, from the equation f ′′ (w1 ) = −2n2 [(n − 1)α + 1 − nw1 ]n−1 + n3 (n − 1)w1[(n − 1)α + 1 − nw1 ]n−2 = 0 we find that its unique solution is w̄1 = 2 1 (n − 1)α + 1 = 2ŵ1 < , (since α < 0.5). n(n + 1) n with the meaning that f is strictly concave on (0, w̄1 ), has an inflextion point at w̄1 , and f is strictly convex on (w̄1 , 1/n). Therefore, the graph of g should lie below the graph of g if ŵ1 < w1 < 1/n and g can cross f only once in the interval (0, ŵ1). 4 Illustrations Let us suppose that n = 5 and α = 0.6. Then from the equation w1 [4 × 0.6 + 1 − 5w1 ]5 = (4 × 0.6)4 [1 − (5 − 4 × 0.6)w1 ]. we find w1∗ = 0.2884 (4 × 0.6) − 5)w1∗ + 1 = 0.1278 4 × 0.6 + 1 − 5w1∗ p w2∗ = 4 (w1∗)3 w5∗ = 0.2353, p w3∗ = 4 (w1∗)2 (w5∗ )2 = 0.1920, p w4∗ = 4 (w1∗)(w5∗ )3 = 0.1566. w5∗ = 5 and, disp(W ∗ ) = 1.5692. Using exponential smoothing [1], Filev and Yager [6] obtained the following weighting vector W ′ = (0.41, 0.10, 0.13, 0.16, 0.20), with disp(W ′ ) = 1.48 and orness(W ′ ) = 0.5904. We first note that the weights computed from the constrained optimization problem have better dispersion than those ones obtained by Filev and Yager in [6], however the (heuristic) technology suggested in [6] needs less computational efforts. Other interesting property here is that small changes in the required level of orness, α, can cause a big variation in weighting vectors of near optimal dispersity, (for example, compare the weighting vectors W ∗ and W ′ ). References [1] R.G. Brown, Smoothing, Forecasting and Prediction of Discrete Time Series (Prentice-Hall, Englewood Cliffs, 1963). [2] M. O’Hagan, Aggregating template or rule antecedents in real-time expert systems with fuzzy set logic, in: Proc. 22nd Annual IEEE Asilomar Conf. Signals, Systems, Computers, Pacific Grove, CA, 1988 81-689. [3] R.R.Yager, Ordered weighted averaging aggregation operators in multi-criteria decision making, IEEE Trans. on Systems, Man and Cybernetics, 18(1988) 183-190. [4] R.R.Yager, Families of OWA operators, Fuzzy Sets and Systems, 59(1993) 125-148. [5] R. R. Yager and D. Filev, Induced ordered weighted averaging operators, IEEE Trans. on Systems, Man and Cybernetics – Part B: Cybernetics, 29(1999) 141-150. [6] D. Filev and R. R. Yager, On the issue of obtaining OWA operator weights, Fuzzy Sets and Systems, 94(1998) 157-169 6 Lemminkäisenkatu 14 A, 20520 Turku, Finland | www.tucs.fi University of Turku • Department of Information Technology • Department of Mathematics Åbo Akademi University • Department of Computer Science • Institute for Advanced Management Systems Research Turku School of Economics and Business Administration • Institute of Information Systems Sciences ISBN 952-12-0609-8 ISSN 1239-1891