An analytic approach for

advertisement
Robert Fullér | Péter Majlender
An analytic approach for obtaining maximal entropy OWA operator
weights
TUCS Technical Report
No 328, January 2000
An analytic approach for obtaining maximal entropy OWA operator
weights
Robert Fullér
Institute for Advanced Management Systems Research,
Åbo Akademi University, Lemminkäinengatan 14B, Åbo, Finland
e-mail: rfuller@abo.fi
and
Department of Operations Research, Eötvös Loránd University,
Rákóczi ut 5, H-1088 Budapest, Hungary
e-mail: rfuller@cs.elte.hu
Péter Majlender
Department of Operations Research, Eötvös Loránd University,
Rákóczi ut 5, H-1088 Budapest, Hungary
e-mail: mpeter@cs.elte.hu
TUCS Technical Report
No 328, January 2000
Abstract
One important issue in the theory of Ordered Weighted Averaging (OWA) operators is
the determination of the associated weights. One of the first approaches, suggested by
O’Hagan, determines a special class of OWA operators having maximal entropy of the
OWA weights for a given level of orness; algorithmically it is based on the solution of a
constrained optimization problem. In this paper, using the method of Lagrange multipliers,
we shall solve this constrained optimization problem analytically and derive a polinomial
equation which is then solved to determine the optimal weighting vector.
Keywords: OWA operator, dispersion, degree of orness
1 Introduction
An OWA operator of dimension n is a mapping F : Rn → R that has an associated weighting vector W = (w1 , . . . , wn )T of having the properties
w1 + · · · + wn = 1, 0 ≤ wi ≤ 1, i = 1, . . . , n,
and such that
F (a1 , . . . , an ) =
n
X
wi bi ,
i=1
where bj is the jth largest element of the collection of the aggregated objects {a1 , . . . , an }.
In [3], Yager introduced two characterizing measures associated with the weighting
vector W of an OWA operator. The first one, the measure of orness of the aggregation, is
defined as
n
1 X
orness(W ) =
(n − i)wi .
n − 1 i=1
and it characterizes the degree to which the aggregation is like an or operation. It is clear
that orness(W ) ∈ [0, 1] holds for any weighting vector.
The second one, the measure of dispersion of the aggregation, is defined as
disp(W ) = −
n
X
wi ln wi
i=1
and it measures the degree to which W takes into account all information in the aggregation.
It is clear that the actual type of aggregation performed by an OWA operator depends
upon the form of the weighting vector. A number of approaches have been suggested
for obtaining the associated weights, i.e., quantifier guided aggregation [3, 4], exponential smoothing [6] and learning [5].
Another approach, suggested by O’Hagan [2], determines a special class of OWA operators having maximal entropy of the OWA weights for a given level of orness. This approach
is based on the solution of he following mathematical programming problem:
n
X
−
wi ln wi
maximize
i=1
subject to
1
n−1
n
X
i=1
n
X
i=1
(n − i)wi = α, 0 ≤ α ≤ 1
(1)
wi = 1, 0 ≤ wi ≤ 1, i = 1, . . . , n.
Using the method of Lagrange multipliers we shall transfer problem (1) to a polinomial
equation which is then solved to determine the optimal weighting vector.
1
2 Obtaining maximal entropy weights
First we note that disp(W ) is meaningful if wi > 0 and by letting wi ln wi to zero if wi = 0,
problem (1) turns into
disp(W ) → max; subject to {orness(W ) = α, w1 + · · · + wn = 1, 0 ≤ α ≤ 1}.
If n = 2 then from orness(w1 , w2 ) = α we get w1 = α and w2 = 1 − α. Furthemore, if α =
0 or α = 1 then the associated weighting vectors are uniquely defined as (0, 0, . . . , 0, 1)T
and (1, 0, . . . , 0, 0)T , respectively, with value of dispersion zero.
Suppose now that n ≥ 3 and 0 < α < 1. Let us
L(W, λ1 , λ2 ) = −
n
X
wi ln wi + λ1
i=1
X
n
i=1
X
n
n−i
w i − α + λ2
wi − 1 .
n−1
i=1
denote the Lagrange function of constrained optimization problem (1), where λ1 and λ2 are
real numbers. Then the partial derivatives of L are computed as
∂L
n−j
= − ln wj − 1 + λ1 +
λ2 = 0, ∀j
∂wj
n−1
n
∂L X
=
wi − 1 = 0
∂λ1
i=1
(2)
n
∂L X n − i
=
wi − α = 0.
∂λ2
n−1
i=1
For j = n equation (2) turns into
− ln wn − 1 + λ1 = 0 ⇐⇒ λ1 = ln wn + 1,
and for j = 1 we get
− ln w1 − 1 + λ1 + λ2 = 0,
and, therefore,
λ2 = ln w1 + 1 − λ1 = ln w1 + 1 − ln wn − 1 = ln w1 − ln wn .
For 1 ≤ j ≤ n we find
j−1
n−j
ln wj =
ln wn +
ln w1 ⇒ wj =
n−1
n−1
q
n−1
w1n−j wnj−1.
If w1 = wn then (3) gives
w1 = w2 = · · · = wn =
2
1
⇒ disp(W ) = ln n,
n
(3)
which is the optimal solution to (1) for α = 0.5 (actually, this is the global optimal value
for the dispersion of all OWA operators of dimension n). Suppose now that w1 6= wn . Let
us introduce the notations
1
1
u1 = w1n−1 , un = wnn−1 .
Then we may rewrite (3) as wj = u1n−j unj−1, for 1 ≤ j ≤ n.
From the first condition, orness(W ) = α, we find
n
n
X
X
n−i
wi = α ⇐⇒
(n − i)u1n−i ui−1
= (n − 1)α,
n
n
−
1
i=1
i=1
and from
n
X
i=1
(n −
i)u1n−i ui−1
n
n−1
X
1
n
i n−i
(n − 1)u1 −
u1 un
=
u1 − un
i=1
u1n−1 − unn−1
1
n
(n − 1)u1 − u1 un
=
u1 − un
u1 − un
1
n
n
n
=
(n − 1)u1 (u1 − un ) − u1 un + u1 un
(u1 − un )2
1
n+1
n
n
(n − 1)u1 − nu1 un + u1 un ,
=
(u1 − un )2
we get
(n − 1)un+1
− nun1 un + u1 unn = (n − 1)α(u1 − un )2
1
nun1 − u1 = (n − 1)α(u1 − un )
1
n
un =
((n − 1)α + 1)u1 − nu1
(n − 1)α
un (n − 1)α + 1 − nw1
=
.
u1
(n − 1)α
(4)
From the second condition, w1 + · · · + wn = 1, we get
n
X
j=1
u1n−j unj−1 = 1 ⇐⇒
un1 − unn
=1
u1 − un
⇐⇒ un1 − unn = u1 − un
(5)
⇐⇒ u1n−1 −
(6)
un
un
× unn−1 = 1 −
u1
u1
Comparing equations (4) and (6) we find
w1 −
nw1 − 1
(n − 1)α + 1 − nw1
× wn =
(n − 1)α
(n − 1)α
3
and, therefore,
wn =
Let us rewrite equation (5) as
((n − 1)α − n)w1 + 1
.
(n − 1)α + 1 − nw1
(7)
un1 − unn = u1 − un
u1 (w1 − 1) = un (wn − 1)
w1 (w1 − 1)n−1 = wn (wn − 1)n−1
((n − 1)α − n)w1 + 1
(n − 1)α(w1 − 1 n−1
w1 (w1 − 1)
=
×
(n − 1)α + 1 − nw1
(n − 1)α + 1 − nw1
w1 [(n − 1)α + 1 − nw1 ]n = ((n − 1)α)n−1[((n − 1)α − n)w1 + 1].
(8)
n−1
So the optimal value of w1 should satisfy equation (8). Once w1 is computed then wn
can be determined from equation (7) and the other weights are obtained from equation (3).
Remark 2.1. If n = 3 then from (3) we get
w2 =
√
w1 w3
independently of the value of α, which means that the optimal value of w2 is always the
geometric mean of w1 and w3 .
3 Computing the optimal weights
Let us introduce the notations
f (w1 ) = w1 [(n − 1)α + 1 − nw1 ]n ,
g(w1 ) = ((n − 1)α)n−1[((n − 1)α − n)w1 + 1].
Then to find the optimal value for the first weight we have to solve the following equation
f (w1 ) = g(w1 ),
where g is a line and f is a polinom of w1 of dimension n + 1.
Without loss of generality we can assume that α < 0.5, because if a weighting vector
W is optimal for problem (1) under some given degree of orness, α < 0.5, then its reverse,
denoted by W R , and defined as
wiR = wn−i+1
is also optimal for problem (1) under degree of orness (1 − α). Really, as was shown by
Yager [4], we find that
disp(W R ) = disp(W ) and orness(W R ) = 1 − orness(W ).
4
Therefore, for any α > 0.5, we can solve problem (1) by solving it with level of orness
(1 − α) and then taking the reverse of that solution.
From the equations
1
1
1
1
′
′
=g
and f
=g
f
n
n
n
n
we get that g is always a tangency line to f at the point w1 = 1/n. But if w1 = 1/n then
w1 = · · · = wn = 1/n also holds, and that is the optimal solution for α = 0.5.
Consider the the graph of f . It is clear that f (0) = 0 and by solving the equation
f ′ (w1 ) = [(n − 1)α + 1 − nw1 ]n − n2 w1 [(n − 1)α + 1 − nw1 ]n−1 = 0
we find that its unique solution is
ŵ1 =
(n − 1)α + 1 1
< ,
n(n + 1)
n
and its second derivative, f ′′ (ŵ1 ) is negative, which means that ŵ1 is the only maximizing
point of f on the segment [0, 1/n].
We prove now that g can intersect f only once in the open interval (0, 1/n). It will
guarantee the uniqueness of the optimal solution of problem (1). Really, from the equation
f ′′ (w1 ) = −2n2 [(n − 1)α + 1 − nw1 ]n−1 + n3 (n − 1)w1[(n − 1)α + 1 − nw1 ]n−2 = 0
we find that its unique solution is
w̄1 = 2
1
(n − 1)α + 1
= 2ŵ1 < , (since α < 0.5).
n(n + 1)
n
with the meaning that f is strictly concave on (0, w̄1 ), has an inflextion point at w̄1 , and f
is strictly convex on (w̄1 , 1/n). Therefore, the graph of g should lie below the graph of g if
ŵ1 < w1 < 1/n and g can cross f only once in the interval (0, ŵ1).
4 Illustrations
Let us suppose that n = 5 and α = 0.6. Then from the equation
w1 [4 × 0.6 + 1 − 5w1 ]5 = (4 × 0.6)4 [1 − (5 − 4 × 0.6)w1 ].
we find
w1∗ = 0.2884
(4 × 0.6) − 5)w1∗ + 1
= 0.1278
4 × 0.6 + 1 − 5w1∗
p
w2∗ = 4 (w1∗)3 w5∗ = 0.2353,
p
w3∗ = 4 (w1∗)2 (w5∗ )2 = 0.1920,
p
w4∗ = 4 (w1∗)(w5∗ )3 = 0.1566.
w5∗ =
5
and, disp(W ∗ ) = 1.5692.
Using exponential smoothing [1], Filev and Yager [6] obtained the following weighting
vector
W ′ = (0.41, 0.10, 0.13, 0.16, 0.20),
with disp(W ′ ) = 1.48 and orness(W ′ ) = 0.5904.
We first note that the weights computed from the constrained optimization problem have
better dispersion than those ones obtained by Filev and Yager in [6], however the (heuristic)
technology suggested in [6] needs less computational efforts.
Other interesting property here is that small changes in the required level of orness,
α, can cause a big variation in weighting vectors of near optimal dispersity, (for example,
compare the weighting vectors W ∗ and W ′ ).
References
[1] R.G. Brown, Smoothing, Forecasting and Prediction of Discrete Time Series
(Prentice-Hall, Englewood Cliffs, 1963).
[2] M. O’Hagan, Aggregating template or rule antecedents in real-time expert systems
with fuzzy set logic, in: Proc. 22nd Annual IEEE Asilomar Conf. Signals, Systems,
Computers, Pacific Grove, CA, 1988 81-689.
[3] R.R.Yager, Ordered weighted averaging aggregation operators in multi-criteria decision making, IEEE Trans. on Systems, Man and Cybernetics, 18(1988) 183-190.
[4] R.R.Yager, Families of OWA operators, Fuzzy Sets and Systems, 59(1993) 125-148.
[5] R. R. Yager and D. Filev, Induced ordered weighted averaging operators, IEEE Trans.
on Systems, Man and Cybernetics – Part B: Cybernetics, 29(1999) 141-150.
[6] D. Filev and R. R. Yager, On the issue of obtaining OWA operator weights, Fuzzy
Sets and Systems, 94(1998) 157-169
6
Lemminkäisenkatu 14 A, 20520 Turku, Finland | www.tucs.fi
University of Turku
• Department of Information Technology
• Department of Mathematics
Åbo Akademi University
• Department of Computer Science
• Institute for Advanced Management Systems Research
Turku School of Economics and Business Administration
• Institute of Information Systems Sciences
ISBN 952-12-0609-8
ISSN 1239-1891
Download