On obtaining minimal variability OWA operator weights Robert Full ´er P ´eter Majlender

advertisement
Robert Fullér | Péter Majlender
On obtaining minimal variability OWA
operator weights
TUCS Technical Report
No 429, November 2001
On obtaining minimal variability OWA
operator weights
Robert Fullér
Department of Operations Research
Eötvös Loránd University
Pázmány Péter sétány 1C,
P. O. Box 120, H-1518 Budapest, Hungary
e-mail: rfuller@cs.elte.hu
and
Institute for Advanced Management Systems Research
Åbo Akademi University,
Lemminkäinengatan 14B, Åbo, Finland
e-mail: rfuller@mail.abo.fi
Péter Majlender
Turku Centre for Computer Sceince
Institute for Advanced Management Systems Research
Åbo Akademi University,
Lemminkäinengatan 14B, Åbo, Finland
e-mail: peter.majlender@mail.abo.fi
TUCS Technical Report
No 429, November 2001
Abstract
One important issue in the theory of Ordered Weighted Averaging (OWA) operators is the determination of the associated weights. One of the first approaches,
suggested by O’Hagan, determines a special class of OWA operators having maximal entropy of the OWA weights for a given level of orness; algorithmically it is
based on the solution of a constrained optimization problem. Another consideration that may be of interest to a decision maker involves the variability associated
with a weighting vector. In particular, a decision maker may desire low variability
associated with a chosen weighting vector. In this paper, using the Kuhn-Tucker
second-order sufficiency conditions for optimality, we shall analytically derive the
minimal variablity weighting vector for any level of orness.
Keywords: Multiple criteria analysis, Fuzzy sets, OWA operator, Lagrange multiplier
1 Introduction
The process of information aggregation appears in many applications related to
the development of intelligent systems. One sees aggregation in neural networks,
fuzzy logic controllers, vision systems, expert systems and multi-criteria decision
aids. In [6] Yager introduced a new aggregation technique based on the ordered
weighted averaging operators.
Definition 1.1. An OWA operator of dimension n is a mapping F : Rn → R that
has an associated weighting vector W = (w1 , . . . , wn )T of having the properties
w1 + · · · + wn = 1, 0 ≤ wi ≤ 1, i = 1, . . . , n, and such that
F (a1 , . . . , an ) =
n
X
wi bi ,
i=1
where bj is the jth largest element of the collection of the aggregated objects
{a1 , . . . , an }.
A fundamental aspect of this operator is the re-ordering step, in particular an
aggregate ai is not associated with a particular weight wi but rather a weight is
associated with a particular ordered position of aggregate. When we view the OWA
weights as a column vector we shall find it convenient to refer to the weights with
the low indices as weights at the top and those with the higher indices with weights
at the bottom. It is noted that different OWA operators are distinguished by their
weighting function.
In [6], Yager introduced a measure of orness associated with the weighting
vector W of an OWA operator, defined as
n
X
n−i
· wi .
orness(W ) =
n−1
i=1
and it characterizes the degree to which the aggregation is like an or operation. It
is clear that orness(W ) ∈ [0, 1] holds for any weighting vector.
It is clear that the actual type of aggregation performed by an OWA operator
depends upon the form of the weighting vector [9]. A number of approaches have
been suggested for obtaining the associated weights, i.e., quantifier guided aggregation [6, 7], exponential smoothing [3] and learning [10].
Another approach, suggested by O’Hagan [5], determines a special class of
OWA operators having maximal entropy of the OWA weights for a given level
of orness. This approach is based on the solution of he following mathematical
programming problem:
maximize
disp(W ) = −
n
X
wi ln wi
i=1
n
X
n−i
· wi = α, 0 ≤ α ≤ 1
subject to orness(W ) =
n−1
i=1
w1 + · · · + wn = 1, 0 ≤ wi , i = 1, . . . , n.
1
(1)
In [4] we transfered problem (1) to a polinomial equation which is then was
solved to determine the maximal entropy OWA weights.
Another interesting question is to determine the minimal variablity weighting
vector under given level of orness [8]. We shall meausure the variance of a given
weighting vector as
n
X
1
· (wi − E(W ))2
n
i=1
X
2
n
n
1 n
1
1X
1X
2
wi −
wi =
wi2 − 2 .
=
n
n
n
n
D 2 (W ) =
i=1
i=1
i=1
where E(W ) = (w1 + · · · + wn )/ stands for the arithmetic mean of weights.
To obtain minimal variablity OWA weights under given level of orness we need
to solve the following constrained mathematical programming problem
D 2 (W ) =
minimize
n
1 X
1
wi2 − 2
·
n
n
i=1
subject to orness(w) =
n
X
i=1
n−i
· wi = α, 0 ≤ α ≤ 1,
n−1
(2)
w1 + · · · + wn = 1, 0 ≤ wi , i = 1, . . . , n.
Using the Kuhn-Tucker second-order sufficiency conditions for optimality, we shall
solve problem (2) analytically and derive the exact minimal variablity OWA weights
for any level of orness.
2 Obtaining minimal variability OWA weights
Let us consider the constrained optimization problem (2). First we note that if
n = 2 then from orness(w1 , w2 ) = α the optimal weights are uniquely defined as
w1∗ = α and w2∗ = 1 − α. Furthemore, if α = 0 or α = 1 then the associated
weighting vectors are uniquely defined as (0, 0, . . . , 0, 1)T and (1, 0, . . . , 0, 0)T ,
respectively, with variability
D2 (1, 0, . . . , 0, 0) = D 2 (0, 0, . . . , 0, 1) =
1
1
− 2.
n n
Suppose now that n ≥ 3 and 0 < α < 1. Let us
L(W, λ1 , λ2 ) =
X
X
n
n
n
1X
1
n−i
wi2 − 2 + λ1
wi − α + λ2
wi − 1 .
n
n
n−1
i=1
i=1
i=1
denote the Lagrange function of constrained optimization problem (2), where
2
λ1 and λ2 are real numbers. Then the partial derivatives of L are computed as
∂L
2wj
n−j
=
+ λ1 +
· λ2 = 0, 1 ≤ j ≤ n,
∂wj
n
n−1
n
X
∂L
wi − 1 = 0,
=
∂λ1
∂L
=
∂λ2
i=1
n
X
i=1
(3)
n−i
· wi − α = 0.
n−1
We shall suppose that the optimal weighting vector has the following form
W = (0, . . . , 0, wp , . . . , wq , 0 . . . , 0)T
(4)
where 1 ≤ p < q ≤ n and use the notation
I{p,q} = {p, p + 1, . . . , q − 1, q},
for the indexes from p to q. So, wj = 0 if j ∈
/ I{p,q} and wj ≥ 0 if j ∈ I{p,q} .
For j = p we find that
∂L
2wp
n−p
=
+ λ1 +
· λ2 = 0
∂wp
n
n−1
and for j = q we get
∂L
2wq
n−q
=
+ λ1 +
· λ2 = 0
∂wq
n
n−1
That is,
2(wp − wq ) q − p
+
· λ2 = 0
n
n−1
and therefore, the optimal values of λ1 and λ2 (denoted by λ∗1 and λ∗2 ) should
satisfy the following equations
2 n−q
n−p
n−1 2
∗
λ1 =
· wp −
· wq
· · (wq − wp ). (5)
and λ∗2 =
n q−p
q−p
q−p n
Substituting λ∗1 for λ1 and λ∗2 for λ2 in (3) we get
2
2 n−q
n−p
n−j n−1 2
· wj +
· wp −
· wq +
·
· · (wq − wp ) = 0.
n
n q−p
q−p
n−1 q−p n
That is the jth optimal weight should satisfy the equation
wj∗ =
j−p
q−j
· wp +
· wq , j ∈ I{p,q} .
q−p
q−p
From representation (4) we get
q
X
wi∗ = 1,
i=p
3
(6)
That is,
q X
q−i
i−p
· wp +
· wq
q−p
q−p
i=p
= 1,
i.e.
wp + wq =
2
.
q−p+1
From the constraint orness(w) = α we find
q
q
q
X
X
X
n−i
n−i q−i
n−i i−p
· wi =
·
· wp +
·
· wq = α
n−1
n−1 q−p
n−1 q−p
i=p
i=p
i=p
that is,
2(2q + p − 2) − 6(n − 1)(1 − α)
.
(q − p + 1)(q − p + 2)
(7)
6(n − 1)(1 − α) − 2(q + 2p − 4)
2
− wp∗ =
.
q−p+1
(q − p + 1)(q − p + 2)
(8)
wp∗ =
and
wq∗ =
The optimal weighting vector
W ∗ = (0, . . . , 0, wp∗ , . . . , wq∗ , 0 . . . , 0)T
is feasible if and only if wp∗ , wq∗ ∈ [0, 1], because according to (6) any other wj∗ ,
j ∈ I{p,q} is computed as their convex linear combination.
Using formulas (7) and (8) we find
1 2q + p − 2
1 q + 2p − 4
∗
∗
wp , wq ∈ [0, 1] ⇐⇒ α ∈ 1 − ·
,1 − ·
3
n−1
3
n−1
The following (disjunctive) partition of the unit interval (0, 1) will be crucial in
finding an optimal solution to problem (2):
(0, 1) =
n−1
[
Jr,n ∪ J1,n ∪
n−1
[
J1,s .
(9)
s=2
r=2
where
Jr,n
J1,n
J1,s
1 2n + r − 2
1 2n + r − 3
= 1− ·
, 1− ·
, r = 2, . . . , n − 1,
3
n−1
3
n−1
1 2n − 1
1 n−2
= 1− ·
, 1− ·
,
3 n−1
3 n−1
1 s−2
1 s−1
, 1− ·
, s = 2, . . . , n − 1.
= 1− ·
3 n−1
3 n−1
Consider again problem (2) and suppose that α ∈ Jr,s for some r and s from
partition (9). Such r and s always exist for any α ∈ (0, 1), furthermore, r = 1 or
s = n should hold.
4
Then
W ∗ = (0, . . . , 0, wr∗ , . . . , ws∗ , 0 . . . , 0)T
(10)
where
wj∗ = 0, if j ∈
/ I{r,s} ,
2(2s + r − 2) − 6(n − 1)(1 − α)
,
(s − r + 1)(s − r + 2)
6(n − 1)(1 − α) − 2(s + 2r − 4)
ws∗ =
,
(s − r + 1)(s − r + 2)
s−j
j −r
wj∗ =
· wr +
· ws , if j ∈ I{r+1,s−1} .
s−r
s−r
wr∗ =
(11)
and I{r+1,s−1} = {r + 1, . . . , s − 1}. Furthermore, from the construction of W ∗ it
is clear that
s
n
X
X
wi∗ = 1, wi∗ ≥ 0, i = 1, 2, . . . , n,
wi∗ =
i=r
i=1
= α, that is, W ∗ is feasible for problem (2).
We will show now that W ∗ , defined by (10), satisfies the Kuhn-Tucker secondorder sufficiency conditions for optimality ([2], page 58). Namely,
and
orness(W ∗ )
(i) There exist λ∗1 , λ∗2 ∈ R and µ∗1 ≥ 0, . . . , µ∗n ≥ 0 such that,
n−1
X
n
X n−i
∂
wi − 1 + λ∗2
· wi − α
D 2 (W ) + λ∗1
∂wk
n−1
i=1
i=1
n
X
=0
µ∗j (−wj ) +
∗
W =W
j=1
for 1 ≤ k ≤ n and µ∗j wj∗ = 0, j = 1, . . . , n.
(ii) W ∗ is a regular point of the constraints,
(iii) The Hessian matrix,
n−1
X
n
X n−i
∂2
∗
2
∗
w
−
1
+
λ
·
w
−
α
D
(W
)
+
λ
i
i
2
1
∂W 2
n−1
i=1
i=1
n
X
∗
µj (−wj ) +
j=1
W =W ∗
is positive definite on
T
T
T
X̂ = y h1 y = 0, h2 y = 0 and gj y = 0 for all j with µj > 0 ,
where
h1 =
T
1
n−1 n−2
,
, ...,
,0 ,
n−1 n−1
n−1
5
(12)
(13)
and
h2 = (1, 1, . . . , 1, 1)T .
(14)
are the gradients of linear equality constraints, and
jth
z}|{
gj = (0, 0, . . . , 0, −1 , 0, 0, . . . , 0)T
(15)
is the gradient of the jth linear inequality constraint of problem (2).
Proof. (i) According to (5) we get
2 n−s ∗ n−r ∗
∗
λ1 = ·
· wr −
· ws
n s−r
s−r
and
λ∗2 =
n−1 2
· · (ws∗ − wr∗ )
s−r n
and
2
n−k ∗
· wk∗ + λ∗1 +
· λ − µk = 0.
n
n−1 2
for k = 1, . . . , n. If k ∈ I{r,s} then
µ∗k
2 n−s ∗ n−r ∗
2 s−k ∗ k−r ∗
·w +
· ws + ·
· wr −
· ws
= ·
n s−r r s−r
n s−r
s−r
n−k n−1 2
+
·
· · (ws∗ − wr∗ )
n−1 s−r n
1 2
(s − k + n − s − n + k)wr∗ + (k − r − n + r + n − k)ws∗
= ·
n s−r
= 0.
If k ∈
/ I{r,s} then wk∗ = 0. Then from the equality
λ∗1 +
n−k ∗
· λ − µk = 0,
n−1 2
we find
n−k ∗
µ∗k = λ∗1 +
·λ
n−1 2
n−k n−1 2
2 n−s ∗ n−r ∗
· wr −
· ws +
·
· · (ws∗ − wr∗ )
= ·
n s−r
s−r
n−1 s−r n
2
1
= ·
· (k − s)wr∗ + (r − k)ws∗ .
n s−r
We need to show that µ∗k ≥ 0 for k ∈
/ I{r,s} . That is,
2(2s + r − 2) − 6(n − 1)(1 − α)
(s − r + 1)(s − r + 2)
6(n − 1)(1 − α) − 2(s + 2r − 4)
+(r − k) ·
≥ 0.
(s − r + 1)(s − r + 2)
(k − s)wr∗ + (r − k)ws∗ = (k − s) ·
6
(16)
If r = 1 and s = n then we get that µ∗k = 0 for k = 1, . . . , n. Suppose now that
r = 1 and s < n. In this case the inequality k > s > 1 should hold and (16) leads
to the following requirement for α,
α≥1−
(s − 1)(3k − 2s − 2)
.
3(n − 1)(2k − s − 1)
On the other hand, from α ∈ J1,s and s < n we have
1 s−1
1 s−2
α∈ 1− ·
, 1− ·
,
3 n−1
3 n−1
and, therefore,
α ≥1−
1 s−1
·
3 n−1
Finally, from the inequality
1−
(s − 1)(3k − 2s − 2)
1 s−1
·
≥1−
3 n−1
3(n − 1)(2k − s − 1)
we get that (16) holds. The proof of the remaining case (r > 1 and s = n) is
carried out analogously.
(ii) The gradient vectors of linear equality and inequality constraints are computed by (13), (14) and (15), respectively. If r = 1 and s = n then wj∗ 6= 0 for
all j = 1, . . . , n. Then it is easy to see that h1 and h2 are linearly independent. If
r = 1 and s < n then wj∗ = 0 for j = s + 1, . . . , n, and in this case
jth
z}|{
gj = (0, 0, . . . , 0, −1 , 0, 0, . . . , 0)T ,
for j = s + 1, . . . , n. Consider the matrix
T
, . . . , gnT ] ∈ Rn×(n−s+2) .
G = [hT1 , hT2 , gs+1
Then the determinant of the lower-left submatrix of dimension (n − s + 2) × (n −
s + 2) of G is
n−s+1
1
0 ...
0 n−1
n−s
n − s + 1 1
0
.
.
.
0
1
1
n−1
n−s n − 1
=
(−1)n−s
n−s−1
= (−1)
n
−
s
n
−
1
1 −1 . . .
0 1
n−1
n
−
1
..
..
.. . .
.
.
.
0 .
0
1
0 . . . −1 which means that the columns of G are linearly independent, and therefore, the
system
{h1 , h2 , gs+1 , . . . , gn },
7
is linearly independent.
If r > 1 and = n then wj∗ = 0 for j = 1, . . . , r − 1, and in this case
jth
z}|{
gj = (0, 0, . . . , 0, −1 , 0, 0, . . . , 0)T ,
for j = 1, . . . , r − 1. Consider the matrix
T
F = [hT1 , hT2 , g1T , . . . , gr−1
] ∈ Rn×(r+1) .
Then the determinant of the upper-left submatrix of dimension (r + 1) × (r + 1)
of F is
n−1
1 −1 . . .
0 n−1
..
..
.. . .
.
.
.
0
.
n−r
n−r+1
1
1
r−1
n
−
1
=
1
0 . . . −1 = (−1)
(−1)r−1
n−1
n
−
r
−
1
n
−
1
n−r
1
n
−
1
1
0 ...
0 n−1
n−r−1
1
0
.
.
.
0
n−1
which means that the columns of F are linearly independent, and therefore, the
system
{h1 , h2 , g1 , . . . , gr−1 },
is linearly independent. So W ∗ is a regular point for problem (2).
(iii) Let us introduce the notation
K(W ) = D
2
(W ) + λ∗1
X
n
i=1
X
n−1
n
X n−i
∗
µ∗j (−wj ).
· wi − α +
wi − 1 + λ2
n−1
i=1
j=1
The Hessian matrix of K at W ∗ is
∂2
∂2
2
2
K(W )
D (W )
=
= · δkj ,
∂wk ∂wj
∂wk ∂wj
n
W =W ∗
W =W ∗
where
δjk =
That is,
1 if j = k
0 otherwise.


1 0 ... 0

2 
0 1 . . . 0
K ′′ (W ∗ ) = ·  . . .

.
n  .. .. . . .. 
0 0 ... 1
which is a positive definite matrix on Rn .
8
So, the objective function D2 (W ) has a local minimum at point W = W ∗ on
n
n
X
X
n−i
n
wi = 1,
X = W ∈ R W ≥ 0,
· wi = α
(17)
n−1
i=1
i=1
where X is the set of feasible solutions of problem (2). Taking into consideration
that D2 : Rn → R is a strictly convex, bounded and continuous function, and X
is a covex and compact subset of Rn , we can conclude that D 2 attains its (unique)
global minimum on X at point W ∗ .
3 Example
In this Section we will determine the minimal variability five-dimensional weighting vector under orness levels α = 0, 0.1, . . . , 0.9 and 1.0. First, we construct the
corresponding partition as
(0, 1) =
4
[
Jr,5 ∪ J1,5 ∪
Jr,5 =
J1,s .
s=2
r=2
where
4
[
4−r 5−r
1 5−r−1 1 5−r
·
, ·
,
=
,
3
5−1
3 5−1
12
12
for r = 2, 3, 4 and
J1,5 =
1 5 − 2 1 10 − 1
·
, ·
3 5−1 3 5−1
=
3 9
,
,
12 12
and
1 s−2
13 − s 14 − s
1 s−1
, 1− ·
,
=
,
J1,s = 1 − ·
3 5−1
3 5−1
12
12
for s = 2, 3, 4, and, therefore we get,
1
1 2
2 3
3 9
(0, 1) = 0,
,
,
,
∪
∪
∪
12
12 12
12 12
12 12
9 10
10 11
11 12
∪
,
,
,
∪
∪
.
12 12
12 12
12 12
Without loss of generality we can assume that α < 0.5, because if a weighting
vector W is optimal for problem (2) under some given degree of orness, α < 0.5,
then its reverse, denoted by W R , and defined as
wiR = wn−i+1
is also optimal for problem (2) under degree of orness (1 − α). Really, as was
shown by Yager [7], we find that
D 2 (W R ) = D 2 (W ) and orness(W R ) = 1 − orness(W ).
Therefore, for any α > 0.5, we can solve problem (2) by solving it for level of
orness (1 − α) and then taking the reverse of that solution.
Then we obtain the optimal weights from (11) as follows
9
• if α = 0 then W ∗ (α) = W ∗ (0) = (0, 0, . . . , 0, 1)T and, therefore,
W ∗ (1) = (W ∗ (0))R = (1, 0, . . . , 0, 0)T .
• if α = 0.1 then
1 2
,
α ∈ J3,5 =
,
12 12
and the associated minimal variablity weights are
w1∗ (0.1) = 0,
w2∗ (0.1) = 0,
2(10 + 3 − 2) − 6(5 − 1)(1 − 0.1) 0.4
=
= 0.0333,
(5 − 3 + 1)(5 − 3 + 2)
12
2
w5∗ (0.1) =
− w3∗ (0.1) = 0.6334,
5−3+1
1
1
w4∗ (0.1) = · w3∗ (0.1) + · w5∗ (0.1) = 0.3333,
2
2
w3∗ (0.1) =
So,
W ∗ (α) = W ∗ (0.1) = (0, 0, 0.033, 0.333, 0.633)T ,
and, consequently,
W ∗ (0.9) = (W ∗ (0.1))R = (0.633, 0.333, 0.033, 0, 0)T .
with variance D 2 (W ∗ (0.1)) = 0.0625.
• if α = 0.2 then
2 3
,
α ∈ J2,5 =
12 12
and in a similar manner we find that the associated minimal variablity weighting vector is
W ∗ (0.2) = (0.0, 0.04, 0.18, 0.32, 0.46)T ,
and, therefore,
W ∗ (0.8) = (0.46, 0.32, 0.18, 0.04, 0.0)T ,
with variance D 2 (W ∗ (0.2)) = 0.0296.
• if α = 0.3 then
3 9
,
α ∈ J1,5 =
12 12
and in a similar manner we find that the associated minimal variablity weighting vector is
W ∗ (0.3) = (0.04, 0.12, 0.20, 0.28, 0.36)T ,
and, therefore,
W ∗ (0.7) = (0.36, 0.28, 0.20, 0.12, 0.04)T ,
with variance D 2 (W ∗ (0.3)) = 0.0128.
10
• if α = 0.4 then
α ∈ J1,5 =
3 9
,
12 12
and in a similar manner we find that the associated minimal variablity weighting vector is
W ∗ (0.4) = (0.12, 0.16, 0.20, 0.24, 0.28)T ,
and, therefore,
W ∗ (0.6) = (0.28, 0.24, 0.20, 0.16, 0.12)T ,
with variance D 2 (W ∗ (0.4)) = 0.0032.
• if α = 0.5 then
W ∗ (0.5) = (0.2, 0.2, 0.2, 0.2, 0.2)T .
with variance D 2 (W ∗ (0.5)) = 0.
4 Summary
We have extended the power of decision making with OWA operators by introducing considerations of minimizing variability into the process of selecting the
optimal alternative. Of particular significance in this work is the development of a
methodology to calculate the minimal variability weights in an analytic way.
References
[1] R.G. Brown, Smoothing, Forecasting and Prediction of Discrete Time Series
(Prentice-Hall, Englewood Cliffs, 1963).
[2] V. Chankong and Y. Y. Haimes, Multiobjective Decision Making: Theory
and Methodology (North-Holland, 1983).
[3] D. Filev and R. R. Yager, On the issue of obtaining OWA operator weights,
Fuzzy Sets and Systems, 94(1998) 157-169.
[4] R. Fullér and P. Majlender, An analytic approach for obtaining maximal
entropy OWA operator weights, Fuzzy Sets and Systems, 124(2001) 53-57.
[5] M. O’Hagan, Aggregating template or rule antecedents in real-time expert
systems with fuzzy set logic, in: Proc. 22nd Annual IEEE Asilomar Conf.
Signals, Systems, Computers, Pacific Grove, CA, 1988 81-689.
[6] R.R.Yager, Ordered weighted averaging aggregation operators in multicriteria decision making, IEEE Trans. on Systems, Man and Cybernetics,
18(1988) 183-190.
11
[7] R.R.Yager, Families of OWA operators, Fuzzy Sets and Systems, 59(1993)
125-148.
[8] R.R.Yager, On the Inclusion of variance in decision making under uncertainty, International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems, 4(1995) 401-419.
[9] R.R.Yager, On the analytic representation of the Leximin ordering and its
application to flexible constraint propagation, European Journal of Operational Research, 102(1997) 176-192.
[10] R. R. Yager and D. Filev, Induced ordered weighted averaging operators,
IEEE Trans. on Systems, Man and Cybernetics – Part B: Cybernetics,
29(1999) 141-150.
12
Lemminkäisenkatu 14 A, 20520 Turku, Finland | www.tucs.fi
University of Turku
• Department of Information Technology
• Department of Mathematics
Åbo Akademi University
• Department of Computer Science
• Institute for Advanced Management Systems Research
Turku School of Economics and Business Administration
• Institute of Information Systems Sciences
ISBN 952-12-0911-9
ISSN 1239-1891
Download