On possibilistic correlation

advertisement
On possibilistic correlation ∗
Christer Carlsson
christer.carlsson@abo.fi
Robert Fullér
rfuller@abo.fi, rfuller@cs.elte.hu
Péter Majlender
peter.majlender@abo.fi
Abstract
In 2004 Fullér and Majlender introduced the notion of covariance between fuzzy numbers by their joint possibility distribution to measure the
degree to which they interact. Based on this approach, in this paper we will
present the concept of possibilistic correlation representing an average degree
of interaction between marginal distributions of a joint possibility distribution as compared to their respective dispersions. Moreover, we will formulate
the classical Cauchy-Schwarz inequality in this possibilistic environment and
show that the measure of possibilistic correlation satisfies the same property
as its probabilistic counterpart. In particular, applying the idea of transforming level sets of possibility distributions into uniform probability distributions, we will point out a fundamental relationship between our proposed
possibilistic approach and the classical probabilistic approach to measuring
correlation.
1
Introduction
In probability theory the notion of mean value of functions of random variables
plays a fundamental role in defining the basic characteristic measures of probability distributions. For instance, the measure of covariance, variance and correlation
of random variables can all be computed as probabilistic means of their appropriately chosen real-valued functions. For variance and covariance of fuzzy random
variables the reader can consult, e.g. Puri and Ralescu [12], and Feng, Hu and Shu
[6].
Using the concept of joint possibility distribution, Fullér and Majlender [8] introduced the interactivity function between level sets of marginal distributions of a
∗
The final version of this paper appeared in: C. Carlsson, R. Fullér and P. Majlender, On possibilistic correlation, Fuzzy Sets and Systems, 155(2005) 425-445.
1
joint possibility distribution. Marginal possibility distributions are always uniquely
defined by their joint possibility distribution by the principle of falling shadows.
Applying the principle of average values of functions on (classical) sets they formulated the notion of expected value of functions on fuzzy sets. Furthermore, they
defined a measure of covariance between marginal distributions of a joint possibility distribution as the expected value of an appropriately chosen function (the
interactivity function) on the joint distribution.
In this paper, using the definitions of possibilistic covariance and variance [8], we
will introduce a measure of possibilistic correlation and present several forms of
the Cauchy-Schwarz inequality for possibility distributions.
The paper is organized as follows. First we will recall the definitions of covariance
and correlation between two random variables and then summarize some of the basic properties of possibility distributions. In Section 2 we will recall the definition
of the expected value operator [8] and present its relation to probabilistic means. In
Section 3 we will interpret the basic normative measures, covariance and variance,
of possibility distributions from a pure probabilistic point of view. We will see that
the possibilistic covariance between fuzzy numbers A and B is nothing else but the
weighted average of the probabilistic covariances between random variables with
uniform joint distribution on the level sets of the joint possibility distribution of A
and B. Coincidently, the possibilistic variance of a fuzzy number computes the
weighted average of the probabilistic variances of uniformly distributed random
variables on its level sets.
In Section 4 we will introduce a measure of possibilistic correlation between fuzzy
numbers by their joint possibility distribution as an average measure of their interaction compared to their respective marginal variances. In particular, we will
present the concept of possibilistic correlation in a probabilistic setting and point
out the fundamental difference between the standard probabilistic approach and
our proposed possibilistic approach to computing and interpreting the correlation
coefficient in these environments.
In Sections 5 and 6 we will present and prove the weak and the strong forms of the
possibilistic Cauchy-Schwarz inequality. From these results we will obtain that the
possibilistic correlation coefficient satisfies the same property as the probabilistic
one: it lies in the range [−1, +1]. Furthermore, we will analyze the two unique
cases when the correlation coefficient equals to 1 or −1.
Finally, in Section 7 we will illustrate the case of non-interactive fuzzy numbers
(when the correlation coefficient is zero) and the two extremal cases when the
correlation coefficient equals to 1 or −1.
Let us recall the definition and the fundamental property of the correlation coefficient between random variables. That is, let X and Y be random variables. Then,
the probabilistic Cauchy-Schwarz inequality states that the following relationship
2
holds between the covariance and the variances of X and Y
2 2
(cov(X, Y ))2 ≤ σX
σY .
(1)
If σX , σY 6= 0 then the correlation coefficient of X and Y is defined as
cor(X, Y ) =
cov(X, Y )
,
σX σY
and from (1) it is obvious that −1 ≤ cor(X, Y ) ≤ 1.
In the following we will briefly recall some of the basics of possibility distributions. A fuzzy number A is a fuzzy set in R that has a normal, fuzzy convex
and continuous membership function of bounded support. The family of all fuzzy
numbers will be denoted by F. Fuzzy numbers can be considered as possibility distributions [13]. If C is a fuzzy set in Rn then its γ-level set is defined by
[C]γ = cl{(x1 , . . . , xn ) ∈ Rn |C(x1 , . . . , xn ) > γ} for any γ ∈ [0, 1] (here, cl
stands for the closure of sets). It is clear that if A ∈ F is a fuzzy number then [A]γ
is a compact interval for all γ ∈ [0, 1].
Let Ai ∈ F, i = 1, . . . , n, be fuzzy numbers, and let C be a fuzzy set in Rn . Then,
C is said to be a joint possibility distribution of Ai , i = 1, . . . , n, if the following
relationships hold
Ai (xi ) =
max C(x1 , . . . , xn ),
xj ∈R, j6=i
∀xi ∈ R,
i = 1, . . . , n.
(2)
In this case we will call Ai the i-th marginal possibility distribution of C and use
the notation Ai = πi (C), where πi denotes the projection operator in Rn onto the
i-th axis, i = 1, . . . , n. We will refer to property (2) as the principle of falling
shadows [8].
The concept of conditional independence has been studied in depth in possibility theory, for good surveys see, e.g. Campos and Huete [1, 2]. The notion of
non-interactivity in possibility theory was introduced by Zadeh [13]. Hisdal [11]
demonstrated the difference between conditional independence and non-interactivity.
In the sense of subsethood of fuzzy sets the largest joint possibility distribution
defines the concept of non-interaction. That is, fuzzy numbers Ai ∈ F, i =
1, . . . , n, are said to be non-interactive if their joint possibility distribution is given
by
C(x1 , . . . , xn ) = min{A1 (x1 ), . . . , An (xn )}, ∀x1 , . . . , xn ∈ R.
If A1 , A2 ∈ F are non-interactive then their joint membership function is defined
by A1 × A2 . It is clear that in this case any change in the membership function of
A1 does not effect the second marginal possibility distribution A2 , and vice versa.
3
On the other hand, A1 and A2 are said to be interactive if they can not take their
values independently of each other [4, 5].
A function f : [0, 1] → R is called a weighting function [7] if it is nonnegative,
monotone increasing and normalized over the unit interval, i.e.
Z 1
f (γ)dγ = 1.
0
Different weighting functions can give different (case-dependent) importances to
γ-levels sets of fuzzy numbers. It is motivated in part by the desire to give less
importance to the lower levels of fuzzy sets [10] (it is why f should be monotone
increasing).
2
The concepts of central value and expected value
In this section we will recall the definitions of central value and expected value of
possibility distributions introduced in [8], and explain their relation to probabilistic
mean values.
Let C be a joint possibility distribution in Rn , let g : Rn → R be an integrable
function, and let γ ∈ [0, 1]. Then, the central value of g on [C]γ is defined by [8]
C[C]γ (g) = R
1
=R
[C]γ dx1 . . . dxn
1
[C]γ
Z
dx
g(x)dx
[C]γ
(3)
Z
g(x1 , . . . , xn )dx1 . . . dxn .
[C]γ
Furthermore, if [C]γ is a degenerated set then we compute C[C]γ (g) as the limit
case of a uniform approximation of [C]γ with non-degenerated sets [9]. That is, let
S(ε) = {x ∈ Rn |∃c ∈ [C]γ kx − ck ≤ ε},
ε > 0.
Then obviously
Z
dx > 0,
∀ε > 0,
S(ε)
and we define the central value of g on [C]γ as
C[C]γ (g) = lim CS(ε) (g) = lim R
ε→0
ε→0
4
1
Z
S(ε) dx S(ε)
g(x)dx.
(4)
In the case [C]γ is a non-degenerated set in Rn , it is clear that C[C]γ (g) gives the
probabilistic mean value of g(Xγ ), where Xγ is a uniformly distributed random
variable on [C]γ ; namely,
M (g(Xγ )) = C[C]γ (g).
(5)
Especially, if n = 1 and g ≡ id is the identity function over R then for any fuzzy
number A ∈ F with [A]γ = [a1 (γ), a2 (γ)], γ ∈ [0, 1]
C[A]γ (id) = R
1
[A]γ
Z
xdx =
dx
[A]γ
a1 (γ) + a2 (γ)
,
2
which is the mean value of a random variable Xγ that is uniformly distributed on
[A]γ . Furthermore, this relationship also remains valid if a2 (γ) − a1 (γ) = 0 for
some γ ∈ [0, 1]. In this limit case the density of the associated random variable
formally equals to a Dirac delta function, and C[A]γ (id) = M (Xγ ) = a1 (γ) =
a2 (γ). In the following we will use the notation C([A]γ ) for C[A]γ (id). It is obvious
that for any fixed possibility distribution C and γ ∈ [0, 1] C[C]γ is a linear operator.
Let C be a joint possibility distribution in Rn , let g : Rn → R be an integrable
function, and let f be a weighting function. The expected value of g on C with
respect to f is defined by [8]
Z 1
Ef (g; C) =
C[C]γ (g)f (γ)dγ.
(6)
0
That is, Ef (g; C) computes the f -weighted average of the cenral values of function
g on the level sets of C.
Note 2.1. As a matter of fact, function g in (3) and (6) can depend on C and γ
as well. However, to avoid over-complicated notations we will always assume that
g ≡ gC;γ , that is, g implies its dependence on both C and γ.
From (5) we can see that in a probabilistic aspect Ef (g; C) is nothing else but the
f -weighted average of the probabilistic means of random variables g(Xγ ), where
for all γ ∈ [0, 1] Xγ is uniformly distributed on [C]γ ; namely,
Z 1
Ef (g; C) =
M (g(Xγ ))f (γ)dγ.
(7)
0
In particular, for any fixed weighting function f and possibility distribution C
Ef ( · ; C) is a linear operator.
Let us denote the projection functions on R2 by πx and πy , i.e. πx (u, v) = u and
πy (u, v) = v for all u, v ∈ R. We will show two important properties of the central
5
value operator presented in [8], and explain their relationship with some classical
results of probability theory. That is, let C be a joint possibility distribution in
R2 with marginal possibility distributions A = πx (C) and B = πy (C), and let
γ ∈ [0, 1] be fixed. If [C]γ = [A]γ ×[B]γ , for instance A and B are non-interactive,
then from Theorems 2.1 and 2.2 [8] we have
C[C]γ (πx + πy ) = C([A]γ ) + C([B]γ )
(8)
C[C]γ (πx πy ) = C([A]γ ) C([B]γ ).
(9)
and
Let Xγ and Yγ be two random variables with densities fXγ and fYγ , and with a
uniform joint density fXγ ,Yγ on [C]γ . Then, we can write
fX,Y (x, y) = R
=R
1
[A]γ
du
1
χ[C]γ (x, y)
[C]γ dudv
χ[A]γ (x) R
1
[B]γ
dv
χ[B]γ (y) = fXγ (x) fYγ (y),
x, y ∈ R,
which implies that Xγ and Yγ are independent. Hence, from probability theory we
can apply the following well-known relationships for the expected value operator
Z
M (Xγ + Yγ ) =
(x + y)fXγ ,Yγ (x, y)dxdy
2
R
Z
Z
=
xfXγ (x)dx +
yfYγ (y)dy = M (Xγ ) + M (Yγ )
R
R
and
Z
M (Xγ Yγ ) =
xyfXγ ,Yγ (x, y)dxdy
R2
Z
Z
=
xfXγ (x)dx
yfYγ (y)dy = M (Xγ ) M (Yγ ),
R
R
wherefrom we obtain (8) and (9), respectively.
From (5) we have that the central value of the identity function on a level set is the
mean of the corresponding uniform probability distribution on that level set. Furthermore, from (7) it is also clear that the expected value of the identity function
on a fuzzy number is nothing else but the weighted average of the probabilistic
means of the respective uniform distributions on the level sets of that fuzzy number. Hence, in a probabilistic aspect our proposal is to first turn the level sets
into uniform probability distributions, then apply their standard probabilistic calculation, and then define measures on possibility distributions by integrating these
probabilistic notions over the set of all membership grades.
6
3
The measure of possibilistic interaction
In this section we will recall the definitions and some basic properties of the measure of covariance and variance of possibility distributions introduced in [8], and
present their probabilistic interpretation.
Let C be a joint possibility distribution in R2 with marginal possibility distributions
A = πx (C) and B = πy (C), and let γ ∈ [0, 1]. Then, the measure of interactivity
between the γ-level sets of A and B (with respect to [C]γ ) is defined by [8]
R[C]γ (πx , πy ) = C[C]γ (πx − C[C]γ (πx ))(πy − C[C]γ (πy )) .
In a possibilistic sense R[C]γ (πx , πy ) computes the central value of the interactivity
function
g(u, v) = (u − C[C]γ (πx ))(v − C[C]γ (πy ))
(10)
on [C]γ . Using the definition of the central value operator we obtain
R[C]γ (πx , πy ) = C[C]γ (πx πy ) − C[C]γ (πx ) C[C]γ (πy )
Z
Z
Z
1
1
1
R
xydxdy − R
xdxdy
ydxdy
=R
[C]γ dxdy [C]γ
[C]γ dxdy [C]γ
[C]γ dxdy [C]γ
for any γ ∈ [0, 1]. In particular, R[C]γ (πx , πy ) actually computes the probabilistic covariance between random variables Xγ and Yγ with a uniform joint density
fXγ ,Yγ on [C]γ ; namely,
R[C]γ (πx , πy ) = cov(Xγ , Yγ )
Z
Z
Z
=
xyfXγ ,Yγ (x, y)dxdy −
xfXγ (x)dx
yfYγ (y)dy .
R2
R
(11)
R
Especially, if [C]γ = [A]γ × [B]γ then the associated random variables Xγ and Yγ
are independent, and we obtain R[C]γ (πx , πy ) = cov(Xγ , Yγ ) = 0.
Now let A be a possibility distribution in R, and let γ ∈ [0, 1]. Then, the measure
of dispersion of [A]γ is defined by
R[A]γ (id, id) = C[A]γ (id − C[A]γ (id))2 .
From a possibilistic point of view R[A]γ (id, id) represents the central value of the
dispersion function
h(u) = (u − C([A]γ ))2
(12)
7
on [A]γ . If A ∈ F is a fuzzy number with [A]γ = [a1 (γ), a2 (γ)], γ ∈ [0, 1] then
from the definition of the central value operator we get
2
R[A]γ (id, id) = C[A]γ (id2 ) − C[A]
γ (id)
2
Z
Z
1
1
(a2 (γ) − a1 (γ))2
2
R
R
=
x dx −
xdx =
.
12
[A]γ dx [A]γ
[A]γ dx [A]γ
That is, the measure of possibilistic dispersion on a level set [A]γ is nothing else but
the probabilistic variance of a random variable Uγ with a uniform density function
fUγ on [A]γ ; namely,
Z
Z
2
2
2
x fUγ (x)dx −
xfUγ (x)dx .
(13)
R[A]γ (id, id) = σUγ =
R
R
Let C be a joint possibility distribution with marginal possibility distributions A =
πx (C) and B = πy (C), and let f be a weighting function. Then, the measure
of covariance between A and B (with respect to their joint distribution C and
weighting function f ) is defined by [8]
Z 1
Covf (A, B) = Ef (g; C) =
R[C]γ (πx , πy )f (γ)dγ,
0
where g ≡ g[C]γ stands for the interactivity function associated with [C]γ , γ ∈
[0, 1] (10). That is, the covariance of A and B is computed as the expected value
of the interactivity function on the joint distribution C.
However, from (11) we also have
Z 1
cov(Xγ , Yγ )f (γ)dγ,
Covf (A, B) =
0
where Xγ and Yγ are random variables whose joint distribution is uniform on [C]γ
for all γ ∈ [0, 1].
Now let A ∈ F be a fuzzy number with [A]γ = [a1 (γ), a2 (γ)], γ ∈ [0, 1], and let f
be a weighting function. The measure of variance of A with respect to f is defined
as [8]
Z 1
Z 1
(a2 (γ) − a1 (γ))2
Varf (A) = Ef (h; A) =
R[A]γ (id, id)f (γ)dγ =
f (γ)dγ,
12
0
0
where h ≡ h[A]γ denotes the dispersion function of the level set [A]γ , γ ∈ [0, 1]
(12).
Nevertheless, from (13) it is also clear that
Z 1
Varf (A) =
σU2 γ f (γ)dγ,
0
where Uγ is a uniformly distributed random variable on [A]γ for all γ ∈ [0, 1].
8
4
The possibilistic correlation
In this section we will define the concept of possibilistic correlation between fuzzy
numbers and analyze its conceptual links to probability theory.
Definition 4.1. Let C be a joint possibility distribution with marginal possibility
distributions A, B ∈ F, and let f be a weighting function. If Varf (A) 6= 0 and
Varf (B) 6= 0 then the possibilistic correlation coefficient of A and B (with respect
to their joint distribution C and weighting function f ) is defined by
Covf (A, B)
ρf (A, B) = p
.
Varf (A) Varf (B)
In a possibilistic environment the correlation between two fuzzy numbers can be
interpreted as a relative measure indicating the degree of their interaction (implied
by their joint possibility distribution) compared to their individual (marginal) variances. Thus, the definition of possibilistic correlation essentially incorporates the
principle of falling shadows (2).
In the following we will point out a fundamental difference between the notions of
probabilistic and possibilistic correlation. Let C be a joint possibility distribution
with marginal possibility distributions A = πx (C) ∈ F and B = πy (C) ∈ F, and
let Xγ and Yγ be random variables with a uniform joint distribution on [C]γ . Then,
the probabilistic correlation of Xγ and Yγ is defined by
cor(Xγ , Yγ ) =
cov(Xγ , Yγ )
,
σXγ σYγ
and it measures the strength of linear relationship between Xγ and Yγ (as compared
to their standard deviations), γ ∈ [0, 1].
Even though Xγ and Yγ have a uniform joint distribution on [C]γ , it is clear that
they are not necessarily uniformly distributed on [A]γ and [B]γ , respectively.
Let Uγ and Vγ denote the uniformly distributed random variables on [A]γ = πx ([C]γ )
and [B]γ = πy ([C]γ ), respectively, for any γ ∈ [0, 1]. Then, we can formulate the
possibilistic correlation as
R1
0 cov(Xγ , Yγ )f (γ)dγ
ρf (A, B) = R
(14)
1/2 R
1/2 .
1 2
1 2
0 σUγ f (γ)dγ
0 σVγ f (γ)dγ
Thus, the possibilistic correlation represents an average degree to which Xγ and
Yγ are linearly associated as compared to the dispersions of Uγ and Vγ .
9
It is clear that we do not run a standard probabilistic calculation in (14). A standard
probabilistic calculation might be the following
R1
0 cov(Xγ , Yγ )f (γ)dγ
R
1/2 R
1/2 .
1 2
1 2
0 σXγ f (γ)dγ
0 σYγ f (γ)dγ
That is, the standard probabilistic approach would use the marginal distributions,
Xγ and Yγ , of a uniformly distributed random variable on the level sets of [C]γ .
Let C be a joint possibility distribution with marginal distributions A, B ∈ F.
In the following we will prove that if [C]γ is convex for all γ ∈ [0, 1] then the
correlation coefficient of A and B can never exceed 1 in absolute value; namely,
−1 ≤ ρf (A, B) ≤ 1 for any weighting function f .
5
The weak forms of the possibilistic Cauchy-Schwarz inequality
In this section we will formulate the weak forms of the possibilistic Cauchy-Schwarz
inequality.
First, let us recall the bilinearity property of the interactivity relation operator R
presented in [8], Theorem 2.4. That is, if C is a joint possibility distribution in R2
and λ, µ ∈ R are real numbers then
R[C]γ (λπx + µπy , λπx + µπy )
= λ2 R[C]γ (πx , πx ) + 2λµR[C]γ (πx , πy ) + µ2 R[C]γ (πy , πy )
for any γ ∈ [0, 1].
The following theorem states the weak form of the possibilistic Cauchy-Schwarz
inequality for the γ-level sets of possibility distributions.
Theorem 5.1. Let C be a joint possibility distribution in R2 . Then
(R[C]γ (πx , πy ))2 ≤ R[C]γ (πx , πx ) R[C]γ (πy , πy )
(15)
holds for all γ ∈ [0, 1].
Proof. Let γ ∈ [0, 1] be fixed. From the definition and the bilinearity of the interactivity relation we have that for any λ ∈ R
0 ≤ R[C]γ (λπx + πy , λπx + πy )
= λ2 R[C]γ (πx , πx ) + 2λR[C]γ (πx , πy ) + R[C]γ (πy , πy ),
10
which implies that the discriminant of the quadratic polynomial on the right-hand
side satisfies the following inequality
(R[C]γ (πx , πy ))2 − R[C]γ (πx , πx ) R[C]γ (πy , πy ) ≤ 0,
which ends the proof.
From (11) and (13) we get that the weak form of the possibilistic Cauchy-Schwarz
inequality for the γ-level sets of possibility distributions (15) is actually a particular case of the probabilistic Cauchy-Schwarz inequality for uniform densities.
Indeed, if Xγ and Yγ are random variables on [A]γ = πx ([C]γ ) and [B]γ =
πy ([C]γ ), respectively, with a uniform joint density on [C]γ , then
2
R[C]γ (πx , πx ) = σX
,
γ
R[C]γ (πy , πy ) = σY2γ ,
and from the probabilistic Cauchy-Schwarz inequality we obtain
2
(R[C]γ (πx , πy ))2 = (cov(Xγ , Yγ ))2 ≤ σX
σY2γ = R[C]γ (πx , πx ) R[C]γ (πy , πy ).
γ
Now we will formulate the weak form of the possibilistic Cauchy-Schwarz inequality for possibility distributions.
Theorem 5.2. Let C be a joint possibility distribution in R2 , and let f be a weighting function. Then
(Ef (g; C))2 ≤ Ef (hx ; C) Ef (hy ; C),
(16)
where g stands for the interactivity function of the level sets of C (10), and
hx (u) = (u − C[C]γ (πx ))2 ,
hy (v) = (v − C[C]γ (πy ))2 .
Proof. Using the triangle inequality for integrals and (15) we can write
Z 1
Z 1
|Ef (g; C)| = R[C]γ (πx , πy )f (γ)dγ ≤
|R[C]γ (πx , πy )|f (γ)dγ
0
0
Z 1
(R[C]γ (πx , πx ))1/2 (R[C]γ (πy , πy ))1/2 f (γ)dγ,
≤
0
wherefrom, by applying the classical Cauchy-Schwarz inequality for integrals, we
obtain
Z 1
|Ef (g; C)| ≤
(R[C]γ (πx , πx )f (γ))1/2 (R[C]γ (πy , πy )f (γ))1/2 dγ
0
Z 1
1/2 Z 1
1/2
≤
R[C]γ (πx , πx )f (γ)dγ
R[C]γ (πy , πy )f (γ)dγ
0
0
= (Ef (hx ; C))1/2 (Ef (hy ; C))1/2 ,
which ends the proof.
11
6
The strong forms of the possibilistic Cauchy-Schwarz
inequality
We saw that (15) is a special case of the classical Cauchy-Schwarz inequality for
uniform density functions. However, notice that in (15), as well as in (16), on both
sides of the inequality the joint distribution C is only taken into consideration.
However, the strong forms of the possibilistic Cauchy-Schwarz inequality will incorporate the principle of falling shadows by including the marginal distributions
of C as well.
Hence, let C be a joint possibility distribution in R2 , let
[C]γ = {(x, y) ∈ R2 |x ∈ [u, v], y ∈ [w1 (x), w2 (x)]}
(17)
be a representation of [C]γ , and let
F (x) = w2 (x) − w1 (x),
x ∈ [u, v].
(18)
Then, applying the Fubini theorem we have
Z v
Z
Z
Z v Z w2 (x)
dxdy =
dydx =
(w2 (x) − w1 (x))dx =
[C]γ
u
w1 (x)
u
v
F (x)dx.
u
We will need the following technical result.
Lemma 6.1. If [C]γ is a convex subset of R2 then F is a concave function.
Proof. Let us assume that F is not concave. Then, there exist x1 , x2 ∈ [u, v],
x1 < x2 and λ ∈ (0, 1) such that for x∗ = λx1 + (1 − λ)x2
F (x∗ ) < λF (x1 ) + (1 − λ)F (x2 ),
that is,
w2 (x∗ ) − w1 (x∗ ) < λ(w2 (x1 ) − w1 (x1 )) + (1 − λ)(w2 (x2 ) − w1 (x2 )). (19)
Let T be the convex hull of the points (xi , wj (xi )), i, j = 1, 2, i.e.
T = conv{(x1 , w1 (x1 )), (x1 , w2 (x1 )), (x2 , w1 (x2 )), (x2 , w2 (x2 ))}.
Since [C]γ is convex, T ⊆ [C]γ , and therefore
{y ∈ R|(x, y) ∈ T } ⊆ {y ∈ R|(x, y) ∈ [C]γ }
for all x ∈ [x1 , x2 ]. Applying this relationship at x = x∗ we obtain
w2 (x∗ ) − w1 (x∗ ) ≥ (λw2 (x1 ) + (1 − λ)w2 (x2 )) − (λw1 (x1 ) + (1 − λ)w1 (x2 ))
= λ(w2 (x1 ) − w1 (x1 )) + (1 − λ)(w2 (x2 ) − w1 (x2 )),
which contradicts to (19).
12
We note that since F is concave, it is continuous.
The strong forms of the possibilistic Cauchy-Schwarz inequality are based on the
following theorem.
Theorem 6.1. Let C be a joint possibility distribution with marginal possibility
distributions A = πx (C) ∈ F, B = πy (C) ∈ F, and let γ ∈ [0, 1]. If [C]γ is
convex then
R[C]γ (πx , πx ) ≤ R[A]γ (id, id).
(20)
Proof. see: Fuzzy Sets and Systems, 155(2005) 425-445.
Note 6.1. From the proof of Theorem 6.1 we obtain that in (20) equality holds if
and only if G(x) = 1/(v − u), x ∈ [u, v], which gives
[C]γ = {(x, y) ∈ R2 |x ∈ [u, v], y ∈ [w1 (x), w2 (x)]},
where F (x) = w2 (x) − w1 (x) = w is constant.
Hence, using our findings above we can see that in a probabilistic environment
inequality (20) states
σXγ ≤ σUγ ,
(21)
where Xγ is the 1-st marginal of a uniform distribution on some convex set [C]γ ⊂
R2 , and Uγ is a uniformly distributed random variable on [A]γ = πx ([C]γ ). Furthermore, in (21) equality holds if and only if Xγ ∼ Uγ is a uniformly distributed
random variable on [A]γ .
Now we are in the position to state the strong form of the possibilistic CauchySchwarz inequality for the γ-level sets of possibility distributions.
Theorem 6.2. Let C be a joint possibility distribution in R2 with marginal possibility distributions A = πx (C), B = πy (C) ∈ F, and let γ ∈ [0, 1]. If [C]γ is
convex then
(R[C]γ (πx , πy ))2 ≤ R[A]γ (id, id) R[B]γ (id, id).
(22)
Proof. Since [C]γ is convex, from Theorem 6.1 we have
R[C]γ (πx , πx ) ≤ R[A]γ (id, id),
R[C]γ (πy , πy ) ≤ R[B]γ (id, id).
Hence, from Theorem 5.1 we obtain
(R[C]γ (πx , πy ))2 ≤ R[C]γ (πx , πx ) R[C]γ (πy , πy ) ≤ R[A]γ (id, id) R[B]γ (id, id),
which ends the proof.
13
Note 6.2. Since R[C]γ (πx , πy ) = 0 whenever [C]γ = [A]γ × [B]γ , from Note 6.1
we find that in (22) equality holds if and only if
max{v ∈ R|(x, v) ∈ [C]γ } − min{v ∈ R|(x, v) ∈ [C]γ } = 0,
∀x ∈ [A]γ ,
max{u ∈ R|(u, y) ∈ [C]γ } − min{u ∈ R|(u, y) ∈ [C]γ } = 0,
∀y ∈ [B]γ .
In this case [C]γ is a line segment in R2 , which can be represented by either
[C]γ = {t(a1 (γ), b1 (γ)) + (1 − t)(a2 (γ), b2 (γ))|t ∈ [0, 1]}
(23)
[C]γ = {t(a1 (γ), b2 (γ)) + (1 − t)(a2 (γ), b1 (γ))|t ∈ [0, 1]},
(24)
or
where [A]γ = [a1 (γ), a2 (γ)] and [B]γ = [b1 (γ), b2 (γ)].
Definition 6.1. Let C be a joint possibility distribution with marginal possibility
distributions A = πx (C) ∈ F, B = πy (C) ∈ F. Then, A and B are said to
be completely positively correlated if (23) holds for all γ ∈ [0, 1]. On the other
hand, A and B are said to be completely negatively correlated if (24) holds for all
γ ∈ [0, 1].
The following theorem states the strong form of the possibilistic Cauchy-Schwarz
inequality for possibility distributions.
Theorem 6.3. Let C be a joint possibility distribution with marginal possibility
distributions A = πx (C) ∈ F, B = πy (C) ∈ F, and let f be a weighting function.
If [C]γ is convex for all γ ∈ [0, 1] then the following inequality holds
(Covf (A, B))2 ≤ Varf (A) Varf (B).
(25)
Proof. Since [C]γ is convex for any γ ∈ [0, 1], from Theorem 6.2 we have
|R[C]γ (πx , πy )| ≤ (R[A]γ (id, id))1/2 (R[B]γ (id, id))1/2
for all γ ∈ [0, 1]. Using the triangle inequality and the classical Cauchy-Schwarz
inequality for integrals, we obtain
Z 1
Z 1
|Ef (g; C)| = R[C]γ (πx , πy )f (γ)dγ ≤
|R[C]γ (πx , πy )|f (γ)dγ
0
0
Z 1
≤
(R[A]γ (id, id))1/2 (R[B]γ (id, id))1/2 f (γ)dγ
0
Z 1
≤
(R[A]γ (id, id)f (γ))1/2 (R[B]γ (id, id)f (γ))1/2 dγ
0
Z 1
1/2 Z 1
1/2
≤
R[A]γ (id, id)f (γ)dγ
R[B]γ (id, id)f (γ)dγ
0
0
1/2
= (Ef (h; A))
1/2
(Ef (h; B))
,
14
where g and h stands for the respective interactivity and dispersion functions. That
is,
|Covf (A, B)| = |Ef (g; C)|
≤ (Ef (h; A))1/2 (Ef (h; B))1/2 = (Varf (A))1/2 (Varf (B))1/2 ,
which ends the proof.
Note 6.3. Let f be an almost everywhere positive weighting function. Then, from
Note 6.2 we obtain that in (25) equality holds if and only if A and B are either
completely positively or completely negatively correlated.
The following theorem is a straightforward corollary of Theorem 6.3 and points
out a fundamental property of the possibilistic correlation coefficient.
Theorem 6.4. Let C be a joint possibility distribution in R2 with marginal possibility distributions A, B ∈ F, and let f be a weighting function. Let us assume that
Varf (A) 6= 0 and Varf (B) 6= 0, and that [C]γ is a convex set for all γ ∈ [0, 1].
Then,
Covf (A, B)
−1 ≤ ρf (A, B) = p
≤ 1.
Varf (A) Varf (B)
Moreover, ρf (A, B) = −1 if and only if A and B are completely negatively correlated, i.e. their joint possibility distribution is defined by (24) for all γ ∈ [0, 1];
and ρf (A, B) = 1 if and only if A and B are completely positively correlated, that
is, their joint possibility distribution is given by (23) for all γ ∈ [0, 1].
7
Illustrations
In this section we will illustrate three important cases of possibilistic correlation.
That is, let C be a joint possibility distribution in R2 with marginal possibility
distributions A = πx (C), B = πy (C) ∈ F, and let [A]γ = [a1 (γ), a2 (γ)] and
[B]γ = [b1 (γ), b2 (γ)], γ ∈ [0, 1].
(i) First, let us assume that A and B are non-interactive, i.e. C = A × B. This
situation is depicted in Fig. 1. Then [C]γ = [A]γ × [B]γ for any γ ∈ [0, 1], and
from the definition of the interactivity relation and (9) we have Covf (A, B) = 0
(see [8]) and
ρf (A, B) = 0
for any weighting function f .
In [3] we have shown that zero covariance does not always imply non-interactivity.
15
Figure 1: If A and B are non-interactive then ρf (A, B) = 0.
(ii) Now let us assume that A and B are completely positively correlated, that is,
their joint possibility distribution C is given by (23), γ ∈ [0, 1]. This situation is
depicted in Fig. 2. It can be shown that in this case the covariance between A and
B with respect to their joint possibility distribution C is [8]
Z 1
R[C]γ (πx , πy )f (γ)dγ
Covf (A, B) =
0
1
=
12
Z
1
[a2 (γ) − a1 (γ)][b2 (γ) − b1 (γ)]f (γ)dγ.
0
Furthermore, as we have already seen
Z
1 1
Varf (A) =
[a2 (γ) − a1 (γ)]2 f (γ)dγ,
12 0
Z
1 1
Varf (B) =
[b2 (γ) − b1 (γ)]2 f (γ)dγ
12 0
for any weighting function f . In particular, from the definition (23) of joint possibility distribution C we find that there exists a constant ϑ ∈ R, ϑ ≥ 0 such
that
b2 (γ) − b1 (γ) = ϑ(a2 (γ) − a1 (γ)), ∀γ ∈ [0, 1].
(26)
16
Figure 2: The case of ρf (A, B) = 1.
Thus, we obtain
Covf (A, B) = ϑ Varf (A),
Varf (B) = ϑ2 Varf (A),
which implies
Covf (A, B)
ρf (A, B) = p
=1
Varf (A) Varf (B)
for any weighting function f .
In this case if u ∈ [A]γ for some u ∈ R then there exists a unique v ∈ R that
B can take. Furthermore, if u is moved to the left (right) then the corresponding
value (that B can take) will also move to the left (right). This property can serve
as a justification of the principle of (complete positive) correlation of A and B.
(iii) Finally, let us assume that A and B are completely negatively correlated, i.e.
their joint possibility distribution C is defined by (24) for any γ ∈ [0, 1]. This
situation is depicted in Fig. 3. It can be shown that in this case the covariance of A
and B with respect to their joint possibility distribution C equals [8]
Z 1
Covf (A, B) =
R[C]γ (πx , πy )f (γ)dγ
0
1
=−
12
Z
1
[a2 (γ) − a1 (γ)][b2 (γ) − b1 (γ)]f (γ)dγ.
0
17
Figure 3: The case of ρf (A, B) = −1.
Furthermore, from the representation (24) of joint possibility distribution C we can
see that there exists a constant ϑ ∈ R, ϑ ≥ 0 such that (26) holds. Hence, we have
Covf (A, B) = −ϑ Varf (A),
Varf (B) = ϑ2 Varf (A),
wherefrom we find
Covf (A, B)
ρf (A, B) = p
= −1
Varf (A) Varf (B)
for any weighting function f .
In this case if u ∈ [A]γ for some u ∈ R then there exists a unique v ∈ R that
B can take. Furthermore, if u is moved to the left (right) then the corresponding
value (that B can take) will move to the right (left). This property can serve as a
justification of the principle of (complete negative) correlation of A and B.
Acknowledgments
We are greatly indebted to Prof. Tamás Móri of Department of Probability Theory
and Statistics, Eötvös Loránd University, Budapest, for his help to simplify the
proof of Theorem 6.1. The authors are thankful to the anonymous referees of the
earlier versions of this paper for their very useful comments and suggestions.
18
References
[1] L. M. de Campos and J. F. Huete, Independence concepts in possibility theory:
Part I, Fuzzy Sets and Systems, 103(1999) 127-152.
[2] L. M. de Campos and J. F. Huete, Independence concepts in possibility theory:
Part II, Fuzzy Sets and Systems, 103(1999) 487-505.
[3] C. Carlsson, R. Fullér and P. Majlender, A normative view on possibility
distributions, in: Masoud Nikravesh, Lotfi A. Zadeh and Victor Korotkikh
eds., Fuzzy Partial Differential Equations and Relational Equations: Reservoir Characterization and Modeling, Studies in Fuzziness and Soft Computing
Series, Vol. 142, Springer Verlag, 2004 186-205.
[4] D. Dubois and H. Prade, Additions of interactive fuzzy numbers, IEEE Transactions on Automatic Control, 26(1981) 926-936.
[5] D. Dubois and H. Prade, Possibility Theory: An Approach to Computerized
Processing of Uncertainty, Plenum Press, New York, 1988.
[6] Y. Feng, L. Hu and H. Shu, The variance and covariance off uzzy random
variables and their applications Fuzzy Sets and Systems, 120(2001) 487497.
[7] R. Fullér and P. Majlender, On weighted possibilistic mean and variance of
fuzzy numbers, Fuzzy Sets and Systems, 136(2003) 363-374.
[8] R. Fullér and P. Majlender, On interactive fuzzy numbers, Fuzzy Sets and Systems, 143(2004) 355-369.
[9] R. Fullér and P. Majlender, Correction to: “On interactive fuzzy numbers” [Fuzzy Sets and Systems, 143(2004) 355-369] Fuzzy Sets and Systems
152(2005) 159-159.
[10] R. Goetschel and W. Voxman, Elementary Fuzzy Calculus, Fuzzy Sets and
Systems, 18(1986) 31-43.
[11] E. Hisdal, Conditional possibilities independence and noninteraction, Fuzzy
Sets and Systems, 1(1978) 283-297.
[12] M.L. Puri and D.A. Ralescu, Fuzzy random variables, J. Math. Anal. Appl.,
114 (1986) 409422.
[13] L. A. Zadeh, Concept of a linguistic variable and its application to approximate reasoning, I, II, III, Information Sciences, 8(1975) 199-249, 301-357;
9(1975) 43-80.
19
Main results
The main results of this paper are:
The f -weighted possibilistic correlation of A, B ∈ F, (with respect to their joint
distribution C) is defined as
Covf (A, B)
ρf (A, B) = p
Varf (A) Varf (B)
R1
0 cov(Xγ , Yγ )f (γ)dγ
= R
1/2 R
1/2 ,
1 2
1 2
0 σUγ f (γ)dγ
0 σVγ f (γ)dγ
where Uγ is a uniform probability distribution on [A]γ and Vγ is a uniform probability distribution on [B]γ , and Xγ and Yγ are random variables whose joint probability distribution is uniform on [C]γ for all γ ∈ [0, 1]. Furtheromore, cov(Xγ , Yγ )
denotes the probabilistic covariance between marginal random variables Xγ and
Yγ for all γ ∈ [0, 1].
Thus, the possibilistic correlation represents an average degree to which Xγ and
Yγ are linearly associated as compared to the dispersions of Uγ and Vγ .
It is clear that we do not run a standard probabilistic calculation here. A standard
probabilistic calculation might be the following
R1
0 cov(Xγ , Yγ )f (γ)dγ
R
1/2 R
1/2 .
1 2
1 2
0 σXγ f (γ)dγ
0 σYγ f (γ)dγ
That is, the standard probabilistic approach would use the marginal distributions,
Xγ and Yγ , of a uniformly distributed random variable on the level sets of [C]γ .
Theorem 6.4 Let C be a joint possibility distribution in R2 with marginal possibility distributions A, B ∈ F, and let f be a weighting function. Let us assume that
Varf (A) 6= 0 and Varf (B) 6= 0, and that [C]γ is a convex set for all γ ∈ [0, 1].
Then,
Covf (A, B)
−1 ≤ ρf (A, B) = p
≤ 1.
Varf (A) Varf (B)
Moreover, ρf (A, B) = −1 if and only if A and B are completely negatively correlated, and ρf (A, B) = 1 if and only if A and B are completely positively correlated. Furthermore, if C = A × B then ρf (A, B) = 0 for any weighting function
f.
20
Citations
[A6] Christer Carlsson, Robert Fullér and Péter Majlender, On possibilistic correlation, FUZZY SETS AND SYSTEMS, 155(2005) 425-445. [MR2181000]
in journals
A6-c5 Chi-Chi Chen, Hui-Chin Tang, Degenerate Correlation and Information
Energy of Interval-Valued Fuzzy Numbers, INTERNATIONAL JOURNAL
OF INFORMATION AND MANAGEMENT SCIENCES, 19(2008) pp. 119130. 2008
http://jims.ms.tku.edu.tw/PDF/M19N18.pdf
A6-c4 B. Bede, T.G. Bhaskar, V. Lakshmikantham, Perspectives of fuzzy initial
value problems, COMMUNICATIONS IN APPLIED ANALYSIS, 11 (3-4),
pp. 339-358. 2007
A6-c3 Hong, D.H., Kim, K.T., A maximal variance problem, APPLIED MATHEMATICS LETTERS, 20 (10), pp. 1088-1093. 2007
http://dx.doi.org/10.1016/j.aml.2006.12.008
In this work, we provide a direct proof concerning a special type
of concave density function on a bounded closed interval with
minimal variance. This proof involves elementary methods, without using any advanced theories such as Weierstrass’s Approximation Theorem, from which the technical core result of the paper [C. Carlsson, R. Fullér, P. Majlender, On possibilistic correlation, Fuzzy Sets and Systems 155 (2005) 425-445] comes. (page
1088)
Recently, Carlsson, Fullér and Majlender [A6] presented the concept of a possibilistic correlation representing an average degree
of interaction between the marginal distribution of a joint possibility distribution as compared to the respective dispersions. They
also formulated the weak and strong forms of the possibilistic
Cauchy-Schwarz inequality. (page 1088)
The strong forms of the possibilistic Cauchy-Schwarz inequality
of the paper [A6] are based on the following theorem.
Theorem 1 ([A6]). Let C be a joint possibility distribution with
21
marginal possibility distribution A = πx (C) ∈ F, B = πy (C) ∈
F, and let γ ∈ [0, 1]. If [C]γ is convex then
R[C]γ (πx , πy ) ≤ R[A]γ (id, id).
(page 1090)
A6-c2 Mizukoshi, M.T., Barros, L.C., Chalco-Cano, Y., Roman-Flores, H., Bassanezi, R.C., Fuzzy differential equations and the extension principle, INFORMATION SCIENCES, 177 (17), pp. 3627-3635. 2007
http://dx.doi.org/10.1016/j.ins.2007.02.039
In this section we will discuss the fuzzy differential equations obtained from a deterministic differential equation introducing an
uncertainty coefficient and fuzzy initial condition. For this, we
will consider that parameter w and initial condition x0 are uncorrelated [A6]. (page 3632)
A6-c1 Matia F, Jimenez A, Al-Hadithi BM, et al. The fuzzy Kalman filter: State
estimation using possibilistic techniques, FUZZY SETS AND SYSTEMS,
157 (16): 2145-2170 AUG 16 2006
http://dx.doi.org/10.1016/j.fss.2006.05.003
22
Download