Significance of Attributes and Similarity Based Rough Sets

advertisement
Rough Sets Similarity Based Learning
Jaroslaw Stepaniuk
Institute of Computer Science
Bialystok University of Technology
Wiejska 45A, 15-351 Bialystok, Poland
email: jstepan@ii.pb.bialystok.pl
ABSTRACT: First part of this paper presents the basic rough set model methodology and thus serves as an introduction
for the other parts. In the second part of the paper we discuss similarity based rough set model. We define similarity
relations and significance of attributes in this model. In the third part we present some applications of introduced notions
in similarity based learning.
1 ROUGH SETS
Rough sets (Pawlak 1991) have been introduced as a tool to deal with inexact, uncertain or vague knowledge in
artificial intelligence applications. In this section we recall some basic notions related to information systems and rough
sets.
An information system is a pair A = (U, A), where U is a non-empty, finite set called the universe and A - a non-empty,
finite set of attributes, i.e. a: U  Va for aA, where Va is called the value set of a. Elements of U are called objects
and interpreted as, for example, cases, states, processes, patients, observations. Attributes are interpreted as features,
variables, characteristic conditions, etc.
Every information system A = (U, A) and non-empty set B  A determine a B-information function
InfB: UP(B
V
a B
a
) defined by InfB(x) = {(a,a(x)): aB}.
We define B - indiscernibility relation as follows: xIND(B)y iff InfB(x)=InfB(y).
For every subset X  U we define the lower approximation IND B X  and the upper approximation
IND B X  as follows:
IND B X    x U : x  B  X  ,
IND B X    x U : x  B  X   .
X
IND B X 
IND B X 
Figure 1: The lower and the upper approximations of a set X in basic
rough set model
Some illustration of the approximations
is presented on Figure 1 (a set U of all
objects is represented as the global
rectangle and B - indiscernibility classes
are represented as small rectangles).
We consider a special case of
information systems called decision
tables. A decision table (Pawlak 1991) is
any information system of the form A =
(U, A  {d}), where dA is a
distinguished attribute called decision.
The elements of A are called conditions.
One can interpret a decision attribute as
a kind of classification of the universe of
objects given by an expert, decisionmaker, operator, physician, etc. The
cardinality of the image d(U) = {k:
d(x)=k for some xU} is called the rank
of d and is denoted by r(d). We assume
that the set Vd of values of the decision d
is equal to {1,...,r(d)}. Let us observe that the decision d determines the partition CLASSA(d)= {X1,...,Xr(d)} of the
universe U, where Xk = {xU: d(x)=k} for 1 k r(d). CLASSA(d) will be called the classification of objects in A
determined by the decision d. The set Xk is called the k-th decision class of A. The set POS(B,{d}) is called the positive
region of classification CLASSA(d) and is equal to the union of all lower approximations of decision classes. Some
example of positive region is presented on Figure 2 (a set U of all objects is represented as the global rectangle,
indiscernibility classes are represented as small rectangles, and there are three decision classes).
Different attributes may play different roles in determining the dependency relationship between the condition and
decision attributes. The basic idea for calculating the weights of each attribute is that the more information an attribute
provides to the decision attribute, the more weight has the attribute. Rough sets theory provides the background for
calculating attribute weight and supplies a variety of tools which can measure the amount of information each attribute
gives to the other attributes as a form of significance. Let B  A, relative significance of an attribute a  B can be
defined in many ways. Here we present two natural coefficients:
SRC B, d , a  
SGF  B, d , a  



card POS  B, d   card POS  B  a, d 
card U 



card  POS  B, d  
 and
card POS  B, d   card POS  B  a, d 
.
In both cases we assume that significance of an attribute reflects the degree of decrease of positive region as a result of
removing attribute a from B. In practice, the stronger the influence attribute a has on the relationship between B and d,
the higher is the value of both coefficients. Let us observe that also the following properties are satisfied:
0  SRC B, d , a  SGF B, d , a  1 .




Problem Analysis
Attribute Determination
and Decision Table
Construction
Basic Rough Set Analysis
Results of analysis are
good enough?
Yes
Stop
No
Definition of Similarity Measures
for Attributes
Similarity Based Rough Set
Analysis
Figure 2: Positive region of partition in the basic
rough set model
Figure 3: General scheme for similarity based
rough set data mining
2 SIMILARITY BASED ROUGH SET MODEL
It was observed, that considering a similarity relation instead of an indiscernibility relation is quite relevant. The main
argument for use of a similarity relation instead of the indiscernibility relation is connected with existence of quantitative
attributes in the decision table. Very often, these attributes carry an uncertain information because of non adequate
definition, imprecise measurement or random fluctuation of some parameters. On the other hand, in order to create a
generalized description of the decision table and to discover some regularities in the data, the user may wish to translate
numerical values of attributes into qualitative terms. Therefore, when using the indiscernibility relation, the quantitative
attributes should be discretized using some norms translating the attribute domains into sub-intervals corresponding to
qualifiers: very low, low, medium, high, very high, etc. For example in medicine the use of norms is quite frequent and
there are many conventions establishing them. In those applications, however, where the definition of norms is arbitrary,
it is more natural to define a relative similarity with respect to a given value of the attribute. Moreover, the use of norms
introduces an undesirable phenomenon, when very close objects are separated between two consecutive sub-intervals.
The similarity based extension of rough sets theory should be applied when results obtained by standard rough set
methods are not satisfactory. Figure 3 provides a summary of rough set based methods (standard model and similarity
based model ) of data mining.
In next two subsections we describe constructions of similarity relation and basic properties of similarity based rough set
model.
2.1 CONSTRUCTIONS OF SIMILARITY RELATION
Construction of similarity relation one can start from setting relations between attribute values for each attribute. We
propose to use as a base similarity measure, which one can adopt to different types of given attributes.
Let A=(U,A{d}) be a decision table and let r(d) be a number of decision values. We can define similarity measures
between two values of a given attribute aA. For example, for attribute aA with numeric values one can define a
similarity measure


sa v i , v j  1 
vi  v j
a max  a min
,
where amin, amax denotes the minimum and maximum values of attribute a, respectively.
For more examples of similarity measures see (Stepaniuk 1996).

 
We assume that value vi is similar to vj when sa v i , v j  t a , where t(a)[0,1] is a similarity threshold for values
of attribute a.
Next we describe the aggregation process leading to the definition of the similarity relation on the set of objects.
Let B  A, to construct global similarity relation (it means between objects) we use for example the following operators
(Stepaniuk 1996):

 s  a x, a y  t ,
y iff  s  a x , a y   t ,
xSIM B y iff
a
a B

xSIM B
a
a B
where t[0,1] is a similarity threshold for objects.
2.2 ROUGH SETS AND SIMILARITY RELATIONS
In this section we present basic notions of the rough set concept based on similarity relations (Skowron and Stepaniuk
1995). The standard rough set model can be generalized by considering any type of binary relations on attribute values,
instead of the trivial equality relation (Skowron and Stepaniuk 1995, Slowinski and Vanderpooten 1995). We propose a
similarity relation on attributes values in information system (Skowron and Stepaniuk 1995, Stepaniuk and Kretowski
1995).
Let A=(U,A{d}) be a decision table, let Va be a set of values of attributes of aA and let r(d) be a number of decision
values.
A similarity based decision table is defined by (A,SIMA), where SIMA is a similarity relation on the set of objects. We
define SIMAx = {yU: ySIMAx}, SIMAx contains all objects similar to x.
The set approximations (Skowron and Stepaniuk 1995, Slowinski and Vanderpooten 1996) are defined below.
The lower approximation of XU by SIMA is defined as follows:
SIM A  X   x  X : SIM A x  X  .
   SIM A x .
The upper approximation of XU by SIMA is defined as follows: SIM A X 
xX
The set
SIM A  X  is the set of all elements of U which can be with certainty classified as elements of X, with respect to
SIMA. The set
SIM A  X  is the set of elements of U which can be possibly classified as elements of X, employing
knowledge included in SIMA.
Let Xi = {xU: d(x)=i}. The set
r d 
POS  SIM A , d    SIM A  X i 
i 1
is called SIMA - positive region of partition {Xi: i=1,...,r(d)}. The positive region as union of the lower approximations
of decision classes include only those objects which belong to the corresponding decision classes without any ambiguity.
Now we introduce the notion of relative reduct in similarity based rough set model.
A subset R  A is a relative reduct for (SIMA,d) iff
1) POS(SIMA,{d})= POS(SIMR,{d}),
2) for every proper subset R’R condition 1) is not true.
Let B  A, relative significance of attribute a  B can be defined as follows:
SRC SIM B , d , a  
SGF  SIM B , d , a  


 
 and

 
 .
card POS  SIM B , d   card POS SIM B a  , d 

card U 
card POS  SIM B , d   card POS SIM B a  , d 

card POS  SIM B , d 

Thus in both cases we assume that significance of an attribute reflects the degree of decrease of positive region as a
result of removing attribute a from B. Let us observe that if R  A, is a relative reduct, then for every a  R we obtain
SRC SIM R , d , a , SGF  SIM R , d , a   0 .
3 SIMILARITY BASED LEARNING
Many data mining algorithms are based on inductive learning methods. Much less are based on similarity-based
learning. However, similarity-based learning accrues advantages, such as simple representations for decision classes
descriptions, low incremental learning costs, small storage requirements.
Similarity based learning algorithms consist at least of the following main components:
 Similarity function: given two normalized objects, this yields their numeric-valued similarity.
 Classification function: given an object xnew to be classified and its similarity with each saved object, this yields a
classification for xnew.
In our approach similarity based learning is performed in three steps.
1. Compute some reduct (with minimal number of attributes).
2. Calculate the weights of attributes (using significance of attributes).
3. Calculate the similarity function of the new object with each object in the reduced decision table and classify the
object to the corresponding decision class.
Similarity between objects can be defined in many ways (Stepaniuk 1996). Let x and y be objects described by attribute
set A and let R  A be some reduct. We consider for example the following similarity functions:



sim x, y  minw * s  a x, a y ,
sim x , y    wa * sa  a x , a y  ,
a R


a R
a
a
sim x , y    wa * sa  a x , a y  ,
a R
where 0  wa  1 is a weight assigned to attribute a  R. We use as wa significance of attribute a (see Section 2).
For new objects, value of decision attribute d is computed as follows: let xnew be a new object, value of decision attribute
d is the same as d(x) for object x  U most similar to xnew.
Thus in other words we can classify d(xnew ) = d(x), where sim(xnew,x) = max{sim(xnew,y) : y  U}.
CONCLUSIONS
This paper has focused attention on standard rough set model and similarity based rough set model. We discussed
properties of basic notions in both models. We also present application of introduced notions in similarity based
learning.
REFERENCES
Hu X., Cercone N. 1995. Rough Sets Similarity-Based Learning from Databases, Proceedings of the First International
Conference on Knowledge Discovery and Data Mining, Montreal, Canada, August 20-21 1995, pp. 162-167.
Krawiec K., Slowinski R., Vanderpooten D. 1996. Construction of Rough Classifiers Based on Application of a
Similarity Relation, Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine
Discovery, November 6-8, 1996, Tokyo, Japan, pp. 23-30.
Kretowski M., Stepaniuk J. 1996. Selection of Objects and Attributes, a Tolerance Rough Set Approach, Proceedings of
the Poster Session of Ninth International Symposium on Methodologies for Intelligent Systems, June 10-13, 1996,
Zakopane, Poland, pp. 169-180.
Pawlak Z. 1991. Rough Sets. Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, 1991.
Skowron A., Stepaniuk J. 1995. Generalized Approximation Spaces, In: Soft Computing, T.Y.Lin, A.M.Wildberger
(eds.), San Diego Simulation Councils, Inc., 1995, pp. 18-21.
Skowron A., Stepaniuk J. 1996. Tolerance Approximation Spaces, Fundamenta Informaticae, 27 (1996) pp. 245-253.
Slowinski R., Vanderpooten D. 1995. Similarity Relation as a Basis for Rough Approximations, Proceedings of the Second Annual Joint Conference on Information Sciences, Wrightsville Beach, N. Carolina, USA, September 28 - October
1, 1995, pp. 249-250, also ICS Research Report 53, 1995.
Slowinski R., Vanderpooten D. 1996. A Generalized Definition of Rough Approximations, ICS Research Report 4,
1996.
Stepaniuk J., Kretowski M. 1995. Decision System Based on Tolerance Rough Sets, Proceedings of the Fourth
International Workshop on Intelligent Information Systems, Augustow, Poland, June 5-9, 1995, pp. 62-73.
Stepaniuk J. 1996. Similarity Based Rough Sets and Learning, Proceedings of the Fourth International Workshop on
Rough Sets, Fuzzy Sets, and Machine Discovery, November 6-8, 1996, Tokyo, Japan, pp. 18-22.
Download