Concept Formation taking into account Property Relevance

From: Proceedings of the Eleventh International FLAIRS Conference. Copyright © 1998, AAAI (www.aaai.org). All rights reserved.
Concept Formation taking into account Property Relevance
UNIFORUniveraidadede Fortalem
Av. Washington
Seams1321
Centrede Ci~nciasTecnoldgicas-CCT
Fortaleza- CEBrazil
vaseo@~c.br
Abstract
Weproposea methodto incromantallycomputeand use,
duringthe conceptformation,the relevanceof a concept
property.Thisrelevanceis computed
throughthe accountof
the propertycorrelationwithotheronesandit is usedbythe
conceptquality functionin order to improvepredictive
accuracy.Theproposedapproachis analyzedcontenting
boththe wedictionpowerof the generatedconceptsandthe
time and space complexityof the c4mceptformation
algorithm.
Initial resultsshow
that, in the taskof prediction
of valuesfor severalattributes, the proposedmethod
has
improved
the wedictionpow~of the generatedconcepts.
Introduction
Thegeneral aimof conceptformationis to construct, based
on entity descriptions (observations), a (usually
hierarchical) categorization of entities. Eachcategoryis
provided with a definition, called a concept, which
summarizes
its elements.Further aim is to use conceptsto
categorize newentities and to makepredictions concerning
unknown
values of attributes of these entities. Therefore,
quality of a conceptcan be measured
in termsof its ability
to make good predictions about unknog~ values of
attributes (the prediction power). Unlike supervised
systems, in whichconcept quality is measuredfrom the
capacity in discoveringa value for a single property, in
conceptformation,the quality of a conceptis measuredby
its capacity in allowingprediction of values for several
attributes.
An importantaspect that must be considered in concept
formation concerns the relevance (sometimes called
salience) of particular attributes/values (or properties).
Cognitivepsychology(Tversky1978), (Seifert 1989),
machinelearning (Gennari, Langley and Fisher 1989),
(Stepp and Michalsld 1986), (Decaestecker 1991)
researchers havepointed out the necessity to determine
howmucha certain propertyis relevant withina concept.
In this paper, we propose a method to compute the
relevance of a concept property basedon the correlation
betweenproperties of the entities coveredby the concept.
We describe our approach using a COBWEB-based
algorithm, called FORMVIEW,
whichcan generate several
hierarchies of categories describingdifferent perspectives
(Vasco97). In a multi-perspectivecontext, the relevance
a propertyis crucial becauseit determinesthe hierarchical
organization of categories. Since FORMVIEW
uses a
probabilistic representation,correlation betweenproperties
is computedfromconditional probabilities. However,the
proposed approach is generic and can be employedin
algorithms,whichmayuse other representations.
The prediction power of the category hierarchies
generated by the proposed method is computed and
compared with those generated by COBWEB.In
particular, we analyze the capacity of the categories
generatedby these algorithmsin prediction of values for
several unknown
properties of entities. Weshowthat, in
such a situation, property relevance based on the
correlation betweenproperties improvesthe prediction
power of hierarchies, which are produced by concept
formationsystems.
Concept Formation Systems
Concept formation (CF, for short) or incremental
conceptualclustering systems(Stepp andMichalski1986),
(Fisher 1987) recognizeregularities amonga set of nonpreclussified entities and inducea concepthierarchy that
summarizes
these entities. ACFalgorithmis reducedto a
search,in the spaceof the all possibleconcepthierarchies,
for that onethat coversthe observedentities andoptimizes
an evaluation function measuringa quality criterion. In
conceptformationentities are treated oneafter anotheras
soonas they are observedand the classification of new
entities is madeby their adequacy for the existing
conceptualcategories.
Typicallyconceptrepresentationis probabilistic (Fisher
1987), in whichconceptshavea set of attributes and all
possible values for them.Eachconcepthas the probability
that an observationis classified into the conceptandeach
valueof a conceptattribute has associateda predictability
anda predictiveness(Fisher 1987).Thepredictability is the
conditionalprobabilitythat an observationx has valuev for
an attribute a, giventhat x is a member
of a category(2, or
P(affivlC). Thepredictivenessis the conditionalprobability
that x is memberof C given that x has value v for a or
P(Cla--v).
FrameI sketchesan algorithmfor conceptformation.
|Copyright O 1998 AmericanAssociation for Artificial Intelligence
(www.aaai.erg). All rlghts mervcd.
Machine
Learning225
FUNCTION
PRINCIPAL(Root, Observation)
I. IncorporateObservationin Root
2. Choosethe best operatorto employon the partition P of the
Root’sdirect sub-concepts,among
the following:
a) IncorporateObservationinto a conceptof P
b) Createa newconceptunderRootto receive Observation
c) Mergethe 2 best concepts of P in a newconcept that
includes Observation
d) Split a conceptof P in its children, addingObservationto
the best of these
3. If the operation2b, read anotherobservation
Else return to 1 with Root = the concept of P in which
Observationwasinserted
relationship between attribute dependenceand the abifity
to correctly infer an attribute’s value using a probabilistic
concept hierarchy. Wecan thus determine those attributes
that depend on others and, as a consequence, those that
influence the prediction of others. Byan influent attribute,
we mean that, if we know its value, it allows a good
prediction about the value of others. Wehave thus defined
the total influence Tinfl of an attribute Akon others A~as
the following:
FrameI a CF’$structure of control
Our claim is that attribute dependencegives a measureto
ponder attribute relevance, which, in this context, means
howmuchan attribute correlates with others.
Concept Formation taking into account
Relevance of a Property
Mdep(Am,
Tinfl(A*)=
A(p,)p(p,~C)p(C]p,)-
the
Computing property relevance for each concept
To compute an attribute dependence, we have defined the
probability of predicting a property p given another
property p’ ; P(p[p ’). Actually, for each conceptC within a
hierarchy, we have P(p[p’ and C). The acquisition of this
probability is problematic, since it cannot be computed
only from the predictability and predictiveness stored by
FORMVIEW.
Instead of keeping all the 2x2-property
correlation,
which would take too much space, or of
computing such a correlation for each new observation.
which would be computationally expensive, we have
defined a procedure that implements a tradeoff between
time and space requirements.
AI
A2 A3 A4
Obsl vl
v3
v5 v7
Obs2 v2
v4
v6 v8
Obs3 vl
v3
v6 v7
Obs4 v2
v3
v6 v8
a2
Iv3 v4
~
P(C)p(p#)
al vl
v2
a2 v3
v4
a3 v5
v6
a4 v7
WhereA(pJ = the relevanceof the category property Pi +
P(p~)).
Computing property relevance
To compute property relevance,
FORMVIEW
uses a
strategy that relies on attribute dependence(or attribute
correlation) in the way that was defined by Fisher (Fisher
1987). Formally, the dependence of an attribute A,. on
other n attributes A; can be definedas :
v8
¢
WhereV#, signifies the jith value of attribute At and Aj
Am.
In fact, this function measuresthe average increase in the
ability to guess a value of A, given the value of a second
attribute Ai. Weconsider that this strategy accounts for the
226 Vasco
where Ak ~eA,.
n
Using the relevance
of a property
in concept
formation
FORMVIEW’s
concept formation process is similar to that
illustrated in Section 2. It constructs probabilistic concepts
while privileging their prediction power. The category
quality function is, like manyof its predecessors, based on
the Gluck and Corter’s work on cognitive psychology
(Gluck and Cotter 1985), who have defined a function
discover, within a hierarchical classification tree, the basic
level category. Category quality is defined as the increase
in the expected numberof properties that can be correctly
predicted given knowledgeof a category over the expected
number of correct predictions without such knowledge.
FORMVIEW,
in addition, takes into account the relevance
of the propertics. Weconsider tlmt the increase in the
expected number of properties to be predicted from a
category depends on the relevance of its properties.
Formally,the utility of a category UCais defined as below:
UCs (C)=
At)
.=l
2
1
3
1
1
............
a3
a4
v$ v6 v7 v8
1 1 2
2
2
1 2
2 1
1
1
-i ......
~]1
3 __[ 1 2
.............
Frame 2 Example ofu triangular array used by FORMVIEW
after 4 observations
Our procedure consists of maintaining two triangular
arrays which keep the 2x2-property correlation" T-root and
T-son. These arrays keep such a correlation for all the
observations already seen. T-root is updated once for each
new observation. It allows computingthe relevance of the
root’s properties. T-son accompaniesside by side the path
followed by the observation during its categorization. It is
updated at each hierarchical level until the observation
arrives at the leaves. Frame2 exemplifiesthe formatof the
usedarraysafter four observations.
Procedures for computing the relevance of a
property
FORMVIEW
procedures for computing dependence and
correlation are shownin Frame3. Theywereinserted into
the procedure which does the incorporation of an
observation(steps 4 and5 in Frame3).
Step four concerns the update of the arrays, which
maintain the frequeney of a property correlation. Two
proceduresare responsiblefor that activity: Update.Array
and RefineTson. In Update.Array, the frequency of a
property correlation is computedfor all the concept
properties. Theretrieve of rids frequencyfor a specific
propertyis donein step 1.1.1. FunctionShift allowsit to
access the cell whichstores the correlation betweenthe
properties. Let us supposethe exampleshownin Frame3.
For the properties (Al--vl) and(A4--v7),Sh/fl returns
1 and shifts 6 columns,whichcorrespondto the samof all
the valuesof attributes less than (consideringthe array’s
order) atlribute A4. Thus,wecan access cell (1,7) of
array.
Step five in Frame3 concerns the computationof the
relevanceof each conceptattribute of the hierarchy. The
procedureComputeRelevance
computesthe relevance of a
propen’yusingthe conditionalprobabilitythat an entity has
a property given that another property is known.These
properties are computedfromthe frequencyof a property
correlation representedin T-son.
The procedure RefineTso, refines T-son each time
FORMVIEM
descends the concept hierarchy. This
refinementconsists of updatingT-sonin orderto let it only
with the account of the existing correlation betweenthe
properties of observations covered by the current root
concept. For each concept, T-son’s actualization is based
on the followingstrategy: if the quantity of observations
coveredby a conceptis greater than the total of its brothers
(children of the concept’sfather), T-sonis updatedfrom
those observations which are covered by these later.
Otherwise,T-sononly stores rite propertycorrelation from
observationscoveredby a concept.
O, O=~.g~)
Ine~zpozate(C,
i. Update
C’s conditional
2. Include
O in list
3. Update
the node
the
number
4. If C is root
probabilities
of observations
{father(C)=~
4.3 RefineTsons
4.4 UpdateArray
5. Computerelevance
(T-son)
O}
(Array,
by
O)
(T-son,
(T-son,
UpdateAzza¥
by C
covered
nil
4.1 UpdateArray
(T-root,
4.2 T-son = T-root
Else
covered
of observations
C, C brothers,
O)
O)
1. For each property Pt of O
1.1 For each property p= of O (z ~ i)
1
1.1.1.Add
attribute)+Sub(pi~s
attribute)}+
/* Sub(x}
attribute
to
Sub(p=’s
(shift(Sub(p=’s
value})))
= retrieves
the
array
subscribe
of an
or a value
/* shift(Sub(x))
the
Array( ( (Sub{p,
valuel}-ll,
attributes
RefineTson
= account
with
(Array,
sub
less
the
number
than
values
for
Sub(x)
C, FC, O)
1. if IC’s observations[
> [FC’s observations[
1.1 Subtract
in Array
the
correlations
observations
covered by FC
from
Else
1.2 Create
Array
from observations
with account
covered by FC
of
correlations
(~I~ teRel evmnae (Array}
i. For each
i.i
For
property
the
p, of a concept
other
properties
p=
of this
same
concept (p=~ p~l
i. 1. I. SubAtmin
Sub(p,~s
ffi
1.1.2.SubValmin
having
min (SubIp=" s
attribute},
attribute))
= Sub
of
the
attribute
value
SubAtmin
l. i. 3. SubAtmax
=
max (Sub (pf’s
attribute),
Sub(p,~s attribute)
l.l.4.SubValmax
ffi
Sub
of the
attribuze
value
that has SubAtmax
=
I.i. 5. P(pllp.)
Array( ( (shift (SubAtmin)
+
SubValmin ), (shift ISubAtmax) +SubValmax) )
Array( (shl ft (Sub(p,’s
Performance Task
In our process of evaluation of FORMVIEW,
we pay
attention to the prediction powerof a hierarchy generated
by it. Thebasic idea is to submita set of ~questions~
(normally, incompleteobservations) to the system, whose
answer is based on the generated representation. The
quality of the representationis measuredaccordingto its
capacity to give ~good~answers(i.e. to infer values for
attributes).
We have used three test domains: two animal
classification domains(ZOOdomains)andthe Pittsburgh’s
bridges domain.
attribute) ) +Sub(px
value) ),
(shift(Sub(px’s
attribute))+Sub(px~s
l.l.6.Pertp, = Pertpt + (P(PlIP%)’
1.2 Pertpl = Pertpl/Nb propertles-i
value)))
- p(p=]=}
Frame3 Proceduresfor computing
a propertyrelevance
The ZOO domains
The zoo domainwastaken from the UCImachine-learning
dataset consisting of 53 observations with 18 mlributes
describinganimals. Wehavedivided this dataset into two
sets of observations described by 10 and 11 properdes,
respectively. In fact, wehaveadaptedthis domainthrough
Machine
Learning227
the insertion of attributes in order to represent two
perspectives:the pet andthe physiologicperspectives.
Predictionof several propertiesin this domainwasuseful
to showmore clearly the contribution of the use of
predictiveinfluenceas a heuristic to compute
the relevance
of a property. Indeed, hierarchies generated from this
heuristic have a better prediction power than those
generated by other systems. It is due to the fact that
concepts are organizedaroundproperties havinga strong
predictive influence. Thus,whenone infers the value of a
property, he/she increases the probability to infer values
for other properties influencedby the first one. Figure1
andFigure2 illustrate tests donein two ZOO
domains.
they are informed by the user guiding the process of
construction
of the artifact.
Figure 3 shows the predictive accuracy of FORMVIEW
a~ainst that of COBWEB.
The results of these tests
indicated us the adequacyof FORMVIEW
in constructing
hierarchiesfor the tasks of design. Actually,specification
properties are muchcorrelated and, consequently, very
influent. This causesgoodinferenceson properties of the
product.
100%
Oe,~
’|
:[at :tat 4at Sat
I
I
I
40 50
0,2
[UICOBWEB
I
I
I
80
90
|
100
Complexity
0
10
20
30
40
N. Observations
Fig, 1 Predictionof severalattributesin FORMVIEW
and
COBWEB
: ZOOPhysiologic
2at 3at 4at Bat
0,7
LI;)COBWEB
.llo.,
F|
I
60 70
Fig, 3 PredictionpowerfromPittsburgh
domain
0,6
o.
I
10 20 30
0,4
0,3
10
20
30
N. Observations
40
Fig. 2 Predictionof severalattributesin FORMVIEW
and
COBWEB
: ZOOPet
The Pittsburgh Bridges domain
The prediction of several properties often appears in
problemsof conception. Therefore, wedecidedto use the
domainof Pittsburgh’s bridges in our tests. Design
domainsare characterizedby havinga set of specification
propertieswhichdefine the user’s needsanda set of design
propertieswldchdescribethe artifact’s clmracteristics.The
bridge domaincontains descriptionsof 108bridges built in
Pittsburgh since 1818.Eachobservationis describedby 12
properties of whichsevenare specification properties and
five are design properties. This domain ~ largely
explored by (Reich 1994 and Reich and Fenves 1991),
wherethey showthe suitability of COBWEB-like
systems
to designdomains.
Wehaveagain defined attribute relevance as a function
of the attribute dependence.
Thetask consists of inferring
values for all the productproperties. Wesupposethat the
specificationpropertiesare less susceptibleof noises since
228 Vasco
To examinethe time and space complexity required by
FORMVIEW’s
approach, we consider:
n the numberof categorized entities
L the averagebranchfactor of the concepttree
The cost of the procedure for computing property
relevance is boundto the cost of updatingT-root and 7:
son. It should be reminded that, FORMVIEW
stores, in
these arrays, the 2x2-correlationfor eachconceptproperty.
First, we determinethe nocessaty space to stock these
correlations. Let cell be the unit wherethe frequencyof
correlation between
twopropertieswill be kept.
In eacharray that maintainsthe frequencyof correlation
betweenproperties, for n observations, whichhave in
average nbAT at~butes with in average nb V values, we
nbAT~nbV
need
E i
i=1
cells,
that
is
to
say,
o
(nb A T x
n b V )=
cells.
Asfor the time complexity,wehavethe cost to updateTroot and T-son. whena newobservation is categorized.
Thecost of T-root’s updateis the sameas that required in
space. T-son’s updatingis moreexpensivethan that of Troot becauseit mustbe doneto eachlevel of the hierarchy
(on average time log~). For each level, only the
frequenciesof correlation betweenobservationproperties
coveredby tile current nodemustbe represented. Thus,Tson mustbe actualized mtimes (re<n), wheremrepresents
the minimum
betweenthe nuntber of observations which
are not covered by the current node and the numberof
observations covered by the current node. In the worst
case, we have m = n12, whichwould makethe geometric
progression
(n/2, n/4, n/8..... 1) for all the depthof the tree.
Thecost of T-sonactualizationis thus the order:
0(2""x (nbAT x nbV)2).
Finally, it is necessary to mention that thc cost of
computing
the predictive influence for every conceptalso
Decaestecker,C. 1993.Apprentissage
et outils statistiques
en classification conceptuelle iner~mentale. Revue
dTntelligence
Artificielle, 7(1).
Fisher, D.H. 1987. Knowledge
Acquisition via Incremental
ConceptualLearning. MachineLearning,2(2).
o ,by ),(2"" + los:))
Fisher, D.; paT~mni Mand Langley,P. eds. 1991.Concept
Formation: Knowledgeand Experience in Unsupervised
Related Work
Learning. MorganKaufmann.
Gennari, J.H.; Langley, P.; and Fisher, D. eds. 1989.
Thedefinition of a propertyrelevancehas beentreated in
early incremental concept formation systems. ADECLU Modelsof Incremental Concept Formation. Artificial
Intelligence,40.
(Decaestecker1993)uses a statistical measureto quantify
the correlation betweenthe property of a conceptand the
Giuck, M. A. and Cortex, J.E.1985. Information,
variable "membership
of the concept". It maintainsa 2x2uncertainty, andthe utility of categories. Proceedingsof
contingence table for each property of each concept.
the 7th Annual Conference of the Cognitive Science
However,there is no account of the correlation between
Society. Irvine, CA,LawrenceErlhaum.
the properties.
Hanson,S. and Bauer, M. 1989. ConceptualClustering,
In ECOBWEB
(Reich and Fenves 1991), property
Categorization, and Polymorphy.MachineLearning, 3:
relevance is taken into account in the categorization
343-372.
process in the same way we have implemented here.
Michalski,R.; Carbonnel,J. and Mitchell, T. eds. 1986:
However,the informationof whichproperties are relevant
MachineLearning, An Intelligence Approach. Vol II.
is given by the user. Clnster/CA(Stepp and Michalski
MorganKaufmann,CA.
1986) eq, mlly uses the information about property
relevance defined by the user in the GDN(Goal
Reich, Y. and Fenves, S. 1991. TheFormationand Useof
Dependence Network). Early versions of FORMVIEW AbstractConceptsin Design.In (Fisher 91).
also follow this sameidea (Vasco,Faucherand Chouraqui
Reich, Y. 1994. Macro and Micro Perspectives of
1995), (Vasoo,Faucherand Chouraqui1996).
Muiti~,-ategy Learning. In Michalsld
andTecuci
eds,
The non-incremental system WITT(Hanson and Bauer
MachineLearning: A Multistrategy Approach.Voi. IV.
1989) computesthe correlation between properties to
MorganKauffmann.
create categories. It keeps for each concept and each
Seifert, C. 1989. A Retrieval ModelUsing Feature
propertypair a contingencetable. Sucha table containsthe
Selection. Proc. of the Sixth International Workshop
on
frequencyof simulUmeous
occurrence of property pairs.
Machine
Lemning.
Morgan
Kanffmann.
For A attributes, WrITkeeps for each concept A(A-I)/2
Smith, E. and Medin,D.L. 1981. Categoriesand Concepts.
contingencetables. This can be a very tough requirement
Library of Congress Cataloging in Publication Data.
in termsof space.
CognitiveScienceseries 4.
Stepp, R. and Michalski,R. 1986.ConceptualClustering:
Conclusion and Future Research
Inventing goal-oriented classifications of structured
objects, In Michalski,R., Carbonnel,J., Mitchell, T. ed~
We have defined a method to compute and use the
relevance of a prope~yin concept formationsystems. The
MachineLearning, An Intelligence Approach. Vol II.
first results obtainedwith the use of propertyrelevanceare
MorganKaufmann,CA.
encouraging. They shownus that, FORMVIEW’s
approach
Tversky,A. and Gaff, I. 1978. Studies of similarity. In
can provide raprcsentations it generates with better
Rosch,E. and Lloyd,B. eds, Cognitionand Categorization,
prediction powerthan those generated from systemsthat
79-98, Erlbaum.
do not take into account the relevance of properties.
Vasco, J.J.P.F., Fancher, C. and Chouraqui, E. 1995.
However,additional tests are necessary, mainly with
Knowledge
Acquisition based on ConceptFormationusing
regards to evaluate the performance of FORMVIEW
with
a Multi-PerspectiveRepresentatiotl Florida AI Research
more data. In order to analyze the generality of the
Symposium,Melbourne,FL.
proposed method, future research consists of the
Vasco, J.J.F.,
Faucher, C. and Chouraqui, E. 1996. A
application of this methodto systemswhichuse different
representations.
KnowledgeAcquisition Tool for Multi-perspective
Concept Formation. In Proceedings of European
References
KnowledgeAcquisition Workshop- EKAW-96.
Shadbolt,
Shreiber
and
O’Hara,
eds.
Springer
Verlag.
Decaestecker, C. 1991. Description Contrasting in
Vasco,J.J.F. 1997.Formationde conceptsclans le contexte
Incremental Concept Formation. In proceedings of
des langages de schemas. PhD.Digs., Universit~ d’Aix
EuropeanWorkingSession Learning.
Marseille III, IUSPIM/DIAM.
requires
timeeffort
of theorder
o (~bAr x .by )’.
The total cost of the procedure for computing the
predictiveinfluenceof propertyis:
o(2""
~ (,,bAr
~ ,,bY)’
los~ (.
bAr .b
xV ’)) -
Machine
Learning229