A Case Study of Incremental Concept Induction

From: AAAI-86 Proceedings. Copyright ©1986, AAAI (www.aaai.org). All rights reserved.
A Case
Study
of Incremental
Concept
Abstract
Applicat,ioli
doltlailis
of niactiirie
promises
search
t~ff;~ctiverlcss
in complex
of iricrorrierital,
induction
domains
effective
requires
lacking.
for ctiaracterixirig
iriducLior1
wllicti
syst,enis
ltlarriifig.
relate
‘t’he dimensions
1ivtl Irlt:rit,s
of 4 iricrerrleritat
effert,ive
rlific-;Llktly
used
IlI3.
iriductiori
det,racting
lo compare
This
coniparisori
the
quality
cremerltal
concept
intensive
(deplh-first
which
and/or
objoct,s
makes
concept
tive
techniques
rrial
hypothesis
is obtairled.
the
of update,
of
respec-
reduce
learning
keeping
without
sig-
of induct4
knowt-
require
and
more
a search
isfici?i y.
Introduction
sporist~
process
titrle
precludes
tid preclude
OII t Iit> problem
of concept,
esuttlples,
comeptutrl
of concept
induction
ttltly
rquire
frorrl
for
rc~;ilizClt.iori
that
irlcrt~rrlcrital
riurllber
1~6).
as
new
basis
pararrlount
SctlIlIrllll.
world
x
tIIJIIl~~,
01 ( II rrcarit learriirlg
496
/ SCIENCE
for rcactirlg
t,rlvirt)rlltlr~rll,s
lW5)
which
sys t t>rlis.
white
on
with
prorrlisth
to
in-
pTYJl.'t!l-tie.5
duct.iori
for
st,orcb 111ay be
to nt’w sLirnuli;
Lhe ticvclol)rrle~lt
X: tloocl,
ltlc
As iliterest,
ttlc
~lorl-
rllot,ival,iorl
Lo pustl
wtlicti
Simon
wliicti
possibility
places
car1 allow
are of tligli
hypothesis,
has
or opti-
This
optirrial
such
saton re-
solutions,
hypotheses.
rapid
control
termed
comt.raints
for optimal
for
a
will generally
in search
(1960)
of
(i.e.,
a stable
of obtaining
scharcti
strategy
irlcreasingly
Mo-
usable
is (‘rl(.ollrlt,erc~d,
(~1arbor~ctII
the
a sat,isficiIlg
on
fOr sal,isfactory
Lhe explicit
trypottieses
t,o clcal
of‘ obsc~rval,iotls,
a knowledge
otily
inforrnatiou
systems
to
solutions,
solutions.
In fact,
update
rrlerilory
and
does
and
quality.
t,tiat
of tirtle.
arc’ rtquired
iIISt,ilIICCa
iJrol”:rty
5irrl~Jl;ilc4
a span
t hc: primary
~1151~1iriirrg ,L corit.iriual
is btbcorliirlg
iri
is Lo occur
t)(a c.orri~)~~tatiorlally
is 1,hat
c;Ktl
rnaj0rit.y
triIigc& prilnarily
divctrsity
riot
Spt~cific~ally,
induct,iorl
updat4
the
executioll,
over
systerrls
arid
rrlay
far,
induction
sy5t.clrrls
its learnirlg
syslthrrls
incrchrrlerllal
rai)idly
objects
but
learm’r~y Jrom
tlorlincre~rleP1fuI,
which
01’ systcrn
acccspt
ir~crcrrlt~nt.al
wil II in great,c:r
(,Ilic tlalski,
over
significantly
(e.g.,
‘t’hus
arc
syskrrls
t.tie outsel
(.rett1etltul systems
t,iv;Ltioils
irlducf,ion
clusterr’rly).
all objects
t)tl prtw~11
has corlcerltrated
seek
luxury
of the correctness
searching
a scarctl
or opti-
the
A reduction
which
erivirori~rierit,
At1
riccessil.att3
learning
tiypoltiesis.
exhaus-
a correct
systems
equivalent
to converge
instances
of hypothe-
precludirlg
tile guarantee
of a filial
that
Iricrernerital
objects
111rty sacrifice
or
of hy-
tfowever,
lncrcmentat
or
search
is converged
frorltier
costly.
thus
be
observed
hypothesis
guarantee
instances
to
a frontier
the
incorporation
cost
past,
stable
con-
Nonirl-
breadth-first,
previously
servca to expand
frorit ier of hypottit5es).
yields
in machine
of
generally
tend
maintaining
some
New
rriality
n’ork
urltit
of the
update.
backtracking,
requires
a list
1982)
rapid
systems
with
which
CTlgt~.
I
induction
systenls
as a result
perf’orrning
ses
indicates
car1 be obtained,
frorrl
by
of illcrernentat
emerge
OH.
lhe
of Quirilari’s
mandated
(Mitchell,
incre-
cost anti qualily
to the
straints
which
we intro-
incremental
variants
program,
Irol~~ t>x;llrlples
I tlat, cost
are
paper
cost advantages
the
di5advantagcs
potheses
discussion
of diffcrirrg
with
come
versiorl-space)
the developrrlerlt
111 this
(iuc(~ :j dimensions
of nofi1,earning
However,
I,he utilily
teas beers
corriplcx
in
limits
methods.
methods.
fi>r corriparirig
rr~e~~tal melhods
l,cx-tlrliques
the computational
ilitensive
cost
01 ciinierisioris
inductiori
to push
incrc~rnerila.1,
Along
Induction
tcor irisLance,
vt~rsiiofI space)
1julul107l.
t,tiis
differeritiatt:
1985;
lirriits
quality)
not. limit
atiy
c>xhaustive
cari bt: irlipI(~r~icrlted
iIlc.rc!rrlt,lltally,
iLed ulilit,y
concept
arid
but
suctl
rrlour~ts,
explicit
we discuss
incrcrnentat
and
WCII its
serving
rrierital
systm~s.
as a basis
These
accepted
scarcti
tectinique
several
for evaluating
are
rliay
have
dirrlerlsions
learners,
competing
to
at a
(e.g.,
instances
irlcrerrlental
riorlilic:rerrlerltal
dimensions
one
so iib Lo accept,
;tri irIl~~I~jr~t~ritat,ioll
ttlis paper
iri-
characterization
are
it1 an c~r~vir-o~irIlerrl tit~rnaridirig
trl
it becorlles
thra cu~r~pututiord
of increrriental
their
property ttlat. objects
t,hta l~el~uvwrd
tirric.
learning
to Itlake
cost,
(e.g.,
Wctijiiqut5,
lhUS
of
in irlcrernerltal
irrlportarll
lirncufllwhich
as
incre-
The number o/ obseruatiotw
l
tem
to obtain
a ‘stable’
required
by a learning
set of concept
sys-
occasioiial
noisy
descriptions.
every
0 The
u/ u$utifly
c o st
served
objet
memory
to accorrirtlodate
an ob-
t.
two
tree
of currlulative
expended
izing
can
cost,
during
which
irlto
the
A last
induction
quulity
‘I‘ht~
reflects
Iearning.
iricrernental
0
be corrlbirled
a single
amount
a high
derived
by a con-
of instarlcc3
tal
discussing
arc
these
used
variants
tcrrl
of the
rriaririer
dinrensions.
to compare
is of the
builds
decision
of t hesrb systems
indicates
object
without
rived
arid
cost
implies
cau
decrease
In general,
an increase
in the
entries
each
distin-
number
re-
quality
of tie-
rrieritcd
approprial
of observed
objects
not
a root
inforrrialiori
II
A case
study:
ID3
1heIi
arisen
Quirllan’s
tree
that
(1983,
1985)
distinguishes
1113 constructs
between
examples
ples of a particular
concept.
The
au
tree
a collection
empty
decision
riorit~xarriples
nurribt~r
of those
applied
The
ttlr
rrlost
to each
of the
informative
value
are
then
for this
plied
positive
or negative.
of I tit: coiict3pt
‘l‘tl(b choice
root:,
II)3
whicll
is critical
uses
an
attribute
cBac:tl sllt)t,ree.
starts
and
is red).
A
is used
to form
a brarich
for each
thus
groups
the
buildirlg
At this
point,
bc~twecn
will
the
root
ot
‘I’tlcl
to t hcbir
is recursively
subtrees.
examples
urltil
eitlker
all iristaIrct:s
itive
or negative)
‘I’able
1 dt3pictS
root.
ap-
pm-
tht>n t hc
iIlf0rIIli~-
‘I’his attribute
as tile
root
is
attl‘il)utt~.
‘I‘he process
sul)trrw
aw
of’
canriot,
foI]OWtd
l.hC
conI iriuc3
orit2 lype
relial)ly
be
irl
is tc~rrried
gloar~eti
over
rlur~ib~~r
ofsut~stqut~r~l
cali
credit
root
(pas-
choscri.
IliOdifit’d
vf:rsi()li
I 1)4.
previous
be rccliost~n.
inslarlccs
i~i:,t;~rices
rrlorc’ stabl<~.
ttir>
roots
iIIlpOrtiirIt
previous
k.ul,l rW
card
t r(‘e corn-
I,ht> 5111ic titx isiori
occurs
IIltmur-e
<irid ctinllgcs
arc) r:il.hcr
( tiosc~I1 root
ill
rtquires
rtnd
t)clf’ortl
‘l‘liib prot.c+s
of I llc iriforrrlaliorl
progrmst3,
a poorly
01’ a su bl,t.t~c tliscartls
tlxarrlirling
irit‘rc’clut’rllly,
lleuristic..
(highcbr
(stiap
irllorrrlatiorl
:\s
a
roots
(Iw.~I)(~Tsubtret~
lo
the
I~~;trIiirig
irl t tic t rtY>) bc~corlit~
iri su Gl rt’tf root
of’ aI1 ctff>V.t. lri ttlta lollowirig
‘I’hc proce~
arid
arc: both
if it, is ur1lik~~ly to Iiavta
tbrnpty.
root,
llle
dot3
the rrlosl
corriputc~
ttlc
which
instances,
it is cllost~rl
aI, a
is irlc-rcldowll
tticre
attrilutc5.
c.llarIgirlg
the
the
exanlplcs
to
is Ith
Sk[JS
allows
(:harigirlg
in a subtree
decision
used
or a ri(‘w
the
proceeds
arid
with
positive>
was all ~xarIipl<~
is t~ncouutt~red
for classilied
rriodific~atiorr
II)-1 also
3d).
examples.
of its values.
process
I.liis irlstarlct
tIlerI
suril-
irlslct11ces
tlie
ill
of
and
is processed,
1Ike k2 ksl,;
this
of 1113. This
negative
ail
‘I’htt ht>arl
attribufcls
arid
at1 ributc,
thctri
by chance,
iIi
by a
rrieasurc
according
urilesled
uriused
usilrg
ion
tirritb
loc.atcd
test
Otherwise,
a value
how
at tribute
into
with
arid
is described
color)
negative
and
out
to determine
positive
all of the
plet,t~ly discriminates
size
attributes
Jlvided
group,
until
shape,
size,
rloncxarn-
of exarllples
and
attribute,
for each
continues
with
algorilhrn
(e.g.,
between
(14acisiori tree
instdnces
(e.g.,
attributes
t lley discriminate
and
tt>ac:h object
corlcc:pl,.
of attribut,es
for each
is
of a
arid
a discrimination
ii
consists
If’a 511btrtIe
previously
evali~alcxl
al,
of tables
root.
ills1 aric(a
rIirsasurc> is
tivtt of the
to rrlodificaf
orie
14:ach tabIt
coilrits
negative
ontk
1 is olm~rveii.
lies in a series
Classilicatioli
have
arId
increrri6Lri-
triaIllit:r.
of all
(Asu bLrt>e.
yet
itive
1irritl.
ilI1
in
is arntlrlable
to wlictllr~r
or rionexarrlplc.
~iurt~-
;iL. oue
for caacti of its attril,iitc~-values
accordirig
updatcl
a largct
lo be prcsetitc~d
proc~twd
of positive
count
II)3
of’
ilic.rt~rIlcrltal
As a new
or negative
reducing
tree.
valur3
thra riurrlbclr
val lit?.
wlic~rc
proct3sirlg
objects
iitl d~~c~isiorl tree
for ItIc
each
aua lysis which
in the
potent
of in-
(1985).
I I)3 as eat II objet
I,(,
c.orril,iltal,ioriall~
rriarixm
analysis
be considerably
however,
an ‘optimal
and
which
rt’ruli
Irid)
l.lli:,
to insure
iI Quirllari
for
be to allow
iIIsl,iirlcc5
is us4
thcb sarriplirig
‘tl algorittlrrl
dvailltblc
of tticsc> rrioclifications
encli
tllat
cdn tw touuti
sirrlply
iu a dtaci‘1’0 limit
A 11101c de1 ailchd account
t hc I I)3 franlework
Ltlat
l+:ach sys-
A formal
by an empirical
incorporation
trees.
to find
objects
so
efficier~t,
irtc,rerrlerital
variety,
instances.
a significant
decision
reqllired
observed
for
dirrlerisions
1113 program.
examples
negative
is bolstered
that
duct4
over
from
the
of several
1085)
frorrl
trees
positive
behavior
(1983,
learning
on a cast‘ study
Specifically,
the
of Quinlan’s
guish
focuses
test
111 a
betwcerl
resultirlg
rr~t~ltlod of applyirlg
would
at a tirrle,
paper
are
instances.
large.
statistical
to cllarlce.
force’
llowever,
‘1‘11e rclrnaindcr
irlstanco,
d e b~rt7’
_ _ 0.f c0nfidcrit.e)
is rrot due
Orit’ ‘brute
system.
concept
to discrirrlinate
t,c> uriricac.msarily
11)X is a rloriillc.rcrrlclrIt
bcr
uf concept
cep t induction
for character-
is:
in the
atterript
rictgativct
rndy
II):< arici its rrlcasIIL‘(3
of resources
diniension
systems
mjl’se)
a x2 (clli-squared)
stances
measure
arid
which
growth,
factors
(or
I I)3 will
positive
siori
(with
These
errors
situation,
ctloic,c3
tla\‘t‘ loh!~
ilIlillySC’S. II) 1 \Cd5,11)1(1to tiis-
I ril)ul.c~ t,lloic (3. ;trlcl t.OJiVt’rge
011
LIXY ds I IJ3.
rlon~~xarrlplcs
to be acquired.
of a rrieasure
if good
for selectiug
decision
irlforrnation
values
1113 has
lhcoretic
besl
also
discriniirialiou
trees
divide
are
rri6lasure
all
beer1 designed
trc’r
to be ot,lainetl.
ot)ject
to
determine
(sub)set
at
to acconlrlloclate
LEARNING
/ -tc)’
is
Inputs:
A decision
tree, One instance.
A tlecisior~ tree.
011tp11t:
I. If this instance
is positive,
increment
the total number of
positive instances.
Otherwise
increlnent
the number of negalive
iristances.
2. If all of the irdanccs
tlie decision
are positive
Compute
then
return
the expected
negative
If thcrc
information
scores
or Lhe maximal
aLtribuLe
is riot
the
a new tree.
a test link from the root
the root
for every
then
of updating
In
1113, to
update
tree
1 : I’seudo
is augmented
code
(I f?j
;Lrld 16).
st!rvt’s
to
r>rnpiric.al
an
refine
for increrrlerital
cart
effecting
an enipirical
incremental
‘I’he introduction
the
analysis
instance
ca~illy
with
additional
space
indicates
quality
variants
where
/II is the
and
number
d is the
that,
analysis,
are
during
the
cost
reduced
tree
number
(which
of
cannot
. . . . for a total
after
every
instance,
over a single object,
then the above
over two objects,
of
and
without,
signifi-
Asyrrlpt,otir:ally,
of learning.
where the required
r~urrlber
rooted
at node j of’ level
the
t~unlbc~r of instances
of
i.
objects
lcor an
i, of t,tie decision
tree, r~, (
JL, TL,,) represents
of objects
required
for all nodes at that
level
a stabIt! discrirninat
iI)g altribute.
A level cannot
until prtavious
levels have achieved
stability,
and
all instances
seen, the
t.hus 7t, ;> 71, 1. Since 1113 retains
uurrlt)t:r of objects
to c:onsLruct
a decision
tree of depth
d
/ SCIENCE
IA ( is lhe
of the
of incorporating
suflic.ic!rit size, ql, rr~usl bc seer) for It)3 to choose
the root
at.tribute
wllost: values best discrirrlinate
objects
of the e11viroli~t~crit
as a whole.
This is true in the creation
of all
498
depth
rrlcthods
techniques,
Arr irilportant
cotriput,atiorlal
measure
is the number
II):3
of ir~sl~a~lc:es required
to c0tisl.ruc.t
an optimal
tree.
choose::, an alt,ribute
to form the test for the root based on
t,l~ts information
that attributtl
contains
over ttie observed
(from
the environrric~nt)
of
A sample of objects
instancc5.
Lo attain
stal)ilk
of instances,
introduced
most
iruportant
is presuluably
term
much
is ) /I2 since
greater
than
the
the
of attributes.”
In 1111, building
ire level,
t,ht: 11urrlber
frorn
j A I).
t,he nurriber
attributes,
cl111
tree
a-- (‘4 /
riurribcr
subtrtlt:
roots as well,
is 71,, lor the subtree
a new
PI (1
exceed
1114.
of thcxse l&or
of incremental
be signilicantly
the
build
attri butt:.
If a tree is built
expense
is incurred
whic,l~ two
is to
each node of the tree requires
that
to determine
their values for previThe cost of constructing
an entire
is
attributes,
allalysis
required
to construct
choice points,
or
rnerrrory
a tree
scratch.
CotistructiIig
instances
be examined
ously unused
attributes.
value of
Go to step I with ttre subtree found by following ~lle
link for the root attribute’s
value in this instance.
Table
ID4
for all attributes.
i. If the maxinlal
attribute
is x2 tiepentlcnt
make it the root of this tree.
ii. Make
Cost
B.
is no rout
build
sample,
in order
to choose
the root attribute
in an optimal
manner.
llowever,
because
Ii&i does not store all instances
encountered,
at the
next level it must examine
mother
rll instances
because
the first n, instances
are not available
for inspection.
Con-
for each value present in the ineither the number
of positive
or
the information
then
object
score.
instances.
Compute
a representative
same
n o instances
sequently,
the number
of instances
the tree is the bum of all of the root
For each attribute,
stance,
increment
root,
or negative
tree.
Again,
assumitlg
must examine
the
a decision
of ot?jects
times
as shown
below
tree
the
is pr.oporLional
square
of the
only
number
to
of
of objects
this
to an
efficient
is substituted
into
characterization
the
cost
is rod 1.
equation
When
for 1113 we have
rank
or
the
sq~arc
edge,
file,
diagonal,
tyPc?
or
otherwise)
where
or ot,ht>rwise)
each
(6 attributes),
piece
(4 attrihutts).
resides
There
and
(i.e.,
corner,
art3 a total
of six-
teen attributes,
each with three values.
Although
there arts
fj” i\ (;” h 4” ._ 2, 985, 984 objects
possible,
ail exhaust,ive
enurtreration
For 1114, the number
of instances
tree is larger:
27::: 71,. Substituting
sion for object
incorporation
yields
to an optimal
decision
this into the expres-
that
these
attributes.
Four
are
which
stances
total
required
expense
to select
71; 1 :- >;f;,’ Tli then
is very
likely
quirc~tl
the
of the
Our
0 bjects
empirical
ari
the
subtree
number
root
expensive
is probably
attribute.
than
number
of inIf
1114. ‘l’his
of instances
greater
re-
than
the
or TL~ r > d.
assurned
discussion
sample
of regularity
in an environment,
we now
of
and
actual
are
The
constructs
new
only
reconstructs
been
misclassified.
instance
instance.
counts
tested
tree
third
variant
fourth
were
randomly
is depicted
tree
6‘3,
instance
only
has
counts
of
new
updates
at-
is made
hoards
presented
formed
in figure
perform
after
for each
generated
were
algoof 1113
scratch
updated
164,
of
version,
au
in classification
positive)
decision
from
is 1114; the
variant
indi-
II)3
version
when
are
an error
‘1‘1rc same
The
variants.
the
of the
force
tree
decision
pins
in terms
A smarter
instances
when
ilar to IT,).
variants
decision
The
knight
objects
is a brute
is received.
the
I~‘inally,
tribute
first
rregative
and
distinct
3,251) actual
a new
each
positive
95,480
orily
incremental
tested.
of the instances
a ‘representative’
a rigorous
of objects
on
the
tree
tree,
has
lacking
distribution
since
the
deecision
analysis
each
11)s is more
case,
to construct
depth
hinges
there
behaviorally
rithrn
Comparing
of the
cates
(sim-
(KS 69%
to these
four
by ull of the variations
2.’ ’
Idist-bk-knight1
analysis.
dist-wk-knight
IV
(:onsider
a hoard
ation
the
task
position,
and
of classifying
task
versus
safety
or loss of the
black
to move.
chess
attempts
Following
attainment
knight
performance
a classifier
as a win or loss.
a corrcept
king
Empirical
Figure
(1979),
we defjne
Quinlan
king
knight
whether
and
rook
or king
1 depicts
Given
the situ-
as determining
a white
black
endgames.
to identify
results
in two
a sample
a black
in the
moves
board
diag/rectl
\pther
t
0’
‘0t
0
Figure
with
2: IIecision
For
the
three
observations
the
least
over
more
for
nurrrber
ple
picture
knight.
Iloards
were
randomly
(6 attributes
every
pair
generated
(in squares)
and
bcttween
of this
type),
board
of pieces
(i.e.,
whether
described
each
pair
relationship
they
lie on
there
quickly
forms
an
Figure
variant
t1ut
nurrrhcr
ranges
of
from
3 depicts
2 (averaged
as a function
a rough
should
11ot
and
tw
01
sim-
equated
builds
more
a coniplcte
tree, while
_instances.
111-l rtquires
the
on the
complete
tree
after
timc
each
irrstances.
is a substantial
its
ttic
in figure
gives
consistently
20,000
Classification
in terms
speed,
substantially
constructs
tree
l>epth
_--
variant
aiits
of’the distance
pin.
tree
for f-64.
by each
ID3 rapidly
converging
Though
algorithms,
decision
built
of learning
approximately
black
of the
of instances.
1 I)4 requires
pinned
knight
this decision
11% to ttle greatest
depth
correctrress.
largest,
of a safe,
to forrn
50 execulions)
the
with
efficient
required
the average
I: Example
for a safe
configura-
tion.
l<‘igure
tree
decision
etftlctivc
perforrnarrctt
range
tree,
classification
of tlie
three
iI1 the
each
of the
for
more
variants
t,htl instances.
efficient
vari-
(averaged
instances.
over 50 executions)
was measured
over 1000
_-_
k‘or II):{ and 1114, a 90% effective
classification
of pieces
between
the
same
LEARNING
/ 499
DEPTH OF DECISION
TREE
COST PER INSTANCE
30K
/
Ii%
i ID4
/
/
/
/
I
,
ID3
20K
/
IOK
250
500
750
INSTANCES
1.000
18,500
63
PROCESSED
250
is Ir)r’rrred
after
~~t:r Iorrnance
as few as 100 instances.
of’ 164
rt’d( tit3 75’i’: correct
!#‘I
c-orrect
t I~ough
it
sp~etf
seri
first,
which
is due
tests
for
to the
of
order
al-
tIstablisheci
learns
by JIM.
form
of the
until
In figure
an
four
of updating
the
cant II Iitfw instance.
corrll>;lrisorks
~xec’ut ions
coricept
number
decisiori
depicted.
are
of comparisons
deeper
Icigures
per irddfic-e
perforrried)
it rit’w clccisiori
tree
desc.riptioris
of corriparisons
do-
in the
4 aritl
for each
after
each
is measureci
rrracie
5 depict
for a lypicd
by
to irlcorporatt:
the
cumulative
a typical
‘l‘he vert,ic’al
(among
most
IIC~ irislance
arid
of
instarices
the
clecisiori
fied
instar1c.e
exllibits
low,
the
high,
tree
from
vtxrl ical scale
O( ,.,t;‘)
_a .
of II)3,
for
bound
curve.
Ifii
which
updates
I I)3 and
becri
of 1114 is greater
^_
1113 (nol~
rrlagnified
thaIi
the
400
that
times).
usually
the
‘l’tie
The
low cost
it is always
ads
wtiil(k r t~rriainirig
however,
i SCIENCE
bounded
expc~nse
per
instance
important
ion
(i.e.,
of the
IL)3
of classifying
expense
of rebuilding
an incorrectly
this
classi-
O(jAl’)
is clearly
expensive
attribute
The
work
refiertecI
curve
counts
orily
iI1
results
if the
test
Conclusion
learning
domains,
intoIlsive
which
gorithm
wit hiri the
rriachine
increased
nature
performs
least
four.
0( / II’ x let)
expense
when
and
of
of 11)s
is incorrect.
corrlpticated
sc1arc.h
the
function
ID4
is
number
version
of the
reflects
irlterrrlittelit
processe<i,
linear
consiclttrably
less tharl the latter’s
--.
pta;ikb. ‘I’he fourth
variant,
IIM, displays
the least expense
_-.
01’ tllth ttlree.
It asymptotes
to a value as srllall as II)3
500
hut
IIF1 has
to
in II&l is nearly
step
of the
as a t’uriction
force
approach
scratch
nearly
As
compartd
1%
irislances
cumulatl’~e
brute
curve
is encountered.
iristarlc32
the
intermediate
V
when
is that
of each
algorithm
The
The
the
arid
for each
by each
expensive
bound.
reveals
from
iitlt
efficierlcy
of 1000
refiects
accelerating
classific~idion
coast
instance.
performance
axis
rriade
the
50
1113 reconstructs
ari instmcc
of this
sequence
of’ instances.
asyrrlptotic
its
‘l‘tlth t’xperisc% of processing
price
over
number
curve
t tie riurriber
execution
variant.
The
6 the
variants
displays
‘1’11~cost
per
slowly.
geometrically
couritirlg
PROCESSED
.-.
1113 cost
and
1.000
anti
quickly.
attributes
value
It5XY
4: II>3
I
750
perfornlarlce,
algorithms
decisive
I;igure
it
incrcrnerl-
relatively
these
off;
‘l’hub,
these
classificatiofl
important,
leaving
instances.
was achieved
with
classificatiori
t MY’S construction;
750
kvel
irisLarlct3,
275
time
perfect
c.I ;issificatioIl
‘I’tltl apparent
effective
after
corlsiderable
hrns to achieve
(2 WC)
after
to
500
INSTANCES
classilicatim
lorlger
classificatiorl
take
The
somewhat
c Iassificdtiori
may
t.al dlgorit
good
takes
I
methods
interest
can
have
to
as they
my
behave
in
behavior
one
tloes
in
evidcrlt.
concept
is that
observations
incremerital
applied
more
of rlorlirlcrerrlerltal,
become
observatiom
thougli,
be rriatle
process
are
deficiencies
in incremental
process
point
rrlethods
the
This
induction
are
leas
rneth-
observed.
An
noriiIlc~rerrierlta1
an
iIicrcrrienta1
at a time).
riot insure
alfasti-
In general,
the
cornpu-
COST PER INSTANCE
CUMULATIVE
75 ,
COST PER INSTANCE
ID3
100K
75K
ID4
50K
II
I
I-’
_/
1
iNA
__
”
INSTANCES
Figure
tatiollal
5: ID4 and
efficacy
for clkal uating
of such
an
inc’rernc~rital
t)t:crl outlined.
These
cost
per
Figure
dimensions
have
CarbonelI,
are:
memory
corlcept
to accommodate
necessary
of objects
PROCESSED
Cumulative
cost, per
J. & flood,
Objectives
It;).
a new
and
,
1,000
instance.
to obtain
a stable
C;. (1985). The World Modelers
Simulator
Kutgers
II.
LeartLing
Workshop
of the
(pp.
14
Ilrliversrty.
K nowledge
(1985).
ftepair
t i0n Versus Revolution.
Proceedings
tio72 ul Machine
Workshop
description.
f’roject:
Proceedings
Architecture.
Muchine
lt~ter71Uti07~ul
Michalski,
0 ‘l’tr(~ riurriber
800
dirnensions
methods
Third
of updating
- _ -_
References
Three
induction
600
INSTANCES
6:
IE
_ ..r
----__
--.~--.-
400
instance.
algorithm.
concept
__
_
200
PROCESSED
64
/’
1”
25K
250
_/’
ID4
1’.
Leurning
Mechanlsr~ls:
of the
(pp.
ll:volulntt r71u-
Third
1 IG 119).
fiutgers
IJniversity.
‘[‘he quality
l
of derived
concept
descriptions.
Mitchull,
‘J’trr>se dimensions
have
ior of 1 increrriental
cascx sludy
in the
a prorrlising
od:,
cari
domain
indication
meet
c:rlviroIIrrrerrts,
been
variants
the
used
of chess
that
meeting
the
endgames
has
served
induction
constraints
high
t)ehCiv-
standards
I)iumverirlg
Exarlrptes.
systerrls
and
Quinlarl,
.I. It.
chine
Acknowledgements
IJisc~ussious
with
tat irlg to ttie
was
arid
quality
in
part
by
tichr grants
the
[ST-81-20685
trlstitute
Naval
tfle
Ocean
National
arid
Systems
raised
of learning.
the
under- grmt
initially
specifically
NO0014-84-K-0391,
grants
N0001.f-85-K-01154,
smrctl
KiLlcr
paper,
cost
supported
urltlthr
Ikrlrlis
irk this
Altu,
Office
of
Naval
(Lge.
I ,earrllrlg
(19X3)
Research
SiirIlrrlut,
c
un-
Muchine
Re-
vcrslty.
Voundation
I24 19, the ArItly
contract
(Ed.),
khliliburgh
fhiirlburgh.
ttfficirml
icatiori
tu ct1ess
, .
~tmrIt!lI
& ‘I‘ pvl
A 71 urtijiclui
classific.at.iorr
irilrlligerice
‘l’ic )g;i I’ut)lishing
prom-
uyp
Co~ripaiiy.
re-
Srience
under
lrltlllctloIl
resclarcti
and
lS’I‘-85-
Center
leurn~r~g:
.
C:alllorrila:
NOO014-84-K-0345,
MI)h903-X5-~X)3~4,
by
a number
the criteria
‘I‘his
Rules
l’rcss.
..
Universit,y
of complex
lttfrl-
,4 rtific.iul
In 11. Michitb
ifb the micro
(‘OI‘I‘t’CtIleSS.
of’ itltlas cbxpressed
as Searcll.
226.
A
meth-
of quality
18. 203
as
II)3 program.
irrcrerrrental
computational
while
to cornpare
of Quinlarr’s
<:erreralixation
‘I‘. (19832).
liyence,
and
by
N66001-
h
I1ur11e, I>. (1985)
plex Kot)o1 World.
~illlOIl,
t1.
Mass.:
Workshop
Lcurniny
(11)6’3)
Procredings
‘h!
‘l’l~e M.l.‘l’.
s
cirrices
1xarriirlg
~hr~c:~:pts
In
a
Corrl-
of the ‘I’hlrd ltltcrnutmnul
(pp.
173 176). ftutgcrs Ilni-
of
the .4rtificd.
(Tarrlbritfgr,
l’ress
t-u-( :-o2s5.
LEARNING
/ 501