Beyond incremental processing: Tracking

From: AAAI-86 Proceedings. Copyright ©1986, AAAI (www.aaai.org). All rights reserved.
Beyond
incremental
processing:
Tracking
JefI’rt:y C. Schlirrlrr~er arid Ric-hard
concept
11. Grarlger,
drift
Jr.
I)epartrnent
of’ Irif’orrriaCion arid Computer
Science
IJniversity of’ Catif’ornia,
Irvine 92717
ArpaNet:
Schlin~rr~et~OI(~S.IJ(~I.l1:1)17, C:rarlgt:r~iCS.IJCT.li:L>U
Abstract
machine
luarrlirlg
systems
are
able
drift,
and hence
ca~noi,
deal with
ronments
containing
these qualities.
I,c~arriilig
in
rrltlt hods
that
Iixc t I:edback)
complex,
changing
are able
a11d &if1
to tolerate
( concepts
‘I‘tic5e
two aspecfs
of
cm.h oltier:
w heri some
corrc~cl,ly
predict
the
(‘011ic’
oc’c urs without
prtlclic.Lor),
a learner
1 t\ts siLuaL.ion
is an
iIistan<e
of noise
to
learrl
tolerating
illust.rates
coiiccpts
over
Sorirc~Lirnt~s
,ill(br
rili~i
(1 volcanic:
111a.y become
drift.
We j)rescnt,
complex
Boolean
ttlat
a learning
character-
An analysis
of
desirable
bellav-
from
an irrij)lerrlcntatiorl
to show its ability
to t,rack
Lo
Idit
lfl~l
dll(
~JarOIwk!r
it doesll’t.
criil)t
ion,
pc~(Jr
is
reading
indicates
rain cornlc’urt.tierrriore,
for trioriths
j)rctviously
good
indicators
of
while
[JI’t’diCtkJrS,
or
C’OlIli)(Jll
il ;tt
is
it,
ai
IlCi~‘Ci
poii~t
sonit:
Lhe
IJrtdict.
t’,
olher
(previously
Attempting
to
tJet.W(!(!ll
bccaust>
CVC’lltS
(a)
rt~ost,
irlt,erldtxi
ir~dic:aLiorl
l,y Llle t‘ac.t
a part,ic~ular
outxo~llt:,
Ll1a1, the
is
tllat
gooci
this
corlctlpl
,just
uoistb aI1d
irlclicatot
a
noisy
is begirirririg
?;,~t ur(’ has solved
Ltlis j,roblt~rn
in hurrrarrs
alld animals:
iI1 classical
corlditiorlirlg
t~xpcritrlerlts
arc at)le Lo tol-
t~r;itc~
rioise
rrlcrll5
with
502
irrlplerrlented
this method
S’I‘A(:(;EK,
and have tested
ranging
from animal
learning
endgames.
ttle prograrrl’s
and
rriany
/ SCIENCE
We
in a computer
program
called
it in a variety
01’ environments,
tasks to t>locksworlds
to chess
present
some
ability
to track
empirical
drifting
Related
lindings
corlcepts.
refiecting
work
Many
successful
learnirrg
systems
have Failed lo deal with
issue of concept,
drift
over time.
Quirllarl’s
ID3 (1986)
for example,
constructs
a tiiscritrlillatiorl
tree to
prograrri,
This
representatiorl
c harac tthrizc
i rrsLaIlc:es
of a concc~pt,.
arid negated
characterizaallows
corljunctive,
disjunC.live,
tions.
Quinlall
ac.cotrlIrlodate
perforrrmnce
tfie rliethod
(and
allci
drift.,
t:vcn
cortip:Ling
irl extrerrlely
cues.
complex
cnvirontiowcvcr,
few current,
has examined
varying
levthls
is close to optimal
is rlollirl~rerrlerltal,
re-t~xarnirling)
does 1101 have
tree Lo iricorpora1.t~
to trac.k cliarlgc3
‘I’titl
iricrmrielilal
a rt:l;tLive
I~it~ctranisrrls
the ability
of this
of noise,
concluding
method
that
irlforrrlatiorl
the strong
<urat,t:ly
to
its
(Quinlarl,
lY8f.j).
However,
for it requires
exarninirlg
large
number
of instarict3
for Iiiodifying
an
ttxislilig
IL is uriable,
thcLref’ort>,
Ilt’w iIlsLanc.es.
ilr c’olict*pt
dt*firliLions
over tilrie.
IlaLurtB
of’
a Iearnitlg
algorithm
gllararitee
that
iL will Iw able to deal with
concept
over tinIt>.
,Mit,cht:ll
( IOH~),
for example,
reports
vcrsioll
spactk Ittarriirlg
rr~el hod iri which
an aj)propriate
sc.ription
searcti
IO tlrill?
rat>
on tlayesian
statistics,
it tolerates
systendic
Iloise,
but not
random
noise,
distinguishes
between
noise
and drift,
and
We have
is
able to track
changing
concepts
over
time.
II
;t~50c.icLLiolls
art’ riot j)erftTLly
coilsistent,
(11Cnt.c~ obst~rvetl
iti~li~~~ct’s
of Ltlc5cs a~mciatiorls
coiltail
‘Iioistf’),
arld (b) abI,cbarriitig
ill t hesc
or u’rt.Jl ov(lr
tirrlc,.
50~ i,Ltiolis
c.llarlgt’
i ro[tlrit~lits
(ir il t ir1tcarac.t:
and
the
Ijoor)
iildicat,ors
may
I~ecorrlt~
j,retiictive.
I(siif II I’rollr c~xjJcrit:lice
AtJOllt
assoCiiLt,iO~lS
like> 1 ti(+,cb irl ttlcb rtB;tl world
is corrfollrldtd
(‘Iiv
liaise
that
tolerates
noise and drift,
and we otfer an anaccount
of why it behaves
as well as it, does.
The
is able to keep track
of, and hence distinguish
bedifferent
types of noisy
instances.
Via formula
based
(called
chang-
Introductiou
a low
d11d sornct.irrlcs
irldication
rriethod
alytical
rrlethod
twcerl,
tolerate
time.
I
irig,
or an
noise
and drift.
why it teas these
ions, a.r~ti tmpirical
results
S’I‘A( ;( ;EI~) are prc:seI~!mi
irrg
requires
noise
(less
than
perthat change
over time).
complex
environments
iriteract
with
particular
learned
predictor
fails to
expected
outcome
(or when
the outhavitig
been preceded
by the learIled
must,
be able to dcterrriine
whethet
t ht> c.of~c.t~pL is beginning
trlt~lhoci
that, is able to
iz.aliollb
while
I tit, aIgorit,hrri
erivironrrlents
to
complex
reactive
enviWe present
a learning
of observed
illstarlc’es
is forriled
Lhrougti
it space of possibililic5.
does
Ilot
drift
on the
de-
via a bidirectional
‘l’hough
relational
is utilized,
the version
sjmce
rnc!ttlod
assurtles
bias t,hal a conjullrtive
characterir/,aliorl
can accapture
Ltle c.o~lc.elJt,
to be learned.
lrr later work
(MiLchelI,
IJtgofl’,
powd
w hicli would
AL Hanerji,
forrri
19X3),
cjisj urictive
a modification
descriptions
was proor toler-
of a concept
drifts
over
time.
1,angley’s
discrimination
Icarning
method
(in press)
is
abItt to track
changes
in a corlcept
defirlition
over time.
The ic~ariied
concepts
are expressed
as a set of productioli
rules,
one of whicli
influences
expectalion
at, a tiirie.
If
Lhc applicability
conditions
for an operator
change,
prcsu~rlat)ly
recently
learned
productions
would
be weakeucd
via st rerigthenirig
while
discrimination
would
propose
new
I~Iventually,
st rt~IigLticned
and
Lhe new
overwhellrl
ollt’s.
this
t’i~usc1
method
however,
I’unction,
rioiscl.
III
tf~lally
irisLance
would
learning.
learning
method:
of STA(;GEfl’s
learning
concepL
rc~prcseIltaLion
rriclrit. of the
iLctc>rixalions.
111ort’
specific,
Xl.i~JtiOll
tse-
by using
the
I,carning
weights
This
and
latter
method
composed
is hased
on
of a se1 of
and
invert.4
‘I‘ht5t~
t’k’~~l~m~S.
I’or iIlc.lusiori
in tlic
I tl;Lt w(‘rc cornbincd
As
each new
of iLs iden-
pair of weights
associaled
occurs
at. two levels:
generation
process
of’
new
constructs
description
them.
the
stirriuli
about,
occur
alone eve11 a few rlurnber
of
their
associatiorl
is severely
impaired.
tteal-world
ample,
the
tasks
also
descriptions
rumforrl
or
noise would
withill
IO%
systemutic
be a tempera1
of its operating
ally
weighted,
represented
symbolic
would
would
blue)
xal iorr5 dre dually
rltlgiit
ivcl iniplication.
or
tJt: represcrlLt4
shape
Oritf
01’ I1 ctiarac.terixatioil
lor
illIll
111th other
represent,s
/JO.\).
itllti
results.
In a classical
as a set
Ii:actl
of’ du-
c~lomenL
of
function
of atlrihuteof c*oIljuncts.
Ali exblue figures
or square
small
cl~tcl
as (size
These
characterisquare.
in order
to capture:
positive
and
wr%igliL represefits
t,titl sufficiency
prediclion,
or
it.s necessity,
measures
are based
conditioliing
(mulched
:, j~,o,s),
or ( Irrlutched >
leariling
atlons
sure
t~xperirrieiit,
learning
events.
subject
to
For exeither
variation.
An example
of random
ure sensor
which
is accurate
Lo
range.
It may read too high 011
occurs
in systematic
with
raIidom
variatiori.
this
in
necessity.
ranges
mind,
‘I’hey
from
cases
but
defined
is dubious
uses
S’I‘A(;GEK
are
zero
in situ-
logical
sufficiency
as a measure
1979). Similarly,
ratio,
serves
of suffilogical
to mea-
as:
rt~lationship.
I’ronj zero
zero
Ltiari
l,N,
and
is iiitt:rpreLed
correlatioli,
grtlattlr
t ban
I,N also represents
Lo posiLivt>
iilfinil,y.
unity
odds
llowt~vtfr,
irtdicaLcls
a positive
and takes
ati /,A’
on values
value
near
to Lhtk conf ingt1nc.y
law, fi>r it can
tnanip~Jlat.io~~s
t,lliit
I,S’
’ I a11d
if p(II,SiNC,‘)
C;ivcli
il
Llie
list.
’ p(/‘,Sl
of
disLrit)uled
liy(,‘)
aLtribute-value
(SchliIrl~rlc~r,
pairs
coricepl
shown
1,-V <
IJc
via alI if and
19%).
describing
rel)rt’s”Iit,;rt.ioIl
ali
in-
as a whole
infl uencds
expc~ctation
of a positive
or negative
instance.
Followirlg
the rriechaliislri
i~sed by I)uda,
(;aschnig,
and
Ilart
(1979),
the dual wcighls
associated
wiLh each charac-
sufliciency
learning
a subjecL
infiility
be easily
corrvert,cd
to probftr1 1,s ValuC less than
unity
unity
iJldiciLtt3
iridepen-
iildic.aLc:,
a positivtl
corrtllatiori,
aricl ii L’iilklt’
grtxaltar
uirity
iritlicates
negative
correlatiori.
If’or hotti
l,.S anti
unity
indicates
irrelt~varlcc.
The I,S a~itl I,N rrit~asures
ad tlt>re
gebraic
orlly
to positive
(Odds
may
t odds).)
indicates
a negative
deuce,
aud a valise
b t,arice,
chosen
for the
on psychological
times,
elements
is a I3oolean
l)y a disjunct
either
small
weighLt:ti
‘1‘11~ rIiathernatic:al
nt05sity
weights
STAC;(:K:H
charac.terixations.
I II(~ c.oiic~t~pl. descript,iorr
VillllC’ /‘airs
represented
,iril~)le t~lcrrlt:lltS rriatchi~~g
on(+i
color
in
p((lSI
INC).
or the other
still learns
an
if each of the
someLimes
read
lower,
but never
higher,
lhan
it should.
The errors
of this IatLer
instrument
are syste7nuticafly
of
011e type
(only
Loo low),
though
llrey
rnay occur
with
an
unpredictable
frequtlncy.
The conlingericy
law states
that,
/,S
arc
and an
Rescorla
o11e occasion
and too low 011 ariolher;
Lhc difectiun of its
error
is random.
Only
a few authors
Ilavc dealt
with
this
possibiIiLy
(c-g., Q uitilan,
1986).
llowever,
it rriay oflen
be
the case that
errors
in description
are the result
of a systcmaLic
variation.
F‘or exarnplt~,
a rain gauge
may leak and
iI1 tcrrus
of odds.
ability
I,
otlds/(l
(:onc.tlpt.s
contain
spurious
of instances
be
(M),
or positive
likelihood
ratio,
ciency
(Duda,
Casthnig,
Kr tlart,
necessity
(I,N),
or negative
likelihood
concept
deCornpete
with
novel
cue thal, wilhout
it, or p(USlNC>)
b~
In behavioral
terms,
Lhis nieans
that, if one
slimulus
frequently
occurb
alone,
the subject
association
t,ctweell
t,he two cues.
Ilowever,
With
with
adjust-
IJoolean
charinore
general,
versions
of existing
11ew cliaracterixatioris
concept
to fornl
of
STACCEK
wclighled,
sy rtlholic
c-tlarac.teri~ations.
is processed,
a cumulative
expectation
t.ity is formed
c.tiarat:terizatioris.
be
is based
on a slrengtlieniiig
evaluation
it does not distinguish
between
tyyes
A new
‘l‘tit: heart
a distritjuted
characterixations
any previous
cue (NC)
testing,
(I!NiH)
f’oririuiated
Lhe co~~t,irigency
law whicll
states
that
subjects
will learn
an association
beLwecri
the two events
ouly if ~,he unpleasant
stin~ulus
is more
likely
following
the
lirriitcd
noise in instanct3
(but
not both,
inleresLingly).
‘1‘1i0ugh
this method
is incrcmenlal,
learned
characterizaLiolis
r~iay not change
and recross
Lhe search
boundaries
previously
established
in Llie version
space
as the defiiii-
ilt,t’
tion
given
repeated
prcsontations
of a novel
unpleasant
stimulus
(I JS). Aft,er extensive
is
terixaLior1
are
used
togettlt~r
with
estirnaLt4
prior
LEARNING
odds
/
to
SO3
calculate
the odds
tat ion is the product
that
a given
instance
is positive.
of the prior
odds of’s positive
Expecinstance
;tl~(i Lht‘ f,S values
of’ all matched
characterizations
/,K values
of all urirrlalchd
ones.
c1kl.r
( pas
/ i74
mLs(pos)
x
n
and
1,s
x
Vmotchcd
n
LN
v ~rrrdtchcd
‘I’htl resulting
number
represents
the odds
in favor
pobitivc
instance.
This
holistic
approach
differs
from
Inac,tlirie
learIling
systtmls
iu which
a single characterization
c~o~rrI~l~~t,t:ly
irlfluences
concept
the
of a
most
The
prior
as (C,
odds
for
t lr)/(I~
If STAGGER
cll;lractt~rizatior1
tatiou
“linearly
limited
weights,
concepts
measurt:s
to
tic, t~xJJ(~ctaliou,
S’I‘AC:(; CR iucrtmenlally
wciigllls
ilsbO<:iat~ld
with
individual
thra b1ructure
of the (.tlarac.tc~ixations
two
l;lI,ter
dt5criptioti
abilities
to t,etter
allow
S’I‘AC;C.;l~11 to
reflect
the concepl.
‘1‘11tb sufliciency
and
necessity
taac-tI of’ the c011cept
descripliori
ac1justt.d.
Consider
the possit~lr
in a distributed
compute
a holis-
modifies
charactttrixatiom
themselves.
adapt
troth
the
aud
‘l’hesc
its
are
its
the
learning
to
distributed
would
be sufficient
to accurately
separable”
concepts
(llarnpson
easily
estimated
adjustment
concept
of the
represerl-
describe
the class of
& Kibler,
1983).
Jn tllis respect
s’I’AC;C: Ktt is similar
to comec&ionkt
of’ learning
w he11 those
models
do riot have
aIly
ctiarar.terixatiorls
f’urlctioris.
rcpreseritirig
Ikiyt:siali
instance
models
“hidden”
units.
The purpose
of the hidden,
internal
units
is to allow
the encoding
of more
complicated
concepts.
Search
processes in S?‘A(:GEK
serve an analogous
purpose:
individual
prediction.
13.
III addition
to
Irlafln(‘r
;trkcl using
a positive
i CN).
concept
weights
elerrlents
situations
associated
with
may
be easily
that
rr~ay arise
wl1c111 rrlatctiirig
a characterixatiori
against
lowirlg
t11e terminology
used
by ljruner,
Au:,Iirl
(IYW),
a positive
instance
is
an instance.
lcolGoodrlow,
arid
positive
evidence
are cornLiued
S’i’A(;C;KH
searches
into
through
rriore
a space
of
c011iplt’x
J300leali
possible
charac-
terizatiom
as it refines
its irlitial
distributed
represerltatiorl
of the concept
irito a uIiified,
accurate
one.
Each
possible
i~oolca~i
c.tiarac.tt~rixatiorl
of attribute-value
pairs
may
be
viewed
as a node iri the space
of all such furictions.
Figure I depicts
a small
portiori
of this space
over a simple
domain
(each
ellipse
rep1 t sents a Boolean
function).
Any
two of the possible
I3ooleau
functions
a.re partially
ordered
along
a dirrxnsion
of generality
(Mitchell,
1982).
MAXIMALLY
“ET
<z-)
/
wtiic h tinily either
con/irm
ttit: predicliverms
of a charact.tbrixal iori (if’ it is matched
ill this instance)
or infirm
the
(.~li~L.itcI,t~rizatiorl’s
predicliveriess
(if it is unmatched).
Sirnilitrly,
a negative
iustance
is negative
evidence
which
either
c011
‘I’dlJIt:
Ii r rus an urirriatcl~ed
elerrient
or infirm3
I surrirriarixes
these possibilities.
‘l’l~tJlt! I : l’ossi
10 rill irlstarlw.
t)le situal,ions
/ lIlstarlce
I
in matching
a matched
one.
a ctiaracterixalion
~:haracterizatiori
Matchetl
/I
Il~lItl~tchCXl
1
MAXlIl dALLY
GENERAL
III 1t’rms
of’ these
irrilJlit3
I,l~at learrlirlg
rrratctiir~g
occurs
f,j pt’ of’ irrfirrriirig
cvitieuce.
it~nou~~t,h 01’ I~0111 positive
sut~j~~c-t.s fail I,0 learu
art
dt~lirtitiorl
of
systcxlrlatic
cfc~lillt~cl
;is both
typt5
t)le situatiolis
oL’ ir~firriiirig
listed
the coutingency
involving
at triost
III siluatioris
and negative
association.
variation
‘I‘tlt~ weighting
rrieasurt3
( II 1‘11t4 by keeping
counts
pobi
everits,
iii cases
with
everi small
infirming
evidellce,
The corrt~sporitlirig
is the
presence
of
1.
h’(h
c:N(G
/ SCIENCE
<CzEJ
Vigure
i CN)
I I]‘)
calthe
’
I : I’artinl
CJ~A~~LC lerixatiorr
S’I‘AGC:ER’s
initial
sirrrple
cllarac.terir/,atioris
wit11 initially
is more
thari
f,N
rriay be easily
characterization
f,N
504
of’ 011ly
evidence.
f,S and
for each
iri Table
law
oric
unbiased
twice
the
a corljuIlct,iorl-orlly
1982) _ A uotlier
c-onct~pt
description
in the rriidiile
Notice
weights.
size
of’ that.
method
iuterestirig
space
both
rnetl1od
searches
sides
toward
the
from
both
the sirriplest
boundaries.
st~arct,
points
its space
middle;
in
the
corrsists
of the
ot’ b‘igure
1 eacll
tllat
this
space
typically
like version
difference
spact’.
searched
spaces
is that
by
(Mitchell,
the versiou
of characteri~atiorls
from
S’I’AC:CHl~
hearn-searches
middle
outward
toward
S’i’Ac;(: RR’s thrc!c
search
operators
cializillg,
generalizing,
or irlvertitlg
rriake
ti concept
descript,iorl
elernerlt
proceeds
down
a co~l.junctive
IIIOI’V gcrieral
elerrittrll,
search
1 ion.
tdstly,
a
poorly
scorirlg
tithgatc>cl;
Lhis does
correspond
c-harac.teri~atiorls.
more
specific,
l,o spe-
path.
Conversely,
to rnakc
proceeds
to a new disjunc~harac.t,cri~atiorl
not. raisca or lowcbr
rrlay
it2 degree
‘l’at,le
1’0
search
IICW
t~~t~Irlt:nts
(Jrror.
(atl
only
W hcri
t’rror
when
ii negative
of’
S’l‘AGCKtZ
inst.arlce
the
OR [c 1 , ~23
Cortlrriissio~i
corrlrrlissiorl),
AND [c 1, ~23
expectatioIl
is
too
though
ctiaract,er-
positive
they
is
‘t’tlis
instarlce
(a11 error
01’ ortlisbion)
is overly
specific:;
to irlc.ludt:
a t110r~‘ general
c.llaract~~rir/,atioll.
of’ error
albo c’auscs
S’I‘AC;G Eli to cxpatld
1,~ proposirlg
lhe
lJl1~
_” surilrriarixes
LN
searcll
b:isctarch
riegatioll
01 a poor characlerixaliorl.
die operators’
precorlditions.
Ta-
2:
Stlarcli
operalor
,c2]
rhtects
atont:
ele~lit~rits
second
f’rolltitar
(iict
tors
ill 11ew c:ttaracleri~itt,ioris.
‘t‘h(a rlorninaliorl
hchuristic
specities
atlerrlative
groups
of’
(,11~lr;L(.t,t’rixatiolis
from
wtlich
to form
compounds.
Af’tcr
~S”I‘A~:( ;lCti has rriadc
afl error
of’ corrilrlissiori,
ch;tractc:riL;it ioils
rrlatc:)ltlcl
iI1 t.his rlchgative
instance
may
IJt:
par1 inlly
t1t’cc5silry,
t)bit dre ctedrty
Iiot sufficit!nl.
Sorric> ett’tllr’llt :, rIlust
h;ivc
suggested
(vid
the
rllatchirlg
t teal 1 his iristarlcta
was likraly
l,o be posilive,
but,
t Ilib itlstallce
wa5 riegativth
, sortie
tlr:c.ess;try
elerrlenl,
step
New
candidate
etecl.iorl
:
rtl;tlchcd
ones.
;it’(’ II r~tt~at~hed
cli5jL1tlc.t
IIO
iorl
hllflic.ic:rlt
IIMVI
IO
If
iri
d
two
art’
two
iziLIiotlb
hcburist
its
w)iicti
apply
S’l’AC;(;l2ti.‘s
ions
art> IIiiltctid
for au terror
tly
similar
rca-
i,.S,
fbrrrling
t~leds
high
new,
dis-
are
f,N(ci)
.c2]
t
A:, I
L/V(c)
>
are
I or I,Y(c)
the
‘I’he
are
t~slat~tisht~ci
search
scdrch
wtlich
or
I j
into
ItlarlIlt’r.
f’rolltictr
<
introduced
(.t~arac.tt~r.izat,iolls
the
1
I
Lsyci)
c:-alld-test
new
rrieasurr
s
as
opera-
tllell
part
eittirr
of
it,.
‘I’0
ii ticw
ctiaract,er-ixatiori
Iilust
be Irlorc
cflkct,ivc
t,han its sporisorilrg
corriporierits.
If ttie 11ew etemerit, surpassc3
a weigtit,
tlirestiotd,
it is estal,)ished
aild
its
c.oIrlporlerlls
iLI’t2 pruud.
IrlkriItl
pdor~rlallce
is
assessed
by
c:xaIrritiirig
recellt
CtliiIlgc’S
ill
its
wtbights.
‘l‘tlese
changes
avoid
htitlg
prulltd,
avcragtvl,
arid
t0
t)t2
un-
if’
this
average
rt~iLI.~liIlg
an
is
itSy~Il[>tOt
t tic> ~tlarac.teri~ation
very
small,
tllc
etcrrlellt
11’ it. is still
(3.
t)cloLc
is pruritd.
elt!Irlents,
alorig
with
elements
u~iwhich
norriiIiatd
present.
whkh
heuristic..
I<teclion
,c2]
gcrlt>riLf
frorrl
appears
wits
il
gcllerate
art’
t)t%c.ausc
in
prurltd
process)
predict,
f’rorn Ihostt
it1 ttlis rlorlt~xarrlplt~.
01’ omissiorl.
‘t’ablc:
irlg
alid
Negation
~lon(~xi~tt~-
ctlarac-Ltar-
S’I‘.AC;GKH
rlit’;thlIreb
siricct
ctlaractcrixal,iorls,
is norrlirlatd
rlorrlill;tlion
forriled,
were
(~harac~lerixat
I Is corrlponeri1,
is
sufficient
<:haract,cri~c,atiorls
illverl,
Ilecessary
riornirialecl
1’~s;~ullc~t.ioil
’
t.his riorlexarnple
(
corrtbirids
ptt3.
riz/,c3
corrlt)itlr:s
ctiarac,t,c~t.i’c,;lt,ioris
1).
thcref’orcl
heuristics.
I+;lt!ctiorl
c,haritc.1c!rixatiorls
thrcs~lold,
Co~ljlIll~~t,iorl
and
corljurlctiorls.
male).
preconditions.
!i’l’.A( ;(; tC:il l’oltows
a t.wo-step
process
of
choosing
good argllrtrt~llts
f;)r the operators;
oti(’ sel of’ hrlurist,ics
~~orfli~~ute~
potc:ril iat argurnc~nts,
iiud 2~ sccortd
set, elects t hct rrlosl, prth-
tttdLch(d.
evidenct!,
for
NOTIc]
so trldt(.timl
is
Table
to
jurlctive
c~lara<,teri~at,iolls.
New negateci
~tlar,a~t,erizatiorls
are elecl,cd
equally
by tmt h measures.
‘l’ablc 4 summarizes
OR[cl
irrctusiori
brother
woightirlg
Iiieasurt~,
to be used in
Function
for
(a
(refer
irlfirrrlirlg
1,tre converse
ctlaractclrixatic)tls
AND[cl
011es
J
evidence
tlegative
criteria1
soniIig,
scmirlg
occur
irlfirrrlirlg
‘I‘a1,le
ib(l
Ma~chcci
IJnluat,cld
Ilrllllatctled
sorrletirrles
rlegative
tolerates
these
‘t’able
OR[cl
NOT[c]
l~rlIllalctletf
and a mate.
The two charact~~iil;at,iorls
(parent
arid
are always
matched
iri a posilive
instance
(hthcr)
il~dlc)
gerlerat.
specific
a
parerit
to be posit,ivo
cxpcctation
‘l‘hub h(1im.h is c:xparltied
toward
a more
iLnliol1.
On the other
hand,
a guess
that
is Iltlgdtive
is (~xp;~t~dd
t tl(sr type
an
Urltllatctled,
Matched
Matched,
Matched,
NOT[cj
t)e
opralors
art:
by proposing
makes
is predicted
hrurisl,ic..
1
-1
ot’ generality.
‘t‘tltl coiijunctiou,
disjuIlct,ion,
a.nd rlegatioll
;ipplied
exhaustively;
search
is limited
Nomination
a
OIuissiun
riot
3:
worse
potlt~rlts
is
did
wilt
Irigger
indicate
than
are
the
wtlerl
it
react,ivaletl
~Iloves
t,licl opposil,(J
rlew
wab
amou~lts
Lhrollgll
order
f’rotn
who11
(,he
wtlighting
chariic’leri~,atioll
is
t5tat)lished.
arid
This
More.
kailse
backtracking
t,tiat
cortlpett:
Its
ab
tflts
to chro~lologic.al
t)icl st’drch
space
which
tllchy wtlrtf
pthrf’ortn-
pruntd
f’aititlg
cortlett~Irlent
backtracking
art’ retracttd
~~roposetl.
iii
Sirrlilar
3 surn~na-
IV
heuristics.
An
Tracking
irIlport.arlt
feature
sporisiveriess
tm changes
ii
lo
fox
tts;irIls
look
of
concept
a
in the
f;)r
Icarrlirlg
drift
IKl~ChiitlisIIl
eriviromie~it.
it cllangcd
coat
is
lcor
color
LEARNING
its
rfh
irIst,ance,
iti his
prey
/ 505
1 Ilts
SCaSOIlS
change.
First,
the learner
must
IJI>I wet111 randotiiim3s
and getiuirie
change.
For
~~(~ct;il ion, the question
arises
as to whether
it
a Iloisy
instance,
and should
be tolerated,
or
irltlit~al,cs
that, the Iearned
concept
has drifted.
ii&S
disitinguish
a Failed exwas simply
whether
it,
S’I‘AGGEIZ
% CORRECTLY
CLASSIFIED
IIM’> ! frtb Ijayesiarl
weighting
measures
to dist,inguish
beI W~TII
events
that
indicat,e
a change
in the definition
of a
c.o~~ct~p!, arid
those
which
are probably
t,he result
of noise.
St~cortrlly,
does
the arllourit
of previous
learning
about
a
givclli
tolrcept
tlckfiriition
aflitct
subsequent
relearning
of a
Ilt’w dc~finition?
In humans
and animals
it. does.
‘I’he adage
“It’s
hard to teach
an old dog rlew tricks”
roughly
captures
a rrlairl
fillding
in Iearning
(e.g.,
Siegel &X IIornjan,
1971).
‘l‘tlc3e
st,utlies
ilidicate
t,hat the resiliency
of learned
conctbpt, definitious
is inverstxly
proporlional
to
trairlirlg;
briefly
t,rained
cotlcep~,s
are more
do11t~1 irl t,he Fart3 of charlge
t,han ext,ensively
the amount
of
readily
abant,rairled
ones.
Kctlpirlg
counts
of the evidence
types
in t,able
1 amounts
lo rc~l,;iiriirig
a history
of’ associatioli,
allowing
S’1‘AGGI31<
to
r110tlc~l
resiliency
appropriately.
tcigurc>
% CORRECTLY
CLASSIFIED
25
INSTANCES
50
PROCESSED
2: ‘l’racking
concept
J
-”
~-
75
drift.
- .--_
~~ “-v~.--
100
/
11‘igure
hucct3bive
2 dcpict,s
definiliorls
the
red
UPL~ shape
shape
circular,
cliL>l1<‘(1 verti<al
liIles
concc~pt
was
performance
for t,hc same
squarish,
(3) color
indicate
changed.
Irlt~(lidi,t~ly
following
quircatf
defiriitior\
chi11tgtv1
instances.
S’l’ACG~:tZ
on three
coricept:
(I)
color
(2)
size
small
or
(blue
or green).
‘I’he
when
the definition
of the
Notice
how
performance
falls
im-
the charge
because
the previously
acwas not sullicient
to characlerize
new,
III each of llte three
cases
STAGGl5lZ
f’or~r~c:tl t,he explicit,,
cts\)(‘s
clc:fillition
and
011 I I1t1 3bart.h
f’roIit,ier.
S’ilAc ;(; EK
of
symbolic
evaluated
addresses
tile
l,hroIlgll
t,he use of its
/,,Y itldicale
a change
1 rigg:tllbacktracking
tIculd,
Illore
01’ the
1 II(~ rr~odifit:ation
of
S’l‘nc I(; tq;le’s acquisition
represerltation
it as the best,
noise
versus
change
weighl,irrg
measures.
iri 1,tie type
of noise
011
Figure
red
other
lead
to
3 depicts
or size
sh ctrarac.l.ttri~atiorl
as iri figure
2. Aft,er
tlic daslied
iC’i1 I lilttl,
positive
irrstariccs
were sut)jet:tcd
to 25% Ileg‘l‘lial
is, %5(X of’ the pas,rt ivtk itili rrriing,
systerriat.ic
Iloist!.
ililts
ilrbt,dric.t3
wttr‘e ra~ltlorrily
assigned
t,o tlit,her
tlit: posiliv(> 01‘ littgalivcl
t-lass; a situ;tI,ioll
similar
to t.he Icaky
rain
vc~r~~ly
Not.ic-t: that
ulrlihth
;Lff’&.tc~tJ, iritlit~;ilir~g
t iii~:lli~~lirig
I~T~IIM~
506
ht~twc~t~Ii
S’rAc;(;t+;I{
/ SCIENCE
rioise
wtairih
figurta
t,h;lt,
‘t, perfi)rItlatlt.t:
is not, adS1’Ac:(:C:l<
is correctly
dis-
ailltl concept
t:ol~rit,s
charlgtb.
of
siluatiori
Figure
issue
Lht:
non
l,ypt:s,
---.-
25
INSTANCES
When
/,S and
prewnt,
they
as explained
above.
sanle
type
of’ rloise
does
t~harat~t,erixalior~s.
of’ t,ht color
AI_---
of the COIIamong
tllose
squari
vcsrl
giLlg;(l.
I
’
il is
in effect
keeping
3: ‘LS% syste~riatic
an
abbreviated
between
a characterization
allows
the prograrrl
to model
of previous
(Jolilrast,
50
PROCESSED
Iioibtt.
bist.ory
of
the
correlation
and a concept
definition.
This
the effects
of’ varying
arnouIlts
learning
on relearriirlg
figure
4 in which
the
than
four
times
the mwurrt
of
bef;)rc
c1at.h charlge
t,liau iii figure
cry learning
is considerably
I’ast,er
rriiriirrml
Lrairiirig
cast! (figure
2).
111 short,
t,he htturist
t,rairlc~tl
concept,s
are
t,ht:r(\(i)rt>
btt abarltio~letf
75
resiliency
prograrrl
al a gross
was give11
level.
more
training
for tbach cont~ep~
2. Notice
I tlat t,he recov(higher
resiliency)
iI1 LIlta
it. dernoIlstratcd
hclre is that
briefly
less likely
LO be st,able
mtl
should
more quickly
in tIlta f’acc of change.
Ori l,l~e othtar
hand,
extelisively
trailled
collcfxpts
are rriore
stable
and have a longer
liistory
of past sut:ct:ss;
they should
bc less resilient
in the f’dce of rlew evidence.
Psychological
studies
indicate
iri this
rriaririer
lhat
(Siegel
natural
learning
8% l)omjan,
met:hanisrns
1971).
behave
“/ CORRECTLY
CLASSIFIED
100
200
300
INSTANCES
Figure
V
4:
‘l’rackittg
conc.ept.
over titrie.
rtt<:asures
I)~~t.wet~tt ttoise attd gertuitte
rt~cbrical
histories
of evettts,
ovr~ri raitbitig
itlg rttottiods
t~urtherrnore,
affords
the
concept
STAGGER
seen in psyc:hological
employed
in ~‘I’AGGEI~
the
proper
rcquirc~s
feedback,
as all
unable
given
overtraining.
drift.
fly
models
retairtitrg
nuthe efl’ec ts of’
experittlents.
are far frorn
‘I‘lte leartia COIIl~Jlek
concept
attaitment
to conceptually
systctrts
cluster
I)utla,
l-t.., Gaschllig,
tlie l’rospecbor
In I). Michie
uyc,
b2illl)urgll:
its
bngley,
irig.
tion
system
in part
by t,ho
NO001 4-X4-K-(139
I~ourrtiat,iolI
OHice
1 artd
UII-
tltir grdtlts
1YrL‘-81-20~iH5
and lS’l’-85124 19, ttte Arttty
IIest1arc.h Ittstitute
urtdctgrant
Ml)A<303-85-(:-0:1’L4,
at~d by
I tit> Nnval
Ocean
Syst.errrs
(:(:tt ler under
contract
NCiCiOO IH3- ( :-0255.
We would
likt: to thank
Michal
Young
who
wd> itlvolved
in the early
fortrlulalion
of these
ideas,
Ross
LI1eir
vigorous
discussions
a tiat,urat
mac.hitict
arid
exletisiotl
learttittg
cotlsistettl
I’rcss.
OJ ltarniny
(1982).
1 K, 203
theory
of cfis~.rtrtlitlatiutl
& It. Nec~hus (I~>cls.),
und
( :eneralizat,iori
226.
learnProduc-
development.
as search.
Arfificial
do,
J. li. (1986).
Ii. S. Illictlalski,
Muchine
lcurnlny:
UW~C II.
I,os Alt,os,
t’rb,
for suggestitig
and Ihe etitire
Ilniversity
inputs.
Acknowledgements
Quitilitrt
j~roc~~ss,
A getteral
1’. I,angley,
models
MiLchelI,
‘1‘. M.
Intelliyencr,
III
Science
l’hlirll)llrgll
I’. (in press).
11) II. litahr,
Qurrhl,
‘I‘lli>
research
was supporl.ed
01‘ Naval
Research
uttder
grants
NO001 l-X5-K-0854, l.he N;Ll,iollal
J., & llart,,
k’. (1979).
Motir:l design
in
consull,anL
system
for tlliner;tl
exploratiott.
(I’ll.),
I!‘rpert
sya!ems
,in tht rt~icro electronic
use of ttte
distittct.iort
5olut iotl t,o the prot~tt:rris
of leartting
it1 complex,
reactive
c~tlvit‘ot~trlettts.
So far,
it, is littlited
t,o learning
tboleatt
c~otrtt)itlat,iotis
of attribute
values
and cannot,
acquire
rela1iorl;tl
descriptions
of structured
objects.
STAG(:I*~R
also
;tt~cl is tlterefore
500
References
tiorls
t)y c.otiducling
a trtiddlcbout
bearti
search
through
the space of possible
conjunctive,
disjunctive,
arid negated
(.tI~tr.itc.l~,rixatiotts.
tlacktrackittg
allows
t,rackittg
changes
in
defitiit.iorts
weighting
dril’l
Conclusions
S’I’AC:C:EK
is an inc~retnental
learning
method
which
tolcbralt5
systematic
noise
and concept
drift.
It begins
with
sirrlplca
characterizations
and learns
complex
characterixa-
c.ottc.chjjt,
lbyc~siatt
400
PROCESSED
to the rrtatchittg
group
at. lrvitte
for
eticouragernertt.
‘i’t if2 effect uf tloise 011 colicxpt
leartlltlg.
J C. Cart~onclt,
& ‘I‘. hl MiLchelI
(Us.),
An urtijiciul
irltrlliyerlcr
uyyroach,
uol(:;Jifortlta:
Morgau
Kaufr~~ar~rl
I’ublish-
IIIU.
Ihc.orta,
tt. A.
ahserir~c
untl
of’
(lIM3)
(3
III
i’hysioioyicul
f ‘rola1)llit.y
of st1c)c.k irl
fear c.c,rltiil.lollirlg.
.louri~ul
I’syr~holoyy,
0’6, 1 5
t.11~:
] J. c: (lwzi).
‘4 note on corrrlutronul
(‘l’echttic~at
report
ff Xti- 13).
Irvine,
Califorrlla:
versity
of California,
I)epuLnienL
of Inforn~at.ion
Sclblirtlrtler
puler
Siegel,
il tlibitAJry
f)resence
ard
of Compurutive
mcusures
‘1%~ tiniand
&III-
Science.
S., & UortIjarl,
~mJCedUIX
M.
(1971).
Lmrnitly
t3 ac k ward coritiitionitig
as a11
und Motioution,
.2’, 1 Il.
LEARNING
/ 507