From: AAAI-86 Proceedings. Copyright ©1986, AAAI (www.aaai.org). All rights reserved. Beyond incremental processing: Tracking JefI’rt:y C. Schlirrlrr~er arid Ric-hard concept 11. Grarlger, drift Jr. I)epartrnent of’ Irif’orrriaCion arid Computer Science IJniversity of’ Catif’ornia, Irvine 92717 ArpaNet: Schlin~rr~et~OI(~S.IJ(~I.l1:1)17, C:rarlgt:r~iCS.IJCT.li:L>U Abstract machine luarrlirlg systems are able drift, and hence ca~noi, deal with ronments containing these qualities. I,c~arriilig in rrltlt hods that Iixc t I:edback) complex, changing are able a11d &if1 to tolerate ( concepts ‘I‘tic5e two aspecfs of cm.h oltier: w heri some corrc~cl,ly predict the (‘011ic’ oc’c urs without prtlclic.Lor), a learner 1 t\ts siLuaL.ion is an iIistan<e of noise to learrl tolerating illust.rates coiiccpts over Sorirc~Lirnt~s ,ill(br rili~i (1 volcanic: 111a.y become drift. We j)rescnt, complex Boolean ttlat a learning character- An analysis of desirable bellav- from an irrij)lerrlcntatiorl to show its ability to t,rack Lo Idit lfl~l dll( ~JarOIwk!r it doesll’t. criil)t ion, pc~(Jr is reading indicates rain cornlc’urt.tierrriore, for trioriths j)rctviously good indicators of while [JI’t’diCtkJrS, or C’OlIli)(Jll il ;tt is it, ai IlCi~‘Ci poii~t sonit: Lhe IJrtdict. t’, olher (previously Attempting to tJet.W(!(!ll bccaust> CVC’lltS (a) rt~ost, irlt,erldtxi ir~dic:aLiorl l,y Llle t‘ac.t a part,ic~ular outxo~llt:, Ll1a1, the is tllat gooci this corlctlpl ,just uoistb aI1d irlclicatot a noisy is begirirririg ?;,~t ur(’ has solved Ltlis j,roblt~rn in hurrrarrs alld animals: iI1 classical corlditiorlirlg t~xpcritrlerlts arc at)le Lo tol- t~r;itc~ rioise rrlcrll5 with 502 irrlplerrlented this method S’I‘A(:(;EK, and have tested ranging from animal learning endgames. ttle prograrrl’s and rriany / SCIENCE We in a computer program called it in a variety 01’ environments, tasks to t>locksworlds to chess present some ability to track empirical drifting Related lindings corlcepts. refiecting work Many successful learnirrg systems have Failed lo deal with issue of concept, drift over time. Quirllarl’s ID3 (1986) for example, constructs a tiiscritrlillatiorl tree to prograrri, This representatiorl c harac tthrizc i rrsLaIlc:es of a concc~pt,. arid negated characterizaallows corljunctive, disjunC.live, tions. Quinlall ac.cotrlIrlodate perforrrmnce tfie rliethod (and allci drift., t:vcn cortip:Ling irl extrerrlely cues. complex cnvirontiowcvcr, few current, has examined varying levthls is close to optimal is rlollirl~rerrlerltal, re-t~xarnirling) does 1101 have tree Lo iricorpora1.t~ to trac.k cliarlgc3 ‘I’titl iricrmrielilal a rt:l;tLive I~it~ctranisrrls the ability of this of noise, concluding method that irlforrrlatiorl the strong <urat,t:ly to its (Quinlarl, lY8f.j). However, for it requires exarninirlg large number of instarict3 for Iiiodifying an ttxislilig IL is uriable, thcLref’ort>, Ilt’w iIlsLanc.es. ilr c’olict*pt dt*firliLions over tilrie. IlaLurtB of’ a Iearnitlg algorithm gllararitee that iL will Iw able to deal with concept over tinIt>. ,Mit,cht:ll ( IOH~), for example, reports vcrsioll spactk Ittarriirlg rr~el hod iri which an aj)propriate sc.ription searcti IO tlrill? rat> on tlayesian statistics, it tolerates systendic Iloise, but not random noise, distinguishes between noise and drift, and We have is able to track changing concepts over time. II ;t~50c.icLLiolls art’ riot j)erftTLly coilsistent, (11Cnt.c~ obst~rvetl iti~li~~~ct’s of Ltlc5cs a~mciatiorls coiltail ‘Iioistf’), arld (b) abI,cbarriitig ill t hesc or u’rt.Jl ov(lr tirrlc,. 50~ i,Ltiolis c.llarlgt’ i ro[tlrit~lits (ir il t ir1tcarac.t: and the Ijoor) iildicat,ors may I~ecorrlt~ j,retiictive. I(siif II I’rollr c~xjJcrit:lice AtJOllt assoCiiLt,iO~lS like> 1 ti(+,cb irl ttlcb rtB;tl world is corrfollrldtd (‘Iiv liaise that tolerates noise and drift, and we otfer an anaccount of why it behaves as well as it, does. The is able to keep track of, and hence distinguish bedifferent types of noisy instances. Via formula based (called chang- Introductiou a low d11d sornct.irrlcs irldication rriethod alytical rrlethod twcerl, tolerate time. I irig, or an noise and drift. why it teas these ions, a.r~ti tmpirical results S’I‘A( ;( ;EI~) are prc:seI~!mi irrg requires noise (less than perthat change over time). complex environments iriteract with particular learned predictor fails to expected outcome (or when the outhavitig been preceded by the learIled must, be able to dcterrriine whethet t ht> c.of~c.t~pL is beginning trlt~lhoci that, is able to iz.aliollb while I tit, aIgorit,hrri erivironrrlents to complex reactive enviWe present a learning of observed illstarlc’es is forriled Lhrougti it space of possibililic5. does Ilot drift on the de- via a bidirectional ‘l’hough relational is utilized, the version sjmce rnc!ttlod assurtles bias t,hal a conjullrtive characterir/,aliorl can accapture Ltle c.o~lc.elJt, to be learned. lrr later work (MiLchelI, IJtgofl’, powd w hicli would AL Hanerji, forrri 19X3), cjisj urictive a modification descriptions was proor toler- of a concept drifts over time. 1,angley’s discrimination Icarning method (in press) is abItt to track changes in a corlcept defirlition over time. The ic~ariied concepts are expressed as a set of productioli rules, one of whicli influences expectalion at, a tiirie. If Lhc applicability conditions for an operator change, prcsu~rlat)ly recently learned productions would be weakeucd via st rerigthenirig while discrimination would propose new I~Iventually, st rt~IigLticned and Lhe new overwhellrl ollt’s. this t’i~usc1 method however, I’unction, rioiscl. III tf~lally irisLance would learning. learning method: of STA(;GEfl’s learning concepL rc~prcseIltaLion rriclrit. of the iLctc>rixalions. 111ort’ specific, Xl.i~JtiOll tse- by using the I,carning weights This and latter method composed is hased on of a se1 of and invert.4 ‘I‘ht5t~ t’k’~~l~m~S. I’or iIlc.lusiori in tlic I tl;Lt w(‘rc cornbincd As each new of iLs iden- pair of weights associaled occurs at. two levels: generation process of’ new constructs description them. the stirriuli about, occur alone eve11 a few rlurnber of their associatiorl is severely impaired. tteal-world ample, the tasks also descriptions rumforrl or noise would withill IO% systemutic be a tempera1 of its operating ally weighted, represented symbolic would would blue) xal iorr5 dre dually rltlgiit ivcl iniplication. or tJt: represcrlLt4 shape Oritf 01’ I1 ctiarac.terixatioil lor illIll 111th other represent,s /JO.\). itllti results. In a classical as a set Ii:actl of’ du- c~lomenL of function of atlrihuteof c*oIljuncts. Ali exblue figures or square small cl~tcl as (size These characterisquare. in order to capture: positive and wr%igliL represefits t,titl sufficiency prediclion, or it.s necessity, measures are based conditioliing (mulched :, j~,o,s), or ( Irrlutched > leariling atlons sure t~xperirrieiit, learning events. subject to For exeither variation. An example of random ure sensor which is accurate Lo range. It may read too high 011 occurs in systematic with raIidom variatiori. this in necessity. ranges mind, ‘I’hey from cases but defined is dubious uses S’I‘A(;GEK are zero in situ- logical sufficiency as a measure 1979). Similarly, ratio, serves of suffilogical to mea- as: rt~lationship. I’ronj zero zero Ltiari l,N, and is iiitt:rpreLed correlatioli, grtlattlr t ban I,N also represents Lo posiLivt> iilfinil,y. unity odds llowt~vtfr, irtdicaLcls a positive and takes ati /,A’ on values value near to Lhtk conf ingt1nc.y law, fi>r it can tnanip~Jlat.io~~s t,lliit I,S’ ’ I a11d if p(II,SiNC,‘) C;ivcli il Llie list. ’ p(/‘,Sl of disLrit)uled liy(,‘) aLtribute-value (SchliIrl~rlc~r, pairs coricepl shown 1,-V < IJc via alI if and 19%). describing rel)rt’s”Iit,;rt.ioIl ali in- as a whole infl uencds expc~ctation of a positive or negative instance. Followirlg the rriechaliislri i~sed by I)uda, (;aschnig, and Ilart (1979), the dual wcighls associated wiLh each charac- sufliciency learning a subjecL infiility be easily corrvert,cd to probftr1 1,s ValuC less than unity unity iJldiciLtt3 iridepen- iildic.aLc:, a positivtl corrtllatiori, aricl ii L’iilklt’ grtxaltar uirity iritlicates negative correlatiori. If’or hotti l,.S anti unity indicates irrelt~varlcc. The I,S a~itl I,N rrit~asures ad tlt>re gebraic orlly to positive (Odds may t odds).) indicates a negative deuce, aud a valise b t,arice, chosen for the on psychological times, elements is a I3oolean l)y a disjunct either small weighLt:ti ‘1‘11~ rIiathernatic:al nt05sity weights STAC;(:K:H charac.terixations. I II(~ c.oiic~t~pl. descript,iorr VillllC’ /‘airs represented ,iril~)le t~lcrrlt:lltS rriatchi~~g on(+i color in p((lSI INC). or the other still learns an if each of the someLimes read lower, but never higher, lhan it should. The errors of this IatLer instrument are syste7nuticafly of 011e type (only Loo low), though llrey rnay occur with an unpredictable frequtlncy. The conlingericy law states that, /,S arc and an Rescorla o11e occasion and too low 011 ariolher; Lhc difectiun of its error is random. Only a few authors Ilavc dealt with this possibiIiLy (c-g., Q uitilan, 1986). llowever, it rriay oflen be the case that errors in description are the result of a systcmaLic variation. F‘or exarnplt~, a rain gauge may leak and iI1 tcrrus of odds. ability I, otlds/(l (:onc.tlpt.s contain spurious of instances be (M), or positive likelihood ratio, ciency (Duda, Casthnig, Kr tlart, necessity (I,N), or negative likelihood concept deCornpete with novel cue thal, wilhout it, or p(USlNC>) b~ In behavioral terms, Lhis nieans that, if one slimulus frequently occurb alone, the subject association t,ctweell t,he two cues. Ilowever, With with adjust- IJoolean charinore general, versions of existing 11ew cliaracterixatioris concept to fornl of STACCEK wclighled, sy rtlholic c-tlarac.teri~ations. is processed, a cumulative expectation t.ity is formed c.tiarat:terizatioris. be is based on a slrengtlieniiig evaluation it does not distinguish between tyyes A new ‘l‘tit: heart a distritjuted characterixations any previous cue (NC) testing, (I!NiH) f’oririuiated Lhe co~~t,irigency law whicll states that subjects will learn an association beLwecri the two events ouly if ~,he unpleasant stin~ulus is more likely following the lirriitcd noise in instanct3 (but not both, inleresLingly). ‘1‘1i0ugh this method is incrcmenlal, learned characterizaLiolis r~iay not change and recross Lhe search boundaries previously established in Llie version space as the defiiii- ilt,t’ tion given repeated prcsontations of a novel unpleasant stimulus (I JS). Aft,er extensive is terixaLior1 are used togettlt~r with estirnaLt4 prior LEARNING odds / to SO3 calculate the odds tat ion is the product that a given instance is positive. of the prior odds of’s positive Expecinstance ;tl~(i Lht‘ f,S values of’ all matched characterizations /,K values of all urirrlalchd ones. c1kl.r ( pas / i74 mLs(pos) x n and 1,s x Vmotchcd n LN v ~rrrdtchcd ‘I’htl resulting number represents the odds in favor pobitivc instance. This holistic approach differs from Inac,tlirie learIling systtmls iu which a single characterization c~o~rrI~l~~t,t:ly irlfluences concept the of a most The prior as (C, odds for t lr)/(I~ If STAGGER cll;lractt~rizatior1 tatiou “linearly limited weights, concepts measurt:s to tic, t~xJJ(~ctaliou, S’I‘AC:(; CR iucrtmenlally wciigllls ilsbO<:iat~ld with individual thra b1ructure of the (.tlarac.tc~ixations two l;lI,ter dt5criptioti abilities to t,etter allow S’I‘AC;C.;l~11 to reflect the concepl. ‘1‘11tb sufliciency and necessity taac-tI of’ the c011cept descripliori ac1justt.d. Consider the possit~lr in a distributed compute a holis- modifies charactttrixatiom themselves. adapt troth the aud ‘l’hesc its are its the learning to distributed would be sufficient to accurately separable” concepts (llarnpson easily estimated adjustment concept of the represerl- describe the class of & Kibler, 1983). Jn tllis respect s’I’AC;C: Ktt is similar to comec&ionkt of’ learning w he11 those models do riot have aIly ctiarar.terixatiorls f’urlctioris. rcpreseritirig Ikiyt:siali instance models “hidden” units. The purpose of the hidden, internal units is to allow the encoding of more complicated concepts. Search processes in S?‘A(:GEK serve an analogous purpose: individual prediction. 13. III addition to Irlafln(‘r ;trkcl using a positive i CN). concept weights elerrlents situations associated with may be easily that rr~ay arise wl1c111 rrlatctiirig a characterixatiori against lowirlg t11e terminology used by ljruner, Au:,Iirl (IYW), a positive instance is an instance. lcolGoodrlow, arid positive evidence are cornLiued S’i’A(;C;KH searches into through rriore a space of c011iplt’x J300leali possible charac- terizatiom as it refines its irlitial distributed represerltatiorl of the concept irito a uIiified, accurate one. Each possible i~oolca~i c.tiarac.tt~rixatiorl of attribute-value pairs may be viewed as a node iri the space of all such furictions. Figure I depicts a small portiori of this space over a simple domain (each ellipse rep1 t sents a Boolean function). Any two of the possible I3ooleau functions a.re partially ordered along a dirrxnsion of generality (Mitchell, 1982). MAXIMALLY “ET <z-) / wtiic h tinily either con/irm ttit: predicliverms of a charact.tbrixal iori (if’ it is matched ill this instance) or infirm the (.~li~L.itcI,t~rizatiorl’s predicliveriess (if it is unmatched). Sirnilitrly, a negative iustance is negative evidence which either c011 ‘I’dlJIt: Ii r rus an urirriatcl~ed elerrient or infirm3 I surrirriarixes these possibilities. ‘l’l~tJlt! I : l’ossi 10 rill irlstarlw. t)le situal,ions / lIlstarlce I in matching a matched one. a ctiaracterixalion ~:haracterizatiori Matchetl /I Il~lItl~tchCXl 1 MAXlIl dALLY GENERAL III 1t’rms of’ these irrilJlit3 I,l~at learrlirlg rrratctiir~g occurs f,j pt’ of’ irrfirrriirig cvitieuce. it~nou~~t,h 01’ I~0111 positive sut~j~~c-t.s fail I,0 learu art dt~lirtitiorl of systcxlrlatic cfc~lillt~cl ;is both typt5 t)le situatiolis oL’ ir~firriiirig listed the coutingency involving at triost III siluatioris and negative association. variation ‘I‘tlt~ weighting rrieasurt3 ( II 1‘11t4 by keeping counts pobi everits, iii cases with everi small infirming evidellce, The corrt~sporitlirig is the presence of 1. h’(h c:N(G / SCIENCE <CzEJ Vigure i CN) I I]‘) calthe ’ I : I’artinl CJ~A~~LC lerixatiorr S’I‘AGC:ER’s initial sirrrple cllarac.terir/,atioris wit11 initially is more thari f,N rriay be easily characterization f,N 504 of’ 011ly evidence. f,S and for each iri Table law oric unbiased twice the a corljuIlct,iorl-orlly 1982) _ A uotlier c-onct~pt description in the rriidiile Notice weights. size of’ that. method iuterestirig space both rnetl1od searches sides toward the from both the sirriplest boundaries. st~arct, points its space middle; in the corrsists of the ot’ b‘igure 1 eacll tllat this space typically like version difference spact’. searched spaces is that by (Mitchell, the versiou of characteri~atiorls from S’I’AC:CHl~ hearn-searches middle outward toward S’i’Ac;(: RR’s thrc!c search operators cializillg, generalizing, or irlvertitlg rriake ti concept descript,iorl elernerlt proceeds down a co~l.junctive IIIOI’V gcrieral elerrittrll, search 1 ion. tdstly, a poorly scorirlg tithgatc>cl; Lhis does correspond c-harac.teri~atiorls. more specific, l,o spe- path. Conversely, to rnakc proceeds to a new disjunc~harac.t,cri~atiorl not. raisca or lowcbr rrlay it2 degree ‘l’at,le 1’0 search IICW t~~t~Irlt:nts (Jrror. (atl only W hcri t’rror when ii negative of’ S’l‘AGCKtZ inst.arlce the OR [c 1 , ~23 Cortlrriissio~i corrlrrlissiorl), AND [c 1, ~23 expectatioIl is too though ctiaract,er- positive they is ‘t’tlis instarlce (a11 error 01’ ortlisbion) is overly specific:; to irlc.ludt: a t110r~‘ general c.llaract~~rir/,atioll. of’ error albo c’auscs S’I‘AC;G Eli to cxpatld 1,~ proposirlg lhe lJl1~ _” surilrriarixes LN searcll b:isctarch riegatioll 01 a poor characlerixaliorl. die operators’ precorlditions. Ta- 2: Stlarcli operalor ,c2] rhtects atont: ele~lit~rits second f’rolltitar (iict tors ill 11ew c:ttaracleri~itt,ioris. ‘t‘h(a rlorninaliorl hchuristic specities atlerrlative groups of’ (,11~lr;L(.t,t’rixatiolis from wtlich to form compounds. Af’tcr ~S”I‘A~:( ;lCti has rriadc afl error of’ corrilrlissiori, ch;tractc:riL;it ioils rrlatc:)ltlcl iI1 t.his rlchgative instance may IJt: par1 inlly t1t’cc5silry, t)bit dre ctedrty Iiot sufficit!nl. Sorric> ett’tllr’llt :, rIlust h;ivc suggested (vid the rllatchirlg t teal 1 his iristarlcta was likraly l,o be posilive, but, t Ilib itlstallce wa5 riegativth , sortie tlr:c.ess;try elerrlenl, step New candidate etecl.iorl : rtl;tlchcd ones. ;it’(’ II r~tt~at~hed cli5jL1tlc.t IIO iorl hllflic.ic:rlt IIMVI IO If iri d two art’ two iziLIiotlb hcburist its w)iicti apply S’l’AC;(;l2ti.‘s ions art> IIiiltctid for au terror tly similar rca- i,.S, fbrrrling t~leds high new, dis- are f,N(ci) .c2] t A:, I L/V(c) > are I or I,Y(c) the ‘I’he are t~slat~tisht~ci search scdrch wtlich or I j into ItlarlIlt’r. f’rolltictr < introduced (.t~arac.tt~r.izat,iolls the 1 I Lsyci) c:-alld-test new rrieasurr s as opera- tllell part eittirr of it,. ‘I’0 ii ticw ctiaract,er-ixatiori Iilust be Irlorc cflkct,ivc t,han its sporisorilrg corriporierits. If ttie 11ew etemerit, surpassc3 a weigtit, tlirestiotd, it is estal,)ished aild its c.oIrlporlerlls iLI’t2 pruud. IrlkriItl pdor~rlallce is assessed by c:xaIrritiirig recellt CtliiIlgc’S ill its wtbights. ‘l‘tlese changes avoid htitlg prulltd, avcragtvl, arid t0 t)t2 un- if’ this average rt~iLI.~liIlg an is itSy~Il[>tOt t tic> ~tlarac.teri~ation very small, tllc etcrrlellt 11’ it. is still (3. t)cloLc is pruritd. elt!Irlents, alorig with elements u~iwhich norriiIiatd present. whkh heuristic.. I<teclion ,c2] gcrlt>riLf frorrl appears wits il gcllerate art’ t)t%c.ausc in prurltd process) predict, f’rorn Ihostt it1 ttlis rlorlt~xarrlplt~. 01’ omissiorl. ‘t’ablc: irlg alid Negation ~lon(~xi~tt~- ctlarac-Ltar- S’I‘.AC;GKH rlit’;thlIreb siricct ctlaractcrixal,iorls, is norrlirlatd rlorrlill;tlion forriled, were (~harac~lerixat I Is corrlponeri1, is sufficient <:haract,cri~c,atiorls illverl, Ilecessary riornirialecl 1’~s;~ullc~t.ioil ’ t.his riorlexarnple ( corrtbirids ptt3. riz/,c3 corrlt)itlr:s ctiarac,t,c~t.i’c,;lt,ioris 1). thcref’orcl heuristics. I+;lt!ctiorl c,haritc.1c!rixatiorls thrcs~lold, Co~ljlIll~~t,iorl and corljurlctiorls. male). preconditions. !i’l’.A( ;(; tC:il l’oltows a t.wo-step process of choosing good argllrtrt~llts f;)r the operators; oti(’ sel of’ hrlurist,ics ~~orfli~~ute~ potc:ril iat argurnc~nts, iiud 2~ sccortd set, elects t hct rrlosl, prth- tttdLch(d. evidenct!, for NOTIc] so trldt(.timl is Table to jurlctive c~lara<,teri~at,iolls. New negateci ~tlar,a~t,erizatiorls are elecl,cd equally by tmt h measures. ‘l’ablc 4 summarizes OR[cl irrctusiori brother woightirlg Iiieasurt~, to be used in Function for (a (refer irlfirrrlirlg 1,tre converse ctlaractclrixatic)tls AND[cl 011es J evidence tlegative criteria1 soniIig, scmirlg occur irlfirrrlirlg ‘I‘a1,le ib(l Ma~chcci IJnluat,cld Ilrllllatctled sorrletirrles rlegative tolerates these ‘t’able OR[cl NOT[c] l~rlIllalctletf and a mate. The two charact~~iil;at,iorls (parent arid are always matched iri a posilive instance (hthcr) il~dlc) gerlerat. specific a parerit to be posit,ivo cxpcctation ‘l‘hub h(1im.h is c:xparltied toward a more iLnliol1. On the other hand, a guess that is Iltlgdtive is (~xp;~t~dd t tl(sr type an Urltllatctled, Matched Matched, Matched, NOT[cj t)e opralors art: by proposing makes is predicted hrurisl,ic.. 1 -1 ot’ generality. ‘t‘tltl coiijunctiou, disjuIlct,ion, a.nd rlegatioll ;ipplied exhaustively; search is limited Nomination a OIuissiun riot 3: worse potlt~rlts is did wilt Irigger indicate than are the wtlerl it react,ivaletl ~Iloves t,licl opposil,(J rlew wab amou~lts Lhrollgll order f’rotn who11 (,he wtlighting chariic’leri~,atioll is t5tat)lished. arid This More. kailse backtracking t,tiat cortlpett: Its ab tflts to chro~lologic.al t)icl st’drch space which tllchy wtlrtf pthrf’ortn- pruntd f’aititlg cortlett~Irlent backtracking art’ retracttd ~~roposetl. iii Sirrlilar 3 surn~na- IV heuristics. An Tracking irIlport.arlt feature sporisiveriess tm changes ii lo fox tts;irIls look of concept a in the f;)r Icarrlirlg drift IKl~ChiitlisIIl eriviromie~it. it cllangcd coat is lcor color LEARNING its rfh irIst,ance, iti his prey / 505 1 Ilts SCaSOIlS change. First, the learner must IJI>I wet111 randotiiim3s and getiuirie change. For ~~(~ct;il ion, the question arises as to whether it a Iloisy instance, and should be tolerated, or irltlit~al,cs that, the Iearned concept has drifted. ii&S disitinguish a Failed exwas simply whether it, S’I‘AGGEIZ % CORRECTLY CLASSIFIED IIM’> ! frtb Ijayesiarl weighting measures to dist,inguish beI W~TII events that indicat,e a change in the definition of a c.o~~ct~p!, arid those which are probably t,he result of noise. St~cortrlly, does the arllourit of previous learning about a givclli tolrcept tlckfiriition aflitct subsequent relearning of a Ilt’w dc~finition? In humans and animals it. does. ‘I’he adage “It’s hard to teach an old dog rlew tricks” roughly captures a rrlairl fillding in Iearning (e.g., Siegel &X IIornjan, 1971). ‘l‘tlc3e st,utlies ilidicate t,hat the resiliency of learned conctbpt, definitious is inverstxly proporlional to trairlirlg; briefly t,rained cotlcep~,s are more do11t~1 irl t,he Fart3 of charlge t,han ext,ensively the amount of readily abant,rairled ones. Kctlpirlg counts of the evidence types in t,able 1 amounts lo rc~l,;iiriirig a history of’ associatioli, allowing S’1‘AGGI31< to r110tlc~l resiliency appropriately. tcigurc> % CORRECTLY CLASSIFIED 25 INSTANCES 50 PROCESSED 2: ‘l’racking concept J -” ~- 75 drift. - .--_ ~~ “-v~.-- 100 / 11‘igure hucct3bive 2 dcpict,s definiliorls the red UPL~ shape shape circular, cliL>l1<‘(1 verti<al liIles concc~pt was performance for t,hc same squarish, (3) color indicate changed. Irlt~(lidi,t~ly following quircatf defiriitior\ chi11tgtv1 instances. S’l’ACG~:tZ on three coricept: (I) color (2) size small or (blue or green). ‘I’he when the definition of the Notice how performance falls im- the charge because the previously acwas not sullicient to characlerize new, III each of llte three cases STAGGl5lZ f’or~r~c:tl t,he explicit,, cts\)(‘s clc:fillition and 011 I I1t1 3bart.h f’roIit,ier. S’ilAc ;(; EK of symbolic evaluated addresses tile l,hroIlgll t,he use of its /,,Y itldicale a change 1 rigg:tllbacktracking tIculd, Illore 01’ the 1 II(~ rr~odifit:ation of S’l‘nc I(; tq;le’s acquisition represerltation it as the best, noise versus change weighl,irrg measures. iri 1,tie type of noise 011 Figure red other lead to 3 depicts or size sh ctrarac.l.ttri~atiorl as iri figure 2. Aft,er tlic daslied iC’i1 I lilttl, positive irrstariccs were sut)jet:tcd to 25% Ileg‘l‘lial is, %5(X of’ the pas,rt ivtk itili rrriing, systerriat.ic Iloist!. ililts ilrbt,dric.t3 wttr‘e ra~ltlorrily assigned t,o tlit,her tlit: posiliv(> 01‘ littgalivcl t-lass; a situ;tI,ioll similar to t.he Icaky rain vc~r~~ly Not.ic-t: that ulrlihth ;Lff’&.tc~tJ, iritlit~;ilir~g t iii~:lli~~lirig I~T~IIM~ 506 ht~twc~t~Ii S’rAc;(;t+;I{ / SCIENCE rioise wtairih figurta t,h;lt, ‘t, perfi)rItlatlt.t: is not, adS1’Ac:(:C:l< is correctly dis- ailltl concept t:ol~rit,s charlgtb. of siluatiori Figure issue Lht: non l,ypt:s, ---.- 25 INSTANCES When /,S and prewnt, they as explained above. sanle type of’ rloise does t~harat~t,erixalior~s. of’ t,ht color AI_--- of the COIIamong tllose squari vcsrl giLlg;(l. I ’ il is in effect keeping 3: ‘LS% syste~riatic an abbreviated between a characterization allows the prograrrl to model of previous (Jolilrast, 50 PROCESSED Iioibtt. bist.ory of the correlation and a concept definition. This the effects of’ varying arnouIlts learning on relearriirlg figure 4 in which the than four times the mwurrt of bef;)rc c1at.h charlge t,liau iii figure cry learning is considerably I’ast,er rriiriirrml Lrairiirig cast! (figure 2). 111 short, t,he htturist t,rairlc~tl concept,s are t,ht:r(\(i)rt> btt abarltio~letf 75 resiliency prograrrl al a gross was give11 level. more training for tbach cont~ep~ 2. Notice I tlat t,he recov(higher resiliency) iI1 LIlta it. dernoIlstratcd hclre is that briefly less likely LO be st,able mtl should more quickly in tIlta f’acc of change. Ori l,l~e othtar hand, extelisively trailled collcfxpts are rriore stable and have a longer liistory of past sut:ct:ss; they should bc less resilient in the f’dce of rlew evidence. Psychological studies indicate iri this rriaririer lhat (Siegel natural learning 8% l)omjan, met:hanisrns 1971). behave “/ CORRECTLY CLASSIFIED 100 200 300 INSTANCES Figure V 4: ‘l’rackittg conc.ept. over titrie. rtt<:asures I)~~t.wet~tt ttoise attd gertuitte rt~cbrical histories of evettts, ovr~ri raitbitig itlg rttottiods t~urtherrnore, affords the concept STAGGER seen in psyc:hological employed in ~‘I’AGGEI~ the proper rcquirc~s feedback, as all unable given overtraining. drift. fly models retairtitrg nuthe efl’ec ts of’ experittlents. are far frorn ‘I‘lte leartia COIIl~Jlek concept attaitment to conceptually systctrts cluster I)utla, l-t.., Gaschllig, tlie l’rospecbor In I). Michie uyc, b2illl)urgll: its bngley, irig. tion system in part by t,ho NO001 4-X4-K-(139 I~ourrtiat,iolI OHice 1 artd UII- tltir grdtlts 1YrL‘-81-20~iH5 and lS’l’-85124 19, ttte Arttty IIest1arc.h Ittstitute urtdctgrant Ml)A<303-85-(:-0:1’L4, at~d by I tit> Nnval Ocean Syst.errrs (:(:tt ler under contract NCiCiOO IH3- ( :-0255. We would likt: to thank Michal Young who wd> itlvolved in the early fortrlulalion of these ideas, Ross LI1eir vigorous discussions a tiat,urat mac.hitict arid exletisiotl learttittg cotlsistettl I’rcss. OJ ltarniny (1982). 1 K, 203 theory of cfis~.rtrtlitlatiutl & It. Nec~hus (I~>cls.), und ( :eneralizat,iori 226. learnProduc- development. as search. Arfificial do, J. li. (1986). Ii. S. Illictlalski, Muchine lcurnlny: UW~C II. I,os Alt,os, t’rb, for suggestitig and Ihe etitire Ilniversity inputs. Acknowledgements Quitilitrt j~roc~~ss, A getteral 1’. I,angley, models MiLchelI, ‘1‘. M. Intelliyencr, III Science l’hlirll)llrgll I’. (in press). 11) II. litahr, Qurrhl, ‘I‘lli> research was supporl.ed 01‘ Naval Research uttder grants NO001 l-X5-K-0854, l.he N;Ll,iollal J., & llart,, k’. (1979). Motir:l design in consull,anL system for tlliner;tl exploratiott. (I’ll.), I!‘rpert sya!ems ,in tht rt~icro electronic use of ttte distittct.iort 5olut iotl t,o the prot~tt:rris of leartting it1 complex, reactive c~tlvit‘ot~trlettts. So far, it, is littlited t,o learning tboleatt c~otrtt)itlat,iotis of attribute values and cannot, acquire rela1iorl;tl descriptions of structured objects. STAG(:I*~R also ;tt~cl is tlterefore 500 References tiorls t)y c.otiducling a trtiddlcbout bearti search through the space of possible conjunctive, disjunctive, arid negated (.tI~tr.itc.l~,rixatiotts. tlacktrackittg allows t,rackittg changes in defitiit.iorts weighting dril’l Conclusions S’I’AC:C:EK is an inc~retnental learning method which tolcbralt5 systematic noise and concept drift. It begins with sirrlplca characterizations and learns complex characterixa- c.ottc.chjjt, lbyc~siatt 400 PROCESSED to the rrtatchittg group at. lrvitte for eticouragernertt. ‘i’t if2 effect uf tloise 011 colicxpt leartlltlg. J C. Cart~onclt, & ‘I‘. hl MiLchelI (Us.), An urtijiciul irltrlliyerlcr uyyroach, uol(:;Jifortlta: Morgau Kaufr~~ar~rl I’ublish- IIIU. Ihc.orta, tt. A. ahserir~c untl of’ (lIM3) (3 III i’hysioioyicul f ‘rola1)llit.y of st1c)c.k irl fear c.c,rltiil.lollirlg. .louri~ul I’syr~holoyy, 0’6, 1 5 t.11~: ] J. c: (lwzi). ‘4 note on corrrlutronul (‘l’echttic~at report ff Xti- 13). Irvine, Califorrlla: versity of California, I)epuLnienL of Inforn~at.ion Sclblirtlrtler puler Siegel, il tlibitAJry f)resence ard of Compurutive mcusures ‘1%~ tiniand &III- Science. S., & UortIjarl, ~mJCedUIX M. (1971). Lmrnitly t3 ac k ward coritiitionitig as a11 und Motioution, .2’, 1 Il. LEARNING / 507