II) 1 \Cd5,11)1(1to tiis- I ril)ul.c~ t,lloic (3. ;trlcl t.OJiVt’rge 011 LIXY ds I IJ3. rlon~~xarrlplcs to be acquired. of a rrieasure if good for selectiug decision irlforrnation values 1113 has lhcoretic besl also discriniirialiou trees divide are rri6lasure all beer1 designed trc’r to be ot,lainetl. ot)ject to determine (sub)set at to acconlrlloclate LEARNING / -tc)’ is Inputs: A decision tree, One instance. A tlecisior~ tree. 011tp11t: I. If this instance is positive, increment the total number of positive instances. Otherwise increlnent the number of negalive iristances. 2. If all of the irdanccs tlie decision are positive Compute then return the expected negative If thcrc information scores or Lhe maximal aLtribuLe is riot the a new tree. a test link from the root the root for every then of updating In 1113, to update tree 1 : I’seudo is augmented code (I f?j ;Lrld 16). st!rvt’s to r> an refine for increrrlerital cart effecting an enipirical incremental ‘I’he introduction the analysis instance ca~illy with additional space indicates quality variants where /II is the and number d is the that, analysis, are during the cost reduced tree number (which of cannot . . . . for a total after every instance, over a single object, then the above over two objects, of and without, signifi- Asyrrlpt,otir:ally, of learning. where the required r~urrlber rooted at node j of’ level the t~unlbc~r of instances of i. objects lcor an i, of t,tie decision tree, r~, ( JL, TL,,) represents of objects required for all nodes at that level a stabIt! discrirninat iI)g altribute. A level cannot until prtavious levels have achieved stability, and all instances seen, the t.hus 7t, ;> 71, 1. Since 1113 retains uurrlt)t:r of objects to c:onsLruct a decision tree of depth d / SCIENCE IA ( is lhe of the of incorporating suflic.ic!rit size, ql, rr~usl bc seer) for It)3 to choose the root at.tribute wllost: values best discrirrlinate objects of the e11viroli~t~crit as a whole. This is true in the creation of all 498 depth rrlcthods techniques, Arr irilportant cotriput,atiorlal measure is the number II):3 of ir~sl~a~lc:es required to c0tisl.ruc.t an optimal tree. choose::, an alt,ribute to form the test for the root based on t,l~ts information that attributtl contains over ttie observed (from the environrric~nt) of A sample of objects instancc5. Lo attain stal)ilk of instances, introduced most iruportant is presuluably term much is ) /I2 since greater than the the of attributes.” In 1111, building ire level, t,ht: 11urrlber frorn j A I). t,he nurriber attributes, cl111 tree a-- (‘4 / riurribcr subtrtlt: roots as well, is 71,, lor the subtree a new PI (1 exceed 1114. of thcxse l&or of incremental be signilicantly the build attri butt:. If a tree is built expense is incurred whic,l~ two is to each node of the tree requires that to determine their values for previThe cost of constructing an entire is attributes, allalysis required to construct choice points, or rnerrrory a tree scratch. CotistructiIig instances be examined ously unused attributes. value of Go to step I with ttre subtree found by following ~lle link for the root attribute’s value in this instance. Table ID4 for all attributes. i. If the maxinlal attribute is x2 tiepentlcnt make it the root of this tree. ii. Make Cost B. is no rout build sample, in order to choose the root attribute in an optimal manner. llowever, because Ii&i does not store all instances encountered, at the next level it must examine mother rll instances because the first n, instances are not available for inspection. Con- for each value present in the ineither the number of positive or the information then object score. instances. Compute a representative same n o instances sequently, the number of instances the tree is the bum of all of the root For each attribute, stance, increment root, or negative tree. Again, assumitlg must examine the a decision of ot?jects times as shown below tree the is pr.oporLional square of the only number to of of objects this to an efficient is substituted into characterization the cost is rod 1. equation When for 1113 we have rank or the sq~arc edge, file, diagonal, tyPc? or otherwise) where or ot,ht>rwise) each (6 attributes), piece (4 attrihutts). resides There and (i.e., corner, art3 a total of six- teen attributes, each with three values. Although there arts fj” i\ (;” h 4” ._ 2, 985, 984 objects possible, ail exhaust,ive enurtreration For 1114, the number of instances tree is larger: 27::: 71,. Substituting sion for object incorporation yields to an optimal decision this into the expres- that these attributes. Four are which stances total required expense to select 71; 1 :- >;f;,’ Tli then is very likely quirc~tl the of the Our 0 bjects empirical ari the subtree number root expensive is probably attribute. than number of inIf 1114. ‘l’his of instances greater re- than the or TL~ r > d. assurned discussion sample of regularity in an environment, we now of and actual are The constructs new only reconstructs been misclassified. instance instance. counts tested tree third variant fourth were randomly is depicted tree 6‘3, instance only has counts of new updates at- is made hoards presented formed in figure perform after for each generated were algoof 1113 scratch updated 164, of version, au in classification positive) decision from is 1114; the variant indi- II)3 version when are an error ‘1‘1rc same The variants. the of the force tree decision pins in terms A smarter instances when ilar to IT,). variants decision The knight objects is a brute is received. the I~‘inally, tribute first rregative and distinct 3,251) actual a new each positive 95,480 orily incremental tested. of the instances a ‘representative’ a rigorous of objects on the tree tree, has lacking distribution since the deecision analysis each 11)s is more case, to construct depth hinges there behaviorally rithrn Comparing of the cates (sim- (KS 69% to these four by ull of the variations 2.’ ’ Idist-bk-knight1 analysis. dist-wk-knight IV (:onsider a hoard ation the task position, and of classifying task versus safety or loss of the black to move. chess attempts Following attainment knight performance a classifier as a win or loss. a corrcept king Empirical Figure (1979), we defjne Quinlan king knight whether and rook or king 1 depicts Given the situ- as determining a white black endgames. to identify results in two a sample a black in the moves board diag/rectl \pther t 0’ ‘0t 0 Figure with 2: IIecision For the three observations the least over more for nurrrber ple picture knight. Iloards were randomly (6 attributes every pair generated (in squares) and bcttween of this type), board of pieces (i.e., whether described each pair relationship they lie on there quickly forms an Figure variant t1ut nurrrhcr ranges of from 3 depicts 2 (averaged as a function a rough should 11ot and tw 01 sim- equated builds more a coniplcte tree, while _instances. 111-l rtquires the on the complete tree after timc each irrstances. is a substantial its ttic in figure gives consistently 20,000 Classification in terms speed, substantially constructs tree l>epth _-- variant aiits of’the distance pin. tree for f-64. by each ID3 rapidly converging Though algorithms, decision built of learning approximately black of the of instances. 1 I)4 requires pinned knight this decision 11% to ttle greatest depth correctrress. largest, of a safe, to forrn 50 execulions) the with efficient required the average I: Example for a safe configura- tion. l<‘igure tree decision etftlctivc perforrnarrctt range tree, classification of tlie three iI1 the each of the for more variants t,htl instances. efficient vari- (averaged instances. over 50 executions) was measured over 1000 _-_ k‘or II):{ and 1114, a 90% effective classification of pieces between the same LEARNING / 499 DEPTH OF DECISION TREE COST PER INSTANCE 30K / Ii% i ID4 / / / / I , ID3 20K / IOK 250 500 750 INSTANCES 1.000 18,500 63 PROCESSED 250 is Ir)r’rrred after ~~t:r Iorrnance as few as 100 instances. of’ 164 rt’d( tit3 75’i’: correct !#‘I c-orrect t I~ough it sp~etf seri first, which is due tests for to the of order al- tIstablisheci learns by JIM. form of the until In figure an four of updating the cant II Iitfw instance. corrll>;lrisorks ~xec’ut ions coricept number decisiori depicted. are of comparisons deeper Icigures per irddfic-e perforrried) it rit’w clccisiori tree desc.riptioris of corriparisons do- in the 4 aritl for each after each is measureci rrracie 5 depict for a lypicd by to irlcorporatt: the cumulative a typical ‘l‘he vert,ic’al (among most IIC~ irislance arid of instarices the clecisiori fied instar1c.e exllibits low, the high, tree from vtxrl ical scale O( ,.,t;‘) _a . of II)3, for bound curve. Ifii which updates I I)3 and becri of 1114 is greater ^_ 1113 (nol~ rrlagnified thaIi the 400 that times). usually the ‘l’tie The low cost it is always ads wtiil(k r t~rriainirig however, i SCIENCE bounded expc~nse per instance important ion (i.e., of the IL)3 of classifying expense of rebuilding an incorrectly this classi- O(jAl’) is clearly expensive attribute The work refiertecI curve counts orily iI1 results if the test Conclusion learning domains, intoIlsive which gorithm wit hiri the rriachine increased nature performs least four. 0( / II’ x let) expense when and of of 11)s is incorrect. corrlpticated sc1arc.h the function ID4 is number version of the reflects irlterrrlittelit processe<i, linear consiclttrably less tharl the latter’s --. pta;ikb. ‘I’he fourth variant, IIM, displays the least expense _-. 01’ tllth ttlree. It asymptotes to a value as srllall as II)3 500 hut IIF1 has to in II&l is nearly step of the as a t’uriction force approach scratch nearly As compartd 1% irislances cumulatl’~e brute curve is encountered. iristarlc32 the intermediate V when is that of each algorithm The The the arid for each by each expensive bound. reveals from iitlt efficierlcy of 1000 refiects accelerating classific~idion coast instance. performance axis rriade the 50 1113 reconstructs ari instmcc of this sequence of’ instances. asyrrlptotic its ‘l‘tlth t’xperisc% of processing price over number curve t tie riurriber execution variant. The 6 the variants displays ‘1’11~cost per slowly. geometrically couritirlg PROCESSED .-. 1113 cost and 1.000 anti quickly. attributes value It5XY 4: II>3 I 750 perfornlarlce, algorithms decisive I;igure it incrcrnerl- relatively these off; ‘l’hub, these classificatiofl important, leaving instances. was achieved with classificatiori t MY’S construction; 750 kvel irisLarlct3, 275 time perfect c.I ;issificatioIl ‘I’tltl apparent effective after corlsiderable hrns to achieve (2 WC) after to 500 INSTANCES classilicatim lorlger classificatiorl take The somewhat c Iassificdtiori may dlgorit good takes I methods interest can have to as they my behave in behavior one tloes in evidcrlt. concept is that observations incremerital applied more of rlorlirlcrerrlerltal, become observatiom thougli, be rriatle process are deficiencies in incremental process point rrlethods the This induction are leas rneth- observed. References

Carbonell, J. & Hood, G. (1985). The World Modelers Project: Proceedings of the Third International Machine Learning Workshop (pp. 14-19). Rutgers University.

Knowledge Repair: Evolution Versus Revolution. Proceedings of the Third International Machine Learning Workshop. Rutgers University. Michalski, R. (1983). A Theory and Methodology of Inductive Learning. Artificial Intelligence, 20, 111-161.

Mitchell, T. (1982). Generalization as Search. Artificial Intelligence, 18, 203-226. Quinlan, J. R. (1983). Learning Efficient Classification Procedures and Their Application to Chess End Games. In R. Michalski, J. Carbonell, & T. Mitchell (Eds.), Machine Learning: An Artificial Intelligence Approach. Palo Alto, California: Tioga Publishing Company.

Acknowledgements

This research was supported in part by the National Science Foundation under grants IST-81-20685 and IST-85-12419, the Office of Naval Research under grants N00014-84-K-0391, N00014-85-K-0116, N00014-84-K-0345, and N66001-85-C-0235, the Naval Ocean Systems Center under grant N00014-85-K-0154, and by the Army Research Institute under contract MDA903-85-C-0324. Hume, D. (1985). Learning Concepts in a Complex Robot World. Proceedings of the Third International Machine Learning Workshop (pp. 173-176). Rutgers University.

Simon, H. (1969). The Sciences of the Artificial. Cambridge, Mass.: M.I.T. Press.