Suggestion and Advice VISION FLASH 43 by Eugene C. Freuder Massachusetts Institute of TLechnolory Artificial Intelligence Latoratory Robotics Section March 1973 Abstract Results of scene analysis, as they are achieved, direct and advise the flow of subsequent processing. Work reported herein was conductee at the Artificial Intelligence Laboratory, a Massachusetts Institute of Technology research program supported in part by the Advanced Research Projects Agency of the Department of Defense and monitored by the Office of Naval Research under Contract Number N00014-70-A-0362-0005. Vision flashes are informal papers intended for internal use. This memo is located in TJ6-able form on file VIS;VF43 >. PAGE 2 This vision flash is only a "progress report". Essentially it makes more accessible two earlier internal memos on the control aspects of my vision work. These are embodied in sections I and II below. The ideas in section I were presented last fall at a vision group seminar. The details of this section do not match the present state of research. However, the basic ideas of the "suggestion list" control structure should stand out clearly in this initial formulation, and serve as a basis by which to gauge future refinements. Section II discusses the related issue of "advice", which is not fully appreciated in section I, and briefly reformulates and extends the basic suggestion structure. Section III hints a bit at the present state of implementation and some of the problems faced. I think it most essential that critical but.presently vague notions like suggestion and advice, first implied in the initial discussions of heterarchy by Minsky and Papert, be given detailed, concrete examination. that direction. .· Hopefully this work is a step in PAGE 3 C I Suggestion Analysis: What to do next. Visual processing faces this issue perhaps even more strongly than other problem areas. In most "intelligent processing' problems, the input data, at least, is highly.organized. In fact the more "intelligent" the problem, the more organized this data is likely to be. Even language processing, not to belittle its many difficulties, at least receives its input in an ordered string. If perceived in written form it is clearly organized into words and sentence units. The problem of analyzing a visual scene might be more closely compared to picking out a spoken phrase-delivered at a noisy cocktail party. An enormous amount of information is received in parallel. We must decide what to consider, and in what order to do so. Conceivably we can examine every potentially interesting segment of the scene, apply all known predicates to it, compute all known relations between all combinations of picture segments. Then see if we can spot familiar patterns in this mass of processed data. This is a biased account of a description network model matching approach. I need not dwell on its limitations. To discern interesting PAGE 4 scene segments and compute predicates and relationships at all may be impossible without using information, as we gathker it, guide further processing. to Certainly much unnecessary computation can be spared by informed choices of where to look and what to look for. Briefly each processing step should be a function of general knowledge, plus the particular knowledge generated to date about the scene. Hopefully a control structure can.be generated which will make some basic decisions on future processing automatically. The scope of the problem demands that some decisions be made routinely. Upon that basis we can always elaborate. We propose therefore a simple mechanism for making suggestions about what to do next. Whenever we establish a particular fact about the.scene, we note that this fact might be useful for establishing other facts. We suggest trying to establish these other facts in the future. That is, we suggest doing things that we already know something about. When we learn a fact we ask: what is it relevant to, what is it uesful for, under what conditions can it occur, and we suggest investigating appropriately. We may have partially established something already; it makes sense to tackle PAGE 5 this next. For example, if we discover a handle in the scene, we may have a hammer, or a screwdriver in view; we suggest exploring these two possibilities. Well and good you say, but if we do this conscientiously will not we soon have an unmanageable number of suggestions? appeal to what I term the "Waltz Effect": I Additional information will eventually intersect and produce greater specification rather than combinatorial explosion. (Conjectured to be correct about 3/4 of the time.) Valid contentions about the scene will be suggested more often as relevant facts are discovered. "priority" for execution. This will increase their The truth will rise to the top, the dross sink to the bottom, of an ordered priority list of suggestions for future investigation. A simple priority list should suffice for a.start, though of course higher order interactions among the suggestions are of interest. Furthermore repeated suggestions will tend to reinforce one another and become more specific. "Suggest left table something" and "suggest left something chair" can combine to become usuggest left table chair". Data: We can distinguish three subproblems to out major goal: next step = f(general knowledge , particular knowledge) PAGE 6 First the particular knowledge, PK, must be able to relate to the general knowledge, GK. Next the GK must make suggestions. Finally these suggestions must be ordered and executed. The key to the first two subproblems especially is a good uniform representation of our knowledge to facilitate communication. For example, we can represent our GK as processes and our PK as instantiations of these processes, i.e. of their calling patterns. statements. Or GK may be relations and PK membership Basically we are building a model of our knowledge. The particular metaphors we use will depend on our background, computational, mathematical, whatever. Each can lend some insight, but none is crucial for this discussion. The major point is a natural communication between fact and processing. Consider a bit of general knowledge organized as procedures: To prove P1 about x and y prove P69 about x then P72 about y: (P1 x y) (P69 x) (P72 y) If we establish the fact (P69 Fred), this associates with the general procedure P69, which in turn is recognized as being useful for establishing P1. Carrying through the arguments in the patterns we make the suggestion: try to show (P1 Fred something). If we organize our general knowledge perspicuously the above process will: PAGE 7 -1) obtain all relevant GK procedures for a given established fact. E.g. (P69 x) may occur in other processes than P1. 2) find the most relevant points in larger procedures to suggest for beginning further study. E.G. (P1 x y) might look like: (P1 x y) (P43 x) (p82 x) (P69 x) (P72 y) (P69 Fred) would then suggest (P43 Fred). All this can occur rather automatically. it is relevant to. suggestions. P69 will know what (P69 Fred) can trigger appropriate These can combine with already existing suggestions to produce more specific ones. The organization of our GK indicated above simply employs "good programming practice": using subroutines whenever a procedure is used more than once, breaking up into subprocesses for clarity. Essentially we are saying that if we program clearly enough, even our program can understand itself. The program can then tailor itself to the specific scene by choosing specific processing steps in an appropriate order. processing is data driven. The PAGE 8 II Advice Advice is a primary issue for heterarchy freaks and knowledge hackers. We need detailed practical analysis of the concept. This section identifies certain dimensions along which we can analyze advice, and relates the issues to the problems of definition. We conclude that an "active" form of definition will embody "advice", and that the suggestion list control structure described in section I is appropriate for implementing this type of advice. Consider the following examples of advice: 1) Suspect x is a hammer. Show x has a handle. 2) x is a handle. Suggest x is a part of a hammer. 3) x is hammer head. Suggest look for a handle. 4) x is a hammer head. Suggest x is next to a handle. 5) x is a hammer head. Suggest a handle lies at right angles to x. 6) Hammer head has brightness y.' Suggest'handle nearby with brightness >y. These examples all involve a few simple facts about hammers. Consider the following top-level definition of hammer: hammer: head PAGE 9 handle relation head handle We want a definition that can be executed as a procedure. The problem with the definition above is that it does not have the flexibility to benefit from advice. have difficulty finding the hammer head. handle first? Suppose, for examl:le, we Can we go find the Can we then use the knowledge about the handle to make our search for the head easier? We can not easily do so if we regard the definition as a passive process to be entered and executed. The knowlede in the definition must be employed in a more active fashion, that: :1) allows flexibility in the order of execution 2) uses the information gained from each partial success to dynamically alter the execution, in a word: employs advice. The relationships between different elements of a definition constitute a basis for advice within the execution of the definition. We can not afford to find the handle, find the head, and then check their relationship to each other as the definition above would imply. The relationship is information that can be actively employed to facilitate successful discovery of head and handle. In a sense, when we establish one part of a definition, the rest of the definition changes. "Has-handle x" is really dependent, considered as a executable process, on "has-head y". These relationships must be analyzed and explicitly embodied in a PAGE 10 successful definition. The definition is more like: hammer: head (handle) handle(head) where the parentheses indicate dependence. Relative predicates can be seen in use as advice in examples 4 and 6 above: "next to", "brighter than". Elements of a definition can also make "absolute" suggestions, generally in an "if-then u format: y is if x is red, then fuzzy. When we make suggestions that involve the definition of an item (or more generally knowledge about an item) we are tacitly assuming that a definition applys, and then attempting to simultaneously justify and benefit from our assumption by establishing its For example, part of a hammer. requires, consequences. when we discover a handle, we guess that it is A. hammer, by definition, implies, and the existence of a hammer head. This head bears a specified relationship to the hammer handle. Therefore we attempt to prove that a hammer head exists with the requisite relationship to the handle. (As in examples 4, 5 above.) This relationship is required to establish that our hammer hypothesis is correct, to fulfill the definition of hammer. However the relationship also serves to guide our attempts to establish the presence of a hammer head. PAGE 11 On the simplest level, if x is defined: foo x: bar y boo z a) bar y - hypothesize foo x - prove boo z However as we have seen, we may have definiticns like: foo x: bar y boo z relationship y z or foo x: bar y if boo z then zam y In which case: b) boo z - hypothesize foo x - prove bar y with advice from relationship or c) boo z - hypothesize foo x - prove bar y and prove zam y. In the previous section we employed the reasoning illustrated in a) to suggest what to do next. Now we see how the conditions of examples b) and c) can help suggest how to do it. We are really moving up and down the definitional structure to obtain suggestions. When we find a handle, we move "up" to suggest a hammer, but PAGE 12 we might also suggest a screwdriver. If one suggestion does not pan out another might, and partial successes generate new suggestions. When we wish to establish a hammer, rather than plowing through a rigid program, we again make certain suggestions. For example, to prove "hammer x" we suggest showing "hammerhead y" and "handle z" both suggestions giving the advice that they are to look within x. When handle z succeeds it suggests hammer-head y, with the advice that y bears certain relationships to z. If the original suggestion for hammer-head y has already been tried and failed, no problem, it now is suggested again. If it has not been tried yet, this second suggestion reinforces the first with added advice, and a higher priority for execution. This control structure allows great flexibility simply. very We do not have to specifically program every order of execution, every contingency of failure. Conjunction of advice can be handled in a uniform manner. Furthermore, we are never trapped inside a specific parochial program. We never have to worry how the hammer program can communicate with the rest of the world. We do not have to worry whether we have spent enough time failing around in pursuing the hammer, whether we should leave, how and when to return later, how to use our partial successes to possible advantage elsewhere. Basically everything is thrown into the pot, everything interacts, the fittest rise to the top, are FAGE 13 tested, and succeed. When something does succeed, e.g. we discover a handle x: 1) we have alternative suggestions about what else to look for 2) we have advice on how to carry out those suggestions. Past and future successes in dealing with the scene: 1) direct our priorities on when to handle these suggestions 2) add to our advice on how to handle them. PAGE 14 III Onward The ideas put forward above, particularly in section I, have been extended and to a degree superseded in the implementation. Suggestion and advice are still the basis of our investigations. The explicit control structure of Planner/Conniver, antecedent and consequent theorems, have been largely replaced with a structure more specifically tailored to the present issues. The central problem of relating particular and general knowledge has been viewed in terms of "what has been done"statements of fact-- and "what might be done"-conjectures. Facts and conjectures are essentially identical assertions appropriately marked in a common data network. Generator assertions, using the Conniver tag mechanism, and an explicit failure function, serve for backup and parallel investigations. Assertions generally are single argument predicates. Multi- place relations are embedded in the advice and alternative programs they advise. They may also be place explicitly in the data base "quantified" as predicates: x left-of y PAGE 15 may become left x right y L of : y L of : x using property lists. Advice is handled by explicit advice routines asscciated with each assertion. advice: This encourages us to think in terms of not how can I tell if x is furry, but what could help me tell if x is furry. Several key conflicting priorities have required careful study. The need to share all discoveries conflicts with the need to distinguish each distinct investigation. Returning to an interrupted investigation we want to take advantage of any new discoveries, but do not want to waste time going over old ground. We want to be able to accomplish subgoals flexibly in differing order, leaving and returning whenever appropriate. On the other hand we have to know when to give up or when to declare success with the main goal. Our priority system must reward successful progress, but not lock us into an almost complete, but hopeless investigation. We desire to combine facts to specify suggestions more completely. Yet we cannot always know when facts are really independent, even contradictory in combination. chair example at the end of section I, The table and for example. "Left table PAGE 16 chair" may be just what to look for. On the other hand we must still consider the possibility that the table is to the left of something else, the chair likewise not to the right of the table. The implementation of these ideas and the handling of these conflicts is still in progress, and continues to modify and extend the basic insights. Issues such as "chaining of advice" through several levels of definitional structure, will hopefully make interesting section headings. At the moment enough of the essential framework has been roughed out to permit us to link with the region handling routines of earlier work, and begin experimentation on "live" data.