From: AAAI Technical Report FS-93-03. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved.
WhatGoodIs Your VacuumingRobot’s Intelligence?
R. Peter Bonasso
Space Systems Division
the MITRECorporation
7525 Colshire Drive
Mclean, Virginia 22102
pbonasso @mitre.org
The situation is the same with actuators: if they can’t
perform the perfect actions needed for the task, some
combination of computationand action can approximate
the neededactions.
Introduction
Robotcontrol is about choosingactions over time to carry
out a task. Howdoes one measurethe efficacy of software
usedfor robot control? If weare to realize an instantiated
real-world agent, e.g., a vacuumingrobot, we need to
understand the contribution of the control software,
particularly if we are to compare one method versus
another. The question is complex, because controlling
robots is complex. To a certain extent, the ability or
inability of the robot to carry out a task in a natural
environmentis dependenton the quality of the sensors and
actuators and the rate of changeof the environment.With
perfect sensorsandperfect actuators, the problemis to find
the mapping between states and actions which
accomplishes the goal. This problem -- the AI planning
problem --is knownto be theoretically intractable
[Chapman
87]. But let’s say the mappingcan be foundfor a
our vacuumingtask, i.e., we have pruned the states and
actions to those we really care about, and the mappingcan
be discoveredin a reasonableamountof time. Thenif the
robot receivesperfect state informationandcan performall
actions, but the time of its sense-act cycle is longer than
that of the environmentalchange,the robot will ultimately
fail. Let’s assumethis is not the case, and that we have
founda mappingthat has a responsetime well within that
of the environment.
So the state of robot control is as follows:
¯ Wecannot accurately measureevery state that is of
interest for the task.
¯ Wecannot execute perfectly every action required for
the task.
(A corollary to the aboveis that while we can endeavor
to improvethe sensing and actuation, there will alwaysbe
environmentsand tasks which will defeat them)
¯ Computationand robot motionare neededto overcome
sensing and actuator limitations.
¯ Wemust limit the computationsof at least the most
critical sense-decide-actcycle (e.g., obstacle avoidance)
order to stay within the frequencyof variability of the
environment(e.g., the fastest movingobject)
¯ Wecannot find even a "satisficing" mappingof states
to actions a priori (at compiletime) to carry out a task
since the environmentis not completely predictable.
Despite these problems, we are seeing an increase in the
numberof examplesof robots carrying out tasks reliably in
field environments.Thereare a variety of reasonsfor this
increase in competent robots. In some cases (e.g.,
[Dickmanns86]), advantage is taken of a single reliably
predictable part of the environment (the curvature of
Autobahn
highways).In others, the environmentis partially
engineered(placing of barcodes)in the vicinity of the task
execution (the 1992 AAAIrobot competition [Bonasso
Dean92]). In all cases, there is a reliance on a nearcontinuoussensing of the environmentalong with the use
of software processesthat can be executedin parallel, and
whose outputs to the actuators are selected with a
prioritization (e.g., [Brooks86], [Bonasso92]). Andit
clear that morethan any advancesin sensorsor actuators, it
is the software which makesthese robots succeed.
Theproblemwouldthen be solved, except that this solution
(finding the mapping)is based on the environmentbeing
predictableenoughthat the actions carriedout will result in
persistent changes in the environment. Else, the robot
trying to stack Block B on Block C and Block A on Block
B will never finish its task if BlockB keepsgetting moved
by a mischievousagent. So a good deal of research into
intelligent robot control centers on re-planning(finding
newmapping)based on sensed changes in the environment
that do not conform to the model of the environment
embodiedin the mappingof states to actions.
But the problemis eventougher: the sensors and actuators
are limited in the states they cansense and the actions that
can be taken. If the sensorsfall short of beingperfect, some
combinationof computation,sensing and action (purposive
sensing)needsto be carried out to approximateor infer the
missing states from past states and a priori knowledge.
14
Morerecently, there has been an emphasisin integrating
the moretraditional AI hallmarksof deliberative planning
to whathave beenmostlysuccessful reactive systems(e.g.,
[Bonasso 91], [Gat 91], [Connell 92], [McDermott92],
[Slack 92]). These efforts involve systems which have
rapidly executing skills (chunks of sensing, computation
and acting) which succeed in a wide variation of
environments, asynchronousdeliberative planning which
will find alternate mappings
of states to actions withpartial
orderings, and somemechanismby whichto transform the
discrete event reasoningof the planner to the continuous
activity of the skills.
But will any of this makea difference in the performance
of our vacuumcleaning robot? Whatis the value of adding
deliberative planning, or eventually, learning and perhaps
natural language capabilities to reactive robots? A
reasonable conjecture is that robots whichrememberand
can reason andcarry on a discourseabouttheir actions and
the results of those actions over time will exhibit more
intelligent behavior than purely reactive robots, which,
while robust maynot be efficient whentrying to achieve a
goal (e.g., the randomwalk of Scarecrow[Bonasso& Dean
92]).
Wehave been developing a methodologyfor measuring,
understanding,and thus predicting the contributionof the
software in endowing a robot with task achieving
capabilities in dynamicenvironments.This paper outlines
that methodology with an example from the household
vacuuming domain.
1. Define the task within the context of the specific
environment.
2. Define/describe the variability of the environmentin
the aspectsthat are critical to the task.
3. Define/describe the limits on the robot sensors and
actuators.
4. Specify the extent to which the environment maybe
engineered.
5. State the specific robot capability one wishes to
measure.
6. Define the measures by which "success" will be
evaluated.
7. Determine through off-line analysis, to the extent
possibl,e that it is feasible for the proposed system to
accomplish the described task under the described
environmental conditions.
8. Structure test cases tailored to capturing the data
necessaryfor provingor disprovingthe claimedcapability
in step 5.
9. Run the tests, gathering the data necessary for
analysis.
10. Analyzethe results with respect to the robot hardware
and the claims madefor the software.
(Thelast twosteps are iterated, concentratingon the areas
in which the robot is having problems)
The Methodology
Step five is the step whichseparates this methodology
from
simply a specification writing and acquisition testing
procedure. We are concerned with measuring the
contributionof the parts of the robot softwaresystemto the
overall systembehavior. Whichparts we are interested in
evaluating will dictate the measureswe mustuse, the tests
to be run, and the types of analysis to be performed.For
example, if we are interested in determining whether
algorithm A for determiningthe location of a barcode is
better than algorithmB, we mightrun tests concernedwith
accuracyof barcodelocation (testing the individual skill),
and tests of a full systemtask (find andvisit) to understand
whether the new algorithm’s computation time slows or
speeds up the robot’s overall performance.If, however,we
wish to understand the value added of a control system
which uses long-term memoryversus a reactive system
without memory(as will be the case in the detailed
exampleto follow), we maynot be concernedat all about
individual skill tests, but more with tests of speed,
efficiency, and improvedperformanceover time.
Aninsight gained at a recent AAAIsymposiumon mobile
robots wasthat the scientific method,so prevalent in the
physical sciences is perhaps not appropriate for the
developmentof intelligent robots [Kuipers 92]. This is
because in the physical sciences the emphasis is on
theorizing about the nature of an existing phenomenon,
whereas in robotics we are actually creating the
phenomenon. In our methodology there is room for
hypotheses and tests once the phenomenon
is functioning
in a task environment, particularly whenthe behavior
exhibited is unexpectedbased on preliminary analysis.
Anotherinsight gained in other discussions is that we are
attempting to makestatements about the behavior of a
complexsystem, and thus we are limited in what we can
say aboutthe effects of the individualparts of the systemas
they contribute to the overall behavior.
Since we are evaluating complex phenomenaand their
interaction with complexenvironments, our methodology
espouses an engineering philosophy,
combining
quantitative and qualitative analysis, the use of
observationaldata, and controlled experimentation.It also
relies on a detailed description of the task and the
variability of the environment
within whichthe task is to be
carried out. Most importantly, in describing the
methodology,we focus on measuring the value-added of a
given piece of software to a given robot hardwaresuite.
In the description that follows, a mythical robot
(Cinderella) is to vacuumthe floors of a one-story house.
DeFinethe Task(1)
Here one defines the minimum
essential requirements for
the task. Aesthetically pleasing performancemayor may
not be a requirement.
Our methodologyhas ten steps:
15
¯ Define the stepwise execution of the task, expected
frequency and required constraints. For example,
Cinderella, is expected to keep the floors in the two
bedrooms,the living room, dining roomand den vacuumed
at all times; but is not to vacuum
at night or whenthere is
anyonepresent in a given room.
humansor furniture up, so we might allow no morethan a
wirelesstether.
Specify the Allowable Environmental
Engineering (4)
The question here is how muchdo you want your house
altered? If we are going to push for a high degree of
intelligence, no altering should be allowed. But one might
get a robot to do the job in this centuryira fewvisual cues
were allowed. However,putting baffles around tables
because of inadequate proximity sensors is probably
unacceptable.
¯ Define the limits of the environment, materials, and
environmentalconditions. A floor plan could be provided
whichwouldinclude the kind of lighting, and the material
of the furniture, etc. Perhapsa better approachwouldbe to
invite the potential designers over to "experience" the
house.
Define/describe the Variability of the
Environment(2)
State the Specific RobotCapability to be
Measured (5)
Here it is most important to describe the "nominal"
conditions under whichthe robot is required to operate.
In this step weare focusingon whatwe expectthe robot to
do from the standpoint of intelligent behavior. Wearen’t
really that interested in the area of the floor that was
coveredor the quality of the cleaning, but howintelligently
the job was done. In step six we need to carefully define
the measuresof goodnessof that intelligence. Anexample
mightbe that wewant to determinethe value addedby the
addition of long term memoryto a reactive system. In
essence,the robot will carry out the task withand withouta
mapof information obtained in previous runs. Wemight
hypothesizethat we should see an improvement
in the time
it takes the robot to vacuumall the floors whenthe robot
has a memoryof where everything was the last time.
¯ Climate, lighting, weather, e.g., indoor lighting will
prevail except during the hours of 9 pm- 6 am, when
manyor all lights will be off.
¯ Number,type, speed, density, frequency of movement
of
objects. The frequencyof movement
of the furniture, the
numberof people whonormally occupy the house and how
fast they will be expectedto move,etc., could be detailed,
but "experiencing"a typical householdenvironment(as in
step 1) would be more practical (we’re talking about
Everyfamilyhere, not a governmentcontract).
Define the Measuresof Success (6)
¯ Number,type, speed, activity of humans. The number
and ages of the householdoccupantsand a range of their
activities could be specified. This is importantsince many
of the measuresof "goodness" for intelligent behavior
hinge on interaction with humans.
Carl Friedlander[Friedlander 92] has suggestedthat there
are behavior measures, inspection measures, and reduced
functionality measureswhichcould be brought to bear in
evaluating robot performance.Behaviormeasuresrelate to
the observedbehavior of the total systemas reflected by
the software logic. Inspection measures are direct
measurementsof the outputs of the software moduleof
concern. Reducedfunctionality measuresare the behavior
and inspection measuresapplied whenparts of the software
are removedor disabled. Of course the reverse of reduced
functionality is improvedfunctionality, the latter designed
to measurea value added, and the former being perhaps
more oriented to whichparts of the systemare the most
critical.
Define the Limits on the Robot System(3)
¯ Size and speed of platform. Wemayallow, for example,
a three foot robot that movesat about 1 foot per second
(fps) whenhumansare present or 3 fps whenno one
home.
¯ Sensor limitations. Weprobably don’t want laser
scanners used in a householdsituation.
¯ End-effector limitations. A typical requirement might
read: The robot shall not use an end-effector whose
operationwill prove harmfulto humansor to furniture or
kitchen appliances.
Webelieve AI tests concernimprovedfunctionality and the
measuresare essentially all behavioral.
¯ Taskcompleted.Withregard to the addition of long term
memoryand our expectations of value added, we would
add to completingthe task the following: Theless average
time the robot spendsdoingthe floors the better. The time
to complete a single round of vacuumingshould increase
only in proportion to the number of changes in the
environmentsince the previous round.
¯ Power;e.g., only available wall outlets can be used for
power.
¯ Autonomy.Althoughsometasks might be allowed to be
done with a tether, we’d want Cinderella not to tangle
16
mightrun Cinderella twice a day, once in the morningand
once in the afternoon at times the robot will typically be
expected to do the task. Wewouldanalyze the results and
if there are regularly occurringerrors, wewouldreturn to
this step anddesignsubtasktests. So for example,the robot
mayhave no trouble getting to the rooms, but doesn’t
always get through the door without catching the door
frame. This could suggest a problem with the obstacle
avoidance routine or an odometryproblem. A set of test
cases could then be designed involving running of the
navigationsystemas Cinderella wentfrom outside to inside
a room.
A purely reactive system might relocate and homein on
each roomon each round, thus ostensibly taking moretime
than a system which "remembered"the floor plan. The
systemwith memory
wouldmoveas directly as possible to
the vicinity of the knownprevious location of each room
before homingin on the door of the room.
It is importantto note here the interplay of the environment
and the allowable amountof environmentalengineering in
the expectationsof outcome.If the environmentwas just a
large great-room(no interior walls), or if markerswere
allowed to be placed on the rooms in such a way as to
allow viewing by a long range sensor, the expected
outcomewouldnot be clear. A reactive systemwith a long
range sensor might perform as well as the system which
memorizedthe floor plan. As well, if the furniture was
rearranged often (as by a wild bunch of ankle biters),
knowingthe floor plan might not be as important as
maneuveringamongobstacles.
Assumingthe robot is generally successful in the runs
described above, a secondset of runs will involve taking
observations of the robot’s performanceover a specified
period of time during which there are more complex
environmentalconditions that cannot be predicted, easily
controlled, or whichwouldbe too cosily to examine(the
presenceof humanscarrying out daily activities, climate,
e.g., creating a humidityproblemindoorsduringthe winter
season). In the example, Cinderella would perhaps be
required to carry out the vacuuming
for three weeksduring
the fall, winter, spring and summerseasons.
¯ Safety. Part of intelligent behavioris cognizantfailure
[GAT91]. Safety is not just a matter of halting or shutting
downin light of unsafesituation; it also involvessomekind
of "unwind-protect"
on the current activity as well as user
notification.
For the purposes of evaluating portions of the software
architecture, the above runs wouldbe repeated with as
identical as possible conditions using different versions of
the software.
Off-line Analysis (7)
Somewhere
in the process, the details of Cinderella’s
hardwareand software need to be presented in order to
conduct some analysis (computations and logical
reasoning)prior to actual on-line tests. It can be augmented
with a simulation of the gross systemperformanceto get a
sense of wherethe shortcomingsmight be. This analysis
wouldinclude following throughthe logic of the software,
comparingthe sensor and end-effector limitations to the
environmental conditions under which the robot must
function, and simplymatchingclaimed capabilities to the
desired capabilities listed in the previoussteps.
Run the Robot Capturing the Data Necessary
for FurtherAnalysis (9)
The "data necessary" is usually a logging of the robot
inputs, outputs, andconfiguration(position andorientation)
at each cycle of a given run. This is the data to be usedto
inspect the outputs of the parts of the softwaresystemwith
whichwe are concerned. It is also used to debugsoftware
fixes; we can get an initial idea of howthe robot will
perform by running the new software on this data.
Structure Test Cases to Prove or Disprove the
ClaimedCapability (8)
For detailed testing of individual skills, physical
measurements
must be taken such as the position of objects
in the area of interest. Anindependentlocating systemfor
tracking the robot’s global position mayalso be in order.
But for answeringquestions of improvedfunctionality via
systembehaviortests, a stopwatchand a set of observations
is all that is initially necessary.
For testing individual robot skills, a set of runs for each
skill could be specified which are in the middle and
extremesof each variable’s rangefor those variables which
can be easily controlledby the testers (lighting, numberand
separation of obstacles). In general, this is the method
currently in use to debugskills. If the robot performs
properlyin these tests andif timepermits,further tests can
be conducted,for exampleto determinethe limits of these
skills beyondthat stated for the tasks at hand.
Analyzethe Results (10)
Here we must rememberthat we are comparing/conlrasting
configurations of software to understand what advantage
one configurationhas over another. Thetemptationmustbe
avoidedto understandwhya robot failed a given test run.
In other words,don’t dilute the measurements
required by
adding additional measurements
that will not answerthe
test criteria.
But for examining overall system behavior, there are
usually too manyvariables to practically control in a field
setting. A moreabstract level of runs will most likely be
more useful in pointing out system problems which, in
turn, wouldsuggestwhatmoredetailed controlled runs are
necessary. For example: for one week of workdays, we
17
For instance perhaps in analyzing the data collected we
found that Cinderella with memoryvacuumedthe first
floor of the house on average faster than without memory
except whenthe numberof people present in the house
exceededa certain value. Nowwe moveinto a hypothesize
and test mode,perhaps hypothesizingthat the additional
information from the mapbecomesless useful when any
path from one point to another in the roomis not very
straight. Wecan then start fromstep five and repeat the
process.
[Chapman87] D. Chapman. Planning for conjunctive
goals. AI Vol. 32, No 3. July 1987. Elsevier Science
Publishers.
[Connell 92] Connell, Jonathan. 1992. SSS:A hybrid
architecture applied to robot navigation,in Proceedingsof
the IEEE International Conference on Robotics and
Automation, April.
[Dickmanns 86] E.D, Dickmanns and A. Zapp, A
curvature-based scheme for improving road vehicle
guidance by computervision. In Mobile Robotics, SPIEProc, Vol. 727, Cambridge,MA,1986, pp. 161-168.
Oftenin the analysis phasewe mightbe trying to find out
whythe robot, whilenot failing, did somethingthat wasnot
expectedbasedon the testers’ understandingof the software
and hardwarespecifications of the system (Step 7). For
example, maybeCinderella didn’t seem to perform any
better with the mapthan without it. To conclude that
memory
is not an essential part of intelligent vacuuming
may be premature. At this point, the users of this
methodologywould moveto step 5 with a new hypothesis
about the phenomenon
in question, devising new measures
and tests to verify or refute the hypothesis.
[Friedlander92] Friedlander, Carl. Position paperon robot
control metrics. DARPA
UGVWorkshop, Winter 1992.
[Gat 91] Gat, Erann. 1991. Taking the Second Left:
Reliable Goal-Directed Reactive Control for Real-World
AutonomousRobots, Phd Dissertation, VPI.
[Kuipers 92] Kuipers, Ben. Comments during a
presentation of the AAAIFall Symposiumon Real World
Robots, October 1992.
Summary
Wehave presented a methodologyfor determining the
value addedof a softwaremodulein the control systemof a
robot. This methodology
stresses off-line analysis, test case
observationsand post test analysis basedon the logic of the
software design. Becauseof the complexityof determining
why a robot system performs better or worse when an
"intelligent" componentis added or subtracted, this
methodologyalso stresses examiningthe overall system
behavior at the outset. In this manner, there is a good
chance of seeing a clear improvementor non-improvement
in system performance without requiring expensive
instrumentation and empirical data acquisition.
[McDermott 92] McDermott, Drew. Transformational
Planning of Reactive Behavior. YALEU/CSD/RR
# 941.
Dec 1992.
[Slack 92] Slack, MarcG. Sequencing Formally Defined
Reactions for Robotic Activity: Integrating RAPSand
GAPPS.Proceedings of SPIE OE/Technologyconference
on Sensor Fusion, Boston, November1992.
References
[Bonasso 91] R. P. Bonasso. Integrating Reaction Plans
and Layered CompetencesThroughSynchronousControl,
bz Proceedingsof the 12th International Jobtt Conference
on Artificial bltelligence. Sydney, Australia. Morgan
Kaufman.1991.
[Bonasso 92] Bonasso, R.P. Using Parallel Program
Specifications For Reactive Control of Underwater
Vehicles, in Journal of AppliedIntelligence, June 1992.
[Bonasso & Dean 1992] A Review of the First AAAI
Robotics Competition. AAAIProceedings of the Fall
Symposiumon Real World Robots, October 1992.
[Brooks 86] RodneyA. Brooks. A Robust Layered Control
System for a Mobile Robot. IEEEJournal of Robotics and
Automation, RA-2:14-23, April 1986.
18