Applying a General-Purpose Planning and Learning Architecture

From: AAAI Technical Report FS-94-01. Compilation copyright © 1994, AAAI (www.aaai.org). All rights reserved.
Applying a General-Purpose Planning and Learning Architecture
to Process Planning*
Yolanda Gil
InformationSciencesInstitute
Universityof SouthernCalifornia
Marinadel Rey, CA90292
gil~isi,edu
M. Alida P4rez
School of ComputerScience
CarnegieMellonUniversity
Pittsburgh, PA15213
aperez@cs,
cmu. edu
Abstract
Processplanningposessignificantcomputational
requirementsdueto the varietyof alternativeprocesses,
their complexity,andtheir interactions.General-purpose
plannersare
generallynot considered
a practicalapproach,
andmostcurrent researchfocusesonspecial-purpose
planningsystems.
Researchwithinthe PRODIGY
framework aimsto provide
expressivegeneral-purpose
plannerstogetherwithlearning
algorithms
that canimprove
their efficiency,the accuracy
of
their domain
model,andthe qualityof their plans.Process
planningis oneof the large-scalecomplex
domains
that we
haveimplemented
in PRODIGY
to demonslrate
the feasibility
of our approach.Ourcurrentmodelof processplanningis
still far fromcomprehensive
andis limitedin manyways,
butit reflectsmany
of the complexities
involved
in the task.
This paperdescribeshowPRODIGY
learns control knowledge,acquiresdomain
knowledge,
andimproves
the quality
of its plansfor this application
domain
usinggeneral-purpose
planning
andlearningalgorithms.
Introduction
Currentresearch on automationof manufacturing
processes
includes CAD
aids, assemblyautomation,andprocessplanning tools. Theautomation
of processplanningin particular
is becoming
a seriousneedin industry,dueto the increasing
scarcity of expertson technologythat is rapidly changing,
the needfor loweringmanufacturing
costs, and the desire
to makecustomizedproductswidelyavailable. Thevariety
of alternative processes,their complexity,andtheir inter°
actions makethe planningtask very complex.In addition,
a goodprocess plan minimizesresource consumptionand
executiontime. Theseissues are part of the researchagenda
of the AI planningcommunity.
However,the automationof
*We
would
like to thankall the members
of the PRODIGY
group
for manyyearsof collaborations
anddiscussions.Thisresearch
waspartially sponsored
by the WrightLaboratory,
Aeronautical
SystemsCenter, Air ForceMaterielCommand,
USAF,
and the
Advanced
ResearchProjects Agency(ARPA)
undergrant numbers F33615-90-C-1465
and F33615-93-1-1330.
Viewsand conclusionscontained
in this document
are thoseof the authorsand
shouldnotbeinterpreted
as necessarily
representing
officialpolicies or endorsements,
either expressed
or implied,of Wright
Laboratoryor the UnitedStatesGovernment.
Thesecondauthorholds
a scholarship
fromthe Minis
t erie deEducaci6n
y Ciencia
of Spain.
48
different aspects of processplanning (see (Chang& Wysk
1985)for an overview)has focusedon special-purposesystemsthat address the complexityof the task with mechanismsspecific to processplanning. Someof these systems
useAItechniques (Hayes1990; Descotte &Latombe1985a;
Nan1987), and there are approachesthat use generalpurposeproblemsolvers coupledwith special-purposesysterns (Kambhampati
et al. 1993).All this bodyof worksuggests that general-purposeAI planningtechniquescannot
handlethe complexityinherentto processplanningtasks.
Researchwithin the PRODIGY
framework
alms to provide
expressivegeneral-purposeplannerstogether with learning
algorithmsthat canimprovetheir efficiency,the accuracyof
their domainmodel,andthe quality of their plans. Process
planningis one of the large-scale complexdomainsthat we
have implemented
in PRODIGY
as a useful testbed for our
planningandlearningresearch.
Ourcurrent modelof process planningI represents machining, joining, and finishing operations. Althoughthis
modelis still far fromcomprehensive
andis limitedin many
ways,it reflects manyof the complexitiesinvolvedin the
task. Weasked an expert job shopmachinistto assist in
the constructionof the domainso it wouldbe as accurate
as possible. Themachinistalso helpedwith the description
of a real machineshop andsampleparts for constructing
problems.For someproblemsweused actual requests that
weresubmittedto the job shopthat serves the Mechanical
EngineeringDepartmentof CarnegieMellonUniversity.
Thepaperbeginswith a brief overviewof PRODIGY,
followedby a presentationof our modelof processplanning.
Finally, wedescribe howthe learning mechanisms
applied
to this process planningdomainimprovePRODIGY’S
performancein several respects.
Planning and Learning in PRODIGY
The PRODIGY
system (Mintonet al. 1989a; Veloso1989;
Carbonell et al. 1992) is an evolving general-purpose
problem
solvingarchitecturethat integratesseverallearning
mechanismsto improve performance. Domainknowledge
is representedin a set of operatorsandinferencerules, and
i Thedomain
is described
in detailin (Gil1991), andis available
uponrequest fromprodigyOcs,cmu.edu.
a type hierarchy for the objects in the domain.The operators are modelsof the available actions and they specify the
effects of the actions under different conditions. Inference
rules are used to deduce additional information from the
state. A problemis given by an internal state, representing
the current state of the world, and a goal state. PRODIGY
searches for a solution using a casual commitment
strategy
for every decision in the search process. Decisions include
choosing a goal, choosing an operator, selecting bindings
to instantiate an operator, and deciding whetherto subgoal
or apply an operator whoseconditions are satisfied. Search
control rules that express definitive selections or heuristic
recommendationsare applied at each decision point. The
problem solver has a very powerful language to express
both domainand control knowledge.
Learning Control Knowledge
Control rules can be learned automatically by the system
by static analysis of the domainoperators (Etzioni 1990),
analysis of problem-solving traces (Minton1988; Borrajo
&Veloso 1994), or a combinationof both (P6rez &Etzioni
1992).
In addition to learning control rules, PRODIGY
can also
control the search using derivationai analogy with similar previously solved problems(Veloso & Carbonell 1993).
Searchis also moreefficient whenPRODIGY
is used as a hierarchical problem
solver
thatlearns
tostructure
thesearch
in
multiple
abstraction
levels
automatically
(Knoblock
1991).
diverge, learning is triggered. Whenthere are several possible modifications of the domainknowledgethat could
potentially fix the problem, PRODIGY
designs and executes
experiments to discern which modification is appropriate.
The experimentationprocess is efficiently carried out with
to a set of domain-independent
hypothesis-selection heuristics that are available to the learning system.
Domainknowledgecan also be acquired directly from a
domainexpert. PRODIGY
Can engage in an apprentice-like
dialogue (Joseph 1992), or learn from observing the expert
solving problems (Wang1994).
Process
Planning
in PRODIGY
In this domain, PRODIGY
generates plans to produce parts
given a request that specifies the material, the shape(rectangular or cylindrical), the size along each dimension,the surface quality (roughness), the surface finish (metal coatings
and polishing), and the features (holes that can be reamed,
tapped, counterbored, etc). This specification forms the
goal state. A description of a shop with machines, tools,
and parts formsthe initial state of any problem.Parts have
six sides, and the location of a feature is determinedwith
x and y coordinates in a given side. Besides the machining operations themselves, a plan consists of operations to
secure the part with a holding device in a certain orientation, to clean metal burrs fromits surface, and to install an
appropriate tool in the machine.
Domain Knowledge
Learningto ImprovePlan Quality
In our model, most operators correspond to machining,joining, and finishing actions, as well as to the steps to prepare
the part and tool set-ups. Consider,for example,an operator
for face milling a part. Weneed to represent the fact that
if we use a milling cutter on a milling machinethe size of
the part will changealong a dimensioncorrespondingto the
part side facing up, that the part must be held by a holding
device in such a waythat the desired dimensioncan be machined, and that the newsize of the part must be smaller
than the current size. Also, any surface properties of the
side being machinedwill disappear, and the part will have
dirtandburrs.
Figure
I showsthecorresponding
operator.Thenotation
means
thatifthepreconditions
aretruein
the current state then we can performthe milling operation,
whichchangesthe state accordingto the effects listed.
The domainimplementation makes use of PRODIGY’S
ability to represent infinite types and to do arbitrary Lisp
function calls. Infinite types, i.e. types withinfinitely many
instances, are used to represent numericquantities, such as
Learning Domain Knowledge
part sizes, hole depths, diameters, and angles. Functions
PRODIGYCan acquirenew domainknowledge
by intercan be used to denote facts that never changein the state,
actionwiththeenvironment
andexperimentation
(Gil
as generators for the infinite types, and to performnumeric
1992).
Given
aninitial
description
ofthedomain
operators, calculations. In the FACE-MIIJ.operator, the function
PRODIGY
Can acquire additional
preconditions
andeffects
smal 1 or represents the restriction that the part size never
autonomously
byexecuting
theplans
thatitbuilds
withthe
increases after milling. Inference rules are used to specify
currently
available
knowledge.
Thesystem
hasexpectations the availability of machines,parts, tools, tool holders, and
thatemerge
fromitscurrent
knowledge.
Planexecution
is
holding devices. They are also used to determine which
monitored,
andwhentheexpectations
andtheobservations sides should be used to hold a part.
PRODIGY
Canlearntoimprove
thequality
oftheplansit
generates
(P6rez& Carbonell
1994).Givena domaindependent
objective
function
thatcanevaluate
thequality
of
plans,
thelearning
algorithm
compares
thesearch
trace
for
theplanner
solution
giventhecurrent
control
knowledge,
andanother
search
tracecorresponding
to a better
solution(better
according
totheevaluation
function).
Thelatter
trace
isobtained
byletting
theproblem
solver
search
further
until
a better
solution
isfound,
orbyasking
a human
expert
formodifications
onthefirst
solution
ora completely
new
one,andthenbuilding
a corresponding
search
trace.
The
algorithm
explains
whyonesolution
isbetter
thantheother
anditsoutput
issearch
control
knowledge
thatleads
future
problem
solving
towards
better
quality
plans.
Thelearning
algorithm
iseffectively
operationaiizing
theobjective
functionintoknowledge
thattheplanner
canuseduring
plan
generation
bytransforming
itintocontrol
rules.
49
(Operator FACE-MILL
(params <machine> <part> <cutter> <hold-dev>
<side> <side-palr> <dim> <value-old> <value>)
(preconds
((<machine> MILLING-MACHINE) (<cutter> MILLING-C~ER)
(<hold-dev> (or 4-JAW-CHUCK VISE COLLET-CHUCK TOE-CLAMP))
(<part> Part)
(<dim> Dimension)
(<side-pair> Side-Pair)
(<side> side)
(<value-old>
(and Size (gen-from-pred (slze-of <part> <dim> <value-old>})))
(<value> (and Size (smaller <value> <value-old>))))
(and (shape-of <part> RECTANGULAR)
(side-up-for-machining <dim> <side>)
(sides-for-holding-devlce <side> <side-palr>)
(holdlng-tool <machine> <cutter>)
(holding <machine> <hold-dev> <part> <side> <slde-pair>)))
(effects ((<surface-coatlng> SURFACE-COATING)
(<surface-flnlsh> SURFACE-FINISH))
((del (Is-clean <part>))
(add (has-burrs <pert>))
(del (surface-coating-slde <part> <side> <surface-coating>))
(del (surface-finish-side <part> <side> <surface-flnish>))
(add (surface-flnlsh-slde <part> <side> RO~H-MILL})
(add (size-of <part> <dim> <value>))
(del (slze-of <pert> <dim> <value-old>)))))
Figure I: The FACE-MILL
operator.
Somequalitative and quantitative measuresof the complexity of this domainare:
¯ The effects of most operators are not reversible.
¯ The precondition expression ofsorne operators and inference rules includes negations, disjunctions and universal
quantification. Someof the preconditions correspond to
predicates derived by inference rules.
¯ There are context-dependenteffects of operators.
¯ Thereare 117rules, that include 73 operatorsand 44 in ference rules. 38 of the operators correspond to machining
operations, and 35 to set-ups.
¯ The average numberof parameters for an operator is 7,
the average numberof preconditions is 5, and the average
numberof effects is 3.
¯ Thereare 41 different predicates. 7 of themare static (i.e.,
do not changeduring problemsolving). 11 Lisp functions
are used to perform numericalcomputationsand constrain
variable values.
¯ Thereare 85 different types and subtypesof objects in the
type hierarchy, 5 of whichare infinite type,.
¯ The length of manysolutions is over one hundred rules
(including operators and inference rules).
¯ The initial state that represents the machineshop includes
morethan 500 facts.
Control Knowledge for Process Planning
[earned or handwritten control rules guide the search for
solutions along the more promising paths. For example,
the rule in Figure2 rejects certain kinds of cutting fluid for
somemachiningoperators according to the material of the
part.
Plan Quality in Process Planning
Plan quality is crucial in process planning to minimizeboth
resource consumption and execution time (Doyle 1969;
Descotte &Latombe1985a). For instance: it maybe advantageous to execute several cuts on the same machine
with the samefixing to reduce the time spent setting up the
50
(control-rule DONT-USE-MINERAL-OIL
(if (and (current-goal-first-arg <part>)
(current-ops (DRILL-WITH-HIGH-HELIX-DRILL
DRILL-WITH-GUN-DRILL REAM
ROUGH-GRIND FINISH-GRIND
CUT-WITH-CIRCULAR-FRICTION-SAW ...))
(or (known (materlal-of <part> STEEL))
(known (material-of <part> ALUMINUM)))
(type-of-object <f> mineral-oil)))
(then reject bindings ((<fluid> . <f>))))
Figure 2: Control rule that rejects bindings for the cutting
fluid dependingon the part’s material.
work on the machines; or, if a hole HI opens into another
hole//2, then/-/2 should be machinedbefore Ht in order to
avoid the risk of damagingthe drill.
Sharing parts of the set-ups amongoperations on one or
moreparts usually reduces the total plan cost. Plan length
is usually not an accuratemetric of plan quality, as different
operators have different costs. For example,a tool can be
switched automatically but holding the part requires human
assistance (Hayes1990). Therefore plans that share set-ups
are cheaper than plans that share tools. The next section
describes howquality-enhancing control knowledgecan be
acquired automatically.
Learning
to Improve Performance in Process
Planning
This section describes howPRODIGY’S
learning techniques
described in the secondsection can be used to improvethe
planner’s performancein our process planning domain.
Efficient Process Planning through Learning
There are two mainapproachesto building process planning
systems (Chang & Wysk 1985). Generative approaches
combine elementary process planning operations to produce the final plan. Variant approaches retrieve complete
plans from a plan library and adapt themto suit the needsof
the current problem. In the implementationjust described,
PRODIGY
finds solutions for process planning problemsin a
generative fashion, i.e., by constructing plans given a set of
possible operators. PRODIGY’S
analogical
engine (Veloso
Carbonell 1993) could be used to implementa variant approach using predefined planning episodes associated with
families of parts, modifying themfor the particular part
wanted.
Abstraction planning has been applied to process planning and scheduling domains successfully (Fox & Smith
1984; Nan 1987). PRODIGY’S
domain-independent techniques (Knoblock1991) should provide useful abstractions
to handle the interactions within subproblemsin a process
planning application.
PRODIGY
currently uses manycontrol rules to guide the
search in the process planning domain. These control rules
are hand-coded, and continue to grow in number as we
continue to understandhowto control the search complexity
of the domain. Someof the work on automatically learning
control knowledgein PRODIGY’$
has been applied to the
process planning domain(Borrajo & Veloso 1994).
Learning to Generate Process Plans of Good
Quality
The performance of the planner can also be improved by
learning newrules to guide the search towards better quality solutions. The mechanismto learn quality-enhancing
control knowledgedescribed previously has been applied to
the process planning domain. The following simple example illustrates the learning process. The domain-dependent
quality metric used is additive on the cost of the individual operators and the operations to set-up the part on the
machineare moreexpensive than those to switch the tool.
Supposethe goal is to reduce the height of a part and
have a spot hole at certain coordinates, and the planner
choosesthe drill press to drill the spot hole. Adomainexpert
mayinput modifications to improvethat solution so that it
uses the milling machineto drill the spot-hole, and shares
the same set-up (orientation, machineand holding device)
for the drill and mill operations. The learning mechanism
comes up with the control rule in Hgure 3 and a similar
bindings preference rule. Goal preferences are also learned
from other problems.
Problemset
(10 probsper set)
!# problemswith
!improvement
3
9i
3
Withoutlearned
control knowledge107 2O2 190
’Withlearned
control knowledge 91 132 166
Cost decrease
44% 48% 33%
10
10
4
9
431 362 442 732
350 220 409 665
24% 47% 17% 8%
Table 1: Improvementon the quality of the plans obtained
for 70 randomly-generated problems in the process planning domain. The third and fourth rows showsolution cost
according to the evaluation function.
initially given to the system(Gil 1992). Table 2 presents
some results obtained when PRODIGY
learns preconditions
that are missingfromits initially givenspecification of the
process planning domain. The tests were run in domains
with 10%and 30%incompleteness using two training sets
and twotest sets.
(control-rule
pre f-dril l-with-spot-drl i i- in-mi I i Ing-machlne30
(if (and
(current-goal (has-spot <pert> <hole> <side> <loc-x> <loc-y>}
( l~ndi ng-goa
(holding <mach> <holdlng-dev> <park> <side> <slde-palr>}
(type-of-object<msch> mllllng-machlne))
(then prefer operator drill-wlth-spot-drill-ln-milling-mechlne
drill-wlth-spot-drill)
Conclusion
Process planning is often considered too complexto be handled by general-purpose mechanisms.The work presented
here illustrates our work on applying PRODIGY’S
generalpurpose planner augmentedwith learning techniques that
improveits performancein a process planning domainalong
several dimensions.
Figure 3: Search control rule learned from the example
problem.
As the explanation is built from a single exampleand
does not consider all possible hypothetical scenarios, it may
be incomplete and the learned rules maybe overgeneral.
Uponunexpected failures the system refines the learned
knowledgeincrementally adding new rules if needed, and
mayset priorities amongrules.
Table I showsthe effect of the learned knowledgeon the
solution cost over 70 randomly-generated problems. Each
columncorresponds to a set of 10 problems with common
parameters: numberand type of goals, parts, etc. The training set consisted of 60 randomlygenerated problems with
the sameparametersthan for sets I to 6 in the table. In many
of the training and test problemsthe planner did not require
control knowledgeto obtain a good solution. Consequently
for each problemset we have only recorded those for which
the solution was actually improved. The numberof nodes
and total CPUtime was also reduced due to shorter solution
lengths. Howeverwe plan to further analyze the possible
tradeoff betweenthe learned-knowledgematching cost and
the savings obtained by using it. Weare exploring the effect
of this learning mechanismon other domainsand on other
types of evaluation functions.
References
Borrajo,D., andVeloso,M.1994.Incrementallearningof control
knowledge
for nonlinearproblemsolving. In Proceedingsof the
EuropeanConferenceon MachineLearning, ECML94.
Sicily,
Italy: SpringerVerlag.
CarbonelLJ. G.; the PRODIGY
Research Group: Jim Blythe,
a.; Etzioni, O.; GiLY.; Joseph, R.; Kahn,D.; Knoblock,C.;
Minton,S.; (editor), A.P.; Reilly, S.; Veloso,M.;andWang,
1992. PRODIGY4.0:
Themanualand tutorial. TechnicalReport
CMU-CS-92-150,
Schoolof ComputerScience, CarnegieMellon
University.
Chang,T. C., and Wysk,R.A. 1985. AnIntroduction to AutomatedProcessPlanningSystems.Englewood
Cliffs, NJ: Prentice
Hall.
Descotte, Y., and Latombe,J.-C. 1985b. Makingcompromises
among
antagonistconstraintsin a planner.Artificial Intelligence
27:183-217.
Doyle,L. E. 1969. Manufacturing
ProcessesandMaterialsfor
Engineers.Englewood
Cliffs, NJ: Prentice-Hall,secondedition.
Etzioni, O. 1990. A Structural Theoryof Explanation-Based
Learning. Ph.D. Dissertation, CarnegieMellonUniversity,
Schoolof Computer
Science. Alsoappearedas TechnicalReport
CMU-CS-90-185.
Fox, M., and Smith, S. 1984. ISIS: Aknowledge-based
system
for factory scheduling.InternationalJournalof ExpertSystems
I(I).
Learning Domain Knowledge for Process
Planning
PRODIGY
acquires new preconditions by experimentation
while planningusing the process planning operators that are
51
,6
I
11;
12
,C---7--7--7---7--7---
I0
,/
8
6
Train
Train
4
i
2 ~--
n 1 ~
n 2 ~--
2
0
I0 20
30 10 50 60 70
~aining Problem
80 90 i00
io 20 30 40 50 60 70 80 90
Training Proble~
IOO
(a) Cumulative number of unexpected action outcomes during training.
20
. ..........
i___L-~~ ’
1.
16
10
u
8
6
Train 1
2
10 20
30 10 50 60 70
Training
Problems
80 90 100
o
10 20
30 10 50 60 70
Training Probl em~
a0
90 100
(b) Numberof plans successfully executed in the test set.
Table 2: PRODIGY
learns new preconditions
Gil, Y. 1991. A specification of process planning for PRODIGY.
Technical Report CMU-CS-91-179,School of Computer Science, Carnegie MellonUniversity, Pittsburgh, PA.
Gil, Y. 1992. Acquiring DomainKnowledge for Planning by
Experimentation. Ph.D. Dissertation, Carnegie Mellon University, School of ComputerScience. Available as technical report
CMU-CS-92-175.
Hayes, C. 1990. MachiningPlanning: a Modelof an Expert Level
Planning Process. Ph.D. Dissertation, The Robotics Institute,
Carnegie MellonUniversity, Pittsburgh, PA.
Joseph, R. L. 1992. KnowledgeAcquisition for Visually Oriented Planning. Ph.D. Dissertation, School of ComputerScience, Carnegie MellonUniversity, Pittsburgh, PA. Available as
technical report CMU-CS-92-188.
Kambhampati,S.; Cutkosky, M. R.; Tenenbaum,J. M.; and Lee,
S.H. 1993. Integrating general purpose planners and specialized
reasoners: Case study of a hybrid planning architecture. IEEE
Transactions on Systems, Manand Cybernetics, Special Issue on
Planning, Scheduling, and Control 23(6).
Knoblock, C. 1991. Automatically Generating Abstractions for
ProblemSolving. Ph.D. Dissertation, Carnegie Mellon University, School of ComputerScience. Also appeared as Technical
Report CMU-CS-91
- 120.
Minton, S.; Carbonell, J. G.; Knoblock,C. A.; Kuokka,D. R.;
Etzioni, O.; and Gil, Y. 1989a. Explanation-based learning: A
problem-solvingperspective. Artificial Intelligence 40:63-118.
Available as technical report CMU-CS-89-103.
Minton, S. 1988. Learning Effective Search ControlKnowledge:
An Explanation-basedApproach. Ph.D. Dissertation, Carnegie
Mellon University, School of ComputerScience. Also appeared
as Technical Report CMU-CS-88-133.
52
in the process planning domain.
Nan, D. 1987. Automatedprocess planning using hierarchical
abstraction. In 1987Texas Instruments call for papers on Al for
Industrial Automation.
Ptrez, M. A., and Carbonell, J. G. 1994. Control knowledgeto
improveplan quality. In Proceedingsof the Secondlnternational
Conference on AI Planning Systems, AIPS-94.
Ptrez, M. A., and Etzioni, O. 1992. DYNAMIC:
A new role
for training problemsin EBL.In Sleeman, D., and Edwards,P.,
eds., MachineLearning: Proceedingsof the Ninth International
Conference (ML92). San Mateo, CA.: Morgan Kanfmann.
Veloso,M. M., and Carbonell, J. G. 1993. Derivational analogy in
PRODIGY:
Automatingcase acquisition, storage, and utilization.
Machine Learning 10:249-278.
Veloso, M. M. 1989. Nonlinear problemsolving using intelligent
casual-commitment Technical Report CMU-CS-89-210,School
of ComputerScience, Carnegie Mellon University.
Wang,X. 1994. Learning planning operators by observation and
practice. In Proceedingsof the Secondlnternational Conference
on AI Planning Systems, AIPS-94.