Effective Approaches for Partial
Satisfaction (Over-subscription) Planning
Romeo Sanchez *
Menkes van den Briel **
Subbarao Kambhampati *
* Department of Computer Science and Engineering
** Department of Industrial Engineering
Arizona State University
Tempe, Arizona
Outline
Background
Example
Approaches
Optiplan
Altaltps
Sapaps
Planning graph heuristics
Results
Background
In one day achieve the following 100
goals: RockData at WP 1, high-res
pics at WP 2 & 3, …., SoilData at WP
100
Given: Actions with costs, and
goals with utilities, find a plan
that has a highest {utility – cost}
No way I can achieve
that many goals in
one day
For all your demands,
you could’ve bought me
a better flash memory
stick at least!
It’s hard but here is
the best I can do:
Goal1, Goal5, Goal99
Previous Approaches:
Highest utility goal first
Estimating the set of most beneficial goals
Background
Complete satisfaction (traditional) planning
Goal state G is a list of conjunctions: G = g1 g2 … gn
A plan that achieves n – 1 goal fluents is as good as a plan that
achieves 0 goal fluents
Partial satisfaction planning (PSP)
Goal state G is a list of fluents: G = {g1, g2 , …, gn}
Goal fluents might have utilities, actions might have costs,
therefore achieving a partial plan might be more beneficial than the
“null” plan.
Achieving all goal fluents might be impossible…
The goal state G may contain logically conflicting fluents
(:goal (and (pointing satellite1 moon) (pointing satellite1 mars) ))
There might not be enough resources to achieve all fluents in G
(:goal (and (have_rock rover1 waypoint1) (have_rock rover1 waypoint2) ))
PSP problems
PSP Net benefit:
Given a planning problem P = (F, A, I, G), and for each action a
“cost” ca 0, and for each goal fluent f G a “utility” uf 0, and a
positive number k. Is there a finite sequence of actions = (a1, a2,
…, an) that starting from I leads to a state S that has net benefit
f(SG) uf – a ca k.
PLAN EXISTENCE
PLAN LENGTH
PSP GOAL
PSP GOAL LENGTH
PLAN COST
PSP UTILITY
PSP NET BENEFIT
PSP UTILITY COST
Example
Getting from Las Vegas (LV) to San Jose (SJ)
C: action cost
U(G): utility of goal G
G1,G2,G3,G4: goals
P = {travel(LV,DL), travel(DL,SJ), travel(SJ,SF)} achieves G1, G2, G3
Approaches
Optiplan
Integer programming based STRIPS planner
Solves the PSP problem by encoding it as an integer program
Altaltps
Heuristic regression planner
Solves the PSP problem through a goal selection heuristic
Sapaps
Heuristic forward state space planner
Solves the PSP problem using an anytime A* algorithm
Optiplan
Optiplan planning system:
Combines Graphplan (Blum & Furst, 1995) with State Change
Encoding (Vossen et al., 1999)
As in the Blackbox planning system, Graphplan reduces the
encoding size generated by Optiplan
Computes optimal plans for a given parallel length
Objective:
fG Uf (x_addf,n + x_preaddf,n + x_maintainf,n) – lL aA Ca ya,l
Sum of goal utilities
– Sum of action cost
Optiplan and partial satisfaction
Objective
0 / Minimize #actions
Objective
Maximize net benefit
Goal utility – action cost
Constraints
Fluent changes
Satisfy initial state
Satisfy goal
Fluent implications
Action implications
Total satisfaction planning:
goal satisfaction is treated as
a hard constraint
Constraints
Fluent changes
Satisfy initial state
Fluent implications
Actions implications
Partial satisfaction planning:
goal satisfaction is treated as
a soft constraint
Graphplan based cost propagation
AltAltps
AltAlt planning system
Heuristic state-space search planner (Nguyen, Kambhampati &
Sanchez, 2002)
Combines Graphplan (Blum & Furst, 1995) with heuristic statespace search techniques (Bonet, Loerincs & Geffner, 1997; Bonet
Geffner, 1999; McDermott 1999)
AltAltps planning system
Total enumeration on 2n goal subsets is too costly
Selects a promising subset of the top-level goals upfront
Searches for a plan using a regression state space search combined
with cost-sensitive planning graph heuristics.
AltAltps cost propagation
Using a planning graph structure
Propositions in the initial state come for free (they have zero cost)
Other propositions have costs computed as follows:
0
0
0
5 5
0
4
l=0
4
0
5 5
0
3 8
4 4
l=1
l=2
hl(p) = Cost of proposition p at level l
hl(p) =
0
if p I
min{hl-1(p), cost(a) + Cl(a)}
if l > 0
otherwise
Propagation procedures
Max-propagation
Cl(a) = max{hl-1(q) : q prec(a)}
Sum-propagation
Cl(a) = q prec(a) hl-1(q)
AltAltps goal set selection
Main idea
Start with the original goal set G and an empty goal set G’
Iteratively add goals to G’ as long as the estimated NET BENEFIT
increases
The cost of adding another goal g to G’ depends on the goals that
are already in G’
G’ g
G’
Cost for achieving G’
Relaxed plan for G’ (R’p)
Residual cost for g
Rp for G’ g biased to re-use actions in R’p
AltAltps cost-sensitive relaxed plan heuristic
General procedure
States are ranked during search using the relaxed plan heuristic
and the propagated costs
The idea is to compute the cost of a relaxed plan Rp in terms of the
costs of the actions composing it.
1.
Given a state S, remove the (sub)goal g from S that has highest hl(g)
2.
Select the action that supports g with lowest cost (cost(a) + Cl(a))
3.
Regress S over a to get S’ = S prec(a) \ eff(a)
4.
Stop when each proposition q S is present in the initial state
Heuristic value for S equal h(S) = aRpcost(a)
Sapaps
SAPAPS: a forward A* approach for PSP
A5: SampleRock
A1: Navigate(X,Y)
A2: SampleSoil(Y)
A4: Navigate(Y,Z)
Anytime A* Algorithm:
Search through best beneficial nodes
A3: TakePicture
g(S) = Util(HasSoilData) – Cost(A1,A2)
h(S) = Util(Apply(A3,S)) – Cost(A3)
g(S) = U(S) – C(S)
h(S) = U(RP(S)) – C(RP(S))
Beneficial Node:
A*: f(S) = g(S) + h(S)
Nodes evaluation:
g(S) > 0 or U(S) > C(S)
Termination Node:
V S’: g(S) > f(S’)
SAPAPS: heuristic
Heuristic: Variation of SAPA’s Approach
Heuristically extracting the least cost relaxed plan using cost-function
Remove “unbeneficial” goals and related actions
A1
A3
A4
G1
G2
A2
→
A1
A3
G3
C(A1) + C(A2) > U(G3)
G1
G2
Empirical results
Empirical results
Future work