Over-subscription Planning with Numeric Goals

advertisement
Over-subscription Planning with
Numeric Goals
J. Benton
Computer Sci. & Eng. Dept.
Arizona State University
Tempe, AZ
Minh Do
Palo Alto Research Center
(PARC)
Palo Alto, CA
Subbarao Kambhampati
Computer Sci. & Eng. Dept.
Arizona State University
Tempe, AZ
Over-subscription Planning
300
Util = 500
B
200
cost = 200
cost = 300
 Goals optional & have utility
 Actions have cost
 Maximize utility-cost
 “Benefit”
Util = 200
A
cost = 500
C
-100
Initial: At A
Goals: Soil_Sample @ B & C
Rovers Example
[“The Mystery Talk”, Smith 2003]
Motivation
 Numeric goals also have utility
 More soil gives better instrument reading
 More packages give more profit
 Cost for achieving varying values differs
 More soil requires more weight
 More packages require more deliveries
Objective
Satisfy numeric goals at different
values to give varying utility
 Want more/less
G = soil-sample ∈ [2,4]
U(G) = (* (soil-sample) 2)
 Challenge – A
measurable level of
numeric goal
achievement: degree of
satisfaction
B
e
n
e
f
i
t
best benefit
value
soil collected
1 gram
1 gram
action cost
Collect
Cost=1
Collect Cost=2
util=2*2=4
1 gram
Collect cost=3
Cost=3
Benefit=4cost=6
util=3*2=6 3=1
Benefit=6-6=0
Modeling Numeric Goal Over-subscription
 Achieve with a given utility
 Specify a goal range
G = soil-sample ∈ [2,4]
1. Fixed utility for
satisfying level
Infinity on
range OK
2. Linear
U
t
i
l
i
t
y
8
6
4
2
0
1
2
3
4
Sample
U(G) = (* (soil-sample) 2)
4. Model as a
separate goal
3. Hard bounds
SapaMps Architecture
Over-subscribed Planning
Planning Problem
Input Initial State
Queue of
Time-Stamped
States
Based on SapaPS
Select state
with best
f-value
Better
benefit
plan?
Yes
Output
Plan
No
Build RTPG
Propagate Cost
Find Utility
Generate
States by
Applying
Actions
Anytime A* Search
Challenge – Heuristic Support
 Heuristic needs to…
 Estimate cost of achieving variable values
 Find the utility of the values
 Extend current state-of-the-art
techniques
 Planning graph structure
 Reachability estimation
 Cost propagation
Challenge – Find Goal Achievement Cost
 Propagate reachable values with cost
Move(Waypoint1)
Sample_Soil
Sample_Soil
Communicate
v1:
cost(
):
0
[0,0]
1
[0,1]
2
[0,2]
0
1
2
A range of
possible values
2.5
Cost of achieving
each value bound
Cost Propagation on Variable Bounds
Sample_Soil
Effect: v1+=1
 Bound cost
dependent upon
 action cost
 previous bound cost
- current bound cost
adds to the next
 Cost of all bounds in
expressions
Sample_Soil
v1:
[0,0]
Sample_Soil
[0,1]
[0,2]
Cost(v1=2)
C(Sample_Soil)+Cost(v1=1)
Sample_Soil
Effect: v1+=v2
Sample_Soil
v1:
[0,0]
v2:
[0,3]
Sample_Soil
[0,3]
[0,6]
Cost(v1=6)
C(Sample_Soil)+Cost(v2=3)+Cost(v1=3)
Extracting Relaxed Plan with Numeric Info
 Start with best benefit bounds
 Relaxed plan includes
 Actions
 Supporting bounds
B
e
n
e
f
i
t
best benefit
value
Dur = 1
Dur = 1.25
Sample_Soil 1 (Sa1)
Sample_Soil 2 (Sa2)
(at end)
V1 += 1
Cost: 1
Sa1
Cost: 2
Sa1
C:1
Sa2
v2
1
1.25
C:1
Sa2
C:2
Com
C:4
2
(at start)
V2 := V1
(at start) Cost: 3
V1 ≥ 1
Sa1
C:1
Com
0
Communicate (Com)
(at end)
V1 += 2
Sa2
C:2
upper bound
@ time point
v1
Dur = 1.5
2.5
3
C:2
C:4
3.75 4
t
value
cost
value
cost
v1 – soil sample in rover’s store
v2 – soil sample communicated
Goal: v2 ∈ [5,∞], U(v2 ∈ [5,∞]) = v2 * 3
Dur = 1
Dur = 1.25
Sample_Soil 1 (Sa1)
Sample_Soil 2 (Sa2)
(at end)
V1 += 1
Cost: 1
Sa1
Cost: 2
Sa1
C:1
Sa2
Dur = 1.5
(at end)
V1 += 2
Com
0
v1
v2
1
1.25
(at start) Cost: 3
V1 ≥ 1
Sa1
C:1
Sa2
C:2
Communicate (Com)
C:1
Sa2
C:2
C:2
Com
C:4
2
2.5
(at start)
V2 := V1
3
C:4
3.75 4
value
cost
satisfies goal
value
cost
h(S) = U(G) - (cost of actions + cost of bounds)
t
Results – Modified Rovers
 Added numeric variables:
 Soil and rock sample amount in rover store
 More communicated soil/rock - greater utility
Results – Modified Rovers
Average improvement: 3.06
Anytime A* Search Behavior
Results – Modified Logistics
 Added numeric variables:
 Number of packages at location
 More packages - greater utility
Results – Modified Logistics
Average improvement: 2.88
Summary
 Over-subscription planning in the
presence of
 Numeric goals
 Durative actions
 Propagating cost over numeric
values
Future Work
 Delayed satisfaction of goals
 Goal utility dependency
Questions.
Download