Over-subscription Planning with Numeric Goals J. Benton Computer Sci. & Eng. Dept. Arizona State University Tempe, AZ Minh Do Palo Alto Research Center (PARC) Palo Alto, CA Subbarao Kambhampati Computer Sci. & Eng. Dept. Arizona State University Tempe, AZ Over-subscription Planning 300 Util = 500 B 200 cost = 200 cost = 300 Goals optional & have utility Actions have cost Maximize utility-cost “Benefit” Util = 200 A cost = 500 C -100 Initial: At A Goals: Soil_Sample @ B & C Rovers Example [“The Mystery Talk”, Smith 2003] Motivation Numeric goals also have utility More soil gives better instrument reading More packages give more profit Cost for achieving varying values differs More soil requires more weight More packages require more deliveries Objective Satisfy numeric goals at different values to give varying utility Want more/less G = soil-sample ∈ [2,4] U(G) = (* (soil-sample) 2) Challenge – A measurable level of numeric goal achievement: degree of satisfaction B e n e f i t best benefit value soil collected 1 gram 1 gram action cost Collect Cost=1 Collect Cost=2 util=2*2=4 1 gram Collect cost=3 Cost=3 Benefit=4cost=6 util=3*2=6 3=1 Benefit=6-6=0 Modeling Numeric Goal Over-subscription Achieve with a given utility Specify a goal range G = soil-sample ∈ [2,4] 1. Fixed utility for satisfying level Infinity on range OK 2. Linear U t i l i t y 8 6 4 2 0 1 2 3 4 Sample U(G) = (* (soil-sample) 2) 4. Model as a separate goal 3. Hard bounds SapaMps Architecture Over-subscribed Planning Planning Problem Input Initial State Queue of Time-Stamped States Based on SapaPS Select state with best f-value Better benefit plan? Yes Output Plan No Build RTPG Propagate Cost Find Utility Generate States by Applying Actions Anytime A* Search Challenge – Heuristic Support Heuristic needs to… Estimate cost of achieving variable values Find the utility of the values Extend current state-of-the-art techniques Planning graph structure Reachability estimation Cost propagation Challenge – Find Goal Achievement Cost Propagate reachable values with cost Move(Waypoint1) Sample_Soil Sample_Soil Communicate v1: cost( ): 0 [0,0] 1 [0,1] 2 [0,2] 0 1 2 A range of possible values 2.5 Cost of achieving each value bound Cost Propagation on Variable Bounds Sample_Soil Effect: v1+=1 Bound cost dependent upon action cost previous bound cost - current bound cost adds to the next Cost of all bounds in expressions Sample_Soil v1: [0,0] Sample_Soil [0,1] [0,2] Cost(v1=2) C(Sample_Soil)+Cost(v1=1) Sample_Soil Effect: v1+=v2 Sample_Soil v1: [0,0] v2: [0,3] Sample_Soil [0,3] [0,6] Cost(v1=6) C(Sample_Soil)+Cost(v2=3)+Cost(v1=3) Extracting Relaxed Plan with Numeric Info Start with best benefit bounds Relaxed plan includes Actions Supporting bounds B e n e f i t best benefit value Dur = 1 Dur = 1.25 Sample_Soil 1 (Sa1) Sample_Soil 2 (Sa2) (at end) V1 += 1 Cost: 1 Sa1 Cost: 2 Sa1 C:1 Sa2 v2 1 1.25 C:1 Sa2 C:2 Com C:4 2 (at start) V2 := V1 (at start) Cost: 3 V1 ≥ 1 Sa1 C:1 Com 0 Communicate (Com) (at end) V1 += 2 Sa2 C:2 upper bound @ time point v1 Dur = 1.5 2.5 3 C:2 C:4 3.75 4 t value cost value cost v1 – soil sample in rover’s store v2 – soil sample communicated Goal: v2 ∈ [5,∞], U(v2 ∈ [5,∞]) = v2 * 3 Dur = 1 Dur = 1.25 Sample_Soil 1 (Sa1) Sample_Soil 2 (Sa2) (at end) V1 += 1 Cost: 1 Sa1 Cost: 2 Sa1 C:1 Sa2 Dur = 1.5 (at end) V1 += 2 Com 0 v1 v2 1 1.25 (at start) Cost: 3 V1 ≥ 1 Sa1 C:1 Sa2 C:2 Communicate (Com) C:1 Sa2 C:2 C:2 Com C:4 2 2.5 (at start) V2 := V1 3 C:4 3.75 4 value cost satisfies goal value cost h(S) = U(G) - (cost of actions + cost of bounds) t Results – Modified Rovers Added numeric variables: Soil and rock sample amount in rover store More communicated soil/rock - greater utility Results – Modified Rovers Average improvement: 3.06 Anytime A* Search Behavior Results – Modified Logistics Added numeric variables: Number of packages at location More packages - greater utility Results – Modified Logistics Average improvement: 2.88 Summary Over-subscription planning in the presence of Numeric goals Durative actions Propagating cost over numeric values Future Work Delayed satisfaction of goals Goal utility dependency Questions.