Cost Based Satisficing Search Considered Harmful William Cushing J. Benton

Cost Based Satisficing Search Considered Harmful William Cushing J. Benton Subbarao Kambhampati Performance Bug: ε-Cost `Trap’  High cost variance: ε = $0.01 / $100.00     Board/Fly Load/Drive Labor/Precious Material Mode Switch/Machine Operation  Search depth:  0-1(heuristic-error)=∞  ε-1(heuristic-error)=huge     Optimal: cost=$1000.00, size=100,000 Runner-up: cost=$1000.10, size=20 Trillions of nodes expanded: When does depth 20 get exhausted? Outline  Inevitability of e-cost Traps  Cycle Trap  Branching Trap  Travel Domain  If Cost is Bad, then what?  Surrogate Search  Simple First: Size  Then: Cost-Sensitive Size-Based Search Cycle Trap  Effective search graph 1 1     1 1 g’ = f = g + h Edge weights = changes in f 0 = ideal - = over-estimated earlier  Or under-estimating now  + = under-estimated earlier  Or over-estimating now  Simple subgraph  Heuristic plateau  1 choice: Which way? Cycle Trap  Even providing a heuristic perfect for all but 1 edge…  Cost-based search fails 2 0 2  Reversible operators are one way in which heuristic penalty can end up being bounded from above  “Unbounded f along unbounded paths”, to have completeness, also forces a heuristic upper bound  Fantastically over-estimating (weighting) could help, but: (2 )  1  Suppose the right edge actually costs 1–ε  Then both directions would have identical heuristic value  Weighting would be fruitless Branching Trap x = # of 1 cost children y = # of ε cost children d/2 + dε/2 = C d = 2C/(1+ε) x+y1/ε = ways to spend 1 (x+y1/ε)C = ways to spend C (x+y)d = # of paths at same depth (x+y)2C/(1+ε) << (x+y1/ε)C Travel 1 2 R A B     Straight Fly = 10,000 cents Diag. Fly = 7,000 cents Board/Debark = 1 cent Various Solutions:  Cheapest Plan  Fastest Plan  Smallest Plan Travel – Cheapest Plan 1 2 R A B Travel – Cheapest Plan 1 2 R A B Travel – Decent Start 1 2 R A g = 1 fly + 4 board + 1 debark h = 2 fly + 4 debark + 1 board B f ~ 3 fly Travel – Begin Backtracking 1 2 h = 2 fly + 4 debark + 1 board R A g = 2 fly + 4 board + 1 debark B f ~ 4 fly Travel – Backtracking 1 2 R A B g = 1 fly + 4 board + 2 debark h = 2 fly + 4 debark + 2 board Travel – Backtracking 1 2 R A B g = 1 fly + 4 board + 2 debark h = 2 fly + 4 debark + 2 board Travel – Backtracking 1 2 R A g = 1 fly + 4 board + 3 debark h = 2 fly + 3 debark + 2 board B Fly 1-2-B Then teleport passengers Travel – Backtracking 1 2 g = 1 fly + 6 board + 3 debark h = 2 fly + 4 debark + 1 board R A B  8 people: 3 2 = 1296 4 4  1, 256, 6561, 390625  (1+0)8, (1+1)8, (1+2)38, (1+4)8 Travel Calculations  4 planes located in 5 cities  54 = 625 plane assignments  4k passengers, located in 9 places  94k passenger assignments globally  Cheap subspace  Product over each city  (1 + city-local planes) (city-local passengers)  e.g., (1+2)4(1+1)4 = 1296  Stop exploring  Large evaluation  Exhaustion of possibilities  Cost-based search exhausts cheap subspaces  Eventually  Assuming an upper bound on the heuristic Outline  Inevitability of e-cost Traps  Cycle Trap  Branching Trap  Travel Domain  If Cost is Bad, then what?  Surrogate Search  Simple First: Size  Then: Cost-Sensitive Size-Based Search Surrogate Search  Replace ill-behaved Objective with a well-behaved Evaluation  Tradeoff: Trap Defense versus Quality Focus  Evaluation Function: “Go no further”  Force ε ~ 1  Make g and f grow fast enough: in o(size)  Normalize costs for hybrid methods  Heuristic: “Go this way”  Calculate h in the same units as g  Retain true Objective  branch-and-bound  duplicates elimination + re-expansion  Re-expansion of duplicates should be done carefully  Can wait till future iterations, cache heuristics, use path-max, … Size-based Search  Replace ill-behaved Objective with a well-behaved Evaluation  Pure Size  Evaluation Function: “Go no further”  Force ε = 1  Heuristic: “Go this way”  Replace cost metric with size metric in relaxed problem  Retain true Objective, for pruning  Resolve heuristic with real objective  branch-and-bound: gcost+hcost >= best-known-cost  duplicates: new.gcost >= old.gcost  Re-expand better cost paths discovered Cost-sensitive Size-Based Heuristic  Replace ill-behaved Objective with a well-behaved Evaluation  Evaluation Function: “Go no further”  Heuristic: “Go this way”  Estimate cheapest/best, but, calculate size  sum/max/… propagation of real objective for heuristic  make minimization choices with respect to real objective  Last minute change:  Recalculate value of minimization choices by surrogate  Retain true Objective, for pruning  Calculate relaxed solution’s cost, also  Faster than totally resolving heuristic  branch-and-bound: gcost+hcost >= best-known-cost  If heuristic is inadmissible, force it to be admissible eventually Results – LAMA  LAMA  Greedy best-first: bad plans  (iterative) WA*: no plan, time out  LAMA-size  Greedy best-first: same bad plans  (iterative) WA*: direct plans, time out  Better cost! … but no rendezvous Expected Result: Only one kind of object Costs not widely varying Portfolio approach possible Results – SapaReplan  WA*-cost  Weight 5: one bad plan, time out  Weight 2: no plan, memory out  WA*-size  Weight 1-2: better plans, memory out Quality-sensitive evaluation function: cost+size Conclusion  ε-cost traps are inevitable      Typical: Large variation in cost Large cheap subspaces Upper-bounded heuristics Large plateaus in objective  Cost-based systematic approaches are susceptible  Even with all kinds of search enhancements: LAMA  Because search depth is “unbounded” by cost-based evaluation function  ε-1(h-error) ~ 0-1(h-error)  That is, search depth is bounded only by duplicate checking  Force good behavior:  Evaluation ≠ Objective  Force ε~1  Quality Focus versus Trap Defense  Simplest surrogate:  Size-based Search  Force ε=1  Performs surprisingly well  Despite total lack of Quality Focus  Easy variation:  Cost-sensitive Size-based Heuristic  Still force ε=1  Recalculate heuristic by surrogate  Performs yet better Conclusion (Polemic)  Lessons best learnt and then forgotten:  goto is how computers work efficiently Go enthusiasts: joseki  A* is how search works efficiently  Both are indispensible  Both are best-possible  In just the right context  Both are fragile  If the context changes If size doesn’t work…                Speed Everything Up Reduce All Memory Consumption Improve anytime approach: Iterated, Portfolio, Multi-Queue Guess (search over) upper bounds Decrease weights Delay duplicate detection Delay re-expansion Delay heuristic computation Exploit external memory Use symbolic methods Learn better heuristics: from search, from inference Precompute/Memoize anything slow: the heuristic Impose hierarchy (state/task abstraction) Accept knowledge (LTL) Use more hardware: (multi-)core/processor/computer, GPU Related Work: The Best Approach?  The Best Surrogate? The Best Approach Over All?     Improve Exploitation     (Dynamic) Heuristic Weighting (Pohl, Thayer+Ruml) Real-time A* (Korf) Beam search (Zhou) Quality-sensitive probing/lookahead (Benton et al, PROBE)      Path-max, A** (Dechter+Pearl) Multi-queue approaches (Thayer+Ruml, Richter+Westphal, Helmert) Iterated search (Richter+Westphal) Portfolio methods (Rintanen, Streeter) Breadth-first search [as a serious contender] (Edelkamp)     h_cea, h_ff, h_lama, h_vhpop, h_lpg, h_crikey, h_sapa, … Pattern Databases (Culbertson+Schaeffer, Edelkamp) Limited Discrepancy Search (Ginsberg) Negative Result: “How Good is Almost Perfect?” (Helmert+Röger)    Factored Planning (Brafman+Domshalak) Direct Symmetry Reductions (Korf, Long+Fox) Symbolic Methods, Indirect Symmetry Reduction (Edelkamp) Improve Exploration Directly Address Heuristic Error `See’ the Structure (remove the traps) Related Fields  Reinforcement Learning: Exploration/Exploitation  Markov Decision Processes: Off-policy/On-policy  Reward Shaping, Potential Field Methods (Path-search)  Prioritized Value Iteration  Decision Theory: Heuristic Errors  “Decision-Theoretic Search” (?)  k-armed Bandit Problems (UCB)  Game-tree Search: Traps, Huge Spaces     Without traps, game-tree pathology (Pearl) Upper Confidence Bounds on Trees (UCT) Quiescent Search Proof-number search (Allis?)  Machine Learning: Really Huge Spaces  Surrogate Loss Functions  Continuous/Differentiable relaxations of 0/1  Probabilistic Reasoning: Extreme Values are Dangerous  that 0/1 is bad is well known  but also ε is numerically unstable What isn’t closely related?          Typical Puzzles: Rubik’s Cube, Sliding Tiles, … Prove Optimality/Small Problems Tightly Bounded Memory: IDDFS, IDA*, SMA* Unbounded Memory, but:   Delayed/Relaxed Duplicate Detection (Zhou, Korf) External Memory (Edelkamp, Korf)    D*, D*-Lite, Lifelong Planning A* (Koenig) Case-based planning Learned heuristics     Bidirectional/Perimeter Search Randomly expanding trees for continuous path planning in low dimensions Waypoint/abstraction methods Any-angle path planning (Koenig)    Explanation Based Learning Theorem Proving (Clause/Constraint Learning) Forward Checking (Unit Propagation)    Subroutine speedup via Precomputation/Memoization Python vs C Priority Queue implementation (bucket heaps!) More than one problem: State-space isn’t a blackbox: State-space is far from a blackbox: Planning isn’t (only) State-space search (Kambhampati) Engineering: Quotes  “… if in some problem instance we were to allow B to skip even one node that is expanded by A, one could immediately present an infinite set of instances when B grossly outperforms A. (This is normally done by appending to the node skipped a variety of trees with negligible costs and very low h.)”  Rina Dechter, Judea Pearl  “I strongly advise that you do not make road movement free (zerocost). This confuses pathfinding algorithms such as A*, …”  Amit Patel  “Then we could choose an ĥ somewhat larger than the one defined by (3). The algorithm would no longer be admissible, but it might be more desirable, from a heuristic point of view, than any admissible algorithm.”  Peter Hart, Nils Nilsson, Bertram Raphael  Roughly: `… inordinate amount of time selecting among equally meritorious options’ – Ira Pohl

Cost Based Satisficing Search Considered Harmful William Cushing J. Benton

Related documents

Products

Support

Cost Based Satisficing Search Considered Harmful William Cushing J. Benton

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib