• Problem Set 1 Logistics Due 10/13 at start of class • Office Hours Monday 3:30pm – o email for another time • Reading R&N ch 6 (skip 5 for now) • Mailing List Last reminder to sign up via course web page • Dan’s Travel Off email until Mon morning If problems on PS, make best assumption © Daniel S. Weld 1 573 Topics Perception NLP Robotics Multi-agent Reinforcement Learning MDPs Supervised Learning Planning Search Uncertainty Knowledge Representation Problem Spaces Agency © Daniel S. Weld 2 Search Problem spaces Blind Depth-first, breadth-first, iterative-deepening, iterative broadening Informed Best-first, Dijkstra's, A*, IDA*, SMA*, DFB&B, Beam, hill climb,limited discrep, AO*, LAO*, RTDP Local search Heuristics Evaluation, construction, learning Pattern databases Online methods Techniques from operations research Constraint satisfaction Adversary search © Daniel S. Weld 3 Combinatorial Optimization Nonlinear programs Convex Programs Linear Programs (poly time) © Daniel S. Weld Integer Programming (NPC) Flow & Matching (fast!) 4 Genetic Algorithms • Start with random population Representation serialized States are ranked with “fitness function” • Produce new generation Select random pair(s): • probability ~ fitness Randomly choose “crossover point” • Offspring mix halves Randomly mutate bits 174629844710 650122942870 510413310889 094001133281 776511094281 900077644555 © Daniel S. Weld Selection Crossover 174629844710 Mutation 174611094281 776511094281 164611094281 776529844710 776029844210 5 Properties • Randomized, parallel, beam search Using fitness function • Importance of careful representation © Daniel S. Weld 6 © Daniel S. Weld 7 Experiment • Fitness Given 166 random problem instances Fitness = number of these problems solved • Initialized to 300 random examples • Results after 10 generations The following program discovered Solves all 166 (EQ (DU (MT CS) (NOT CS) (DU (MS NN) (NOT NN))) © Daniel S. Weld } } Move all blocks to table Build correct stack 8 © Daniel S. Weld 9 © Daniel S. Weld 10 Koza Block Stacking • Learn a program which stacks blocks Initial blocks can be in any orientation Program should make a tower spelling “Universal” • Clever Representation CS = name of block on top of current stack TB = name of top block st it and lower blocks OK NN = name next block needed above TB • Imagine if blocks described using x,y coords! © Daniel S. Weld 11 Available Actions • (MS x) if x is on table, move it to top of stack • (MT x) If x is somewhere in the stack, move to table the topmost block • (EQ x y) Returns T if x equals y • (NOT x) • (DU x y) Do x until y returns T Any problems? © Daniel S. Weld 12 Eisenstein’s Representation onRammed onHit Input Gun onScan Base Other actuators • Each AFSM is a REX-like program • Fixed-length encoding 64 operations per AFSM ~2000 bits per genome © Daniel S. Weld Adapted from Jacob Eisenstein presentation 13 Training • Scaled fitness • Mutation pegged to diversity • Typical parameters 200-500 individuals 10% copy, 88% crossover, 2% elitism • This takes a LONG TIME!!! Sample from ~25 starting positions Up to 50,000 battles per generation 0.2-1.0 seconds per battle 20 minutes to 3 hours per generation © Daniel S. Weld Adapted from Jacob Eisenstein presentation 14 Results • Fixed starting position, one opponent GP crushes all opposition Beats “showcase” tank • Randomized starting positions Wins 80% of battles against “learning” tank Wins 50% against “showcase” tank • Multiple opponents Beats 4 out of 5 “learning” tanks • Both… Unsuccessful © Daniel S. Weld Adapted from Jacob Eisenstein presentation 15 Example Program Function Input 1 1. Random ignore ignore 0.87 2. Divide Const_1 Const_2 0.5 3. Greater Than Line 1 Line 2 1 4. Normalize Angle Enemy bearing ignore -50 5. Absolute Value Line 4 ignore 50 6. Less Than Line 4 Const_90 1 7. And Line 6 Line 3 1 8. Multiply Const_10 Const_10 100 9. Less Than Enemy distance Line 8 0 10. And Line 9 Line 7 0 11. Multiply Line 10 Line 4 0 12. Output Turn gun left Line 11 0 © Daniel S. Weld Input 2 Output Adapted from Jacob Eisenstein presentation 16 Functions • • • • • • • Greater than, less than, equal +-*/% Absolute value Random number Constant And, or, not Normalize relative angle © Daniel S. Weld Adapted from Jacob Eisenstein presentation 17 Search Problem spaces Blind Depth-first, breadth-first, iterative-deepening, iterative broadening Informed Best-first, Dijkstra's, A*, IDA*, SMA*, DFB&B, Beam, hill climb, limited discrep,AO*, LAO*, RTDP Local search Heuristics Evaluation, construction via relaxation Pattern databases Online methods Techniques from operations research Constraint satisfaction Adversary search © Daniel S. Weld 18 Admissable Heuristics • f(x) = g(x) + h(x) • g: cost so far • h: underestimate of remaining costs Where do heuristics come from? © Daniel S. Weld 19 Relaxed Problems • Derive admissible heuristic from exact cost of a solution to a relaxed version of problem For transportation planning, relax requirement that car has to stay on road Euclidean dist For blocks world, distance = # move operations heuristic = number of misplaced blocks What is relaxed problem? # out of place = 2, true distance to goal = 3 • Cost of optimal soln to relaxed problem cost of optimal soln for real problem © Daniel S. Weld 20 Simplifying Integrals vertex = formula goal = closed form formula without integrals arcs = mathematical transformations n 1 x n x dx n 1 heuristic = number of integrals still in formula what is being relaxed? © Daniel S. Weld 21 Heuristics for eight puzzle 7 2 3 5 1 6 8 3 start 1 2 3 4 5 6 7 8 goal • What can we relax? © Daniel S. Weld 22 Importance of Heuristics • h1 = number of tiles in wrong place • h2 = 3 6 distances of tiles from correct loc D 2 4 6 8 10 12 14 18 24 © Daniel S. Weld 7 2 4 1 8 5 IDS 10 112 680 6384 47127 364404 3473941 A*(h1) 6 13 20 39 93 227 539 3056 39135 A*(h2) 6 12 18 25 39 73 113 363 1641 23 Need More Power! Performance of Manhattan Distance Heuristic 8 Puzzle 15 Puzzle 24 Puzzle < 1 second 1 minute 65000 years Need even better heuristics! © Daniel S. Weld Adapted from Richard Korf presentation 24 Subgoal Interactions • Manhattan distance assumes Each tile can be moved independently of others • Underestimates because Doesn’t consider interactions between tiles © Daniel S. Weld Adapted from Richard Korf presentation 25 Pattern Databases [Culberson & Schaeffer 1996] • Pick any subset of tiles • E.g., 3, 7, 11, 12, 13, 14, 15 • Precompute a table Optimal cost of solving just these tiles For all possible configurations • 57 Million in this case Use breadth first search back from goal state • State = position of just these tiles (& blank) © Daniel S. Weld Adapted from Richard Korf presentation 26 Using a Pattern Database • As each state is generated Use position of chosen tiles as index into DB Use lookup value as heuristic, h(n) Admissible? © Daniel S. Weld Adapted from Richard Korf presentation 27 Combining Multiple Databases • Can choose another set of tiles Precompute multiple tables • How combine table values? • E.g. Optimal solutions to Rubik’s cube First found w/ IDA* using pattern DB heuristics Multiple DBs were used (dif subsets of cubies) Most problems solved optimally in 1 day Compare with 574,000 years for IDDFS © Daniel S. Weld Adapted from Richard Korf presentation 28 Drawbacks of Standard Pattern DBs • Since we can only take max Diminishing returns on additional DBs • Would like to be able to add values © Daniel S. Weld Adapted from Richard Korf presentation 29 Disjoint Pattern DBs • Partition tiles into disjoint sets 1 2 3 4 5 6 7 8 • E.g. 8 tile DB has 519 million entries • And 7 tile DB has 58 million 9 10 11 12 13 14 15 For each set, precompute table • During search Look up heuristic values for each set Can add values without overestimating! Manhattan distance is a special case of this idea where each set is a single tile © Daniel S. Weld Adapted from Richard Korf presentation 30 Performance • 15 Puzzle: 2000x speedup vs Manhattan dist IDA* with the two DBs shown previously solves 15 Puzzles optimally in 30 milliseconds • 24 Puzzle: 12 million x speedup vs Manhattan IDA* can solve random instances in 2 days. Requires 4 DBs as shown • Each DB has 128 million entries Without PDBs: 65000 years © Daniel S. Weld Adapted from Richard Korf presentation 31