Automated Puzzle Generation Simon Colton Universities of Edinburgh and York Background • Train journey with Jeremy Gow – – – – – To meet Herb Simon Puzzle generation rather than problem solving Wrote some puzzles for Jeremy Jeremy kept getting the “wrong” answer Puzzle generation is a difficult task • Reviewer’s comment – View puzzles independently of implementation Some Example Puzzles • Which is the odd one out? – Hair, triangles, squares, plants, words, trees – Answer: triangles (others have roots) • Jingle is to corporation as ? Is to politician – Campaign, platform, slogan, promises – Answer: slogan • What is next in the sequence – 4, 3, 6, 6, 2, 9 ? – Answer: later Overview of What’s Needed • Structure for puzzles – Characterisation of puzzles • Puzzles must have single solutions – Theory formation helps here • Puzzles must be of correct difficulty – Methods for disguising the answer Queendom.com Examples • What’s the odd one out? – Coconuts, oysters, clams, eggs, walnuts, haddock – A: haddock (the others have shells) • Hair is to stubble as potatoes are to ? – F.fries, sweet potatoes, potato skins, vegetable – A. French fries • What’s next in the sequence – 3, 8, 15, 24, 35? – A: 48 (square integers and subtract 1) A Characterisation of Puzzles • Three (of many) types of puzzle are: – Odd one out, analogy, next in sequence • Have (almost) the same structure: – Question statement – Set of choices, one of which is answer – Solution which is an embedded concept • Some tweaking necessary to make a fit – Next in sequence puzzles have no choices – Analogy puzzles have no solution concept Solutions to Puzzles • Solution is a single embedded concept – Fairly simple and positively stated • Which is the odd one out: 4, 9, 8, 36? – A: 9 (even numbers), A: 8 (square numbers) – Puzzle is unsatisfying if there are two answers • Which is the odd one out: 2, 3, 9, 20? – A: 9 (it is a square number) • Which is the odd one out: 23, 25, 27, 29? – A: 27 (others are primes or squares) The Difficulty of Puzzles • Embedded concept is usually not complex – Probably in order to ensure single solution • Number of possible answers – Increases the search space for answer – Could make the problem easier • Disguising concepts – Odd one out: haddock puzzle, they’re all foodstuffs – Next in sequence (from queendom): 2, 7, 4, 14, 6? – Another concept interleaved (or stuck on) The HR Program • Automated theory formation – Concepts (ex. & def.), conjectures, proofs – Theory is a collection of concepts (in this case) • Concept formation via 8 production rules – Builds new concepts from old ones – Compose,disjunct,exists,forall,match,negate,size,split • Complexity of a concept: – Number of production rule steps • Specialisation concepts important – Specialistion of objects of interest (e.g., prime nums) Extension for Puzzles (General) • HR generates theory, then builds puzzles – Embed each concept, make all puzzles, choose rep. • From characterisation of solution: – Don’t use negate or disjunct production rules in ATF • From single solution: – Exhaust theory up to a complexity limit – Check for alternative solutions and discard • From difficulty consideration – Present puzzles in order of conc. complexity, disguise – Actively add disguise where possible Extension for Puzzles (Special) • User: chooses the number of possible answers (n) – Answers are presented in random order • Odd one out: – Choose n positive and 1 negative example of spec. conc – Check all other concepts for a different solution • Next in sequence (only in domain of integers) – Embed number type (e.g. primes, 2, 3, 5, 7, ?) – Embed function (e.g. number of divisors, 1, 2, 2, 3, ?) – Actively disguise by interleaving simple seq. • Analogy: A is to B as C is to: D, E, F, G? – A, B, C and D share spec. property, E, F and G do not Experiment 1: Animals • Animals dataset (distributed with Progol) – 18 animals (dog, platypus, snake, eagle, etc.) – 12 properties (class, homeothermic, eggs, etc.) • Theory formation up to comp. limit 5 – Compose, exists, forall, match, size, split • Asked for all odd one out & analogy puzzles – User specifies: 4 answers possible Animals Results • 31 puzzles about animals formed • Good examples [15] Which is OOO: penguin, ostrich, cat, bat? [31] Eel is to platypus as shark is to snake,eagle,turtle,lizard? • Bad example [27] Cat is to dog as eagle is to lizard, eel, ostrich, trout? • Observations: – Low complexity of concepts, little disguise found – Need more examples of animals • Conclusion: – Single solutions worked OK, but fairly easy to solve Experiment 2: Integer Sequences • Integers 1 to 30 provided – Addition, multiplication, digits, divisors – Compose, exists, match, size, split • Theory formed up to complexity 4 • Disguise simple concepts (comp. < 3) – By interleaving other simple concepts • All next in sequence puzzles asked for – User specifies: 6 terms of the sequence given Sequences Results • • 24 next in sequence puzzles generated Good examples: [2] 4, 3, 6, 6, 2, 9, ? [numdiv, 27, mult. of 3] [3] 21, 3, 24, 6, 27, 9, ? [mult 3, mult 3] [10] 21, 22, 24, 25, 26, 28, ? [digit is a div] • Bad examples: [20] 6, 0, 2, 0, 4, 0, ? [# even divisors of 24, …] [22] 11, 12, 12, 13, 13, 14, ? • Observations – – Functions should start earlier on number line Embedded concepts are in general too complex Remarks about Creativity • Setter: creative act is finding concept/examples • Solver: creative act is finding the answer/solution • Having a single solution: – Want the solver to be P-creative, not H-creative • Difference between answer and solution – IQ tests: interested in answer, not solution • More will come to light after field testing – Comments very welcome Conclusions and Future Work • Characterisation of puzzles – Single pos. simp. solution, difficulty (disguise) • Puzzle generation can be automated – Results not stunning, but still preliminary • Puzzle generation needs improvement • Also needs hand crafting of input files • More answers/questions about puzzle solver/setter creativity – After a field test of HR’s puzzles