General Game Playing (GGP) Marco Adelfio CMSC 828N – Spring 2009 Classic Game Playing AI Deep Blue TD-Gammon Poki General Game Playing AI GGP Agent General Game Playing GGP Goals: Create systems to play arbitrary games (given formal game definitions) Eliminate game-specific strategies Emphasize generic strategy formulation Competition created by Stanford Logic Group Hosted during AAAI conference since 2005 $10,000 Grand Prize General Game Playing Questions: What additional challenges arise for GGP agents? How should a GGP agent evaluate game states? Can a GGP agent transfer knowledge between games? General Game Playing Finitely many players, states Game play controlled by Game Manager over network Players act synchronously (noops allowed) Time limits enforced Basic agent must: Understand rule specification Respond to game states with legal actions Recognize a terminal state and its payoffs Game Definition Language A game definition must logically define: Set of states in the game Legal actions for each player from a given game state Transition function Initial state Terminal states and their payoffs Game Definition Language - Example (role p1) (<= (row ?m ?x) (role p2) (true (cell ?m 1 ?x)) (init (cell 1 1 b)) (true (cell ?m 2 ?x)) (init (cell 1 2 b)) (true (cell ?m 3 ?x))) … … (init (control p1) (<= (line ?x) (row ?m ?x)) … (<= (line ?x) (column ?m ?x)) (<= (legal ?w (mark ?x ?y)) (<= (line ?x) (diagonal ?x)) (true (cell ?x ?y b)) … (true (control ?w))) (<= (goal p1 100) … (line x)) (<= (next (cell ?m ?n x)) (<= (goal p1 0) (does p1 (mark ?m ?n)) (true (cell ?m ?n b))) … (line o) … (<= terminal (line x)) Game Communication Game Manager Message Game Player Response (START MATCH.435 WHITE description 90 30) READY (PLAY MATCH.435 (NIL NIL)) (MARK 2 2) (PLAY MATCH.435 ((MARK 2 2) NOOP))) NOOP (PLAY MATCH.435 (NOOP (MARK 1 3)) (MARK 1 2) (PLAY MATCH.435 ((MARK 1 2) NOOP)) NOOP ... ... (STOP MATCH.435 ((MARK 3 3) NOOP) DONE General Game Playing Design Challenges: Indeterminacy Size Multi-game Commonalities Opponent Recognition AAAI Competition – Past Winners 2005 - ClunePlayer (UCLA) 2006 - FluxPlayer (Technical University of Dresden) 2007 - CADIA (Reykjavik University) 2008 - CADIA (Reykjavik University) Agent 1: ClunePlayer Approach: Minimax Problem: Needs to assign values to intermediate game states in arbitrary games. Solution: Calculate a vector of generic features at each node 2. Simulate games to determine which features are “stable” and correlated with either payoff or control 3. When running minimax, use a combination of those scores as the evaluation heuristic 1. Agent 2: CADIA-Player Approach: UCT (Variant of Monte Carlo simulation) Monte Carlo: Pick random actions for each player to descend the tree After reaching a terminal state, update expected payoff Q(s,a) for each visited state s and action a Introduces explore/exploit tradeoff Agent 2: CADIA-Player UCT (Upper Confidence bound for Trees) Balance exploration and exploitation Give “bonus” to less travelled paths Agent 3: UTexas LARG Approach: Knowledge Transfer Uses lessons from past games to improve play in new games War Games! Determines whether a new game is isomorphic or similar to a previous game. If so, transfer estimated rewards Summary General Game Playing introduces a different set of challenges than designing game-specific AI Biggest challenge is evaluating states in a novel game Better understanding of general strategy formation has many applications References GGP Website: http://games.stanford.edu/ Hilmar Finnson. CADIA-Player: A General Game Playing Agent. MSc Thesis, School of Computer Science, Reykjavik University. 2007. Kuhlmann, Gregory and Peter Stone. Graph-Based Domain Mapping for Transfer Learning in General Games. Lecture Notes in Computer Science, Volume 4701/2007.