Playing "Invisible Chess" with Information-Theoretic Advisors

Playing "Invisible Chess" with Information-Theoretic Advisors A.E. Bud, D.W.Albrecht, A.E. Nicholson and I. Zukerman {bud,dwa,annn,ingrid}@csse.monash, edu. au School of Computer Science and Software Engineering, MonashUniversity Clayton, Victoria 3800, AUSTRALIA phone: +61 3 9905-5225 fax: +61 3 9905-5146 From: AAAI Technical Report SS-01-03. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved. Abstract Makingdecisions under uncertainty remains one of the central problemsin AI research. Unfortunately, most uncertain real-world problemsare so complexthat any progress in them is extremelydif cult. Gamesmodelsomeelementsof the real world, and offer a morecontrolled environmentfor exploring methodsfor dealing with uncertainty. Chess and chess-like gameshave long been used as a strategically complextestbed for general AI research, and we extendthat tradition by introducing an imperfect information variant of chess with someuseful properties such as the ability to scale the amount of uncertainty in the game.Wediscuss the complexityof this gamewhichwecall invisible chess, and present results outlining the basic values of invisible pieces in this game.Wemotivate and describe the implementation and application of two information-theoretic advisorsthat assist a player of invisible chess to control the uncertainty in the game.Wedescribe our decision-theoretic approachto combiningthese informationtheoretic advisors with a basic strategic advisor. Finally we discuss promisingpreliminary results that wehave obtained with these advisors. 1 Introduction Makingdecisions under uncertainty remains one of the central problems in AI research. An agent in an uncertain world needs to select actions from the action search space -- the set of all possible actions in that world. As the uncertainty increases, this task can becomeincreasingly dif cult. The numberof possible actions may increase, the numberof possible situations in which those actions may be applied may increase, or both. The effects of these growingsearch spaces are ampli ed as the agent tries to search further ahead. Each action on each possible world state requires more possible world states to be evaluated for future moves. This property of imperfect information domains makes tackling real-world problems extremely dif cult. Games and game theory model some of these properties of real-world situations in a more controlled environment and thereby allow analysis and empirical testing of decisionmakingstrategies in these domains. In this capacity, games have long been used as a testbed for general arti cial intelligence research and ideas. In particular, chess has a long Copyright(~) 2001, AmericanAssociation for Arti cial Intelligence(www.aaai.org).All rights reserved. Nolicense: PDFproducedby PStin (c) F, Siegert - http://www.this.net/~frank/pstill.html history of use in the AI communitybecause of its strategic complexity, and well-studied and understood properties. Despite these advantages, chess has two signi cant drawbacks as a general AI testbed: the rst is the success of computer chess players that use hard-coded, domain specific rules and strategies to play; and the secondis the fact that standard chess is a perfect information domain. A number of researchers have tackled the rst of these drawbacks. For example (Berliner 1974) investigated generalised strategies used in chess play as a model for problemsolving, and (Pell 1993) introduced metagame. Metagameis a system for generating new games arbitrarily generated from a set of games known as Symmetric Chess-Like games (SCL Games). Thus there is no point hand-training a computer program to play one particular game, as each game requires different strategic planning and position evaluation. A good computer metagame player has to somehow encapsulate a higher level of "strategic knowledge"than is possible in a single gamesuch as chess. In this paper, we address the second drawback of chess as a general AI testbed -- that of perfect information. Wedescribe a missing (or imperfect) information variant of standard chess which we call invisible chess (Bud et al. 11999). Invisible chess involves a con gurable numberof invisible pieces, i.e., pieces that a player’s opponent cannot see. Invisible chess is thus a representative of the general class of strategically complex, imperfect information, two player, zero-sum games. Manyresearchers have investigated games with missing information including poker ((Findler 1977), (Korb, Nicholson, & Jitnah 1999), (Koller &Pfeffer 1997)), bridge ((Gambiick, Rayner, & Pell 1991), (Ginsberg 1999), (Smith, & Throop 1996)), and multi-user domains (Albrecht et al. 1997). With the exception of Albrecht et al., whouse a large uncontrollable domain, all of these domainsare strategically simple given perfect information. Onthe other hand, invisible chess retains all of the strategic complexity of standard chess with the addition of a controllable element of missing information. Invisible chess 1Invisible chess is in the set of Invisible SCLGames,an extension to the set of SCLGamesintroduced by Pell. Wehave chosen invisible chess as the rst gameto explorebecauseof the availability of standard chess programs. is related to kriegspiel ((Li 1994) and (Ciancarini, DallaLibera, &Maran1997)), a chess variant in which all the opponent’spieces are invisible and a third party referee determines whetheror not each moveis valid. In addition to the complexityof playing standard chess, a player of invisible chess mustmaintainthe possible positions of the opponent’sinvisible pieces. Thesepositions may be representedas a probability distribution over the possible squares on the chess board. Section 4 describes our design that enables a central moduleto maintainapproximationsto the invisible piece probabilitydistributions for bothplayers. Section5 presents a brief analysis of the advantagesof playing with various invisible pieces. Interestingly, we show that the relative value of the minorinvisible pieces differs from the standard chess piece ranking. Wereport on the uncertainty causedby the different invisible pieces and the effect that those pieces have on the opponent’sability to play strategically. Section6 discusses the relationship betweenuncertainty and the information-theoretic concept of entropy,andthe applicationof entropyto our results. It also motivatesthe design of information-theoreticadvisors, describes these advisors and presents results that demonstrate the ef cac y of our approach.Section 7 contains conclusions and ideas for further work. 2 Related ions do not tend to occur in a signi cant numberof nodes in the gametree, 2 and therefore partition search is not likely to be very effective on any variants of chess. Morerecently (Ginsberg 1999) included MonteCarlo methods his bridge playing programto simulate manypossible outcomes,choosingthe action with the highest expectedutility over the simulations. Ginsbergclaims that trying to glean or hide information from an opponentis probably not useful for bridge. In contrast, wepresent results that showthat both informationhiding and gleaning can be useful in invisible chess. (Frank &Basin 1998) and (Frank &Basin 2000) have formeda detailed investigation of search in imperfectinformation games. They have concentrated almost exclusively on bridge. Frank &Basin focus on a numberof search techniques including attening the gametree which they have shownto be an NP-Completeproblem, and MonteCarlo methods. Their investigation suggests that MonteCarlo methodsare not appropriate for imperfect information games.This problemis exacerbatedin invisible chess given the relative lack of symmetry and the size of the gametree. In the closest workto that presentedhere, (Ciancarini, DallaLibera, & Maran 1997) considered king and pawn end gamesin the gameof kriegspiel. Kriegspiel is an existing chess variant where neither player can see the opponent’s pieces or moves. They used a game-theoretic approach involving substantive rationality and have shownpromising results in this trivial versionof kriegspiel. Thusfar they have not applied their approachto a completegameof kriegspiel. Kriegspiel differs frominvisible chess in a numberof important areas. All the opponent’spieces are invisible to a player of kriegspiel. Thusthere is no wayto reduce the uncertainty in the domainwithout reducing the strategic complexity.By contrast, in invisible chess, the numberand types of invisible pieces is con gurable. As all pieces are invisible, every moveinvolvessubstantial increases in uncertainty. In invisible chess, a player maychoose whetherto movea visible or invisible piece. Finally, in kriegspiel, a player attempts movesuntil a legal moveis performed.A third party arbitrator indicates whetheror not each moveis legal. In invisible chess, the player is given specie information whena move is impossibleor illegal, and maymiss a turn as a result of attempting an impossiblemove(see Section 3). Thesedifferences makekriegspiel a substantially morecomplex and less controllable domainthan invisible chess. Speci cally, exploring the relationship betweenuncertainty and strategic play wouldbe moredif cult in kriegspiel. Ultimately, the pathological case of invisible chess whereall pieces are invisible approachesthe complexityof kriegspiel. Thusinvisible chess provides a convenientstepping stone to a muchmore difcult problem. To our knowledgethere exist no computerkriegspiel players that are able to play a full gameof kriegspiel with whichto compareour system. Work Since the 1950s, when Shannonand Turing designed the rst chess playing programs (Russell & Norvig 1995), computershave becomebetter at playing certain gamessuch as chess using large amountsof hand-codeddomainspecific information. Continuingthe tradition of using gamesas a testbed for general AI investigation, (Berliner 1974)proposeda tactical analyser for chess whichused strategies and tactics, but did not play as well as the existing hard-coded systems. In answer to the success of hard-coded algorithms, (Pell 1993) introduceda class of gamesknownas Symmetric Chess Like (SCL) Games,and a system called Metagamer that plays gamesarbitrarily generatedfromthis class using a set of advisorsrepresentingstrategies in the class of games. Peli did not consider imperfect information games.However, his class of SCLGamesis easily extendedto the class of Invisible SCLGames,where one or more of a player’s pieces is hiddenfromtheir opponent. (Koller &Pfeffer 1995)investigated simple imperfectinformationgameswith an initial goal of solving them. However, their approachdoes not scale up to morecomplexgames such as invisible chess. (Smith, Nan, &Throop1996) wrote a bridge playing program using a modi ed form of gametree with enumerated strategies rather than actions, effecting a formof forward pruning of the gametree. Smithet al. stated that forward pruningworkswell for bridge, but not for chess. (Ginsberg 1996)introducedpartition search to reducethe effective size of the gametree. He showedthat this approach workswell for bridge and other gameswith a high degree of symmetry. Partly becausein chess pawnsonly moveforward, and partly becauseof the strategic nature of the game,the sameposit- 2In fact if the samepositionoccursthree times in a gameof chess,the gameis declareda draw. 7 Nolicense: PDFproducedby PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html quently the invisible black bishop on c4 is revealed, and white must supply another move. 4: If a player is not in check, and attempts to moveinto check, the invisible piece causing the checkis revealed, and the player’s tumis forfeited. In standard chess, the movewouldbe disallowed and the player wouldprovide an alternate move.In Figure 4, white attempts to move their king from dl to d2. However, black’s invisible bishop on a5 is threatening d2; consequentlythe invisible black bishop on a5 is revealed, and white forfeits their turn. 5: Ifa player attemptsto castle throughcheckfroman invisible piece or throughan invisible piece itself, the invisible pieceis revealedandthe turn is forfeited. 3 Domain: Invisible Chess Invisible chess is basedon standardchess with the following difference. In invisible chess we differentiate betweenvisible and invisible pieces and de ne themas follows: a visible piece is one that both players can see; an invisible piece is one that a player’s opponentcannot see. Thus, every time a visible piece is moved,the boardis updatedas per standard chess. However,whena player movesan invisible piece, their opponentis informedwhichinvisible piece has moved, but not where it has movedfrom or to. This information enables each player to maintaina probability distribution of their opponent’sinvisible pieces across the squares on the board. In this paper, we frame invisible chess pieces ~ to distinguish themfromvisible chess pieces (~). Wede ne three terms for referring to movesin invisible chess: A possible moveis any legal chess movegiven complete information about the board. That is, no invisible pieces are in the wayof the move,and the piece can moveto the desired position. An impossiblemoveis an attempted movethat violates the laws of chess because of an incorrect assumptionas to the whereaboutsof an invisible piece. For example, a player attempts to movea piece "through" an invisible piece, or attempts to use a pawnto capture an invisible piece that is not diagonallyin front of it. See Figure 1. An illegal moveis an impossible movethat would not allow the gameto continue. For example,a player is not allowedto movetheir king into checkby an invisible piece. See Figure 3. Therules of invisible chess are basedon the rules of chess. Theonly modications pertain directly to invisible pieces and their impact on the game.In general if a moveis possible then it is accepted;if a moveis impossiblethen it is rejected and the player’s turn is forfeited; and if a moveis illegal, someinformationregarding the reason the moveis illegal is revealedto the player attemptingthe move,and the player must supply another move.Note that we do not allow invisible kings in this basic invisible chess, as a version of invisible chess with invisible kings woulddrastically modify the goal of the original game. (Budet al. 2000) gives details of all scenarios wherenew rules comeinto effect, but in summary, onlythe rules for the followingscenarios differ fromthose of standard chess. 1: Impossiblemovesare disallowed,the position of the rst invisible piece that caused the moveto be impossible is revealed, and the player’s turn is forfeited. Figure 1 showsan exampleof an impossible move. 2: If a player’s kingis threatened,the invisible piece causing the checkis revealed, and the threatenedplayer moves normally. Additionally, the player can infer that there are no invisible pieces betweenthe threatening piece and their king (unless the king is being threatened by a knight). Figure 2 showsan exampleof this scenario. Blackis told that the invisible whitebishopis on g5. 3: If a player is in check, and attempts to moveinto check again or does not moveout of check, any invisible pieces causing the check are revealed, and the player must supply another move.In Figure 3, white is in checkand attemptsto capture the invisible black knight on fl. Conse- 3.1 The complexityof standard chess is well understood. Chess has an average branching factor of around 35. A typical gamelasts approximately 50 movesper player so the entire gametree has approximately351°° nodes ((Russell Norvig1995), p 123). In the trivial case wherethe opponenthas no invisible pieces, the gametree for invisible chessis exactlyas large as that for standardchess. Onceinvisible pieces are introducedinto the game,each node of the gametree must be expandedfor each possible movefor each possible combinationof positions of the opponent’sinvisible pieces. In a gameof invisible chess the player and opponent each have minvisible pieces. If each invisible piece has a positive probability of occupying n~ squares, then the branchingfactor is approximatelythe branchingfactor of chess multiplied by the combinationof invisible squaresthat could be occupiedas shownby the following formula: Branching Factor = 35 x nl x n2 x ... x nra (1) If each player has two invisible pieces, and each of those pieces has an average of only four squares with a positive probability of occupation, then the average approximate branching factor of that gameof invisible chess is 35 x 16 = 560. For three invisible pieces each, the branching factor is around35 × 64 = 2240. Assumingthat the invisible pieces are on the board and movingfor approximately half the game, then the complete expandedgametree for three invisible pieces each is in the order of 22405°whichis around 1015° nodes. To makethe domaineven morecomplex,chess has virtually no axes of symmetrythat allow the size of the gametree to be reduced as in highly symmetricalgamessuch as tictac-toe, and Smith and Nan (Smith &Nan 1993) claim that forward pruningto reduce the branchingfactor of chess has beenshownto be relatively ineffective. In addition to the combinatorially expansivenature of the invisible chess gametree, in order to play invisible chess, a player mustmaintain beliefs about the positions of the opponent’sinvisible pieces. A player using a simplistic belief updating schemesuch as assumingthe most likely destination will ignore manyprobable squares. Further, a player that uses any belief updating 8 Nolicense: PDFproducedby PSfill (c)F. Siegert - http://www.this.net/-frank/pstill.html Domain Complexity L...... Y/_//~ ....~ .,,.’//////,~ ...... 3 2 1 a Figure 1: An impossible move. The black bishop on b2 is invisible. Whiteattempts an impossible moveby trying to movethe bishopon cl to a3. schemethat does not take into account every possible destination square whenan invisible piece is moved,will lose important information about the o w of the game. Suppose a player incorrectly assumesthat the opponent’sinvisible piece is on a certain square; if the opponentthen movesa visible piece to that square, the player nowhas no Wayof backtracking, and has no information about the location of the invisible piece. Assuming no strategic knowledge(i.e., each square that an invisible piece can moveto is as likely to be visited as any other), and only one invisible piece, a player can easily maintainthe precise distribution for that piece (see (Budet al. 2000)). In addition, wheneverany piece (other than knight)moves,the probability distribution for each invisible piece maybe updated to re ect the fact that other squares in the movingpiece’s path are vacant. Similarly, whenever a player’s king is threatened by any piece (except a knight), visible or invisible, the probabilitydistributionsfor all invisible pieces maybe updated to re ect the knownvacancies betweenthe piece causing check, and the threatened king. However,for multiple invisible pieces, the positions of invisible pieces are not conditionallyindependentwith respect to each other, so the probability of one invisible piece occupyingany particular squareaffects the probability of another invisible piece occupyingthat square. Maintainingthe probabilitydistributions of multipleinvisible pieces involves combinatorialcalculations in the numberof invisible pieces or the storage of combinatoriallylarge amountsof data in order to maintainthe completejoint distributions of all invisible piecesover the entire chess board. Weaddress this problemby utilising a central moduleto obtain an approximationof the distribution that can be calculated quickly enoughto use in real time (Section 4). The GCDMM calculates the distributions for both players, and has access to the true positionsof all invisible piecesat all times. Weuse the true positions of other invisible pieces Nolicense: PDFproducedby PStin (c) F. Siegert - http:flwww.this,nel/~frank/pstill.html b c d e f g h Figure 2: Aninvisible piece is revealed to warnof the check. Black is in check becauseof the invisible white bishop on g5. as an approximation to the joint distribution of all invisible pieces across all squares. Thus, the calculation of each invisible piece’s moveis reducedto the calculation to be performedwhenthere is only one invisible piece. Becausethe resulting approximationis moreaccurate than the distribution that could be calculated by a real player of invisible chess, our results slightly underestimatethe advantagesof playingchess with invisible pieces. For the purposesof this research, using this methodto maintainprobability distributions allowsus to focus our efforts on the effects of manipulating the amountof uncertainty inherent in these distributions. 4 Basic Design In Section 3.1 we estimate the branchingfactor of the game tree for invisible chess with 2 invisible pieces to be around 560, and for 3 invisible pieces to be in the order of 2000, comparedto the branchingfactor of standard chess whichis about 35 ((Russell &Norvig 1995), p. 123). To cope this combinatorial explosion, and the strategic complexity of invisible chess, we employa "divide and conquer" approach. Wesplit the problemof choosingthe next moveinto a numberof simpler sub-problems,and then use utility theory (Raiffa 1968) to recombinethe calculations performed for these sub-problems into a move.Weuse informationtheoretic ideas (R6nyi1984)to deal with the uncertainty the domain, and standard chess reasoning to deal with the strategic elements. This modular, hybrid approach is implementedwith advisors or experts connected and controlled by a GameController and Distribution MaintenanceModule(GCDMM). the current implementation,we include three advisors: (1) the strategic advisor; (2) the movehide advisor; and (3) the move seek advisor. The GCDMM controls the game state, has knowledge of the positions of all invisible pieces and maintainsdistributions of invisible pieces on behalf of N®N 61 ........ N....... N N N 67 ,/,.A~ ~ ~ ........ t 4 3 3 2 2 1 1 a b c d e f g h a N 7 6 c d e d~5’ 4 4 b2a3 I a2a4 25 a2a3l 21 e2h5 -24 c3o41 4 5 4 e2fl 3 -32 399 -32 -32 386 397 378 -28 376 376 374 363 -24 81.50 -24 87.75 61 107.50 64 107,25 21 189.25 29 198.25 24 384 376 54 209.50 30,395 374 55 213.50 b263 Ialbl , 26 395 376 58 213.75 2 glh3 I 33 397 374 52 214.00 e2e4 t 24 387 376 54 210.25 62 216.25 dld3[ 31~395377 h2h3 31 395 376 63 216.25 e2a6l 241 397 376 54 212.75 glf3 34 394 376 64 217.00 e2d3I 32 395 377 63 216.75 1 b c d e f g g h Hide Seek Total IStrat (W1:1) - [[~]{Pos Move b4 a3 c5 d6 EU W2:5 Ws:5 N N °N a f Figure 4: Attempting to moveinto check. Whiteattempts to movethe king from dl to d2. This is impossiblebecauseit results in the king being in check fromthe invisible black bishop on a5. Figure 3: An illegal move. White is in check fromthe black bishop on a5, and attempts to capture the invisible black knight on fl whereit wouldbe in check from the invisible black bishop on c4. 8x b h Figure 5: Aninvisible bishop each. White believesthe invisible black bishopon b4 is on one of a3, b4, c5 or d6. Black believes the invisible white bishop on e2 has possible squarese2, d3, c4, b5 and a6. 0.77 0 0 0 0.77 -0.22 0 ~ 0.77 0 0.77 0 213.35 O, 213.50 0 213.75 0~ 214,00 0 214,10 O! 215,15 0 216.25 0l 216.60 0 217.00 0 ] 220.60 Figure 6: Possible Movessorted by their relative total scores. the two players. The GCDMM is responsible for deciding whethera moveis legal, impossibleor illegal and ensuring that the gameprogresses correctly. Whenit is a player’s turn to move,the GCDMM requests the next movefrom the Maximiserfor that player. The Maximiser,responsible for choosingthe best movesuggestedby the available advisors, generatesall possible boardsand all possible moves,and requests utility values fromeach of the advisorsfor each move. Eachadvisor evaluates the possible moves, across as many boards as possible in the available time, accordingto an internal evaluation function. The strategic advisor, a modied version of GNU Chess whichreturns all possible moves 10 Nolicense: PDFproducedby PSIJll (c) F. Siegert http:llwww, this.netl-franldpstill.html 0 0.56 84,30 0 0.56~ 90.55 0 0 107150 0 0.77 111.10 0.77 0 193,10 0 0 198,25 and their utilities (evaluatedto a speci ed depth), gives the ExpectedUtility (EU) of each movefrom a strategic perspective. 3 The EUfor each moveor action (A) is calculated by multiplyingits utility value by the probability of the gamestate (X) for whichthe utility value was calculated, 3Notethat GNU Chesshas no knowledge of invisible pieces. Themodications to GNU Chessare to performx ed depthsearching, andto returnall moves andutilities rather thanonlythe best move.In standardchess, a poormovemaybe discountedearly in the search so that the searchon the gametree canfocus on more promising moves.However, in invisible chess, a movethat is poor in oneboardpositionmaybe goodin other positions. and summing this across all possible gamestates (G) as per Equation2. EU(A) = ~ (Utility(AlX) x Prob(X)) (2) VX6 G The moveseek advisor scores movesaccording to howmuch they reduce the player’s uncertainty. For example, a move that fails becauseit "collides" with an invisible piece removesthe uncertaintyas to the position of that piece. Similaxly, the movehide advisor scores movesaccording to how muchthey increase the opponent’suncertainty. For example, movinga previouslyrevealedinvisible piece will greatly increase the opponent’s uncertainty. The movehide and seek advisors are describedin Section6. Eachadvisor has a weightassociated with it that re ects the relative value of its advice. The Maximisermultiplies the value returned from each of the advisors by its weight (W0 and sumsthese values. The movewith the highest overall value is passed back to the GCDMM which implementsthat moveand requests a movefromthe other player. If there are multiple moveswith equal highest score, one of these moves is chosen at randomby the Maximiser. 4.1 Example Figure 5 showsa typical position in a gameof invisible chess with each player playing with one invisible bishop; it is white’s turn to move.Black’sbelief aboutthe position of white’s invisible bishopis representedby the probability distribution: e2 0.2, d3 0.2, c4 0.2, ]35 0.2, a6 0.2, and white believes that black’s invisible bishop has the probability distribution: a3 0.25, b4 0.25, c5 0.25, d6 0.25. White knows that the invisible black bishop cannot be on e7 as the black queen traversed e7 to get to its current position at h4. TheGCDMM asks the white Maximiserfor its next move. The white Maximiser passes each possible boardposition, i.e., the current board with black’s invisible black bishop in each of b4, a3, c5 and d6 to the strategic advisor. Figure 6 showssomepossible movesacross the four possible boardstogether with their utilities and the EUof eachmovefromthe strategic advisor’s perspective. Noticethat the moveglf3 has the highest strategic EUof 217. However,movingthe knight on gl has no effect on either player’s information.If there wereno other advisors present, the Maximiserwouldchoosethis move. Somemovesdo assist in discovering the location of the invisible black bishop, e.g. d4c5, a2a3 and b2a3, however, noneof these movesis strategically strong. For this reason, and becauseof the small numberof squares in the invisible black bishop’s distribution, the moveseek advisor has little effect at this point in the game.Later in the game,once the invisible black bishop has movedseveral times, many squareswill have a probability of being occupiedby the invisible black bishop, and the moveseek advisor will have moreeffect. The movehide advisor evaluates each movebased on the increase in the opponent’suncertainty. Whenthe invisible white bishopmoves,its destination fromthe opponent’sperspective, maybe any square accessible fromany square currently in the invisible white bishop’sdistribution. Thusall Y.I. X 39.0 35.0 65.0 82.8 58.5 56.4 10.2 61.0 X 46.6 60.8 86.8 62.6 72.4 11.0 65.0 53.4 X 61.8 74.6 65.6 72.6 7.0 35.0 39.2 38.2 X 56.2 58.6 52.4 7.0 17.2 13.2 25.4 43.8 X 28.0 42.4 4.4 41.5 37.4 34.4 41.4 72.0 X 54.2 3.8 43.6 27.6 27.4 47.6 57.6 45.8 X 1.8 89.8 89.0 93.0 93.0 95.6 96.2 98.2 X Table 1: Winpercentages for different combinationsof invisible pieces. possible destination squares need to be addedto its distribution, and regardless of wherethe invisible white bishop movesto (unless it causes check), the opponent’suncertainty is increased. Onthe other hand, whena visible piece traverses a square that has a positive probability of occupation by an invisible piece, the opponent’suncertainty decreases as they can infer that the traversed squareis actually vacant. For example,the movedld3 tells the opponentthat the square d3 is empty. The Maximisermultiplies each movevalue from each advisor by the advisor’s weight(Wi)and sumsover all the advisors, giving the "Total" columnin Figure 6. Notice that the movewith the highest overall value is nowe2d3 (220.60). The extra addeduncertainty has beenenoughin this case to makethis movebetter than the strategic choice of glf3. 5 The Strategic Advisor In this section wepresent results for playinginvisible chess with a single strategic advisor for each player. Eachplayer played with a different con guration of invisible pieces, using only a strategic advisor to decide the next move.Each result is obtained from500gamesrun with a particular conguration of invisible pieces that endedwith a win.4 White has a slight advantagein standard chess, and the strategic advisor movespieces differently dependingon whetherit is playing white or black. To removecolour biases fromthe resuits, each set of 500gameswasbrokeninto two runs of 250 gameseach, the invisible piece con gurations were swapped betweenwhite and black for each run, and the results were averagedbetweenthe two runs. Table 1 showsresults for gamesplayed with major (nonpawn)invisible pieces against each other and against no invisible pieces (N.I.). Readingacross a row, it showsthe percentage of gameswonby the combinationof invisible pieces in the rowheadingagainst the invisible piece combinationof a particular column.For example,one invisible bishop (1B) won65%of the time against one invisible rook (1R), but only 43.6%of the time against twoinvisible rooks (2R). Theseresults showthat the values of several invisible piece combinationsdiffer betweeninvisible chess and standard chess. For example,in standard chess, a rook is considered morevaluable than a bishop. However,in invisible chess, 4Drawn gamesare largelythe result of repeatingmoves continuously.Consistingof less than 10%of gamesplayed,drawsare notcounted in theseresults. 11 Nolicense: PDFproducedby PStill (¢) F. Siegert - http:flwww.this.net/~frank/pstill.html 1B IN 1R 1Q 2B 2N 2R Further examination of our results showsthat the more a player movestheir invisible pieces, the moreimpossible movestheir opponent attempts (see (Bud et aL 2000) for details). This movement of invisible pieces and the corresponding opponentuncertainty can be summarisedusing entropy to quantify each player’s uncertainty in a game.Figures 7 and 8 showthe uncertainty of each player regarding the positions of the opponent’sinvisible pieces for games played with one invisible bishop against two invisible bishops.6 For each game,the entropy of each player’s distribution of invisible pieces is summed across all moves.Thusthe graphs showthe total amountof uncertainty each player had to deal with over the course of each game.Figure 7 shows each player’s uncertainty in the 190gamesthat werewonby white; the solid line showsblack’s uncertainty, sorted by entropy, whilethe dashedline showswhite’s uncertainty in the corresponding game. Figure 8 shows each player’s uncertainty in the 60 gamesthat were wonby black; the dashed line showswhite’s uncertainty, sorted by entropy, while the solid line showsblack’s uncertainty in the corresponding game. As can be seen from Figure 7, all the gameswonby white, except for one (gamenumber86) showgreater uncertainty for black. The gameswonby black (Figure 8) are muchcloser in uncertainty and there are manygameswhere white had greater uncertainty than black.7 This examplecorroborates our intuition that players withless uncertaintyare morelikely to win, and underpinsour information-theoretic advisors. one bishop beat one rook, and two bishops beat every other combinationof invisible pieces consideredincluding rooks. This apparent anomalyis due to the bishop’s early involvementin the game, while rooks tend to comeinto play later. By causing uncertainty early in the game,a player with twoinvisible bishopshas an early advantageagainst a player withtwoinvisible rooks. Further analysis of these results is presented in (Budet aL 2000). 6 Building Information Theoretic Advisors A reasonableinference fromthe results shownaboveis that players with more information about the gametend to win moreoften; i.e., the closer a player’sbelief abouta boardposition is to the true boardposition, the morelikely the player is to play well strategically. In this section, wedescribeour use of informationtheory (R6nyi1984) to quantify a player’s uncertainty about the positions of an opponent’sinvisible pieces (Section 6.1), present two information-theoretic advisors (Section 6.2), and discuss somepreliminaryresults obtainedwith these advisors (Section 6.3). 6.1 Uncertainty and Entropy Informationtheory provides a measurefor quantifying information represented by a sequence of symbols. That is, using information theory we can determine the minimum numberof bits required to transmit the sequenceof symbols to someoneelse. This numberrepresents a measureof the amountof informationintrinsic to the sequenceof symbols. Thecalculation of this numberrequires a distribution of the probability that any particular symbolto be transmitted will occur. Thusany probability distribution can be said to have an entropy or information measureassociated with it. This entropy measureis boundedfrom belowby zero, whenthere is only one possibility and therefore no uncertainty, and increases as the distribution spreads. In invisible chess, given a probability distribution of each of the opponent’sinvisible pieces, it is possible to derive a 5probabilitydistribution across all possible boardpositions. Thus we can calculate the entropy (H) of a set of board states togetherwiththeir associatedprobabilities as follows: H = - E Prob(X) × log2(Prob(X)) (3) 6.2 Information-Theoretic Advisors Followingan analysis of the results in Section 5, we have implementedtwo information-theoretic advisors. These advisors are movehide and moveseek. YXEG whereX represents a single gamestate, from the set of possible gamestates (G), whichis one possible combination positions of the opponent’s invisible pieces, and Prob(X) represents the probability of that gamestate. As invisible pieces move,the numberof squares they may occupyincreases. This leads to an increase in the numberof possibleboardstates, and thereforethe entropyof the distribution of thoseboardstates, i.e., it increasesthe opponent’s uncertainty. 5Assuming that the piece distributions are independent,this canbe calculatedbycombinatorially cyclingthroughthe invisible piecepositions.In ourimplementation, thesedistributionsof invisible piecesare independent becauseof the waythey are maintained (Budet al. 2000). 12 No license: PDFproduced by PSdU(c) F. Siegert - http:l/www.this.neff~frank/pstill.html Move Hide Advisor. Working on the premise that the moreuncertain the opponentis, the worsethey will play, the movehide advisor advises a player to perform moves that hide informationfrom the opponent.That is, each move is scored accordingto its expectedeffect on the opponent’s perceiveduncertaintyabout the positions of the player’s invisible pieces. This effect maybe an increase, a decrease or no change in the opponent’s uncertainty. Movesby invisible pieces that do not cause checkresult in an increase in the opponent’s uncertainty. Movesthat cause check or movesby visible pieces maycause a decrease in the opponent’s uncertainty by revealing vacant squares or invisible pieces themselves. As discussed in Section 6.1, we use the entropyof the distribution of the positions of the opponent’s invisible pieces as a measureof the opponent’suncertainty. In order to calculate this entropy, the movehide advisor needsto modelthe opponent’sdistribution update strategy. Of course a player can use the real positions of each nonmovinginvisible piece and thereby avoid the cost of storing 6Themanygameswithzero entropyfor whiteare generallydue to the invisiblebishop(queen’s bishop)capturinga pieceonits rst moveand subsequentlybeingcapturedwithoutmovingagain. 7Notethat the samerelationship betweenentropyand wincan be seen for gamesbetweenmoreevenlymatchedinvisible pieces (Budet al. 2000). aO Figure8: Uncertainty(as entropy) in gameswith one invisible black bishop against twoinvisible white bishops that were wonby black. Figure7: Uncertainty(as entropy)in gameswith one invisible black bishop against two invisible white bishops that were wonby white. or calculating the completejoint distribution of the positions of all invisible pieces. This methodprovidesa modelof the best distribution the opponentcould havewithout taking strategic informationinto account. To incorporate strategic informationinto the model,a player could use the combined expectedutility for each possible destination squarethat an invisible piece could moveto. For a proposedmove,the movehide advisor uses Equation3 to calculate the entropy of the opponentmodelof the distribution of the player’s invisible pieces prior to and following the move.The exponential of the perceived change in entropy is returnedas the movehide utility. Theexponentialis taken in order to allow a comparisonbetweenthe log based utilities returned by the movehide and moveseek advisors, and the utility returnedby the strategic advisor whichscores moveson a linear scale. MoveSeek Advisor. The moveseek advisor suggests that a player performmovesthat are morelikely to discover information about the positions of the opponent’s invisible pieces. That is, each moveis scoredaccordingto the expected decreasein the entropy of the opponent’sinvisible pieces following the move.This expected decrease in entropy must be greater than or equal to zero as no movecan makea player less certain aboutthe position of the opponent’sinvisible pieces. A movethat covers a large numberof squares with a positive probability of occupationby the opponent’sinvisible pieces will yield a certain amountof informationwhetherit is successful or not. Clearly this type of movewill yield more informationif it is successful, as the player nowknowsthat all of those squaresare vacant. Onfailure, a player can only concludethat at least one of those squares is occupied. On the other hand, a movethat traverses only one square with a positive probability of occupationby the opponent’sinvisible pieces will yield moreinformationif unsuccessful. That is, the player nowknowsthat an invisible piece is on that square. Thus, the moveseek advisor multiplies the projected decrease in entropy for each outcomeby the probability of that outcometo get an expected utility. Theexponential 13 Nolicense:PDF produced byPStill(c) F. Siegert- http://www.this.neff~frank/pstill.html ]Weight] 0.5 5.0 50.0 Seek Hide ] J B vsB Qvs Q BvsB QvsQ 57.2 77.2 47.2 51.0 71.6 43.6 54.6 45.0 13.6 30.3 47.6 67.0 Table2: Theeffect of the movehide and moveseek advisors. of this expected changein entropy is returned as the move seekutility. 6.3 Advisor Results This section presents and discusses somepreliminary results that showthe individual effects of the movehide and moveseek advisors with varying weights. Eachresult was obtained by playing 500 gamesseparated into runs of 250 gamesas before, with one player using the strategic advisor only, against an opponentusing one of the informationtheoretic advisors and the strategic advisor. The invisible piece con gurations in this section were chosen because their results are typical of those obtained using our information-theoreticadvisors. The move hide column of Table 2 shows the results of adding the movehide advisor to an invisible queeneach (Q vs Q) and an invisible white bishopversus an invisible black bishop (B vs B). The moveseek columnshowsthe results of adding the moveseek advisor to an invisible queeneach and and one invisible white bishop versus one invisible black bishop (B vs B). The rst columnheaded "Weight" shows the weightof the information-theoretic advisor relative to the strategic advisor.Thus, with a weight of 0.5, in games played with an invisible queeneach, the player playingwith the movehide advisor won57.2%of the games. MoveHide Results. With a weight of 0.5, the movehide advisor signi cantly assists, increasingthe winrate to 57.2% and 72.2%for queen versus queen and bishop versus bishop respectively. However,as the movehide advisor weight increases past 0.5, the winrate decreasesquite rapidly. This behaviouris typical of all observedmovehide runs, and results fromthe playertaking less notice of the strategic advisor’s advice. Althoughthe opponentmaybe slightly more uncertain whenthe movehide advisor is weighted highly, the player is makingenoughstrategically poor movesto counter that advantage. Nonetheless, the movehide advisor is de nitely helpful with small weight, and the results shownfor a weightof 0.5 are statistically signi cantly different from the base case of 50%. Figure 9 showsthe difference in entropy betweenwhite and black, each playing with an invisible queen, averagedover the entire gamefor each of 250 games.To makethe trends clearer, the data points are sorted by entropy. The middie curve represents gamesplayed betweeninvisible queens with no information-theoretic advisors present, and shows that the entropydifference is fairly even. In approximately half the gamesit is negative, andin the other half it is positive, and it ranges betweenaround -1 and +1. The bottom curve represents gameswhere white played with the move hide advisor. This represents an entropy advantageto white. The difference betweenwhite and black’s uncertainty ranges from under -2 to around +1. The rest of the curve is also shifted downwards indicating the relative increase in black’s uncertainty. This increase in black’s uncertainty leads to morewins for white while using the movehide advisor. Whenan invisible piece moves,the uncertainty increases in the same mannerregardless of the nal destination of the invisible piece. Thusthe movehide advisor is mostvaluable whenany strategically advantageousmoveby an invisible piece is possible. MoveSeek Results. Table 2 (column 5) shows the move seek advisor’s effectiveness against an opponent’sinvisible bishop. As the weightincreases to around5.0, the percentage of winsincreases up to 71.6%.Thisis largely a result of the moveseek advisor assisting the player to nd and capture the opponent’sinvisible bishop early in the game.The player then has an informationadvantageequivalentto an invisible bishopversus no invisible piecesand is verylikely to win. Figure 10 showsthe entropy advantageof playing with the moveseek advisor. Thedata points are sorted by entropy to clarify the trends. Thetop curveshowsthe entropydifference with no moveseek advisor, and the bottomcurve shows that the entropy is signi cantly lower whenplaying with a moveseek advisor. In this example, the moveseek advisor assists white to reduceuncertaintyoverthe courseof the gameand therefore play strategically better than black. Table 2 (column4) showsthe moveseek advisor as applied to a gamebetweentwo invisible queens. In this situation, the moveseek advisor is muchless effective and wouldappear to probablybe detrimentalin somecases. It seemslikely that the large numberof squares an invisible queenmay have a positive probability of occupying,and the frequency of queen movement,makeit dif cult for the opponent to nd an invisible queen. The top curve in Figure 9 represents gameswherewhite played with the moveseek advisor. Anexaminationof the entropy difference slightly favours black comparedto the base case. This is almost certainly because white is spending movestrying to nd black’s in- 14 Nolicense: PDFproducedby PStill (c) F. Siegert - http://www.this.nel/~frank/pstill.htnd visible queenrather than movingtheir owninvisible queen. Onlymovesthat traverse squares that havea positive probability of occupationby the opponent’sinvisible pieces are valued by the moveseek advisor. These movesmayor may not correspond to good strategic moves. The more a player listens to the moveseek advisor, the fewergoodstrategic movesthey are able to perform.The difference betweenthe bishop-seeking behaviour and the queen-seekingbehaviour dependson the difculty of nding the opponent’sinvisible piece. Aninvisible bishopcan havea positive probability of occupyingat most half the available squares on the board, while an invisible queencan have a positive probability of occupyingall the available squares. Thus,the adviceprovided by the moveseek advisor often aids in the early capture of the invisible bishop, thereby removingthe uncertainty from the game.In contrast, following this advice whenthe invisible piece is an invisible queenleads to wastedmoves. Thusthe moveseek advisor assists whenthe uncertainty in the gameis low, but maybe useless or detrimental whenthe uncertaintyis high. 7 Conclusion and Future Work Theresults presentedin this paper are preliminaryand further exploration of the domainis required. Thereare a number of areas that require morein depth investigation. Specifically, moreaccurate prediction of the likely positioning of the opponent’sinvisible pieces is needed. This prediction could take the formof using strategic informationabout the likely destination of a movinginvisible piece, or involve evaluating the completesearch tree for more ply. Improving this distribution wouldreduce its entropy and therefore improvestrategic performance.Aside effect of this type of distribution improvement is the possibility of incorporating blufng into invisible chess. That is, movingan invisible piece to an unlikely position in order to confuse an opponent. As indicated above, one wayto improvethe prediction of the positions of the opponent’sinvisible pieces wouldbe to modelthe uncertainty regarding a player’s pieces from the opponent’sperspective for moreply. However,the problem of the combinatorial expansionin the search required as a result of this prediction needsto be resolved. The mostobvious wayto managethis explosion is to nd effective ways to prunethe gametree. As our systemcurrently stands, a non-integrated player of invisible chess (whether humanor machine)could not have the bene t of our GCDMM which maintains the approximationto the distribution of the positions of the opponent’s invisible pieces by using the true positions of all other invisible pieces whenupdatingthe distribution. Sucha player wouldneed to nd other waysof approximatingthat distribution. In summary,we have presented and discussed a complex, but controlled domainfor exploring automatedreasoning in an uncertain environmentwith a high degree of strategic complexity. Wehave motivated and introduced the use of information-theoretic advisors in the strategically complex imperfect information domainof invisible chess. Wehave shownthat our distributed-advisor approachusing a combi- Invisible ~e*n vs Invlsible Oug~. Diff*r*ncs in Entropy blt~en Black and Whir* 2 ¯ [n~ropy Diff*’rsn=s with No ~dvi=orl m ~ntropy Difference with ~Thits MovaSeeks" 1.5 ® 1 Znv Black Bishop vl 1 inv whirs Bishop, Dlff In Entropy beb~sn Black & Whirs 2 g Entropy Dlf f@renc’e wl~h ~ito s~k m ~tropy Difference with No Advienrn .... 1.5 O.5 ....... gn~rop¥~i~Dlff*r*n¢~ with~cnlts HoveHidg......... 0 ~-0.~ ~ .,"""" .... .... ..... .r" o .......,.........-,.’..... ~ -0.5 i -1.5 i -1.5 -a.5 -2.5 I lOO ! 15o | 2oo 25o Figure 10: The difference in entropy between white and black for a base case, and with moveseek, sorted by entropy. Figure 9: Entropy difference between white and black for a base case, move seek (weight 1.0) and move hide (weight 0.5), sorted by entropy. nation of information-theoretic and strategic aspects of the domain lead to performance advantages compared to using strategic expertise alone. Given the simplicity and generality of this approach, our results point towards the potential applicability to a range of strategically complex imperfect information domains. Acknowledgements The authors would like to thank Chris Wallace for his insight, advice and patience. References Albrecht, D. W.; Zukerman,I.; Nicholson, A. E.; and Bud, A. 1997. Towardsa bayesian model for keyhole plan recognition in large domains.In User Modelling. Proceedingsof the Sixth International Conference UM97,number 383 in CISMCourses and Lectures, 365-376. Springer Wien, NewYork NYUSA. Berliner, H.J. 1974. Chess as ProblemSolving. Ph.D. Dissertation, CarnegieMellonUniversity, Pittsburgh, PA. Bud, A.; Nicholson, A.; Zukerman,I.; and Albrecht, D. 1999. A hybrid architecture for strategically compleximperfect information games. In Proceedingsof the 3rd International Conference on Knowledge-BasedIntelligent Information Engineering Systems., 42--45. IEEEPress. Bud, A.; Zukerman,I.; Albrecht, D.; and Nicholson, A. 2000. Invisible chess: Rules, complexity and implementation. Technical Report forthcoming, Dept. of ComputerScience, Monash University. Ciancarini, P.; DallaLibera, E; and Maran, E 1997. Decision Makingunder Uncertainty: A Rational Approachto Kriegspiel. In van denHerik, J., and Uiterwijk,J., eds., Advancesin Computer Chess 8, 277-298. Univ. of Rulimburg. Findler, N. V. 1977. Studies in machinecognition using the game of poker. Communicationsof the ACM20(4):230-245. Frank, I., and Basin, D. 1998. Search in gameswith incomplete information:A case study using bridge card play. Arti cial Intelligence 100:87-123. Frank, I., and Basin, D. 2000.A theoretical and empirical investigation of search in imperfect informationgames.In Theoretical ComputerScience -- To appear. 15 Nolicense:PDF produced byPStill(c) F. Siegert- http://www.this.net/~frank/pstill.html 1 Gamb~ick, B.; Rayner,M.;and Pell, B. 1991. Anarchitecture for a sophisticated mechanicalBridgeplayer. In Levy,D. N., and Beal, D. F., eds., Heuristic Programming in Arti cial Intelligence -The Second ComputerOlympiad2, 87-107. Chichester, England: Ellis Horwood. Ginsberg,M. 1996. Partition search. In Proceedingsof the Thirteenth National Conferenceon Arti cial Intelligence, 228-233. AAAIPress / MITPress. Ginsberg, M. 1999. GIB:steps toward an expert-level bridgeplaying program.In Proceedingsof the Sixteenth International Joint Conferenceon Arti cial Intelligence, 584-589. Morgan Kaufmann. Koller, D., and Pfeffer, A. 1995. Generatingand solving imperfect information games.In Proceedingsof the FourteenthInternational Joint Conferenceon Arti cial Intelligence, 1185-1192. Morgan Kaufmann. Koller, D., andPfeffer, A. 1997.Representationsandsolutions for game-theoreticproblems.Arti cial Intelligence 94(1):167-215. Korb, K. B.; Nicholson, A. E.; and Jitnah, N. 1999. Bayesian poker. In Laskey, K. B., and Prade, H., eds., Proceedings of the Fifteenth Conferenceon Uncertaintyin Arti cial Intelligence, 343-350. Stockholm, Sweden: MorganKaufmann. Li, D. 1994. Kriegspiel: Chess Under Uncertainty. Premier Publishing. Pell, B. 1993. Strategy Generation and Evaluation for MetaGamePlaying. Ph.D. Dissertation, University of Cambridge. Raiffa, H. 1968. Decision analysis: introductory lectures on choices under uncertainty. Addison-Wesley. R6nyi, A. 1984. A diary on information theory. John Wiley and sons. Russell, S., and Norvig,P. 1995. Arti cial Intelligence: A Modern Approach.Prentice Hall. Smith, S. J. J., and Nau, D.S. 1993. Strategic planning for imperfect-information games. In Games:Planning and Learning, Papers from the 1993Fall Symposium,84-91. AAAIPress. Smith,S. J. J.; Nau, D. S.; and Throop,T. A. 1996. Total-order multi-agent task-network planning for contract bridge. In Proceedings of the Thirteenth National Conferenceon Arti cial Intelligence, 108-113. AAAIPress / MITPress.

Playing "Invisible Chess" with Information-Theoretic Advisors

Related documents

Products

Support

Playing &#34;Invisible Chess&#34; with Information-Theoretic Advisors

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

Playing "Invisible Chess" with Information-Theoretic Advisors