How to Choose a Random Sudoku Board Joshua Cooper USC Department of Mathematics Rules: Place the numbers 1 through 9 in the 81 boxes, but do not let any number appear twice in any row, column, or 3 3 “box”. You start with a subset of the cells labeled, and try to finish it. 1 6 5 4 3 9 7 2 8 7 8 3 2 6 1 4 5 9 9 2 4 8 5 7 6 1 3 4 7 9 5 2 8 1 3 6 3 1 2 9 4 6 5 8 7 6 5 8 7 1 3 2 9 4 5 9 6 1 8 4 3 7 2 2 3 7 6 9 5 8 4 1 8 4 1 3 7 2 9 6 5 A Sudoku puzzle designer has two main tasks: 1. Come up with a board to use as the solution state. 2. Designate some subset of the board’s squares as the initially exposed numbers (“givens”). For example: BOX ROW CELL 1 7 9 4 3 6 5 2 8 COLUMN 6 8 2 7 1 5 9 3 4 5 3 4 9 2 8 6 7 1 4 2 8 5 9 7 1 6 3 3 6 5 2 4 1 8 9 7 9 1 7 8 6 3 4 5 2 BOARD 7 4 6 1 5 2 3 8 9 2 5 1 3 8 9 7 4 6 8 9 3 6 7 4 2 1 5 STACK 1 7 6 8 3 9 7 4 8 5 9 2 8 1 8 7 1 2 8 4 7 1 3 7 PUZZLE We’re going to focus on task #1: How to choose a good Sudoku board? 8 BAND 6 GIVEN 1 5 Not all boards are created equal. Some make lousy puzzles: 1 2 3 4 5 6 7 8 9 4 5 6 7 8 9 1 2 3 7 8 9 1 2 3 4 5 6 2 3 4 5 6 7 8 9 1 5 6 7 8 9 1 2 3 4 8 9 1 2 3 4 5 6 7 3 4 5 6 7 8 9 1 2 6 7 8 9 1 2 3 4 5 9 1 2 3 4 5 6 7 8 It would be preferable to generate random Sudoku boards when designing a puzzle. Furthermore, there are many mathematical questions one can ask about the “average” Sudoku board that require that we be able to generate random ones. For example: 1. How often are the 1 and 2 in the upper-left 3X3 box in the same column? 2. What is the average length of the longest increasing sequence of numbers that appear in any row? 3. What is the probability that the permutation of {1,…,9} that the first two rows provide is cyclic? 1 1 6 5 4 3 9 7 2 8 7 8 3 2 6 1 4 5 9 7 9 4 8 1 2 3 4 5 6 7 8 9 2 6 1 2 3 4 5 6 7 8 9 3 5 Furthermore, there are many mathematical questions one can ask about the “average” Sudoku board that require that we be able to generate random ones. For example: 1. How often are the 1 and 2 in the upper-left 3X3 box in the same column? 2. What is the average length of the longest increasing sequence of numbers that appear in any row? 3. What is the probability that the permutation of {1,…,9} that the first two rows provide is cyclic? 4. What about the “generalized Sudoku board”? For example, 16X16: Furthermore, there are many mathematical questions one can ask about the “average” Sudoku board that require that we be able to generate random ones. For example: 1. How often are the 1 and 2 in the upper-left 3X3 box in the same column? 2. What is the average length of the longest increasing sequence of numbers that appear in any row? 3. What is the probability that the permutation of {1,…,9} that the first two rows provide is cyclic? 4. What about the “generalized Sudoku board”? For example, 16X16: In order to get an approximate answer to these questions, one could: a.) Generate lots of random examples. b.) Compute the relevant statistic for each of them. c.) Average the answers. This general technique is called the “Monte Carlo” method. It is very useful for mathematical experimentation, and it comes up all the time in applied mathematics (usually to approximate some sort of integral). Attempt #1: Fill an empty board with random numbers between 1 and 9. If the result is not a valid Sudoku board, discard the result and try again. Problem #1: The chances that a random board is actually a Sudoku board is about 3 X 10-56. Even if we could check a trillion examples every second, it would still take 7 X 1025 times longer than the universe has been around before we expect to see a single valid board. Attempt #1b: Each row is actually a permutation (i.e., no number occurs twice), so generate 9 random permutations until a valid Sudoku board results. Problem #1: The chances that a random board is actually a Sudoku board is about 6 X 10-29. Again, even if we could check a trillion examples every second, it would still take 500 billion years before we expect to see a single valid board. Attempt #1c: Start with an empty board. Randomly choose an unoccupied location and fill it with a random number, chosen from among those that can legally live there. Problem #1: We may run out of legal moves! Attempt #1c addendum: Okay, so just start over if you get stuck. Problem #2: Not every board is equally likely to emerge from this process. Despite this fact, most board generating software out there uses this strategy. Attempt #2: Generate all Sudoku boards and pick one uniformly at random from the list of all of them. Problem #1: There are 6,670,903,752,021,072,936,960 (~6.7×1021 = 6.7 sextrillion) different Sudoku boards (Felgenhauer-Jarvis 2005). Even at 4 bits per symbol, this translates to about 270 billion terabytes = approx. $18 trillion ($68 per 1TB hard drive, says Google) = approx. 130% of US annual GDP Problem #2: This generalizes very poorly to larger boards. (There are about 6×1098 16X16 boards >> number of atoms in the known universe.) Attempt #3: Generate a list of one representative of each orbit of Sudoku boards under the natural symmetries: rotation, transposition, permuting symbols, permuting rows within a horizontal band, permuting columns within a vertical band, permuting horizontal bands, and permuting vertical bands. Attempt #3: Generate a list of one representative of each orbit of Sudoku boards under the natural symmetries: rotation, transposition, permuting symbols, permuting rows within a horizontal band, permuting columns within a vertical band, permuting horizontal bands, and permuting vertical bands. The operations: 1. Permuting the rows and columns of each band/stack (X 3!6) 2. Permuting bands I, II, and III, and I and stacks A, B, and C (X 3!2) 3. Permuting the numbers/colors (X 9!) II III A B C Attempt #3: Generate a list of one representative of each orbit of Sudoku boards under the natural symmetries: rotation, transposition, permuting symbols, permuting rows within a horizontal band, permuting columns within a vertical band, permuting horizontal bands, and permuting vertical bands. The operations: 1. Permuting the rows and columns of each band/stack (X 3!6) 2. Permuting bands I, II, and III, I and stacks A, B, and C (X 3!2) 3. Permuting the numbers/colors (X 9!) II 4. Rotating the board (X 2) III A B C Attempt #3: Generate a list of one representative of each orbit of Sudoku boards under the natural symmetries: rotation, transposition, permuting symbols, permuting rows within a horizontal band, permuting columns within a vertical band, permuting horizontal bands, and permuting vertical bands. The operations: 1. Permuting the rows and columns of each band/stack (X 3!6) 2. Permuting bands I, II, and III, I and stacks A, B, and C (X 3!2) 3. Permuting the numbers/colors (X 9!) II 4. Rotating the board (X 2) generate a group of order 1,218,998,108,160. III The number of orbits of this group (i.e., the number of “truly distinct” boards) A B C = 5,472,706,619. Attempt #3: Generate a list of one representative of each orbit of Sudoku boards under the natural symmetries: rotation, transposition, permuting symbols, permuting rows within a horizontal band, permuting columns within a vertical band, permuting horizontal bands, and permuting vertical bands. Problem #1: You can’t just pick a uniformly random choice of orbit: some orbits are bigger than others. In fact, you have to choose them with probability proportional to their sizes. This means doing a big computation using “Burnside’s Lemma.” Problem #2: Again, this scales very poorly. The number of orbits for the 16X16 board is approximately 2.25 × 1071. Still ridiculously large. Attempt #4: Start with some Sudoku board and make small, random changes for a while. The result should be close to uniformly random. This general strategy is known as a “random walk” or “Markov chain.” When paired with Monte-Carlo type calculations, we have “Markov Chain Monte Carlo”, or MCMC. Why is it called a “random walk”? Why is it called a “random walk”? Why is it called a “Markov chain”? Andrey Markov (Андрей Андреевич Марков) 1856 – 1922 Consider the 4X4 case (there are 288 boards, but only 2 essentially distinct ones!) 1 2 3 4 1 2 3 4 3 4 1 2 3 4 1 2 2 3 4 1 2 1 4 3 4 1 2 3 4 3 2 1 What “small changes” can we make to get between them? Consider the 4X4 case (there are 288 boards, but only 2 essentially distinct ones!) 1 2 3 4 1 2 3 4 3 4 1 2 3 4 1 2 2 3 4 1 2 1 4 3 4 1 2 3 4 3 2 1 What “small changes” can we make to get between them? 1 2 2 3 4 2 1 3 4 2 1 3 4 3 4 1 2 3 4 1 2 3 4 1 2 2 3 4 1 1 3 4 1 1 3 4 2 4 1 2 3 4 1 2 3 4 2 2 3 1 2 2 3 4 2 1 3 4 2 1 3 4 3 4 1 2 3 4 1 2 3 4 1 2 2 3 4 1 1 3 4 1 1 3 4 2 4 1 2 3 4 1 2 3 4 2 2 3 1 2 2 3 4 2 1 3 4 2 1 3 4 3 4 1 2 3 4 1 2 3 4 1 2 2 3 4 1 1 3 4 1 1 3 4 2 4 1 2 3 4 1 2 3 4 2 2 3 2 1 3 4 2 1 3 4 3 4 2 1 3 4 1 1 1 3 4 2 1 3 4 2 4 2 1 3 4 2 1 3 All we did was relabel the board by switching 1’s and 2’s! 1 2 3 4 3 2 3 4 3 2 1 4 3 4 1 2 3 4 1 2 1 4 1 2 2 3 4 1 2 3 4 1 2 3 4 1 4 1 2 3 4 1 2 3 4 1 2 3 3 2 1 4 1 4 3 2 2 3 4 1 4 1 2 3 Prop. If the sequence of moves terminates before reaching every vertex, the result is a truly different sudoku board. Proof. Let G be the group of Latin square isotopies: the group generated by relabelings, rotations, and all row and column permutations (not just in-band or in-stack). It’s not hard to see that each element g of G can be factored uniquely into a product of a relabeling L, a column permutation C, a row permutation R, and (possibly) a quarter-turn Q: g LCRQ j where j = 0 or 1. Note that the Sudoku isotopy group G0 is a subgroup of G. Suppose that g in G0 exchanges some reds and blues, but not all – and otherwise fixes the content of every cell. Write g as g LCRQ j By permuting rows and columns to group together cycles of reds and blues, we get that the action of g looks something like: g Suppose j = 0. Whether or not L flips the colors red and blue, some one of these cycles is flipped, while another is not. f a b c d e f g h j k l m n o o n m c b a e d h g k l i j i b c f e h o i n n l d a o g m The sequence of row and column permutations required to flip the colors either reverses rows or columns. Therefore, the relabeling L must permute symbols a—o. But this changes the contents of other cells – a contradiction. It’s easy to check the j = 1 case as well (and deal with the cases where the cycles are only 4 or 6 in length). But, does every Sudoku board have a cycle that terminates “early”? To restate: Define a graph H on the set of cells with a complete subgraph in each row, column, and box. Color vertices according to the contents of the cells. Define Hij to be the subgraph of H induced by vertices of color i and j. Conjecture: For any Sudoku board, there are an i and a j so that Hij is disconnected. But, does every Sudoku board have a cycle that terminates “early”? To restate: Define a graph H on the set of cells with a complete subgraph in each row, column, and box. Color vertices according to the contents of the cells. Define Hij to be the subgraph of H induced by vertices of color i and j. Conjecture: For any Sudoku board, there are an i and a j so that Hij is disconnected. Question: Can one get from any Sudoku board to any other via a sequence of such moves? (If so, then this MCMC strategy will work!) Attempt #5: Relax a linear program. Use the edges of the resulting polytope as the “moves” to make in the random walk. Write xijk for a variable that indicates whether or not cell (i, j) is occupied by color k. (So xijk = 1 if so, xijk = 0 if not.) Then, letting i, j, and k vary over {1,…,9} we have the following constraints that describe a valid Sudoku board. Attempt #5: Relax a linear program. Use the edges of the resulting polytope as the “moves” to make in the random walk. Write xijk for a variable that indicates whether or not cell (i, j) is occupied by color k. (So xijk = 1 if so, xijk = 0 if not.) Then, letting i, j, and k vary over {1,…,9} we have the following constraints that describe a valid Sudoku board. xijk 0,1 Attempt #5: Relax a linear program. Use the edges of the resulting polytope as the “moves” to make in the random walk. Write xijk for a variable that indicates whether or not cell (i, j) is occupied by color k. (So xijk = 1 if so, xijk = 0 if not.) Then, letting i, j, and k vary over {1,…,9} we have the following constraints that describe a valid Sudoku board. xijk 0,1 9 x i 1 for j,k ijk 1 = 1,…,9 9 x j 1 for i,k ijk 1 = 1,…,9 9 x k 1 for i, j ijk 1 = 1,…,9 Attempt #5: Relax a linear program. Use the edges of the resulting polytope as the “moves” to make in the random walk. Write xijk for a variable that indicates whether or not cell (i, j) is occupied by color k. (So xijk = 1 if so, xijk = 0 if not.) Then, letting i, j, and k vary over {1,…,9} we have the following constraints that describe a valid Sudoku board. xijk 0,1 9 x i 1 for j,k ijk 9 x 1 j 1 = 1,…,9 3 3 m 3 3 n x i 1 3 m j 1 3 n ijk ijk 9 x 1 for i,k = 1,…,9 1 for m,n k 1 for i, j ijk 1 = 1,…,9 = 0,1,2; k = 1,…,9 The set of these equations defines an integer program, the set of whose solutions correspond exactly to valid Sudoku boards. Attempt #5: Relax a linear program. Use the edges of the resulting polytope as the “moves” to make in the random walk. Write xijk for a variable that indicates whether or not cell (i, j) is occupied by color k. (So xijk = 1 if so, xijk = 0 if not.) Then, letting i, j, and k vary over {1,…,9} we have the following constraints that describe a valid Sudoku board. 0 xijk 1 9 x i 1 for j,k ijk 9 x 1 j 1 = 1,…,9 3 3 m 3 3 n x i 1 3 m j 1 3 n ijk ijk 9 x 1 for i,k = 1,…,9 1 for m,n k 1 for i, j ijk 1 = 1,…,9 = 0,1,2; k = 1,…,9 The set of these equations defines an integer program, the set of whose solutions correspond exactly to valid Sudoku boards. If we “relax” the first constraint, the result is a linear program, the set of whose solutions include all valid Sudoku boards. Note that there are indeed solutions to the linear program which are not solutions to the integer program. For example, set xijk = 1/9 for all i, j, and k. The set of solutions (which lives in 729-dimensional space, since there are 9X9X9 variables xijk) is a “polytope”, the higher-dimensional analogue of 2-D polygons and 3-D polyhedra. ? All valid Sudoku boards lie at vertices of this polyhedron. Each vertex is connected to some other vertices via edges of the polyhedron. Using standard linear programming techniques, it is not hard to compute these “neighboring” vertices. If we take a random walk along the resulting (automatically connected) graph, we have MCMC! Problem #1: Are there any other vertices than “proper” Sudoku boards? Problem #2: What is the diameter and expansion constant of the resulting graph? In other words, how long must one wander around the graph to ensure something close to a uniform distribution? Interested in studying any of these questions? Email me at cooper@math.sc.edu.