Parameterized complexity of Sudoku (Jena, 26 May of 2008) (http://compalg.inf.elte.hu/~tony/Jena-2008) 1. History Let n be a positive integer, N = {0, 1, 2, …, n}, N+={1, 2, …, n}. A latin square L of order n is an nxn sized array (grid, matrix) filled by the elements of N+ (that is L[i,j]N for 1≤ i, j≤n), in which each row and each column contains the elements of N+ exactly once. Here is a Latin square of order 4 (the so called back circulant or cyclic or shifted square of order 4): 1 2 3 4 2 3 4 1 3 4 1 2 4 1 2 3 Let n be a square number, that is n=m2 and M ={0, 1, 2, …,m2}, M+={1, 2, …, m2}. Since n is a square number, an nxn sized array can be divided into mxm disjoint subarrays (called blocks) of size mxm sized. A sudoku square S of order m is an m2xm2 sized array filled by the elements of M+ (that is S[i,j]M for 1≤i, j≤m2), in which each row, each column and each block contains the elements of M+ exactly once. Here is a sudoku square of order 2 (the so called shifted sudoku square of order 2): Rows, columns and block are called houses. The elements of the array are contained in cells. A sudoku puzzle P of order m is an m2xm2 sized array filled by the elements of M+ (that is S[i,j]M for 1≤i, j≤m2). To solve a sudoku puzzle means to find all sudoku squares which can be constructed by substitusion of the null sin P by the elements of M+. According to these definitions we have 516 4 order latin and 2 order sudoku puzzles and 1081 9 order latin and 3 order sudoku puzzles. It is known that there are 576 2 order latin squares and 288 1 2 order sudoku square (50 %), 5524751496156892842531225600≈5x1027 9 order latin squares and 6,670,903,752,021,072,936,960≈7x1021 3 order sudoku squares (≈0.00012%). Here is a sudoku puzzle P of order 2 (instead of 0’s empty cells are used): To solve a sudoku puzzle P means to get all sudoku square S which can be constructed substituting all nulls of P by the elements of M+. According to the majority of books the most popular form of sudoku puzzles was proposed by Howard Garns American architect and puzzlemaker in 1979 as a puzzle “number place”. Its popularity and actual name sudoku is due to the Japanese puzzle publisher Nicoli. We can remark that the special case without blocks was investigated by Leonhard Euler, and a more general case (where the grid was divided into m regions of size m cells each) was investigated by a German statistician Behrens in 1956. Similar squares and cubes were investigated more then 2000 years ago in China as magic squares [Wikipedia2008] and recently as perfect squares and cubes [Knuth2005, HorváthI2008]. 2. Classification of Sudoku puzzles (according to size, number of given data, number of solutions, complexity) A natural parameter of a Sudoku square is its order (m). Another parameter is the number of its elements (d). The number of solutions is either zero (unsolvable or Bad puzzle) or more (solvable puzzle). If a puzzle has exactly one solution, it is called Good. If a puzzle has more solutions, it is called solvable, but Not good. According to the order m = m(P) and number of solutions s = s(P) the sudoku puzzles can be classified as follows: s/m s=0 s=1 s≥2 m=2 2B 2G 2N m=3 3B 3G 3N m=4 4B 4G 4N It is easy to show that each class contains puzzles. 2 m=5 5B 5G 5N ... mB mG mN 1) Let m ≥2. If P[1,1]=P[1,2]=1, then the first row of P can not contain all elements of M+, therefore the set mB is nonempty. 2) Now we construct an element of 3G. Let P[1,i]=i (1≤i≤m2). Then we fill the (jm+1)-th row (1≤j≤m–1) shifting the first row left by j. Finally we fill the (jm+k)-th line (0≤ j≤m–1, 2≤ k ≤n) shifting the previous line left by m (in increasing order of k). Using this method we can get a good m2xm2 sized square for any m. 3) We can get an element of the set of puzzles with several solutions nN removing all 1’s and 2’s from an element of nG; then one solution is the original square, and we get a second solution writing 2’s into the cells where earlier were 1’s, and writing 2’s into the cells, where earlier were 1’s. The different printed and digital sources very often classify the puzzles according to their complexity, e.g. as easy, medium, hard and very hard ones. Our system is similar but contains more classes, actually 63 (0, 1, 2, …, 62). The class 0 the puzzles having a unique solution, which can be found by algorithm A0. If 1≤i≤62, then the class i A1, …, Ai contains the puzzles which have a unique solution, which can be found by by algorithm Ci, but can not be found by algorithm Ci-1. The result is the following classification: s/m s=0 s=1 s≥2 m=2 i=0, …, 62 i=0, …, 62 i=0, …, 62 m=3 i=0, …, 62 i=0, …, 62 i=0, …, 62 m=4 i=0, …, 62 i=0, …, 62 i=0, …, 62 m=5 i=0, …, 62 i=0, …, 62 i=0, …, 62 ... i=0, …, 62 i=0, …, 62 i=0, …, 62 In connection with this table we introduce a new type of puzzles (after G, B, N): U (Undetermined) means, that a given algorithm Ci could not determine the type of the investigated puzzle. One of the natural parameters of a sudoku puzzle S is the number of its elements d(S) száma. The trivial bounds are természetes korlátok 0≤d(S)≤m(S)4. The corresponding classification: s/m s=0 s=1 s≥2 m=2 d=0, …, 16 d=0, …, 16 d=0, …, 16 m=3 d=0, …, 81 d=0, …, 81 d=0, …, 81 m=4 d=0, …, 256 d=0, …, 256 d=0, …, 256 3 m=5 d=0, …, 625 d=0, …, 625 d=0, …, 625 ... d=0, …, m4 d=0, …, m4 d=0, …, m4 These subclasses are sometimes empty. E.g. a bad puzzle has at least 2 elements. We know, that a good sudoku puzzle of order 2 contains at least four elements. It is conjectured that if a sudoku puzzle of order 3 has a unique solution then it has at least 17 elements. Also it known, that if a sudoku puzzle has at least two solutions, then it contain at least 4 empty cells. Therefore the previous bounds can be improved: s/m s=0 s=1 s≥2 m=2 d=2, …, 16 d=4, …, 16 d=0, …, 12 m=3 d=2, …, 81 d=17?, …, 81 d=0, …, 77 m=4 d=2, …, 256 d=?, …, 256 d=0, …, 252 m=5 d=2, …, 625 d=?, …, 625 d=0, …, 621 ... d=2, …, n4 d=?, …, n4 d=0, …, n4− 4 Ezeket a szempontokat összesítve egy rejtvény osztályát (típusát) nti-j formában adjuk meg, ahol n a rejtvény paramétere (rendje), t a típusa (t {R,A,T,B}), i a rejtvényt megoldó Ci algoritmusok minimális indexe, j pedig a bemenő adatok száma. 3. Parameterized sudoku algorithms [Bach2006,Berthier2007,Iványi2008,Stuart2007,Werl2008] The next table contains the basic data of some parameterized sudoku algorithms: Short name Full name NAKEDk Usual values of k 1,2,3,4 Indices in ROSA A1,A4,A7,A10 Nk Hk HIDDENk 1,2,3,4 A2,A5,A8,A11 Xk Xk-WING 2,3,4 A6,A9,A12 Yk Wk Yk-WING Wk-WING 2,3,4 2,3,4 A13,A15,A17 A14,A16,A18 Concrete Names Naked Single, Naked Pair, … Hidden Single, Hidden Pair, … X-Wing, Sword-Fish, … XY-Wing XYZ-Wing, XYZW-Wing Algorithm NAKEDk (A1 in ROSE) investigates k (1≤k≤m-1) choosen cells of a choosen house H. It counts, how many different candidates are allowed (not excluded) by the neighbours. There are 3 cases: if the number of candidates f<k, then the investigated puzzle is unsolvable; if f=k, then the given cells and candidates form a naked k-tuple, and these candidates can be excluded from the remaining cells of H. Finally, if f>k, then algoritm NAKEDk is not useful int he given situation. Algoritm HIDDENk (A2 in ROSE) counts, in how many cells (f) of a choosen house H are permitted k choosen numbers as candidates. If f=0, then the puzzle is unsolvable; if f=k, then the choosen numbers and cells form a hidden k-tuple and are forced to fill the k cells of H, therefore the remaining candidates can be removed from the choosen cells.. Algorithm Xk investigates a fixed candidate j in k parallel lines (rows or columns) of a given sudoku puzzle P. At first it counts that in how many lines (p) perpendicular on the choosen lines do appear the candidates j. If Ha p<k, akkor P is bad. If p=k, then the algorithm determines, how many (c) independent (belonging to different rows and different columns) candidates j are in the 4 k2 crosspoints of the 2k lines (the given data must be chhosen in any case). If c<k, then P is bad. If c=k, then we can remove the candidates j from the remaining (different from the crosspoints) cells of the perpendicular lines. Algorithm Yk investigates k+2 choosen cells, one of these cells is the pivot cell (V) containing j1, j2, …, jk; another cell (T) cella (target cell) contains a candidate J different from the previous ones plus at least a further candidate (+); further k pincer cells (C1, …, Ck), which are neighbours (taht is they are in the same cell) of the pivot cell and the target cell mind a T, mind pedig a Z cellával s(azaz velük azonos házban vannak), and Ci contains ji and J. The pairs V-Ci are called Y-wings. If V contains ji, then ji is excluded from Ci , therefore the contents of Ci will be J . So in all cases of the possible contents of V, J can excluded from T. If k=2, then the special name of the algoritm is XY-Wing. Algorithm Wk is similar. The difference is only, that the pivot cell contains J too, further the pivot cell and target cella re also neighbours. The special name of algorithm W2 is neve XYZWing, and the spacial name of W3 is XYZW-Wing or WXYZ-Wing. 4. Sudoku programs [Balázs2007,Juillerat2008,vanderWerl2008] 5. Open problems 5.1. Complexity of Sudoku versions General [Knuth2005], ASP [Yato2003], Consistent [Thom2007], Unique[????] 5.2. Spectrum of sizes of critical sets 5.3. Spectrum of sizes of reduced uniqueness sets References [Bach2006] Bach, Claudia: Sudoku-Trick-Kiste. AEGIS GmbH., Berlin, 2006. ISBN 978-3-9811369-13, 156 pages, no reference. Claudia.Bach@sudoku4u.biz. [Bailey2008] Bailey, R. A.; Cameron, Peter J.; Connolly, Robert: Sudoku, gerechte designs, resolutions, affine space, spreads, reguli, and Hamming codes. The American Mathematical Monthly 115 (5) (2008), 383−404. http://www.math.cornell.edu/~connelly/. [Balázs2007] Balázs, Viktor: Sudoku algorithms (in Hungarian). BSc thesis. Eötvös University, Budapest, 2007. [Behrens1956] Behrens, W. U.: Feldversuchsanordnungen mit verbessertem Ausgleich der Bodenunterschiede. Zeitschrift für Landwirtschaftliches Versuchs- und Untersuchungswesen 2 (1956) 176–193. [Berthier2007] Berthier, Denis: The Hidden Logic of Sudoku. 2. edition. Lulu (http://www.lulu.com/content/7877877 ), 2007. 416 pages, 31 referencess. ISBN 978-1-84799-214-7. [ColbournCS1984] Colbourn Charles J.; Colbourn, M. J.; Stinson D. R.: The computational complexity of recognizing critical sets. Graph Theory in Singapore 1983. Lecture Notes in Mathematics. 1073 (1984), 248−253. [Colbourn1984] Colbourn, Charles J.: The complexity of completing partial Latin squares. Discrete Appl. Math. 8 (1984), no. 1, 25−30. MR0739595 (85d:05055) [DénesK1991] Dénes, József; Keedwell, A. Donald: Latin Squares. New Developments in the Theory and Applications. North-Holland, Amsterdam, 1991. ISBN 0 444 88899 3. 454 pages, 756 references. MR1096296 (92h:05022) [EastonP2001] Easton, T.; Parker, R. Gary: On completing Latin squares. Discrete Appl. Math. 113 (2001), no. 2-3, 167--181. MR1857774 (2002j:05029) 5 [Elsholz2007] Elsholtz, Ch.; Mütze, A.: Sudoku im Mathematikunterricht. Math. Semesterber. 54 (2007), no. 1, 69--93. MR2293996. http://www.springerlink.com/content/b85310747k374329/. [Gordon2006] Gordon, Peter: Solving Sudoku. Sterling Publishing Co., Inc., New York, 2006. [HerzbergM2007] Herzberg, Agnes M.; Murty, M. Ram: Sudoku squares and chromatic polynomials. Notices Amer. Math. Soc. 54 (2007), no. 6, 708--717. 10 pages, 6 references. MR2327972 http://www.ams.org/notices/200706/tx070600708p.pdf . [HorváthI2008] Horváth, Márk; Iványi, Antal: Growing perfect cubes. Discrete Math. 308 (2008). http://www.sciencedirect.om.hu/ . [Iványi2008] Iványi, Antal: Systematic Sudoku. Manuscript in Hungarian. Budapest, 2008. 152 pages, 174 references. http://compalg.inf.elte.hu/~tony/Oktatas/Rozsa/ . [Juillerat2008] Juillerat, Nicolas: Explainer. Sudoku solver program. University of Fribourg, 2008. http://diuf.unifr.ch/people/juillera/Sudoku/Sudoku.html . [Keedwell2003] Keedwell, A. D.: Critical sets in latin squares and related matters: an update. Util. Math. 65 (2004), 97--131. MR2048415 (2005k:05047) [Keedwell2007] Keedwell A. Donald: On Sudoku squares. Bull. Inst. Combin. Appl. 50 (2007), 52--60. MR2317088 (2007m:05044) [Knuth2005] Knuth, Donald E.: The Art of Computer Programming. Vol. 4, Fasc. 2. Generating all tuples and permutations. Addison-Wesley, Upper Saddle River, NJ, 2005. vi+127 pages, 124 references. ISBN: 0-201-85393-0 . MR2251595 (2007f:68004b). http://www.cs.utsa.edu/~wagner/knuth/. [Quistorff2007] Quistorff, J.: Sudoku puzzles in the context of extremal problems for Latin squares and codes. Geombinatorics 17 (2007), no. 1, 19--33. MR2375676 [Stuart2007] Stuart, AndrewAndrew: The Logic of Sudoku. MM Multimedia Ltd. 2007. ISBN 978-0955-4841-0-0, XII + 284 pages, 2 references. http://www.org.uk/LogicofSudokuResources.htm. [Thom2007] Thom, Dennis: SUDOKU ist NP-vollständig. Stuttgart, 2007. http://w3studi.informatik.uni-stuttgart.de/~thomds/SudokuinNPC.pdf. 23 pages, 10 references. [VanderWerl2008] Van der Werl, Ruud (editor): Sudopedia. http://www.sudopedia.org/index.php?title=Main_Page/. [Wikipedia2008] Magisches Quadrat (Lho Shu). http://de.wikipedia.org/wiki/Magisches_Quadrat . [Yato2003] Yato, Takayuki: Complexity and Completeness of Finding Another Solution and Its Application to Puzzles. Master Thesis. The University of Tokyo, 2003. 43 pages, 26 references. http://www-imai.is.s.u-tokyo.ac.jp/~yato/data2/MasterThesis.pdf. Budapest, May 17, 2008. 6