Constraint Satisfaction & Constraint Programming Advanced Issues My Thanks to Toby Walsh and Roman Bartak (for “stealing some slides”) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 1 Overview Higher Consistencies for binary constraints Path Consistency, k-consistency, SAC, PIC, NIC, RPC Non-binary Constraints Search Algorithms & Consistencies for non-binary constraints GAC, Bounds Consistency, PWC specialized filtering algorithms Encodings of non-binary constraints into binary decomposable constraints encodings of non-binary into binary constraints Alternative search methods Limited Discrepancy Search Optimization ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 2 Path Consistency (PC) A CSP is path consistent iff every pair of assignments (x,a) and (y,b) where a and b are compatible, can be extended to a consistent assignment of every third variable a PC algorithm adds binary no-goods The assignments (x,a) and (y,b) cannot be made simultaneously Plus/Minus + detects more inconsistencies than AC - extensional representation of constraints - changes in graph connectivity Directional PC, Restricted PC ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 3 Path Consistency Path consistency will detect that there is no solution in this problem X by adding new binary constraints 1 Naturally PC is more expensive than AC O(n3d3) time complexity compared to O(n2d2) PC algorithms have high space complexity as well O(n3d2) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 1 2 2 Y Z 1 2 4 k-consistency k-consistency = (k-1,1) consistency consistent assignment of (k-1) variables can be extended to k-th variable strong k-consistency j-consistency for each jk NC strong 1-consistency AC strong 2-consistency PC strong 3-consistency What is 6-consistency? The cost of k-consistency is exponential in k impractical for large k ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 5 Consistency Completeness strongly N-consistent constraint graph with N nodes => solution strongly K-consistent constraint graph with N nodes (K<N) => ??? path consistent but no solution {1,2,3} A {1,2,3} B D {1,2,3} C {1,2,3} Special graph structures tree structured graph => (D)AC is enough cycle cutset, MACE ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 6 Domain Filtering Consistencies An important disadvantage of path consistency and k-consistency in general is that they alter the structure of the constraint graph and the constraints’ relations this implies an exponential (in k) space complexity Domain filtering consistencies are consistencies that only remove values from the domains of variables i.e. they only add unary constraints arc consistency is the most widely used such consistency many more have been proposed inverse and singleton consistencies ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 7 Singleton Arc Consistency (SAC) A value a of a variable x is singleton arc consistent (SAC) if after reducing the domain of x to {a} and applying AC, there is no domain wipe-out A CSP is SAC iff all values in the domains of all variables are SAC A SAC algorithm deletes all values that are not SAC Singleton consistency is a generic notion that can be combined with many consistencies singleton path consistency singleton k-consistency Singleton consistencies do not alter the constraints they only remove inconsistent values ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 8 Singleton Arc Consistency (SAC) Singleton arc consistency will detect that there is no solution in this problem X without adding new binary constraints Naturally SAC is more expensive than AC O(end3) time complexity compared to O(ed2) The space complexity is: O(end2) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 1 1 2 2 Y Z 1 2 9 Path Inverse Consistency (PIC) k-consistency is otherwise known as (k-1,1) consistency consistent assignment of (k-1) variables can be extended to k-th variable path consistency is (2,1) consistency What about the inverse k-consistency = (1,k-1) consistency? consistent assignment of 1 variable can be extended to any (k-1) variables The cost of k-inverse-consistency is exponential in k impractical for large k Path inverse consistency is (1,2) consistency consistent assignment of 1 variable can be extended to any 2 variables ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 10 Path Inverse Consistency (PIC) Path inverse consistency will detect that there is no solution in this problem X 1 without adding new binary constraints Naturally PIC is more expensive than AC O(ed2+cd3) time complexity, where c is the number of 3cliques The space complexity is: 1 2 2 Y 1 2 O(ed+cd) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 11 Neighborhood Inverse Consistency (NIC) A value a of a variable x is neighborhood inverse consistent (NIC) if it can be extended to a consistent instantiation of all the neighbors of x (i.e. all variables connected to x) NIC deletes values that cannot be part of a solution to the sub-problem defined by the neighbors of x The behavior of NIC is dependent on the structure of the constraint graph what happens if the graph is complete (i.e. each variable is constrained with all the others)? NIC is cost effective when used on problems with sparse graphs ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 12 Neighborhood Inverse Consistency (NIC) NIC will detect that there is no solution in this problem X 1 NIC can be very expensive Its time complexity depends on the maximum number of neighbors that a variable has this can be as much as n-1 The only proposed algorithm has complexity of O(eg2dg+1) 1 2 2 Y 1 2 g is the maximum degree ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 13 Restricted Path Consistency (RPC) Restricted Path Consistency (RPC) is an intermediate level of consistency between AC and PC it removes all arc inconsistent values and it checks the path consistency of all pairs of values (x,a) , (y,b) such that (y,b) is the only support for (x,a) in the domain of y. if such as pair is path inconsistent, its deletion would lead to the arc inconsistency of value a of x RPC only removes a (it does not add the binary constraint) RPC performs few more checks than AC, while deleting more values without changing the structure of the constraint graph ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 14 Restricted Path Consistency (RPC) RPC will detect that there is no solution in this problem X 1 PRC is relatively cheap O(ed2 + cd2 ) where c is the number of 3-cliques 1 2 Preliminary experiments have shown that it is the most promising alternative to AC ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 2 Y 1 2 15 Relations between Local Consistencies NIC SRPC SAC PIC RPC AC strong PC stronger consistency incomparable consistencies ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 16 Non-binary Constraints: Outline Definition of non-binary constraints Modeling with non-binary constraints Solving non-binary CSPs Search and constraint propagation with non-binary constraints Encodings of non-binary CSPs into binary Practical benefits: case study Golomb rulers Crossword puzzles Sudoku A non-exhaustive list of specialized non-binary constraints… ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 17 Definition Binary constraint Relation on 2 variables identifying those pairs of values disallowed (no-goods) E.g. not-equals constraint: X1 X2. Non-binary constraint Relation on 3 or more variables identifying tuples of values disallowed E.g: alldifferent(X1,X2,X3) X1+X2+X3>X4 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 18 Some non-binary examples Timetabling Lecture1, Lecture2, … Values: time1, time2, … Constraint that lectures do not conflict: alldifferent(Lecture1,Lecture2,…). Variables: Scheduling Job1, Job2, … Values: machine1, machine2, … Constraint on number of jobs on each machine: atmost(2,[Job1,Job2,…],machine1), atmost(1,[Job1,Job2,…],machine2). Variables: ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 19 Why use non-binary constraints? We know that any non-binary constraint can be represented using binary constraints E.g. alldifferent(X1,X2,X3) is “equivalent” to X1 X2, X1 X3, X2 X3 In theory therefore they’re not needed But in practice, they are! most real problems are naturally represented using non-binary constraints ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 20 Modeling with non-binary constraints Benefits include: Natural representation of real constraints Compact, declarative specifications Efficient constraint propagation However, non-binary constraints post some challenges: Most algorithms and techniques have been developed for binary constraints can we adapt them to the non–binary case ? ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 21 Modeling with non-binary constraints Consider the all-different constraint It can be represented using binary constraints x1 x2 This representation is not very compact x1 alldifferent([X1,…Xn]) expands into n(n-1)/2 binary not-equals constraints, Xi Xj non-binary constraint or O(n2) binary constraints? x3 x4 x5 x6 x2 x6 x3 one ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 x5 x4 22 Solving non-binary CSPs There are two approaches we can follow: Extend algorithms and heuristics for binary CSPs to deal with nonbinary constraints Search algorithms, local consistencies, variable/value ordering heuristics, local search techniques, etc. Devise new algorithms if necessary Translate any given non-binary CSP into a binary one and solve it using standard techniques for binary constraints How can we do the translation? Is this efficient? ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 23 Local Consistencies for Non-binary Constraints Generalized arc-consistency (GAC) for non-binary constraints A non-binary constraint is GAC iff for every value a for a variable x there is a consistent and valid tuple including (x,a) and values for all other variables in the constraint A tuple is consistent if it is allowed by the constraint A tuple is valid if none of the values in the tuple has been removed from the corresponding domain Supports are not single values, but tuples of values We can prune values that are not supported GAC = AC on binary constraints {0,1,2} {0,1,2} {0,1,2} {0,..,6} x1 x2 x3 x4 value 5 of X4 has the support <2,2,2> value 6 of X4 has no support X1+X2+X3>X4 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 24 GAC is stronger than AC Non-binary model alldifferent(X1,X2,X3) is not GAC Binary model X2, X1 X3, X2 X3 are all AC X1 {2,3} X1 {2,3} But GAC is, in general, much more expensive X2 {2,3} X3 O(ekdk) optimal time complexity where k is the maximum arity of the constraints ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 25 GAC-3: An Algorithm for GAC procedure GAC-3(G) Let Q be the set of (undirected) constraints of G while Q not empty do select and remove any constraint c from Q; for each variable x participating in c REVISE(x,c) if REVISE(x,c) changed the domain of x then add to Q the set of all constraints that include x (except c); procedure REVISE (x,c) for each value a in domain of x do if there is no tuple in the relation of c that includes (x,a) and is consistent and valid then delete a from the domain of x ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 26 Algorithms for GAC The time complexity of GAC-3 is O(ek2dk+1) this can be reduced to O(ek2dk) using a similar data structure as in AC2001 (algorithm GAC-2001/3.1) procedure REVISE-2001/3.1 (x,c) for each value a in domain of x do if there is no consistent and valid tuple τ in the domain of y such that τ> Lastx,a,c then delete a from the domain of x else Lastx,a,y = first such tuple Other binary AC algorithms have been extended to the non-binary case GAC-4 GAC-Schema (a generalization of AC-7) achieves multi-directionality and has optimal worst-case complexity O(ekdk) but uses complicated data structures and is difficult to implement ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 27 Achieving GAC By exploiting “semantics” of certain constraints, we can often enforce GAC much more efficiently than with a generic algorithm Consider alldifferent([X1,…Xn]) with each Xi having domain of size d Generic GAC algorithm runs in O(ndn) A specialized GAC algorithm for the alldifferent constraint runs in O(dnn) based on network flow Designing such algorithms is a hot topic in constraint programming research more on this later… ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 28 Alldifferent GAC pruning How to make an all-different constraint GAC? Given domains, create domain/variable bipartite graph x1 1 x2 2 x3 3 x4 4 x5 5 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 29 Alldifferent GAC pruning Pruning? Which edges are in no matching? Find them and prune the corresponding values from the domains x1 1 x2 2 x3 3 x4 4 x5 5 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 Domain is sharply reduced 30 Bounds Consistency Local consistencies for non-binary constraints are expensive in the general case GAC exponential in the arity of the constraints Bounds consistency (BC) is a restricted (and cheap) form of (G)AC that applies (G)AC only on the values at the bounds of the variables’ domains e.g. BC will only check if values 0 and 9 are AC for a variable with domain {0,…9} BC can be very cost effective for certain types of constraints can you think of such a constraint? ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 31 Other Local Consistencies for Non-binary CSPs Local consistencies for non-binary constraints are not as studied as in the binary case because they are expensive in general Some strong consistencies have been proposed relational consistencies pairwise consistency will say more about it later hyper k-consistencies In all these cases, the primary entities where the consistencies operate are the constraint relations, not the variables ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 32 Search Algorithms for Non-binary CSPs Some search algorithms are easily generalized to the non-binary case MAC => MGAC while for others it is less obvious FC => ? Lets start with the simplest algorithm. How can we generalize chronological backtracking to handle non-binary problems? perform a constraint check only when the current variable is the last variable in a constraint e.g. constraint x1+x2+x3>x4 will be checked when the algorithm reaches x4 (assuming a static variable ordering) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 33 FC for non-binary constraints Forward checking has been generalized to non-binary CSPs in many ways: nFC0 check a constraint when there is exactly one variable unassigned e.g. constraint x1+x2+x3>x4 will be checked when the algorithm reaches x3 nFC1 apply one pass of AC to each constraint and projection involving current variable and one future variable nFC2 apply one pass of AC to the set of constraints involving the current variable and at least one future variable nFC3 apply AC to the set of constraints involving the current variable and at least one future variable ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 34 FC for non-binary constraints Forward checking has been generalized to non-binary CSPs in many ways: nFC4 apply one pass of AC to the set of constraints involving at least one past variable (or the current variable) and at least one future variable nFC5 apply AC to the set of constraints involving at least one past variable (or the current variable) and at least one future variable ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 35 nFC - Example c1 c2 c3 x y z u v w x y w a a a a a a a a c a b c a b b a b c a c b c c c Assume the assignments (x,a) and (u,a) are made. What pruning do algorithms nFC0-nFC5 achieve? ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 36 Constraint Programming Solvers CP Solvers offer: A rich constraint language Arithmetic, higher-order, logical constraints Global constraints for natural substructures Easy specification of a search procedure Definition of search tree to explore through modeling decisions Specification of search strategy Choice of branching heuristics The user Models the problem as a CSP by specifying variables, domains, and constraints Selects the search strategy to be used Feeds the problem to the solver ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 37 Illustrative artificial example Color a map of (part of) Europe: Belgium, Denmark, France, Germany, Netherlands, Luxembourg No two adjacent countries same color Are four colors enough? enum Country {Belgium,Denmark,France,Germany,Netherlands,Luxembourg}; enum Colors {blue,red,yellow,gray}; var Colors color[Country]; solve { color[France] <> color[Belgium]; color[France] <> color[Luxembourg]; color[France] <> color[Germany]; color[Luxembourg] <> color[Germany]; color[Luxembourg] <> color[Belgium]; color[Belgium] <> color[Netherlands]; color[Belgium] <> color[Germany]; color[Germany] <> color[Netherlands]; color[Germany] <> color[Denmark]; }; ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 Variables non-numeric Constraints are non-linear 38 Constraint Programming CP solvers consist of: Domain store For each variable: what is the set of possible values? If empty for any variable, then infeasible If singleton for any variable, then solution Constraint store Capture interesting and well studied substructures called global constraints Need to Determine if constraint is feasible WRT the domain store Prune “impossible” values from the domains This is done using specialized or generic GAC algorithms Bounds consistency algorithms ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 39 Turning non-binary constraints into binary Two methods Encodings: Replace with binary constraints by introducing new variables Decompositions: (For restricted classes of non-binary constraints) Replace with binary constraints on same variables Theoretical results are informative Comparing non-binary constraint propagation with binary Suggests where non-binary constraints are valuable ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 40 Let’s start with the easy case! Decomposable constraints: Non-binary constraints that can be represented by binary constraints with introducing new variables It’s a special case that sometimes occurs about which we can be (theoretically) quite precise Certain non-binary constraints decompose into binary constraints on same variables Sometimes called “network decomposable” ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 41 Binary decompositions Two examples: all-different(x1,x2,x3) is x1≠x2, x1≠x3, x2≠x3 monotone(x1,x2,x3) is x1 < x2, x2 < x3 One non-example: even(x1+x2+x3) Can you see why not? Can you think of any other decomposable constraints? ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 42 Binary decompositions Theoretical comparison direct compare pruning of variables in binary decomposition with that in non-binary Empirical experiments reinforce theory decomposing non-binary constraints can add orders of magnitude to solution cost ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 43 Binary decompositions Upper and lower bound on FC nFC1 on non-binary > FC on decomposition > nFC0 on non-binary Gaps can be exponential Consider n-ary all-different with n-1 values nFC1 takes (n-1) branches FC on decomposition takes (n-1)! branches GAC lower bound GAC on non-binary > AC on decomposition Gap again can be exponential But if we decompose too much, GAC=AC! GAC upper bound In general, GAC ~ NIC, GAC ~ PIC .. BUT if decomposition to clique, NIC > GAC ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 44 Binary decompositions Tighter results provable for stricter classes Tree decomposable constraints constraint graph is tree Triangle preserving constraints non-binary constraints on all triangles ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 45 Binary decompositions Tree decomposable constraints e.g. monotone(x1,x2,x3) GAC=AC not surprising as AC is enough to solve the problem! Decomposition here doesn’t lose us anything but even one cycle is enough for GAC>AC ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 46 Binary decompositions Triangle preserving decomposition e.g. all-different(x1,x2,x3), quasigroups, ... GAC > PIC, gap can again be exponential GAC ~ SAC, strongPC PIC is very strong consistency to be achieving at each node GAC can do even better than this! decomposition carries a very large price ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 47 Binary decompositions Experimental results quasigroup completion quasigroup existence Quasigroup is a Latin square completion is completing partially filled square existence is finding one with additional properties ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 48 Binary decompositions Modelling the quasigroup problem n2 Non-binary model 2n vars, each with domain of size n all-different constraints (one for each row and column) Binary decomposition 2n cliques of not-equals constraints ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 49 Binary decompositions Quasigroup completion Gomes & Selman report “heavy-tailed” distributions Maintaining AC on binary decomposition problems often take long time to solve Maintaining GAC on all-different almost all problems trivial Quasigroup existence of interest to design theory Open results first proved by computer in some cases, only ever proved by computer Maintaining GAC very competitive compared to specialized model finders like FINDER, SEM ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 50 Binary encodings Every non-binary constraint can be encoded into binary constraints using polynomial number of additional (dual) variables Two well-known encodings hidden variable encoding add a dual variable for each non-binary constraint add constraints between original and dual variables dual encoding add a dual variable for each non-binary constraint throw away original variables add constraints between dual variables ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 51 Binary encodings Dual encoding consider c1:even(x1+x2+x3), c2:odd(x2+x3+x4) c1 {000,011,101,110} R21 c2 R21= <000,001> or <011,111> or <101,010> or <110,100> {001,010,100,111} ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 52 Binary encodings Hidden variable encoding consider c1:even(x1+x2), c2:odd(x2+x3) c1 {000,011,101,110} r11 x1 {0,1} x2 {0,1} r12 c2 r21 r22 r31 x3 {0,1} r32 {001,010,100,111} ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 r11=<0,0**> or <1,1**> r21=<0,*0*> or <1,*1*> r21=<0,**)> or <1,**1> etc. 53 Double Encoding The double encoding combines the dual and the hidden consider c1:even(x1+x2), c2:odd(x2+x3) c1 {000,011,101,110} r11 x1 {0,1} x2 r12 c2 r31 r21 R21 {0,1} {0,1} x3 r22 r32 {001,010,100,111} ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 54 Binary encodings Hidden variable encoding FC on hidden ~ nFC0 on original each can be exponentially better than the other FC+ propagates through hidden variables FC+ on hidden = nFC1 on original Hidden variable encoding AC on hidden = GAC on original Before looking for efficient (specialized) GAC algorithm try AC on hidden variable encoding ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 55 Binary encodings Dual encoding FC on dual ~ nFC0 on original each can be exponentially better than the other Dual better for tight constraints domains for hidden vars then small Dual encoding AC on dual > GAC on original But domains of hidden variables are very large when the non-binary constraints are loose AC on dual prohibitively expensive Or is it??? ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 56 Binary Encodings Binary encodings offer a way to use all known algorithms and heuristics developed for binary constraints in non-binary CSPs However there are serious drawbacks: the space cost can be unmanageable exponential in the arity of the constraints the special structure of the encodings makes it difficult to apply generic methods efficiently consider the cost of AC in the dual encoding specialized algorithms must be developed On a brighter note: strong propagation can be achieved in the dual and double encodings heuristics can be utilized in the double encoding ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 57 Translating non-binary CSPs into binary Non-binary v binary decompositions GAC on non-binary can be stronger than PIC on decomposition Non-binary v binary encodings GAC on non-binary = AC on hidden AC on dual > GAC on non-binary Non-binary v binary decompositions decomposition can add significantly to search cost Non-binary v binary encodings encoding pays in practice on tight constraints ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 58 Case Studies Let us examine how some problems can be modeled as non-binary CSPs Do the (theoretical) results affect constraint solving in practice? Case studies Golomb rulers Crossword puzzle generation Sudoku ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 59 Golomb rulers Mark ticks on a ruler Distance between any two (not necessarily consecutive) ticks is distinct Very hard combinatorial problem with applications in radio-astronomy ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 60 Golomb rulers There is a simple solution: Build an exponentially long ruler Ticks at 0,1,3,7,15,31,63,… The challenging goal is to find minimal length rulers We can turn the optimization problem into a sequence of satisfaction problems Start with a large m Is there a ruler of length m? Is there a ruler of length m-1? …. ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 61 Optimal Golomb rulers Known for up to 23 ticks Large distributed internet project to find large rulers 0,1 0,1,3 0,1,4,6 0,1,4,9,11 0,1,4,10,12,17 0,1,4,10,18,23,25 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 62 Modeling the Golomb ruler problem There is a variable Xi for each tick The values are the possible positions on the ruler What is the domain size? Naïve model with quaternary constraints For all i,j,k,l |Xi-Xj| |Xk-Xl| Large number of quaternary constraints O(n4) constraints Problems with this model Looseness of quaternary constraints Many values satisfy |Xi-Xj| |Xk-Xl| ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 Limited pruning 63 A better non-binary model Introduce auxiliary variables for inter-tick distances Dij = |Xi-Xj| O(n2) ternary constraints Post a single large non-binary constraint alldifferent([D11,D12,…]). relatively tight! Alternatively post a binary ≠ constraint for every pair of auxiliary variables limited pruning compared to alldifferent ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 64 Other modeling issues Symmetry A ruler can always be reversed! Break this symmetry by adding constraint: D12 < Dn-1,n Also break symmetry on Xi X1 < X2 < … Xn Such tricks important in many problems Additional (implied) constraints Don’t change set of solutions But may reduce search significantly E.g. D12 < D13, D23 < D24, … ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 65 Experimental results Problem Naïve model (sec) Alldifferent model (sec) 8-Find 2.0 0.1 8-Prove 12.0 10.2 9-Find 31.7 1.6 9-Prove 168 9.7 10-Find 657 24.3 10-Prove > 105 68.3 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 66 Crossword Puzzle Generation – Binary model Xi X1 Each word to be filled is a variable Domains? Constraints? ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 67 Crossword Puzzle Generation – Non-binary model X1 X2 X3 X4 X13 Each blank square is a variable X20 Domains? Constraints? X32 X43 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 68 Binary encodings - Experimental results crossword puzzles “Original” model vars for letters, domains {A-Z} Dual model vars for words, domains = dictionary Dual sometimes 1,000 faster on larger problems It depends! Golomb rulers encoded using hidden var/double encoding Competitive with non-binary model ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 69 Sudoku Sudoku is a logic-based placement puzzle. It consists in placing numbers from 1 to 9 in a 9-by-9 grid made up of nine 3-by-3 subgrids, called regions or boxes or blocks, starting with various numerals given in some cells, the givens or clues. It can be described with a single rule: Each row, column and region must contain all numbers from 1 to 9 We can immediately deduce that for each row, column, and region the values in the cells have to be different. Moreover, this condition is sufficient; thus, the unique rule could be reformulated as: Each row, column and region must contain numbers from 1 to 9 that are all different ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 70 Sudoku - Example An easy Sudoku puzzle containing 26 givens ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 A solution 71 Case Studies - Conclusions Benefits of non-binary constraints Compact, declarative models Efficient and effective constraint propagation Supported by many constraint toolkits alldifferent, atmost, cardinality, … Modeling decisions: Auxiliary variables Implied constraints Symmetry breaking constraints More to constraints than just declarative problem specifications! finding the right model can be hard ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 72 Non-binary Constraints in Practice Not just all-different… Order constraints Constraints on values Partitioning constraints Timetabling constraints Graph constraints Scheduling constraints Bin-packing constraints And many more… ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 All these are sometimes called global constraints Second International Summer School of the Association for Constraint Programming Advanced School and International Workshop on GLOBAL CONSTRAINTS Doryssa Bay Hotel Samos, Greece June 18-23, 2006 73 Order constraints min(X,[Y1,..,Yn]) and max(X,[Y1,..Yn]) X <= minimum(Y1,..,Yn) X >= maximum(Y1,..Yn) min_mod(X,[Y1,..,Yn],m) and max_mod(X,[Y1,..Yn],m) X mod m <= minimum(Y1 mod m,..,Yn mod m) X mod m <= maximum(Y1 mod m,..,Yn mod m) min_n(X,n,[Y1,..Ym]) and max_n(X,n,[Y1,..,Ym) X is nth smallest value in Y1,..Ym X is nth largest value in Y1,..Ym ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 74 Value constraints among(N,[Y1,..,Yn],[val1,..,valm]) N vars in [Y1,..,Yn] take values val1,..,valm e.g. among(2,[1,2,1,3,1,5],[3,4,5]) count(n,[Y1,..,Ym],op,X) where op is =,<,>,, or relation “Yi op X” holds n times E.g. among(n,[Y1,..,Ym],[k]) = count(n,[Y1,..,Ym],=,k) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 75 Value constraints balance(N,[Y1,..,Yn]) N = occurrence of more frequent value - occurrence of least frequent value E.g balance(2,[1,1,1,3,4,2]) all-different([Y1,..,Yn]) => balance(0,[Y1,..,Yn]) Can you think of an application for this constraint? How can we propagate this constraint? ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 76 Value constraints min_nvalue(N,[Y1,..,Yn]) and max_nvalue(N,[Y1,..,Yn]) least (most) common value in Y1,..,Yn occurs N times E.g. min_nvalue(2,[1,1,2,2,2,3,3,5,5]) Can replace multiple count or among constraints common(X,Y,[X1,..,Xn],[Y1,..,Ym]) X vars in Xi take a value in Yi Can you think of an application? Y vars in Yi take a value in Xi E.g. common(3,4,[1,9,1,5],[2,1,9,9,6,9]) among(X,[Y1,..,Yn],[val1,..,valm]) = common(X,Y,[X1,..,Yn],[val1,..,valm]) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 77 Partitioning constraints all-different([X1,..,Xn]) Other flavors all-different_except_0([X1,..,Xn]) Xi Xj unless Xi=Xj=0 0 is often used for modeling purposes as “dummy” value Don’t use this slab Don’t open this bin .. ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 78 Partitioning constraints all-different([X1,..,Xn]) Other flavors symmetric-all-different([X1,..,Xn]) Xi Xj and Xi=j iff Xj=i Very common in practice Team i plays j iff Team j plays i.. Regin has proposed very efficient algorithm ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 79 Partitioning constraints nvalue(N,[X1,..,Xn]) Xi takes N different values all-different([X1,..,Xn]) = nvalue(n,[X1,..,Xn) gcc([X1,..,Xn],Lo,Hi) global cardinality constraint values in Xi occur between Lo and Hi times all-different([X1,..,Xn])=gcc([X1,..,Xn],1,1) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 80 Timetabling constraints change(N,[X1,..,Xn]),op) where op is {=,<,>,<=,>=, } “Xi op Xi+1” holds N times E.g. change(3,[4,4,3,4,1], ) You may wish to limit the number of changes of classroom, shifts, … longest_changes(N,[X1,..,Xn]),op) where op is{=,<,>,<=,>=, } longest sequence “Xi op Xi+1” is of length N E.g. longest_changes(2,[4,4,4,3,3,2,4,1,1,1],=) You may wish to limit the length of a shift without break, … ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 81 Graph constraints Tours in graph a often represented by the successors: [X1,..,Xn] means from node i we go to node Xi E.g. [2,1] represents the cycle (1)->(2)->(1) cycle(N,[X1,..,Xn]) there are N cycles in Xi e.g. cycle(2,[2,1,4,5,3]) as we have the 2 cycles (1)->(2)->(1) and (3)->(4)->(5)->(3) Useful for TSP like problems (e.g. sending engineers out to repair phones) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 82 Graph constraints derangement(N,[X1,..,Xn]) there are no length 1 cycles in Xi e.g. derangement([2,1,4,5,3]) as the 2 cycles (1)->(2)->(1) and (3)->(4)->(5)->(3) have length 2 and 3 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 83 Scheduling constraints cummulative([S1,..,Sn],[D1,..,Dn],[E1,..,En],[H1,..,Hn],L) schedules n (concurrent) jobs, each with a height Hi ith job starts at Si, runs for Di and ends at Ei the height can denote machine usage for example Ei=Si+Di at any time, accumulated height of running jobs is less than L 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 84 Misc constraints element(Index,[a1,..,an],Var) Var=a_Index constraint programming’s answer to arrays! e.g. element(Item,[10,23,12,15],Cost) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 85 Non-binary Constraints - Conclusions Real problems are full of non-binary constraints most problems are naturally modeled with non-binary constraints We can transform any non-binary constraint into a binary one, but we lose expressive power propagation search Efficient algorithms for many non-binary constraints exist there is still much work to be done ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 86 Alternative Search Strategies We have reviewed search algorithms for solving CSPs that are either based on depth-first search or on local search BT, FC, MAC, FC-CBJ Min-Conflicts, Stochastic variations of Min-Conflicts Many other alternative strategies for solving CSPs have been proposed in the literature Limited Discrepancy Search, Depth Bounded Discrepancy Search Genetic Algorithms (GENET project) Hybrids of Backtracking and Local Search ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 87 Limited Discrepancy Search (LDS) Discrepancy = heuristic is not followed (a value different from the heuristic is chosen) Idea of Limited Discrepancy Search (LDS): first, follow the heuristic when a failure occurs then explore the paths when the heuristic is not followed maximally once (start with earlier violations) after next failure occurs then explore the paths when the heuristic is not followed maximally twice after next failure occurs then explore the paths when the heuristic is not followed maximally three times etc. ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 88 Limited Discrepancy Search (LDS) Example: the heuristic proposes to use the left branches 1 discrepancy taken (make one decision contrary to the heuristic) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 89 Limited Discrepancy Search (LDS) Example: the heuristic proposes to use the left branches 2 discrepancies taken 3 discrepancies taken ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 90 Limited Discrepancy Search (LDS) procedure LDS-PROBE(Unlabelled,Labelled,Constraints,D) if Unlabelled = {} then return Labelled select X in Unlabelled ValuesX D(X) - {values inconsistent with Labelled using Constraints} if ValuesX = {} then return fail else select HV in ValuesX using heuristic if D=0 then return LDS-PROBE(Unlabelled-{X}, Labelled{X/HV}, Constraints, 0) for each value V from ValuesX -{HV} do R LDS-PROBE(Unlabelled-{X}, Labelled{X/HV}, Constraints, D-1) if R ≠ fail then return R end for return LDS-PROBE(Unlabelled-{X}, Labelled{X/HV}, Constraints, D) end if end LDS-PROBE procedure LDS(Variables,Constraints) for D=0 to |Variables| do % D is the number of allowed discrepancies R LDS-PROBE(Variables,{},Constraints,D) if R ≠ fail then return R end for return fail end LDS ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 91 Limited Discrepancy Search (LDS) LDS can be used as a search scheme in a variety of search problems (not only CSPs) Issues about LDS: what happens when a variable has more than 2 values? Which discrepancies are tried first? how can we combine LDS with constraint propagation? LDS is efficient only if we have a good value ordering heuristic Depth Bounded Discrepancy Search (DDS) is a variant that combines LDS with iterative deepening it uses an increasing depth bound to search for discrepancies high at the search tree (where they are more often and more important) ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 92 Optimization So far we have looked for feasible assignments only In many cases the users require optimal assignments where optimality is defined by an objective function Definition: Constraint Satisfaction Optimisation Problem (CSOP) consists of the standard CSP P and an objective function f mapping feasible solutions of P to numbers Solution to CSOP is a solution of P minimising / maximising the value of the objective function f To find a solution of CSOP we need in general to explore all the feasible valuations. Thus, the techniques capable to provide all the solutions of CSP are used ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 93 Branch & Bound Branch and bound is perhaps the most widely used optimisation technique based on cutting sub-trees where there is no optimal (better) solution It is based on a heuristic function h that approximates the objective function a sound heuristic for minimisation satisfies h(x) ≤ f(x) in case of maximisation f(x) ≤ h(x) a function closer to the objective function is better During search, the sub-tree is cut if there is no feasible solution in the sub-tree there is no optimal solution in the sub-tree bound ≤ h(x), where bound is max. value of feasible solution How to get the bound? It could be an objective value of the best solution so far ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ - Lecture 1 94