Proceedings of the Tenth Symposium on Abstraction, Reformulation, and Approximation A Framework for Combining Set Variable Representations∗ Christian Bessiere Zeynep Kiziltan, Andrea Rappini CNRS University of Montpellier, France bessiere@lirmm.fr DISI University of Bologna, Italy {zeynep,rappini}@cs.unibo.it Abstract with multiset variables. In (Law, Lee, and Woo 2009), for instance, cardinality and variety bounds are combined with subset bounds (Kiziltan and Walsh 2002). In this paper, we present a formal framework with which we can study and compare such combined representations of sets and multisets. We define combination operators, notions of consistency between representations, and of propagation. Using this framework, we launch a systematic study of representations. Due to lack of space, we focus only on set variables but our results are directly applicable to multisets. Our study provides insight into existing combined representations, as well as new and potentially useful representations. Moreover, it uncovers a number of important relations. For example, we prove that the lengthlex representation for set variables (which combines lexicographic ordering and cardinality information and is designed to work well with cardinality and ordering constraints) is incomparable to a representation for set variables that deals with the lexicographic ordering and cardinality bounds either separately or simultaneously. As a second example, we prove that the hybrid representation of (Sadler and Gervet 2008) (which combines lexicographic, subset and cardinality bounds) is tighter than one that looks at pairs of bounds separately, but looser than one that looks at all the bounds simultaneously. We demonstrate the value of our framework via theoretical and experimental results. We show that, whilst it can be intractable in general to combine representations, some combinations are powerful but polynomial to reason about. For example, for some common set constraints, reasoning with subset and cardinality bounds simultaneously is not more difficult than reasoning about them separately (as Oz and Cardinal do). Our experiments demonstrate that combining another representation with subset bounds can result in a tight approximation. Hence, one of our conclusions is that developers of solvers should consider more tightly integrating subset bounds with representations like cardinality and lengthlex. Set and multiset variables are important modelling constructs in constraint programming. Several representations have been proposed for set and multiset variables, often based on combining together different representations. In this paper, we provide a formal framework with which we can study many existing combinations of representations and compare their strength. In addition, our framework opens the door to interesting new combinations, as well as to the construction of propagators with well defined properties. We illustrate the value of the framework via both theoretical and experimental results. 1 Toby Walsh NICTA and University of New South Wales Australia toby.walsh@nicta.com.au Introduction Many constraint satisfaction problems can be modelled naturally using set variables. Consider for instance the social golfers problem (prob010 at CSPLib.org) which partitions a set of golfers into foursomes. Another example is the ternary Steiner problem from combinatorial mathematics which requires to find triples of integers from 1 to n (Lindner and Rosa 1980). Several constraint toolkits therefore provide set variables as modelling and solving construct since the earliest constraint solvers (Puget 1992; Gervet 1994). An important issue regarding a set variable is how to represent it. As an extensional representation of the domain of a set variable is exponential, more compact representations are typically used which approximate the full powerset domain. The first representation proposed for set variables, which is still used in many solvers, is subset bounds. The lower bound is the set of definite elements, whilst the upper bound is the set of possible elements. Unfortunately this is a weak approximation. To better approximate domains, new representations have been proposed that combine together different representations (e.g. (Müller 2001; Azevedo 2007; Gervet and van Hentenryck 2006; Sadler and Gervet 2008; Malitsky, Sellmann, and van Hoeve 2008)). For instance, set constraint programming in Oz (Müller 2001) and the Cardinal set solver (Azevedo 2007) both reason with subset and cardinality bounds on the set. Similar issues arise 2 Background A set variable takes as value a subset of a universe U = {1, . . . , n}. The domain may contain up to 2n values. The representation R of a set domain is typically given by bounds that limit the domain to an interval in a (partial or total) order of the subsets of 2U . The approximation DR (S) ∗ The first author has been partially funded by the EU project ICON (FP7-284715). c 2013, Association for the Advancement of Artificial Copyright Intelligence (www.aaai.org). All rights reserved. 25 lbM L (S2 ) = {} and ubM L (S1 ) = ubM L (S2 ) = {3} and the constraint c ≡ |S1 ∩ S2 | = 2. We have c[S1 ]M L = {{1, 2}, {1, 2, 3}, {1, 3}, {2, 3}}, hence BC on minlex would update lbM L (S1 ) to glbM L ({{1, 2}, {1, 2, 3}, {1, 3}, {2, 3}, {3}}) which is {1, 2}, and ubM L (S1 ) to lubM L ({{1, 2}, {1, 2, 3}, {1, 3}, {2, 3}}) which is {2, 3}. Similarly for S2 . of a domain is given by a lower and upper bound, lbR (S) and ubR (S). The subsetbounds representation (denoted SB) uses the partial order over subsets (Puget 1992; Gervet 1994). The lower bound contains the definite elements, and the upper bound contains the possible elements. Given a variable S, DSB (S) = {s | lbSB (s) ⊆ s ⊆ ubSB (s)}. T Given a set of sets A S⊆ 2U , we define glbSB (A) = a∈A a and lubSB (A) = a∈A a. Another approach is to consider lower and upper bounds, lbT and ubT within a total ordering ≤T on 2U . As before, DT (S) = {s | lbT (S) ≤T s ≤T ubT (S)}. In minlex (denoted M L), the total ordering ≤M L is defined as follows: we list elements in each set from smallest to largest and then order lists obtained in this way lexicographically. In lengthlex (denoted LL (Gervet and van Hentenryck 2006)), the total ordering ≤LL first orders sets by size, breaking ties with ≤M L . In the hybrid domain of (Sadler and Gervet 2008), the total ordering ≤maxlex is defined as follows: we list elements in each set from largest to smallest and then order lists obtained in this way lexicographically. Given a set of sets A ⊆ 2U , we define glbT (A) = min≤T (A) and lubT (A) = max≤T (A). A third approach is based on the size or value of a set. We illustrate this with the cardinality representation (card) but we could also consider cost of the set. In card, DC (S) = {s | lbC (S) ≤ |s| ≤ ubC (S)}. Given a set of sets A ⊆ 2U , we define glbC (A) = min({|a| | a ∈ A}) and lubC (A) = max({|a| | a ∈ A}). 3 Framework To present our framework, we focus on combining two representations. When using two representations simultaneously, a desirable property is that both representations agree on their bounds. Given representations R1 and R2 , we write DR1 R2 (S) for DR1 (S) ∩ DR2 (S). Definition 2 (Synchronization) Given a variable S, and two representations R1 and R2 , then R1 is synchronized with the representation R2 on S iff: • lbR1 (S) = glbR1 (DR1 R2 (S)); • ubR1 (S) = lubR1 (DR1 R2 (S)); S is synchronized for R1 and R2 iff R1 is synchronized with R2 and R2 is synchronized with R1 . Example 3 Consider a variable S with U = {1, 2, 3} represented using R1 = minlex with lbM L (S) = {1, 2} and ubM L (S) = {3}, and using R2 = SB with lbSB (S) = {} and ubSB (S) = {1, 3}. We have DM L (S) = {{1, 2}, {1, 2, 3}, {1, 3}, {2}, {2, 3}, {3}} and DSB (S) = {{}, {1}, {3}, {1, 3}}. The intersecting domain DM LSB (S) = {{1, 3}, {3}}. For minlex to be synchronized with subsetbounds, its lower bound lbM L (S) must be glbM L (DM LSB (S)) = {1, 3}. Similarly, for subsetbounds to be synchronized with minlex, its lower bound lbSB (S) must be glbSB (DM LSB (S)) = {3}. Example 1 Consider a variable S whose domain D(S) = {{1}, {1, 2}, {3}}. Given a representation R among subsetbounds, minlex, lengthlex and card, the domain DR (S) is defined as follows: • DSB (S): lbSB (S) = {} and ubSB (S) = {1, 2, 3}; • DM L (S): lbM L (S) = {1} and ubM L (S) = {3} ; • DLL (S): lbLL (S) = {1} and ubLL (S) = {1, 2} ; • DC (S): lbC (S) = {1} and ubC (S) = {2}. We consider two different ways to propagate a combined representation. In the weak sense, we apply bound consistency on each representation independently and synchronize them. In the strong sense, we ensure the lower and upper bounds of a representation are bound consistent on their synchronized domains. Given two representations R1 and R2 , the weak combination will be denoted by R1 + R2 and the strong combination by R1 × R2 . We now give an uniform definition of bound consistency (BC). Given representation R, variables S1 , . . . , Sn and a constraint c(S1 , . . . , Sn ), we denote by c[Si ]R the set {si | c(s1 , . . . , si , . . . , sn ) ∧ sj ∈ DR (Sj )∀j ∈ 1..n}. Definition 3 (Bound consistency on R1 + R2 ) Given variables S1 , . . . , Sn , and a constraint c(S1 . . . , Sn ), Si is bound consistent on c for R1 +R2 iff Si is synchronized for R1 and R2 , Si is BC on c for R1 and Si is BC on c for R2 . Definition 1 (Bound consistency on R) Given a representation R, variables S1 , S2 , . . . , Sn , and a constraint c(S1 . . . , Sn ), we say that Si is bound consistent on c for R iff lbR (Si ) = glbR (c[Si ]R ) and ubR (Si ) = lubR (c[Si ]R ). c(S1 . . . , Sn ) is bound consistent for R if for every i ∈ 1..n Si is bound consistent on c for R. Example 4 Consider a variable S with U = {1, 2, 3} represented using R1 = minlex with lbM L (S) = {1} and ubM L (S) = {3}, and using R2 = subsetbounds with lbSB (S) = {1} and ubSB (S) = {1, 2, 3}, and the constraint c ≡ |S| = 2. The variable S is BC on c for SB but is not for minlex. Enforcing BC on c for minlex would update the bounds of minlex so that their cardinality is 2, hence we get lbM L (S) = {1, 2} and ubM L (S) = {2, 3}. With these domains, S is not synchronized for minlex and SB, because SB requires that 1 belongs to S. Ensuring synchronization, ubM L (S) is further updated to {1, 3}. Now, S is BC on c for minlex + SB. When a constraint c(S1 . . . , Sn ) is not bound consistent, we can enforce bound consistency on c to reduce the search space. This increases lbR (Si ) and decreases ubR (Si ), i ∈ 1..n until bound consistency holds, without discarding solutions of c. When no such lower and upper bounds exist (i.e., when DR (S1 )×. . .×DR (Sn ) does not contain any solution of c), we say that bound consistency has failed on c. Example 2 Consider two variables S1 and S2 with U = {1, 2, 3} represented using R = minlex with lbM L (S1 ) = 26 Definition 4 (Bound consistency on R1 × R2 ) Given variables S1 , . . . , Sn and a constraint c(S1 . . . , Sn ), Si is bound consistent on c from R1 to R2 iff: Proof: Let c(S1 , . . . , Sn ) be a constraint. Suppose BC does not fail on c for R1 × R2 . Definition 4 tells us that BC on R1 × R2 requires that for every Si , lbR1 (Si ) = glbR1 (c[Si ]R1 R2 ), lbR2 (Si ) = glbR2 (c[Si ]R1 R2 ), ubR1 (Si ) = lubR1 (c[Si ]R1 R2 ), and ubR2 (Si ) = lubR2 (c[Si ]R1 R2 ). These bounds are synchronized because c[Si ]R1 R2 ⊆ D(Si )R1 R2 . These bounds are also obviously BC for R1 and BC for R2 , so BC for R1 +R2 . Therefore, if BC does not fail for R1 × R2 , then R1 + R2 cannot fail, as in the worst case, it will converge on the same bounds as for R1 × R2 . • lbR1 (S) = glbR1 (c[Si ]R1 R2 ); • ubR1 (S) = lubR1 (c[Si ]R1 R2 ); Si is bound consistent on c for R1 ×R2 iff Si is synchronized for R1 and R2 and Si is bound consistent from R1 to R2 and vice versa. Example 5 Consider two variables S1 and S2 with U = {1, 2, 3} represented using R1 = minlex with lbM L (S1 ) = lbM L (S2 ) = {1} and ubM L (S1 ) = ubM L (S2 ) = {3}, and using R2 = card with lbC (S1 ) = lbC (S2 ) = ubC (S1 ) = ubC (S2 ) = {1}, and the constraint c ≡ {2, 3} ⊆ S1 ∪ S2 . Both S1 and S2 are synchronized for minlex and card, and c is BC for both representations. Hence, S1 and S2 are BC on c for minlex + card. However, S1 and S2 are not BC on c for minlex × card. We have c[S1 ]M LC = {{2}, {3}}, hence BC on minlex × card would update lbM L (S1 ) to glbM L ({{2}, {3}}) which is {2}. Similarly for S2 . 4 When representations can represent the same domains, a stronger relation based on the values pruned can be defined. Let fc denote the stronger relation in terms of failure as in Definition 5, and pc denote in terms of pruning. We have: Theorem 3 Given two representations R1 and R2 , a constraint c: • R1 fc R2 =⇒ R1 pc R2 • R1 ≡pc R2 =⇒ R1 ≡fc R2 Pruning based definition would be useful when comparing for instance R1 × R2 and R1 + R2 where the R1 and R2 are isomorphic. As an example, consider a unary constraint c. We have (R1 ×R2 ) ≡pc (R1 +R2 ) if one of R1 and R2 is defined by a total ordering. This is because, the bounds of that ordering must satisfy c and be synchronized with the other representation at the same time. Consequently, any value pruned by R1 × R2 will necessarily be pruned by R1 + R2 . By Theorem 3, we have (R1 ×R2 ) ≡fc (R1 +R2 ). A counter example when neither representation is defined by a total ordering is given in Section 5. Pruning based definition can also give us more information when (R1 × R2 ) ≡fc (R1 + R2 ). As an example, consider the binary not equals constraint c ≡ S1 6= S2 . We have (SB×card) ≡fc (SB+card) but (SB × card) pc (SB + card), as we will show later in Section 8. In the rest of the paper, for the specific representations that we consider, we either have R1 fc R2 or R1 ∼fc R2 for all constraints c (except Section 8 where we compare representations based on a specific constraint). We therefore consider only the failure based definition and stick to the notation . Comparing Combinations Informally, one representation is stronger than another if bound consistency reasoning with the first representation is more powerful than with the second. Definition 5 (Stronger relation ) Given two representations R1 and R2 and a constraint c(S1 , . . . , Sn ) such that ∀i, Si is synchronized for R1 and R2 , we say that R1 is stronger than R2 on c (R1 c R2 ) iff when BC fails on c for R2 then BC fails on c for R1 . We say that R1 is strictly stronger than R2 on c (R1 c R2 ) iff R1 c R2 and R2 6c R1 . We say that R1 and R2 are equivalent on c (R1 ≡c R2 ) iff R1 c R2 and R2 c R1 . We say that R1 and R2 are incomparable on c (R1 ∼c R2 ) iff R1 6c R2 and R2 6c R1 . We say that R1 R2 (resp. R1 R2 or R1 ≡ R2 or R1 ∼ R2 ) iff R1 c R2 (resp. R1 c R2 or R1 ≡c R2 or R1 ∼c R2 ) for all constraints c. Definitions for comparing propagation are often based on how they prune values (e.g. (Debruyne and Bessiere 1997)). However, such definitions are limited as both representations must represent the same domains (i.e., the representations are isomorphic). By using a definition based on detection of failure, we can compare different representations of domains that are not necessarily isomorphic. This choice is similar to that made in (Bacchus et al. 2002). The following theorems are useful to compare two (combined) representations. 5 An example To illustrate the potential of this framework, we consider pairwise combinations of three of the most popular representations for sets: SB, card and minlex. Synchronization The synchronization property of Definition 2 gives rise to the following synchronization rules. Theorem 1 Given two representations R1 and R2 : • (R1 + R2 ) R1 and (R1 + R2 ) R2 • (R1 × R2 ) R1 and (R1 × R2 ) R2 . Representations minlex and card Proof: Immediate from the definitions. • lbC (S) ≤ |lbM L (S)| ≤ ubC (S) and lbC (S) ≤ |ubM L (S)| ≤ ubC (S); Theorem 2 Given two representations R1 and R2 , (R1 × R2 ) (R1 + R2 ). • lbC (S) ∈ {|s| | s ∈ DM L (S)} and ubC (S) ∈ {|s| | s ∈ DM L (S)}. 27 Minlex x SB x Card Representations minlex and subsetbounds • lbSB (S) ⊆ lbM L (S) ⊆ ubSB (S) and lbSB (S) ⊆ ubM L (S) ⊆ ubSB (S); • lbSB (S) ⊇ ∪s∈DM L (S) {s}. ∩s∈DM L (S) {s} and ubSB (S) ~ (Minlex x Card) + SB ⊆ (Minlex x SB) + Card ~ ~ (SB x Card) + Minlex Minlex + SB + Card Figure 2: Comparison of triple combinations with minlex, SB and card. H1 → H2 means H1 H2 whilst H1 −̃H2 means that H1 ∼ H2 . Representations subsetbounds and card • lbC (S) = |ubSB (S)| → lbSB (S) = ubSB (S); • ubC (S) = |lbSB (S)| → ubSB (S) = lbSB (S); 6 • lbC (S) ≥ |lbSB (S)|, ubC (S) ≤ |ubSB (S)|. This framework can be generalized to combinations of more than two representations. We can then reason with, for instance, the hybrid domain representation (Sadler and Gervet 2008) and the multiset representations of (Law, Lee, and Woo 2009; Law et al. 2011). The extension of the strong combination operator is straightforward as we can take the domain of S in R1 × R2 to be DR1 R2 (S). The extension of the weak combination operator is more problematic as the domain of S in R1 + R2 is not defined explicitly. This necessitates a generalization of the definitions of synchronization and BC. In this generalization, we first define when S is synchronized for R1 × . . . × Rn and for (R1 × . . . × Rj ) + . . . + (Rk × . . . × Rn ). Then, we define when S is BC on c for these combinations. Finally, we give some algebraic identities that are useful in comparing (combined) representations. These synchronization rules are polynomial to enforce. Pairwise Comparison Theorem 4 (minlex, SB and card pairwise comparisons) Figure 1 gives the relations that hold when comparing minlex, SB and card pairwise. Similar results hold for maxlex instead of minlex. Minlex x Card Minlex x SB SB x Card Minlex + Card Minlex + SB SB + Card Minlex Card Minlex SB SB Generalizing the Framework Card Figure 1: Pairwise comparison of representations minlex, SB and card. H1 → H2 means H1 H2 . Theorem 5 (Ri × . . . × Rk × Rk+1 × . . . × Rn ) (Ri × . . . × Rk ) + (Rk+1 × . . . × Rn ). Theorem 6 (R1 + . . . + Rn ) Rk for all k ∈ {1, . . . , n}. Proof: By Theorem 1 and Theorem 2, we have all the relations claimed in Theorem 4. We thus only have to show strictness. We highlight the most important results. To show (minlex × card) (minlex + card), take the constraint S1 ∪ S2 ⊇ {1, 3} and U = {1, 2, 3}. Suppose DM L (S1 ) = DM L (S2 ) = {{1}, {1, 2}, {1, 2, 3}, {1, 3}, {2}}; lbC (S1 ) = ubC (S1 ) = lbC (S2 ) = ubC (S2 ) = 1. S1 and S2 are synchronized for both representations. BC on minlex × card fails whilst BC on minlex + card does not prune. To show that (minlex × SB) (minlex + SB), take the constraint |S1| + |S2| = 5 and U = {1, 2, 3, 4, 5}. Suppose DM L (S1 ) = DM L (S2 ) = {{1, 5}, {2}, {2, 3}, {2, 3, 4}, {2, 3, 4, 5}, {2, 3, 5}, {2, 4}, {2, 4, 5}, {2, 5}, {3}}; lbSB (S1 ) = lbSB (S2 ) = {}, ubSB (S1 ) = ubSB (S2 ) = {1, 3, 5}. S1 and S2 are synchronized for both representations. BC on minlex × SB fails whilst BC on minlex + SB does not prune. To show that (SB × card) (SB + card), we consider a unary constraint. Take the constraint on S which is satisfied when the sum of its elements is at least 4 and U = {1, 2, 3, 4, 5}. Suppose lbSB (S) = {} and ubSB (S) = {1, 2, 3}; lbC (S) = ubC (S) = 1. S is synchronized for both representations. BC on SB × card fails whilst BC on SB + card does not prune. Theorem 7 (R1 × . . . × Rn ) Rk for all k ∈ {1, . . . , n}. For reasons of space, we skip formal definitions and proofs. To illustrate this generalization, we consider minlex, SB, and card. By applying the weak and strong combination operators in all possible ways, we obtain 5 different combined representations: minlex×SB×card, (minlex× SB)+card, (minlex×card)+SB, (SB×card)+minlex, and minlex + SB + card. The necessary synchronization and BC rules can be obtained by instantiating their formal definitions. In addition, using Theorems 5–7, we can compare the strength of these combinations and contrast with (combined) representations involving a subset of the three representations. In Figure 2, we show how the triple combinations compare. Due to space, we prove here only one of the results. Theorem 8 (minlex × SB) + card ∼ (minlex × card) + SB. Proof: We first show (minlex×card)+SB 6 (minlex× SB) + card. Take the constraint |S1 ∩ S2 | = 1 on variables S1 and S2 , and U = {1, 2, 3, 4, 5}. Suppose DM L (S1 ) = DM L (S2 ) = {{1, 5}, {2}, {2, 3}, {2, 3, 4}, {2, 3, 4, 5}, {2, 3, 5}, {2, 4}, {2, 4, 5}, {2, 5}, {3}, {3, 4}}, lbSB (S1 ) = lbSB (S2 ) = {}, ubSB (S1 ) = ubSB (S2 ) = {1, 3, 4, 5}, lbC (S1 ) = lbC (S2 ) = ubC (S1 ) = ubC (S2 ) = 2. S1 and S2 28 Minlex x Card are synchronized for all representations. (minlex × SB) + card fails ({3} is pruned thanks to synchronization with card) but (minlex × card) + SB does not prune. We now show (minlex × SB) + card 6 (minlex × card) + SB. Take the constraint S1 ∪ S2 ⊇ {2, 3, 4} on variables S1 and S2 . Suppose DM L (S1 ) = DM L (S2 ) = {{1, 2}, {1, 2, 3}, {1, 2, 3, 4}, {1, 2, 4}, {1, 3}, {1, 3, 4}, {1, 4}}, lbSB (S1 ) = lbSB (S2 ) = {1}, ubSB (S1 ) = ubSB (S2 ) = {1, 2, 3, 4}, lbC (S1 ) = lbC (S2 ) = ubC (S1 ) = ubC (S2 ) = 2. S1 and S2 are synchronized for all representations. (minlex×card)+SB fails but (minlex×SB)+card does not prune. 7 ~ Minlex + Card ~ Lengthlex ~ Minlex Card Figure 3: Comparison of minlex and card combinations with lengthlex. H1 → H2 means H1 H2 whilst H1 −̃H2 means that H1 ∼ H2 . We now show (minlex × card) 6 lengthlex. Take again the constraint (S1 <M L S2 ∧ |S1 | = 2 ∧ |S2 | = 2) and U = {1, 2, 3, 4}. Suppose DM L (S1 ) = {{1, 2, 3}, {1, 2, 3, 4}, {1, 2, 4}, {1, 3}, {1, 3, 4}, {1, 4}, {2}, {2, 3}, {2, 3, 4}, {2, 4}, {3}, {3, 4}}, DM L (S2 ) = {{1, 2}, {1, 2, 3}, {1, 2, 3, 4}, {1, 2, 4}, {1, 3}, {1, 3, 4}, {1, 4}, {2}, {2, 3}, {2, 3, 4}, {2, 4}, {3}, {3, 4}, {4}}, DLL (S1 ) = {{2, 3}, {2, 4}, {3, 4}, {1, 2, 3}}, DLL (S2 ) = {{4}, {1, 2}, {1, 3}, {1, 4}}, lbC (S1 ) = 2, ubC (S1 ) = 3, lbC (S2 ) = 1, ubC (S2 ) = 2. S1 and S2 are synchronized for both representations. With BC on minlex × card, we obtain DM L (S1 ) = {{1, 3}, {1, 3, 4}, {1, 4}, {2}, {2, 3}, {2, 3, 4}, {2, 4}}, DM L (S2 ) = {{1, 4}, {2}, {2, 3}, {2, 3, 4}, {2, 4}, {3}, {3, 4}} lbC (S1 ) = ubC (S1 ) = lbC (S2 ) = ubC (S2 ) = 2. BC on lengthlex fails. Thus, (minlex × card) 6 lengthlex, and obviously we deduce (minlex + card) 6 lengthlex and minlex 6 lengthlex. Applying the Framework In the following, we show where the existing combined representations are precisely situated in our framework. More interestingly, we present new results regarding these representations, which help us understand better their pruning capabilities and how they compare to the similar representations that can be constructed using the framework. Combination of two representations The SB +card representation was used first in set constraint programming in Oz (Müller 2001) and later for instance in the Cardinal set solver (Azevedo 2007). The representation SB × card has been previously referred to as subsetcardinality (Malitsky, Sellmann, and van Hoeve 2008) or sbc-domain (Yip and Hentenryck 2010). A similar BC definition for SB ×card is given in (Yip and Hentenryck 2010). However, there was no formal study nor implementation of SB × card. Although the lengthlex representation uses both cardinality and lexicographic information, it can be defined in our framework by a single representation T defined by a total order. Given that both lengthlex and combinations of minlex and card representations focus on the same information, we might wonder how they compare in general. Interestingly, the lengthlex representation is incomparable to representations which deal with the cardinality and lexicographic ordering information either separately or simultaneously. Theorem 9 lengthlex is incomparable to minlex, minlex + card and minlex × card. Proof: We first show lengthlex 6 minlex. Consider a conjunction of a lexicographic ordering and two cardinality constraints (S1 <M L S2 ∧ |S1 | = 2 ∧ |S2 | = 2). Suppose DM L (S1 ) = {{2}, {2, 3}, {2, 3, 4}}, DM L (S2 ) = {{1}, {1, 2}, {1, 2, 3}, {1, 2, 3, 4}, {1, 2, 4}, {1, 3}}, DLL (S1 ) = {{2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}}, DLL (S2 ) = {{1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3}}. S1 and S2 are synchronized for all representations. BC on minlex fails for both domains but BC on lengthlex does not: DLL (S1 ) = {{1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}} and DLL (S2 ) = {{1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}} are BC. Thus, lengthlex 6 minlex, and obviously we deduce lengthlex 6 (minlex + card) and lengthlex 6 (minlex × card). Even though our proof used a conjunction of a lexicographic constraint together with cardinality constraints, there are many other basic constraints, like subset and disjoint, where lengthlex and the minlex and card combinations are incomparable. Figure 3 summarises these results. Combination of three representations (Malitsky, Sellmann, and van Hoeve 2008) introduces the length-lex∗ representation, which is lengthlex × SB in our framework. In (Yip and Hentenryck 2010), this is called the ls-domain, but experiments were only performed with lengthlex + SB. There was no formal study nor implementation of lengthlex × SB. As lengthlex × SB and combinations of the minlex, SB and card represent similar information, we might ask how they compare in general. In Figure 4, we compare the various combinations of minlex, SB and card against lengthlex × SB. lengthlex × SB is incomparable to any minlex based combination, including minlex alone. Theorem 10 lengthlex × SB is incomparable with minlex, minlex × SB, minlex × card, minlex + SB + card, (minlex × card) + SB, (minlex × SB) + card, (SB × card) + minlex and minlex × SB × card. Proof: First we show that (lengthlex × SB) 6 minlex. Take the constraint S ≥M L {1, 3} on the variable S and U = {1, 2, 3}. Suppose DM L (S) = {{}, {1}, {1, 2}, {1, 2, 3}}, DLL (S) = {{}, {1}, {2}, {3}, {1, 2}, {1, 3}, 29 (Minlex x Card) + SB Minlex x SB x Card ~ ~ ~ ||s|| ≤ ubV (S)} and given a set of multisets A ⊆ 2U , we define glbV (A) = min({||a||, a ∈ A}) and lubV (A) = max({||a||, a ∈ A}). In (Law et al. 2011), this representation is extended by considering 8 different lexicographical orderings on multisets. The authors compared these totally ordered representations in terms of expressiveness and compactness, as well as experimentally with respect to SB + card + variety. Our framework lets us combine SB, card, variety and these other totally ordered representations in new ways, as well as compare their pruning power. (Minlex x SB) + Card Lengthlex x SB ~ (SB x Card) + Minlex ~ Minlex + SB + Card Figure 4: Comparison of minlex based triplets against lengthlex × SB. H1 −̃H2 means that H1 ∼ H2 . {2, 3}, {1, 2, 3}}, lbSB (S) = {}, ubSB (S) = {1, 2, 3}. S1 is synchronized for all representations. Enforcing BC for minlex fails whilst for lengthlex × SB it does not. Thus lengthlex × SB is not stronger than any combination of minlex. We now show that (minlex × SB × card) 6 (lengthlex×SB). Take the constraint S1 ∪S2 = {1, 2, 3, 4} on two variables S1 and S2 and U = {1, 2, 3, 4}. Suppose DLL (S1 ) = DLL (S2 ) = {{3}, {4}, {1, 2}}, lbSB (S1 ) = lbSB (S2 ) = {}, ubSB (S1 ) = ubSB (S2 ) = {1, 2, 3, 4}, DM L (S1 ) = DM L (S2 ) = {{1, 2}, {1, 2, 3}, {1, 2, 3, 4}, {1, 2, 4}, {1, 3}, {1, 3, 4}, {1, 4}, {2}, {2, 3}, {2, 3, 4}, {2, 4}, {3}, {3, 4}, {4}}, lbC (S1 ) = lbC (S1 ) = 1, ubC (S2 ) = ubC (S2 ) = 2. S1 and S2 are synchronized for all representations. lengthlex×SB fails whilst minlex×card×SB does not (note that the lower bound of card is updated to 2). Thus all minlex based combinations with SB and card are not stronger than lengthlex × SB. 8 Complexity Results Despite its benefits, combining representations can be challenging even with just two representations. For instance, synchronization is intractable to achieve in general. As another example, enforcing BC even on unary constraints can be hard. Theorem 11 (Synchronization) Enforcing synchronization of S for R1 and R2 is NP-hard even if R1 and R2 are defined by total orderings for which computing the successor of an element is polynomial. Proof: Reduction from the NP-hard problem of deciding whether the overlapping alldifferent constraint has a solution (Kutz et al. 2008). Overlapping alldifferent is a conjunction of two alldifferent constraints involving integer variables that overlap on some of these variables. An integer variable Xi is a variable whose domain D(Xi ) is a finite set of integers. Suppose we have the integer variables X1 , . . . , Xn and two constraints alldifferent(X1 , . . . , Xp ) and alldifferent(Xk , . . . , Xn ) with k < p. We build the universe U so that each occurrence of value j in a domain is associated with a distinct element: U = {vji | 1 ≤ i ≤ n, j ∈ D(Xi )}. We define the total ordering <U on U such 0 that vji <U vji 0 if and only if i < i0 or i = i0 and j < j 0 . We define Cn as the subset of 2U that contains all sets of size n. We define Sol1 as the subset of Cn that contains sets {vj11 , . . . , vjpp , . . . , vjnn } such that j1 , . . . , jp is a solution of alldifferent(X1 , . . . , Xp ). Similarly, Sol2 is the subset of Cn that contains sets {vj11 , . . . , vjkk , . . . , vjnn } such that jk , . . . , jn is a solution of alldifferent(Xk , . . . , Xn ). Let Ceven be the set of all sets in 2U \Cn of even cardinality and let Codd be the set of all sets in 2U \ Cn of odd cardinality. Representation R1 is a total ordering <R1 on 2U defined as follows. Given two sets s1 and s2 , s1 <R1 s2 if and only if s1 ∈ Sol1 ∪ Ceven and s2 ∈ / Sol1 ∪ Ceven , or s1 and s2 both belong to Sol1 ∪Ceven or are both outside Sol1 ∪Ceven and s1 <M L s2 . Representation R2 is a total ordering <R2 on 2U defined as follows. Given two sets s1 and s2 , s1 <R2 s2 if and only if s1 ∈ Sol2 ∪ Codd and s2 ∈ / Sol2 ∪ Codd , or s1 and s2 both belong to Sol2 ∪ Codd or are both outside Sol2 ∪ Codd and s1 <M L s2 . We observe that computing the successor of a set is polynomial both in R1 and in R2 because going from one solution of an alldifferent constraint to another in lexicographic order is polynomial. Suppose now that the set variable S is represented by lbR1 (S) = {}, ubR1 (S) = {vjn , vjn0 }, lbR2 (S) = {v11 }, and ubR2 (S) = {vjn }. lbR1 (S) and ubR1 (S) both belong In (Sadler and Gervet 2008), a combination of maxlex, SB and card, called a hybrid domain, is introduced and associated inference rules are given to maintain consistency between the bounds as well as to enforce some form of local consistency on common set constraints. This hybrid domain is a representation between maxlex + SB + card and maxlex × SB × card. In the hybrid domain, each bound of two representations is updated by looking at the bounds of the other two representations at the same time. This is stronger than reasoning pairwise as we do in maxlex + SB + card. However, this reasoning does not consider the bounds of all three representations simultaneously. Consider for instance a variable S with the domain lbmaxlex (S) = {4, 2, 1}, ubmaxlex (S) = {4, 3, 1}, lbSB (S) = {4}, ubSB (S) = {1, 2, 3, 4}, lbC (S) = ubC (S) = 3. The inference rules for hybrid domains will not change the subset bounds, but 1 must be in lbSB for the bounds to be synchronized. Similarly, consider a variable S with the domain lbmaxlex (S) = {4, 3}, ubmaxlex (S) = {5, 1}, lbSB (S) = {}, ubSB (S) = {1, 2, 3, 4, 5}, lbC (S) = ubC (S) = 2. The inference rules will not change the subset bounds, but 2 must be removed from ubSB (S) for the bounds to be synchronized. Thus, the hybrid domain is between a weak and strong combination. Combination of four representations (Law, Lee, and Woo 2009) proposes the combined representation for multiset variables of SB + card + variety, where variety counts the number of distinct elements ||a||. Similar to card, in variety we have DV (S) = {s | lbV (S) ≤ 30 to Ceven . Now, for any s ∈ Sol1 we have lbR1 (S) <M L s <M L ubR1 (S). Thus, for any s ∈ Sol1 we have lbR1 (S) <R1 s <R1 ubR1 (S). In addition, for any s ∈ / Sol1 ∪ Ceven we have ubR1 (S) <R1 s. As a result, Sol1 ⊆ DR1 (S) ⊆ Sol1 ∪ Ceven . Similarly, we prove that Sol2 ⊆ DR2 (S) ⊆ Sol2 ∪ Codd . The only possible sets common to DR1 (S) and DR2 (S) are in Sol1 ∩ Sol2 because Ceven ∩ Codd = ∅, Sol1 ∩ Codd = ∅, and Sol2 ∩ Ceven = ∅. Now, sets in Sol1 ∩ Sol2 correspond to tuples that satisfy both alldifferent(X1 , . . . , Xp ) and alldifferent(Xk , . . . , Xn ). Therefore, the synchronization of the bounds of S for R1 and R2 will not fail if and only if there exists a solution to the overlapping alldifferent constraint. agating SB × card, the extra inference needed for BC on SB × card on top of BC on SB + card is polynomial for many common constraints. For reasons of space, we give examples for =, 6=, and ⊆ but similar results hold for ∩, ∪, \, and ⊕. For S1 = S2 , SB × card and SB + card are equivalent so we need to do nothing more. For S2 6= S2 , the only additional pruning needed over SB + card is when one set is ground and has cardinality 1, which can be captured by a simple rule. For S1 ⊆ S2 , the extra pruning over SB + card can be captured by the following rules. Theorem 12 (BC on a unary constraint for SB × card) There exists a unary constraint c such that enforcing BC on c for SB, card and SB + card are polynomial but for SB × card is NP-hard. – ∀e ∈ lbSB (S2 ) ∩ ubSB (S1 ), add e to lbSB (S1 ) – ∀e ∈ ubSB (S2 ) ∧ e ∈ / lbSB (S2 ) ∪ ubSB (S1 ), remove e from ubSB (S2 ) • Fail: lbC (S1 ) + |lbSB (S2 ) \ ubSB (S1 )| > ubC (S2 ) • Pruning ubSB (S2 ) and lbSB (S1 ): if lbC (S1 ) + |lbSB (S2 ) \ ubSB (S1 )| = ubC (S2 ) then: • Pruning lbC (S2 ): lbC (S2 ) ≥ |lbSB (S2 )| + lbC (S1 ) − |lbSB (S2 ) ∩ ubSB (S1 )| Proof: Reduction from 3-SAT. Suppose a 3-SAT formula F on n Boolean variables. S is a set variable that contains positive and negative literals and clauses from F . We say that S contains a truth assignment iff S contains exactly one literal (positive or negative) for each of the n variables appearing in F . Let c(S) be satisfied iff (S does not contain exactly n literals) or (S contains a truth assignment that satisfies all clauses in S). Now, enforcing BC on c for SB is polynomial. There are several cases to consider. If S is ground then we simply check c. If S has all its literals set but the clauses are not ground, then there are several subcases. In the first subcase, S contains a truth assignment. If some clause in lb(S) is not satisfied we fail. Else, we prune from ub(S) all clauses that are not satisfied. In the second subcase, S contains a number of literals different from n. We then have support for all clauses. In the third subcase, we simply check that S contains n literals that are not a truth assignment and fail. Other cases (i.e., when not all literals are set) are polynomial by similar arguments. Similarly BC on c for card bounds is polynomial as every cardinality has support. Thus, BC on SB + card is polynomial. However, enforcing BC on SB × card is NP-hard. We set S such that its upper and lower bound include the m clauses in F , its lower bound is set to include no literals, its upper bound is set to include all 2n possible literals, and the cardinality of S is set to n + m. Enforcing BC on c for SB × card will fail unless there is a satisfying assignment to F . • Pruning ubC (S1 ): ubC (S1 ) ≤ ubC (S2 ) − |lbSB (S2 )| + |lbSB (S2 ) ∩ ubSB (S1 )| No need to modify lbC (S1 ), ubC (S2 ), ubSB (S1 ), lbSB (S2 ). 9 Experimental Results To compare combinations of representations which are theoretically incomparable, we ran some experiments. We select minlex (M L), lengthlex (LL), SB, card (C), and their combinations, and evaluate the information lost when transforming a domain from one of these representations to another representation. The goal is to predict how much the pruning performed by BC on one representation propagates to the others. Given a pair of (combined) representations R1 and R2 , we randomly generate the bounds of a set S in R1 , initialize the bounds of S in R2 so that DR2 (S) = 2U , and then synchronize the bounds of S in R2 with the bounds of S in R1 . Finally we compute the approximation ratio|DR2 (S)|/|DR1 (S)|. We average over 1000 sets. This gives us the entry (R1 , R2 ) in Table 1. We report results for |U | = 10. However, similar observations hold for larger universes. As we indicate in bold, 4 out of 9 representations have mean approximation ratio less than 3. These representations, in order of increasing mean approximation ratio, are M L × SB × C, LL × SB, M L × SB, and SB × C. Surprisingly, all these representations include the subset bounds. The remaining representations are either SB or do not include subset bounds, and exhibit much looser approximations. In particular, a relatively simple combination like SB × C behaves much better than for instance LL. The best binary combination LL×SB significantly improves LL. Apart from these general hardness results regarding combining representations, there also exists the issue of difficulty in reaching fix point in constraint propagation (Sellmann 2009). Our framework helps us right here to give more precise results on specific cases. For instance, consider the SB and card representations used in most constraint solvers. When combining these representations using either + or ×, we can only make polynomial number of domain changes before reaching fix point, because only a polynomial number of iterations can be made on SB and card bounds, as opposed to for instance minlex bounds. Despite the general hardness result given in Theorem 12 on prop- 10 Related Work In his Ph.D. thesis (Jefferson 2007), Jefferson has provided a framework for comparing set representations and their combinations. For instance he presented a result closely related to Theorem 12 though he obtained it via a different way. 31 Table 1: Comparison of approximation ratios of different (combined) representations. |U | = 10 Original Representation ML LL SB ML × C M L × LL M L × SB LL × SB SB × C M L × SB × C Geometric mean ML 1 5.03 19.8 2.39 41.2 30.6 48.5 27.9 38.1 13.6 LL 7.05 1 30.7 4.15 19.3 50.6 24 21.2 39.3 13.7 SB 2.41 6.53 1 5.84 10.6 2.87 3.65 1.58 4.01 3.42 ML × C 1 1.97 16.1 1 18.7 19.1 22 12.8 17.6 7.26 Target Representation M L × LL M L × SB LL × SB 1 1 2.36 1 4.82 1 11.7 1 1 1 2.28 1.7 1 5.1 2.42 15.3 1 2.42 8.98 2.6 1 10.9 1.41 1 14.3 1.34 2.05 3.98 1.87 1.54 Differently from (Jefferson 2007), our framework allows the study of many different ways of combining multiple representations most of which were not previously known. In addition, the ordering we have defined on representations lets us compare the pruning power of the (combined) representations and consequently measure their ability to prune the search space. (Jefferson 2007) instead discusses search size comparison only briefly by showing representations which can represent more and produce smaller search trees. Altogether, in our work we can see what kind of combinations can be built, how they compare and which ones already exist or not. 11 SB × C 2.38 2.27 1 2.25 5.25 2.54 2.08 1 2.51 2.11 M L × SB × C 1 1.97 1 1 3.03 1 1.83 1 1 1.3 Second, our theoretical and experimental results suggest we should develop solvers that more tightly integrate subset bounds with representations like cardinality and lengthlex. All these necessitate the design and development of efficient synchronization and propagation algorithms. Last, our framework helps us to give more precise results on the reaching of fixed points when combining represetations (Sellmann 2009). An in-depth investigation is needed to understand how certain combinations impact on this issue. References Azevedo, F. 2007. Cardinal: A finite sets constraint solver. Constraints 12(1):93–129. Bacchus, F.; Chen, X.; van Beek, P.; and Walsh, T. 2002. Binary vs. non-binary constraints. Artif. Intell. 140(1/2):1– 37. Debruyne, R., and Bessiere, C. 1997. Some practicable filtering techniques for the constraint satisfaction problem. In Proc. of IJCAI’97, 412–417. Gervet, C., and van Hentenryck, P. 2006. Length-lex ordering for set CSPs. In Proc. of AAAI’96, 48–53. Gervet, C. 1994. Conjunto: Constraint propagation over set constraints with finite set domain variables. In Proc. of ICLP’94, 733. Jefferson, C. 2007. Representations in constraint programming. Ph.D. Dissertation, University of York. Kiziltan, Z., and Walsh, T. 2002. Constraint programming with multisets. In Proc. of SymCon’02, CP’02 Workshop, 9–21. Kutz, M.; Elbassioni, K.; Katriel, I.; and Mahajan, M. 2008. Simultaneous matchings: Hardness and approximation. J. Comput. Syst. Sci. 74(5):884–897. Law, Y.; Lee, J.; Woo, M.; and Walsh, T. 2011. A comparison of lex bounds for multiset variables in constraint programming. In Proc. of AAAI’11, 61–67. Law, Y.; Lee, J.; and Woo, M. 2009. Variety reasoning for multiset constraint propagation. In Proc. of IJCAI’09, 552– 558. Lindner, C., and Rosa, A. 1980. Topics on steiner systems. Annals of Discrete Mathematics 7. Conclusions and Future Work We have defined a general framework for combining together different representations of set and multiset variables. Central to this framework is the idea of synchronization between representations. We have proposed two different combinators, a weak method which considers each representation independently, and a strong method which considers them together. To compare representations, we have proposed an ordering on representations that measures their ability to prune the search space. We have undertaken a detailed study of combinations of representations (including several like SB×card and lengthlex×SB which have been proposed previously but not compared theoretically). We identified several new relations. For example, the lengthlex representation, which simultaneously captures cardinality and lexicographic ordering information, is not better than a representation which deals with cardinality and lexicographic ordering information separately or simultaneously. As a second example, the hybrid domain of (Sadler and Gervet 2008) provides between the weak and strong combination of the constituent representations. Finally, we have provided some complexity results regarding synchronization and propagation, and performed experiments showing how good or bad a representation is in approximating others. This paper opens the door to several interesting new directions. First, we believe that our framework along with the complexity results provide useful step towards focusing on particular combinations with promising theoretical properties and conforming their interest on practical problems. 32 Malitsky, Y.; Sellmann, M.; and van Hoeve, W. J. 2008. Length-lex bounds consistency for knapsack constraints. In Proc. of CP’08, 266–281. Müller, T. 2001. Constraint Propagation in Mozart. Ph.D. Dissertation, Saarland University. Puget, J. 1992. PECOS: a high level constraint programming language. In Proc. of Singapore International Conference on Intelligent Systems (SPICIS’92), 137–142. Sadler, A., and Gervet, C. 2008. Enhancing set constraint solvers with lexicographic bounds. J. Heuristics 14(1):23– 67. Sellmann, M. 2009. On decomposing knapsack constraints for length-lex bounds consistency. In Proc. of CP’09, 762– 770. Yip, J., and Hentenryck, P. V. 2010. Exponential propagation for set variables. In Proc. of CP’10, 499–513. 33