The Generalized Subgraph Problem: Complexity, Approximability and Polyhedra Corinne Feremans, Martine Labbé, Adam Letchford, Juan José Salazar September 10th 2003 Abstract This paper is concerned with a problem on networks which we call the Generalized Subgraph Problem (GSP). The GSP is defined on an undirected graph where the vertex set is partitioned into clusters. The task is to find a subgraph which touches at most one vertex in each cluster so as to maximize the sum of vertex and edge weights. The GSP is a relaxation of several important problems of a ‘generalized’ type and, interestingly, has strong connections with various other well-known combinatorial problems, such as the quadratic semi-assignment, max-flow / min-cut, matching, stable set, uncapacitated facility location and max-cut problems. In this paper, we examine the GSP from a theoretical viewpoint. We show that the GSP is strongly N P-hard, but solvable in polynomial time in several special cases. We also give several approximation results. Finally, we examine two 0-1 integer programming formulations and derive new classes of valid and facet-inducing inequalities that could be useful to develop a cutting plane approach for the exact or heuristic resolution of the problem. 1 Introduction In recent years there have been several works on so-called generalized graph problems, such as the Generalized Spanning Tree Problem (GSTP) and Generalized Travelling Salesman Problem (GTSP) — see for example Golden, Levy, and Dahl (1981), Fischetti, Salazar, and Toth (1995),(1997), Feremans, Labbé, and Laporte (2002b),(2002a). In all of these problems, one is given an undirected graph G = (V, E), where V is a vertex set partitioned into m clusters V k , k ∈ K = {1, . . . , m}, and E is the edge set; the task is to find a spanning tree, Hamiltonian cycle, or whatever, which touches exactly (or at most, or at least) one vertex in each cluster optimizing a given cost function. More fundamental than a generalized spanning tree or generalized Hamiltonian cycle is what Feremans (2001) calls a generalized subgraph (GS). A generalized subgraph is a subgraph G∗ = (V ∗ , E ∗ ) of G, not necessarily connected, such that |V ∗ ∩ Vk | ≤ 1 for all k ∈ K. Figure 1 shows a graph and a GS. The small numbered circles represent vertices, the ovals represent the clusters and the lines represent the edges. The lines and circles in bold represent the GS. Consider the following problem: Given a graph G = (V, E), a partition of V into clusters, and weights on the vertices and edges, find a GS of maximum weight. We call this the Generalized Subgraph Problem (GSP). Notice that edges of non-positive weight can be deleted, so we assume without loss of generality that all edge-weights are positive. However, we do not require vertexweights to be positive. In our view the GSP is a ‘fundamental’ problem, in the sense that it is a natural relaxation of the majority of problems of a ‘generalized’ type. Moreover, as we will show, there are also 1 2 3 1 4 8 5 7 6 Figure 1: Graph with 8 vertices, 4 clusters and 7 edges. interesting connections between the GSP and several other well-known combinatorial optimization problems, including the quadratic semi-assignment, max-flow / min-cut, matching, stable set, uncapacitated facility location, max-cut, vertex cover and maximum clique problems. This is the motivation for our study of the GSP. The structure of the paper is as follows. Section 2 reviews results in Feremans (2001) and Section 3 establishes the above-mentioned connections with other known combinatorial problems. In Section 4 we examine the computational complexity of the GSP. We show that, even under very restrictive assumptions, the decision version of the GSP is strongly N P-complete. However, we show polynomial solvability of some special cases of the GSP, including the case where the shrunk graph — the graph obtained by shrinking each cluster into a single vertex — is series-parallel. In Section 5 we examine the issue of approximability. In particular, we show that the GSP can be approximated to within a factor of d/2, if the degrees of all vertices in the shrunk graph are bounded from above by d, and within a factor of 2q, if q is the maximum cardinality of a cluster. Finally, in Section 6 we examine two 0-1 integer programming formulations of the GSP and derive new classes of valid and facet-inducing inequalities. Conclusions are given in Section 7. The following notation will be used in the remainder of the paper. For any S 1 , S2 ⊂ V with S1 ∩ S2 = ∅, E(S1 : S2 ) denotes the set of edges with one end-vertex in S1 and the other in S2 . When S1 contains only a single vertex v, we write E(v : S2 ) rather than E({v} : S2 ) for brevity. We also write δ(S) for E(S : V \ S) and δ(v) for δ({v}). The shrunk graph is denoted by GS = (V S , E S ). It is obtained by contracting each cluster into a single vertex (hence |V S | = m) and by merging all parallel edges into single ones. That is, there exists an edge between two vertices of the shrunk graph if and only if there exists at least one edge between the two corresponding clusters in G. Other useful definitions from Graph Theory are the following. A cut-vertex (or articulation point) in a graph is a vertex whose removal disconnects the graph. A connected graph with no cut-vertices is called biconnected. A maximal biconnected subgraph of a graph is called a block. Any graph can be decomposed into blocks, where each edge appears in exactly one block, and these blocks form a tree structure. The block decomposition and associated tree structure can be found in linear time (see Tarjan (1972)). 2 2 Known Results As pointed out by Feremans (2001), the GSP can be formulated as the following 0-1 integer program. Let us consider a variable xe taking the value 1 if and only if edge e is in the GS and similarly P for yv and vertex v. Also, the standard conventionP is used: for any F ⊂ E, x(F ) denotes e∈F xe and, similarly, for any S ⊆ V , y(S) denotes v∈S yv . Then the GSP is equivalent to: X X max we x e + p v yv (1) e∈E v∈V subject to: y(Vk ) ≤ 1 for all k ∈ K, (2) for all k ∈ K and all v ∈ V \ Vk . (3) xe ∈ {0, 1} for all e ∈ E, (4) yv ∈ {0, 1} for all v ∈ V. (5) x(E(v : Vk )) ≤ yv In this paper we will denote by P xy (G) the convex hull of feasible solutions to (2)–(5). In Feremans (2001), it is shown that P xy (G) is full-dimensional and that the inequalities (2) and (3), along with the trivial inequalities xe ≥ 0 for all e ∈ E, induce facets. Two more complex classes of inequalities were also introduced. The first are known as odd cycle inequalities (see Figure 2 for an illustration). Proposition 1 (Odd Cycle Inequalities, Feremans (2001)). Let c ≥ 3 be an odd integer and let V1 , . . . , Vc be clusters. Suppose that each cluster Vk is partitioned into two non-empty subsets Vk1 and Vk2 . Then the odd cycle inequality c X x(E(Vk1 : V(k2 mod c)+1 )) ≤ bc/2c (6) k=1 defines a facet of P xy (G). The second are known as odd clique matching inequalities (see Figure 3 for an illustration). Proposition 2 (Odd Clique Matching Inequalities (OCMI), Feremans (2001)). Let c ≥ 3 be an odd integer and let V1 , . . . , Vc be clusters. Suppose that each cluster Vk is partitioned into c − 1 subsets Vkl for l = 1, . . . , c − 1. If subsets Vkl are all non-empty then the odd clique matching inequality c−1 c−1 X X k x(E(Vkl : Vl+1 )) ≤ bc/2c (7) k=1 l=k defines a facet of P xy (G). We would like to point out that it is not necessary for all of the Vkl to be non-empty for an OCMI to induce a facet. Indeed, the odd cycle inequalities already mentioned can be regarded as OCMIs in which some of the Vkl are empty. In fact, the conditions under which OCMIs are facet-inducing are rather subtle. We explore this issue in Subsection 6.4. Feremans (2001) also considered the projection of P xy (G) into x-space, which we shall denote by P x (G). This is of interest for GSP instances in which the vertex-weights are zero, because then the y variables are not necessary. Feremans shows that P x (G) is also full-dimensional and that the odd cycle and odd clique matching inequalities, together with the trivial inequalities xe ≥ 0 for each e ∈ E, induce facets. 3 V1 V5 V11 V22 V12 V21 V2 V51 V32 V52 V31 V4 V3 V41 V42 Figure 2: Support graph of the odd cycle constraint for c = 5. V1 V5 V14 V13 V12 V11 V21 V24 V23 V22 V2 V51 V32 V52 V31 V53 V34 V54 V33 V44 V41 V42 V3 V43 V4 Figure 3: Support graph of the odd clique-matching constraint for c = 5. 3 Related Problems As already mentioned, the GSP is a natural relaxation for many problems of a ‘generalized’ type. Interestingly, it is also related to several other well-known combinatorial optimization problems. This is the topic of this subsection. We begin by pointing out a connection between the GSP and the so-called quadratic semiassignment problem (QSAP), in which a quadratic function of 0-1 variables must be minimized subject to the semi-assignment constraints (see Burkard and Çela (1997), Malucelli and Pretolani (1995)). More precisely, this problem can be formulated as XX XXXX min F (x) := fik xik + cijkl xik xjl i i k 4 j k l subject to X xik = 1 for all i k xik ∈ {0, 1} for all i, k. Then, by considering the following cost transformation M − fik pik = wijkl = M − cijkl for all i, k for all i, j, k, l, where M is a sufficiently large constant, the objective function of QSAP can be replaced by XX XXXX max F 0 (x) := pik xik + wijkl xik xjl . i i k j k l This maximization version of QSAP can be seen as a GSP in which the vertices of cluster V i (for each i) correspond to variables xik (for all k) and in which each cluster must be visited exactly once. Conversely, a GSP instance can be transformed into a QSAP instance. To see this, note that, if yu and yv have been set to one, it is always optimal to set xuv to 1. On the other hand, if either yu or yv is set to zero, then xuv must be zero too. This means that xuv = yu yv in any optimal solution. Hence,Pwe could formulate the GSP as the problem of maximizing P the quadratic objective function v∈V pv yv + {u,v}∈E wuv yu yv subject to (2) and (5). In case clusters have different cardinality, dummy vertices and edges with zero weight can be added and, again using the standard ‘big M ’ method, we can obtain an equivalent QSAP instance. This connection with quadratic 0-1 programming proves to be useful in the following way. When |Vk | = 1 for all k, the GSP reduces to an unconstrained quadratic maximization problem in 0-1 variables. In general these problems are N P-hard, but when, as in our case, all of the quadratic terms have non-negative coefficients, it can be solved as a max-flow / min-cut problem. The reverse is also true: any max-flow / min-cut problem can be transformed into a GSP instance with |Vk | = 1 for all k (see Picard and Ratliff (1975) and Rhys (1970)). On the other hand, when |Vk | = 2 for all k, the GSP becomes strongly N P-hard, see Section 4. Note that the QSAP formulation can be linearized by introducing variables y ijkl = xik xjl and introducing the extra constraints yijkl ≥ xik + xjl − 1 for all i, j, k, l yijkl ≤ xik for all i, j, k, l yijkl ≤ xjl for all i, j, k, l. The associated integer polytope is equivalent to the so-called partial constraint satisfaction (PCS) polytope studied by Koster, Van Hoesel, and Kolen (1998). According to Koster et al. (1998), many frequency assignment problems described in the literature are particular cases of the PCS problem, and therefore also of the GSP. Koster et al. (1998) propose a class of valid inequalities, the so-called cycle inequalities, for the PCS polytope. Expressed in terms of the GSP, these inequalities take the following form: 5 V1 V4 V12 V22 V11 V21 V2 V41 V32 V42 V31 V3 Figure 4: Even cycle inequality with c = 4. 2 1 1 1 1 2 1 7 5 6 3 2 4 4 4 3 4 3 5 5 5 6 6 (a) Matching instance 7 5 (b) GSP instance Figure 5: Converting a matching instance into a GSP instance. Proposition 3 (Cycle Inequalities, Koster, Van Hoesel, and Kolen (1998)). Let c ≥ 3 be an integer and let V1 , . . . , Vc be clusters. Suppose that each cluster Vk is partitioned into two non-empty subsets Vk1 and Vk2 for all k ∈ {1, . . . , c}. Then the cycle inequality c−1 X k=1 1 x(E(Vk1 : Vk+1 )) + c−1 X 2 x(E(Vk2 : Vk+1 )) + x(E(V11 : Vc2 )) + x(E(V12 : Vc1 )) ≤ c − 1, (8) k=1 is valid for P xy (G) and P x (G). See Figure 4 for an illustration of a cycle inequality with c = 4. When c is even, these inequalities can be shown to induce facets of P xy (G) and P x (G) using the proof technique of Koster et al. (1998). However, when c is odd, they are dominated by the odd cycle inequalities (6). Therefore from now on we will call (8) even cycle inequalities and assume that c is even. Next, we show that the special case of the GSP in which all vertex-weights are zero is intermediate in generality between two other well-known problems: the maximum weight matching problem of Edmonds (1965c) and the stable set problem (see, e.g., Grötschel, Lovász, and Schrijver (1988)). Figure 5 shows that any instance of the matching problem can be converted into a GSP instance by ‘exploding’ each vertex of degree d into a cluster of d vertices. To see that the GSP is in turn a special case of the stable set problem, let us say that two edges are in conflict if they cannot both be used simultaneously in a GSP solution. It is easy 6 {2, 3} {1, 4} {7, 6} {4, 5} {4, 8} {2, 7} {3, 7} Figure 6: Conflict graph for the GSP instance in Figure 1. to see that two edges e and f are in conflict if and only if there exists k ∈ K and two distinct vertices u, v ∈ Vk such that e ∈ δ(u) and f ∈ δ(v). So, given a GSP instance defined on a graph G = (V, E), let us define an auxiliary graph GC , called the conflict graph, as follows. There is one vertex in V C for each edge in E. Two vertices in V C are connected by an edge in E C if and only if the two corresponding members of E are in conflict. Then, there is a one-to-one correspondence between feasible GSP solutions in G and stable sets in GC . (Figure 6 shows the conflict graph corresponding to the GSP instance displayed in Figure 1. Vertices in bold represent the bold solution in Figure 1.) Several other well-known problems can also be transformed into the GSP in a natural way. Examples include the uncapacitated facility location problem (UFLP), the max-cut problem, the vertex cover problem (VCP) and the maximum clique problem. The transformations are fairly straightforward, so we skip the details in the case of the UFLP and max-cut problem. The transformations from vertex cover and maximum clique are described and used in Sections 4 and 5, respectively. 4 Complexity In this section we address certain issues related to the complexity of the GSP and some of its special cases. We begin with the bad news that GSP is a strongly N P-hard problem even under very restrictive assumptions. Indeed, it is not difficult to show, by reduction from the max-cut problem, that GSP is strongly N P-hard even when |Vk | = 2 for all k and all vertex weights are zero. We will show an even stronger hardness result, by using the vertex cover problem (VCP). The VCP is defined as follows. Given an undirected graph G0 = (V 0 , E 0 ), the VCP looks for a minimum-cardinality subset V ∗ ⊆ V 0 such that each edge in E 0 contains at least one vertex of V ∗. Theorem 1. The VCP reduces to GSP. Proof. Let G0 = (V 0 , E 0 ) be a VCP instance. Define another undirected graph G = (V, E) by associating a vertex vi in V to each i ∈ V 0 , four vertices tie , tje , uie and uje in V to each edge e = {i, j} ∈ E 0 , and two edges {tie , vi } and {uie , vi } in E to each vertex i ∈ V 0 and each edge e = {i, j} ∈ E 0 . The clusters are defined by Te = {tie , tje } and Ue = {uie , uje } for each e = {i, j} ∈ E 0 , and Vi = {vi } for each i ∈ V 0 . The weight of vertices tie and uie are zero, the weight of vertices vi are −1, and the weight of the edges are +1. Then the problem of finding a minimum-cardinality subset of vertices in V 0 coincides with the problem of finding a maximum-weight GS in G. Notice that a slightly different reduction yields a GSP instance with zero vertex weights. Instead of taking the singleton clusters Vi for all i ∈ V 0 , take clusters of cardinality two by 7 inserting an additional vertex vi0 in each Vi . Also add a last cluster with one vertex V0 = {v00 }, and the additional edges {vi0 , v00 } with weight equal to one. This gives an instance of the GSP with zero vertex weights, bipartite, with at most one edge between each pair of clusters. Corollary 1. The GSP is strongly N P-hard even if: 1. the shrunk graph is bipartite, 2. there is at most one edge between each pair of clusters on opposite sides of the bipartition, 3. all clusters on one side of the bipartition are singletons with unit cost (negative profit), 4. all clusters on the other side of the bipartition contain only two vertices, each with zero weight, 5. all edges have unit profit. Corollary 2. The GSP is strongly N P-hard even if: 1. the shrunk graph is bipartite, 2. there is at most one edge between each pair of clusters on opposite sides of the bipartition, 3. all vertices have zero cost, 4. all edges have unit profit, 5. all clusters contain at most two vertices. Notice that imposing the strongest conditions of the above two corollaries leads to a polynomially solvable GSP instance. Indeed, when all clusters on one side of the bipartition contain one single vertex with zero weight, the GSP is trivial. So the above two results can be seen as best possible. Given these results, it is natural to look for special cases for which the GSP becomes polynomially solvable. Our approach for doing this is to specify a number of ‘reduction’ or ‘preprocessing’ operations, together with some simple well-solved GSP classes. If by using the pre-processing operations a GSP instance can be reduced to a small (polynomial) number of instances in the well-solved classes, then the original GSP instance is also polynomially solvable. Here are three simple well-solved cases: Proposition 4. The GSP is solvable in polynomial time when each vertex v ∈ V has degree at most one in G. Proof. When this condition holds, the GSP instance reduces to solving a maximum-weight matching problem on the shrunk graph where the weight of an edge {u, v} is p u + wuv + pv . Proposition 5. The GSP is solvable in polynomial time when |Vk | = 1 for all k ∈ K. Proof. As mentioned in Section 3, this can be reduced to a max-flow / min-cut problem. Proposition 6. The GSP is solvable in polynomial time if the number of clusters m is bounded by some fixed constant q. Proof. The number of candidates for V ∗ is clearly O(|V |m ), which is polynomial provided that m is bounded by a constant. Once V ∗ is selected, it is trivial to determine the optimal E ∗ . 8 Now we present our ‘reduction’ procedures. Our first reduction procedure involves the elimination of edges in G. Procedure 1 (Edge Elimination): For a given pair of clusters Vk , Vl , suppose that there are at least two edges in E(Vk : Vl ) which are not adjacent to any edges in E \ E(Vk : Vl ). Then all of these edges, along with their end-vertices, can be removed except the edge {u, v} and the vertices u, v such that pu + wuv + pv is maximized. Of course, even these can be removed when this value is non-positive. A second reduction procedure is capable of eliminating a cluster from certain GSP instances. The effect is to remove vertices of degree two in the shrunk graph. (This procedure corresponds to the series reduction procedure described in Malucelli and Pretolani (1995).) Procedure 2 (Cluster Elimination): Suppose that there exist three clusters Vi , Vj , Vk such that δ(Vi ) = E(Vi : Vj ∪ Vk ). (That is, the vertices in Vi are connected only to vertices in Vj and Vk .) Then cluster Vi can be eliminated as follows. For all u ∈ Vj and v ∈ Vk , compute the quantity d(u, v) := maxs∈Vi max{0, wus + ps + wsv }, where if an edge e 6∈ E then we := 0. Remove Vi and all edges in δ(Vi ) from G. For all u ∈ Vj and v ∈ Vk set wuv := wuv + d(u, v). If wuv > 0 and {u, v} 6∈ E then upgrade E := E ∪ {{u, v}}. The third reduction procedure enables us to eliminate a cluster of cardinality one if the vertex-weight is non-negative. Procedure 3 (Eliminating a Singleton Cluster): Suppose that there exists a cluster Vl with Vl = {v} and pv ≥ 0. Then, we can eliminate Vl as follows. For each e = {u, v} ∈ E, set pu := pu + we and remove e from E. (We can either leave v as an isolated vertex with weight pv or remove it from V and add pv to the objective.) The fourth reduction procedure uses the idea of block decomposition mentioned at the end of Section 1: Procedure 4 (Eliminating a Block): Suppose that the set B ⊂ V S induces a block in the shrunk graph GS and that it is a ‘leaf’ block, i.e., that there is a unique cut-vertex i in the shrunk graph such that i ∈ B. Let Vk be the cluster corresponding to i and let V (B) denote the set of vertices in V which map to vertices in B. Then, for all v ∈ Vk , let p(V (B), v) be the optimal profit for a smaller GSP defined only on the subgraph of G induced by (V (B) \ Vk ) ∪ {v}, where all weights and prizes remain the same apart from pv , which is set to zero. Then the vertices in V (B) \ Vk can be removed, along with the associated edges, and the vertex-prize for each v ∈ V k updated as pv := pv + p(B, v). This procedure can be applied iteratively. Indeed, if all blocks have a sufficiently simple structure, we can solve the entire problem in this way. (The tail reduction procedure described in Malucelli and Pretolani (1995) corresponds to a particular case of Procedure 4 when |B| = 2.) These four procedures, coupled with the three basic classes of well-solved GSP instances, establish the polynomial solvability of a variety of GSP instances. Theorem 2. The GSP is solvable in polynomial time when for each vertex v ∈ V there exists k(v) ∈ K such that δ(v) ⊆ E(v : Vk(v) ). 9 Proof. After Procedure 1 is applied, the GSP has the structure required in Proposition 4, and can therefore be solved as a matching problem. Theorem 3. The GSP is solvable in polynomial time when the shrunk graph is a tree. Proof. Using Procedure 4, this can be reduced to a trivial GSP instance with only two clusters, which is easy by Proposition 6. More generally, Theorem 4. The GSP is solvable in polynomial time when the shrunk graph is not contractible to K4 , i.e., when the shrunk graph is series-parallel. Proof. Application of Procedure 2 to a ‘leaf’ block reduces such a block to only two clusters. Then, applying Procedure 4 eliminates the block entirely. Iterative application of this idea leads once again to a trivial GSP instance with only two clusters. Other still more general results can be obtained by applying all four procedures simultaneously, in conjunction with Propositions 4 to 6. 5 Approximability Given that the GSP is N P-hard in general, it is natural to ask whether it can be approximated to within a constant factor in polynomial time. The following theorem shows that the answer is negative. Theorem 5. There exists no polynomial time heuristic with any guaranteed performance for the GSP unless P = N P. Proof. This is by reduction from the maximum clique problem. The question “Does a graph G contain a clique be reduced to the question “Does G 0 contain a generalized ¡m¢ of m vertices?” can 0 subgraph of 2 edges?”, where G = (V 0 , E 0 ) is constructed from G as follows. The vertex set V 0 is partitioned into m clusters, each one being a copy of V . There is an edge in E 0 whenever the corresponding edge is in E and the two end-vertices are not in the same cluster. Figure 7 illustrates the construction of G0 from a graph G with m = 3. Further each edge of E 0 has a unit weight and each vertex has a profit equal to −((m − 1)/2 − 1/m). The only solutions ¡m¢ of this GS instance having a positive value (equal to one) contain exactly m vertices and 2 edges, i.e., correspond to cliques in G. Hence, a polynomial heuristic with guaranteed worst case performance for the GSP would also provide a solution to the maximum clique problem. However, we do have some positive results concerning approximability. In the remainder of this section we assume that all vertex weights are non-negative, and that isolated clusters have been preprocessed (i.e., the maximum weight vertex in every isolated cluster is selected in a solution). We begin with the following simple proposition, which relies on a recent result of Berman (2000). Proposition 7. If all vertex-weights are zero, the GSP can be approximated to within a factor of d, where d is the maximum degree of a vertex in the shrunk graph GS . 10 1 1 2 2 3 4 1 3 3 4 1 2 2 3 4 4 (a) Original graph G with m = 3 (b) GS instance on G0 Figure 7: The maximum clique problem as a decision version of GSP. V6 V3 1 2 V7 V4 V5 V1 V2 V8 Figure 8: GSTP instance such that edge {1, 2} is the center of a 2d − 1 = 7 claw in the conflict graph. Proof. A b-claw is a graph with b + 1 vertices, in which the only edges are those connecting one vertex to all other vertices. It can be checked that the conflict graph of a GSP instance is 2d-claw-free, i.e., it cannot contain a 2d-claw as a vertex-induced subgraph (see Figure 8). Yet it was shown by Berman (2000) that the stable set problem in b-claw-free graphs is approximable to within a factor of b/2. However, we can achieve a better ratio than this, even for more general vertex weights (but still nonnegative), by exploiting the structure of the problem. First we will show that the GSP can be approximated to within a factor of (d + 1)/2, i.e., m/2 in the worst-case. Then, by a slightly more complicated argument, we will show that a ratio of d/2 can be achieved. Recall that GS = (V S , E S ) denotes the shrunk graph. Let F denote a set of subgraphs f in GS . For a given e ∈ E S , let F e ⊂ F denote the set of all subgraphs in F which include edge e. Consider the following Linear Program: X r∗ (F ) := min rf f ∈F 11 subject to X rf ≥ 1 ∀e ∈ E S , rf ≥ 0 ∀f ∈ F. f ∈F e This LP calls for a fractional covering of the edges of E S by elements of F . We denote by z ∗ the optimal value of a GSP and by G∗ = (V ∗ , E ∗ ) the corresponding optimal GS. Let zf denote the optimal value of the modified version of the GSP instance in which all edges have been eliminated apart from those which maps onto edges in f in the shrunk graph. Further, let z H (F ) := max{zf : f ∈ F and rf∗ > 0}, where rf∗ denotes the optimal value of variable rf in the fractional covering problem associated with F . The following lemma provides approximability information about the general heuristic scheme H which amounts to solving (easier) instances of the GSP and considering the worst case. Lemma 1. r ∗ (F ) · z H (F ) ≥ z ∗ . Proof. Let h(v) be the vertex of the shrunk graph representing the cluster containing vertex v, and let h(e) be the edge of the shrunk graph onto which edge e maps. We have: X X z∗ = we + pv ≤ e∈E ∗ v∈V ∗ X µ X we e∈E ∗ = ≤ f :h(e)∈f µ X rf∗ X rf∗ zf ≤ f ∈F f ∈F rf∗ X ¶ + f ∈F v∈V ∗ we + e∈E ∗ :h(e)∈f µX X rf∗ v∈V ¶ pv µ X f :h(v)∈f X ∗ :h(v)∈f pv rf∗ ¶ ¶ (9) · z H (F ) = r ∗ (F ) · z H (F ). The first inequality comes from the fact that (rf∗ ) is a fractional cover of the edges and thus also of the vertices of the shrunk graph Gs . The term between brackets in (9) represents the value of a feasible solution to the modified GSP corresponding to f , it is thus bounded by z f . It follows that the weight of the optimal GSP solution is at most r ∗ (F ) times the weight of the best solution. Theorem 6. The GSP can be approximated to within a factor of (d + 1)/2 in polynomial time. Proof. Consider the special subgraph family F consisting of all forests in GS (hence, computing r∗ (F ) is a special case of the problem of fractionally covering the elements of a matroid by bases; see Edmonds (1965b)). Using the ellipsoid method and column generation, where columns correspond to forests, an optimal basic solution to the fractional covering problem associated with F can be found in polynomial time. (Indeed, the pricing problem is in this case to find a maximum weight forest, which is polynomially solvable.) Moreover, in a basic optimal solution there are p positive variables rf∗ with p ≤ |E S |. Now, for each positive rf∗ , we can compute zf in polynomial time from Theorem 3. Hence z H (F ) can be determined in polynomial time. 12 It now remains to show that r ∗ (F ) ≤ (d + 1)/2. For this, we use the characterization of Edmonds (1965b) of fractional matroid covering. Edmonds’ result, translated into our notation, shows that: ¾ ½ S |E (W )| ∗ r (F ) = max , W ∈W |W | − 1 where W contains all sets W ⊆ V S which induce connected subgraphs in GS , and E S (W ) is the set of edges in E S with both end-vertices in W . Now, for any W we have: |E S (W )| ≤ min{d, |W | − 1}|W | , 2 and therefore |E S (W )| min{d, |W | − 1}|W | min{d, |W | − 1} 1 ≤ = + min |W | − 1 2(|W | − 1) 2 2 ½ d ,1 |W | − 1 ¾ ≤ d 1 + . 2 2 Note that the bound of (d + 1)/2 is tight for the polynomial method based on covering by forests described in Theorem 6, even when G is a complete m-partite graph, all vertex-weights are zero, and all edge profits are 1. The optimum has profit m(m−1)/2, but each of the solutions from a forest has profit only m − 1. So the ratio is m/2, which is (d + 1)/2. This approximation result relies on the fact that the GSP is solvable in polynomial time when GS is a forest. From the previous section, the GSP is also solvable in polynomial time when GS is series-parallel. This suggests that one might be able to achieve a better approximation ratio by seeking a fractional covering of the edges by series-parallel graphs instead of by forests. Unfortunately, it seems unlikely that the resulting LP could be solved efficiently. So far all we have been able to achieve is the following strengthening: Proposition 8. The GSP can be approximated to within a factor of d/2 in polynomial time (assuming d > 1). Proof. The proof goes along the same lines as the previous theorem but instead of using a covering by forests, we use a covering by graphs in which each connected component contains at most one cycle. The associated independence system is again a matroid, and therefore the LP can be solved in polynomial time. Moreover, we can solve the necessary modified GSP instances in polynomial time. Now by Edmonds’ result: ½ S ¾ |E (W )| ∗ r (F ) = max , W ∈W |W | which is easily shown to be bounded above by d/2. Finally, we give a rather different approximation result which relies on the maximum size of a cluster rather than properties of the shrunk graph. We first need another technical lemma. Lemma 2. Let Km be the complete graph with m vertices and F B be the family of bipartite subgraphs of Km . There exists a solution to the fractional covering problem associated with F B in which at most 2l−1 variables are positive and equal to 22−l , where l := dlog2 me. Furthermore, these bipartite graphs can be determined in polynomial time. 13 Proof. We first describe the procedure to obtain the bipartite graphs used in the fractional cover. Assume the vertices of Km are labelled from 1 to m and let M := {1, . . . , m}. For i = 1, . . . , m and j = 1, . . . , l, let b(i, j) denote the value of the j-th bit in a binary encoding of i. P The right-most bit comes first. Now, for any subset S ⊆ {1, . . . , l}, denote T (S) := {i ∈ M : j∈S b(i, j) is odd }. We claim that the following bipartite graphs gives us the required covering: for each set S such that |S| is odd, construct a bipartite graph f S in which T (S) constitutes one side of the bipartition. Clearly, there are 2l−1 odd cardinality subsets S and thus 2l−1 such bipartite graphs fS . These bipartite graphs can be determined in polynomial time since for m = 2l , the number of odd subsets S is m 2 , which is polynomial in m. It remains to prove that each edge of Km belongs to 2l−2 bipartite graphs fS . Let {i1 , i2 } be an edge of Km and D := {j : b(i1 , j) 6= b(i2 , j)}. Then, {i1 , i2 } belongs to all bipartite graphs fS such that |S ∩ D| is odd and |S \ D| is even. Indeed, assume w.l.o.g. that i1 ∈ T (S). We know that X X X X X b(i1 , j) = b(i1 , j) + b(i1 , j) = |S ∩ D| − b(i2 , j) + b(i2 , j) j∈S j∈S∩D j∈S∩D j∈S\D j∈S\D P is odd and |S ∩ D| is odd. Hence, j∈S b(i2 , j) must be even, i.e., i2 6∈ T (S) and {i1 , i2 } is an edge of fS . Finally, the number of subsets S such that fS contains a given edge is 2|D|−1 ·2l−|D|−1 = 2l−2 . Hence the solution ( 22−l if f is a bipartite graph fS with |S| odd rf = 0 otherwise is a feasible fractional covering with value 2. Theorem 7. Let q be the largest cluster cardinality, i.e., q := maxk∈K |Vk |. Then GSP can be approximated to within a factor of 2q within polynomial time. Proof. Using Lemmas 1 and 2, we have that 2 · z H (F B ) ≥ z ∗ . The number of bipartite graphs which need to be considered to determine z H (F B ) is 2l−1 ≤ m but each such GSP instance is unlikely to be polynomial. However, a GSP instance whose shrunk graph is bipartite can be approximated with a factor of q within polynomial time as follows. Consider the clusters corresponding to one side of the bipartition of V S . We construct instances in which each of these clusters is reduced to a singleton and such that each vertex of these clusters appears in exactly one of these instances. Further, the edge set contains those having both extremities in the instance vertex set. The maximum number of such instances is q and since all vertex weights are nonnegative, Procedure 3 enables us to solve them in polynomial H the largest optimal value of those instances. Now, using arguments similar time. Denote by zB to those in the proof of Lemma 1, we can conclude that a bipartite GSP can be approximated H ≥ 2 · z H (F B ) ≥ z ∗ which completes the within a factor q in polynomial time. Hence q · 2 · zB proof. 6 Polyhedral Study In this section we explore further the polytopes P x (G) and P xy (G). In order to do this it will be useful to define P x (G) explicitly, rather than implicitly (as the projection into x-space of 14 Figure 9: Support graph of a star inequality defined by (12) P xy (G)). This is easily done, using the idea of conflicts between pairs of edges mentioned in Section 3. Indeed, when all vertex-weights are zero, the GSP can be formulated as: X max we x e e∈E subject to xe + x f ≤ 1 xe ∈ {0, 1} for all conflicting pairs e, f ∈ E, (10) for all e ∈ E. (11) Then, P x (G) is the convex hull of feasible solutions to (10)–(11). Moreover, as mentioned in Section 3, the polytopes in x-space are intermediate in generality between matching polytopes and stable set polytopes. 6.1 The basic formulation and its projection The inequalities (2) and (3), together with the non-negativity inequalities on the x variables, define the feasible region of the basic LP relaxation of the GSP. In this section, we examine properties of this polytope. We begin by considering its projection onto x-space. Proposition 9. The projection of the polytope {(x, y) ∈ [0, 1]|E|+|V | : (2), (3) hold } onto the space of the x variables is defined by the non-negativity inequalities x e ≥ 0 for all e ∈ E, together with the inequalities X x(E(v : Vk(v) )) ≤ 1, (12) v∈Vl for all l ∈ K and, for each v ∈ Vl , for any k(v) ∈ K \ {l}. Proof. It suffices to apply Fourier-Motzkin elimination (Martin (1999)) to the GSP formulation given by (1)-(5). We call the inequalities (12) star inequalities, because their support graph (when shrunk) is a star — see Figure 9. A particular and interesting case occurs when k(i) = k for all i ∈ V j , when the inequality reduces to x(E(Vj : Vk )) ≤ 1. Note that the star inequalities are much stronger than the inequalities (10). Indeed, we have the following proposition. 15 Proposition 10. An inequality of the form x(F ) ≤ 1 induces a facet of P x (G) if and only if it is either a star inequality of the form (12) or a 3-cycle inequality (i.e., an odd cycle inequality of the form (6) with c = 3). Proof. Recall that P x (G) is a special kind of stable set polytope. Padberg (1973) showed that an inequality of the form x(F ) ≤ 1 is facet-inducing for the stable set polytope if and only if the vertices in the set F form a maximal clique in the graph. The result follows from simple enumeration of possible maximal cliques in the conflict graph GC . Using the ellipsoid method, one can optimize over either of the polytopes mentioned in Proposition 9 in polynomial time. This leads us to ask under what conditions these formulations give complete descriptions of P xy (G) and P x (G). In order to answer this question, we will need the following theorem, which is of independent interest. Theorem 8. Let G = (V, E) be the graph associated with a GSP instance. If the shrunk graph GS is not biconnected, then a complete description of P xy (G) is given by a complete description of the polytopes associated with each block. Proof. By induction, it suffices to prove the following: let G1 = (V 1 , E 1 ) be the subgraph of G associated with a leaf block in GS , let Vk ⊂ V 1 be the cluster corresponding to a cut node in GS , and let G2 = (V 2 , E 2 ) be the subgraph of G induced by (V \ (V 1 \ Vk )). Then a complete description of P xy (G) is given by a complete description of P xy (G1 ) and P xy (G2 ). The proof relies on Procedure 4 (eliminating a block). Let x1 (respectively x2 ) be the subvector corresponding to the edges of E 1 (respectively E 2 ) and y 1 (respectively y 2 ) the subvector for vertices in V 1 \ Vk (respectively V 2 \ Vk ). Subvector y k corresponds to the vertices of V k . Finally the associated objective function coefficient subvectors are denoted by w 1 , w2 , p1 , p2 and pk . Let (x∗ 1 , x∗ 2 , y ∗ 1 , y ∗ 2 , y ∗ k ) be an optimal solution to max w1 x1 + w2 x2 + p1 y 1 + p2 y 2 + pk y k (13) 1 1 k xy 1 (14) 2 2 k xy 2 (15) (x , y , y ) ∈ P (G ) (x , y , y ) ∈ P (G ) From (14) it follows that the subvector (x∗ 1 , y ∗ 1 , y ∗ k ) is a convex combination of, say t, vertices 1 1 k (xi , y i , y i ) of P xy (G1 ), i.e.: (x∗ 1 , y ∗ 1 , y ∗ k ) = t X 1 1 k λi (xi , y i , y i ) i=1 Pt where λi ≥ 0 and i=1 λi = 1. All those vertices have integer coordinates and either y i (Vk ) = 1 or y i (Vk ) = 0. Hence, X (16) {λi : yvi = 1} = yv∗ and 1 1 X {λi : y i (Vk ) = 0} = 1 − y ∗ (Vk ) (17) k Further, each vertex (xi , y i , y i ) must be an optimal solution to 1 max w1 x1 + p1 y 1 (18) 1 (19) k xy 1 (x , y , y ) ∈ P (G ) yk = y 16 ik (20) 1 1 k because otherwise, replacing (xi , y i , y i ) by the optimal solution to (18)–(20) in the convex combination would yield a solution to (13)–(15) with a better objective function value than (x∗ 1 , x∗ 2 , y ∗ 1 , y ∗ 2 , y ∗ k ). Recall from Procedure 4 in Section 4 that p(V 1 , v) denotes the optimal value of (18)–(20) when yvi = 1. Let p(V 1 \ V k ) denote the optimal value of (18)–(20) when y i (Vk ) = 0. We now have that 1 ∗1 w x 1 ∗1 +p y 1 =w ( t X i1 1 λi x ) + p ( i=1 X X 1 t X 1 λi y i ) = i=1 1 λi (w1 xi + p1 y i ) + X 1 1 λi (w1 xi + p1 y i ) = i:y i (Vk )=0 v∈Vk i:yvi =1 X X λi p(V 1 , v) + X λi p(V 1 \ Vk ) = v∈Vk i:yvi =1 i:y i (Vk )=0 X X yv∗ p(V 1 , v) + (1 − v∈Vk yv∗ )p(V 1 \ Vk ), v∈Vk using (16) and (17). In consequence, problem (13)–(15) can be rewritten as: X max w2 x2 + p2 y 2 + (pv + p(V 1 , v) − p(V 1 \ Vk ))yv + p(V 1 \ Vk ) v∈Vk (x2 , y 2 , y k ) ∈ P xy (G2 ). This theorem implies that the shrunk support graph of an inequality inducing a facet of must be biconnected. It also immediately yields the following: P xy (G) Corollary 3. The polytope P xy (G) is completely described by the non-negativity inequalities, constraints (2) and (3) if the shrunk graph is a forest. Proof. When the shrunk graph is a forest, each block of the shrunk graph is merely an edge. The result is trivially true when the shrunk graph is an edge, so the result follows from Theorem 8. Note that the condition in Corollary 3 is sufficient but, for general graphs G, it is not necessary. (It is easy to find counter-examples.) On the other hand, for what we call full GSP instances, we can obtain a necessary and sufficient condition. Definition 1. A GSP instance (and its associated graph G) is said to be full if the following holds for each pair Vk , Vl of clusters: either every vertex in Vk is connected to every vertex in Vl , or E(Vk : Vl ) is empty. Theorem 9. Let G be full. The inequalities in the IP formulation of the GSP completely describe P xy (G) if and only if all cycles of the shrunk graph involve only singleton clusters (equivalently, each block of the shrunk graph is either an edge or is formed entirely by singleton clusters). 17 Proof. When the shrunk graph is an edge, the result is trivially true. When all clusters are singletons, then the inequalities in the formulation give a complete description because the correspondent matrix is totally unimodular (as mentioned in Section 3, the GSP is a min-cut / max-flow problem). Sufficiency then follows from Theorem 8. On the other hand, if there is a cycle in the shrunk graph which contains at least one nonsingleton cluster, then there is a facet-inducing odd ring inequality. (These inequalities will be introduced in Subsection 6.3). In the case of P x (G), a result analogous to Theorem 8 does not hold; the instance on the left of Figure 13 in Subsection 6.3 gives a counter-example since the support graph of this inequality contains edges in different blocks. Nevertheless, we have the following two results. Theorem 10. The polytope P x (G) is completely described by the non-negativity and star inequalities if the shrunk graph is a forest. Proof. Follows from Corollary 3 and Proposition 9. Theorem 11. Let G be full. The star and non-negativity inequalities completely describe P x (G) if and only if there is no cycle in the shrunk graph containing only non-singleton clusters (equivalently, the non-singleton clusters induce a forest in the shrunk graph). Proof. Given a full graph G meeting this condition, perform the following operation to each singleton cluster which has degree d > 1 in the shrunk graph. Suppose without loss of generality that the cluster, say {v}, is connected in G to clusters V1 , . . . , Vd . Replace the cluster {v} by d new singleton clusters, say {vi } for i = 1, . . . d. Then, replace each edge {u, v}, where u ∈ Vi , with the edge {u, vi }. It can be easily seen that the polytope P x (G) is unchanged by this transformation (apart from a re-labelling of edges). The transformed graph meets the condition of Theorem 10, and therefore P x (G) is completely described by the star and non-negativity inequalities. On the other hand, suppose that there is a cycle of non-singleton clusters in G. If this cycle has odd cardinality, then there is a facet-inducing odd cycle inequality of the form (6). If this cycle has even cardinality, then there is a facet-inducing even cycle inequality of the form (8). Notice that the condition for P x (G) in Theorem 11 is weaker than the condition for P xy (G) in Theorem 9. A related question is whether the upper bounds obtained by optimizing over the polyhedra mentioned in Proposition 9 are of good quality in general. Some experiments on small GSP instances have led us to believe that, in the case of P x (G), this issue is connected to the quantity (defined in the previous section) r ∗ (F ), where F denotes the set of all forests in GS : Conjecture 1. In the case of no vertex weights, the upper bound obtained using the star and non-negativity inequalities is never greater than r ∗ (F ) times the optimum, and, provided that the number of vertices in each cluster is ‘sufficiently large’, this ratio is achievable. In the case of general vertex weights, we have examples where the upper bound using (2) and (3) is positive, yet the optimum is zero. Therefore no similar result holds for P xy (G). 18 6.2 Some lifting theorems In this subsection, we present some lifting results which enable new valid (and facet-inducing) inequalities to be derived from known ones. When presenting these results we assume for simplicity that G is a complete m-partite graph. That is, if clusters Vi and Vj are distinct, then every vertex in Vi is connected to every vertex in Vj . In order to motivate the first lifting result, we will need the following lemma, which is also of independent interest: Lemma 3. All facet-inducing inequalities for P xy (G) can be written in the form αxP − βy ≤ γ, with α, β, γ ≥ 0, apart from the non-negativity inequalities xe ≥ 0 and the inequalities i∈Vk yi ≤ 1. Proof. Clearly, we can always assume that the inequality is in ≤ form. Moreover, the origin is feasible, which implies that γ ≥ 0. Now let e ∈ E be an arbitrary edge. If we have an integer solution which lies on the facet, and xe = 1, we can obtain another feasible integer solution by setting xe = 0. Therefore αe ≥ 0. On the other hand, if every integer solution on the facet satisfies xe = 0, the facet must be induced by the non-negativity inequality xe ≥ 0 (given that P xy (G) is full-dimensional). Similarly, if we have an integer solution which lies on the facet, and there is a cluster k such that yi = 0 for all i ∈ Vk , we can obtain other feasible integer solutions by setting yi = 1 for each i ∈ Vk . Therefore βi ≥ 0 for all i ∈ Vk . On the other hand, if every integer solution on the P P facet satisfies i∈Vk yi = 1, the facet must be induced by the inequality i∈Vk yi ≤ 1. Proposition 11 (Trivial Lifting for P xy (G)). Suppose that an inequality αx − βy ≤ γ, with α, β, γ non-negative, is valid for P xy (G). Let G0 be a graph obtained from G by adding another cluster Vk+1 , consisting of a single vertex w (together with the |V | extra edges required to make G0 complete (m + 1)-partite). Then the inequality α0 x − β 0 y ≤ γ is valid for P xy (G0 ), where α0 is defined by αe0 := αe (if e ∈ E), αe0 := 0 (otherwise), and β 0 is defined by βv0 := βv (if v ∈ V ) 0 := 0. Moreover, the new inequality induces a facet of P xy (G0 ) if the original induced a and βw facet of P xy (G). Proof. Validity follows from the fact that a subgraph of a generalized subgraph is itself a generalized subgraph. Now suppose that αx − βy ≤ γ induces a facet of P xy (G). Then (given that P xy (G) is full-dimensional) there are |E| + |V | affinely independent generalized subgraphs in G whose incidence vectors (x∗ , y ∗ ) lie on the facet. From these it is trivial to construct |E| + |V | affinely independent generalized subgraphs in G0 whose incidence vectors (x∗ , y ∗ ) sat∗ = 0 and x∗ = 0 for all e 6∈ E. Note also that, for every v ∈ V , there isfy α0 x∗ − β 0 y ∗ = γ, yw e is at least one GS among these |E| + |V | GSs which contains vertex v. (Otherwise, all points of P xy (G) in the facet would also satisfy x(δ(v)) = 0, which is impossible because P xy (G) is full-dimensional.) To prove the proposition, we need |V | + 1 more such affinely independent generalized sub∗ = 1. graphs in G0 . One of these is found by taking any of the previous solutions and setting yw The remaining |V | are constructed as follows: for each v ∈ V , choose one of the |E| + |V | GSs which contains vertex v, and add vertex w and edge {v, w} to the GS. Proposition 12 (Trivial Lifting for P x (G)). Suppose that an inequality αx ≤ β, with α and β non-negative, is valid for P x (G). Let G0 and α0 be defined as in Proposition 11. Then the inequality α0 x ≤ β is valid for P x (G0 ), and it induces a facet of P x (G0 ) if the original induced a facet of P x (G). 19 Proof. Similar to the proof of Proposition 11. Proposition 13 (Vertex Cloning for P xy (G)). Suppose that an inequality αx − βy ≤ γ, with α, β, γ non-negative, is valid for P xy (G). Let Vk be a specified cluster and let w ∈ Vk be a specified vertex. Let G0 be the graph obtained from G by ‘cloning’ vertex w. That is, a new vertex w 0 is added to Vk and |V \Vk | extra edges are added to make G0 complete m-partite. Then the inequality α0 x − β 0 y ≤ γ is valid for P xy (G0 ), where α0 is defined by αe0 := αe (if e ∈ E), αe0 := αuw (if e = {u, w 0 } for some u ∈ V \ Vk ), and β 0 is defined by βv0 := βv (if v ∈ V ) and 0 := β . Moreover, the new inequality induces a facet of P xy (G0 ) if the original induced a βw 0 w facet of P xy (G). Proof. Validity follows from the fact that vertices w and w 0 cannot both appear in a GS and by symmetry. The preservation of the property of being facet-inducing follows from a similar argument to that given in Proposition 11. Proposition 14 (Vertex Cloning for P x (G)). Suppose that an inequality αx ≤ β, with α, β non-negative, is valid for P x (G). Let G0 and α0 be defined as in Proposition 13. Then the inequality α0 x ≤ β is valid for P x (G0 ), and it induces a facet of P x (G0 ) if the original induced a facet of P x (G). Proof. Similar to the proof of Proposition 13. Theorem 12 (Cluster addition for P xy (G)). Suppose that an inequality αx − βy ≤ γ, with α, β, γ non-negative, is valid (facet-inducing) and not of type xuv ≤ yu for P xy (G). Let G0 be the graph obtained from G = (V, E) by adding another cluster Vk+1 = {w, q} together with edges in E(Vk+1 : V ) to make G0 complete (m + 1)-partite. Let {u, v} be an edge of E such that αuv ≥ 0 then α0 x − β 0 y ≤ γ 0 is valid (facet-inducing) for P xy (G0 ) if αe0 = αe , ∀e ∈ E\{{u, v}}, 0 0 = α , α0 = 0, ∀i ∈ V , α0 = 0 and β 0 = α , β 0 = 0, β 0 = 0, ∀i ∈ V . αuw = αvw uv uv uv w q qi i Proof. The validity follows from the fact that if ᾱx̄ − β̄ ȳ + αuv xuv ≤ γ is valid for P xy (G) then ᾱx̄ − β̄ ȳ + αuv (xuw + xvw − yw ) ≤ γ is valid for P xy (G0 ), where (ᾱ, β̄) and (x̄, ȳ) are associated with Ḡ = (V̄ , Ē) for V̄ = V and Ē = E\{{u, v}}. Indeed, let (x∗ , y ∗ ) be the vector associated ∗ = 1), we have ᾱx̄∗ − β̄ ȳ ∗ +α with a feasible GS solution of G0 with x∗uw = x∗vw = 1 (then yw uv ≤ γ ∗ ∗ since (x̄ , ȳ , 1), where the last component denotes xuv , is a feasible GS solution of G. The proof of facet-inducingness is done in three steps. First, we prove the result when G0 = (V 0 , E 0 ) is such that V 0 = V ∪ {w} and E 0 = E ∪ {{u, w}, {v, w}}. Then we extend the result to the case where δ(w) is complete by lifting optimally on the remaining variables xzw , z ∈ V \{u, v}. The third step is obtained for G0 = (V 0 , E 0 ) where V 0 = V ∪ {w, q} and E 0 = E ∪ δ(w) ∪ δ(q). For v ∈ V , let h(v) denote the index of the cluster containing v. Step 1. Let S be a set of dim(P xy (G)) affinely independent vectors (x, y) of P xy (G) such that αx − βy = γ. Let G0 = (V 0 , E 0 ) be such that V 0 = V ∪ {w} and E 0 = E ∪ {{w, u}, {w, v}}. We associate a vector (x0 , y 0 ) of P xy (G0 ) such that α0 x − β 0 y = γ 0 with each vector (x, y) from S. This extension is made as follows: x0f := xf for f ∈ E\{{u, v}}, x0uv := 0, 0 := x 0 x0uw = x0vw = yw uv and yt := yt for t ∈ V . In other words, if edge {u, v} is present in the solution corresponding to (x, y) in G, in the extension, edge {u, v} is erased and replaced by edges {u, w} and {v, w} in G0 . If edge {u, v} is not present in the vector in G then nothing is added in the extension in G0 (see Figure 10). 20 u v u G v G u G0 w v u (a) If {u, v} 6∈ E. v w G0 (b) If {u, v} ∈ E. Figure 10: First extension from G to G0 . u v G v G u G0 u w v (a) Vector (xa , y a ). v G u G0 u w v (b) Vector (xb , y b ). u G0 w v (c) Vector (xc , y c ). Figure 11: Second extension from G to G0 . Since adding vertex w and the two edges {u, w} and {v, w} to G implies that three additional variables are introduced, we need therefore three more vectors. In order to do this, we select three special vectors in S. 1. The first vector (xa , y a ) is such that yua = 1 and xauv = 0. Such a vector exists in S since αx − βy ≤ γ is facet-inducing and S cannot be contained in another hyperplane (in this case, the hyperplane would be xuv = yu ). This vector is extended to G0 by 00a setting x00a uw := 1 and yw := 1 (see Figure 11 (a)). 2. The second vector (xb , y b ) is such that yvb = 1. It is possible since S cannot be contained in another hyperplane (in this case, the hyperplane would be xuv = yv ). 00b This vector is extended to G0 by setting x00b vw := 1 and yw := 1 (see Figure 11 (b)). 3. The third vector (xc , y c ) is such that xcuv = 1. It is possible since S cannot be 00c 00c contained in xuv = 0. This vector is extended to G0 by setting x00c uw = xvw = yw := 1 (see Figure 11 (c)). These dim(P xy (G0 )) vectors are affinely independent, they correspond to feasible GS solutions and they satisfy α0 x − β 0 y = γ 0 . Indeed, let a generic inequality be λ̄x̄ − µ̄ȳ + λuw xuw + λvw xvw − µw yw ≤ ν (where the notations are the same as in the validity part). Using the first dim(P xy (G)) vectors defined above, we have λ̄ = Cα, µ̄ = Cβ, ν = Cγ and λuw + λvw − µw = Cαuv . Using the last three vectors (xa , y a ), (xb , y b ) and (xc , y c ) that have been extended in two different ways to G0 we obtain λuw = µw , λvw = µw and λuv = 0. Step 2. The second step of the proof is reached by considering the lifting of the facet-inducing inequality for P xy (G0 ) where V 0 = V ∪ {w} and E 0 = E ∪ {{u, w}, {v, w}}. This lifting is done optimally with respect to variable xzw for all z ∈ V \{u, v}. We show that the optimal lifting coefficient for each xzw is zero for z ∈ V \{u, v}. 21 u v G u v G z u v w u u G0 z (a) If {u, v} 6∈ E. v G z G0 z v w z (b) If {u, v} ∈ E. z G0 v w u (c) If z ∈ Vh(u) . Figure 12: Third extension from G to G0 . 1. If z 6∈ Vh(u) , there exists a vector (x, y) in P xy (G) such that αx − βy = γ with xuz = 1. This vector can be extended to (x0 , y 0 ) in P xy (G0 ) where V 0 := V ∪ {w} and E 0 := E 0 ∪ {{z, w}} with E 0 initialized to E ∪ {{u, w}}, {v, w}} by setting x0e := xe 0 := 1, x0 , y 0 := 0 otherwise. In case for e ∈ E, yi0 := yi for i ∈ V and x0zw = x0uw = yw 0 xuv = 1 then xvw is set to 1. These extensions satisfy α0 x − β 0 y = γ 0 (see Figure 12 (a) and (b)). 2. If z ∈ Vh(u) , there exists a vector (x, y) in P xy (G) such that αx − βy = γ with xvz = 1. Extending this vector to (x0 , y 0 ) in P xy (G0 ) where V 0 := V ∪ {w} and E 0 := E 0 ∪ {{z, w}} with E 0 initialized to E ∪ {{u, w}}, {v, w}} is done by setting 0 := 1, x0 , y 0 := 0 otherwise x0e := xe for e ∈ E, yi0 := yi for i ∈ V and x0zw = x0vw = yw (see Figure 12 (c)). Step 3. The third step of the proof remains to show that α0 x − β 0 y ≤ γ 0 is facet-inducing for G0 when V 0 = V ∪ {w, q}, q ∈ Vh(w) , E 0 = E ∪ δ(w) ∪ δ(q). To prove this, we show that the lifting coefficient associated with xe , e ∈ δ(q) and yq are equal to 0. 1. Indeed, there exists a solution (x, y) in P xy (G) lying on the face defined by αx−βy ≤ γ with xuv = 0. This vector can be extended to (x0 , y 0 ) in P xy (G0 ) where V 0 := V ∪{q, w} and E := E ∪ δ(w) by setting x0e := xe for e ∈ E, yi0 := yi for i ∈ V and yq0 := 1, x0 , y 0 := 0 otherwise. This proves that βq0 = 0. 2. For each z ∈ V , there exists a solution (x, y) in P xy (G) such that αx − βy = γ with xzl = 1, l ∈ Vh(v) \{v} if z 6∈ Vh(v) , l ∈ Vh(u) \{u} otherwise. This vector can be extended to (x0 , y 0 ) in P xy (G0 ) where V 0 := V ∪ {q, w} and E 0 := E 0 ∪ {{z, q}} with E 0 initialized to E ∪ δ(w) by setting x0e := xe for e ∈ E, yi0 := yi for i ∈ V and 0 = 0 for all z ∈ V . x0zq = yq0 := 1, x0 , y 0 := 0 otherwise. This shows that αzq Notice from the proof of Theorem 12 that the result also holds in case V k+1 = {w} is added instead of Vk+1 = {w, q}. These simple lifting results enable us to simplify what follows, since, when proving inequalities to be valid or facet-inducing, we only have to consider inequalities which do not arise from simpler inequalities by lifting. These simplest inequalities will be called primitive. 22 1 2 3 4 5 1 3 4 2 6 7 8 9 5 6 7 8 Figure 13: Two configurations of edges leading to 5-holes in GC . 6.3 Holes, cycles and rings In Subsection 6.1 we noted that the star and 3-cycle inequalities are equivalent to the well-known clique inequalities, defined by Padberg (1973) for the stable set polytope. Another well-known class of inequalities for the stable set polytope, also due to Padberg (1973), are the so-called odd hole inequalities. An odd hole in a graph is a chordless cycle with cardinality odd and at least 5. The associated odd hole inequality states that, if an odd hole is of cardinality c, then at most bc/2c vertices in the hole can appear in a stable set. Odd hole inequalities can be lifted to obtain facets of the stable set polytope (Padberg (1973)). In the context of the GSP, the odd hole inequalities state that, if a set F of edges with |F | ≥ 5 and |F | odd induces a hole in the conflict graph GC , then x(F ) ≤ b|F |/2c is valid for P x (G) (and therefore for P xy (G) also). It should be immediately apparent that the primitive odd cycle inequalities (i.e. with |V k | = 2) are odd hole inequalities and that non-primitive odd cycle inequalities are lifted odd hole inequalities. However, there are other configurations of edges in G which lead to odd holes in GC , which can be used to derive new facets for P x (G). Consider the two configurations shown in Figure 13. For the instance on the left in Figure 13, the odd hole inequality x16 + x23 + x45 + x39 + x78 ≤ 2 is valid. Similarly, for the instance on the right in Figure 13, the odd hole inequality x13 + x15 + x24 + x48 + x67 ≤ 2 is valid. These inequalities, which happen to be primitive, are not odd cycle inequalities; yet they can be easily shown to be facet-inducing for the associate polytope P x (G). (As we point out below, however, they are not facet-inducing for P xy (G)!) It is not difficult to characterize the possible configurations of edges in G which lead to odd holes in GC . However, we do not go into detail here, because we have not found it to be particularly illuminating. Moreover, it appears to be difficult to find a closed-form for the lifted odd hole inequalities which induce facets of P x (G). Of course, given any facet-inducing odd hole inequality, we can apply vertex-cloning to obtain more general lifted odd hole inequalities. But there are lifted odd hole inequalities which cannot be obtained in this way. Indeed, inequalities obtained by vertex-cloning odd hole inequalities have coefficients of zero or one on the left hand side — yet we know of lifted odd hole inequalities in which one or more left hand side coefficients take the value two. We now turn our attention to the polytope P xy (G). Paradoxically, the situation here is much simpler. It turns out that the only odd hole inequalities which induce facets of P xy (G) are those which happen to be odd cycle inequalities. This is because ‘weird’ odd hole inequalities such as the ones shown in Figure 13 are dominated by a much simpler class of inequalities involving both x and y variables. This new class of inequalities is introduced in the next two theorems. 23 First we will need a definition. Definition 2 (Rings). Consider a GSP instance with |K| = m ≥ 3 such that |Vk | ≤ 2 for all k ∈ K. A ring is a set F ⊂ E such that: • |F | = |K|, • |F ∩ δ(Vk )| = 2 for all k ∈ K and • F induces a Hamiltonian cycle in the shrunk graph. Note that for a given ring, there are three possibilities for a cluster. If the cluster contains two vertices, either each of the two vertices in the cluster is adjacent to an edge in the ring, or one of the vertices in the cluster is adjacent to two edges in the ring and the other vertex in the cluster is adjacent to none. If the cluster contains one vertex, this vertex is adjacent to two edges in the ring. We will say that clusters with two vertices of degree one in the ring are odd, whereas clusters of the other types are even. Theorem 13 (Primitive Odd Ring Inequalities). Let G be such that |Vk | ≤ 2 for all k ∈ K and let F be a ring. Suppose there are c odd clusters, with c odd, and let W denote the set of vertices in even clusters which are incident on two edges in the ring. (Note that c + |W | = m.) Then the (primitive) odd ring inequality x(F ) ≤ bc/2c + y(W ) (21) is valid for P xy (G). Proof. For each edge e = {u, v} ∈ F , the inequalities xe ≤ yu and xe ≤ yv hold. Sum these together over all e ∈ F , together with the inequalities (2) for each odd cluster, to obtain 2x(F ) ≤ c + 2y(W ). Divide this by two and round down the right hand side to obtain the result. Applying trivial lifting and vertex-cloning to the primitive inequalities (21) yields what we call odd ring inequalities. Theorem 14 (Odd Ring Inequalities). Odd ring inequalities induce facets of P xy (G). Proof. This needs to be proved for the primitive version. The result then follows from Propositions 11 and 13. The primitive odd ring inequalities with c ≥ 3 are facet-defining inequalities since they are obtained from odd cycle inequalities using the lifting Theorem 12. To prove that odd ring inequalities with c = 1 are also facet-defining, it suffices to show that the primitive inequality involving three clusters (one odd and two even clusters) is facet-defining. This can be done by listing affinely independent vectors lying in the face. Theorem 12 can then be applied to have the facet-defining property for any primitive odd ring inequality with one odd cluster. It should be noted that odd ring inequalities are a proper generalization of odd cycle inequalities. Moreover, as stated above, the odd ring inequalities dominate ‘wierd’ odd hole inequalities such as the ones shown in Figure 13. Indeed, for the instance on the left in Figure 13, the odd hole inequality x16 + x23 + x45 + x39 + x78 ≤ 2 is dominated by the odd ring inequality x16 + x23 + x39 + x78 ≤ 1 + y3 (together with the trivial inequalities x45 ≤ y4 and y3 + y4 ≤ 1). Similarly, for the instance on the right in Figure 13, the odd hole inequality x13 +x15 +x24 +x48 +x67 ≤ 2 is dominated by the odd ring inequality x13 +x15 +x48 +x67 ≤ 1+y1 (together with the trivial inequalities x24 ≤ y2 and y1 + y2 ≤ 1). 24 6.4 On odd clique matching inequalities We now return to the question of which OCMIs (7) are facet-inducing. In order to do this, it is useful to note that the OCMIs are analogous to the well-known blossom inequalities of Edmonds (1965a). The blossom inequalities state that, if S is a set of vertices in a graph, |S| is odd, and E(S) is the set of all edges which have both end-vertices in S, then a (1-)matching can use at most bS/2c edges in E(S). Looking at Figure 5 in Section 2, it is clear that the OCMIs have much the same flavour, except that each vertex in S has become a cluster partitioned in a certain way. This suggests that we might be able to ‘borrow’ the conditions for a blossom inequality to be facet-defining for the matching polytope (Pulleyblank and Edmonds (1974)), and apply them to the OCMIs. These conditions are based on the concepts of biconnectedness and hypomatchability. We have defined biconnectedness above. A graph G = (V, E) is hypomatchable if, for each vertex i, the subgraph induced in G by V \ {i} possesses a perfect matching. Obviously this implies that |V | is odd. This leads to the following two results (which are true for both P xy (G) and P x (G)): Theorem 15. If an OCMI is facet-inducing, then its associated shrunk support graph must be biconnected. Proof. If the shrunk graph GS (V S , E S ) (with |V S | odd), is not biconnected, there exists at least one articulation point and it can be decomposed into blocks. Two cases arise. S ) with |V S | 1. At least one leaf block is odd (see Figure 14 (a)). It is denoted by B(VBS , EB B odd. Let k be the articulation point in VBS . The OCMI is the sum of OCM (B) + OCM (GS \(VBS \{k})) ≤ |VBS | − 1 |V S \VBS | |V S | − 1 + = , 2 2 2 where OCM (H) denotes the left hand side of the OCMI whose shrunk support graph is H. 2. All the leaf blocks are even (see Figure 14 (b)). Let B be a leaf block and k be the articulation point in VBS . Then the OCMI is the sum of OCM (B\{k}) + OCM (GS \VBS ) + star(k) ≤ |VBS \{k}| − 1 |V S \VBS | − 1 |V S | − 1 + +1= , 2 2 2 where star(k) is the left hand side of the star inequality (12) (which is still valid for xy (G)) whose nucleus is Vk in G and such that its shrunk support graph is included in PGS S G . Theorem 16. If an OCMI is facet-inducing, then its associated shrunk support graph must be hypomatchable. We do not present the proof for this theorem, because we will now prove something stronger. It turns out that these two conditions are not sufficient to get a facet. Indeed, Figure 15 (a) shows an OCMI whose shrunk support graph is biconnected and hypomatchable, but which is 25 B GS \ (VBS \ {k}) GS \ VBS k B k (a) If at least one leaf block is odd. (b) If all the leaf blocks are even. Figure 14: Two cases for the proof of Theorem 15. (a) OCMI with shrunk graph biconnected and hypomatchable. (b) A stronger lifted odd hole inequality. Figure 15: An OCMI satisfying the conditions of Theorems 15 and 16, but not facet inducing. not facet-defining. It is dominated by the inequality shown in Figure 15 (b), which is a lifted odd hole inequality. The right hand side of both inequalities is 2. In order to give a stronger necessary condition for an OCMI to induce a facet, we will need the following definitions. Definition 3. Let G = (V, E) be a graph. A set S = {u, v, w} ⊆ V such that |S| = 3 is said to be a fork if |E({u, v, w})| ≥ 2. Definition 4. A graph G = (V, E) is said to be fork-hypomatchable if, for every fork S ⊆ V , the graph induced by V \ S contains a perfect matching. Note that the shrunk support graph for the OCMI shown in Figure 15 (a) is biconnected and hypomatchable, but not fork-hypomatchable. Theorem 17. If an OCMI is facet-inducing, then its associated shrunk support graph must be fork-hypomatchable. Proof. Consider the OCMI (7) once more. Suppose without loss of generality that V 11 , V21 , V22 , V32 are all non-empty. That is, the clusters V1 , V2 and V3 correspond to a fork {u, v, w} in the shrunk graph. Suppose also that removing {u, v, w} from the shrunk graph leaves a graph with 26 Figure 16: Two graphs which satisfy the conditions of Conjecture 3. no perfect matching. Then (7) is dominated by the following inequality, x(E(V11 : V22 )) + x(E(V21 : V32 )) + c−1 c X X i )) ≤ bc/2c, x(E(Vij : Vj+1 i=1 j=i which is easily shown to be valid. Note that, if a graph is biconnected, then every vertex is contained in at least one fork. Therefore, biconnectedness and fork-hypomatchability collectively imply hypomatchability. Therefore Theorems 15 and 17 imply Theorem 16. Moreover, we propose the following conjecture. Conjecture 2. An OCMI is facet-inducing (for both P xy (G) and P x (G)) if and only if its associated shrunk support graph is biconnected and fork-hypomatchable. Enumeration shows that the unique graphs with 5 vertices which satisfy this condition are the 5-hole and the 5-clique. However, for seven or more vertices there are graphs which meet the conditions which are neither odd holes nor odd cliques. Figure 16 shows two examples. The second is of particular interest, because it is not Hamiltonian. This implies that the associated OCMIs are not lifted odd hole inequalities. It is important to note that, unlike ordinary hypomatchability, fork-hypomatchability is not preserved under the addition of edges. 6.5 {0, 12 }-cuts It is instructive to note that many of the inequalities discussed in this paper can be derived as so-called {0, 12 }-cuts. These are valid inequalities which (see Caprara and Fischetti (1996)) are obtained by summing together a number of other valid inequalities, dividing the result by two and rounding down coefficients to the nearest integer. In the case of P x (G), it is easy to show that the odd hole inequalities from the conflict graph can be derived as {0, 12 }-cuts from the inequalities (10) mentioned at the beginning of this section. Moreover, since the star inequalities (12) dominate the inequalities (10), we can expect to derive stronger {0, 12 }-cuts from them. Indeed, many (but not all) lifted odd hole inequalities can be derived as {0, 21 }-cuts from the star inequalities, and so can all of the OCMIs. In general, we expect to be able to derive stronger {0, 12 }-cuts for P xy (G) than we can for x P (G), because for P xy (G) we can use the inequalities (2) and (3), whereas for P x (G) we are limited to the weaker star inequalities. Indeed, in the case of P xy (G), the {0, 12 }-cuts include not only certain lifted odd hole inequalites and all OCMIs, but also the odd ring inequalities (see the proof of Theorem 13). 27 In fact, in the case of P xy (G) at least, there exist non-redundant {0, 12 }-cuts which do not lie in any of the classes discussed so far. Proposition 15. Suppose that |Vk | > 1 for all k. Label each cluster as type 1 or type 2. Let F ⊂ E be a set of edges with the following property: • If Vk is a cluster of type 1, then for each v ∈ Vk , |F ∩ δ(v)| = 1. • If Vk is a cluster of type 2, then for each v ∈ Vk , |F ∩ δ(v)| = 2. Suppose there are t1 clusters of type 1 and that t1 is odd. Finally, let W ⊂ K be the set of clusters of type 2. Then the (primitive) inequality X x(F ) ≤ bt1 /2c + y(Vk ) (22) k∈W is a valid {0, 21 }-cut for P xy (G). Proof. For each edge e = {u, v} ∈ F , the inequalities xe ≤ yu and xe ≤ yv hold. Sum these together over all e ∈ F , together with the inequalities (2) for each cluster of type 1, to obtain X 2x(F ) ≤ t1 + 2 y(Vk ). k∈W Dividing this by two and rounding down the right hand side yields the inequality (22), thus showing that (22) is a valid {0, 21 }-cut. These {0, 12 }-cuts can in fact be obtained by applying Theorem 12 to the OCMIs described in Proposition 2. Thus, they will induce facets whenever the ‘underlying’ OCMI does. It seems that a further study of {0, 12 }-cuts for the GSP might be useful. Moreover, in the context of a practical cutting plane algorithm for the GSP, it may be more sensible to search for violated {0, 21 }-cuts, rather than limiting the search to predefined classes such as OCMIs. 7 Conclusions In this paper we have explored the GSP from a theoretical point of view. We have studied its complexity and approximability, and have provided new polyhedral results. It should be stressed that the polyhedral results given in this paper have implications for other problems related to the GSP. All of the valid inequalities for the GSP are also valid for any of the problems which are a restriction of the GSP, such as the GMSTP and GTSP. Of course, the inequalities are no longer guaranteed to induce facets for these other problems. But in many important cases they do. For example, many of the odd clique matching inequalities are facet-inducing for the GMSTP. References Berman, P. (2000). A d/2 Approximation for Maximum Weight in Independent Set in d-Claw Free Graphs. Nordic Journal of Computing 7, 178–184. Burkard, R. and E. Çela (1997). Quadratic and Three-Dimensional Assignments. In M. Dell’Amico, F. Maffioli, and S. Martello (Eds.), Annotated Bibliographies in Combinatorial Optimization, pp. 373–392. Wiley. 28 Caprara, A. and M. Fischetti (1996). {0, 21 }-Chvátal-Gomory Cuts. Mathematical Programming 74, 221–235. Edmonds, J. (1965a). Maximum Matching and a Polyhedron with 0-1 Vertices. Journal of Research of the National Bureau of Standards 69B, 125–130. Edmonds, J. (1965b). Minimum Partition of a Matroid into Independent Subsets. Journal of Research of the National Bureau of Standards 69B, 67–72. Edmonds, J. (1965c). Paths, Trees and Flowers. Canadian Journal of Mathematics 17, 449– 467. Feremans, C. (2001). On the Generalized Minimum Spanning Tree and Extensions. Ph. D. thesis, Université Libre de Bruxelles. Feremans, C., M. Labbé, and G. Laporte (2002a). A Comparative Analysis of Several Formulations for the Generalized Minimum Spanning Tree Problem. Networks 39 (1), 29–34. Feremans, C., M. Labbé, and G. Laporte (2002b). The Generalized Minimum Spanning Tree Problem: Polyhedral Analysis and Branch-and-Cut Algorithm. Technical Report 2002/04, Université Libre de Bruxelles, CP 210/01, B-1050 Bruxelles, Belgique. Available at http://smg.ulb.ac.be (Preprints). Fischetti, M., J. Salazar, and P. Toth (1995). The Symmetric Generalized Traveling Salesman Polytope. Networks 26, 113–123. Fischetti, M., J. Salazar, and P. Toth (1997). A Branch-and-Cut Algorithm for the Symmetric Generalized Traveling Salesman Problem. Operations Research 45, 378–394. Golden, B., L. Levy, and R. Dahl (1981). Two Generalizations of the Traveling Salesman Problem. Omega 9, 439–441. Grötschel, M., L. Lovász, and A. Schrijver (1988). Geometric Algorithms and Combinatorial Optimization. Springer-Verlag. Koster, A., S. Van Hoesel, and A. Kolen (1998). The Partial Constraint Satisfaction Problem: Facets and Lifting Theorems. Operations Research Letters 23, 89–97. Malucelli, F. and D. Pretolani (1995). Lower Bounds for the Quadratic Semi-Assignment Problem. European Journal of Operational Research 83, 365–375. Martin, R. (1999). Large Scale Linear and Integer Optimization: A Unified Approach. Kluwer Academic Publishers. Padberg, M. (1973). On the Facial Structure of Set Packing Polyhedra. Mathematical Programming 5, 199–215. Picard, J.-C. and H. Ratliff (1975). Minimum Cuts and Related Problems. Networks 5, 357– 370. Pulleyblank, W. and J. Edmonds (1974). Facets of 1-Matching Polyhedra. In C. Berge and D. Ray-Chaudhuri (Eds.), Hypergraph Seminar, pp. 214–242. Springer, Berlin. Rhys, J. (1970). A Selection Problem of Shared Fixed Costs and Networks Flows. Management Science 17 (3), 200–207. Tarjan, R. (1972). Depth First Search and Linear Graph Algorithms. SIAM Journal on Computing 1, 146–160. 29