Clique, independent set, and graph coloring Jeff Pattillo and Sergiy Butenko Abstract This article introduces the closely related maximum clique, maximum independent set, graph coloring, and minimum clique partitioning problems. The survey includes some of the most important results concerning these problems, including their computational complexity, known bounds, mathematical programming formulations, and exact and heuristic algorithms to solve them. 1 Introduction The maximum clique, maximum independent set, graph coloring, and minimum clique partitioning problems are classical problems in combinatorial optimization. Due to their important role in several theoretical fields and applicability in a wide variety of practical settings, these problems have been extensively studied from different perspectives by mathematicians, computer scientists, operations researchers, engineers, biologists, and social scientists. This chapter surveys only a fraction of these research developments with the aim to provide a brief introduction to these problems for the operations research community. The reader is referred to surveys [19] and [86] for far more extensive information about each problem. According to WordNet dictionary [81], clique is defined as “an exclusive circle of people with a common purpose”. Cliques, as described in the dictionary definition, represent a natural object of interest for social and behavioral sciences. Thus, it is not surprising that the first mentioning of this term in graph-theoretic context is attributed to researchers in social network analysis: in their 1949 paper [74], Luce and Perry used complete subgraphs to model social cliques, which are defined as groups of people that know (are friends of) all other people in the group. It should be noted that the complete subgraphs had been studied by mathematicians even before the term “clique” was introduced to graph theory. Graph coloring problems have been around even longer, dating back to the famous four-color conjecture first attacked in the 19-th century [67, 36], stating that regions of any simple planar map can be colored with four colors so that any two adjacent regions have different colors. The proof of this seemingly simple statement required the efforts of several generations of mathematicians and involvement of a computer [106, 51]. Clique-like structures frequently arise in many other applications, where one is interested in detecting large groups of elements that are all closely related to each other in some sense. Depending on the application of interest, such structures are often referred to as clusters, modules, complexes or cohesive subgroups. If the elements in the application of interest are represented as vertices (nodes) and the relationships between the elements are represented as edges (links, arcs), then clusters can be naturally modeled as cliques in this graph-theoretic representation. Similarly, graph coloring can be used to solve many important practical problems arising, e.g., in scheduling, timetabling, and telecommunication networks. 1 In this paper, we seek to give an introduction to these problems and their typical applications, as well as give a brief summary of the most popular methods of solving the problems and the capabilities and limitations of these methods. 2 Definitions and notations Given a simple undirected graph G = (V, E), where V = {1, . . . , n} is the vertex set and E ⊆ V × V is its edge set, we denote by G(S) = (S, E ∩ S × S) the subgraph induced by a subset of vertices S, and we use the notation Ḡ to represent the compliment graph, that is Ḡ = (V, Ē) where Ē = {(i, j)|i, j ∈ V, i ̸= j and (i, j) ∈ / E}. Vertices i and j are called adjacent if (i, j) ∈ E. The neighborhood N(i) of a vertex i is the set of vertices adjacent to i, N(i) = { j ∈ V : (i, j) ∈ E}. By AG we denote the adjacency matrix of G, which is an n × n matrix AG = [ai j ]ni, j=1 such that ai j = 1 if (i, j) ∈ E and ai j = 0 otherwise. The degree deg(i) of i in G is the size |N(i)| of the neighborhood of i in G. We will denote by δ(G) and ∆(G) the smallest and the largest degree of a vertex in G, respectively. A graph G = (V, E) is called complete if all its vertices are adjacent, i.e. if ∀i, j ∈ V, i ̸= j, we have (i, j) ∈ E. By definition a clique C is a subset of vertices such that G(C) is complete. An independent set in G is a subset of vertices I such that the induced subgraph G(I) is edgeless. A clique (independent set) is called maximal if it is not a subset of a larger clique (independent set) in G, and it is called maximum if there is no larger clique (independent set) in the graph. Given a simple undirected graph G, the maximum clique problem (maximum independent set problem) asks for a maximum clique (independent set) in G. The clique number, which is traditionally denoted as ω(G), is the number of vertices in the largest clique in G. The independence number α(G) is defined analogously. It is easy to see that C is a clique in G if and only if C is an independent set in Ḡ, hence, ω(G) = α(Ḡ). Due to this close relation between the two concepts, we will use the notions of cliques and independent sets interchangeably in this paper, since all the facts stated for a clique in G are true for an independent set in Ḡ, and vice versa. A proper k-coloring of the vertices of G is an assignment of colors (e.g., {1, 2, . . . , k}) to the vertices of G so that no two adjacent vertices get the same color. A vertex k-coloring can also be thought of as a mapping f : V → {1, 2, . . . , k} such that f (vi ) ̸= f (v j ) if (vi , v j ) ∈ E. A graph G is called k-colorable if there exists a proper k-coloring of its vertices. The graph coloring problem is to find a proper coloring of its vertices with the smallest possible number of colors. The chromatic number χ(G) of G is the smallest integer k for which G is k-colorable. A k-coloring partitions V into k different color classes, where each member of the class has the same color. To have the same color, the members of each class must be pairwise non-adjacent, which by definition makes them an independent set. Similarly, one can define a k-clique partitioning of G, which is a partitioning of the set of vertices of G into k disjoint subsets C1 , . . . ,Ck , each of which is a clique. The minimum clique partitioning problem asks for a k-clique partitioning of G using the smallest possible number k of cliques, which is denoted by χ̄(G) and is called the clique partitioning number. Note that a proper k-coloring of G is a k-clique partitioning of Ḡ, thus, χ(Ḡ) = χ̄(G). The maximum clique, maximum independent set, vertex coloring and minimum clique partitioning problems are all known to be NP-hard [48]. Moreover, these problems are known to be hard to approximate. Arora et al. showed in [5] and [6] that the maximum clique size cannot be approximated within a factor of nε for any ε > 0 unless P = NP. Similarly, the graph coloring problem is not approximable within n1−ε for any ε > 0 [45]. 3 Mathematical programming formulations 3.1 The maximum clique problem The maximum clique problem can be formulated as the following integer program: n max ∑ xi i=1 s.t. xi + x j ≤ 1, ∀(i, j) ∈ Ē xi ∈ {0, 1}, i = 1, . . . , n (1) In this edge formulation any feasible solution x defines a clique C in G as follows: C = {i ∈ V : xi = 1}. The linear relaxation of this formulation is significant as well. Nemhauser and Trotter [84] found that if a variable xi in any optimal solution to the linear relaxation had the value 1, then xi = 1 in at least one optimal solution to the above formulation. When some such variable xi = 1, this clearly helps cut down the search space for an optimal clique. Shor [92] modified the edge formulation to obtain an equivalent nonlinear problem: n max ∑ xi i=1 s.t. xi x j = 0, ∀(i, j) ∈ Ē (2) xi2 − xi = 0, i = 1, . . . , n He used this formulation to obtain dual estimates of good quality. Motzkin and Straus [83] related ω(G) to the global optimal value of a certain quadratic function over the standard simplex. Before we state this classical formulation, some additional notations are in order. For a set of vertices S we use the notation xS to denote the characteristic vector of S, that is a vector that satisfies xiS = 1/|S| if i ∈ S and xiS = 0 otherwise, for i ∈ {1, . . . , n}. By S we will denote the standard simplex in ℜn : n S = {x ∈ ℜn : ∑ xi = 1, xi ≥ 0, i = 1, . . . , n}. i=1 Let AG be the adjacency matrix of a graph G, and for x ∈ ℜn let g(x) = xT AG x. Then the global optimal solution x∗ to max g(x) is related to the clique number by the formula x∈S ω(G) = 1 . 1 − g(x∗ ) The following generalization of the Motzkin-Straus theorem was intoroduced by Bomze [18] to guarantee a one-to-one correspondence between maximal cliques and local maximizers of the quadratic program. Let f (x) = xT AG x + 12 xT x. Theorem 3.1 Let S be a subset of vertices of a graph G, and let xS be its characteristic vector. Then the following statements hold: • S is a maximum clique of G if and only if xS is a global maximizer of the function f over the simplex S. In this case, ω(G) = 2(1−1f (xS )) . • S is a maximal clique of G if and only if xS is a local maximizer of f in S. • All local (and hence global) maximizers x of g over S are strict, and of the form x = xS for some S ⊆ V. The maximum independent set problem can be equivalently formulated using the following mathematical programs over the unit hypercube in ℜn (see, e.g., [2, 9]): ∑ xi − ∑ α(G) = max x∈[0,1]n i∈V α(G) = max i∈V (1 − x j ). (4) j∈N(i) ∑ 1+ x∈[0,1]n α(G) = max (3) (i, j)∈E ∑ xi ∏ x∈[0,1]n i∈V xi x j . xi . ∑ xj (5) j∈N(i) In each of these formulations, the feasible region can be replaced with {0, 1}n without changing the optimal objective value. Several other similar formulations can be found in [55]. Recently, Martins [78] proposed discretized formulations for the maximum clique problem and reported tighter upper bounds based on linear programming relaxations than those for the known formulations on many benchmarks graphs. 3.2 Formulations of the graph coloring problem The following is perhaps the most straightforward integer programming formulation of the graph coloring problem. n min ∑ yk k=1 n s.t. ∑ xik = 1, ∀i ∈ V k=1 (6) xik + x jk ≤ 1, ∀(i, j) ∈ E yk ≥ xik , ∀i ∈ V, k = 1, . . . , n yk , xik ∈ {0, 1} ∀i ∈ V, k = 1, . . . , n. Any feasible solution induces a proper coloring by assigning each vertex i the unique color k for which xik = 1. The optimal objective function value gives the chromatic number of the graph. Another integer programming formulation r min ∑ ∑ xik x jk k=1 (i, j)∈E r s.t. ∑ x jk = 1, j = 1, . . . , n (7) k=1 x jk ∈ {0, 1}, ∀ j, k is quadratic in its objective value but has much simpler constraints than the previous formulation. Probably the most noteworthy continuous optimization formulation is min α s.t. qi j ≤ α, if (i, j) ∈ E qi j = q ji qii = 1 Q = [qi j ] ≽ 0 (8) While this is not a semidefinite program because α is on the wrong side of an inequality, many have used relaxations of this to find the coloring number of graphs using semidefinite programs [41]. 4 Bounds Since the number of vertices in any color class defined by a proper coloring of G cannot exceed α(G), we obtain the following inequality: |V | χ(G) ≥ . (9) α(G) The Caro-Wei bound on the independence number is expressed in terms of vertex degrees as follows [34, 102]: 1 α(G) ≥ ∑ . (10) i∈V deg(i) + 1 This bound is sharp if and only if each connected component of G is a clique. Note that the Caro-Wei bound follows from formulation (5) by setting all variables to 1. In 1967, Wilf [105] showed that ω(G) ≤ ρ(AG ) + 1, (11) where ρ(AG ) is the spectral radius of the adjacency matrix of G (the largest eigenvalue of AG ). Amin and Hakimi [4] proved that ω(G) ≤ N−1 + 1 < n − N0 + 1, (12) where N−1 is the number of eigenvalues of AG that do not exceed −1, and N0 is the number of zero eigenvalues. This bound is sharp if G is a complete multipartite graph. Applying a greedy coloring yields a simple upper bound on the chromatic number: χ(G) ≤ ∆(G) + 1. (13) Brook’s theorem [26] slightly improves this bound for connected graphs: for a connected, simple graph G that is not a complete graph or an odd cycle, we have χ(G) ≤ ∆(G). (14) Szekeres and Wilf [96] generalized bound (13) by proving that for any function λ(G) satisfying two properties, (i) G′ ⊂ G ⇒ λ(G′ ) ≤ λ(G) and (ii) λ(G) ≥ δ(G), we have χ(G) ≤ λ(G) + 1. Reed [88] conjectured that for every graph G, 1 χ(G) ≤ (∆(G) + ω(G)) + 1. 2 (15) The clique number ω(G) and chromatic number χ(G) satisfy the inequality ω(G) ≤ χ(G), since every vertex in the largest clique is adjacent to every other vertex in this clique, and thus requires a different color. What is remarkable is that there exists a polynomially computable graph function ϑ(G), known as the Lovász ϑ-function [73], that always satisfies the sandwich theorem [69]: ω(G) ≤ ϑ(Ḡ) ≤ χ(G) (16) Hence, ϑ(Ḡ) provides a polynomially computable upper bound on ω(G) and lower bound on χ(G). A graph G such that for any subset of vertices S ⊆ V the corresponding induced subgraph has equal clique and chromatic numbers, ω(G(S)) = χ(G(S)), is called perfect. Recenltly, Chudnovsky et al. [38] proved the strong perfect graph theorem (first conjectured by Claude Birge in 1960) stating that a graph is perfect if and only if it contains no odd hole (that is, a chordless cycle of length at least four) and no odd antihole (a complement of a hole). Hence, checking whether a graph G is perfect can be done in polynomial time. On the other hand, Busygin and Pasechnik [31] proved that recognizing whether χ(G) − ω(G) = 0 is NP-hard, implying that any polynomially-computable parameter that lies between ω(G) and χ(G) will provide the “provably best” upper bound for the independence number in the sense that no other polynomially computable bound can improve this bound whenever it can be improved. In particular, the Lovász theta is one such bound. 5 5.1 Computing Cliques and Colorings Exact Solutions The early algorithms for the maximum clique problem were motivated by various applications and, because of the nature of the considered applications, aimed to enumerate all maximal cliques in the graph rather than just finding a single maximum clique. In 1957, Harary and Ross [56] published the first algorithm for enumerating all cliques in a graph, which was motivated by an application in sociometry. Their method first reduces the problem on general graphs to a special case with graphs having at most three cliques, and then solves the problem for this special case. Many other algorithms for enumerating all maximal cliques followed [87, 77, 22, 12]. The progress in computing technologies in 1960’s motivated the development of many new enumerative algorithms. Bron and Kerbosch [25] proposed a backtracking method that requires only polynomial storage space and excludes of the possibility of computing the same clique twice. Various versions of this algorithm are still used in many different applications, in particular, in computational biology, where it was found very effective for problems related to matching 3-dimensional molecular structures [47]. Tomita et al. [100] developed a modification of this approach that has the time complexity of O(3n/3 ). This complexity is cannot be improved for enumerative algorithms, since there are graphs with 3n/3 maximal cliques [82]. More recently, Tomita et al. [101] presented a depth-first search algorithm for generating all maximal cliques of an undirected graph, which uses the same pruning rules as the Bron-Kerbosch algorithm, and also has O(3n/3 ) worst-case time complexity. They report encouraging results of computational experiments. While the above-mentioned algorithms were designed for finding all maximal cliques in a graph, solving the maximum clique problem requires to find only one maximum clique. Many exact algorithms for the maximum clique and related problems have been proposed in the literature, most of which can be viewed as variations of the branch and bound technique. Some of the algorithms were designed with worst-case time complexity in mind, while others emphasize the practical performance. In 1977, Tarjan and Trojanowski [97] proposed a recursive algorithm for the maximum independent set problem with the time complexity of O(2n/3 ). Later, Robson [89] improved this result to obtain the time complexity of O(20.276n ). In 2001, Robson [90] further brought down the best known worst-case complexity to O(2n/4 ). As for more practical algorithms, Balas and Yu [7] used an interesting implementation of the implicit enumeration, with which they were able to compute maximum cliques in graphs with up to 400 vertices and 30,000 edges. Carraghan and Pardalos [35] proposed an alternative, simpler approach examining vertices in the order corresponding to the nondecreasing order of their degrees withing the implicit enumeration framework. This method proved to be extremely efficient, especially for sparse graphs and was selected as a benchmark for comparing different algorithms in The Second DIMACS Implementation Challenge [66]. Östergård [85] proposed a branch-and-bound algorithm that employs a vertex ordering based on approximate coloring. This algorithm showed an excellent performance on random graphs and DIMACS benchmark instances and is considered the state-of-the-art general purpose exact algorithm for the maximum clique problem by many researchers. Tomita and Kameda [98] used similar ideas in their implementation of the so-called MCR algorithm, which also uses approximate coloring along with an appropriate sorting of vertices. They report superior computational results on graphs with up to 15,000 vertices, including instances arising in image processing, design of quantum circuits, and bioinformatics. As we have seen in the previous section, the maximum clique and graph coloring problems can be formulated as integer or continuous nonconvex programs. Modern integer programming solvers typically combine branch-and-bound strategies with cutting plane methods, efficient preprocessing schemes, including fast heuristics, and sophisticated decomposition techniques in order to find an exact solution, and may be quite effective even with their default settings when used with an appropriate integer programming formulation. However, the experience shows that developing specialized strategies based on a detailed polyhedral studies of the specific problem helps to speed up the computations and tackle larger problem instances. Several attempts of using algorithms based on the polyhedral properties of the maximum clique problem have been reported in the literature. For example, Bourjolly et al. [23] propose a column generation algorithm for the maximum stable set problem, and Rossi [91] reports encouraging computational results with a branchand-cut algorithm for the same problem. While these approaches show some promise and may still be improved in the future, at the moment they are, generally speaking, outperformed by the above-mentioned branch-and-bound strategies in practice. The exact algorithms for the graph coloring follow the same general line of development as for the maximum clique problem. Brown [27] proposed a branch-and-bound algorithm that uses a sequential greedy coloring heuristic described in the next section. Several improvements were suggested later [24, 66, 71]. Mehrotra and Trick [80] developed a column generation approach for graph coloring. Herrmann and Hertz [57] attempt to compute the chromatic number of a graph G by first determining a critical subgraph, which is the smallest induced subgraph of G that has the same chromatic number. Their method allows to solve large instances than those reported previously. Lucet et al. [75] proposed an exact algorithm for the graph coloring based on a linear-decomposition of the graph. This method is exponential with respect to the linearwidth of the input graph, but only linear with respect to its number of vertices. The method was used to successfully solve some of the problem instances for the first time. 5.2 Approximation Algorithms and Heuristics Due to inapproximability of the maximum clique and graph coloring problems mentioned above, the approximation algorithms for these problems normally have the approximation ratio depending on the problem size and are of mostly theoretical value. For example, a greedy algorithm that that builds a maximal independent set by recursively adding a minimum degree vertex and removing its neighbors has an approximation ratio of (∆(G) + 2)/3 [54]. Johnson [64] presented an algorithm for graph coloring with a performance guarantee of O(n log n). Heuristics for the maximum clique and graph coloring problems usually have little theoretical justification or performance bounds. Thus evaluations of which heuristics tend to work best in which situations are mostly based on experimentation. The heuristics listed below are among the most popular and successful. 5.2.1 Greedy Construction Heuristics Greedy heuristics for the maximum clique problem typically work in one of two ways: either adding in vertices into a partial clique until no more can be inserted or deleting vertices from a set of vertices that is not a clique until it induces a complete subgraph. The rule governing which vertex to add or delete next is usually based upon simple local information, such as the degree of the vertex. Heuristics such as this tend to run extremely fast. For examples of greedy heuristics, see [63, 70, 99]. The most commonly implemented greedy heuristic for the coloring problem is called the sequential greedy coloring heuristic (SGCH). The first step of this algorithm is to order the vertices. After that each vertex is colored in the specified order in a greedy method, using the smallest feasible color. There will always be an ordering that will lead to an optimal coloring. To see this, suppose the color classes from an optimal coloring are {S1 , . . . , Sk }. If SGCH is applied to an ordering where all of S1 is colored, then all of S2 , then all of S3 , and so on, it is easy to see that, while the same exact coloring may not be obtained, it will certainly not use more than k colors. Because of this fact, heuristics with greedy coloring tend to concentrate heavily on the order the vertices are to be colored in. Ordering schemes such as largest first in [103], smallest last in [79], and dynamic ordering schemes that do not pre-order as in [24] and [72] have all been well studied. For research into how one can move gradually from an initial ordering to an optimal ordering, see, e.g., [14, 37, 104]. 5.2.2 Local Search Heuristics Local search methods work by finding an initial solution to a problem and then searching “neighbors” of this solution in the space of all solutions looking for a better answer. Local search methods will vary on how they define the neighborhood of a solution as well as how they perform the search itself. The most basic local search methods are based on one of the two common rules for determining the next move: the best improvement strategy chooses the best neighbor, while the first improvement strategy accepts the first better solution found while searching the neighborhood of the current solution. The definition of the neighborhood is a critical element of a local search algorithm. The most common neighborhoods for the maximum clique and the graph coloring problems are the so-called exchange neighborhoods. An exchange neighborhood for the maximum clique problem is defined as follows: given a clique representing the current solution, its neighbors are obtained by removing one or more vertices from the clique and adding at least as many new vertices to the clique whenever possible. For graph coloring the exchange is typically performed between different color classes. Namely, given a proper coloring, its neighbors are obtained by moving a certain number of vertices from one color class to another in an attempt to eventually reduce some of the color classes to empty set, thus obtaining a better coloring. Oftentimes exchanges resulting in infeasible solutions are allowed with a penalty added to the objective for the infeasibility. This is done to diversify the search to increase the chances of hitting a high quality solution. In [39], an alternative iterative improvement heuristic for the coloring problem is proposed. The paper notes that if you take a coloring {S1 , . . . , Sr }, where the Si ’s are the color classes, then coloring an ordering of the form (Si1 , . . . , Sir ) by a greedy heuristic, that is an ordering where each color class is colored entirely before the next class, can only improve upon the previous coloring. This observation is independent on the ordering of the vertices within each class, as well as the order in which the classes appear, so in [39] various ordering schemes of the classes are explored. Typically a mixture of the schemes is best because it allows the algorithm to search a “larger” neighborhood of potential solutions. While traditional local search methods are typically capable of improving on the solutions generated by construction heuristics, they are known for their tendency to often getting trapped in local optima of rather poor quality. Several strategies commonly used to overcome this drawback are briefly reviewed in the next subsection. Various other heuristics can be found in, e.g., [66, 33]. 5.2.3 Metaheuristics Metaheuristic strategies were designed to allow a traditional local search algorithm to escape poor local optimums that a search might get bogged down in because of a poor initial solution. While many metaheuristic strategies, such as neural networks, genetic algorithms, and genetic computing have been applied to the maximum clique and graph coloring problems, few offer very good results. For a discussion of this on the clique problem specifically, see [19]. Both problems do, however, surrender good results to the same two local search methods: simulated annealing & tabu search. These methods specify unique ways of performing a search through the solution set. In simulated annealing, after an initial solution has been created, the algorithm chooses a neighbor of it to potentially accept as the next solution. If the neighbor improves the current solution, it is automatically accepted. If it is not an improvement, the neighbor is accepted with a probability that gets lower and lower as the annealing process proceeds. The rate at which the probability of acceptance decreases, called the cooling schedule, depends on a parameter called temperature, which highly determines the runtime of the simulated annealing process. While the exact solution can be guaranteed with a logarithmic cooling schedule, faster cooling schedules are often used so that the algorithm does not take exponential time to finish. For different implementations of simulated annealing to solve the clique problem, see [1] and [60]. It is interesting to note that it has been shown in [62] that, theoretically, simulated annealing should perform poorly for the clique problem, but the results in [60] conclude just the opposite. Simulated annealing actually outperforms most other heuristics on the clique problem in experiments in that paper. Pure versions of simulated annealing for the coloring problem, along with implementations where it is combined with greedy heuristics, exist in [37] and [65]. It seems to outperform most greedy heuristics both in time and quality of the solution on random graphs of up to 1000 vertices. Like simulated annealing, tabu search was designed to keep an algorithm from always ending up at the same answers as it searches solution space, but it goes about this in a different way. Essentially this method keeps a tabu list, a list of aspiration criterion, and a list of neighbors for the current solution. The algorithm chooses a neighbor and checks it against the tabu list. The tabu list keeps track of which vertices are illegal to visit, unless they meet an aspiration criterion. If the chosen vertex is not on the tabu list, it is visited. In this way the solutions searched is very much diversified. The implementations of tabu search that have performed well for the maximum clique problem can be found in [11, 49, 94, 93]. For implementations of tabu search for the graph coloring problem, see [40, 58, 59]. For results comparing not only simulated annealing and tabu search but also greedy heuristics on random graphs of up to 500 vertices and varying densities, see [68]. 5.2.4 Heuristics Based on Continuous Optimization The Motzkin-Straus formulation yields another potential way of solving the max clique problem, shifting it from a discrete to a continuous setting. Gibbons, Hearn, and Pardalos in [50] relaxed the Motzkin-Straus formulation to where it became a problem of optimizing a quadratic over a sphere, which is solvable in polynomial time. The drawback is that solutions may become infeasible for the original problem and hence have to be rounded. Their algorithm performed well on DIMACS benchmark graphs, as recorded in [50]. Others have taken advantage of the fact that the Motzkin-Straus formulation happens to be a replicator equation. A replicator equation, appropriately named because it behaves like the fitness of a population as it replicates, is an equation that as each new solution is plugged in recursively, always increases. With enough starts in random locations, one hopes that recursively plugging in solutions to the replicator equation will end in discovering the global optimum. For discussion on algorithms to take advantage of the MotzkinStraus formulation as a replicator equation, including how to remove the need for iteration in the procedure and empirical analysis, see [20, 21]. Results on benchmark graphs are reported in [66]. Burer et al. [28] studied rank-one and rank-two relaxations of the Lovász semidefinite program [53] and obtained two continuous formulations for the maximum independent set problem. They used these formulations to develop new heuristics for finding large independent sets. Several other heuristics for the maximum independent set problem based on continuous optimization techniques have been proposed in the literature [2, 30, 10, 29]. As for the graph coloring problem, Dukanovic and Rendl [41] use information provided by near-optimal solutions of a semidefinite programming formulation for the Lovász ϑ-function to generate heuristic solutions for the graph coloring problem. Based on the results of numerical experiments, they suggest that this approach could be useful for coloring medium-sized instances. 6 Applications The clique and coloring problems share more than just theoretical connections. They tend to show up in the similar applications, as ways of dealing with conflicts/errors that are expected. The coloring problem tends to appear in applications where you simply wish to detect issues. The solution gives classes that are not in conflict with each other but if an error occurs, the whole thing tends to be scrapped. The clique problem tends to appear in applications where issues need to be corrected. To pinpoint the location of the error with high likelihood, information from as many sources as possible are needed. The largest clique tends to give the most information from which the location of an error can be deduced. The reader is referred to [32] for a survey of applications of clique-detection models in biochemistry and genomics. Applications of cliques and coloring in telecommunications are discussed in [8]. Some other applications are briefly reviewed below. One of the best known applications of the clique problem is in coding theory. Suppose we have a code made up of binary vectors of size n. Two vectors u = (u1 , . . . , un ) and v = (v1 , . . . , vn ) are said to have Hamming Distance d if ui ̸= vi in at least d slots. If all vectors in a code have hamming distance d, then it is well known the code can correct up to ⌊ d−1 2 ⌋ errors due to transmission by rounding to the nearest codeword. What would be helpful to know is how many codewords of Hamming distance d can exist for a set of binary vectors of length n. If a graph is made with all binary vectors of length n as vertices, and edges between all vertices that have at least hamming distance d so that they can be put in the same code, then the clique number of this graph is precisely how many codewords can exist in a binary code of length n. No formula exists in general for this value. If one wants to build a code able to correct up to ⌊ d−1 2 ⌋ errors with a desired number of codewords in mind, the clique problem must be solved to have the most efficient code. Another application that involves solving the clique problem is fault diagnosis in multiprocessor systems. Pinpointing the faulty system or systems requires lots of information and in [13] a method is created that pinpoints the system with probability at least 1 − n−1 . A major portion of the algorithm, to pinpoint the correct system, is being able to determine the largest clique, which has information from which many deductions can be made. Some of the most obvious applications of the coloring problem occur in scheduling where there is limited resources. Whether this be timetable scheduling to minimize the number of rooms needed for an event, allocation of variables to registers in a way that minimizes the number of wasted calls to memory because the frequently used variables are the ones being overwritten, or assignment of frequencies to users in a way that minimizes interference due to proximity, the coloring problem typically appears where many pieces must share resources. A plan where not everything fits is usually scrapped. For instance, if the classes that need to run in school do not fit in the set of available rooms with one schedule, a new schedule is sought. Another application of the coloring problem is in circuit board ( ) testing. To test for shorts, one could test the nets of a circuit board pairwise. If n nets exist, this is n2 tests to be sure a circuit board is sound. However, if the nets that cannot have shorts between them are all yoked together as a( color class, one can ) simply test to see if shorts exist between the k color classes pairwise. This takes only 2k tests to determine is a circuit board is sound. While the exact location of the short cannot be determined, which might allow one to fix the board, this allows one to detect an error much quicker. Thus, assuming the board will be scrapped if an error has occurred, the solution to the coloring problem tremendously speeds up the testing process. 6.1 Clique and chromatic numbers in uniform random graphs In this section we provide some results concerning the asymptotic behavior of clique and chromatic numbers in random graphs. Erdös and Rényi [42, 43, 44] founded the random graph theory by introducing several models of uniform random graphs. Two of the most popular such models are G(n, m) and G (n, p) [15], the first of which assigns the same probability to all graphs with n vertices and m edges, while in the second model each pair of vertices is connected by an edge randomly and independently with probability p. Studying the asymptotic behavior of various graph properties may provide reasonable theoretical estimates of graph invariants for massive graph instances that cannot be tackled by a computer. We say that almost every (a.e.) graph has property Q, or the property Q holds asymptotically almost surely (a.a.s.), if the probability that a random graph on n vertices has property Q tends to 1 as n → ∞. Erdös and Rényi observed that many graph properties either hold or do not hold for almost every graph. 6.2 Clique number Assume that 0 < p < 1 is fixed. Then instead of the sequence of spaces {G (n, p), n ≥ 1} one can work with the single probability space G (N, p) containing graphs on N with the edges chosen independently with probability p. For a graph G ∈ G (N, p) we denote by Gn the subgraph of G induced by the first n vertices {1, 2, . . . , n}. Then the sequence ω(Gn ) appears to be almost completely determined for a.e. G ∈ G (N, p). Bollobás and Erdös [17] proved that if p = p(n) satisfies n−ε < p ≤ c for every ε and some c < 1, then there exists a function cl : N → N such that a.a.s. cl(n) ≤ ω(Gn ) ≤ cl(n) + 1, i.e. the clique number is asymptotically distributed on at most two values. The sequence cl(n) appears to be close to l0 (n) = 2 log1/p n + O(log log n): for a.e. G ∈ G (N, p) if n is large enough then ⌊l0 (n) − 2 log log n/ log n⌋ ≤ ω(Gn ) ≤ ⌊l0 (n) + 2 log log n/ log n⌋ and 3 ω(Gn ) − 2 log1/p n + 2 log1/p log1/p n − 2 log1/p (e/2) − 1 < . 2 Frieze [46] and Janson et al. [61] extended these results by showing that for ε > 0 there exists a constant cε , such that for cnε ≤ p(n) ≤ log−2 n a.a.s. ⌊2 log1/p n − 2 log1/p log1/p n + 2 log1/p (e/2) + 1 − ε/p⌋ ≤ ω(Gn ) ≤ ⌊2 log1/p n − 2 log1/p log1/p n + 2 log1/p (e/2) + 1 + ε/p⌋. 6.3 Chromatic number The problem of coloring random graphs has been studied by several researchers [16, 3, 52, 95, 76]. Łuczak [76] improved the previous results about the concentration of χ(G(n, p)), proving that for every sequence p = p(n) such that p ≤ n−6/7 there is a function ch(n) such that a.a.s. ch(n) ≤ χ(G(n, p)) ≤ ch(n) + 1. Alon and Krivelevich [3] have shown that for any positive constant γ the chromatic number of a uniform 1 random graph G(n, p), where p = n 2 −γ , is a.a.s. distributed among two consecutive values. Moreover, a proper choice of p(n) may result in a one-point distribution. The function ch(n) is difficult to find in general, however it can be characterized in some cases. In particular, Janson et al. [61] proved that there exists a constant c0 such that for any p = p(n) satisfying cn0 ≤ p ≤ log−7 n a.a.s. np np ≤ χ(G(n, p)) ≤ . 2 log np − 2 log log np + 1 2 log np − 40 log log np If p is constant, we have the following estimate: χ(G(n, p)) = n , 2 logb n − 2 logb logb n + Oc (1) where b = 1/(1 − p). References [1] E. Aarts and J. Korst. Simulated Annealing and Boltzmann Machines. John Wiley & Sons Incorporated, Chichester, UK, 1989. [2] J. Abello, S. Butenko, P. Pardalos, and M. Resende. Finding independent sets in a graph using continuous multivariable polynomial formulations. Journal of Global Optimization, 21:111–137, 2001. [3] N. Alon and M. Krivelevich. The concentration of the chromatic number of random graphs. Combinatorica, 17:303–313, 1997. [4] A. T. Amin and S. L. Hakimi. Upper bounds on the order of a clique of a graph. SIAM J. Appl. Math., 22:569–573, 1972. [5] S. Arora, C. Lund, R. Motwani, and M. Szegedy. Proof verification and hardness of approximation problems. Journal of the ACM, 45:501–555, 1998. [6] S. Arora and S. Safra. Probabilistic checking of proofs: a new characterization of NP. Journal of the ACM, 45(1):70–122, 1998. [7] E. Balas and C. Yu. Finding a maximum clique in an arbitrary graph. SIAM Journal of Computing, 15:1054–1068, 1986. [8] B. Balasundaram and S. Butenko. Graph domination, coloring and cliques in telecommunications. In M. G. C. Resende and P. M. Pardalos, editors, Handbook of Optimization in Telecommunications, pages 865–890. Spinger Science + Business Media, New York, 2006. [9] B. Balasundaram and S. Butenko. On a polynomial fractional formulation for independence number of a graph. Journal of Global Optimization, 35:405–421, 2006. [10] B. Balasundaram and S. Butenko. On a polynomial fractional formulation for independence number of a graph. Journal of Global Optimization, 35(3):405–421, 2006. [11] R. Battiti and M. Protasi. Reactive local search for the maximum clique problem. Algorithmica, 29:610–637, 2001. [12] A. R. Bednarek and O. E. Taulbee. On maximal chains. Roum. Math. Pres et Appl., 11:23–25, 1966. [13] P. Berman and A. Pelc. Distributed fault diagnosis for multiprocessor systems. In Proceedings of the 20th Annual International Symposium on Fault-Tolerant Computing, pages 340–346, Newcastle, UK, 1990. [14] N. Biggs. Some heuristics for graph coloring. In R. Nelson and R. Wilson, editors, Graph Colorings, Pitman Research Notes in Mathematics, pages 87–96. Wiley, 1990. [15] B. Bollobás. Random Graphs. Academic Press, New York, 1985. [16] B. Bollobás. The chromatic number of random graphs. Combinatorica, 8:49–56, 1988. [17] B. Bollobás and P. Erdös. Cliques in random graphs. Math. Proc. Camb. Phil. Soc., 80:419–427, 1976. [18] I. M. Bomze. Evolution towards the maximum clique. Journal of Global Optimization, 10:143–164, 1997. [19] I. M. Bomze, M. Budinich, P. M. Pardalos, and M. Pelillo. The maximum clique problem. In D.-Z. Du and P. M. Pardalos, editors, Handbook of Combinatorial Optimization, pages 1–74, Dordrecht, The Netherlands, 1999. Kluwer Academic Publishers. [20] I. M. Bomze, M. Pelillo, and R. Giacomini. Evolutionary approach to the maximum clique problem: empirical evidence on a larger scale. In I. M. Bomze, T. Csendes, R. Horst, and P. M. Pardalos, editors, Developments of Global Optimization, pages 95–108. Kluwer Academic Publishers, Dordrecht, The Netherlands, 1997. [21] I.M. Bomze and V. Stix. Genetical engineering via negative fitness: Evolutionary dynamics for global optimization. Ann. Oper. Res., 89:279–318, 1999. [22] R. E. Bonner. On some clustering techniques. IBM J. Res. Develop., 8:22–32, 1964. [23] J.-M. Bourjolly, G. Laporte, and H. Mercure. A combinatorial column generation algorithm for the maximum stable set problem. Operations Research Letters, 20(1):21 – 29, 1997. [24] D. Brélaz. New methods to color the vertices of a graph. Communications of the ACM, 22(4):251– 256, 1979. [25] C. Bron and J. Kerbosch. Algorithm 457: Finding all cliques on an undirected graph. Communications of ACM, 16:575–577, 1973. [26] R. L. Brooks. On coloring the nodes of a network. Proc. Cambridge Philos. Soc., 37:194–197, 1941. [27] J.R. Brown. Chromatic scheduling and the chromatic number problem. Management Science, 19:456–463, 1972. [28] S. Burer, R. D. C. Monteiro, and Y. Zhang. Maximum stable set formulations and heuristics based on continuous optimization. Mathematical Programming, 94:137–166, 2002. [29] S. Busygin. A new trust region technique for the maximum weight clique problem. Discrete Applied Mathematics, 154:2080–2096, 2006. [30] S. Busygin, S. Butenko, and P. M. Pardalos. A heuristic for the maximum independent set problem based on optimization of a quadratic over a sphere. Journal of Combinatorial Optimization, 6:287– 297, 2002. [31] S. Busygin and D.V. Pasechnik. On NP-hardness of the clique partition – independence number gap recognition and related problems. Discrete Mathematics, 304:460–463, 2006. [32] S. Butenko and W. Wilhelm. Clique-detection models in computational biochemistry and genomics. European Journal of Operational Research, 173:1–17, 2006. [33] M. Caramia and P. Dell’Olmo. Coloring graphs by iterated local search traversing feasible and infeasible solutions. Discrete Appl. Math., 156(2):201–217, 2008. [34] Y. Caro. New results on the independence number. Technical report, Tel-Aviv University, Israel, 1979. [35] R. Carraghan and P. Pardalos. An exact algorithm for the maximum clique problem. Operations Research Letters, 9:375–382, 1990. [36] A. Cayley. On the colourings of maps. Proc. Royal Geographical Society, 1:259–261, 1879. [37] M. Chams, A. Hertz, and D. de Werra. Some experiments with simulated annealing for coloring graphs. European Journal of Operational Research, 32:260–266, 1987. [38] M. Chudnovsky, N. Robertson, P.D. Seymour, and R.Thomas. The strong perfect graph theorem. Ann. Math., 164:51–229, 2006. [39] J.C. Culberson. Iterated greedy graph coloring and the difficulty landscape. Technical Report TR 92-07, University of Alberta Dept of Computer Science, Edmonton, Alberta, Canada, 1992. [40] D. de Werra. Heuristics for graph coloring. Computing, 7:191–208, 1990. [41] I. Dukanovic and F. Rendl. A semidefinite programming-based heuristic for graph coloring. Discrete Appl. Math., 156(2):180–189, 2008. [42] P. Erdös and A. Rényi. On random graphs. Publicationes Mathematicae, 6:290–297, 1959. [43] P. Erdös and A. Rényi. On the evolution of random graphs. Publ. Math. Inst. Hungar. Acad. Sci., 5:17–61, 1960. [44] P. Erdös and A. Rényi. On the strength of connectedness of a random graph. Acta Math. Acad. Sci. Hungar., 12:261–267, 1961. [45] U. Feige and J. Kilian. Zero knowledge and the chromatic number. Journal of Computer and System Sciences, 57:187–199, 1998. [46] A. Frieze. On the independence number of random graphs. Disctere Mathematics, 81:171–175, 1990. [47] E. J. Gardiner, P. J. Artymiuk, and P. Willett. Clique-detection algorithms for matching treedimensional molecular structures. Journal of Molecular Graphics and Modeling, 15:245–253, 1998. [48] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NPcompleteness. W.H. Freeman and Company, New York, 1979. [49] A. Gendreau, L. Salvail, and P. Soriano. Solving the maximum clique problem using a tabu search approach. Ann. Oper. Res., 41:385–403, 1993. [50] L. E. Gibbons, D. W. Hearn, and P. M. Pardalos. A continuous based heuristic for the maximum clique problem. In D. S. Johnson and M. A. Trick, editors, Cliques, Coloring, and Satisfiability: Second DIMACS Challenge, DIMACS Series in Discrete Mathematics and Theoretical Computer Science 26, pages 103–124. American Mathematical Society, Providence, RI, 1996. [51] G. Gonthier. Formal proof–the four-color theorem. Notices of the American Mathematical Society, 55:1382–1393, 2008. [52] G. R. Grimmett and C. J. H. McDiarmid. On coloring random graphs. Mathematical Proceedings of Cambridge Phil. Society, 77:313–324, 1975. [53] M. Grötschel, L. Lovász, and A. Schrijver. Geometric Algorithms and Combinatorial Optimization. Springer-Verlag, Berlin, 2nd edition, 1993. [54] M. M. Halldórsson and J. Radhakrishnan. Greed is good: Approximating independent sets in sparse and bounded-degree graphs. Algorithmica, 18:145–163, 1997. [55] J. Harant. Some news about the independence number of a graph. Discussiones Mathematicae Graph Theory, 20:71–79, 2000. [56] F. Harary and I. C. Ross. A procedure for clique detection using the group matrix. Sociometry, 20:205–215, 1957. [57] F. Herrmann and A. Hertz. Finding the chromatic number by means of critical graphs. J. Exp. Algorithmics, 7:10, 2002. [58] A. Hertz. Cosine: A new graph coloring algorithm. Operations Research Letters, 10:411–415, 1991. [59] A. Hertz and D. de Werra. Using tabu search techniques for graph coloring. Computing, 39:345–351, 1987. [60] S. Homer and M. Peinado. Experiments with polynomial-time clique approximation algorithms on very large graphs. In [66], pages 147–167. 1996. [61] S. Janson, T. Łuczak, and A. Ruciński. Random Graphs. John Wiley & Sons Incorporated, New York, 2000. [62] M. Jerrum. Large cliques elude the metropolis process. Random Structures and Algorithms, 3:347– 359, 1992. [63] D. S. Johnson. Approximation algorithms for combinatorial problems. J. Comput. Syst. Sci., 9:256– 278, 1974. [64] D.S. Johnson. Worst-case behavior of graph-coloring algorithms. In Proceedings of 5th Southeastern Conference on Combinatorics, Graph Theory and Computing, pages 513–528, Winnipeg, 1974. [65] D.S. Johnson, C.R. Aragon, L.A. McGeoh, and C. Schevon. Optimization by simulated annealing: An experimental evaluation, part ii, graph coloring and number partitioning. Operations Research, 39:378–406, 1991. [66] D.S. Johnson and M.A. Trick, editors. Cliques, Coloring, and Satisfiablility: Second D IMACS Implementation Challenge, volume 26 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society, Providence, RI, 1996. [67] A.B. Kempe. On the geographical problem of four colors. Amer. J. Math., 2:193–200, 1879. [68] D. Klimowicz and M. Kubale. Graph coloring by tabu search and simulated annealing. Archives of Control Sciences, 2:41–54, 1993. [69] D. E. Knuth. The sandwich theorem. Electronic Journal of Combinatorics, 1:A1, 1994. Accessed April 2010. [70] R. Kopf and G. Ruhe. A computational study of the weighted independent set problem for general graphs. Found. Control Engin., 12:167–180, 1987. [71] S. M. Korman. The graph-colouring problem. In N. Christofides, A. Mingozzi, P. Toth, and C. Sandi, editors, Combinatorial Optimization, pages 211–235, New York, 1979. Wiley. [72] F.T. Leighton. A graph colouring algorithm for large scheduling problems. J. Res. Nat. Bur. Stand., 84:489–496, 1979. [73] L. Lovász. On the shannon capacity of a graph. IEEE Trans. Inform. Theory, 25:1–7, 1979. [74] R.D. Luce and A.D. Perry. A method of matrix analysis of group structure. Psychometrika, 14:95– 116, 1949. [75] C. Lucet, F. Mendes, and A. Moukrim. An exact method for graph coloring. Comput. Oper. Res., 33(8):2189–2207, 2006. [76] T. Łuczak. A note on the sharp concentration of the chromatic number of random graphs. Combinatorica, 11:295–297, 1991. [77] P. M. Marcus. Derivation of maximal compatibles using Boolean algebra. IBM J. Res. Develop., 8:537–538, 1964. [78] P. Martins. Extended and discretized formulations for the maximum clique problem. Computers and Operations Research, 37:1348–1358, 2010. [79] D.W. Matula, G. Marble, and J.D. Isaacson. Graph coloring algorithms. In Graph Theory and Computing, pages 109–122. Academic Press, Inc., 1972. [80] A. Mehrotra and M. A. Trick. A column generation approach for graph coloring. INFORMS Journal on Computing, 8(4):344–354, 1996. [81] G. A. Miller. WordNet, Princeton University. http://wordnet.princeton.edu, 2009. [82] J. W. Moon and L. Moser. On cliques in graphs. Israel Journal of Mathematics, 3:23–28, 1965. [83] T. S. Motzkin and E. G. Straus. Maxima for graphs and a new proof of a theorem of Turán. Canad. J. Math., 17:533–540, 1965. [84] G. L. Nemhauser and L. E. Trotter. Vertex packing: structural properties and algorithms. Mathematical Programming, 8:232–248, 1975. [85] P. R. J. Östergård. A fast algorithm for the maximum clique problem. Discrete Applied Mathematics, 120:197–207, 2002. [86] P.M. Pardalos, T. Mavridou, and J. Xue. The graph coloring problem: A bibliographic survey. In D.-Z. Du and P.M. Pardalos, editors, Handbook of Combinatorial Optimization, volume 2, pages 331–395. Kluwer Academic Publishers, Dodrecht, The Netherlands, 1998. [87] M. C. Paull and S. H. Unger. Minimizing the number of states in incompletely specified sequential switching functions. IRE Transactions Electr. Comput., EC-8:356–367, 1959. [88] B. Reed. ω, ∆, and χ. Journal of Graph Theory, 27:177–212, 1998. [89] J. M. Robson. Algorithms for maximum independent sets. Journal of Algorithms, 7:425–440, 1986. [90] J. M. Robson. Finding a maximum independent set in time o(2n/4 ). Technical Report 1251-01, LaBRI, Universite Bordeaux, 2001. [91] F. Rossi and S. Smriglio. A branch-and-cut algorithm for the maximum cardinality stable set problem. Operations Research Letters, 28(2):63 – 74, 2001. [92] N. Z. Shor. Dual quadratic estimates in polynomial and Boolean programming. Annals of Operations Research, 25:163–168, 1990. [93] P. Soriano and M. Gendreau. Diversification strategies in tabu search algorithms for the maximum clique problem. Ann. Oper. Res., 63:189–207, 1996. [94] P. Soriano and M. Gendreau. Tabu search algorithms for the maximum clique problem. In [66], pages 221–242. 1996. [95] V. T. Sós and E. G. Straus. Extremal of functions on graphs with applications to graphs and hypergraphs. J. Combin. Theory B, 32:246–257, 1982. [96] G. Szekeres and H. S. Wilf. An inequality for the chromatic number of a graph. Journal of Combinatorial Theory, 4:1–3, 1968. [97] R. E. Tarjan and A. E. Trojanowski. Finding a maximum independent set. SIAM Journal of Computing, 6:537–546, 1977. [98] E. Tomita and T. Kameda. An efficient branch-and-bound algorithm for finding a maximum clique with computational experiments. J. of Global Optimization, 37(1):95–111, 2007. [99] E. Tomita, A. Tanaka, and H. Takahashi. Two algorithms for finding a near-maximum clique. Technical Report UEC-TR-C1, University of Electro-Communications, Tokyo, Japan, 1988. [100] E. Tomita, A. Tanaka, and H. Takahashi. The worst-time complexity for finding all the cliques. Technical Report UEC-TR-C5, University of Electro-Communications, Tokyo, Japan, 1988. [101] E. Tomita, A. Tanaka, and H. Takahashi. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci., 363(1):28–42, 2006. [102] V. K. Wei. A lower bound on the stability number of a simple graph. Technical Report TM 8111217-9, Bell Laboratories, Murray Hill, NJ, 1981. [103] D.J.A. Welsh and M.B. Powell. An upper bound for the chromatic number of a graph and its applications to timetabling problems. The Computer Journal, 10:85–86, 1967. [104] A. T. White. Graphs, Groups and Surfaces. Elsevier Science Publishers B.V., Amsterdam, 1984. [105] H. S. Wilf. The eigenvalues of a graph and its chromatic number. J. London Math. Soc., 42:330–332, 1967. [106] R. Wilson. Four Colours Suffice. Allen Lane Science, 2002.