Mcnte-earlo AlgOrithIIlS for Enumeration and Reliability Problems Richard M. Karpt University oJ California at Berkeley Michael Lubyt University 01 Toronto In a similar spirit, we can discuss randomized approximation methods in which ~ and 0, as'well as w, are part of the input. Such a randomized algorithm is called a randomized approximation scheme for f if, for every input triple (l:,O,w), where wED, E > 0 and 0 < 0 < 1, the algorithm produces as output a real number l£,o(w) such that 1. Introduction We present a simple but very general Monte-Carlo technique for the approximate solution of enumeration and reliability problems. Several applications are given, including: 1. Estimating the number of triangulated plane maps with a given number of vertices; 2. Estimating the cardinality of a union of sets; 3. Estimating the number of input combinations for which a boolean function, presented in disjunctive normal form, assume the value true; 4. Estimating the failure probability of a system with faulty components. Pr ['le'~jl~f(W) I > e I < l5 In cases where the domain D is a set of strings, a randomized approximation scheme is called fully polynomial if its execution time is bounded by a polynomial in ;. ~ and the length of w. We derive randomized approximation schemes for the problems mentioned above. In particular, we give a fully polynomial scheme for estimating the number of input combinations that make a disjunctive normal form boolean formula true. Thus we have a fully polynomial randomized approximation scheme for a IP complete problem. 1.1 Randomized Approximation Algorithms and Approximation Schemes Let f be a function ~rom some domain D into the positive reals. We shall be concerned with randomized algorithms which accept as input any wED and produce as output a positive real number (w) which is an estimate of f (w). Since the algorithm involves randomization, (w) is a random variable, rather than a constant, for each :fixed w. Such a randomized algorithm is called an (l:,o) approximation algorithm for J if, for every input wED, 1 2. Counting Equivalence Classes The general principles underlying all our results can be described abstractly as follows. Let S be a finite set on which an equivalence relation is defined. We wish to estimate the number of equivalence classes into which partitions S. The number of equivalence classes will be denoted 1S /I'J I. We assume that 1s I the cardinality of S, is known. Let [x] denote the equivalence class containing % . We give two Monte-Carlo methods for estimating'S /f'V ,. Each of these methods 1 Pr [I l( w )- t {w }I > J{w) l; '"'J I'V 1< 0 . J tResearch supported by NSF Grant MCS-81-05217 56 0272-5428/83/0000/0056$01.00 © 1983 IEEE executes t trials. The estimator of IS/row I is X +X + ... +X 1 2 t ,where Xi is the result of t the it", trial. The random variables ~ are independent and identically distributed, and E [~] IS /~ I· Each of the methods requires a procedure for choosing elements at random from S. Method 1 assumes that, given %, we can determine the number of elements in [x ]. Method 2 assumes that each equivalence class contains a canonical representative, and that. it is possible to determine whether a given element x is the canonical representative of its equivalence class. In Method 1, the i th trial is conducted as follows: Choose a random element XES Our various Monte-Carlo algorithms will be obtained by choosing a set S and an equivalence relation'" such that the quantity we wish to estimate is IS /'" I. In each case, the efficiency of the method will hinge on the = . observatIon that p 3. Estimating the Number of Plane Triangulations A plane triangulation is a connected plane map in which every face is bounded by three edges and no two edges are incident with the same pair of vertices. Let T 1 and T2 be plane triangulations. Then T 1 and T 2 are called isomorphic if there is a one-to-one mapping 9 of the vertices of T 1 onto the vertices of T2 such that l[x]1 In Method 2, the i th trial is conducted as fol+- lows: Choose a random element XES If x is the canonical representative of [x] then Xi ~ lSI else Xi, ~ O. It is easily verified that, in each case, 1. two distinct vertices v and w of T 1 are joined by an edge of T 1 if and only if g(v) and g(w) are joined by an edge of T2 , and 2. three distinct vertices u, v and w of T 1 lie on a common face of T 1 if and only if g(u), g(v) and g(w) lie on a common face of T2 . E [Xi] = IS/row I. In determining the number of trials required in an (~,o) approximation scheme for estimating IS /rv I. the quantity p = r;,' I plays a crucial role. We require Let Un denote the number of isomorphism types of plane triangulations with n +3 vertices. For example, U 3 = 2, and the two isomorphism types are indicated in Figure 1. the following lemma. Let X be a random variable Lemma 1: which has mean p <~ not too small. lSI Xi = 1sI s/"'1I IS· and always assumes values in the range [0,1]. Let l~j be a sequence of independent random variables, each of which has the same distribution as X. Then, for every ~ E (0,1), Pr X 1+X2 + ... +Xt II t P I>t:p]<2 exp ·1-2~2tP 1 9 (l-p ) . The proof follows from Theorem 2, p. 41, of R~nyi [5]. Corollary: If I f;j I p. then the number of trials sufficient in Method 1 or Method 2 to obtain an (~,o) approximation scheme is- Pi(Jure 1. - The Two Types oJ Plane Triangulatio'nS with Six vertices 57 The· problem· of computing Un is unsolved, but Tutte [6] has solved a related problem in which the three vertices of some face are distinguished by special labels. Define a labelled plane triangulation as a plane triangulation in which one face is distinguished, its three vertices are labelled 0" band c and the other vertices remain unlabelled. Let T 1 and T2 be labelled plane triangulations. Then T 1 and T2 are called label isomorphic if there is a labelpreserving one-lo-one mapping of the vertices of T 1 onto the vertices of T2 which preserves edges and faces, as in lhe preceding definition of isomorphism. Tutte proved that the number of label isomorphism equivalence classes is - 4.. - . 2'(4n + 1)! (n + 1)!(3n + 2)!' n 1. 2. 3. 4. as T. and our estimator of Un is ~ 2 The value of r Hence, p :2 In ~ 2( 4.. . r never exceeds 12·(n+l). 1 ). and the number of trials required for an (l;,0) approximation to We show that Tulte's classic result can be exploited to yield an (£,0) approximation scheme for estimating Un' The execution time is proportional to A single trial of Method 1 goes as follows. Select a random labelled plane triangulation T with n +3 vertices. Let G be the unlabelled map obtained by deleting the labels from T. In all possible ways, select a face of G and a labelling of that face. Since G has 2 (n+l) faces and each face can be labelled in 3! ways, this gives 12'(n+l) labelled plane triangulations. Compute the certificate of each of these labelled triangulations. Let r be the number of distinct certificates so obtained. Then r is the number of labelled isomorphism types contained in the same unlabelled isomorphism type 1 · n+l Un is less than or equal to ~ . and polyno- 108ft 2 mial in n. We use the formalism of the preceding section, choosing Sand as follows: S is the set of all distinct (Le., nanisomorphic) labelled plane triangulations with n +3 vertices. Hence, I S I = L",. Two distinct labelled plane triangulations are equivalent under I'V if these triangulations are isomorphic as unlabelled graphs. Hence, the number of equivalence classes is the number of distinct (nonisomorphic) unlabelled plane triangulations with n+3 vertices; IS /I'V I Un' The efficient implementation of MonteCarlo Method 1 depends on two observations. First, there is a randomized polynomial time algorithm for sampling from the labelled isomorphism types; Le. for generating a labelled triangulation T with n+3 vertices whose labelled isomorphism type is equally likely to be anyone of the 4... distinct types. Second, there is a polynomial. time algorithm which computes certificates for labelled plane triangulations; two labelled triangulations are isomorphic if and only if their certificates are equal. Here, "polynomialtime" means that the number of steps is bounded by a polynomial in n. I'V + 99 .LIn &. £2 6· For clarity, we illustrate a typical trial, in which n = 3, so Is I = La = 13. Suppose the labelled map of Figure 2 is chosen in Step 4. a = b c Figure 2. - A Labelled Triangulation There are 48 labelled triangulations derived from this map. Among them. twelve distinct labelled isomorphism types .occur. Hence 58 = 12 and our estimator of is L 3 = 13 r 12 . r U 3 for this trial a4 - - - - - - a3 a~------~ We now show how to generate a random labelled plane triangulation, and how to compute the certificate of a given labelled plane triangulation. Both of these computations are recursive and require a generalization of the conc~pt of a labelled plane triangulation. For d = 3,4,... define a labelled d-map as a plane map in which (a) one face F with d edges on its boundary is distinguished; (b) each of the other faces has exactly three edges on its boundary; 2 (c) no two edges join the same pair of vertices; (d) the vertices on the boundary of F are labelled al,Q,2, ... ,ad in cyclic order; (e) each edge which is not on the boundary of F, but joins two vertices on the boundary of F, is incident with al; at is called the root of the labelled d-rnap. Figure 3. - The Isomorphism. Types of Labelled 4-M(1,ps 'With Ji'ive 'Vertices Let S(d,n) denote the number of label isomorphism equivalence classes of labelled d-maps with n +d vertices. Then L". S(3,n). Figure 3 illustrates that 8(4,1) = 3. Also, S(d,O) = 1 for d = 3,4, .... By convention, 8(2,n) S(d,-1) o. Let a,b denote the edge joining a and b. Let < b 1, b 2,.'" br > denote the bounded region with vertices b l' b 2,".' br and edges b 1b 2 , b 2 b 3 ,···,br b 1 0nitsboundary. = = We see that the labelled plane triangulations are just· the labelled 3-rnaps. Two labelled d-maps M 1 M2 are called label isomorphic if there is a label-preserving one-to-one mapping of the vertices of M 1 onto the vertices of M2 which preserves edges and faces. and Theorem 1: For d = = 3,4,5... and n = 1,2, ... S(d,n) = S(d-l,n) + S(d+l, n-1) + E S(d 1, nl) . S(d 2, n2) ftl+fte=n-l 4:s;dl~d d1+d e=d+3 Proof: The distinct labelled d-maps with n + d vertices correspond to the triangulations of <0,1,(1,2, ... ,o,d >. such that every edge joining two vertices of the boundary is incident with the root a1. We select a unique triangle T in this triangulation, according to the following case analysis. CASE 1. < 0,1,(1,2,0,3 > is a triangle. Then T = < 0,1,0,2,0,3 >. This case can occur in S(d-l, n) ways, corresponding to the triangulations of < 0,1,0,3,.'. ,act> with n + d ~ 1 vertices and 59 a, 1 as root. CASE i. i = 3.4•...• d-l. Root at is adjacent to 4£ and is not adjacent to any o,j, 2 ::s; j ~ i . Within <o,l,o,2' ... '~ > T = <at, b, Cl£ > is the unique triangle containing 41~. This case can occur in ways, corresponding to the ways of triangulating < b, o,t,o,2, ... ,Cl.£ > and <o,l,lt£.Cl.£+l, ... ,o,d >, with respective roots b and 0,1. and with n-l vertices in addition to o,l.o,2, ... ,ad and b. is adjacent to none of this case 0,1 4 2 lies in a unique triangle T < 0,1' b , 0,2>. This case can occur in S(d+l, n-l) ways, corresponding to the triangulations of <b. 0,2,43, ... ,all,at > withb as root. CASE d Root o,3,o,4, ... ,o,d-t .. In a1 = • CASE i, i The case analysis in the proof of Theorem 1 suggests a recursive way of selecting at random a labelled isomorphism type of d-map with n +d vertices. The first step is to select one of the cases. Case 1 is chosen with probability S1~~,~)), case i = 3,4•...,.d-l b with probability S(d,n) i Figure 4. - Cases in the Proof of Theorem 1. = 3,4, ... ,d-l and case d with probability S(d+l, n-l) . . S(d,n) · If, for example, case 'I. IS of < o,l,llt,CL£+l, ... ,o,d > with root a 1 and n -l-n t internal vertices. selected, 3 ~ i ~ d -1, then a given value for is chosen with probability The certificate of a given labelled d-map is obtained by' numbering the vertices in a canonical way. Two d-maps are isomorphic if and only if their associated numbered maps have exactly the same vertices and edges. The canonical numbering procedure follows the case analysis of Theorem 1. procedure NUM «a 1,a2, ... ,acl » canonically numbers the vertices within or on the bqundary of <al,o,2,···,acl >, where 0,1 is the root. Procedure INTNUM .« o,l,o,2, ... ,o,ll » canonically numbers the interior vertices of <a 1,a2,·· .,4d >, where 0,1 is the root. The command nurn(x) numbers vertex x with nl Then, recursively,. one of the S(i+l, n 1) types of triangulations of <b, o,Vo,2, .... ,1lt > with root band n 1 internal vertices is selected at random, along with one of the S(d-i+2, n-l-nt) types of triangulations 60 the least positive integer not previously assigned as the number of a vertex. Procedure NUM « We assume that, for each i; al.a2, ..... ,ad» For i = 1 to d do num(a;,); INTNUM «al.Q,2, ... ,ad » Procedure INTNUM « aI, a2,.· ., ad 1. ISi I is known ; 2. It is possible to choose a random element of Si ; 3. It is possible to decide whether a given element s lies in Sf,. » Then a trial in Method 1 can be implemented as follows. Determine which case holds in the case analysis of Lemma 2. 1. CASE 1. INTNUM «al,Q,3, ... ,act» ItY~ISil , CASE i. i = 3.4..... d-1 num (b); INTNUM Q,1,Q,2'''''Q;, »; INTNUM «al,~,CL£+l, ... ,ad » CASE d «b, 2. num (b); 3. INTNUM Choose i E ll,2, ... ,m J with probabil. IS;, I «b, a2,a3, ... ,ad,a,1 », Choose a random element S E Si; (the pair <s ,i > has now been chosen) For all j ¢i test whether m An alternate way of assigning certificates to labelled d-maps is given in [7]. 4. X ~ Sj ; 18;,1 +- _ _'_=_1 I fj I s S E _ E S;J I Then X is the .required estimator. In this case, 4. Estimating the Cardinality of a Union of Sets The problem of computing the cardinality of a union of finite sets is a fundamental one in combinatorics, and it is usually attacked using the Principle of Inclusion and Exclusion: Hence, ~(m-l) LIn ~ 2 l;2 0 trials are sufficient for an (e,o) approximation. This expression for I.U I S,; The foregoing method has the disadvantage that, in order to determine I fj I S E S; J I it is necessary to test the membership of s in each set S· j ¢i. To 'd J, aVOl m-l membership tests in each trial we can resort to Monte-Carlo estimation of I fj I S E Sj J I· The estimation is done by repeatedly drawing a j at random from f1,2, ... ,mJ and testing whether S E Sj. If it takes l drawings to obtain' an Si containing S, then the estimator of I fj I s E S· J I is . m J -l- and, accordingly, the estimator of entails 2 m -1 1,=1 terms, and thus is inconvenient for computation when m is large. Our Monte-Carlo methods provide a very attractive alternative if one is willing to settle for a reliable approximation rather than an exact count. We apply Method 1, with S = f <s, i > l S E Si J = and <s,i> <s',i'> if and only if s s'. The num.ber of equivalence classes is clearly "J , IuS;, I is m equal to the number of elements in U Sf, m E I Si I. -£=1 Unlike Methods 1 and 2, which perform a predetermined number of trials, this method terminates when elements have been d~awn at random from ll,2, ... ,mJ a i=1 and IS _l I = ~ ISi I· i=1 61 specified number of times; since the number of drawings in a trial is a random variable, the number of trials executed is not 'fixed in advance. and let T +- = Pr 0; trials . draws repeat untIl begin choose i IS;, 1 . E +- 0; [500m 5m) ~ max. 6 ' E;2 (l,2, ... ,mJ with probability i -,u I ]<0. >l;~ 2: Applying N= end J Kolmogorov's 1.1T 1 J:L inequality - 13 x=v~ and with setting =[·~T J and recalling that L l Var [Y1] ~ mJ.L, we obtain: { X is the estimator of USi I} ;'=1 p,. The execution time of Method 3 is domina.ted by the time to perform tests of the form Iris s E Sj?" For :fixed t:. and 0 the expected number of such tests is. O(m), rather than O(m 2 ), as in Method 1. I o 3; + YN < T] Pr [Y1 + Y2 + + YL > T] ~ ; ; Lll 1:sa L~ N andIYl+Y2+'.'+Yi F l Y1+ y2·... ···+ Ytrials Since It is required to show that ~ Pr [Yl + Y2 + I J.L > .L-y'3Nm l 0 Hence, with probability at L ~ trials ~ T and Theorem 2: Method 3 is an (£,0) approximation scheme. Pro 01: trials I Pr [3L. 1:saL ~N and I Yl+Ya+ ... +ll-lJ.£ I C!:Z -VN ~--\-. choose a random element s E S;,; l +- 0; do until a j is chosen such that s E Sj begin draws +- draws + 1 ; l +- l + 1; choose j at random in 11,2, ... ,mJ +- I Y 1+ Y2+ ... + Ytria/8 For this purpose we invoke Kolmogorov's inequality ([1], p. 220): let fYkJ be a sequence of independent identically distributed random va.riables having mean J.L and variance 02. Then. for each positive integer N and each positive real x, 0 E IS;, 1 ' Ytrials The algo- rithm observes Yl, Y2 ,··., YtriaLs , where- trials min(k 1 Y1 + Y2 + ... + Yk ~ T~. It suffices to show that Kethod3 draws = max ( 500m ~ . 5m) l:2~ • trials -YJlf-;" 11. r- 'I~ trials '1 least -V IL ] ~ £.. 3 . 1 - 0, 3Nmy o' JL :s; t: j.t. the proof is com- plete. • 5. An Application to DNF Fonnulas The Yl: are independent and identically distributed, since they correspond to independent repetitions of the'same random experiment. Straightforward calculation mIUS;,1 shows that E [Yt ] = - - - I S;, I Given a Boolean formula in disjunctive normal fOnTI, we would like to estimate the number of input combinations that make the formula true. Suppose the formula contains n variables and is the disjunction of m r; ;'=1 terms. Then we need to estimate and Var [Y1] ~ mE [Y1]· Let,u denote E [Y1] I USi I, ;'=1 62 where S;, is a subset of l 0, 1 J"' consisting of those input combinations that make the i th term true. The prerequisites for an efficient implementation hold. IS;, I 2"'-nc, where n;, is the number of literals occurring in the i th term. It is easy to select an element of Si at random; the values of ~ variables are forced, and the other n ~ variables can be chosen independently at random. To determine whether a given input combination lies in 8" requires only '7lt bit inspections. With reasonable assumptions about the format in which the DNF formula is presented Method 3 gives a fully polynomial randomized approximation scheme which, for each fixed ~ and 0, runs in O(m·n) time. It is remarkable that such a scheme exists for a IP complete problem. The reader should note, however, that the scheme depends crucially on the assumption that the formula is in disjunctive normal form; on can hardly expect such a favorable result for formulas in conjunctive n.ormal form, since such a result would imply a randomized polynomial time algorithm for the Satisfiability problem. Let the number of st-cuts be m, and let S, be the set of configurations in which all the edges of the i th cut are off. Then the set of configurations in which the network fails is = USi" n P.· n fe I edge e is on I Iu 8;, ]. i=1 If an explicit list of the st-cuts is available then our methods can be applied directly, with a few minor changes because 'we are estimating the probability of a union of events rather than the cardinality of a union of sets. For example, a trial in Method 1 takes the form: 1. Choose i E 11,2,... ,mJ with probabil. Pr [8,] Ity ; Pr [8i ] r; i=1 2. Choose configuration 'l't Pr [s] b y Pr prob all [8;,] ; 3. For all j ¢ E S i test whether S Si E 8j with ; m 2; Pr [8'1,] 4. X 4- ~'_=_1 I fj I s ~ E 8j J I . A number of trials sufficient for an (£.6) 6. Applications to Network Reliability The most significant applications of our Monte-Carlo methods are in the area of network reliability. Here, one is given a graph in which the edges are normally on (i.e., operating or working) but may be off (Le., disabled or failing). We assume that each edge e is off with a small probability Fe' independently of the other edges. A criterion for correct operation of the network is specified, and the task is to estimate the probability that the network fails to operate correctly. For example, we might specify that the network fails if a specified pair of distinguished vertices sand t cannot communicate; Le., in every path between sand t at least one edge is off. Define a st-cut as a minimal set of edges that intersects every st-path. Define a configuration as an assignment to each edge of one of the two states fan,of J J. The probability of a configuration is Ie I edge e is 01/1 and we wish to estimate Pr i=1 approximation is : (m - 1) £~ In ~. A suit- able modification of Method 3 runs in time proportional to the number of sf-cuts times the number of edges. Further details and variations of this scheme, along with computational results, are reported in [2]. When the graph is planar the MonteCarlo approach can be implemented without the explicit listing of sf-cuts. and the running time can be bounded by a polynomial in the number of vertices and edges of the graph (rather than the number of st-cuts), provided we assume that the failqre probabilities of the edges are sufficiently small [4]. Finally. we mention an unusual reliability problem to which our Monte-Carlo methods apply especially nicely. In the seepage problem every edge is normally off. but is on with a small probability qe (think of a pipe that normally does not conduct a noxious fluid, but may have a leaky valve). The network fails if the fluid can reach every node of the network from its source. This is equivalent to saying that all the edges of some spanning (1-Pe). 63 tree are on. Let the number of spanning trees be m .. let the i th spanning tree be Ti , and let References [1] Feller, W.. An Introduction to Probability The.ory (l,nd its Applications. V.I, Wiley, New York [1950]. Then q(T,,) is the failure probability of Ti . In this case, each trial of the Monte-Carlo method gives an unbiased estimator X of the failure probability of the network as follows: 1. Choose T~ with probability [2] Karp. R.M. and Luby, M.G., "A New Monte-Carlo Method for Estimating the Failure Probability of an n-Component System", Report No. UCB(CSD83/17, Computer Science Division (EECS). University of California at Berkeley~ ""q( T,) L q(T;) j=l 2. 3. 4. [3] Kirchhoff. G., "Uber die Aufiosung der Gleichungen. auf welche man bei der Untersuchung der linearen Verteilung galvanische Strome gefuhrt wird", Ann. Phys. Chern.• 72(1847).497-508. Choose a configuration in which each edge of Ti is on, and each edge e not in T;, is on with probability qe' and off with probability l-qe' In this configuration, let r be the number of spanning trees, an of whose edges are on. X [4] Luby. M.. "Monte Carlo Methods for Estimating System Reliability," Ph.D. Thesis. Computer Science Division, University of California at Berkeley [1983]. l: q(T 4- i ) _ _1,=_1 r [5] R~nyi. A., Probability Theory, North Holland, Amsterdam [1970]. With the help of Kirchhoff's matrix-tree theorem [3] we can express ~ q (T,J as a [6] Tutte, W.T., irA Census of Planar Triangulations", Canad. J. Math. 14 (1962), 2136. \=1 determinant that can be computed in polynomial time. Similarly. the matrix-tree theorem permits us to carry out all the steps within a trial in polynomial time. It can be shown that the number of trials needed for an (e,o) approximation is less than or equal to [7] Weinberg, L. "Plane Representations and Codes for Planar Graphs". Proceedings 01 the Third Annual AUerton Conference on OiTcuit and System Theory, (1965A), 733-744. Thus. the method is very effective when the expected number of edges that are oJ! is small. 64