Document 14271850

advertisement
Mcnte-earlo AlgOrithIIlS for Enumeration and
Reliability Problems
Richard M. Karpt
University oJ California at Berkeley
Michael Lubyt
University
01 Toronto
In a similar spirit, we can discuss randomized approximation methods in which ~ and
0, as'well as w, are part of the input. Such a
randomized algorithm is called a randomized approximation scheme for f if, for
every input triple (l:,O,w), where wED,
E > 0 and 0 < 0 < 1, the algorithm produces
as output a real number l£,o(w) such that
1. Introduction
We present a simple but very general
Monte-Carlo technique for the approximate
solution of enumeration and reliability problems. Several applications are given, including:
1. Estimating the number of triangulated
plane maps with a given number of vertices;
2. Estimating the cardinality of a union of
sets;
3. Estimating the number of input combinations for which a boolean function,
presented in disjunctive normal form,
assume the value true;
4. Estimating the failure probability of a
system with faulty components.
Pr
['le'~jl~f(W) I > e
I
< l5
In cases where the domain D is a set of
strings,
a
randomized
approximation
scheme is called fully polynomial if its execution time is bounded by a polynomial in
;. ~ and the length of w. We derive randomized approximation schemes for the
problems mentioned above. In particular,
we give a fully polynomial scheme for
estimating the number of input combinations that make a disjunctive normal form
boolean formula true. Thus we have a fully
polynomial
randomized
approximation
scheme for a IP complete problem.
1.1 Randomized Approximation Algorithms
and Approximation Schemes
Let f be a function ~rom some domain D
into the positive reals. We shall be concerned with randomized algorithms which
accept as input any wED and produce as
output a positive real number (w) which is
an estimate of f (w). Since the algorithm
involves randomization,
(w) is a random
variable, rather than a constant, for each
:fixed w.
Such a randomized algorithm is called
an (l:,o) approximation algorithm for J if,
for every input wED,
1
2. Counting Equivalence Classes
The general principles underlying all our
results can be described abstractly as follows. Let S be a finite set on which an
equivalence relation is defined. We wish to
estimate the number of equivalence classes
into which
partitions S. The number of
equivalence classes will be denoted 1S /I'J I.
We assume that 1s I the cardinality of S, is
known. Let [x] denote the equivalence class
containing % .
We give two Monte-Carlo methods for
estimating'S /f'V ,. Each of these methods
1
Pr
[I l( w )- t {w }I >
J{w)
l;
'"'J
I'V
1< 0 .
J
tResearch supported by NSF Grant MCS-81-05217
56
0272-5428/83/0000/0056$01.00
©
1983 IEEE
executes t trials. The estimator of IS/row I is
X +X + ... +X
1
2
t ,where Xi is the result of
t
the it", trial. The random variables ~ are
independent and identically distributed, and
E [~]
IS /~ I·
Each of the methods requires a procedure for choosing elements at random
from S. Method 1 assumes that, given %, we
can determine the number of elements in
[x ].
Method 2
assumes that each
equivalence class contains a canonical
representative, and that. it is possible to
determine whether a given element x is the
canonical representative of its equivalence
class. In Method 1, the i th trial is conducted
as follows:
Choose a random element XES
Our various Monte-Carlo algorithms will
be obtained by choosing a set S and an
equivalence relation'" such that the quantity
we wish to estimate is IS /'" I. In each case,
the efficiency of the method will hinge on the
=
.
observatIon
that p
3. Estimating the Number of Plane Triangulations
A plane triangulation is a connected
plane map in which every face is bounded by
three edges and no two edges are incident
with the same pair of vertices. Let T 1 and T2
be plane triangulations. Then T 1 and T 2 are
called isomorphic if there is a one-to-one
mapping 9 of the vertices of T 1 onto the vertices of T2 such that
l[x]1
In Method 2, the i th trial is conducted as fol+-
lows:
Choose a random element XES
If x is the canonical representative of
[x] then Xi ~ lSI else Xi, ~ O.
It is easily verified that, in each case,
1.
two distinct vertices v and w of T 1 are
joined by an edge of T 1 if and only if
g(v) and g(w) are joined by an edge of
T2 , and
2.
three distinct vertices u, v and w of T 1
lie on a common face of T 1 if and only if
g(u), g(v) and g(w) lie on a common
face of T2 .
E [Xi] = IS/row I.
In determining the number of trials
required in an (~,o) approximation scheme
for
estimating
IS /rv I. the quantity
p =
r;,'
I
plays a crucial role. We require
Let Un denote the number of isomorphism types of plane triangulations with n +3
vertices. For example, U 3 = 2, and the two
isomorphism types are indicated in Figure 1.
the following lemma.
Let X be a random variable
Lemma 1:
which has mean p
<~
not too
small.
lSI
Xi
= 1sI s/"'1I IS·
and always assumes
values in the range [0,1]. Let l~j be a
sequence of independent random variables,
each of which has the same distribution as
X. Then, for every ~ E (0,1),
Pr
X 1+X2 + ... +Xt
II
t
P
I>t:p]<2 exp ·1-2~2tP
1
9 (l-p ) .
The proof follows from Theorem 2, p. 41,
of R~nyi [5].
Corollary:
If
I
f;j I
p. then the number
of trials sufficient in Method 1 or Method 2 to
obtain an (~,o) approximation scheme is-
Pi(Jure 1. - The Two Types oJ Plane Triangulatio'nS
with Six vertices
57
The· problem· of computing Un is
unsolved, but Tutte [6] has solved a related
problem in which the three vertices of some
face are distinguished by special labels.
Define a labelled plane triangulation as a
plane triangulation in which one face is distinguished, its three vertices are labelled
0" band c and the other vertices remain
unlabelled. Let T 1 and T2 be labelled plane
triangulations. Then T 1 and T2 are called
label isomorphic if there is a labelpreserving one-lo-one mapping of the vertices of T 1 onto the vertices of T2 which
preserves edges and faces, as in lhe preceding definition of isomorphism. Tutte proved
that the number of label isomorphism
equivalence classes is
-
4.. -
. 2'(4n + 1)!
(n + 1)!(3n + 2)!' n
1.
2.
3.
4.
as T. and our estimator of Un is
~
2
The value of r
Hence, p
:2
In
~
2(
4..
.
r
never exceeds 12·(n+l).
1
). and the number of
trials required for an (l;,0) approximation to
We show that Tulte's classic result can
be exploited to yield an (£,0) approximation
scheme for estimating Un' The execution
time is proportional to
A single trial of Method 1 goes as follows.
Select a random labelled plane triangulation T with n +3 vertices.
Let G be the unlabelled map obtained by
deleting the labels from T.
In all possible ways, select a face of G
and a labelling of that face. Since G has
2 (n+l) faces and each face can be
labelled in 3! ways, this gives 12'(n+l)
labelled plane triangulations.
Compute the certificate of each of these
labelled triangulations. Let r be the
number of distinct certificates so
obtained. Then r is the number of
labelled isomorphism types contained in
the same unlabelled isomorphism type
1 · n+l
Un is less than or equal to
~ . and polyno-
108ft
2
mial in n. We use the formalism of the
preceding section, choosing Sand
as follows: S is the set of all distinct (Le., nanisomorphic) labelled plane triangulations with
n +3 vertices. Hence, I S I = L",. Two distinct labelled plane triangulations are
equivalent under I'V if these triangulations
are isomorphic as unlabelled graphs. Hence,
the number of equivalence classes is the
number of distinct (nonisomorphic) unlabelled plane triangulations with n+3 vertices; IS /I'V I Un'
The efficient implementation of MonteCarlo Method 1 depends on two observations.
First, there is a randomized polynomial time
algorithm for sampling from the labelled isomorphism types; Le. for generating a
labelled triangulation T with n+3 vertices
whose labelled isomorphism type is equally
likely to be anyone of the 4... distinct types.
Second, there is a polynomial. time algorithm
which computes certificates for labelled
plane triangulations; two labelled triangulations are isomorphic if and only if their
certificates are equal. Here, "polynomialtime" means that the number of steps is
bounded by a polynomial in n.
I'V
+ 99 .LIn &.
£2
6·
For clarity, we illustrate a typical trial,
in which n = 3, so Is I = La = 13. Suppose
the labelled map of Figure 2 is chosen in
Step 4.
a
=
b
c
Figure 2. - A Labelled Triangulation
There are 48 labelled triangulations derived
from this map. Among them. twelve distinct
labelled isomorphism types .occur. Hence
58
= 12 and our estimator of
is L 3 = 13
r
12 .
r
U 3 for this trial
a4
- - - - - - a3
a~------~
We now show how to generate a random
labelled plane triangulation, and how to compute the certificate of a given labelled plane
triangulation. Both of these computations
are recursive and require a generalization of
the conc~pt of a labelled plane triangulation.
For d = 3,4,... define a labelled d-map as a
plane map in which
(a) one face F with d edges on its boundary is distinguished;
(b) each of the other faces has exactly
three edges on its boundary;
2
(c) no two edges join the same pair of
vertices;
(d) the vertices on the boundary of F
are labelled al,Q,2, ... ,ad in cyclic
order;
(e) each edge which is not on the boundary of F, but joins two vertices on
the boundary of F, is incident with
al; at is called the root of the
labelled d-rnap.
Figure 3. - The Isomorphism. Types of
Labelled 4-M(1,ps 'With Ji'ive 'Vertices
Let S(d,n) denote the number of label
isomorphism equivalence classes of labelled
d-maps
with
n +d
vertices.
Then
L". S(3,n). Figure 3 illustrates that
8(4,1) = 3. Also, S(d,O) = 1 for d = 3,4, ....
By convention, 8(2,n) S(d,-1) o. Let
a,b denote the edge joining a and b. Let
< b 1, b 2,.'" br > denote the bounded region
with
vertices
b l' b 2,".' br
and
edges
b 1b 2 , b 2 b 3 ,···,br b 1 0nitsboundary.
=
=
We see that the labelled plane triangulations are just· the labelled 3-rnaps. Two
labelled d-maps M 1
M2 are called label
isomorphic if there is a label-preserving
one-to-one mapping of the vertices of M 1
onto the vertices of M2 which preserves
edges and faces.
and
Theorem 1:
For d
=
= 3,4,5... and n = 1,2, ...
S(d,n) = S(d-l,n) + S(d+l, n-1)
+
E
S(d 1, nl) . S(d 2, n2)
ftl+fte=n-l
4:s;dl~d
d1+d e=d+3
Proof:
The distinct labelled d-maps with
n + d vertices correspond to the triangulations of <0,1,(1,2, ... ,o,d >. such that every edge
joining two vertices of the boundary is
incident with the root a1. We select a unique
triangle T in this triangulation, according to
the following case analysis.
CASE 1.
< 0,1,(1,2,0,3 >
is a triangle. Then
T = < 0,1,0,2,0,3 >. This case can occur in
S(d-l, n) ways, corresponding to the triangulations of < 0,1,0,3,.'. ,act> with n + d ~ 1
vertices and
59
a, 1
as root.
CASE i. i = 3.4•...• d-l.
Root at is adjacent
to 4£ and is not adjacent to any o,j, 2 ::s; j ~ i .
Within <o,l,o,2' ... '~ > T = <at, b, Cl£ > is the
unique triangle containing 41~. This case
can occur in
ways, corresponding to the ways of triangulating
< b, o,t,o,2, ... ,Cl.£ >
and
<o,l,lt£.Cl.£+l, ... ,o,d >, with respective roots b
and 0,1. and with n-l vertices in addition to
o,l.o,2, ... ,ad and b.
is adjacent to none of
this case 0,1 4 2 lies in a
unique triangle T
< 0,1' b , 0,2>. This case
can occur in S(d+l, n-l) ways, corresponding
to
the
triangulations
of
<b. 0,2,43, ... ,all,at > withb as root.
CASE d
Root
o,3,o,4, ... ,o,d-t .. In
a1
=
•
CASE i, i
The case analysis in the proof of
Theorem 1 suggests a recursive way of
selecting at random a labelled isomorphism
type of d-map with n +d vertices. The first
step is to select one of the cases. Case 1 is
chosen with probability
S1~~,~)), case
i
= 3,4•...,.d-l
b
with probability
S(d,n)
i
Figure 4. - Cases in the Proof of Theorem 1.
=
3,4, ... ,d-l and case d with probability
S(d+l, n-l)
. .
S(d,n)
· If, for example, case 'I. IS
of
< o,l,llt,CL£+l, ... ,o,d >
with root
a 1
and
n -l-n t internal vertices.
selected, 3 ~ i ~ d -1, then a given value for
is chosen with probability
The certificate of a given labelled d-map
is obtained by' numbering the vertices in a
canonical way. Two d-maps are isomorphic if
and only if their associated numbered maps
have exactly the same vertices and edges.
The canonical numbering procedure follows
the case analysis of Theorem 1. procedure
NUM «a 1,a2, ... ,acl » canonically numbers
the vertices within or on the bqundary of
<al,o,2,···,acl >, where 0,1 is the root. Procedure INTNUM .« o,l,o,2, ... ,o,ll » canonically
numbers
the
interior
vertices
of
<a 1,a2,·· .,4d >, where 0,1 is the root. The
command nurn(x) numbers vertex x with
nl
Then, recursively,. one of the S(i+l, n 1)
types of triangulations of <b, o,Vo,2, .... ,1lt >
with root band n 1 internal vertices is
selected at random, along with one of the
S(d-i+2, n-l-nt) types of triangulations
60
the least positive integer not previously
assigned as the number of a vertex.
Procedure NUM
«
We assume that, for each i;
al.a2, ..... ,ad»
For i = 1 to d do
num(a;,); INTNUM «al.Q,2, ... ,ad »
Procedure INTNUM
«
aI, a2,.· ., ad
1.
ISi I is known ;
2.
It is possible to choose a random element of Si ;
3.
It is possible to decide whether a
given element s lies in Sf,.
»
Then a trial in Method 1 can be implemented
as follows.
Determine which case holds in the case
analysis of Lemma 2.
1.
CASE 1.
INTNUM «al,Q,3, ... ,act»
ItY~ISil
,
CASE i. i = 3.4..... d-1
num (b);
INTNUM
Q,1,Q,2'''''Q;, »;
INTNUM «al,~,CL£+l, ... ,ad »
CASE d
«b,
2.
num (b);
3.
INTNUM
Choose i E ll,2, ... ,m J with probabil.
IS;, I
«b, a2,a3, ... ,ad,a,1 »,
Choose a random element S E Si;
(the pair <s ,i > has now been
chosen)
For all j ¢i test whether
m
An alternate way of assigning certificates
to labelled d-maps is given in [7].
4.
X
~
Sj ;
18;,1
+- _ _'_=_1
I fj I s
S E
_
E
S;J I
Then X is the .required estimator. In this
case,
4. Estimating the Cardinality of a Union of
Sets
The problem of computing the cardinality of a union of finite sets is a fundamental
one in combinatorics, and it is usually
attacked using the Principle of Inclusion and
Exclusion:
Hence,
~(m-l) LIn ~
2
l;2
0
trials
are
sufficient for an (e,o) approximation.
This expression for
I.U I
S,;
The foregoing method has the disadvantage
that,
in
order
to
determine
I fj I S E S; J I it is necessary to test the
membership of s in each set S· j ¢i. To
'd
J,
aVOl
m-l membership tests in each trial
we can resort to Monte-Carlo estimation of
I fj I S E Sj J I· The estimation is done by
repeatedly drawing a j at random from
f1,2, ... ,mJ and testing whether S E Sj. If it
takes l drawings to obtain' an Si containing
S, then the estimator of I fj I s E S· J I is
. m
J
-l- and, accordingly, the estimator of
entails 2 m -1
1,=1
terms, and thus is inconvenient for computation when m is large. Our Monte-Carlo
methods provide a very attractive alternative if one is willing to settle for a reliable
approximation rather than an exact count.
We apply Method 1, with
S
= f <s, i > l S
E
Si J
=
and <s,i> <s',i'> if and only if s
s'. The
num.ber of equivalence classes is clearly
"J
,
IuS;, I is
m
equal to the number of elements in U Sf,
m
E I Si I.
-£=1
Unlike Methods 1 and 2, which perform a
predetermined number of trials, this
method terminates when elements have
been d~awn at random from ll,2, ... ,mJ a
i=1
and IS
_l
I = ~ ISi I·
i=1
61
specified number of times; since the number
of drawings in a trial is a random variable,
the number of trials executed is not 'fixed in
advance.
and let T
+-
=
Pr
0; trials
. draws
repeat untIl
begin
choose i
IS;, 1 .
E
+-
0;
[500m 5m)
~ max.
6
'
E;2
(l,2, ... ,mJ with probability
i
-,u
I ]<0.
>l;~
2:
Applying
N=
end
J
Kolmogorov's
1.1T 1
J:L
inequality
-
13
x=v~
and
with
setting
=[·~T J and recalling that
L
l
Var [Y1] ~ mJ.L, we obtain:
{ X is the estimator of
USi I}
;'=1
p,.
The execution time of Method 3 is domina.ted by the time to perform tests of the
form Iris s E Sj?" For :fixed t:. and 0 the
expected number of such tests is. O(m),
rather than O(m 2 ), as in Method 1.
I
o
3;
+ YN < T]
Pr [Y1 + Y2 +
+ YL > T] ~ ; ;
Lll 1:sa L~ N andIYl+Y2+'.'+Yi
F
l
Y1+ y2·... ···+ Ytrials
Since
It is required to show that
~
Pr [Yl + Y2 +
I
J.L
> .L-y'3Nm
l
0
Hence, with probability at
L ~ trials ~ T and
Theorem 2:
Method 3 is an (£,0) approximation scheme.
Pro 01:
trials
I
Pr [3L. 1:saL ~N and I Yl+Ya+ ... +ll-lJ.£ I C!:Z -VN ~--\-.
choose a random element s E S;,;
l +- 0;
do until a j is chosen such that s E Sj
begin
draws +- draws + 1 ; l +- l + 1;
choose j at random in 11,2, ... ,mJ
+-
I
Y 1+ Y2+ ... + Ytria/8
For this purpose we invoke Kolmogorov's inequality ([1], p. 220): let fYkJ be a sequence of
independent identically distributed random
va.riables having mean J.L and variance 02.
Then. for each positive integer N and each
positive real x,
0
E IS;, 1 '
Ytrials
The algo-
rithm observes Yl, Y2 ,··., YtriaLs , where- trials
min(k 1 Y1 + Y2 + ... + Yk ~ T~. It suffices
to show that
Kethod3
draws
= max ( 500m
~ . 5m)
l:2~ •
trials
-YJlf-;"
11.
r-
'I~ trials
'1
least
-V
IL ]
~ £..
3 .
1 - 0,
3Nmy
o'
JL :s; t: j.t. the proof is com-
plete.
•
5. An Application to DNF Fonnulas
The Yl: are independent and identically
distributed, since they correspond to
independent repetitions of the'same random
experiment.
Straightforward calculation
mIUS;,1
shows that E [Yt ] = - - - I S;, I
Given a Boolean formula in disjunctive
normal fOnTI, we would like to estimate the
number of input combinations that make the
formula true. Suppose the formula contains
n variables and is the disjunction of m
r;
;'=1
terms. Then we need to estimate
and Var [Y1] ~ mE [Y1]· Let,u denote E [Y1]
I USi I,
;'=1
62
where S;, is a subset of l 0, 1 J"' consisting of
those input combinations that make the i th
term true. The prerequisites for an efficient
implementation hold. IS;, I 2"'-nc, where n;,
is the number of literals occurring in the i th
term. It is easy to select an element of Si at
random; the values of ~ variables are
forced, and the other n ~ variables can be
chosen independently at random. To determine whether a given input combination lies
in 8" requires only '7lt bit inspections. With
reasonable assumptions about the format in
which the DNF formula is presented Method
3 gives a fully polynomial randomized
approximation scheme which, for each fixed
~ and 0, runs in O(m·n) time. It is remarkable that such a scheme exists for a IP complete problem. The reader should note, however, that the scheme depends crucially on
the assumption that the formula is in disjunctive normal form; on can hardly expect
such a favorable result for formulas in conjunctive n.ormal form, since such a result
would imply a randomized polynomial time
algorithm for the Satisfiability problem.
Let the number of st-cuts be m, and let S,
be the set of configurations in which all the
edges of the i th cut are off. Then the set of
configurations in which the network fails is
=
USi"
n
P.·
n
fe I edge e is on I
Iu
8;, ].
i=1
If an explicit list of the st-cuts is available
then our methods can be applied directly,
with a few minor changes because 'we are
estimating the probability of a union of
events rather than the cardinality of a union
of sets. For example, a trial in Method 1
takes the form:
1. Choose i E 11,2,... ,mJ with probabil.
Pr [8,]
Ity
;
Pr [8i ]
r;
i=1
2.
Choose
configuration
'l't
Pr [s]
b y Pr
prob all
[8;,] ;
3.
For all j
¢
E
S
i test whether
S
Si
E
8j
with
;
m
2; Pr [8'1,]
4.
X
4-
~'_=_1
I fj I s
~
E
8j J I
.
A number of trials sufficient for an (£.6)
6. Applications to Network Reliability
The most significant applications of our
Monte-Carlo methods are in the area of network reliability. Here, one is given a graph
in which the edges are normally on (i.e.,
operating or working) but may be off (Le.,
disabled or failing). We assume that each
edge e is off with a small probability Fe'
independently of the other edges. A criterion for correct operation of the network
is specified, and the task is to estimate the
probability that the network fails to operate
correctly. For example, we might specify
that the network fails if a specified pair of
distinguished vertices sand t cannot communicate; Le., in every path between sand t
at least one edge is off.
Define a st-cut as a minimal set of edges
that intersects every st-path. Define a
configuration as an assignment to each edge
of one of the two states fan,of J J. The probability of a configuration is
Ie I edge e is 01/1
and we wish to estimate Pr
i=1
approximation is : (m - 1)
£~
In
~.
A suit-
able modification of Method 3 runs in time
proportional to the number of sf-cuts times
the number of edges. Further details and
variations of this scheme, along with computational results, are reported in [2].
When the graph is planar the MonteCarlo approach can be implemented without
the explicit listing of sf-cuts. and the running time can be bounded by a polynomial in
the number of vertices and edges of the
graph (rather than the number of st-cuts),
provided we assume that the failqre probabilities of the edges are sufficiently small
[4].
Finally. we mention an unusual reliability
problem to which our Monte-Carlo methods
apply especially nicely. In the seepage problem every edge is normally off. but is on with
a small probability qe (think of a pipe that
normally does not conduct a noxious fluid,
but may have a leaky valve). The network
fails if the fluid can reach every node of the
network from its source. This is equivalent
to saying that all the edges of some spanning
(1-Pe).
63
tree are on. Let the number of spanning
trees be m .. let the i th spanning tree be Ti ,
and let
References
[1] Feller, W.. An Introduction to Probability
The.ory (l,nd its Applications. V.I, Wiley,
New York [1950].
Then q(T,,) is the failure probability of Ti . In
this case, each trial of the Monte-Carlo
method gives an unbiased estimator X of the
failure probability of the network as follows:
1.
Choose
T~ with probability
[2] Karp. R.M. and Luby, M.G., "A New
Monte-Carlo Method for Estimating the
Failure Probability of an n-Component
System", Report No. UCB(CSD83/17,
Computer Science Division (EECS).
University of California at Berkeley~
""q( T,)
L
q(T;)
j=l
2.
3.
4.
[3] Kirchhoff. G., "Uber die Aufiosung der
Gleichungen. auf welche man bei der
Untersuchung der linearen Verteilung
galvanische Strome gefuhrt wird", Ann.
Phys. Chern.• 72(1847).497-508.
Choose a configuration in which each
edge of Ti is on, and each edge e not in
T;, is on with probability qe' and off with
probability l-qe'
In this configuration, let r be the
number of spanning trees, an of whose
edges are on.
X
[4] Luby. M.. "Monte Carlo Methods for
Estimating System Reliability," Ph.D.
Thesis. Computer Science Division,
University of California at Berkeley
[1983].
l: q(T
4-
i )
_
_1,=_1
r
[5] R~nyi. A., Probability Theory, North Holland, Amsterdam [1970].
With the help of Kirchhoff's matrix-tree
theorem [3] we can express
~
q (T,J as a
[6] Tutte, W.T., irA Census of Planar Triangulations", Canad. J. Math. 14 (1962), 2136.
\=1
determinant that can be computed in polynomial time. Similarly. the matrix-tree
theorem permits us to carry out all the
steps within a trial in polynomial time. It
can be shown that the number of trials
needed for an (e,o) approximation is less
than or equal to
[7] Weinberg, L. "Plane Representations and
Codes for Planar Graphs". Proceedings
01 the Third Annual AUerton Conference
on OiTcuit and System Theory, (1965A),
733-744.
Thus. the method is very effective when the
expected number of edges that are oJ! is
small.
64
Download