De««y ALFRED The P. WORKING PAPER SLOAN SCHOOL OF MANAGEMENT probabilistic Part I: minimum spajining tree problem complexity and combinatorial properties Dimitris Bertsimcis Sloan W.P. No. 2057-88 August 1988 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 50 MEMORIAL DRIVE CAMBRIDGE, MASSACHUSETTS 02139 The probabilistic Part I: minimum spanning tree problem complexity and combinatorial properties Dimitris Bertsimas Sloan W.P. No. 2057-88 August 1988 Abstract In this paper classical we consider a natural miaimum spanning probabilistic tree minimum spanning we consider the case where not probabilistic variation of the (MST) problem, which we PMST and problem (PMST). In particular, all the points are deterministically find a closed pected length of a given spanning we prove that the problem is tree. NP — PMST problem with the sign problem and in discuss the apfor the ex- Based on these expressions complete. MST We form expression some combinatorial properties of the problem, of the the tree present, but are present with certain probability. plications of the call We further examine establish the relation problem and the network de- examine some cases where the problem is solvable polynomial time. Key words: Probabilistic combinatorial optimization problems, ning tree, network design, NP — completeness. minimum span- Introduction 1 The classicad minimum spanning in combinatorial optimization. tree (MST) problem plays an important role possesses the matroidal property which It lows the greedy algorithm to solve the problem optimally, and thus summary prototype for problems solvable in polynomial time. For a properties and algorithms for From a tion, its practical point of view, problem. we the of its Steiglitz [7]. has important applications in transporta- communications, distribution systems, In this paper and solution see Papadimitriou it it is al- etc. consider a natural probabilistic variation of this classical In particular, we consider the case where not all the points are deterministically present, but are present with certain probability. Formally, given a weighted graph each subset S of V, minimum expected G = {V,E) and a probability of presence p{S) we want to construct an a priori spanning tree of length in the following sense: on any given instance of the problem delete the vertices and their adjacent edges vertices provided that the tree remains connected. an a priori spanning tree of minimum spanning of the is T PMST tree minimum expected (PMST) among length is for all i G V if in Figure every node is is part of a the probabilistic 1. If the a priori tree tree becomes present with probability Pi then the problem reduces to the classical This paper of finding problem. In order to clarify the definition problem, consider the example easily observe that the set of absent The problem and nodes 2,7,9 are the only ones not present, the One can for MST more general investigation Ti. = 1 problem. of the properties of combinatorial optimization problems when instances are modified probabilis- Figure The PMST [4] asymptotic theorems [1] Ph.D thesis of on the probabilistic traveling salesman problem (PTSP), where he posed the problem, examined some of PTSP methodology Interest in this class of problems started with the tically. Jaillet 1: are derived in the plane. In and the its combinatorial properties and proved Bertsimas results of Jaillet [4] [1] further properties of the are sharpened. Bertsima^ includes also results on the probabilistic vehicle routing problem, defined in Jaillet and Odoni and probabilistic [5] facility location problems. To our knowledge, the PMST ture despite intrinsic interest as well as its applicability. In which the its problem hcis never been defined before in the a sequel of the present paper, is PMST we perform and prove that surprisingly the PMST litera- Bertsimas [2], probabilistic analysis of problem equivalent to the strategy of re-optimization, in which is asymptotically we re-optimize every instance of the problem. In the next section while in section 3 for the we discuss some applications we address the question PMST problem, of finding an explicit expression expected length of an a priori tree T. 4 of the In section 4 we investigate we prove the complexity of the problem and the problem with all weights equal MST simplicity of the problem is is NP — complete, which in view of the a quite surprising result. In section 5 we examine some combinatorial properties PMST that even a restricted version of of the problem (bounds, relation of and MST, relation with re-optimization strategies and the network design problem). In section 6 the problem and find some we exploit these combinatorial properties of which are solvable in polynomial time. The 2 Discussion and Applications of the PMST special cases final section includes some concluding remarks. Problem It is natural to ask why the PMST problem is interesting. that the problem defines an efficient strategy to update tree solutions when problem's is first minimum observe spanning instances are modified probabilistically because of the absence of certain nodes from the graph. Ejp, where Tp We the optimal a priori tree. We Then denote this strategy with in the instance 5, i.e. when only nodes in the set S are present, the strategy produces a tree Tp{S) with length LjpiS), which is the length of the tree that connects nodes from the set S of present nodes using parts of Tp. In the context of this discussion the letter Two 1. S denotes the strategy used. possible other strategies are: A re-optimization strategy Sa/sTi in which ning tree (MST) we find the minimum of the set of present nodes in every instance. span- We denote with Lmst{S) the length of the 2. A MST of the nodes in the set 5. re-optimization strategy T^steiner-, in which we find the Steiner tree of the set of present nodes in every instance. minimum We denote with Lsteiner{S) the length of the Steiner tree of the nodes in the set S, using possibly nodes from the set Remark: The above V— S. definition of the re-optimization strategy ^steiner ap- network, as opposed to the case where the plies only for the case of a fixed points are located in the Euclidean plane. In this case Lsteiner{S) is the length of the Steiner tree in the plane of the points from the set S. Why don't we use these re-optimization strategies Ea/st, ^steineR', rather than the strategy Sj-p we are proposing? Concerning the 'E,steiner strategy, because the tree connecting the set S it is clear that Lsteiner{S) < using only parts of the tree Tp Ltj,{S), is also a solution to the Steiner problem. The disadvantage of the Steiner strategy is that which we have is to solve an NP — hard problem in every instance, something problem instances. feasible only for small With the strategy Sa/st it is clear that 0(|5|^), using the greedy algorithm, but it we can compute Lmst{S) is not clear that in LmstIS) < LxpiS). In fact in section 5 we construct examples where the probabilistic strategy Sj^ we prove we are proposing is better than the Ea/st- Furthermore, in [2] that asymptotically under reasonable probabilistic assumptions the probabilistic strategy Ej is at least as good as the Ea/st- What is more important is real-time strategy to modify the solution strategy satisfies this criterion, since the tree T{S) can be found 0(n) time as follows: 1. Start with the a priori tree T. 2. Until there are no if i an unmarked G 5 mark The 3. Since we it; instances we need a modified. PMST find when the applications sire Clearly, the in many the fact that in unmarked leaf in T; else delete resulting tree leaves in T: is i from T. the tree T{S). are only looking at every node at most once, this is an 0(n) algo- rithm. Note that the two re-optimization strategies are superlinear. In addition, we may not have the computer resources to re-optimize. important motivation in favor of the Ejp strategy is An even more that this strategy does not change the underlying network structure, while both the re-optimization strategies can result in a completely different problem instances, by adding new edges nication network for example, to create new communication it may network structure cind deleting old ones. In a the PMST let links for each us consider commu- be very expensive or even impractical problem instance. After this discussion of the various strategies available stances are modified, for different some potential areas when problem in- of application of problem. Consider for example a VLSI environment. Suppose on a circuit, there are n processors subject to failure and processor i becomes inactive with probability pi. Then we would connect the active pro- like to cessors using a spanning tree structure, which minimizes the manufacturing Communication cost. means that the cessors this two active processors through some inactive pro- of inactive processors allow communication. example changing the underlying network structure PMST strategy is is Since in impractical, the a good solution to the problem. communication network, nodes may represent communication cen- In a ters, arcs represent tion costs among communication centers. The links and link costs are the probability of failure blocked communication in center i. If p, is communica- the probability of the centers axe blocked, they can only be used to establish communication between unblocked centers. Then the problem is the of finding an a priori PMST A more unusual application of the problem following: minimum expected PMST problem is in the area of or- For instance, a rather intriguing paradigm for the in the area of organizational structure design might be the Suppose the n points that we wish to interconnect represent our agents or spies in a foreign country. They will undertake in the future a of missions, each mission involving a different subset of agents. in cost problem. ganizational structures. PMST network structure of our context, is We an instance of the problem. A series mission, are looking for an a priori organizational structure in which, for obvious rea.sons, each agent will know only the people immediately above or below him/her in the structure; this implies a spanning-tree-like structure. point i is The probability the a priori probability that agent any random mission undertaken bj' i will p, associated with have to participate in the network. For any given mission, only that part of the organization which is necessary to interconnect participating in that particuleir mission T of Figure 1 activated. (For example, is represents the network and agents and j j is will 1 the tree represents the network be activated.) The distance between points interpreted as the cost or risk of exposure incurred must communicate or work with each other. Given and the distance matrix if 6 and 8 must be in- 1, 3, 5, volved in a particular mission, the tree Ti of Figure and subset of agents that the agents all when agents p, for for all possible pairs (t,j), the = i PMST i i and 1,2, ... ,n gives the or- ganizational structure which, in the expected value sense, minimizes the risk random of exposure of the network on a In a routing context, a companj' tions, using a tree-like structure. mission. may want The cost an a having a demand at location priori spanning tree expected transportation for the p, represents the proba- The company would i. demand loca- between demand locations can represent transportation cost and the probability bility of demand to connect all locations, such as to like to find minimize the cost. Other areas of potential application can be in the areas of transportation and of strategic planning. One might object that all the examples idealization of reality. Nevertheless, the in many we have PMST is discussed represent some a generic problem, which applications where a particular type of randomness be a more appropriate model than the classical question of finding a spanning tree which than a solution which is is MST. It PMST present can also addresses the optimal on the average, rather optimal on a particular instance. characteristic therefore of the is problem is that it is The essential a more global problem than the is MST PMST problem; as well the optimal solution to the a robust solution. Unfortunately, as we prove in section 4, one pays for these nice properties (robustness, globality) by changing the complexity of the problem radically. While the MST problem is easily solvable, the The Expected Length 3 PMST problem is NP — hard. of a Given Spanning Tree As we noted in the previous section the PMST strategy for updating spanning tree solutions modified probabilistically in T we which connects nodes from the example of T. For in Figure when problem efficient instances axe response to the absence of certain nodes from the graph. Given an a priori tree tree problem defines an 1, S = define set S Lt{S) to be the length of the of present nodes using only parts {1,3,5,6,8} and Lt{S) is the length of the tree T^. Then if the set S of points present has probability p{S), the problem can be defined formally as follows: Problem definition: Given a graph E —* R, a. G = (V, E), not necessarily complete, a cost function c probability function p : 2^ —* [0, 1] we want to find a tree T : that minimizes the expected length E[Lt]'- E[Lt] = E PiS)LT{S), scv 10 (1) where the summation is taken over subsets of V, the instances of the all problem. at this level of generality Note that we can model dependencies cimong the An probabilities of presence of sets of nodes. additional observation with this formulation one would need 0(n2"), {\V\ We would expected length of a given tree T. The question we address efficiently. we can compute functions p{S) we define h{S) = Pr{ If Theorem none of the nodes is E[Lt] present } for = that n) effort to compute the be able to compute E[Lt] in this section efficiently in S like to = is is for which probability a given tree T. X^rcv-sPC-^)? then 1 Given an a T priori tree E[Lt] where Ke^V — obtained from = E A'e are T its expected length c(e){l - h{K,) is given by the expression - h{V - /C) + /i(V)}, (2) the subsets of nodes contained in the two subtrees by removing the edge e (see Figure 2). Proof: Given a tree E[Lt]- By T let how much each edge e ^ T the definition of the problem only the edges in this expectation. If A{Ke) = us consider at least we contributes to T contribute in define the events: one node in A'e is present, then the contribution of every edge e is c{e)Pr{A{K,)r]A{V-K\)}, because the edge e is used if and only if 11 there exists at least one node present V-K, Figure in A'e and at least 2: The one node present E\Lt] sets in K^, V— V- K, Ke- As a result, = X: c{e)Pr{A{K,) n A{V - K,)}. But Pr{.4(A'e)n/l(V-A'e)} = But we 1 - Pr{.4^(A',)} since = Pr{[A'{K,)UA%V-K,)Y} = l-Pr{A%K,)UA%V-K,)} - Pr{A'{V - Pr{A%K\)] = Prjnone A',)} + Pr{A'{h\) D A'{V - of the nodes in K^ is present} A%)}. = h{Ke), easily obtain (2). • Thus if instead of the probability function p{S) h{S) we can compute E[Lt] for compute h{S) we can in 0(1), since any given tree T we in are given the function 0(n), assuming we can find all the sets A'e for all e in 12 0(n) by An starting the computation at the leaves. in practice, when is the nodes are present independently. an explicit expression Theorem If node tree T interesting case, for and important Then we can find E[Lt]- 2 is i present with probability p,, then the expectation E[Lt] of a given given by the expression is £[ir] = E^(^){i- n(i-p.)}{i- n (i-p.)}- (3) Proof: In this case, because nodes are present independently h{S) = l[{l-p,). Substituting the above expression in From 0(|5|). 0(n) we can compute E[Lt] (3) By in we easily obtain (3). • 0{n^), since we can compute h{S) as follows: = Let a 2. Until node set - n.ev'(l is if i is a add to the set I E[Lt] in organizing the computation carefully, we can compute E[Lt] in 1. 3. (2), leaf, let = P.) : let a, = MARKED=set Let 1; empty: a,- = (1 - p.) MARKED Ee=(.,;)eT c{e){l - UjeMARKED, ; delete a,){\ - 13 i (.,»eT Oj; from T. a/a,). of leaves. An important special case E[Lt] If we is when = E^(^){1 - = pi p for all -Pr''}{l - (1 Then E[Lt] becomes i. (1 -pr''^''}. (4) define ^{k)^{i-{i-p)''}{i-{i-pr-'}, (5) then E[LT] = j:c{e)4>i\IQ). (6) eer Based on these closed form expressions we PMST the decision version of the c(e) is = 1 is NP — complete. An 4 we prove p, = = PMST 1 for all i and some key combinatorial properties and of problem. of the PMST Problem that the simplest possible case of the with equal weights c(e) p additional importance of the expressions (6) The Complexity In this section prove in the next section that problem, even with that they will assist us in deriving the optimal solution to the will p, = p is NP — complete. PMST We problem first define formally the decision version of this restricted problem. The Restricted Instance number Problem (RPMST) Given a graph : < p, Question PMST : Is p < 1 G= = 1 for all e £ E, a. and a bound B. there a spanning tree E[Lt] = {V,E), costs c(e) Z ^(^){1 - (1 - T for G with P)"^'''}{1 e6T 14 - (1 - P)"" "^'•'} < ^? rational In order to prove that the some properties RPMST of the function problem </>(/:) = (1 — NP — complete is x*)(l — a:""*), x we will = 1 —p need defined in (5). Proposition 3 The function < 1. If 2. (i>{k 3. 34>i^) it has the following properties: <f>{k) m < + m) < - f then , + (f){k) <l>{k) < 4>{m). <f){Tn). >0. 2(f>{i) Proof: These properties follow from elementary algebraic manipulations as easily follows: 1. - (f>{k) 2 + - -r 2 <i){m) = + 2. (j){k) 3. 3<j^(3) x"-^)(l (x'" - x*)(l - x"-'"-'^) < m m ^ < (1 - + x(l - x^) + x2(l - x)] > 0. RPMST is < if A: and 4- n. 4>{m) - = - 4>{k = 2(^(4) + 2x^ - (1 - x*)(l x^)[l - x"-3) + m) = 3(1 3j3) - = (1 - x"-'*)(l - - x"')(l - x)[l 2(1 + - x""''-'") x'')(l - > 0. x""") > • We now have all the required tools to prove that the NP — complete. Theorem The 4 RPMST problem is NP - complete. 15 problem . Proof: RPMST Cleaxly, belongs to the class NP, since given a tree pute E[Lt] in polynomial time (0(n)) and compare In order to prove the completeness of the B. NP - complete to [3]) problem EXACT COVER BY it T we can com- with the given bound problem we will reduce the 3-SETS (Garey and Johnson it. EXACT COVER BY Instance A : 3-SETS (E-3C) S = family {di,. . . ,a,] of 3-element subsets of a set C = .,C3c}. {Ci,. Question Is : there a subfamily 5i C 5 of pairwise disjoint sets such that Given an instance of the E-3C problem, we define the following instance of the RPMST problem: G={V,E), V = RliSuC, R= r {oo,. = + 3 E= . . ,ar}, 3c, {(a„ ao), 2 = 1, . . . , r) {r + (f>{k) = (1 Zc)<j>{\) - x'=)(l As an example 2,5 = 4, Let - if T"-''), G 5} U {(<7, c), c G a}, 1, 5 = x=l-p,n = r + l+s + 3c. {{ci, 02,03}, {02,03,05}, {02,04,05}, {04,05, ce}}, c be a feasible {E{Lt] show that <t), ct c4,[A), the corresponding graph T We now + {(gq, < p< p arbitrary rational with B= U if E[Lt\ is < B) < presented in Figure 3. spanning tree of G. Clearly B, then {ao,a) 6 16 T = for all (a,, oq) G T. a £ S. Suppose Figure E[Lt] > B. = example We si in Figure there exists (see Figure of nodes in = 4a, g, 4a). C— l,g^ some a £ for We i S. We G 5 and j show that C such that E. define {j} that are adjacent to = will / in T. In the 2. also define = number the vertices From 5i T G number the ^ T, Since {ao.a) (ao,t),(i, j),(i.cr) gi ^ T there exists only one (ao,cr) first RPMST Equivalent instances of E-3C and 3: from C in nodes T [l these definitions + 252 + 353 = We now E[Lt] of = 3c - = 5 — {i,cr] in T that are adjacent to exactly / 0,1,2,3). we 5, in get - <7^ - 1 =J> 53 = -(3c - 2^2 - s^ - g, - g„ - I). (7) write an expression for E[Lt]- r<^{l) + {3c-g,-g„-l)4>{l)-\-s,<i>{2) + S2<f>{3)-\-S34>{^) + 4>{9^+9c + 3)+ 17 where the first term (r^(l)) the second term in C is is from the contributions of the r edges from the contributions of the edges connecting the nodes except the ones that are connected with 3), (f){2-\-ga), i,(T, and the terms <f){g, 4>{\+g„) are from the contributions of the edges (ao,t), respectively. (a,, oq), (t, + g,, + j), {j,cr) Then E[Lt] > B= + (r + 3c)4>{l) c4>{i) ^ si4>i2)+s2(f>iZ)+s3(l>{4)+4>{9,+ga+^)+H^+ga)-\-Hi-\-g.)-H^)-c<f>{i) > 0. Substituting (7) we get E[Lt] > B^ - \s,[Z<t>{2) <?i(4)] + ^52[3<;5(3) - 2(^(4)] + ^[3<^(5.+5.+3)-(5.+5.)<^(4) + <^(2+5.)] + ^[2<^(2+5.)-<^(4)] + [<^(l+5.)-<i(l)]>0. Using proposition 3 we can positive a tree, there exist is Then if Tk-\ in If , i we add the edge (oo, k nodes in T tree T cri E S and (gq, <^k) which there are only k we denote the li,, the terms in j I in not connected to ao, Uj.u^^ be the number ^ C [ ] are strictly nodes (see Figure <7i, . . . , (j, cr^-i ak) we get a new tree not connected with oq. order to represent the fact that there are we claim that > E[Lt,_,]. of nodes in the subtrees respectively (see also Figure 4b). 4b). Since such that {ao,i),{i,j),{j,(Tk) € T. and delete the edge — with Tk ^T ),..., (accrj.) E[Lt,] Let all and thus E[Lt\ > B. Suppose now that there are T check that eaisily The from nodes i,j,crk contribution of edges in Tk,Tk-i that 18 Figure 4: The cases (oq, <t) ^ T and (oq, ctj), . . , . ^T (oq, (Tk) are not involved in the cycle created by adding the edge (cq, the same. (Tk) is Then E[Lt,]-E[Lt,_,] = (^(u. (f>{u, where <f){u, + + 1 Uj + l + Uj + H-u„, + l) + (/>(uj + H-u^, + l) + <A(u^, + l)- + + 1 + t/j + 1 + t;<,^ + - - <j>{u, + 1) 1), 4>{uj + 1 4- u,,;^ 1) + <?i(ti^, + 1), 1), 4>{u„^ + 1) are the contributions in Tk of (oo,i), (i,i) and {j,(Tk) respectively. Similarly in Tk-i the terms <f){u, + l + Uj + \), {ao,<^k) respectively. <f){u, +1+ E[Lt^_^]. Uj + 1) By and 4>{uj + l) and proposition 3 4>{uj +1+ u„^ we need u, + l) are from {ao,i), we have that + l) Note that we have used the proposition 3 to hold <f>{u„^ > <t>{uj fact r = <f>{ui + 5 1). + + l + Uj + H-ti<,* + 1 < 19 (i, j) and + l-\-Uj-\-l-\-u^^ + \) > As a result, E[Lt^] > 3c, since in order for 5 + 3c < ^ = i±l±2±3£. As a Making {O'Oi'^k)- = We will it now show one missing arc we have already proved (oq, cti), T to be feasible all edges (ao,(7,) e T. that using the quantities 5i find: >.. > E[LtA. E[Lt,_,] follows that for the tree The expected we > B. E[Lt] By > E[Lt,] tree Ti has only that in this case E[Lj^] Therefore, decreases by adding one missing arc this trajisformation inductively E[Lt] But since the T expected cost of result, the + T = is (r •<=>E>3C has a solution. = 1,2,3) defined above (/ + 2.':2 cost of E[Lt] si < B 353 0, = 3c, 5o + + 5i 52 + we have 53 = S. then given by + + 3c)<^(l) + s,4>{2) S2<f>{3) + 53<^(4). Thus E[Lt] <B^ ^-s,[3<p{2) From proposition 3, 3(^(2) inequality (8) holds is if + 5i<;5(2) - ,p{A)] <^(4) and only 52<^(3) and hence the RPMST (53 + ^52[3^(3) > and if Sj = equivalent to E-3C having a solution. solution, + problem 20 52 - - - 0. 2<^(4) > and hence Thus E[Lt] < is < < 2(^(4)] 3(?i(3) = c)<^(4) B NP — complete. <» (8) 0. 53 As a = c, result, which •O E-3C has a • We can add some insight to why the problem As p following remarkable fact. is easily solvable. <i>{k) — What is the limit as - (1 - = (1 hard by noticing the PMST approciches p — 0? In this case the 1 ^ is - the MST which > - p'k{n - k). (1 - p)"-*) \Ke\) is the objective function of another p)'){l .\s a result, — The expression Hegr c(e)|/Ve|(n famous problem, the NETWORK DESIGN PROBLEM on a tree, which is defined as follows: NETWORK DESIGN PROBLEM Instance Question sum the A : : Is graph G= there a spanning tree T for G such that, ^ W{{u,v)]) denotes — |A'e|). NP — complete in Johnson, Lenstra and Rinnooy problem approaches why as p The network ^ an the problem hard. In fact, originally led us to suspect that the PMST Kan is it is problem is [6]. f{T) = Thus the PMST this observation that hard. PMST NP — complete. We now 21 e that problem which gives some proved that the restricted version of the on a non-complete graph T, then design problem on a tree was proved NP — complete is in W{{u,v})<B? XIegjc(e)|A'el(7i We have and a bound B. by considering the contribution of every edge easily seen intuition as to if E on the path joining u and v of the weights of the edges f{T)= It is £ {V, E), a weight c(e) for each e with equal costs prove that even if the graph complete, but the costs are either small or large, the problem is is still hard. The PMST Instance A : probability Question Theorem The complete graph <p< p, Is : problem on a complete graph A'„, a cost c(e) € {1,M}, a bound B and a 1. there a spanning tree T with E[Lt] < Bl 5 PMST problem on a complete graph is NP — complete. Proof: Clearly the problem in is NP because of the closed form expressions we have found. To prove that the problem in the proof of theorem 4. complete we use the same reduction as is remaining edges but with very high if make the graph complete we add In order to = cost, i.e c(e) we include any edge of this type, its contribution = 5+ can not be included in the c(e)<^(l) 1, i.e. it the rp^^^ {r+3c)iii)+c<t>{4)+i _ would be c{e)4>{\Kt\) > tree. Therefore, the proof remains unchanged since edges with large costs never appear in a tree with E[Lt] < B. • In section 6 by exploiting some combinatorial properties of the problem, we examine some special Ccises of the PMST solved in polynomial time. For example with all costs equal the problem indicate that the problem costs are 1 or is M or the graph is hard is in which the problem can be we prove that solvable in 0(n). if in The previous theorems either the graph is complete and the non-complete but the costs are equal. combine these two requirements (complete graph, equal becomes a complete graph easy. 22 costs) the If we problem Combinatorial and Functional Properties 5 PMST of the we examine the case with equal In this section case we p. In this are trying to find a spanning tree that minimizes the expression f{p) ^ nun Mp) = min{ Yl c{em\K,\)]. Functional Properties of the 5.1 = probabilities p, Expression (9) is (9) PMST clearly a function of the coverage probability p. For different values of p the corresponding optimal probabilistic trees which minimize (9) We first are different. address the question of specifying the properties of the we have seen function f{p). From difficult to find /(p) for a particular value of p, but can the results of section 4 properties of this function which will give call some we would be that it find some global insight into the problem? We these properties functional because they are related to the function /(p). Some initial observations are stated in the following proposition. Proposition 6 The function np > 2 it is f{p) is continuous, increasing, piecewise differentiable. also concave if the costs are positive. Proof: We examine the properties of the function <^,(p)^(i-(i-p)'=)(i-(i-pr*^). 23 For We can easily check that = ^mp) (1 - p)'-'m - - rr'') + (1 n(i - pr-"(i - (i - p)')] > o, [n(n-l)(l-p)"-*-Jt(Jt-l)- ^Mp) = _(„ _ k){n ^ -k-l){\- pr-^'jil - p)'-\ k can be easily checked that j^4>k{p) if np > it is Thus the function /r(p) 2. a polynomial and furthermore since it is a weighted function f{p) of a finite is is sum concave number < for np > /jzCpa) it > < = 2 and ^4>\{p) < 2 is continuous and differentiable since is increasing and concave for > and continuous, since P ^ Pi+i for p2 if fip,) = fTXPi)-,^ /(Pj)- Finally there We Between successive breakpoints some T,. 2, Therefore, the 0). it is is = the minimum li2, then f{p\) a finite number which can possibly minimize /(p). Thus the function /(p) has a of breakpoints. np > and continuous functions. Furthermore, /(p) of concave < fT:i{pi) for all with positive weights (c(e) increasing because for pi /ti(Pi) it < 2; k=l. (n-l)(l-p)"-3(2-np), It > Hence /(p) is p,, p,+i, finite = /(p) = of trees, number /r,(p),Pi < piecewise differentiable. • can now combine the above proposition 6 and our previous observa- tions that as p p close to 1 is — > the 1 the PMST MST, and as tends to the p ^ MST, the optimal i.e. the optimal tree for PMST is the solution to the network design problem, to sketch a possible graph of the function /(p) in Figure 5. 24 MST fip) \ Lmst Network Design solution Figure 5: The PMST problem as a function of the coverage probability 25 p Bounds 5.2 for the PMST Based on the above functional properties of f{p) and some properties of <i>{k) from proposition 3 we can prove the following proposition. Proposition 7 If Tp is PMST the optimal - max[plA,sT,p(l and Lj is the length of the tree T, then -pr-')LT,] < E[Lt,] < (1 (1 - (l - py^^fL^fST. (10) Proof: From the concavity of the function f{p) we get that /(p)>p/(1) + (1-p)/(0)=pImst, where clearly L},fST solution of PMST =the length for p we = of the minimum spanning proposition 3 From the closed form expression (6) for E[Lt] Since E[LTp] = is the get <l>{l)j:c{t) < E[Lmst]^ we < E[Lt] ecisily = we find ^c(e)^(|A'el) PMST < <;5(L^J)Ir. derive (10). • Exploiting these bounds, we address the question of as a solution to the which 1. From ^(1)It tree, problem. The following the bounds (10). Proposition 8 26 is how good is the MST an obvious corollary of T, Figure 6: The E[Lmst] - E[Lt^] ^ (1 trees Ti,T2 - - p)(i (1 - p)lfJ-i) (11) Proof: < E[Lmst] < Since E[Lt,] > pLmst we E[Lt^] becomes 0(n). <?(LiJ)^mst which is (1 - —+ not informative. intuition that the for p large enough (say p > MST However, as p In fact, the following 1/2 l,...,n and c(e) example 1,2, . . . ,n + 1 = — satisfies and T2 then clearly Tj is the is ^ (z, z + 1). 27 = oo » example confirms our If PMST + the tree Ji the star tree rooted at node n E[Lt,] — problem. 1) = 1, i = Note that the cost function the triangle inequality. MST. Then MST problem, and n > can be a very poor solution to the 2 for all e the ) PMST Consider a complete graph A'„+i with cost function: c{i,i in this the bound • consistent with our intuition. is and p)^?J)i^A/sr, a good approximation for the solution of the is bound this - (1 can easily derive (11). Note that as p These bounds indicate that solution < (2n - + 1 1)<?S(1) is the path (see Figure and E[Lt,] 6), = 2E.li Hi) = "(1 + number). Then if (1 Tp - pT) - is the minimal "(1 E\Lmst\ ^ E[Lr,] > E[Lt,] If p = - 2tlz£l±Eiiz£lLitlz£i: (assuming n + PMST, we (1 E[Lt,] - p)") (2n ^ for some constant a > 2 we - - an even obtain oUrEllEilzfll^iirEn l)p(l - (1 easily obtain as E[L\jst] is n - p)n) — oo that >f)(n). E[Lt,] Thus from (11) we always have E[L\{st] = 0(n), E[Lt,, and we have found an example for which -m^ As a result, we conclude that the bound ''<'' (11) is the best possible. Furthermore, we can address the opposite question. PMST solution to the Similarly MST How good is the problem? we can show Proposition 9 Ljp — Lmst 1 — P ^ - p(l-(l-p)"-i) Lmst Proof: Inequality (12) follows from the inequality (10) as follows: p(l - (1 -p)"-^)Ztp < E[Lt,] < Lmst.* 28 (12) Relation of the 5.3 PMST problem and Re-optimization Strategies We an have suggested in section 2 the idea that the efficient strategy to lems, when problem PMST update the solution to minimum spanning tree prob- instances are modified probabilistically because of the absence of certain nodes from the graph. In section 2 we introduced two other alternative strategies Sa/sj and ^steineRi in which mum spanning nodes the in MST tree problem defines (MST) every instance. If or the minimum we find the mini- Steiner tree of the set of present Lmst{S) and LsTElNERiS) denote the length or the Steiner tree respectively of the nodes in the set S, of we then define the expectation of these re-optimization strategies as follows: EI^mst] = E PiS)LMSTiS), (13) ^ p{S)LsTEINERiS), (14) scv E[T,stEINEr] = scv be the probability that only nodes in where p{S) was defined earlier to present. In this section, we address the question of we have to it is difficult to find compute a sum PMST of 0(2") terms. Instead, we will find Proposition 10 every node is independently present with probability p then rrv , ^ strategy. a closed form expression for E[T,mst], since the E[Lmst]- If are comparing the expectation of the re-optimization strategies with the expectation of the In general S np+(l-p)"-l n — 29 1 , a bound on Proof: E[Uist] = E P'(l - We Dk = define Escv,|S|=fc claim that We will S, Lmst{S) and thus = Ep'(l-pr'Z?fc. (16) prove the above claim by backward induction. Consider the n sets = V-{i}. Then Lmst{S,) + ciKj) > LmsHV) = Lmst because the tree created by adding the edge feasible solution to the instance V. it LmsHS). SCV,|S|=it £:pA/5r] We E P)""* fc=2 holds for any edge in the edge (?', j) We MST, we the one with the is Summing over all i we minimum cost 1 = i choose for every — 6 {i,j) to the apply (18) for from the MST, such that the n one edge V(e, j) i MST, MST 1 . . . on (IS) S, is a n and since the corresponding edges {i,j) are distinct, and among all edges in the MST. get Yl^MST{Sr) + Y^c{iJ) > uLmst«=i i=\ In order to choose n leeLst 1. in cost — 1 edges we perform the Find the edge e' (f, j) to be distinct following algorithm: with smallest cost c(e*). 30 and the one remaining the r Until the node set 2. if I is a leaf in the non-empty, is MST l({i,j,) 7^ e* delete then Since there are n — 1 let e" be their corresponding edge. edges {i,j) that are distinct, then " ^c(z, j) = La/st + c(e*) < 1 + ——r)LMST' (1 " •=i As a ' result, " I>n-i = y^^MsriS,) > r— (n - 1 1 Consider now the A, let A,,j = A, t = - Adding with respect (l) {j}. to i subsets of -)Lmst = n — 1=1 all MST. I. For the two remaining nodes 3. be the unique edge in let (i, j,) \' — 2) —Lmstn — 1 n{n 1 of cardinality k, Ai, A-2,. . ., Af For Arguing as before we get E^mst(A.,)>^^^Z).. (19) But, E^A^5T(A,.,) since in the subsets of V summation in (20) of cardinality k - = (n-fc + we count each 1, n - [k — l)(n k + I l)Z),_i, distinct subset of the [k-ij times. find 31 — K + (20) 1) Combining (19), (20) we Applying (21) inductively we easily obtciin (17) Then from (16), (17) we find > tp'i^-Pr-'^f^fjLMST. ^[Ea/st] Therefore, a-pT-l LmST- np + ^[^MSTl Note that as n Ei^Msr] > + As Cj, ci will As a < E[Ltp]- Let C2 < be shown n — \ 1 oc the bound becomes ^ < not clear that E[E\fST] It is c, — d. . . . < G= c„. £'[i^Tp]- In fact, we give an example where A'„ with c{i,j) = the star tree rooted at node 1. {V,E) be a complete graph MST Then, the in proposition 12 is the optimal PMST is the same star tree. result, E[Lt^ = In this example we P(l - (1 - p)"-^)[(n - l)ci + f: c,]. be able to find a closed form expression for E[Emst] will by exploiting the special structure of the cost function. present and the 1st,..., i — node is 1th nodes are not present then the optimal tree is the star tree rooted at node form expression for From i. E[Emst] If the ith. we can write a this observation closed '• n-l E[^mst] = ^p{^ — py~^E[LT,\node i is present], 1=1 where E[Lx^] means the expected length rooted at node i with leaves z + 1, . . . , in the n. Since 32 PMST E[Lt, \i sense of the star tree is present] = pLr, = p[{n — 1 — i)c, 4- IZfc=,+i some algebraic manipulations, we after C/t], easily find that El^xfsr] = pf2c,\pin - 0(1 - py-' + 1 - - (1 p)'-']. t=i Choosing Ci = we I find £[Ir,l = "'"^''-V + n-? + 0((l-pr). = E[^mst] Letting np = and n c ^[^Tp] Thus c< 6 n as — > oc — oo ^ (c/2 > -Ei-Z^jp] we p)"). see + - 1 £[EA/sr] -^ cn/2. 3/c)n, > E[Emst] Some for c > 3, but E[LTf,] < £[Ea/5x] for Special Cases we exploit some proved in section 5, to find PMST problem graph and c(e) G {1, The Role first of the combinatorial properties, some special cases in which were which we can solve the polynomial time. In section 4 we have seen that the more in restricted versions of the The "% + 0((1 - 3. In this section 6.1 "'"^f M} PMST in a problem with c(e) complete graph are = 1 in a non-complete NP — complete. of the Star Tree natural question concerns the complexity of the problem combine the above restrictions, i.e. when we when we have a complete graph with 33 all costs c(e) = 1. We prove a more general theorem, which includes this case and chaxacterizes the optimal Theorem 11 In the case the MST solution. where problem = p, p for all t € V, whenever the optimum solution a star tree T., then T. is also the solution to the is of PMST problem. Proof: For all T trees E[LT] from proposition Since T. is 3. = Y,<^)^i\^<'\)^<f>i^)LT But by assumption the MST Lj > Lj. for all trees. Combining the above inequalities E[Lt.] Therefore, the star tree T. solves the Theorem star tree T.. < E[Lt]. PMST problem. • 11 characterizes the optimal solution But are there interesting examples in whenever the which T, is the MST is a MST? Proposition 12 In the following 11, r. is the examples the MST is a star tree T. and thus, by theorem PMST. 1. A complete graph, with c{i,j) = c, 2. A complete graph, with c{i,j) = c.Cj, c, 34 + Cj. > 0. 3. 4. A complete graph, with c{i,j) di = A complete graph, with c{i,j) = + c, Cj d,dj,w\t\i ci -|- = mine, and min<fi. — min(ci,Cj). Proof: Consider an arbitrary tree Ci = min, one at least Lt., i.e. Since c,. T, T is connected MST, the Without . Then Lj = ij is 1. is T cost its C2+c,j + . with T, rooted Lt = is c(2, 12) .-\-Cn+c,„ . we assume loss of generality in node > + - • •+c(n, i^), (n — l)ci-|-C2 + . . that where .+c„ = 1. With exactly the same argument we can prove that T, is the MST in the other ca^es. • A MST corollary of theorem 11 is that in a complete graph with c(e) a star tree T. and thus T. optimal solution can be found The Case 6.2 is p, ^ in T., whenever T. case p, 7^ Theorem If Pj? The PMST. Hence, the in this case the pj have shown that the optimum the also the 1 0(n) time. We is is = MST. Does PMST, in the case p, = p, is this result continue to hold a star tree even in the following theorem answers this question. 13 the probability of presence of node star tree T, rooted at node 1, then T. i is p,, is Proof: 35 the pi = minjp, and the PMST. MST is a From = \ — Pi and without n (i-n^.)(i- because n.6A-<-{i} we assume that loss of generality x.)-(i-xi)(i-nx.)=(x,- < ^i 1 and Xi > > i, n.€V-A'« Ml the assumption that Lj. < i^r we E[Lt] star tree T. As a is corollary, in a rooted at node ^.)>o, a;,. eeT /, the PMST. find that > E[Lt.]. • complete graph with c(e) where pi = min, = the 1 PMST the star tree is p,. investigate next the conditions under which the star tree T., which was optimal for certain special cases, the cost function We is n ^o(i- But Sensitivity Analysis 6.3 We /^e- -f[{l - p.))Y:<e). i=2 The n G 1 result, E[Lt] > By (i-p.)}. xGY-Ke i€K^ e€T Let Xi n = i:<^){i- n(i-p.)}{i- £:[lt] As a T the closed form expression (3) for ajiy tree is remains optimal in the case p, = p when arbitrary. define a node in a tree to be an outer one, an inner node if the degree of the node node is if the degree of the node two or more. outer nodes from a tree of n nodes then the remaining graph formed by inner nodes. This tree will If is we erase again a tree be referred to as the inner tree 36 all Tj. In some nodes the tree Tj, there again must be of degree one and these nodes axe called extreme inner nodes. Theorem 14 Let a,b,c be the costs of three sides of any triangle, nodes, in the n positive t < t — node > network (n 4), < with a b any i.e. < If c. set of three there exists a with (1 -p)(i _ (1 -p)L?J-^)Vp(L^J - + tb> c i)(i - (1 -p)"-') = 0(^), such that a for all triangles in the network, then there exists a PMST which is a star tree. Proof: Note that the smaller the value t = it Q, then of restricts all sides of it /, the more restrictive the inequality. any triangle to be of equal length. reduces to the regular triangle inequality. less than 1, this condition is Since the value of < = 1 must be t stronger than the triangle inequality, If If i.e. more restrictive. It is sufficient to any spanning cost. So, let Let A'', tree, T which is of inner nodes in not a star tree, without increasing the expected be any spanning tree which contains at least two inner nodes. be an extreme inner node in node. Since A^^ is be of degree one 7, show that we can reduce the number T with a neighbor Np which an extreme inner node, in T. Call these all its is an inner neighbors except Np must nodes Ni,... Nk-i- This where the distance between Np and Ni 37 is denoted by is c,. shown in Figure no< > k-i Figure Let us construct a that the nodes Ni tree T\, Ng is = {i The 7: new spanning I , . . . , k — I) tree tree Ti T which the sajne as is are connected to Np T except directly. In the new no longer an inner node. Thus the number of inner nodes has decreased by one. We show that will E[LtA < E[Lt]. If we apply tree T. this idea recursively to the trees TijTj,. which exactly the is same both T and Ti, - E[Lt,] = Ml - we need only — a; = 1 a:')(l - i"-*) for the part to the right oi Np. If E[Lt] we ., will finally get a Since the part of the tree to the a star tree. for . p, left of Np is calculate the expected cost then + X: a.(l - ar)(l - x"-^- 1=1 fc-i - - x"-M - E x){l - x"-^). net decrease of expected cost in changing from T to Ti 6o(l x)(l c.(l - 1=1 The ^"^ xCl - E[ir]-^[lTj = (l-x)(l-x"-^)$:[a.-c + 6or (it«=i 3S is x*-Mfl '^ 1)(1 - 1"-*=-^ -x)(l -x"-i)- The decrease be positive will , , "" For a fixed n, x(l when k - each term is positive, x(l-x^-M(l-x"-^-^) namely ^,'^ ^ ^(fc-l)(l-x)(l-x"-i) - x''-''){\ largest, that is if - x"-''-i)/(^ when k = is [n/2j. - - 1)(1 Thus x)(l it is (1-p)(1-(1-p)ItJ-1)^ - x""^) sufficient to is smallest have ^_ ^ •• "•^S(LtJ-i)(i-(i-p)'^-')Assuming a, < < 60 c, gives the strongest inequality, ment of the theorem. • 7 Some Concluding Remarks We have seen that and we have the a natural probabilistic variation of a classical combinatorial problem has the potential to model various practical situations, alternative way probabilistically its to offers an update solutions to problem instances which are modified and leads to very The deterministic counterpart. problem was proved that the state- to be NP different properties in comparison with simplest possible version of the — complete^m PMST sharp contrast with the fact MST problem is solved by a greedy, most straightforward algorithm. Surprisingly, our analysis of the combinatorial properties of the problems established some interesting connections with the network design and naturally with the MST. tends to 0, the PMST problem In particular^ as the probability of presence p approaches the solution to the network design prob- lem. This limiting behavior suggests the idea of solving the network design problem as a sequence of PMST problems, which 39 is a topic of future research. At a final step, we examined the the solution of the PMST problem under some conditions. As a general conclusion, rial special role of the star tree, which can be probabilistic variations of classical combinato- optimization problems raise interesting and entirely pared with their deterministic counterparts and ing of the properties of the probabilistic terministic problems, BlS it new questions com- in addition, understand- problem can add insight to de- was the case with the network design problem. Acknowledgments I would useful like to thank comments and my thesis advisor Professor interesting discussions. Patrick Jaillet for his constructive remarks. I Amedeo Odoni want also to for several thank Professor This work was partially sup- ported from the National Science Foundation under grant ECS-8717970. References [1] Bertsimas D., (1988), "Probabilistic Combinatorial Optimization Problems", Ph.D thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts. [2] Bertsimas D., (1988), "The Probabilistic lem, Part II: Minimum Spanning Tree Prob- Probabilistic Analysis and Asymptotic Results", submitted for publication. 40 Garey M. and Johnson D., (1979), Computers and [3] Guide to the Theory of NP-Completeness, Freeman, Sam Jaillet P., (1985), "Probabilistic [4] Intractability: A Fraaicisco. Traveling Salesman Problems", (Ph.D. Thesis) Technical Report No. 185, Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts. Jaillet P. [5] and Odoni A., (1988), The Probabilistic Vehicle Routing Prob- lem, in Vehicle Routing: Methods and Studies, edited by B.L. Golden and A. A. Assad, North Holland, Amsterdam. [6] Johnson D., Lenstra J.K., A. Rinnooy Kan, (1978), "The Complexity of the Network Design Problem", Networks, [7] 8, 279-285. Papadimitriou C.H. and Steiglitz K., (1982), Combinatorial Optimization: Algorithms and Complexity, Prentice-Hall, 41 kSkl Ui|5 New Jersey. Date Due m. 1 3 19^^ Lib-26-67 MIT LIBR»P!F^ 3 nun TOaO DDSb71b7 T