HD28 .M414 umf^ no. -V WORKING PAPER ALFRED P. SLOAN SCHOOL OF MANAGEMENT On Central Limit Theorems In Geometrical Probability Florin Avram Dimitris Bertsimas WP# 3312-MS July 14, 1991 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 50 MEMORIAL DRIVE CAMBRIDGE, MASSACHUSETTS 02139 On Central Limit Theorems In Geometrical Probability Florin Avram Dimitris Bertsimas WP#3312-MS July 14, 1991 M.I T. AUG LIBRARIES 5 1991 RECtlVfcu 1 On central limit theorems in geometrical probability Florin Avram ' ^ Dimitris Bertsimas July 14, 1991 Abstract We prove central limit theorems (CLT) probability when points are generated in for the following the [0, 1]'^ problems in geometrical cube according to a Poisson point process with parameter n: 1. The length of the /bth-nearest graph Nk^„, in which each point is connected to its k-th nearest neighbor. 2. The length of the Delaunay triangulation Z)„ of the points. 3. The length of the Voronoi diagram Ki of the points. We show using the technique of dependency graphs of Baldi and Rinot that the pendence rajige in all these problems converges quickly to addition, our approach leads to class of more efficient sequenticd and de- with high probability. In parallel zJgorithms for this problems on randomly distributed points. Introduction 1 In a pioneering [11] it paper by Beardwood, Halton and Hammersley [4] and continued in Steele was shown that the lengths L„ of several combinatorial optimization problems (the 'Florin Avriui, Department of Mathematics, Northeastern University. The research of the author was partially supported from a grant from Northeastern University. 'Dimitris Bertsimas, Sloan School of Management, MIT. The research ported by the National Science Foundation under grants DDM-9014751, Young Investigator Award with matching funds from Draper 1 of the author was partially sup- DDM-9010332 and by a Laboratories. Presidential minimum traveling salesman, etc.) satisfy laws of large cube 1,... ,n in the It is matching, minimum spanning minimum tree, numbers, when their input consists of n random iid Steiner tree, points Xk, t = [0,1]''. example Steele also believed (see for [12]), but yet unknown, that they satisfy central limit theorems (CLT), since, while the edges of the above optimal graphs are not independent, the dependence "seems to be local". In trying to make this intuitive idea precise we succeeded constructions 1. The in proving CLTs for three graphs, computational geometry (see in length A'^jt,„ for which are among the most fundamental example, Preparata and Shamos of the fcth-nearest graph, in which each point is connected to [9]): its k-th nearest neighbor. 2. The length V„ of the Voronoi diagram of the points, which polygons point O point in 3. The for the set of is X all a Voronoi polygon around a points in the plane which are closer to D„ O of the Delaunay triangulation of the points, which on the given points with an edge between them 1). , the collection of Voronoi than to any other (see figure 1). length figure X each point. Given a collection of points is This graph is indeed a triangulation, is the graph defined if they are Voronoi neighbors (see i.e., a subdivision of the convex hull of points in triangles. Both the Voronoi diagram and the Delaunay triangulation are fundamental constructions in computational geometry, since are based on their construction. tant for the in the MST efficient MST, since it many algorithms for solving geometrical problems In particular, the contains the MST Delaunay triangulation as a subgraph, have to be also Voronoi neighbors (see figure algorithm for the MST, O(nlogn) time and then runs algorithm for the MST as well. its quite impor- i.e., points which are neighbors 1). This property leads to an since one first constructs the the greedy algorithm on is Delaunay triangulation in graph, leading to an O(nlogn) Figure We MST 1: The Voronoi diagram and believe that our results give as well. Ramey [10] some the Delaunay triangulation partial insight has attempted to prove a on why a CLT for the CLT might hold MST, for the but his approach, although very interesting, did not succeed since he needed some unproven, but plausible, lemmas from continuous Figure The CLT 2: percolation. The MST for jVi,„ has already complicated fourth moment is a subgraph of the Delaunay triangulation been obtained by Bickel and Breiman estimates. They wrote: "Our proof is long. [5], We who used very believe that this is due to the complexity of the problem. Nearest neighbor distances are not independent". We used instead a simple geometrical approach, which works very well and in all these problems has the potential to work for other problems also. it Our results are established for the Poisson model, which is asymptotically equivalent to the usual Euclidean model, but more convenient to work with. Poisson model Let X,, in i = 1, . . , . Nn he the points of a Poisson process with intensity n on a d-dimensional cube, so that The paper for all three a Poisson ^"-^/^"U <I>(x) is x} = mean n. .^(x). (1) the cumulative distribution of a standard and the Voronoi diagram E[U] :? n The constants = A'^(0, 1) normal. In Section we 3, nearest neighbor graph, the Delaunay triangulation Jt-th satisfy: Pd. (2) J~ Pj appearing in (2) have been explicitly computed by Miles sharp contrast with the problems studied we variable with y/Var[Ln] show that the lengths E[Ln] of the that random lie structured as follows. Section 2 contains the proof of our main result that is limPr{ lim is which problems: n-.oo where Nn R'', calculated recently in [1]. We in [4], [11] with the exception of the [7], which is in MST constant include for convenience simple direct computations for the constants. Finally in the final section efficient sequential and we remark that our approach parallel algorithms for this class of leads to more problems on randomly distributed points. 2 Central limit theorems In this section ology and we prove CLTs comment on its for the 3 problenns we consider. We first outline the method- applicability to other combinatorial problems. Intuitive idea 2.1 We identify a certain geometrical configuration An, such that Pr{An} — 1, whose oc- currence implies independence of the configuration at points which are far enough. precisely, there a cutoff distance, such that is the event -4„ happened, then determinis- if the configuration around a given point tically, More is not influenced by that of points further than the cutoff distance. For all the three problems, the event = j-^ into p„ points Xk only a finite An the event that, in the subdivision of is equal subcubes, each subcube contains at least one and at most If all the neighboring subcubes around a point are nonempty, number of them determines the fc-th e log [0, 1]*^ n of the we then show that nearest neighbors or the Voronoi neighbors of a point. Thus, conditional on the event An, the three problems exhibit "m-dependence" (at the subdivision level) for [2] As mentioned see [10], how for the and lie it m. Applying the theory of dependency graphs from a similar approach for the we described thesis of for the three minimum spanning tree. In order to problems we considered might generalize consider the event Bn,k that there exist 2 points which are Voronoi neighbors some distance all Ph.D the introduction, our approach was inspired by the who attempted MST, which are than in the approach at finite CLTs. yields the Ramey some shorter than of each other, but can also be connected by a chain of edges / / edges for some large decision not to connect and the shortest such chain between the two points uses more Such two points k. them ts will not be neighbors in the affected hy the position of far away MST and the points. Let B'^^. be the complement of the event Bn,k Conditioning in this case on >i„ and thus showing that Pr{fl„jt} (modulo some make technicalities). fl — * 5J, ^ 0, m finite for the MST makes the problem m-dependent with as n — oo would establish the CLT Further analogies with continuous percolation, when d = 2, the above quite plausible, even though a rigorous proof has not been submitted. To summarize, lation, in the nearest neighbor, the Voronoi diagram and the Delaunay triangu- we have established the geometrical configuration whose probability of occurrence tends to the and which impHes 1 MST finite range dependence and thus the CLT. In contrast, for we know the geometrical configuration, but we can not show occurrence tends to Finally, in the 1. TSP probabiHty of and minimum matching problems we do not even have a clear geometric argument of why dependence 2.2 its is local. Dependency graphs The fundamental concept that captures the idea of local dependence graphs. Introduced in Petrovskaya and Leontovitch Baldi and Rinott Let Xa, a G V [8] follows: G= be a collection of random variables. The graph dependency graph for has one endpoint in Xa, if for any pair of Ai and the other in disjoint sets A\, A2, the sets of Xa, a S A2 are independent. The following is that of dependency and applied to several problems by dependency graphs are defined as [2], [3] is {V, E) is said to be a A2 £ V such that no edge random variables Xa, a in E £ A\ and the fundamental theorem that captures local dependence. Theorem Gn = 1 ([3]) Let {Xa,n, {Vn,En), n maximum degree of > I a Let Sn . Cn \Pr{^ILZll^ < (^^d x} € V„} be = HoeVn ^<»n,o-^ < Bn suppose \Xan\ - $(x)| < random 32(1 variables having a dependency graph = Var[Sn] < 00. Let Dn denote the almost surely. Then + V^k '^"':^"^" )^. (3) n Thus ^" if l l ff^" — as n ^ 00 ^^^—^^-^(0,1). (4) Remark: The above theorem quantifies the fact that enough independence (D„ small), enough bound- ness (S„ small) and enough variability (cn large) imply the CLT. The CLT 2.3 Let a set » = We C (»i, We be the length of the graphs Nk,n,Dn and V„. Lji . of m'' subcubes with m'' , . . ij), where each decompose the Ln = ij = subdivide the cube The subcubes (i^t;^)- takes values from 1 to will [0, 1]'' into be denoted by indices m. L„ as total length E hn (5) Tec where L' is the sum of all edges with both ends in the subcube portion of the edges within the subcube with only one endpoint be the number of points falling non-empty and contain >in We first Lemma at in most e the cube log i in »', plus the the subcube n points, lim PriAr,} n—'OO - 1. Proof Poisson with parameter A Pr{\ < ^,^ < eA} But Pr{Nj^ = 0} = where we used = 1 - Pr{Af,„ = = -^ = 0} - logn. Pr{N^,^ > Thus eA}. e-^ and eStirling's 1 bounds PriK (eA)! for the N^ Let A^-' will be (6) 2 is of the i.e., show that variable A^^^ i. and An be the event that the subcubes = n^ec{i<^r.n<«'°gM- The random sum < " x_^_^<e-\ (e - 1)^^ next to s/X last inequality. -A = <eA} > l-2e-^ Therefore, 1--2 Since we are using the Poisson model, the random variables are independent and A''' therefore Pr{^„}>(l--r' = (l--)i^. n Taking limits we obtain lim and thus lemma We n introduce Pr{An} > _ lim e 2 »«" = and j: 1 2 follows. now a distance between subcubes d(ij) = i maii<r<d{|»r - >]}• Let SlR={UC: subcubes of radius S^j^ denotes the sphere of i.e., sets of subcubes, we d{lj)<R}, R around i. Moreover, if A,B are two let d{A,B} = minj^^j^gd(ij). The following proposition is the heart of our development of the CLT and captures the idea of "local dependence" in the three problems we consider. Proposition 3 Conditional on R the event A^ having (depending only on the dimenston d) such that with {L-r d{A,B) > R, then , i 6 B} if occurred, there exists a fixed A,B the corresponding collections of is any pair of random number sets of subcubes variables {L^„, « 6 A] and are independent. Proof Let X-' denote the set of random of the points in a subcube. We variables {Xy, v note first £i}, i.e the collection of the locations that the point process on [0, 1]'' obtained from the Poisson point process by conditioning on An, will retain the property that independent. X^ will be prove the proposition for d = the proof holds with no changes in any d. We will neighbor graph we present the case k = where the geometry 2, <^(». R = J) £ 4. 2, Note since if that first > d{i,j) if 3, easier to visualize, but Similarly, for ease of exposition in the k nearest I. = For the nearest neighbor graph (with d with is two points Xk € then a point and 2 it = 1), we prove the proposition and X; € j are nearest neighbors, then i a subcube at distance one of in i is guaranteed to be closer than A'/ (see also figure 3). Nearest neighbor X. Figure 3: \ . X Local dependence in the nearest neighbor graph Moreover, the decision which of the points Xr belonging is the nearest neighbor of Xk G that the analog of this property the event An, {L^^' some function f{X^, since Xj ^ ^) * i '^ i, is in fails and Xi 6 some function f{X-^, i € Sa,2} and But, given similarly {Lj^, i € B} 4 implies that Sa.2, Sb,2 are disjoint turn our attention to the Voronoi diagram. » is included in the sphere j are Voronoi neighbors then d{i,j) to the Voronoi polygon of both. In fact, with lies the sphere S^^ MST, matching and TSP). to hold for the the set of points which are nearest to the point Xt in no way affected by any point Xt outside Sj^. (Note 6 Sb.j}- Since d{A,B) > polygon of a point Xk € d(i,J) subcubes is and are independent, the proposition follows. We now if in > 4, < Sj^j, Arguing as before, the Voronoi since the Voronoi polygon of Xk than any 4, more other point Xi. Thus, since the midpoint of Xjt effort we can check that then any circle through Xk, Xi has to contain a (since all full subcube if Xk is Xk € » and Xi belongs d{i,j) f, < 3, since where a point subcubes are non-empty) and thus Xk, Xi cannot be Voronoi neighbors. The decision which of the points neighbors is the nearest neighbor Thus, arguing as Xr belonging oiXk € is in i, the nearest neighbor case, in subcubes in in the sphere S^^ are Voronoi no way affected by any point Xr outside S^ /? = 8 is sufficient to .. prove the proposition. D Since we want to apply theorem is 1 we want an upper bound on the length of L-' , which provided by the following proposition. Proposition 4 Given the event An Proof Since every distance is at .,n Given the event >4„, A^;^^ < e Finally, in order to apply is provided in ^ most the length log n - .,n and since theorem 1 of the diagonal of the subcube, m= we have that ^ (j^^)^ we obtain (7). D we need a lower bound on the variance of L„. This the following proposition. Proposition 5 There exists a constant /j depending only on the dimension such that Var[Ln]>fdn^. (8) Proof We will use a general technique from subdivide the cube [0, 1]^ in Ramey [10], which we illustrate for n squares of dimension -^ with area ^ each. d = We 2. We identify a particular configuration, which has-a strictly positive probability of occurring and has nonzero e = g-T^. 1. We variability. Consider all describe the construction for the nearest neighbor graph. the subcubes that have the following properties: They contain exactly and radius e, Let 2 points inside a circle with center the center of the and no other points in the subcube. 10 subcube 2. A ring of points exist near the outside of the another. In particular, that surround the initial we demand that all boundary 1 most 2e of the 36 smaller subcubes of dimension subcube are non-empty 1 at distance at (see figure 4). i For the t-nearest graph the construction points inside the that circle, so outside subcubes contains k + I is similar except we require that there points. They contain exactly one point subcube. The three points in the the circle in six equal circular regions. second The subcubes with all e and no other points We circle are located as follows: subdivide As 3 points are located in alternate regions. neighbor graph a ring of points exist outside the boundary at distance at in the nearest most of the inside a circle with center the center of the subcube and radius | and exactly 3 points inside a circle in the its +1 . For the Voronoi diagram and the Delaunay triangulation we consider the following properties: and it-nearest neighbors are inside the circle all are i 2e of each other. The probability of such configuration is pi > Note that the 0. point inside has Voronoi neighbors only the three points in the outside circle. Fixing the position of the three points in the outside circle there the Voronoi polygon and easily above we Un be the let the positions of all A^i be the number of subcubes <t— algebra determined by the the points lying outside the N ^, a certain variability Delaunay triangulation of the point its computable. Letting is inside, all of / satisfying random variables ^ for which are the two properties »i, . . , . »Af, subcubes and inside the larger and by circle. Then, Var[U] = Var[E[Lr,\Un]] Var[E[U\Ur,]] + E[E[Ll\Ur,] n d,- We now is have Theorem sum the all avpx > ^ = d.|C/„]] = 0, of the Voronoi polygon (Delaunay triangulation) of the inside point. the ingredients to prove the CLT. 6 The length Ln of the k-nearest neighbor graph, the Voronoi diagram and the Delaunay tnangulaiton ^lun^Pr{ n-.oo where ^(x) {E[Ln\Un]?] + E[Var[U\Un]] > E[Var[Yl d, + —E[N] = where - satisfies the ^"-^/^"U CLT x}.<^(x), (9) yVar[Z,„] is the cumulative distribution of a standard normal. 12 — Proof We is establish that the first random ')' variable 2isymptotically normal. Indeed, we given that the event -^ proposition the 3. An has occurred Bn < - — ^ Lj-^ and Lj ^ if the sense of section 2.2 in < d(i,j) 1 dependency graph this we have that = iKnl at is and D„ m"^ R most < 72'^ , i.e., finite is defined Moreover, independent of from proposition 3. from proposition 4 and Var[L„] > fdn~^ from proposition °^i' R R, where Proposition 3 establishes the correction of this construction. maximal degree Dn of Applying theorem dependency graph define a between the L^^ by putting an edge between in " ••• ' y/Var[L„] n. In addition, 5. Therefore, since where uj is 'y ^i^)[ \Vn\DlBl < ;t < ^d ; — »• oo. Applying now theorem has occurred We now "<i- = j^^ 1, we need to show that (/„ = 0, i ns establish that asymptotically normal N(0, is we obtain Pognrl Var[Ln]'i n < a constant depending on the dimension only. Since m!^ \Vn\DlBl as nl+i ] dTTT "~ ' "I given that the event An 1). " 1 ?~ 1 is asymptotically normal. But Pr{Un <x} = Pr{Un < x\An}Pr{An} + Pr{Un < x\A'JPr{A'n} Taking we linnits and using lemma 2 and the asymptotic normality of (7„ given the event An establish the theorem. Remark: We can establish rates of convergence to the normal distribution by using ^/Var[Ln\ 3 (3). We see that n* Computations of the expectations In this section in (2), we will find exactly £'[L„], i.e., we will compute the constants /Jj, appearing using the toroidal rather than the usual Euclidean distances, so that there are no 13 boundary However, the boundary effects. can be shown to be negligible (Jaillet effects [6]), and so the same results holds. The k-th nearest neighbor graph Consider a given point under the toroidal model. Then the length of the nearest neighbor = graph L„ "/?„, where Rn the distance of a given point to is its t-th nearest neighbor. Also, E[Rn]= I Pr{Rr, > r}dr = Jo 2^ L where cj '^'' 7i ^•! the volume of the unit sphere is Letting u = Letting n — n cj r'' dimension in d. we obtain that oo we obtain that A ro + ;-i) --~i;^-^S 0-1)! E[u\_ 1 <'"> The length of the Voronoi diagram Let Rn be E[Ln\ = the length of the perimeter of the Voronoi polygon around a given point O. n ^^ ^2" contribution to ^''' Rn from a point OC tioning on the distance E[hn] given that perform the calculation OC = = C for d = 2. Then of the Poisson process. r (see figure 5), let £'[/i„(r)] Let £'[/»„] £'[/2n] = Then be the expected nE[hn]- Condi- be the expected contribution to Then, r. E[hn] = E[hn{r)]2Krdr. / Jo Moreover, E[/i„(r)] = 2/o Pr{the length diagram}^/. But, Pr{the length Pr{no perpendicular belongs in dl at dl at distance distance bisector cuts the segment the circle with center B and radius 72 E[hn] = 4^ / Jo / from A is / from e~'**?' "*"' re-'((?)'+'')M/dr Jo 14 part of the Voronoi = point of the Poisson process ,x/2 / is part of the Voronoi diagram} OB} = Pr{no BO} = A ^". Thus Computing the integral we obtain lim Ef/inln? = 4 n—>oo which leads to lim 2. n-^oo Figure The Delaunay Let its Rn be the 5: The expectation ,Jn of the length of the Voronoi diagram triangulation sum of the distances in the Delaunay triangulation from a given point Voronoi neighbors. Then, the expected length of the triangulation Let h{r)rdr dO = Pr{A is a Voronoi neighbor of r2T E[Rn] = compute h{r) we consider the the Poisson process B, which is in rdr d6}. Then rh{r)rdrde. circle passing to the right of the contains no other point of the process. a point is Jo Jo In order to .4 is to /-x/a / / O— O It is well 15 through O, segment known OA (see for A and a third point of with the property that example [9]) that O, A it are . , L(r,t) R(r,t) Figure 6: Voronoi neighbors circle to the left of of the length of the Delaunay triangulation The expectation if and only O^ if no points of the Poisson process lie in the segment of the (see figure 6). Let Tr the radius of the circle and L{r,t), R{r,t) the areas shown in figure 6. Therefore, rv/2 h{r} = 2 < e-^<"'''"dPr{T, / t} Jo where we multiplied by 2 because we considered the point C could have considered the case in which <0 Pr{Tr is the area of the right segment of the simple geometry L{r,t) integrals we = <i,t^ - on the is = circle. ^^ U left. 4> = E[Rn]s/n = 64 — which leads to ,. hm E[U] — = — -=- y'n 16 OA, where we Moreover, sin'^j^ = and R[r,t) n—oo to the right of e"^*''"". find that Hm C 32 07r (tt - is the angle <p)i^ + ^^. ACB, then from Computing the On 4 the relation of The approach we used CLTs to efficient algorithms leads also to a faster algorithm to and the Delanay triangulation for compute the Voronoi diagram randomly distributed points. From proposition 3, in order to find the portion of the Voronoi diagram inside a subcube, we need only to consider only those subcubes that are at distance at most An event occurs, there are diagram inside a subcube. R = 4 from this subcube. Thus, if the O(logn) points that contribute to the portion of the Voronoi Running an 0{k log k) algorithm with it = O(logn) gives an 0(log n loglog n) algorithm to find the Voronoi diagram inside a subcube. Since there are j^^ subcubes, we obtain an 0(n log log n) algorithm. If the event An does not simply run any O(nlogn) algorithm. Since the probability of An goes to time O(nloglogn) with probability runs in done in parallel, 1. In addition, since this 1, occur, we this algorithm computation can be the above appraoch leads to an 0(log nlog log n) parallel algorithm with i-^^ processors. In general, if such an approach were to work for other combinatorial problems, we would obtain not only a CLT, but faster sequential as well as parallel algorithms for their solution. References [1] F. Avram and D. Bertsimas, (1990), "The rical probability minimum spanning tree constant in geomet- and under the independent model; a unified approach", to appear in Annals Appl. Prob. [2] P. Baldi, Y. Rinott, (1989), "Asymptotic normality of some graph related statistics", Jour. Appl. Prob., 26, 171-175. [3] P. Baldi, Y. Rinott, (1989), "On normal approximations of distributions dependency graphs". Annals [4] J. in terms of Prob., 17, 1646-1650. Beardwood, J.H. Halton and J.M. Hammersley, (1959), "The shortest path through many points", Proc. Cambridge Phtlos. Soc, 55, 299-327. 17 [5] P. Bickel and L. moment bounds, [6] P. Jaillet, (1990), [7] R. Miles, (1970), Breiman, (1983), "Sums of functions of nearest neighbor distances, limit theorems and goodness of fit tests", Ann. Prob., 11, 185-214. personal communication. "On the homogeneous planar Poisson point process". Math. Bio- sciencts, 6, 85-127. [8] M. Petrovskaya and A. Leontovitch, of random Appi, [9] F. (1982), variables with a slowly growing "The central limit theorem for a sequence number of dependencies", Theory Prob. 27, 815-825. Preparata and M. Shamos, (1985), Computational Geometry, Springer- Verlag, New York. [10] M. Ramey, [11] J.M. Steele, (1981) "Subadditive Euclidean Functionals and Nonlinear Growth (1983), Ph.D. thesis, Department of ometric Probability", Ann. Prob., [12] 9, Statistics, Yale University. in Ge- 365-376. J.M. Steele, (1988) "Growth Rates of Euclidean Minimal Spanning Trees with Power Weighted Edges", Annals of Prob., 1767-1787. 18 55U \ 071 Date Due OCT Lib-26-67 iI/i^AKlCs I