The Rectangle Enclosure and Point-Dominance Problems Revisited Anatoli Uchitel Based on an article by Prosenjit Gupta, Ravi Janardan, Michiel Smid, Bhaskar Dasgupta Topics of Discussion Preliminaries A Devide&Conquer Algorithm Red-Blue Dominance reporting In Three Dimensions Final Result Introduction Problem 1 Given a set R of n axes-parallel rectangles in the plane, report all pairs (r,r’) of rectangles such that r encloses r’. 2 2 4 R Transformation R R T([l,r]x[b,t]) = (-l,-b,r,t) Problem 2 Given a set V of n points in R 4 , report all dominance pairs in V. Terminology n Let S be a set of n points in R where d2 Point p dominates point q if pi qi for all i , 1 i d A point of S is called maximal in S if it is not dominated by any other point of S If S is a set of points in the plane d 2 , then the maximal points , when sorted by their x-coordinates, form a staircase, also called a contour. q p The ordering of the maximal points by xcoordinate is the same as the ordering by the y-coordinate. P is inside the contour if it is dominated by some point of the contour. Otherwise, we say that p is outside the contour. Data Structures Used PST ( Priority search tree ) Stores a planar point set S. for any R [l x , rx ] [b y , ) we can report in O(logn+k) time all points in S that are in R. ( n = |S|,k- reported points ) radix PST If all objects come from a finite universe of size u, we can use a radix PST, which is a simpler structure since there is no need for rebalancing. – Query time is O(logn + k) – Update time is O(logu) – Space required is O(n+u) Van-Emde Boas tree If S U {0,1,..., u 1} – Space required is O(u) – Query time is O(loglogu) – Insertion & deletion time is O(loglogu) – Takes O(u) time to build the structure A devide and conquer algorithm for solving the point dominance problem Input: V R 4 Output: all dominance pairs first step: normalization Sort the point by every coordinate one at a time, and give each coordinate a rank in the range 0,…,u-1 according to its order in the sort. Analysis of the normalization: – This step takes O(nlogn) time and O(n) space. – The normalized points we get have the same dominance relationships as the original points. – There are no two different points with the same value for one of their coordinates The Algorithm In A Simpler Form Input: V U 4 when U={0,1,…,u-1}, |S|=n and S is sorted by its third coordinate ( It is done in fact in the normalization step ) Output: all dominance pairs Algorithm Steps Calculate the median m of the fourth coordinate values and split S into: S1 {p S | u 4 (p) m} S2 {p S | u 4 (p) m} Perform the algorithm recursively to solve the problem on S1 and S2 Define: 3 R {( p1 , p 2 , p3 ) U | (p1 , p 2 , p3 , p 4 ) S1} B {( p1 , p 2 , p3 ) U 3 | (p1 , p 2 , p3 , p 4 ) S2 } The groups of the red and blue points Calculate all dominance pairs (r, b) R B The Merge Step Note that all the points are sorted by the third coordinate z We sweep a plane parallel to the xyplane downward along the z-direction During the sweep we maintain a radix PST for the projections onto the sweep plane of all points from B visited already If the sweep plane visits a red point b (b x , b y , b z ) we insert its projection on the sweep plane (b x , b y ) to the rPST. If the sweep plane visits a blue point r (rx , ry , rz ) we query the PST and find all points (b x , b y ) such that b x rx and b y ry For each such point , we report the corresponding pair (r,b) Time & Space Complexity Time Complexity – Normalization step takes O(nlogn) – The algorithm itself • • • • Let T(n,u) denote the total time Let M(n,u) denote the time for the merge T(n,u) = 2T(n/2,u)+M(n,u) M(n,u) = (n/2)O(logu) + (n/2)O(logu) + O(k) = O(nlogu+k) • without considering number of pairs T(n,u) = 2T(n/2,u) + O(nlogu) = O(nlognlogu)= O(n log 2 n) • T(n,u) = O(n log 2 n k ) – Thus total time is O(n log n k) 2 Space Complexity The size of the rPST is O(n+u) Since u=n, we get O(n) We will see a way to improve the time complexity in expense of the space complexity using the vEBT and an improved algorithm for the red-blue dominance reporting in three dimensions Red-Blue Dominance Reporting In Three Dimensions The problem revised: Given two groups R , B U 3 where U={0,1,…,u-1}, and both R,B points are sorted by their third ( z ) coordinate, find all dominance pairs (r, b) R B First, we give two subroutines that will be used in the final algorithm The Cleaning Step In this step we create R1 {r R | b B, r b} B1 {b B | r R, r b} We sweep a plane parallel to the xy-plane downward along the z-direction During the sweep we maintain an initially empty vEBT for the contour of the projections onto the sweep plane of the maximal points from B visited already, sorted by their x-coordinate When the sweep plane visits a blue point b, we update the contour and the vEBT, as follows: – We search in the vEBT with bx, and determine whether b’s projection is inside or outside the contour – If it is outside, then we delete from the vEBT all the blue points in the contour whose projections are dominated by b’s projection – We insert b as a new point When the sweep plane visits a red point r, if r is inside the contour we add it to the group R1 At the end of the algorithm we get R1 {r R | b B, r b} The space complexity is O(u) The time complexity is O(nloglogu) We shall refer to this algorithm as a routine Clean(R,B) Building B1 Using This Algorithm We introduce a transformation F: U U 3 3 F(a,b,c) = (u-1-a, u-1-b, u-1-c) F in equal to its inverse F reverses all dominance relationships B1 = F( Clean( F(B) , F(R) ) ) The Sweep & Report Step The exact form of the algorithm will depend on whether | R1 || B1 | or not, for the sake of time complexity We shall assume wlog that | R1 || B1 | Let B1 ' be the maxima of B1 We shall introduce an algorithm Sweep(R1,B1,0) that will return all dominance pairs (r, b) R1 B1 ' Step 1 - Create L=B1’ We sweep a plane parallel to the xyplane downward along the z-direction During the sweep we maintain an initially empty vEBT for the contour of the projections onto the sweep plane of the maximal points from B1 visited already, sorted by their x-coordinate We add the sequence of updates made in the vEBT to the list M When the sweep plane visits a blue point b of B1, we add b to the initially empty list L iff b’s projection lies outside the current contour that resides in the vEBT In this case we also update the vEBT ( add b to the contour and remove all points which projections are dominated b’s projection ) and we add the sequence of updates made to the list M At the end of the sweep L=B1’ Intuision L contains exactly those points from B1 which projections on the plane were in the contour sometime during the sweep Each such point is inferior by its z coordinate to those points that were sweeped before but superior to them by its x-y coordinates since its projection dominates them. Thus, none of the previously sweeped points dominate it. On the other hand it is superior by its z coordinate to those points that are sweeped after. Thus, none of the future sweeped points dominate it. Step 2 - report (r,b) from R1xB1’ We sweep along the points of R1UB1’ upwards the z-direction The vEBT will consist the final contour from step 1. For every b from B1’ in this contour we will initialize an empty list Cb, which will contain all the red points r from R1, which are dominated by b When the sweep plane reaches a blue point b of B1’, we do the following – Using M, we undo in the vEBT the changes made to the 2-dimentional blue contour when we visited b during the sweep of step 1. That is we add new points and remove an old point – For each r C b , report (r,b) as a dominance pair – for each new blue point q, which resides on the contour at this point create a Cq list as follows: • For every point r in Cb ( the dominance list of the old point b , which has been removed right now ) search by rx in the vEBT, to determine whether r resides inside the new contour ( notice that the new points create a continuos stairway ) • If it is, insert r into Cq for each new point q in the contour, which projection dominates r’s projection When the sweep plane reaches a red point r of R1, do the following – search by rx in the vEBT, to determine whether r resides inside the current contour – If it is, insert r into Cp for each blue p in the contour, which projection dominates r’s projection Main Result During step 2, all dominance pairs (r,b) from R1 x B1’ are reported, and only they Intuition – if (r,b) is reported than b was reached after r so bz>rz and b’s projection dominates r’s – if (r,b) is a dominance pair than r is inserted to Cb or to Cb’, where the projections fulfil b’>b. From the algorithm steps r will be transferred between blue points which dominate each other until it will reach Cb, thus (r,b) will be reported Time & Space Complexity Let KRB | {( r, b) | r b, r R1, b B1 '} | than SPACE(Sweep(R1,B1,0)) = O(u+kRB) TIME(Sweep(R1,B1,0)) = O(kRBloglogu) In brief – Let n=|B1|+|R1| – We keep a vEBT of size O(u) , and Cb’s lists, which may take O(kRB) space. – In step 1 , we check for each b in B1 in O(loglogu) the x coordinate and we travel on the contour points O(|B1|) number of times. That is a total of O(nloglogu) time – For step 2 • The total time of reporting pairs (r,b) is O(kRB) • Replacing all old points by new points in the contour during the sweep takes a total time of O(nloglogu) • Inserting red points to Cb’s takes O(nloglogu) for checking whether they are in the contour plus O(kRBloglogu) for the total insertions • Since | R1 || B1 | and obviously k RB | R1 | we get that n 2k RB , hence total time is O(kRBloglogu) Algorithm 3Ddom(R,B) R1=Clean(R,B); B1=F(Clean(F(B),F(R)); I=1; while (Ri!=null && Bi!=null){ if ( | R i || Bi | ){ Sweep(Ri,Bi,0); Bi+1=Bi-Bi’; Ri+1=Clean(Ri,Bi+1); } else { Sweep(F(Bi),F(Ri),1); H=F(Ri)-Ri’; Bi+1=F(Clean(F(Bi),H); Ri+1=F(H); } i++; } Supportive arguments At the end of the (i-1)st iteration of the while loop , i 1 , the sets Bi,Ri are clean w.r.t one another – Intuition: if wlog we enter the if statement, then Ri+1 is cleaned explicitly w.r.t Bi+1 and Bi+1 is cleaned w.r.t Ri+1 since we removed from Ri the points that were dominated by Bi’ and only them The algorithm 3Ddom terminates and reports all dominant pairs (r,b) from RxB and only them – Intuition: • The algorithm terminates when Ri or Bi are empty. It will happen, since in each stage one of them is reduced, since the points of the maxima are removed from it • If (r,b) are reported than it happens in one of the calls to sweep. We have already proved that sweep works correctly, therefore r<b indeed • Notice that a point is discarded only during a call to Clean or right after a call to sweep if it becomes a maxima. If r<b it means they are “important”, that is, if none of them were removed before one of the calls to Clean, than none of them will be removed during that call. • Since the algorithm ends, the points are removed, let’s assume w.l.o.g that b is the first one • So, b becomes a maxima point during the Sweep call and is removed right afterwards • During the Sweep r is in Ri , so by the correctness of Sweep , (r,b) is reported Algorithm 3Ddom Complexities 3 R , B U Let be sorted by the third ( z ) coordinate We are given an empty vEBT on the Universe U={0,1,…,u-1} of size u Let us define n = |R|+|B| and k ' {( r, b) R B | r b} Algorithm 3Ddom finds all k’ dominance pairs in O((n+k’)loglogu) time and O(u+k’) space Proof The correctness of the algorithm was proved Let us define for the i-th iteration – ki the number of reported pairs – ni = |Ri| + |Bi| In each iteration we make one sweep and one clean operation, which cost: O(niloglogu) , O(kiloglogu) respectively Since we make sure that in each iteration n i 2k i we get O(kiloglogu) The initial cleaning of R1,B1 takes O(nloglogu) Thus, the total algorithm time is O(n log log u ) O(k i log log u ) i O(n log log u ) O(( k i ) log log u ) i O(n log log u ) O(k ' log log u ) O(( n k ' ) log log u ) As far as space conserns, we need O(u) for the vEBT and O(k’) for the Cb lists. Taking into consideration that u is initial n we get a total of O(n+k’) space Analysis of the 4D dominance reporting problem Input: S U 4 when U={0,1,…,u-1}, |S|=n and S is sorted by its third coordinate Output: all dominance pairs Let k denote the number of reported pairs How much Space does it take? Again O(n+k) ( vEBT and total Cb lists ) How much time does it take? – Split takes O(n) – If T(n,u) is the total running time, than T(n,u)=2T(n/2,u) + 3Ddom(n/2,n/2,u) – 3Ddom(n/2,n/2,u) takes O((n+k’)loglogu) where k’ is the number of reported pairs for given R,B sets. Since the handling of each dominance pair is done only in one session, we should not include k’ in the development of T(n,u) formula, but rather add the total price O(kloglogu) for the pairs at the end – Thus T(n,u)=2T(n/2,u) + O(nloglogu) exept for reporting – Solving this equation we get T(n,u)=O(nlognloglogu) – Adding the complexity for handling the dominance pairs yields T(n,u)= O(nlognloglogu+kloglogu) – Since u equals initial n we get O(nlognloglogn+kloglogn) – The initial normalization step takes O(nlogn) time and O(n) space, thus its complexities absorbed in the total results Next Steps In the original work , a mistake was made - the authors didn’t take into consideration the space that the Cb lists may consume, and thought mistakenly that space complexity is O(n), which is not true Later, in an article by Bozanis, Kitsios, Makris and Tsakalidis an improvement was made by using persistent data structures, thus reducing space consumption to O(n)