COSC 6114 Prof. Andy Mirzaian References: • [M. de Berge et al] chapter 10 Data Structures: • Interval Trees • Priority Search Trees • Segment Trees Applications: • • • • • Windowing Queries Vehicle navigation systems Geographic Information Systems Flight simulation in computer graphics CAD/CAM of printed circuit design Windowing PROBLEM 1: Preprocess a set S of non-crossing line-segments in the plane for efficient query processing of the following type: Query: given an axis-parallel rectangular query window W, report all segments in S that intersect W. Windowing PROBLEM 2: Preprocess a set S of horizontal or vertical line-segments in the plane for efficient query processing of the following type: Query: given an axis-parallel rectangular query window W, report all segments in S that intersect W. INTERVAL TREES PROBLEM 2: Preprocess a set S of horizontal or vertical line-segments in the plane for efficient query processing of the following type: Query: given an axis-parallel rectangular query window W, report all segments in S that intersect W. W INTERVAL TREES SUB-PROBLEM 1.1 & 2.1: Let S be a set of n line-segments in the plane. Given an axis-parallel query window W, the segments of S that have at least one end-point inside W can be reported in O(K + log n) time with a data structure that uses O(n log n) space and O(n log n) preprocessing time, where K is the number of reported segments. Method: Use 2D Range Tree on segment end-points and fractional cascading. INTERVAL TREES Now consider horizontal (similarly, vertical) segments in S that intersect W, but their end-points are outside W. They must all cross the left edge of W. W SUB-PROBLEM 2.2: Preprocess a set SH of horizontal line-segments in the plane, so that the subset of SH that intersects a query vertical line can be reported efficiently. Method: Use Interval Trees. INTERVAL TREES Imed Imed Ileft Ileft Iright Iright xmed Associated structure for Imed : L left = list of segments in Imed sorted by their left end-points, L right = list of segments in Imed sorted by their right end-points. L left = 3,4,5 L left =1,2 L right = 5,3,4 L left =6,7 L right = 1,2 2 1 L right = 7,6 5 3 4 6 7 INTERVAL TREES THEOREM: Interval Tree for a set of n horizontal intervals: • O(n) storage space • O(n log n) construction time • O(K + log n) query time [report all K data intervals that contain a query x-coordinate.] INTERVAL TREES SUB-PROBLEM 2.3: Now instead of the query being on a vertical line, suppose it is on a vertical line-segment. The primary structure of Interval Trees is still valid. Modify the associated secondary structure. (qx, q’y) q (qx, qy) SOLUTION: L left = Range Tree on left end-points of Imed , L right = Range Tree on right end-points of Imed . (-: qx] [qy : q’y] q xmed INTERVAL TREES THEOREM: Interval Tree for a set of n horizontal intervals: • O(n log n) storage space • O(n log n) construction time • O(K + log2 n) query time [report all K data intervals that intersect a query vertical line-segment.] COROLLARY: Let S be a set of n horizontal or vertical line-segments in the plane. We can preprocess S for axis-parallel rectangular query window intersection with the following complexities: • O(n log n) storage space • O(n log n) construction time • O(K + log2 n) query time [report all K data intervals that intersect the query window.] PRIORITY SEARCH TREES Improving the previous solution: the associated structure can be implemented by Priority Search Trees, instead of Range Trees. P = {p1, p2, … , pn } 2. A Priority Search Tree (PST) T on P is: • a binary tree, one point per node, • heap-ordered by x-coordinates, • (almost) symmetrically ordered by y-coordinates. PRIORITY SEARCH TREES pmin pmin point in P with minimum x-coordinate. ymin min y-coordinate of points in P ymax max y-coordinate of points in P P’ ymed Pbelow Pabove ymin , ymax P – {pmin} y-median of points in P’ { p P’ | py ymed } { p P’ | py > ymed } PST on PST on Pbelow Pabove p7 p1 p3 y7 y7 y6 y3 p6 p3 pmin p1 p6 y6 y6 y4 y3 p2 p5 y4 y2 y5 y5 p4 y4 y4 p7 ymed p2 p5 p4 PRIORITY SEARCH TREES Priority Search Tree T on n points in the plane requires: • O(n) storage space • O(n log n) construction time: either recursively, or pre-sort P on y-axis, then construct T in O(n) time bottom-up. (How?) Priority Search Trees can replace the secondary structures (range trees) in Interval Trees. • simpler (no fractional cascading) • linear space for secondary structure. How to use PST to search for a query range R = (-: qx] [qy : q’y] ? ALGORITHM QueryPST (v, R) if v = nil or pmin x(v) > qx or ymin(v) > q’y or ymax(v) < qy then return if pmin x (v) qx and qy ymin(v) ymax(v) q’y then Report.In.Subtree (v, qx) else do if pmin x (v) R then report pmin x(v) QueryPST (lc(v), R) QueryPST (rc(v), R) end else end PROCEDURE Report.In.Subtree (v, qx) if v=nil then return if pmin x (v) qx then do report pmin x(v) Report.In.Subtree (lc(v), qx) Report.In.Subtree (rc(v), qx) end if end q’y ymax pmin qx T v R ymin qy Truncated Pre-Order on the Heap: O(1 + Kv) time. v PST LEMMA: Report.In.Subtree(v, qx) takes O(1 + Kv) time to report all points in the subtree rooted at v whose x-cooridnate is qx , where Kv is the number of reported points. THEOREM: Priority Search Tree for a set P of n points in the plane has complexities: • O(n) • O(n log n) • O(K + log n) Storage space Construction time Query time [report all K points of P in a query range R = (-: qx] [qy : q’y] .] PST qy q’y SEGMENT TREES Back to Problem 1: Arbitrarily oriented line segments. Solution 1: Bounding box method. W Bad worst-case. Many false hits. W SEGMENT TREES Back to Problem 1: Arbitrarily oriented line segments. Solution 2: Use Segment Trees. a) Segments with end-points in W can be reported using range trees (as before). b) Segments that intersect the boundary of W can be reported by Segment Trees. SUB-PROBLEM 1.1: Preprocess a set S of n non-crossing line-segments in the plane into a data structure to report those segments in S that intersect a given vertical query segment q = qx [qy : q’y] efficiently. SEGMENT TREES Elementary x-intervals of S p1 p2 p3 ... pm (-: p1), [p1 : p1], (p1 : p2), [p2 : p2], … , (pm-1 : pm), [pm : pm], (pm : +). Build a balanced search tree with each leaf corresponding (left-to-right) to an elementary interval (in increasing x-order). Leaf v: Int(v) = set of intervals (in S) that contain the elementary interval corresponding to v. IDEA 1: Store Int(v) with each leaf v. Storage O(n2), because intervals in S that span many elementary intervals will be stored in many leaves. SEGMENT TREES IDEA 2: internal node v: Int(v) = union of elementary intervals corresponding to the leaf-descendents of v. Store an interval [x:x’] of S at a node v iff Int(v) [x:x’] but Int(parent(v)) [x:x’]. Each interval of S is stored in at most 2 nodes per level (i.e., O(log n) nodes). Thus, storage space reduces to O(n log n). What should the associated structure be? s2,s5 s1 s5 s1 s3 s1 s3,s4 s3 s2,s5 s1 s2 s5 s4 s3 s4 SEGMENT TREES v1 S(v1) = {s3} v2 v3 S(v2) = {s1 , s2} S(v3) = {s5 , s7} s7 s3 s6 s2 s5 s4 s1 SEGMENT TREES Associated structure is a balanced search tree based on the vertical ordering of segments S(v) that cross the slab Int(v) (- : +). s6 s5 s4 s6 s5 s4 s3 s2 s3 s2 s1 s1 SEGMENT TREES THEOREM: Segment Tree for a set S of n non-crossing line-segments in the plane: • O(n log n) • O(n log n) • O(K + log2 n) Storage space Construction time Query time [report all K segments of S that intersect a vertical query line-segment. COROLLARY: Segment Trees can be used to solve Problem 1 with the above complexities. That is, the above complexities applies if the query is with respect to an axis-parallel rectangular window. Exercises 1. Let P be a set of n point sin the plane, sorted on y-coordinate. Show that, because S is sorted, a Priority Search Tree of the points in P can be constructed in O(n) time. 2. Windowing queries in sets of non-crossing segments are performed using range query on the set of end-points and intersection queries with the four boundary edges of the query window. Explain how to avoid reporting segments more than once. To this end, make a list of all possible ways in which an arbitrary oriented segment can intersect a query window. 3. Segment Trees can be used for multi-level data structures. (a) Let R be a set of n axis-parallel rectangles in the plane. Design a data structure for R such that the rectangles in R that contain a query point q can be reported efficiently. Analyze the amount of storage and the query time of your data structure. [Hint: Use a segment tree on the x-intervals of the rectangles, and store canonical subsets of the nodes in this segment tree in an appropriate associated structure.] (b) Generalize this data structure to d-dimensional space. Here we are given a set of axisparallel hyper-rectangles, i.e., polytopes of the form [x1 : x’1] [x2 : x’2] … [xd : x’d] , and we want to report the hyper-rectangles that contain a query point. 4. Let I be a set of intervals on the real line. We want to store these intervals such that we can efficiently determine (a) those intervals that are completely contained in a given interval [x : x’]. Describe a data structure that uses O(n log n) storage and solves such queries in O(K + log n) time, where K is the output size. [Hint: Use a range tree.] (b) The same question as in part (a), except that we want to report the intervals that contain a query interval [x : x’]. 5. Consider the following alternative approach to solve the 2-dimensional range searching problem: We construct a balanced search tree on the x-coordinate of the points. For a node v in the tree, let P(v) be the set of points stored in the subtree rooted at v. For each node v we store two associated priority search trees of P(v), a tree T left allowing for range queries that are unbounded to the left, and a tree T right for range queries that are unbounded to the right. A query with a range [x : x’] [y : y’] is performed as follows. We search for the node vsplit where the search paths toward x and x’ split in the tree. Now we perform a query with range [x : + ) [y : y’] on T right(lc(v)) and a query with range (- : x’] [y : y’] on T left(rc(v)). This gives all the answers (there is no need to search further down in the tree!). (a) Work out the details of this approach. (b) Prove that the data structure correctly solves range queries. (c) What are the bounds for preprocessing time, storage space, and query time of this structure? Prove your answers. END