Continuous Intersection Joins Over Moving Objects Rui Zhang University of Melbourne Dan Lin Purdue University Kotagiri Ramamohanarao University of Melbourne Elisa Bertino Purdue University Outline Backgrounds Intersection Joins on moving objects Indexes for moving objects Algorithms Adapting existing algorithms Our approach Time constrained processing Improvement techniques Experiments Motivation (Traditional) Intersection join Given two sets of spatial objects A and B, find all object pairs ‹i,j›, where iA, j B, such that i intersects j. Intersection join on moving objects Moving Continuous Join Algorithms Nested loops join Block nested loops join Efficient Dependent on buffer size Index nested loops join Basic Expensive Efficient and robust Sort-merge join Efficient Difficult for spatial objects u Indexing Moving Objects Minimum bounding rectangle (MBR) TPR-tree [SIGMOD’00] Add time parameters to the R-tree Other indexes: Bx-tree [VLDB’04], STRIPES [SIGMOD’04] u p = p ( t ref ) + v (t - t ref ) TM : maximum update interval Only for points u u R-tree [SIGMOD’84] u Sampling-based Trajectory-based u Monitoring moving objects u N31 N3 A N1 C N1 N2 D N1 F A C D E N2 B N2 B E F Naive Algorithm (NaiveJoin) Join nodes from two TPR-trees recursively If intersected, check on children Otherwise, disregard it For an update, compute its join pairs and update the answer Join result ‹a1,b1›, [0,3] ‹a2,b2›, [1,4] ‹a3,b4›, [6,8] Node access (IO) roots, N1, N2, N3, N4 Comparison (CPU) root A vs root B, N1 vs N3, N2 vs N4 Extended TP-Join Algorithm (ETP-Join) Time Parameterized Join (TP-Join) [SIGMOD’02] Current result ‹a1,b1› Expiry time 1 Event that causes the change ‹a2,b2› Join result ‹a1,b1›, [0,3] ‹a2,b2›, [1,4] ‹a3,b4›, [6,8] For the 1st TP-Join Node access (IO) roots, N1, N3 Comparison (CPU) root A vs root B, N1 vs N3 Summary NaiveJoin One tree traversal per update, but expensive traversal ETP-Join Cheaper traversal, but too frequent traversals For the 1st TP-Join Node access (IO) Node access (IO) roots, N1, N2, N3, N4 roots, N1, N3 Comparison (CPU) Comparison (CPU) root A vs root B, N1 vs N3, N2 vs N4 root A vs root B, N1 vs N3 Too long Too short Key Problem Find a good time range for computing the join pairs Observation How do we know ta or tb? Consider object a and b Let the next update time for them be ta and tb Perfect time range for computing their join result is [tc, min(ta,tb)] TM gives a bound for them Time range is cut from [tc, ] to [tc, tc+TM] Is this correct for all objects? Yes. Proof in technical report: http://www.cs.mu.oz.au/~rui/publication/TR_mj.pdf Time Constrained Processing (TC-Join) NaiveJoin with constrained processing time range [tc, tc+TM] Join result ‹a1,b1›, [0,3] ‹a2,b2›, [1,4] ‹a3,b4›, [6,8] Node access (IO) roots, N1, N3 Comparison (CPU) root A vs root B, N1 vs N3 Further Optimization (MTB-Join) Many objects will not update at the time bound Put objects in time buckets Each time bucket has an associated TPR-tree An object is inserted into the tree whose time bucket contains the object’s latest update time tc is in [TM, 3/2TM] Improvement on the Basic Join Algorithm Plane Sweep Sorting based on the lower left corner in dimension x Two sequences: Sa = ‹a3, a4, a5›; Sb = ‹ b1, b2, b3, b4› Two essential components for PS Lower bound Upper bound Other Improvements Sorting dimension selection Smaller average speed Intersection check First intersection check and then plane sweep Experiments: setting Computer: 2.6G Pentium IV CPU, 1G RAM Datasets: Uniform, Gaussian, Battlefield Measure: IO and Time Parameter Value Node capacity 113 Maximum update interval (TM) 60, 120, 240 Maximum object speed 1, 2, 3, 4, 5 Object size (% of space) 0.5, 0.1, 0.2, 0.4, 0.8 Dataset size 1K, 10K, 50K, 100K Dataset Uniform, Gaussian, Battlefield Experiments: TC processing Up to 15 times improvement Experiments: Improvement techniques Up to 6 times improvement Comparison: Initial Join MTB-Join outperforms others Half an hour for NaiveJoin Comparison: Maintenance Up to 104 times improvement Time for processing the join for one second 1K 10K 100K MTB-Join 0.03 millisecs 0.05 secs 6 secs ETP-Join 6.3 secs 15 mins hours Conclusion and future work Conclusion Time Constrained processing Further optimization by bucketing in time Improvement techniques Several orders of magnitude performance improvement Future work Applying TC processing to other queries References R-tree [SIGMOD’04] TPR-tree [SIGMOD’00] C. Jensen, D. Lin, and B.C.Ooi. Query and update efficient B+-tree based indexing of moving objects. International conference on Very Large Databases, 2004. STRIPES [SIGMOD’04] S. Saltenis, C. S.Jensen, S. T. Leutenegger, and M. A. Lopez. Indexing the positions of continuously moving objects. ACM SIGMOD Conference 2000. Bx-tree [VLDB’04] Antonin Guttman. R-Trees: A Dynamic Index Structure for Spatial Searching . ACM SIGMOD Conference 1984. J. M. Patel, Y. Chen, and V. P. Chakka. STRIPES: An efficient index for predicted trajectories. ACM SIGMOD Conference 2004. TP-Join [SIGMOD’02] Y. Tao and D. Papadias. Time-parameterized queries in spatio-temporal databases. ACM SIGMOD Conference 2002. Questions Please send your questions to Rui Zhang rui@csse.unimelb.edu.au