Data Stashing: Energy-Efficient Information Delivery to Mobile Sinks through Trajectory Prediction HyungJune Lee, Martin Wicke, Branislav Kusy, Omprakash Gnawali, and Leonidas Guibas Stanford University ACM/IEEE IPSN’10 April 15, 2010 Traditional Data Delivery to Mobile Sinks in Wireless Ad-Hoc/Sensor Networks • Immediate delivery from data source to mobile sinks ?• – Proactive scheme: DSDV, OLSR – Reactive scheme: DSR, AODV Data MULEs to collect data as it passes each of the sensor nodes – Wait until mobile sinks come • What’s a compromise between two extremes? to collect Performance degrades Often infeasible if we cannot • How exploit the tolerated delay? rapidly withto increasing control the movement • How to use regularity of mobility pattern? mobility • How to select only a partial set of effective relays? 2 Overview: Predictive Mobile Routing 1. Trajectory Prediction • Anticipated trajectory nodes 2. Data request and trajectory announcement 3. Stashing node selection • To cover the likely paths and minimize the routing cost 4. Data stashing 5. Data collection by mobile nodes 3/34 Summary of Contributions • Predictive Model of Users’ Trajectories – In the space of wireless connectivity – Capture • Long-term behavior (in minutes) – a set of the future connected relays A • Predictive Data Delivery – Propose an energy-efficient data delivery scheme to mobile sinks – Turn even limited knowledge of future connectivity into networking benefit 4 Outline [Off-line Learning Phase] • Mobile Trajectory Model – In the space of wireless connectivity – For packet delivery purpose [Routing] • Prediction of Future Relay Connectivity • Predictive Data Delivery to Mobile Users [Evaluation] 5 Capturing Mobile Trajectory Patterns • Background – Trajectory: a sequence of node associations on a given spatial path – Trajectories from the same spatial trajectory are not necessarily identical • Due to imperfect links and radio signal strength fluctuations • Goal y p s t u x i b z r q l a o T =a l o r t z b p y u T’ = a l q o r z s p i u z T’’= a q r t z t s b y i x – To cluster similar mobile trajectories – General trajectory pattern models explored by a number of spatial trajectories 6 Constructing trajectory clusters • Step I. Similarity measure T1 a l o r t z t b o r t how similar? T2 t o p r b o t a • Step II. Hierarchical clustering • Step III. Compact representation 7 Step I: Similarity Measure • Similarity measure (normalized) F(m,n) min(m,n) where F(m,n) is the length of the longest common subsequence (LCS) – Not a distance metric [ Example 1.] T1 a l o r t z t b o r t [ Example 2.] T1 a l o r t z t b o r t how similar? how similar? T2 t o p r b o t a T2 a z o t LCS o r b o t LCS a z o t sim(T1,T2 ) 5 /min(11,8) 5 /8 sim(T1,T2 ) 4 /min(11,4) 1 8 Step II. Hierarchical Clustering • Hierarchical clustering : Every point is its own cluster 1. Find most similar pair of clusters 2. Merge it into a parent cluster 3. Calculate the average similarity between objects in two clusters n n 1 r s sim(r,s) sim(x ri, x sj ), i (1, ,nr ), j (1, ,ns ) n r n s i1 j1 4. Repeat 9 Step III: Probabilistic Representation R R (using ClustalW tool) Y - Computation complexity R O(N 2 L2 ) where E R N : # of sequences R L : the sequence length R 2. Construct Profile K : A probabilistic representation R for efficient search in the R usage phase 1. Execute multiple sequence alignment T E E E D E E I E E E E-RT-EACE-GIP----D--S A C E G I P D S C-R--E-CEIGIPS---D--S E I G I P S D S C--Y-E-C---I--------I CREC-EICG--IGNG-ND--S E I C G I G N G N D S -ED-E-C---IGP---D--S E C I G P D S -R--E-CH-CIGK---D--S C H C I G K D S -R--E-C---IGC------C I G C -RI-E-CG--SG-D-LDK-S E--K-E-CG--IGTD-WD--S C G S G D L D K S C-R--E-CN--IG-DGTD--S G I G T D W D S C-REPE-CN--IGID-GDKDS N I G D G T D S P E C N I G I D G D K D S Px, j : probability of column j that is character x 10 Summary: Mobility Trajectory Clusters in an off-line phase Trajectory sequences ……………… ………………………. …………………. …………………………. …………… 11 Outline [Off-line Learning Phase] • Mobile Trajectory Model [Routing] • Prediction of Future Relay Connectivity • Predictive Data Delivery to Mobile Users [Evaluation] 12 Prediction of Future Relay Connectivity • Given a partial test sequence, • 1) First find the closest cluster – A variant of Smith-Waterman algorithm for local matching – With the largest F(*,*) among all profiles • 2) Find the highly overlapped region Test sequence: RCECNC ? ... Profile: J Mobility Profile Database 13 Prediction of Future Relay Connectivity • 3) Obtain the most probable subsequences starting from J+1 through J+W J W 14 Optimal Route Selection Using Predictive Knowledge • Data stashing: T T1 2 Given a set of future trajectories of multiple mobile users, • Cover all possible future trajectories • Minimize routing cost to the selected relay nodes T4 T5 T6 N M1 – Find the optimal stashing nodes for each data source – Considering T3 A M2 15 Optimal Route Selection Using Predictive Knowledge • Optimization problem – For sensor node A, – Minimize total routing cost T1 T3 T4 T5 T6 • From sensor node itself • To the selected stashing nodes – Subject to T2 N M1 A • Stashing nodes cover all possible future paths of multiple mobile users • Solved by LP/IP solvers such as CPLEX, Gurobi, GLPK, … M2 16 Outline [Off-line Phase] • Mobile Trajectory Model [Routing] • Prediction of Future Relay Connectivity • Predictive Data Delivery to Mobile Users [Evaluation] • Dynamic mobility model – Prediction Accuracy • Routing performance – – – – Scalability Tolerated Delay Load Balance Computation for Selecting Stashing Nodes 17 Prediction Accuracy of Mobile Trajectory Model Validated trajectory clustering using UMass DieselNet realworld dataset : 34 buses, 4198 APs, 789 bus trips around UMass campus • Prediction method results in excellent stashing node selections for real-world data 18 Simulation Setup for Routing TOSSIM under ‘meyer-light’ interference 830x790 m2 716 nodes 20 mobile trajectories Vehicle moves at a random speed N(30, 52) km/h Vehicle sends a beacon every 1 sec Each sensor node has data to deliver to mobile sinks 19 Scalability depending on # of mobile sinks • Data stashing consumes less energy than immediate point-to-point routing – Scalable with # of mobile sinks! (lower is better) • Data stashing keeps high packet delivery even for network congestion • Data stashing performs closely to the upper bound by perfect prediction (higher is better) – Even limited knowledge of future trajectories can significantly improve routing performance! 20 Tolerated Delay W • W: # of future trajectory hops • Large W means more chance to exploit data stashing scheme (lower is better) • As W 1, data stashing should break • Implication Trade-off: Tolerated delay vs. Network performance (higher is better) 21 Load Balance • Data stashing has a good load balancing performance compared to a point-to-point routing immediately to mobile sinks Immediate Routing better Data Stashing 22 Running time for a source to compute stashing nodes • PC: Dell Precision 390 (2.4 GHz Core 2 Duo) Small Embedded: fit-PC2 (Intel Atom Z530 1.6GHz) • Measured running time for solving the optimization problem - binary integer program (lower is better) • Feasible even in a small embedded platform, taking less than 500ms 23 Conclusion • Dynamic mobile trajectory model in the space of wireless connectivity, capturing wireless volatility • Mobile data delivery can be improved through mobility pattern learning and prediction • Even limited knowledge of the future trajectory can improve networking performance • Take-home lesson: “If you know where someone is going (even uncertainly), you can deliver data to him more efficiently and reliably.” 24 Limitations & Future Works Two problems Current delivery scheme is “best-effort” Current clustering method cannot share common pieces of trajectories More robust packet delivery: Multi-tier clustering: When the system detects delivery Long trajectories can be partitioned would fail, restashing can significantly improve robustness Trajectory prediction and data stashing can be more intertwined short pieces for efficient clustering On-line clustering A multi-tier clustering approach can deal with extremely large complex networks into 25 Questions? HyungJune Lee abbado@stanford.edu 26