An RFID and Particle Filter based Indoor Spatial Query Evaluation System Jeff Ku Auburn University Introduction Why research on indoor spatial queries? 1. People spend a significant amount of time daily in indoor spaces. 1. Office buildings, shopping malls, etc. 2. New York City subway system delivered over 1.7 billion rides in 2015 2. Accurate localization techniques based on RFID are not available. 3. Modeling methods are very different from outdoor space. 1 Introduction Characteristics of RFID devices 1. Consist of RFID readers and tags. 2. Tags are very cheap. 3. Challenges include limited sensing range, false negatives, inability to cover the whole indoor space. An example of RFID reader and tag Background—Indoor Spatial Queries Previous works [1] and [2] proposed solutions for indoor range queries and kNN queries, respectively. Underlying assumption: an object’s location is uniformly distributed after leaving a reader’s detecting range. Object is uniformly distributed in its uncertain region, a circle whose radius is growing with time. 2 Symbolic Model Symbolic Model (Cont.) 3 Background—RFID Data Cleansing Due to the inherent unreliability of RFID raw readings, data cleansing is necessary. [3, 4, 5, 6] adopted a sampling based method—particle filters [7, 8] employed a different sampling method called Markov Chain Monte Carlo Particle filtering is more suitable for our setting Preliminary—Particle Filters Represent the posterior probability by a set of samples (particles) associated with weights Particles update themselves from parent particles at previous time step according to Weights are updated according to the observation Particles are resampled to replicate high weight particles, eliminate low weight particles 4 Preliminary—Particle Filters based Location Inference Suppose an object is detected by d2 at t0 . It’s particles are initially uniformly distributed within the detecting range of d2 with random speeds and directions. Initial distribution of particles at t0 Preliminary—Particle Filters based Location Inference The object is later detected by d3 at t1, when particles are dispersed around d2. After resampling most particles will become replicates of the ones within the detecting range of d3, which are moving from left to right. Dispersed particles at time t1 before resampling 5 System Design—Overall system structure System Design—Raw Data Collector Only store readings of the most recent two detecting devices for each object. Aggregate multiple readings per second to only one entry per second, since particle filters update once every second. 6 System Design—Indoor Walking Graph Model and Anchor Point Indexing Model Indoor Walking Graph: Simplify particle movement Anchor Point: Discretize particles’ locations An example of Indoor Walking Graph and Anchor Point System Design—Query Aware Optimization Range Query: Filter out non-candidate objects for range query 7 System Design—Query Aware Optimization kNN Query: Let f be the k-th minimum of all objects’ li values If oi.si>f, oi can be safely pruned. Filter out non-candidate objects for kNN query System Design—Particle Filterbased Preprocessing Run particle filtering for every object oi in candidate set until the current time stamp Assign oi‘s particles to their nearest anchor point, calculate pi(oi.location=ap)=n/Ns, where n is the number of particles falling on an anchor point ap, and Ns is the total number of particles for oi. pi stands for the probability of oi being at ap. 8 System Design—Particle Filterbased Preprocessing Update the indexing hashtable APtoObjHT. An example of APtoObjHT key value ap1 <o1, 0.12>, <o2, 0.3>,<o3, 0.2> ap2 <o2, 0.2>, <o4, 0.6> ap3 <o3, 0.5> … … System Design—Indoor Range Query For queries in hallways, identify anchor points covered along the length of query window, sum up probabilities for each object, also consider ratio wqh/wh. For queries in rooms, identify anchor points within the room, sum up probabilities for each object, also consider the ratio Areaqr /Arearoom. 9 System Design—Indoor kNN Query Expand from query point q, search for anchor points in ascending order of their distance to q. Stop if the accumulated probability is no less than k. System Design – Continuous Spatial Queries Critical device idea 10 Experimental Validation Conducted on an Ubuntu Linux Server equipped with an Intel Xeon 2.4GHz processor and 16GB memory. The setting includes 30 rooms and 4 hallways on a single floor. A total of 19 RFID readers are deployed on hallways with uniform distance to each other. Compared with the symbolic model based methods [1, 2]. Metrics include top-k success rate (higher better) and KL divergence (lower better) Haley Center Floor Plan 11 The Simulator Structure Effects of Query Window Size 12 Effects of k Effects of Number of Particles 13 Effect of Number of Moving Objects Effect of Activation Range 14 Conclusion Proposed the particle filter-based location inference method, the indoor walking graph model, and the anchor point indexing model for RFID data cleansing. Proposed efficient algorithms for evaluate range and kNN queries on probabilistic data. In the future, extend the current framework to support more spatial query types such as continuous range, continuous kNN, closestpairs, etc. References [1] Bin Yang, Hua Lu, and Christian S. Jensen. Scalable continuous range monitoring of moving objects in symbolic indoor space. In CIKM, pages 671-680, 2009 [2] Bin Yang, Hua Lu, and Christian S. Jensen. Probabilistic threshold k nearest neighbor queries over moving objects in symbolic indoor space. In EDBT, pages 335–346, 2010. [3] Thanh T. L. Tran, Charles Sutton, Richard Cocci, Yanming Nie,Yanlei Diao, and Prashant J. Shenoy. probabilistic Inference over RFID Streams in mobile Environments. In ICDE, pages 1096–1107, 2009. [4] Christopher Re, Julie Letchner, Magdalena Balazinska, and Dan Suciu. Event queries on correlated probabilistic streams. In SIGMOD Conference, pages 715–728, 2008. [5] Evan Welbourne, Nodira Khoussainova, Julie Letchner, Yang Li, Magdalena Balazinska, Gaetano Borriello, and Dan Suciu. Cascadia: a system for specifying, detecting, and managing rfid events. In MobiSys, pages 281–294, 2008. [6] Julie Letchner, Christopher Re, Magdalena Balazinska, and Matthai Philipose. Access Methods for Markovian Streams. In ICDE, pages 246–257, 2009. [7] Haiquan Chen, Wei-Shinn Ku, Haixun Wang, and Min-Te Sun. Leveraging spatiotemporal redundancy for RFID data cleansing. In SIGMOD Conference, pages 51–62, 2010. [8] Wei-Shinn Ku, Haiquan Chen, Haixun Wang, and Min-Te Sun. A Bayesian InferenceBased Framework for RFID Data Cleansing. IEEE Trans. Knowl. Data Eng.,Vol. 25, Issue 10, pp. 2177-2191, 2013. 15