Department of Computer Science City University of Hong Kong A Statistics-Based Sensor Selection Scheme for Continuous Probabilistic Queries in Sensor Networks Song Han1, Edward Chan1, Reynold Cheng2, and Kam-Yiu Lam1 Department of Computer Science1, City University of Hong Kong 83 Tat Chee Avenue, Kowloon, HONG KONG Department of Computer Science City University of Hong Kong Department of Computing2 Hong Kong Polytechnic University PQ706, Mong Man Wai Building Hung Hom, Kowloon, Hong Kong 1 Department of Computer Science City University of Hong Kong Agenda Introduction Objective System Model Methodology Performance Analysis Conclusion Statistics-Based Sensor Selection Scheme in Sensor Networks 2 Department of Computer Science City University of Hong Kong Introduction Constantly-evolving Environment Uncertainty of Sensor Data Sensor Data are erroneous, unreliable and noisy Database may store inaccurate values Query results can be incorrect Statistics-Based Sensor Selection Scheme in Sensor Networks 3 Department of Computer Science City University of Hong Kong Introduction Statistical Model of Sensor Uncertainty A sensor value can be described more accurately as a Gaussian Distribution Mean µ Variance σ2 Gaussian Distribution (,2) Statistics-Based Sensor Selection Scheme in Sensor Networks 4 Department of Computer Science City University of Hong Kong Introduction Probabilistic Queries [SIGMOD03] Represent the imprecision in the value of the data as a probability density function. e.g., Gaussian Augment query answers with probabilities Give us a correct (possibly less precise) answer, instead of a potentially incorrect answer Statistics-Based Sensor Selection Scheme in Sensor Networks 5 Department of Computer Science City University of Hong Kong Introduction Query Quality and Variance Query quality can be improved with lower variance To obtain a smaller σ2, a simple idea is to use more sensors Get an average of these readings N(µ,σ2) becomes N(µ,σ2/ns), where ns is the number of “redundant” sensors Statistics-Based Sensor Selection Scheme in Sensor Networks 6 Department of Computer Science City University of Hong Kong Introduction Deploying Redundant Sensors Exploit the fact that sensors are cheap Example: 1000 sensors in the room to obtain average temperature Variance decreased by a factor of 1000 Resource Limitation Problem Wireless network has limited bandwidth Sensors have limited battery power Can’t afford too many sensors! Statistics-Based Sensor Selection Scheme in Sensor Networks 7 Department of Computer Science City University of Hong Kong Introduction The Sensor Selection Problem How to decide sensors’ sampling period How many sensors to use for the guaranteed level of query quality? Select which sensors? Statistics-Based Sensor Selection Scheme in Sensor Networks 8 Department of Computer Science City University of Hong Kong Objective Adaptive Sampling Period Decision Scheme Find out the minimum variance of each entity being monitored to meet the probabilistic query quality requirement Select minimum number of “good” sensors to achieve the required variance Decide which sensors should be selected Statistics-Based Sensor Selection Scheme in Sensor Networks 9 Department of Computer Science City University of Hong Kong System Model region region User Wireless Network Base Station region region Statistics-Based Sensor Selection Scheme in Sensor Networks 10 Department of Computer Science City University of Hong Kong System Model User coordinator Base Station Statistics-Based Sensor Selection Scheme in Sensor Networks 11 Department of Computer Science City University of Hong Kong Methodology Adaptive Sampling Period Decision Sensor Selection Process 1. obtain (, max 2) from sensors in region 2. Derive max 2 for each item to satisfy quality 3. Determine sensor nodes to be used Statistics-Based Sensor Selection Scheme in Sensor Networks 12 Department of Computer Science City University of Hong Kong Adaptive Sampling Period Decision The region’s value is changing continuously Periodical Sample will consume excessive system resource Adaptive Sample Scheme for MAX/MIN query ESSENCE: To increase the sampling period for the regions whose values have little effect on the query result. Statistics-Based Sensor Selection Scheme in Sensor Networks 13 Department of Computer Science City University of Hong Kong Adaptive Sampling Period Decision Adaptive Sample Scheme for MAX/MIN query Predicted Sampling Time (PST) PSTi Max(( max i 3 ( max i )), 0) vi vmax Statistics-Based Sensor Selection Scheme in Sensor Networks 14 Department of Computer Science City University of Hong Kong Sensor Selection Process Types of Probabilistic Queries Factors Affecting Query Quality Probabilistic Query Quality An Example: MAX Query Reselection of Sensors for Continuous Queries Statistics-Based Sensor Selection Scheme in Sensor Networks 15 Department of Computer Science City University of Hong Kong Types of Probabilistic Queries MAX/MIN: Which region has max or min temperature? (A, 60%), (B, 30%), (C, 10%) AVG/SUM: What is the average temperature of regions A, B and C? Range Count: How many objects are within 50m from me? COUNT 1 2 3 4 5 Probability 0.1 0.2 0.5 0.15 0.05 Statistics-Based Sensor Selection Scheme in Sensor Networks 16 Department of Computer Science City University of Hong Kong Factors Affecting Query Quality Error distribution of each sensor reading Variance of Gaussian distribution Each query has its own correctness requirement 1. 2. 3. MAX / MIN AVG / SUM Range Count Query Statistics-Based Sensor Selection Scheme in Sensor Networks 17 Department of Computer Science City University of Hong Kong Probabilistic Query Quality Probabilistic queries allow specification of answer quality 1. MIN/MAX: highest probability ≥ P 2. AVG/SUM: variance of answer ≤ T 3. Range count: Top K counts contribute total probability ≥ P Statistics-Based Sensor Selection Scheme in Sensor Networks 18 Department of Computer Science City University of Hong Kong Example: MAX Query Let the probability of the i-th region be pi, where fi(s) is the pdf of N(µ,σ2) pi N ( j1^ ji f i (s) ( f j (t)dt )ds) s Quality requirement: the maximum of pi must be larger than P Statistics-Based Sensor Selection Scheme in Sensor Networks 19 Department of Computer Science City University of Hong Kong Finding variance for MAX 1. Set the variance of each region (σ1,σ2,…, σn) to their maximum possible 2. Find pimax, the maximum of pi’s 3. Find jmax, the index of the maximum of k P ( 1 , 2 ,..., n ) i.e., the sensor with greatest impact to pimax Statistics-Based Sensor Selection Scheme in Sensor Networks 20 Department of Computer Science City University of Hong Kong Finding variance for MAX (Cont.) 4. Adjust variance of the jmaxth sensor σjmax=σjmax-∆σ 5. Keep reducing variances until pimax(σ1,σ2,…, σn) P 6. Return σ1,σ2,…, σn as the variances for the n regions Statistics-Based Sensor Selection Scheme in Sensor Networks 21 Department of Computer Science City University of Hong Kong Deciding Set of Sensors Distribution of ns samples follows normal distribution N(µ,σ2/ns) Compute ns satisfying σ2/ns ≤ max variance Compute expected value of E(s) Select ns sensors with the lowest difference of readings from E(s) Only these sensors send their sampled values to the coordinator for computing N(µ,σ2/ns) Statistics-Based Sensor Selection Scheme in Sensor Networks 22 Department of Computer Science City University of Hong Kong Reselection of Sensors for CQ Sensor selection runs again when: 1. Probabilistic query quality cannot be met (e.g., due to change of mean) 2. Coordinator detects some sensor is faulty (e.g., its value deviates significantly from the majority) or gives no response after some timeout period Statistics-Based Sensor Selection Scheme in Sensor Networks 23 Department of Computer Science City University of Hong Kong Simulation Model Continuous query length: 1000 sec Sensor sampling interval: 5 sec Number of regions: 4 Number of sensors per region: U [100,150] Sensor error variance range: 5-25% Difference in the values of different regions: 2-10% Quality requirement for MIN/MAX Query : 95% Variance Change Step (∆σ): 0.3 Statistics-Based Sensor Selection Scheme in Sensor Networks 24 Department of Computer Science City University of Hong Kong Performance Analysis % in Sensor Selected vs. Difference in Region’s Values Accuracy vs. Difference in Region’s Values Statistics-Based Sensor Selection Scheme in Sensor Networks 25 Department of Computer Science City University of Hong Kong Performance Analysis Accuracy vs. Sensor Error Variance Percentage Percentage of Sensors Selected vs. Sensor Error Variance Statistics-Based Sensor Selection Scheme in Sensor Networks 26 Department of Computer Science City University of Hong Kong Performance Analysis Changes in Value of Regions over Time Percentage of Sensors Selected over Time for Continuous Changes in Values of Regions Statistics-Based Sensor Selection Scheme in Sensor Networks 27 Department of Computer Science City University of Hong Kong Conclusion Accuracy improved through multiple sensors Adaptive Sample Period Decision Scheme Limited network bandwidth allows only limited number of redundant sensors Sensor selection algorithm selects good sensors for reliable readings Statistics-Based Sensor Selection Scheme in Sensor Networks 28 Department of Computer Science City University of Hong Kong Future Work Region Selection Reducing the Computational Complex of the sensor selection progress Differentiating bad sensors from “good ones” that report true surprising events Hierarchical organization of coordinators How to assign coordinators? Statistics-Based Sensor Selection Scheme in Sensor Networks 29 Department of Computer Science City University of Hong Kong References 1. [VSSN04] K.Y. Lam, R. Cheng, B. Y. Liang and J. Chau. Sensor Node Selection for Execution of Continuous Probabilistic Queries in Wireless Sensor Networks. In Proc. of ACM 2nd Intl. Workshop on Video Surveillance and Sensor Networks, Oct, 2004. 2. [SIGMOD03] R. Cheng, D. Kalashnikov and S. Prabhakar. Evaluating Probabilistic Queries over Imprecise Data. In Proc. of ACM SIGMOD, June 2003. 3. [Mobihoc04] D. Niculescu and B. Nath. Error characteristics of adhoc positioning systems. In Proceedings of the ACM Mobihoc 2004, Tokyo, Japan, May 2004. 4. [WSNA03] E. Elnahrawy and B. Nath. Cleaning and Querying Noisy Sensors. In ACM WSNA’03, September 2003, San Diego, California. Statistics-Based Sensor Selection Scheme in Sensor Networks 30 Department of Computer Science City University of Hong Kong Thank you! HAN Song han_song@cs.cityu.edu.hk Statistics-Based Sensor Selection Scheme in Sensor Networks 31