Why is Fingerprint-based indoor localization still so hard? Brieuc VIEL, Mikael ASPLUND Department of Computer and Information Science Linköping University, Sweden brivi621@student.liu.se mikael.asplund@liu.se Abstract – Wireless indoor localization systems and especially signal strength fingerprinting techniques have been the subject of significant research efforts in the last decades. However, most of the proposed solutions require a costly site-survey to build the radio map which can be used to match radio signatures with specific locations. We investigate a novel indoor localization system that addresses the data collection problem by progressively and semi-autonomously creating a radio-map with limited interaction cost. Moreover, we investigate how spatiotemporal and hardware properties-based variations can affect the RSSI values collected and significantly influence the resulting localization. We show the impact of these fluctuations on our system and discuss possible mitigations. I. INTRODUCTION The market for location-based services has increased dramatically in the last few years and is projected to grow further in the coming years. Today, mobile devices are often able to determine their position with an accuracy of just a couple of meters under favorable conditions. However, in large buildings the location capabilities drop significantly due to unavailability of satellite signals (i.e., GPS). This gap is well known and many attempts have been made towards achieving indoor localization with acceptable accuracy, including major players such as Google, Bing and also Nokia. Despite these efforts, the indoor localization problem remains largely unsolved. There are three basic methods that can be used for localization (in isolation or in combination). Multilateration or multi-angulation (often are both referred to as triangulation) uses estimated distance or angle to fixed reference points to calculate the position, scene analysis uses knowledge about particular characteristics (such as RSSI values, or image data) for a particular location to match with current sensor values, and finally inertial navigation, which relies on motion sensors to derive the current location. For indoor localization, scene-analysis have been deemed a promising approach since most public and commercial buildings today have a number of Wi-Fi access points which can be used to create a sort of signal strength fingerprint for a given location. There are many successful implementations of this method [1, 2, 3, 4, 5] providing a positioning accuracy of as good as 1.5m. However, scene analysis and fingerprinting by definition relies on having collected a sufficient amount of information beforehand in order to match the current sensor readings with a previously recorded location. Without this information, a fingerprinting-based approach will not be able to provide any location information to the user. Unfortunately, collecting this information is both time-consuming and error prone. In this paper we take a closer look at the basic conditions for indoor localization through fingerprinting. In particular we focus on the collection and sharing of fingerprints among multiple devices. We present a novel approach for non-intrusive user input to match fingerprints with physical locations. Moreover, we discuss a previously neglected issue related to the sharing of fingerprints among multiple devices. Many works assume that the fingerprint made by one device can be used by another device. Despite the presence of results showing that this is not the case [6], this continues to be used as an implicit assumption. In this paper we strengthen previous results and show that RSSI variations can occur also between identical models of popular high-end smartphones. We show that our system still achieves acceptable results despite this difference, but we believe that it is a problem which needs further study by the research community. The contributions of this paper are two-fold. First, we present a novel fingerprinting-based indoor localization system which provides a reasonable tradeoff between the amount of user interaction and quality of service. Second, we show that the quality of information of RSSI signals is less than what is implicitly assumed by many similar approaches. The rest of this paper is organized as follows. Section II describes related works. Then in Section III, we present the system we designed. Section IV studies the impact of the RSSI variations. Finally, Section V concludes this paper. II. RELATED WORK Researchers have proposed different approaches to determine device location in various indoor environments. Cell-phone manufacturers have proposed to increase GPS performance by installing GPS-repeater modules inside buildings to provide more accuracy to GPS devices. These modules are still too expensive and do not consider complicated signal’s propagation inside buildings. Then either large empty RSSI (dBm) -55 Signal 1 -75 Signal 2 Signal 3 -95 1 51 101 151 201 251 301 351 Measurements 401 451 501 551 Fig. 1. Wi-Fi RSSI variations over the time at fixed position room or huge number of repeaters is required to provide high accuracy[2]. WLAN-based fingerprinting is another interesting technique proposed in solving the indoor localization problem. It proposes to use the received signal strength and properties from different access points to create a unique combination of data. Therefore, a constraining preliminary phase is required to collect fingerprints in the specified area. Such a system doesn’t manage environment changes like moving and unpredictable obstacles which may significantly deteriorate the relevance of the data set. Several predictive algorithms and probabilistic models have been implemented to achieve the matching between measured data and recorded fingerprints. A wide range of wireless technologies has also been exploited like Wi-Fi[8] network, GSM[9] or Bluetooth [10], but also environmental characteristics such as: color, light and sound [2]. Veljo Otsason [9] proposed an indoor GSMbased fingerprinting localization system with median accuracy of 5 meters in a large multi-floor building. To determine the actual location, the measured signals strengths are compared to the recorded data using the Weighted K-Nearest Neighbor algorithm to define the closest estimated position. This GSM-based system has shown good results and high accuracy in different types of multi-floor building. RADAR [8] project was the first attempt to Wi-Fi-based fingerprinting. Contrary to the other project existing at that time (year 2000), RADAR doesn’t require any special equipment, it only uses already deployed wireless networks. As Otsason’s system, a constraining data collection phase is needed before utilization. Matching is done using the Nearest Neighbors in Signal Space algorithm. However, these two systems present significant limitations: the long training and the low adaptability to environmental changes (wall modification, new furniture or addition of a new wireless network). In this case, a total reconfiguration of the fingerprints data set is required. A major innovation was presented by SurroundSense (2009) [2], which proposed to combine GSM, Wi-Fi and Bluetooth characteristics with environmental properties like light, color and sound intensities to achieve better accuracy. The resulting fingerprints present a higher uniqueness compare to other systems but it also complicates the data collection and matching. In 2009, PILS [3], a GSM and Wi-Fi combined project presented an important innovation while proposing user-friendly system that limit user interaction and reduce training-phase. The collection phase is now partially integrated in the running-time. This is the first step to site-survey suppression and user-friendly data-collection, providing fast and easy set-up and configuration. However, time is still required before the database get enough fingerprints to perform really efficient localization. Even if the global method is similar, our implementation is slightly different, focus is given on data collection and fingerprint generation to provide more reliable and stable signals information. So we defined a new and updated technique record more representative RSSI values by simplifying this data collection and reducing the user-interaction needed. More recently, Fingerprinting method has been combined with Pedestrian Dead Reckoning (PDR) technique to provide an hybrid localization approaches. Pazl [7] presents this innovative combination, but even with sufficient number of samples accurate location estimation is still difficult because the system only use one fingerprint and has to proceed rapidly to combined with PDR. Moreover, this system can be quite expensive from an energy consumption perspective if solely relied on for continuous location tracking. Fig. 2. System Architecture III. SYSTEM DESCRIPTION In this section, we give an overview of our system, we explain our motivation, present the main functionalities and explain the main principle and algorithms designed. We propose a novel adaptive indoor localization system dedicated to Android smartphones based on Radio-Frequency infrastructures such as GSM, 3G and Wi-Fi networks to provide room recognition and localization. Average Euclidean Distance Vs Success Rate 1 0,8 0,6 0,4 0,2 0 Average ED (dBm) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Localization attempts per Location Fig. 3. Average Euclidean Distance and Success Rate computed during the evaluation Xperia Nexus Nexus Galaxy SII -10 -30 -50 -70 RSSI (dBm) -10 -30 -50 -70 RSSI (dBm) RSSI (dBm) -10 -30 -50 -70 Wi-Fi signals 1 2 3 4 5 6 7 8 9 10 Wi-Fi signals 1 2 3 4 5 6 7 8 9 10 Wi-Fi signals 1 2 3 4 5 6 7 8 9 10 SII #1 SII #2 Fig. 4-6. Wi-Fi RSSI pair analysis A – Main principle We designed a system which should work with any Android smartphone without distinction of brand or model; for this reason we had to apply some strong requirements. Based on hardware properties [9] and analysis [4] we choose to control the number of signals recorded per fingerprint to a maximum of 6 GSM and 10 Wi-Fi signals per fingerprint. According to related studies [5], uniqueness of fingerprint and probability of successful increase significantly with size up to 6 signals. GSM and Wi-Fi both share common properties like ID and RSSI values. They also present the same response to external factors and parameters that may cause significant variation to signal strength value. We observe that not only does the signal strength of GSM and Wi-Fi (Figure 1) present huge variations over time (a well-known fact) but also depending of the individual device used. Figure 1 illustrates the variations of three Wi-Fi signals strength (RSSI) at a fixed position over a period of 10min (1 sample per second). Unpredictable variations may affect signal strength, produced by simple environmental changes and factors, i.e. the number of people present or moving around the building, the walls, floors, ceilings or other architectural constraints [11], or simply the position of the user when taking the measurement. Even a simple weather condition change, may cause significant variations [12]. Properties of the device used may also affect the measurements: variations have been noticed even between phone of same brand and model. To reduce the impact of the variation of received signal strength, we collect several RSSI data per fingerprint and apply a smoothing algorithm to limit exposition to strong and sudden variations. This method results in more representative data, despite the possible environmental or materiel variations. The system is organized around four basics modules shown in Figure 2: the User interface (UI) module, the Localization module (fingerprint creation and localization algorithm), the Scanner module (GSM, Wi-Fi and motion captors) and finally the Database module. Due to hardware limitations , smartphones usually process and record only the six best GSM neighbor networks [14, 15]. Since our system is designed for commercial Android smartphones without any restrictions (except casual sensors like Wi-Fi, GSM and Accelerometer), we chose to consider a maximum of six GSM signals per fingerprint. On the other hand, Wi-Fi interface does not have such a restriction and may return up to fifty different networks. Previous work [9, 15] show that the localization efficiency is linked to the size of the fingerprint used, based on evaluations done on different localization systems and methods. The error rate usually strongly decrease when size increases up to 10 signals and then decrease slowly until size equals 15 signals. After 16, the error rate becomes rather stable, implying that no more performance increase is provided by adding more signals. Based on this study, we chose to define the size of our fingerprint to 6 GSM signals and 10 Wi-Fi signals (i.e., a total of 16 signals). The temptation is high to collect as much signals as possible for each fingerprint but this would lead to higher measurements, computation and storage costs without significantly increase the localization accuracy. Fingerprint is related to a relative location. This position is defined by Cartesian coordinates X and Y representing the position on a map corresponding to the current building area. These coordinates are relative coordinates; they only make sense on the corresponding map (Cartesian plans) and would be meaningless if used on another map or even with a deformed map. We assume that the recognition of the corresponding map and the possibility to manage (add and modify) a map in the system does not affect the accuracy or the precision of the system. For these reason, map management has been considered but not implemented. In order to save the collected data, we defined a database that contains the data relative to the fingerprints (signal ID + RSSI) and to the locations (x, y coordinates + related map). The system uses a local database stored in user’s smartphones, which contains all the data needed to be localized. B – Three-steps algorithm In order to propose a crowd-sourcing-based autonomous system, free of site-survey and requiring a minimized human interaction, we have designed a novel localization algorithm. The general process is the same as in other indoor localization process: when a localization attempt is launched, the system will record the fingerprint of the current network environment that will become the reference data. This reference fingerprint (R-Fingerprint) will be compared to the data already recorded into the database to find a possible matching and related location. Our localization algorithm is divided into three steps in order to achieve a more representative data collection, data processing and accurate localization estimation. In the first step, the system performs a pre-selection among the previously recorded fingerprints in the database set, focusing only on relevant data. For each fingerprint from the database, the system analyzes the number of signals in common with the reference fingerprint (RF) (based on signal ID) and keeps only those respecting a predefined threshold. By this method, we ensure that every fingerprint the system will compare has at least a predefined proportion of signal IDs in common with the R-Fingerprint. To ensure the accuracy of our comparison, the preselected fingerprints will be temporary truncated to keep only the common signals. So every pre-selected data has the same size and the same percentage of corresponding signals with the RF. The second step is the comparison between resulting set of step one and reference fingerprint. The comparison is done based on the Euclidean Distance (ED) calculated between the strength value of each RF signals and the other fingerprint signals. The unit of ED is dBm per signal. If no satisfying location is returned, the system proceeds to step three. The Weighted k-Nearest Neighbor estimation method is applied in step three to evaluate the current location based on the x and y position of the k closest fingerprints based on their respective ED. A weight is also applied to the position of each of the k-nearest neighbors based on its ED. The innovation of this algorithm lies first in the way it selects the data to process, making the result more accurate by comparing RSSI data of same common size fingerprints. We insure that the fingerprints compared have exactly the same number of signals. Novelty is also presented by the algorithm adaptation capacity and the flexibility of the result returned. Adaptation and flexibility are key points in indoor localization, since environmental changes are most likely to happen and impact the wireless signal propagation and so the corresponding fingerprint. Even the morphology of the user or the way he carries his phone may have an impact. In this case and in comparison with existing systems, our system will automatically adapt itself and overpass this change by progressively updating its database. The flexibility of our algorithm makes it possible since it doesn’t return a perfect value but a matching state based on the ED calculated. These three states are the Perfect Matching (we assume the returned location is true), the Possible state (confirmation is asked) and the Estimation State (step three of the algorithm). With these states, the system is able to manage environmental changes and limit the impact on the resulting localization. C – Data Collection and interval scanning We propose to suppress the site-survey phase and include data collection into the normal utilization process. The interactions required are then limited and no offline phase is required. Data collection is accomplished by two different methods. First, by giving the opportunity to add a new location in the database or correct the proposition of the localization process. The second way to collect data is the interval scanning module. Based on acceleration magnitude measurements, the system applies a motion detection (Fig. 7). When the device (and by extension the user) stays stationary over a period of time, the system performs a continuous data collection. If the fingerprint of the current position is not already recorded in the database, a new fingerprint is created compiling the average values collected until the user move away. Then the system asks the user to relate the fingerprint with his previous location on a map. The basic idea behind this behavior is that it is more convenient for user to remember his previous location during a period of time he stayed stationary rather than providing it precisely at a certain time. Fig. 7. Motion detection and interval scanning This system provides more fingerprints and locations to the database with a limited interaction. This also gives the system the freedom to report the confirmation to a future moment more convenient for the user (e.g., moment where he comes back to his desk/table). A similar system has been presented in PILS project [3], with less autonomy allowed to the system and more interactions imposed to the user. More details are asked to the user about the position to link the fingerprint to the corresponding location (name, location, function). Moreover, the PILS system doesn’t compare the data with the already recorded ones, creating this way doubles locations and fingerprints into the database. Fig. 8. Evaluation area and targeted locations , IDA building LiU. D – Evaluation and results To evaluate the reliability of our system, we defined a protocol to analyze the result of the localization process at 10 predefined positions, performing thirty attempts for each location, so a total of 300 measurements. This evaluation was done in an academic building of Linköping University (Fig. 8 800m²), in normal use conditions with full-capacity and performance. No consideration is given to the size of the user, to the position the user carries the phone in since the variations resulting from these parameters must be override by the system. To analyze the results of this evaluation, we recorded the closest Euclidean Distance generated during each localization process, the database size and the number of user interactions during the evaluation (Fig. 3). From the results of this evaluation, we observe two temporary period before achieving full-capacity. During the first ten localization attempts, the system collects the first data to provide localization and progressively fills the database. Then system keeps collecting data to provide complete overview of every location. Over the 300 values calculated during this evaluation, we achieved a success rate of 82% correct localization with roomlevel accuracy and an Euclidean Distance of 0.47 dBm. However, even after thirty attempts, localization failure may still happened due to access point signal punctual disappearance or exceptional meteorological conditions. By relating results values and figure 8 together, we can assume that our system is less efficient in high room-density area (locations 1, 4, 5 and 6) and especially with small size rooms (locations 5 and 6). We also observed an unpredicted failure, “false positive” localization may happen, especially during fifteen first localization attempts when data signal map is not complete enough to manage every situation and in high density areas previously described. Even if these “false positive” results have an short-term impact on the localization (false location is returned to the user), it will be automatically overwritten by the fingerprints collected at mid-term. The system will continuously record new data that will reduce the accuracy and ED of the “false positive” fingerprint. After evaluation, we define that ”False Positive” values only represent 4% of the final amount of localization attempts. Finally, we investigated the number of user interactions required during the utilization of our system. This number of interaction is closely related to the matching case encountered (1 to 3 interactions maximum). For each case, we assume that the launching of a localization attempt (i.e. press the start button) is considered as one interaction, so in the best case, this is the only interaction required. During the evaluation of the system, the number of interactions applied by the user has been recorded. The user interacts an average of 1.57 time with the system per localization attempt. As précised, in the worst case, the user would have to interact a maximum of 3 times per attempt. During the last 10 localization attempts corresponding to a normal utilization context, the average number of interaction required by the system to the user decreases under 1.3 time per localization. IV – IMPACT OF RSSI VARIATIONS The signal strength measured by a given device may fluctuate significantly depending of spatiotemporal and hardware-based properties. We now proceed to describe the impact these variations have on our system and especially on the matching resulting from our algorithm. This experiment consists of measuring simultaneously GSM and Wi-Fi fingerprints on different pairs of devices. We compare devices of different brands (Fig. 4.) devices of same brand but different models (Fig. 5.) and devices of same model (Fig. 6.). Four different devices have been considered, one Samsung galaxy Nexus, one Sony Ericsson Xperia and two Samsung Galaxy SII; all of them running on Android OS (v.4.2, API lvl. 17). For each pair of device, we measured the respective RSSI average per signal (Fig. 4-6.). We observe notable differences between data recorded, even with devices of the same brand and model. Difference of as much as 12 dBm can be observed between data from same signal recorded by different devices. Contrarily to what we expected, devices from the same brand with chipset from the same vendor will not record the same value. Figure 6 shows that variation of almost 9dBm may occurs between these two Samsung mobiles. Even if the size of our experimental set is not consequent, this further strengthens the work presented by Lui et al. from University of New South Wales, Sydney, AU [6]. They showed that different Wi-Fi devices (mostly wireless network adapters) show a considerable variation of RSSI readings (including one case of three identical network adapters). We demonstrate that the same problem can be seen for identical mobile phones of a popular model. Hardware properties-based fluctuations have a direct impact on RSSI value measured by a device and by consequence on the Euclidean Distance calculated during localization process to evaluate fingerprints matching. Pair of devices Nexus vs. Xperia Nexus vs. SII SII vs. SII ED (dBm) 0.6833 0.7858 0.3316 Table. 1. Simulated Euclidean Distance between the respective pair of devices Table 1 presents the simulated Euclidean Distance calculated from data collected during pair devices experiment. Based on evaluations, we assume that an ED of less than 0.5 dBm corresponds to a perfect matching, i.e. the location returned has a high probability to be correct. Between 0.5 and 1.0 dBm, probability is still positive but confirmation should be given by the user. So for two fingerprints taken simultaneously by two different devices in the same conditions, accuracy of the ED calculated may differ significantly and the resulting location returned by our system strongly diverge despite all the precautions taken for data collection. However, this observation is moderated when both devices are of same brand and model. It seems as if some of the RSSI-differences between different devices are systematic, which would suggest that they could be measured, modelled, and accounted for by the localization algorithms. However, considering for example Figure 6, we see that the values recorded by device 1 is higher for some of the signals and lower for the others compared to the readings by device 2. This suggest that one would need a fairly sophisticated learning algorithm to capture the change in RSSI values caused by an individual device. However, Figure 4 and 5, where the differences are similar across all signals, suggests that significant improvements could likely be achieved even without a perfect algorithm. V – CONCLUSIONS Indoor localization remains elusive. Despite the wealth of research papers claiming to have found the solution, something is holding back successful deployment. We believe that one of the main issues lies in the ability to collect relevant context information of high quality about each location. In this paper we present a prototype system where this type of data collection is integrated in the localization tool. We show that this prototype shows acceptable results with relatively simple algorithms. It would be reasonable to assume that pairing our approach with more sophisticated sensor-fusion algorithms could yield even better results. However, our evaluations also point out an inherent weakness with RSSI-based fingerprinting. If the quality of this information source is so low that differences of 12 dBm between the averaged data from two different devices can be observed, then it into question how well crowd-sourcing approaches can be made to work. Obviously, algorithms will need to be both robust and adaptive to cope with this problem. More research is needed to more precisely model variations between devices to be able to account for this source of inaccuracy. It should be noted that all the data used in the work have been collected in a specific academic building with its particular characteristics and properties. Data have been measured using a limited number of devices, with specific environmental conditions. More experiments will be required to fully validate the effectiveness of our algorithm. Moreover, we believe that our observations regarding differences in RSSI should be extended with more experimental data. REFERENCES [1] B. VIEL, “Adaptive Indoor Localization System for Android Smartphones”. Linkoping University, 2013. [2] M. Azizyan, R. R. Choudhury, “SurroundSense: mobile phone localization using ambient sound and light”. Mobile Computing and Communications Review 13, 2009. [3] P. Bolliger, K. Partridge, M. Chu , M. Langheinrich, “Improving Location Fingerprinting through Motion Detection and Asynchronous Interval Labeling”, Proceedings of the 4th International Symposium on Location and Context Awareness, May 2009, Tokyo, Japan [4] K. Kaemarungsi and P. Krishnarmurthy, "Modeling of indoor positioning system based on location fingerprinting", Twenty-third Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM'04), China 2004. [5] V. Honkavirta , T. Perala , S. Ali-Loytty and R. Piche "A comparative survey of WLAN location fingerprinting methods", Positioning, Navigation Commun., 2009. [6] G. Lui, T. Gallagher, B. Li, A. G. Dempster, and C. Rizos, "Differences in RSSI readings made by different Wi-Fi chipsets: A limitation of WLAN localization," international Conference, 2011. [7] V. Radu, L. Kriara, and M. K. Marina. Pazl: “A Mobile Crowdsensing based Indoor WiFi Monitoring System”. In Proc. IEEE CNSM, 2013. [8]] P.Babl, et al, “RADAR: An in-Building RF-based User Location and Tracking System.” In Proceedings of the IEEE Infocom, Tel-Aviv, Israel. Mar 2000 [9] V. Otsason, A. Varshavsky, A. LaMarca and E. de Lara, "Accurate GSM indoor location", Springer-Verlag Berlin Heidelberg, UbiComp 2005.. [10] A. Bekkelien, “Bluetooth Indoor Positioning”, Master thesis, University of Geneva, Mar 2012 [11] P. Prasithsangaree, P. Krishnamurthy, P. Chrysanthis, “On indoor position location with wireless LANs”, in The 13th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, vol. 2, 2002. [12] C. A. Boano , J. Brown , Z. He , U. Roedig and T. Voigt "Low-power radio communication in industrial outdoor deployments: The impact of weather conditions and ATEXcompliance", Proc. 1st Int. Conf. Sens. Netw. Appl., Exp. Logistics (Sensappeal), 2009 [13] A. Bekkelien, “Bluetooth Indoor Positioning”, Master Thesis, 2012. [14] V. Otsason, “Accurate Indoor localization Using Wide GSM fingerprinting”, Master thesis, Tartu University, published 2005. [15] V. Honkavirta , T. Perala , S. Ali-Loytty and R. Piche, “A comparative survey of WLAN location fingerprinting methods”, Positioning, Navigation Commun., published 2009.