Masaryk University Faculty of Informatics Locating mobile phones using signal strength measurements Master’s Thesis Jakub Martinka Brno, Fall 2019 Masaryk University Faculty of Informatics Locating mobile phones using signal strength measurements Master’s Thesis Jakub Martinka Brno, Fall 2019 This is where a copy of the official signed thesis assignment and a copy of the Statement of an Author is located in the printed version of the document. Declaration Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Jakub Martinka Advisor: RNDr. Jiří Kůr Ph.D. i Acknowledgements I would like to thank my advisor, RNDr. Jiří Kůr, Ph.D. for all the patience, guidance, knowledge and valuable ideas he provided to me throughout the writing of this thesis. I would also like to thank my beloved wife for her patience and endless support. Last but not least, I would like to thank my family and friends for their encouragement and support. Dakujem kotol iii Abstract Nowadays, mobile phones have become an essential part of our lives. An ability to track position of a phone leads to revealing a position of its owner. In this work, different methods of mobile phone location tracking are explained with the main focus given on methods using signal strength of surrounding mobile network base stations. An Android application was developed for taking signal strength measurements. The data obtained from this scanner were used for evaluation and testing of implemented localization techniques: fingerprinting, weighted centroid and trilateration. iv Keywords location, mobile phone, position, tracking, triangulation, trilateration, RSSI, fingerprinting v Contents 1 Introduction 1 2 RF technology 2.1 Signal propagation . . . . . . . . . . . . . . . . . . . . . . 2.2 Mobile phone cellular network . . . . . . . . . . . . . . . . 3 3 7 3 Positioning techniques 3.1 Global Navigation Satellite System - GNSS . 3.2 WiFi access point . . . . . . . . . . . . . . . 3.3 Bluetooth beacon . . . . . . . . . . . . . . . 3.4 Mobile network connection . . . . . . . . . . 3.4.1 RRLP and LPP . . . . . . . . . . . 3.4.2 Cell-ID . . . . . . . . . . . . . . . . 3.4.3 Timing advance - TA . . . . . . . . 3.4.4 Angle of arrival - AoA . . . . . . . 3.4.5 Time of arrival - ToA . . . . . . . . 3.4.6 Time difference of arrival - TDoA . 3.4.7 Weighted centroid . . . . . . . . . 3.4.8 Received signal strength - RSS . . 3.4.9 Fingerprinting . . . . . . . . . . . . 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Molotras - Mobile phone location tracking system 4.1 Android scanner . . . . . . . . . . . . . . . . . . 4.2 Core . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Cell database . . . . . . . . . . . . . . . 4.2.2 Localization algorithms . . . . . . . . . 4.2.3 GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 12 14 15 15 16 16 17 17 19 21 22 23 24 . . . . . 27 27 30 31 32 35 5 Evaluation 37 6 Future work and possible extensions 41 7 Conclusions 43 Bibliography 45 vii A Evaluation 49 B Molotras source files and measured data 51 viii 1 Introduction In last decades a huge technical progress in radio technologies and change of industry orientation led to building new network infrastructures making wireless technologies available for public use and they are nowadays part of everyday life for most of the people. We use mobile phones on a daily basis and have them constantly by hand. Because of this, ability to track phones indirectly leads to tracking people. Information that some user was at some place at some time is not much useful in general but it may be very important and useful in specific contexts. Mobile operators have to provide all of the information about their customers to the law enforcement agencies if the agencies have a corresponding court permission. In some countries there are also data retention rules that order the operators to store user data for certain period of time, usually 6 or 12 months. This include not only location information but also communication metadata such as time and duration of phone calls, from who or to whom they were made, text and communicants of SMS and other data. Investigators may obtain such data and process it so they obtain detailed information about user’s communication and also his position in a time period. It is possible to use the data to create a profile of the target including his daily habits, favorite places, people he meets and also to reveal relationships between them. Connecting this kind of data with public data from social networks may result in very precise profile of the target. This may be a good data source for security agency but it may be easily misused as there is sometimes very thin line between helping and spying. Some countries prohibit usage of satellite phones that do not use terrestrial network but allow usage of phones connected to mobile cellular networks. This indicates that governments have control of the network up to some extent. It can be monitored and shut down in case of any critical situations. Collecting data about mobile phone users necessary for correct functionality of service is generally acceptable but sometimes these data are gathered silently without the users’ knowledge and possibly against their will. 1 1. Introduction Location Based Services (LBS) are very well established in emergency services because the callers are often too young or injured so much they can not report their precise position. The position of the caller should be available at the public safety answering point at the time the emergency call starts or very shortly after. In some countries LBS are even anchored in law. For example, Federal Communications Commission, federal agency responsible for implementing and enforcing communications law and regulations in the USA, published an order effective from 2011 for the telecommunication providers to deliver localization services precision of 100 meters in 67% of emergency calls and 300 meters in 90% of emergency calls allowing some exclusions for heavily forested areas [1]. The main topic of this thesis is the mobile phone localization. To set this topic into wider context, radio technology and radio signal propagation are described together with basics of cellular networks and their topology. Mobile operators posses capabilities to obtain position of the user’s phone. The methods that can be used by mobile operators as well as positioning techniques that exclude mobile operators are explained in the theoretical part of this thesis. The aim of this thesis is to evaluate and compare localization techniques based on measurements of signal strengths of signals transmitted by a cellular network. These measurements are under some circumstances available for mobile operators. An Android application that simulates the generation of such measurement report is implemented and used for data gathering in a city and a village environments. Three different localization methods are implemented and collected data are used for their evaluation. The rest of this thesis is organized as follows: In chapter 2 the radio technology is introduced with focus on signal propagation and use in cellular network. Different mobile phone positioning techniques are presented and explained in chapter 3. Chapter 4 describes the implementation of chosen localization algorithms that are subsequently analyzed in chapter 5. Finally, possible future directions and summary of this research are discussed in chapters 6 and 7. 2 2 RF technology Location based services are tightly connected with physical characteristics of radio signals. First part of this chapter describes basic characteristic of radio signal and introduces factors influencing radio signal propagation. Second part explains basics of mobile cellular networks necessary to understand how and what type of information may be used in localization algorithms. 2.1 Signal propagation Radio frequency (RF) is the oscillation rate of an alternating electric current in the frequency range approximately from 20 kHz to 300 GHz. Energy from RF currents in conductors can be radiated into the free space as electromagnetic (radio) waves. Basic characteristics of the electromagnetic wave is given by its frequency, amplitude and phase. For transmitting digital data, the carrier signal is combined with the data signal by changing its basic characteristics (modulation). Electromagnetic radiation waves that travel in the direct path from a transmitter to a receiver follow so called Line-of-sight (LOS) propagation [2]. On the contrary, Non-line-of-sight (NLOS) propagation occurs when the transmission path is partially obstructed and the signal travels in an indirect path. Some of many phenomenons that distort the signal are diffraction, refraction, reflection or even absorption by obstacles (e.g. mountains, buildings, trees) in the path. Radio waves or their part may be reflected and scattered multiple times on their path to the receiver so the signals arriving at the receiver may come from different directions, with different strengths and with a shift of their phases resulting in a signal that is hard or impossible to decode. Another effect causing an increase in error rate is interference that occurs if there are transmissions of the same frequency from two or more sources. Interference can be explained on the example of talking persons. If Alice and Bob communicate over a distance, they have to speak loudly enough to hear each other. If Cyril starts speaking near them, they have to either speak louder or to come closer to each other. The very same applies in the RF technologies with the receiver unable to properly isolate individual incoming signals. Communication channel 3 2. RF technology and its quality may also be time varying when the transmitter or the receiver are moving through different environments. All these effects influencing signal propagation have to be taken into account in designing and planning any radio network. Computing signal propagation in real environment would be very time and resource consuming as it would require precise position and composition of all objects in the area together with information about RF propagation in particular materials. A map with precise signal propagation is usually not necessary for network planning and the whole process is simplified by path loss models. A model of radio signal propagation is mathematical equation determining path loss - the measure describing how is the signal attenuated over the specific path. We can generally state that signal strength at the receiver (Pr ) is equal to strength of the transmitted signal (Pt ) diminished by the characteristics of the path - path loss (PL). This relation is shown in equation 2.1 where Pr and Pt are in dBm and PL in dB. Pr = Pt − PL (2.1) The aim of using propagation models is homogenization of highly heterogeneous environment. This means that an area (e.g. a city) is modeled as if the signal is propagating at all places and in all directions equally even if the obstacles (e.g. buildings) are of different magnitudes and composed of different materials. Free Space Path Loss The most basic propagation model is Free Space Path Loss (FSPL) [2] model defined by equation 2.2. FSPL represents how the LOS signal is attenuated over the free space, typically over the air. The intensity of electromagnetic radiation decreases with the distance by the inverse square law or in other words path loss grows with the square of the distance. This means that if the distance between transmitter and receiver is doubled, the signal attenuation is four times higher and so the strength at the receiver is lowered to one quarter of the reference signal. 4 2. RF technology PL FS = 10 · log 4πd λ 2 ! 4πd = 20 · log λ (2.2) where PL FS − Free Space Path Loss [dB] d − distance from the transmitter λ − carrier wavelength in the same units as d Okumura-Hata model The Okumura model is empirical model based on extensive measurements taken in Tokyo, Japan. The data from okumura model lead to further developed Hata model also known as Okumura-Hata model [3] defined in equation 2.3. Path loss equations introduce signal propagation in a large city and correction functions are applied for other environments. The environment categories used in the model are: Rural area: Large open space, minimum of obstacles Suburban area: Village or highway, obstacles typically trees and houses Urban area: City with large buildings, many different obstacles Parameters and constraints used in Okumura-Hata model: L − Path loss [dB] f c − carrier frequency [MHz], 150 MHz ≥ f c ≥ 1500 MHz hb − base station height [m], 30 m ≥ hb ≥ 200 m hm − mobile station height [m], 1 m < hm < 10 m d − great circle distance between base station and mobile station [km] C L = A + B · log d − D E Rural areas Suburban areas (2.3) Urban areas 5 2. RF technology where A = 69.55 + 26.16 · log f c − 13.82 · log hb B = 44.9 − 6.55 · log hb C = 4.78 · (log f c )2 − 18.33 · log f c + 40.94 D = 2 · (log( f c /28))2 + 5.4 2 3.2 · (log(11.75 · hm )) − 4.97 f c ≥ 300MHz E = 8.29 · (log(1.54 · hm ))2 − 1.1 f c < 300MHz (1.1 · log f − 0.7) · h − (1.56 · log f − 0.8) c m c large cities medium / small cities Since Okumura-hata is defined for frequencies up to 1.5 GHz, it was extended to cover 1.5 − 2 GHz band in COST 231-Hata model. Many radio propagation models are available and they are suitable for different types of applications. More information about propagation models may be found in [2] or [3]. Note on decibels Signal strength is usually measured either in absolute units such as watts (W), milliwatts (mW) or decibel − milliwatts (dBm) or in some cases we use relative decibel (dB) unit. Decibel is very important unit in propagation studies and it is also often source of confusion. To avoid any confusions in the later work, relations between mentioned power units are explained in equations 2.4-2.7. The bel (equation 2.4) is a logarithmic unit of power ratio representing an increase in power P by a factor of 10 relative to power Pre f . 1 bel therefore means that the power P is 10 times higher than Pre f . Pbel = log PmW Pre f mW (2.4) The decibel (equation 2.5) is one tenth of the bel and it is more convenient to use than bel. 10 times higher power than the reference means 10 dB, 100 times higher power means 20 dB and so on. Decibel units are usually used for expressing path loss, attenuation or gain of RF 6 2. RF technology components (e.g. attenuator, power amplifier, antenna) with reference to the input power. P PdB = 10 · log mW (2.5) Pre f mW While bel and decibel are relative units that can be used only if it is known what is the reference power, decibel − milliwatt (equation 2.6) is an absolute unit that is referenced to 1 mW. Power in dBm therefore represents how many times (on the logarithmic scale) is the signal stronger or weaker than 1 mW. dBm have very convenient expression power for both low power 10−13 W = −100 dBm and high power 107 W = 100 dBm while units are still in a "nice" range ⟨−100, 100⟩. PdBm = 10 · log PmW P = 30 + 10 · log W 1 mW 1W (2.6) To convert dBm values to mW (or W), equation 2.7 is used. It is straightforward that equations 2.6 and 2.7 are inverses to each other. PmW = 1 mW · 10 PdBm 10 = 1 W · 10 PdBm −30 10 (2.7) 2.2 Mobile phone cellular network Radio frequency technology is nowadays used for a large variety of services but one of the most important is mobile telecommunication network using carrier signal frequencies usually from range 300 kHz − 3 GHz. This chapter explains basic principles of mobile phone networks with emphasis on principles that are necessary to understand localization techniques such as network topology and handover process. The cellular infrastructure built almost all over the world provides reliable communication service with massive outreach. Each operator providing telecommunication service uses its own base transceiver stations (BTS) but it is a common practice that operators share the infrastructure especially at some strategic geographic locations. For example the only high building in a small village has usually mounted multiple antennas of base stations of different operators on its rooftop. Base stations have dedicated frequency ranges at which they operate. This holds for stations of different operators as well as for neighboring 7 2. RF technology Figure 2.1: Cell density and size difference in different area types. Image from [5] cells of the same operator to avoid co-channel interference. However, base stations may reuse the frequencies of other cells if they are out of reach of each other so the interference can not occur. Topology and range of cells varies in respect to the amount of expected traffic in the area. Differences in cell sizes are pictured in figure 2.1. City centers are usually densely filled with base stations with very low range what makes the frequency reuse possible with maximizing the number of served mobile stations as well as network throughput. On the contrary rural areas are usually covered by a few strong transmitters usually placed at high terrain points maximizing the signal outreach. Moving through the areas of operation of multiple cells requires a mobile station to switch between different base stations without any decrease in the quality of service. Process of switching the serving cell is called a handover. [4] A handover should provide smooth user experience during the phone call even if the user is moving at high speed. It avoids call or data session termination whenever one of the communication parties moved out of the range of its serving cell. The network needs information from the mobile station (MS) regarding signal quality of the 8 2. RF technology Figure 2.2: A handover between two cells. Image from [6] surrounding base stations to decide whether a handover is needed at all and if yes, decide which station is the best candidate to be the new serving cell. New base station is instructed to open new communication channel for the MS and the old channel is closed at specified time point. The network then keeps tracking the new reports from the MS and is ready for another handover or other events. The signal strength information are gathered through messages called measurement reports [7] – measurements performed by the MS to measure signal strength and identity of surrounding cells. This message consists of signal strength measurement of the registered cell and then up to 6 neighboring cells with station identifier and signal level fields. The operator knows the positions of his BTSs and after the report that is sent multiple times per second he also knows the signal strengths of the towers at the MS position. Chapter 3.4 explains how different methods may be used to approximate the MS position from these information. Knowing the antenna type of the base station may increase precision of MS localization or on the other side localization of the base station itself. There are many different types of antennae and in gen9 2. RF technology Figure 2.3: Comparison of radiation patterns of omnidirectional and sector antenna. Images from [8], [9] eral we can divide them into two large groups - omnidirectional and sector. They differ in their ability to transmit signal in different directions. While omnidirectional antenna forms a donut shaped radiation pattern spreading its signal almost equally to all directions, the sector antenna transmits only in the direction of its principal axis and to usually unwanted side lobes. Radiation patterns of both types of antennae are shown in figure 2.3. Both are used in mobile telecommunication networks but the sector antennae are prevalent. Telecommunication network is part of the critical infrastructure of every country. It is necessary for successful coordination of operations as well as for providing emergency services. Official information about the base stations in the network are not accessible to the public because of potential strategic attacks and concurrency between telecommunication providers. However, antennae can be seen on the rooftops or eventually tracked down by signal strength so the data are gathered into the public crowdsource databases by many contributors. Having information about the network topology gives space for commercial use of the location based services based on cell network as well as for further research of new positioning algorithms. 10 3 Positioning techniques Because of high demand for precise location tracking on both developers’ and customers’ sides there are various methods to accomplish this demand with their specific pros and cons. The most relevant factors are availability, power consumption and accuracy. Sometimes the combination of multiple methods is used to obtain better accuracy implying higher reliability of the system. Triangulation and trilateration [10][11], both visualized in figure 3.1, are well known techniques for determining position of distant objects used in many different applications and yet are sometimes misunderstood or interchanged. Triangulation is used in situations where we know position of two out of three points and the angles between the line connecting known points and the third point. For example two observers A and B are on a coastline at known positions. Both can see a boat X under angles α and β respectively. With known distance d between A and B and two angles, a triangle can be constructed following the basic geometry. Then the distance between X and a coastline or X and the observers may be easily computed with using trigonometry rules. The main idea is therefore using the distance between known points and directions from known points towards the unknown point. Trilateration uses known distances from at least three points A, B, C to the unknown point X. Each known point is set as a center of a circle with radius equal to distance to X. These three circles intersect more Figure 3.1: Trilateration and triangulation methods 11 3. Positioning techniques or less precisely at the position of X, depending on the precision of the measured distance to the unknown point X. Following sections contain basics of the most common methods used nowadays for location tracking of mobile phones such as GNSS and techniques for obtaining GNSS information from MS, WiFi and Bluetooth based methods and the largest group is formed by methods based on a connection to the cell network. The last three methods described in this chapter were implemented (chapter 4) and evaluated (chapter 5). 3.1 Global Navigation Satellite System - GNSS A GNSS [12] is a constellation of satellites orbiting the Earth and perpetually broadcasting radio signal towards the surface. Data sent over the radio channel include accurate timing information obtained from the atomic clock on the board of the satellite and position data. All satellites broadcast rough position information of themselves and all other satellites in so called almanac, that is considered valid for weeks or even several months. Each satellite sends its precise position information and health state in ephemeris data, valid up to 30 minutes. Freshly started devices with no prior information about the satellites’ position may improve their time to first fix by obtaining the almanac and ephemeris from other source than the satellite, for example from a mobile network. These terrestrial transmitters resend the satellite data providing so called Assisted GNSS (A-GNSS) service. Probably the most popular global navigation satellite system is American Global Positioning System (GPS). Other systems in use are European Galileo, Russian GLONASS and Chinese COMPASS also known as BeiDou-2. Satellite navigation is the most accurate localization method, its error is usually less than 8 meters in ideal conditions. Factors that influence the GNSS accuracy are for example weather, indoor NLOS environments, high buildings, etc. In cases when GNSS positioning is not available, applications may fall back to other types of localization. All GNSS should be designed so that the signal from at least 4 satellites is available at every place on the Earth. A receiver, a chip with an antenna, decodes the signal and computes distances from 12 3. Positioning techniques Figure 3.2: GNSS multilateration. Image from [10] received signal delay of at least 4 satellites. Even the theory of relativity have to be applied to compute the distance precisely. With known distances, a multilateration method is applied to compute the chip’s position. Resulting position should be the intersection of the spheres virtually created around satellites with radius equal to computed distance to each satellite respectively as shown in figure 3.2. Because of using spheres in the multilateration algorithm, the computed position consists of latitude, longitude and altitude, so the height or sea level of the receiver may be computed, too. This is also the reason why at least 4 satellites have to be visible instead of 3 reference points that are used in 2-dimensional plane in trilateration, where the altitude parameter is omitted. Geographic coordinate system using latitude and longitude is used for locating a point on a sphere. However, for computing a distance between two points on this sphere we use Haversine formula 3.1 that computes a great circle distance - the shortest distance between two points on the surface of a sphere. Earth is usually approximated as a perfect sphere with a radius of 6378.137 km, the radius at the equator. Radius at the Earth’s poles is actually 22 km smaller. Errors of this 13 3. Positioning techniques approximation are negligible for relatively small distances between two points. a = sin2 (∆φ/2) + cos φ1 · cos φ2 · sin2 (∆λ/2) √ √ c = 2 · atan2( a, 1 − a) d = R·c (3.1) where φ is latitude [rad] ∆φ equals φ2 − φ1 λ is longitude [rad] ∆λ equals λ2 − λ1 R is a sphere radius (Earth) [m] d is a great circle distance between two points [m] 3.2 WiFi access point Another possible technique for location tracking is based on the location of the WiFi access points. The device has to have turned on WiFi but it does not have to be connected to any access point. Mobile station receives information about all access points in the close proximity so if it has also an information about location of these access points, it is able to precisely locate itself. It is hard to obtain accurate and large database of all WiFi access points all over the world, unless using the power of the crowd. There are many crowdsource databases of access points (defined by MAC address) and their GPS coordinates. Anybody may contribute to these projects either by manually adding their WiFi access point location or by running an application that is automatically monitoring nearby area and sending measured data to the remote server that filters the data and updates the database. There are applications focused specifically on this purpose and also applications that monitors surrounding environment just as a side effect while providing other services to the user. Typically applications that use this type of databases also collect new data and contribute to these databases. WiFi access point based technique consumes less power than GNSS on the MS but is also less accurate. Usually applications use this method together with cell-id method as an alternative to the GNSS, 14 3. Positioning techniques for example in cases when users do not need very high precision but only rough location estimate. 3.3 Bluetooth beacon Bluetooth beacon [13] is a hardware transmitter from a category of Bluetooth Low Energy (BLE) devices that transmits its universally unique identifier, optionally with other data, to the nearby area with low energy consumption. Beacon is a one way communicator - it transmits data but it does not receive any. So it is up to the receiver to react, typically some mobile application monitors the Bluetooth devices in a close proximity and once it captures specific data, it pushes a notification to the user. Beacons are installed for example in shops or other centers of interest and are mostly used for advertisement when user has compatible application active on his device. Important fact here is that an application has to be installed on the user device and that is the element responsible for location determining. Other use case include multiple beacons in a building so a trilateration technique may be applied to find the user’s position and navigate him through the building. 3.4 Mobile network connection Large group of phone localization algorithms is based on their connection to the cellular network. These algorithms typically use fixed position of base stations to determine phone position. Mobile operators have advantage of knowing exact position of the base stations together with their technical specification. Some methods, e.g. Angle of Arrival, Time of Arrival, Time difference of Arrival, require additional network cooperation such as additional synchronization messages or antennae modifications so they are unusable for any third party. Location tracking based on cell network is surely less accurate than positioning by GPS but it may locate the user device with precision higher than 100 m what makes it usable for a lot of applications that do not need very high precision or real time availability and in situations where GPS position is not available. 15 3. Positioning techniques Positioning methods may be divided into 2 groups according to the actor performing actual computation: Network-based - data from MS are captured by one or multiple BTSs and then forwarded to central processing point that estimates the MS position. In this case network computes the MS position from MS provided data. Handset-based - MS obtains signals from multiple (more than 3) sources with known position and determines its own position. Well known example is GPS. Both approaches are actually symmetric: the former one uses one transmitter (MS) with unknown position and multiple receivers with fixed position (BTS) and the latter one uses multiple transmitters with known position (BTS) and one receiver (MS) with unknown position. It is possible to use some methods, e.g. AoA (subsection 3.4.4), in both Network and Handset-based modes. 3.4.1 RRLP and LPP Radio Resource Location services Protocol (RRLP) used in GSM/UMTS networks and LTE Positioning Protocol (LPP) used in LTE network are protocols for position message exchange between MS and BTS [14]. Base station may request measurement data (signal strength, Doppler, etc.) of the phone and compute the distance itself or let the MS compute its position and request its coordinates directly. This request does not have to be a part of any session (call, SMS) so it may be performed without the knowledge of the user. Use of RRLP/LPP was specified for emergency services but it can be used also by law enforcement agencies. Global navigation systems are used in mobile networks providing much more precision than is needed for correct network functioning but making the whole localization process in critical situations much simpler and more accurate. 3.4.2 Cell-ID Every phone connected to a mobile network is connected to one base station at a time. In Cell-ID method, position of the MS is determined to be equal to the position of currently serving cell. When the cell 16 3. Positioning techniques identifier is known, position of the station may be looked up in a crowdsource database. Accuracy of this method is highly dependent on physical infrastructure of the network since base stations may have effective radial range from 10 m to 30 km so the knowledge of basic characteristics of the base station including output power (range) and antenna type (explained in section 2.2) improves the precision. This method is often used as a fall back method if no other is applicable. It is also possible to combine it with other methods to improve accuracy, for example using Timing advance described in the next section. 3.4.3 Timing advance - TA While it is possible to obtain the center of an area and its max range with Cell-ID, this area of possible location of the MS may be reduced with Timing advance. In technologies using Time Division Multiple Access (TDMA) such as GSM and LTE a TA parameter is used to correct signal delays caused by distance between transmitter and receiver. Base station is monitoring these time delays through specific messages and then commands the MS to increase or decrease its TA, the value specifying time offset the MS needs to send its data in advance to fit its allocated time window. TAs are limited by the speed of light (signal). Each TA level corresponds approximately to 554 m and 78 m step in GSM and LTE respectively. For localization purposes a TA means the minimal and maximal distance from the base station, forming an annulus around it or only a sector of the annulus if base station uses a sector antenna. Reflection of the signal causes large inaccuracies because the NLOS path of the connection is longer than the LOS path, meaning the TA is also higher, but the actual physical distance between the base station and MS is lower than length of the NLOS signal path. 3.4.4 Angle of arrival - AoA Some antennae are able to determine the direction of the incoming signal. Similar result may be reached by using multiple directional receivers. The Angle of Arrival [16][15] may be used in both handset-based and network-based setup but both require mentioned special hardware 17 3. Positioning techniques Figure 3.3: Triangullation usedin AoA method. Image from [15] 18 3. Positioning techniques solution. As the name of the method outlines, the underlying localization technique is triangulation. In network-based approach, multiple stations track the direction of signal transmitted from MS and send these data to central processing point where an intersection of the lines following the angle of arrival direction is computed. This intersection is the MS position estimate. Symmetrically in the handset-based setup the MS tracks the direction of incoming signals from surrounding base stations and compute its own position itself using the same triangulation algorithm as the network-based approach. This setup is feasible only if the precise position of the BTSs is known and MS is able to measure the angle of arrival. The latter is not trivial as additional hardware is necessary. The AoA is very sensitive to multipath distortion because the very same sample of the received signal arrives from multiple directions so the angle is not determined precisely rather than a range. To overcome inaccuracies caused by multipath usually the later samples are filtered out considered to be propagated in NLOS path. 3.4.5 Time of arrival - ToA Time of arrival (ToA) [11][15] is popular technique for determining range. The most notable radiolocating system using ToA is GPS. This method is based on 3 types of information: 1. Exact time tt when the signal is sent from the source 2. Exact time tr when the signal is received at the reference point 3. A speed of signal propagation c (in most cases the speed of light) The distance d between the transmitter and the receiver is computed by formula 3.2. Distance is specified as the length of the path that signal travels over time t at speed c. The speed of radio signal may vary according to the medium it is traveling through. d = c · ( tr − t t ) (3.2) For locating the receiver in n dimensions, at least n + 1 distance measurements are necessary. With known distances from reference points with known position, the trilateration or multirateration is used to 19 3. Positioning techniques Figure 3.4: Trilateration with range measurements such as ToA or RSS. Image from [15] 20 3. Positioning techniques compute the position of a node. While mentioned GPS using ToA is a handset-based method, the network-based method may be implemented in the mobile network without any additional hardware, as opposed to AoA method, because transmission times are already measured by network thanks to synchronization messages. If synchronization can not be maintained, it is possible to use Round Trip Time (RTT) method - base stations transmits an ToA signal and MS responds with another one so the base station obtains the doubled time of the path that can be averaged out. To counteract NLOS effects, analysis should be performed on the incoming signal as to avoid positioning based on signals that have traveled a longer path than the direct one. There are different strategies to avoid NLOS, most of which exclude unreasonable measurements by comparing all the incoming signals and keeping only those that are likely to be LOS signals. 3.4.6 Time difference of arrival - TDoA Time Difference of Arrival (TDoA) [11][15] is similar to ToA but it uses relative time of arrival instead of absolute. Each station measures the time of arrival itself and these measurements are collected at central processing point. For each pair of base stations, a difference of their measured times of arrival is computed. The difference cancels out potential time desynchronization between a base station an mobile station because the mobile network is synchronized itself and the error introduced by MS is same for all the base stations. Each computed difference connects points on the map that are equally distant to both of the base stations to one hyperbola. In other words, difference of distances from any point of the hyperbola to the corresponding two base stations is the same. If the difference is zero, meaning the MS is equally distant to both of the base stations, the difference curve is not a hyperbola but a straight line. A set of 3 base stations forms 3 hyperbolas (one for each pair) and the MS should lie at their intersection as can be seen in figure 3.5. 21 3. Positioning techniques Figure 3.5: TDoA hyperbolas formed for each pair of base stations with mobile station at their intersection. Image from [15] 3.4.7 Weighted centroid Cell-ID method uses only the registered cell for MS localization. An obvious improvement to this approach is to take into consideration not only the position of the serving cell but also the neighboring cells. Already presented measurement report carries exactly this information so it is directly applicable on the network side. The MS position is not united with the position of the registered cell as in Cell-ID but instead a cluster of cells visible by the MS is formed. The centroid of a polygon or a cluster is an arithmetic mean of coordinates of the vertices. The weighted centroid [17] is computed with different vertices having different weight to propagate to the result. In this case the higher the signal strength, the closer is the centroid to that station. This assumption is not always true since there are situations where two BTS have equal measured signal strength at the receiver but one is located further transmitting stronger signal while the second one is closer and transmitting weaker signal. Situations like this are directly increasing the position error. The coordinates of the centroid c x,y of n vertices Vix,y of weight wi are computed according to equation 22 3. Positioning techniques 3.3. ∑in=1 Viy · wi ∑in=1 Vix · wi cx = cy = (3.3) ∑in=1 wi ∑in=1 wi An interesting fact regarding the vertex weight is that the serving cell does not have to be the one with strongest signal. The main factor affecting an accuracy of the weighted centroid technique is one side shadowing. Measured low signal of one cell or more cells from roughly same direction due to an obstacle may massively influence the computation of coordinates of the centroid. Some antennae in mobile stations are very sensitive to the orientation so it is possible that even rotating the phone may end up with different measurement and therefore different centroid coordinates. 3.4.8 Received signal strength - RSS Another distance computing approach is measuring the signal strength at the receiver. As shown in equation 2.2, a radio signal is in the free space attenuated according to the inverse square law. This can be turned into the benefit of localization algorithm because once the received signal strength is measured, it can be compared to the signal strength at the transmitter so we obtain the path attenuation. Following one of many signal propagation models, the distance can be computed by inverting the model equation. Instead of computing the path loss given the distance, we compute the distance given the path loss. The obvious drawback of this method is that the signal strength of the base station has to be known but that is not a public information. This is not a problem for a network provider that sets up the stations and designs whole network topology. For applications without the knowledge of transmit power, it can be measured at the reference distance (e.g. 100 m) prior to the actual localization. This process may be directly included in the data gathering for crowdsource databases so we can generally state that signal strength of the transmitter at the reference point is available if the position of the transmitter is also available. The unknown signal strength of the transmitters may be also predicted if position of the BTS in the area is known by performing at least one calibration measurement. Position of the calibration point is 23 3. Positioning techniques known as well as position of the BTS so the distances between them can be computed and signal strength at given distance is measured. At this point the signal strength at the reference point, distance between reference point and BTS and propagation pattern are known so the actual transmit power may be calculated. This process highly depends on the accuracy of the propagation model and that signals from all the BTS in the area are measured so all of them are calibrated. The core principle of this method is transformation of received signal strength into distance and than using trilateration to obtain position of the MS. This approach is very similar to ToA (subsection 3.4.5) where time is used as a measured metrics instead of signal strength. 3.4.9 Fingerprinting In the data collection phase of cell database, a contributor moves in an area and signal strength measurements are taken every given period of time. The purpose is to collect the globally unique identifiers of the BTS and signal strength is measured to apply algorithms identifying position of the BTS. Such algorithm is for example finding a central point of a cluster of measurements where the signal strength is expected to be the highest possible value. In fingerprinting [18] this algorithm is not applied but a database of measurements is created instead. Each signal measurement is bound to GPS coordinates so the database is actually a map with reference points (places where the measurements were taken) describing what base stations are visible at the these points together with their signal strengths. The reference points may be grouped to form small areas with averaged signal strength for each BTS. These reference points are called fingerprints and the process of collecting such fingerprints is called war-driving, war-cycling or war-walking with respect to the mean of transport. After forming the database, any new measurement can be compared with all the fingerprints. This comparison may be implemented in many ways but the point is to find the fingerprint as much similar to the measurement as possible. Result of this method is therefore a fingerprint whose signal strengths of the base stations are very close to those present in given measurement. The accuracy of fingerprint24 3. Positioning techniques ing depends on the amount of fingerprints. Long term maintenance requires repetitive scanning of all the areas of interest. [18] An error may be introduced if a different device was used for wardriving and different for location tracking because different antennae end up with different power measurements so the fingerprint does not have to be properly matched. 25 4 Molotras - Mobile phone location tracking system From all the techniques described in chapter 3 the last three were implemented and evaluated as a pat of this thesis. Weighted centroid, RSS trilateration and fingerprinting were chosen because they are all based on signal strength measurement what is crucial when we want to demonstrate localization using measurement reports on the mobile operator side. These methods also do not need any precise synchronization with the network, any additional communication with the network nor any additional hardware or hardware modifications. Implemented system Molotras consists of two standalone parts: a scanner application for Android phones that gathers power measurements of the surrounding cells simulating the measurement reports and core application with implementation of localization methods that use data gathered from the scanner. These main parts of the system are described in a greater detail in the following sections of this chapter. 4.1 Android scanner Android operating system introduced a permission system to protect the privacy of its users [19]. Every Android application that requests access to sensitive user data (contacts, SMS, photos, etc.) or certain system features (camera, internet access, sensors, etc.) has to be granted permission of corresponding class. The idea behind this Android security architecture is that no application has by default a permission to adversely impact other applications, operating system itself or user and his data. This also include features like keeping the device awake. Applications requesting location information are limited by location permission. This is also the case of the Android part of this project. Here Android distinguishes two different types of location permissions: ACCESS_COARSE_LOCATION that allows the application to access approximate location and ACCESS_FINE_LOCATION that allows an access to precise location. Both of these permissions have protection level dangerous that means that the user has to ex27 4. Molotras - Mobile phone location tracking system plicitly allow this permission. Accessing GPS location requires fine location permission and accessing information about currently connected cell and neighboring cells requires coarse location permission that is weaker and allows access to specific subset of classes. Those are of course also accessible with stronger fine location permission. With respect to the aim of this project, these information about android security architecture and permission politics reveal multiple interesting facts. Location by cell network connectivity is not only theoretically possible but also functional and widely used. It is also part of Android location API – applications may use LocationManager class with multiple providers working in the background including GPS, WiFi access points, cell towers or Internet access. Another fact is that usage of cell network data is guarded by permission so there is no privacy breach. The last thing that is obvious from permission management is that GPS is the best option with respect to accuracy and other means may be useful but end up only with supposedly inaccurate result. After installing the app, user has to grant permissions to access precise position (GPS) and to manipulate data in the file system. After starting the app, 2 files ”Documents/Molotras/cell_data.csv” and ”Documents/Molotras/shots_data.csv” are created. If they already exist, they will remain unchanged. Those files are external and public therefore everybody should be able to access them either directly in the phone file system, e.g. read them, send them using other application, etc. or access them from external device, for example after connecting the phone to a PC, the data should be visible and available. The scanner app runs only in the foreground so it stops gathering data once the user changes the active window to other application or turns the screen off. To prevent unintentional stop of the data gathering the app forbids the display to turn off automatically. The scanner provides two buttons for controlling: Start/Stop for starting and stopping the data gathering and One-shot for reference measurements. After pressing the start button, the scanner app requests GPS position of the device in one second intervals and registers callback to this incoming GPS fix. The callback then requests cell information of the neighboring cells. The measurements are visible on the screen in the CSV format and at the same time they are stored to the cell_data.csv file. After pressing the One-shot button, the next measurement is stored 28 4. Molotras - Mobile phone location tracking system Figure 4.1: Molotras-scanner in idle (left) and scanning (right) state in the shots_data.csv. Then app continues to store new measurements to the cell_data.csv. The one-shot feature is used to get reference measurements directly on site during the war-driving but those can not be stored to the cell_data.csv because that could result in an exact one-toone mapping between the reference measurement and a fingerprint. The structure of the CSV measurement is: a measurement line with GPS coordinates and a timestamp followed by lines of scanned cells that are visible at the time and position of the measurement. Structure is also demonstrated in figure 4.1. Data that are possible to collect are the following: - RAT - Radio Access Technology - GSM / WCDMA (UMTS) / LTE - Cell ID - identifier of the cell - MCC - Mobile Country Code (e.g. 230 - Czech Republic, 231 - Slovakia) - MNC - Mobile Network Code - LAC/TAC - Location area code / Tracking area code - ARFCN - Absolute Radio Frequency Channel Number - RSS - Received Signal Strength, measured in dBm 29 4. Molotras - Mobile phone location tracking system - ASU level - Arbitrary (signal) Strength Unit - TA - Timing Advance (2G, 4G) - BSIC - Base Station Identity code (2G) - PSC - Primary Scrambling Code (3G) - PCI - Physical Cell Id (4G) - CQI - Channel Quality Indicator (4G) - RSRP - Reference Signal Received Power (4G) - RSRQ - Reference Signal Received Quality (4G) - RSSNR - Reference Signal Signal to Noise Ratio (4G) Every cell tower should be globally uniquely identified by MCC, MNC, LAC and CID. MCC and MNC are public information maintained by ITU (International Telecommunication Union) and LAC and CID are custom properties defined by operator. For the localization purposes it is enough to obtain these four values together with received signal strength. Problems regarding the scanner include invalid or missing data in the measurement. Responses from the phone are many times invalid values, e.g. maximal possible value for corresponding data type. The reason may be actual unavailability of the data at the time of measurement or the underlying implementation of the method is not provided. Typically these methods send a request to a baseband processor that sends the requested data as a response. It may be the case that the baseband processor does not recognize the request so the data are in the default, invalid state. Because of this, additional data sanitization and filtering need to be done before use. 4.2 Core The core part of Molotras can be further divided into three components: crowdsource database of cells, implementation of weighted centroid, trilateration and fingerprinting methods and third GUI for visualizing results of these methods. 30 4. Molotras - Mobile phone location tracking system 4.2.1 Cell database Implemented localization algorithms use cell databases for reference positions of cells in the area of measurement. For this purpose the OpenCellid [20] database, available at www.opencellid.org was used, where anybody can contribute to and also use this database for free. There are more alternatives to OpenCellid but they are very similar and usually they are forks of OpenCellid so any significant improvement in using other open database is not expected. There are applications that use OpenCellid API queries to obtain position – data consumers and applications that contribute to database with new measurements – data providers. OpenCellid gathers the data from data providers, updates the database and releases new updated version every day. Data consumers do not have to use online API queries, because the complete database with cells from all over the world is available for download and offline use. At the time of writing this thesis the database is stored and distributed as a 3.5 GB large CSV file. The data stored in the database are similar to the data gathered by the implemented Android scanner. The most important fields are RAT, MCC, MNC, LAC and CID to uniquely identify the cell and its GPS coordinates given by latitude and longitude. Other values like number of measurement samples of each cell, source of the measurement, timestamp and others are not important for the aim of the thesis. Integrating cell database to the project is performed by parsing the downloaded CSV file and loading all the data to MySQL database. For performance reasons database was prior filtered to contain only cells of Czech and Slovak Republic. After loading the CSV to MySQL database, it is ready to be used in the localization algorithms. A typical problem with cell database is that it does not take into account the direction of cell tower antennae. Position of the cell tower is usually computed as a centroid of measurements obtained from data providers. If the antenna of the cell is directional, the centroid is incorrectly determined as a point somewhere in the main lobe of the transmission instead of the point at the beginning of the lobe. This means that centroid method is relatively precise for determining position of omnidirectional antennae but it introduces an error with directional antennae. After forming the database, it is impossible to recognize if the cell record is precisely determined record of omnidi31 4. Molotras - Mobile phone location tracking system rectional cell or imprecise record of directional cell. For localization of MS, all the records of cells are considered to be omnidirectional so there definitely is an error in accuracy caused by this assumption. Another issue is the crowdsourcing itself. As the database is open, anybody may contribute with correct measurements but it is possible that there is a lot of measurements that are either not accurate enough, invalid or misleading by purpose. Some records in the OpenCellid contain values that are invalid and can not occur in real environment so it is probable that database inputs are sanitized weakly or not at all. 4.2.2 Localization algorithms The Molotras-core is a module developed in python (version 3.7.3) containing implementation of weighted centroid, trilateration based on RSS and fingerprinting method used for comparison and evaluation of these methods. All three methods expect one complete measurement with the same structure as in the Android scanner. Provided data therefore contain list of cells and corresponding measured signal strengths as well as GPS coordinates of the measurement. The cell list is used to estimate GPS coordinates of the MS on which the measurement was performed and then haversine distance, explained in section 3.1, between position estimate and actual position is computed and return as a result of the method. The smaller is the distance between the position estimate and actual position the more accurate the method is. Centroid technique is basically implementation of equation 3.3. For all the cells in the input list the cell record is looked up in the database. If no such record exists, this cell is omitted and the result is computed without it. Usually these situations decrease the accuracy of the result. The weight of the vertices (cells) is the signal strength measured in dBm. Typically the signal strength falls in the interval −112 dBm weakest to −50 dBm strongest. To use the signal strength as a vertex weight where all the values are negative and the higher number means the higher weight, the inverse of the signal strength is used in the computation. For example if a cell has signal strength of −82 dBm its 1 weight is − 82 . The computation is then trivial following the equation 3.3, we multiply the position of the cell by its weight divided by the 32 4. Molotras - Mobile phone location tracking system sum of the all weights, obtaining a measure of how much is the cell affecting the result. This method assumes that all the cells have the same transmit power so the weight is computed equally for all of the cells. In real environment it does not have to hold that if a phone measures signal strengths of cells A and B and A has a stronger signal, the phone is closer to the cell A. It may be for example the case that A is further but has stronger transmission than B that is located closer to the phone. Trilateration using received signal strength is technically an improvement of a centroid method. It is actually translating the signal strength to the distance. Generally if we use trilateration and assume that all the cells transmit at the same out power level, we should obtain very similar results to the centroid. This is because the transformation of the signal strength to distance is performed equally for all the cells so we are again in the state where the stronger signal means the lower distance. This approach is therefore benefiting from additional necessary information that is the signal strength at the transmitter point or in other words an out power of the cell. Usually this information is not available so an estimate or calibration is needed. This implementation uses Okumura-Hata signal propagation model that is used for both calibration as well as for computation itself. The calibration is performed at the beginning of the algorithm. The calibration set provided as an input is a list of measurements from the area of interest. For calibration of a cell, a measurement where this cell has the strongest signal is chosen. Then a haversine distance between the cell and the point of measurement is computed. Then an Okumura-Hata path loss is computed at the computed distance. Finally when we know what is the expected signal path loss at the specific distance, we can sum received signal strength with the path loss to obtain expected output power of the cell. This algorithm is applied for all the cells in the provided calibration list. Each cell from the input list is looked up in the cell database to obtain its position and then looked up in the calibration set to obtain its output power. If at least one of these look ups fails, the cell is omitted from actual positioning and probably introducing an error in the resulting position estimate. Once having a list of cells with computed distances to them, the trilateration is started. In ideal case 33 4. Molotras - Mobile phone location tracking system one intersection of the circles would appear and this point would be the result. In reality there is usually set of circles that have either very large intersection area or there is no intersection at all. Sometimes some of the cells form large intersection and another part is not intersecting at all. To handle all these situation, the algorithm uses optimization function that minimizes the distance to all the circles. The starting guess is taken to be the weighted centroid. Then this point is moved in all directions trying to minimize the error function, where the error is distance to the circles. Used optimization function is function minimize from module scipy.optimize. The Fingerprinting method is the last implemented method using completely different approach. Instead of using cell database to obtain positions of cells, it uses database of fingerprints. Before starting the fingerprinting method, actual database of fingerprints has to be created. Again the input is a list of measurements, ideally dense measurements completely covering the area of interest. Coordinates with lowest and highest longitude and latitude are taken as border lines of the area. This rectangle is then divided into small square tiles with arbitrary size. For each tile a fingerprint is computed. A fingerprint groups all the measurements with GPS coordinates inside the tile and an average signal strength is computed for each cell. The tile size has direct impact on localization accuracy. Bigger tiles mean each tile has more cell measurements bound to itself. The drawback is that once the tile is determined as the result, it introduces an implicit error. The error is determined to be the distance between the center of the tile and the furthest point still belonging to the tile. In square tiles this is the corner vertex so the error is a circle of diameter equal to the length of the diagonal of the tile. The radius of this circle is obviously half of the length of the diagonal. For example if a tile dimensions are 100 m × 100 m, the final error would be a circle with a radius of approximately 70 m. After the fingerprint database is computed, an input measurement may be evaluated. The signal strengths of the cells in the measurement are compared to the signal strengths present in each fingerprint. Comparison is actually computing a vector distance of two vectors of signal strengths. If a cell present in the measurement is not present in the fingerprint, a penalty is introduced. In this implementation the 34 4. Molotras - Mobile phone location tracking system penalty equals to −113 dBm as if the cell was visible with the lowest possible signal strength. It may happen that multiple tiles have exactly the same best score. In this case the resulting distance between the reference position of the measurement and computed position is determined as the average of distances between the reference point and center of each of the tiles. 4.2.3 GUI A graphical user interface with Molotras-core in the backend may be used to visualize the positioning algorithms. For this purpose a python web framework CherryPy [21] was used. The frontend part is written in classic joint of JavaScript, HTML and CSS. The biggest benefit of using GUI is visualizing on the map. The map server used in this project is well known Google Maps JavaScript API [22]. This service is no longer free so to be able to use it, the user has to pay per API request. Since Molotras is able to run as a web server, GUI is accessible through any browser at port 8080. It consists of a map where all the methods are visualized and a right panel with settings for each of the methods. The map shows tiles, circles and points specific for the technique. After computing the result, both the reference point and computed coordinates are displayed together with distance error. Overview of the GUI is shown in figure 4.2. 35 4. Molotras - Mobile phone location tracking system Figure 4.2: Fingerprinting method with 10 m × 10 m tiles visualized in Molotras GUI 36 5 Evaluation Implemented localization methods weighted centroid, RSS trilateration and fingerprinting were evaluated on three mobile phones and at two different locations. The data gathering, evaluation process and obtained results are described in this chapter. To diminish an error caused by specific hardware, three phones of different manufacturer were used for data gathering and subsequent evaluation. The tested phones are Nokia 6.1, Moto G6 Play and Google Pixel. All the phones had SIM card of the same operator. Signal propagation is dependent on the surrounding environment so different results are expected for different environments. Two such environments were chosen for evaluation, a city and a village environment. The city measurements were taken in the centre of Brno, Czech Republic and the village measurements are from periphery of Trenčianska Turná, Slovak Republic. These two environments are further referenced as the city and the village. The measurement data were gathered by walking through the streets in the area with all three phones with the scanner app running. For both environments 10 measurement shots were taken on each phone at randomly chosen places. The data gathering process therefore results in two large data sets and for each additional 30 single shot measurements. Collected data are shown in figure 5.1 where all the measurement are grouped to 10 m × 10 m tiles. Mobile phones are not scanning all the available radio access technologies (RAT) all the time. Usually the currently active RAT, e.g. LTE, makes the phone to scan only LTE cells in the area. For this reason the data gathering of all the RATs requires either multiple phones or more iterations of the data collecting in the area with different RAT active on the phone. GSM measurements ended up with mostly valid data, additional sanitization was necessary only for data from Pixel. On the other side UMTS and LTE measurement data were valid only for the currently connected cell on all three phones. All the neighboring cells necessary for localization could not be identified, thus evaluation is performed on GSM data only. Neither GSM nor LTE measurement provided 37 5. Evaluation Figure 5.1: Maps of collected GSM data in Brno CZ (left) and Trenčianska Turná SK (right) valid timing advance data so the timing advance method could not be evaluated. Data sanitization was necessary for data gathered at Pixel, because the MNC value was missing and also currently registered cell was in the measurement list two times, with different signal strengths. MNC was manually corrected to the corresponding value for the Czech and Slovak mobile operator and the record with lower signal strength was removed from the list. Each method was evaluated with 30 measurement samples. The large data sets of measurements captured by different phones were merged together and used for creating the fingerprint database for fingerprinting method and for calibration for RSS trilateration method. The measured metric in the evaluation process is the haversine distance (great circle distance) between actual position where the measurement was taken and estimated position. A method is therefore more accurate when the distance error is smaller. The results are represented by boxplot charts for the city in figure 5.3 and in figure 5.2 for the village. Corresponding values are present in tables in appendix A. Results for each phone are shown independently for each method and there is also a box plot summarizing the phone results for general comparison of the methods. As already stated, each phone is evaluated with 10 measurements so the summarized box plot "all" is evaluation of 30 measurements. 38 5. Evaluation Figure 5.2: Evaluation in the village environment The differences between the city and the village results are of different magnitude for all the methods. While the scale for the error in the city is 130 m the scale for the village is 900 m. This great difference is probably caused by multiple factors. There are two cells measured by phones that are not present in the OpenCellid database that are therefore omitted by weighted centroid and RSS trilateration. There are also much less cell towers present in the village than in the city, probably all mounted on two or three spots. There is also a difference between results for each phone within each method. The inconsistency is probably caused by different hardware in the phones, namely the radio components, that could result in different sensitivity to radio signal. Fingerprinting has by far the best results compared to weighted centroid and RSS trilateration. The accuracy of this method is not influenced by any external database that could contain inaccurate data or by any assumption like the one used by weighted centroid that all the transmitters have the same output power. Weighted centroid and RSS trilateration reach almost the same results in the city while RSS trilateration is more accurate one in the village. This is probably caused 39 5. Evaluation Figure 5.3: Evaluation in the city environment by better signal propagation since there are less and smaller obstacles so the Okumura-Hata propagation model is more precise. 40 6 Future work and possible extensions The main goal of this thesis was to prove that mobile phone location tracking based on received signal strength is possible with publicly available resources and to compare and evaluate some of the methods using this approach. There are many possible ways to improve or further research positioning techniques, some of them are introduced in this chapter. The scanner implemented for Android OS is already used in a research studying possibilities of using machine learning to obtain better results or to compute more fingerprints from a given fingerprint set. Some places are hard or forbidden to enter so measuring the data around the object while still obtaining reliable fingerprints inside the area may dramatically improve the position estimate. Data collected during war-driving may be used not only for a fingerprint database but also for forming new cell database. The difference is that measurements would be used for localization of transceiver stations and possibly new characteristics not present in OpenCellid could be measured as well, for example determine the output power of the transmitter or direction of the antenna. Another use for the war-driving data could be the calibration for RSS trilateration. Calibration for Molotras was performed on a single cell with highest signal strength. There could be used more advanced algorithms that would use multiple reference points and estimate the output power of the base station in a way it makes sense for all the points. This would need to take directivity of the antenna in the consideration. Kalman filter is an estimator used for indirect measurements where it is impossible to measure the subject directly or if the measurements are taken in a noisy environment. The aim of the Kalman filter is to accurately estimate real system state based on imprecise measurements and estimate from the previous step. This estimator is memory-less so all already performed estimates are aggregated to the last one. JeanPierre Dubois et al. used Kalman filter in GSM networks [23] and they successfully decreased the position error over time. This system is not a brand new approach, it is rather an optimization combining results of already introduced methods to obtain the best possible location 41 6. Future work and possible extensions estimate assuming there are errors that can be smoothed out. Kalman filter is used for iterative measurements to minimize the error over time so it is not applicable to one time measurements. The future research of position estimation should be aimed also in this direction using consecutive independent measurements. 42 7 Conclusions This thesis explains basics of radio technology, radio signal propagation and characteristics of mobile cellular networks that are necessary to understand localization techniques. Different location determination approaches were introduced with main focus on those based on signal strength measurements. Implemented Android application that gathers the signal strength measurements of the surrounding cells and captures GPS position was used to create a fingerprint database and reference measurements for testing. The measurements were taken at two different locations, in the center of Brno and in a village Trenčianska Turná. To avoid hardware specific errors, the measurements were taken on three phones, namely Nokia 6.1, Moto G6 Play and Pixel by Google. Three localization methods were chosen and implemented. Fingerprinting, weighted centroid and trilateration are all based on signal strength measurement of cell towers in the nearby area. Data obtained by the Android scanner were used to evaluate implemented methods. Evaluation was based on the signals from GSM network only. The most accurate results were obtained by fingerprinting that introduces a distance error approximately 10 meters in the city and 50 meters in the village. Other two methods reach similar accuracy of 70 and 600 meters in the city and in the village correspondingly. 43 Bibliography 1. FEDERAL COMMUNICATIONS COMMISSION. Wireless E911 Location Accuracy Requirements. 2011. https://www.fcc.gov/document/wirelesse911-location-accuracy-requirements-1. 2. SAUNDERS, Simon R; ARAGÓN-ZAVALA, Alejandro. Antennas and propagation for wireless communication systems. 2nd ed. John Wiley & Sons, 2007. ISBN 978-0-470-84879-1. 3. A.MAWJOUD, Sami. Path Loss Propagation Model Prediction for GSM Network Planning. International Journal of Computer Applications. 2013, vol. 84, pp. 30–33. Available from DOI: 10.5120/ 14592-2830. 4. MUNIR, Muhammad. Different Generations of Cellular Networks System. 2005. Available from DOI: 10 . 13140 / RG . 2 . 1 . 3341.2004. 5. TNUDA. Cellular Communication Network Technologies [online]. 2016 [visited on 2019-11-27]. Available from: https : / / www . tnuda.org.il/en/physics-radiation/radio-frequency-rfradiation/cellular-communication-network-technologies. 6. LANAROL, Isuru Mahesh. Telecommunication Engineering Concepts: GSM Handover/Handoff [online]. 2012 [visited on 2019-1208]. Available from: http://telecommunicationengineeringconcepts. blogspot.com/2012/05/gsm-handoverhandoff.html. 7. KEYSIGHT. Measurement Reports [online]. 2010 [visited on 201912-08]. Available from: http://rfmw.em.keysight.com/rfcomms/ refdocs/gsm/gprsla_meas_reports.html. 8. MP ANTENNA, LTD. Omnidirectional Antenna Radiation Patterns [online]. 2019 [visited on 2019-11-27]. Available from: https : //www.mpantenna.com/omnidirectional-antenna-radiationpatterns-of-different-antenna-designs/. 9. MIRO. Part 4 of 5: Introduction to Wireless Communication (Antennas) [online]. 2017 [visited on 2019-11-27]. Available from: https://www.miro.co.za/part-4-5introduction-wirelesscommunication-antennas/. 45 BIBLIOGRAPHY 10. GISGEOGRAPHY. Trilateration vs Triangulation – How GPS Receivers Work [online]. 2019 [visited on 2019-11-28]. Available from: https://gisgeography.com/trilateration- triangulationgps/. 11. SHOJAIFAR, Alireza. Evaluation and Improvement of the RSSIbased Localization Algorithm : Received Signal Strength Indication (RSSI). In: 2015. 12. FERNÁNDEZ-PRADES, Carles; CLOSAS, Pau; VILÀ-VALLS, Jordi; FIGUEROA-NAHARRO, Marta; TORNÉ-SANJOSÉ, Albert. Assisted GNSS in LTE-Advanced Networks and Its Application to Vector Tracking Loops. In: 2012, vol. 2. 13. DICKINSON, Patrick; CIELNIAK, Gregorz; SZYMANEZYK, Oliver; MANNION, Mike. Indoor positioning of shoppers using a network of Bluetooth Low Energy beacons. In: 2016, pp. 1–8. Available from DOI: 10.1109/IPIN.2016.7743684. 14. SCHÜTZ, Jan. LTE Location Based Services Technology Introduction White paper. In: 2013. 15. TAHAT, Ashraf; KADDOUM, Georges; YOUSEFI, Siamak; VALAEE, Shahrokh; GAGNON, François. A Look at the Recent Wireless Positioning Techniques With a Focus on Algorithms for Moving Receivers. IEEE Access. 2016, vol. 4, pp. 6652–6680. Available from DOI: 10.1109/ACCESS.2016.260648. 16. ZIMMERMANN, Lars; GOETZ, Alexander; FISCHER, Georg; WEIGEL, Robert. GSM mobile phone localization using time difference of arrival and angle of arrival estimation. 2012. Available from DOI: 10.1109/SSD.2012.6197970. 17. OLIVEIRA, Leonardo; DESSBESELL, Gustavo; MARTINS, Joao; MONTEIRO, José. Hardware implementation of a centroid-based localization algorithm for mobile sensor networks. In: 2011, pp. 2829– 2832. Available from DOI: 10.1109/ISCAS.2011.5938194. 18. SARAIVA CAMPOS, Rafael; LOVISOLO, Lisandro. Fingerprinting Location Techniques. In: Handbook of Position Location. John Wiley and Sons, Ltd, 2019, chap. 15, pp. 497–529. ISBN 9781119434610. Available from DOI: 10.1002/9781119434610.ch15. 46 BIBLIOGRAPHY 19. ANDROID DEVELOPERS. Permissions overview [online]. 2019 [visited on 2019-12-08]. Available from: https : / / developer . android.com/guide/topics/permissions/overview. 20. UNWIRED LABS. OpenCellid [online]. 2019 [visited on 2019-1208]. Available from: https://opencellid.org. 21. CHERRYPY TEAM. CherryPy — A Minimalist Python Web Framework [online]. 2019 [visited on 2019-12-08]. Available from: https: //cherrypy.org/. 22. GOOGLE MAPS. Maps JavaScript API [online]. 2019 [visited on 2019-12-08]. Available from: https://developers.google.com/ maps/documentation/javascript/tutorial. 23. DUBOIS, Jean-Pierre; DABA, Jihad S.; NADER, M.; FERKH, C. El. GSM Position Tracking using a Kalman Filter. International Journal of Electronics and Communication Engineering. 2012, vol. 6, no. 8, pp. 867–876. ISSN eISSN: 1307-6892. Available also from: https: //publications.waset.org/vol/68. 47 A Evaluation Following tables contain all measured values for all the methods and phones that were used in this thesis. 49 A. Evaluation Table A.1: City results Method Centroid Centroid Centroid Centroid Trilateration Trilateration Trilateration Trilateration Fingerprinting Fingerprinting Fingerprinting Fingerprinting Phone Median Mean Min Max Nokia 6.1 Moto G6 Play Pixel All Nokia 6.1 Moto G6 Play Pixel All Nokia 6.1 Moto G6 Play Pixel All 53.07 68.7 81.31 66.34 62.25 66.46 91.84 69.7 9.03 6.16 23.38 9.23 51.46 66.39 76.46 64.77 59.85 69.58 79.93 69.79 8.4 13.36 26.73 16.16 17.8 32.13 21.56 17.8 13.75 37.05 13.88 13.75 3.67 2.22 7.47 2.22 97.7 128.37 122.59 128.37 121.99 117.32 118.25 121.99 14.67 45.99 61.37 61.37 Table A.2: Village results Method Phone Centroid Nokia 6.1 Centroid Moto G6 Play Centroid Pixel Centroid All Trilateration Nokia 6.1 Trilateration Moto G6 Play Trilateration Pixel Trilateration All Fingerprinting Nokia 6.1 Fingerprinting Moto G6 Play Fingerprinting Pixel Fingerprinting All 50 Median Mean Min Max 620.88 615.91 577.92 605.93 450.76 620.17 549.53 539.11 129.7 27.84 39.93 49.72 621.54 648.12 805.4 691.69 458.74 564.2 738.6 587.18 114.48 85.39 71.43 90.43 208.17 339.3 554.81 208.17 154.97 294.26 474.74 154.97 7.92 5.93 4.89 4.89 904.03 941.86 2172.08 2172.08 901.04 893.54 1889.19 1889.19 225.72 306.63 258.53 306.63 B Molotras source files and measured data The electronic archive of this thesis contains the following data attachments: ∙ source files of the Android scanner - Molotras-scanner ∙ source files of the implemented localization method - Molotrascore ∙ collected data that were used for evaluation of methods 51