UMLI: An Unsupervised Mobile Location Extraction Approach with

advertisement

UMLI: An Unsupervised Mobile Location Extraction

Approach with Incomplete Data

Nam Tuan Nguyen, Rong Zheng

, and Zhu Han,

ECE Department,

CS Department, University of Houston, Houston, TX 77004

Abstract —Location extraction in an indoor environment is a challenging task, and yet, it is of great interest to retrieve locations information without cumbersome manual site surveys. Indoor location information, namely, which room a user is located, is valuable for applications such as location based services, mobility prediction, personal health care, network resource allocation, etc.

In absence of reliable GPS signals in indoor environments, WiFi is a potential candidate due to its wide availability. However, WiFi signal tends to be noisy and incomplete due to the limited range of access points and randomness in beaconing. We propose a twolayer clustering method that is able to i ) classify the rooms in an unsupervised manner; and ii ) handle missing data effectively.

Experiment results using smartphone traces show UMLI can achieves an identification rate of 99.84%.

I. I NTRODUCTION

Despite that humans spend most of time indoor, studies on indoor mobility are relatively scarce. In contrast to mature and popular outdoor location-based services utilizing GPS, accurate indoor localization remains elusive to this date. Fortunately, even though GPS signals are not available or are of poor quality, each indoor location may have its own signature that differentiates itself from other locations. A location signatures can be a combination of the RF environment, the surround sound, image, etc., which is unique to the location. By extracting the signatures, one can tell apart different locations and further infer logical functionality of individual locations.

Due to the prevalence of WLAN infrastructure, receive signal strength (RSS) scans from WiFi access points (APs) have been considered to provide location signature in literature.

However, as a result of the multipath effect between mobile devices and access points (APs), RSS from APs at the mobile devices is highly varying. Missing measurements from APs are commonplace. Within one building, the set of available APs at different locations differ a lot. Even at the same location, different scans may be from different set of APs due to the randomness in channel access. There are two types of missing data: Missing At Random (MAR) and Missing Not At Random

(MNAR). Data is said to be missing at random if the missing probability does not depend on the missing value [1]. Many data imputation techniques can be utilized to handle MAR. In contrast, in MNAR, the missing probability depends on the missing value. Both types of missing data are present in the collected WiFi scans. MAR can happen at any value of RSS above the sensitivity threshold, MNAR occurs at low SNR when the mobile user is outside the range of an AP.

Therefore, to classify indoor locations utilizing the RSS from neighbor APs, one has to resolve the problem of missing data. Moreover, it is too cumbersome for users to collect and label manually RSS from each location of interest, which is subject to changes over time. An unsupervised (profiling free) approach is needed. In this paper, we propose a novel Unsupervised Mobile Location extraction approach with

Incomplete data (UMLI) to overcome the challenges. Our contributions are three-fold. First, UMLI handles incomplete data in a systematic manner. Second, we propose a novel non-parametric categorical clustering (NCC) method to cluster the locations on a coarse grained scale such as a floor of a building. We call it the first classification layer. Third, we implement a non-parametric Bayesian classification (NBC) technique to further classify the coarse grained locations into fine grained room-level locations as the second classification layer. The term non-parametric here means the number of rooms is not known a priori.

We conduct extensive experiments using an HTC smart phone and collect data on the 3 rd floor of the Cullen College of Engineering building, University of Houston. UMLI is able to achieve 99.84% accuracy in room-level identification.

Utilizing the output of UMLI, further information can be inferred regarding the dwelling time in each location, the number of locations and the likelihood of a new observed data belonging to a specific location, e.g., identity matching. These information, together with other available sensor data on smart phones, can completely characterize users’ indoor activities.

UMLI differs from existing solutions utilizing RSS for indoor localization in its unsupervised and non-parametric nature. Moreover, our proposed solution to MNAR may be of interest in its own right in other classification problems.

The rest of the paper is organized as follows. In Section

II, the related works are reviewed. The location extraction problem is formally formulated in Section III, followed by a description of the proposed NCC method in Section IV.

The nonparametric Bayesian clustering method is detailed in Section V. Experiment setup and results are presented in

Section VI. Finally, the conclusion is drawn in Section VII.

II. R ELATED W ORK

Generally, there are two categories of approaches in wireless indoor localization or place identification, namely, profiling and profiling-free. In profiling approaches [5], [2], [3], [4], fingerprints for each location are first obtained through manual survey to construct a fingerprint map, which is later used to find the best match with the new measurements at unknown locations. In SurroundSense [5], fingerprints are generated from WiFi signal strength, ambient sound measurement, acceleration, color and light readings. SurroundSense is shown to be very accurate in identifying places though the good performance comes at the expense of utilizing sensing modalities with high energy consumption such as camera.

2

SensLoc [13] is an energy-efficient profiling approach for sensing everyday places. It bares some similarity with LOIRE in the use of accelerometer to detect motion to trigger WiFi scanning. In SensLoc, input Wi-Fi scans are treated as vectors so that the Tanimoto coefficient can be used. A group of scans within a window, the response rate of each AP (defined as the percentage of beacons observed) are used to determine the fingerprint. Signals from unobserved beacons are imputed with zero. The approach can identify correctly 96% of revisited places. However, the performance is sensitive to the threshold chosen for the Tanimoto distance between fingerprints. Moreover, the devices need to continuously measure WiFi signals whenever the users are stationary.

Profiling-free localization solutions [7], [6] do not require users to collect fingerprints from different locations beforehand. WILL [6] uses RSS from APs as fingerprints and computes a pairwise similarity score of any two fingerprints from their Euclidean distance. The distance is then used in K -means clustering to group measurements into virtual rooms. Missing data are not explicitly dealt with in WILL. Furthermore, in

K -mean clustering, the number of clusters (or, the number of places) needs to be specified before hand. ARIEL [7] is a profiling-free nonparametric approach for indoor localization, which does not require the number of locations to be known a priori. It utilizes n -gram [8] to dete rmine the distance between the WiFi sessions, and then group the sessions into one zone. Sessions are grouped into one zone if, i) they are within a certain distance threshold, Eps , and ii) their quantity exceeds a minimum number of points, MinPts . The zone-based clustering is followed by a motion-based clustering, which groups zones together into rooms . The main rationale behind

ARIEL is that the order of RSS from APs is mostly fixed within one location, and the order is independent of the device characteristics. However, we found through our measurement study that the first assumption is often violated. Moreover, the way they determine the thresholds is somewhat heuristics and can have a strong impact on the performance and the granularity (i.e. zones, rooms) of their clustering approaches.

And finally, because of its nature, ARIEL only performs well if users collect at least 400 samples per each room. Thus, it is not efficient in term of energy consumption.

UMLI is a novel location extraction framework in that it is unsupervised and does not assume the knowledge of the number of locations. Furthermore, UMLI recognizes and handles missing data in RSS measurements in a systemic manner, which to our knowledge, is the first work in doing so. However, it should be noted that like other unsupervised approaches, UMLI provides weaker location semantics in that, distinctive locations a user visits are provided rather the physical coordinates or the logical names of these locations.

The later has to be facilitate by users.

Let X = [ ~

1

III. P x

2

ROBLEM x

N

]

F ORMATION be a set of N observations. Each observation, ~ i

, has a variable length di and is denoted by, x i

= { AP 1 : RSS 1 , AP 2 : RSS 2 , ..., AP di

: RSS di

} .

(1)

TABLE I

O BSERVATIONS FROM TWO ROOMS . T HE FIRST TWO OBSERVATIONS COME

FROM THE SAME ROOM .

Observation 1

AP 1001 1002 1074 1073 1050 1075

RSS -67 -69 -89 -76 -69 -76

Observation 2

AP 1001 1002 1074 1073

RSS -66 -66 -89 -76

Observation 3

AP 1001 1002 1068 1070 1073 1076 1077 1078

RSS -63 -63 -86 -89 -96 -91 -87 -88

Our goal is to classify the observations into an unknown number of clusters. Each cluster corresponds to a physical location. An example of three observations is given in Table I.

The first two observations are collected in the same room close in time. The third observation is collected from a different room approximately 26 f eet away from the first room. One can observe that the lengths of these three observations are all different. Obviously MNAR is severe in the first two observations, since most of the APs available in the 3 rd observation are not observed in the first two observations although both rooms are on the same floor in the same building, and are not far away from each other. Motivated by the above observations, three problems need to be resolved in location extraction:

1) How to handle MNAR and MAR?

2) How to handle observations with variable length?

3) How to classify the observations without prior knowledge about the number of locations?

We propose UMLI to resolve the above three problems.

UMLI is described in Fig. 1. At the first classification layer,

NCC, observations with small pairwise distances are grouped into one coarse-grained zones. Within each coarse-grained zone, RSS from the most common APs are chosen as the features for the second classification layer, infinite Gaussian mixture model (IGMM). IGMM is a generative model that is suitable when the number of classes/clusters is unknown. It is a nonparametric Bayesian (NBC) classification approach. In the next two sections, NCC and IGMM will be discussed in details.

IV. A N ONPARAMETRIC

A

C ATEGORICAL

PPROACH

C LUSTERING

NCC is a categorical clustering method utilizing only the identifier of the APs observable (rather than the RSS measurements). The purpose of NCC is to identify coarse-grained zones, such as a floor or a common area of a building.

Locations that are well separated can be distinguished at this layer.

Given that adjacent RSSs in time tend to be strongly correlated, we first apply an averaging filter for missing data across all recorded RSS from the same AP. The filter only imputes the MAR values since it bases on the time adjacent

RSSs to fill in the missing value. If none of the adjacent RSSs

3 basic idea is to build a density function of the observations using the pairwise Jaccard distances. The density function is given by, f

ˆ l,κ

( ~ ) =

1

N l D

N

X k ( ~ − ~ i

) = i =1 c k,D

N l D

N

X k i =1

J ( ~ x i

) l

,

(4) where l is the diameter in the Jaccard coordinate of a cube centered at ~ , c k,D is a normalization factor and k ( · ) is called profile of the kernel function. Given the density function, the densest region should locate at the point where the gradient of the density function is 0. The gradient of the density function is as follows.

Fig. 1.

UMLI algorithm.

∇ f

ˆ l,κ

( ~ ) =

2 c k,D

N l D +2

" N

X g i =1

J ( ~ l x i

P

N i =1

J ( ~ x i

) g

P

N i =1 g

J ( ~ x i

) l

J ( ~ x i

) l

)

#

, (5) values is observed, no imputation is done. Formally, let x i the predicted missing values.

x i is calculated below: be x i

=

P m j =1 x i + j − m/ 2

P m j =1 w j

· o i + j − m/ 2

· o i + j − m/ 2

· w j

, (2) where m is the number of the taps of the filter and can be varied to change the smoothness.

w j tap and defined as 1 /m .

o j is the coefficient of each is an indicator whether the data is observed or not: it is 0 if data is missing and 1 otherwise.

Here, we choose m = 2 , which means we only fill in missing values for the timestamps that have at least one out of its 4 nearest neighbors observed. A small value of m will preserve the abrupt RSS changes, maintaining a good classification performance.

After imputing the MAR values, we proceed to the next step to cluster the observations into coarse-grained zones. Since observations are of variable length, the Jaccard distance is used as a measure for dissimilarity between any two observations.

The Jaccard distance between two sets is defined as,

J ( ~ i x j

) = 1 − x i x i

∩ ~ j

,

∪ ~ j

(3) where ∩ represents set intersection and ∪ stands for set union. In other words, the Jaccard distance is calculated as the fraction of the number of disjoint APs over the number of all available APs in the two observations. For example, the

Jaccard distance of the first two observations in Table I is 4 / 7 .

Unlike [7], we do not sort the APs according to their RSSs and take into account the order of the APs. This is because

RSS is very noisy and two neighboring APs can exchange their order frequently. Taking into account the order of APs will degrade in the performance. RSS measurements from the same locations are likely to contain the same set of APs, and are thus close to each other in the Jaccard coordinate. Variablelength observations are no longer a problem, since the Jaccard distance measures set differences.

Given the Jaccard distance, we propose to use a kernel density estimation approach to classify the observations. The where g ( x ) = − k

0

( x ) , and D is the number of dimensions.

(5) is a product of two components. The first component is proportional to the density function at point ~ , while the second one is called the mean shift vector , ~ .

m is the difference between the weighted mean by using kernel profile, g and the center of the kernel, ~ . This mean shift vector points toward the maximum in the density function.

The algorithm starts from an arbitrary observation, and groups all the observations within the coverage of a cube of diameter l and centered at the chosen observation to form a cluster. Next, it moves to the mean of these grouped observations. Since only pairwise distances are given, the mean is defined as the observation which has the smallest sum distance to all other observations in the same cluster. Finally, the algorithm is able to move to the densest region of the observations set. Repeat the whole process for all observations.

All observations within the true cluster should converge to the same mean, and thus, are grouped into a cluster. NCC does not assume any information about the number of cluster. At the end of the algorithm, the number of clusters is determined as the number of the means found.

NCC plays an important role in grouping locations into one big zone that share the same list of the strongest APs. An AP is selected as one of the strongest APs if it is observed 95% of the time. Otherwise, the AP will be discarded. As can be seen from the experiment result, the threshold 95% is large enough to select all necessary APs for the purpose of room classification. The common lists of APs within big zones are then formed. The above step removes MNAR, together with the low-pass filter mentioned above, both MNAR and MAR data are eliminated from the observations. Toward this end, a collections of big locations is classified, and every observation in a big zone shares the same list of available APs. It is then possible to utilize the second clustering layer – the IGMM to fine tune the layer one clustering result into smaller locations.

4

V. I NFINITE G AUSSIAN M IXTURE M ODEL

When a mobile user moves around a building, depending on its location, only a subset of WiFi APs with different signal strength can be detected. Within a specific location, there is a corresponding unique set of WiFi APs and their signal strengths. An observation set of a location/room is unique for two reasons: Firstly, it has a list of available APs that helps to differentiate itself from other observations from other locations. This property is utilized in the first classification layer, where big zones are extracted. Secondly, within one big zone, which shares one common list of available APs, signal strengths at different locations are much different due to multipath, reflection, etc. When we have a joint representation of RSS from all the APs, the variation within RSS values of different rooms is large enough to differentiate the rooms.

This observation motivates us to form a joint distribution of the RSS from the common set of APs in each big zone. The joint distribution can then represent each physical location, and later on can be analyzed to extract the room signature.

Hence, at this layer, the RSSs become important instead of

APs as in the first layer. Different signal strengths associated with the WiFi APs play an important role in classifying different locations. But adjacent locations may have very similar or overlapping signal strength profiles. Furthermore, the number of locations is unknown, and thus, classical methods where the number of classes is given a priori cannot be utilized.

We propose to use IGMM [9] to overcome the above challenges. IGMM has been proved to be advantageous in classifying overlapping clusters with an unknown number of classes [10]. IGMM is a generating model following which the observations are believed to be created. IGMM assumes that there are an unbounded number of joint distributions, each distribution generating a cluster of observations. The observations have D dimensions, which stand for different features of each observation. In IGMM, it is assumed that the number of clusters is infinite, but only a limited number of clusters are observed. The number of clusters can grow as large as needed when more data are obtained. Formally, the definition of IGMM is given below.

Definition 1: IGMM can be defined as

~ | α ∼ Stick ( α ); z i

| ~ ∼ Multinomial ( ·| w ); θ k

∼ x i

| ( z i

= k, Σ k

µ k

) ∼ N ( ·| ~ k

, Σ k

) , where θ j

∼ ~ stands for

(6)

Σ k

∼ Inverse Wishart v 0

0

); ~ k

∼ N ( ~

0

, Σ k

/K

0

) , (7) and ~ | α ∼ Stick ( α ) is a shorthand of: w

0 k

| α ∼ Beta (1 , α ); w k

= w

0 k

K − 1

Y

(1 − w

0 k

) .K

→ ∞ .

(8) l =1

H, ~ are called hyper parameters. The Stick ( α ) is a stick breaking process [11] used to generate an infinite number of weights, ~ , that sum up to 1 .

w is the mixing weight vector, which defines the contributions of each distribution in the observations set.

z i

= k with probability p ( z i k indicates that

= k ) = w k

.

~ i belongs to cluster

The observations are

Fig. 2.

Infinite Gaussian Mixture Model.

assumed to follow k joint Gaussian distributions which can be represented by θ k

, a parameter set of a cluster. Specifically, θ k includes a set of mean vector and covariance matrix for each cluster. The mean vector and the covariance matrix in turn are assumed to follow a Gaussian distribution and an Inverse

Wishart distribution,

~

, respectively. Each cluster corresponds to a location and the number of dimensions D is the number of available APs.

IGMM inference model has been fully investigated in [10],

[12] and given below. An inference process is a process of assigning an observation to it correct cluster. This involves two probabilities, the first probability represents the likelihood of an observation is assigned to a represented class, which already has some observations belong to:

P ( z i

= k | Z

− i

× t v l

, X ; α, ~ ) ∼

− D +1

µ l

, n k, − i

Λ l

( κ

α + N − 1 l

+ 1)

.

κ l

( υ l

− D + 1)

(9)

The second distribution is used in the case of assignment to a unrepresented class, which does not contribute any data point in the feature space:

P ( z i

× t v

0

− D +1

µ

0

,

α

= j, ∀ j = i | Z

− i

, X ; α, ~ ) ∼

Λ

0

( κ

α + N − 1

0

+ 1)

,

κ

0

( υ

0

− D + 1)

(10)

The posteriors consist of two parts, the prior and the likelihood

(the Student-t distribution). In the above equations, t is a multivariate Student-t distribution and its parameters are updated as:

µ

κ

Λ l l l

=

=

κ

0

κ

0

κ

0

+ L

µ

0

+ L, ν l

+

κ

0

= ν

0

L

+

+

L

L,

= Λ

0

+ S +

(11)

(12)

κ

κ

0

0

L

+ L

( ¯ − ~

0

)( ¯ − ~

0

)

T

, (13)

¯

= ( ~

1

+ ~

2

+ · · · + ~

L

) /L, (14)

S = i = L

X

( ~ i

− ¯

)( ~ i

− ¯

)

T

, i =1

(15)

5

1

0.8

0.6

0.4

0.2

0

0 5 10

Access Point IDs

15

Fig. 4.

Histogram of APs in the scan result.

20

Fig. 3.

Floor plan and phone placement layout.

where ~ l

, κ l

, ν l and Λ l are updated hyperparameters after observing L − 1 samples.

L is the number of observations in the particular cluster, and S is the scatter matrix.

Given the posterior distributions, we follow the Collapsed

Gibbs sampler [12] to sequentially sample the labels for the observations. After a few steps, the collapsed Gibbs sampler will converge and output the indicators z i

. After this step, the observations are fully classified into fine grained clusters. Each cluster corresponds to one physical location.

A. Experiment Setup

An Android app is developed and installed on an HTC smart phone for data collection. It collects three parameters, the available MAC address of APs, their RSSs, and timestamps at the sample point. Experiments have been conducted on the

3 rd floor of the Cullen College of Engineering building 1

(EB1) and 2 (EB2). Data are collected from seven different labs/offices/common areas. A floor map of the buildings and the phone placement are given in Fig. 3. A classroom, an office and some laboratories are selected on the same floor of Engineering building 2 to test the fine grained clustering algorithm. Note that the laboratories are in close proximity separated by 13 feet. Data are also collected on the same floor of a nearby building, one in a common area, the other inside an office. This helps test the coarse grained classification layer to see if it can differentiate different adjacent big zones.

B. NCC Result

VI. E

XPERIMENT

R

ESULTS

On the data set collected, after applying NCC, 100% of the time, UMLI is able to cluster the data set into three different big zones. Phone placements within a circle on Fig. 3 are grouped together into one big cluster. As one can see, all locations/rooms on EB2 are grouped together into one big zone. Interestingly enough, the two locations, one in common area and one inside an office in the EB1, can be differentiated by just applying the NCC. This is due to the fact that the common area has many more available APs compared to the offices. After classifying big zones, a common list of APs are extracted. Since the other two big zones in the EB1 building have been identified and each corresponds to one room, we only need to apply the fine tune clustering, IGMM, on the other big zone, e.g., the 3 rd floor of the EB2.

C. Extract Common List of APs

If an AP appears more frequently in the observations, we have more confidence in including it as part of the signature of the location. In the same big zone, for the purpose of using

IGMM, a common list of the strongest APs is needed. The common list is necessary since IGMM only operates with observations of the same length. Fig. 4 shows the histogram of 20 APs observable on the 3rd floor of EB2. 8 APs in black are the ones that almost always show up in the AP list on the mobile phone. In fact, the lowest percentage of appearance in the 8 APs is 99 .

2% . As the result, RSSs from the 8 APs are selected as the input to our IGMM algorithm. The other APs, in grey color, are discarded.

D. IGMM

The output from IGMM in classifying the 3 rd floor of

EB2 is shown in Fig. 5. Each rectangular is one room and there are totally 5 rooms in EB2 from which WiFi signals are collected. Observations within a rectangular are in fact collected within the same room. Observations marked with the same character/color are assigned to the same room by

UMLI. As we can see, the output of UMLI is very close to the ground truth. Experiment results show that even with very noisy measurements, UMLI still achieves 98% hit rate in correctly detecting the number of major locations. The hit rate is defined as the number of the times UMLI can detect the number of the rooms correctly over the total number of trials.

Furthermore, we evaluate the performance of UMLI in correctly assigning an observation to its correct room. Note that our approach is unsupervised, and thus, we do not have a library of labelled data. However, part of the clustering

−60

−65

−70

−75

−80

−85

−80

−40

−45

−50

−55

Original

Labels

Lab1

Lab2

Lab3

Lab4

Office1

TABLE II

C ONFUSION TABLE .

Predicted labels

Lab1 Lab2 Lab3 Lab4 Office1

50

0

0

49.85

0

0.15

0

0

0

0

0

0

0

0

0.05

0

49.4

0.51

0.25

0.4

49.39

0.05

0.2

0.05

49.7

Output of UMLI

−75 −70 −65 −60

AP1 RSS (dB)

−55 −50

Fig. 5.

Ground truth and output of the algorithm.

problem can be viewed as the problem of finding the most likely room of a new observation given the rooms are identified. In Table II, a confusion matrix is present. There are 50 observations per room. We can see that for every room, UMLI can achieve a very high identification probability, closely

100%. On average, the false positive rate is 0.16% and the true positive rate, or identification probability, is 99.84%.

VII. C ONCLUSION

In this paper, we propose UMLI, a novel framework for location extraction. The method is unsupervised, nonparametric and able to handle missing data. Experiment results demonstrate the superior performance of UMLI that it can correct determine the number of rooms 98% of the time. The rate of room detection is also considerably high compared to other work, at 99.84%. A novel nonparametric categorical clustering algorithm is proposed; a framework to handle missing data is constructed; and a statistical tool, IGMM is implemented to achieve this high performance. We believe UMLI serves as an important step in learning users indoor activity in an unsupervised manner.

R

EFERENCES

[1] D. B. Rubin, “Inference and missing data,” Biometrika , Volume 63, Issue

3, pp. 581–592, Dec. 1976.

[2] P. Bahl and V. N. Padmanabhan, “RADAR: An inbuilding RF-based user location and tracking system,” in proceedings of IEEE INFOCOM ,

Tel Aviv, Israel, vol. 2, pp. 775-784, Mar. 2000.

6

[3] J. Park, B. Charrow, D. Curtis, J. Battat, E. Minkov, J. Hicks, S.

Teller, and J. Ledlie, “Growing an organic indoor location system,” in proceedings of the ACM MobiSys , San Francisco, CA, pp. 271-284, Jun.

2010.

[4] A. Varshavsky, E. de Lara, J. Hightower, A. LaMarca, and V. Otsason,

“GSM indoor localization,” in proceedings of the IEEE PerCom , White

Plains, NY, USA, vol. 3, no. 6, pp. 698-720, Mar. 2007.

[5] M. Azizyan, I. Constandache, and R. Roy Choudhury, “Surroundsense: mobile phone localization via ambience fingerprinting,” in proceedings of the ACM MobiCom , Beijing, China, pp. 261-272, Sep. 2009.

[6] C. Wu, Z. Yang, Y. Liu, and W. Xi, “WILL: Wireless indoor localization without site survey,” in proceedings of IEEE INFOCOM , Orlando, FL,

Mar. 2012.

[7] Y. Jiang, X. Pan, K. Li, Q. Lv, R. Dick, M. Hannigan, and L.

Shang, “ARIEL: Automatic Wi-Fi based Room fingerprinting for indoor localization,” in proceedings of Ubicomp , Pittsburgh, PA, Sep. 2012.

[8] Y. Jiang, K. Li, L. Tian, R. Piedrahita, X. Yun, O. Mansata, Q. Lv, R.

P. Dick, M. Hannigan, and L. Shang, “MAQS: a personalized mobile sensing system for indoor air quality monitoring,” in proceedings of

Ubicomp , Beijing, China, Sep. 2011.

[9] Z. Ghahramani, “Nonparametric Bayesian methods,” Tutorial presentation at the UAI Conference , Edinburgh, Scotland, Jul. 2005.

[10] N. T. Nguyen, R. Zheng, and Z. Han, “On identifying primary user emulation attacks in cognitive radio systems using nonparametric Bayesian classification,” IEEE Transaction on Signal Processing , vol 60 , no 3, pp 1432–1445, Mar. 2012.

[11] Y. W. Teh, Dirichlet processes , Encyclopedia of Machine Learning,

Springer, New York, 2007.

[12] F. Wood and M. J. Black, “A nonparametric Bayesian alternative to spike sorting,” Journal of Neuroscience Methods , pp 173(1):1-12, Aug. 2008.

[13] Donnie H. Kim and Younghun Kim and Deborah Estrin and Mani B.

Srivastava, “SensLoc: Sensing Everyday Places and Paths using Less

Energy”, ACM SenSys, Nov., 2010.

Download