A Location-Based Personal Semantic Web Recommender 基於地理位置的個人語義Web智能推薦

advertisement
A Location-Based Personal Semantic Web Recommender
基於地理位置的個人語義Web智能推薦
George-Leonard Chetreanu, Alina Elena Mihăilă, IulianVascu,
George-AlexandruVlad, Sabin C. Buraga, LenutaAlboaie
Faculty of Computer Science, “AlexandruIoanCuza” University of Iasi, Romania
Berthelot 16, Iasi 700483 – Romania
{george.chetreanu, elena.mihaila, iulian.vascu, george.vlad, busaco, adria}@info.uaic.ro
Abstract – Location-based services are widely spread both asentertainment and business applications. The focus of this
workis towards one particular area – searching services for locatingPOIs (Points of Interest) or friends. The project’s goal
is todevelop an application able to intelligently suggest resources likegas stations, restaurants, hospitals, stores or banks,
using a geo-locationservice.
摘要-基於位置的服務被廣泛流傳在娛樂和商務應用中。這項工作的重點是對一個特定區域–進行查找POI(興趣點)或朋友搜索的服
務。該項目的目標是開發一個智能的應用程序資源,如使用地理定位服務提出建議的加油站,餐館,醫院,商店或銀行。
Keywords – semantic Web, recommender system, knowledgemodeling, social Web application, geo-location
Keywords –語義Web,推薦系統,知識建模,社交網絡應用,地理位置
I.
INTRODUCTION 序論
Semantic recommender systems use profiles to represent the long-term interests of the users. The collaborative
filtering systems give recommendations based on the information provided by other users that share similar profiles.
This means that building a viable (correct) user profile is very important for efficient filtering in recommender systems,
since an incomplete or biased profile can affect the relevance of the user recommendations.
語義推薦系統中使用數據圖表來表示用戶的長期愛好。協同過濾系統提出建議根據提供類似的數據圖表的其他用戶共享
的信息。這意味著建立一個可行的(正確)的用戶數據圖表的高效過濾推薦系統是非常重要的,因為一個不完整的或有
偏見的個人資料可能會影響相關用戶的各項建議。
The goal is to develop an application (called PeRe), which not only would recommend the users resources from
different categories in which they are interested, but also would provide – in an interactive way – information about their
physical location. One of the possible approaches is to offer a semantic recommender able to give users
geo-localization information on a map about items from categories (s)he is interested in.
我們的目標是開發一個應用程序(稱為PERE),它不但會建議不同類別中用戶感興趣的資源,而且也會用互動的方式提
供它們的物理位置信息。其中一個可能的方法是提供一個語義的推薦,能夠給用戶感興趣的有關項目類別在地圖上的地
理定位信息。
There are some similar recommender systems that arecurrently being used by various businesses and industries
offering pubic online services:
也有一些相似的的推薦系統,目前正在使用的各種企業和行業提供公共在線服務:
• When a user is consulting a product page available on an e-commerce Website such as Amazon.com, the application
also recommend other products based on data regarding what other clients purchased together correlated to the
current item of interest.
•當用戶諮詢產品頁面上的電子商務網站,如Amazon.com,該應用程序還會根據其他客戶端一起購買的相關感興趣的
項目數據來推薦其他的產品。
• Pandora Radio recommends playlists to users starting from their preferred song or musician name. The songs are
added to the playlist based on the keywords (tags) associated to the user input.
•潘多拉電台會建議播放列表的用戶他們喜歡的歌曲或音樂家的名字的歌曲被添加到播放列表中的關鍵字(標籤)相關
聯的用戶輸入的基礎上。
• Netflix suggests movies that a user might like to watch based on the user’s profile. When building the profile of a user,
his/her previous ratings and watching habits are compared to those of the other users.
•Netflix的建議根據用戶的數據圖表,可能喜歡看電影的用戶建立檔案時,他/她以前的收視率和收看習慣和其他用戶進
行比較。
II.
CONCEPTS 概念
Recommender systems [1] are a subclass of information filtering systems, which try predicting the rating or the
preference that a user would give to an item. In order to give accurate predictions, these systems could use algorithms
like Pearson’s Correlation Algorithm [12] able to find a correlation between two continuous variables. The Pearson’s
correlation value can fall between −1.00 and 1.00, where first value is a strong negative correlation, the second value is
a strong positive correlation, and 0 means there is no correlation. In a social network, a particular user’s neighborhood
with similar preferences or interests can be found by calculating the Pearson correlation coefficient. By collecting the
preference data of top N
nearest neighbors of a particular user (weighted by similarity), the user’s preference can be
predicted.
推薦系統的一個子類:信息過濾系統,嘗試給出用戶對一個項目的評等及偏好預測。為了給出準確的預測,這些系統可以
使用Pearson相關性算法,如演算法能夠找到兩個連續變數之間的相關性。Pearson相關性的值在-1.00和1.00之間,第一
個值較強的負相關,第二個值是一個較強的正相關,0表示不存在相關性。在社會網絡中,可找到特定用戶的偏好或興趣
同類鄰里計算的Pearson相關係數。通過收集偏好數據的前N個最近鄰的特定用戶(加權後相似度),用戶的偏好可以預
測的。
The correlations could be interpreted such as:
• −1.0 to −0.7 a strong negative association.
• −0.7 to −0.3 a weak negative association.
• −0.3 to +0.3 a little or no association.
• +0.3 to +0.7 a weak positive association.
• +0.7 to +1.0 a strong positive association.
相關係數可以這樣被解釋,例如:
•數值在-1.0到-0.7 強烈的負關聯性。
•數值在-0.7到-0.3一個較弱的負相關。
•數值在-0.3到+0.3很少或根本沒有關聯。
•數值在+0.3到+0.7一個較弱正相關性。
•數值在+0.7到+1.0強烈的正相關性。
This rule, of course, is somewhat arbitrary. For some situations, it might move the cut-off values closer to 0 (e.g., 0.2
and 0.6) and for other situations, it might move the cutoff values closer to 1 (e.g., 0.4 and 0.8).
當然,這條規則,是有些武斷。對於某些情況下,它可能移到截止值接近0(例如,0.2和0.6),其他情況下,它可能移
到截止值接近為1(例如,0.4和0.8)。
PeRe application is focused on a particular case, respectively creating a user profile from the information gathered from
a social network – Facebook. Based on the created profile, it uses the Pearson’s Correlation Algorithm in order to
compute the similarity with other user profiles. After the similarities are calculated, the user has the option to search for
points of interest from a list of categories. The relevant POIs are the ones placed in its local area and they can be
filtered based on the suggestions computed from the similar profiles, by entering a specific option or by searching all
the available points of interest from that category.
PERE應用程序集中在特定的情況下,分別從一個社群網路-Facebook收集到的信息創建用戶數據圖表。根據所創建的數
據圖表,它採用皮爾森相關演算法以計算其他用戶數據圖表之間的相似性。相似性計算後,用戶可以從分類列表中選擇
搜索興趣點。相關興趣點是置於其本地區域的,他們可以根據進行過濾同類公司計算的建議,通過輸入一個特定的選項,
或通過搜索所有可用的從該類別中的興趣點。
III.
ARCHITECTURE 架構
The application is built using a client–server model. The flow regarding the interaction between the client and the server
is represented in Fig. 1.
該應用程序是使用client–server模型。關於在客戶端和服務器之間的交互的流程表示在圖1。
A. Initialization
初始化
At this step, the server loads the RDF (Resource Description Framework) [3, 4] data source in the memory and
prepares for interaction with the client. Fig. 3 depicts an example of the RDF model. This RDF document stores
information about registered users and their friends together with information regarding their last physical location.
Preferences for each POI (Point of Interest) categories are also stored as RDF triples.
在該步驟中,服務器載入的RDF(資源描述框架)到記憶體,並準備與客戶端進行互動。圖3描述了一個演示的RDF模型。
此的RDF文檔存儲信息有關註冊用戶和他們的朋友一起與他們的最後一個物理位置的信息。每一個POI(興趣點)類別中
的“偏好設定”也為RDF三元組存儲。
B. Gathering social data
收集社會數據
First time, the user must allow tracking his/her physical (geographical) location. Then, the client connects to a social
network to gather data using the credentials provided by the user. An internal user profile is computed using the
collected data. All this information is stored on the server-side as RDF triples. Also, the similarity between the current
user and all other users is calculated and kept in memory (i.e. cached), so itcan be provided to the client when it is
needed.
首次採用用戶必須允許追踪他/她的所在(地區)的位置。然後,客戶端連接到社交網絡的採用由用戶提供的憑證來收集
數據。內部用戶數據圖表使用收集到的數據來計算。所有這些信息都存儲在服務器端為RDF三元組。此外,當前用戶和
所有其他用戶之間的相似性計算,並保存在存儲器中(即高速緩存),所以當需要時能夠馬上提供給用戶。
C. Searching recommended or custom POIs 搜索推薦的或自定義興趣點
Based on the interest of the user for a particular POI category, the server provides several suggestions obtained from
the RDF graph from other users having similar profiles(similarity index above 0). The suggested values are shown in a
list as available options for narrowing the POI search. The desired option selected by the user acts as a filter for the
POI sin his/her vicinity. The results are displayed on an interactive map by linking the POI geo-location to the map
coordinates.
根據一個特定的POI類別的用戶的權益,該服務器提供了一些建議,從RDF圖的其他用戶具有相似的配置文件(相似性指
數大於0)。縮小POI搜索可用選項列表中的建議值。由用戶選擇所需的選項作為一個過濾器,在他/她附近的POE。結果
顯示在互動地圖上,把POI的地理位置,地圖上的坐標。
D. Searching friends
搜尋朋友
The application also provides an option for the user to easily locate his/her friends that also use the PeRe system. The
latest geo-location information of his/her friends is read fromthe RDF and used to pinpoint their location on a
map(similarto the POIs search).
該應用程序還為用戶提供了一個選項,可以方便地找到他/她的朋友,也可以使用PERE系統。最新的地理位置信息讀取
他/她的朋友從RDF和使用上的精確定位
E. Signing out
登出
The server updates the RDF data related to the current userwhen the client logs out from the application.
服務器更新用戶端登出時從應用程序給當前用戶的RDF數據。
Figure 1. Overview of the application flow
圖1。應用程式流程的概述
IV. KNOWLEDGE MODEL 知識模型
The following RDF vocabularies were used:
使用下面的RDF詞彙表:(資源描述框架)
• foaf (Friend Of A Friend) [15] using foaf: name property to store the name of the user and foaf: knows to save the
Facebook friends of the user.
•FOAF(朋友的朋友)採用的foaf:name屬性儲存用戶的名稱和foaf:保存的用戶在Facebook上認識的朋友。
• geo – the WGS84 geodetic reference datum [16], for geo-location information – predicates geo: latitude and geo:
longitude for representing the latest physical location of the user.
•地理 - WGS84坐標大地測量基準資料 - 用戶最新的物理位置的信息的謂詞 geo:緯度和geo:經度。
contact – the PIM contact information [17]; the contact: city property will store user’s latest location.
•接觸 - PIM接觸的信息,接觸:城市屬性將存儲用戶的最新位置。
pstcn [18] – using pstcn: current Status to store in a persistent manner the user status. Currently, the possible values
are Active and Inactive. The Active value is stored for users that currently are logged in the PeRe application. Once the
user signs out of the system, his/hers status is updated to Inactive.
•pstcn[18] - pstcn:一個持久的方式將用戶當前的狀態存儲。目前,可能的值是活躍和不活躍。存儲用戶當前登錄的PeRe
應用程序中的動態值。一旦用戶登出系統,他/她的狀態更新為“無效”。
pere namespace – denoted by the http://PeRe.net/ URL – was created for the purpose of the PeRe application in order
to store properties that were not found in other vocabularies. For example, pere: interests is used in order to store the
user activity needed to build his profile, and pere: prefers holds the user preferences for one of the following POI
categories: gas_station, store, bank, hotel, restaurant, hospital. The values are the names of the POI category that the
user last preferred.
PERE命名空間 - 表示為的http://PeRe.net/網址 - 創建的目的以存儲屬性PERE應用程序中未發現的其他詞彙。例如,
PERE:興趣,以存儲用戶的活動需要建立他的個人資料,和PERE:喜歡,保存用戶的偏好下列操作之一
POI類別:加油站,商場,銀行,賓館,飯店,醫院。該值是最後推薦用戶的POI類別名稱。
IV.
TECHNICAL DETAILS 技術詳細資料
The Web interaction is facilitated by the new HTML 5 [11] specifications, including the asynchronous model of
transferring data in JSON (JavaScript Object Notation) [9], a format used in conjunction to the AJAX (Asynchronous
JavaScript And XML) paradigm.
網路的互動是促進了新的HTML 5[11]規範,包括非同步傳輸數據的JSON(JavaScript對象符號)模型[9],結合AJAX(異
步JavaScript和XML)的範例使用的格式。
After allowing the browser to detect the geo-location, a login form is shown to the user asking to enter his/her Facebook
credentials. After the authentication is successfully done, a request is made to Facebook API in order to obtain his/her
preferences and the graph of friends via the Open Graph protocol [14].
允許瀏覽器的地理位置偵測後,顯示一個登錄表單給用戶,要求輸入他/她的Facebook的憑證。成功地完成驗證後,通過
Open Graph protocol發出請求到Facebook API以獲得他/她的偏好和圖中的朋友[14]。
At this point, the user can select the category of interest. The currently supported categories are the following: friends,
gas stations, restaurants, stores, banks, hospitals, and hotels. These, together with the geo-location information are
stored in the RDF on the server side using Jena framework [2]. Our system is generic enough to support adding new
POI categories by only specifying their names and graphical representation.
此時,用戶可以選擇感興趣的類別。目前支持的類別如下:朋友,加油站,餐館,商店,銀行,醫院,酒店等。這些,
再加上地理位置信息儲存在RDF在伺服器端採用Jena框架[2]。我們的系統是通用的,支援新增新的POI種類僅指定名稱
和圖形表示。
The user profile is composed out of his/her activities on Facebook in several areas of interest (that can be also have
semantic Web-based descriptions) such as Local business, Arts/humanities, Software, Website, Company, Movie,
Education, Camera/photo, Product/service, Application, University, Entertainment, Shopping/retail, Cause,
Internet/software, Book, Author, Musician/band, TV channel, Interest, Sport, Games/toys, etc. For each such domain,
his/her activity is quantified. The resulting user profile consists of a list of numbers used to compute the degree of
similarity between the users. This approach could be useful in creating a knowledge base regarding user
profiles/personas.
用戶數據圖表是由他/她在Facebook上的活動,如本地商業,藝術/人文學科,軟件,網站,公司,電影,教育,相機在
若干領域的利益(也可以基於語義Web的描述)/照片,產品/服務,應用程序,大學,娛樂,購物/零售,原因,互聯網/
軟件,圖書,作者,音樂人/樂隊,電視頻道,興趣,體育,遊戲/玩具等,對於每一這類網域,他/她的活動進行量化。
將得到的用戶配置文件包括一個用來計算的用戶之間的相似性的程度的數字列表。在創建一個用戶配置文件/人物角色方
面的知識基礎,這種做法可能是有益的。
The user has the option to see the location of his/her friends that also use the PeRe application. Their locations are
pinpointed on a map. Also, a visual indicator provides information about the user online/offline status.
用戶有權選擇看到的位置,他/她的朋友,也可以使用PeRe應用程序程序。他們的位置在地圖上精確定位。此外,的視覺
指示器提供信息的用戶連線/離線狀態。
To obtain the desired information, the user will be able to choose from three available options: “Search all”, “Similar
searches” and “Own option”.
要獲得所需的信息,用戶可以選擇自三個可用選項:“搜尋所有”,“類似搜索”和“自行選擇”。
In case “Search all” is selected, all the POIs returned by using the Google Local Search API [8] that are in the local area
of the user are pinpointed on the map as shown in Fig. 2.
“搜索”時情況下,所有的POI使用google本地搜索API傳回[8],地圖上標明的用戶在當地。如圖2。
Figure 2. Pinpoints on the map showing restaurants around the user location
圖2。精確定位顯示地圖上的用戶位置周圍的餐廳
For the similarity option, the user has the possibility to choose from a list that includes the preferences of other users.
Using the Pearson’s Correlation Algorithm, a similarity index is computed between the current user and the others
stored in the RDF files. Based on the results, a list of suggestions is built from the users that are the “most similar” to
the current user.
對於相似度選項中,用戶有可能來自列表,該列表包含別的用戶的喜好選擇。的相似性指數使用皮爾遜相關算法,計算
之間的RDF文件中儲存的當前用戶和其他人。根據研究結果,一個建議列表建立“最相似的”當前的用戶自用戶。
In case the user choose “Similar searches” or “Own option”, then his/her input will be stored in the RDF model as
his/her own preference in order to assure that choice is the first among the suggested items the next time (s)he
performs a search. For example, if the user selects for gas station the option “Own option”, then (s)he needs
additionally to specify the name of the preferred gas station, that will be saved in the RDF model.
如果用戶選擇“搜索同類產品”或“自行選擇”,那麼他/她輸入將被存儲在RDF模型作為他/她自己的喜好,以確保選擇的建
議項目中是第一次在下一次(S)執行搜索。例如,如果用戶選擇,用於加油站的選項“是”選項“,則他(她)必須另外指
定的名稱的首選的加油站,將被保存在RDF模型。
The user can receive a visual indication on how to get to the desired POI starting from his/her current location and
using the easiest path. This path is the default suggestion from Google API [7] that can also be used as driving
instructions.
使用者可以獲得的直觀指示如何獲得所需的POI從他/她的當前位置,並使用最簡單的路徑。此路徑是從google的API的默
認建議[7],也可以用來作為驅動指令。
When the user logs out, his/her status is updated as “offline”, so his/her friends will know (s)he has left the application
and his/her geo-location might not be accurate.
當用戶退出登錄,他/她的狀態更新為“離線”,所以他/她的朋友都會知道他(她)離開應用程序和他/她的地理位置可能會
不準確。
Figure 3. Visual representation of a fragment from the RDF graph used by the PeRe system
圖3。PeRe系統使用的可視化表示了RDF圖形的一個片段
Fig. 3 presents a simple example of the RDF graph used by the PeRe application. Several explanations regarding this
model are given in the next paragraph.
圖3.給出一個簡單的例子為PeRe應用程序所使用了RDF圖形。在下一個段落中關於此模型的幾種解釋。
Each oval node in the above RDF graph represents an entity (subject). The edge (arrow) starting from the entity
represents the property (predicate) of the subject. The rectangle node represents a property value (literal). There may
be multiple entities connected by arrows, denoting the relationships established between them.
在上面的RDF圖中的每一個橢圓形的節點代表一個實體(對象)。從實體的邊緣(箭頭)代表屬性(謂詞)的主題矩形
節點的屬性值(文字)。可能有多個實體由箭頭連接,表示它們之間的關係成立。
The only information available in the RDF graph about the user Maria Georgescu is that she has the current status
Inactive (that person did not login yet into the application).
RDF圖中的用戶Maria Georgescu唯一可用的信息是,她的當前狀態無效(那個人還沒有登錄到應用程序中)。
Another user, Mihai Ionescu knows six other people. His interests are encoded as a list of numbers representing the
number of “likes” (concerning an activity) for one of the domains presented above. The user has current status Active.
He is located in Iasi at 47.1221728 lat. and 27.5448174 long., and prefers Petrom as a gas station.
另一個用戶米哈伊·約內斯庫認識其他六人。他的興趣被編碼為代表為“喜歡”(有關的活動),一個以上的域列表中的號
碼。用戶有當前的狀態。在他位於Iasi的47.1221728緯度。 27.5448174長。Petrom公司,並且更喜歡作為一個加油站。
The user Ion Popescu shares the same preference for Petrom gas station and has the same location in Iasi. He knows
only three other people and shares one acquaintance with user Mihai Ionescu. His status is also Active.
使用者Ion Popescu Petrom公司加油站和共享相同的優先級,在Iasi具有相同的位置。他只知道三個其他的人分享我的一
個熟人,使用者米哈伊·約內斯庫。他的狀態也很活躍。
The Pearson’s correlation algorithm uses the value of the user’s interests (list of numbers). For the above users, the
similarity index is low.
Pearson相關算法使用的價值,用戶的利益(序列號)。對於上述用戶相似性指數是低的。
The PeRe application provides a Web service [13] – based on the SOAP paradigm – which allows other applications to
query the RDF data stored on the server and build their own business logic and user interface. Also, a SPARQL
endpoint (Web service, too) [6] is available in order to test queries before calling the Web service directly. Fig. 4 shows
the Web interface of our developed endpoint.
PeRe的應用程序提供了一個Web服務[13] - 基於SOAP範例 - 它允許其他應用程序
查詢RDF數據儲存在服務器上建立自己的業務邏輯和用戶界面。此外,SPARQL端點[6](Web服務)也可直接調用Web
服務之前,為了測試查詢,圖4顯示了我們的開發端點的Web界面。
The PeRe Web service receives as input two parameters: a string (i.e. a sequence of characters) representing the
SPARQL query [5] and the desired output format – one of the following values: XML, RDF, JSON, CSV (Comma
Separated Values) or TSV (Tab Separated Values). The resulting information provided by the Web service is the set of
data returned by the execution of the given input query on the RDF model (stored on the server). The result set is
formatted as specified by the format parameter and returned as a string.
PeRe Web服務作為輸入接收兩個參數:一個代表SPARQL查詢的字符串(即一個字符序列)[5]和所需的輸出格式 - 下
列值之一:XML,RDF,JSON,CSV(逗號分隔值)或定位鍵分隔值(TSV)。
由此產生的信息由Web服務提供的RDF模型存儲在服務器上執行給定的輸入查詢傳回的數據的集合。被格式化為指定的
格式參數,並以字符串形式返回結果集。
V.
CONCLUSIONS AND FURTHER W ORK 結論和下一步的工作
The PeRe project is a semantic Web application that uses a recommender system in order to suggest – in an intelligent
manner – resources of interest like banks, gas stations, restaurants, hospitals, hotels, or stores nearby, using a public
geo-location service.
PeRe項目是一個語義的Web應用程序,使用推薦系統的建議 - 以智能的方式 - 相關資源的興趣,如銀行,加油站,餐
館,醫院,酒店,商店附近,使用公共的地理定位服務。
In addition, the application can be easily extended in order to add more categories available for the search.
此外,應用程序可以很容易地擴展,以添加更多類別的搜索。
In the next version, we plan to add a feature allowing the user to get multimedia (video and/or audio) guidance to any
selected destination.
我們計劃在未來的版本中,增加一個功能,讓用戶能夠獲得多媒體(視頻和/或音頻)指導到任何一個選定的目標。
Figure 4. Web interface of the client application invoking the PeRe SPARQL endpoint
圖4。PeRe SPARQL端點的客戶端應用程序調用的Web界面
REFERENCES 參考文獻
Download