GeoCrowd: Enabling Query Answering with Spatial Crowdsourcing By GOPIKRISHNA KURRA SRIKANTH GOLI SIVATEJA KOTIPALLI 1 OUTLINE Introduction. Related work. Problem Definition. Taxonomy of crowd sourcing. Preliminaries. Assignment protocol. Performance evaluation. Experimental Methodologies. Conclusion and Future work. 2 Introduction Spatial crowd sourcing is emerging as a new platform, enabling spatial tasks assigned to and performed by human workers. 3 Introduction Spatial crowd sourcing is the process of crowd sourcing a set of spatial tasks to a set of workers, which requires the workers to perform the spatial tasks by physically traveling to those locations. 4 Related work Most existing work on spatial crowd sourcing focus on a particular class of spatial crowd sourcing called participatory sensing. Mobile Millennium project: It uses GPS-enabled mobile phones to collect en route traffic information and upload it to a server in real time. 5 Related work (Continued) All the previous studies on participatory sensing focus on a single campaign and try to address challenges specific to that campaign. Examples: Amazon’s Mechanical Turk, Crowd Flower etc More examples of single campaign include campaign for watching petrol prices and a campaign for monitoring urban air pollution. 6 Problem Definition Maximum task assignment (MTA) : In spatial crowd sourcing, the main optimization goal is to maximize the overall task assignment while confirming to the constraints of the workers. 7 Problem Definition (Continued) In this paper author proposes three solutions to the maximum task assignment Greedy (GR) Least Location Entropy Priority (LLEP) Nearest Neighbor Priority (NNP) 8 Taxonomy of Crowd sourcing Spatial Crowd sourcing Classification: Spatial crowd sourcing can be classified based on the motivation of the workers into two classes Reward-based Spatial Crowd sourcing Self-incentivised Spatial Crowd sourcing 9 Taxonomy of Crowd sourcing Spatial Task Publishing Modes: With spatial crowd sourcing, tasks can be published in two different modes Worker Selected Tasks (WST) Mode Server Assigned Tasks (SAT) Mode 10 Taxonomy of Crowd sourcing Spatial Task Assignment Modes : In this section, author defines two modes for task assignment in terms of how to verify the validity of the spatial tasks. Single Task Assignment. Redundant Task Assignment. 11 Taxonomy of Crowd sourcing 12 Preliminaries: Definition 1 (Spatial Task): A spatial task t of form <l, q, s, δ> is a query q to be answered at location l, where l is a point in the 2D space. The query is asked at time s and will be expired at time s + δ. 13 Preliminaries: Definition 2 (Spatial Crowd sourced Query): A spatial crowd sourced query of form (<t1, t2, ...> , k) is a set of spatial tasks and a parameter k issued by a requester, where every spatial task t1 is to be crowd sourced k number of times. 14 Preliminaries Definition 3 (Worker): A worker, denoted by w, is a carrier of a mobile device who volunteers to perform spatial tasks. A worker can be in an either online or offline mode. A worker is online when he is ready to accept tasks. 15 Preliminaries Definition 4 (Task Inquiry or TI): Task inquiry is a request that an online worker w sends to the SCserver, when ready to work. The inquiry includes location of w, l, along with two constraints: A spatial region R, and the maximum number of acceptable tasks maxT. 16 Preliminaries Definition 5 (Task Assignment Instance Set): Let Wi={w1, w2, ...} be the set of online workers at time si. so, let Ti={t1, t2, ...} be the set of available tasks at time si. The task assignment instance set, denoted by Ii is the set of tuples of form <w,t>, where a spatial task t is assigned to a worker w, while satisfying the workers’ constraints. 17 Preliminaries Definition 6 (Maximum Task Assignment (MTA)). Given a time interval ϕ = {s1, s2, ..., sn}, let |Ii| be the number of assigned tasks at time instance si. The maximum task assignment problem is the process of assigning tasks to the workers during the time interval ϕ, while the total number of assigned tasks is maximized. 18 ASSIGNMENT PROTOCOL: Sc-server ideally should have global knowledge of all the workers and tasks. Global optimal solution is not feasible. Using the spatial information and the capacity of the workers,scserver should arrive at local optimal solution. Three solutions are proposed based on the local optimal strategy. 19 ASSIGNMENT PROTOCOL(Contnd.): The three solutions of assignment protocol are: Greedy (GR) Strategy. Least Location Entropy Priority(LLEP) Strategy. Nearest Neighbor Priority(NNP) Strategy. 20 Greedy(GR) Strategy: At every instance of time, tries to maximize the current assignment. Does not provide a globally optimal solution. Goal is to maximize overall assignment by solving the maximum task assignment instance problem for every instance of time Every worker forms two constraints :the spatial region R, and the maximum number of tasks maxT during task inquiry. 21 Greedy(GR) Strategy(Contnd): Theorm1:The maximum task assignment instance problem is reducible to the maximum flow problem. Consider a time instance si with Wi={w1,w2,…….} as the set of online workers and Ti={t1,t2,……..} as the set of available spatial tasks. Let Gi=(V,E) be the flow network graph. Set V contains Wi+Ti+2 vertices and set E contains Wi+Ti+m edges. 22 Greedy(GR) Strategy(Contnd): An example of Wi and Ti 23 Greedy(GR) Strategy(Contnd): Flow network graph Gi=(V,E) 24 Greedy(GR) Strategy(Contnd): We can now use any algorithm that computes the maximum flow in the network. One such method is Ford-Fulkerson method. Ford-Fulkerson method: start sending flow from source vertex to destination vertex, as long as there is a path between the two with available capacity. 25 Least Location Entropy Priority(LLEP) strategy: Greedy strategy does not consider future optimizations. The idea here is to assign higher priority to tasks which are located in worker sparse areas. Location entropy: measure of total number of workers in that location as well as relative proportion of their future visits to that location. 26 Least Location Entropy Priority(LLEP) strategy (Contnd): Pl(w) is the fraction of total visits to l that belongs to worker w. Ol is the total number of visits to location l. Ow,l is total number of visits to location l that belongs to worker w. 27 Least Location Entropy Priority(LLEP) strategy (Contnd): Here the goal is to assign maximum number of tasks during every instance of time while the total cost(location entropy) associated to assigned tasks is lowest. Theorem 2. The minimum-cost maximum task assignment instance problem is reducible to the minimum-cost maximum flow problem. Consider a time instance si with Wi={w1,w2,…….} as the set of online workers and Ti={t1,t2,……..} as the set of available spatial tasks, let Gi=(V,E) be the flow network graph 28 Least Location Entropy Priority(LLEP) strategy (Contnd): Every task is associated with a cost. Let Vj be the vertices mapped to every worker Wi and vwi+j be the vertices mapped to every task. For every u belong to Vj let (u, vwi+j) be the edge connecting the above two. Every edge in the above set has a cost associated to it. 29 Least Location Entropy Priority(LLEP) strategy (Contnd): Cost of all the other edges is set to 0. Hence in the above graph, we have to find the minimum-cost maximum flow. First find maximum flow by Ford-Fulkerson method. Then the cost of the flow can be minimized by linear programming. 30 Least Location Entropy Priority(LLEP) strategy (Contnd): The total cost of the flow is defined as follows. Here f is the flow in the edge (u,v) and a is the cost associated with the edge (u,v). There are several constraints like f(u,v)<=c(u,v) and f(u,v)=-f(v,u). All constraints are linear and the goal is to optimize a linear function ,which can be done by linear programming. 31 Nearest Neighbor Priority (NNP) Strategy: GR and LLEP do not consider the travel cost( in time or distance). Travel cost of the workers in the assignment process is incorporated here. Tasks which are closer to a worker will have smaller travel costs. Here the goal is to maximize the task assignment at every instance while minimizing the travel cost of workers. Hence higher priority is given to tasks which are closer. 32 PERFORMANCE EVALUATION : The Author conducted several experiments on both real-world and synthetic data to evaluate the performance of our proposed approaches: GR, LLEP, and NNP. 33 Experimental Methodology : In the first set of experiments, author evaluated the scalability of our proposed approaches by varying the number of spatial tasks from 50k to 200k. The Figures show the result of our experiments using both the synthetic data and real data. 34 Experimental Methodology : CONTD… The figures show that LLEP outperforms both GR and NNP in terms of the number of assigned tasks 35 Experimental Methodology : CONTD… As the figures show, the average travel cost of the workers decreases in all cases because in a task-dense area, there is a higher probability that an assigned task is in a closer distance to a worker. 36 Effect of Maximum Acceptable Tasks Constraint: In the next set of experiments, author evaluated the impact of the maximum acceptable tasks (i.e., maxT) constraint using the synthetic data. 37 Effect of Maximum Acceptable Tasks Constraint : CONTD… LLEP is the superior approach in terms of improving the number of task Assignment while NNP outperforms both GR and LLEP in terms of the travel cost. 38 Effect of Spatial Region Constraint : In our final set of experiments, author measured the performance of our approaches with respect to expanding the spatial region of every worker . 39 Effect of Spatial Region Constraint CONTD… LLEP outperforms both GR and NNP in terms of the number of task assignment, while the NNP approach is superior in terms of the travel cost. 40 RELATEDWORK : With the increasing popularity of crowdsourcing, recently, a set of crowdsourcing services such as Amazon’s Mechanical Turk (AMT) and CrowdFlower have emerged which allow requesters to issue tasks that workers can perform for a certain reward. 41 RELATEDWORK CONTD :… One class of spatial crowdsourcing is known as participatory sensing, in which workers form a campaign to perform sensing tasks.They use GPS-enabled mobile phones to collect traffic information. 42 RELATEDWORK CONTD :… Another class of spatial crowdsourcing is known as volunteered geographic information (or VGI), in which the goal is to create geographic information provided voluntarily by individuals. Some examples include: 1.WikiMapia 2.StreetMap 3.Google Map Maker 43 CONCLUSION AND FUTUREWORK : In this paper, the author introduced spatial crowdsourcing as the process of crowdsourcing a set of spatial tasks to a set of workers. As future work, the author aims to focus on the other classes of spatial crowdsourcing. Moreover, since location privacy is one of the major impediments that may hinder workers from participation in spatial crowdsourcing, they plan to extend their work to protect the location privacy of the workers. 44 REFERENCES : [1] Amazon mechanical turk. http://www.mturk.com. [2] Center for embedded networked sensing (cens). http://urban.cens.ucla.edu/projects/. [3] Crowdflower. http://www.crowdflower.com. [4] Google map maker. http://www.wikipedia.org/wiki/Google Map Maker. [5] Gowalla. http://www.wikipedia.org/wiki/Gowalla. [6] Minimum-cost maximum flow problem. http://www.wikipedia.org/wiki/Minimum-cost flow problem. [7] Openstreetmap. http://www.wikipedia.org/wiki/OpenStreetMap. 45 46/16 Thank You http://students.cse.unt.edu/~gk0096/ 47/16