Task Planning and Incentives in Ubiquitous Crowdsourcing KSE801 Uichin Lee Recruitment Framework for Participatory Sensing Data Collections KSE801 Uichin Lee Participatory Sensing • Allowing people to investigate processes with mobile phones • Community based data collection and citizen science; offering automation, scalability, and real-time processing and feedback • Examples: taking photos of assets that document recycling behavior, flora variety, and green resources in a university Participatory Sensing: Challenges • Diverse users and participatory sensing projects • How to identify participants to projects? • Goal: devise a new recruitment framework using availability and reputation – Spatio-temporal availability based on mobility and transport mode – Reputation of data collection performance Sustainability Campaigns • GarbageWatch: The campus needs to divert 75% of its waste stream from landfills, and effective recycling can help reach this goal. By analyzing photos, one can determine if recyclables (paper, plastic, glass, or aluminum) are being disposed of in waste bins, and then identify regions and time periods with low recycling rates. • What's Bloomin: Water conservation is a high priority issue for the campus and efficient landscaping can help. By collecting (geotagged) photos of blooming flora, facilities could later replace high water usage plants with ones that are drought tolerant. • AssetLog: For sustainable practices to thrive on a campus, the existence and locations of up-to-date “green" resources needs to be documented (e.g., bicycle racks, recycle bins, and charge stations). Sustainability Campaigns System Overview Recruitment Framework • Qualifier: minimum requirements – Availability: destinations and routes within space, time, and mode of transport constraints – Reputation: sampling likelihood, quality, and validity over several campaigns or by campaign-specific calibration exercises • Assessment: participant selection – Identify a subset of individuals who could maximize coverage over a campaign area and time period while adhering to the required mode of transport – Cost may be considered when selecting participants • Progress review: checking “consistency” – Review coverage and data collection performance periodically – If participants are below a certain threshold, provide feedback, or recruit more participants Related Work • Mobility models – Location summarization for personal analytics: from location traces to places (e.g., spatio-temporal clustering, density-based clustering, reverse geo-coding) – Location prediction to adapt applications: mostly for locationbased services (LBS); prediction methods include Markov models, time-series analysis, etc. • Reputation systems: – Summation and average (e.g., Amazon review) – Bayesian systems (e.g., Beta reputation system) • Selection services: – Online labor markets: M-Turk, GURU.com – Sensor systems: traditional sensor networks focused on coverage (or sensing in a predefined zone) Coverage Based Recruitment • • • • Mobility traces (say for every 30 seconds) Density-based clustering to find “destinations” (or places) Routes are points between destinations Mode of transport is inferred (e.g., still, walking, running, biking, or diving) • Qualifier filters: e.g., selecting individuals with at least 5 destinations in a certain area in a week or individuals with at least 7 unique walking routes during day time weekday hours • Assessment: – Given (1) a set of participants with associated costs and spatial blocks w/ mode of transport over time, and (2) block with certain utilities – Maximize the utility under budget constraints (NP-hard) • Greedy algorithm is known to achieve at least 63% from the optimum Coverage Based Recruitment • Reviewing M*N spatio-temporal association matrix – M rows: spatial blocks (100m*100m) – N: distinct time slots in a day (cumulated over a week) – Entry: the proportion of time spent in a spatial block (that satisfies mode of transport and monitoring period constraints) • Comparing two consecutive weeks (to check deviation) – Singular Value Decomposition (SVD): U*∑*Vt • U: patterns common across different time periods (days) • ∑: singular values (σ1…σrank) show variance represented by each pattern where Participation and Performance Based Recruitment • Cross-campaign vs. campaign-specific • Focus on campaign-specific indicators – – – – Timeliness (latency) Relevancy (falls in phenomenon of interests) Quality Participation likelihood: whether an individual took a sample when given the opportunity • Beta reputation model w/ Beta distribution where α (#success) β (#failure) – Expected reputation: E = α/(α+β) – Exponential averaging over time (w/ some aging factor w) Evaluation • Campaign deployment information: • Ground-truth: experts traversed the routes Coverage Based Recruitment • Evaluated assessment methods: – Random: select users from campaigns arbitrarily – Naïve: select users who cover the most blocks overall without considering coverage of existing participants – Greedy: select users who maximize utility by considering coverage of existing participants Coverage Based Recruitment • Consistency check for campaign coverage (progress review) (changed mode of transport: from walking to driving) Participation and Performance Based Recruitment • Evaluated participation likelihood – Other metrics not considered (e.g., timeliness, relevancy, quality due to the nature of projects, i.e., auto uploading) • A user’s reputation after “AssetLog calibration exercise” Participation and Performance Based Recruitment • Re-evaluating reputation over two weeks – With (c) and without (d) exponential aging Discussion • Greedy vs. naïve: if users’ coverage overlaps more, there will be much difference.. • Across campaign consideration (due to individual’s preference, performance may be different) • Participants grew tried of collecting samples • Participants reported that the act of data capture should be streamlined so that it can be repeated rapidly • Participants wanted visualization (e.g., map) • Participants were generally OK with “minor” deviation from their routes, but drastic change may require some incentives Dynamic Pricing Incentive for Participatory Sensing Juong-Sik Lee and Baik Hoh Nokia Research Pervasive and Mobile Computing 2010 Introduction • Dynamic return of investment (ROI) of participatory sensing applications (different data types, users’ context, etc) • Fixed price incentives may not work well; further, it’s hard to come up with an optimal price • Reverse auction: users bid for selling their data, and the buyer selects a predefined number of lower bit price users – Selling price dynamically changes Reverse Auction • A user’s utility: U(b) = (b-t)*p(b) – b: received credit – t: base value of the data (that a user think) – p(b): winning probability Problems with Reverse Auction • Lost users may drop out of the system • Incentive cost explosion happens when a system has below a threshold number users (here, m) – Those users can increase their bid as much as possible • Solution: for each loss, buyer gives virtual participation credit (of fixed amount α); credit cumulates over time – Seller can use the credit to lower its bid (thus, increasing winning probability) BID Incentive cost explosion winners losers Give “virtual credit” α to losers Credit-based Incentives Summary • Random Selection based Fixed Pricing (RSFP): – – – – (+)Simple to implement (+) Easy to predict total incentive cost (-) Difficult optimal incentive price decision (-) Unable to adapt to dynamic environments • Reverse auction dynamic pricing w/ virtual credit (RADP-VPC) – – – – – – (+) Eliminate complexity of incentive price decision (+) Able to adapt to dynamic environments (+) Minimize incentive cost (+) Better fairness of incentive distribution (+) Higher social welfare (-) Relatively harder to implement than RSFP Evaluation Model • ROI till round r = [earning so far] / [# of participation till round r]*[min reward] • If ROI(r) drops below 0.5, a user drops out of the system • A user’s valuation is randomly generated based on some distribution • Evaluation items – Incentive cost reduction – Fairness against true valuation – Service quality Incentive Cost Comparison Incentive Cost Reduction Reverse auction dynamic pricing with virtual participation credit (RADP-VPC) Fairness Against True Valuation Service Quality Guarantee Discussion • Privacy leak: one has to send data with bid – Data encryption prevents the buyer from validating the quality (how about using homomorphic crypto?) • Data broker in between seller and buyer – Data collection, maintenance, processing/mining • Handling different types of apps (e.g., real-time vs. asynchronous) • How to guarantee data integrity and to maintain seller’s reputation?