Incentivize Crowd Labeling under Budget Constraint Qi Zhang, Yutian Wen, Xiaohua Tian, Xiaoying Gan, Xinbing Wang Shanghai Jiao Tong University, China Outline Introduction to Crowdsourcing Mechanism Problem Formulation and Mechanism Setting Mechanism Analysis Performance Evaluation 2 Background Crowdsourcing systems leverage human wisdom to perform tasks, such as: Image classification Character recognition Data collection 3 Types of Tasks Tasks can be divided into two categories: Structured response format Binary choice Multiple choice Real Value Unstructured response format Logo design 4 Motivating Example Example: Image classification Workers Dog Task Dog Cat Allocation Crowdsourcing Platform Inference Algorithm Dog Cat Dog 5 Framework: Reverse Auction (1)Tasks (2)Bids (3)Winning bids determination (4)Winning bids (5)Answers (6)Payments 6 Major Challenges(1) To design a successful crowdsourcing system Task Allocation (winning bids) • Tasks should be allocated evenly Payment Determination: • Must provide proper incentives (monetary rewards) Inference Algorithm: • Should improve overall accuracy • Should address the diversity of the crowd 7 Major Challenges(2) We need to model on Diverse task difficulty • Dog or Cat • Older than 30 or Not Diverse worker quality Cat 8 Model on Tasks(1) We focus on binary choice tasks Each task is a 0 – 1 question (Assumption) Each worker is uniformly reliable Task Soft Label • Probability that the task is labeled as 1( by a reliable worker) Crowd Label 0 or 1 9 Model on Tasks(2) The soft label is viewed as a random variable drawn from Beta distribution Prior Parameters Update parameters (a,b) by Bayes rule Posterior Inference Prior Likelihood The task is inferred as 1 More than half agree 10 Framework: Reverse Auction The platform publicizes a set of binary tasks Workers reply with a set of bids • Each bid is a task-price pair (Allocation) The platform sequentially decide winning bids (Payment) Winning workers provide labels and get payment 11 Crowdsourcing Platform Utility After observing all crowd labels updated as , the distribution is Platform Utility: KL Divergence between the initial and the final distribution 12 Problem Formulation We want: Platform utility maximization under budget constraint Individual rationality Truthful about the cost Truthful bid Untruthful bid Computation Efficiency 13 Allocation Scheme (1) The task allocation(winning bid determination) is sequential : Candidate selection • one candidate a round Candidate Remaining bids Proportional rule check Discard Winning bid Answer collection & Soft label update The allocation scheme repeats the 3 steps until All bids Discard Winning bids 14 Allocation Scheme (2) The candidate selection is greedy • The largest platform utility gain per unit price PU Gain Candidate Price • Platform utility gain: Current distribution Updated distribution 15 Allocation Scheme (3) Proportional rule check budget price fraction ratio Soft label update • Collect the answer from the winning bid • Update the soft label according to Bayes rule 16 Allocation Scheme (5) Candidate selection Proportional rule check Soft label update Computationally efficient ! 17 Payment Scheme(1) Winning bids Discard {A, B, C} {D, E, F} Kick out C { A,B,D,E,F } Winning bids {A, B, b2 b1 C D, b3 Discard {F} E} b4 b1 is the minimum price that bid C can replace bid A p(C) = max {b1,b2, b3, b4} 18 Payment Scheme(2) (Proposition)The winning bid C is paid threshold payment. p(C) C’s payment, b(C) C’s bid if b(C) < p(C), C is a winning bid if b(C) > p(C), C is discarded Winning bids {A, B, b2 b1 D, b3 E} b4 C p(C)=max { b1, b2, b3, b4} 19 Payment Scheme(3) (Proposition)The incentive mechanism is truthful Each bid Workers will truthfully reveal the cost as asked price has a cost Why? Proof: Threshold payment + Greedy candidate Selection 20 Individual Rationality (Proposition)The incentive mechanism is individual rational The utility of a winning bid is nonnegative Proof: Let us consider the winning bid C 1. C is the 3rd winning bid. 2. The first 2 bids are the same 3. b3 is the minimum price that bid C can replace the new New Winning bids {A, 3rd bid (D) b1 B, b2 D, b3 E} b4 It is true that b3 > b(c) ! p(C) = max {b1, b2, b3, b4}, p(C) > b3 { A, B, C} Original wining bids p(C) > b(C) 21 Budget Feasibility (Proposition, Payment Bound) Payment to each winning bid is upper bounded by • Proportional rule: • Set 22 Performance Evaluation(1) Benchmark 1. Untruthful Allocation: Workers’ cost is public information 2. Random Allocation: Candidate selection is random Truthful Benchmark 1 Benchmark 2 My Mechanism Running Time Platform Utility High Low Low High 23 Performance Evaluation(2) Metric 1 : Platform Utility • Platform utility vs. Budget Price of Truthfulness Gain over random allocation 24 Performance Evaluation(3) Metric 2 : Budget Utilization • Payment / Budget Budget utilization gain Over random allocation 25 Thank you ! Presented by : Qi Zhang