Mechanisms for Making Crowds Truthful Andrew Mao, Sergiy Nesterko Improving Peer Prediction Weakness in the Miller et al. paper: Honest reporting is not a unique equilibrium (or even Paretooptimal) Collusion is not limited to symmetric strategies, nontransferable utility Does not give a minimum bound on the payoff between lying and truth-telling Players may be indifferent if difference in payoffs is less than ε Scoring rules cannot be easily extended to accommodate new constraints Overview Address cases of collusion Improve payment mechanism by creating unique NE, or at least Pareto-optimal NE Use multiple reference raters (>= 4) "...By giving a higher reward for matching all but one of the reference reports, it is possible to give a higher expected payoff to the truthful reporting equilibrium..." Symmetric and asymmetric strategies Transferable / non-transferable utility Automated mechanism design approach Payments computed by optimization, rather than closed form scoring rules Some Features Only pure strategies are considered Mixed strategy Bayes-Nash equilibria are too complicated to compute Initially, prove NE for truthful reporting, then extend to different collusive cases Payments to players for good or bad reports determine best-response strategies The Model Many buyers experience the same product with varying levels of quality. Define type as product quality, with a discrete distribution. We'll use just two types - Good and Bad. Buyers can rate what they get with either 1 (good) or 0 (bad). They get some reward for reporting. In sequential games, respondent rewards are computed in batches Apply this model repeatedly to achieve sequential play Model continued Common prior among players, center N respondents in each batch Possible strategies: (0, n) and (1, n); for n = 0 … N-1 n is the number of other players that submit a positive report Probability that n positive reports are submitted by remaining N-1 reviewers, given my signal oi: Example of Incentive-Compatible Payments Plumber Bob has the following prior: P(G) = 0.8, P(B) = 0.2 P(1|G) = 0.9, P(1|B) = 0.15 Suppose Alice (customer) has a job done well. Then P(G|1) = 0.96. She is told: "the report is paid only if it matches the reference report. A negative report is paid $2.62, while a positive report is paid $1.54" Then Alice expects the next user to get good service from Bob with probability P(1|1) = P(1|G)P(G|1) + P(1|B)P(B|1) = 0.87. Example continued Alice wants to match the expected report of the next customer So, if she tells the truth, expected payoff is 0.87 * 1.54 + 0.13 * 0 = 1.34, if she lies 0.87 * 0 + 0.13 * 2.62 = 0.34. So, no incentive to lie. Note that if we let P(G) = 0.001 and P(B) = 0.999, this is reversed! It is important that payoffs correspond to the right prior! But even with smart payoffs, everyone 1 is still an equilibrium! This is addressed in a later section Automated Mechanism Design i.e., how did we magically compute payments to Alice? First proposed by Conitzer and Sandholm (2003) In general, mechanisms are computed to satisfy specified design goals, instead of deriving closed form rules Allows variations within a class of mechanisms to be dynamically generated Mechanism can make use of specific available information In this case: Computing payments by solving optimization problems Incentive-Compatible Payment Mechanisms Payment mechanism is incentive-compatible if honest reporting is a Nash Equilibrium How do you compute the payment scheme so as to satisfy this? Can you create a unique NE? Is it efficient? We want: minimize expected payment to each player reward margin between truthful and dishonest reports all payments must be positive Solving a Linear Program Simple case: no collusion resistance For this to make sense, everyone must have the same prior Analytical Solution to the LP From constraints in the LP, we have two nonzero decision variables (payments) Lemma: ratio of Pr[n|1]/Pr[n|0] is monotonically increasing in n Must be for two separate reports: τ(0, n1) and τ(1, n2) From the dual, expected payment depends on this ratio Under cost minimization, incentive compatible payments are driven to n1 = 0 and n2 = N - 1, respectively Result: only τ(0, 0) and τ(1, N-1) are positive payments Satisfying Incentive Compatibility Consider the conditions for incentive compatibility, with n1 = 0, n2 = N-1: τ(0, 0) > τ(1, 0); τ(1, N-1) > τ(0, N-1) In the 2-player case, this becomes τ(0, 0) > τ(1, 0); τ(1, 1) > τ(0, 1) Obviously, this introduces the "all-report-high" and "allreport-low" equilibria Now, how do we fix this? Add more constraints to the optimization problem! Extensions Coalition size (full coalition/fractional coalition) Symmetric vs. asymmetric strategies Transferable utility Some combinations of these conditions are unreasonable i.e. doesn’t make sense if colluders can make side payments but not coordinate on asymmetric strategies Achieving unique or Pareto-optimal Nash equilibria Extension: Full coalition, symmetric strategies, non-transferrable utilities We want to get rid of the “all-report-X” Nash Equilibrium Extending the plumber example to N = 4 agents, look at probabilities Note the differences in distributions! Example continued Optimal payment scheme: Reporter is encouraged to "even out" the 0 distribution, but the prior compensates This gives the incentive for one person to switch when everyone else is reporting the same Implicit collusion resistance to symmetric strategies Extension: Partial Collusion, Asymmetric Strategies, Nontransferable Utility Theorem: When more than half of the agents collude, no incentive-compatible payment mechanism can make truth-telling dominant strategy for the colluders Cost of payments rises exponentially as the coalition fraction increases Extension: Partial Collusion, Asymmetric Strategies, Transferable Utility Note that the normalized cost rises much faster than before when participants can make side payments Summary of Extensions Some conditions lead to MILPs, which are harder to solve Unique vs. Pareto-optimal NE The latter is much cheaper Partial collusion: payment cost increases dramatically beyond a threshold of colluders Improvements Extension to original peer prediction mechanism with automated mechanism design Dynamically generated payments, so rules don't have to be in closed form Expected payment from honest reporting better than lying by some guaranteed threshold Different conditions can generate Unique, Pareto-optimal, or even Dominant NE, with corresponding different costs Drawbacks Common prior still required for BNE Report space is discrete (binary, in fact) Sequential nature of reports submission is not considered Need at least a certain size group Weird budget results if center has different prior from users Not necessarily incentivizing players to spend effort to uncover information - why not just invent a report? Discussion