Game Theoretical Modeling and Studies of Peer-Reviewing Methods Marius C. Silaghi Florida Tech CS Seminar, Fall 2012 Peer-Reviewing Games • Game Theory Concepts • Peer-Reviewing Background • Model Peer-Reviewing as Game • Theoretical analysis of PR Games • Experimental analysis of PR Games Game Theory Is not as much about computer games. Game Theory It is about understanding motivations (utilities $$$). Fundamentals of Game Theory Understanding what will happen in a given situation Typical example: “Prisoner’s Dilemma” Each can rat out the other or remain silent, resulting in 4 possible outcomes Golden Balls Trust Golden Balls strategies Fundamentals of Game Theory Understanding what will happen in a given situation Typical example: “Prisoner’s Dilemma” Perfectly rational players only care for their own felicity (utility) Payoff matrix y defects y cooperates x defects -5, -5 0,-20 x cooperates -20,0 -1,-1 utility of x, utility of y Fundamentals of Game Theory Understanding what will happen in a given situation Typical example: “Prisoner’s Dilemma” Perfectly rational players only care for their own felicity (utility) Payoff matrix y defects y cooperates x defects -5, -5 0,-20 x cooperates -20,0 -1,-1 utility of x, utility of y Mechanism design is about selecting right payoffs to encourage a “social choice” function: W. Vickrey got Nobel Prize in 1996 for 2nd Price Auctions for “truthful bidding”. Iterated Prisoner’s Dilemma The game repeats every 20 years for 1000 years Strategies: Tit for Tat Forgiving Tit for Tat Optimistic Tit for Tat Iterated Prisoner’s Dilemma The game repeats every 20 years for 1000 years Strategies: Tit for Tat Forgiving Tit for Tat Optimistic Tit for Tat Strategy Equilibrium studies try to predict behavior in existing games by theoretically or experimentally analyzing their impact on a player’s utilities. A player utilities define its type Player will select best strategy given strategies currently used by other participants. Rational Players (… = predictable) Rational players are ones that predictably try to maximize their utility. Utility can be expressible in $$$. Can most people can be assumed rational? (if given enough time and help to think) One has to take into account the utilities as defined by the beliefs (type) of the given player Rational Players (… = predictable) Are avaricious/epicurean/workaholic people rational? One has to take into account the utilities as defined by the beliefs (type) of the given player. - obviously, avaricious people believe in the value of the dollar. If one is alive, one likely has (subjective) beliefs. Can they be manipulated? Yes, since they are predictable: promise’em what they want. Rational Players (… = predictable) Are seekers of fame rational? Tiberius One has to take into account the utilities as defined by the beliefs (type) of the given player. - a player that believes in immortality through fame will value fame (quantifiable in money). Temple of Artemis in Ephesus burned on July, 356 BC by Herostratus who wished that: ”His name be spread to the whole Earth.” Can they be manipulated? To dissuade copycats, Ephesians ruled that his name should never be pronounced. Rational Players (… = predictable) Is a “fanatical altruist” rational? One has to take into account the utilities as defined by the beliefs (type) of the given player. - an altruist will be happier if he believes that “others” (family, country, humanity, animals) are happier. Can he be manipulated? Coax him by claiming that “others” love to be bombed! Rational Players (… = predictable) Is a religious person rational (maximizing a utility)? He (concurring with cynics) likes to claim he is not looking for $ / (i.e. rational). One has to take into account the utilities as defined by the beliefs (type) of the given player. - for a player that has a degree of belief in afterlife, his utilities are a function of that religion (believed mechanisms to reach afterlife). Can he be manipulated? Devil lies in details: claim that the correct interpretation of ‘A’ is ‘D’. Rational Players (… = predictable) Is a religious person rational (maximizing a utility)? He (concurring with cynics) likes to claim he is not looking for $ / (i.e. rational). One has to take into account the utilities as defined by the beliefs (type) of the given player. - for a player that has a degree of belief in afterlife, his utilities are a function of that religion (believed mechanisms to reach afterlife). Can he be manipulated? Devil lies in details: claim that the correct interpretation of ‘A’ is ‘D’. Game Theory It is about understanding motivations (utilities). Used in real war Used in designing how an enterprise/country works Used commonly in macro-economy Can be used for peer-reviewing For computer science: • Used in Multi-agent Systems • It is a computational problem (require simulations, models, etc.) Motivation Machine Game Theory Used in real war Why do soldiers obey their commanders in the army of the enemy? • Could one offer them something such that they defect? • Could one make them think that they (or their country) is better capitulating? • (whether that is true or false) Why does the president of the enemy country fight for that system? • Could one blackmail/bribe/menace/convince/confuse him into quitting? Most research and funding for game theory seems to be here. Game Theory Used in designing a country Nicolo Machiavelli, 1469-1527 Italian ex-politician of the Republic of Florence Discourses on Livy El Principe Game Theory Used in designing a country People obey laws (for fear of police). • besides brainwashing in schools/media Police obeys (for fear of army). Army obeys (for fear of secret services). Secret services obey (for high pay, or fear of another secret service). • checks and balances • good fences make good neighbors Poorly designed countries (with the incentives missing/disappearing for some ring in the chain), have been seen collapsing in spectacular manners: Roman Empire (motivation of soldiers?) USSR (motivation of KGB?) Yugoslavia (brainwashing failure, motivation of states in confederation?) 15 9 countries (11 states) Game Theory Used commonly in macro-economy What combination of fees/taxes/subventions would lead to a strong economy? • (where resources end-up in the hand of those who know/can make most out of them, etc.) E.g.: subventions only to people owning over 50ha, concentrates farming land in hands of those who have money for machines and technology. Game Theory Can be used for peer-reviewing: Peer-reviewing is the foundation of modern scientific research and • controls the speed of the development and • significant decisions on allocation of funding. Index • • • • • • Game Theory Concepts Peer-Reviewing Background Model of Peer-Reviewing as Game Theoretical analysis of PR Games Experimental analysis of PG Games Conclusions Peer-Review Most scientists regarded the new streamlined peer-review process as ‘quite an improvement.’ scienceforseo.com Features of PR mechanisms keep getting richer to improve and encourage research quality. o o o o o Blind reviewing, Author's reply to comments by reviewers, Reviewers bid for papers, Authors rate reviewers, Authors blacklist reviewers Common Blind Peer-Reviewing for Conferences 1. Chair assigns each paper to a Senior PC 2. SPC distributes a paper to 3 PC members (bidding) 3. PC gives a paper to a reviewer student 4. Each student reviews and assigns score. 5. Author sees reviews and answers 6. Student/PC may change review 7. PC forwards review to SCP 8. SCP gathers 3 reviews: rejects if any reject 9. Chair makes last changes: 10. Applies threshold (1/10 papers) 11. Answers complains Chair Accept 7 Senior Program Committee Reject 3 Accept 5 Program Committee Open Peer-Review (e.g. Material Thinking Design Workshop, 2007) Mainly with journals (biology) Proceedings Reviewers bid on papers. Papers distributed to 3 reviewers. main article Each reviewer writes a short article with the review of the paper. Authors see reviews and: • can withdraw paper, or • may write for each review a short article with an answer. Reviewers see answers and can withdraw their reviews. review articles Papers are published together with reviews and answers. Paper with negative reviews (no withdrawn) are published as technical reports together with the reviews. review answer articles Open Peer-Review Papers without reviews. What to do? Proceedings Understanding facts/possible motivations/conclusions 1. main article nobody accepted to review • likely not relevant to community, or • reluctance to write negative reviews • but may also be a boycott fair: tech rep for an archiving fee 2. all reviews are withdrawn / not submitted in time: • (paper’s fault) could be irrelevant or poor quality • (reviewer’s fault) overcommitted, malicious strategy • assigned reviewer names should be published review articles accepted or rejected? A 3rd category! The number of non-reviewed reports is a measure of the quality of the symposium/community. usable for deciding whether to submit similar articles in the future. review answer articles ? Index • • • • • • Game Theory Concepts Peer-Reviewing Background Model of Peer-Reviewing as Game Theoretical analysis of PR Games Experimental analysis of PG Games Conclusions Players Game players: o The researchers - Authors and reviewers, a repeated/iterated game at each conference o Funding Agency - rewards researchers based on their publications. mechanism designer, or player in a game with the researchers next slides with help of R. Vishen Concepts • Model paper quality by a paper's worth - utility to the society. • Worth is evaluated by expert reviewers. • Assumption: All reviewers in the symposium are equally expert. Reviewers have the same type (association paper worth) t : R • • Assumption of equally expert reviewers Note: we fail to model people emotionally attached to antagonistic scientific beliefs in a community: • scientists believing in climate warming vs. unbelievers • scientists believing (or not) in the relevance of a given metric: • is/isn’t privacy more important than verifiability (in voting) • is network logic runtime more relevant than real runtime? scientists with a given emotional belief should probably create their own communities/conferences. Model • Authors and Reviewers expect rewards from a funding agency. • Assumption: funding agency intends to maximize social value trouble • Social value is defined as “sum of quality of the endorsed papers.” • An article is endorsed if it is published with favorable reviews by experts. • Given a set of texts appearing in a community of type t, the social choice function is : f t s , t 0 endorsed s t 0 endorsed s Maximizing the total utility: t { |endorsed( )} Publication venues and social value • Conferences have multiple venues: • orally presented papers • posters • technical reports Publication venues and social value Venues gives a way to automate the accounting of the paper worth, via its impact on the visibility (number of citations): o Technical reports are less endorsed than posters or orally presented papers. o The social value given a set of publication venues ψ (posters, regular papers, etc.) consists of the weighted sum of the worth of the published papers (assume measurable via citation influence). w * t Utilities (Motivation) Let us convert it all to • Funding agency settings for distributing funds (rewards): o Citation Influence, o Publications count. Assumption: the funding agency cannot access the worth of a paper directly. Citation Influence o The citations influence (CI) of an author at a given moment is a metric of the influence of his publications, and it estimates the weighted sum of the worth of his publications on each of the three venues Ψ, Ψ = {regular, poster, technical report} CI w * t (author) Utilities (Motivation) Let us convert it all to: • A researcher gets reputation (positive utility) when papers are cited. • often one cannot automatically distinguish good vs bad citations. • A researcher can get bad reputation (negative utility): • for publishing erroneous articles, (as pointed by citations/reviews) or • if his/her review is proven to be incorrect. (only with Open PR) E=mc2 Utilities (Motivation) Let us convert it all to: What about reviewing (what is the motivation?) Being asked to review is like a citation (proof of reputation). To be asked again one has to promptly review when requested. But why writing good reviews rather than random ones? - Fear of “Authors scoring of reviewers” (not typically valuable: tit for tat) - SPC cannot generally notice poor reviews. - (when noticed, no mechanism to disseminate it) Utilities (Motivation) Let us convert it all to: Deviation from truthful reviewing may pay. • Reviewing takes time (writing random reviews earns you time) • Conferences with thresholds are zero-sum games. • A paper “p” is superseded by a newer paper “n” when the new paper “n” steals the show for “p” (reduces future citations of p). Citation Influence with superseding o The citations influence (CI) of an author at a given moment is a metric of the influence of his publications, and is given by the weighted sum of the worth of his un-superseded publications of each of the three venues Ψ, Ψ = {regular, poster, technical report} CI w * t unsuperseeded( (author)) Index • • • • • • Game Theory Concepts Peer-Reviewing Background Model of Peer-Reviewing as Game Theoretical analysis of PR Games (#-based) Experimental analysis of PG Games (CI-based) Conclusions Funding Based on Counting articles (Trusted Peer-Reviewing) • Rewards author i based on the number and venue of publications. R noi * wo n ip * w p nti * wt • Paper superseding not relevant. • Conferences: • with threshold on paper acceptance rate (1/10, 1/20) • without thresholds on paper acceptance Funding Based on Counting articles (Trusted Peer-Reviewing) No thresholds on paper acceptance rates • Two-players conference, with one submission each, with {accept, reject} decisions, the payoff matrix is: accept x’s reject x’s accept y’s 1,1 0,1 reject y’s 1,0 0,0 One-shot game: Best strategy is “make random decision”. Iterated game: Effective strategy is “(forgiving) Tit-for-Tat” Funding Based on Counting articles (Trusted Peer-Reviewing) Paper Acceptance Thresholds • Conference in order to remain relevant to the funding agency puts a threshold on the ratio of accepted papers. (CBR CBRz) • In this new version the actions available to players are not {accept, reject}, but the scores {low, high}. • In case of tie, a paper is randomly selected. • 2-players case: Zero-sum game. high x’s low x’s high y’s 0.5,0.5 0,1 low y’s 1,0 0.5,0.5 (Trusted Peer-Reviewing) Multiple players Paper Acceptance Thresholds • n-players case (accepting n/k articles, equally worthy submissions): • Zero-sum game. • not pair-wise Zero-sum game 1 n /k n /k • Utility of rejecting a paper (one less competitor): k(n 1) n 1 n high y’s high x’s low x’s 0,0 0,1/k(n-1) low y’s 1/k(n-1),0 1/k(n-1),1/k(n-1) For small CBR communities, a dominant strategy is “always low”. For huge CBR communities, a dominant strategy tends to “random review”. Always “low” is a Nash equilibrium (observed in some small communities)! With Tit-for-Tat opponents, Nash equilibrium is “always low” if the opponent is not met again for (n-1) rounds. With Hits-for-Tat opponents, Nash equilibrium is “always high” Hits-for-Tat: an opponent can strike back many (m>>1) times for one Tat m expected penalty: k(n 1) Truthful reviewing for TPR Conclusion With counting of articles: Truthful reviewing is never in equilibrium. (under the working assumptions: mainly that nobody notices how you review). Index • • • • • • Game Theory Concepts Peer-Reviewing Background Model of Peer-Reviewing as Game Theoretical analysis of PR Games (#-based) Experimental analysis of PG Games (CI-based) Conclusions Experimental studies Simulations Assumption: Funding based on CI Compared mechanisms: Open Peer-Review (SelectivitY) Common Blind Review (CBR) Common Blind Review with paper acceptance threshold (CBRz) Evaluation – Simulation Experiment • Generate 100 random research communities (i.e. simulations) • 20 researchers • 20 conferences (i.e. 20 years) • All participants are considered equally expert and inventive. • Researchers get ideas for articles with a Uniform distribution at an average of (only) 2 articles per year, and a worth that is uniformly random in [-10, 10]. • Each paper is superseded each year with a probability of 1/5. • Experiments for the weights wo = 0.5, wp = 0.3, and wt = 0.1. Evaluation - Reviewer Types • Compared Review Strategies i. Truthful reviewing. ii. Truthful reviewing except for papers superseding one’s work, which are rejected. iii. Random reviewing except for papers superseding one’s work, which are rejected. iv. Giving the opposite possible score to all papers (reject good papers and accept poor papers). v. Giving the lowest possible score to all papers. (Tit-for-Tat was not explored here: assumed to have limited relevance) Evaluation – Experiment Assumptions about worth of reviews • A misclassifying comment has a negative worth. o if published, it will be accounted only as bad reputation for the reviewer. • Misclassifying comment worth - the difference between the corresponding value and the real worth of the paper (always negative). • The worth of an author’s answer to a negatively misclassifying comment is the same as the “absolute value of the worth of the misclassifying comment”. Otherwise, the answer has worth zero. Evaluation - Experiment Cases Combinations of strategies tested for equilibria a. All reviewers review truthfully. b. All reviewers review truthfully, except for one reviewer who rejects articles superseding his work but reviews truthfully submissions not superseding his work c. All reviewers review truthfully, except for one reviewer who rejects articles superseding his work and reviews randomly submissions not superseding his work. d. All reviewers review truthfully submissions not superseding their work and reject the other submissions. e. All reviewers review randomly submissions not superseding their work and reject the other submissions. Evaluation - Experiment Results • For both reviewing mechanisms the goal of the funding agency (social value) is maximized with truthful reviewing – case (a). • Reduced in other cases. • In SY with cases (b)-(e), even if all worthy papers are published, the total worth is reduced compared to the case(a) - [remember, technical reports have less weight] Evaluation - Experiment Results for equilibriums with CI • • • To evaluate the equilibrium of truthful reviewing researcher 1 performs non-truthful reviews. The experiments show the extent of the implications of the use of different strategies with CBR and SY. Confirms that truthful reviewing is not in Nash equilibrium using CBR, but it is in Nash equilibrium when SY is used under given assumptions and strategies. Experiments with funding based on counting Settings: • 100 researchers • each paper reviewed by 4 people (assumed truthful except for 1). • ¼ are selected for publication • 500000 randomized simulation runs. Truthful reviewing was always leading to less benefits for reviewer: • with strategy iv (inverting), gain 11.19% more publications • with strategy v (always low), gain 15% more publications Conclusions We gave an example of how to analyze peer-reviewing mechanisms. Introduced “Peer-Reviewing Games”, an abstraction of real peer-reviewing processes: • sufficiently complex to capture interesting trade-offs • sufficiently simple to enable some theoretical and experimental analysis Prove that truthful-reviewing is not in Nash equilibrium for Common Based Review with given assumptions. Prove that truthful-reviewing is in Nash equilibrium for the simplified Open Peer-Review SY under studied assumptions and strategies. For OPR with threshold on acceptance rate, Tit_for_Tat is a rational strategy. For CBR, a rational strategy given assumptions is: reject superseding, random review for others. Next?