Game Theoretical Modeling and Studies of Peer-Reviewing Methods Marius C. Silaghi

advertisement
Game Theoretical Modeling and Studies of
Peer-Reviewing Methods
Marius C. Silaghi
Florida Tech CS Seminar, Fall 2012
Peer-Reviewing Games
• Game Theory Concepts
• Peer-Reviewing Background
• Model Peer-Reviewing as Game
• Theoretical analysis of PR Games
• Experimental analysis of PR Games
Game Theory
Is not as much about computer games.
Game Theory
It is about understanding motivations (utilities $$$).
Fundamentals of Game Theory
Understanding what will happen in a given situation
Typical example: “Prisoner’s Dilemma”
Each can rat out the other or remain
silent, resulting in 4 possible outcomes
Golden Balls Trust
Golden Balls strategies
Fundamentals of Game Theory
Understanding what will happen in a given situation
Typical example: “Prisoner’s Dilemma”
Perfectly rational players
only care for their own felicity (utility)
Payoff matrix
y defects
y cooperates
x defects
-5, -5
0,-20
x cooperates
-20,0
-1,-1
utility of x, utility of y
Fundamentals of Game Theory
Understanding what will happen in a given situation
Typical example: “Prisoner’s Dilemma”
Perfectly rational players
only care for their own felicity (utility)
Payoff matrix
y defects
y cooperates
x defects
-5, -5
0,-20
x cooperates
-20,0
-1,-1
utility of x, utility of y
Mechanism design is about selecting right payoffs to encourage a “social choice” function:
W. Vickrey got Nobel Prize in 1996 for 2nd Price Auctions for “truthful bidding”.
Iterated Prisoner’s Dilemma
The game repeats every 20 years for 1000 years 
Strategies:
Tit for Tat
Forgiving Tit for Tat
Optimistic Tit for Tat
Iterated Prisoner’s Dilemma
The game repeats every 20 years for 1000 years 
Strategies:
Tit for Tat
Forgiving Tit for Tat
Optimistic Tit for Tat
Strategy Equilibrium studies try to predict behavior in existing games by theoretically or
experimentally analyzing their impact on a player’s utilities.
A player utilities define its type
Player will select best strategy given strategies currently used by other participants.
Rational Players
(… = predictable)
Rational players are ones that predictably try to maximize their utility.
Utility can be expressible in $$$.
Can most people can be assumed rational?
(if given enough time and help to think)
One has to take into account the utilities as defined by the beliefs (type) of the given player
Rational Players
(… = predictable)
Are avaricious/epicurean/workaholic people rational?
One has to take into account the utilities as defined by the beliefs (type) of the given player.
- obviously, avaricious people believe in the value of the dollar.
If one is alive, one likely has (subjective) beliefs.
Can they be manipulated?
Yes, since they are predictable: promise’em what they want.
Rational Players
(… = predictable)
Are seekers of fame rational?
Tiberius
One has to take into account the utilities as defined by the beliefs (type) of the given player.
- a player that believes in immortality through fame will value fame (quantifiable in money).
Temple of Artemis in Ephesus burned on July, 356 BC by
Herostratus who wished that:
”His name be spread to the whole Earth.”
Can they be manipulated?
To dissuade copycats, Ephesians ruled that his name should never be pronounced.
Rational Players
(… = predictable)
Is a “fanatical altruist” rational?
One has to take into account the utilities as defined by the beliefs (type) of the given player.
- an altruist will be happier if he believes that “others” (family, country, humanity, animals) are happier.
Can he be manipulated?
Coax him by claiming that “others” love to be bombed!
Rational Players
(… = predictable)
Is a religious person rational (maximizing a utility)?
He (concurring with cynics) likes to claim he is not looking for $ / (i.e. rational).
One has to take into account the utilities as defined by the beliefs (type) of the given player.
- for a player that has a degree of belief in afterlife,
his utilities are a function of that religion (believed mechanisms to reach afterlife).
Can he be manipulated?
Devil lies in details: claim that the correct interpretation of ‘A’ is ‘D’.
Rational Players
(… = predictable)
Is a religious person rational (maximizing a utility)?
He (concurring with cynics) likes to claim he is not looking for $ / (i.e. rational).
One has to take into account the utilities as defined by the beliefs (type) of the given player.
- for a player that has a degree of belief in afterlife,
his utilities are a function of that religion (believed mechanisms to reach afterlife).
Can he be manipulated?
Devil lies in details: claim that the correct interpretation of ‘A’ is ‘D’.
Game Theory
It is about understanding motivations (utilities).
Used in real war
Used in designing how an enterprise/country works
Used commonly in macro-economy
Can be used for peer-reviewing
For computer science:
•
Used in Multi-agent Systems
•
It is a computational problem (require simulations, models, etc.)
Motivation Machine
Game Theory
Used in real war
Why do soldiers obey their commanders in the army of the enemy?
• Could one offer them something such that they defect?
• Could one make them think that they (or their country) is better capitulating?
• (whether that is true or false)
Why does the president of the enemy country fight for that system?
• Could one blackmail/bribe/menace/convince/confuse him into quitting?
Most research and funding for game theory seems to be here.
Game Theory
Used in designing a country
Nicolo Machiavelli, 1469-1527
Italian ex-politician of the Republic of Florence
Discourses on Livy
El Principe
Game Theory
Used in designing a country
People obey laws (for fear of police).
• besides brainwashing in schools/media
Police obeys (for fear of army).
Army obeys (for fear of secret services).
Secret services obey (for high pay, or fear of another secret service).
• checks and balances 
• good fences make good neighbors
Poorly designed countries (with the incentives missing/disappearing for some
ring in the chain), have been seen collapsing in spectacular manners:
Roman Empire (motivation of soldiers?)
USSR (motivation of KGB?)
Yugoslavia (brainwashing failure, motivation of states in confederation?)
15
9 countries (11 states)
Game Theory
Used commonly in macro-economy
What combination of fees/taxes/subventions would lead to a strong economy?
•
(where resources end-up in the hand of those who know/can make most out of them, etc.)
E.g.: subventions only to people owning over 50ha,
concentrates farming land in hands of those who have money for machines and technology.
Game Theory
Can be used for peer-reviewing:
Peer-reviewing is the foundation of modern scientific research and
•
controls the speed of the development and
•
significant decisions on allocation of funding.
Index
•
•
•
•
•
•
Game Theory Concepts
Peer-Reviewing Background
Model of Peer-Reviewing as Game
Theoretical analysis of PR Games
Experimental analysis of PG Games
Conclusions
Peer-Review
Most scientists regarded the new streamlined
peer-review process as ‘quite an improvement.’
scienceforseo.com
Features of PR mechanisms keep getting richer to improve and encourage research quality.
o
o
o
o
o
Blind reviewing,
Author's reply to comments by reviewers,
Reviewers bid for papers,
Authors rate reviewers,
Authors blacklist reviewers
Common Blind Peer-Reviewing
for Conferences
1.
Chair assigns each paper to a Senior PC
2.
SPC distributes a paper to 3 PC members (bidding)
3.
PC gives a paper to a reviewer student
4.
Each student reviews and assigns score.
5.
Author sees reviews and answers
6.
Student/PC may change review
7.
PC forwards review to SCP
8.
SCP gathers 3 reviews: rejects if any reject
9.
Chair makes last changes:
10.
Applies threshold (1/10 papers)
11.
Answers complains
Chair
Accept 7
Senior
Program
Committee
Reject 3
Accept 5
Program
Committee
Open Peer-Review
(e.g. Material Thinking Design Workshop, 2007)
Mainly with journals (biology)
Proceedings
Reviewers bid on papers. Papers distributed to 3 reviewers.
main
article
Each reviewer writes a short article with the review of the paper.
Authors see reviews and:
•
can withdraw paper, or
•
may write for each review a short article with an answer.
Reviewers see answers and can withdraw their reviews.
review
articles
Papers are published together with reviews and answers.
Paper with negative reviews (no withdrawn) are
published as technical reports together with the reviews.
review
answer
articles
Open Peer-Review
Papers without reviews. What to do?
Proceedings
Understanding facts/possible motivations/conclusions
1.
main
article
nobody accepted to review
•
likely not relevant to community, or
•
reluctance to write negative reviews
•
but may also be a boycott
 fair: tech rep for an archiving fee
2.
all reviews are withdrawn / not submitted in time:
•
(paper’s fault) could be irrelevant or poor quality
•
(reviewer’s fault) overcommitted, malicious strategy
•
assigned reviewer names should be published
review
articles
 accepted or rejected? A 3rd category!
The number of non-reviewed reports is a measure of the quality of
the symposium/community.

usable for deciding whether to submit similar articles in the
future.
review
answer
articles
?
Index
•
•
•
•
•
•
Game Theory Concepts
Peer-Reviewing Background
Model of Peer-Reviewing as Game
Theoretical analysis of PR Games
Experimental analysis of PG Games
Conclusions
Players
Game players:
o The researchers - Authors and reviewers,
 a repeated/iterated game at each conference
o Funding Agency - rewards researchers based on their
publications.
 mechanism designer, or
 player in a game with the researchers
next slides with help of R. Vishen
Concepts
•
Model paper quality by a paper's worth - utility to the society.
•
Worth is evaluated by expert reviewers.
•
Assumption: All reviewers in the symposium are equally expert.
Reviewers have the same type (association paper  worth)
t :  R

•
•
Assumption of equally expert reviewers
Note: we fail to model people emotionally attached to antagonistic
scientific beliefs in a community:
• scientists believing in climate warming vs. unbelievers
• scientists believing (or not) in the relevance of a given metric:
• is/isn’t privacy more important than verifiability (in voting)
• is network logic runtime more relevant than real runtime?
scientists with a given emotional belief should probably create their
own communities/conferences.

Model
• Authors and Reviewers expect rewards from a funding agency.
• Assumption: funding agency intends to maximize social value
trouble
• Social value is defined as “sum of quality of the endorsed papers.”
• An article is endorsed if it is published with favorable reviews by experts.
• Given a set of texts appearing in a community of type t, the social choice
function is :


f  t   s , t    0   endorsed s  t   0   endorsed s
Maximizing the total utility:
t 
{ |endorsed( )}
Publication venues and social value
• Conferences have multiple venues:
• orally presented papers
• posters
• technical reports
Publication venues and social value
Venues gives a way to automate the accounting of the paper worth, via
its impact on the visibility (number of citations):
o Technical reports are less endorsed than posters or orally
presented papers.
o The social value given a set of publication venues ψ (posters,
regular papers, etc.) consists of the weighted sum of the worth of
the published papers (assume measurable via citation influence).


 w * t  
  



Utilities (Motivation)
Let us convert it all to
• Funding agency settings for distributing funds (rewards):
o Citation Influence,
o Publications count.
Assumption: the funding agency cannot access the worth of a paper directly.
Citation Influence
o The citations influence (CI) of an author at a given moment is a
metric of the influence of his publications, and it estimates the
weighted sum of the worth of his publications on each of the three
venues Ψ, Ψ = {regular, poster, technical report}


CI  
  w * t   

   (author)




Utilities (Motivation)
Let us convert it all to:
• A researcher gets reputation (positive utility) when papers are cited.
• often one cannot automatically distinguish good vs bad citations.
• A researcher can get bad reputation (negative utility):
• for publishing erroneous articles, (as pointed by citations/reviews) or
• if his/her review is proven to be incorrect. (only with Open PR)
E=mc2
Utilities (Motivation)
Let us convert it all to:
What about reviewing (what is the motivation?)
Being asked to review is like a citation (proof of reputation).
To be asked again one has to promptly review when requested.
But why writing good reviews rather than random ones?
- Fear of “Authors scoring of reviewers” (not typically valuable: tit for tat)
- SPC cannot generally notice poor reviews.
- (when noticed, no mechanism to disseminate it)
Utilities (Motivation)
Let us convert it all to:
Deviation from truthful reviewing may pay.
• Reviewing takes time (writing random reviews earns you time)
• Conferences with thresholds are zero-sum games.
• A paper “p” is superseded by a newer paper “n” when the new paper
“n” steals the show for “p” (reduces future citations of p).
Citation Influence with superseding
o The citations influence (CI) of an author at a given moment is a
metric of the influence of his publications, and is given by the
weighted sum of the worth of his un-superseded publications of each
of the three venues Ψ, Ψ = {regular, poster, technical report}


CI  
 w * t  

  unsuperseeded( (author)) 



Index
•
•
•
•
•
•
Game Theory Concepts
Peer-Reviewing Background
Model of Peer-Reviewing as Game
Theoretical analysis of PR Games (#-based)
Experimental analysis of PG Games (CI-based)
Conclusions
Funding Based on Counting articles
(Trusted Peer-Reviewing)
• Rewards author i based on the number and venue of publications.
R  noi * wo  n ip * w p  nti * wt
• Paper superseding not relevant.

• Conferences:
• with threshold on paper acceptance rate (1/10, 1/20)
• without thresholds on paper acceptance
Funding Based on Counting articles
(Trusted Peer-Reviewing)
No thresholds on paper acceptance rates
• Two-players conference, with one submission each,
with {accept, reject} decisions, the payoff matrix is:
accept x’s
reject x’s
accept y’s
1,1
0,1
reject y’s
1,0
0,0
One-shot game: Best strategy is “make random decision”.
Iterated game: Effective strategy is “(forgiving) Tit-for-Tat”
Funding Based on Counting articles
(Trusted Peer-Reviewing)
Paper Acceptance Thresholds
• Conference in order to remain relevant to the funding agency puts a
threshold on the ratio of accepted papers. (CBR  CBRz)
• In this new version the actions available to players are not {accept,
reject}, but the scores {low, high}.
• In case of tie, a paper is randomly selected.
• 2-players case: Zero-sum game.
high x’s
low x’s
high y’s
0.5,0.5
0,1
low y’s
1,0
0.5,0.5
(Trusted Peer-Reviewing)
Multiple players
Paper Acceptance Thresholds
• n-players case (accepting n/k articles, equally worthy submissions):
• Zero-sum game.
• not pair-wise Zero-sum game
1
n /k n /k
• Utility of rejecting a paper (one less competitor): k(n 1)  n 1  n
high y’s
high x’s
low x’s
0,0
0,1/k(n-1)

low y’s
1/k(n-1),0
1/k(n-1),1/k(n-1)
For small CBR communities, a dominant strategy is “always low”.
For huge CBR communities, a dominant strategy tends to “random review”.
Always “low” is a Nash equilibrium (observed in some small communities)!
With Tit-for-Tat opponents, Nash equilibrium is “always low”
if the opponent is not met again for (n-1) rounds.
With Hits-for-Tat opponents, Nash equilibrium is “always high”
Hits-for-Tat: an opponent can strike back many (m>>1) times for one Tat
m
expected penalty:
k(n 1)
Truthful reviewing for TPR
Conclusion
With counting of articles: Truthful reviewing is never in equilibrium.
(under the working assumptions: mainly that nobody notices how you review).
Index
•
•
•
•
•
•
Game Theory Concepts
Peer-Reviewing Background
Model of Peer-Reviewing as Game
Theoretical analysis of PR Games (#-based)
Experimental analysis of PG Games (CI-based)
Conclusions
Experimental studies
Simulations
Assumption: Funding based on CI
Compared mechanisms:
Open Peer-Review (SelectivitY)
Common Blind Review (CBR)
Common Blind Review with paper acceptance threshold (CBRz)
Evaluation – Simulation Experiment
• Generate 100 random research communities (i.e. simulations)
• 20 researchers
• 20 conferences (i.e. 20 years)
• All participants are considered equally expert and inventive.
• Researchers get ideas for articles with a Uniform distribution at an
average of (only) 2 articles per year, and a worth that is uniformly
random in [-10, 10].
• Each paper is superseded each year with a probability of 1/5.
• Experiments for the weights wo = 0.5, wp = 0.3, and wt = 0.1.
Evaluation - Reviewer Types
• Compared Review Strategies
i. Truthful reviewing.
ii. Truthful reviewing except for papers superseding one’s work, which
are rejected.
iii. Random reviewing except for papers superseding one’s work, which
are rejected.
iv. Giving the opposite possible score to all papers (reject good papers
and accept poor papers).
v. Giving the lowest possible score to all papers.
(Tit-for-Tat was not explored here: assumed to have limited relevance)
Evaluation – Experiment
Assumptions about worth of reviews
•
A misclassifying comment has a negative worth.
o if published, it will be accounted only as bad reputation for the
reviewer.
•
Misclassifying comment worth - the difference between the
corresponding value and the real worth of the paper (always
negative).
•
The worth of an author’s answer to a negatively misclassifying
comment is the same as the “absolute value of the worth of the
misclassifying comment”. Otherwise, the answer has worth zero.
Evaluation - Experiment Cases
Combinations of strategies tested for equilibria
a. All reviewers review truthfully.
b. All reviewers review truthfully, except for one reviewer who rejects
articles superseding his work but reviews truthfully submissions not
superseding his work
c. All reviewers review truthfully, except for one reviewer who rejects
articles superseding his work and reviews randomly submissions not
superseding his work.
d. All reviewers review truthfully submissions not superseding their
work and reject the other submissions.
e. All reviewers review randomly submissions not superseding their
work and reject the other submissions.
Evaluation - Experiment Results
•
For both reviewing mechanisms the goal of the funding agency
(social value) is maximized with truthful reviewing – case (a).
• Reduced in other cases.
•
In SY with cases (b)-(e), even if all worthy papers are published,
the total worth is reduced compared to the case(a) - [remember,
technical reports have less weight]
Evaluation - Experiment Results for
equilibriums with CI
•
•
•
To evaluate the equilibrium of truthful reviewing researcher 1
performs non-truthful reviews.
The experiments show the extent of the implications of the use of
different strategies with CBR and SY.
Confirms that truthful reviewing is not in Nash equilibrium using
CBR, but it is in Nash equilibrium when SY is used under given
assumptions and strategies.
Experiments with funding based on counting
Settings:
• 100 researchers
• each paper reviewed by 4 people (assumed truthful except for 1).
• ¼ are selected for publication
• 500000 randomized simulation runs.
Truthful reviewing was always leading to less benefits for reviewer:
• with strategy iv (inverting), gain 11.19% more publications
• with strategy v (always low), gain 15% more publications
Conclusions
We gave an example of how to analyze peer-reviewing mechanisms.
Introduced “Peer-Reviewing Games”, an abstraction of real peer-reviewing processes:
•
sufficiently complex to capture interesting trade-offs
•
sufficiently simple to enable some theoretical and experimental analysis
Prove that truthful-reviewing is not in Nash equilibrium for Common Based Review with given
assumptions.
Prove that truthful-reviewing is in Nash equilibrium for the simplified Open Peer-Review SY under
studied assumptions and strategies.
For OPR with threshold on acceptance rate, Tit_for_Tat is a rational strategy.
For CBR, a rational strategy given assumptions is: reject superseding, random review for others.
Next?
Download