Scientific Community Game

advertisement
Scientific Community Game
Karl Lieberherr
4/13/2015
SCG
1
SCG Structure
• Virtual scientists competing/collaborating for
reputation
• Problem domain with niches (niche = subset
of problems)
• 2 player game
• Alice (Bob): I am better than Bob (Alice) at
solving problems in niche N
• Let’s play!
4/13/2015
SCG
2
Purpose of SCG
• Improve problem solving capabilities of teams
for solving CS/Math problems in a given
domain. Level of abstraction
– Algorithm design (solving problems means coming
up with algorithms and their analyses)
– Software development (solving problems means
providing reliable software implementing the
algorithms)
• Model scientific communities
4/13/2015
SCG
3
How to influence creativity
• Overall guidance: follow rules of a scientific
community
• Space of defining reputation gain
• Information release
– influences who participates
• Show that game is sound: winning is only
possible with strong problem solving and
prediction techniques. Otherwise people try
to cheat.
4/13/2015
SCG
4
Example: Stress Testing
• You have a ladder with n rungs, and you want
to find the highest rung from which you can
drop a copy of the jar and not have it
break. We call this the highest safe rung.
• You have k=2 jars available to break to
determine the highest safe rung.
• How many experiments do you need?
Minimize.
4/13/2015
SCG
5
niche (n,k) = niche (25,2)
• Alice and Bob think of a secret highest rung and share
it with Nina (the administrator). I.e., Alice and Bob
prepare “hard” problems for each other.
• First Alice asks a sequence of questions to Bob.
Answers:
– (1,no) (2,no) (3,no) (4,no) (5,no) (6,no) (7,no) (8,no) (9,yes)
(9 questions)
• Now Bob asks a sequence of questions to Alice.
Answers:
– (7,no) (14,no) (21,no) (22,no (23,yes) (5 questions)
• Nina checks answers. Bob wins because he needed
fewer questions. Bob wins reputation from Alice.
4/13/2015
SCG
6
Second level: Stress Testing cont.
• The virtual scientists must make a prediction.
• E.g., Bob claims that for any problem in niche
(25,2), he only needs 8 questions to
determine the highest safe rung.
• But Alice gives Bob a problem for which he
needs 9 questions to find the highest safe
rung, clearly contradicting Bob’s prediction.
• Alice wins reputation from Bob.
4/13/2015
SCG
7
Second level: Stress Testing (continued)
• Bob claims that for any problem in niche (25,2),
he needs at most 16 questions to determine the
highest safe rung.
• Alice claims that for any problem in niche (25,2),
she needs at most 14 questions to determine the
highest safe rung.
• Both can solve problems from the opponent
within the stated limit.
• But Alice is the better virtual scientist because
she made a tighter prediction.
• Alice wins reputation from Bob.
4/13/2015
SCG
8
Summary Stress Testing Example
• Alice wins reputation from Bob if
– [better problem solver] she solves Bob’s problems
better than Bob solves her problems
– [points to problem in Bob’s prediction] she gives a
problem to Bob so that Bob’s solution contradicts
his own prediction
– [better predictor] she makes a stronger prediction
than Bob and she supports her own prediction
with the problems from Bob
4/13/2015
SCG
9
Comparison to Real Scientific
Community
SCG: Virtual Scientific Comm.
• reputation gain
Real Scientific Community
• reputation gain
– better problem solver
– demonstrate problem with
claim
– strengthen claims
4/13/2015
– better problem solver
– demonstrate problem with
claims of other scientists
– strengthen claims of other
scientists
SCG
10
Problem Kinds we want to solve better
• Decision: Given p, exists J: pred(p,J)
• Optimization: Given p, find best J: obj(p,J) is
max
• Translation: Provide translator T, so that for p:
pred(p,T(p))
4/13/2015
SCG
11
Dimensions (continued)
• Direct/Indirect
– Game is played by humans (direct)
– Game is played by software agents written by
humans
• Formal/Informal
– The communication language is
• defined by a grammar (formal)
• English, maybe stylized (informal)
• Symmetric/Asymmetric
4/13/2015
SCG
12
Dimensions (continued)
• Hypotheses are algorithmic predictions. To
check predictions, need
– execution history of algorithm
• algorithm needs to be submitted to admin.
– result of algorithm
4/13/2015
SCG
13
hypotheses frozen before problems are provided
SCG
symmetric
• scientists propose and
oppose hypotheses to each
other
• unsuccessful opposition
gains reputation for
proposer
• strengthening gains
reputation for strengthener
4/13/2015
asymmetric
• Scientists are forced to
oppose. Chief scientist
proposes
• the same?
SCG
14
Hypothesis
• (niche, prediction)
• prediction is a function of the niche. Example:
niche (n,2). prediction: log(n).
4/13/2015
SCG
15
SCG
symmetric
• propose: wide choice
• oppose: optional
asymmetric
• propose: niche of hypothesis is
given, but not prediction. No
choice for niche.
• oppose: compulsory
– discount
– strengthen
– discount: cannot live up to
prediction
– strengthen: if can live up to
prediction, but stronger
prediction.
• provide
• solve
• provide
• solve
4/13/2015
SCG
16
Crowdsourcing
Winning criteria are checkable
Crowdsourcing without SCG
1. Company broadcasts
problem online
2. Online “crowd” submits
solutions
3. Crowd vets solutions
4. Company rewards winning
solvers
5. Company owns winning
solutions and profits
4/13/2015
Crowdsourcing with SCG
1. Company submits problem
definition to SCG server
2. Online “crowd” submits
solutions as SCG agents.
3. SCG tournaments vet
solutions
4. Company rewards winning
solvers
5. Company owns winning
agents and profits
SCG
17
Download