Collective Learning in Social Coordination ...

advertisement
Collective
Learning in Social Coordination Games
Akira Namatame,Hiroshi Sato
Dept. of ComputerScience, National Defense Academy
Yokosuka, 239-8686, JAPAN
E-mail: { nama, hsato}@cc.nda.ac.jp
Abstract
Animportant aspect of collective intelligence is the learning
strategy adapted by each agent. Aninteresting problemwhich Weespecially consider the networks of strategic decisions as
shown in Figure 1, in which each node represents an agent
has been widely investigated is under what circumstances
will agents converge to some particular equilibrium? We and each link represents social interaction betweentwo agents.
Wemodelthe strategic decisions of agents as follows: Each
consider the networks of agents, in which each agent faces
agent faces the binary decision with externalities.
An
strategic decision problems. Weinvestigate the dynamicsof
externality occurs if one cares about others’ decisions, and
collective decision when each agent adapts the strategy of
interaction to its, neighbors. Weare interested in to showhow they also affect each agent’s strategic decision. Weconsider
the networks of social gamesin whicheach agent is repeatedly
the society gropes its way towards an equilibrium situation.
Weshowthat the society selects the mostefficient equilibrium matchedwith its neighbors to play the two-personcoordination
games. Weinvestigate how the society gropes for its way
amongmultiple equilibria when the agents composingit do
learn fromeach other as collective learning, and they co-evolve towards equilibrium in an imperfect world where agents
interact locally. Theydo not performoptimizationcalculations,
their strategies over time. The possibility of mutuallearning
allows the socially optimality where the model with random instead observe the current performanceof their neighbors,
and learn from the most successful one. Weattempt to probe
matchingmayresult in inefficiency.
Keyword:collective learning, crossover, coordination games deeper into these issues by modelingthe choices madeby the
, equilibriumselection,
agents with the owninternal models. The term evolutionary
dynamicsoften refers to systems that exhibit a time evolution
in which the character of the dynamics may change due to
1. Introduction
internal mechanisms[4]. Wefocus on evolutionary dynamics
The concept and techniques of game theory have been used
that maychange in time according to each agent’s local rule
extensively in dealing with strategic decision-making
of interaction. Wethen model the microscopic evolutionary
situations. However,game theory has been unsuccessful in
dynamics induced by both agents’ conscious decision and
explaining on how agents know which equilibrium can be
adaptive decision.
realized if a gamehas multiple equally plausible equilibria
There is no presumptionthat the self-interested behavior
[2]. Introspective or deductivetheories that attempt to explain
of agents shouldusually lead to collectively satisfactory results.
the problem of equilibrium selection impose very strong
Howwell each agent does for it in adapting to its social
informational assumptionson the decision-makingprocess[3].
environmentis not the same thing as howsatisfactory a social
In most gametheoritic models, agents are assumed to have
environment they collectively create for themselves. We
perfect knowledgeof the consequencesof their decision.
explore the mechanismso that emergent collective behavior
As a consequence, attention has shifted to evolutionary
can be manipulated in order to achieve desirable situations.
explanations motivated by the evolutionary gametheory[9].
Wediscuss the role of collective learning by considering the
Evolutionary game theory assumes that the gameis repeated
problem of equilibrium selection. Weuse the term emergent
by trial and error. Theevolutionary selection process operates
to denote stable macroscopicpatterns arising from the local
over time on the population, and morefit agents or strategies
rules for interaction. Weinvstigate the mechnismin which
can be selected by natural selection. That is, selection pressure
collective learning produces some kind of coherent and
favors agents whichare fitter, i.e., whosestrategy yields a
higher payoff against the average of the population. Two systematic emergentbehavior of social efficiency.
features of the evolutionary approachalso distinguish it from
the introspective approach of the gametheory. First, agents
are not assumed to be so rational or knowledgeable as to
correctly guessor anticipate the other agents’ strategies.
Second, an explicit dynamicprocess is specified describing
howagents adjust their strategies over time as they learn
fromtheir experiences[ 11].
This paper considers the dynamicsof collective decision
Fig.l. Networksof strategic decisions: social games
whenagents adapts their strategic decision to their neighbors.
156
2. Social Cordination Gamesand and
Equilibrium Selection
Social coordination games ae modeled to describe social
situations, in whichagents are expectedto select the strategy
that the majority does. Coordination games have multiple
equilibria with the possibility of coordinating on different
strategies, whichis knownas the problemof miss-coordination.
The traditional game theory is silent on how agents know
which equilibrium should be realized if a gamehas multiple
equally plausible equilibria, wherethese can be Pareto ranked.
The game theory has been also unsuccessful in explaining
howeach agent should learn in order to improvean equilibrium
situation [2].
As a specific exampleof a coordination game,we formulate
the endogeneouschoice problems of partners. Each link in
Fig.l, represents social interaction formulatedas the following
coordination game:Agents periodically have the discretion to
add or sever links to their neighbors. All possible outcomes
betweentwo agents are given as the payoff matrix in Table I.
If both agents add the links, they receive the payoff ct. The
parameter 13 represents the loss of the one-sided addition to
the network. If the partner severs the network, the network
cannot be connected,hence the effort for addingto the networks
becomes to be meaningless. Social interaction model is
characterized by the fact that each agent can completelycontrol
with whomthey interact. With such endogeneousinteraction
patterns, the question is whether the society mayselect an
efficient equilibrium, in whicheach agent is linked as shown
in Figure 1, starting from any initial configuration of the
networks.
The payoff matrix in Table 1 has two equilibria with the
pairs of the pure strategies (SI,S~), ($2,$2), and one
equilibrium of the mixed strategy. The most preferable
equilibrium, defined as Pareto-dominance,is ( Sl, Sl), which
dominatesthe other equilibria. There is another equilibrium
concept, the risk-dominance.If a - 13 < 0, the equilibrium (
S2, S2) risk-dominates
( 1, S1 ). I f t he value 0= 13/(t~+13 ),defined
as the threshold, is greater than 0.5, the Pareto-dominant
equilibrium (St, $1) also risk dominates( 2 Sz). However, i
0 is less than 0.5, the equilibrium( 2, $2) risk dominates l(
,St). The optimal situation of the society is realized if all
agents choosethe Pareto-optimal strategy Sl . Howdo rational
agents make their decision when Pareto-dominant and riskdominantstrategies are different? Absent an explanation of
howagents coordinate their expectations on the multiple
equilibrium, someagents expect the equilibrium (Sl ,S1) and
the others expect ( 2 , $2), and then they often f ace the problem
of miss-coordination.
Even if we consider the social coordination gamesin an
evolutionary setting, the situation remainsthe same. Theriskdominantequlibrium is selected from amongthe set of Nash
equlibria, even if it is inefficient. The basin of attraction of
the risk-dominantequlibriumis larger than that of the Paretodominantequilibrium. In addition, if agents sometimesmake
mistakes, the society occasionally switches from one
equilibrium to another. As the likelihood of mistakes goes to
zero, it is knownthat the risk-dominant equilibrium becomes
to be stochasticallystable [6] .[8].
Table 1 The payoff matrix of a coordination game
Own’s
Th~ oth6r’s
trateg~
S1
(add)
S2
(sever)
S1
(acid)
S2
(sever)
3. Spatioal Evolutionary
Collective Learning
Dynamics with
In order to formulate the social interactions in large, we may
have two fundamental models, random matching and local
matching[I]. The approach of random (or uniform) matching
is modeled as follows: At each time period, every agent is
assumed to match (interact) with one agent drawn at random
from a society. An important assumption of randommatching
is that they receive knowledge of the current strategy
distribution. Agentscan calculate their best strategy basedon
information about what other agents have done in the past.
What happens when an agent does not have all of this
information7
Weespecially consider the structure of territories in
which the entire territory is divided up so that each agent
interacts with eight neighbors in Fig.1 [7]. Theyadapt the
most successful strategy as guides for their owndecision
(individual learning). Hencetheir success dependsin large
part on how well they learn from their neighbors. If the
neighbor is doing well, its strategy can be imitated by all
others (collective learning). Each agent is modeled to
matchedseveral times with the samepartner, and the strategies
for the samepartner is codedas the list. A part of the list is
replaced with that of the most successful agent as shownin
Fig2 (cross-over) [5][10]. Neighbors can serve another
function as well. If the neighbor is doing well, the behavior
of the neighbor can be imitated, and successful strategies
can spread throughouta population from neighbor to neighbor.
The consequencesof their behaviors maytake sometime to
have an effect on agents with whomthey are not directly
linked.
cross-over
I
I
~D~Ug~c
I Agent A
Neighbor
who
of Agent ~C~/
Updated
Strategy
..........
~ F-q/m/
acquired
the
highest
payoff
Aj
Fig2 Illustration of crossover as individual learning
157
4. Simulation Results
Weendowagents with somesimple way of collective learning
and describe the evolutionary dynamics that magnifies
tendencies toward more better situation. Wearrange agents
for an area of 50 × 50 (N = 2,500 agents) as the lattice
model in Figure 2 with no gap, and four comers and end of
an area connectit with an opposite side. At each time period,
each agent plays the coordination gamein Table 1 with his 8
neighbors .Weinvestigate the role of collective learning in
social games. We demonstrate various evolutionary
phenomenaby studying the behavior in the following two
completely different models: The means-field modelin which
all agents with all (or equivalently the randommatchingmodel
in which each agent interacts with a randomlychosen partner
), and the local interaction on the lattice model.
In Fig.4, we showthe simulation results by changing the
parametersa, B in Table 1. Thex-axis represents the threshold
value /9 =/3/( ct +/3 )). They-axis represents the ratio of
Pareto-optimal strategies ($1) in the society at equilibrium.
The dot line in Fig.4 represent the result of the evolutionary
approach.If/9 < 0.5, the pareto-optimalstrategy $1 also riskdominateS2, therefore all agents finally chooses S~. However
if 8 > 0.5, all agents finally choosethe risk-dominantstrategy
S2 with the means-field model. With the latteice modelwith
collective learning, if the threshold value 0 is less than 0.8,
then almost every agent finally choose S~. However,if the
threshold /9 is greater than 0.8, moreagents becometo choose
the risk-dominantstrategy S
2.
Wealso considered the effect of mistakes for equilibrium
selection. Weassume that agents sometimes make mistakes
in the inplementation of their strategies. That is, with some
probability ~, the agents do not use the current strategy
determinedby their rules of interaction, but the other one.
Weuse different noise levels of ~ =0.01, ~ =0.025, and
=0.05. With the high rate of mistake (~ =0.05), the riskdominant-strategy S2 becomesto dominate if the threshold
value 0 is large ( 0 > 0.8). However,with the low level
mistakes, the pareto-dominatestrategy S~ still dominatesthe
society.
~)
100
80
"--"
"-- "-:"
""
- -~~_~. ¯
60
40
¯
¯
20
I
I
3=05
I
I
~=0.00
~~
~__- JAil- .e=0.01
~.2 .... 6=0.025
t
~°~ -6=0.05
%1
0=0.9
Fig.4:The
ratio of the pareto-dominant
strategy(St) at equilibrium
158
The differences betweenthe mean-field modeland the lattice
modelwith collective learning are evident in all these cases.
Wefound that the collective learning process leads the society
of rational agents to the Pareto dominantequilibrium situation
in most cases. In this context, evenif mistakes or errors when
they choose their strategy are introduced, we found social
coordination gameswith collective learning always converges
to the Pareto dominantequilibrium.
5. Conclusion
This paper discussed strategic modelsin whichagents interact
locally with their neighbors and learn from the most successful
one. Wecompared the global model with global interaction
in whichall agents can interact with each other and the lattice
modelwith local interaction in whichagents can only interact
with their immediate neighbors. As an example, we showed
howsocial networks evolve to a socially optimal situation
where everybody is fully connected starting from randomly
disconnectedsituation. Wealso discussed the role of collective
learning in social games. The hypotheses we employedhere
reflect limited ability of agents. The learning strategy
employedhere is crossover, and they learn from the most
successful neighbor. The methodologyof collective learning
is veryslowto evolve,but it is essential especiallyin imperfect
worlds. Weinvestigated howthe society selects the most
efficient equilibrium amongmultiple equilibria when the
agents composing it do learn each other, and showedthat
collective learning is essential for realizing social efficiency..
6 References
[1]Cohedent,P (eds): TheEconomics
of Networks,Springer, 1998
[2]Fudenberg,D., and Levine,D., TheTheoryof Learningin Games,
The MITPress, 1998
[3]Hansaryi,J. and Selten, R., A GameTheoryof Equibrium
Selection in Games,MITPress, 1988.
[4]Hofbauer,J. Sigmund,K., EcolutionaryGamesandPopulation
Dynamics,Cambridge
Univ. Press, 1998.
[5]IwanagaS, Namatame
A., "DecentralizedDecisionNetworks",
TheThird Int. Conf. OnInformationFusion, pp571-577,2000,
[6]Kandori,M.and Mailath, G., "Learning,Mutationand Long
RunEquilibria in Games",Econometrica,
Vol.61, pp.29-56, 1993.
[7]Nowak,
M.A., etc., "TheArithmeticsof MutualHelp",Scientific
American,June, 1995.
[8]Samuelson,,L.,EvolutionaryGamesand EqulibriumSelection,
The MITPress, 1998.
[9] Smith,1. M., Evolutionand the Theoryof Games,Cambridge
UniversityPress, 1982.
[10] UnoK. and Namatame
A., "An Evolutionary Design of the
Networksof MutualReliability", CEC’99,Washington
D.C., pp. 17171723, 1999.
[11] UnoK. and Namatame
A., "Evolutionary Behaviors Emerged
throughStrategic Interactions in the Large", GECCO’99,
Orlando,
Florida, pp.1414-1421,
1999.
[12] WeibullJ.,Evolutionary GameTheory,The MITPress, 1996.
Download