Learning in Multiagent systems

advertisement
Learning in
Multiagent systems
Prepared by: Jarosław Szymczak
Based on: „Fundamentals of Multiagent Systems with
NetLogo Examples” by José M Vidal
Scenarios of learning
 cooperative learning – e.g. each agent has own
map, and together with other agents aggregate
a global view
 competitive learning – e.g. each selfish agent
tries to maximize own utility by learning about
behaviors and weaknesses of other agents
 agents learn because:
 they don’t know everything about the environment
 they don’t know how the other agents behave
The Machine Learning Problem
 The goal of machine learning research is the
development of algorithms that increase the
ability of an agent to match a set of inputs to
their corresponding outputs (Mitchell, 1997)
 Input here could be e.g. a set of photos
depicting people and output would be set
consisting of {man, woman}, the machine
learning algorithm will have to learn a proper
recognition of photos
The Machine Learning Problem
 Input set is usually divided into training and
testing set, they can be interleaved
 Graphical representation of MLP:
The Machine Learning Problem
 Induction bias – some learning algorithms
appear to perform better than others in certain
domains (e.g. two algorithms can learn perfectly
classify + and -, but still have different functions)
 No free lunch theorem - averages over all
possible learning problems there is no learning
algorithm that outperforms all others
 In multiagent scenario some of the fundamental
assumptions of machine learning are violated,
there is no longer fixed input, it keeps changing
because other agents are also learning
Cooperative learning
 We have given a two robots able to
communicate, they can share their knowledge
(their capabilities, knowledge about terrain etc.)
 Sharing the information is really easy if robots
are identical, if not we need to somehow model
their capabilities to decide which information
would be useful for another robot
 Most systems that share learned knowledge
among agents, such as (Stone, 2000), simply
assume that all agents have the same
capabilities.
Repeated games
 Nash equilibrium – I choose what is best for me when
you are doing what you are doing and you choose what
is best for you when I am doing what I am doing
 In repeated games we have two players facing each
other, like e.g. prisoner's dilemma
 Nash equilibrium is based on assumption of perfectly
rational players, in learning in games the assumption is
that agents use some kind of algorithm, the theory
determines the equilibrium strategy that will be arrived at
by the various learning mechanisms and maps these
equilibrium to the standard solution concepts, if possible.
Fictitious play
Agent remembers everything the other agents have done. E.g.:
Fictitious play
 Let’s have a look at some theorems:
 (Nash Equilibrium is Attractor to Fictitious
Play). If s is a strict Nash equilibrium and it is
played at time t then it will be played at all
times greater than t (Fudenberg and Kreps,
1990).
 (Fictitious Play Converges to Nash). If
fictitious play converges to a pure strategy
then that strategy must be a Nash equilibrium
(Fudenberg and Kreps, 1990).
Fictitious play
 Infinite cycles problem – we can avoid it by
using randomness, here is the example of
infinite cycle in fictitious play:
Replicator dynamics
This model assumes that
the fraction of agents
playing a particular strategy
replicator dynamics will
grow in proportion to how
well that strategy performs
in the population.
A homogeneous population
of agents is assumed. The
agents are randomly paired
in order to play a symmetric
game (same strategies and
payoffs). It is inspired by
biological evolution.
Let φt(s) be a number of agents
using strategy s at time t, ut(s)
be an expected utility for an
agent playing strategy s at time
t and u(s,s’) be utility that agant
playing s receives against agent
playing s’. We can define:
Replicator dynamics
 Let’s have a look at some theorems:
 (Nash equilibrium is a Steady State). Every Nash equilibrium is a
steady state for the replicator dynamics (Fudenberg and Levine,
1998).
 (Stable Steady State is a Nash Equilibrium). A stable steady
state of the replicator dynamics is a Nash equilibrium. A stable
steady state is one that, after suffering from a small perturbation,
is pushed back to the same steady state by the system’s
dynamics (Fudenberg and Levine, 1998).
 (Asymptotically Stable is Trembling-Hand Nash). An
asymptotically stable steady state corresponds to a Nash
equilibrium that is trembling-hand perfect and isolated. That is,
the stable steady states are a refinement on Nash equilibria only a few Nash equilibria are stable steady states (Bomze,
1986).
Evolutionary stable strategy
 An ESS is an equilibrium strategy that can overcome the
presence of a small number of invaders. That is, if the
equilibrium strategy profile is ω and small number ε of
invaders start playing ω’ then ESS states that the
existing population should get a higher payoff against the
new mixture (εω‘+(1−ε)ω) than the invaders.
 (ESS is Steady State of Replicator Dynamics). ESS is an
asymptotically stable steady state of the replicator
dynamics. However, the converse need not be true—a
stable state in the replicator dynamics does not need to
be an ESS (Taylor and Jonker, 1978).
Replicator dynamics
AWESOME algorithm
The abbreviation stands for:
Adapt When Every is Stationary,
Otherwise Move to Equilibrium
Stochastic games
COMING SOON  (THIS AUTUMN)
Download