lec02_03

advertisement
Lecture 2: Multi-agent Interactions
SIF8072
Distributed Artificial Intelligence
and
Intelligent Agents
http://www.idi.ntnu.no/~agent/
Lecturer: Sobah Abbas Petersen
Email: sap@idi.ntnu.no
Lecture Outline
1. Multi-agent Systems
2. Utility and Preferences
3. Game Theory and Payoff Matrices
4. Strategies
5. Negotiation - Auctions
6. Summary
2
References
Wooldridge: ”Introduction to MAS”
– Multi-agent Interactions: Chapters 6
– Auctions: Chapter 7
3
Interactions
”The world functions through interacting
agents. Each person pursues his/her own
goals through encounters with other
people or machines.”
”Rules of Encouter” by Rosenchein and Zlotskin, 1994
4
Example 1
Two students decide to work together on their
exercises. They have to decide upon a time. One
prefers to work on Thursday afternoons after the
lecture while the other prefers to work on Friday
morning. How do they decide upon a time to do
the work?
5
Example 2
A friend invites you out for a drink and the cinema
tonight. But your favourite TV program is on tonight.
You think:
–
It would be nice to go out with my friend, but it’s cheaper to
watch TV.
–
If you stay at home and watch TV, you might not have a chance
to go out with your friend for a long time.
–
I can always record the program and watch it afterwards.
–
I can invite my friend home.
6
Multi-agent Systems (MAS)
• Contains a number of agents which:
– interact with one another through communication
– are able to act in an environment
– have different ”spheres of influence”
– may be linked by other relationships, e.g. organisational
• It is important to understand the type of interaction.
• Each agent can be assumed to be self-interested:
– has its own preferences and desires about how the world should be.
7
Multi-agent Systems (MAS)
Multi-agent System
Sphere of influence
Environment
Agent
Interaction
Organisational
relationship
8
Utilities and Preferences
•
Assume we have 2 agents: Ag = {i,j}.
•
Assume  ={ 1,  2,….} is the set of ”outcomes” that
agents have preferences over.
•
We capture preferences by utility functions:
•
•
ui :   IR
•
uj :   IR
Utility functions lead to preference orderings over
outcomes:
•
 ≥ i ’ means ui() ≥ ui(’)
•
 > i ’ means ui() > ui(’)
9
What is Utility?
•
Utility is not money, but a useful analogy
•
Typical relationship between utility and
money:
Utility
Money
10
Multi-agent Encounters 1
•
Need a model of the environment in which the agents
will act.
•
Agents simultaneously choose an action and, as a result,
an outcome in  will result.
•
Actual outcome depends on a combination of actions.
•
Environment behaviour given by state transformer
function: (reference: p31 of textbook)
 :
Ac
Agent i’s action

Ac

Agent j’s action
11
Multi-agent Encounters 2
•
Assume that each agent has two possible
actions:
1. C: cooperate
2. D: defect

Let Ac = {C,D}
12
State Transformer Funtions
•
Environment sensitive to actions of both agents:
 (D,D)=  1
•
 (C,D)=  3
 (C,C)=  4
Environment where neither agent has any influence:
 (D,D)=  1
•
 (D,C)=  2
 (D,C)=  1
 (C,D)=  1
 (C,C)=  1
 (C,D)=  1
 (C,C)=  2
Environment controlled by j:
 (D,D)=  1
 (D,C)=  2
Let Ac = {C,D}
13
Agent’s Preference
•
Consider the case where both agents influence the
outcome and they have the following utility functions:
•
ui(1 )=1
ui(2 )=1
ui(3)=4
ui(4 )=4
uj(1 )=1
uj(2 )=4
uj(3)=1
uj(4 )=4
ui(D,D)=1
ui(D,C)=1
ui(C,D)=4
ui(C,C)=4
uj(D,D)=1
uj(D,C)=4
uj(C,D)=1
uj(C,C)=4
Then, agenti’s preferences are:
C,C  i C,D

i D,C  i D,D
Agenti preferes all outcomes that arise through C over
all outcomes that arise through D
14
Payoff Matrices
We can characterise the previous scenario in a payoff
matrix
i
e.g. Top right cell:
i cooperates, j defects
Defect
Coop
1
4
Defect
1
j
•Agent i is the column player
1
1
Coop
4
4
4
(payoff received by i shown in top right of each cell)
•Agent j is the row player
15
Game Theory
•
A mathematical theory that studies interactions about
self-interested agents.
•
•
Essential elements of a game are:
–
Players (2 or more)
–
Some choice of action (strategy)
–
One or more outcomes (someone wins, someone loses)
–
Information
Suitable for situations where the other agent’s (player’s)
behaviour matters.
16
The Prisoner’s Dilemma 1
•
2 men are collectively charged with a crime and held in
separate cells. They have no way of communicating with
each other or making an agreement. They are told:
–
if one confesses and the other does not, confessor will be freed
and the other jailed for 3 years.
–
if both confess, then each will be jailed for 2 years.
–
If neither confess, then each will be jailed for 1 year.
•Confessing => defecting (D)
•Not confessing => cooperating (C)
If you were one of the prisoners, what would you do?
Discuss your answer with your neighbour.
17
The Prisoner’s Dilemma 2
Payoff matrix for Prisoner’s Dilemma:
•
Top left: If both defect,
i
punishment for mutual defection.
•
Top right: if i cooperates and j defects,
Defect
Coop
2
1
i gets sucker’s payoff of 1 while j gets 4.
•
j gets sucker’s payoff of 1 while i gets 4.
•
Defect
Bottom left: if j cooperates and i defects,
Bottom right:
2
j
4
4
Coop
Reward for mutual cooperation.
1
3
3
 Numbers in the payoff matrix reflect how good an outcome is for the agent.
e.g.
ui(D,D)=2
ui(D,C)=4
ui(C,D)=1
ui(C,C)=3
uj(D,D)=2
uj(D,C)=1
uj(C,D)=4
uj(C,C)=3
18
The Prisoner’s Dilemma 3
•
The individual rational agent will defect!
–
This guarantees a payoff of no worse than 2
–
Cooperating guarantees a payoff of at most 1
•
Defection is the best response to all possible strategies
–
•
Both agents defect and get a payoff = 2.
If both agents cooperate, they will each get payoff = 3.
–
(The other prisoner is my twin!)
Can we recover cooperation?
The Iterated Prisoner’s Dilemma
19
Let’s take a minute…..
How can we apply the Prisoner’s Dilemma to real
situations?
•
e.g. Arms races – nuclear weapons
compliance treaty between two countries.
•
Can you think of other situations?
20
Strategies
•
”A strategy is the way an agent behaves in
an interaction”. (Ref: Rosenchein and Zlotskin, 1994)
–
•
From game theory: strategies are actions of agents (Ac)
When 2 agents encounter, important
question: What should I do?
21
Dominance
•
Given any particular strategy s (e.g. C or D),
there will be a number of outcomes.
•
We say that s1 dominates s2 if every outcome
possible by i playing s1 is preferred over every
outcome possible by i playing s2.
22
Nash Equilibrium
•
2 strategies s1 and s2 are in Nash Equilibrium if:
–
Under the assumption that agent i plays s1, agent j can do no
better than play s2;
–
Under the assumption that agent j plays s2, agent i can do no
better than play s1;
•
Neither agant has any incentive to deviate from a Nash
Equilibrium.
•
Unfortunately:
–
Not every interaction scenario has a Nash Equilibrium.
–
Some interaction scenarios have more than one Nash
Equilibrium.
23
Nash Equilibrium - Example
The Battle of the Sexes
•
•

Conflict between a man and a woman,
where the man wants to go to a Prize
Fight and the woman wants to go to a
Ballet
Woman
Prize Fight
They are deeply in love. So, they would
make a sacrifice to be with each other.
2 Nash Equilibria


Strategy combination (Prize Fight, Prize
Fight)
1
Prize Fight
2
Man
Ballet
0
0
0
Ballet
0
2
1
Strategy combination (Ballet, Ballet)
Ref: ”Games and Information, E. Rasmussen, 2001
24
Let’s play a little game…..
Guess half the average
•
Choose a number between 0 and 100. Your
aim is to choose a number that is closest to
half the average of the numbers chosen by all
the students.
•
What is your number?
25
Competitive and zero-sum
Interactions
•
One agent can only get a more preferred
outcome at the expense of the other agent

•
strictly competitive.
Zero-sum encounters
for all   .
–
ui () + uj () = 0,
–
e.g. A football game where only one team can win.
26
Assumptions in Game Theory
• All Players behave rationally
– Not always the case with all agents!
• Each player knows the rule.
• Payoffs are known and fixed.
 These are limitations!
27
Multi-agent Interaction: Summary
• MAS: a number of agents which interact with one another through
communication.
• An agent’s action results in an outcome in the environment.
• Utility functions are used for preference orderings.
• Game theory – a mathematical theory that studies interactions among
agents.
• An agent’s action is a strategy:
– Dominant
– Nash Equilibrium
28
Negotiation
•
”The process of several
agents searching for an
agreement”
e.g. about price.
 Reaching consensus
”Rules of Encouter” by
Rosenchein and Zlotskin, 1994
29
Auction: Example 1
Several millions of $ paid for art
at auction houses such as
Sotheby’s.
Ears 2 u, Vincent!
30
Auction: Example 2
Online Auctions
You want to buy some exciting video games.
You see that there are some available on eBay.
You register at eBay and offer a bid for some of
these games.
31
auctioneer
Auctions
•
bidders
An Auction takes place between an auctioneer and a
collection of bidders.
•
Goal is for the auctioneer to allocate the goods to one
of the bidders.
•
Price
In most settings, the auctioneer desires to maximise
the price; bidders desire to minimise the price.
bidder
auctioneer
32
Auction Parameters
Value of goods
Private,
public/common,
Correlated
Winner determination
First price,
second price
Bids may be
Open cry,
Sealed
Bidding may be
One shot,
ascending,
descending
33
English Auctions
•
Price
English auctions are:
–
First price
–
Open cry
Bidder x
Bidder 1
–
•
Ascending
auctioneer
Dominant strategy: successively bid a small amount more than
the highest current bid until it reaches the valuation, then
withdraw.
•
Susceptible to Winners curse
–
Winner is the one who overvalues the goods on offer and may end
up paying more than its worth.
34
Dutch Auctions
Price
auctioneer
•
Dutch auctions are:
–
–
•
Open cry
auctioneer
Bidder
Descending
Auctioneer starts at an artificially high price. Then continually
lowers the offer price until an agent makes a bid which is equal
to the current offer price.
•
Dominant strategy: None
•
Susceptible to Winners curse
35
First-price Sealed-bid Auctions
Bidders
auctioneer
•
One shot auction
•
Single round, where bidders submit a sealed-bid for
the good.
•
Good is awarded to agent that made the highest bid.
•
Winner pays price of highest bid.
•
Best strategy: bid less than true value.
36
Vickrey Auctions
•
Vickrey auctions are:
–
second-price
–
sealed-bids
•
Good is awarded to agent that made the highest bid.
•
Winner pays price of second highest bid.
•
Best strategy: bid the true value.
•
Susceptible to anti-social behaviour
37
Lies and Collusions
•
Lies:
–
By the bidders (e.g. In Vickrey auctions)
–
By the auctioneer (shills, in Vickrey auction)
•
Collusion of bidders
–
Coalition of bidders where they agree beforehand to put
forward artificially low bids for the good on offer. When the
good is obtained, the bidders can then get the true value of
the good and share the profits.
38
Limitations of Auctions
•
Only concerned with the allocation of goods;
•
Not adequate for settling agreements that
concerns matters of mutual interest.
 Negotiation
39
Let’s take a minute……
• Can you think of any auctions that you have come
across?
• How about offering your notebook to the highest
bidder at the end of the year…..
• Discuss with your neighbour.
40
…..Selecting a Bid
41
Auctions: Summary
• An Auction takes place between an auctioneer and a collection of
bidders.
• In most settings, the auctioneer desires to maximise the price; bidders
desire to minimise the price.
• Types of Auctions:
– English auction
– Dutch auction
– First-price sealed bids
– Vickrey (Second-price sealed bids)
• Useful for allocating goods. But too simple for many other settings.
42
Next Lecture: Negotiation
Will be based on:
”Reaching Agreements”,
Chapter 7 in
Wooldridge: ”Introduction to MultiAgent
Systems”
Coordination – Working together, Chapter 9
43
Download