Dima slides

advertisement
Computing equilibria of security
games using linear integer
programming
Dima Korzhyk
dima@cs.duke.edu
Problem Description
• Targets t
c
u
u
c
u
(
t
)

u
(
t
),
u
(
t
)

u
• Utilities d
d
a
a (t )
• Resources nd, na
Mixed strategies
• Defender strategy: probability of defending
target t is dt, s.t.  dt  nd
t
• Attacker strategy: similarly, the probability of
attacking t is at, s.t.  at  na
t
Stackelberg strategy
• The defender commits to a mixed strategy d
• The attacker observes d and plays a best
response a
• Note that the attacker’s best response doesn’t
need to be randomized, so we can assume a is
binary
Stackelberg – first formulation
Maximize
c
u
a
[
d
u
(
t
)

(
1

d
)
u
 t td
t
d (t )]
t
Subject to
d t uac (t )  (1  d t )uau (t )   a  at  0
d t uac (t )  (1  d t )uau (t )   a  at  1
d
t
 nd
t
 na
t
a
t
Trick – fixing an implication with a
binary right-hand side
Suppose b is binary and we have implication
if g(x) > 0 then b=1
Fix this as follows:
b >= g(x) / M
Here M is a sufficiently large number
Fixed conditions
We can rewrite the implication
d t uac (t )  (1  d t )uau (t )   a  at  0
As
 a  d t uac (t )  (1  d t )uau (t )  0  at  0
at  1  [ a  d t uac (t )  (1  d t )uau (t )] / M
M (1  at )   a  d t uac (t )  (1  d t )uau (t )
Fixing the objective
Note that the objective contains a product of
two variables at and dt
Maximize
 a [d u
t
t
c
t d
(t )  (1  d t )u (t )]
u
d
Trick – fixing a product of a binary and
a continuous variable in the objective
Suppose the objective is
Maximize bx
Where b is binary, and x>=0 is a continuous
variable.
Replace this with
Maximize v
v <= x
v <= bM
Where M is a sufficiently large number
MIP for computing a Stackelberg
strategy
Maximize
v
t
t
Subject to
vt  d t u dc (t )  (1  d t )udu (t )
vt  at M
[ a  d t u ac (t )  (1  d t )uau (t )]  M (1  at )
Mat  d t uac (t )  (1  d t )uau (t )   a
d
t
 nd
t
 na
t
a
t
0  dt  1
at binary
Nash equilibrium
Find a pair of mixed strategies d,a such that both
players are best-responding. d,a are not binary.
Same attacker constraint s
d t uac (t )  (1  d t )uau (t )   a  at  0
d t uac (t )  (1  d t )uau (t )   a  at  1
Similar defender constraint s
at (udc (t )  udu (t ))   d  d t  1
at (udc (t )  udu (t ))   d  d t  0
Trick – fixing the implications
[aimms.com]
Suppose we have an implication
IF [g(x) > 0] THEN [h(x) > 0]
This is equivalent to
[g(x) <= 0] OR [h(x) > 0]
Now, we can use the either-or trick (next slide)
Either-Or Trick
Suppose we have a constraint
[g(x)>0] OR [h(x)>0]
We need to change this to AND
Replace with the following:
g(x) + M b > 0
h(x) + M (1-b) > 0
where M is a sufficiently large number
b is binary
Fixing the implications in the NE
program
d u (t )  (1  d t )u (t )   a  at  0
c
t a
u
a
d t uac (t )  (1  d t )uau (t )   a OR at  0
d u (t )  (1  d t )u (t )  θa -Mbt
c
t a
at  0  M (1  bt )
0  at  1
bt integer
u
a
MIP for computing a Nash equilibrium
d t uac (t )  (1  d t )uau (t )   a  bta 0 M
at  0  (1  bta 0 ) M
d t uac (t )  (1  d t )uau (t )   a  bta1M
at  1  (1  bta1 ) M
at (udc (t )  udu (t ))   d  btd 0 M
d
d t  0  (1  btd 0 ) M
a
at (u (t )  u (t ))   d  b M
c
d
u
d
d t  1  (1  btd 1 ) M
d1
t
t
 nd
t
 na
t
t
0  dt  1
0  at  1
Scaling of the NE MIP in the number of
targets/variables
0.03
0.025
0.02
0.015
0.01
0.005
0
0
20
40
60
80
100
120
140
160
Conclusion
• NE MIP is simpler than a specialized algorithm,
but may take longer to execute
• A benefit is that we can combine the two MIPs
in this presentation to find a defender strategy
that is both a Stackelberg and a Nash strategy
Thank you!
I thank Vince Conitzer, Ron Parr for detailed
discussions.
References
Security games (with a single attacker resource) were introduced in: Christopher
Kiekintveld, Manish Jain, Jason Tsai, James Pita, Fernando Ordóñez, and Milind Tambe. Computing Optimal
Randomized Resource Allocations for Massive Security Games. AAMAS-2009.
Computing a Stackelberg strategy in a security game with multiple attacker resources s is NP-hard: Zhengyu
Yin, Dmytro Korzhyk, Christopher Kiekintveld, Vincent Conitzer, and Milind Tambe. Stackelberg vs. Nash in
security games: interchangeability, equivalence, and uniqueness. AAMAS-2010.
Computing a Nash equilibrium in security games with multiple attacker resources can be done in polynomial
time. The algorithm by Dmytro Korzhyk, Vincent Conitzer, Ronald Parr is currently unpublished and is under
submission.
Some of the integer programming tricks in this presentation were taken from AIMMS Modeling Guide
http://www.aimms.com/aimms/download/manuals/aimms3om_integerprogrammingtricks.pdf
Download