Dima slides

Computing equilibria of security games using linear integer programming Dima Korzhyk dima@cs.duke.edu Problem Description • Targets t c u u c u ( t )  u ( t ), u ( t )  u • Utilities d d a a (t ) • Resources nd, na Mixed strategies • Defender strategy: probability of defending target t is dt, s.t.  dt  nd t • Attacker strategy: similarly, the probability of attacking t is at, s.t.  at  na t Stackelberg strategy • The defender commits to a mixed strategy d • The attacker observes d and plays a best response a • Note that the attacker’s best response doesn’t need to be randomized, so we can assume a is binary Stackelberg – first formulation Maximize c u a [ d u ( t )  ( 1  d ) u  t td t d (t )] t Subject to d t uac (t )  (1  d t )uau (t )   a  at  0 d t uac (t )  (1  d t )uau (t )   a  at  1 d t  nd t  na t a t Trick – fixing an implication with a binary right-hand side Suppose b is binary and we have implication if g(x) > 0 then b=1 Fix this as follows: b >= g(x) / M Here M is a sufficiently large number Fixed conditions We can rewrite the implication d t uac (t )  (1  d t )uau (t )   a  at  0 As  a  d t uac (t )  (1  d t )uau (t )  0  at  0 at  1  [ a  d t uac (t )  (1  d t )uau (t )] / M M (1  at )   a  d t uac (t )  (1  d t )uau (t ) Fixing the objective Note that the objective contains a product of two variables at and dt Maximize  a [d u t t c t d (t )  (1  d t )u (t )] u d Trick – fixing a product of a binary and a continuous variable in the objective Suppose the objective is Maximize bx Where b is binary, and x>=0 is a continuous variable. Replace this with Maximize v v <= x v <= bM Where M is a sufficiently large number MIP for computing a Stackelberg strategy Maximize v t t Subject to vt  d t u dc (t )  (1  d t )udu (t ) vt  at M [ a  d t u ac (t )  (1  d t )uau (t )]  M (1  at ) Mat  d t uac (t )  (1  d t )uau (t )   a d t  nd t  na t a t 0  dt  1 at binary Nash equilibrium Find a pair of mixed strategies d,a such that both players are best-responding. d,a are not binary. Same attacker constraint s d t uac (t )  (1  d t )uau (t )   a  at  0 d t uac (t )  (1  d t )uau (t )   a  at  1 Similar defender constraint s at (udc (t )  udu (t ))   d  d t  1 at (udc (t )  udu (t ))   d  d t  0 Trick – fixing the implications [aimms.com] Suppose we have an implication IF [g(x) > 0] THEN [h(x) > 0] This is equivalent to [g(x) <= 0] OR [h(x) > 0] Now, we can use the either-or trick (next slide) Either-Or Trick Suppose we have a constraint [g(x)>0] OR [h(x)>0] We need to change this to AND Replace with the following: g(x) + M b > 0 h(x) + M (1-b) > 0 where M is a sufficiently large number b is binary Fixing the implications in the NE program d u (t )  (1  d t )u (t )   a  at  0 c t a u a d t uac (t )  (1  d t )uau (t )   a OR at  0 d u (t )  (1  d t )u (t )  θa -Mbt c t a at  0  M (1  bt ) 0  at  1 bt integer u a MIP for computing a Nash equilibrium d t uac (t )  (1  d t )uau (t )   a  bta 0 M at  0  (1  bta 0 ) M d t uac (t )  (1  d t )uau (t )   a  bta1M at  1  (1  bta1 ) M at (udc (t )  udu (t ))   d  btd 0 M d d t  0  (1  btd 0 ) M a at (u (t )  u (t ))   d  b M c d u d d t  1  (1  btd 1 ) M d1 t t  nd t  na t t 0  dt  1 0  at  1 Scaling of the NE MIP in the number of targets/variables 0.03 0.025 0.02 0.015 0.01 0.005 0 0 20 40 60 80 100 120 140 160 Conclusion • NE MIP is simpler than a specialized algorithm, but may take longer to execute • A benefit is that we can combine the two MIPs in this presentation to find a defender strategy that is both a Stackelberg and a Nash strategy Thank you! I thank Vince Conitzer, Ron Parr for detailed discussions. References Security games (with a single attacker resource) were introduced in: Christopher Kiekintveld, Manish Jain, Jason Tsai, James Pita, Fernando Ordóñez, and Milind Tambe. Computing Optimal Randomized Resource Allocations for Massive Security Games. AAMAS-2009. Computing a Stackelberg strategy in a security game with multiple attacker resources s is NP-hard: Zhengyu Yin, Dmytro Korzhyk, Christopher Kiekintveld, Vincent Conitzer, and Milind Tambe. Stackelberg vs. Nash in security games: interchangeability, equivalence, and uniqueness. AAMAS-2010. Computing a Nash equilibrium in security games with multiple attacker resources can be done in polynomial time. The algorithm by Dmytro Korzhyk, Vincent Conitzer, Ronald Parr is currently unpublished and is under submission. Some of the integer programming tricks in this presentation were taken from AIMMS Modeling Guide http://www.aimms.com/aimms/download/manuals/aimms3om_integerprogrammingtricks.pdf

Dima slides

Related documents

Products

Support

Dima slides

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib