Game Theory

Methods of PPE I Part Game Theory Introduction Decision theory is the theory about rational decision making. We can distinguish between:  Normative decision theory studies what rational decision makers ought to do;  Descriptive decision theory tries to explain what decision makers actually do. Decision matrices and decision trees are models that describe a decision situation for a decision maker Decision matrices are usually used for one-shot decision situations, while decision trees are used for sequential (or dynamic) decision situations. When we describe (or model) a decision situation, we distinguish three levels of abstraction: 1. The decision problem 2. A formalisation of the decision problem 3. A visualization of the formalisation The 1. 2. 3. 4. 5. model of a decision problem contains at least four elements: States Acts or strategies Outcomes Preferences over outcomes Information of the decision maker Ad 1. A state represents all that the decision maker has no influence on. Ad 2. In a one-shot decision model, a decision maker can choose an act from a set 𝐴 = {𝑎1, 𝑎2 , … , 𝑎𝑝 } of acts. So, an act represents what the agent chooses/does. In a sequential decision model, a decision maker has to choose a sequence of acts, where the sets of acts to choose from at some moments might depend on the chosen acts in the past. A sequence of acts is called a strategy. Ad 3. An outcome is the result of an act (in a one-shot decision model) or strategy (in a sequential decision model). Ad 4. The outcomes can be ordered based on a preference relation. This preference relation mentions for every two outcomes if one is preferred to the other or not. In the last case it can be that the decision makes is indifferent between the two outcomes, or cannot compare them. (We say more about this in the coming slides.) Ad 5. Given the preferences over the outcomes, the decision maker chooses his/her act or strategy from the set of acts. This can be done by different criteria or objectives. This also depends on the information of the decision maker: is it decision making under uncertainty, risk or ignorance? Transformation of scales: We can distinguish three types of scales: 1. Ordinal scale: Based on the numbers you can compare the outcomes. But you can give no meaning to the difference or ratio between these numbers. 2. Cardinal scales: a. Interval scale: Differences have meaning, but ratios do not. b. Ratio scale: Differences and ratios have a meaning. Transformation of scales: 1. Ordinal scale: a. Maintain preference, e.g. A is better than B, but C is better than D. 2. Interval scale: a. Maintain numerical difference, e.g. difference of 160 between states 3. Ratio scale: a. Maintain ratio in difference, e.g. the difference in state B should be twice the difference in A Expected utility Expected value (EV):  Expected monetary value (EMV): ∑𝑛𝑥=1 𝑝𝑥 𝑚𝑥 = 𝑝1𝑚1 + 𝑝2 𝑚2 + ⋯ + 𝑝𝑛 𝑚𝑛  Expected utility (EU): ∑𝑛𝑥=1 𝑝𝑥 𝑢𝑥 = 𝑝1 𝑢1 + 𝑝2 𝑢2 + ⋯ + 𝑝𝑛 𝑢𝑛 Risk    attitude: Risk averse: diminishing marginal utility Risk neutral: constant marginal utility Risk seeking: increasing marginal utility Axiomatization of expected utility: EU1: If all outcomes of an act have utility 𝑢, then the utility of the act is 𝑢. EU2: If one act is certain to lead to better outcomes under all states than another, then the utility of the first act exceeds that of the latter; and if both acts lead to equal outcomes they have the same utility. EU3: Every decision problem can be transformed into a decision problem with equally probable states, in which the utility of all acts is preserved. EU4: If two outcomes are equally probable, and if the better outcome is made slightly worse, then this can be compensated for by adding some amount of utility to the other outcome, such that the overall utility of the act is preserved. Theorem 4.1 If axioms EU1-4 hold for all decision under risk, then, the utility of an act equals its expected utility. © 2020 Patrick Hup Preference relations Consider a set of alternatives 𝑋.    ≻ (strict preference) ≽ (“… at least as good as …”) ∼ (indifference) Properties of preference relations:  Complete: if and only if for all 𝑥, 𝑦 ∈ 𝑋, it holds that: 𝑥 ≻ 𝑦 or 𝑥 ∼ 𝑦 or 𝑦 ≻ 𝑥.  In words, a preference relation is complete if any two alternatives 𝑥 and 𝑦 can be compared to each other: 𝑥 is better than 𝑦, or 𝑦 is better than 𝑥, or the decision maker is indifferent between 𝑥 and 𝑦.   Asymmetric: if and only if for all 𝑥, 𝑦 ∈ 𝑋, it holds that: if 𝑥 ≻ 𝑦 then ¬𝑦 ≻ 𝑥. In words, a preference relation is asymmetric if for any two alternatives at most one can be (strictly) better to the other.   Transitive: if and only if for all 𝑥, 𝑦, 𝑧 ∈ 𝑋, it holds that: [𝑥 ≻ 𝑦 and 𝑦 ≻ 𝑧] implies that [𝑥 ≻ 𝑧]. In words, a preference relation is transitive if, whenever the decision makes prefers alternative 𝑥 to alternative 𝑦, and prefers alternative 𝑦 to alternative 𝑧, then he/she agent prefers alternative 𝑥 to alternative 𝑧.  Negative transitive: if and only if for all 𝑥, 𝑦, 𝑧 ∈ 𝑋, it holds that: [¬𝑥 ≻ 𝑦 and ¬𝑦 ≻ 𝑧] implies that [¬𝑥 ≻ 𝑧]. Utility functions Definition A real-value function 𝑢 is a utility function representing the preference relation ≻ if and only if: 𝑥 ≻ 𝑦 if and only if 𝑢(𝑥) > 𝑢(𝑦) Remark: a utility function is measuring on an ordinal utility scale. Theorem 5.1 Let 𝑋 be an finite set of outcomes. Preference relation ≻ can be represented by a utility function 𝑢 if and only if ≻ is complete, asymmetric and negatively transitive in 𝑋. Axioms vNM1 Preference relation ≻ is complete if and only if for all 𝐴, 𝐵 ∈ 𝐿, it holds that: 𝐴 ≻ 𝐵 or 𝐴 ∼ 𝐵 or 𝐵 ≻ 𝐴 vNM2 Preference relation ≻ is transitive if and only if for all 𝐴, 𝐵, 𝐶 ∈ 𝐿, it holds that: [𝐴 ≻ 𝐵 and 𝐵 ≻ 𝐶] and [𝐴 ≻ C] Remark: Note that these are similar as defined before, but now over lotteries. A new axiom: vNM3 Preference relation ≻ satisfied independence if and only if for all 𝐴, 𝐵, 𝐶 ∈ 𝐿 and 0 ≤ 𝑝 ≤ 1, it holds that: [𝐴 ≻ 𝐵] if and only if [𝐴𝑝𝐶 ≻ 𝐵𝑝𝐶]. In words, if you prefer lottery 𝐴 to lottery 𝐵 then you prefer any lottery between 𝐴 and a third lottery 𝐶 to the lottery between 𝐵 and 𝐶 with the same probabilities. Theorem 5.2 Preference relation ≻ satisfies vNM 1-4 if and only if it can be represented by a utility function 𝑢 satisfying: (i) 𝐴 ≻ 𝐵 if and only if 𝑢(𝐴) > 𝑢(𝐵) (ii) 𝑢(𝐴𝑝𝐵) = 𝑝𝑢(𝐴) + (1 − 𝑝)𝑢(𝐵) (iii) for every other function 𝑢′ satisfying (i) and (ii) there are numbers 𝑐 > 0 and 𝑑 such that 𝑢′ = 𝑐 ⋅ 𝑢 + 𝑑 A utility function as in Theorem 5.2 is called a von Neumann-Morgenstern (expected) utility function This theory is fundamental in the development of decision theory in particular game theory. When the preferences of a decision making axioms vNMI1-vNM4 then the decision maker can make its decision based on maximizing expected utility. Decisions under ignorance In decision making under risk, the decision maker knows the probability of the possible outcomes. In decision making under ignorance, the decision maker does not know the probability of the possible outcomes (or these probabilities do not exist). So, when making decisions under risk, you can use the probability distribution to calculate the expected payoff. This cannot be done when making decisions under ignorance. What to do if you do not know the probability of the states? There are several criteria that can be used. Strong dominance 𝑎𝑖 ≻ 𝑎𝑗 if and only if 𝑣(𝑎𝑖 , 𝑠) ≥ 𝑣(𝑎𝑗 , 𝑠) for every state 𝑠, and there is at least one state 𝑠𝑛 such that 𝑣(𝑎𝑖 , 𝑠𝑛 ) > 𝑣(𝑎𝑗 , 𝑠𝑛 ) We say that 𝑎𝑖 is a strongly dominant act if 𝑎𝑖 ≻ 𝑎𝑗 for all 𝑎𝑗 ≠ 𝑎𝑖 . If a rational decision maker uses this criterion then he/she chooses the strongly dominant act (if it exists). Similar for Weak dominance 𝑎𝑖 ≽ 𝑎𝑗 if and only if 𝑣(𝑎𝑖 , 𝑠) ≥ 𝑣(𝑎𝑗 , 𝑠) for every state 𝑠 If there is no (strongly) dominant act, then there can still be a (strongly) dominated act © 2020 Patrick Hup We say that act 𝑎𝑗 is a strongly dominated act 𝑎𝑖 if 𝑎𝑖 ≻ 𝑎𝑗 . If a rational decision maker uses this criterion then he she does not choose a strongly dominated act. Similar for Weak dominance. Notice that 𝑎𝑖 is strongly dominant act if and only if all other acts are strongly dominated by act 𝑎𝑖 . Decision rules: Maximin Maximize the minimum valuable obtainable with each act min(𝑎𝑐𝑡 𝑎) = min(10, 5, −5, −10) = −10 min(𝑎𝑐𝑡 𝑏) = min(6, 5, 𝑥, −5) = −5 min(𝑎𝑐𝑡 𝑐) = min(2,2,2,2) = 2 So, 𝑚𝑎𝑥𝑖𝑚𝑖𝑚 chooses ‘𝑎𝑐𝑡 𝑏’ Leximin If the worst outcomes are equal, one should choose an alternative in which the second worst outcome is as good as possible. min(𝑎𝑐𝑡 𝑎) = min(−5, 6, 9) = −5 min(𝑎𝑐𝑡 𝑏) = min(5, 12, −7) = −7 min(𝑎𝑐𝑡 𝑐) = min(−10, 5, 0) = −10 min(𝑎𝑐𝑡 𝑑) = min(10, −5, 7) = −5 So, its 𝑎𝑐𝑡 𝑎 or 𝑎𝑐𝑡 𝑑 Act a: min(𝑎𝑐𝑡 𝑎) = min(−5, 6, 9) = 6 Act d: min(10, −5, 7) = 7 So, choose ‘𝑎𝑐𝑡 𝑑’ Note that ff also the second worst outcomes are equal, compare the third worst outcomes, etc. Maximax Maximize the maximal value obtainable with an act. max(𝑎𝑐𝑡 𝑎) = max(−5, 6, 9) = 9 max(𝑎𝑐𝑡 𝑏) = max(5, 12, 7) = 12 max(𝑎𝑐𝑡 𝑐) = max(−10, 5, 0) = 5 max(𝑎𝑐𝑡 𝑑) = max(10, −5, 7) = 10 So, choose ′𝑎𝑐𝑡 𝑏’ Optimism-Pessimism Rule Consider both the best and the worst possible outcome of each alternative, and then choose an alternative to her degree of optimism or pessimism. The decision maker is optimistic to degree 0.7 Act A: max(𝑎𝑐𝑡 𝑎) = max(−5, 6, 9) = 9 min(𝑎𝑐𝑡 𝑎) = min(−5, 6, 9) = − 5 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 = 0.7 ⋅ 9 + (1 − 0.7) ⋅ −5 = 4.8 Act B: max(𝑎𝑐𝑡 𝑏) = max(5, 12, 7) = 12 min(𝑎𝑐𝑡 𝑏) = min(5, 12, 7) = 5 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 = 0.7 ⋅ 12 + (1 − 0.7) ⋅ 5 = 9.9 Act C: max(𝑎𝑐𝑡 𝑐) = max(−10, 5, 0) = 5 min(𝑎𝑐𝑡 𝑐) = min(−10, 5, 0) = 0) 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 = 0.7 ⋅ 5 + (1 − 0.7) ⋅ 0 = 3.5 Act D: max(𝑎𝑐𝑡 𝑑) = max(10, −5, 7) = 10 min(𝑎𝑐𝑡 𝑑) = min(10, −6, 7) = −6 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 = 0.7 ⋅ 10 + (1 − 0.7) ⋅ −6 = 5.2 So, choose ‘act B’ © 2020 Patrick Hup Minimax Regret Choose an alternative under which the maximum regret value is as low as possible. Regret is calculated by subtracting the value of the best outcome of each state from the value of the outcome in question. max(𝐺𝑟𝑜𝑤𝑖𝑛𝑔 𝑒𝑐𝑜𝑛𝑜𝑚𝑦) = max(8, −10, 0) = 8 max(𝑆ℎ𝑟𝑖𝑛𝑘𝑖𝑛𝑔 𝑒𝑐𝑜𝑛𝑜𝑚𝑦) = max(4, −3, 0) = 4 max(𝑆𝑡𝑎𝑏𝑙𝑒 𝑒𝑐𝑜𝑛𝑜𝑚𝑦) = max(2, 2, 2) = 2 The regret matrix is max(−12, −5, −6) = −5 So, according to the minimax regret criterion choose ‘Firm B’. The Principle of Insufficient Reason If one has no reason to think that one state of the world is more probable than another, then all states should be assigned equal probability. 1 No lockdown: (10 + 5 + (−5) + (−10)) = − 4 1 Partial lockdown: (6 + 5 + 𝑥 + (−5)) 1 4 Lockdown: (2 + 2 + 2 + 2) = 2 4 So, partial lockdown is chosen according to the Principle of Insufficient Reason if Randomized acts [Insert theory] © 2020 Patrick Hup 6+𝑥 4 > 2, so 𝑥 > 2 Game Theory Game Theory (also called Interdependent or Interactive Decision Theory) studies situations with more than one decision maker where the decision makers are aware of the influence of their actions on the choice behaviour of the other decision makers. In game theory we refer to the decision makers as players. A game needs to describe the following elements  Players: decision makers  Rules: which player moves when?  Actions: possible moves of a player  Outcomes: for each set of actions (or strategies)  Payoff: the utility a player receives  Information: the knowledge a player has of the relevant variables at a certain point in the game Taxonomy:   In zero-sum games, in every outcome of the game the sum of all payoffs equal to zero. These games reflect strong competition. In nonzero-sum games different outcomes of the game can have different total sum of payoffs.   Noncooperative games: no previous binding commitments. Cooperative games: binding agreements during pre-play negotiations.   Simultaneous-move game: each player moves once, without knowing the move of the players. A move is called an action. Sequential-move games: players do not move all at the same time, and it can be that a player moves more than once. Each move is called an action, and a contingent plan where a player chooses an action at every moment he/she can make a move is called a strategy. Normal form games A simultaneous move game is usually modelled as a normal form game Definition A normal form game consists of (i) A set of player 𝑁 = {1, … , 𝑛} (ii) for every player 𝑖 ∈ 𝑁 a set of (pure) strategies 𝑆𝑖 , and (iii) for every player 𝑖 ∈ 𝑁 a payoff (utility) function 𝑢𝑖 over the possible strategy profiles (outcomes of the games) Here 𝑢𝑖 (𝑠1, 𝑠2, … , 𝑠𝑛 ) is the payoff for player 𝑖 when player 1 plays 𝑠1, player 2 players 𝑠2, etc. So, the payoff of a player depends on the strategies chosen by all players. This interactive elements is essential in game theory. We assume that the payoff functions are von Neumann-Morgenstern utility functions, see Chapter 5/Lecture DT2A. A tuple 𝑠 = (𝑠1, 𝑠2, … , 𝑠𝑛 ) is called a strategy profile. It consist of 𝑛 strategies, one for each player. Let 𝑆 = ∏𝑖∈𝑁 𝑆𝑖 be the space of all strategy profiles. Sometimes we write a strategy profile as 𝑠 = (𝑠𝑖 , 𝑠−𝑖 ), with 𝑠−𝑖 ∈ 𝑆−𝑖 = ∏𝑗∈𝑁{𝑖} 𝑆𝑗 being the set of strategy profiles of ‘the other players’. Dominant strategies Definition A strategy 𝑠𝑖 is a (strictly) dominant strategy for player 𝑖 if against every strategy profile of the other players, it gives player 𝑖 a higher payoff than any other strategy player 𝑖 could play. So, a strategy 𝑠𝑖 is a (strictly) dominant strategy for player 𝑖 if 𝑢𝑖 (𝑠𝑖 , 𝑠−𝑖 ) > 𝑢𝑖 (𝑠𝑖′ , 𝑠−𝑖 ) for all 𝑠−𝑖 ∈ 𝑆−𝑖 and all 𝑠𝑖′ ∈ 𝑠−𝑖 /{𝑠𝑖 } Definition A strategy profile 𝑠 = (𝑠1, 𝑠2, … , 𝑠𝑛 ) is Pareto dominated if there exists a strategy profile 𝑠 ′ = (𝑠1′, 𝑠2′, … , 𝑠2′) such that 𝑢1 (𝑠 ′ ) > 𝑢𝑖 (𝑠) for all 𝑖 ∈ 𝑁 A strategy profile is Pareto optimal if it is NOT Pareto dominated. The Prisoner Dilemma shows that the strategy profile that results if all players play a dominant strategy need not be Pareto optimal. Dominated strategies Mostly there is no strictly dominant strategies. In these cases there can exist strictly dominated strategies. Example This game does not have a strictly dominant strategy. However, it has a strictly dominated strategy, namely strategy R2 for player Row. © 2020 Patrick Hup Definition A strategy 𝑠𝑖 is a strictly dominated if there is another strategy 𝑠𝑖′ ≠ 𝑠𝑖 for player 𝑖 such that against every strategy profile of the other players 𝑠𝑖′ gives player 𝑖 a higher payoff than 𝑠𝑖 . So a strategy 𝑠𝑖 is a strictly dominated if there is another strategy 𝑠𝑖′ ≠ 𝑠𝑖 , such that 𝑢𝑖 (𝑠𝑖 , 𝑠−𝑖 ) < 𝑢𝑖 (𝑠𝑖′ , 𝑠−𝑖 ) for all 𝑠−𝑖 ∈ 𝑆−𝑖      A rational player will not play a strictly dominated strategy. Note that this says nothing about strategy 𝑠𝑖′ , but only that strategy 𝑠𝑖 will not be played. There are games which have no strictly dominated strategies. If a game has a strictly dominant strategy (for a player), then it also has strictly dominated strategies (all the other strategies of that player). Alternative definition of strictly dominant strategy: a strategy which strictly dominates all the other strategies of that player. Iterated Elimination of Strictly Dominated Strategies Consider again the previous example What would you do (as player Row or Col)? We already saw that R2 is dominated What would you do as player Col Consider again the previous example: What would you do (as payer Row or Col)? What next? What is left after iterated elimination of dominated strategies is (R1,C2) Step 1. If there are no strictly dominated strategies, then stop. Otherwise go to step 2. Step 2. Choose a player that was not chosen in the previous round and delete at least one (maybe even all of its strictly dominated strategies in the reduced game. Go to step 3. Step 3. Consider the reduced game (the game that is left after step 2) and return to step 1. The final reduced game is independent of the order of elimination. So, any player can find this reduced game by itself when it knows all payoffs. Rationality requirement is limited: it is required that each player is able to make (a finite number of) comparisons. Weak Dominance What would you play in the game? There are no strictly dominated strategies We say that R2 is weakly dominated (by R1) for player Row (Similar, C2 is a weakly dominated strategy for player Col.) © 2020 Patrick Hup Definition A strategy 𝑠𝑖 is a weakly dominated strategy for player 𝑖 if there is another strategy 𝑠𝑖′ such that every strategy profile of the other players, 𝑠𝑖′ gives player 𝑖 at least the same payoff than 𝑠𝑖 , and there is at least one strategy profile of the other players such that 𝑠𝑖′ gives player 𝑖 a higher payoff than 𝑠𝑖 . Formally, a strategy 𝑠𝑖 is a weakly dominated if there is another strategy 𝑠𝑖 ≠ 𝑠𝑖 , such that 𝑢1 (𝑠𝑖 , 𝑠−𝑖 ) ≤ 𝑢𝑖 (𝑠𝑖′ , 𝑠−𝑖 ) for all 𝑠−𝑖 ∈ 𝑆−𝑖 and there is at least one 𝑠−𝑖 ∈ 𝑆−𝑖 such that 𝑢1 (𝑠𝑖 , 𝑠−𝑖 ) < 𝑢𝑖 (𝑠𝑖′ , 𝑠−𝑖 ) Definition A strategy is weakly dominant for a player if it weakly dominates all the other strategies of that player. Remarks: The order of elimination matters by iterative elimination of weakly dominated strategies. Therefore, usually we do not do that. Moreover you can eliminate Nash equilibria. Game Theory: Nash Equilibrium A rational player will try to maximize his/her payoff Each player will choose a strategy in response to the strategies of all other players that maximizes his/her own payoff. The result is called a Nash equilibrium. Definition A strategy profile is a (pure) Nash equilibrium if and only if it holds that once every player choses its strategy, then none of the players could reach a better outcome by unilaterally switching to another strategy. Formally, A strategy profile 𝑠 = (𝑠1, 𝑠2, … , 𝑠𝑛 ) is a (pure) Nash equilibrium if for every player 𝑖 𝑢𝑖 (𝑠𝑖 , 𝑠−1) ≥ 𝑢𝑖 (𝑠𝑖′ , 𝑠−1) for all 𝑠𝑖′ ∈ 𝑠𝑖 Remark: Note that a Nash equilibrium is a strategy profile, while a dominant (dominated strategy) is a strategy for some player Remarks: A (pure) Nash equilibrium is always one of the pure strategy profiles which is left after iterative elimination of (pure) strictly dominated strategies. Generally, there are less pure Nash equilibria than pure strategy profiles that remain after the iterative elimination of str ictly dominated pure strategies. As mentioned in Lecture DT2, this does not have to be the case by the iterative elimination of weakly dominated strategies. An alternative definition considers the best response of a player. A strategy 𝑠𝑖 is a best response of player 𝑖 against the strategy profiles 𝑠−𝑖 of the other players, if it maximizes the payoff of player 𝑖 when the other players play their strategy in 𝑠−𝑖 . Then a strategy profile is a (pure) Nash equilibrium if and only if every player plays a best response against the others. Example Best response of player Row: R1 is best response against C1 R1 is best response against C2 R1 and R3 are best response against C3 Best response of player Col: C1 and C3 is best response against R1 C2 and C3 is best response against R2 C1 is best response against R3 © 2020 Patrick Hup One (pure) Nash equilibrium: (R1,C3) Some classic examples We already saw one of the most famous games, the Prisoner dilemma, in Lecture DT2. Another example: Coordination game Both players prefer to do something together, but Row prefers Bar to Cinema, and Col prefers Cinema to Bar. (Pure) Nash equilibria are (Bar, Bar) and (Cinema, Cinema) Remark: A game like this is also known as ‘Battle of the Sexes’ Hawk Dove Game The best for a player is that he/she fights and the opponent does not. Then the ‘fighter’ wins. The worst that can happen for both players is that both fight (disaster). It is still better to lose from the other when you do not fight and the other does. (Pure) Nash equilibria are (Fight, Truce) and (Truce, Fight). Remark: A game like this is also known as ‘Chicken’. Mixed strategies A pure Nash equilibrium does not have to be unique. In other words, a game can have more than one pure Nash equilibrium. Not every game has a pure Nash equilibrium, see ‘Matching Pennies’. A mixed strategy for a player is a probability distribution over its pure strategies. Every normal form game has at least one mixed Nash equilibrium. Each player will choose a mixed strategy in response to the mixed strategies of all other players that maximizes his/her own expected payoff. The result is called a mixed Nash equilibrium. Finding Mixed Nash Equilibria We illustrate this with an example of a game with two players who each have two pure strategies. Notation: 𝑝: Probability that player Row plays R1 (0 ≤ 𝑝 ≤ 1) 𝑞: Probability that Player col plays C1 (0 ≤ 𝑞 ≤ 1) Then 1-𝑝: Probability that player row plays R2 1-𝑞: Probability that player Co plays C2 © 2020 Patrick Hup Note that 𝑝 described a mixed strategy for player Row, and 𝑞 describes a mixed strategy for player Col.  The Expected Payoff 𝑢1 (𝑝, 𝑞) of player Row (player 1) depends on 𝑝 and 𝑞 and is given by: 𝑢1 (𝑝, 𝑞) = = 3𝑝𝑞 + 𝑝(1 − 𝑞) + 2(1 − 𝑝)𝑞 + 4(1 − 𝑝)(1 − 𝑞) = 3𝑝𝑞 + 𝑝 − 𝑝𝑞 + 2𝑞 − 2𝑝𝑞 + 4 − 4𝑝 − 4𝑞 + 4𝑝𝑞 = 4𝑝𝑞 − 3𝑝 − 2𝑞 + 4 Similar, the Expected Payoff 𝑢2 (𝑝, 𝑞) of player Col depends on 𝑝 and 𝑞 and is given by 𝑢2 (𝑝, 𝑞) = = 0𝑝𝑞 + 𝑝(1 − 𝑞) + 2(1 − 𝑝)𝑞 + (1 − 𝑝)(1 − 𝑞) = 0𝑝𝑞 + 𝑝 − 𝑝𝑞 + 2𝑞 − 2𝑝𝑞 + 1 − 𝑝 − 𝑞 + 𝑝𝑞 = −2𝑝𝑞 + 𝑞 + 1  𝑝 = 0 (R2) is a best response for player Row If and only if 𝑢1(0, 𝑞) ≥ 𝑢1 (𝑝, 𝑞) ∀0 < 𝑝 ≤ 1 If and only if −2𝑞 + 4 ≥ 4𝑝𝑞 − 3𝑝 − 2𝑞 + 4 If and only if 𝑞 ≤ 3/4  𝑝 = 1 is a best response for player 1 If and only if 𝑞 ≥ 3/4  0 ≤ 𝑝 ≤ 1 is a best response for player 1 If and only if 𝑞 = 3/4  𝑞 = 0 is a best response for player 2 if and only if 𝑢2 (𝑝, 0) ≥ 𝑢2 (𝑝, 𝑞) if and only if 1 ≥ −2𝑝𝑞 + 𝑞 + 1 if and only if 𝑝 ≥ 1/2 ∀0 < 𝑞 ≥ 1  𝑞 = 1 is a best response for player 2 if and only if 𝑝 ≤ 1/2  0 ≤ 𝑝 ≤ 1 is a best response for player 2 If and only if 𝑝 = 1/2 1 3 2 4 This gives the unique (mixed) Nash equilibrium: 𝑝 = , 𝑞 = In the Nash equilibrium player 1 plays R1 and R2 both with probability ½, while player 2 plays C1 with probability ¾ and C2 with probability ¼. An alternate way to find mixed Nash equilibria is to use partial differentiation: First order condition for maximizing 𝑢1(𝑝, 𝑞) is 𝜕𝑢2 (𝑝. 𝑞) = 4𝑞 − 3 = 0 𝜕𝑝 3 so, 𝑞 = 4 First order condition for maximizing 𝑢2 (𝑝, 𝑞) is so, 𝑝 = 𝜕𝑢2 (𝑝. 𝑞) = −2𝑝 + 1 = 0 𝜕𝑝 1 2 A social choice problem is any decision problem faced by a group in which each individual is able to state ordinal preferences over outcomes. © 2020 Patrick Hup Social Choice: Social Choice and Welfare functions A social choice problem consists of a set of individual decision makers, and for each decision maker a preference relation. We distinguish two types of preference aggregation:   A Social Choice Function assigns to every social choice problem one or more alternatives which can be considered as the alternatives that are chosen by society. A Social Welfare Function assigns to every social choice problem one preference relation which can be seen as the ‘social preference relation’. Social choice problems We consider a society with a finite set of agents or individuals who can choose among a finite set of alternatives. The society should come to one collective decision (choice of one alternative) taking into account the preferences of the individual agents. Definition Given a set of alternatives 𝐴 = {𝑎1 , … , 𝑎𝑛 } and a finite set of agents 𝑁 = {1, … , 𝑛}, a preference profile is a tuple 𝐺 = (≽𝑖 )𝑖∈𝑁 with ≽𝑖 a preference relation on 𝐴, for 𝑖 ∈ 𝑁. A social choice problem is a triple (𝑁, 𝐴, 𝐺) where  𝑁 is a finite set of agents  𝐴 is a finite set of alternatives, and  𝐺 = (≽𝑖 )𝑖∈𝑁 is a preference profile Since, we take the set of agents 𝑁 as well as the set of alternatives 𝐴 as given, we represent a social choice problem (𝑁, 𝐴, 𝐺) just by its preference profile 𝐺. We make the following assumption. Assumption: All individual preference relations are transitive, complete and asymmetric Consequently, we can denote the preference relation 𝑖 ∈ 𝑁 by ≻𝑖 , and a preference profile by (≻𝑖 )𝑖∈𝑁 𝑎 ≻𝑖 𝑏 means that agent 𝑖 considers alternative 𝑎 ‘better than’ alternative 𝑏. Social choice functions Given the preferences of the individual agents, there are two main questions:  How do/should the agents choose one alternative together for the whole society (Social choice function)  Is it possible to derive a social preference relation reflecting the preferences of the society as a whole? (Social welfare function) Note that both questions are relevant both from a normative as well as a descriptive viewpoint. Two viewpoints that have been taken in the literature are  cooperative viewpoint where a benevolent dictator tries to do what is ‘best’ for society  a strategic viewpoint where, by voting, agents can strategically manipulate the voting outcome A social choice function 𝐶 assigngs to every preference profile 𝐺 a subset of the set of alternatives 𝐴, i.e. 𝐶(𝐺) ⊆ 𝐴. The set of 𝐶(𝐺) is called the social choice set associated to preference profile G. Remark: Social choice functions are also called voting rules, or shortly rules. Social welfare functions Instead of only making a (social) choice, we might want to know the full social preference relation for a social choice situation. A social welfare function 𝐹 assigns a preference relation to every social choice situation. Most social choice and social welfare functions fall into one of the following categories: 1. Scoring functions (Borda) 2. Majoritarian functions (Condorcet) Ad1. Scoring scf’s and swf’s assigns scores (points) to the alternatives in every preference profile, and the ‘winner’ is the alternative that has the highest sum of scores over all individual agents. Ad2. Majoritarian scf’s and swf’s derive from each social choice problem one preference relation (the social preference relation) and based on this relation dertemine who is the ‘winner’. We abbreviate:  scf: social choice function  swf: social welfare function © 2020 Patrick Hup Plurality social welfare function  The plurality scf chooses from the alternatives by only considering what are the best alternatives for each agent. It chooses the alternatives that are best for the highest number of agents. Example Agent 1 a b c d Agent 2 a b c d Agent 3 a b d c Agent 4 c b d a Agent 5 c b c a 𝑝𝑙𝑢𝑟(𝐺) = (𝑎, 𝑏, 𝑐, 𝑑) = (3, 0, 1. 1) So, plurality winner is alternative 𝑎 Antiplurality social choice function  The antiplurality scf chooses the alternatives by only considering what are the worst alternatives for each agent. It chooses the alternatives that are worst for the lowest number of agents. Consider the previous example 𝑎𝑛𝑡𝑖𝑝𝑙𝑢𝑟(𝐺) = (2, 0, 1, 2) So, the antiplurality winner is alternative 𝑏. Borda social welfare function  The borda swf assigns points to all alternatives, and the ‘winner’ is the alternative with the highest number of points when summing over all the agents. Consider Agent 1: Agent 2: Agent 3: the previous example 𝑎≻𝑏≻𝑐≻𝑑≻𝑒 𝑐≻𝑑≻𝑒≻𝑏≻𝑎 𝑏≻𝑐≻𝑎≻𝒅≻𝑒 𝑏𝑜𝑟𝑑𝑎(≻1) = (4, 3, 2, 1, 0) 𝑏𝑜𝑟𝑑𝑎(≻2) = (0, 1, 4, 3, 2) 𝑏𝑜𝑟𝑑𝑎(≻3) = (2, 4, 3, 1, 0) Then, the total Borda scores are 𝐵𝑜𝑟𝑑𝑎(𝐺) = (6, 8, 9, 5. 2) The Borda winner is alternative 𝑐. Condorcet social welfare function (Majority rule)  Is the set of alternatives are ‘best elements’ in the social preference relation ≽𝐺  The majority rule (Condorcet rule) gives social preference relation 𝑎 ≽𝐺 𝑏 ≽𝐺 𝑐 ≽𝐺 𝑑. Consider Agent 1: Agent 2: Agent 3: the previous example 𝑑≻𝑎≻𝑏≻𝑐 𝑏≻𝑐≻𝑑≻𝑎 𝑐≻𝑑≻𝑎≻𝑏 The majority relation is 𝑎 ≽𝐺 𝑏 𝑐 ≽𝐺 𝑎 𝑑 ≽𝐺 𝑎 𝑑 ≽𝐺 𝑏 𝑐 ≽𝐺 𝑑 So, there is no Condorcet winner. Remark: The majority relation ≽𝐺 need not be transitive. © 2020 Patrick Hup Social Choice: Properties of Social Choice and Social Welfare Functions, Sen’s liberalism, Restricted Domains Properties of social welfare functions Definition A group of people 𝐷 (which may be a single-member group), which is part of the group of all individuals, is decisive with respect to the ordered pair of social states (𝑎, 𝑏) if and only if state 𝑎 is socially preferred to 𝑏, whenever everyone in 𝐷 prefers 𝑎 to 𝑏. A group that is decisive with respect to all pairs of social states is called decisive. Property A social welfare function satisfied non-dictatorship if and only if no single individual is decisive. Property A social welfare function satisfies ordering if and only if for every possible combination of individual preference relations, the social preference relation is complete, asymmetric and transitive. Property A social welfare function satisfied Pareto efficiency if and only if the group of all individuals in society is decisive. Property A social welfare function satisfies independence of irrelevant alternatives (IIA) if and only if all individuals having the same preference between 𝑎 and 𝑏 in two different preference profiles 𝐺 and 𝐺 ′ , implies that society’s preference between 𝑎 and 𝑏 must be the same in 𝐺 and 𝐺 ′ . Arrow’s Impossibility Theorem Theorem (Arrow’s impossibility theorem) If there are at least three alternatives, then no Social Welfare Function satisfied independence of irrelevant alternatives, Pareto efficiency, non-dictatorship and the ordering condition. Remark: This is Theorem 13.1 in the book. Corollary if there are at least three alternatives, then every Social Welfare Function that satisfied Pareto efficiency, the ordering condition and IIA, must be dictatorial. Remark: There exist non-dictatorial social welfare function that satisfy unanimity and IIA on restricted domains. For examples, if preferences are single-peaked, then the Condorcet social welfare function satisfied IIA and is Pareto efficient. Sen on Liberalism Property A social welfare function satisfied minimal liberalism if and only if there are at least two individuals in society such that for each of them there is at least one pair of alternatives with respect to which she is decisive, that is, there is a pair of 𝑎 and 𝑏, such that if she prefers 𝑎 to 𝑏, then society prefers 𝑎 to 𝑏 (and society prefers 𝑎 to 𝑏 if she prefers 𝑎 to 𝑏). Theorem There is no Social Welfare Function that satisfies minimal liberalism, Pareto efficiency and the ordering condition Remark: This is Theorem 13.2 in the book. (Paradox of Paretian Liberal) Properties of social choice functions Next, we discuss an impossibility result for social choice functions. Assumption: We assume the social choice function to tb single valued, i.e. to every social choice problem it assings a unique choice (the choice set is a singleton). Remark: Note that this is a rather strong assumption. Moreover, it is an assumption on the ‘outcome’ (what is assigned by the social choice function), and not an assumption on the preference profile. Definition Agent 𝑗 ∈ 𝑁 has a successful manipulation in preference profile 𝐺 = (≻𝑖 )𝑖 ∈ 𝑁 if by ‘misreporting’ his/her preferences (i.e. stating a difference preference relation ≻′𝑖 than its real preference relation ≻𝐼 ) while the other agents do not change their preference relation, the social choice is better for agent 𝑖. Property A social choice function is strategy-proof if for every preference profile there is no agent who has a successful manipulation. So, a social choice function is strategy-proof if misreporting is never beneficial for any agent. Property A social choice 𝐶 is dictatorial if there is always an individual agent whose unique best element is always the social choice. Formal definition: A social choice function 𝐶 is dictatorial if there is agent 𝑖 ∈ 𝑁 such that, for every preference profile 𝐺, 𝑎 ≻𝑖 𝑏 for all 𝑏 ∈ 𝐴\{𝑎} ⇒ 𝐶(𝐺) = {𝑎} Gibbard-Satherwaite impossibility Theorem Theorem (Gibbard-Satherwaite Theorem) If there are at least three alternatives, then there is no Social Choice Function that is strategy-proof and non-dictatorial. Phrased differently Corollary If there are at least three alternatives, then every strategy-proof social choice function is dictatorial. © 2020 Patrick Hup Remark: Similar as with Arrow’s impossibility theorem for social welfare functions, if we restrict the domain (i.e. we do not allow all preference relations), then there might be strategy-proof social choice functions that are not dictatorial. An example of such a domain are agent single-peaked preferences. Remark: Dowding and van Hees (2007) argue that manipulation might be a virtue from a democratic perspective Strategic manipulation Consider the following preference profile with 5 agents and 4 alternatives Agent Agent Agent Agent Agent 1 2 3 4 5 a(3) a(3) a(3) c(3) d(3) b(2) b(2) b(2) b(2) b(2) c(1) c(1) d(1) d(1) c(1) d(0) d(0) c(0) a(0) a(0) (In red are the Borda scores for the individual agents). The total Borda scores of the alternatives are 𝐵𝑜𝑟𝑑𝑎(𝐺) = (9, 10, 6, 5), so the Borda winner is alternative b. Agent 1 has a strategic manipulation: acdb Then the total Borda scores are 𝐵𝑜𝑟𝑑𝑎(𝐺) = (9, 8, 7, 6) Single-peaked preferences A preference relation is single-peaked if the alternatives can be put in a line (order) such that there is a best alternative 𝑎∗, and every alternative 𝑏 that ‘lies between’ 𝑎 and 𝑎∗ is considered better than alternative 𝑎. So, there is a unique best alternative, and if you ‘walk away’ from that alternative in either direction (left or right) your utility decreases. So, we consider a set of alternatives that can be ordered on a line, in other words let 𝐴 = {𝑎1 , 𝑎2 , … , 𝑎3 } with 𝑎𝑘 ∈ 𝑁 be such that 𝑎𝑘 < 𝑎𝑘+1 for all 𝑘 ∈ {1, … , 𝑚 − 1} Example: 𝐴 = {1, 2, … , 𝑚} Definition Preference relation ≽𝑖 on 𝐴 is single-peaked if 1. there is an 𝑎∗ ∈ 𝐴 such that 𝑎∗ ≻𝑖 𝑏 for all 𝑏 ∈ 𝐴\{𝑎∗ }, and 2. for all 𝑎, 𝑏 ∈ 𝐴 it holds that: if 𝑎 < 𝑏 < 𝑎∗ then 𝑏 ≻𝑖 𝑎; and if 𝑎 > 𝑏 > 𝑎∗ then 𝑏 ≻𝑖 𝑎 Remark: a single-peaked preference relation would not be complete. Some examples of a single-peaked preferences on 𝐴 = {1, 2, … , 100}  𝑎 ≻𝑖 𝑏 iff 𝑎 < 𝑏 (1 ≻𝑖 2 ≻𝑖 3 ≻𝑖 … )  𝑎 ≻𝑖 𝑏 iff 𝑎 > 𝑏 (100 ≻𝑖 99 ≻𝑖 98 ≻𝑖 … )  𝑎 ≻𝑖 𝑏 iff |𝑎 − 4| < |𝑏 − 4| 2 ≻𝑖 1, 2 ≻𝑖 7, 3 ≻𝑖 2, … Theorem If all preference relations ≻𝑖 , 𝑖 ∈ 𝑁, are single-peaked, then the majority relation ≽𝐺 is completely and transitive. Corollary If all preference relations ≻𝑖 , 𝑖 ∈ 𝑁, are single-peaked, then a Condorcet winner exist. Theorem If all preference relations ≻𝑖 , 𝑖 ∈ 𝑁, are single-peaked, then Condorcet rule is strategy-proof Remarks: For the Condorcet rule only the peaks matter. No scoring rule is strategy-proof. The Condorcet winner is that alternative 𝑎∗ such that the number of agents that has it peak ‘to the left’ of 𝑎∗ is equal to the number of agents that has it peak ‘to the right’ of 𝑎∗. (Median voter) © 2020 Patrick Hup

Game Theory

Related documents

Products

Support

Game Theory

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib