Coordination Problems and Norms in Heterogeneous Populations Mark Bernard∗, Ernesto Reuben†and Arno Riedl‡ October 31, 2011 PRELIMINARY AND INCOMPLETE Abstract We study coordination frictions, and the importance of contribution norms, in step-level public good games with large equilibrium sets and heterogeneous agents. We show that heterogeneity creates frictions on aggregate. An elicitation task and a questionnaire show that individuals hold, and expect others to hold, well defined yet conflicting normative views of fair contribution rules related to effciency, equality, and equity. Successful groups agree on, and then stick with, normatively appealing allocations with focal properties that can be derived from first principles. Moreover, normative viewpoints and expectations (as elicited ex-ante) predict group behavior, confirming the importance of normative arguments and expectations in complex coordination problems. ∗ Stockholm School of Economics, e-mail: mark.bernard@hhs.se IZA and Columbia University, e-mail: ereuben@columbia.edu ‡ CESifo, IZA, and Maastricht University, e-mail: a.riedl@maastrichtuniversity.nl † 1 1 Introduction The need for coordination among people or entities with heterogeneous characteristics is an undeniable fact of social and economic life. When a single point has to be selected from a potentially large equilibrium set, differing views and expectations resulting from heterogeneity can cause substantial frictions. What is more, coordination problems often have a threshold characteristic in the sense that an endeavor is only fruitful if a critical amount of resources can be bundled, and not otherwise. The recent disagreement among Member States about the size, scope and particularly financing of the European Financial Stability Facility, devised to discourage speculation against Eurozone sovereign debt, presents a case in point.1 On a smaller scale, (partly) irreversible investments into joint projects that require a minimum amount of financing, or cost-sharing agreements in contexts as mundane as building restoration (consider the case of of a co-op), may when interacted with heterogeneity on relevant and observable characteristics (wealth; benefit from project; stake in co-op) be hindered, or held up, by disagreement about how to take this heterogeneity into account. Crucially, part of (or all) the resources put into the project, or at least the effort sunk in trying to find a solution, can get lost in case of failure. Hence all parties involved are at risk not only of jettisoning efficiency but also wasting private resources in case of miscoordination. It has been suggested that (social) norms can act as equilibrium selection devices (Schelling [1960]; Lindahl and Johannesson [2009]). As laid out in Binmore [1994, 1998], evolution may have favored the emergence of social norms as a means to select among multiple equilibria on the Pareto frontier. The presence of shared views about normatively appealing behavioral rules is a necessary condition for a social norm to exist. In addition, for a norm to emerge and be actually observed, sufficiently many people have to be willing to follow the rule, either intrinsically or through the threat of sanctions (Bicchieri [2006]; Young [2008]). Stability is another key feature of norms, to be understood in the stochastic sense in the long run (Foster and Young [1990], Young [1993, 1998]). The key problem in heterogeneous groups, at least in the short run, is that there may not be a uniquely prevailing view about what social norm should be selected. For instance, when it comes to cost sharing among agents or entities with differing wealth levels, assuming decreasing marginal utility of wealth, those who apply a principle of equal sacrifice will disagree with those who put emphasis on equal outcomes. And even if (without knowing it) subjects interacting in a group happen to agree ex-ante, they may expect others to disagree and adjust their behavior. Thus, heterogeneity in normative expectations may be sufficient to upset coordination. 1 Admittedly the exact threshold is unknown here even to experts, but there is no doubt about there being a threshold. Former US Treasury Secretary Hank Paulson’s 2008 statement about the need for a “big bazooka” to calm markets in the wake of the financial crisis also springs to mind. 2 The purpose of this paper is to investigate experimentally and using questionnaire data (i) the extent of frictions caused by heterogeneity, (ii) whether people share (and expect to share) specific normative views regarding contribution behavior in homogeneous and heterogeneous groups that could serve as a basis for a contribution norm, (iii) whether successful groups coordinate on stationary equilibria and (iv) whether these stationary equilibria exhibit normatively appealing properties. Moreover, we investigate to which extent the type of heterogeneity influences equilibrium selection. Finally, we hope to gain insight into the link between (i) and (ii), i.e. whether coordination failure at the group level can indeed be related to normative disagreement between individuals, or diverging expectations, ex-ante. Our workhorse is a step-level, or threshold, public good game where players are heterogeneous with respect to their wealth or the extent to which they benefit from the public good (in case it is provided) and this heterogeneity is public information. We find that heterogeneity causes substantial efficiency losses relative to the case of homogeneous groups. While in the homogeneous treatment, virtually all groups coordinate on the unique symmetric and efficient equilibrium, there is considerable variation between groups’ solutions to the coordination problem in the heterogeneous treatments. However, what successful groups have in common is that they do coordinate on a single allocation that is consistent with a specific normative principle. This reflects our questionnaire data which shows that in the heterogeneous treatments, while virtually all subjects agree on basic normative principles, there is some disagreement as to which specific rule is to be followed. Still, the number of competing candidates is small and most can be rationalized with a specific fairness/equity principle. Finally, there exists a clear statistical link between individual questionnaire responses and group outcomes. The rest of the paper is organized as follows: Section 2 describes the experiment and the questionnaire. Section 3 discusses related literature. Section 4 contains equilibrium analysis and selection arguments, as well as our hypotheses. Section 6 presents our results and Section 7 concludes. 2 2.1 Design and Procedures Underlying game The game around which our experiment revolves is a step-level public goods game with groups of four players. Each player i receives an endowment of yi in “points”. Players simultaneously but individually decide how many points to contribute to produce a public good. Let ci be 3 player i’s contribution, where ci ∈ 0, yi . Contributions are irreversible and sunk. The public good is only produced if the sum of all four players’ contributions surpasses a threshold c̃, P 0 < c̃ < 4j=1 y j . If the threshold is surpassed, the public good provides a payoff of vi c̃ to P individual i which may vary across players, where 0 < vi ≤ 1 and 4j=1 v j > 1.2 Hence, individual i’s earnings can be expressed as 4 X πi = yi − ci + I c j ≥ c̃ vi j=1 where I (·) is the indicator function. We implement three types of groups. In our baseline treatment “Homogeneous” (shorthand “H”), yi = 30 for all i and vi = 0.5 for all i. In our first heterogeneous treatment, “Heterogeneous Endowments” (“HE”), there are two “high types” with y1 = y2 = 60 and two “low types” with y3 = y4 = 30 while still vi = 0.5 for all i.3 That is, players 1 and 2 have twice the endowment of the other two players. In our second heterogeneous treatment, “Heterogeneous Benefits” (“HB”), on the other hand, yi = 30 for all i but now there are two “high types” with v1 = v2 = 1 and two “low types” with v3 = v4 = 0.5. In plain English, conditional on its provision, players 1 and 2 benefit twice as much from the public good as players 3 and 4. In all treatments we keep the threshold at c̃ = 60. Table 1 summarizes the information on our treatments and data. Table 1: Treatment details. Treatment H HE HB Type yi vi low 30 0.5 high 60 0.5 low 30 0.5 high 30 1 low 30 0.5 Total c̃ 60 60 60 60 60 Groups 15 16 16 47 Subjects 60 32 32 32 32 188 Each experimental session consisted of three parts which we shall call “Elicitation ex-ante”, “Interaction” and “Elicitation ex-post”, administered sequentially. The two Elicitation parts were structurally identical and non-interactive and aimed at eliciting subjects’ normative views and expectations. The Interaction part had subjects play 20 rounds of the game just described, with fixed partners. P This latter assumption ensures that producing the public good is always efficient if 4j=1 c j = c̃. 3 Since subjects will be assigned into roles randomly, the particular numbering is irrelevant. 2 4 2.2 Elicitation After subjects had come into the lab and been randomly assigned to cubicles with computers,4 they were randomly assigned into groups and roles and the step-level public good was explained to them on-screen (with parameters according to the treatment that was being run and using neutral language). That is, when learning about the game they knew whether they were low or high types. However, they were not yet told that there would be an Interaction part. Instead, they were asked to choose an allocation (i.e. a feasible vector of contribution levels; henceforth the “prescribed allocation”) of their choice which would be implemented for a randomly selected other group in the lab. They were asked to act from the perspective of a “neutral, uninvolved arbitrator” and urged to specify an allocation they deemed “appropriate”. No particular allocations were suggested, but the computer interface allowed subjects to put in different contribution vectors and learn about their payoff consequences before finalizing and submitting a decision. The payoffs from the selected allocation were added anonymously to the earnings of the members of the randomly selected other group at the end of the experiment. Having made their decisions, subjects were asked to guess the prescribed allocations of their fellow group members (hence, there were three vector-valued guesses to be made). Incentivization was such that a fixed premium was added to a subject’s earnings at the end of the experiment for each guess that turned out to exactly correct, given which subjects should have reported their modal guess for each fellow group member. After all subjects had completed their input in this stage, the experiment moved on to the Interaction part, which had not been previously announced. No one was informed of anyone else’s guesses. 2.3 Interaction and second Elicitation phase Subjects remained matched in the groups and roles they were initially assigned to and played 20 rounds of the step-level public good game. At the end of each round, which started with an input screen, subjects were informed of all group members’ choices, whether the public good had been provided, and all group members’ payoffs in that round. No history or cumulative payoffs were displayed. After round 20, there was a surprise announcement that there would be another set of questions. The part that followed was exactly identical to the initial Elicitation, with groups and roles unchanged, except that the prescribed allocation would now go to another randomly selected other group (there was another random draw). The purpose of the second Elicitation part was to check for consistency, but also to see whether there would be convergence in normative expectations in successful groups (provided they settled on a stationary allocation). 4 The entire experiment was programmed usingthe software z-Tree (Fischbacher [2007]). 5 After all subjects were done submitting their prescriptions and guesses, subjects were called to the front desk individually, informed of their total earnings from the experiment, paid and dismissed. Subjects spent about 50 minutes in the lab and earned an average of $ 16.78. Sessions were run at the Center for Experimental Social Science at New York University and recruitment took place using the Center’s online recruitment system.5 No subject participated in more than one session. 3 Equilibrium sets and norm-based selection arguments 3.1 Theory We restrict attention to the pure strategy equilibria of the one-shot game. This is plausible since our focus on norms entails a stationarity assumption, given which no allocation outside the set of one-shot Nash equilibria can be an equilibrium of the finitely repeated game.6 Clearly, there is always a trivial equilibrium in which no one contributes anything (call this vector “zero contributions”, or z). Let Σ be the set of all feasible contribution vectors. A necessary condition P for an allocation σ with positive contibutions to be an equilibrium is that 4j=1 c j = c̃, that is, the public good must be provided (otherwise reversion to 0 is a profitable deviation) and the allocation must be efficient (otherwise some players can reduce their contributions while the public good, which gives a fixed payoff, is still provided). Moreover, each player i must obtain weakly more than her endowment yi . In fact, the two conditions are necessary and sufficient and we have, for all treatments: 4 4 X X 4 σ ∈ Σ : c = c̃ ∩ π ≥ y ∪ = σ ∈ R : c = c̃ ∩ max c ≤ 30 NE = ∪ {z} {z} j j i + i j=1 j=1 The set of undominated equilibria is: 4 X NE = σ ∈ Σ : c = c̃ ∩ π y ∪ {z} j u j=1 Since y is also the (pure-strategy) minmax vector, our earlier statement about the link between stationary equilibria of the repeated game and those of the one-shot game follows. It is easy to see that the set NE has several thousand elements and does not vary by treatment. The sets of weakly dominated equilibria WD for each treatment are WDH = WDHE = 5 6 http://rec.econ.nyu.edu/cessWeb/viewCalendar.do To be proved after characterization of the equilibrium set. 6 {σ ∈ Σ : ci ≥ 30 for some i}, WDHB = {σ ∈ Σ : ci = 30 for some i > 2}. While there no longer is a perfect overlap when applying weak dominance, we have WDH = WDHE ⊃ WDHB and WDHB still has a cardinality of several thousand. Essentially, sets only differ on those allocations where at least one of the high types contributes his/her entire endowment. Apart from z, which is always “safe”, there is also no clear risk ordering on NE u (Pesci [2010]). Stronger selection arguments are needed. Relaxing the assumption of material self-interest can help reduce the cardinality of the equilibrium sets by ruling out allocations that lead to very unequal distributions of earnings.7 For the same reason they could also explain differences in contribution behavior between H and the Heterogeneous treatments (but not between HE and HB). However, unless preferences are (a) commonly known and (b) strong enough to pin down a single allocation, multiplicity of equilibria and/or strategic uncertainty remain impediments to cooperation, albeit perhaps to a lesser extent, and our question as to what solves the selection problem remains nontrivial. 3.2 Focal and Conflicting Contribution Norms Our main interest concerns the possible emergence of contribution norms in the homogeneous and heterogeneous groups. In the literature on social norms there are some divergent views regarding the exact definition of a social norm. For instance, in the tradition of Sudgen [1986] and Coleman [1990], Bicchieri [2006] argues that norms enforce non-equilibrium behavior in situations where there is a tension between individual and collective material welfare. Young [2008] takes a different stance by arguing that the term “norm” can only apply to games with multiple equilibria and, hence, cannot serve as an enforcement mechanism for non-equilibrium behavior. With our approach we follow the approach of Young here and focus on equilibrium selection. Our hypothesis is that vectors which are focal because they heed particular normative principles will be chosen and enacted as contribution norms in successful groups. However, in case of heterogeneity there may be disagreement about which normative principle to apply, and the principles on which agreement can be reached may be too weak to uniquely select an allocation. We illustrate this in what follows. Throughout we assume that all players agree on the zero vector z being undesirable (although potentially the lesser evil compared to other allocations) 7 Theoretical models assuming other-regarding preferences were originally proposed to explain behavior that is inconsistent with maximization of own material payoffs (e.g., Levine [1998]; Fehr and Schmidt [1999]; Bolton and Ockenfels [2000]; Charness and Rabin [2002]; Dufwenberg and Kirchsteiger [2004]; Falk and Fischbacher [2006]; Cox, Friedman, and Gjerstad [2007]). Specifcally, it has been shown that, with appropriate assumptions on the strength of other-regarding preferences, social dilemma games are transformed into coordination games with multiple equilibria, most of which include positive contribution levels (e.g., Rabin [1993] and Propositions 4 and 5 in Fehr and Schmidt [1999]). 7 and restrict attention to the equilibria on the Pareto frontier. Thus, we seek guidance from rules regarding how to split the burden of providing the public good, or relative contribution rules, conditional on Nash equilibrium play. A very basic notion is that of horizontal fairness, which insists that equal cases be treated equally. In H this requirement is sufficient to single out the “Equal Contributions” allocation (ci = 15 for all i) - essentially it is equivalent to assuming symmetry. In HE and HB, on the other hand, it only specifies that c1 = c2 =: cH , c3 = c4 =: cL and cH + cL = 30. Slightly less basic, but still fairly uncontroversial, is the notion of vertical fairness, which in HE and HB prescribes min {c1 , c2 } geq max {c3 , c4 }. High types should contribute at least as much as low types.8 Horizontal and vertical fairness together (along with the implicit efficiency assumption) then restrict choices to the set HV F = {σ ∈ NE : c1 = c2 =: cH ∩ c3 = c4 =: cL ∩ cH + cL = 30 ∩ cH ≥ cL } The set HV F is however still large and identical to both HE and HB. We therefore now turn to three very specific, and in an intuitive sense salient, fairness/equity principles that have been found to be popular and endorse (different) notions of equality and equity (see Konow [2003]; Konow, Saijo, and Akai [2009]). First, the concept of equality can be applied to contributions, leading to the Equal Contributions allocation discussed before. More often, however, and second, the term “equality” is meant to refer to equality of outcomes, here earnings. If such a notion is applied, one obtains c1 = c2 = 30 and c3 = c4 = 0 in both HE and HB. We shall refer to this allocation as “Equal Earnings” in the analysis to follow Finally, an obvious way to interpret “equity” in the present context is to appeal to the principle of equal (proportional) sacrifice in HE and equal (proportional) benefit in HB. In HE, since high types have an (unearned) endowment twice as high as that of the low types, they might reasonably be expected to contribute twice as much, i.e. c1 = c2 = 20 and c3 = c4 = 10. In HB, since high types exogenously benefit twice as much from the public good as low types, they might also be expected to contribute twice as much, so again c1 = c2 = 20 and c3 = c4 = 10. We henceforth label this allocation “Proportionality”.9 It should briefly be noted that all fairness principles just discussed trivially prescribe Equal Contributions in H. Table 2 sums up the analysis. Moreover, all three allocations just derived satisfy horizontal and vertical fairness. A potential caveat game-theoretically is that in HE, the Equal Earnings allocation is 8 It is an interesting afterthought to realize that the Shapley value would propose a violation of this principle in HE, where high types are more often pivotal than low types and should hence earn more of the surplus, i.e. contribute less, than low types. In HB the anomaly disappears. 60 30 9 Note that this allocation equalizes the input-output ratio, or the “return on investment”, in HB, since 20 = 10 = 3. However, in HE, equalizing the “return on investment” implies equal contributions, violating the principle of equal sacrifice. 8 weakly dominated. Table 2: Summary of allocations resulting from popular fairness principles. Treatment H Equal Contributions ci = 15 ∀i HE ci = 15 ∀i HB ci = 15 ∀i Proportionality ci = 15 ∀i cH = 20 cL = 10 cH = 20 cL = 10 Equal Earnings ci = 15 ∀i cH = 30 cL = 0 cH = 30 cL = 0 Our results imply that given agreement on a specific normative principle, the selection problem is easily solved, but while horizontal and veritcal fairnes seem uncontroversial there is no strong reason to believe that the entire population should agree on one particular fairness principle. Hence, the basic tension resulting from (anticipation of or beliefs about) normative disagreement resulting from heterogeneity remains. The purpose of our Elicitation part is exactly to substantiate these points. 3.3 Hypotheses We hypothesize that normative viewpoints as revealed by the Elicitation part will heed the principles of efficiency, horizontal and vertical fairness and potentially be concentrated around the three candidate allocations Equal Contributions, Proportionality and Equal Earnings. In H there should be no disagreement (about choosing Equal Contributions). Moreover, we hypothesize these viewpoints to translate into the way groups solve the coordination problem in the 20period Interaction part. At a minimum, we expect successful groups to pick and concentrate on one particular undominated equilibrium, i.e. display a stationary path. However, if our hypothesis about norms as focal points is true, allocations that were deemed normatively desirable should be more stable and widely accepted when played. Finally, we would expext a link between ex-ante normative expectations and interaction within groups, and in particular potentially a link between ex-ante normative disagreement within a group and the likelihood of success in the interaction. 4 Related Literature A first version of the step-level public good game was introduced by Hardin [1976] and further developed by van de Kragt, Orbell, and Dawes [1983]. In its original conception, players make 9 binary contribution decisions (either contribute the full endowment or nothing) and so equilibria consist of some players contributing their endowments and others nothing at all. Coordination is therefore on binary decisions (or on contribution probabilities), not on intensities. In this context Rapoport [1988] comes closest to our setup. The paper studies endowment heterogeneity, but still in a setting of binary decisions. The author finds experimentally that high-endowment subjects are more likely to contribute, in line with his hypothesis. This finding can however be predicted using standard noncooperative theory. In post-study interviews, some subjects revealed distributional concerns, while others opined that low types, benefiting more relative to endowments, should be more likely to contribute. However no quantification of responses or investigation of potential contribution norms is given, it not being the focus of the paper. Success rates are quite low (40.3%).10 Continuous-contribution step-level public good games like the present one were introduced by, among others, Isaac et al. [1989]. Studies assessing the effects of heterogeneity are Bagnoli and McKee [1991], who have treatments similar to our game in HB, and Rapoport and Suleiman [1993], who investigate a close cousin to the game in HE. Bagnoli and McKee [1991] also vary group size and are mostly interested in general theoretical predictions (essentially whether groups play some Nash equilibrium, and whether they avoid z), rather than the effects of heterogeneity on selection per se. Rapoport and Suleiman [1993] test the effect of wealth heterogeneity on contribution behavior in 5-player groups. They use a within-subject design where subjects play 5 supergames of 3 rounds each, with endowments randomly redrawn at the beginning of each supergame. The main results for our purose are that (a) heterogeneity drives down success rates and (b) players contribute the same proportion of their endowment across wealth levels. The authors explain the latter fact by a theory of “mixed motives” whereby subjects trade off potential gains from contribution (relative to endowment) and their likelihood of affecting the group outcome. There is no discussion of focal points and conflicting normative concerns. Our setup is novel in several ways: first, we introduce the ex-ante Elicitation task to canvass, and assess the nature and dispersion of, subjects’ normative views and expectations in heterogeneous setups (capturing normative disagreement). Second, we investigate different types of heterogeneity. This will for instance allow us to take a stand on Rapoport and Suleiman [1993] hypothesis that subjects coordinate on contributing a fixed share of their endowments. In HB this should then lead to Equal Contributions (and in HE to Proportionality), a strong hypothesis. Third, our long horizon (20 rounds) will allow for an analysis of the stability properties of different allocations, another key ingredient to norms Young [2008]. Fourth, combining the data from Elicitation and Interaction will allow us to assess whether actual choices in groups do 10 Croson and Marks [2000] discusses success rates as a function of technology in their meta-analysis. 10 indeed correspond to initially revealed normative preferences, and whether ex-ante normative disagreement spills over into actual conflict. In the related literature on standard (linear) public good games three papers are close to ours in spirit. Both Anderson, Mellor, and Milyo [2008] and Buckley and Croson [2006] study the impact of heterogeneity on overall contribution levels, with results pointing in opposite directions. Reuben and Riedl [2011] run our three treatments in the standard public good context and investigate the emergence and enforcement of contribution norms, however in the spirit of Bicchieri [2006] rather than following Young’s (2008) equilibrium selection paradigm as we do. They find evidence for fairness- and efficiency-oriented contribution norms when allowing for punishment but not otherwise. 5 Results Throughout the analysis we use parametric tests as a default, and supplement results from nonparametrics only in case of contradicting evidence or if parametric tests are clearly invalid/infeasible. Standard errors are always robust and, unless stated otherwise, clustered at the group level whenever multiple observations per group are used. 5.1 Aggregate statistics - behavior We begin by analyzing the success rate, summarized in the first column of Table 3 which contains summary statistics. A Probit regression of success on heterogeneity type and punishment dummies, as well as interaction terms, is used to substantiate differences statistically. We find that without punishment, groups were significantly more successful in H than in HE (p < 0.035) and HB (p < 0.027). There is no significant difference between HE and HB (p = 0.852). Turning to aggregate contribution behavior (columns 2-4 of Table 3), we investigate whether in case of success there was significantly more excess contributions in the heterogeneous treatments as hinted at by column (3) (perhaps as a consequence of strategic uncertainty). There was indeed more slack in HE (negative binomial regression, p < 0.001) and in HB (p < 0.001) than in H, but again no difference between the Heterogeneous treatments (p = 0.478). Failure and overcontribution were the two potential drags on efficiency in our experiment. Let the efficiency gain (column (8) of Table 3) denote the realized (group-level) gain in total earnings as a percentage of the maximum potential gain. In analogy to the analysis of success rates, without punishment, the Homogeneous treatment easily outperforms both Heterogeneous treat- 11 ments (p < 0.010, negative binomial regression).11 Again there is no difference between HE and HB (p = 0.696) For an analysis of profits, remember that we had two types of players in our Heterogeneous treatments, high types and low types. Table 4 provides summary statistics. As a simple regression shows, there are no significant between-treatment differences in low type earnings (p ≥ 0.260).12 High types earned more than low types in both Heterogeneous treatments (p < 0.001), and high types in HE earned more than high types in HB, p < 0.001. Subjects were able to improve significantly upon their initial endowments across treatments (p < 0.031), with the exception of high types in HE (p = 0.974).13 As becomes clear from Table 4, though, high types earned considerably less than they would have assuming equal contributions. Correspondingly, looking at the low types’ contribution shares (last column of Table 3) we see that in both HE and HB, low types contributed significantly less than high types (p < 0.001), in fact slightly less than half as much (TEST). Before concluding this section, let us briefly touch upon time trends. Probit regressions show that success rates trended up in all treatments, but significantly only in HB (p < 0.006). Excess contributions trended downward significantly in all treatments (p < 0.022, negative binomial regression). As a consequence, efficiency trended upwards in both Heterogeneous treatments (p < 0.017, Tobit/Probit). Hence groups to some extent learned to reduce frictions over time. To summarize the aggregate results, we find that heterogeneity causes frictions which drive down success rates relative to the homogeneous case, replicating previous work by Rapoport and Suleiman (1993). High types are able to capitalize somewhat on their advantageous position even though they contribute considerably more to the public good on average than low types. Table 3: Aggregate statistics. Means of variables (standard deviations in parentheses). Treatment H HE HB Success rate .78 (.4149384) .553125 (.4979484) .534375 (.4995982) c 13.73917 (4.961685) 12.86797 (8.797731) 12.9625 (8.653534) 11 c (success) 15.11218 (2.207721) 15.67655 (8.158692) 15.8962 (7.80412) Eff. gain .6440556 (.674346) .2483854 (.8213708) .3694792 (.65353) cL cH . . .4584473 (.1127259) .4618588 (.1260245) The result is even stronger when running a Probit on an indicator for whether the contribution vector was efficient, conditional on success, p < 0.001 each way. 12 In H, all players are counted as low types. 13 Remember that high types in HE had initial endowments of 60 poins whereas all others had initial endowments of 30 points. 12 Table 4: Earnings by treatment and type. Means of variables (standard deviations in parentheses) Treatment H HE HB 5.2 low types 39.66083 (10.71805) 37.40156 (13.77438) 36.8 (14.13729) high types . . 60.05 (14.03664) 45.36875 (27.20228) Aggregate statistics - normative data Before explaining the punishment mechanism (where applicable) and starting the 20-round interaction, we canvassed participants’ normative viewpoints by asking them to implement an allocation for a randomly selected other group (as part of the Elicitation task). Table 6 summarizes characteristics the responses we obtained by treatment and type. From the first two columns we see immediately that subjects a priori prescribed higher contributions for high types and this difference is always highly significant (robust t-test p < 0.001). As a corollary,14 subjects in HE and HB prescribed lower contributions for low types than subjects in H (p < 0.001). Prescribed contribution levels did not differ between the Heterogeneous treatments (p = 0.257 for high type contributions, p = 0.132 for low types). Looking at the implied contribution shares for low types (column (3)), we find that at slightly above 25% they are much lower than the contribution shares actually observed in the game (cf. the last column of Table 3). Table 5: Choices in allocation prescription task - prescribed contribution levels. Means of variables (standard errors in parentheses). treatment H HE HB cH cL . 14.95968 . (1.964726) 22.03516 8.332031 (5.472258) (5.673629) 22.37868 8.095588 (5.765638) (6.394012) cH cL . . .2705474 (.1775873) .2617006 (.1944308) Table 5 takes a closer look at the (distribution of) specific characteristics in the prescribed allocations. First, we find that virtually no one prescribed the zero contributions vector (column 14 Subjects rarely ever prescribed overcontribution, see below. 13 Figure 1: Prescribed allocations, by treatment. Size of circles indicates frequency of allocation. (1)). Second, virtually everyone prescribed a successful allocation (column (2)). And third, virtually everyone prescribed an efficient contribution vector, that is successful with zero excess contributions (column (3)). Moreover, and importantly, most subjects prescribed vertically (“VF”) and horizontally fair (“HF”) allocations (columns (4) and (5)) and typically both (column(6)). Our candidate normative allocations (Equal Contributions or “EQC”, Equal Earnings or “EQE” and Proportionality or “PRO”) form a subset of all vertically and horizontally fair allocations. Together they account for 71.9% of all prescribed allocations in HE, 76.6% in HE and 93.3% in H (column (10)). Thus the vast majority of participants prescribed contribution vectors that we had ex ante classified as normatively “preferable” even among those allocations that, in being vertically and horizontally fair, already heeded basic normative principles. However, it is also at this point that the anticipated normative disagreement takes hold: while in H, all mass is naturally on equal contributions,15 we see disagreement in both HE and HB (columns (7)-(9)). Even though a plurality rooted for Proportionality (45.3% in HE and 34.4% in HB), Equal Earnings are strongly present (HE: 17.2%; HB: 29.7%), as are Equal Contributions (HE: 9.4%; HB: 12.5%). HE and HB do not differ significantly despite the seemingly stronger focus on Proportionality in HE and Equal Earnings in HB (χ2 p = 0.193).16 Figure 1 summarizes our results graphically. As column (10) of Table 5 shows, another popular allocation was “25250505” (high types contribute 25 points each, low types 5 points). Whether this was meant as a compromise between proportionality and equal earnings, or selected mainly owing to its focality, is unclear. Certainly 15 There being no high types in H, all three candidate allocations collapse into one. Note that the claim made by Rapoport and Suleiman [1993] on subjects’ preferences for tying contributions to endowments proportionally is not generally supported. Such attitudes would uniquely select Proportionality in HE and Equal Contributions in HB. 16 14 15 Treatment H HE HB z .0166667 0 0 Success .9833333 .984375 1 Efficient .9833333 .953125 .96875 VF . .96875 .984375 HF .95 .96875 .96875 VF & HF . .953125 .96875 EQC .9333333 .09375 .125 PRO . .453125 .34375 EQE . .171875 .296875 Table 6: Properties of choices in allocation prescription task. Percentages. Norm. .9333333 .71875 .765625 Norm. or 25250505 .9333333 .84375 .859375 25250505 0 .125 .09375 normative concerns must have played some role, however, since allocations that were focal (in the sense of prescribing multiples of 5) but not both vertically and horizontally fair were virtually never selected. Including “25250505” in our list of desirable (and focal) allocations, we now see that that list explains more than 82% of prescriptions in each treatment (column(11)).17 The above results suggest than when it comes to equilibrium selection, common notions of fairness and/or equity play a strong role and revealed preferences mostly fall into one of the three categories outlined in our discussion on equilibrium selection through normative arguments. A caveat could be that these allocations are also numerically focal (as is “25250505”). However, in our online appendix we present evidence for our normative hypothesis from a survey very similar in structure to the (first part of the) Elicitation part, but where our normative candidate allocations are not focal in terms of raw numbers. At any rate, the fact that allocations outside the domain of horizontal and vertical fairness (including several numerically focal ones) were essentially never chosen adds strength to our claim. The selected allocations are focal at least in part because they follow a particular principle, or a combination of principles. This can be seen in the spirit of Schelling [1960]. It should be noted that we find some evidence of self-serving bias since in both HE and HB, high types are more likely to prescribe the equal contributions allocation and less likely to prescribe the equal earnings allocation than low types (p < 0.041, multinomial Logit regression).18 However, normative disagreement remains even after controlling for types, as Table 7 shows. Table 7: Selection between fair/equitable allocations, by treatment and type. Percentages of total (rows). Treatment, type HE, low HE, high HB, low HB, high EQC 4.17 22.7 4.4 26.9 PRO 62.5 63.6 39.1 50 EQE 33.3 13.6 56.5 .23.1 In addition to our attempt at eliciting normative standpoints, we elicited participants’ beliefs about other participants’ standpoints. A detailed discussion of this variable is relegated to the online appendix. For the moment, it suffices to note that on aggregate, beliefs were fairly consistent with reality and normative disagreement was relatively well anticipated in the sense that participants did obviously consider the possibility that others might disagree (they did not fully project their own beliefs onto others, although). Table 8 illustrates this result. 17 Allocations that were both vertically and horizontally fair made up more than 95% of all allocations in both HE and HB. 18 More generally, a Tobit regression finds that high types prescribe significantly higher contribution shares for low types, p < 0.002. 16 Table 8: Relative frequencies of guessed prescribed allocations, by own prescribed allocation (conditioning on normative allocations). Own EQC Own PRO Own EQE Expect EQC 4.17 4.4 26.9 Expect PRO 62.5 39.1 50 Expect EQE 33.3 56.5 .23.1 To summarize, we find that despite near unanimous agreement on the desirability of efficiency, vertical and horizontal fairness, our data points to normative disagreement within the boundaries of the (large) set of allocations consistent with both desiderata. In particular, the allocations that were focal or consistent with more specific fairness/equity considerations all figure prominently, highlighting the potential for miscoordination and conflict in the Heterogeneous treatments. 5.3 Choices of individual groups 5.3.1 Categorization We now turn to examining how groups in our different treatments solved the coordination problem they faced. In particular, we are interested to which extent normative, or focal, aspects and (in the Heterogeneous treatments) disagreement would influence choices. Therefore, Table 9 classifies groups’ contribution vectors, conditional on success, in the spirit of our earlier normative analysis. Contribution efficiency (i.e. zero overcontribution, column (1)) is substantially lower in the Heterogeneous treatments than in the Homogeneous cases. These results (also statistically) mirror earlier findings on excess contributions. Slightly more interesting are the low types’ contribution shares (column (2)). While still comfortably above the levels reported in the “normative” part of the experiment, from Table 3 we can infer that low types contributed less relatively in case of success than in case of failure (albeit marginally, p < 0.085 in a pooled Tobit regression). This might point to high types being more likely than low types to trigger failure.19 Next, we turn to horizontal and vertical fairness. While most successful allocations were vertically fair in all heterogeneous treatments (column (4)), and horizontally fair in the Homogeneous treatments, less than half were exactly horizontally fair in the Heterogeneous treatments (column (3)), generating a low upper bound for allocations that were both vertically and hori19 The difference disappears statistically when breaking the analysis down by treatment. 17 zontally fair (column (5)). This could either mean that horizontal fairness was a low-value chip on the bargaining table, or might reflect normative disagreement within types, which we pointed out earlier. In that case, less horizontal fairness should be associated with a lower success rate, which is what we find (p < 0.032). Nearly all allocations that were both horizontally and vertically fair corresponded to one of our normative candidates. Columns (7)-(9) report the weights groups gave to our more specific candidates for norms (column (6) reports their sum). Column (10) captures the vector “25250505” which showed strongly in the normative data but was never chosen in the actual interaction. Equal Earnings allocations are very scarce, too (column (9)). Proportionality (column (8)) remains popular, in fact the most popular allocation in HE (in HB, the Equal Contributions allocation wins the race). Still, Equal Contributions allocations (column (7)) are less prevalent than indicated in the normative data in all treatments. Overall, with 20.4%, Proportionality is easily the most frequently chosen successful allocation in the Heterogeneous treatments, followed at some distance by Equal Contributions (10.5%). No other allocation accounts for more than 5% of observations. To move from mere frequencies to a story about focal points and norms, one needs to show that groups did not merely make random draws from the distribution we just sketched, but rather that successful groups managed to agree on, and stick with, or at least closely around, a particular allocation once reached, at least for a while. Figure 2 indicates that this is exactly what happened in our experiment. The grey bars capture success rates (right axis) whereas the squares and diamonds plot individuals’ mean contribution levels in case of success (left axis). More successful groups clearly display more stable contribution patterns in all treatments. Evidence of the link between stability and success within groups is presented in Figure 3 and reinforced by regression analysis. More successful groups display less (within-group) contribution variance (p < 0.001, Tobit). Turning to variation between groups, we find that while in H all successful groups coordinate on the Equal Contributions allocation, in HE and HB there is much more variance between (the averages of) groups’ successful contribution vectors (p < 0.001, Conover squared-rank tests for between-group variance). This evidences the selection problem, and strongly suggests that the best way for groups to solve the problem was to erect a stable contributions norm. What we shall show in the next section, however, is that some contribution norms were better than others. 5.3.2 Survival analysis We further investigate the factors influencing emergence and stability of successful choices with a survival analysis. This has several advantages over standard Tobit/Probit regressions 18 19 Treatment H HE HB cL Efficient HF cH .9059829 . .8632479 .6158192 .3561357 .3728814 .5263158 .359527 .3508772 VF . .7627119 .8421053 HF & VF . .3728814 .3450292 Normative EQC .8632479 .8632479 .3728814 .0734463 .2807018 .1345029 PRO . .299435 .1052632 Table 9: Properties of successful contribution vectors, by treatment. Percentages of total. EQE . 0 .0409357 “25250505” 0 0 0 Figure 2: Contribution behavior and success rates within groups. Error bars reflect standard errors of the mean. Figure 3: Stability and success. 20 involving lagged values or simple counting regressions. First, one avoids the problem of having to arbitrarily choose the number of lags and hence the “memory” of the underlying stochastic process. Second, a survival analysis easily accommodates multiple spells per group and can take into account the order of spells when estimating the hazard rate. Third, the evolution of the hazard rate over time can be studied, i.e. whether success is self-stabilizing, has a tendency to unravel, or even a Markov property.20 We use the semiparametric Cox PH model as our baseline, whose validity rests on a single easily testable assumption and which is hence more robust than parametric specifications, supplementing parametric tests wherever necessary. First, Let a group become at risk when it is either successful for the first time, or returns to success after at least one failure. Let the failure event be the first period after becoming at risk in which the group fails. Obviously all groups at risk in period 20 cease to be at risk thereafter, and independent censoring is assumed.21 Our initial covariates of choice are treatment type indicators. Figure 4 illustrates the regression results displaying estimated hazard functions by treatment. We heterogeneity has a destabilizing effect on success (HE: p < 0.006; HB: p < 0.003). Importantly, the hazard rate is downward sloping over the duration of a success spell, meaning that success has a self-stabilizing property (the longer a group is successful, the more likely it is to remain successful). There is no measurable difference between the Heterogeneous treatments (p = 0.974). The Cox PH model does not supply an estimate for the slope of the hazard function, but parametric specifications based on Weibull and log-logistic distributions back this finding up statistically (p < 0.012).22 A natural question to ask in our context is whether normative desirability (or focality within the confines of basic normative principles) enhances longevity of success spells. Given previous results we pool the Heterogeneous treatments and include a dummy for whether, when becoming at risk, the group chose a vertically fair allocation, as well as a dummy indicating that the allocation was both vertically and horizontally fair.2324 The latter indicator happens to capture our three candidate normative allocations. The regression demonstrates that vertical fairness alone has virtually no effect on the hazard rate (p = 0.907) while the pivotal effect of horizontal fairness (given vertical fairness) is highly significant and of the expected negative sign (p < 0.001). In combination, therefore, vertical and horizontal fairness greatly increase the expected duration of a success spell (p < 0.001). In fact, taking a step further and run20 For comparison, a Probit or Tobit including a single lag would impose the Markov property on the dgp. This assumption is at least indirectly testable, see below. 22 All other results are equally confirmed. 23 There was only one observation of a successful group which was not vertically fair but horizontally fair, hence a proper interaction design is impracticable. 24 We could have included indicators for each period - the Cox PH model accommodates time-varying covariates - but this creates endogeneity issues (rate dependence and state dependence) and hence we restricted ourselves to the more robust approach of proxying with the initial indicators. Results are even stronger with time-varying covariates. 21 21 Figure 4: Survival analysis of success. Multiple spells per group. ning the original treatment regression conditional on success spells starting with vertically and horizontally fair allocations, all previously significant effects disappear (F−test: p = 0.896). Hence, success is less stable in the Heterogeneous treatments primarily because some groups are unable to setlle on an allocation that heeds basic normative principles. The question that remains to be answered, then, is whether this inability to coordinate can be linked statistically to ex-ante normative disagreement. Another approach emphasizing the importance of our normative allocations is to compare them to other efficient allocations, again in a Cox PH setup. While efficiency alone greatly increases survival chances (p < 0.004), there is an additional significant downward shift of the hazard function for groups starting exactly with one of the three normative allocations (p < 0.006). Hence even conditional on being on the Pareto frontier, being at a normative allocation made a substantial difference. We now change the definition of a failure event slighty to address a closely related but slightly different question. We are interested in exact stability, i.e. for how long a group remains at exactly the same successful contributions vector. This matters in our discussion because such exact stability is one key characteristic of norms as defined in the literature.25 We use the Cox PH model again and run the exact same regressions as before. As in the case of success spells, 25 As an aside, it should be noted that even allowing for arbitrary transition length, no group ever switched from one normative/focal allocation to another. Hence those particular allocations would not seem to have been regarded as easy substitutes for one another. 22 heterogeneity undermines exact stability. Between Heterogeneous treatments there is again no difference (p ≥ 0.868). The hazard rate is even more strongly declining over time than in the survival of success (Weibull and log-logistic p < 0.001). Crucially in light of the definition of a norm, we again find the indicator for horizontal and vertical fairness to be highly significant in enhancing stability (p < 0.001 for both the incremental effect of horizontal fairness given vertical fairness, and the combined effect) while vertical fairness alone (p = 0.150) and punishment (p = 0.343) are not. Again, conditioning the treatment regression on vertically and horizontally fair allocations eliminates all previously measured effects (F−test p = 0.366). Moreover, again being normative trumps just being (Pareto-)efficient (p < 0.001), which in turn gives higher survival chances than other, inefficient successful allocations (p < 0.006). Summarizing the above, we find that heterogeneity acts as a destabilizing force even when concentrating on success. Interestingly, success was self-stabilizing over time. Groups in Heterogeneous treatments that managed to coordinate on normatively appealing allocations were as stable and successful as their Homogeneous cousins. This entails that the main friction caused by heterogeneity lies in some groups’ apparent inability to agree on one such allocation, and our next task will be to relate this disagreement to the self-reported preferred allocations we collected at the beginning and at the end of our sessions. 5.4 Normative data and allocation choice Both before and after the main experiment, we asked subjects to (a) prescribe an allocation to a randomly chosen other group and (b) convey their beliefs about their group members’ prescriptions. This provides a basis for linking actual group behavior to individuals’ normative viewpoints and normative expectations, and in case of significant results further testifies to the importance of (fairness) norms - and or perhaps as focal points - in solving complex coordination problems. We approach the data in three ways: first, by linking group frequencies of prescriptions and expectations to group frequencies of actual choices; second, by constructing summary measures and compare those; and third, by linking a measure of normative disagreement to success or more generally group behavior. In all the analysis that follows, we restrict attention to the (pooled) Heterogeneous treatments. Results are much stronger when including the Homogeneous treatments, but would be misleading since there was no normative conflict or focal ambiguity in those treatments. We begin with the ex-ante normative data and look at the link between normative data and group choices. Specifically, we correlate the number of times a group chose a particular “normative” allocation (equal contributions or proportionality, all others are too infrequent) with the head 23 counts of that and other “normative” or focal allocations in the normative data. Parametric regressions give considerable trouble: first, as seen in Table 6, most prescriptions fall into one of four normative or focal categories, so including all categories leads to a multicolinearity problem and produces odd coefficients. Second, Tobit and negative binomial regressions on reduced sets of covariates produce more reasonable coefficients but very bad fits of the data. We hence report simple (Spearman) rank-correlation coefficients instead. The results are in Table 10, where the left columns focus on subjects’ prescriptions and the right columns use normative expectations. The frequency of “Equal contributions” allocations chosen in the game correlates negatively with the frequency of “proportionality” in the normative data, and positively with the frequency of “Equal contributions”. The converse holds for the frequency of “Proportionality” allocations chosen during the 20-period interaction. Moreover, expectations seem to be more important, since coefficients and significance levels are clearly higher than those from presribed allocations. Table 10: Ex-ante normative data and normative allocations. Spearman correlation coefficients between frequencies. Contribution vector: Normative category: Equal contributions p= Proportionality p= Equal earnings p= “25250505” p= Proportional Prescriptions Expectations -0.054 -0.209 0.665 0.092 0.039 0.291 0.757 0.018 -0.156 -0.059 0.211 0.639 0.208 -0.015 0.094 0.905 Equal contributions Prescriptions Expectations 0.078 0.318 0.532 0.010 -0.084 -0.287 0.505 0.020 0.039 -0.076 0.754 0.543 -0.127 -0.246 0.312 0.046 Next, we run a Tobit regression of realized low type contribution shares on prescribed low type contribution shares (including expectations). The coefficient is positive and strongly significant (p < 0.001). Thus, groups whose members prescribed (and expected others to prescribe) higher low type contribution shares ex-ante also ended up choosing allocations with higher low type contribution shares. Again, expectations are better predictors of behavior (p < 0.023) than are prescriptions (p = 0.161). Finally, we construct a measure of normative disagreement as follows: Pooling prescription and expectational data we compute group standard deviations of low types’ contribution shares. We then try to link this measure to either the group’s success rate (regression 1) or the frequency of “normative” allocations (pooled) among that group’s choices (regression 2). In both cases the coefficient is negative, indicating a detrimental effect of ex-ante disagreement on group success, 24 but the effect is not significant (regression 1: p = 0.749; regression 2: p0.359). Turning to the ex-post normative data, results are somewhat stronger. Table 11 replicates Table 10 and the correlations are qualitatively identical but quantitatively and statistically stronger. We run another set of Tobit regressions, this time of (ex-post) prescribed and expected low type contribution shares on realized low type contribution shares (to respect the timing of events). The coefficient from the full regression is strongly positive (p < 0.001) and robust to the exclusion of either prescriptions (p < 0.001) or belief data (p < 0.006). Finally, we regress our measure of (ex-post) normative disagreement on the success rate (regression 1) and the realized frequency of “normative” allocations. The coefficient is negative in both regressions, and the results are statistically noticeably stronger than those obtained using the ex-ante normative data (regression 1: p < 0.095; regression 2: p < 0.019). Significance in regression 2 is driven mainly by normative expectations (p < 0.015) and less by subjects’ own prescriptions (p = 0.148). Table 11: Ex-post normative data and normative allocations. Spearman correlation coefficients between frequencies. Contribution vector: Normative category: Equal contributions p= Proportionality p= Equal earnings p= “25250505” p= Proportional Prescriptions Expectations -0.325 -0.422 0.008 0.000 0.493 0.725 0.000 0.000 -0.023 -0.210 0.857 0.0.091 0.045 -0.015 0.720 0.905 Equal contributions Prescriptions Expectations 0.443 0.495 0.000 0.000 -0.176 -0.309 0.156 0.012 0.069 0.032 0.581 0.798 -0.130 -0.246 0.299 0.046 We conclude this section by observing that we find some, but not strong, links between normative data, in particular normative expectations, on the one hand, and realized allocations in a group on the other. That the expectations component should be stronger is entirely consistent with the characterization of a norm (and can separate norms from intrinsic other-regarding, or distributional, preferences). However, while there is a reliable link between allocations prescribed and anticipated in the Elicitation part and group behavior (possibly a case of selffulfilling expectations), there is no apparent link between normative disagreement and success. Results are stronger for the ex-post data, suggesting experience with the interaction informed both individual normative viewpoints and normative expectations. We can therefore not convincingly confirm our core hypothesis that ex-ante normative disagreement drives down success rates, despite the close correspondence on aggregate between prescribed, expected and actual successful allocations. However there is ample evidence that 25 normatively appealing allocations, once agreed on, do fare much better than other equilibria. 6 Conclusion We studied coordination frictions, and the importance of contribution norms, in step-level public good games with large equilibrium sets and heterogeneous agents. Heterogeneity was with respect to wealth in one treatment, and with respect to the benefit from the public good in the other. We showed that heterogeneity creates frictions on aggregate. An elicitation task and a questionnaire revealed that individuals held, and expected others to hold, well defined yet conflicting normative views of fair contribution rules related to effciency, equality, and equity. Successful groups agreed on, and then stuck with, a normatively appealing allocation with focal properties that can be derived from first principles. Moreover, normative viewpoints and expectations (as elicited ex-ante) predict group behavior. However, we cannot confirm the hypothesis of a link between ex-ante normative disagreement and coordination failure in the interaction. References Lisa R. Anderson, Jennifer M. Mellor, and Jeffrey Milyo. Inequality and public good provision: An experimental analysis. The Journal of Socio-Economics, 37:1010–1028, 2008. 11 Mark Bagnoli and Michael McKee. Voluntary contribution games: Private provision of public goods. Economic Inquiry, 29:351–366, 1991. 10 Cristina Bicchieri. The Grammar of Society: The Nature and Dynamics of Social Norms. Cambridge University Press, New York, 2006. 2, 7, 11 Ken Binmore. Playing Fair: Game Theory and the Social Contract I. MIT Press, Cambridge, MA, 1994. 2 Ken Binmore. Just Playing: Game Theory and the Social Contract II. MIT Press, Cambridge, MA, 1998. 2 Gary E. Bolton and Axel Ockenfels. A theory of equity, reciprocity, and competition. American Economic Review, 90:166–193, 2000. 7 Edward Buckley and Rachel T. A. Croson. Income and wealth heterogeneity in the voluntary provision of linear public goods. Journal of Public Economics, 90:935–955, 2006. 11 26 Gary Charness and Mathew Rabin. Understanding social preferences with simple tests. Quarterly Journal of Economics, 117:817–869, 2002. 7 James S. Coleman. Foundations of Social Theory. Harvard University Press, Cambridge, 1990. 7 James C. Cox, Daniel Friedman, and Steven Gjerstad. A tractable model of reciprocity and fairness. Games and Economic Behavior, 59(1):17–45, 2007. 7 Rachel T. A. Croson and Melanie Beth Marks. Step return in threshold public goods: A metaand experimental analysis. Experimental Economics, 2:239–259, 2000. 10 Martin Dufwenberg and Georg Kirchsteiger. A theory of sequential reciprocity. Games and Economic Behavior, 47:268–298, 2004. 7 Armin Falk and Urs Fischbacher. A theory of reciprocity. Games and Economic Behavior, 54: 293–315, 2006. 7 Ernst Fehr and Klaus M. Schmidt. A theory of fairness, competition, and cooperation. Quarterly Journal of Economics, 114:817–868, 1999. 7 Urs Fischbacher. z-tree: Zurich toolbox for ready-made economic experiments. Experimental Economics, 10(2):171–178, 2007. 5 Dean P. Foster and H. Peyton Young. Stochastic evolutionary game dynamics. Theoretical Population Biology, 38:219–232, 1990. 2 Russell Hardin. Group provision of step goods. Behavioral Science, 21, 1976. 9 R. Mark Isaac, David Schmitz, and James M. Walker. The assurance problem in a laboratory market. Public Choice, 62:217–236, 1989. 10 James Konow. Which is the fairest one of all? A positive analysis of justice theories. Journal of Economic Literature, 41:1186–1237, 2003. 8 James Konow, Tatsuyoshi Saijo, and Kenju Akai. Morals and mores: Experimental evidence on equity and equality. Working paper, Loyola Marymount University, 2009. 8 David K. Levine. Modeling altruism and spitefulness in experiments. Review of Economic Dynamics, 1:593–622, 1998. 7 Therese Lindahl and Magnus Johannesson. Bargaining over a common good with private information. Scandinavian Journal of Economics, 111:547–565, 2009. 2 27 Marcin Pesci. Generalized risk dominance and asymmetric dynamics. Journal of Economic Theory, 145:216–248, 2010. 7 Mathew Rabin. Incorporating fairness into game theory and economics. American Economic Review, 83:1281–1302, 1993. 7 Amnon Rapoport. Provision of step-level public goods: Effects of inequality in resources. Journal of Personality and Social Psychology, 54, 1988. 10 Amnon Rapoport and Ramzi Suleiman. Incremental contribution in step-level public goods games with asymmetric players. Organizational Behavior and Human Decision Processes, 55:171–194, 1993. 10, 14 Ernesto Reuben and Arno Riedl. Enforcement of contribution norms in public good games with heterogeneous populations. Working Paper, 2011. 11 Thomas C. Schelling. The Strategy of Conflict. Harvard University Press, Cambridge, MA, 1960. 2, 16 Robert Sudgen. The Economics of Rights, Cooperation and Welfare. Basil Blackwell, Oxford, 1986. 7 Alfons J. C. van de Kragt, John M. Orbell, and Robyn M. Dawes. The minimal contributing set as a solution to public goods problems. The American Political Science Review, 77, 1983. 9 H. Peyton Young. The evolution of conventions. Econometrica, 61:57–84, 1993. 2 H. Peyton Young. Conventional contracts. Review of Economic Studies, 65, 1998. 2 H. Peyton Young. Social norms. In Steven N. Durlauf and Lawrence E. Blume, editors, The New Palgrave Dictionary of Economics, volume 7, pages 647–651. Palgrave Macmillan, Basingstoke, 2008. 2, 7, 10 28