Digitized by the Internet Archive in 2011 with funding from Boston Library Consortium IVIember Libraries http://www.archive.org/details/fragilityofasympOOacem DEWEY Massachusetts Institute of Technology Department of Economics Working Paper Series FRAGILITY OF ASYMPTOTIC AGREEMENT UNDER BAYESIAN LEARNING Daron Acemoglu Victor Chernozhukov Muhammet Yildiz Working Paper 08-09 March 15,2008 Room E52-251 50 Memorial Drive Cambridge, MA 02142 This paper can be downloaded without charge from the Social Science Research Network Paper Collection http://ssrn.com/abstract=1 11 2855 HB31 .M415 at Fragility of Asymptotic Agreement mider Bayesian Learning* Daron Acemoglu, Victor Chernozhukov, and Muliamet March Yildiz'^ 2008. Abstract Under the assumption that individuals know the conditional distributions of signals given the payoff-relevant parameters, existing results conclude that, as individuals observe infinitely We first many signals, their beliefs show that these signal distributions: about the parameters results are fragile when will eventually merge. individuals are uncertain about the given any such model, a vanishingly small individual uncertainty about the signal distributions can lead to a substantial (non- vanishing) amount of differbeliefs. We then characterize the conditions under which a small amount of uncertainty leads only to a small amount of asymptotic disagreement. ences between the asymptotic According to our characterization, this is the case if the uncertainty about the signal distributions is generated by a family with "rapidly-varying tails" (such as the normal or the exponential distributions). However, when this famity has "regularly- varying tails" (such as the Pareto, the log-normal, and the t-distributions), a small amount of uncertainty leads to a substantial amount of asymptotic disagreement. Keywords: asymptotic JEL disagreement, Bayesian learning, merging of opinions. Classification: Cll, C72, D83. *An earlier version of tliis paper was circulated under the title "Learning and Disagreement in an Uncertain World". We thank Eduardo Faingold, Greg Fisher, Bengt Holmstrom, Ma.tthew Jackson, Drew Pudenberg, Alessandro Lizzeri, Giuseppe Moscarini, Marciano Sinscalchi, Robert Wilson and seminar participants at MIT, the University of British Columbia, Universitj' of Illinois at UrbanaChampaign, Stanford and Yale for useful comments and suggestions. ''Department of Economics, Massachusetts Institute of Technology. Introduction 1 The common prior assumption is one of the cornerstones of modern economic analy- Most models postulate that the sis. world," or more payoff distributions meter vector) 9 is — for example, they drawn from a known prior assumption all agree that comes from prior about the some of 9. The of the game form and state (payoff-relevant para- distribution G, even though each some components additional information about common common they have a precisely, that game have the "same model players in a may also have typical justification for the learning; individuals, through their own experi- ences and the communication of others, will have access to a history of events informative about the state distribution of 6*, 9. and A this process will lead to "agreement" strong version of this view the statement that a Bayesian individual, truth," will learn An it individuals about the expressed in Savage (1954, is who does p. 48) as not assign zero probability to "the eventually as long as the signals are informative about the truth. immediate implication of sequence of signals among ^^^ll this result is that two individuals ultimately agree, even if observe the same they start with very different priors. Despite this powerful intuition, disagreement practice. For example, there who is the rule rather than the exception in typically considerable disagreement even is among econo- mists working on a certain topic. Similarly, there are deep divides about rehgious beliefs within populations with shared experiences. In most cases, the source of disagreement does not seem to be differences in observations or experiences. Instead, individuals ap- pear to interpret the available data subsidies increase investment with different priors. An ment appears more hkely is we interpreted very differently to judge the data or the less investigate the two Bayesian individuals For example, an estimate showing that by two economists starting economist beheving that subsidies have no be unreliable and thus to attach In this paper, differently. wth importance to outcome different priors methods leading on invest- to this estimate to this evidence. of learning about an uiiderlying state by when they are possibly unceHain about the conditional distributions (or interpretations) of signals. identification problem, as the effect same long-run frequency This leads to a potential of sigiials may result from differ- ent combinations of payoff-relevant variables and different interpretations of the signals. Hence, even though the individuals may will learn the not always be able to infer the state 9, and translate into differences in asymptotic beliefs. asymptotic frequency of signals, they initial differences in their beliefs When the amount of uncertainty is may small. the identification problem he likely that \vill also small in the sense that each individual finds is eventually assign high probability to the true state. it highly One may then expect that the asymptotic beliefs of the two individuals about the underlying states should be close as well. If so, the common prior assumption would be a good approxi- mation when players have a long common experience and face only a small amount of uncertainty about Our ticular, how the signals are related to the states. focus in this paper to investigate the validity of this line of argument. In par- is we study whether asymptotic agreement agreement continuous at certainty is if is at certainty. Asymptotic Our main result shows that asymptotic discontinuous at certainty for every model: for every model there amount ishingly small probability continuous a small amount of uncertainty leads only to a small amount of disagreement asymptotically. agreement is 1 of uncertainty that is is a van- each individual to assign nearly sufficient for that they will asymptotically hold significantly different beliefs about the underlying state. This result implies that learning foundations of common prior are not as strong as one might have thought. Before explaining our main result and about the environment we study. of signals, {S(}"^q, and form feature of the environment Two its intuition, it is about the state that these individuals may be bution of signals conditional on the underlying state. state and the signal are binary, e.g., 6 Pr = = [st 9 \ 9) po is not a details uncertain about the distri- In the simplest ca,se where the e {A,B}, and known number, but The only non-standard 6. Sj G {a.b}, Fg as individual's subjective probability distribution and to This distribution, which can differ its i = imphes that among We individuals, probability distributions are non-degenerate, individuals will have The presence 1,2. refer to density fg as subjective ural measure of their uncertainty about the informativeness of signals. terpreting the sequence of signals they observe. this individuals also have a prior over pg, say given by a cumulative distribution function Fg for each agent (probability) density. some individuals with given priors observe a sequence their posteriors is useful to provide When some is a nat- subjective latitude in in- of subjective probability distributions over the interpretation of the signals introduces an identification problem and implies that, in contrast to the standard learning environments, asymptotic learning and asymptotic agreement are not guaranteed. In support for each 9, particular, when each Fq has a full there will not be asymptotic learning or asymptotic agreement. Lack of asymptotic agreement implies that two individuals with different priors observing the same sequence many nitely of signals signals. \vill reach different posterior beliefs even after observing Moreover, individuals attach ex ante probability infi- that they will 1 disagree after observing the sequence of signals. Now consider a family of subjective density functions, {/^m,}: becoming increasingly concentrated around a single point uncertainty is — thus converging to certainty. small), each individual bility 1 to the true value of 9. is ahnost certain that he we can is large (and will assign nearly construct sequences of fg .^ that fail. proba- any (P4,Pb,P^,Pb), In particular, for become more and more concentrated around but with a significant amount of asymptotic disagreement at almost m. This establishes that asymptotic agreement for all m Despite this approximate asymptotic learning, our main shows that asymptotic agreement may result When pg, sample paths all discontinuous at certainty for is every model. Under additional continuity and uniform convergence assumptions on the family {fg ^}, ment is we characterize the faixiihes of subjective densities under which asymptotic agree- continuous at certainty. p, these additional When and fg^ are concentrated around the same /g„, assumptions ensure that asymptotic agreement tainty. Otherwise, continuity of continuous at cer- asymptotic agreement depends on the the family of subjective density functions {fg tails is .^„}. \'VT.ien tail properties of has regularly-varying this family (such as the Pareto or the log-normal distributions), even under the additional reg- ularity conditions that ensure uniform convergence, there will be a substantial of asymptotic disagreement. distribution), asymptotic The \Vhen {fg^} has rapidly- varying agreement intuition for this result each individual believes that he other individual will individual reacts fail it be continuous at Whether when a frequency certainty. but he may still or not he believes this depends on of signals different when {fg,,-,} 1 wth will is expect the realized and that they will asymptotically agree. This has rapidly- varying family {/g^} has regularly-varying (thick) how an from the one he expects the frequency of signals under their model of the world what happens when the family small, event ensures that the individual learns 9 If this "surprise" thus attaches probability arbitrarily close to is believe that the does in the case of learning under certainty), then each individual other to learn is (such as the normal tails Wlien the amount of uncertainty as follows. will learn the state 9, to learn. "almost certainty" occurs. (as is will amount tails, tails. However, when the an unexpected. frequencj' of signals will prevent the individual fi-om learning (because he interprets this as a possibility likely even near certainty due to the thick tails). In this case, each individual expects the limiting frequencies to be consistent with his model and the other individual not to learn the true state 9, and concludes that there be significant asymptotic disagreement. will Lack of asymptotic agreement has important implications We situations. illustrate some of these in a for a range of economic companion paper by studying a number of simple environments where two individuals observe the same sequence of signals before or while playing a Our game (Acemoglu, Chernozhukov and doubt on the idea that the results cast by learning. They imply that in Yildiz, 2008). common prior assumption may be justified many environments, even when there uncertainty is little so that each individual believes that he will learn the true state, Bayesian learning does not necessarily imply agreement about the relevant parameters. the strategic outcomes may be significantly different environments.^ ^'Vhether this assumption setting and v\4iat is from those common-prior in the warranted therefore depends on the specific type of information individuals are trying to glean from the data. Relating our results to the famous Blackwell-Dubins (1962) theorem their essence. This (i.e., Consequently, may help clarify theorem shows that when two agents agree on zero-probability events their priors are absolutely continuous with respect to each other), asymptotically, they will make the same predictions about future frequencies of signals. Our results do not contradict this theorem, since we impose absolute continuity. Instead, as pointed out above, our results rely on the fact that agreeing about future frequencies is not the same as agreeing about the underlying payoffs-relevant variables, because of the identification problem that arises in the presence of uncertainty.^ different possible interpretations of the priors. In most economic situations, same what is This identification problem leads to signal sequence important is but some payoff-relevant parameter. For example, what to evaluate a policy is is relevant for economists trying not the frequency of estimates on the effect of similar policies from what may be relevant in trading assets is when (and if) implemented. not the frequency of information about the dividend process, but the actual dividend that the asset will pay. ^For previous arguments on whether game-theoretic models should be formulated with common diff'erent not future frequencies of signals other researchers, but the impact of this specific policy Similarly, by individuals with Aumann all Thus, individuals and Gul (1998). Gul (1998), for instance, questions whether the common prior assumption makes sense when there is no ex ante stage. ^In this respect, our paper is also related to Kurz (1994, 1996), who considers a situation in which agents agree about long-run frequencies, but their beliefs fail to merge because of the non-stationarity having a of the world. prior, see, for example, (1986, 1998) many situations in which individuals need to learn about a parameter or state that will determine their ultimate payoff as a function of their action the analysis here. Our main shows that even when result neghgible for individual learning, and Sanchirico imphcations to asymptotic agreement its Rreedman problem may be of is large. (1963, 1965) and (1999), that question the applicabihty of the absolute continuity assumption in the Blackwell-Dubins theorem in statistical and Preedman, 1986, Stinchcombe, 2005). also Diaconis wthin the realm this identification In this respect, our work differs from papers, such as Miller falls tant theorems in statistics, for example, Berk (1966), and economic settings Similarly, a number (see of impor- show that when individuals place zero probability on the true data generating process, limiting posteriors will have their support on the set of ing distribution). all Our identifiable values (though they results are different may fail to converge to a limit- from those of Berk both because in our model individuals always place positive probability on the truth and also because tight characterization of the conditions for lack of asymptotic learning In addition, neither Berk nor any other paper that asymptotic agreement Our paper is is we is the main focus of our paper. work by Cripps, Samuelson (2006), who study the conditions under which there ing" by two agents observing correlated private and agreement."^ are aware of investigates whether continuous at certainty, which also related to recent independent we provide a will Ely, Mailath and be "common learn- signals. Cripps, et al. focus on a model in which individuals start with common priors and then learn from private signals under certainty (though they note that their results could be extended to the case of non- common priors). They show that individual learning ensures "approximate knowledge" when the signal space contrast, we is finite, but not necessarily when common In it is infinite. focus on the case in which the agents start with heterogenous pri,ors and learn from public signals under uncertxanty or under approximate certainty. signals are public in our model, there is no difficulty in achieving approximate Since all common knowledge.'' ^In dynamic games, another source of non-learning (and tlius lack of convergence to is that some subga.mcs arc never information that contradict visited along the equilibrium tlieir beliefs 1993, Fudenberg and Kreps, 1995). to learn or fail to reach Our path about payoffs in these subgames results differ agreement despite tlie fact from those common prior) do not observe Fudenberg and Levine, a.nd thus players (see, in this literature, since individuals fail that they receive signals about all payoff-relevant variables. ''Put differently, al. we as whether a pla3rer thinks that the other player will learn, whereas Cripps et ask whether a player player will learn. i thinks tliat tlic other player j thinks that i thinks that j thinks that ... a The paper rest of the organized as follows. Section 2 provides a number of prelimi- is nary results focusing on the simple case of two states and two signals. Section 3 contain our main results at characterizing the conditions under which agreement is continuous at certainty. Section 4 provides generalizations of these results to an environment with K states and L > K Section 5 concludes, while the Appendix contains the signals. proofs omitted from the text. The Two- State Model and Preliminary Results 2 2 1 . Environment We start wth a two-state model with binary signals. This model our main results in the simplest possible setting. all arbitrary number and signal values of states St assigns ex ante probability 9, G {a,b}. The underlying state is G n'^ the signals are exchangeable, with an unknown distribution.'"' unknown number p.4; number pb in Section 4. 2, to (0, 1) i.e., i 6* = = That Our main departure from is, most of the G observe a sequence {/I, St = b St = given 9 a given 9 = B i is = A is an an unknowm B Pa i 1 - Pa 1 -Pb Pb We —namely, is that we denote the cumulative distribution function of pg his subjective probability distribution degenerate (Dirac) and puts probabihty analysis, we will allow the individuals to 1 at model —by Fg. In the somepg. In contrast, impose the foUowng assumption: ^See, for example, Billingslcy (1995). If there were only one state, then our De 5}, and agent individuals believe that, given the probability of the standard models and pe- p.4 according to individual for who in the following table: b is i The y4. A standard models, Fg and they are independently and identically distributed a be uncertain about 1 likewse, the probabihty of —as shown sufficient to establish results are generalized to = There are two individuals, denoted by of signals {sf}"^Q where These is model would be iden- example, Savage, 1954). In the context of this model, De Finetti's theorem provides a Bayesian foundation for classical probability theory by showing that exchangeability (i.e., invariance under permutations of the order of signals) is equivalent to having an intical to Finetti's canonical (see, for dependent identical unknown distribution and implies that posteriors converge to long-run frequencies. De Finetti's decomposition of probability distributions is extended by Jackson, Kalai and Smorodinsky (1999) to cover cases without exchangeability. Assumption over 1 For each and i Fg has a continuous, non-zero and finite density fl 9, [0, 1]. The assumption Assumption 1 is implies that F^ has full support over we assume that 7r\ = consider infinite sequences s n"^ , Fg and Fg are {sfj^'^j of signals such sequences. The posterior belief of individual signals {5t}";^i (^ = {.S/}"^j) .4 sequence of signals, does not matter and about 9 i to both individuals.^ S v.Tite for the set of all after observing the first n = A\{s,}l,), =A denotes the posterior probability that 9 s, is tt' = St Tn{s) /n converges to Since the Fg. generated by an exchangeable process, the order of the signals only depends on for the posterior. It of times given a sequence and subjective probability distribution = #{t< rnis) number known | of signals {5(}"^j under prior the 2, is <t/„is)^F,^{9 where Pr' Remark in stronger than necessar)' for our results, but simplifies the exposition. In addition, throughout We As discussed [0, 1]. a out of first some p{s) G n [0,1] = n\st signals. a] By '^ , the strong law of large numbers, almost surely according to both individuals. Defining the set S= {s ^ S this observation implies that Pr' (5 for all sample paths : lim„^oo = G 5) 'Tn 1 for /n exists} {s) z = We 1, 2. (1) , will often state our results which equivalently implies that these statements are true s in 5, almost surely or with probability 1. Now, a straightforward apphcation of the Bayes rule gives to-' ^Since our purpose not assume a common is to understand whether learning prior, allowing agents to justifies the have differing beliefs common prior assumption, known. ''Given the definition of r„ the probability distribution Pr' on (s), Pr' (S-^-' ") = ^'^ (£«'") = (l-TT') Pr' /' p'"'^) (1 - p)"- x {.4, 1?} ^"('' f], (p) dp, 5 is and / (1-p)'-"^-''p"-'-"('VMp)^P Jo at each event £*•*" = {{9,s') \s[ = St for each i < n}, where s = we do even when the beliefs are commonly {st}^-^ and s' = {s[]^i. where Pr' n {rn\9) is the probabihty of obsendng the signal St = a exactly r„ times out of signals with respect to the distribution Fg. The followng lemma sample paths s in provides a useful formula for 0J^ (s) 5 and lim„^oo 0Ji {s) for all also introduces the concept of the asymptotic likelihood ra- Both the formula and the asymptotic Hkelihood tio. = ratio are crucial for our analyses throughout the paper. Lemma 1 Suppose Assumption c^^ {p is)) where p is = {s) lim„^oo r-n. {s) /n, Then for 1 holds. ^ hm and 0';,, \/p G is) all s = G S, ^J (3) [0, 1], the asymptotic likelihood ratio. Proof. See the Appendix. In equation a when R' (p) the true state individual about (4), i is is B the asymptotic likelihood ratio of observing frequency p of versus when it is A. Lenmia uses this hkelihood ratio and Bayes rule to 1 states that, asymptotically, compute 6. In the statements about learning, without loss of generality, 6 = his posterior beliefs A. The two we suppose that in reahty questions of interest for us are: 1. Asymptotic learning: whether 2. Asymptotic agreement: whether Pr' (limn.„>oo 0n Pr' (lim„._^oo (5') \<pl = Ij^* (s) - = .4) (p^ (s)| = = 1 for O) = i = 1,2. 1 for i = 1,2. Notice that both asymptotic learning and agreement are defined in terms of the ex ante probability assessments of the two individuals. Therefore, asymptotic learning implies that an individual beheves that he or she will ultimately learn the truth, while asymptotic agreement imphes that both individuals believe that their assessments will eventually converge.^ ^We formulate asymptotic learning and agreement in terms of each individual's initial probability measure so as not to take a position on what the "objective" for "true" probability measure is. Under Assumption 1, asymptotic learning and agreement occur iff the corresponding limits hold for almost all long run frequencies p [s] € [0, 1] under Lebesgue measure, which has also an "objective" meaning. 8 Asymptotic Learning and Agreement with 2.2 Pull Identifica- tion In this subsection, we provide a number of preliminary results on the conditions under which there will be asymptotic learning and agreement. These results mil be used as the background for the investigation of the continuity of asymptotic agreement at Throughout certainty in the next section. where Assumption The foUowng 1 focus on environments does not hold. well-known result on asymptotic result generalizes Savage's (1954) learning and agreement. Savage's Theorem, which is we this subsection then stated as Corollary is the basis of the argument that Bayesian learning toII push individuals towards beliefs and Let us denote the support of a distribution priors. inf(suppF) to be the infiraum of the set suppf Also let (i.e., F 1 below, common by suppF and define the largest p such that F (p) = 0). "'' = us define the threshold value - log (pb/ (1 (For future reference, this is j(?.4)) + log (p^/ (1 - Pb)) the unique solution to the equation p^(l— p^) p\-'{i-pbY-) (Generalized Asymptotic Learning and Agreement) Define p {pa,Pb) Theorem 1 as in (5). Assume P i.PA,j,PB,j) + Pa,i for i^ all 1. Pr'(lim„^^</);(.s) 2. W (lim„_.^ and that for each 6 = \d>\ (.s) - l|6' j. = i, Then for .4) 4)1 (s)| = = pg^,, — all i^ inf(suppFg) G (1/2, 1 For any s lnn^,{s) - {s) = limr„ 1 - pB,i 7^ l,- O) = 1 1} and only if 1 -ps.i < P {pa,j,Pb,j) < PA,i- of the following claim. E S, n —>oo - where p and j, Proof. Both parts of the theorem are a consequence Claim 1) (5) /n. "^ = {' (0 ^fP(^)>P(P-^^^-P^^^ if p{s) < p{pA,.,PBa), (6) (Proof of Claim) Let " Take any p > '"' ^ Pr^ {rn\e = Jprn A) < p{pA/,,PB,i)- Since 1 -ps,,; (1 ' _ p)"-" dFX PA,i, [l-PB,yp'Bf<fA,{l-PA,f-r function pP The p — p. Then, suppF^, p e (1 The first {I (7) — p) ~^ continuous and concave in imphes that there e (p-e^p \pA,t,p],rn/ri - p)- is p— < (1 and reaches p, and p > paj. its maximum such that for all at p G + e), - pB,r" p^7" < f" (1 -pf- '"" < p- (1 - p)"-^" (8) . inequality in (8) imphes that < i^-PB,rp%:r- j {i-pY-p^-^-dF^s On > exists e (7) the other hand, the last inequality in (8) implies (q) that ff-{i-pr-'^dF\> f f-{i-pr-'"dF\>F\{p)f^{i-pr'-, where the first inequahty follows from non-negativity of p'"" (1 — p)"~'"" By . (lo) dividing the we left-hand side [right-hand side] of (9) by the left-hand side [right-hand side] of (10), therefore obtain *' • °-''"<'^"*-^T(?)U-/-(i-P)-'"'" j Equation (6) follows from in parenthesis in (11) n — > oo and Tn/n -^ argument (switching (Part agent is i 1) Since Therefore, = 1 SufRciency) ymptotic agreement. when Vn/n G 1, /-^^ — (r„) > (s) (?!)';, We 0, — e/2,p and thus 4- e/2], (pni-s) when p < -^ 1), (6) the expression 1 Suppose p{pa,j,Pbj) < 10 — Pb,i. PA,i- 1. p (.s) > pA^ > < as The same O p{pA.i,PB,i.)- This implies that Pr' (limn^oo ^n prove that —^ implies that conditional on 9 5 and to the event that s G (5). [p so that the right-hand side converges to inf(suppF0) e (1/2, the last inequality follows from 2: (8), and B) imphes that assigns probabihty (Part By smaller than p. A p^,, (11). = A, p {pA:hPB,i), where i^) p{Pa,j,Pb,j) < = 1|^ = -4) = 1. PA,i implies as- Then, conditional on 9 = A, (6) . implies that (jP^ p{Pa,j,Pbj) > 1 - 1 establishes that (Part converges to (s) also PB/n conditional on (s) \(pl — 4>l — PB,i < p{Paj,Pbj) < (s)| -^ = 6i ^ p{s) e \pA,t,pipAj,PB,j))- But ^ (s)| -^ 0^. (s) when Next, This 0. sufficiency. assigns strictly positive probability to the event that i -^ (6) implies (j)'^{s) 1 and 0^ Therefore, the beliefs diverge almost surely. 1. where p {paj^Pbj) < for the case 0. Suppose the inequality does not hold, and consider the rn{s)/n 0^ and —> PA,i- p{pA,j,PB,j)- Then, — (s)| 0^ prove that asymptotic agreement imphes the inequality < |(^^ (s) — B, (pi{s) -^ and proves case pA,i that (s) \(t)\ We Necessity) 2: and therefore 1, — 1 PB,i is -^ (s) 0, so The argument analogous and completes the proof of the theorem. Theorem each 9 and shows that under the 1 i, assumption" that "full identification pe^i > 1/2 for asymptotic learning always obtains. Furthermore, asymptotic agreement depends on the lowest value of pg to which individual pe,, = i 1,2 assigns positive probability. An immediate corollary is Savage's theorem. Corollary 1 (Savage's Theorem) Assv,me some pg > 1/2, i.e., = Fg {pg) 1. Pr'(lim„_.^<(s) 2. Pr^ (lim„^oo 0n (5) = l|e - with the latter. and Fg = = .4) = 0' (s) 1 It is useful to spell 1 [p) that each Fg puis pivbability 1 = for each p i = 1,2, 0) = 1 I 1 states that when 1 and Corollary the individuals distributions of the signals (and hence they agree will learn the truth Then, for each pg. l. out the intuition for Theorem Corollary < on pg for what those with experience (almost surely as n — > know 1. Let us start the conditional distributions are), they oo) and two individuals observing the same sequence will necessarily come to agree what the underlying state, is. A simple intuition for this result is that the underlying state 9 is fully identified 6, from the limiting frequencies, so that both individuals can infer the underlying state from the observation of the limiting frequencies of signals. However, there is is more to this corollary than this simple intuition. Each individual sure that they will be confronted either with a limiting frequency of a signals equal = 9 = to Pa, in which case they will conclude that 9 A, or they wall observe a hmiting frequency of l—pB, and they will conclude that B; and they attach zero probabihty 11 to the events that they will observe a different asymptotic frequency. "What happens an individual observes a frequency p of sample of The answer n? size signals diff'erent from pA and 1 - ps in to this question will provide the intuition for the results that we will present in the next section. Observe that probabihty under the individual's = beliefs at the limit n cx5. this event However, for if a large some of has zero n < oo he will assign a strictly positive (but small) probability to such a frequency of signals resulting from sampling variation. Moreover, P {Pa,Pb) e - samphng (1 Pb.Pa) given by it is above such that when p (5) variation that leads to p under 9 one under 6 = A. Consequently, 1 to the event that assign probability 1 to 9 = 1 is 9, and the sequence of signals. individual i infinitely greater (as to p is B. each individual When variations are. is pg^j, infinitely smaller p > — pB,i.PA,i)-, with probability 1 that 9 similarly, close, The assumption 1/2 in Corohary and learn the payoff-relevant Avill = A that 9 1 — is when is more likely under 9 = A depend only on how large these beliefs — than under 9 — 1, he presumes that this has resulted > oo) the necessary sampling variation ~ A. Conversely, B. Consequently, the individual when p < p{pA,i^PB,i), he beheves B. Wlrether there will be asymptotic agreement then purely diff'erent the cutoff values p {pa,i,Pb,i) and p (p.4,2,Pb,2) are. both individuals mil interpret the limiting frequency of even when this 1, the lowest sampling variation that leads p iPA.i^PB,i) (and as n depends on whether and how Wlien they are 9, and the asymptotic under 9 beheves with probabihty > 1. In particular, similar to the intuition for Corollary In particular, for each attained at will both himself and the other player to do so before observing observes a frequency p E {I = oo) than the when p < p{pa,Pb), he A. Conversely, from sampling variation, and decides whether frequency p or under 9 n —^ (p.4,Pb), the individual will asymptotically the assumption that pg 1) generalizes also expects p {pa,Pb), the required very similar to that of Corollary sufficient to ensure that asymptotically state \s > B. Theorem intuition for that inf (suppFj) e (1/2, is = =B when p > p assign probabihty The straightforward to see that there exists a unique a frequency to which they initially signals, p, assigned zero probability, and wiU reach asymptotic agreement.^ The next corollary highlights a range of conditions other than those in Corollary that, according to part 2 of ^In contrast, if Theorem 1, 1 are sufficient for asymptotic agreement. these cutoff values were far apart, so that p{pA,i,PB.j) ^ (1 ~ Pb,i^Pa,i), both players their beliefs would diverge to the extremes and we would assign positive probability to the event that would thus have lirn,,, -oo |</'„ {^) — 4>n (*')| "= ^- 12 Corollary 2 (Sufficient Conditions for Asymptotic Agreement) Suppose Pe^i = inf(suppFg) e (1/2, Then, there 1). that asymptotic agreement whenever any one of is the following conditions hold: 1. certainty (with symmetry): each Fg puts probability 1 on 2. symmetric support: suppF4 3. common support: suppFg Proof. Part 1 = suppF| of the corollary assumption, we have p [pa;i~Pb,i) from part 2 of Theorem under the = suppFg common is = for each for each j, 1/2; 6. 2. Under s}'iiimetric support so that part 2 of the corollary follows Finally, part 3 of the corollary follows 1. > p" i; a special case of part 1/2 for each some support assumption pIpaj^Pb.j) = pipA^i.PB,,) from the € fact that — PB,i,PA,i)- (1 Corollary 2 shows that various reasonable conditions ensure asymptotic agreement. Asymptotic agreement mon is implied, for example, by certainty, symmetric support or com- support assumptions. In particular, certainty (with symmetry), which corresponds to both individuals believing that hmiting frequencies have to be p' or P^ ¥" P^) is sufficient for asymptotic agreement. In tliis 1 — p' case, each individual (but with is certain and therefore believes that the frequency about what the limiting frequency will l^e pected by the other individual not be reahzed (creating a discrepancy between that individual's initial belief will and observation). Nevertheless, with the same reasoning as the discussion followng Corollary 1, Theorem is 1 sufficient for and Corollary in each individual also believes that the other individ- ual will ascribe this discrepancy to sampling variation and reach the himself. This ex- same conclusion as asymptotic agreement. 2 therefore show that results on asymptotic learning and agreement are substantially more general than Savage's original theorem. Nevertheless, these results do rely on the feature that F^J (thus implicitly imposing that Assumption 1 (1/2) = for each i = 1,2 and each does not hold). This feature implies that both individuals attach zero probability to a range of possible models of the world they are certain that pe cannot be situations in which this is less 9 than 1/2. There are two reasons — for considering not the case. First, the preceding discussion illustrates assigning zero probability to certain models of the world is important; it i.e., why enables individ- uals to ascribe any frequency of signals that are unlikely under these models to sampling variabiUty. This kind of inference may be viewed as somewhat unreasonable, since indi- viduals are reaching very strong conclusions based on events that have vanishingly small 13 probabilities (since sampling variability vanishes as ^ n cxd). Second, our motivation of investigating learning under uncertainty suggests that individuals (albeit small) probabilities to all possible values of pe. of Assumption 1 attach positive This latter feature is the essence (the "full support" requirement). and Agreement with Failure of Asymptotic Learning 2.3 may Full Support We next impose Assumption where Fg has full 1 and show that under the more general circumstances support, there will be neither asymptotic learning nor asymptotic agreement. Theorem (Lack of Asymptotic Learning and Agreement) Under Assumption 2 (lim„_^ 1. Pr' 2. Pr* (lim„_oo (/>; ^ [s) \<PI,. (-s) ~ 1|^ 0n = = .4) for 1 = {s)\ t^ O) i = 1,2; whenever 1 n'^ ^ = and Fg tt^ Fg for each ee {A,B}. Proof. Since f's{l-p{s)) Lemma [p (s)) 1, (p]^ note that, by Lemma <Pl (pis)) Since Pr' vr^ 7^ tt^ {\(t>l (s) ^ = <iL iPis)) and Fg -<pI{s)\ Remark 1 The We can see from each 1 for 1, for = > any if s and fA{p{s)) s, establishing the and only Fg, this implies that for each s G S, ^0) = 1 for i = assumption that Fg 2 Theorem 2. Assumption It is ymptotic learning 1 is is Hence, by (s) 7^ 0^ (s), and thus m 1,2. — (f)]^ (12) Fg in this theorem is adopted in Section 4 states a and signals, for simplicity. mil typically be more general version of this and shows how the assumption sigTiificantly. considerably stronger than the adopted only it 0. see the second part, (12) that even in the absence of this condition, there Fg can be relaxed Remark To part. > ^^R' ip{s)) = ^^^' (p(*0) if result for the case of multidimensional state = first R'{p{s)) E S, no asjTuptotic agreement. Theorem 6 that Fg is finite, for simplicity. sufficient (but It necessary conditions for can be verified that for lack of as- not necessary) that the measures generated by 14 the distribution functions F\ and F^ {p) (1 -p) be absolutely continuous with respect to each other. Similarly, for lack of asymptotic agreement, F\ essary) that the measures generated by {p), F^ -p), Fj (1 absolutely continuous with respect each other. For example, that Pa is sufficient (but it is if (p) and F^ (1 not nec- - p) both individuals believe and that ps either 0.3 or 0.7 (with the latter receiving greater probabihty) also either 0.3 or 0.7 (with the former receiving greater probabihty), then there neither asymptotic learning nor asymptotic agreement. Throughout 1 both because we turn and because simplifies the notation individual will fail Theorem 1 and imphes to learn the true state. The second 2 contrasts with amount is wll be we use Assumption a natural assumption it is to the anal5^sis of asymptotic agreement as the Theorem if it be when of uncertainty vanishes. that, with probabihty each 1, part of the theorem states that the individuals' prior behefs about the state differ (but they interpret the signals in the same way), then their posteriors will eventually disagree, and moreover, they will both attach probabihty 1 Put to the event that their beliefs will eventually diverge. differently, this implies that there is "agreement to eventually disagree" between the two individuals, in the sense that they both believe ex ante that after observing the signals they will fail to agree. Intuitively, individual is when Assumption 1 support feature) holds, an (in particular, the full never sure about the exact interpretation of the sequence of signals he observes and will update his views about pg (the informativeness of the signals) as well as his views about the underlying state. For example, even A in state than in state that the true state is B, a very high frequency of a A, because he beheved, and they initially may may will when us how more likely not necessarily convince him instead be biased towards is a. his prior beliefs he Therefore, the indi\'idual captured by the fact that in (4) never takes the value zero or infinity. Consequently, as be determined by is infer that the signals are not as reliable as never becomes certain about the state, which beliefs will signal a shown R {p) defined in (3), his posterior about the state and also by i?.', which tells the individual updates his beliefs about the informativeness of the signals as he When two individuals interpret the informativeness of the signals same way (i.e., B} = B?), the differences in their priors will always be reflected observes the signals. in the in their posteriors. In contrast, (i.e., if i if an individual were certain about the informativeness of the signals were sure that pg = p'g for some p'g 15 > 1 12) as in Theorem 1 and Corollaiy 2, then he would neA^er question the inforraativeness of the signals, frequency of a converges to a value different from p'^ or such discrepancies as resulting from sampling variation. = asymptotic agreement when p\ The p)^. full 1 - even when the limiting pg, and would interpret This would be sufficient for support assumption in Assumption 1 prevents this type of reasoning and ensures asymptotic disagreement. Main Results 3 In this section, we present our main results concerning the potential discontinuity of asymptotic agreement at certainty. More precisely, we investigate whether as the amount of uncertainty about the interpretation of the signals disappears and we recover the standard model of learning under certainty, the amount of asymptotic disagreement vanishes continuously. We will show that this is not the case, so that one can perturb the standard model of learning under certainty sightly and obtain a model in which there is is substantial asymptotic disagreement. We first show that asymptotic agreement discontinuous at certainty in every model, including the canonical model of learning under certainty, where both individuals share the same (Theorem signal distributions We 3). beliefs regarding the conditional then restrict our perturbations to a class that embodies strong continuity and uniform convergence assumptions. Within perturbations, we characterize the conditions under which asymptotic agreement continuous at certainty (Theorem For any p 6 i.e., 5p (p) Let = 1 [0, 1], if p write > p and S.^ for the Sp (p) Dirac distribution that puts probability = hm^l^e^miP) m -^ cx), is - asymptotic agreement under <^ S^i 6^, for each (i, ^) as to a world of asymptotic agreement. 0j,o,n7 Evidently, as {Fg„, } for the Sp,J. Throughout = p; We S^,, , m— > oo: (13) it is (as in Corollaries 1 impHcitly assumed and 2). Therefore, as we converge wTite Pi'"' for the ex ante probabihty under asymptotic posterior belief that 6 converges to on p ifp<2->^,. uncertainty about the interpretation of the signals disappears and {F\ ^, Fg „) and 1 denote an arbitrary sequence of subjective simply say that {Fg^} converges to that there be otherwise. {i^(jm}m6N,!:sA',ee0 {{Fg.,„} for short) will will 5). probability distributions converging to a Dirac distribution (We this class of =A under iF\^^, -^e.m)- each individual becomes increasingly convinced 16 . that he will learn the true state, so that learning formally, for all e > 0, hm Pv''^{4>'^^,,>l-e\e = A)=l. This imphes that when a model of learning under certainty from full More continuous at certainty. is is perturbed, deviations learning will be small and each individual will attach a probability arbitrarily close to 1 that he will eventually learn the payoff-relevant state variable 6. We next define the continuity of asymptotic agreement at certainty. Definition 1 For any given family {F^^}, we continuous at certainty under {Fg^^,}, if for all e > limPr'>'"(|(/)^,,„.-0^,„,|<5) We agreement say that asymptotic is asymptotic agreement sa,y that and for each = = 1,2, l. continuous at certainty at {Pa^Pb^P%P%) continuous at certainty under every family {-Pg^} converging to if it is z is (5p. Thus, continuity at certainty requires that as the family of subjective probabihty distributions converge to a Dirac distribution (at which there is asymptotic agreement), the ex ante probability that both individuals assign to the event that they will agree asymptotically becomes arbitrarily close to tinuous at certainty at {Pa^Pb>p'a-p'b) and e > i^ 1. Hence, asymptotic agreement there exists a family is discon- {Fg^} converging to 5 pi such that hm — Pr'>"(|0L,..-<*L,^,|>£)>O 771. for i = 1, 2. We Definition 2 'CxD wll next define a stronger notion We say that asymptotic certainty under {Fg^} if there exists £ of discontinuity. agreement > is strongly discontinuous at such thai lim Pr'-"'(|0^,,„,-0^,„|>e)=l for i = 1,2. We say that asymptotic certainty at (p^ pg , , p^^ , p|) family {F^^} converging to if it is agreement is strongly discontinuous at strongly discontinuous at certainty under some 6.^^ Strong discontinuity requires that even as we approach the world of learning under certainty, asymptotic agreement will fail according to both individuals. 17 with probability approximately equal to 1 Discontinuity of Asymptotic Agreement 3.1 The next theorem estabUshes the strong discontinuity of asymptotic agreement at cer- tainty. Theorem ment 3 (Strong Discontinuity of Asymptotic Agreement) Asymptotic agree- strongly discontinuous at every {Pa^P^b^ Palp's) is Moreover, if tt^ ^ tt' then there exist {Fg} converging to , - \<plc,m [P ('S)) The each proof of this theorem the hmiting world 9, Theorem in Savage's of observing signal s 9 =B a p^ > 1/2 1 the state if on = A and 9 is pg). Therefore, this close to this world of certainty, the asymptotic gap in beliefs condition p^ G (1/2, E S. Note that when pi = pj — pe for not needed (see 1) is Theorem I — ps the state if is theorem establishes strong discontinuity at certainty for the canonical learning model; even The s such that both individuals are certain that the probability 1): each Fg puts probabihty (i.e., and all {d,i). the canonical learning model (under certainty) described is is Z> and 5^1 allmGN for provided below. is (Corollary = > Z <^L,m (P (^0)1 e (1/2,1) for p'g '^''i'ih is when we are arbitrarily bounded away from The proof 7 below). is zero. based on the following example. Excunple 1 For some small 1 — e, Pe is in a A G e, (0, 1), each individual A-neighborhood of some pl > More signals are not informative. (1 + precisely, for pg i thinks that with probability A) /2, but with probabihty > (1 + A) /2 the e, and A < iPl—pj], we have f f^)^ '^^^ for each 6 and i. ^ Now, by ^'^^^'^^^^ f / \ + (1 - e) 'fP e (Pe otherwise /^ e (4), ^ — ' - V2,P^ + A/2) ^ the asymptotic likelihood ratio ' is ifp{s)ED\^{p\-X/2^p\+\/2) z/>(.) G Z?i, ^ (1 -p'i, -A/2,1- Pi, + A/2) otherwise. This and other relevant functions are plotted likelihood ratio down to Lemma 1 1, /?' {[> (.s)) is 1 wlien p (s) is in (s) will also 1 for e — > 0, A small, takes a very high value at afterwards, becomes nearly zero around ^^ Figure p'_^, 1 is The 0. — Pg, and then jumps back to be non-monotone: when p[s) 18 ^ 1. goes By small, the signals are not ^' -^-. n' 7I--71' 1-71= 71' P \-p' Figure \4L 1: The - 0LI as three panels show, respectively, the approximate values of ^ e informative, thus signals ^^ P 0, for f)\ (/)J^ (s) is = p^ = (p), (j!)^, the same as the prior, In contrast, around 1 tt'. state become uninformative again and is B, thus 0'J^ (s) 0!^ (s) goes back to — = tt*. p^, the 0. become uninformative and when p(5) is around 1 — pg 0'^^ (s) falls back to After Around p^, the signals are again informative, but this time favoring state A, so (p^ (s) Finally, signals again and p'\ become very informative suggesting that the this point, the signals /?,' = 1. Intuitively, tt''. or p'4, the individual assigns very high probabiUty to the true state, but outside of this region, he sticks to his prior, concluding that the signals are not informative. The first important observation range of limiting frequencies, as 1 to the event that he will learn Theorem 1, as e -^ and A —^ frequencies will be either 1 — pg is in a region and A This 6. is is equal to the prior for a large each individual attaches probability > because as illustrated by the discussion after each individual becomes convinced that the Umiting is considerably weaker than as3aTiptotic agreement. also understands that since where he learns that signals are uninformative — 0^ or p\. However, asymptotic learning Each individual -^ f 0, that even though is 9 = and adhere to A < |pj — Pq|, when the long-run frequency A, the other individual will conclude that the his prior belief. Consequently, he expects the posterior beliefs of the other individual to be always far from his. e -^ and A -^ 0, each individual beheves that he that the other individual will fail differently, as will learn the value of 9 to learn, thus attaches probability they disagree. This can be seen from the third panel of Figure 19 Put 1; 1 himself but to the event that at each sample path one of the individuals in S, at least and the will fail to learn, between their difference limiting posteriors will be uniformly higher than the following "objective" = min z When = TT^ 1/3 and = tt^ {7r\ tt^, 1 — bound 2/3, this — 7r\ 1 tt^, — \Tr'^ 7r"|} . equal to 1/3. is In fact, the belief of each individual regarding potential disagreement can be greater than vidual beMeves that he will learn but the other individual will quently, for each min{7r\7r^, — 1 Pr' (|(^^ (s) -i, 7r\ 1 - and Pg = pI 0^ < 1x^/2 for (s) learning of 6. ~ where )^ By i). m -^ 00. G (s) In e > Z) 1 to do so. fail where as e, Example is for each 1, such that 1 — < (^^ (s) = e take 7/7., < A (1 Z= D\ U D'g. But z/2 Pr''-'" > By 0. {p (s) £ — 5p. choice of D\ U e = e — Conse- Z — 0, » D'^) , = e, A pj for 6 = e/m,, 7r-')/2 for Such e. construction, each Fq'^' converges to proof, pick > only consider the case p] p{s) E I?e whenever To complete the whenever p We 3. other cases are identical. > qr^ {s)\ - each indi- this; z > = This "subjective" bound can be as high as 1/2. tt^}. Proof of Theorem - bound e and - p], A, B; the = p{s) e exists (by pi + A, D\ and asymptotic — p1\ > A for each (s) — ^L.m i-^)\ > ^ \p]j |0^,„ e (1 = A), which goes to 1 as Therefore, hm Pr^''"{Ul^„,-ct>lJ>Z) = l. (15) TD—>00 To prove the last Z — statement in the theorem, pick Theorem In the example (and thus in the proof of and the asymptotic beliefs (j!)'^ (s) are non-monotone 3), z/2, which is positive the likelihood ratio in the frequency p{s). /?' when (p(s)) This is a natural outcome of uncertainty on conditional signal distributions (see the discussion at When the end of Section 2 and Figure 2 below). uncertainty is both of them will learn Z > monotone and the amount of the true state and consequently asymptotic disagreement will be impose the monotone likelihood For any is small, at each state one of the individuals assigns high probability that small. Nevertheless, asymptotic agreement Theorem 4 R is still discontinuous at uncertainty ratio property. This is when we sho^vn in the next theorem. (Discontinuity of Asymptotic Agreement under Monotonicity) p\,p''g > 1/2, i G {1,2}, mid 7r\ tt'^ such that: 20 G (0,1). there exist a family {Fg^^} and B} and 1. for each 9 G {A, 2. the likelihood ratio 3. for each R^ = i 1,2, Fg^ converges to S^i ; nonincreasing in p for each {p) is and i rn, and i, lim Pr'^''"(kU-Cn.|>^)>0. (16) Proof. See the Appendix. The monotonicity 3, of the so that the hmit in (16) hkehhood ratio has no longer equal to is discontinuous at certainty, but not strongty Note that weakened the conclusion 1, What {^9m} ^^^'^ ^'^^ important is is and characterize the conditions This is not crucial for the ensure continuity of the is We next pointwise). (as well as additional strong continuity for discontinuity of impose assumptions) asymptotic agreement at certainty. Agreement and Disagreement with Uniform Convergence 3.2 we consider a In this subsection, Dirac distribution 5pr for some p' G class of famihes [Fg-^] converging (1/2, 1) We start our analysis determining density function is strictly positive there exists x < tail is discontinuity properties of {Fg^}. by defining the family {Fg.^}, with a corresponding family of subjective probabihty density functions [fl^,]- / uniformly to the and show that whether there of asymptotic agreement at certainty depends on the (ii) to the discontinuity of that the likelihood ratios under families converge uniformly (instead, convergence a uniform convergence assumption (i) ratios. Fg^ would however, since smooth approximations to likelihood ratios as well. is so. asymptotic agreement induce discontinuous likehhood results, Theorem so that asymptotic agreement Theorems 3 and 4 the famihes {F^^,} leading in of /. We is parameterized by a impose the following conditions on and symmetric around oo such that / The family (.x) is /: zero; decreasing for all x > x\ (iii) ^,(x,y)= lim m^oo exists in [0, oo] at ah {x.y) G M+. 21 4^4 / [my) (17) Conditions and (i) (ii) are natural and serve to simplify the notation. Condition (iii) introduces the function R{i:,y), which will arise naturally in the study of asymptotic agreement and has a natural meaning in asymptotic statistics (see Definitions 1 and 2 below). In order to vary the [x — /m, which y) scale amount down we of uncertainty, consider mappings of the form x the real line around y by the factor 1/m. subjective densities for individuals' beliefs about pa and pe, {flm}^ by / and the transformation x ^-^ [x — p') we /m.'" In particular, ^^'^^^ i-^- The family of be determined consider the following family of densities = e{m)f[m{p-p^)) fl,^,{:p) for each 6 that /g^ and i where d (m) = 1/ J^ /' [m {p — p')) a proper probability density function on is subjective densities, the uncertainty about pa to the Dirac distribution dpi as rn ^ dp [0, 1] a correction factor to ensure each m. In this family of for down by scaled is is (18) oo, so that individual l/'m, and fg ^ converges becomes sure about the i informativeness of the signals in the limit. The next theorem characterizes the class of determining functions / for which the resulting family of the subjective densities {J} ment as the Theorem p" > amount the family {F^.m} defined in (18) for 1/2 and f, satisfying conditions (i)-(m) above. formly converges 1. of uncertainty vanishes. (Characterization) Consider 5 leads to approximate asymptotic agree- ,^„} If R{p^ to R{x,y) over a neighborhood of + p^ — l,\p^ ~ p'^l) = + p'^ — l,\p^ — p'^l) Assume that — 1, \p^ {p^ -|-p^ f{mx)/f{my) — ]5^|). some uni- 0; ihen agreement is continuous at certainty under 7^ 0, then agreement is strongly discontinuous at cer- in, J- . 2. If R{p^ tainty under {Fg.^}. Proof. Both parts Claim of the 2 Iim.^,_oo (0^.,™ if) (where (p'oom i'P') " ^o..n,. (p')) = if and only denotes beliefs evaluated under sample ^"This formulation assumes that do not introduce theorem are proved using the following claim. p'^ a.ncl Pg arc cqu.il. Wc can easily assume these to be this generality here to simplify the exposition. in the context of the more general model with multiple 22 R {f + p^ - h\p' - p^\) = paths with p = f ). if states Theorem different, but 8 allows for such differences and multiple signals. - (Proof of Claim) Let be the asymptotic likelihood ratio as defined in i?^, (p) One can associated with subjective density /g^. Hence, by By (12), lim,„._>oo (0^,m - iP') easily check that = (P^ocm iP')) and only if Rm (p') hm„,^«,i?4 (p^) limm^oo if (4) = = 0. 0. definition, hm Rl„ [p m^oo = ) hm — - —— m-»c» —— —p)) j['ni[p = R{l~p'-^f.,f-f) where the Rm establishes that hm^^^oo and only if R (p^ + p^ - (Proof of Part 0. We By Lemma f exists is) \<P' there exists 1, > e - > e' (i), symmetry the (and thus hm„j_oo -p^\) =0. 1) Talce any hm = (p') 1, \p^ show that there will Pr' There by condition last equality follows and G rfi. N This of the function /. (p') {4>'oo,m. - 4>io,m (p')) = 0) if D > (5 0, and assume that ^ (p^ + p^ — 1, |p^ — p^|) = such that 4>lm {s)\>e) such that 0^ .„ <6 (Vm >m,i = {p (s)) > 1 — 1, 2). whenever f i?' (p (s)) < e'. also exists xq such that Pr^ (p (s) - G {f + xo/m) Xo/m.,p' \0 = A) = f ° f (x) dx > 1 - 5. (19) J ~X0 Let K (see (xo = (ii) + xi) min-cg[_.CQ^,;(,] / (x) > Since / monotonically decreases to zero in the tails 0. above), there exists Xi such that / (x) — 1) > 0. +p'| > Xi/rn, I (2p' have |p(5) — 1 Then, for any m < e'n > mi and Again, by i?4 (p (s)) < all ni > mi and Lemma e". Now, for hm p G (s) (p' there exists e" 1, (s) G (p' — |x| > Let \xi\. xq/tii.p' mi = + Xo/m), we and hence /(m(p(5)-p')) Therefore, for p whenever each p — Xo/m^p'' > K + Xo/m.), we have that such that <J^m, (p(s)) > 1 — e whenever (,s), /?4(p(,)) = ^(p(,s)+p'-l,|p(5)-]y|). 23 (21) Moreover, by the uniform convergence assumption, there exists ^ (/o(s) uniformly converges to — +it>' — p' |p(s) 1, I) on (p' — > r/ such that i?^ (p + ry,p' (s)) and r/) R{p{s)+p=-l,\p{s)-^y\) <e'72 — for each p{s) in (p' is continuous at it takes the value T],p'^ +p^ — (p^ + Moreover, uniform convergence also implies that ri). 1, [p^ — p^\) (and by hj^pothesis, oo such that for aU m- > m2 and in this part of the proof, m2 < Hence, there exists 0). R p(s) e (p'-r/,p' +77), i?4,(p(.s)) Therefore, for all m > 7712 <R{p{s)+fP - l^\p{s)-jV\)+e"/2 < and p {s) £ [ff — + 7/), we i],p' e". have C,,Jp(s))>l-e. Set m= ma.x{mi,m.2,ri/xo}. Then, by (20) and (22), for any {p'-Xo/m,f + that Pr^ B, (22) (|(/.j,„,„ we have Xo/Tn), (p(5)) - this establishes that (Proof of Part 2) < <^i,,„ (p(s))| Pr'' (10'^^ (p We will find = eie- > < (tL.rn {p (s))\ /I) - (^ „ (s)) f - (P (s)) \4'';^jn > 1 - (p (s)) By <5. | < e) 777. 1 > such that for each 5 fh and p{s) e Then, (19) implies e. symmetry the > > 0, 5 for of m> A and tti. eN there exists fh such that Pr' ( lim Since lim^^o, e = (1 1 to — i?^,, |(^,;,,,„ - 0^ > (s) e) > 1 - J (V7T7 > 777,, ^ = 1, 2). I {f) hmm,--^oo 0io (.s) m = R{p^+p^~ \f 1, - f~\) > (P')) /-^ ^'^^ ^^"^ similar 0, linv^.^ ,^^_„ (f) arguments to those in the < 1. We set proof of Part obtain the desired conclusion. The main assumption in Theorem 5 is that the likelihood ratios uniformly to a Hmiting likehhood ratio, given by R}^ In what /?'„ follows, "noise vanishes uniformly" as a shorthand for the statement that the /?^(p(s)) converges uniformly to the limiting likelihood complete characterization of the conditions ratio. for the continuity of [pis)) converge we say that hkehhood Theorem ratio 5 provides a asymptotic agreement at certainty under this uniform convergence assumption. In particular, this theorem shows that even ( when the likelihood ratios converge uniformly, asymptotic agreement ^^Note that the limiting hkclihood ratio ) model without uncertainty. ft is may fail. not related to the likelihood ratio that applies in the "limiting" 24 In contrast Corollary 2 shows that that there will always be asymptotic agreement in the limit. The theorem provides a simple condition on the of the distribution tail / that de- termines whether the asymptotic difference between the posteriors will be small as the amount of uncertainty concerning the conditional distribution of signals vanishes "uni- formly". This condition can be expressed R{x,y)= as: (^ = lim m.^oo f where x = p^ + - p'^ > 1 |p^ — p^| = (23) [my] The theorem shows y. that if this condition is then as uncertainty about the informativeness of the signals disappears the satisfied, difference between the posteriors that condition (23) is of the two individuals symmetric and does not depend on Intuitively, condition (23) other individual will learn. distributions vanishes, become neghgible. Notice will i. related to the beliefs of one individual on whether the is As the amount we always have of uncertainty concerning the conditional that lim„,^oo believes that he will learn the value of /?.'„ with probability (p') 1. = 0, so that each agent Asymptotic agreement (or lack thereof) depends on whether he also believes the other individual mil learn the value of 9. When R{x,y) = 0, an individual who expects a limiting frequency of p^ in the asymptotic distribution will p^. Therefore, individual 1, still who learn the true state p^ 0, individual and thus expects to learn the limiting frequency is almost certain that the limiting frequency will be is p^, still believes that individual 2 will reach the when R{x,y) ^ when 1 is still same inference as himself. In contrast, certain that Umiting frequency of signals will be himself. However, he understands that, when R {x, y) ^ 0, an individual who expects a limiting frequency of p^ will fail to learn the true state when almost certain that the limiting limiting frequency frequency wll be fy" happens to be (or 1 — p^), p^ Since he . he expects the other agent not to learn the truth and them thus he expects the disagreement between The theorem exploits this result their beliefs will disappear to asymptotic disagreement when 1 of R to show that the individuals to the event that the asymptotic difference when (23) holds, (23) fails to hold. beliefs as uncertainty vanishes "uniformly" are a condition on the to persist asymptotically. and the continuity attach probability arbitrarily close to between is tail of /. 25 and they attach probability Thus the behavior 1 of asj^mptotic completely determined by condition (23), . When Whether > y (i.e., when varying (thick) a familiar condition in statistics. is (thin) or regiilarly- tails: A Definition 3 p^), condition (23) depends on whether / has rapidly-varying satisfied it is ^ p^ density function f has regularly- varying tails if it has unbounded sup- port and satisfies Hm 4r^ = ^(^) e K m-too j (m) for any x The > 0. condition in Definition 3 that H (x) the functional equation a G i.e., H{x) = in the limit, the function H {) must be a solution = H{x)H[y) it H{xy), which is x~" only possible uniformly for x in any compact subset of a density / has regularly-varying (in particular, the R but nevertheless if for a € H{x) = (0, oo). x~" to for Moreover, Seneta (1976) shows that the convergence in Definition 3 holds (0, oo).^^ locally uniformly, if relatively weak, is implies that has important implications. In particular, This follows from the fact that M G defined in (17) tails, This imphes that then the assumptions imposed in Theorem 5 uniform convergence assumption) are is (0, oo). In fact, in this case, satisfied. given by R{x, y) .y, and varying f tails behave approximately with regularly-varjing (x) As everywhere continuous. is varying function £ this expression suggests, densities with regularlylike power functions in the tails; indeed = can be written as f{x) tails (with \m\n^^, C{rnx) / C {in) — 1). a density C{x)x"°' for some slowly- Many common distributions, including the Pareto, log-normal, and t-distributions, have regularly-varying densities. When We / has regularly varying tails, > R{x,y) 0, and condition (23) cannot be also define: Definition 4 A density function f has rapidly- varying tails if fimx) hm ^ip^ = = "-^°° ^("^) ^^To see this, 1 note that since hm„i_oc if{'nix)/f{rn)) y f f{^T^"''y)\ hm Hui[xy) = -rhm -> / ?n J = '"-=c= — oc \ ,. . -in See de Haan if x>l 1 ./ X oo if x<l ( .x-°° (1970) or Feller (1971). 26 = H (x) = ,/ my satisfies 1 e R, we have Krnxy) f{jny)\ f -— V it f{m)J = H [x)\„/ H (y) ^ satisfied. > for any x As 0. in Definition 3, the above convergence holds locally uniformly (uniformly over any compact subset that excludes tails 1). Examples include the exponential and the normal densities. R{x,y) = = {x/y)~°° The next 0, and condition (23) is in x of densities with rapidly- varying When / has rapidly varying tails satisfied. proposition formally states that under the assumptions that noise vanishes uniformly and set p^ 7^ p^, whether agreement is continuous depends on whether the family of subjective densities converging to "certainty" has regularly or rapidly-varying tails: Proposition (Tail Properties 1 form Convergence) p^ ^f. and Asymptotic Disagreement Under Uni- Suppose that the conditions in Theorem 5 are satisfied and that Then, 1. If f has regularly-varying 2. If f has rapidly-varying tails, tails, then agreement then agreement is is continuous at certainty under strongly discontinuous at certainty under [Fl^]. Proof. Wlien / has regularly or rapidly varying tion tails, uniform convergence assump- and the proposition follows from Definitions 3 and 4 and from Theorem is satisfied, 5. Returning to the intuition above, Proposition it clear that the failure of asymptotic agreement, converges to R uniformly, limiting frequencies, i.e., is p^ ^ p^, together when confronted with a given there, this is and the previous definitions under the assumption that make i?.^ [p] related to disagreement between the two individuals about with sufficiently thick probability distribution so that an individual certainty 1 sufficient for who tails of the subjective expects p^ should have sufficient un- limiting frequency of p^ . Along the lines of both individuals to believe that they value of 6 themselves, but that the other individual will fail to do the intuition will learn the true so. Rapidly- varying tails imply that individuals become relatively certain of their model of the world and thus when individual will interpret this as i observes a limiting frequency p close to, but different from p', he being driven by sampling variation and attach a high probability 27 to 9 = A. This will guarantee asymptotic agreement between the two individuals. In contrast, with regularly-varying tails, even under the uniform convergence assumptions, limiting frequencies different from p' will be interpreted not as sampling variation, but as = potential evidence for 9 B, preventing asymptotic agreement. The following example provides a simple illustration of part Example of Proposition 1. 2 Let / be the Pareto distribution and n^ = 1 = tt'^ 1/2. The likelihood ratio is and the asymptotic probability of ^i for all 777.. varying 6* = ipis)-p' fnf.\\ (These expressions hold tails.) As y4 is in the limit in —> cx) illustrated in Figure 2, in this case ^J^jm = magnitude of asymptotic disagreement, consider p{s) is under any / with regularly- approximately and 1, (pio m In that case, ^J^sm (pi^)) ff. + y^°). approximately y~^/ (x'" ^^ iP i^)) monotone. To see the i^ ^'^^^ Hence, both individuals believe that the difference between their asymptotic posteriors will be This asymptotic difference sponds to the difference likely. ^ to zero as y The last (i.e., R is in the = \p^ — p~\, which asymptotic difference y > 0. that R is example continuous. This is when in fact generally true is whenever \p^ e — P'\ < A. Pr' lim f > converge and S > W „ - 0I „ noise vanishes explored in the next proposition. continuous on the set Then for every will fp-). Proposition 2 (Limits to Asymptotic Disagreement) In Theorem assume corre- views on which frequencies of signals are most this expression that this as p^ -^ statement uniformly and increasing with the difference y in the individuals' from It is also clear is 0, > D — {{x,y) — | there exist A e) < I 28 f5 1 > (Vm > < < 2: and 7T7,, 7 1, m = |7/| 5, < in addition, y} for some G (0,oo) such 1,2). that Figure lini„_oo 0n 2: (^') To prove Proof. this proposition, and use the notation R[x,Qi) = whenever .x, — p^| < |p^ there exists A jfP — 1, |p (5) R > is — j^l) on 77 (p'' > — such tliat R{f?- p^. sucli that ?7,p' + Tj) >,p' (s) [a of Part 1 of +fP' Tlren, — 3/4. Theorem continuous on the compact set Fix any such p^ and A. gence assumption, there exists R {p (s) + we modify the proof Since in that proof. each for Pareto distribution as a function of p ^*^i" \,\f?-—p^\) 5 D and < e"/4 by the uniform conver- Rl^{p{s)) uniformly converges to and R{p{s)+jP-l,\p{s)-f\)<e"/2 for in each p (s) in (p' Theorem — ry,p'' + ?/). The rest of the proof is identical to the proof of Part 1 5. This proposition implies that in the case where noise vanishes uniformly and the individuals are almost certain about the informativeness of signals, any significant dif- ference in their asymptotic beliefs regarding the signal distribution when p^ = p"-, must be due — that we must have R{x,y) is, = \p^ 0, to differences in their subjective densities — p~\ cannot be too small. In particular, and thus, from Theorem 5, there will be = p^ For example. Corollary 2 established that under certainty there is convergence to asymptotic agreement. Notably, however, the requirement that p^ is rather strong. asymptotic agreement for all In closing this section, let p\p^ > 1/2. us reiterate that the key assumption in Proposition 2 is that i?^ (p) uniformly converges to a continuous limiting likelihood ratio R. In contrast, recall that Theorem 3 establishes that a slight uncertainty disagreement with nearly probability 1 even when p^ 29 may lead to substantial asymptotic = p^ The crucial difference is that . in Theorem hkehhood 3 the ratios converge to a continuous hmiting hkehhood function pointwise, but not umfoTinly. 4 Generalizations The previous two K generalize to an environment with this section are contained in the the analog of Theorem We the results from the previous two sections Appendix and K signal values. K and L > 2 states All the proofs for signals. on space, we do not provide to economize = {^i, ..., .4^-} is a set a generic element of the set by Ak- refer to L > > an environment with two states and 1 G 9, where Suppose results in we show that In this section, signals. main section provided our As before, define s = = #{t < signal Sj be the number of times the Similarty, let {sj^j, and rvj (s) = n\st = out of ai K> containing each for 2 distinct elements. E St / = {cii, 1, ..., ...,ai}, L, let ai] n first Once signals. again, the strong law of large numbers implies that, according to both individuals, for each almost surely converges to some Pi{s) e 1,...,L, rn,i{s)/n Define p{.s) 1. { p = {pi,. . . ,pl) 6 S= With analogy A (L) G {s as in usual models, frequencies p0 : ^,^1 pi lim„_^oo = i ^n,/ [s) = assigns to ^ a; when = and 1 [, to the two-state-two-signal of observing signal s we : £ S probabihty individual to before, 1]^"' [0, = {pi{s) let the set as the vector p{s) /n .4/^, tt' the true state immediate generalizations , S = is 9. of / = 1, zero and A (L) = where {n\, ...,tt\^), and L} ..., tt). pej, (24) . > be be the prior tlie frequency players are certain about pe/s Theorems 1 and 1 apply. With analogy define Fg as the joint subjective probability distribution of conditional = {pg^i, ,Pe,L) according to individual ?'. Since our focus under uncertainty, we impose an assumption similar to Assumption Assumption = = be in Section 2, let ^^''llen / with Y^i^iPiis) [0,1] ...,pj^{s)), exists for each model with 2 For each i and finite density fl over 9, the distribution Fg over A(L) has is learning 1. a continuous, non- A(L). This assumption can be weakened along the 30 lines discussed in Remark 2 above. We also define probabihty that 9 0^, „ {s) = Ak = Pr* {6 = Ak \ = {sJjLo) for each k 1, ,.., /C as the posterior after observing the sequence of signals {sJ"^q, and n —»oo Given now this structure, it is straightforward to generalize the results in Section define the transformation T^ : M^' — R^~\ 2. Let us such that > TUx)=(jj-;k'e{l,...,K}\k Here T^ (x) is taken as a column vector. This transformation will play a useful role in the theorems and the proofs. In particular, TT* transformation will be applied to the vector this of priors to determine the ratio of priors assigned the different states by individual Let us also define the norm The next lemma Lemma ||.t|| generalizes = max; Lemma 1 x for |.x-;| = (xi, . , . . i. xl) G M^. (proof omitted). 2 Suppose Assumption 2 holds. Then for all s E S, 1 '^U(p('^))1 Our first theorem + in this section parallels Theorem tion 2 there will be lack of asymptotic learning, 2 and shows that under Assump- and under a relatively weak additional condition, there will also asymptotic disagreement. Theorem 6 (Generalized Lack of Asymptotic Learning pose Assumption 2 holds for 2. W (K^ 0) = {p{s)) and F^ The additional 0) = 0, = i - ^l^ — 1,2, then for each k {p{s))\ 7^ 0) F| for each (9 = 1 in ciuestion). In particular, p{s), the relative asymptotic likehhood of different priors ..., K wheneverW({Tk condition in part 2 of Theorem two individuals with I, and for each , {n'yn i = Sup1,2, {n')m{f^{p{s)) G 9. plays the role of differences in priors in pose of the vector — and Agreement) if 6, that Pr'HTk {TT^)-Tk {n^)yTk{f'{p{s)) Theorem 2 (here ' " denotes the trans- this condition did not hold, then at some states could be the and they would interpret 31 " at least some same according to some sequences of = . . manner and achieve asymptotic agreement. signals in a similar - that the condition that Pr'((Tfc {n^) and holds generically restore it.^"* — The Part if it i.e., {ir'^))'Tk{.f"'{p{s)) = = 0) important to note is did not hold, a small perturbation of Theorem 2 of Tk It is weak relatively tt^ or tt^ would 6 therefore implies that asymptotic disagreement occurs generically. We this next define continuity and discontinuity of asymptotic agreement at certainty in more general case. A family of subjective probability distributions by {Fg.^}. Throughout {Fg^} converge and 5j,i when by such that there is learning 0fc,oo,m and is is asymptotic agreement (that for all e > (pl^^,^, for 0, = /c K 1, ..., — for each k 1, hm Asymptotic agreement K ..., Pr''-^" m~->oo and all and for each Mil (|0U,m. families > there exists e = 1, ..., K - i = ous at certainty at 4,o.,m\ 1 1 (p^.p'^) "More 1,2. G A (L) < e) = 1- {p^,p'^) 0. This is G A(L)'^ if it is con- to 5^^ strongly discontinuous at certainty under l Asymptotic agreement ^ Theorem formally, the set of solutions has Lebesgue measure denoted such that = result generalizes beliefs are 1 // is asymptotic agreement ous at certainty under some family {Fg^^} converging The next asymptotic agreement 1,2, {Fq^} converging is i and each where p^ G A(L), N. hm Pr''"(Koo,m-<oo,.J>^) = for each k is again denoted continuous at certainty under family {Fg^} is Definition 6 Asymptotic agreement if mG continuous at certainty at is tinuous at certainty under family {Fg.^} there is, under uncertainty). The corresponding asymptotic Definition 5 Asymptotic agreement if to a Dirac distribution 5p^, is to strongly discontinuis strongly discontinu- 5.pi 3: 5 = {{k\ k''' , p) G A(L)2 : a consequence of the Preimage (2^ (tt') - Tk {n-)YTk{f(p)) = Theorem and 0} Theorem in The Preimage Sard's example, Guillemin and Pollack, 1974, pp. 21 and 39). map f Y, then f"^ (y) is a submanifold of is a regular value of the with dimension equal to dimX — dimF. In our context, this implies that if map {Tk (tt^) -Tk (7r"))'Tfc(/'(p)), then the set S is a two dimensional submanifold of A(L)^ and thus differential topology (see, for Theorem implies that has Lebesgue measure if j/ is 0. a regular value of a : Sard's theorem implies that 32 is X^ generically a regular value. X . Theorem 7 (Generalized Strong Discontinuity of Asymptotic Agreement) As- ymptotic agreement is strongly discontinuous at each Towards generalizing Theorem of probabiHty densities Assumption 3 For {i, 9, whenever 9 m) = ^ 9' , {p^ ,p^) € A [Lf^ we now formally present the appropriate 5, families and introduce the necessary notation; Q each 9 & fi^,ip) where c . 1/ /pg^(^) and = / {m let the c{i,e,m) f {m{p [p and f :M.^ —^M. m G N, -p (?:, 9))) dp, p subjective density fg^ be defined by - p{i.e))) (?, 9) G A (L) with p (z, 9) y^ p {i, 9') a positive, continuous probability density function is that satisfies the following conditions: (i) lim^_,oomax|^.||^||>,j} / (,r) = 0, (u) R{x,y)^ exists at all x,y, ^^ lim (25) m^oo f {my) and (Hi) convergence in (25) holds uniformly over a neighborhood of each [p Writing i (f)\ ^^ (p (s)) = lim„^oo cpl (?', 9) — p [j, 9') ,p{i,9) „ _„ (s) for the asymptotic posterior of individual with subjective density fg ^ we are now ready to state the generalization of Theorem , 5. Theorem 8 (Generalized Asymptotic Agreement and Disagreement Under Uniform Convergence) Under Assumption 1. Suppose that Suppose that the following are true: R {p {i, 9) — p (j, 9') ,p{i,9) — p (j, ^)) = Then, asymptotic agreement 2. 3, is is 9' continuous under {i^^„,}. R (p {i. 9) - p [j, 9') Then, asymptotic agreement for each distinct 9 and , p {i, 9) ~ p {j, ^)) 7^ for each distinct 9 and strongly discontinuous under {-F(J,„}. 33 9'. — p (j, 6) These theorems therefore show that the asymptotic agreement derived in the results previous section do not depend on the assumption that there are only two states and binary signals. Propositions 2 and 1 about lack of asymptotic learning and It is to the case with multiple states also straightforward to generahze and signals; we omit this to avoid repetition. We assumed both the number can be dropped in the of signal values and states are finite. This assumption expense of introducing technical issues that are not central to our focus here. Concluding Remarks 5 The standard approach als game theory and economic modehng assumes in have a "common prior," meaning that they have regarding the game beliefs consistent that individu- with each other forms, institutions, and possible distributions of payoff-relevant pa- rameters. This presumption is often justified by the argimient that sufficient common experiences and observations, either through individual observations or transmission of information from others, will eliminate disagreements, taking agents towards priors. common This presumption receives support from a number of well-known theorems in statistics, such as Savage (1954) and Blackwell and Dubins (1962). Nevertheless, existing theorems apply to environments in which learning occurs un- der certainty, that Certainty is is, individuals are certain about the meaning of sufficient to ensure that payoff-relevant variables iting frequencies of signals. In many different signals. can be identified from lim- situations, individuals are not only learning a payoff-relevant parameter but also about the interpretation of different learning takes place under uncertainty. For example, interpretation might source. This may make many (in signals, i.e., signals favoring a particular individuals suspicious that the signals prevent full identification about come from a biased the standard sense of the term in econo- metrics and statistics). In such situations, information will be useful to individuals but may not lead to full learning. This paper investigates the conditions under which learning under uncertainty take individuals towards common priors and asymptotic agreement. We will consider an environnrent in which two individuals with different priors observe the same infinite sequence of signals informative about some underlying parameter. 34 Learning is under uncertainty, however, because each individual has a non-degenerate subjective probability distribution over the When will hkehhood of different signals given the values of the parameter. subjective probabihty distributions of both individuals have never agree, even after observing the same infinite sequence of signals. Our main results provide conditions under which asymptotic agreement or discontinuous at certainty (meaning that as the amount asymptotic agreement ishingly small amount is amount We first show that of uncertainty about the signal distribution can guarantee that close to 1 that they will asymptotically Under additional strong continuity and uniform convergence assumptions, also characterize the conditions under certainty. fragile discontinuous at certainty for every model. In particular, a van- both individuals attach probability arbitrarily disagree. is of uncertainty in the envi- ronment diminishes, we remain awaj' from asymptotic agreement). we support, they full which asymptotic agreement is continuous at Even under these assumptions, asymptotic disagreement may prevail as the of uncertainty vanishes, provided that the family of subjective distributions has regularly- varying tails (such as for the Pareto, the log-normal or the t-distributions). In contrast, with rapidly-varying tails (such as the normal and the exponential distributions), convergence to certainty leads to asymptotic agreement. Lack of common beliefs and common priors has important implications for economic behavior in a range of circumstances. The type of learning outlined in this paper interacts with economic behavior in various different situations. The companion paper, Acemoglu, Chernozhukov and Yildiz (2008), ihustrates the influence and lack of asymptotic agreement on games of learning under uncertainty of coordination, bargaining, asset trading and games of communication. 35 games of common interest, Appendix: Omitted Proofs 6 Lemma Proof of Write 1. jlv'"{\-pY-'-fB{i-v)dp {rn\9 Pr'' = A) ?)-"-'- fA{'p)dp j„'p-"(l-p)"-'-"/B(l-p)rip _ ~ Jo'p'-"(l-p)"-'-"dp Jo'p'"(l-p)"-^'"/4(p)rfp Jo'p'''(l-p)"-'"rfp E^ifBJl ~ p)\rn] E^[fA{p)K] Here, the first The same term. — p) equality obtained by dividing the numerator and the denominator by the is resulting expression on the numerator is the conditional expectation of (Lebesgue) prior on p and the Bernoulli distribution on —p)\r„]: Denoting this by and the denominator, which is similarly defined as E'^[/b(1 {st}(Lothe conditional expectation of Ja (p), by E'^[/.4(p)|r„], we obtain the last equality. By Doob's /b (1 given r„ under the flat consistency theorem for Bayesian posterior expectation of the parameter, as r„ that E^[/s(l -p)|r„,] -^ /b(1 - p) and E'M/4(p)|r„] -- /^(p) (see, and Ramamoorthi, 2003, Theorem 1.3.2). This establishes » p, we have Doob, 1949, Ghosh B) Pr'(r,,,| Pr'- e.g., — R' = irn\0 A) (P) as defined in (4). Equation (3) then follows from (2). Proof of Theorem where xe 9' ^ 9, e = 1 — e [pg, — = \ =^ A/2) - For each ?n S> 4. 1/m, p\ e^ (l - = xe/\ ifpe [p',-A/2,p^ + s^ if e otherwise, pa - p^, <^ + is clearly decreasing f 1 fl - e^jXA fp^4 m is oo /?.'•'" (p) - <( 1 - - p^, (0, 1). A/2] A/2, A, = pa - p\ Here, xg is X, p% I p p'b - > large. + A/2 p:4 + For e A/2 < p < p\ - <p<p:4 + A/2, A/2, A/2, = 0, we have p < 1 - ?5)5 + A/2, ifl - p'e + A/2 <p<p'4- A/2, if if p > p'4 36 - A/2, = Pb + close to 1 for large 1-Pb-A/2<p<1-p's + A/2, XBJe^ when < p^ ^ Pb - A, A/2) e if which p ifp<l-p'g-A/2, l/e2 f i?''™ (p) = 1, let A, and m. Then, and hence ifp<l-p'B + 0r(p)-<( Notice that when p G - p2j V^'" we have <p'^ {p) ^ and the proof of theorem, (P) (/>^"' (/j) we then - lim Pr^-- (10^- {p) 771—>00 ifi-p^B+V2<p<p:4-A/2, 1 if> \l2,ii\ - 0^'" {p)\ + = > p\ - A/2] </'^"' (P)| A/2. <v\- we have p , = 1-vri. Similarly, tt^ so that Z = min pick ' A/2, ^'- k;!'io"' {tt", 1 — tt^ } >Z)^ hm<0O Pr^'™ (p e 777— = (p) (p) ^ (^^ [p) ^ 1 and - p^ - A/2,pij + A/2], A/2, so that when ^ £ [l TT In order to complete . /2. In that case, - [l p^j - + A/2]) 'J/ = A/2,p)j '" > I-tt^ and lim Pr^.- (|<^Ji- (p) - > Z) = </,^- (p)| lim Pr^-- (p € [p?i - A/2,p2 + A/2]) = tt^ > 0, completing the proof. Proof of Theorem 6. (Proof of Parti) This part immediatel)^ follows from Lemma 2, as each fr^ifAy (p(s)) positive, and TT%fAk {p{s)) is finite. (Proof of Pai't 2) Assmiie Fg = Fg for each G 0. Then, bj' Lemma 2, 4>\ooip) 6' 4>k,ooiP) = if and only if (tt^) (7^ - Tk {n~))' Tk [{feiP))eeQ) under both probability measures probability Proof of Theorem for all 9,9' (including 9 Pick sequences jd^"' 7. = Pr'^ For each 9'). — > iff e = will 61' be the andi = pg and ^ is 1,777 such that < e/m} normalized so that /^ ^^ — > ^ 5^. for is 4,777, ifpel)(, 1/m otherwise. 2,771. - 'e > Pf,' e/r i. , Notice that Dl'^TiD^g,'^ a probability density function. each (6*, ^ i). We will By construction of se- show that agreement is discontinuous {FlJ. Now </'L.,0,m (P) if > Define quences /g ^ and Pg™', Fl under inequality has '^^^^ ^^^^^^ {9,i), define /ff.77,,(P) where x\ e set of likely frequencies at state 6 according to ?:'. ~~ and Pr" by hypothesis. D y" = {peA{L):3 P-Pe which "" ^- is p e Dg"'' for some 9 and ^J^j m. (p) — ~ ^' otherwise. Note that <i>^oo,e,m. Moreover, since the sets D^'™' are disjoint (as we have seen above), p G Dg"*. Hence, there exist m such that ||<?^'oo,777, for any m > m (p)-'^io,m(p)|| 37 and any p G >£ (p) ^ 1 if 4>'oo,m.{p) £?'"' = — P S Z?g"\ '"'"' [jgDg"^, when 0, where e Pr'-"' (p = - minj,^ [l G D''"') > 1 But tt^J /2. - 1/m. Lemma > ip e Dg'"'|0j 6, Pr'-'" - 1/m, showing 1 that Therefore, ,hm Proof of Theorem each for Our proof 8. = Pr''"'(||(^'^,„,-C,,„,||>c) l. utihzes the following two lemmas. A. 1 lim fc.OO.TTA (P) l Proof. By condition k and k', m^oo p^^ Then, m->oo c [p) Lemma A Lemma follows B. For any and each p{s) with Iimm.-,oo c (i), A;,, (z, \\p{s) m) Ak, rn) {i, — 1 for each and i Hence, for every distinct k. [m [p-p{l,Ak))) m-^oo / Lemma from e + E;e'^/c5^^(p-p(i,AfcO,p-p(i,A,)) 2. > and h > 0, there — p{i,Ai,)\\ < h/m, m exists such that for each in > ifi, k < K, < Proof. Since, by hypothesis, by Lemma A, there exists h' hm m.—>oo and by condition (iii), > R. is 0, m each k' ^ if —p (j, 9') ,p (?', 9) —p 9)), {j, {p Afc)) (i, < i/2 (27) ' — p{i,Ak)\\ < h' lim The inequahties . (i, -4fc)) - <^i_oo,m < [p (s)) (?5 (^. £/2. in (27) ^fc))) = (28) and iff (28) then .R (p (i, imply A^) - p (26). (j, ^/t') ,P (i, A^) - p {j k. Lemma A implies that and only (/>1,,^ ,„ m—>oo 4,^,„, ' (4,oo,m (P lim rn—too Proof. Proof. Since R{p[i,Ak) — p if 9) h/h! such that - ' for > (*)) 4,oo,m (P Lemma C. Imw^oo - ' ' (z, such that ^1,,^_,„ {p (s)) there exists holds uniformly in \\p{s) continuous at each (p (26) (i, ^4^/) ,0) limTO_,oo<)^fc,oo,-m(p(^'^fc)) limm^oo <?^i,oo,m (p(i,^fc)) = L = 1- = for each Hence, linim^oo Since each ratio 38 k' ^ k (by condition (?^fc,co,m 7r|,,/7r'[ is (P(^^A-)) positive, by (i)). " <?^fc,oo,m Lemma (P(^'^fc)) . A, the latter holds only if R {p if Ak) (z, - p {j, Ai-,) - p {j, Ak) ,p{i, = Ak)) each for k' y^ k. m (Proof of Part 1) Fix > e Pr'' (||<^L,n,.(s) Fix any and i (i), Since each k. > <Pk,oo,m (pis)) 1 - e > there exists ho^k - > S f\^^ is We 0. cl>l,m (s)|| 7r-^//7r;[ whenever 0, and m e N such will find >e)<5 by finite, that {^m>m,i = Lemma < {p{s))/f\^ {p{s)) 1,2). there exists 2, > e' holds for every k' e' 0, such that 7^ k. Now, by such that = Ak)= Fv''i\\p{s)-p{i,Ak)\\<ho,k/m\9 f / (x) > dx {1 - S) Let Qk,rr,. and f K. (x) = min||j,||</,g f {x) > < €'k/2. There exists 0. ^, p{s) e Qk,m., and any c(2, Afc/,77i) /c(i, Afc,7Ti) By II > there exists hijc (i), we have ||p(s) — 7nifc > p{i,Ak')\\ 1 /(777,(p(s)-p(J,yV))) 2 k, 1 for 2 for every k' ^ each i k and and m> 6, ||2;|| such that for any m > hj^k, > mi^fc, hi^k/m, and £Vc — ho,k/m} such that, whenever f{m{p{s)-p{i,Ak,))) (i, ^,777.) < - p (i, Ak) < \\p a sufficiently large constant k' 7^ k, Moreover, since lim,„.->ooC = {peA(L): ^ e^ 2- there exists 7712, fc > 'irii^k such that This implies 7772, fc- f\,{p{s))irA,{p{s))<e', establishing that ^k^oo.m Now, and 9'. for j 7^ 7, > ^ (p {i, 0) —p (j, 6*') (29) — p (j, ,p{i,0) = 9)) for A, limm-*oo 0^ 00 m (p(^^fc)) = li and hence by Tn2,k such that for each 771 > ma^/t, p [s) e Qkjn, Then, by exists m^^k assume that {p{s))>l-e. Lemma each distinct 9 Lemma B, there <oo,^,(/>(s))>l-e- (30) Notice that when (29) and (30) hold, we have ||<;i'L,,n. (*') 771 — max/j77i4 fc, we obtain the desired inequality for each m Pr' (||</.^,^, is) - 0';,,,„ {s)\\ <e) = Yl P^-' Yl p^'' (11'^-.- (^) - ~ '^L.jn, > in: (-5)11 < < e|e '^'So.n (s)|| £• - ^0 k<K > (^ k<K fc</C = 1-6. 39 (^^) ^ ^'^.-1^ ^ ^^) p^"' (^ Then, setting = ^^) Pr'' (0 = A^) (Proof of Part 2) Assume that R (p {i, 6) - p {j, 9') ,p{i, 6) - p {j, 9)) and 9' We will find e > such that for each > 0, there exists m G tinct 9 (5 . P^' {\\<t>L,mis) Now, each since each k. we have 0^00 > > <Plo.rr,.is)\\ positive. e) Lemma A >l-S (Vm > 771, Z hmm^oo implies that <?^fc = 1, 2). ,30 m (^('^'^fc)) < dis- ^ ^^^ Let Then, by Part 777 is 7rj,,/7r-^ - ^ for each N such that 7775,^. 1, for iPi^)) and p (s) each > 1 \\^lo,m (s) - 7712, /t m such that for every B, there also exists 7715,/^ > > and p{s) G Qk,m^ such that for every 77i2,fc 77T.2,fc e Qfc.m, -^ioo.m (P (5)) This implies that there exists k, — £ By Lemma lle/i^.m (Plo,^ (5)11 < < J™, -^ioo.m (P ipi^)) e " at the ('. +£< ^k)) <^So,m (p('5'))|| > ^- 1 Setting end of the proof of Part obtain the desired inequality. 40 - 1 2e 771 < = 0-^,^ (p (s)) - e. maxk'm^^k and changing to ||0^_„. (s) - ^^^„ (s)|| > e, we References 7 Acemoglu, Daron, Victor Chemozhukov, and Muhamet Yildiz (2008): Applications of Learning under Uncertainty" Aumann, Robert (1986): , "Economic mimeo. "Correlated Equilibrium as an Expression of Bayesian Ra- Econometrica, 55, 1-18. tionality," Aumann, Robert (1998): "Common A priors: Reply to Qui," Econometrica, 66-4, 929-938. Berk, Robert (1966): "Limiting Behavior of Posterior Distributions Wlien the Model Annals of Mathematical Is Incorrect," Statistics, 37, 51-58. Billingsley, Patrick (1995): Probability New and Measure, Third Edition, Wiley and Sons, York. Blackwell, David and Lester Dubins (1962): "Merging of Opinions with Increasing Information," Annals of Mathematical Statistics, 33, 882-887. Cripps, Martin, Jeffrey Ely, George Mailath and Larry Samuelson (2006): "Common Learning" Cowles Foundation Discussion Paper No. 1575, August 2006. Doob, J. L. (1949): "Application of the Theory of Martingales," in Le Calcul des Probabilities et ses Applications, 23-27, Paris. Diaconis Persi and David Freedman (1986): "On Inconsistent Bayesian Estimates of Location" Annals of Statistics, 14, 68-87. Avinash and Jorgen Weibull (2007): Dixit, "Political Polarization," Mimeo, Prince- ton University and Stockliolm School of Economics. Feller, Volume William (1971): II, 2nd ed. Wiley, An New Introduction to Probability Theory and Its Applications. York. Freedman, David (1963): "On the Asymptotic Behavior of Bayes Estimates Discrete Case I" Annals of Mathematical in the Statistics, 34, 1386-1403. Freedman, David (1965): "On the Asjanptotic Behavior of Bayes Estimates in the Discrete Case IF' Annals of Mathematical Statistics, 36, 454-456. Fudenberg Drew and David Kreps (1995): "Learning Confirming Equihbria," Games and Economic Behavior, Fudenberg, rica, 61, Drew and David Levine (1993): in 8, Extensive Games, I: Self- 20-55. "Self-Confirming Equilibria" Economet- 523- 546. Fudenberg, Drew and David Levine (2005): Latin American Journal of Economics, 42, 199-27. 41 "Learning and Belief Based Trade," Geanakoplos, John and Heraklis Polemarchakis (1982): ever," Journal of Ghosh Verlang, Economic Theory, "We Can't Disagree For- 28, 192-200. K. and R. V. Ramamoorthi (2003): Bayesian Nonparametrics, Springer- J. New York. Qui, Faruk (1998): "A Comment on Aumann's Bayesian view", Econometri.ca, 66-4, 923-927. Guillemin Victor and Alan Pollack (1974): Differential Topology, Prentice Hall, New Jersey. Haan, L. de (1970): On Regular Variaiion and Application to the Its Weak Conver- gence of Sample Extremes, Mathematical Centre Ti'act 32, Mathematics Centre, Ams- terdam, Holland. Jackson, Matthew 0., Ehud Kalai and Rann Smorodinsky sentation of Stochastic Processes under Learning: De (1999): "Bayesian Repre- Finetti Revisited," Econometrica, 67, 875-893. Kurz, Mordecai (1994): "On the Structure and Diversity of Rational Behefs," Eco- nomic Theory, 4, 877-900. Kurz, Mordecai (1996): duction," Miller, Economic Theory, Ronald I. An "Rational Behefs and Endogenous Uncertainty: 8, Intro- 383-397. and Chris W. Sanchirico (1999): nuity in Merging of Opinions and Rational Learning," "The Role of Option with Conti- Games and Economic Behavior, 29, 170-190. Savage, Leonard J. (1954): The Foundations of Statistics, Dover reprint. New York, 1972. Seneta, Eugene (1976): Regularly Varying Functions, Lecture Notes in Mathematics, 508, Springer, BerUn. Stinchcombe, Maxwell (2005): "The Unbearable Fhghtness of Bayesians: Generically Erratic Updating," University of Texas, Austin, mimeo. 42 :' / 7 '•