Polarization: theory and evidence

advertisement
Polarization: theory and evidence
Jean-Pierre Benoît
Juan Dubra
Outline
Two groups come into a lab and report beliefs
Type
θ1
< θ2
… <θn Higher in FOSD Sense Type
θ1
< θ2
…
< θn
Probability
.2
.1
… .4
.3
.2
…
.3
Probability
The two groups observe a common signal S and report beliefs again
Larger belief increases in FOSD
Lower Belief decreases in FOSD
Type
θ1
θ2
…
θn
Type
θ1
< θ2
…
< θn
Probability
.1
.1
…
.5
Probability
.4
.3
…
.2
Can the results be obtained with Bayesian model?
Obtained in a “reasonable” Bayesian model?
Can Bayesian model explain “agreed upon” patterns?
Can Bayesian model explain a particular experiment?
•
•
•
•
•
•
•
•
•
•
•
Lord, Ross, and Lepper (1979)
Darley and Gross (1983)
Plous (1991) -- nuclear
Nyhan and Reifler (2010) -- WMD
Gerber and Green (1999) – Bayesian implications
Miller, McHoskey, Bane, and Dowd (1993) – CP, direct
response
Kuhn and Lao (1996)
Munro and Ditto (1997) -- homosexuality
Kunda (1987) – motivated reasoning
Liberman and Chaiken (1992) – defensive processing
Lord, Lepper, and Preston (1984) -- corrective measures
•
•
•
•
•
•
•
Acemoglu, Chernozhukov, and Yildiz (2009)
Andreoni and Mylovanov (2012)
Baliga, Hanany, and Klibanoff (2012)
Dixit and Weibull (2007)
Glaeser and Sunstein (2013)
Kondor (2012)
Rabin and Schrag (1999)
A Hypothesis-Confirming Bias in Labeling Effects
Darley and Gross (1983)
70 Princeton undergraduates.
Question: What is Hannah’s
grade level?
Scale: Kindergarten -- sixth
grade, in 3-month intervals.
Hannah: Nine years old, fourth grade, Caucasian
Demographic Information
Negative Expectancy Condition:
Father: Meat-packer
Mother: Seamstress
Both: High school education
Hannah’s neighbourhood
Positive Expectancy Condition
Father: Attorney
Mother: Free-lance writer
Both: College graduates
Hannah’s neighbourhood
Performance Information
Subjects viewed a video of Hannah answering
problems that ranged from “easy” to “difficult”.
Indicate Hannah’s Ability
Liberal Arts
No-Performance
Performance
Lower Class
3.98
3.79
Upper Class
4.30
4.83
Reading
No-Performance
Performance
Lower Class
3.90
3.71
Upper Class
4.29
4.67
Mathematics
No-Performance
Performance
Lower Class
3.85
3.04
Upper Class
4.03
4.10
What should happen?
Initial responses: A – 4.25, B – 3.75.
A and B receive common information:
• A and B harmonize if, following the information,
A and B's responses both rise or both fall.
• A and B moderate if, following the information,
A's response falls and B's response rises.
• A and B polarize if, following the information, A's
response rises and B's response falls.
Mixed Performance
• Hannah correctly answers some “difficult” questions but
misses some “easy” questions.
• Hannah sometimes concentrates, sometimes is distracted.
• The two group’s reports move further apart (polarize)
• Prior literature has said “we have to explain how could two
people’s beliefs move further apart”:
– Impossible with Bayes let’s do ambiguity (Baliga et al.)
– Impossible with Bayes let’s assume that people differ on their
beliefs of how types map to signals (Acemoglu et al)
– Andreoni&Mylovanov and Kondor do it with 2 dimensions
• We say that is the “wrong” question.
Mixed Performance
1. Hannah's mixed performance shows her to be
an average student.
2. Hannah manages to answer some difficult
questions and to concentrate when she wants
to. Hannah may not be the best student, but she
is well above average (say, 4.5 or above).
3. Hannah misses some easy questions and cannot
maintain her concentration. Hannah may not be
the worst student, but she is well below average
(say, 3.5 or below).
• Let’s take exactly that story, and let
– B be “distractions = Boredom”
– I be “distractions = Inability to focus”
• State Space {B,I} x {3.5, 4, 4.5}. Each state has p=1/6
• Signal 1: A = {b,i} w P(b | B,j) = 2/3, P(b | I,j) = 1/3
• Signal 2: S = {sometimes distracted, focused} = {d,f}
Probability of signal d in each state
3.5
4
4.5
B
½
½
1
I
1
½
½
Posterior after b and common signal d.
3.5
4
4.5
B
1/6
1/6
1/3
I
1/6
1/12
Marginal
1/3
1/4
Posterior after i and common signal d.
3.5
4
4.5
B
1/12
1/12
1/6
1/12
I
1/3
1/6
1/6
5/12
Marginal
5/12
1/4
1/3
• What needs to be explained? Given next to last
slide “certainly not” that people could polarize. We
had two very simple and plausible interpretations
of the data that implied in one case that Hannah
was better than the prior, and in the other that she
was worse.
• What do the results actually mean?
– Apart from the simple example, there are many other
plausible interpretations which are not “it’s a bias”.
– “Big point”: be careful when reading these papers (the
alternative hypothesis is never stated). Maybe it’s a
cheap point, but economists have embraced the idea
that it is a bias.
What do the answers mean?
• “Subjects who were given only demographic
information about the child demonstrated a
resistance to making expectancy-consistent
attributions on the ability indexes.”
• “Their estimations of the child’s ability level
tended to cluster closely around the one
concrete fact they had at their disposal: the
child’s grade in school.”
What do the answers mean?
• Base-rate information represents probabilistic
statements about a class of individuals, which
may not be applicable to every member of the
class. Thus, regardless of what an individual
perceives the actual base rates to be, rating
any one member of the class requires a higher
standard of evidence.
Don’t Prejudge:
Say different from 4 only if 75% sure.
• Lower class child:
– Most likely 3.5; 35% chance she is exceptional, and thus 4.
• Upper class child:
– Most likely 4.5; 35% chance she is exceptional, and thus 4.
• No performance video
– Subjects answer “4”
• Performance video: Hannah is not exceptional
– Subjects answer “3.5” or “4.5”
• “A teacher, for example, would be extremely
hesitant to conclude that a black child had low
ability unless that child supplied direct behavioral
evidence validating the application of the label.”
• Darley & Gross (…):
– Don’t say bad unless she performs badly
– Don’t say good unless she performs well
• This other (rational) example:
– Don’t say bad/good unless sufficiently confident.
• So far, I have said that we can explain certain
results with a Bayesian model (low bar, and
Andreoni&Mylovanov and Kondor have done it).
• Next, we’ll raise the bar about what it is that one
should try to do.
• But before that I will discuss what are the
“deliverables”:
– Polarization in Expected value?
– Polarization in FOSD?
Expected Value
Lower Class
Exceptional
Average (-)
Average (+)
Exceptional
Probability
1/8
1/2
1/4
1/8
Grade
2
3
4
5
3.37
Upper Class
Exceptional
Average (-)
Average (+)
Exceptional
Mean
Probability
1/8
1/4
1/2
1/8
Grade
2
3
4
5
Video signal: Hannah is pretty average
Lower class: 3.33
Higher class: 3.67
Mean
3.62
FOSD Polarization
(Baliga, Hanany, and Klibanoff) (Dixit and Weibull)
Poor
Rich
Baliga, Hanany, Klibanoff
Definition Fix two individuals with beliefs η and η’
over Φ (in R) and with common support such that η’
stochastically dominates η. After they both observe a
signal x whose likelihood given θ ∈ Φ is πθ(x), we say
that polarization occurs if and only if the resulting
posterior beliefs lie further apart in the sense of fosd.
Theorem (fosd) Polarization cannot occur if the two
individuals use Bayesian updating.
Precludes moderation.
Unrelated to experiment (e.g. in Hannah people
report one summary statistic for whole beliefs, not
the whole distribution).
Many “dimensions” to get polarization
• Andreoni and Mylovanov: The essential feature of our
model is simply that the optimal action depends on
relative values of different dimensions of the
information space and, as such, contains at least one
fewer degree of freedom.
• Kondor: The main result is based on a simple
observation. Agents’ opinions about the opinions of
others (higher-order expectations) respond differently
to public information than agents’ opinions about the
fundamentals of an economic object.
One dimension – only the type matters
Θ = {2, 3, 4, 5} uniform beliefs S = {s2, s3, s4, s5}
Likelihoods
Posteriors
2
3
4
5
2
3
4
5
S2
¾
¼
0
0
S2
¾
¼
0
0
S3
1/8
½
¼
1/8
S3
1/8
½
¼
1/8
S4
1/8
¼
½
1/8
S4
1/8
¼
½
1/8
S5
0
0
¼
¾
S5
0
0
¼
¾

Type (grade)
2
3
4
5
Subject I (s3)
1/8
1/2
1/4
1/8
Subject II (s4)
1/8
1/4
1/2
1/8
M
Θ = {2, 3, 4, 5}
S = {s2, s3, s4, s5}}
T= {L, M, H}
Likelihoods
2
3
4
5
L
1
0
0
0
M
0
1
1
0
H
0
0
0
1
Lumping
Type (grade)
2
3
4
5
Subject I (s3)
1/8
1/2
1/4
1/8
Subject II (s4)
1/8
1/4
1/2
1/8
Θ = {2, 3, 4, 5}
Bad: t ∈ {2,3}
Good: t ∈ {4,5}
Posteriors after signals
B G
s3
s4
s3 , M
s4 , M
5
8
3
8
2
3
1
3
3
8
5
8
1
3
2
3
S = {s2, s3, s4, s5}}
If not knowing the real type space
you collect beliefs about B or G
you would find fosd polarization.
And conclude that there’s a bias.
• All the previous discussion was about “if we are
trying to explain experiments, what is it that
experiments tell us?”
– They don’t tell us anything about FOSD
– Despite Baliga et al, you could obtain FOSD in an
experiment, if you don’t know the exact type space.
– From Baliga also: FOSD seems too strong (precludes
moderation).
• Still, we can derive our results in FOSD, so we will
proceed with this “higher bar” for polarization.
Lord, Ross, and Lepper
• 151 subjects complete a questionnaire
• 48 are selected two weeks later.
– 24 Proponents: Favour capital punishment,
believe it has a deterrent effect, think most
relevant research supports their views.
– 24 Opponents: Oppose capital punishment, doubt
it has a deterrent effect, believe most relevant
research supports their views.
Two Studies
• Kroner and Phillips (1977) compared murder rates for the year before and
the year after adoption of capital punishment in 14 states. In 11 of 14
states, murder rates were lower after adoption of capital the death
penalty. This research supports the deterrent effect of the death penalty.
• Palmer and Crandall (1977) compared murder rates in 10 pairs of
neighbouring states with different capital punishment laws. In 8 of the 10
pairs, murder rates were higher in the state with capital punishment. This
research opposes the deterrent effect of the death penalty.
• Two-page description of methodology.
• Studies also flipped
• Studies are fictional, but characteristic of research found in the literature
cited in judicial decisions.
Attitudes Polarize
“It is an impressive demonstration of
assimilation biases that contending factions
both believe the same data to justify their
position ‘objectively’.”
Updating Model
T|G
F|B
T|G
F|B
T: Capital punishment deters
F: Does not deter
Ancillary State Matters
Ambiguous Signals: different meanings
in different ancillary states.
T|G
F|B
T|B
F|G
T: Capital punishment deters
F: Does not deter
• Recall from Hannah “ambiguous”: distracted meant
different things in different “ancillary” states
– B “distracted=boredom”; I “distracted=inability to focus”
• State Space {B,I} x {3.5, 4, 4.5}. Each state has p=1/6
• Signal 1: A = {b,i} w P(b | B,j) = 2/3, P(b | I,j) = 1/3
• Signal 2: S = {sometimes distracted, focused} = {d,f}
Probability of signal d in each state
3.5
4
4.5
B
½
½
1
I
1
½
½
Posterior after b and common signal d.
3.5
4
4.5
B
1/6
1/6
1/3
I
1/6
1/12
Marginal
1/3
1/4
Posterior after i and common signal d.
3.5
4
4.5
B
1/12
1/12
1/6
1/12
I
1/3
1/6
1/6
5/12
Marginal
5/12
1/4
1/3
Plausible?
Selection
No Selection
T|50%
F|75%
T|25%
F|50%
Info: % of states in which crime rises.
T: Capital punishment deters
F: Does not deter
• State Space {S,N} x {T,F}. Each state has p=1/4
• Signal 1: A = {s,n} w P(s | S,j) = 2/3, P(s | N,j) = 1/3
• Signal 2: % states with increase in crime rate =
{25%,50%,75%}
Probability of signal 50% in each state
Posterior after s & common
signal 50%.
T
F
S
2/5
4/15
N
2/15
Marginal
8/15
T
F
S
3/4
1/2
N
1/2
3/4
Posterior after s & common
signal 50%.
T
F
S
1/5
2/15
1/5
N
4/15
2/5
7/15
Marginal
7/15
8/15
• Can we do it? yes (others too).
• With a plausible model? Yes (previous slides). The
focus on plausibility is new.
• Next:
– Should you expect the result? Yes (new)
• Theorem
• Empirical “proof”
– Predict patterns? Yes (new)
• When would polarization occur
• Experts polarize more (Theorem). Unexpected!
• People who are more certain polarize more. Unexpected!
Theorem: Bayesian model says Polarization
“should” occur in setting of experiments
Prior Evidence 50%
Selection
0 -- 65%
66 -- 100%
New Evidence 50%
Believe true
0 -- 35%
Believe false
36--100%
No Selection
True:
CP deters
False:
CP does not deter
“Proof” that result is expected: Affirmative
Action
Miller, McHoskey, Bane, and Dowd (1993)
No net polarization.
“Why did relatively more subjects in this study
report a depolarization of their attitudes? We
have no convincing answer. Subjects may have
been less familiar with detailed arguments about
affirmative action relative to capital punishment.”
• Information orthogonal to how they were
selected
Who Polarizes?
• Experts and people with strong opinions.
• Capital Punishment (Miller…):
– Polarization: Extreme pro ≈ 52%, moderate pro ≈ 26%,
moderate anti ≈ 39%, extreme anti ≈ 47%.
– Depolarization: Extreme pro ≈ 7%, moderate pro ≈ 18%,
moderate anti ≈ 17%, extreme anti ≈ 11%.
– No change ≈ 50%.
• Nuclear:
– Attitude polarization highest among subjects who
reported high issue involvement and strong convictions.
Page 46
Who polarizes? Experts
• On experts: If experts are people who know “all” the
available information (say: number and nature of
nuclear incidents), they will have observed the same
prior information S.
• For the general population, those who have seen
ambiguous signals (of the same kind as C) will
polarize, while the rest won’t.
More polarization for people with
strong opinions.
Download