Opinión Pública y Análisis de Encuestas martes 6 de julio de 2010

advertisement
Opinión Pública y Análisis
de Encuestas
Módulo II: Introducción a las encuestas
martes 6 de julio de 2010
David Crow, Associate Director
UC Riverside, Survey Research Center
david.crow@ucr.edu
What Surveys Measure
• Attitudes – positive or negative orientation
toward something
• Beliefs – opinions about the objective state of
the world (something is true or untrue)
• Knowledge
• Behavior
BUT self-reported behavior (“stated” not
“revealed” preferences)
Survey Goals (Weisberg, Chap. 7)
• Four Basic Survey Goals (recapitulating from Chap. 1)
1) Measuring prevalence of attitudes, beliefs, and
behaviors
-
attitudes: likes and dislikes (do I like or dislike it?)
beliefs: acceptance of factual proposition about the state of
the world (is it true or not?)
behaviors: things we do
2) Change over time
3) Differences between subgroups
4) Analyze causes of attitudes, beliefs, and behaviors
Uses of Surveys (Weisberg, Chap. 1)
• Polls and Elections
• Population Characteristics
– Current Population Survey (CPS)
– American Community Survey (ACS)
– Bureau of Labor Statistics
• Consumer Research
• Courts
– “contingent valuation”
– Utah apportionment case
Other Research Designs
• Experiments:
experimenter manipulates variable (experimental
treatment) and observes its effects in different groups
• Aggregate Data:
using data (census, election, sales) available for
groups or geographical clusters (countries, states, census tracts) of people
(units)
• Deliberative Poll: do entrance survey, then give participants
information and ask them to participate in group discussion, do
exit survey to see if views have changed.
• Focus Groups: moderated discussion
• Audience Reaction: exposing
• Others: in-depth interviews, participant observation, content
analysis
Tradeoff: Broad vs. Deep
• Know a little about a lot of people, or a lot about
few people
• Surveys, Aggregate Data, Experiments: results can
be generalized from sample to populations, BUT
we can’t explore topics in-depth
• In-Depth Interviews, Participant Observation,
Focus Groups: We can explore concepts and
meanings with great detail and nuance, but we
can’t be sure the results are valid beyond the
participants
Measuring Behavior
• Self-reported actions in the past
• Problems:
1) Imperfect Recall: the farther the behavior is in the past, the
less accurate reports of it are; forward telescoping
(remembering events as more recent than they really were)
and backward telescoping (remembering events as more
remote than they really were)
Solutions: 1) ask people about recent behavior; 2) set time frames
for people with “warm-up” questions; 3) anchor memories to
important life or historical events
2) Sensitive Topics: people are reluctant to report engaging in
socially disapproved behavior and “over-report” engaging in
socially desirable behavior
Solutions: 1) “bogus pipeline” technique; 2) include behavior as
part of a list of non-controversial behaviors; 3) randomized
response technique
Measuring Attitudes: Non-Attitudes
• Non-Attitudes: respondents don’t know/ haven’t thought
enough about question to have meaningful opinion
(Converse, 1964)
– “Don’t know / No opinion”
– BUT, sometimes they give an on-the-spot answer that doesn’t reflect a
real opinion
– Debate: include NO/DK option or not? Pro: prevents reporting of
non-attitudes as attitudes; Con: gets respondents off the hook from
doing tough cognitive work  solutions: 1) offer DK/NO option, but
prompt people to reflect on responses; 2) deliberative polling
• Attitude Strength: how strongly a person feels about a
topic
– Strong attitudes tend to be more stable over time
– Strong attitudes are better at predicting behavior
– Factors contributing to attitude strength: 1) knowledge of topic; 2)
interest in topic; 3) value system
Measuring Attitude Strength
• Ask directly  respondents rate topic on a scale of
importance
• Measure length of response time  longer responses
indicate weaker opinions
• Give counterarguments
survey-based experiment: ask for opinion, give counter-argument,
re-ask question and see if answer varies
Attitude Stability
• Is public opinion fickle—i.e., change easily over
time—or stable?
• Is change real change or faulty measurement?
 Differences over meanings of words  e.g., conservative / liberal
• Strong opinions are more stable
– More resistant to new evidence
• When do opinions change?
- Exposure to new evidence (e.g., increased opposition to Clinton
health plan)
- Changing “frames”: a “frame” is a widely accepted conceptual lens
(often created in part by news media coverage) used to interpret
events  frames change over time
- E.g., Anita Hill / Clarence Thomas; sexual harassment to high tech
lynching
- E.g., Schwarzenegger election: “Kooky Californians”, “Popular
Revolt”, “Great Incommunicator” (Lakoff, 2004)
Measuring Change Over Time
• Change: increase or decrease of numerical
variables over time
• Some ways to measure change:
– Attitude Recall Data: ask R directly how he/she felt about something
in the past  sometimes inaccurate because of “consistency bias” =
desire to present oneself as consistent underestimates true change
– Comparison of Cross-Sectional Surveys Over Time: compare averages
for same (or similar) questions asked of different people at different
times
– Panel Studies: ask same questions to same people at different points
in time  “repeated measurements”
– Instant Polls: interactive polls that measure real-time reactions to
stimuli  e.g., Frank Luntz “dial” polls
Problems with
Cross-Sectional Comparisons
Problem: Distinguishing Between Real Differences and Ones that
are Artifacts of Survey Methodology
• Different Survey Organizations: different survey organizations use
different methods to select, contact, and interview respondents  results
could vary
• Different Populations: are the samples drawn from the same
population? E.g., voting age adults vs. “likely voters”
• Different Questions: even slight variations in question wording can
change responses
• Sampling Error: fluctuation over time might be random error rather
than real change
Panel Studies: Advantages
• Allows for Assessment of Causal Effects:
Causality implies temporal priority of cause to effect
cross-section allows us to see how an effect varies across groups,
but not how an effect varies as a result of some changing
circumstance;
- e.g., effect of age on voting: older people vote more often than
younger people, but getting older doesn’t increase your propensity
to vote
- e.g., effect of gun ownership on feelings of safety: people with guns
feel safer than people without, but getting a gun doesn’t necessarily
mean you will feel safer
• Gross vs. Net Change: net = individual-level change; gross =
overall, aggregate-level change  panel studies allow for
measurement of both
Panel Studies: Disadvantages
• Expensive to locate people for re-interviews:
• “Mortality” (aka “attrition”)  people drop out of successive
waves of study
- Non-random attrition (i.e., when attrition systematically affects
one group more than others) could alter results
e.g., poorer people are more transient and more difficult to locate; people
who are less interested in a topic are less likely to be interviewed
• Survey process itself could alter behavior under study
- e.g., initial interview about elections could increase interest in an
election, causing people to vote who otherwise would not have
voted
Subgroup Comparisons
• Comparing differences in behaviors, attitudes, and beliefs across
subpopulations
• Take the average for one group and compare it to that of another group; e.g., Calderón’s
job approval ratings among PAN, PRI, PRD adherents (and those with no affiliation)
• Note that you must take into account uncertainty associated with estimation for each
subgroup
• Implications for sampling  typical nationally representative
sample of 1,200 may not be enough to assess differences
- “double-” or “over-sample” subgroups
- “pyramiding”: combining estimates for subgroups over several
surveys (at same point in time or at different time points)
• BUT potential difficulties in assessing aggregate behavior and
attitudes are less severe when comparing subgroup
• E.g., recalled voting behavior: inaccurate memory affects our estimates of total
proportions of people who voted, but it would only alter our estimates of the
relationship of union membership to voting if we felt that union members were more or
less likely to remember inaccurately than non-union members
Assessing Causes of Behavior
• Ask people directly why they do things  people offer post
hoc rationalizations
• Better strategy: think about possible causes of behavior, and
social and individual circumstances that influence actions, and
ask about them
• Explore numerical associations through statistical techniques such as crosstabulations, correlation, multiple regression
Populations and Samples
• Define the group of people to be studied
- Characteristics: geography? age? gender?
- Should be population suited to study of research question
• Samples: Representative Subset of Population
- Who should be interviewed? Population or sample?
- How many interviews are necessary?
• Larger is more representative and gives a more precise
estimate, but costs more
• Are subgroups important?  oversampling
• Depends on research question:
elections  900 to 1,500
drug trials  often as few as 200
• Modes of contact: 1) face-to-face; 2) SAQ (pencil and
paper or on Web); 3) telephone
Problems & Challenges (Cont’d)
• Response Rate: not everyone responds, sample overrepresents easy-to-reach people; sample is not representative
of population solutions: 1) increase efforts to reach hard-to-reach
people (e.g., increase callbacks); 2) substitute new respondents or PSUs
for non-respondents  possibility of “substitution bias”; 3) offer
incentives
• “Sugging”: selling under the guise of surveys  people grow
wary of surveys
• Sample Error: Uncertainty in estimates based on only a part of
the sample (margin of error)  can be calculated only for
probability samples;
• Non-coverage Error: sampling frame does not correspond to
target population
Constructing Questionnaire
• Topic Order: Avoid embarrassing or difficult
topics at beginning; demographics, especially
income, toward end  ESTABLISH TRUST
• Question Order: Questions should flow easily;
place related questions together. However,
“consistency bias”  respondents want to appear consistent and
give same answers to similar questions; solution: invert scales, vary question
placement
• Response Set: Vary response set
• Number of Questions: Keep interview
manageable
Questionnaire Construction
(Weisberg, Chap. 4)
• Question Form:
- Closed-ended questions: predefined response categories  easier
to code and analyze, but don’t take into account all possible
responses
- Open-ended questions: allows for free responses  accurately
reflects range of possible responses, but very difficult to group
together and time-consuming
• Rating Scales:
- Likert scale: four or five ordinal categories  e.g., “strongly agree”,
“agree”, “neither agree nor disagree”, “disagree”, “strongly disagree”
- Feeling thermometer: 0 is very cold, 50 is neutral, 100 is very warm
- Semantic differential: bipolar scales (typically seven-point) that
ask respondents to rate something along several dimensions
- Numerical scales: e.g., 1 to 10, sometimes with end-points
anchored by semantic content
Question Wording
• Avoid Ambiguity:
– Conceptual Ambiguity: be as specific as possible; e.g., not
“racial integration”, but “racial integration” in specific
situations; short, direct questions
– Temporal ambiguity: broad, undefined time periods for
self-reported past behaviors
• Avoid Bias: question should scrupulously avoid leading
respondents toward a particular response 
– 1) use neutral “frames”, e.g., taxes  estate tax, “death tax”;
– 2) social desirability bias  interviewees say what people want to
hear; solutions: non-judgmental question phrasing, interviewer
rapport
Question Wording (Cont’d)
• Avoid “double-barreled” questions: “twofers”
that ask about two things in one question 
e.g., bipolar scales should really be opposites; two possible reasons for
response, e.g., “do you taxes on foreign oil should be used to reduce
consumption” NO  don’t want to reduce consumption or don’t want to
tax foreign oil? Branching questions
• No Opinion Option?
Debate: early research (Converse,
Michigan) portrayed citizens as uninformed  forcing an answer may
pressure citizens into meaningless response; More recent research
(Krosnick) shows that “no opinion” option gives respondents an easy out
 allows them to avoid cognitive work of thinking about tough issues;
solution: interviewer prompts
Question Wording
• Use Standard Wordings: if a question has
been asked before, see how others have done
this  Census (Current Population Survey, CPS; American Community
Survey, ACS); American National Election Survey, ANES (U. of Michigan);
General Social Survey, GSS (University of Chicago, NORC)
• Wording Matters!
Are different question wordings equivalent?
E.g., “prohibiting abortion” vs. “protecting the life of an unborn child” or
“satisfied with democracy in Mexico” vs. “satisfied with the way
democracy is working in Mexico”
Issues in Rating Scales
• Three Decisions: 1) How many points to include? 2)
Middle category or not? 3) How many and what labels?
• Difficult to remember many categories: solutions  1)
prompt cards in face-to-face interviews; 2) branching
format  e.g., party ID on American National Election
Survey (first question asks what party R identifies with; if
respondent answers none, follow-up question asks if R
leans toward any party)
Order Effects
1) order in which answers are presented could slant
responses; “primacy”  first category privileged;
“recency”  last category
2) order in which questions are presented in poll could
bias answers  e.g., question that asks about if R voted,
followed immediately by a question if R is registered to
vote will bias responses to the second question
Evaluating Questions
• Reliability: people should answer question the
same way each time they are asked; results
should be reproducible;  ways to assess reliability: 1) measure
same people short time later; 2) batteries of similarly worded questions 
answers should be correlated
• Validity: question should measure concept it is
intended to measure  1) “face validity”: question seems to
measure appropriate concept on first inspection; 2) “convergent validity” and
“divergent validity”: measures of same concept should have similar answers,
measures of different concepts should have different answers  tested by
correlational analysis; 3) “criterion validity”  compare answers against
objective data  e.g., self-reported voting behavior; 4) “content validity” 
measures all important aspects of concept; 5) “construct validity”  measures
how related one concept is to another concept  related concepts should be
correlated
Technical Concepts
• Sampling Error: because a survey result is based on only a
part of the population, it will typically be a little above or
below the real value of the variable
• Margin of Error: an estimate of the precision of the
estimated value, reported as +/- x% around the estimated
value; depends basically on number of respondents 
higher for subsamples
• Confidence Level: percentage of samples in which the true
value will fall within the margin of error; if CL is 95%, in 1
out of 20 samples, the real value will be outside the
margin of error; higher confidence level  wider margin of
error
Media and Polls
(Gollin 1987)
• Increase in media reliance on polls
- Volume of stories based on polls has expanded dramatically
- Polls become the story, rather than being used as part of a story
- Media outlets open their own polling operations
o Sporadic before 1960’s, increasingly common after
o CBS / NYT
o Washington Post/ABC
• Public demand for polls increased 
- Between 1940s – 1980s public trusts in polls as accurate reflections
of public opinion
Media Reporting of Polls
• However, suspicions arose  1970’s
- News media using polls to “make news” rather than report it
- Candidate claims to electability (based on internal, “secret” polls)
contradicted by public polls
- Citizens suspected activists and politicians of manipulating polls to
further their aims
- Increased number of polling organizations  uneven quality of
polls
- Conflicting, inaccurate electoral forecasts
• Response: Legal regulation?
- Congressional bills after 1936 and 1948, but 1st Amendment
protections win out
- Self-regulation: professional associations (AAPOR, NCPP)
Public Perceptions of Polling
• Public is increasingly mistrustful
• Public is weary of polls
- Invasive and make demands on time
- Public attitudes toward telephone etiquette changing (Groves
“Survey Nonresponse)
- Sales, telemarketing, commercial and governmental research
Should We Trust Polls Reported in
Media? (Asher, Chap. 6)
• Is the poll well done technically?
• Is the media source interpreting the poll
correctly? (Do we have enough information to know?)
• Who’s paying for the poll and why?
Technical Reporting Standards
• NCPP/AAPOR Standards (Asher, Chap. 6)
– Sponsorship
– Field Work:
• Dates of field work
• Location
• Contact method
– Sample:
• Population sampled
• Description of sampling frame
• Selection procedure (random? Self-selected?)
• Size (N)
• Response / Completion rates
• Description of subsamples, if any
– Question wording
– Precision (sampling error, margin of error)
Technical Standards Don’t Ensure
Quality of Information
• Source of poll (pollster) may be different than
news agency covering poll
- If same, easier to comply
- If different, no way to enforce recommendations
• Standards themselves are incomplete
- Don’t address response rates / efforts to increase response
- Don’t address sample adjustments  weighting, filters
Weighting: e.g., Latinos constitute 35% of CA population, but only
10% of sample  weighting adjustment multiplies each Latino by
.35 / .10 = 3.5
Filters: e.g., probable voters: “How likely are you to vote?” (fivepoint scale ranging from “definitely” to “not at all”)  filter out
“not at all” and base conclusions on other respondents
Media Don’t Always Comply
with Standards
• Newspapers
- Compliance varies (Miller and Hurd, 1982: of 116 polls reported,
85% complied with sample size, but only 16% with margin of error)
- BUT, study based on big-market papers
- Coverage improved over time
• TV: Paletz et al., 1980
- Considerably worse than newspapers
- Sponsor never mentioned; question wording in 5% of TV news
programs; survey dates in only 30% (cf. 43% in NYT)
- Virtually no other technical information
– Larson, 2000
- 50% mentioned sampling error, but no one understood
how it works
Substantive Interpretation
• Media have wide latitude in interpreting
results of polls; e.g. 1985 NYT abortion poll:
Legal as is now
Legal only to save mother, rape, incest
Not Permitted
Don’t Know / NA
40%
40%
16%
4%
“Abortion is murdering a child / Abortion not murder because fetus isn’t a
person”
Murder
53%
Not Murder
35%
Don’t Know / NA
10%
“Abortion sometimes is the best course in a bad situation”
Agree
66%
Disagree
26%
Don’t Know / NA
8%
Criticism of Media Poll Reporting
• Misinterpretation, e.g.:
– NYT 1989 overstated support for tort reform (poll sponsored by
insurance company Aetna) (Krosnick 1989)
– News outlets wrongly reported increasing support of Panama Canal
Treaty (Smith and Hogan 1987)
– Coverage of Ohioans support of “intelligent design” ignored questions
in poll that suggested this should be taught at home or in church
(Jacobs and Shapiro 1995)
• Also, media CREATE news by carrying out polls
and reporting on them
• Focus on numbers often in detriment to
underlying meaning  “horse races” in presidential elections
Download