Sweden lecture on survey modalities

advertisement
Survey Experiment
Modalities
Some Pros and Cons of Differing Sampling Sources and Methods
Matthew A. Baum
Harvard University
&
Leonie Huddy
Stonybrook University
Outline
Review of prominent Internet players
Amazon Mechanical Turk
Knowledge Networks
Yougov/Polimetrix
SurveyMonkey Audience
Comparing survey modalities
USA vs. Swedish Phone/Internet Usage
Statistics
Conclusions
Amazon Mechanical Turk (AMT)
Overview
Began in 2005
“Workers” sign up to participate in tasks for pay (called
“Human Intelligence Tasks”, or “HITs”)
Global workforce (~100,000 workers)
Salary: ~$3-$8/hour (maybe $10/hr for high-experience workers)
“Requestors recruit “workers” on internal message board
describing task
Can specify “qualifications” for workers (e.g., experience, quality
ratings, nationality, etc.)
10% fee to Amazon
AMT provides survey building tools, or Requestors can include
links to external surveys
HITs range from 1-second marketing surveys (“Rate appeal of
this photo on 1-10 scale”) to elaborate 30+ minute experimental
surveys
Amazon Mechanical Turk (AMT)
Effects of Compensation Amount and Task Length on Participation Rates
(Submitted Surveys per Hour of Posting Time)
Compensation
Amount
Short survey
(5 min)
Medium survey
(10 min)
Long survey
(30 min)
$.02
5.6
5.6
5.3
$.10
25.0
14.3
6.3
$.50
40.5
31.6
16.7
Source: Buhrmester, Kwang, and Gosling (2011)
My AMT Experience:
• paid $1.00 per completed HIT for ~22 minute survey
• Listed as “News and Politics Survey”
• Received 1933 completed surveys in ~one month
• rapidly diminishing returns
• 1825 valid responses
Amazon Mechanical Turk (AMT)
Demographics
Buhrmester, Kwang, and Gosling (2011)
% Female
% NonWhite
% NonAmerican
Age
AMT
55
36
31
32.8
Internet
57
Fewer (?)
~31
24.3
Paolacci, Chandler, and Ipeirotis (2010)
Category
(N=1000 AMT workers)
% Female
64.9
Average Age
36
% Earning below $60k/year
66.7
Education
“higher than general pop.”
% Non-American
53
Amazon Mechanical Turk (AMT)
Demographics
My AMT Survey
(N=1824)
US National Averages
(From Pew 2010 Survey)
Average Age
32.6
51.0
% Non-White
12.2
20.3
% Female
58.2
56.9
% Democrats
37.4
31.9
% Republicans
17.7
28.0
% Independents
44.3
40.1
% Liberal
36.7
17.5
% Moderate
46.9
43.9
% Conservative
16.3
38.6
Amazon Mechanical Turk (AMT)
Follow-Up Survey
Contacted all 1825 workers who completed valid HITs in original
survey approximately 4 months later
Offered $.50 for ~12 minute survey
426 valid responses (24%)
2nd batch of invites
Tradeoffs
Source: Paolacci, Chandler, & Ipeirotis (2010)
Some Conclusions re AMT
“Our analyses of demographic characteristics suggest that
MTurk participants are at least as diverse and more
representative of noncollege populations than those of
typical Internet and traditional samples. Most important, we
found that the quality of data provided by MTurk met or
exceeded the psychometric standards associated with
published research.”
Buhrmester et al. (2011)
“Our theoretical discussion and empirical findings suggest
that experimenters should consider Mechanical Turk as a
viable alternative for data collection.”
Paolacci et al. (2010)
Knowledge Networks
2007 Survey (via TESS)
Participants recruited via residential address searches
Respond online (Internet access provided to recruits who don’t
have it)
Pretty good samples, but imperfect (selection effects not
completely purged) and expensive
Demographics
Post-Survey Matched Sample
(N=1014)
% Female
50
% Non-white
26.7
% Liberal
25
% Moderate
41
% Conservative
33
Average Age
46
Ideological Intensity (0-3
scale)
.99
Recent (2012) KN Proposal
• Design: 4-wave study, longitudinal sample, all waves occurring within 1 year
• Pretest: N=25 interviews each wave
• Sample: General population adults, age 18+, English-language survey-takers
• Number of completed interviews: N=2,000 wave 1, with about 70%-80% of
wave 1 respondents completing each of the later waves
• Median survey length: 20 minutes each wave
• Multimedia/graphics: None
• Incentives: $5 for each of waves 1-3, $10 for wave 4
• KN will provide standard deliverables (self-documented data file with all the
survey data, general demographic profile data, and field report documenting
all sampling and data collection procedures, codebook, and panel recruitment
methodology)
• Price: $255,550 (No, that’s not a typo!)
Yougov/Polimetrix
Opt-in sampling
Random draw from target population matched with most
comparable available panelists to create representative
population samples
Demographics + attitudinal/behavioral factors
Post-Survey Matched
Sample
My Survey
CCAP (2008,
(2007, N=1200) 6-wave panel)*
Average Age
40
44.2
% Female
51
52
% Non-white
31.5
16.7
% Democrat
38.0
36.1
% Republican
39.8
30.6
% Independent
22.2
33.3
% Liberal
26.4
22.8
% Moderate
36.8
34.5
% Conservative
36.8
35.3
*Notes:
(1) Overweights battleground
states 2-fold.
(2) Demo weights based on
age, race, gender, educ,
marital status, kids,
income, state, metro area,
employment, citizenship
(3) Attitude/Behavior
weights based on religion,
church attendance,
evangelical status, news
interest, PID, ideology.
Comparing Modalities: KN vs. Polimetrix
vs. Natural Survey: Barabas & Jerit (2010)
Polimetrix vs. Pew vs. NES
(Partisan Distribution)
Source: Hill, Lo Vavreck & Zaller (2007)
Polimetrix vs. Pew vs. NES
(Ideologial Distribution)
Source: Hill, Lo Vavreck & Zaller (2007)
Demographic Comparisons
Source: Hill, Lo Vavreck & Zaller (2007)
Comparing Modalities: KN vs. Polimetrix
vs. Natural Survey: Barabas & Jerit (2010)
Conclusions:
“The results presented here should be encouraging to anyone
devoted to the scientific study of politics because they suggest
that what occurs in survey experiments resembles what takes
place in the real world.”
“Although there was a discrepancy between the size of survey
treatment effects and the general population in our natural
experiment, we observed correspondence exactly where one
would expect to find it—among those who were most likely to
be exposed to media messages about the two government
announcements.”
SurveyMonkey Audience
New enterprise for SM
still figuring it out!
Recruit participants from survey respondents
Currently U.S. only, but likely to expand
$3 per finished response; $5 for rush project (3 business
days)
Custom create demographic profiles
Gender, age, income, location, education, race, industry of
employment, job function, marital status, employment status,
home ownership, vehicle ownership, smartphone ownership,
exercise habits.
No political attitude selectors yet (though I’ve lobbied them!)
Similar to Polimetrix, except don’t start with random draw from
population
Much less expensive!
AAPOR 2010 Task Force on
Online Panels
Focus on nonprobability samples
Informative, but inconclusive…
Conclusions
If research objective includes accurate estimate of population
values, avoid nonprobability online panels.
Results differ significantly from probability-based methods (like
RDD telephone) on range of behaviors and attitudes, with latter
being more accurate.
Nonprobability online panels sometimes appropriate, when precise
estimates of population values not critical
More research needed on evaluating and testing techniques used
across disciplines to make population inferences from
nonprobability samples.
Ansolabehere & Schaffner (2011)
Comparison of Survey Modalities
3-mode study conducted in 2010
1. Opt-in Internet panel
2. Live telephone interviews (using national
RDD sample of landlines and cell phones)
3. Mail (using national sample of residential
addresses)
Ansolabehere & Schaffner (2011)
Mode
Sample
Size
Field Dates
Response
Rate
Completion
Time
Internet
1000
1/15/10-2/11/10
42.9% (RR1)
8.94 mins
Mail
1207
1/30/10-9/30/10
21.1% (RR3)
11.80 mins*
Phone
907
1/28/10-1/30/10
19.5% (RR3)
14.33 mins
*mail recruits who took survey online
Ansolabehere & Schaffner (2011)
Item
Response
Internet
Phone
Mail
Validating
Source
Home
Ownership
Own
.613
.632
.632
.669 (CPS)
Mobility
Moved in past
year
.152
.155
.162
.154 (ACS)
At address 5+
years
.555
.609
.519
.588 (ACS)
Smoked 100
Cigarettes
Yes
.504
.471
.497
.430 (NHIS)
Smoke
Cigarettes
now
Every or some
days
.259
.242
.241
.203 (NHIS)
Voted in ‘08
(if registered)
Yes
.888
.876
.821
.896 (CPS)
Vote Choice
in ‘08
Obama
.482
.454
.553
.529
McCain
.469
.505
.431
.456
.036
.035
.043
Avg Diff.
Ansolabehere & Schaffner (2011)
Non-validatable Political Point Estimate Comparison, by Mode
Includes State of Economy,
Approval of/Support for Obama, Congress, R’s Member, Abortion,
Affirmative Action, Gay Marriage, Investing Social Security, Tax over $200k,
Cut Spending, Government
Right and Wrong
Voting Method
Religious and/or Political Contributions
Political Knowledge
News Source
Internet vs.
Phone
Phone vs.
Mail
Internet vs.
Mail
Avg Diff.
(All measures)
.062
.042
.051
Avg Diff.
(Attitudinal
measures only)
.052
.042
.044
Weighted proportions of respondents in each category, excluding DK.
Ansolabehere & Schaffner (2011)
Small (“negligible”) differences across modes
Except…Internet respondents more politically knowledgeable &
made more political contributions
Mail costs 5 times more than Internet & twice as much as phone
Internet half as costly as phone and faster turnaround
Differences from other studies that found Internet samples less
valid than phone samples attributed to (1) more Internet users than
5+ years ago when prior data samples collected, and (2) advances
in “science of constructing, matching and weighting opt-in Internet
panels”
Conclusion: “...an opt-in Internet survey produced by a respected
firm can produce results that are as accurate as those generated by
a quality telephone poll and that these modes will produce few, if
any, differences in the types of conclusions researchers and
practitioners will draw in the realm of American public opinion.”
Media Access,
Sweden vs. USA
U.S. Telephone Ownership
Sweden vs. USA
Sweden vs. USA
Sweden vs. USA
Swedish Phone Use by Age
Percent Sending and Receiving Different Volumes of Mobile
Phone Calls, by Age (2007 Swedish Survey)
70
60
50
40
30
20
10
0
18-24
25-39
1-9/day send
Source: Axelsson (2010)
40-49
1-9/day rec
10+/day send
50-59
10+/day rec
60-65
Why Persistent Mobile Phone &
Internet Usage Gaps?
USA
0.04
0.002
Sweden
0.001
0.18
0.11
0.36
0.11
1st Quintile
2nd Qintile
3rd Quintile
4th Quintile
0.84
0.21
5th Quintile
0.15
Income Inequality: USA vs. Sweden in 2011
Effects of Income on Internet
Usage in U.S.
U.S. has more of these folks
What Do Data Tell Us?
Swedes more likely to be online in 2010 (by ~14 percentage
points), and make greater use of Internet
But, similar in fixed broadband and landline usage
More likely to use mobile phones
But similar in volume of mobile calls sent and received
Moral of story? Infrastructure looks, if anything, LESS hospitable
to probability sampling in Sweden than in USA
So, if RDD today works better in Sweden for generating
probability samples, reason seems likely to have more to do
with attitudes toward surveys than infrastructure
Conclusions
Opt-in Internet samples here to stay
Cheaper (by a lot!)
Faster (by a fair amount…)
Primary competitor (RDD phone surveys) increasingly
difficult
14% of adults “unreachable” in Sweden?
Estimated 25% of U.S. households cell only in 2010.
“Unreachable”: ~13% (AAPOR 2010); others say more
Open up new possibilities
E.g., cross-national samples/panels
Most current evidence suggests that with current matching
and weighting techniques, Opt-in Internet samples can be
representative of target populations
Download