Modeling and Simulation of Survey Collection Using Paradata

advertisement
Modeling and Simulation of
Survey Collection Using
Paradata
Presented by: Kristen Couture
Co-authored by: Yves Bélanger
Elisabeth Neusy
Outline
 Motivation for Simulating Survey Collection
 Details of Simulation
 Modeling using Paradata
 Preliminary Results
 Conclusions
 Future Work
2
Motivation
 Ultimate goal: make CATI survey collection more efficient
 Recent initiatives in the field
• Experimentation with call attempts and calling priorities
• Takes time, lack of control, costly, results not always easy to
interpret
 Need for a controlled environment, where impact of each
experiment can be tested prior to collection
3
Microsimulation
 What is microsimulation?
• A modeling technique that operates at the level of individual
units, such as persons, households, vehicles, etc.
 For us: microsimulation = a "virtual collection" system
 Recreates CATI collection environment with Simulation Software
(SAS Simulation Studio)
 Allows manipulation of parameters in simulated environment
4
Microsimulation
 What are the elements of our microsimulation?
•
•
•
•
•
5
Cases
Queues
Interviewers
Rules of the Call Scheduler (flows and priorities)
Output Call Transaction File
Overview of Microsimulation
SAS Simulation
Studio
Paradata
Model Call Outcomes
Model Call Duration
Model Parameters
Simulation Model
Collection Parameters
6
Modeling using Paradata
 Use existing survey data (Blaise Transaction History)
 Call Outcome
• Multinomial logistic regression
 Call Duration
• Create histograms and fit distributions for each of the outcomes
 Output Model parameters
• Estimated parameters from logistic regression model
• Fitted distribution and parameters
• Input into simulation model
7
Modeling using Paradata: Call Outcome
 Multinomial Logistic Regression Model
log





8
pj
pk 1
n
  ij xi , for j  1,...,k
i 1
Model probability of outcomes (sum of probabilities = 1)
k+1 outcomes
xi = explanatory variables from paradata
pj = probability of outcome j
ij = parameters from logistic regression model
Modeling Call Outcome: An Example
 Paradata: Existing RDD survey
 5 outcomes:
Unresolved, Out of Scope, Refusal, Other Contact, Respondent
 7 explanatory variables entered into the model
Time of Call: Afternoon, Evening, Weekend
Residential Status: Residential phone number
Call history: Unresolved, Refusal, Contact
 Estimated parameters from model are entered into simulation
9
Modeling Call Outcome: An Example
 Calculate probability of each outcome
Time of Call
Call History
pj values
10
Microsimulation
Paradata
Model Call Outcomes
Model Call Duration
Model Parameters
Simulation Model
Collection Parameters
11
Preliminary Results: Two examples
 Investigate how collection parameters impact response rates
 Two Examples:
• Example 1 : Different distributions of interviewers throughout the day
• Example 2: Different distributions of interviewers throughout the day
combined with different time slices
 Purpose:
• Demonstrate how users can manipulate collection parameters to test
specific collection scenarios
• Verify that simulation results reflect collection
12
Example 1
 Change allocation of interviewers
throughout the time periods
 3 Time Periods each 4 hours in
length
 30 interviewers per day for 30
days
 What happens to response rates?
13
One Possible Scenario
Time period
# of
Interviewers
Morning
(9h-13h)
4
Afternoon
(13h-17h)
4
Evening
(17h-21h)
22
Fixed Total
30
Example 1
Impact on Response Rate when Changing
Concentration of Interviewers in Evening
Response Rate
60%
55%
50%
45%
40%
0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30
# of Interviewers in Evening (Total = 30)
14
Example 2
 Same setup as Example 1
One Possible Scenario
 Add time slices: control
maximum number of
attempts made at different
time periods throughout the
day
Time period
# of
Interviewers
Max # of
Attempts
Morning
4
2
Afternoon
4
2
 What happens to response
rates?
Evening
22
16
Fixed Total
30
20
15
Example 2
 Response Rates
Time period with the
majority of
interviewers
16
Time period with majority of attempts
permitted
Morning/Afternoon
Evening
Morning/Afternoon
47%
37%
Evening
42%
52%
Conclusions
 Create simple simulation model using paradata
that produces results that reflect collection
 Able to test different collection parameters to see
impact on response rates without spending a lot
of money or time
 Approach adaptable to all types of CATI surveys
17
Future Work
 Improve logistic model by adding more
parameters
 Add more complicated collection procedures to
the model such as interviewer characteristics
 Simulate collection with multiple surveys at a
time to see impact
 Run simulation for a survey to predict outcome
and compare with actual results from field
18
 For more information,
please contact:

Pour plus d’information,
veuillez contacter :
Kristen Couture
kristen.couture@statcan.gc.ca
Yves Bélanger
yves.belanger@statcan.gc.ca
Elisabeth Neusy
elisabeth.neusy@statcan.gc.ca
19
Download