Modeling and Simulation of Survey Collection Using Paradata Presented by: Kristen Couture Co-authored by: Yves Bélanger Elisabeth Neusy Outline Motivation for Simulating Survey Collection Details of Simulation Modeling using Paradata Preliminary Results Conclusions Future Work 2 Motivation Ultimate goal: make CATI survey collection more efficient Recent initiatives in the field • Experimentation with call attempts and calling priorities • Takes time, lack of control, costly, results not always easy to interpret Need for a controlled environment, where impact of each experiment can be tested prior to collection 3 Microsimulation What is microsimulation? • A modeling technique that operates at the level of individual units, such as persons, households, vehicles, etc. For us: microsimulation = a "virtual collection" system Recreates CATI collection environment with Simulation Software (SAS Simulation Studio) Allows manipulation of parameters in simulated environment 4 Microsimulation What are the elements of our microsimulation? • • • • • 5 Cases Queues Interviewers Rules of the Call Scheduler (flows and priorities) Output Call Transaction File Overview of Microsimulation SAS Simulation Studio Paradata Model Call Outcomes Model Call Duration Model Parameters Simulation Model Collection Parameters 6 Modeling using Paradata Use existing survey data (Blaise Transaction History) Call Outcome • Multinomial logistic regression Call Duration • Create histograms and fit distributions for each of the outcomes Output Model parameters • Estimated parameters from logistic regression model • Fitted distribution and parameters • Input into simulation model 7 Modeling using Paradata: Call Outcome Multinomial Logistic Regression Model log 8 pj pk 1 n ij xi , for j 1,...,k i 1 Model probability of outcomes (sum of probabilities = 1) k+1 outcomes xi = explanatory variables from paradata pj = probability of outcome j ij = parameters from logistic regression model Modeling Call Outcome: An Example Paradata: Existing RDD survey 5 outcomes: Unresolved, Out of Scope, Refusal, Other Contact, Respondent 7 explanatory variables entered into the model Time of Call: Afternoon, Evening, Weekend Residential Status: Residential phone number Call history: Unresolved, Refusal, Contact Estimated parameters from model are entered into simulation 9 Modeling Call Outcome: An Example Calculate probability of each outcome Time of Call Call History pj values 10 Microsimulation Paradata Model Call Outcomes Model Call Duration Model Parameters Simulation Model Collection Parameters 11 Preliminary Results: Two examples Investigate how collection parameters impact response rates Two Examples: • Example 1 : Different distributions of interviewers throughout the day • Example 2: Different distributions of interviewers throughout the day combined with different time slices Purpose: • Demonstrate how users can manipulate collection parameters to test specific collection scenarios • Verify that simulation results reflect collection 12 Example 1 Change allocation of interviewers throughout the time periods 3 Time Periods each 4 hours in length 30 interviewers per day for 30 days What happens to response rates? 13 One Possible Scenario Time period # of Interviewers Morning (9h-13h) 4 Afternoon (13h-17h) 4 Evening (17h-21h) 22 Fixed Total 30 Example 1 Impact on Response Rate when Changing Concentration of Interviewers in Evening Response Rate 60% 55% 50% 45% 40% 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 # of Interviewers in Evening (Total = 30) 14 Example 2 Same setup as Example 1 One Possible Scenario Add time slices: control maximum number of attempts made at different time periods throughout the day Time period # of Interviewers Max # of Attempts Morning 4 2 Afternoon 4 2 What happens to response rates? Evening 22 16 Fixed Total 30 20 15 Example 2 Response Rates Time period with the majority of interviewers 16 Time period with majority of attempts permitted Morning/Afternoon Evening Morning/Afternoon 47% 37% Evening 42% 52% Conclusions Create simple simulation model using paradata that produces results that reflect collection Able to test different collection parameters to see impact on response rates without spending a lot of money or time Approach adaptable to all types of CATI surveys 17 Future Work Improve logistic model by adding more parameters Add more complicated collection procedures to the model such as interviewer characteristics Simulate collection with multiple surveys at a time to see impact Run simulation for a survey to predict outcome and compare with actual results from field 18 For more information, please contact: Pour plus d’information, veuillez contacter : Kristen Couture kristen.couture@statcan.gc.ca Yves Bélanger yves.belanger@statcan.gc.ca Elisabeth Neusy elisabeth.neusy@statcan.gc.ca 19