Multiple Indicator Cluster Surveys Data Processing Workshop Overview of Data Processing System MICS Data Processing Workshop Content of the Presentation • Overview of the MICS data processing system • Data processing using paper questionnaires • Main characteristic of the MICS CAPI system • Creating analysis files • Data archiving Content of the Presentation • Overview of the MICS data processing system MICS Data Processing System: Actors and Roles – Country data processing manager and country team: • Customization of data entry programs, data entry, editing, and production of datasets • Customization of tabulation syntaxes and tabulation – Regional Office MICS Coordinator • Coordination and supervision, organization of the Data Processing workshop – Regional Office Data Processing Consultant • Technical support and review of customized programs and close work with country teams – HQ Data processing unit • Development of standard programs, templates and coordination of Data Processing workshops MICS Data Processing System • The data-processing system can be divided into following phases: – Customization of MICS data entry/collection program and tabulation syntaxes, – Establishing the data entry system locally, – Primary data processing (data entry/data collection), – Secondary data processing (creating analysis files), and – Tabulation – Archiving Content of the Presentation • Overview of the MICS data processing system • Data processing using paper questionnaires MICS Data Processing System • Designed to deliver the first results of a survey within several weeks after the end of fieldwork • Such rapid turnaround time is possible when completed questionnaires are entered simultaneously with survey fieldwork • Data for each cluster is stored in a separate data file and is processed as soon as all the questionnaires from a cluster are returned from the field • This approach breaks data processing down into discrete segments and allows it to progress while fieldwork is ongoing • By the time the last questionnaires are finished and returned to headquarters, most of the data have already been processed Primary Data Processing Flow Main Data Entry Structure Check Verification Data Entry Difference Listing Backup Raw Data Secondary Editing Backup Final Data Primary Data Processing • Main data entry – First time data is entered • Structure check – Checks structure of data files • Verification data entry – Second time data is entered • Difference listing – Two data files are compared; differences resolved Primary Data Processing • Raw data backup – Verified data are backed up to a directory containing raw data files • Secondary editing – Complex inconsistencies are investigated and resolved if able • Final data backup – Edited data are backed up to a directory containing edited data files Data Processing Personnel • • • • Questionnaire administrators Data entry operators Secondary editors Data processing supervisor Questionnaire Administrators • • • • Receive clusters from the field Check that all questionnaires are present Check that questionnaires are ready to enter Check that HH and individual questionnaires are in the proper order, and rearranges them if not • Keep track of location of all clusters • Should follow interviewer training Data Entry Operators • Enter main data • Enter verification data • Resolve differences between files • Must follow interviewer training • Must be familiar with the questionnaires Secondary Editors • Investigate complex inconsistencies • Tell supervisor if and how to resolve inconsistencies • Review editing guidelines • Must be present during interviewer training • Need excellent understanding of questionnaire and goals of survey Data Processing Supervisor • Resolves data entry problems • Maintains programs • Oversees entire data processing system • Must be present during interviewer training • Must have excellent grasp of questionnaire • Must have programming skills in SPSS and CSPro Questionnaire Administrator Training • Review list of checks in data processing chapter • Give QA several clusters and check work • Establish questionnaire storage procedures Data Entry Training • Begin when you have one cluster for each data entry operator • Allows you to – Train data entry operators – Debug programs • Practice verification at the same time • When you have finished – Fix entry programs – Delete data files Secondary Editor Training • Wait until you have 3-4 clusters double-entered • Give secondary editors – Copy of editing guidelines – A cluster’s error listing – The cluster’s questionnaires • Review work with secondary editor • Try to schedule a day when data entry operators aren’t working Data Processing Equipment • Data entry machines – – – – Windows XP, 2000, Vista,7 or 8 Supervisor’s machine Windows XP, 2000, Vista,7 or 8 Available disk drive space for all software and MICS data files • Uninterrupted power supplies (UPS) • Network • Surge Protectors Data Processing Equipment • • • • • A printer Paper Toner cartridges/printer ribbons Flash drives (if network is not established) Green pens Data Processing Rooms • Data Entry – Desk space for the monitor, keyboard, and stack of keyed and unkeyed questionnaires – Area to store “in-progress” clusters • Editing – Quiet space for editors to work • Questionnaires – Must contain means (e.g., shelving) to organize questionnaires by cluster Server and Keyer’s Directory Structure MICS5 CSPRO DATA contains data from main entry ENTRY contains data entry programs VERI contains data from verification entry Supervisor’s Directory Structure MICS5 CSPRO BACKUP DICTS ENTRY EXPORT FINAL GPS RAW SUPER SPSS WEIGHTS backup of verified data dictionaries used in all programs (not just entry) contains data entry programs programs to transfer data from CSPro to SPSS a copy of all edited data GPS data entry program (if applicable) contains all unedited data (just after differences resolved) all programs not related to data entry phase contains all SPSS programs used for export and tabulation contains weights spreadsheet and data files Data Entry Cluster reaches the office Cluster acceptance is done. All the questionnaires are counted and recorded according to their type and result code to the cluster tracking form (CTF). Supervisor verifies the main and verification data by the supervisor menu. If there is typing errors, lets the keyers correct them at their computers. If no problem, the raw data is backed up. Sytem automatically gives date, it should be written to CTF Cluster is sent to Supervisor Supervisor records the information on CTF to the system. Cluster is ready for main data entry. Supervisor gives it to first keyer and records keyers information to system and to CTF. MAIN DATA ENTRY First keyer enters the whole of the questionnaires of the cluster. VERIFICATION Second keyer enters the whole of the questionnaires of the cluster. Supervisor runs the structure check control. If all questionnaires are entered, gives to second keyer for verification and records this information to system and CTF. After backing up raw data After backing up of raw data: Secondary editing is done by the supervisor menu. NO ERROR “Back up the final data” Supervisor follows the same procedures for every cluster If all procedures are completed for all clusters: Export the data to SPSS ERRORS Secondary Editor: By using the “Editing Manual “ the necessary changes at the data with “Modify the data” option at Supervisor menu is done. NETWORK STRUCTURE Network Computer: Keyer Computer: Supervisor Computer: \\MICS5\CSPRO\DATA \VERI \ENTRY\entry.enc \entry_menu.enc \\MICS5\CSPRO \DATA \VERI \ENTRY\entry.enc \entry_menu.enc \entry_menu.pff \\MICS5\SPSS \\MICS5\CSPRO\BACKUP \DICTS \ENTRY \EXPORT \FINAL \GPS \RAW \SUPER Content of the Presentation • Overview of the MICS data processing system • Data processing using paper questionnaires • Main characteristic of the MICS CAPI system Mobile data collection • In the recent years we have seen development of many innovative data collection tools using handheld PocketPC personal digital assistants (PDAs), smartphones or tablet personal computers. Tablet Personal Computer • Tablet – sized computer with the key features of a full – size personal computer • Various operating systems • Computer vs. cell phone capability Personal Digital Assistants • Handheld computers • Various operating systems – Windows mobile – Palm – Others (e.g. iPhone, Nokia) • Cell phone capability – Plus: good communications – Minus: security Tablet vs PDA • Choosing a tablet over a PDA - Tablets have much larger screens, better resolutions, more space for on screen typing making things easier to see and more useful for data entry - Tablets and PDAs have similar battery life, though if PDAs are used for making and receiving calls battery will need more frequent charging - Better security Hardware requirements for CSPro Tablets: • Required configuration: Full Microsoft Windows 7 or 8 (NO windows RT tablets) PDAs: • Required configuration: Windows Mobile versions 5 and 6 (note that UNICODE is not supported) MICS CAPI System • All applications to collect and administer data are written in CSPro • Three systems: – Interviewers - data collection – Tablet/PDA – Supervisors – data monitoring and control – Tablet/PDA – Central Office - centralized data and monitor fieldwork - PC Repair Utility Int 5 Int 2 Interviewer’s System Update Utility Supervisor’s System Screen Supervisor Reports Supervisor Reports System Updates System Updates Central Office Central Office System Training • 2 weeks training (paper questionnaires) including 3 days field exercise + 1 week extra training (PDAs/Tablets) + 4-5 days extra field exercise • High Level Technical Support Directories on Interviewer's Tablet \My Documents\MICS5\ • • • • • Data Data from the questionnaires as collected during the interview Dicts Dictionaries describing the data structures Entry Questionnaire and Interviewer’s menu programs Ref Reference data files used by the programs Utility Utility programs used for Bluetooth transfer, compressing files, upgrading programs, etc. • Work Working files used by the menu and programs \SD Card\ • Backup Backup copy of the data files from Data created after the interviews Directories on Supervisor’s Tablet \My Documents\MICS5\ • Receive • Temp • Work • Reports Directory for data received from the interviewers Temporary directory for working files transferred from interviewers, prior to transferring to the Receive directory Working data files Report files created by supervisor applications Content of the Presentation • Overview of the MICS data processing system • Data processing using paper questionnaires • Main characteristic of the MICS CAPI system • Creating analysis files Secondary Data Processing • Exporting data from CSPro – Create SPSS data file and syntax file from CSPRO data file and dictionary • Importing data into SPSS – Executing syntax file created by CSPro • Recoding variables – Creating new variables and recoding old variables Secondary Data Processing • Adding sample weights – Sample weights are added from weights spreadsheet • Adding wealth index – Wealth index calculated then added to files • Adding GPS data – Geographic location data added to files • Tabulation – Tables are generated from the analysis files Content of the Presentation • Overview of the MICS data processing system • Data processing using paper questionnaires • Main characteristic of the MICS CAPI system • Creating analysis files • Data archiving Data archiving: rationale • Collecting data is expensive. Data should be used beyond producing basic report. • Survey microdata are valuable resources for government departments and academic researchers. • Survey data constitute valuable and irreplaceable assets which should be managed in a way that encourages their widest possible use and re-use. • At the same time, data collectors main focus should be protecting respondents while making microdata assessable.