Development of Electronic Data Reporting (EDR) in Statistics Finland Design goals Framework for similar systems Multi-language support User authentication / authorisation Data security (collection and production databases) Simple method for transferring data between collection and production databases “Mass emailer” for all kinds of collection systems Rami Peltola 13.9.2005 A2 Three generations Building cost index 2001 sub-annual survey built using Microsoft Windows DNA (Distributed iNternet Application Architecture) 6 EDR solutions 2002-2004 both sub-annual and annual surveys VB.NET 4 EDR solutions 2005 sub-annual surveys XCola Rami Peltola 13.9.2005 A3 Data Provider relations and response rates: Business surveys Background Traditionally high response rates in annual and sub-annual business surveys up to over 99% persistent staff Good relations with data providers experienced staff many continuous personal contacts usually long-lasting relationships high level of trust Rami Peltola 13.9.2005 A5 Infrastructure and atmosphere for EDR in Finland High level of using the Internet almost every enterprise has access to internet business surveys are typically made for the largest enterprises Positive atmosphere for using internet for transactions with the government respondents are even enthusiastic about using Internet “It’s fun to fill in web forms instead of paper ones!” Rami Peltola 13.9.2005 A6 Key issues when implementing EDR -solution Simplifying data collection process Cost-efficiency Reducing need for human resources Reducing other data collection costs Improving the quality of collected data Accuracy Decreasing non-response Timeliness Speeding up the data accumulation Reducing response burden Enabling direct individual feedback for respondents Enabling browsing of previously submitted data Assuring high level data security Rami Peltola Data provider relations 13.9.2005 A7 Reducing response burden Using e-mail for informing the survey and sending the reminders link to the EDR -solution Questionnaire is “always” available Good designing of the questionnaire not just a copy of the paper form Helpful validity checks reduce the need of additional inquiries Contextual on-line help Option to fill in the questionnaire in separate sessions Support for several languages Rami Peltola 13.9.2005 A8 Enabling direct individual feedback for respondents (1) Motivates respondent to use EDR -solution instead a paper form For example enterprise’s own data compared with the data of it’s own industry Respondents have found feedback useful has even led to inquiries for more information Rami Peltola 13.9.2005 A9 Enabling direct individual feedback for respondents (2) Monthly sale inquiry: Rami Peltola 13.9.2005 A 10 Enabling browsing of previously submitted data Respondent have access to all the data it has previously submitted simple short-term (sub-annual) surveys Pre-filling the form with data from previous surveys useful especially in annual surveys helps the respondent to remember how the figures were compiled last year Rami Peltola 13.9.2005 A 11 Assuring high level data security Data security audit by an outside consult All traffic on the Internet is SSL -encrypted An authentication / authorisation -process is always needed New user IDs and passwords every year User IDs and passwords are initially sent in a letter only one of them can be sent by email the other one must always be sent in a letter or given over by telephone Only a certain number of our staff have access to user IDs and passwords (usually two persons per survey) Rami Peltola 13.9.2005 A 12 Decreasing non-response and speeding up the accumulation of the data Response rates have remained on high level The average response time of monthly surveys has been reduced in the best case by 7-8 days or 30% The number of reminders sent has decreased substantially in the best case by 50% The share of the respondents using EDR -solution has in most cases reached high level sub-annual surveys > 60% (in the best case 85%) annual surveys ~ 30% (in the best case 50%) Rami Peltola 13.9.2005 A 13 An example: Sale inquiry Change in response media 10/2001 - 10/2004 1600 1400 1200 1000 10/2001 10/2002 10/2003 10/2004 800 600 400 200 0 Fax Mail Internet Rami Peltola 13.9.2005 A 14 An example: Sale inquiry EDR users of all respondents: after 1. month: 48% after 2. month: 59% after 3. month: 61% after 4. month: 70% Today: > 80% Rami Peltola 13.9.2005 A 15 An example: Sale inquiry Reminders sent: before EDR: ~ 1000 after 1. month: ~ 800 after 2. month: ~ 700 after 3. month: ~ 600 since 4. month: ~ 500 Rami Peltola 13.9.2005 A 16 An example: Sale inquiry Responses per day 10/2001 (until t+45 days) 300 250 Due date (accumulation 31%) 200 Reminders (accumulation 76%) 150 100 50 0 Rami Peltola 13.9.2005 A 17 An example: Sale inquiry Responses per day 10/2002 (until t+45 days) 450 400 Due date (accumulation 49%) 350 300 Reminders (accumulation 81%) 250 200 150 100 50 0 Rami Peltola 13.9.2005 A 18 An example: Sale inquiry Responses per day 10/2003 (until t+45 days) 200 180 160 140 Due date 120 (accumulation 68%) 100 80 60 Reminders (accumulation 78%) 40 20 0 Rami Peltola 13.9.2005 A 19 An example: Sale inquiry Responses per day 10/2004 (until t+45 days) 250 200 Due date (accumulation 76%) 150 100 Reminders (accumulation 86%) 50 0 Rami Peltola 13.9.2005 A 20 An example: Sale inquiry Accumulation of data 10/2001 - 10/2004 2000 1500 1000 500 0 10/2001 10/2002 10/2003 Rami Peltola 10/2004 13.9.2005 A 21 Costs and Benefits of EDR solutions: Business surveys Costs of EDR The costs of developing web-based applications and running them has dropped by 60-70% during the last three years Average investment cost per new EDR -solution (today) An outside service provider: EUR 5000 In-house solution (XCola): 150 hours of work Maintenance costs of EDR solution per year (today) An outside service provider: EUR 1000 In-house solution (XCola): 50 hours of work Rami Peltola 13.9.2005 A 23 Development costs: In-house EDR solutions The in-house solutions are already in the third generation phase During the first and second phases the total resource input was about 2.5 person years more or less “learning by doing” includes the development of a secure communication environment includes the implementation of 7 solutions XCola (third generation phase) development took about 1 person year includes the implementation of 4 solutions Rami Peltola 13.9.2005 A 24 Benefits of EDR to Statistics Finland (1) Four second generation solutions have been in production for more than year 3300 respondents per month and 800 per quarter Average percentage of work saved in the data collection phase is over 40 (2 person years) The amount of ground mail has been reduced by 64 000 or 65% (0.5 person years) The average response time has been reduced in the best case by 7-8 days or 30% Number of reminders sent has gone down by the half Investment has paid off in about a year Rami Peltola 13.9.2005 A 25 Benefits of EDR to Statistics Finland (2) The data received are of better quality both annual and sub-annual surveys common estimate is “25% less errors” comprehensive study has not been made As manual handling diminishes, it can be replaced by more rewarding tasks Rami Peltola 13.9.2005 A 26 Benefits of EDR to respondents Questionnaire is possible to be completed more rapidly Pre-filling helps respondents to remember how they have answered previously Validity checks prevent the sending of erroneous data no additional inquiries Same piece of information needs always to be entered only once Respondents like to use Internet Perceived response burden has gone down Rami Peltola 13.9.2005 A 27 Effects of the EDR on data collection process Printing the questionnaires Transferring data to collection database Mailing E-mail informing (mass emailer) Receiving the questionnaires (mail, fax, e-mail) (Electronic data supply) Validating and entering the data On-line validations + mass validation Printing and mailing the reminders E-mail reminder (mass emailer) Phone inquiry Phone inquiry Non-individual delayed feedback Individual direct feedback Limited access to previous own data Previous own data available Manual exclusive treatment Electronic mass treatment Rami Peltola 13.9.2005 A 28 Data transfers Data transfers between collection and production databases are handled with an external application Data from collection database is first transferred to the temporary tables in the production database and then synchronized with the actual tables Solution is quite customizable Easy to customize for new collection systems Easy to add new databases Rami Peltola 13.9.2005 A 29 Mass emailer An external application to send emails to the respondents Modular approach New systems can be added using textual configuration files Reply requests (list of e-mails) can be added by writing sql statements to the configuration files Supports attachments Replaces traditional letters Rami Peltola 13.9.2005 A 30 An example: Sale inquiry Background Monthly inquiry using paper forms (~2080 enterprises) Data collection process (7-8 persons: ~2.0 working years) Printing and mailing the questionnaires Receiving the questionnaires (mail, fax, e-mail) Validating and entering the data Printing and mailing the reminders Phone inquiry Quarterly non-individual feedback for respondents Previously submitted data preprinted to questionnaires for the last 2 months Rami Peltola 13.9.2005 A 31 An example: Sale inquiry Purpose To describe the economic situation in different industries Trade: 30 branches Services: 23 branches Construction: 3 branches Manufacturing: 15 branches To respond to EU-legislation The Regulation of the EU concerning short-term statistics (1165/98) Rami Peltola 13.9.2005 A 32 An example: Sale inquiry / Source data Tax Administration’s register Value added tax -data Delay 2 months in turnover The data accumulates for 6 months Direct data collection (sale inquiry ~2080 companies) The most largest companies from each industry Trade: ~660 companies Services: ~600 companies Construction: ~200 companies Manufacturing: ~620 companies Business Register Basic information about companies: branch of industry, location, contact information etc. Rami Peltola 13.9.2005 A 33 An example: Sale inquiry End products Monthly indicators (newsletters and Internet releases) Trade: 30 (European sample), 45 (preliminary) and 75 days delay Services: 45 (European sample) and 75 days delay Construction: 75 days delay Manufacturing: 75 days delay Time series Starts mostly from 1995 (trade from 1985) Rami Peltola 13.9.2005 A 34 Data collection process in sale inquiry before EDR Collecting the sales data concerning March Manual data entering April May 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 1 2 3 4 Printing and mailing the questionnaires Due date Printing and mailing the reminders Phone inquiry for non-respondents Rami Peltola 13.9.2005 A 35 Data collection process in sale inquiry today Collecting the sales data concerning March Electronic data transfers April 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Transferring data to collection database Due date E-mail reminders E-mail informing 1. phone inquiry for nonrespondents on European sample 2. phone inquiry for non-respondents on European sample Phone inquiry for the rest non-respondents Rami Peltola 13.9.2005 A 36 EDR solution / Sale inquiry Rami Peltola 13.9.2005 A 37 Change in working hours used in data collection and validation (hours per year) 2500 2000 2001 2002 2003 2004 1500 1000 500 0 Rami Peltola 13.9.2005 A 38 Hours used in development and maintenance of EDR solution (sale inquiry) 900 Includes hours used in development of infrastrucre 800 700 600 500 2002 2003 2004 400 300 200 100 0 Hours Rami Peltola 13.9.2005 A 39 Experiences (1) Feedback from respondents has been very positive: Response burden has redused remarkably Enthusiasm of persons involved in data collection Manual data treatment has redused (at least by 50%) Quality of data has improved: On-line validations, additional information if data is not comparable etc. Rami Peltola 13.9.2005 A 40 Experiences (2) Number of enquires made by respondents concerning EDR solution: First two months: ~100 / month (mainly questions concerning base settings) Since third month: ~30 / month (mainly forgotten passwords) Rami Peltola 13.9.2005 A 41 Effects of EDR on data quality (ESS quality dimensions) Accuracy Automatic validations Timeliness Mail E -mail and web Manual data entering Electronic data transfers Comparability Rami Peltola 13.9.2005 A 42