The Experiences of Web Based Data Collection from Enterprises in Finland

advertisement
The Experiences of Web Based
Data Collection from Enterprises
in Finland
August 9th 2006, JSM Seattle USA
Introduction - Strategies And Methods
Statistics Finland’s Strategy for EDR
 To offer an electronic option in all data collections by 2007
(not in person statistics)
 It’s the respondent’s choice whether to use it or not
Data Collection Methods
 About 97% of data are derived from administrative registers
 About 3% are from direct data collection (paper forms, machine
readable data / primary EDI, EDR, interviews by CATI/CAPI systems: mainly Blaise)
Business Data Collections
 About 50 surveys (excluding collections with less than 30 respondents)
 45 Web (Internet form) collections in use
Rami Peltola
August 9th 2006
2
Background - Data Collection And Infrastructure

Traditionally high response rates
(in both annual and sub-annual business surveys)


Good relations with data providers


Up to over 99%, persistent staff
Experienced staff, continuous personal contacts, high level of trust
High level of using the Internet

Almost every enterprise has access to the internet
(employees 10+  98%, employees 100+  100%)


Business surveys are directed to the largest enterprises
Positive atmosphere for using internet with the government


Respondents are even enthusiastic about using the Internet
“It’s fun to fill in web forms instead of paper ones!”
Rami Peltola
August 9th 2006
3
Background - Three Generations of In-house EDR
Solutions

1. Generation: Building cost index 2001


2. Generation: 7 EDR solutions 2002-2005


Built using Microsoft Windows DNA
(Distributed Internet Application Architecture)
VB.NET
3. Generation: 23 EDR solutions 2005-2006

XCola
11 EDR solutions made by outside service provider 1997-2006
 Pilot in integrated data collection (tourism statistics)

Rami Peltola
August 9th 2006
4
Technical information - XCola in a Nutshell
A generic application for Web surveys
 Processes the XML questionnaires and transforms them
into Web applications
 Supports client and server side validations
 Executed on the server side, does not require any
installation on the respondent side
 Works on every modern browser
 Easy to implement new questionnaires in just hours


Main developer: Mr. Toni Räikkönen, toni.raikkonen@stat.fi
Rami Peltola
August 9th 2006
5
Benefits - Summary of Main Benefits
Simplifying data collection process
 Reducing need for human resources
 Reducing other data collection costs

Improving the quality of collected data
 Decreasing non-response


Cost-efficiency
Accuracy
Timeliness
Speeding up the data accumulation
Reducing response burden
 Enabling direct individual feedback for respondents
 Enabling browsing of previously submitted data
 Assuring high level data security

Rami Peltola
Data provider
relations
August 9th 2006
6
Achieved Cost-Efficiency - 2nd Generation

Four second generation solutions have been in production
for 3 years (3300 respondents per month plus 800 per quarter)
 Average per cent of work saved in the data collection
phase is over 40 (2 person years)
 The amount of ground mail has been reduced by 65%
(0.5 person years)

Number of reminders sent has gone down by half


“Mass e-mailer” for all kinds of collections
Investment is paid off in about a year
Rami Peltola
August 9th 2006
7
Cost-Efficiency Continues to Improve
- 3rd Generation

Common framework (one engine) for similar systems
 An effective build-up
Simple method for transferring data between collection and
production databases
 Only one application to maintain and support
 Support and development knowledge easier to acquire and
spread
 Reducing need for human resources

 As manual handling diminishes,
it can be replaced by more rewarding tasks
Rami Peltola
August 9th 2006
8
An Example - Working Hours Used in Data
Collection and Validation in Sale Inquiry
2500
hours
2000
1500
1000
500
0
2001
2002
2003 2004 2005
years Rami Peltola
August 9th 2006
9
Accuracy and Timeliness

The data received are of better quality: “25% less errors”
(both annual and sub-annual surveys)
Response rates have remained on high level
 The average response time of monthly surveys has reduced



The number of reminders sent has decreased substantially


in the best case by 8-10 days or 30%
in the best case by 50% (from 1000 to 500 in just 4 months)
The share of the respondents using EDR -solution has in
most cases reached high level


sub-annual surveys > 60% (in the best case 85%)
annual surveys ~ 30% (in the best case 75%)
Rami Peltola
August 9th 2006
10
An Example - Sale Inquiry
Accumulation of Data 01/2002 - 01/2006
2000
responses
1500
1000
500
0
01/2002
01/2003
01/2004
01/2005
Rami Peltola
01/2006
August 9th 2006
11
Data Provider Relations
Perceived response burden has gone down
 E-mail informs of the survey and reminds to answer
 Questionnaire is “always” available and fast to fill-in
 Option to fill in the questionnaire in separate sessions
 Good designing of the questionnaire
 Helpful validity checks - no additional inquiries
 Contextual on-line help
 Support for several languages
 Individually tailored feedback
 Access to all the previously submitted data
and pre-filled questionnaires

Rami Peltola
August 9th 2006
12
High Level of Data Security
Data security audit by an outside consult
 All traffic on the Internet is SSL -encrypted
 An authentication / authorisation -process is always needed
 New user IDs and passwords every year
 User IDs and passwords are initially sent in a letter




Only one of them can be sent by email
The other one must always be sent in a letter or given over by
telephone
Only a certain number of our staff have access to user IDs
and passwords (usually two persons per survey)
Rami Peltola
August 9th 2006
13
An Example - Sale Inquiry
Change in Response Media 12/2001 - 12/2005
1800
1600
responses
1400
1200
12/2001
12/2002
12/2003
12/2004
12/2005
1000
800
600
400
200
0
Fax
Mail
EDR
Rami Peltola
August 9th 2006
14
Costs - Investment and Maintenance
The costs have dropped by 60-70% during the last few years
 Average investment cost per new EDR -solution (today)




Maintenance costs of EDR solution per year (today)



An outside service provider: was EUR 5000
In-house solution (XCola): less than 150 hours of work
An outside service provider: was EUR 1000
In-house solution (XCola): less than 50 hours of work
During the first and second phases the total resource input was
about 2,5 person years (“learning by doing”)


Included the development of a secure communication environment
Included the implementation of 7 solutions
Rami Peltola
August 9th 2006
15
An Example - Work Done in Development and
Maintenance of An EDR Solution (Sale Inquiry)
900
Includes hours used
in development of
infrastrucre
800
700
hours
600
500
400
300
200
100
0
2002
2003
2004
2005
Rami Peltola
August 9th 2006
16
Challenges - In-house Development and Maintenance
The development of surveys can be very fast if the IT personnel have good skills in XML and related techniques
 At the moment the number of very skilled survey
developers is limited
 The whole production environment around XCola is not yet
finished



Somewhat dependent on certain named persons
The statistics departments typically have a lot of
requirements for the surveys

Some minor development in XCola is needed all the time
Rami Peltola
August 9th 2006
17
Pilot - Integrated Data Collection (Tourism)

Data are delivered directly from hotel management
systems into our database

No manual work needed (except to initiate the transfer)
After their reception data are submitted to the standard
validation process
 Software vendors implement a module for the hotel’s
management software using Statistics Finland’s definitions
for data and service interface
 Implemented using typical B2B integration technique: XML
Web Services

Rami Peltola
August 9th 2006
18
Near Future - Productisation and Integration
More integrated data collections?
 Co-operation with management system providers
Project for productisation of XCola (since June 2006)
 Has already been made (Xcola v. 3.1):




Developer’s manual, finalised administration tools
Routines for transfers between collection and production databases
XCola version for outside evaluation has been built
Under development

Graphical editor for building questionnaires and links to metadata
Project for co-ordination of business surveys
 In the future more co-ordinated surveys - instead of many
independent surveys targeted towards businesses
Rami Peltola
August 9th 2006
19
Download