Reducing Costs Work Stream: - Office for National Statistics

advertisement
Methods for reducing respondent burden and saving
costs incurred in the production of Official Statistics
This document covers several methods for reducing respondent burden and saving costs that
are hoped to be generally applicable to many scenarios in both business and social statistics
across the GSS.
Each method covered below is explained and discussed briefly and links to methodological
guidance are provided where appropriate. Each section also contains at least one case study,
in which financial details are included where possible. The teams or projects quoted can be
contacted for further information. The methods suggested in this paper compliment usual best
practice of regularly reviewing statistical output.
The methods and case studies cover a range of different aims such as:
- Reducing respondent burden
- Reducing costs for statistical producers
- Improving the quality of an output without increasing costs or respondent burden
This document aims to provoke further thinking, to facilitate communication amongst the GSS
community, and be a point of reference for statisticians looking for ideas to reduce costs or
burden.
Often several government bodies collect similar information for different purposes. Sometimes
statisticians are not aware of what others may be doing, and at other times legalities can
prevent data from being shared. To minimise duplication in research projects, all new surveys
should be discussed with departmental Survey Control Liaison Officers (SCLOs) in the first
instance and subsequently with the Survey Control Unit in ONS.
Contents:
Method 1: Imputing data from other sources
Method 2: Using alternative methods of data collection
Method 3: Using administrative data
Method 4: Reducing the burden of European legislation
Method 5: Reducing frequency of surveys/publications
Method 6: ESSnet: international collaboration
Method 7: Modelling and Estimation
Return to top
Method 1: Imputing data from other sources
It may be possible to reduce the number of questions on a survey without loosing information,
or add variables without including extra questions, by matching the survey respondents to an
existing source that already contains the information required. Many sources contain similar
core variables and it is sometimes possible to impute the information from one source onto
another set of respondents using these core variables. Matching techniques vary depending
on the quality of the data and the accuracy of the match required. Techniques can range from
looking at correlation coefficients for an entire dataset at aggregate level to propensity score
matching at record level.
Furthermore, if the sources used contain information on identical respondents it may be
possible to link them together. Linking two datasets together will produce the most accurate
results, but it should be noted that this option is rarely available, and reliable data can be
imputed using less precise techniques.
Links:
One of the challenges with this method is that there is no exhaustive list of data sources
currently available. The publication hub and data.gov are two starting points for any potential
search.
Also, for basic details on survey names and departments see
Statnet: GSS@work: Guidance and Good Practice:
Survey Control: Comprehensive list of Regular Surveys
Communities and Local Government: The single data list (a list of all mandatory data
collections from Local Authorities)
Case study:
Scottish Household Study – Scottish Government
The Scottish Household Survey (SHS) records the income of the Highest Income
Householder and their spouse/partner, but does not ask any questions about the income of
other householders. This means that the survey under-estimates 'total household income' for
some household types, especially multi-adult households. To get the information through
interview would increase the length of the interview and possibly require more householders
to be interviewed.
To address this, statisticians within Scottish Government worked with DWP colleagues from
the Family Resources Survey (FRS) and an ONS methodologist (funded by the ONS Quality
Improvement Fund) to devise a method for imputing 'other householders' income in the SHS
using similar cases in the FRS. The first set of SHS income data including the imputed cases
have now been published (classed as "data being developed") and are currently undergoing
further quality assurance work.
Further information on the SHS income imputation project can be found online.
Return to top
Method 2: Using alternative methods of data collection
Alternative methods of data collection using new technology or mixed-mode approaches offer
potential cost savings. Furthermore, as people increasingly use the web and personal
technology, a traditional pen and paper survey approach may appear inefficient and may be
more burdensome to respondents. In such circumstances, switching to new methods of
technology and using sequential mixed modes may help to maintain response rates.
Links:
A report for the GSS: The application of alternative modes of data collection in UK
Government social surveys states that cost savings may be achievable in government social
surveys, by maximising the use of cheaper modes of data collection (mail, internet and
telephone) in mixed-mode data collection survey designs. In some circumstances probability
sampling and some face-to-face interviewing may need to be retained to ensure data quality.
More evidence is needed to assess the trade off between reducing cost and maintaining
quality
The report flags the need to manage discontinuity in time series, reducing or simplifying data
needs, the requirement for organisational commitment, capability and resources to develop,
test and maintain new survey designs or strategies.
Case study 1:
Developing the National Travel Survey (NTS) using GPS devices in place of the
traditional paper travel diary – Department for Transport
The NTS is a GB household survey whereby every household member is interviewed then
asked to complete a seven day travel diary. The travel diary is a considerable burden upon
respondents; it is long (by international standards) and poses difficulties for specific groups of
people such as those with limited English, sight impairments or low literacy levels.
In 2006 DfT published a Review of the Potential Role of 'New Technologies' in the National
Travel Survey (Wolf et al). It was subsequently agreed that personal GPS devices were the
most suitable option to deliver affordable and practical improvements. In 2010, the National
Travel Survey GPS Feasibility Study (Anderson et al) concluded that GPS technology has
real promise for use within the NTS and that there are no fundamental barriers to feasibility or
public acceptability.
DfT has since started preparing a second pilot study that will test how the NTS would work in
a real life scenario if the diary were replaced with an accelerometer equipped GPS device. If
this work is successful it is possible that when the existing contract expires in 2012, the NTS
could be re-tendered using a new GPS based methodology.
Using GPS devices means that the burden upon respondents is substantially reduced to
carrying the device and charging it for a few hours each evening, much like a mobile phone. It
would also remove the need for printing, coding and data entry of some 22,000 diaries each
year, offering potential costs savings beyond the initial investment in devices and ongoing
data processing costs. Until the pilot study is completed the true scale of cost savings is
unclear, but it is expected that the introduction of GPS devices will save money for the DfT in
the long run, in addition to dramatically cutting respondent burden.
For more information please email the National Travel Survey team
Case Study 2:
Developing the Secure Electronic File Transfer System to replace paper and removable
media transfers - ONS
The Secure Electronic File Transfer system (SEFT) is a web application which was developed
for the Office for National Statistics in 2006. It was originally intended to replace paper
questionnaires for the Foreign Direct Investment (FDI) family of surveys, which requires one
questionnaire to be filled out for each unit of the responding organisation. This often meant
250 questionnaires for a single respondent.
SEFT allows the transmission of files electronically, enabling the FDI respondents to complete
a single spreadsheet (often writing reports from their own software systems to do it
automatically) instead of handwriting all 250 responses.
A considerable amount of workflow functionality is included in the system. This includes
automatic prompts and reminders for completion and a secured messaging facility which
makes it possible to discuss responses in detail with respondents at their convenience.
The system is now used for approximately 35 surveys, some for the messaging facility alone
and some to replace files that would have been encrypted and written to removable media. In
all cases the effort required by a respondent to provide data or clarifications has been
reduced.
The purpose of developing this system was initially to manage statistical risk as well reduce
the burden on respondents supplying this data. Evidence shows that in excess of 60% of
respondents using this system do not require reminders to complete their surveys. This has
also resulted in a reduction of administrative costs. The SEFT system, although expensive in
terms of number of respondents is less expensive on a data item basis.
Basic information on costs:
The system costs £250k per annum
Per respondent: £104 per year (2400 respondent users)
Per file: £54 (4575 files, Jan 2010 to Jan 2011)
Per response: 36p (average 150 responses per file)
This cost includes all 6550 initial notifications to provide data, 2725 reminder emails and 6350
messages sent in the Jan 2010 – Jan 2011 period. ONS cannot report on the cost of
response chasing and querying in an equivalent mode.
For more information please email the respondent feedback team in the ONS
Return to top
Method 3: Using Administrative Data
Government owned administrative databases can be rich and useful sources of data. It may
be possible to improve quality of statistical outputs whilst reducing burden on respondents if
statisticians could get access to such sources. On the other hand, administrative sources are
sometimes highly inaccessible due to legal barriers, data quality issues and subtle definitional
differences that make comparisons to statistical releases difficult.
Links




ONS legal guide to data sharing for statistical purposes (link temperamental) provides
statisticians with issues to consider and some examples of best practice in this area.
Eurostat: Six Dimensions of Statistical Quality provides a framework for assessing
whether an administrative source meets the required level of quality needed for a
statistical output.
National Statistician's Summary Guidance on the Use of Administrative or
Management Information and guiding principles (link only available over GSI).
GSS guide giving step-by-step description of the issues and actions that need to be
considered when looking to create new legal gateways - GSS Stepping Stones to
data sharing.
Case study:
The Migration Statistics Improvement Programme (MSIP)
- cross government
The internal migration estimates were previously based entirely on moves observed in the
Patient Register. Some of the moves made by students were not captured in this dataset as
many students continued to be registered with family doctors at their parent’s home address
despite moving across the country. Additionally, at the end of study, subsequent moves to
locations for employment were badly captured particularly in male students, as a high
proportion did not register with doctors until they needed treatment of some kind.
The Higher Education Statistics Agency (HESA) holds the Student record, a dataset that
identifies the location of students in England and Wales. This source can identify student’s
addresses, both term time and home, therefore students can be correctly allocated to their
correct term time Local Authority. Students can also be more accurately taken out of their
university location at the end of their study and redistributed according to popular graduate
locations from the most recent census.
Statistical colleagues in HESA advised the MSIP team that the Student record was
appropriate to be used in the population estimates. Parliamentary legislation was required
when seeking access to data at individual level and ONS achieved this legislative change.
The MSIP team were then able to incorporate the student record into their estimates,
improving accuracy without creating an additional survey or increasing respondent burden.
For more information please see the MSIP methodology paper online.
Return to top
Method 4: Influencing European legislative requirements
Increased communication between EU countries can help to identify areas where legislative
EU data requirements could be lowered in order to save respondent burden and costs across
the EU.
Case Study 1:
EU trade in goods data collection (Intrastat) – HM Revenue & Customs (HMRC)
The Intrastat survey collects information on trade in goods between UK VAT registered
companies and other EU member states. This has one of the largest statistical survey
burdens on business in the UK. HMRC lobbied Eurostat and other Member States to
introduce legislation that would reduce the burden on business in complying with Intrastat
regulations.
In 2006 Eurostat agreed to pass legislation which meant that the weight of the traded goods
would not have to be declared on the Intrastat declaration (for around one quarter of all the
different products). Instead, Member States would be required to estimate the weight, using
either the declared value or ‘number of items’ information. Eurostat provided factors to use in
the estimation of weight, so as to have a consistent method employed across member states.
This has led to businesses not having to declare the weight on several million trades over the
course of a year, yielding around a £400k annual burden reduction (according to the standard
cost model).
Later, in 2009 Eurostat also agreed to pass legislation requiring only the largest of EU import
traders to submit detailed declarations. For the UK this meant that around 20% of the current
Intrastat business population became exempt (around 7,000 SMEs). Only around 15% of the
largest businesses now have to make intrastat declarations, and these account for around
95% of the total value of EU import trade. The remaining 5% value of trades is now estimated
from other administrative sources. This has led to an estimated £2m annual burden reduction
(according to the standard cost model).
Intrastat simplification details - 2010
Case Study 2:
Structural Business Statistics Regulation SBSR – GSS wide
The Structural Business Statistics Regulation (SBSR) is the legislation covering data
collection requirements for Business Statistics across EU countries. When this legislation was
being reviewed by the European Council, there were a number of changes made to reduce
burden both on statistical producers and on respondents.
The GSS were able to influence this decision while it was under review. The GSS were clearsighted in what they wanted; persistent (yet reasonable) in their views; and formed successful
alliances with other member states, as well as a close relationship with the Council chairman.
These aspects combined to provide the weight needed to change the legislation across the
EU.
For more information about this specific case, contact the Surveys and Administrative
Sources directorate in the ONS or the International Relations Branch of the ONS for
information on another area of the GSS.
Return to top
Method 5: Reducing frequency of surveys/publications
Reducing frequency is a simple and effective way to cut internal costs and respondent
burden. However, the challenge is to assess how a reduced frequency will impact on the time
series and the end use of the data, and the relative importance of any impact. For example, if
the data are volatile you may loose a lot of information that is crucial for modelling or
estimation, or the loss may not be significant. It is also important to consider the needs of
users when evaluating the impact of reduced frequency..
Case study:
The Small Business Survey - Department for Business Innovation and Skills
The Annual Small Business Survey (2003 to 2007) asked small and medium sized
enterprises (SMEs) about a range of business issues. The survey interviewed around 9,000
SME respondents in the UK each year.
From 2008, it was agreed that the survey would move to a biennial frequency in order to
reduce the cost to the department and the burden on businesses. However, in 2009, the
planned survey was deferred so as not to burden businesses further during the recession.
The deferral of the survey in 2008 and 2009 saved £470,000 in total compliance costs.
It has now been agreed that the survey will only be carried out according to BIS business
needs and priorities. In 2010, a Small Business Survey was commissioned with a smaller
sample size of 4,000 SMEs, balancing the need for a robust sample size and the need to
keep costs down. As a result, it is estimated that the 2010 survey compliance costs to
business will be around £80,000 less than in 2007.
Statistical tables are published on the BIS website, or contact the Enterprise Statistics team
for more information.
Return to top
Method 6: International Collaboration
Increased work with other countries can help reduce burden in the UK. Eurostat fund
numerous international collaborations in various ways and at various levels. These can be
useful ways for members of the European Statistical System (ESS) to receive funding to aid
methodological or strategic developments which aim to reduce costs incurred in producing
statistics and reduce burden. Examples of such (part-funded) projects are the European
Statistical System Network (ESSnet) Projects (previously known as the Centres and Networks
of Excellence (Cenex) Projects), which are part of the Modernisation in European Enterprise
and Trade Statistics (MEETS) Programme.
The purpose of these ESSnets is to “put together expertise distributed throughout the ESS
organisations in order to develop specific actions which would benefit the whole system.”
Examples of ESSnet projects which involve reducing costs or burden include:
o Decentralised access to microdata
o Use of administrative data
o Methodology for modern business statistics
o Standardisation and the integration of statistical processes in the ESS
Links
General introduction to ESSnet Projects:
http://epp.eurostat.ec.europa.eu/portal/page/portal/essnet/introduction
Completed ESSnet projects:
http://epp.eurostat.ec.europa.eu/portal/page/portal/essnet/essnet_projects/finished_ESSnet_p
rojects
On-going ESSnet projects:
http://epp.eurostat.ec.europa.eu/portal/page/portal/essnet/essnet_projects/running_ESSnet_p
rojects
ESSnet on Use of Administrative Data:
http://essnet.admindata.eu/
Case Study
The ESSnet on the use of administrative data in business statistics - ONS
Business statistics are often produced by conducting statistical surveys requiring enterprises
to submit data they have already provided to other government institutions. This places
additional burden on businesses.
As set out in Method 3, increasing the use of administrative data would reduce the reporting
burden on businesses, improve the quality of statistical information and may even reduce
costs to the statistical producer. This ESSnet project aims to produce guidance and best
practice to facilitate the use of administrative data in the production of business statistics.
Some of the topics covered by this ESSnet project include:
o Checklists for deciding to use administrative data and when receiving administrative data
from the source organisation,
o Methods for estimation of variables not directly available from administrative data,
o Methods to address issues with the timeliness of administrative data,
o Development of quality indicators when using administrative data in the production of
business statistics,
o Definitions and links between statistics and accounting standards.
For more information follow the above ESSnet link on Use of Administrative Data
Return to top
Method 7: Modelling and Estimation Techniques
Instead of adding a new question into a survey, or running a separate survey, it may be
possible to model or estimate a response value based on information received by the
respondent and/or from alternative data sources that already exist. By collecting less
information the respondent burden would be reduced. In some circumstances, a required
variable can be directly estimated from an available correlated variable. Where this is not
possible, statistical techniques such as regression, survival analysis and others can be used
to predict outcomes based on several variables from the historical time-series or variables
collected elsewhere.
Related link:
Estimation and accurate data modelling can often be methodologically challenging. Many
statistical textbooks are available covering models of all kinds. Statisticians interested in
developing these techniques could look to attend specific courses, for example those run by
the RSS Centre for professional development or the University of Southampton.
The United Nations (UN) has also published guidelines for the Modelling of Statistical data
and Metadata.
Case Study 1
Foreign Affiliates Statistics - ONS
The Foreign Affiliates Statistics (FATS) regulation is a new European regulation requiring the
UK to produce statistics on the overseas activities of multinational enterprises that are
controlled by a UK entity. This is a new area of statistics for the UK and it was determined that
simply responding to the regulation by launching a full-scale survey would lead to
unacceptable costs both to ONS and business.
As a result, ONS have developed an approach in which only a very small survey is conducted
which is primarily focussed on the very largest multinational groups. Data for the remainder of
the population is imputed from the EuroGroups Register (EGR). The EGR is a new
multinational business register, coordinated by Eurostat, which contains information from all
European member states. Multi-level models, which have been developed based on the
survey returns and other available data, are applied where data on particular variables is
unavailable in the EGR or is of poor quality.
The methodology is currently being finalised to allow delivery of the first set of figures to
Eurostat in August 2011. For more information, please email a member of the FATS team
Case Study 2
Sub-regional fuel poverty – DECC
DECC's method for producing sub-regional fuel poverty data takes national data, derived
primarily from survey data, and combines this with various other small area datasets, in
particular the census. It uses regression analysis to predict fuel poverty at low levels of
geography by linking common factors from the census (and other sources) with the national
fuel poverty data.
Until last year, local authorities were responsible for collecting their own “fuel poverty” data to
monitor a National Indicator (NI187) – they did this using an expensive survey of households
in the district, which provided difficulties both in terms time and burden of the survey, as well
as in comparability. By using sub-regional fuel poverty data, DECC are able to provide
figures at no cost to the LAs, and NI187 has been scrapped.
DECC’s 2008 Sub-regional fuel poverty methodology and documentation can be found on
their website. The data presented at sub-regional level help to predict where fuel poverty will
occur, and are available to download at local authority and parliamentary constituency level,
and on request at census output area.
Contact the Fuel Poverty Team for more information.
Download