Methods for reducing respondent burden and saving costs incurred in the production of Official Statistics This document covers several methods for reducing respondent burden and saving costs that are hoped to be generally applicable to many scenarios in both business and social statistics across the GSS. Each method covered below is explained and discussed briefly and links to methodological guidance are provided where appropriate. Each section also contains at least one case study, in which financial details are included where possible. The teams or projects quoted can be contacted for further information. The methods suggested in this paper compliment usual best practice of regularly reviewing statistical output. The methods and case studies cover a range of different aims such as: - Reducing respondent burden - Reducing costs for statistical producers - Improving the quality of an output without increasing costs or respondent burden This document aims to provoke further thinking, to facilitate communication amongst the GSS community, and be a point of reference for statisticians looking for ideas to reduce costs or burden. Often several government bodies collect similar information for different purposes. Sometimes statisticians are not aware of what others may be doing, and at other times legalities can prevent data from being shared. To minimise duplication in research projects, all new surveys should be discussed with departmental Survey Control Liaison Officers (SCLOs) in the first instance and subsequently with the Survey Control Unit in ONS. Contents: Method 1: Imputing data from other sources Method 2: Using alternative methods of data collection Method 3: Using administrative data Method 4: Reducing the burden of European legislation Method 5: Reducing frequency of surveys/publications Method 6: ESSnet: international collaboration Method 7: Modelling and Estimation Return to top Method 1: Imputing data from other sources It may be possible to reduce the number of questions on a survey without loosing information, or add variables without including extra questions, by matching the survey respondents to an existing source that already contains the information required. Many sources contain similar core variables and it is sometimes possible to impute the information from one source onto another set of respondents using these core variables. Matching techniques vary depending on the quality of the data and the accuracy of the match required. Techniques can range from looking at correlation coefficients for an entire dataset at aggregate level to propensity score matching at record level. Furthermore, if the sources used contain information on identical respondents it may be possible to link them together. Linking two datasets together will produce the most accurate results, but it should be noted that this option is rarely available, and reliable data can be imputed using less precise techniques. Links: One of the challenges with this method is that there is no exhaustive list of data sources currently available. The publication hub and data.gov are two starting points for any potential search. Also, for basic details on survey names and departments see Statnet: GSS@work: Guidance and Good Practice: Survey Control: Comprehensive list of Regular Surveys Communities and Local Government: The single data list (a list of all mandatory data collections from Local Authorities) Case study: Scottish Household Study – Scottish Government The Scottish Household Survey (SHS) records the income of the Highest Income Householder and their spouse/partner, but does not ask any questions about the income of other householders. This means that the survey under-estimates 'total household income' for some household types, especially multi-adult households. To get the information through interview would increase the length of the interview and possibly require more householders to be interviewed. To address this, statisticians within Scottish Government worked with DWP colleagues from the Family Resources Survey (FRS) and an ONS methodologist (funded by the ONS Quality Improvement Fund) to devise a method for imputing 'other householders' income in the SHS using similar cases in the FRS. The first set of SHS income data including the imputed cases have now been published (classed as "data being developed") and are currently undergoing further quality assurance work. Further information on the SHS income imputation project can be found online. Return to top Method 2: Using alternative methods of data collection Alternative methods of data collection using new technology or mixed-mode approaches offer potential cost savings. Furthermore, as people increasingly use the web and personal technology, a traditional pen and paper survey approach may appear inefficient and may be more burdensome to respondents. In such circumstances, switching to new methods of technology and using sequential mixed modes may help to maintain response rates. Links: A report for the GSS: The application of alternative modes of data collection in UK Government social surveys states that cost savings may be achievable in government social surveys, by maximising the use of cheaper modes of data collection (mail, internet and telephone) in mixed-mode data collection survey designs. In some circumstances probability sampling and some face-to-face interviewing may need to be retained to ensure data quality. More evidence is needed to assess the trade off between reducing cost and maintaining quality The report flags the need to manage discontinuity in time series, reducing or simplifying data needs, the requirement for organisational commitment, capability and resources to develop, test and maintain new survey designs or strategies. Case study 1: Developing the National Travel Survey (NTS) using GPS devices in place of the traditional paper travel diary – Department for Transport The NTS is a GB household survey whereby every household member is interviewed then asked to complete a seven day travel diary. The travel diary is a considerable burden upon respondents; it is long (by international standards) and poses difficulties for specific groups of people such as those with limited English, sight impairments or low literacy levels. In 2006 DfT published a Review of the Potential Role of 'New Technologies' in the National Travel Survey (Wolf et al). It was subsequently agreed that personal GPS devices were the most suitable option to deliver affordable and practical improvements. In 2010, the National Travel Survey GPS Feasibility Study (Anderson et al) concluded that GPS technology has real promise for use within the NTS and that there are no fundamental barriers to feasibility or public acceptability. DfT has since started preparing a second pilot study that will test how the NTS would work in a real life scenario if the diary were replaced with an accelerometer equipped GPS device. If this work is successful it is possible that when the existing contract expires in 2012, the NTS could be re-tendered using a new GPS based methodology. Using GPS devices means that the burden upon respondents is substantially reduced to carrying the device and charging it for a few hours each evening, much like a mobile phone. It would also remove the need for printing, coding and data entry of some 22,000 diaries each year, offering potential costs savings beyond the initial investment in devices and ongoing data processing costs. Until the pilot study is completed the true scale of cost savings is unclear, but it is expected that the introduction of GPS devices will save money for the DfT in the long run, in addition to dramatically cutting respondent burden. For more information please email the National Travel Survey team Case Study 2: Developing the Secure Electronic File Transfer System to replace paper and removable media transfers - ONS The Secure Electronic File Transfer system (SEFT) is a web application which was developed for the Office for National Statistics in 2006. It was originally intended to replace paper questionnaires for the Foreign Direct Investment (FDI) family of surveys, which requires one questionnaire to be filled out for each unit of the responding organisation. This often meant 250 questionnaires for a single respondent. SEFT allows the transmission of files electronically, enabling the FDI respondents to complete a single spreadsheet (often writing reports from their own software systems to do it automatically) instead of handwriting all 250 responses. A considerable amount of workflow functionality is included in the system. This includes automatic prompts and reminders for completion and a secured messaging facility which makes it possible to discuss responses in detail with respondents at their convenience. The system is now used for approximately 35 surveys, some for the messaging facility alone and some to replace files that would have been encrypted and written to removable media. In all cases the effort required by a respondent to provide data or clarifications has been reduced. The purpose of developing this system was initially to manage statistical risk as well reduce the burden on respondents supplying this data. Evidence shows that in excess of 60% of respondents using this system do not require reminders to complete their surveys. This has also resulted in a reduction of administrative costs. The SEFT system, although expensive in terms of number of respondents is less expensive on a data item basis. Basic information on costs: The system costs £250k per annum Per respondent: £104 per year (2400 respondent users) Per file: £54 (4575 files, Jan 2010 to Jan 2011) Per response: 36p (average 150 responses per file) This cost includes all 6550 initial notifications to provide data, 2725 reminder emails and 6350 messages sent in the Jan 2010 – Jan 2011 period. ONS cannot report on the cost of response chasing and querying in an equivalent mode. For more information please email the respondent feedback team in the ONS Return to top Method 3: Using Administrative Data Government owned administrative databases can be rich and useful sources of data. It may be possible to improve quality of statistical outputs whilst reducing burden on respondents if statisticians could get access to such sources. On the other hand, administrative sources are sometimes highly inaccessible due to legal barriers, data quality issues and subtle definitional differences that make comparisons to statistical releases difficult. Links ONS legal guide to data sharing for statistical purposes (link temperamental) provides statisticians with issues to consider and some examples of best practice in this area. Eurostat: Six Dimensions of Statistical Quality provides a framework for assessing whether an administrative source meets the required level of quality needed for a statistical output. National Statistician's Summary Guidance on the Use of Administrative or Management Information and guiding principles (link only available over GSI). GSS guide giving step-by-step description of the issues and actions that need to be considered when looking to create new legal gateways - GSS Stepping Stones to data sharing. Case study: The Migration Statistics Improvement Programme (MSIP) - cross government The internal migration estimates were previously based entirely on moves observed in the Patient Register. Some of the moves made by students were not captured in this dataset as many students continued to be registered with family doctors at their parent’s home address despite moving across the country. Additionally, at the end of study, subsequent moves to locations for employment were badly captured particularly in male students, as a high proportion did not register with doctors until they needed treatment of some kind. The Higher Education Statistics Agency (HESA) holds the Student record, a dataset that identifies the location of students in England and Wales. This source can identify student’s addresses, both term time and home, therefore students can be correctly allocated to their correct term time Local Authority. Students can also be more accurately taken out of their university location at the end of their study and redistributed according to popular graduate locations from the most recent census. Statistical colleagues in HESA advised the MSIP team that the Student record was appropriate to be used in the population estimates. Parliamentary legislation was required when seeking access to data at individual level and ONS achieved this legislative change. The MSIP team were then able to incorporate the student record into their estimates, improving accuracy without creating an additional survey or increasing respondent burden. For more information please see the MSIP methodology paper online. Return to top Method 4: Influencing European legislative requirements Increased communication between EU countries can help to identify areas where legislative EU data requirements could be lowered in order to save respondent burden and costs across the EU. Case Study 1: EU trade in goods data collection (Intrastat) – HM Revenue & Customs (HMRC) The Intrastat survey collects information on trade in goods between UK VAT registered companies and other EU member states. This has one of the largest statistical survey burdens on business in the UK. HMRC lobbied Eurostat and other Member States to introduce legislation that would reduce the burden on business in complying with Intrastat regulations. In 2006 Eurostat agreed to pass legislation which meant that the weight of the traded goods would not have to be declared on the Intrastat declaration (for around one quarter of all the different products). Instead, Member States would be required to estimate the weight, using either the declared value or ‘number of items’ information. Eurostat provided factors to use in the estimation of weight, so as to have a consistent method employed across member states. This has led to businesses not having to declare the weight on several million trades over the course of a year, yielding around a £400k annual burden reduction (according to the standard cost model). Later, in 2009 Eurostat also agreed to pass legislation requiring only the largest of EU import traders to submit detailed declarations. For the UK this meant that around 20% of the current Intrastat business population became exempt (around 7,000 SMEs). Only around 15% of the largest businesses now have to make intrastat declarations, and these account for around 95% of the total value of EU import trade. The remaining 5% value of trades is now estimated from other administrative sources. This has led to an estimated £2m annual burden reduction (according to the standard cost model). Intrastat simplification details - 2010 Case Study 2: Structural Business Statistics Regulation SBSR – GSS wide The Structural Business Statistics Regulation (SBSR) is the legislation covering data collection requirements for Business Statistics across EU countries. When this legislation was being reviewed by the European Council, there were a number of changes made to reduce burden both on statistical producers and on respondents. The GSS were able to influence this decision while it was under review. The GSS were clearsighted in what they wanted; persistent (yet reasonable) in their views; and formed successful alliances with other member states, as well as a close relationship with the Council chairman. These aspects combined to provide the weight needed to change the legislation across the EU. For more information about this specific case, contact the Surveys and Administrative Sources directorate in the ONS or the International Relations Branch of the ONS for information on another area of the GSS. Return to top Method 5: Reducing frequency of surveys/publications Reducing frequency is a simple and effective way to cut internal costs and respondent burden. However, the challenge is to assess how a reduced frequency will impact on the time series and the end use of the data, and the relative importance of any impact. For example, if the data are volatile you may loose a lot of information that is crucial for modelling or estimation, or the loss may not be significant. It is also important to consider the needs of users when evaluating the impact of reduced frequency.. Case study: The Small Business Survey - Department for Business Innovation and Skills The Annual Small Business Survey (2003 to 2007) asked small and medium sized enterprises (SMEs) about a range of business issues. The survey interviewed around 9,000 SME respondents in the UK each year. From 2008, it was agreed that the survey would move to a biennial frequency in order to reduce the cost to the department and the burden on businesses. However, in 2009, the planned survey was deferred so as not to burden businesses further during the recession. The deferral of the survey in 2008 and 2009 saved £470,000 in total compliance costs. It has now been agreed that the survey will only be carried out according to BIS business needs and priorities. In 2010, a Small Business Survey was commissioned with a smaller sample size of 4,000 SMEs, balancing the need for a robust sample size and the need to keep costs down. As a result, it is estimated that the 2010 survey compliance costs to business will be around £80,000 less than in 2007. Statistical tables are published on the BIS website, or contact the Enterprise Statistics team for more information. Return to top Method 6: International Collaboration Increased work with other countries can help reduce burden in the UK. Eurostat fund numerous international collaborations in various ways and at various levels. These can be useful ways for members of the European Statistical System (ESS) to receive funding to aid methodological or strategic developments which aim to reduce costs incurred in producing statistics and reduce burden. Examples of such (part-funded) projects are the European Statistical System Network (ESSnet) Projects (previously known as the Centres and Networks of Excellence (Cenex) Projects), which are part of the Modernisation in European Enterprise and Trade Statistics (MEETS) Programme. The purpose of these ESSnets is to “put together expertise distributed throughout the ESS organisations in order to develop specific actions which would benefit the whole system.” Examples of ESSnet projects which involve reducing costs or burden include: o Decentralised access to microdata o Use of administrative data o Methodology for modern business statistics o Standardisation and the integration of statistical processes in the ESS Links General introduction to ESSnet Projects: http://epp.eurostat.ec.europa.eu/portal/page/portal/essnet/introduction Completed ESSnet projects: http://epp.eurostat.ec.europa.eu/portal/page/portal/essnet/essnet_projects/finished_ESSnet_p rojects On-going ESSnet projects: http://epp.eurostat.ec.europa.eu/portal/page/portal/essnet/essnet_projects/running_ESSnet_p rojects ESSnet on Use of Administrative Data: http://essnet.admindata.eu/ Case Study The ESSnet on the use of administrative data in business statistics - ONS Business statistics are often produced by conducting statistical surveys requiring enterprises to submit data they have already provided to other government institutions. This places additional burden on businesses. As set out in Method 3, increasing the use of administrative data would reduce the reporting burden on businesses, improve the quality of statistical information and may even reduce costs to the statistical producer. This ESSnet project aims to produce guidance and best practice to facilitate the use of administrative data in the production of business statistics. Some of the topics covered by this ESSnet project include: o Checklists for deciding to use administrative data and when receiving administrative data from the source organisation, o Methods for estimation of variables not directly available from administrative data, o Methods to address issues with the timeliness of administrative data, o Development of quality indicators when using administrative data in the production of business statistics, o Definitions and links between statistics and accounting standards. For more information follow the above ESSnet link on Use of Administrative Data Return to top Method 7: Modelling and Estimation Techniques Instead of adding a new question into a survey, or running a separate survey, it may be possible to model or estimate a response value based on information received by the respondent and/or from alternative data sources that already exist. By collecting less information the respondent burden would be reduced. In some circumstances, a required variable can be directly estimated from an available correlated variable. Where this is not possible, statistical techniques such as regression, survival analysis and others can be used to predict outcomes based on several variables from the historical time-series or variables collected elsewhere. Related link: Estimation and accurate data modelling can often be methodologically challenging. Many statistical textbooks are available covering models of all kinds. Statisticians interested in developing these techniques could look to attend specific courses, for example those run by the RSS Centre for professional development or the University of Southampton. The United Nations (UN) has also published guidelines for the Modelling of Statistical data and Metadata. Case Study 1 Foreign Affiliates Statistics - ONS The Foreign Affiliates Statistics (FATS) regulation is a new European regulation requiring the UK to produce statistics on the overseas activities of multinational enterprises that are controlled by a UK entity. This is a new area of statistics for the UK and it was determined that simply responding to the regulation by launching a full-scale survey would lead to unacceptable costs both to ONS and business. As a result, ONS have developed an approach in which only a very small survey is conducted which is primarily focussed on the very largest multinational groups. Data for the remainder of the population is imputed from the EuroGroups Register (EGR). The EGR is a new multinational business register, coordinated by Eurostat, which contains information from all European member states. Multi-level models, which have been developed based on the survey returns and other available data, are applied where data on particular variables is unavailable in the EGR or is of poor quality. The methodology is currently being finalised to allow delivery of the first set of figures to Eurostat in August 2011. For more information, please email a member of the FATS team Case Study 2 Sub-regional fuel poverty – DECC DECC's method for producing sub-regional fuel poverty data takes national data, derived primarily from survey data, and combines this with various other small area datasets, in particular the census. It uses regression analysis to predict fuel poverty at low levels of geography by linking common factors from the census (and other sources) with the national fuel poverty data. Until last year, local authorities were responsible for collecting their own “fuel poverty” data to monitor a National Indicator (NI187) – they did this using an expensive survey of households in the district, which provided difficulties both in terms time and burden of the survey, as well as in comparability. By using sub-regional fuel poverty data, DECC are able to provide figures at no cost to the LAs, and NI187 has been scrapped. DECC’s 2008 Sub-regional fuel poverty methodology and documentation can be found on their website. The data presented at sub-regional level help to predict where fuel poverty will occur, and are available to download at local authority and parliamentary constituency level, and on request at census output area. Contact the Fuel Poverty Team for more information.