ESCM Chapter 8: Data Quality and Metadata United Nations Oslo City Group on Energy Statistics 8th Oslo Group Meeting, Baku, Azerbaijan 24-27 September 2013 1 Outline of the presentation Background ESCM Chapter 8 outline Comments from OG7 and VM3 Chapter structure and content Changes to quality dimensions Examples/issues of data quality Metadata and examples General Statistical Business Process Model and examples of implementation Next steps 2 Background – ESCM Chapter 8 outline Provide guidance on the compilation of energy indicators and the preparation of metadata for energy statistics: A. Data quality indicators for energy statistics B. Metadata – Description of metadata specific to energy statistics – Presentation and dissemination of metadata 3 Comments from OG7 OG7: • Specific energy issues that will affect quality. • The challenges with consistency within energy statistics itself and in relation to other statistical areas. • The practice of using secondary sources for validating purposes. • The use of the balances as a quality check. • General Statistical Business Process Model to include administrative data sources. • Quality in a decentralised model vs. a centralised system. • Guidance for countries that do not have yet a well developed energy system. 4 Comments from VM3 VM3: • Overall good description of data quality and metadata. • Incorporate energy specific examples and references to energy issues. 5 Chapter structure and content Three main sections to the chapter Data quality Metadata Ensuring data quality and metadata using the General Statistical Business Process Model (GSBPM) Each section followed by energy related examples 6 Changes to quality dimensions Updated data quality dimensions and indicators for measuring quality from UN’s International Recommendations for Energy Statistics. Static elements • • • • • • Relevance Credibility/objectivity Accuracy Timeliness Coherence Accessibility Dynamic elements • Non-response • Coverage • Sampling Other dimension of quality for metadata: • Interpretability 7 Energy specific examples/issues of data quality Statistics Austria’s country practice on Energy Balances for Austria and the Laender of Austria. Report on Energy Supply and Demand (RESD) from Statistics Canada. Use of standard concepts, definitions, classifications – North American Industry Classification System The Swedish Official Statistics decentralized system. 8 Metadata Important for assessing “fitness for use” and ensuring interpretability. Required at every step of the survey process. Critical for enabling comparisons with other data. Used to prescribe definitions, concepts, variables and standards. Supports the harmonization of international surveys and data. 9 Example of metadata Statistics Canada’s Integrated Business Statistics Program metadata requirements • Development, test, user acceptance testing, production-simulation, and production environments. • Two levels of metadata: Global: To support processes and functionalities of all surveys. Survey: Specific to each survey depending on use and purpose. • Defining metadata to drive systems. • Quality of metadata. 10 Example of metadata characteristics to drive systems and the GSBPM phases 1. metadata is defined in the design and GSBPM Design phase 2.1 Design outputs 2.2 Design variable descriptions 2.3 Design data collection methodology 2.4 Design frame and sample methodology 2.5 Design statistical processing methodology 2.6 Design production and systems workflow 11 build phases of development 2. high-level concepts underpin system content and metadata is used to describe concepts at various levels of abstraction 4. metadata is managed through a user interface (or several); metadata concepts underpin the browse functionality for locating information objects, such as variables and questionnaires, and their relationships and attributes 5. uses centralized management supporting single entry and single source of, e.g. codesets: change once and propagate across system wherever codeset is used, and is flexible, not hard-coded 8. applies administrative metadata or ‘settings’ at different levels in the system Example of metadata and quality dimensions Criterion Accuracy Definition Names, codesets and other metadata are created and implemented according to the IBSP naming convention, the hierarchy of identifiers and other standards to ensure that metadata are aligned with the established semantics that reflect the business survey management. Metric Test different features of the system, such as the Questionnaire Development Application, to verify that the logic required by the various standards is reflected in the system functionality. Gather information during user testing regarding the quality and suitability of codesets to their purpose. Frequency Every fiscal year. Example A questionnaire can be retrieved by its Survey ID, SDDS number or name. Level System Follow-up Log any issues concerning the standards and structure in the system and address the issues by working with the Standards Division or by fixing the problem in the metadata registry, as the case dictates. 12 Quality assurance must be built into all stages of the survey process General Statistical Business Process Model (GSBPM) Survey Stages: Quality Assurance Framework 13 1. 2. 3. 4. 5. 6. 7. 8. 9. Specify needs Design Build Collect Process Analyze Disseminate Archive Evaluate Examples of GSBPM implementation Implementation of GSBPM on crude oil in the State Statistical Committee of the Republic of Azerbaijan Statistics Canada Integrated Business Statistics Program (IBSP) Initiative • Redesign of the Business Statistics Program • Approximately 120 business surveys including the content redesign of 23 energy surveys. • Common sampling, collection and processing methodologies, and sharing of common systems and analytical tools. • Will be done in several phases to be completed by 2016. 14 IBSP - Monthly Refined Petroleum Products Survey Data on the activities of refineries and other producers of refined petroleum products. Consultations with clients and respondents across Canada to identify the statistical needs for the future, review of concepts, and respondents’ ability to provide the information requested. New content includes biofuels contained in light fuel oil, diesel and aviation fuel. Meetings with the IBSP collection, processing and analysis working group to identify the processing requirements. 15 IBSP - Monthly Refined Petroleum Products Survey Report is prepared outlining the sampling methodology, collection strategy, content of the questionnaire and collection edits, standards for classifications and metadata, survey processing, data integration, edit and imputation, adjustments and additions, estimation, reports, dissemination, break in the series, and documentation. The design of electronic questionnaires is starting and field testing will take place this winter with production/collection to be ready for January 2015. 16 Next steps… Determine if need for different or additional energy related examples and issues. Review structure, content, and length of chapter for approval. 17 Thank you! Andy Kohut, Director Manufacturing and Energy Division Statistics Canada Section B-8, 11th Floor, Jean Talon Building Ottawa, Ontario Canada K1A 0T6 Telephone: 613-951-5858 E-mail: andy.kohut@statcan.gc.ca www.statcan.gc.ca 18