The Data Quality Assessment Framework and IMF’s Dissemination Standards Bulletin Board Tools and Practices for Collecting and Disseminating Metadata CCSA CONFERENCE ON DATA QUALITY FOR INTERNATIONAL ORGANIZATIONS 27-28 APRIL 2006 Prepared by Kim Zieschang, IMF Statistics Department (STA). The views expressed in this presentation are those of the author and should not be attributed to the International Monetary Fund, its Executive Board, or its management.. The IMF Data Standards Program Special Data Dissemination Standard (SDDS) General Data Dissemination System (GDDS) Data Quality Program (DQP) Background – SDDS The Fund initiated the SDDS in 1996 in the wake of the Mexican financial crisis of the mid-1990s Response to a need to inform financial markets better, not only about country macroeconomic data themselves, but also information about the data (metadata) Focus on dissemination practices: Coverage, Periodicity, Timeliness Reserves template added in 1998 (Second Review) following the Asian financial crises of 1997-98 External debt added in 2000 (Fourth Review) Background – GDDS The Fund initiated the GDDS in 1997 as a developmental framework to improve member countries’ statistical capacities Statistical capacity is measured, among other things, by the ability to meet SDDS dissemination (coverage, periodicity, timeliness) requirements More extensive socio-demographic metadata than SDDS Background – The DQP and the Data Quality Assessment Framework (DQAF) The DQP was introduced in the Fourth Review of the Data Standards Initiatives (2001) and introduced the DQAF as the framework for the Data Module of the Reports on Observance of Standards and Codes (Data ROSCs) The Fifth Review of the Data Standards Initiatives (2003, Supplement 2) introduced an update (July 2003, the current version) to the first version of the DQAF The Sixth Review (2005, Supplement 1) sets the DQAF of the DQP as the underlying metadata model of the SDDS and GDDS, as well as the Data ROSCs, noting its connection with the emerging Statistical Data and Metadata eXchange (SDMX) standard. Metadata model of the Data Standards: DQAF 0. Prerequisites of quality 1.1 Professionalism 1.2 Transparency 1.2.1 The terms and conditions under which statistics are collected, processed, and disseminated are available to the public. 1.2.2 Internal governmental access to statistics prior to their release is publicly identified 1.2.3 Products of statistical agencies/units are clearly identified as such. 1.2.4 Advance notice is given of major changes in methodology, source data, and statistical techniques. 3. Accuracy and reliability 0.1 Legal and institutional environment 0.2 Resources 0.3 Relevance 0.4 Other quality management 1. Assurances of integrity 4. Serviceability 4.1.1 Periodicity follows dissemination standards 4.1.2 Timeliness follows dissemination standards 4.2 Consistency 4.3 Revision policy and practice 5. Accessibility 5.1 Data accessibility 1.3 Ethical standards 2.1 Concepts and definitions 2.2 Scope 2.3 Classification/sectorization 2.4 Basis for recording 4.1 Periodicity and timeliness 2. Methodological soundness 3.1 Source data 3.2 Assessment of source data 3.3 Statistical techniques 3.4 Assessment and validation of intermediate data and statistical outputs 3.5 Revision studies 5.1.1 Statistics are presented in a way that facilitates proper interpretation and meaningful comparisons 5.1.2 Dissemination media and format are adequate 5.1.3 Statistics are released on a preannounced schedule 5.1.4 Statistics are made available to all users at the same time 5.1.5 Statistics not routinely disseminated are made available upon request 5.2 Metadata accessibility 5.3 Assistance to users Metadata model of the Data Standards: DQAF The Generic DQAF contains 3-digit detail not shown in the previous exhibit for 2-digit elements other than 1.2, 4.1, and 5.1 This detail is suppressed only to show how the DQAF maps to the SDDS/GDDS presentation structure The internal organization of SDDS/GDDS metadata will eventually contain the full 3-digit DQAF granularity for the four macroeconomic data categories Conversion of existing metadata is proceeding for a first wave of about 10 of the 61 SDDS countries Dissemination Standards Bulletin Board (DSBB) dsbb.imf.org Displays a dissemination practices view of metadata This structure dates from the beginning of the SDDS (1996) and GDDS (1997) Four dimensions and a summary methodology (SDDS)/comprehensive framework (GDDS) Metadata model of the Data Standards: DSBB dissemination practices presentation Data Access Advance release calendar Simultaneous release Integrity Coverage Periodicity Timeliness Terms and conditions Advance access Ministerial commentary Revisions and advance notice of methodological changes Quality Documentation Summary methodology Analytical Framework, Concepts, Definitions, and Classifications Scope of the data Accounting conventions Nature of the basic data sources Compilation practices Other aspects Detail/reconcilation/frameworks Metadata model of the Data Standards: DQAF mapping to the presentation model Data Access Advance release calendar (5.1.4) Simultaneous release (5.1.3) Integrity Coverage (2.1, 2.2, 2.3, 2.4, 3.1, 3.3, 3.4, 5.1.1) Periodicity (4.1.1) Timeliness (4.1.2) Terms and conditions (0.1, 1.1, 1.2.1, 1.3) Advance access (1.2.2) Ministerial commentary (1.2.3) Revisions and advance notice of methodological changes (1.2.4, 3.5, 4.3) Quality Documentation (5.2) Summary methodology Analytical Framework, Concepts, Definitions, and Classifications (2.1) Scope of the data (2.2) Accounting conventions (2.4) Nature of the basic data sources (3.1,3.2) Compilation practices (3.3, 3.4, 3.5) Other aspects (4.2, 4.3) Detail/reconcilation/frameworks (4.2, 5.1.5) Metadata model of the Data Standards: DQAF mapping to the presentation model The DQAF provides greater “granularity” to the SDDS and GDDS metadata presentation framework 2-digit DQAF topics map cleanly into the Data Standards presentation model, except for three 2-digit categories whose 3-digit items that map to multiple DSBB presentation items: DQAF 1.2 Transparency goes to the SDDS/GDDS Integrity items: DQAF 4.1 Periodicity and timeliness goes to the SDDS/GDDS Data items: Integrity: Terms and conditions (..., 1.2.1, ...) Integrity: Advance access (1.2.2) Integrity: Ministerial commentary (1.2.3) Integrity: Revisions and advance notice of methodological changes (1.2.4, ...) Data: Periodicity (4.1.1) Data: Timeliness (4.1.2) DQAF 5.1 Data accessibility goes to the following SDDS/GDDS Data, Access, and Quality items Data: Coverage (..., 5.1.1) Access: Advance release calendar (5.1.4) Access: Simultaneous release (5.1.3) Quality: Detail/reconcilation/frameworks (..., 5.1.5) Metadata model of the Data Standards: DQAF mapping to the presentation model The DQAF allows the DSBB to easily use ROSC Detailed Assessment metadata as a basis for updating SDDS and GDDS metadata and, conversely, to use the SDDS and GDDS as a basis for ROSC mission preparation. The DQAF prospectively will allow a similar two-way transmission of information between technical assistance documentation and SDDS and GDDS metadata. Metadata model of the Data Standards: Economic subject area Real sector National accounts Production Labor market Fiscal sector General government operations Central government operations Central government debt Financial sector Employment Unemployment Wages CPI PPI Analytical accounts of depository corporations Analytical accounts of the central bank Interest rates Share price index External sector Price statistics Balance of payments Reserves template Merchandise trade International investment position External debt Exchange rates Socio-demographic sector Population Health Education Poverty Sixth Review (November 2005) initiatives – SDDS and GDDS metadata Implement 3-digit DQAF “granularization” of SDDS and GDDS metadata Improve coverage of oil and gas activities and products in the metadata for existing data categories DQAF and the European Statistics Code of Practice IMF DQAF Eurostat Statistics Code of Practice 51 headings 77 headings Detailed cross-domain concepts from merging DQAF and Code of Practice 106 headings DQAF, European Code of Practice, and the SDMX SDMX version 2.0 guidelines for broad cross-domain concepts now in discussion 26 headings Merged IMF DQAF and Eurostat Code of Practice 106 headings Very general—could serve as a basis for possible SDMX version 2.0 guidelines for detailed cross-domain concepts that allow broad interoperability across virtually all data quality frameworks now available. Proposed Broad SDMX CrossDomain Concepts 01 Accessibility of documentation 02 Accounting conventions/basis 03 Accuracy 04 Classification systems 05 Comparability/Coherence 06 Confidentiality 07 Contact 08 Data presentation 09 Date of update 10 Dissemination formats 11 Frequency and Periodicity 12 Institutional framework 13 Professionalism and ethical standards 14 Quality management (including resource management) 15 Release calendar 16 Relevance 17 Revision policy and practice 18 Scope / coverage 19 Simultaneous release 20 Source data 21 Statistical concept 22 Statistical processing 23 Supplementary data 24 Timeliness and punctuality 25 Transparency 26 Validation Illustrative Detail of SDMX “Accuracy” Notional code SDMX Broad crossDomain concept DQAF Indicator European Statistics Code of Practice Indicator Possible SDMX detailed cross-domain concept Notional code 03 Accuracy 3.2.1 Source data—including censuses, sample surveys, end administrative records—are routinely assessed, e.g., For coverage, sample error, response error, and nonsampling error; the results of the assessments are monitored and made available to guide statistical processes. 12.1 Source data, intermediate results and statistical outputs are assessed and validated [12.1.1 Use of accuracy assessments—source data] Use of accuracy assessments— source data 03.01 03 Accuracy 3.2.1 Source data—including censuses, sample surveys, and administrative records—are routinely assessed, e.g., for coverage, sample error, response error, and nonsampling error; the results of the assessments are monitored and made available to guide statistical processes. 12.2 Sampling errors and non – sampling errors are measured and systematically documented according to the framework of the ESS quality components Measurement and documentation of sampling and non–sampling errors 03.02 03 Accuracy 3.4.2 Statistical discrepancies in intermediate data are assessed and investigated. 12.1 Source data, intermediate results and statistical outputs are assessed and validated [12.1.2 Use of accuracy assessments—intermediate data] Use of accuracy assessments— intermediate data—validation 03.03 03 Accuracy 3.4.3 Statistical discrepancies and other potential indicators or problems in statistical outputs are investigated. 12.1 Source data, intermediate results and statistical outputs are assessed and validated [12.1.3 Use of accuracy assessments—statistical outputs] Use of accuracy assessments— statistical outputs 03.04 03 Accuracy 3.5.1 Studies and analyses of revisions are carried out routinely and used internally to inform statistical processes (see also 4.3.3). 08.6 Revisions follow standard, well-established and transparent procedures Documentation of revisions 03.05 Concluding remarks Interoperability of metadata frameworks requires maintaining sufficient “granularity” in the topical structure of the framework Merge of IMF DQAF and Eurostat Statistics Code of Practice may take us a long way to determining what metadata topical “granules” we need Although SDMX technical standards permit successful electronic data transmission of datasets and sub-cubes of datasets, interoperability requires international agreement at the granular level Both the SDMX technical standards and the content oriented guidelines (granular cross-domain concepts) are needed to realize the potential of SDMX to Greatly reduce the cost of metadata capture by the IMF and other international organizations and international bodies Greatly reduce the respondent burden on the national reporters of metadata