United Nations Economic Commission for Europe Statistical Division Measuring and Communicating Data Quality UNECE Training Workshop on Dissemination of MDG Indicators and Statistical Information Astana, Kazakhstan 23 – 25 November 2009 Steven Vale, UNECE Contents What is quality? How can we measure quality? How should we report and communicate quality? Steven Vale - UNECE Statistical Division Slide 2 Which is the Best Quality? Steven Vale - UNECE Statistical Division Slide 3 Definition of Quality International Standard ISO 9000/2005 defines quality as; 'The degree to which a set of inherent characteristics fulfils requirements.’ Steven Vale - UNECE Statistical Division Slide 4 What Does This Mean? Whose requirements? • A set of inherent characteristics? • The user of the goods or services Users judge quality against a set of criteria reflecting the different characteristics of the goods or services So quality is all about providing goods and services that meet the needs of users (customers) Steven Vale - UNECE Statistical Division Slide 5 Quality Criteria Steven Vale - UNECE Statistical Division Slide 6 Quality Criteria for Statistics Different statistical organisations use different criteria - but lists of criteria are quite similar UNECE list: Relevance Comparability Accuracy Clarity Timeliness Accessibility Punctuality Steven Vale - UNECE Statistical Division Slide 7 Relevance Are the statistics that are produced needed? Are the statistics that are needed produced? Do the concepts, definitions and classifications meet user needs? Steven Vale - UNECE Statistical Division Slide 8 Accuracy The closeness of statistical estimates to true values In the past: Quality = Accuracy Now accuracy is just one part of quality Steven Vale - UNECE Statistical Division Slide 9 Timeliness The length of time between data being made available and the event or phenomenon they describe Punctuality The time lag between the actual delivery date and the promised delivery date Steven Vale - UNECE Statistical Division Slide 10 Comparability The extent to which differences are real, or due to methodological or measurement differences • • • Comparability over time Comparability through space (e.g. between countries / regions) Comparability between statistical domains (sometimes referred to as coherence) Steven Vale - UNECE Statistical Division Slide 11 Accessibility The ways in which users can obtain or benefit from statistical services (pricing, format, location, language etc.) Clarity The availability of additional material (e.g. metadata, charts etc.) to allow users to understand outputs better Steven Vale - UNECE Statistical Division Slide 12 Importance of Accessibility Not just about making data available on the Internet or in a book • Passive accessibility Accessibility is about bringing data to users in an understandable way, opening a dialogue with those users, and ensuring that their information needs are met • Active accessibility Steven Vale - UNECE Statistical Division Slide 13 Accessibility Should Include: Communicating Marketing Interpreting “Story-telling” Informing Educating Steven Vale - UNECE Statistical Division Slide 14 Accessibility and Visualization Good visualizations make data accessible to many more users Bad visualizations are unhelpful / misleading “Self-service” visualization needs to be simple, with guidance to help users get meaningful results “Ready-made” visualizations can be more complex, tailored to specific data sets Steven Vale - UNECE Statistical Division Slide 15 Accessibility and Visualization Is it more cost-effective to: develop “ready-made” graphics, or • offer users more “self-service” functionality? • Many users don’t have the time or knowledge to produce good visualizations Advanced users have access to their own visualization and analysis tools Steven Vale - UNECE Statistical Division Slide 16 Importance of Clarity Clarity is all about explaining data Do current explanatory notes help? • Often written by specialists for specialists • Full of jargon • Too long • Too boring! Simplified, plain-text versions needed Steven Vale - UNECE Statistical Division Slide 17 Other Considerations Cost / efficiency Integrity / trust Reputation of the organization Professionalism • Adherence to international standards (e.g. UN Fundamental Principles of Official Statistics) Steven Vale - UNECE Statistical Division Slide 18 Quality is not just about outputs Input Process Output To have good outputs we need to have good inputs and processes, so we need to think about the quality of these as well Steven Vale - UNECE Statistical Division Slide 19 Quality of Inputs Timeliness Completeness – are there any missing units or variables? Comparability with other sources Quality check survey? Knowledge of the source is vital! Steven Vale - UNECE Statistical Division Slide 20 Quality of Processing Quality of matching / linking Outlier detection and treatment Quality of data editing Quality of imputation Keep raw data / metadata to refer back to if necessary Steven Vale - UNECE Statistical Division Slide 21 Quality of Outputs Are the users satisfied? Are the outputs comparable with data from other sources? What is the impact on time series? Are the outputs cost-effective? Quality reports to measure and communicate differences? Steven Vale - UNECE Statistical Division Slide 22 Measuring Quality Quantitative methods • E.g. confidence intervals User surveys Self evaluation Benchmarking Steven Vale - UNECE Statistical Division Slide 23 Quantitative Measures The tops of the bars indicate estimated values and the red lines represent the confidence intervals surrounding them. Steven Vale - UNECE Statistical Division Slide 24 UNECE Database User Survey Launched each autumn on database web site 10 questions 150 responses (target 100) Steven Vale - UNECE Statistical Division Slide 25 Exercise Design a user survey with up to 10 questions for users of your web site 20 minutes Steven Vale - UNECE Statistical Division Slide 26 UNECE User Survey Questions 1. Type of user 2. Frequency of use 3. Location (country) 4. Type of data 5. Database relevance 6. Timeliness Steven Vale - UNECE Statistical Division Slide 27 Continued... 7. Clarity (metadata) 8. Overall data quality 9. User interface 10. Other comments and questions Steven Vale - UNECE Statistical Division Slide 28 Results: Type of user Media Individual Other Private business National Statistical Office National government International organization / NGO Student Academic / research Results: Frequency of use Results: Location Very poor 1% Poor 1% Results: Data quality Excellent 18% Average 17% Good 63% Results: User interface Poor 1% Average 23% Very poor 1% Excellent 15% Good 60% Improving Our Services Better timeliness of data New “Country Overview” data cube to give quick access to key indicators More content in Russian Improved user interface More and better metadata Statistical literacy Steven Vale - UNECE Statistical Division Slide 34 Self-evaluation Relatively quick and cheap Is it sufficiently objective? Needs a standard framework to ensure comparability of quality assessments • Eurostat DESAP check list: http://epp.eurostat.ec.europa.eu/portal/page /portal/quality/documents/desap%20G0LEG-20031010-EN.pdf Steven Vale - UNECE Statistical Division Slide 35 Benchmarking Comparing data values or data production processes between two sources Differences can be studied to try to find ways to improve quality Steven Vale - UNECE Statistical Division Slide 36 Benchmarking Between Countries Fairly cheap and easy way to get ideas on how to improve statistical processes Mutual benefit - “win - win” Helps to improve international cooperation May lead to joint development projects Steven Vale - UNECE Statistical Division Slide 37 Communicating Quality Quality Reports • Summary – “traffic light” indicator Red – Serious quality issues, read the quality report before using Orange – Caution, do not use for important decisions without reading the quality report Green – Good quality Intermediate – short quality report (1000 words maximum) • Detailed – full quality report • Steven Vale - UNECE Statistical Division Slide 38 Detailed Quality Reports Should cover all components of quality Should be written for the user Should be easily accessible Should follow a standard template Steven Vale - UNECE Statistical Division Slide 39 Exercise What should be covered in a detailed quality report? • List the topics that should be included 10 minutes Steven Vale - UNECE Statistical Division Slide 40 ESQR Contents (1) Introduction to the statistical process and its outputs Relevance Accuracy Timeliness Punctuality Accessibility Clarity Steven Vale - UNECE Statistical Division Slide 41 ESQR Contents (2) Comparability Trade-offs between quality components Assessment of User Needs and Perceptions Performance, Cost and Respondent Burden Confidentiality, Transparency and Security Conclusion Steven Vale - UNECE Statistical Division Slide 42 Summary Quality is all about meeting user needs There are many different aspects to quality, some of which may be in conflict • E.g. Timeliness versus Accuracy There are various ways of measuring quality; user views are important Quality should be communicated to users in a way they can understand Steven Vale - UNECE Statistical Division Slide 43 Which is the Best Quality? It depends what the user needs! Steven Vale - UNECE Statistical Division Slide 44 Questions? Steven Vale - UNECE Statistical Division Slide 45