New Perspectives on the Quality of Administrative Data COPAFS Quarterly Meeting September 21, 2012 Bill Iwig, USDA/NASS 1 Outline • FCSM Subcommittee on Statistical Uses of Administrative Data: Where We’ve Been • Data Quality Working Group: Addressing the Elusive Quality Issue • Development of the Data Quality Assessment Tool • Status of Pilot Testing • Next Steps • Discussion 2 FCSM Subcommittee on Statistical Uses of Administrative Data • Created in late 2007 to identify and document: • Agency experiences and practices • Areas of needed research • Barriers that preclude or raise the cost of administrative records projects • Opportunities to collaborate 3 Subcommittee Products • Profiles in Success of Statistical Uses of Administrative Data (Prell, et al.; 2009) • Model Agreement to Share Administrative Records (Cornman, et al.; 2010): Under OMB Review • Informed Consent: Requirements and Practices for Statistical Uses of Administrative Data (Nickerson, et al.; 2012): Under OMB Review 4 Data Quality Working Group: Addressing the Elusive Quality Issue • A consistent and major concern of administrative data users. • Survey organizations have an understanding about the error properties of their survey data – but not of administrative data. • Other non-statistical quality dimensions also included in the Data Quality Frameworks of most major statistical organizations. 5 Development of the Data Quality Assessment Tool • Data Quality ~ Fitness for Use • Intended to facilitate decisions by a potential data user about the suitability of the data for identified purpose. • Intended to prompt additional questions. • Based on the information gathered, the data user may decide to: • • • • Use the data as planned Alter plans Accommodate some quality weakness in the data Not use the data 6 Development of the Data Quality Assessment Tool • Multi-Dimensional • • • • • • Relevance Accessibility Interpretability Coherence Accuracy Institutional Environment • Questions within each dimension are designed to address quality characteristics of micro-level administrative data. • Example answers provided based on characteristics of a Supplemental Nutrition Assistance Program (SNAP) data set. 7 Development of the Data Quality Assessment Tool Discovery Phase • Evaluation period leading to approval for developing a data-sharing MOU. Initial Acquisition Phase • Evaluation period from MOU development approval to firsttime receipt of data. Repeated Acquisition Phase • Continuing periodic receipt of data. 8 Putting the Phases and Dimensions Together • Discovery Phase: 12 questions covering Relevance, Accessibility, and Interpretability, including a request for a Data Dictionary • Initial Acquisition Phase: 29 questions covering Accessibility, Interpretability, Coherence, Accuracy, and Institutional Environment • Repeated Acquisition: 11 questions covering Interpretability (updated Data Dictionary), Coherence, Accuracy, and Institutional Environment 9 What it is and What it isn’t • Evaluates quality characteristics of an administrative data file. • Doesn’t address quality characteristics of linked data. • Doesn’t provide guidance for making fitness for use decisions. • Doesn’t provide quality improvement recommendations. • Not specifically designed to address quality of commercial data. 10 Status of Pilot Testing • Obtained feedback from three federal/state reviewers prior to pilot testing. • Pilot tested on: • Child Care data files from the Office of Child Care Information Systems at DHHS – Initial Acquisition Phase • Wisconsin Child Care Payment System files – All Phases • Expecting a response for two additional files. 11 General Comments Completing the Data Quality Assessment Tool is perceived to be burdensome to the administrative agency. Some agencies may be concerned about data quality implications. Many administrative data files are compiled from various data sources. Availability of a data dictionary is very important. Some questions will be addressed in the Data Sharing MOU. Key issue is to understand the content, structure, strengths, and weaknesses of the data. Based on this information, the user then decides if the data can be used for the intended purpose. Census Bureau plans to add responses to their Meta-Data Discovery Site for access by researchers exploring potential uses of administrative data. 12 Next Steps • Presented at the FCSM/COPAFS Statistical Policy Seminar in December. • Update based on feedback. • Make available to the public. • Pursue opportunities to promote use. 13 Thank You Bill Iwig National Agricultural Statistics Service Bill.Iwig@nass.usda.gov 202-720-3895/3918 14