Olivier Dupriez World Bank, Development Data Group Manager, International Household Survey Network (IHSN) Addis Ababa, September 23, 2011 Two main components ◦ Metadata Editor: a specialized software for documenting any kind of microdata (surveys, censuses, administrative records) ◦ NAtional Data Archive (NADA): an open source application for cataloguing and dissemination ◦ (CD-Builder for dissemination) ◦ Compliant with the DDI/DCMI (XML) standards (Data Documentation Initiative and Dublin Core) XML metadata standards Standard checklists of what you need to know about a study and its dataset (DDI), and about the related resources (DCMI) DDI developed by academic data centers Now used in most countries in the world, and by various software applications (e.g. DevInfo, CsPro) Two versions of DDI: ◦ Version 2.n (DDI codebook), used by the Toolkit ◦ Version 3.n (DDI life cycle) “The National Statistics Office (NSO) of Popstan conducted the Multiple Indicators Cluster Survey (MICS) with the financial support of UNICEF. 5,000 households, representing the overall population of the country, were randomly selected to participate in the survey, following a two-stage stratified sampling methodology. 4,900 of these households provided information.” In XML this could look like this: <titl> Multiple Indicator Cluster Survey 2005 </titl> <altTitl> MICS 2005</altTitl> <AuthEnty> National Statistics Office (NSO) </AuthEnty> <fundAg abbr= "UNICEF">United Nations Children Fund </fundAg> <nation> Popstan </nation> <geogCover> National </geogCover> <sampProc> 5,000 households, stratified two stages </sampProc> <respRate> 98 percent </respRate> Can be transformed into many kinds of outputs: ◦ ◦ ◦ ◦ HTML PDF Databases Others Plain text files not specific to any operating system or application (“durable” metadata) Metadata Editor ◦ By Nesstar Ltd (“Nesstar Publisher”) with IHSN support ◦ Now a freeware ◦ Development benefited from many users’ feedback ◦ Available at www.ihsn.org/toolkit NADA, (CD-Builder) ◦ By the World Bank / IHSN ◦ Available at www.ihsn.org/nada Skip demo The Metadata Editor is a tool for preparing and packaging your data and metadata, not a tool for dissemination ! This DDI (+DCMI) file is ready to be “transformed”, e.g. by being published in a NADA catalog. Skip demo Replicability, transparency Visibility Credibility Institutional memory Knowledge generation (if disseminate microdata) increase and demonstrate the value of data more funding Satisfy a legal requirement in some countries Participate in Open Data / Data Liberation movement Reports, tables (PDF) Web development tool On-line tabulation (and analysis) tool REDATAM, SuperStar, Nesstar, Tableau, etc Indicators CensusInfo, DevInfo, etc Microdata (n% sample) IHSN Metadata Editor and NADA Metadata IHSN Metadata Editor and NADA Microdata, full, raw and edited versions IHSN Metadata Editor Guidelines for documenting a dataset using the IHSN Toolkit http://www.ihsn.org/home/index.php?q=tools/documentation Formulating an access policy and procedures http://www.ihsn.org/home/index.php?q=focus/dissemina tion-microdata-files-principles-procedures-and-practices Long term preservation of data and metadata ◦ Based on OAIS “standard” ◦ Complex; useful as a “technical audit manual” http://www.ihsn.org/home/index.php?q=tools/preservation Country experience: Statistics Canada’s Data Liberation Initiative (forthcoming) Other IHSN manuals (being drafted): ◦ Producing public use census sample files ◦ Anonymizing microdata Countries ◦ Comply with the DDI standard ◦ Produce sample dataset (n%) for public (free) dissemination of microdata ◦ Publish a formal microdata management and dissemination policy ◦ Assess your preservation policy/procedures ◦ Preserve all versions of your census data International agencies ◦ Develop a central census catalog (UNSD?) ◦ Develop anonymization guidelines ◦ Support the establishment of data archives Accelerated Data Program (PARIS21/WB) ◦ Training, technical support to data archiving ◦ Contacts: Olivier Dupriez at the World Bank (odupriez@worldbank.org) Francois Fonteneau at PARIS21 (francois.fonteneau@oecd.org)