House of Microdata (HoM) and Research Data and Service Centre (RDSC) at the Deutsche Bundesbank – a Draft Concept Workshop on Integrated Management of Micro-Databases, hosted by the Banco de Portugal, Porto, June 20-22, 2013 Ulf von Kalckreuth, Deutsche Bundesbank A mandate Mandate for Statistics Dept in medium term strategic planning: Enhance availability of micro data both for scientific research and for analytical tasks Background: ❙ New focus on financial stability – need for granular information by Banking Supervision, the new Financial Stability Dept and the ECB in its new role ❙ Bundesbank Research Centre asking for support Ulf von Kalckreuth, Deutsche Bundesbank Page 2 1 Problems to solve ❙ Data management in Stat Dept and elsewhere not conceived to support micro level analysis – micro data are intermediate products in a process of creating statistical aggregates. ❙ No well structured data sets ready for analysis ❙ Documentation for the use of externals incomplete or missing ❙ Use of microdata subject to complicated sets of confidentiality rules. Purpose of data use important. Level of confidentiality must be systematically assigned to groups of users ❙ Data are created in different processes which are not integrated – cannot be analysed simultaneously Ulf von Kalckreuth, Deutsche Bundesbank Page 3 Problems to solve Working with new micro data extremely time intensive: → Data sets underused → Certain questions not adressed at all → Users liable to misinterprete data – analysis misleading Ulf von Kalckreuth, Deutsche Bundesbank Page 4 2 Suggestion Set up a House of Microdata with the following tasks: ❙ Compilating, documenting and archiving micro data collections that are informative for external research and analysis ❙ Providing a platform for rapid compilation of microdata for varying purposes - analytical (extrnal) and cross-validation (internal) ❙ Enhancing existing data by record linkage – also based on external data ❙ Providing the micro data collections to internal and external users, within the legal bounds relevant for each case ❙ Methodological and substantial (data content and interrelations) support ❙ Analytical work on behalf of internal clients in those cases, where clients either do not have legal access or where the evaluation by client himself is impossible due to time constraints or insufficient data expertise ❙ Limited amount of own research -- descriptive in nature and focussed on the data provided by the unit Ulf von Kalckreuth, Deutsche Bundesbank Page 5 Suggestion Create a potential for ❙ Receiving and disseminating data from other Depts and external sources ❙ Exchange of research data within the ESCB Ulf von Kalckreuth, Deutsche Bundesbank Page 6 3 Suggestion ❙ Statistics Dept cannot answer all possible questions. The production and dissemination of micro data collections fit for analysis of externals is a new statistical product sui generis: The result of the statistical process is not a set of indicators but an information basis for users to work out their own results, answering their own questions following their own criteria. ❙ Evaluations on the behalf of externals is a classical product: Moments and quantiles of distributions are derived from the available granular information ❙ Own research needed to enhance methodological competence and knowledge on data, to communicate to researchers, understand the research agenda and being motivated to create highly relevant data Ulf von Kalckreuth, Deutsche Bundesbank Page 7 1) A Research Data and Service Centre (RCDS) ❙ Changing statistical processes takes time and consumes human resources. Data recording and formats may have to be adapted. Legal constraints have to be overcome ❙ A first (limited) stage leaves statistical processes as they are ❙ Start providing data where there is experience with scientific usage ❙ Would be constituted as scientific use files, periodically saved (data freeze), archived and documented. ❙ User rights and limitations, setting up an application procedure ❙ Manual record linkage on the basis of available characteristics ❙ Progressively develop new data collections ❙ Long time lags and limited linkage – in terms of flexibility, timeliness and simultaneous analysis, this is still very much the old world Ulf von Kalckreuth, Deutsche Bundesbank Page 8 4 Process data bases Analytical data bases Record linkages 1) A Research Data and Service Centre (RCDS) Sub-Unit 1: Experts for the relevant data sets, with the task of setting up and developing the research data collections, the documentation and the interaction with users in matters of data content. They would also do evaluations on demand and a certain amount of own research Sub Unit 2: Staff for the technical side of the dissemination, process, supporting ❙ Evaluation from a distance ❙ Safe computers „on site“ ❙ Provision of de facto or fully anonymised scientific use files ❙ Possibly exchange with other research centers in „safe rooms“ Ulf von Kalckreuth, Deutsche Bundesbank Page 10 5 1) A Research Data and Service Centre (RCDS) Sub Unit 3: Record linkage and validation, including the enhancement by additional characteristics, such as sector information Substantive areas eg: ❙ Company level information from balance sheets and BoP statistics ❙ Financial institutes ❙ Securities ❙ Survey data (household wealth survey, travel survey, payment survey) Ulf von Kalckreuth, Deutsche Bundesbank Page 11 1) A Research Data and Service Centre (RCDS) ❙ The RCDS on its own will address some of the urgent needs, while leaving other problems on the wayside ❙ Will put a new focus on the use and the information content of granular information ❙ Manual record linking is a crutch – slow and limited ❙ Legal problems for matching data on companies (no access to NSI‘s company register) and banks (dual purpose of bank data – supervision and statistics). Ulf von Kalckreuth, Deutsche Bundesbank Page 12 6 2) Integrated micro data management ❙ In the longer run, statistical processes can be adapted to make possible an integrated processing of micro data. ❙ The information availably on any observational unit can be addressed simultaneously and without delay ❙ Data repositories could be centralised or kept decentral ❙ The potential of an RDCD would increase exponentially and record linkage does not create additional work, as far as internal informaton is concerned ❙ Evaluation and quality control – increasing informational content by crosschecking – reduction of inconsistencies Ulf von Kalckreuth, Deutsche Bundesbank Page 13 Important ideas from a visit to the Banco de Portugal ❙ Common reference data bases ❙ Introducing a distinct exploration level into the various stat. processes ❙ read-only ❙ Validated ❙ Subset relevant for analysis ❙ Intermediate step may be to make available common identifyers: record linkage could then be done in a satellite environment (data repositories within the RCDS) Ulf von Kalckreuth, Deutsche Bundesbank Page14 7 Four level concept within one business area File reception Processing Generic business unit New: Analysis Example bank balance sheets Micro data from reporting BMI reports Data to be validated and recombined, visible only for the business unit BASWeb operative system Finalisised data used for cross chacking in Stat Dept and as data warehouse for external analysis „House of micro data “ Publication Dissemination ZIS time series Ulf von Kalckreuth, Deutsche Bundesbank Page 15 Four level concept between various business areas Ulf von Kalckreuth, Deutsche Bundesbank Page16 8 Key features of integrated micro data management ❙ Shared reference information ❙ Eg by using an SDMX framework ❙ Overarching four level framework ❙ Common model instead of a common platform ❙ Implemented by each business unit locally, respecting specific restrictions Success factors as seen from the experience in Portugal ❙ „Step By Step“ instead of „Big Bang“ ❙ Unit by unit, beginning with the most important and the easiest cases ❙ Sequential implementation -- on the basis of existing infrastructure ❙ Integrating only where there is value added ❙ Cost benefit analysis ❙ New areas need to follow a common blueprint ❙ Overall, a disciplining effect Ulf von Kalckreuth, Deutsche Bundesbank Page 17 Summary A vision for the House of Microdata: 1. The RDCS, a separate business unit, specialised on the interchange with external analytics 2. Integrated microdata management as a joint feature of all business units ❙ Common reference data ❙ Harmonised level for data analysis in all business units for information crossing Many thanks for listening and for discussing! Ulf von Kalckreuth, Deutsche Bundesbank Page 18 9