The Data Warehouse of Banca d’Italia Guiding Principles and Architecture of an Integrated Statistical Warehouse Vincenzo Del Vecchio Banca d’Italia Statistics Collection and Processing Department vincenzo.delvecchio@bancaditalia.it 2012 ESSnet Workshop – 4 December 2012 Agenda 1. Guiding principles 2. Architecture of the statistical data warehouse of the Bank of Italy 4 Dec 2012 2012 ESSnet Workshop 2 1.1 – Integrated Approach Reporting units (Banks & OFI (>4.000), enterpr.& families (>15.000), individuals (>150.000)) Other Institutions (IMF, OECD, ECB, BIS Eurostat, ISTAT, …) Market Providers (Bloomberg, IBCA, Enterprise Register, .…) Institutional statistics Economic analysis Central banking Stat. Domains Payment system Supervision C.C.R. DEFINITIONS DEFINITIONS DEFINITIONS DATA DATA DEFINITIONS DATA BI users (research, supervision, markets, >2.500 users) Public data (> 750.000 inquiries/year) Return flows (to > 5.000 reporting agents) DATA Internal sources (payment system, accounting system …) 4 Dec 2012 (> 1 billion observations / year ) 2012 ESSnet Workshop Other Flows (to other Institutions ) 3 1.1 – Integrated Approach Information are shared by many organizational functions and accessible to users who have rights; Data are collected and processed minimizing redundancies The integrated use of data from different sources and the reuse of data for many purposes are fostered through: Organizational Measures (statistical committee, specialized units for warehouse administration, …) Metodological and Technical Measures (reference information model, common data dictionary …) Harmonization of Concepts, Code Lists, Data Contents (concepts and data administration) 4 Dec 2012 2012 ESSnet Workshop 4 1.2 – Information completness Model of meta-models Model of data definitions (“meta-model”: how to make definitions) L4 L3 Object group: property L2 Cars by colour: percentage (Information Model: Matrix) Data definitions “dictionary” Stats Definer Stats Producer L1 Data obser. Green cars: 40% Stats User Real world 4 Dec 2012 L0 2012 ESSnet Workshop 5 1.3 - Active Definitions (model driven sw) Information Model Administrator System Automation Time to market Accurate and up-todate DEFINITIONS DATA User Software Services 4 Dec 2012 6 1.4 - User oriented model & languages Information Model Administrator : Subject Matter Expert DEFINITIONS Independent of the IT implementation and the IT people 4 Dec 2012 Based on Mathematics & Statistics User 7 1.5 - Unique model and approach (integration of methods and techniques) Statistical Domain Monetary & financial Balance of Payments … Data type Quantitative / qualitative Periodical / not Multidimensional Time series Registers Questionnaires … Definition 4 Dec 2012 Information Model Administrator DEFINITIONS DATA Software Services Extraction, Collection transformation, and storage transmission Compilation Dissemination 2012 ESSnet Workshop Use 8 1.6 – Historical representation Two different histories: of the real world (e.g. when something is true or false) of the information system (e.g. when something is known or unknown) History of all the I.S. contents: definitions data observations 4 Dec 2012 2012 ESSnet Workshop 9 Agenda 1. Guiding principles 2. Architecture of the statistical data warehouse of the Bank of Italy 4 Dec 2012 2012 ESSnet Workshop 10 2.1 - The current reference architecture • A unique information model - the Matrix model – (designed and maintained by the Bank of Italy) – descibing concepts, data structures and algorithms for validation and calculation; • A unique data dictionary – a data base structured according to the Matrix model – storing the users’ definitions; • A logically unique warehouse storing the data observations; • A common software platform – Infostat - made of reusable services driven by the users’ definitions stored in the data dictionary. 4 Dec 2012 2012 ESSnet Workshop 11 2.2 - User application architecture PROCESSES SOFTWARE SERVICES Receive Send Check Remarks Release Microdata Calculate Macrodata Calculate Indicators BPEL GSBPM Send User Application Application Applications DATA WH. Data Definitions Data Observations Calc. Algorithms 4 Dec 2012 2012 ESSnet Workshop Define Send Receive Calculate & Check Release Monitor Inquiry Import/export … W3C, WS-I DEFINITIONS DATA MATRIX (SDMX, XBRL, CSV …) 12 Warehouse Administration Statistical community / Information system “A” Statistical community / Information system “B” DEFINITIONS DEFINITIONS DEFINITIONS DEFINITIONS DEFINITIONS DATA DATA DEFINITIONS DEFINITIONS DATA DATA DEFINITIONS DATA DATA DATA DATA Common software services 4 Dec 2012 2012 ESSnet Workshop 13 2.2 – Graph of Data and Algorithms Economic research models IMF, OECD, ECB, … A1 D1 D2 D3 A53 A51 D51 A2 D4 A3 D5 A54 A52 D52 D53 D54 Statistical bulletin Banks & OFI’s reports D1 0 D12 D13 A60 A61 A1 D17 A12 D15 4 A13 D16 4 Dec 2012 A21 D23 D24 A22 D60 Statistical products A70 A71 D71 C.C.R. D21 D22 D61 A72 D70 D72 Supervision models D41 A41 A42 2012 ESSnet Workshop D42 • 14 page 14 Supporting more Warehouses Economic analysis Statistical Communities B.I. Central Institut. DEFINITIONS DEFINITIONS banking Functions F.I.U. ESCB DEFINITIONS DATA (RIAD) DATA DEFINITIONS DATA Other Supervision DATA Italian … Specific Instit. Data Shared Data 4 Dec 2012 Payment system 2012 ESSnet Workshop 15 INFOSTAT architecture Information provider Information consumer User interface A2A Data entry Collection & validation User interface Messages upload, remarks download Notifications, Alerts A2A Inquiry, search Collaboration Data services Dissemination Format conversion Workflow engine Checks Warehouse Calculation engine Data definition Metadata administration Metadata import/export Metadata administrator 4 Dec 2012 Analysis & reporting Regular data production Inquiry, search, analysis tools Data services Data analyst 2012 ESSnet Workshop Monitor Report generation Process monitor Operations administrator 16 8:34-8:36 Brief history of the IT support for Statistics THE ’60s & ’70s first IT solutions first “active” software (metadata driven) THE ’80s launch of the integrated approach the Matrix schema and the first integrated solutions THE ’90s integration of many silos applications evolution of the Matrix Model support to GESMES-CB standardization THE 2000s Statistical Dictionary support to SDMX and XBRL standardization THE 2010s INFOSTAT: a service oriented platform 2009: data collection and data quality services 2012-13: full set of services From 2009 on: migration of old surveys and data bases 4 Dec 2012 2012 ESSnet Workshop 17 Thank you ! Vincenzo 4 Dec 2012 2012 ESSnet Workshop 18