InSPIRe Australian initiatives for standardising statistical processes and metadata Simon Wall Australian Bureau of Statistics December 2012 1 Challenges facing NSIs Riding the big data wave New competitors & changing expectations Rapid changes in the environment Increasing cost & difficulty of acquiring data Competition for skilled resources Reducing budget 2 Challenges facing ABS • Reduce the cost and time of doing business • Grow the business through new statistical products and services • Deliver the first large scale digital Census (2016) on time, to budget and quality while delivering Business As Usual. 3 How will we get there? • Through large scale Innovation across the whole ABS • We will: – radically transform the way we acquire, collate, use, reuse and disseminate statistical information • By: – industrialising, modernising and reengineering our business processes – reengineering our statistical infrastructure and the way we manage information – develop capability needed to meet future needs – Collaborating with other international NSIs 4 InSPIRe 5 What is InSPIRe? • Infrastructure for Statistical Process and Information Management Re-engineering • Building core infrastructure in the ABS for: – Business Process management – Information management • SWM – Statistical Workflow Management System • MRR – Metadata Registry and Repository 6 SWM • The Statistical Workflow Management System • An environment that will be used to develop and manage common business processes – These processes can drive and/or be used by tools and applications. 7 MRR • The Metadata Registry and Repository consists of two parts: – Repository • The centralised ‘bucket’ to store standards based metadata. – Registry • The catalogue that lets you find out what is in the repository. 8 MRR • Registers and stores a wide range of information, including: – Metadata (i.e. Classifications, Variables, etc) – Data – registered in MRR, stored in EDW – Process Definitions and Paradata – information to run a process – Process Metrics – information about a process that was run. 9 MRR • The MRR – Enforces registration of the information, ensuring consistent documentation – Reads information in a range of standard supported metadata formats • (initially DDI and SDMX) – Re-issues metadata in whatever standard format is required by a given process. • Regardless of the format in which the metadata was originally created in. – This information is stored for use and later re-use by the metadata driven processes in SWM – Enables the searching and discovery of metadata for re-use. 10 Interfaces • Processes and Systems will interact with the MRR through a web service interface that supports a controlled set of standards. – Currently limited to DDI 3.1 and SDMX 2.1 11 Users User Interface Business Process Search Retrieve Register SWM InSPIRe MRR Benefits of InSPIRe • Basis for greater automation and reuse of processes and metadata • Faster to market – For individual collections – For building new collections • Cheaper to market (long term) – For individual collections – For building new collections 13 Progress so far… • Proof of Concept Project (2010/2011) – A group of simple use cases to prove the idea of the MRR and SWMs. • Demonstrating: – Benefits – Potential functionality – What is technically achievable • Building capability 14 Pathfinders • Pathfinder Integration (July 2012) – Expanded on PoC – ‘Pathfinder’ projects to produce further use cases for the MRR and SWMs. • Not end-to-end processes, but isolated ‘snap shot’ projects to demonstrate the capabilities of InSPIRe. • Four pathfinders chosen – Including web data capture (e-forms) and REEM (Remote Execution Environment for Microdata) REEM and Web Data Capture Pathfinders • Proved integration with real ABS systems • Demonstrated the benefits of integration with InSPIRe, namely: – Re-use of processes (in SWM) – Standards based metadata (DDI in MRR) driving business tools – Automating business processes (in SWM) Early Adopter Projects • Projects which are positioning to integrate with InSPIRe within the next 12 months. – Includes projects focusing on administrative data, eforms and the 2016 Census. • Suitable candidates to prove the operation InSPIRe. – implementing processes in different phases of the GSBPM. 17 What we have achieved so far • MRR – Design and review, mappers, shredders, database, automated generation of metadata types, basic search, registry model • SWM – Environment setup, test cases with projects • Capability Development – Best practice guidelines for implementing DDI, metadata content guidelines, InSPIRe integration information packs 18 What we still have to do • MRR – Production version • SWM – Further process definition, integration with other projects • Metadata Authoring Environment – A mechanism for: • Taking existing metadata content and registering it in the MRR • Creating new metadata content via a content creation tool and registering it in the MRR 19 The future… An example 20 Survey Designer Data Collection e-form Transformation Environment Content Creation Tool DDI SWM Web form DDI DDI Retrieve Retrieve Register MRR Store Store EDW 21