Generic Statistical Information Model (GSIM) Jenny Linnerud jal@ssb.no This webinar on GSIM (Generic Statistical Information Model) is part of a series of lectures on the main projects undertaken by the High Level Group for the Modernization of Official Statistics (HLG-MOS) Vision of the High Level Group What is GSIM? • It is a strategic approach and a new way of thinking, designed to bring together statisticians, methodologists and IT specialists to modernize and streamline the production of official statistics. • It is a reference framework of internationally agreed definitions, attributes and relationships that describe the pieces of information used in the production of official statistics (information objects). • This framework enables generic descriptions of the definition, management and use of data and metadata throughout the statistical production process. What is the relationship betweeen GSIM & GSBPM? • GSIM and GSBPM are complementary models for the production and management of statistical information. • GSBPM models the statistical production process and identifies the activities undertaken by producers of official statistics that result in information outputs. • GSIM helps describe GSBPM sub-processes by defining the information objects that are used by them, that flow between them, and are created in them in order to produce official statistics. What is an information object? • GSIM is a model of objects that specify information about the real world (“information objects”). • Examples include data and metadata (such as classifications), as well as rules and parameters needed for production processes to run (e.g. data editing rules). • GSIM identifies ca. 110 information objects, which are grouped into four broad categories Statistical Support Program Statistical Program Business Process Data Set Information Resource Referential Metadata Set Exchange Process Step Web Scraper Channel Data Structure Variable Structures Referential Metadata Structure Questionnaire Exchange Channel Product Concepts Business Statistical Need Administrative Register Population Concept Unit Statistical Classification GSIM Development 2012 • GSIM sprint in Slovenia, February • GSIM sprint in Republic of Korea, March • Integration workshop in the Netherlands, November GSIM v1.0 December Developing the GSIM 17 different organisations What are the benefits of using GSIM? • GSIM enables statistical organizations to rethink how their business could be more efficiently organized – by defining information objects common to all statistical production, regardless of the subject matter area, • Improves communication between different disciplines involved in statistical production – within and between statistical organizations; – between users and producers of official statistics. • Generates economies of scale – reuse of information can improve comparability of statistics • Enables greater automation of the statistical production process • Validates existing information systems In Statistics Norway we are also using GSIM to communicate with other government agencies and with IT consultants. Statistics Norway’s participation in GSIM Implementation • GSIM v1.0 Brochure and Communication document in Norwegian • Informal task force on metadata flows in the GSBPM - ca. 20 GSIM information objects were mapped to the phases in GSBPM v4 • GSIM v1.0 discussion forum • GSIM Statistical Classification Model -> GSIM v1.1 December 2013 • Trying out GSIM v1.1 within the RAIRD project GSIM implementation 2013-2015 8 countries provided GSIM Case studies in 2015 - Canada, Finland, France, Germany, Italy, New Zealand, Norway, Sweden http://www1.unece.org/stat/platform/display/CASES/GSBPM+and+GSIM+Case+Studies • GSIM Statistical Classifications is the part of the model that statistical organisations have implemented most GSIM in Statistics Norway - Vision GSIM should lead to: • A foundation for standardised statistical metadata use throughout systems • A standardised framework for consistent and coherent design of statistical production • Increased sharing of system components Remote Access Infrastructure for Register Data (RAIRD) • Statistics Norway and the Norwegian Social Science Data Services (NSD) aim to establish a national research infrastructure providing easy access to large amounts of rich high-quality statistical data for scientific research, while at the same time managing statistical confidentiality and protecting the integrity of the data subjects. • The work is organized as a project, RAIRD – Remote Access Infrastructure for Register Data, and funded by the Research Council of Norway. See: www.raird.no RAIRD Information Model (RIM) • RIM is an implementation of the Generic Statistical Information Model (GSIM) v1.1. • We have based RIM on the GSIM Design Principles • RIM extends GSIM with 27 Information objects that are mainly specialisations e.g. to include different types of agents (producers, administrators and researchers) • RAIRD is a project that is still in progress with completion planned in 2017. Potential Benefits of RAIRD • • • • Simplify the approval process Provide quicker access to analysis results More Masters students will use our data Simplify large, complicated studies by providing exploratory analysis in an early phase • More research and use of our data Contents in 2017 • • • • • Demography Education Income Labour market Social security and benefits Better transfer of knowledge within Statistics Norway Overview of the main components Event History Input Data Set Event History Data Store Provisional Output Analysis Data Set Input Metadata Set Data Catalogue Disclosure Control System Load API SSB Data Mgt. System Virtual Statistical Machine Final Output Browser Browser Browser Browse Data Catalogue User Operations User Views VIRTUAL RESEARCH ENVIRONMENT Metadata Researcher cannot see the data -> Simplifies the approval process Metadata is the interface to the data Metadata Analyse data Statistical Confidentiality in RAIRD Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality on 5-7 October 2015 at Statistics Finland - Topic (v): Practicum: Case Studies and Software How do I find out more? UNECE - GSIM Wiki http://www1.unece.org/stat/platform/display/gsim/Generic+Statistical+Information+Model ? Questions Thank-you to Peter Frayne for contributing questions in advance The End