Eric Hermouet
Statistics Division, ESCAP hermouete.unescap@un.org
The data deluge
New competitors
Changing user demand
Economic pressure
2
The internet had 1800 exabytes of data in 2011 exa = 10^18
3
50 000 exabytes by
2020
Even if 99.9% are videos, photos, audio files, text messages etc., that still leaves a huge amount of potentially relevant data
4
Google:
– Real-time price indices
– Public Data Explorer
– First point of reference for the “data generation”
Facebook, store cards, credit agencies, ...
– What if they link their data?
Can they provide an alternative to population censuses?
5
Statistics made available faster
Need to answer wider range of user
– Need to package data differently
Need for more detailed information
Need for integrated data products
– Linking logically datasets from different sources
6
Develop and promote new:
Sources Processes Products
High-Level Group for Strategic Directions in
Business Architecture in Statistics (HLG-BAS)
– Created by the Conference of European Statisticians in
2010
– 9 heads of national and international statistical organizations
7 Civil Registration and Vital Statistics
8
The Challenges are too big for statistical organizations to tackle on their own.
We need to work together
Collaboration
Coordination
Communication
9
Common processes
Common tools
Common methodologies
Recognizing that all statistics are produced in a similar way:
No domain is “special”
Increased flexibility to adapt to new sources and produce new outputs
10
G eneric
S tatistical
B usiness
P rocess
M odel
G eneric
S tatistical
I nformation
M odel
S tatistical D ata and M etadata e X change
D ata D ocumentation I nitiative
11
Statistical production has traditionally been organised by topic, e.g. transport, trade, …
Some statistical organisations are moving towards a process-based approach
12
To define and describe statistical processes in a coherent way
To standardize process terminology
To compare and benchmark processes within and between organisations
To identify synergies between processes
To inform decisions on systems architectures and organisation of resources
13
All activities undertaken by producers of official statistics which result in data outputs
National and international statistical organisations
Independent of data source, can be used for:
– Surveys / censuses
– Administrative sources / register-based statistics
– Mixed sources
14
15
Sub-processes do not have to be followed in a strict order
It is a matrix, through which there are many possible paths, including iterative loops within and between phases
Some iterations of a regular process may skip certain sub-processes
16
Harmonizing statistical computing systems
Facilitating sharing of statistical software
Framework for process quality management
Structure for storage of documents
Measuring operational costs
18
The Generic Statistical Information Model is a reference framework of information objects, which enables generic descriptions of data and metadata definition, management, and use throughout the statistical production process
Another model is needed to describe data and metadata objects and flows within the statistical business process
19
Provide a common reference model for statistical information
Define the information required to drive statistical production processes and define outputs
Facilitate building efficient metadata driven collection, processing, and dissemination systems
20
Leads to a modular approach in designing software
Plug and play tools
21 Civil Registration and Vital Statistics
EGM in June 2011
SIAP Management Seminar in December 2011
Moscow workshop, April 2012
On the agenda of the upcoming Committee on
Statistics, December 2012
http://www.unescap.org/stat/MSIS/index.asp
22 Civil Registration and Vital Statistics
GSBPM
– http://www1.unece.org/stat/platform/display/metis/The+
Generic+Statistical+Business+Process+Model
GSIM
– http://www1.unece.org/stat/platform/display/metis/Gene ric+Statistical+Information+Model+(GSIM)
HLG-BAS
– http://www1.unece.org/stat/platform/display/hlgbas
23
24
THANK YOU.
Questions?
Civil Registration and Vital Statistics