Report of the TFIDC to the IAG

advertisement
Report of the Task Force on
International Data Cooperation
Presented to the Meeting of the Inter-Agency Group on Economic and Financial Statistics
September 23 and 24, 2013
1
Table of Contents
List of Acronyms ............................................................................................................................ 2
Executive Summary ........................................................................................................................ 3
I. Introduction ................................................................................................................................. 4
II. International Data Cooperation: General Principles .................................................................. 4
III. Data Cooperation for GDP Main Aggregates and Population, and Sectoral Accounts ............ 5
A. Background ............................................................................................................................ 5
B. Mandate and Objectives ......................................................................................................... 6
C. Deliverables and Timing ........................................................................................................ 6
IV. Pilot Exercises .......................................................................................................................... 7
A. Scope of the Pilot Exercises ................................................................................................... 7
Datasets ................................................................................................................................... 8
Pilot Countries ........................................................................................................................ 9
Expected results and timeframes .......................................................................................... 10
Criteria for evaluation ........................................................................................................... 10
B. Templates ............................................................................................................................. 10
C. SDMX Infrastructure in Member Agencies ......................................................................... 12
Review of the Infrastructure ................................................................................................. 12
Implications for Pilot Exercises ............................................................................................ 16
V. Current Workflow Practices and Workflow for the pilots ....................................................... 16
A. Data transmission from country to international agencies ................................................... 17
B. Sharing of validated data among international agencies ...................................................... 17
ANNEX I: Members of the TFIDC .............................................................................................. 22
ANNEX II: Country coverage for the Pilot Exercise (by group of countries) ............................ 23
ANNEX III: Template for GDP Main Aggregates and Population .............................................. 25
ANNEX IV: SDMX Survey Questionnaire .................................................................................. 27
ANNEX V: Summary of SDMX Survey Results ......................................................................... 30
ANNEX VI: Data Workflow Practices ......................................................................................... 35
ANNEX VII: Eurostat-OECD Protocol: Agreed Minimum Checking Rules .............................. 42
2
LIST OF ACRONYMS
2008 SNA
API
BIS
DSD
ECB
ECB SDW
EDAMIS
ESA
Eurostat
GPG
IAG
MSD
OECD
PGI
SDMX
SDMX-EDI
SDMX-ML
SDMX-NA
TFIDC
UNECE
UNSD
System of National Accounts, 2008
Application Programming Interface
Bank for International Settlements
Data Structure Definitions
European Central Bank
ECB Statistical Data Warehouse
Electronic Dataflow Administration and Management Information System
European System of Accounts
Statistical Office of the European Union
GNU Privacy Guard
Inter-Agency Group on Economic and Financial Statistics
Metadata Structure Definitions
Organisation for Economic Cooperation and Development
Principal Global Indicators
Statistical Data and Metadata Exchange
SDMX using EDIFACT Syntax
SDMX using XML Syntax
SDMX in National Accounts
Task Force for International Data Cooperation
United Nations Economic Commission for Europe
United Nations Statistics Division
3
EXECUTIVE SUMMARY

The Inter-Agency Group on Economic and Financial Statistics (IAG) has emphasized the
need to improve cooperation among international and supranational agencies in terms of
collecting, validating, and disseminating public official statistics from national and
international/supranational sources. In this regard, the IAG has established the Task Force on
International Data Cooperation (TFIDC) to examine the elements and undertake pilot exercises
on a framework that would allow member countries of the agencies represented on the IAG to
submit data only once, and for these data to be shared among the member agencies. The overall
objective of the TFIDC is to determine the procedures that could be applied for a successful data
cooperation arrangement across international agencies that would streamline and improve the
efficiency of data collection, sharing, and dissemination.

To meet this objective, the TFIDC will oversee two pilot exercises for (i) GDP main
aggregates and population, and (ii) sectoral accounts. The pilot exercises will begin in September
2013 and will run for about two years. In preparation for the exercises, the TFIDC has agreed on
data collection templates for each of the datasets. The template for GDP main aggregates and
population comprises a set of main national accounts aggregates and auxiliary indicators that are
widely used and are available for a large number of countries. The TFIDC will use the templates
on sectoral accounts that have been developed by the Working Group on Sectoral Accounts and
endorsed by the IAG.

The cooperation with regard to the agreed datasets should demonstrate that the exchange
of data among international agencies is technically feasible for both sending and receiving
agencies. This should result in the availability of consistent data across international agencies
and efficiency gains for all parties.

In general, participating international agencies are expected to use the pilot exercises to
evaluate the benefits and costs associated with international data cooperation. In particular, the
pilot exercises will determine the following: (i) whether the current data collection practices of
the respective agencies meet the requirements of the templates; (ii) where gaps exist in terms of
country coverage; (iii) the responsibilities for the validation of the data, the validation principles
to be used, and the timeframe for validation (iv) a suitable schedule for the transmission of data
from national agencies to the collecting international agencies that will satisfy the timetables of
all receiving international agencies (with the implication that the strictest schedule currently in
place among all the agencies may have to be followed; and (v) the IT infrastructure in terms of
the ability of the agencies to receive and send (where appropriate) national accounts data using
SDMX-ML with the appropriate 2008 SNA data structure definition. Overall, the TFIDC has
determined that the technical systems of the participating agencies are capable of handling the
SDMX formats for data exchange and are generally ready for the pilot exercises.

In the medium-term, data cooperation among the agencies may render unnecessary,
multiple collection and validation of data from countries. These agencies may focus on having
the data reported to only one international agency.
4
I. INTRODUCTION
1.
The Inter-Agency Group on Economic and Financial Statistics ((IAG1) plays a key
monitoring and coordinating role in the implementation of the recommendations made in the
Report to the G20 on Data Gaps and the Financial Crisis by the International Monetary Fund and
the Financial Stability Board2. In addition to addressing the recommendations made in the report,
the IAG has focused its attention to improve the practical cooperation between international and
supranational agencies in terms of collecting, validating and disseminating public official
statistics from national and international/supranational sources.
2.
The Task Force on International Data Cooperation (TFIDC) was established by the IAG
and began its work in early 2013. The purpose of the TFIDC is to examine the elements and
undertake pilot exercises on a framework that would allow member countries of the agencies
represented on the IAG to submit data only once, and for these data to be shared among the
member agencies.
3.
This report provides an update on the work of the TFIDC and outlines the details for
undertaking the pilot exercises. Section II outlines the general principles underlying international
data cooperation; Section III reviews the background and mandate of the data cooperation
arrangements for GDP main aggregates and population, and sectoral accounts; Section IV
reviews the scope of the pilot exercise, the templates for the datasets, and SDMX infrastructure
existing in international agencies; and Section V reviews the workflow practices for the pilot
exercises as well as presents the timing of the pilot exercises.
II. INTERNATIONAL DATA COOPERATION: GENERAL PRINCIPLES
4.
The IAG has emphasized the need to improve cooperation among international and
supranational agencies in terms of collecting, validating, and disseminating public official
statistics from national and international/supranational sources.
5.
The general objectives of the improved cooperation are as follows:


reduce the reporting burden on national authorities;
make more efficient use of resources at the national and international/supranational
agencies;
1
The IAG includes the Bank for International Settlements (BIS), the European Central Bank (ECB), the Statistical
Office of the European Union (Eurostat), the International Monetary Fund (IMF), the Organization for Economic
Co-operation and Development (OECD), the United Nations (UN) and the World Bank. The IMF chairs the IAG
and provides its secretariat.
2
See http://www.imf.org/external/np/g20/pdf/102909.pdf.
5


ensure that the economic and financial data and related metadata in the databases of
international/supranational agencies are identical for the same statistical concepts and are
of the highest quality, including in terms of frequency and timeliness; and
improve the dissemination to users globally of a more consistent set of economic and
financial data.
6.
The IAG has started to develop a set of cooperation guidelines for the international and
supranational agencies that are willing to participate in the improvement of data flows amongst
themselves. The guidelines are based on the following:







the implementation of more aligned statistical concepts and definitions;
the harmonization of reporting templates;
the efficient exchange of data between international and supranational agencies, in
particular the use of SDMX technical standards and data structure definitions;
the allocation of responsibilities among these agencies with respect to data validation and
quality assurance;
making use, to the extent possible, of existing data cooperation arrangements already in
place;
the implementation of more efficient and consistent data flows to and from various data
hubs, both across agencies and on public websites; and
more formal arrangements between international agencies where appropriate.
III. DATA COOPERATION FOR GDP MAIN AGGREGATES AND POPULATION, AND SECTORAL
ACCOUNTS
A. Background
7.
The TFIDC organized its first physical meeting in April 2013. The meeting discussed a
framework for its activities leading up to the implementation of the pilot exercises on the data
cooperation arrangements for sectoral accounts, GDP and population statistics. Before and after
the April 2013 meeting, the TFIDC arranged its work via electronic means (e-mails,
teleconferences, and video conference).
8.
The IAG has mandated that the TFIDC undertake two pilot exercises to test the
cooperation of national accounts datasets in line with the draft general principles outlined above.
The goal of the pilot exercises is to develop a set of commonly shared principles and working
arrangements for data cooperation that could be implemented by the international agencies. One
pilot exercise will test the arrangements for a set of GDP main aggregates and population data
with the widest possible representation of countries and agencies. The other pilot exercise will
test arrangements relating to institutional sector accounts within the scope of the G20
Recommendations (Recommendation #15 of the G20 Data Gaps Initiative).
6
B. Mandate and Objectives
9.
The overall objective of the TFIDC is to determine the procedures that could be applied
for a successful data cooperation arrangement across international agencies that would
streamline and improve the efficiency of data collection, sharing, and dissemination. Thus, the
TFIDC is expected to develop the framework for the data cooperation process—with the work
processes clearly outlined—and recommend a feasible program for its implementation based on
the outcomes the pilot exercises.
10.
The IAG has mandated the TFIDC to prepare, by September 2013, a report outlining the
practical workflows among the international agencies and between the data-providing countries
and international agencies in terms of collecting the relevant time series for each of the datasets
covered in the pilots, the procedures to validate the collected data, sharing, and dissemination.
11.
At the April 2013 meeting, The TFIDC agreed on a basic timeline for its work, including
the schedule and anticipated outcomes of the pilot exercises. The mandate of the TFIDC has
been revised to reflect the anticipated workflows and expected outcomes. Following the
completion and review of the pilots, the initial informal guidelines outlined in the general
principles will be reviewed and, depending on the outcome of the pilots, these principles may be
applied to other datasets.
C. Deliverables and Timing
12.
The TFIDC is expected to deliver the following:








the templates to be used for the pilots on data cooperation;
the agreement about which international agency covers which countries in which pilot;
the data flows from countries to international agencies;
the workflows among the agencies;
the clarification and establishment of the IT environment for the data cooperation
exercise;
transmission process to be used in case SDMX cannot be used directly by some agencies
for a transitional period;
feasibility of exchanging non-validated data; and
processes beyond the successful pilots.
The first six items are addressed in this first progress report, as they are instrumental in getting
started with the pilot exercises. The two remaining items will be addressed at a later stage.
13.
The TFIDC will function for an initial period of two years while undertaking the pilot
exercises. A first review of the exercises will be undertaken nine months after the initiation of
the pilot.
7
IV. PILOT EXERCISES
A. Scope of the Pilot Exercises
14.
The TFIDC will oversee pilot exercises covering (i) GDP main aggregates and
population, and (ii) sectoral accounts. They will be launched in September 2013 and will run for
about two years.
15.
The pilot exercises will determine whether the current data collection practices by the
respective agencies meet the requirements of the templates developed for the data cooperation
(see Section IV.B for a description of the templates). It will therefore review the collection
practices currently being followed by the agencies and will identify their data requirements in
terms of the items presented in the templates. The exercises will also determine where gaps exist
in terms of country coverage and the completeness of the templates. In this regard, the TFIDC
will use the results of the pilot exercises to propose mechanisms to address the country coverage
and data gaps.
16.
In preparation for the pilots, the TFIDC conducted a review of the data collection and
workflow practices in member agencies as they relate to the datasets under consideration. Thus,
the review was conducted separately for GDP main aggregates, population, sectoral accounts.
17.
The review determined that currently, the country coverage, range of data collected, and
collection practices vary widely across the agencies (see Annex VI). The agencies focus on
collecting data for their members and, in some cases, a few additional countries of interest.
Therefore, country coverage ranges from currently 36 for ECB and Eurostat (covering the 28
European Union member countries, EFTA countries and Candidate countries) to 235 for the
UNSD (covering all UN member countries as well as other areas and territories.) Whereas all
the agencies collect data on GDP main aggregates, five collect data on sectoral accounts, these
being the BIS, ECB, Eurostat, OECD and UNSD. Further, the number of countries covered by
data collection for the sectoral accounts is significantly lower than the number of countries
covered for GDP main aggregates.
18.
While data specified in the proposed templates for GDP main aggregates and sectoral
accounts are covered by the ESA Transmission Programme, also used in the context of OECD
data reporting requirements and quarterly data reporting requirements for the ECB as regards
sector accounts, the full dataset is not yet provided by all countries covered by these three
institutions due to derogations or non-compliance issues. On the other hand, the World Bank has
no data reporting requirements and collects the data from the national agencies through visiting
World Bank missions. As regards the level of detail, the ECB, Eurostat and OECD collect a
broad range of data by institutional sector and subsector, but the BIS and UNSD collect a limited
number of items for the main institutional sectors.
19.
The agencies follow various formal and informal data cooperation arrangements. Most of
the arrangements involve the ECB, Eurostat or OECD as a partner. The OECD currently receives
non-validated data from Eurostat but an agreement has been reached for the transmission of
8
validated data from Eurostat, as is currently the case with the ECB. The OECD also receives
sectoral accounts data from Eurostat and the ECB. Sectoral accounts data received by the OECD
from Eurostat and ECB and remaining OECD countries are transmitted to the IMF by the OECD
for publication on the Principal Global Indicators (PGI) website. The IMF receives the GDP
main aggregates data for most EU member countries from Eurostat in a bulk file (using SDMXEDI) on a monthly basis. It also gets additional data for non-EU countries from OECD.Stat. The
agencies are currently working on a data exchange process where the IMF would directly query
Eurostat's databases (“Pull” technology). The UNSD receives annual GDP data from OECD and
other non-IAG regional agencies and collects data from other UN member states.
20.
The data collection schedules of the agencies follow closely the national release calendars
of countries but are designed to meet the publication schedules of the agencies. Therefore, the
agencies may collect the data some time after these data are released by the national statistical
agencies (usually on their websites) but a cutoff data for data transmission is usually prescribed
by each IAG. The data collection schedules for the ECB and Eurostat are based on specific
regulations although national release calendars allow the countries to submit the data to these
agencies ahead of the prescribed timeframes. The pilot exercises will evaluate the existing
processes for handling of data revisions which will lead to proposals for approaches to streamline
processes for data revisions.
21.
During the pilot exercises, the agencies will determine the time lapse between the release
of the data by the national agencies and the transmission of the data by the receiving agencies.
There are two aspects of this flow of data that will be monitored:
1. The lapse of time between the release of data by a national agency and the receipt
by the collecting international agency; and
2. The amount of time taken to validate the data before transmission to other
agencies.
Datasets
GDP Main Aggregates and Population
22.
For GDP main aggregates and population, all agencies will continue to collect the data as
they currently do, which implies that the same data may be collected more than once for some
economies. In principle, only the primary validating international agency will push these data to
the other interested agencies using SDMX, where feasible. However, for countries beyond the
validation responsibilities of Eurostat, ECB and OECD, transmission from multiple international
organizations may be possible during the pilot, primarily to be able to assess the allocation of the
validating role for these countries. The exercise will be evaluated after the first nine months and
if agencies are satisfied with the progress, then they may choose to implement the process for
data cooperation before the end of the two-year period.
9
23.
During the pilot exercises, agencies will disseminate the data only after the validation
process is complete. The timeliness of the dissemination of the validated data will then be
compared with the current timeframe for data collection. The agencies will also explore the
possibility to allow for the dissemination of non-validated data (where the data will be pushed to
the receiving agencies as soon as they are collected from the countries).
24.
Costs are mainly related to handling the reception and transmission of data. While it was
agreed that the pilot exercise on GDP and population data should be started on the basis of the
current technical environment involving varying technical infrastructure and manual
interventions, the exchange of data between international agencies will be more efficient once
SDMX is operational.
Sectoral Accounts
25.
For the sectoral accounts data, the pilot exercise will continue until data are available on a
2008 SNA/ESA 2010 basis for most of the participating countries. The data collection process
during the pilot exercise will follow the practices for submitting sectoral accounts data that have
been adopted by the IAG Working Group on Sectoral Accounts.
26.
Under the existing arrangements, Euro area and EU member countries submit the data to
the ECB and Eurostat. EFTA members and EU candidate and associated countries also already
submit partial datasets to Eurostat. These organizations then transmit the data to the OECD. The
OECD collects data for OECD member countries that are not members of the EU and for
selected OECD Key Partner countries where the data are available. The OECD submits the entire
data to the IMF for publication on the PGI website. For assessing the feasibility of data
cooperation, the pilot could test the transmission of sector accounts data to the other participating
agencies as well. In the future, the IMF may probably need to assume the responsibility for
collecting data from countries that are not covered by the OECD, subject to data availability. The
scope for any such collection could be broadened by the inclusion of annual data in the pilot.
Pilot Countries
27.
Building on already agreed responsibilities among the ECB, Eurostat and the OECD, the
TFIDC proposes that the transmission of data for the euro area, EU and remaining OECD and
Key Partner countries to other agencies be organized by these three agencies, while the IMF
agreed to send existing data it collects for all remaining countries. The BIS and the World Bank
will only participate as a recipient of pilot project data.
28.
Annex II summarizes the country list for the two pilot exercises and outlines the possible
arrangements for country by country sender responsibilities and receiving agencies. It is
expected that the pilots will review the list in the Annex II and propose a feasible arrangement
for future cooperation.
10
Expected results and timeframes
29.
The cooperation with regard to the agreed datasets should demonstrate that the exchange
of data among international agencies is technically feasible for both sending and receiving
agencies. This will result in the availability of consistent data across international agencies and
efficiency gains for all parties. In the medium-term, this will also demonstrate that multiple data
collection and validation from national authorities may no longer be necessary and that agencies
may start working on reporting of data by countries to only one international agency.
Criteria for evaluation
30.
Participating international agencies should use the pilot exercises to evaluate the benefits
and costs associated with international data cooperation. Eurostat, the ECB and the OECD would
benefit by gaining additional country data that could be used to analyze the economies of main
trading and financial partners. The BIS, IMF, WB and the UN would need to assess the
timeliness and quality of data in comparison to current arrangements, and evaluate the savings
from possibility of reducing data collection and validation routines.
31.
The documentation that will be developed as a basis for the evaluation of the pilots will
include the following:
1. Agencies sending data, will document, country by country, when the data was pushed.
2. Agencies receiving data will document, country by country, the following:
 the time they receive data through existing channels and when they publish the
data;

the time they receive the data through the pilot exercises and when they could
publish these data (note: during the pilot the data will not be made public)

any quality problems found in the data received and notify the sender of them
without delay; and

resources spent on data validation in person days, for the data received via own
channels and via the pilot exercises.
B. Templates
32.
The TFIDC has agreed on two sets of templates for (i) GDP main aggregates and
population and (ii) sectoral accounts. Regarding GDP main aggregates and population, the
TFIDC agreed on a set of main national accounts aggregates and auxiliary indicators that are
widely used and are available for a large number of countries (see Annex III).
11
GDP main aggregates and population
33.
The proposed set of 36 indicators presented in template covers GDP (including statistical
discrepancies), main GDP aggregates relating to output, expenditure and income, as well as data
on gross national income, saving and net lending. These indicators are complemented by
population and employment figures, which are important auxiliary indicators to derive indicators
per inhabitant or monitor productivity. The TFIDC has included four key components in the
templates for GDP and aggregates. These are as follows:




Seasonally adjusted and (original) non-seasonally adjusted data
Quarterly and annual data
Nominal and volume measures
Inclusion of flash estimates (as received by countries)
34.
In terms of the published data, the TFIDC has decided that seasonally adjusted data
submitted by the countries should be used for dissemination. If these seasonally adjusted series
are not available, then the agencies may conduct the seasonal adjustment and transmit these data
to other agencies. However, the seasonally adjusted data represent an additional series that
should be clearly presented as an estimate of the relevant international agency and it would be
the responsibility of the agencies to get back to the countries to inform them of the seasonal
adjustment.
Population Statistics
35.
The TFIDC agreed that the definition of population used should preferably be based on
the national accounts concept of residence as this definition is more relevant an indicator for per
capita measures. While EU countries report population statistics based on the national accounts
concepts, other countries may only have data based on demographic statistics. Therefore, where
the definition of population based on the national accounts concept is not available, countries
may report the data based on the demographic statistics.
Sectoral accounts
36.
The templates on sectoral accounts that have been developed by the Working Group on
Sectoral Accounts and endorsed by the IAG will be used for the pilot3. The distinction between
the minimum and encouraged items stipulated in the templates will not be applied for the pilot
exercise. Therefore,countries are expected to complete the templates as they deem feasible.
37.
The data currently being compiled by countries are already being provided for
publication on the PGI website, thus to a central location. It is expected that the sector accounts
templates will be applied as of end 2014, and data will be collected using the new SDMX-ML
data standards definitions for national accounts. A formal data cooperation pilot exercise will not
3
The templates are available on the IMF website: http://www.imf.org/external/np/sta/templates/sectacct/index.htm
12
be established until then. The current data cooperation exercise involves the provision of data,
mostly from European countries to OECD and IMF. Currently, not all agencies receive these
data from the primary validating/disseminating agencies.
38.
The TFIDC will focus on the collection of non-seasonally adjusted quarterly data
(seasonally adjusted data may need to be considered in cases where non-seasonally adjusted data
are not available). The agencies will also collect annual data including also those countries that
compile quarterly sectoral accounts data.
39.
The TFIDC will investigate further whether it is possible to make the currently collected
sectoral accounts data available directly to other participating international agencies.
40.
For the future, the following issues need to be clarified:

Whether the current cooperation arrangements regarding quarterly sectoral accounts data
should be extended to include as recipients the other members of the TFIDC

Whether the data collection should include annual data. This raises issues of consistency
between quarterly and annual data;

Whether the agencies should collect and share seasonally adjusted data; and

Whether the collection should be expanded beyond the G-20 economies.
C. SDMX Infrastructure in Member Agencies
Review of the Infrastructure
41.
The TFIDC conducted a survey in July 2013 to determine the status of their preparations
regarding their SDMX infrastructure data / metadata transmission and reception, and the
readiness of their technical environment for exchanging data (see Annex IV for survey
questionnaire and summary results). The infrastructure assessment is in the context of the
conclusions by the TFIDC that the pilot should test the agencies’ ability to receive and send
(where appropriate) national accounts data using SDMX-ML, using the appropriate 2008 SNA
DSD.
42.
The overall assessment, based on the answers received to the questionnaire, is that the
participating agencies’ technical systems for data exchange are prepared to handle SDMX
formats for data exchange and are generally ready for the pilot exercise. In order to continue with
the next stage of the project, the following common features prevail:

SDMX-ML Compact 2.0 format is the preferred format for the data exchange; the
SDMX-EDI format would provide problems with OECD, IMF and WB. For the BIS and
the ECB, who initiated SDMX-EDI already 15 years ago and based their system on EDI
specifications, the SDMX-ML messages should therefore also be EDI compliant for the
purpose of the pilot exercise (e.g. Key Family/Data Set Identifier information to be
13






43.
present in the header of the data file as agreed in the BPM6 and 2008 SNA DSD technical
WG), in order to benefit from automated processes.
DSD messages (over MSD messages) are preferred SDMX messages for processing;
Automated creation, loading and reception of SDMX messages is preferred;
Current/existing/testing environment systems will be used;
Automated data transmission via SDMX messages is preferred;
Manual intervention, i.e. data exchange via E-mail to be used only as last resort;
The pilot phase is intended to test data exchanges and synchronization of data vintages
between IO’s, as such the exchanged data is experimental. The data cannot be publicly
disseminated with the 2008 SNA DSD until the latter is officially used for data
transmission.
Different level of systems development and usage include:



New SDMX artifacts like hierarchical code lists cannot be handled by all participating
organizations;
Use of SDMX-NA Sandbox versus existing systems;
SDMX-EDI format is not supported by all organizations, see also a) above in common
features;
SDMX transmission format
The table 1 highlights the results of the survey on SDMX transmission formats.
14
Table 1: Summary of SDMX Transmission Format
Messaging format (Q2)
Type of messages (Q3)
SDMX-ML
2.0
SDMX-ML
2.1
Eurostat


IMF



OECD
(only
compact)




ECB
(only
compact)
SDMXEDI

(not
preferred)
Hierarchica
l code lists
Attributes
at any
group level




















DSD




New SDMX artefacts (Q5)
Creation of
SDMX
messages from
database (Q6)
MSD
(with extra
effort)
System rely
on SDMXEDI
dependences
(Q4)

(only
structure
specific)







Automatic
data
loading
(Q7)
Reception
of SDMX
messages
(Q8)
WB
(only
compact)


(currently
not used; if
needed will
be built
upon)






(no data post on
the web)


(possible data
post on .Stat)
Automatic
transmission
channels in place
on the
dissemination side
(Q10)

(EDAMIS)

(secured channel
with OECD)



(only for
SDMXEDI and
EDI
compliant
ML
messages)
(direct data
exchange with BIS,
EUROSTAT and
IMF; with OECD
via Eurostat; secure
e-mail with UN and
WB)


Monitoring of
receipt of
SDMX
messages/post
data during the
pilot (Q9)

(for most
data)


(internal
mapping
may be
needed)

(no data post on
the web)

(under
evaluation)


BIS









(only for
SDMXEDI and
EDI
compliant
ML
messages)


(no data post on
the web)
This table highlights the following aspects:

The transmission format used in the pilot should be SDMX-ML 2.0 Compact, EDI SDMX compliant, if all agencies manage to implement EDI compliance in due course.

The pilot should involve sharing of data, not metadata

Hierarchical code lists and attributes at group level would be avoided.

15
Transmission methods
44.
Based on the review, several possible transmission methods have emerged as described
below.
Option 1: Agency to agency transmission of data using existing infrastructure: All
six responding agencies consider this feasible, when it occurs with SDMX-ML 2.0 format
[ideally EDI compliant ML messages]. This would allow data coordination involving the
transmission of data using existing infrastructure from the receiving agencies to receiving
agencies directly, without manual intervention and without intermediate data storage.
Agencies that are equipped for automated transmission and reception could make use of
their existing infrastructure, and actual turnaround times could be validated.
Option 2: Use of the technical pilot sandbox: Four out of five agencies consider this
option viable. The ECB is the only agency that does not support this option (the BIS did
not express an opinion). The main advantage of using the technical pilot sandbox is that it
would provide a central hub for data cooperation. The main disadvantages associated
with its use are that it currently does not support automatic notifications upon arrival of
data and that data files need to be manually downloaded.
Option 3: Use of EDAMIS: EDAMIS is an application implementing the Single Entry
Point policy of Eurostat. A specific dataflow could be created for the pilot and any
participating agencies could be set up as sender/receiver.. EDAMIS would provide a
central dispatch place for data cooperation. It would allow the data to be transmitted
securely and would automatically monitor the reception and re-dispatch of data, including
any small delays. Receivers could also choose in which form they want to receive the
data, including email. The main drawback of EDAMIS is that not all agencies are
familiar with it. However, Eurostat will configure EDAMIS accordingly and provide a
short webinar session to explain how it can be used in the context of the pilot.
Option 4: Email exchange: All agencies mention email exchange as a possible data
cooperation alternative, although none rated it as the preferred option. Under this
solution, each participating organization would send the pilot data through encrypted
emails to all other agencies. The main advantage of this option is its simplicity. It would
however lead to an overall increase in manual work compared to the previous option.
Moreover, this option does not contemplate any centralized hub, which would make data
cooperation more error-prone (due to e.g. inadvertent omissions of recipients in the
mailing lists).
45.
The TFIDC concluded that for the GDP pilot the EDAMIS option would be
implemented. Should there be problems with longer system delays, the fall back solution
considered would be the use of an email exchange.
16
Implications for Pilot Exercises
46.
Taking into account the current state of progress of SDMX implementation in Eurostat
and the other international agencies, Eurostat has prepared for a transmission of the selected list
of GDP and population data building on its public dissemination database (Eurobase).
47.
For this purpose, the list of identified indicators, units, frequencies and adjustment
methods were mapped to respective series in Eurobase, which are updated twice per day
following the validation of data in the internal production database, and a correspondence to the
SDMX DSD has been prepared.
48.
The files will be pushed via EDAMIS as a preferred option. While an initial transfer
would encompass all series (starting in 1995), the extraction of updates could be limited to
concerned countries.
49.
The OECD is preparing for a transmission of the selected list of GDP and population data
built on OECD.Stat. A correspondence between the DSDs representing the native structure of the
OECD dissemination data warehouse and the agreed DSD for National accounts is to be
prepared.
50.
Assuming that the data messages are SDMX-ML 2.0 Compact EDI compliant and
Eurostat sends the files via EDAMIS, ECB reception and loading of the data in the ECB test
system could be automated. With the OECD, secured e-mail could be used after an exchange of
PGP/GPG keys. If EDAMIS or secured e-mail is not used, manual intervention will be needed
on the reception of the data to manually load them in our reception database. Manual
interventions will also be required if the SDMX-ML data messages are not EDI compliant (see
above). Once the data are in the reception database they can be moved in production and
disseminated in the ECB test environment.
51.
Regarding the sending of the quarterly sector financial accounts, the ECB foresees two
steps in both cases using own tools. First, the test transmission planned for January 2014 will
require mapping to the new DSD series keys; the ECB test environment will be used and data
sending via the agreed and currently established transmission channels to all agencies. The
second transmission scheduled in November 2014 will take place after the official go-live of the
new 2008 SNA DSD. Therefore the ECB production system will be used, no mapping will be
required by then and data will be automatically sent via the agreed and currently established
transmission channels to all agencies. The published quarterly sectoral financial accounts data
will also be accessible via ECB SDW site at that time as well as web-services.
V. CURRENT WORKFLOW PRACTICES AND WORKFLOW FOR THE PILOTS
52.
The TFIDC will examine the data cooperation arrangements. This includes arrangements
relating to the collection of data from the countries, the transmission of the data to the
international agencies, and the work flows among the international agencies with respect to the
sharing of consistent data on a timely basis. The TFIDC will also review the workflow practices
17
relating to the flow of information from the national (compiling) agencies to the international
agencies.
A. Data transmission from country to international agencies
53.
The review of workflow practices illustrated a range of methods, timelines, and coverage
for the transmission of country data to international agencies (see Annex VI). In addition, the
transmission schedules vary among international agencies as the schedules are designed to meet
the re-dissemination timetables of the agencies. The pilot exercises will need to determine a
suitable schedule for the transmission of data from national agencies to the collecting
international agencies that will satisfy the timetables of all receiving international agencies. This
implies that, among all the agencies, the strictest schedule currently in place may have to be
followed. Further, the validation process should also satisfy the timetables of the receiving
agencies.
54.
The pilot exercises will also need to determine which national agencies would be
responsible for submitting the data.
B. Sharing of validated data among international agencies
55.
For many years, international agencies have been exchanging national accounts data with
each other, to respond to user needs. Following are some examples of current data cooperation:
 Eurostat submits national accounts data for most EU countries to IMF in a bulk file on
a monthly basis.
 OECD provides quarterly national accounts data to all other six institutions once a
month (at the time of the press releases).
 UNSD receives data from the OECD, the UNECE and CARICOM. The ECE provides
data for transition economies.
 World Bank receives national accounts data for selected high-income economies from
the OECD and most of the population data from UNPD.
56.
Furthermore, in relation to the G20 Data Gaps Initiative, ECB, Eurostat and the OECD
started, from 2012 onwards, providing selected datasets on an ongoing basis for the PGI website
hosted by the IMF.
57.
More recently, Eurostat and the OECD signed a data exchange protocol for national
accounts data. This agreement describes the scope of data exchanges and data validation
arrangements4. The protocol includes more detailed validation requirements and maximum
delays for processing and validation, which could also serve as a basis for the pilot data
cooperation exercise at hand.
4
http://epp.eurostat.ec.europa.eu/portal/page/portal/national_accounts/documents/MoU%20OECD%20NA%202013.
pdf
18
Workflow arrangements for the pilots
58.
Below the workflows to be followed are given separately for sender and receiver
agencies.
Workflow for data sender
data
collection
data
validation
creation
SDMX file
pushing of
data
documenting
• collection of data from countries
• sender IOs procedures in place used
• data validation respecting minimum
standards
• creation of the SDMX message agreed
• transmission of the SDMX message
agreed
• date and time of reception of data from
countries
• person days used for validation
• date and time of pushing of the file
Outside scope of pilots
Inside scope of pilots
19
Workflow for data receiver
data reception
data
validation
data
dissemination
documenting
• reception of the SDMX file from data sender
• data validation using own procedures in place
• note: during the pilots the data will not be made public
• date and time of reception of SDMX file (EDAMIS will automatically record)
• person days used for validation of the pilot data
• date and time of when the pilot data could be loaded into the own public
dissemination environment (note: during the pilots the data will not be made
public)
• date and time of reception of data from countries via the own collection system,
if applicable
• person days used for validation of the data collected via the own system, if
applicable
• date and time of dissemination of the data colelcted via the own system in the
public environment
Procedures and validation rules
59.
International agencies will only disseminate the data as soon as the validation process is
complete. The validation process will be based on rules and procedures which have been
identified as good practice by the ECB-Eurostat-OECD technical working group in 2012 and
agreed for the Eurostat-OECD protocol.
60.
The agreed minimum checking rules (Annex VII) to validate national accounts data
include basic format, structure, encoding and content tests, which are essential for automatic
processing, as well as revision checks and consistency checks within and between transmitted
national accounts datasets, which are essential for users’ perception of the quality of national
accounts data. Further checks (e.g. statistical and economic plausibility checks or cross-checks
against related data) are also considered useful for detecting problems in the transmission or
comparability of data and should be run occasionally or systematically depending on the dataset
in question. While focusing on checks that can be processed and interpreted in a relatively
20
automatic way will help to ensure that data fulfilling minimum quality requirements can be
published in a timely way, it is important that data specificities which affect validation checks
are clarified with national authorities and documented as metadata.
Timetables for Pilot Exercises
61.
In terms of the timetables, data received from national sources should preferably be
validated by the responsible international organization with the shortest possible delay. One of
the objectives of the data cooperation pilot is indeed to monitor these delays in practice.
62.
Based on the timeliness agreed in the Eurostat-OECD data exchange protocol, the sharing
of validated data is expected to be as follows:

Validated data for main aggregates-population will be shared:
 within three days for quarterly data (one working day for larger economies)
 within one week for annual data

Validated data for sector accounts will be shared:
 within one to two weeks for quarterly data
 within one month for annual data
63.
These time lags are counted from the moment the data-providing international agencies
receive data from the relevant national agency. The pilots will address timeliness issues and will
propose feasible timelines for data sharing.
Timetable for Pilot Exercise on GDP Aggregates and Population
The following timetable is proposed for the first data exchanges. In total the pilot will take 9
months. Eurostat will start pushing data according to the following time table:
Test with dummy or old data:
26 September 2013
Initial data dissemination (up to Q2 2013):
a. complete national dataset
b. aggregates
7 October 2013
Continuous transmission of new and updated data
Q3 2013: GDP flash growth rate
Q3 2013: aggregates
Q3 2013: full transmission
Q4 2013: GDP flash growth rate
as of 07 October 2013
14 November
04 December 2013
09 January 2014
15 February 2014 (to be
confirmed)
July 2014
Report on status of pilot exercise
21
OECD and IMF will follow the same timetable. However, OECD will start in late
October and IMF in November 2013, in order to allow for some technical adaptations in
their own systems to receive and push data according to the agreed DSDs.
Timetable for Pilot Exercise on Sectoral Accounts
64.
The following tentative timetable is proposed for the first data exchanges:
Test of System with dummy or old data
February, 2014
Initial data dissemination (up to Q2 2014):
end -November 2014
Q2 2014. Sectoral Accounts and Balance Sheets
end-November 2014
Annual 2013: Sectoral Accounts and Balance Sheets
end-November 2014
Report on the pilot exercise and implementation
February 2015
65.
The timetables presented above may be adjusted where necessary as the situation
warrants and based on the first experiences with the pilot on GDP main aggregates5. A brief
status of the pilots could be included in the ISWGNA report to the 2014 meeting of the UN
Statistical Commission. A side event may be considered at the time of 2014 UNSC meeting.
5
For instance, whether the dissemination of sector accounts from Eurostat can be organized based on the automatic
extractions from the dissemination database (Eurobase), independently of the production database.
22
ANNEX I: MEMBERS OF THE TFIDC
Bank for International Settlements
Christian Dembiermont
European Central Bank
Tjeerd Jellema
Statistical Office of the European Communities
Silke Stapel-Weber (Co-Chair)
August Götzfried
Christine Gerstberger
Daniel Suranyi
International Monetary Fund
Manik Shrestha (Co-Chair)
Thomas Alexander
Olga Laveda
Gangti Zhu
Organization for Economic Co-operation and Development
Peter van de Ven
Jennifer Ribarsky
Rachida Dkhissi
Gyorgy Gyomai
United Nations
Herman Smith
World Bank
Ibo Levent
23
ANNEX II: COUNTRY COVERAGE FOR THE PILOT EXERCISE (BY GROUP OF COUNTRIES) 6
Pilot 1: GDP and Population
BIS
ECB
ESTAT
IMF
OECD
UN
WB
UN
WB
Sender
Country data
European Union members
X
EU-candidate countries
X
EFTA-members
X
Non-EU and non-EFTA OECD members
X
OECD key partners and accession countries
X
Other G20 countries (Argentina, Saudi Arabia)
X
X
Other countries (IMF members)
Aggregates
EU-28
X
EUR-17
X
OECD
X
G20
X
World
X
Pilot 2: Sector accounts
BIS
ECB7
ESTAT
IMF
OECD
Sender
Country data
European Union members
X
X
EU-candidate countries
X
EFTA-members
X
Non-EU and non-EFTA OECD members
X
OECD key partners and accession countries
X
Other G20 countries (Argentina, Saudi Arabia)
X
Aggregates
EU-28
EUR-17
X
X
X
OECD
X
G20
X
World
6
Data are not available for all countries and all indicators; i.e., the tables do not represent actual data availability but
rather the allocation of collection/validation responsibilities for the pilot. Transmission will depend on availability
from countries.
7
Quarterly sectoral financial positions and flows.
24
Overview of division of work for data transmissions for the Pilots 1 and 2
Countries marked in bold will also be covered by sector accounts (Pilot 2)
Eurostat or ECB
Eurostat
OECD
European Union (EU28)8
1 Belgium
BE
2 Bulgaria*
BG
Czech
3
CZ
Republic
4 Denmark
DK
5 Germany
DE
EU-candidates9
54
55
Egypt
Hong Kong
EG
HK
8
9
29
30
FYROM
Iceland
MC
IS
Non-EU and non-EFTA
OECD members
37 Australia
AU
38 Canada
CA
31
Montenegro
ME
39
Chile
CL
56
Malaysia
MY
32
33
Serbia
Turkey
RS
TR
40
41
Israel
Japan
Korea, Republic
of
Mexico
New Zealand
USA
IL
JP
57
58
Singapore
Thailand
SG
TH
KR
59
Taiwan, PC
TW
MX
NZ
US
60
Ukraine
UA
6
Estonia*
EE
7
8
9
10
Ireland
Greece
Spain
France
IE
GR
ES
FR
11
Croatia**
HR
12
13
14
15
Italy
Cyprus
Latvia*
Lithuania*
IT
CY
LV
LT
46
47
48
49
16
Luxembourg*
LU
50
17
18
19
20
21
22
23
24
25
26
27
Hungary*
Malta*
Netherlands
Austria
Poland
Portugal
Romania
Slovenia
Slovakia*
Finland
Sweden
United
Kingdom
HU
MT
NL
AT
PL
PT
RO
SI
SK
FI
SE
51
28
IMF
42
EFTA-Members
34
35
36
Lichtenstein
Norway*
Switzerland
LI
NO
CH
43
44
45
OECD key partners and
accession countries
Brazil
China
India
Indonesia
Colombia
Russian
Federation
South Africa
Other countries (priority)
Other countries (available)
61
All other
countries
BR
CN
IN
ID
CO
RU
ZA
Other G20 countries
52
53
Argentina
Saudi Arabia
AR
SA
UK
Quarterly non-financial accounts:* partial coverage only S13 and S2;**expected .
Annual non-financial accounts; Iceland -only limited financial accounts data; Montenegro- only non-financial
accounts data from 2013; Serbia-confidential data.
25
ANNEX III: TEMPLATE FOR GDP MAIN AGGREGATES AND POPULATION
Proposed selection of indicators for IAG Pilot project 1:
If all indicators in Bold are transmitted; indicators in italic could, in principle, be derived;
more precise information on preferred concepts, prices or units are provided as footnotes.
Q
X
X
X
X
N
x
x
x
x
Y or S
X
x
x
x
V
x
x
x
x
L
X
X
X
X
X
X
X
X
x
x
x
x
x
x
x
X
x
P3
P3_S13
X
X
X
X
x
x
x
x
x
x
X
X
X
X
x
x
x
X
X
X
X
x
X
X
x
x
x
x
x
x
x
x
x
x
X
X
X
X
X
X
X
x
x
x
x
x
x
x
x
x
x
X
X
x
X
X
X
X
X
X
X
x
x
x
x
x
x
x
x
x
x
x
x
x
P3_S14&15
P41
P5g
P51g
P52&P53
P6
P7
B11
Total final consumption expenditure
Government final consumption expenditure
Household and NPISH final consumption
expenditure
Actual individual consumption
Gross capital formation
Gross fixed capital formation
Changes in inventories and acquisition less
disposals of valuables
Exports of goods (fob) and services
Imports of goods (fob) and services
External balance of goods and services
Main GDP aggregates from the income side
D1
Compensation of employees
D11
Gross wages and salaries
B2g&B3g
Operating surplus and mixed income, gross
D2X3
Taxes less subsidies on production
Volumes
A
X
X
X
X
Current
prices
Code
Unit: national currencies (millions)
B1gQ
Gross domestic product at market prices
YA0
Statistical discrepancy (expenditure approach)
YA1
Statistical discrepancy (production approach)
YA2
Statistical discrepancy (income approach)
Main GDP aggregates from the output side
P1
Output
P2
Intermediate consumption
B1g
Gross value added at basic prices
D21XD31
Taxes less subsidies on products
Main GDP aggregates from the expenditure side
Indicator
yes (2)
no (1)
Prices (3)
Quarterly
Frequency
Annual
I/ Main GDP aggregates
Seasonal
Adjustment of
quarterly
series
26
Y or S
V
Gross national income, saving and net Lending
(D1_D4)
Net primary income from RoW
B5G
Gross national income at market prices
K1
Consumption of fixed capital
B5N
Net national income at market prices
(D5_D6_D7) Net current transfers from RoW
B6N
Disposable income, net
Adjustments for change in pension
D8
entitlements
B8N
National saving, net
D9
Net capital transfers from RoW
K2
(+/-) non produced, non--financial assets
B9
Net lending or net borrowing of the nation
II/ Population and Employment (4)
X
X
X
X
X
X
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
X
x
x
x
x
X
X
X
X
x
x
x
x
Frequency
x
x
x
x
x
x
x
x
Q: Adjustment
Volumes
N
Current
prices
Q
yes (2)
no (1)
Unit: national currencies (millions)
Prices (3)
A
Indicator
Code
Seasonal
Adjustment of
quarterly series
Quarterly
Frequency
Annual
I/ Main GDP aggregates
L
x
x
x
x
Units (5)
Indicator
Annual
Quarterly
no (1)
yes (2)
Persons
Hours
worked
Code
Unit: 1000
A
Q
N
Y or S
PERS
HR
POP
EMP_DC
SAL_DC
SELF_DC
Total population
Employment, domestic concept
Employees, domestic concept
Self employed, domestic concept
X
X
X
X
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
(1) Neither seasonally nor working day adjusted
(2) Working day and seasonally adjusted (Y) or seasonally adjusted, not working day adjusted (S);
if seasonal adjustment is done by the International Organisation, a flag (I) should be added.
(3) GDP volume data should preferably be provided as chain-linked volumes (L) or previous year prices (Y); if these are not available
constant prices (Q) can also be used.
(4) Population and employment are considered auxiliary variables in national accounts, aimed to calculate ratios like value added,
output, or labour costs per inhabitant or per employed person. If possible, population data should therefore refer to data compiled for
national accounts rather than demographic statistics, i.e. representing averages for the observation period and including adjustments
(e.g. for students, immigration, armed forces...). Employment should refer to employment in resident production units irrespective of
the place of residence of the employed person (i.e. domestic concept, DC) rather than resident persons in employment (i.e. the socalled national concept of employment, NC) as this concept is more appropriate when examining employment in relation to GDP (the
difference being mainly the net number of cross-border workers).
(5) If persons (PERS) and hours worked (HR) are not available, Jobs (JOB) or Full-time-equivalents (FTE) could be provided.
27
ANNEX IV: SDMX SURVEY QUESTIONNAIRE
June 20, 2013
Task Force on “International Data Cooperation”
Draft Questionnaire Infrastructure
Data cooperation Pilot Exercise. September 2013 – March 2014.
Introdcution
The TF-IDS agreed to conduct a data cooperation pilot with the following participating
international agencies
1. BIS
2. ECB
3. Eurostat
4. OECD
5. IMF
6. United Nations
7. World Bank
The pilot exercise will encompass the transmission of data messages with regards to the agreed
template for GDP main aggregates and population statistics; the data transmission will make
use of SDMX-ML messages using the now nearly finalized DSD for national accounts.
The scope of the data transmission will encompass a list of countries for which it is established
that quarterly main aggregates are regularly produced and disseminated. This list, it has been
established, is a super set of the data currently collected by the OECD and Eurostat, which
implies that for several countries a primary validator/disseminator IO needs to be assigned.
This questionnaire provides for a fact finding for the transmission and reception of SDMX type
of messages. A concrete follow up could involve a meeting or teleconference of subject matter
specialists to establish the precise parameters of the pilot data transmissions under data
cooperation. An early opportunity may be at the margin of the Eurostat workshop 10-11 June
2013.
28
Questionnaire
1. SDMX-NA DSD Pilot test participation.
Are you/is your organization actively participating to the test technical testing of the
SDMX-NA, in May – July 2013
a. Will you obtain the SDMX specifications from the Technical Pilot Sandbox?
b. Will you formulate and send test SDMX messages to the technical pilot sandbox?
c. Would you like the possibility to download SDMX messages from the technical
pilot sandbox?
2. Which SDMX messaging formats are you able to process?
a. SDMX-ML V2.0, which format: compact, generic?
b. SDMX-ML V2.1, which format: structure specific, generic
c. SDMX-EDI / GESMES/TS
3. Which SDMX messages are you able to process in your system?
MSD?
DSD?
4. Is your current system in terms of SDMX messages handling and data/metadata storage
relying on features/concepts used in SDMX-EDI, e.g. use of sibling and/or dataset
attachment level attribute for specific attributes across DSDs, mandatory OBS_STATUS,
etc?
5. Is your current system able to handle new SDMX artifacts like hierarchical code lists, to
handle attributes at any group level, etc? Do you intend to use those features? If yes, by
when will this be implemented? Please specify.
6. Creation of SDMX messages as of September 2013
a. You have the necessary infrastructure in place to create from your database
system SMDX messages , in all SDMX formats you mentioned above (question
2), for any DSDs. Please specify.
b. You will need to formulate SDMX type of messages manually (e.g. through
extracting data manually and filling a custom prepared Excel workbook). Please
specify.
c. Any other means of forming SDMX messages, namely.
d. No infrastructure present.
29
7. Loading of SDMX messages in your database system
a. You have the necessary infrastructure in place to automatically load SDMX
messages in your database system, in all SDMX formats you mentioned above
(question 2). Please specify.
b. You will need to rely on a manual loading of the data in your database.
8. Reception of SDMX messages as of September 2013
a. Do you have the necessary infrastructure in place to receive SDMX messages
from all participating agencies in all SDMX formats you mentioned above
(question 2). Please specify.
b. You will need to receive the SDMX-ML messages by e-mail
9. Monitoring of receipt of SDMX messages and upload of the data on your website (in
order to be able to see if the pilots are a success in terms of more consistent data being
available earlier to users)
a. What would be your system for systematically monitoring the receipt of the
SDMX messages?
b. What systems do you have in place to monitor the date and time of the upload of
the data onto your website?
c. What is your suggestion how we share this information?
10. Transmission channels in place on the dissemination side. Can you support any of the
following?
a. Any type of SDMX messages attached in an E-mail: secured with encryption and
signature or not?, requires or not manual intervention in preparing e-mails?, with
all data exchange partners?, on both reception and dissemination?) Please specify.
b. E-mail: automatically generate and process SDMX files and emails for data
exchange
c. Any type of SDMX-Messages exchanged via an automated process, i.e. via
dedicated data exchange infrastructure between institutions, including all data
exchange partners? and on both reception and dissemination? Please specify.
11. Any other relevant observations / remarks / limitations in your system that would affect
the success of the data cooperation pilot.
30
ANNEX V: SUMMARY OF SDMX SURVEY RESULTS
As of 12 August 2013, only one participating international organization (the United Nations) has
not provided input to the survey; all other participants’ replies have been taken into account
(BIS, ECB, Eurostat, IMF, OECD and the World Bank). Most of the organizations have
provided exhaustive replies to the questionnaire which also helped for better evaluation of the
preparedness status of the project.
Q1: SDMX-NA DSD Pilot test participation. Are you/is your organization actively
participating to the test technical testing of the SDMX-NA, in May – July 2013
All responding international agencies (with the exception of the BIS) confirmed their
participation in the SDMX-national accounts DSD Pilot test.
a. The ECB, Eurostat, IMF and OECD confirmed that they can obtain the SDMX
specifications from the Technical Pilot Sandbox. The World Bank said that it will be
monitoring the progress of 2008 SNA through the SDMX Secretariat.
b. Eurostat, OECD and IMF will formulate and send test SDMX messages to the technical
pilot sandbox. The World Bank will use the technical pilot sandbox when it is ready; an
initial analysis of the NA DSD structure and their internal data structure is needed
followed by mapping of the two. The ECB will not use the 2008 SNA fusion registry or
the sandbox but will use its data exchange infrastructure and statistical environment.
c. The OECD, IMF and the World Bank would like to download SDMX messages from the
technical pilot sandbox. The ECB and Eurostat will not use this approach. The ECB will
set up its system test environment to generate SDMX-ML compact 2.0 and structure
specific 2.1 in addition to SDMX-EDI. Eurostat will rely on the “push” approach but still
having the technical possibility to download SDMX messages from the technical pilot
sandbox.
Q2: Which SDMX messaging formats are you able to process?
a. SDMX-ML Compact 2.0 format is the preferred format for exchange by all agencies.
SDMX-ML Generic 2.0 format is used by Eurostat and IMF; currently developed by the
OECD.
b. The ECB and the BIS can process SDMX-ML Structure Specific 2.1 format. The World
Bank is evaluating its infrastructure to process the SDMX-ML V2.1 messages. No reply
from the OECD.
31
c. The ECB and the BIS are able to process SDMX-EDI for this pilot exercise. For Eurostat
this format is also possible but would require manual intervention. The OECD, IMF and
the World Bank do not support SDMX-EDI (GESMES/TS) technology and the latter two
agencies have no plans to provide support to that format.
Q3: Which SDMX messages are you able to process in your system?
All IOs can process DSD messages.
Eurostat can process MSD messages. The IMF and the World Bank would need further
investigation to implement it, i.e currently not used by the World Bank. The ECB and the BIS
does not currently process MSD messages - reference metadata are defined at the level of the
DSD.
Q4: Is your current system in terms of SDMX messages handling and data/metadata
storage relying on features/concepts used in SDMX-EDI, e.g. use of sibling and/or dataset
attachment level attribute for specific attributes across DSDs, mandatory OBS_STATUS,
etc?
The BIS and the ECB systems rely on features/concepts used in SDMX-EDI (use of sibling
and/or dataset attachment level attribute for specific attributes across DSDs, mandatory
OBS_STATUS). Eurostat, OECD, IMF and the World Bank do not.
Q5: Is your current system able to handle new SDMX artefacts like hierarchical code lists,
to handle attributes at any group level, etc? Do you intend to use those features? If yes, by
when will this be implemented? Please specify.
Only Eurostat indicated that can handle new SDMX artefacts like hierarchical code lists if used
with SDMX 2.0. No definite commitment for implementation is being mentioned by the other
IOs.
Q6: Creation of SDMX messages as of September 2013
a. All agencies in principle have the infrastructure in place to create from database system
SMDX messages. The common SDMX format is SDMX-ML v2.0 compact and generic
format as well as DSD based on the underlying dataset structure (IMF) or DSD in the
native structure of our dissemination data warehouse (OECD), or test environment (the
ECB, BIS). Additional mapping will be required for Eurostat and the OECD.
b. No agency will need to formulate SDMX type of messages manually (e.g. through
extracting data manually and filling a custom prepared Excel workbook) as in all cases
these processes are automated.
32
c. Eurostat replied that if needed SDMX Reference Infrastructure or SDMX Converter may
be used as any other means of forming SDMX messages.
d. The necessary infrastructure is present in all agencies.
Q7: Loading of SDMX messages in your database system
a. All agencies have the necessary infrastructure in place to automatically load SDMX
messages in their database systems, in all SDMX formats they support. The most
common format is SDMX-ML V2.0. BIS and the ECB support also SDMX-EDI format
(EDI or EDI compatible ML). OECD supported data sources are Web Service (pull
method) and file (push method).
b. Agencies do not need to rely on a manual loading of the data in their databases. Eurostat
may need partially manual intervention, depending on transmission format and
transmission mean. The World Bank have manual loading of data from SDMX-ML to the
system for some datasets.
Q8: Reception of SDMX messages as of September 2013
a. All agencies have the necessary infrastructure in place to receive SDMX messages from
all participating agencies in all SDMX formats they support. BIS uses file exchange
based on Web or Java application. The ECB uses direct data exchanges with BIS,
Eurostat and IMF; data exchanges with the OECD are generally done via Eurostat
EDAMIS; no exchange with the UN and World Bank thus it is foreseen to set up
infrastructure for the exchange of secured e-mails that would contain a SDMX data
message (PGP-GPG encrypted and signed message). The most common format is
SDMX-ML V2.0. BIS and the ECB support also SDMX-EDI format (extension for
SDMX-ML data files should be available in a near future for the ECB). OECD supported
data sources are Web Service (pull method) and file (push method). The World Bank can
receive SDMX messages but will need to map to their our data structure.
b. The need to receive the SDMX-ML messages by e-mail would, for the agencies, depend
on the data exchange partner. This is not seen as a preferred method but more as last
resort because manual treatment would be required in such cases.
Q9: Monitoring of receipt of SDMX messages and upload of the data on your website (in
order to be able to see if the pilots are a success in terms of more consistent data being
available earlier to users)
a. Systems for systematically monitoring the receipt of the SDMX messages differ between
agencies but are basically using their current monitoring systems of data reception. The
33
ECB will rely on its current data exchange infrastructure, including a Web UI, to monitor
the data exchange in the SNA context like for any other data flows. Eurostat will use
manual monitoring it the reception is by email and automatic monitoring for reception by
EDAMIS. The OECD may use mailbox, web service or file server depending on the data
transmission. The IMF will use the NA Registry Service notification, NA sandbox
service and SDMX reader service forming a system that currently being developed. The
World Bank is evaluating tools for this capability.
b. Systems in place to monitor the date and time of the upload of the data onto agencies
website are generally within the scope of the agencies systems for systematically
monitoring the receipt of the SDMX messages. In addition the ECB has developed an
interface for its data producers to monitor their data flows from reception to
dissemination, including scheduling actions for compilation of derived statistics and data
validation; during the pilot phase data will not be uploaded on website until the go-live of
the SNA transmission. BIS do not consider that it is necessary to post the data on the web
during the pilot phase. The OECD has the .Stat entry gate system, which can be
configured for this pilot (for the time being, QNA data is loaded on OECD.Stat twice a
day at fixed times at 6am and 12pm).
c. The information is proposed to be shared in various ways: web services (ECB);
EDAMIS (Eurostat) can also be setup as a message dispatcher and forward incoming
messages by email to agencies without EDAMIS client or not using the EDAMIS portal;
E-mail and RSS (OECD); web service and data API (IMF) as well as Sandbox support of
web service call for extracting the data; Registry service (IMF and World Bank).
Q10: Transmission channels in place on the dissemination side. Can you support any of the
following?
a. The ECB (for reception and dissemination) and Eurostat (for reception via EDAMIS) can
support SDMX messages attached in an E-mail: secured with encryption and signature
given that there are no manual interventions, i.e. data exchange is automated assuming
the exchange of the public keys took place and installation of the relevant keys in their
data exchange infrastructure. OECD is in principle able also to support this option. For all
e-mail data exchanges agencies will require manual work as such automated
infrastructure is not in place.
b. E-mail: automatically generating and processing of SDMX files and emails for data
exchange is possible for the ECB given public keys exchange and installation on both sender and receiver - exchange systems exists. For Eurostat it would be also possible but
with manual intervention. Other agencies do not support this data exchange type.
34
c. SDMX-Messages exchanged via an automated process, i.e. via dedicated data exchange
infrastructure between institutions, including all data exchange partners is the supported
or preferred option by the participating agencies using their current systems. The World
Bank currently do not have any automated exchange of data.
Q11: Any other relevant observations / remarks / limitations in your system that would
affect the success of the data sharing pilot.
The ECB will set up the new SNA DSDs in its test environment until the SNA data exchanges go
live in 2014Q3. As result the data will not be available through the ECB SDW web services until
then but data files will be exchanged with agencies using the ECB data exchange infrastructure.
After the third quarter of 204 (when in production) ECB SDW web services 2.0 or 2.1 will be
available. No use of the Sandbox is foreseen.
Eurostat would not exclude that, depending on the solution chosen, if manual intervention is
needed then the throughput time might be delayed compared to the "real" production process in
the future.
The OECD foresees the following challenges: the mapping of NA DSDs; handling different
message formats, especially the legacy SDMX-EDI if they have to; handling custom and grouplevel attributes such as sibling; validating the data messages because SDMX does not currently
provide integrity rule checks.
The IMF would like to develop the project in the direction of SDMX registry, Sandbox,
notification, monitoring system, web service and open data API platform.
No further comments were provided by BIS and the World Bank.
35
ANNEX VI: DATA WORKFLOW PRACTICES
Data Workflow Practices in International Agencies
GDP Main Aggregates
Questions
1. Which countries
do you cover in your
data collection?
ESTAT
BIS
EU27, HR and EFTA (CH, IS, NO) coutries based on ESA 95 TP and Collects data from 56 central banks, published following national
methodology (legal basis). EU candidate countries (ME, MK, SE,
methodologies.
RS) also provide some of these data.
OECD
The 34 OECD MCs are covered by data collection plus other
countries depending on the subject area.
2. Which indicators
do you cover?
A and Q data via Table 1 on the ESA TP, include subtables on GDP
form the output side (T0101), expenditure side (T0102) and income
side (T0103) and savings and net lending (T0107). EU/EA
aggregates are compiled by ESTAT.
For QNA (GDP and main expenditure components) volume and
price indices as well as growth rates. Some zones' aggregates are
compiled for the OECD-Total, OECD-Europe, G20 (only GDP), G7
and NAFTA.
3. Which templates
do you use (please
provide any templates
or webforms)?
The requirements of data transmissions are specified in the ESA TP. No template. National methodologies are used.
Eurostat provides specific EXCEL questionnaires for each subtable, whith a converter to EDAMIS files for transmission to
Eurostat. For the ESA 2010 TP, a standardised SDMX framework is
being prepared.
For EU countries, ESA 95 questionnaires transmitted automatically
by Eurostat to the OECD. For financial subjects, ECB transmits to
the OECD one file inc.only publishable data. For non-European
countries, these templates are slightly adapted to refer to the SNA
93.
4. What is the time
schedule of data
collection and
dissemination?
Q: t+70 days (t+90 days); A: usually in March (t+70) and at t+9
Data are received within 24 hours after publication. Data are
months, but MSs follow national release calendars, i.e. data are
disseminated to external users within 30 minutes after reception.
received between t+30 and t+70/90, and updated on the ESTAT
website. EU/EA estimates are currently released at t+45 (flash), t+65
(including GVA and expenditure breakdowns) t+75 (income) and
t+100 (third estimate for all variables).
for EU countries:QNA and QPOP: T+70 days in general and
following any national release; for non-EU countries: QNA, QPOP,
QNFSA and ANFA: following national releases; QNA, QPOP and
QNFSA data are disseminated right after the validation.
5. How is data
validation organised
(timing,
responsibilities,
methods)?
Standard checks upon reception and loading to the production
database include basic format, structure, encoding and content
tests, which are essential for automatic processing, as well as
revision checks and consistency checks. A preer review of
valdation checks recently performed by a ECB-ESTAT-OECD WG.
Data are validated within 24 hours after reception. Data are
validated within 24 hours . Automatic checks with thresholds are
activated .Validation covers consistency checks gaps and outlier
detections, vintage consistency timeliness monitoring.
Data validation of a dataset takes on average 2 to 4 hours,
depending on the dataset. The statisticians are responsible for data
validation. The type of checks performed to validate data is
summarized in the separate table.
6. Do internal
stakeholders have
access to nonvalidated data?
The access to non-validated data is limited to the production unit.
Non-validated data for OECD members are currently forwarded to
the OECD.
Non-validated data are available to internal stakeholders as well as No
to external users.
7. Which data
sharing methods
with other
international
organisations are
already in place?
8. What is your
revision policy?
Eurostat automatically forwards non-validated data to OECD, but
None
following a forthcoming SLA, ESTAT will regulrarly transmit
validated data to OECD, as is already the case for ECB. In relation
to the G20 IAG initiative, Eurostat provides data for the PGI website.
To IMF: QNA (GDP and main exp.comp; to UN: twice a year ANFA
series are sent; To all institutions (BIS, ECB, Eurostat, IMF, UN,
WB): 1 a month QNA CSV file is sent; from Eurostat: non validated
EU data for QNA, QPOP and ANFA are automatically forwarded;
validated will be sent from Eurostat following the new procedures .
The revision policy for country data varies across MSs. Eurostat
As soon as revised data are reported, they are re-disseminated.
processes any update of MS data received within a few days.
There is no specific BIS policy.
Estimations for EU/EA aggregates are updated at the t+65 and t+100
releases. At the t+45 EU/EA flash estimates only the volume series
for the last quarter are updated.
Revisions erase previous data ,methodological revisions are
updated in a new set of series. QNA zones’ aggreg. (OECD-Total,
OECD-Europe, G7 and NAFTA) are revised on an ongoing basis
incl. data up to the second last quarter. First OECD GDP growth at
T+50 days; revisions t T+70 days and T+90 days. For the G20 GDP
aggregate, the first estimate at T+70 days and revisions to the at
T+90 days and the following quarters.
GDP main aggregates, main expenditures in nominal and volume
for all countries. Main income series for important countries.
36
Data Workflow Practices in International Agencies
GDP Main Aggregates
IMF
UNSD
WB
1. Which countries do you cover
in your data collection?
Most of the 188 countries of the IMF as well as some nonsovereign entities.
The UNSD national accounts data covers 235 countries, areas, and GDP data mostly collected from NSIs/NCBs by visiting and resident
territories (including all UN Member States). The current database WB missions.
contains detailed annual national accounts estimates for 204 of
these entities.
2. Which indicators do you
cover?
Q and A, at current prices and in volume terms, reported or
calculated main GDP aggregates, including expenditure,
stat.discrepancy with GDP by type of activity, net primary and
secondary income from abroad, national disposable income,GNI and
gross saving.
Main aggregates (Table 1) GDP by expenditures at current and
constant prices ; relations among product, income, saving and net
lending aggregates at current prices; Domestic production by
industries (value added by industries and fixed assets at current and
constant prices)
3. Which templates do you use
(please provide any templates or
webforms)?
Countries submit the data on the prescribed forms . The forms may
contain country-specific elements. Countries submit the data in the
NC. The data are transmitted using special system or by email. The
ICS is a web-based data reporting system for use by country data
reporters.
The National Accounts Questionnaire (NAQ) template for all but
EU and OECD countries, for which data are obtained from the
OECD. In addition, UNECE and the CARICOM secretariat collect
data from their constituents. Countries are requested to update the
data tables and complete them as comprehensively as possible.
4. What is the time schedule of
data collection and
dissemination?
A calendar indicating the monthly cut-off dates for data reporting is
posted on the ICS website. The cut-off date is usually one week
before the end of the cycle. Quarterly and annual reporters are
invited to report every month even if no updates/revisions are
available. Data are disseminated to an international dissemination
space on a daily basis,they are disseminated to the external users
once
a month. Section is in charge of collecting, validating and
The Database
NAQ is sent out to all countries in February/March; countires
return completed questionnaires in April; but submissions continue
to be received and validated between April-August; End of
validation cycle: end August, no additional submissions after this
point; Data published online in Sept/Oct (data.un.org)
5. How is data validation
organised (timing,
responsibilities, methods)?
disseminating these indicators every month. Staff members of
perform preliminary checks . Color codes signal if data are first
transmissions (cells are turned blue) or have been revised (a light-todark coloring reflects the size of the revisions); every quarter, data
are compared (with IMF databases and international organizations).
Once Excel questionnaires are uploaded, a validation worksheet is
created from the internal database with a, extensive set of validation
rules (over 800). Inconsistencies may be corrected for minor
aggregates if possible; otherwise the country is contacted or a
footnote added to indicate the discrepancy.
6. Do internal stakeholders have Internal stakeholders can only access the data once they have been
access to non-validated data?
validated and disseminated. A working group is currently examining
the possibility of authorizing internal stakeholders to access nonvalidated data.
Internal stakeholders do not have access to non-validated data.
7. Which data sharing methods
with other international
organisations are already in
place?
In most cases, the IMF receives the NA data directly from the
countries. Eurostat submits national accounts data for most EU
countries in a bulk file (using GES.) on a monthly basis. Currently
working on a data exchange process where would directly query
Eurostat's databases using their "Bulk download facility".
UNSD receives data from the OECD, the UNECE and CARICOM.
Data for selected high-income economies are from the OECD.
The ECE provides data for transition economies. CARICOM uses
the NAQ to collect data, i.e. United Nations questionnaire based on
the 1993 SNA is sent to about 160 countries, areas, and territories
out of the 235 (204 have provided sufficient data to be published) .
8. What is your revision policy?
Countries are required to submit all updates and revisions.
The data provided by countries each year replace previously
submitted data. In general, figures for the most recent year are
regarded as provisional. Where large changes have been made to
country data due to changes in currency, adoption of new statistical
standards/methods, etc, new series are created, thereby preserving
the older data in overlapping years for analysis.
37
Data Workflow Practices in International Agencies
Population
ESTAT
BIS
1. Which countries do you cover
in your data collection?
2. Which indicators do you
cover?
Population data (T0110) and
employment data (T0110/1).
3. Which templates do you use
(please provide any templates or
webforms)?
4. What is the time schedule of
data collection and
dissemination?
OECD
26 countries are covered by
data collection.
One series: total population
Collect data on QPOP.
Standard questionnaire,
national files or data is
extracted from national
websites.
EA/EU data released at t+75 and
t+100.
QPOP: following national
releases for dissemination:
QPOP data are disseminated
right after the validation.
5. How is data validation
organised (timing,
responsibilities, methods)?
6. Do internal stakeholders have
access to non-validated data?
7. Which data sharing methods
with other international
organisations are already in
place?
8. What is your revision policy?
Eurostat will transmit
validated European data for
QPOP following the new
data exchange procedures .
EA/EU data released at t+75 and
t+100
38
Data Workflow Practices in International Agencies
Population
IMF
1. Which countries do you cover
in your data collection?
2. Which indicators do you
cover?
STA collects data only on "total
population".
3. Which templates do you use
(please provide any templates or
webforms)?
No specific template is used. The
UNSD provides the data in an
Excel template .
4. What is the time schedule of
data collection and
dissemination?
Annual data usually in midJune, internal dissemination
after one day, external with IFS
Yearbook.
5. How is data validation
organised (timing,
responsibilities, methods)?
No validation is performed on
this dataset as all validation is
the responsibility of the UNSD.
UNSD
WB
The World Bank
Development Data Group
(DECDG) covers most of the
World Bank member
economies
and all other
DECDG collects/estimates
many
population/demographic
data including total
population, population by
age/sex/place of residence
(urban/rural),
crude birth
No
specific template
is
used. The total population
data are collected variety of
sources (NSI, Eurostat,
UNDP). The World Bank
produces its own
population
estimates
for a
Data
are collected
twice
year. Data are updated in
April (WDI book, WDI
database and Health
Nutrition Population
database) and in July in
(WDI
database
Population
dataand
are Health
reviewed through WDI
review process around
December-January, and
Operational Guidelines
exercise around April-May.
6. Do internal stakeholders have Not applicable.
access to non-validated data?
Not applicable.
7. Which data sharing methods
with other international
organisations are already in
place?
All population data published by
STA are sourced from the UNSD
as part of a data sharing
agreement.
8. What is your revision policy?
Countries are required to submit
all available updates and
revisions to the UNSD.
About 35% of the total
population data are
collected from national
statistical offices or
Eurostat, and about 65% of
the total population data
received
from
the Unitedfor
The group
of countries
which the World Bank
considers using country
estimates are the developed
countries which produce
high quality estimates every
year or even more often
39
Data Workflow Practices in International Agencies
Sector Accounts
ECB
1. Which
countries do
you cover in
2. Which
indicators do
you cover?
We cover all 27 EU countries
3. Which
templates do
you use (please
provide any
templates or
webforms)?
Data is always received through SDMX-EDI data files (from 2014 SDMXML 2.0/2.1). National compilers are provided only with the amended TP
tables that include the expected codes for the series to be transmitted.
These tables are included in Appendix I. No templates or webforms that
convert actual data in SDMX-EDI data files are provided to national
compilers.
BIS
Same as for GDP main
aggregates.
The Euro Area Accounts (EAA) present a complete and consistent set of Main items of the 5 sectors
quarterly data for all resident institutional sectors and the rest of the for the main countries.
world. In addition, the EAA integrates financial and non-financial
statistics , thereby allowing for an integrated analysis of non-financial
economic activities and financial transactions. The euro area accounts
also contain consistent financial balance sheets.
4. What is the
time schedule
of data
collection and
dissemination?
Same as for GDP main
aggregates.
For the regular quarterly financial accounts data production at the EAA
the following time schedule exists: T+80 Transmission of MUFA Early
Estimates (to ECB from national compilers); T+110 Transmission regular
MUFA (to ECB from national compilers); T+120 Publication and
dissemination of euro area and national accounts (from ECB to national
compilers, international organizations, external/internal users). For the
QSA dataflow received from Eurostat the following schedule exists:
around T+94 Incomplete MS QSA transmissions (from Eurostat to ECB);
around T+98 Validated MS QSA transmission (from Eurostat to ECB);
around T+108 EA QSA transmission (from Eurostat to ECB). For the annual
financial and non-financial accounts data received from Eurostat there is
no time line since there is no revision policy. Annual data are
disseminated automatically upon reception.
5. How is data In order to ensure the efficient exchange of high quality national MUFA
validation
data, a set of data validation and consistency procedures are
organised
implemented at the ECB, in the statistical production environment. These
(timing,
data checks refer to: completeness checks; horizontal consistency;
responsibilities, balancing items consistency; aggregation consistency; who-to-whom
methods)?
consistency; Sizable revisions; Overall plausibility checks on other
changes; Negative stocks. The data are said to be consistent if
inconsistencies in the above mentioned cases do not exceed a threshold
of 10 million for both stocks and transactions. The validation of country
data is allocated between the EA national accounts team members. Each
member validates its own set of countries and communicates any
inconsistency issues directly with the country.
6. Do internal
No.
stakeholders
have access to
7. Which data
The ECB is responsible for the quarterly national financial accounts data
sharing
which are disseminated to the users through the Statistical Data
methods with
Warehouse and, at the same time, the data are also transmitted to the
other
NCBs/NSIs, Eurostat, BIS, OECD and IMF via SDMX-EDI data files. In
international
addition ECB receives validated non-financial accounts and annual
organisations
financial and non-financial data from Eurostat by means of SDMX-EDI
are already in
data files. The national accounts data flow at the ECB is visualized in the
place?
graphic blow
Same as for GDP main
aggregates.
8. What is your There is no revision policy national and international data providers are
revision policy? allowed to revise and resubmit their data at any time.
Same as for GDP main
aggregates.
Same as for GDP main
aggregates.
Same as for GDP main
aggregates.
Same as for GDP main
aggregates.
40
Data Workflow Practices in International Agencies
Sector Accounts
OECD
UNSD
1. Which countries do you Very few countries are covered by data collection.
cover in your data
collection?
2. Which indicators do you We collect data for non-financial and financial national accounts, of which
cover?
the following subject areas covered by this TF: ANFA, QNFSA, QFSA.
Same as for GDP main aggregates.
3. Which templates do you
use (please provide any
templates or webforms)?
Same as for GDP main aggregates.
Part IV: Integrated economic accounts (from
production to financial accounts); Total economy
(S.1); Rest of the world (S.2); Non-financial
corporations (S.11);Financial corporations (S.12);
General government (S.13); Households (S.14);
Non-profit institutions serving households (S.15);
Combined Sectors: Non-Financial and Financial
Corporations (S.11 + S.12); Households and NPISH
(S.14 + S.15)
4. What is the time
EU countries - QNFSA: These data are transmitted to the OECD by Eurostat at Same as for GDP main aggregates.
schedule of data collection T+105 days for full QSA and free QSA received before t+97, around T+113
and dissemination?
(received between t+98 and t+105) and around T+120 (received by Eurostat
between t+105 and t+111). For free QSA datasets received later, individual
data deliveries are provided.ANFA: T+24 months; QFSA: T+120 days. non-EU
countries - QNFSA and ANFA: following national releases; QFSA: T+105 days
5. How is data validation for EU countries, for QNFSA and QFSA data are first validated by Eurostat and Same as for GDP main aggregates.
organised (timing,
ECB respectively and then transmitted by these institutions to the OECD
responsibilities, methods)? which performs its own checks.
6. Do internal stakeholders
have access to nonvalidated data?
7. Which data sharing
methods with other
international organisations
are already in place?
No.
8. What is your revision
policy?
Countries’ data are revised according to the national revisions policy.
Same as for GDP main aggregates.
To IMF: QNFSA, QFSA data released in OECD.stat are used to feed the Same as for GDP main aggregates.
Principal Global Indicators website (in the context of Recommendation 15 of
the DGI). To UN: twice a year ANFA series are sent by OECD to UNSD. From
Eurostat: non validated EU data ANFA are automatically forwarded to the
OECD through the e-Damis system on the OECD generic account
SNA.contact@oecd.org. Validated EU data for QNFSA (Full and Free datasets)
are transmitted to the OECD generic account SNA.contact@oecd.org. From
ECB: EU validated QFSA data are transmitted to the OECD on the OECD
generic account SNA.contact@oecd.org.
Same as for GDP main aggregates.
41
Data Workflow Practices in International Agencies
Sector Accounts
Questions
ESTAT
1. Which countries do you
cover in your data
collection?
Sector Accounts
EU27, HR and EFTA (CH, IS, NO) countries based on ESA 95 TP and
methodology (legal basis).Some EU candidate countries (ME, RS) also provide
some annual data.
2. Which indicators do you
cover?
Annual Sector Accounts data are transmitted via table 8 of the ESA TP, whereas
Quarterly Sector Accounts data are transmitted via table 801.
3. Which templates do you
use (please provide any
templates or webforms)?
The requirements of data transmissions are specified in the ESA TP. Eurostat
provides specific EXCEL questionnaires for each sub-table. Sector Accounts
templates are available at:
https://circabc.europa.eu/w/browse/ab340f5e-fd71-47b6-b8b6-486c437bfb2d
4. What is the time schedule
of data collection and
dissemination?
At present (ESA95 TP), Annual Sector Accounts have to be transmitted by t+9
months; countries data and aggregates are disseminated on the website and on the
database of Eurostat at around t+10 months; Full Quarterly Sector Accounts (full
QSA) transmitted by the countries at t+90 days; key Indicators by country are
published at around t+105; EU and EA aggregates around t+120 days; Publishable
Quarterly Sector Accounts (as agreed with countries for free publication) are
transmitted to OECD (for international data sharing): at around t+105, t+113 and
t+120 days.
Standard checks upon reception and loading to the production database include
basic format, structure, encoding, content tests, outlier detection, which are
essential for automatic processing, as well as revision checks, and consistency
checks
5. How is data validation
organised (timing,
responsibilities, methods)?
6. Do internal stakeholders
have access to non-validated
data?
Non-validated Sector Accounts data are not shared, except for Annual Sector
Accounts as given in section 4 above.
7. Which data sharing
methods with other
international organisations
are already in place?
The data sharing of Sector Accounts data with the other international organisations is
explained in section 4 above.
8. What is your revision
policy?
The Sector Accounts data EU/EA aggregates and Publishable QSA data by
country published quarterly at round t+120 days (or before: see section 4) are not
subject to revisions outside the quarterly disseminations. However, the Annual
Sector Accounts countries’ data may be revised during the year following the
revisions policies of the countries.
42
ANNEX VII: EUROSTAT-OECD PROTOCOL: AGREED MINIMUM CHECKING RULES
Implications for
validation process
Type of check
Target
Requirement
Minimum
check
Preliminary checks on
data reception,
conversion, loading
Sender, formats, codes, etc.
Full respect of
transmission and
encoding conventions
Essential
Identification of key
data characteristics
ESA table number, unit,
frequency, time span, series
Consistency of data
specifications
Essential
Checks on dubious
values
Empty, zero or negative
values, etc.
Respect of coding
conventions and
expected range of
values
Revision checks
Comparison with previous
transmissions
Significant revisions not
due to regular updates
should be explained
(metadata)
Intra-file/table checks
on the consistency of
totals and breakdowns
Various types of possible
breakdowns (indicator,
industry, sector/sub-sectors,
etc.)
Sum of breakdowns
should be equal to
respective total (for
additive series)
Other types of intrafile/table consistency
checks
Current prices/volumes,
raw/adjusted data,
annual/quarterly data,
assets/liabilities,
uses/resources, etc.
Specific relations
between some series are
expected
Extra-file/table
consistency checks
Variables transmitted via
different files/tables
Series should be
coherent (except for
vintages)
Essential
Unexplained major
discrepancies can lead to
refusal of dataset
Statistic plausibility
checks
Standard deviation, etc.
Significant deviations
may indicate errors
Useful
Possible follow-up
questions
Economic plausibility
checks
Economic ratios, growth
rates, etc.
Significant deviations
may indicate errors
Useful
Possible follow-up
questions
Consistency checks
against other statistics
Consistency checks
against data published
by other institution
Related statistics (possible
conceptual differences)
Significant deviations
may indicate errors
Essential
Essential
Essential
(Automatic) correction or
back to sender
(Automatic) correction or
back to sender
(Automatic) correction or
back to sender
Unexplained major
revisions can lead to refusal
of the dataset
Misalignments can lead to
refusal of dataset
Misalignments can lead to
refusal of dataset
Essential
Useful
Possible follow-up
questions
Possible follow-up
questions
Same series published by
other institutions
Significant deviations
may indicate errors
Useful
Download