Report of the Statistical Center of Iran

advertisement
Report of the
Statistical Centre of Iran (SCI)
On
Implementation of Plan of Action
for the Framework of Cooperation in Statistics
Prepared for the
nd
2 High Level Expert Group Meeting (HLEGM) on Statistics
of the ECO National Statistical Offices
&
Detailed proposal of the SCI on
Establishment of the ECO Statistical Network
24-25 September 2009
Dushanbe, Tajikistan
SCI Report
Since submission of the ECO Framework of Cooperation and Plan of Action in
Statistics, Statistical Centre of Iran (SCI) has considered objectives and contents of the
Plan as a base for intensive cooperation in the field of statistics among ECO member
states and actively participated in the related events. In this line, the SCI has proposed
and supported the idea of usage of effective modalities for implementation of the Plan
of Action such as setting up High Level Expert Group (HLEG) on Statistics as a
modality for cooperation of the ECO National Statistical Offices in the field of
statistics. The followings are some aspects of activities adopted by the SCI for
implementation of the Plan of Action in recent years:
1. Hosting ECO events in the field of statistics: Statistical Centre of Iran has
shown its interest and readiness for hosting ECO events in the field of statistics.
Most important ECO events relating to statistical issues have been hosted by the
SCI namely:
1.1 The first meeting of the Heads of the ECO National Statistical Offices was
hosted by the SCI during 28-29 January 2008 in Tehran. It was the first meeting
of authorities and high ranking officials of the ECO NSOs and the first step
toward realization of the Plan of Action on Statistics in the region. This meeting
was concluded with issuance of Tehran Communiqué expressing willingness of
the ECO NSOs for active cooperation in the field of statistics and proposing for
taking effective measures in this regard.
2
1.2 The first meeting of the ECO High Level Expert Group (HLEG) on statistics
was hosted by the SCI during 26-27 October 2008 in Tehran. This meeting was
organized based on the outcomes of the first meeting of the Heads of the ECO
National Statistical Offices in January 2008. The HLEG on statistics discussed
the most important issues and problems of joint statistical activities in the
region and proposed a number of proposals for consideration and further actions
by the ECO member states. These two important events were initial steps for
promotion of cooperation in the field of statistics in the ECO region in line with
the Plan of Action.
2. Establishment of the ECO Statistical Network: One of the main issues of the
ECO Plan of Action for the Framework of Cooperation in Statistics is creating
institutional mechanism for this purpose. This issue was raised in the first
meeting of the ECO Heads of NSOs in January 2008 and establishment of the
ECO Statistical Network was proposed by the SCI. Following this meeting, this
proposal was circulated among member states by the ECO secretariat for
receiving their comments and views. The proposal was discussed in the first
meeting of the High Level Expert Group (HLEG) on statistics in October 2008
and also in the 6th NFPs of Economic Research and Statistics in November 2008
in Ankara. The 19th RPC meeting decided to put this proposal into operation
subject to approval by the CPR. Finally, the 469th CPR meeting on 7 June 2009
approved proposal of the SCI for establishment of the ECO Statistical Network
3
and allowed the Statistical Centre of Iran to realize this proposal in cooperation
with ECO secretariat. Detailed information of the ECO Statistical Network has
been prepared by the SCI and will be presented to this meeting (Annex I).
3. Capacity building and organizing regional training workshops and courses:
During the last two years, the Statistical Centre of Iran organized and hosted a
number of professional workshops and training courses for the ECO member
states and other countries in the Asia and the Pacific region. Some main
specifications of these workshops and training courses are described below:
3.1 Sub-regional Course on Statistics for the Countries in Transition in Central Asia
and Caucuses: This course organized by the SIAP (Statistical Institute for Asia
and the Pacific) and hosted by the Statistical Centre of Iran during 21 April to 2
May 2007 in Tehran. Representatives from 12 countries (including Armenia,
Azerbaijan, Georgia, Kazakhstan, Kyrgyzstan, Tajikistan, Uzbekistan and Iran)
participated in this course.
3.2 Workshop on Economic Statistics and Informal Sector: This workshop organized
for the ECO countries with cooperation of UNSD, UNESCAP and ECO during
10-13 November 2007. The statistical Centre of Iran hosted this workshop and
representatives from 9 ECO member states participated in the workshop. The
workshop resource persons/ lecturers were from UNSD and UNESCAP.
3.3 Workshop on Geographical Information System (GIS): This workshop was
designed and hosted by the Statistical Centre of Iran with cooperation of the
ECO secretariat during 19-22 April 2009 based on the proposal of the SCI in 6th
NFPs on Nov. 2008 in Ankara. The course conducted by the Statistical
Research and Training Centre (SRTC) and lectures of the workshop delivered
by experts from Office of Map and Geospatial Information of the SCI.
Representatives from 7 ECO member countries participated in the workshop. It
was the first workshop which was designed based on the capabilities and
potentialities of experts of the member states for capacity building within the
region.
4
3.4 Workshop on the System National Accounts: Another workshop proposed by the
SCI in the 6th NFPs on Economic Research and Statistics on Nov. 2008 in
Ankara was held in the field of national accounts with cooperation of the ECO
secretariat during 17-20 May 2009 in Tehran. Office of Economic Accounts of
the SCI was assigned for preparation of the curriculum of the workshop and
Statistical Research and Training Centre (SRTC) was responsible for
conducting the event. Representatives from 7 member countries of the ECO
participated in this workshop as well as representatives from UNESCAP and
ECO Trade and Development Bank. Main aspects and structure of the System
of National Accounts (SNA Rev. 1993) were provided and discussed in this
workshop by an expert from Office of Economic Accounts of the SCI. It was
the second workshop organized for the ECO member states with cooperation of
experts from ECO NSOs for capacity building in the region.
5
As described above, the Statistical Research and Training Centre (SRTC) has
significantly contributed to the training programs of the SCI for ECO member
states. So this centre with its experience and good capacity (in terms of software
and hardware) for organizing regional and international training courses and
workshops can effectively assist the ECO secretariat and ECO NSOs as one of the
statistical training centers for capacity building, improvement of statistical science
and development of cooperation in the field of statistics in the region.
4. Preparation of data for National Economic Report: For publication of the
ECO Annual Report, the questionnaire of the ECO secretariat was completed
and sent back to the Secretariat covering requested data for 35 items for the
period of 2000-2007.
5. Providing Metadata for statistical items: The issue of harmonizing concepts
and definitions of statistical items and preparation of metadata is one of the
main issues in cooperation in the field of statistics which must be pursued by
the ECO NSOs. This issue also was discussed in the first meeting of the ECO
High Level Expert Group (HLEG) on statistics and member states agreed to
cooperate in this regard. Based on the request of the ECO secretariat, the SCI
has prepared and completed required metadata for statistical items of the ECO
key indicators and ECO socio-economic indicators and provided them to the
Secretariat.
6. Roster of Leading Experts in Statistics: In the first meeting of the ECO High
Level Expert Group (HLEG) on statistics, the SCI set forward a proposal for
creating Roster of Leading Experts from national statistical offices in the region
6
as a directory table considered by the ECO secretariat in capacity building,
organizing training courses and workshops and exchange of experience and
expertise within the region. Statistical Centre of Iran has already prepared the
list of its leading experts in the various fields of statistics (economic statistics
and national accounts, Population and labor force statistics, statistical survey
design, data processing and data warehousing. . .) and provided the Secretariat
with the list for further measures.
Annex I
ECO Statistical Network
I.
Introduction
7
This project aims at developing a comprehensive system of statistical data and information
management for the ECO Member States which the Statistical Centre of Iran has been selected to
develop and manage.
ECO Statistical Network is a place through which member countries can access to these main
facilities:

General Information: Users can access to general information such as: publication, events,
and general information about member countries, figures and chart.

Business intelligence and data warehouse capabilities
All these functionalities are given in next sections.
II.
Proposed Solution
The ECO Statistical Network uses various information from member countries and international
organizations, and it provides a suitable platform for data to be saved for responding to any kind of
analytical queries. This solution is based on Data Warehousing technology, so it's a good idea to
review a few definitions in this area:
Data Warehouse:
A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data
in support of management's decision making process.
8
Data Warehouse features:

Subject Oriented: Data that gives information about a particular subject instead of
about a company's ongoing (day-to-day) operations. For example Population, National
Accounts

Integrated: Data that is gathered into the data warehouse from a variety of sources and
merged into a coherent whole.

Time-variant: All data in the data warehouse is identified with a particular time period.

Non-volatile: Data is stable in a data warehouse. More data is added but data is never
removed. This enables management to gain a consistent picture of the business.
Benefits of Data Warehousing:
Some of the benefits that a data warehouse provides are as follows:

A data warehouse provides a common data model for all data of interest regardless of the
data's source. This makes it easier to report and analyze information than it would be if
multiple data models were used to retrieve information such as sales invoices, order
receipts, general ledger charges, etc.

Prior to loading data into the data warehouse, inconsistencies are identified and resolved.
This greatly simplifies reporting and analysis.

Information in the data warehouse is under the control of data warehouse users so that,
even if the source system data is purged over time, the information in the warehouse can
be stored safely for extended periods of time.

Because they are separate from operational systems, data warehouses provide retrieval of
data without slowing down operational systems.

Data warehouses can work in conjunction with and, hence, enhance the value of
operational business applications, notably customer relationship management (CRM)
systems.

Data warehouses facilitate decision support system applications such as trend reports (e.g.,
the items with the most sales in a particular area within the last two years), exception
reports, and reports that show actual performance versus goals.
OLAP (Online Analytical Processing):
OLAP allows users to analyze database information from multiple database systems at one time.
While relational databases are considered to be two-dimensional, OLAP data is multidimensional,
meaning the information can be compared in many different ways. For example, a company might
9
compare their computer sales in June with sales in July, and then compare those results with the
sales from another location, which might be stored in a different database.
For example, if a report shows sales are trending lower than expected, business users need to be
able to easily uncover the underlying issue by getting answers to questions such as:

Is the problem with one product line, or certain regions?

What is different between underperforming products or regions versus other combinations
that are performing well?

Is there a related problem with sales headcount? Marketing campaigns? Or something else?
Main functions of ECO Statistical Network are:
1- Data Integration(ETL)
Data is everywhere. Providing a consistent, single version of the truth across all sources of
information is one of the biggest challenges faced by IT organizations today. The ECO Statistical
Network's Data Integration delivers powerful Extraction, Transformation and Loading (ETL)
capabilities using an innovative approach. The ease of use in our graphical, drag-and-drop design
increases productivity and our extensible; standards based architecture ensures that you will never
be forced to adopt proprietary methodologies into your ETL solution.
Extract: Most data warehousing projects consolidate data from different source systems. Each
separate system may also use a different data organization / format. Common data source formats
are relational databases and flat files.
Transform: The transform stage applies a series of rules or functions to the extracted data from
the source to derive the data for loading into the end target.
Some important functions of Transform stage:
I.
Selecting only certain columns to load
II.
Translating coded values (e.g., if the source system stores 1 for male and 2 for female, but
the warehouse stores M for male and F for female)
III.
Encoding free-form values (e.g., mapping "Male" to "1" and "Mr" to M)
IV.
Deriving a new calculated value (e.g., sale_amount = qty * unit_price)
V.
Filtering
VI.
Sorting
VII.
Joining data from multiple sources (e.g., lookup, merge)
VIII.
Aggregation (for example, rollup - summarizing multiple rows of data - total sales for each
store, and for each region, etc.)
IX.
Transposing or pivoting (turning multiple columns into multiple rows or vice versa)
10
Applying any form of simple or complex data validation. If validation fails, it may result in a full,
partial or no rejection of the data, and thus none, some or all the data are handed over to the next
step, depending on the rule design and exception handling. Many of the above transformations
may result in exceptions, for example, when a code translation parses an unknown code in the
extracted data
Load: The load phase loads the data into the end target, usually the data warehouse (DW).
2- Analysis
ECO Statistical Network Analysis Overview
Analysis puts rich, analytic power in the hands of your knowledge workers – helping them
operate with maximum effectiveness by gaining the insights and understanding they need to make
optimal business decisions. For example, if a report shows sales are trending lower than expected,
knowledge workers need to be able to easily uncover the underlying issue by getting answers to
questions such as:
I.
Is the problem with one product line, or certain regions?
II.
Is it all states within that region, or a combination of certain products in certain regions?
III.
What is different between underperforming products or regions versus other combinations
that are performing on target?
IV.
Is there a related problem with sales headcount? Marketing campaigns? Or something else?
Analysis helps answer these kinds of business questions by:
I.
Making it easy for users to freely explore business information by interactively drilling into
and cross-tabulating data
II.
Providing speed-of-thought response times to complex analytical queries
III.
Presenting data multi-dimensionally and letting users select what dimensions and measures
to explore
3- Ad-hoc reporting
All organizations use reporting in one form or another. As a result, reporting is considered a core
Business Intelligence (BI) need and is frequently the first BI application deployed. ECO statistical
Network Reporting allows members to easily access, format, and distribute information to their
users.

Flexible deployment from standalone desktop reporting to embedded reporting and
enterprise business intelligence

Broad data source support including relational, OLAP, or XML-based data sources
11

Popular output options including Adobe PDF, HTML, Microsoft Excel, Rich Text Format, or
plain text

Web-based ad hoc query and reporting for business users

Enterprise Edition provides enhanced software functionality, comprehensive professional
technical support, product expertise, certified software and software maintenance, and
more
III.
Deployment
For extraction data and information, setting up a FTP Site is suggested for entry of statistical
data, indicators and items so that the ECO Statistical Database can be fed from it.
Flowchart of the work process based on FTP Site
FTP-Based ETL
Cnt 1
Reporting
Tools
Cnt 2
OLAP
Staging
Cnt 3
Development
& Design Tools
SCI’s Data warehouse
Members’ DBs
At last, FTP Site will be developed and implemented, and the log in and data entry facilities for
each of the ECO Member States will be available. Username and Password are given to every
country member for uploading data. As shown in the flowchart, after data entry (uploading) by the
member states, ETL (Extraction, Transform, Load) process extracts the data from this site and loads
them at staging (temporary) database. This operation is done automatically. The most important
12
thing is that the ETL system supports most of the formats. Common data formats (Microsoft Excel,
XML, HTML, TSV/CSV and the like) are supported by ETL system. After extraction of data, the data
should be managed. That is, some of the following activities may be taken for unification and
consolidation of display format and management of consistencies and conflictions:

Definition of conditions and contextual and thematic control rules

Selecting and transferring only certain columns and fields

Translating coded values (e.g., the source system stores code 1 for male and code 2 for
female),

Cleaning up values

Joining data from multiple sources

Aggregation (for example, minimum, maximum, number of records, etc)

….
13
Above you will find some schemas of ETL process.
After these stages, managed data are loaded in the Data Warehouse (DW). The loaded data and
information must be done in the format on which cubes can be defined. Finally, reports produced
from the data cubes which are designed on the special Site will be released. Data Warehouse of
the ECO is implemented and put on a special RDBMS PostgreSQL.
As mentioned before, for launching the ECO Statistical Database, the following stages are
required.
14
Project's Progress
IV.
Pilot
In pilot stage, all statistical indicators were extracted from ECO website from
http://www.ecosecretariat.org and then categorized into a few segments (such as Population,
Financial Intermediation, etc) based on Iran Statistical Yearbook. At next step these data were
transformed and loaded into a main database. After that all data cubes were designed and
published. A few examples of ECO statistical data cubes are shown in figures 3, 4:
In figure 3 an indicator, External Public Dept-GDP for two dimensions (year and country) along
with time-series are shown.
In figure 4 three indicators of Financial Intermediation for two dimension are shown. As you can
see sum of any indicator is shown for all countries. If someone clicks on plus near All Countries he'll
see these indicators for all countries.
15
Figure 3: a sample of time series
16
Figure 4: a sample of data cube
In figure 5,6 a few charts and graphs are shown.
17
Figure 5: Financial Intermediation and
charts
18
19
20
Download