Handbook on Major Statistical Data Management Platforms

advertisement
United Nations Economic Commission for
Africa
African Centre for Statistics
Handbook on Major Statistical Data
Management Platforms
Addis Ababa
October 2011
© 2011 African Centre for Statistics (UNECA)
Page 0
Contents
I.
BACKGROUND INFORMATION ..................ERROR! BOOKMARK NOT DEFINED.
STATISTICALDATA ........................................................................ ERROR! BOOKMARK NOT DEFINED.
1.1.1. microdata...................................................................................................................... 1
1.1.2. macrodata ..................................................................................................................... 1
1.2.
STATISTICAL DATA MANAGEMENT SYSTEM .................................ERROR! BOOKMARK NOT DEFINED.
1.3.
JUSTIFICATION OF THE ASSIGNMENT............................................. ERROR! BOOKMARK NOT DEFINED.
1.4.
ORGANIZATION OF THIS DOCUMENT ............................................. ERROR! BOOKMARK NOT DEFINED.
1.1.
II.
PROJECT DEFINITION .................................ERROR! BOOKMARK NOT DEFINED.
2.1.
2.2.
2.3.
OBJECTIVE ...................................................................................ERROR! BOOKMARK NOT DEFINED.
MODE OF OPERATION ..................................................................ERROR! BOOKMARK NOT DEFINED.
SCOPE OF WORK ..........................................................................ERROR! BOOKMARK NOT DEFINED.
III.
MAJOR REQUIREMENTS OF STATISTICAL DATA MANAGEMENT SYSTEMSERROR!
BOOKMARK NOT DEFINED.
DATA CAPTURING ........................................................................ERROR! BOOKMARK NOT DEFINED.
DATA STORAGE AND RETRIEVAL .................................................ERROR! BOOKMARK NOT DEFINED.
DATA PROCESSING AND DISSEMINATION .....................................ERROR! BOOKMARK NOT DEFINED.
STANDARD DATA SHARING AND EXCHANGE ...............................ERROR! BOOKMARK NOT DEFINED.
METADATA MANAGEMENT ..........................................................ERROR! BOOKMARK NOT DEFINED.
INDICATORS MANAGEMENT .........................................................ERROR! BOOKMARK NOT DEFINED.
INTEGRATION WITH OTHER SYSTEMS ..........................................ERROR! BOOKMARK NOT DEFINED.
DATA SECURITY ...........................................................................ERROR! BOOKMARK NOT DEFINED.
3.8.1. Backup and Restore Features ...................................................................................... 7
3.8.2. Access Control .............................................................................................................. 7
3.8.3. User management ......................................................................................................... 8
3.8.4. Users and data auditing ............................................................................................... 8
GIS SUPPORT ...........................................................................................ERROR! BOOKMARK NOT DEFINED.
REPORTING FEATURES .............................................................................ERROR! BOOKMARK NOT DEFINED.
TRAINING .............................................................................................................................................. 9
USER INTERFACE .................................................................................................................................. 9
ALERTING FEATURE .................................................................................ERROR! BOOKMARK NOT DEFINED.
ANALYSIS TOOLS .....................................................................................ERROR! BOOKMARK NOT DEFINED.
SCALABILITY ...........................................................................................ERROR! BOOKMARK NOT DEFINED.
EXTENDIBILITY ........................................................................................ERROR! BOOKMARK NOT DEFINED.
SYSTEM ENVIRONMENT ...................................................................................................................... 10
3.1.
3.2.
3.3.
3.4.
3.5.
3.6.
3.7.
3.8.
3.9.
3.10.
3.11.
3.12.
3.13.
3.14.
3.15.
3.16.
3.17.
IV.
AVAILABLE STATISTICAL DATA MANAGEMENT SYSTEMSERROR!
NOT DEFINED.
BOOKMARK
4.1. LIST OF STATISTICAL DATA MANAGEMENT SYSTEMS ..............................ERROR! BOOKMARK NOT DEFINED.
4.2. PRODUCT DESCRIPTIONS ..........................................................................ERROR! BOOKMARK NOT DEFINED.
4.2.1. CountrySTAT (FAO) ....................................................... Error! Bookmark not defined.
4.2.2. DevInfo (UNICEF) ..................................................................................................... 12
4.2.3. Eurotrace (Eurostat) ................................................................................................. 12
4.2.4. LABORSTA (ILO) ........................................................... Error! Bookmark not defined.
4.2.5. Live database (World Bank) ........................................... Error! Bookmark not defined.
4.2.6. Nesstar ............................................................................ Error! Bookmark not defined.
4.2.7. StatBase (UNECA) ..................................................................................................... 15
© 2011 African Centre for Statistics (UNECA)
Page i
4.2.8. StatWorks (OECD) ......................................................... Error! Bookmark not defined.
4.3. FEATURE COMPARISONS ..................................................................................................................... 19
V.
SOFTWARE SELECTION GUIDELINES ........ERROR! BOOKMARK NOT DEFINED.
5.1. HIDDEN FACTORS FOR SOFTWARE SELECTION.........................................ERROR! BOOKMARK NOT DEFINED.
5.1.1. Vendor history and experience ....................................... Error! Bookmark not defined.
5.1.2. Cost................................................................................. Error! Bookmark not defined.
5.1.3. Ease of use/adoption....................................................... Error! Bookmark not defined.
5.1.4. Maintenance ................................................................... Error! Bookmark not defined.
5.1.5. Familiarity ...................................................................... Error! Bookmark not defined.
5.1.6. Security ........................................................................... Error! Bookmark not defined.
5.1.7. Software as a service (SaaS) .......................................... Error! Bookmark not defined.
5.2. IMPORTANT STEPS INSELECTING THE RIGHT SDMS ................................ERROR! BOOKMARK NOT DEFINED.
5.2.1. Needs Analysis ................................................................ Error! Bookmark not defined.
5.2.2. Management support ...................................................... Error! Bookmark not defined.
5.2.3. Requirements specification ............................................. Error! Bookmark not defined.
5.2.4. RFP Preparation ........................................................................................................ 25
5.2.5. Software demonstration .................................................. Error! Bookmark not defined.
5.2.6. System selection and contract negotiation ..................... Error! Bookmark not defined.
VI.
CONCLUSION ..........................................ERROR! BOOKMARK NOT DEFINED.
VII.
RECOMMENDATIONS ..............................ERROR! BOOKMARK NOT DEFINED.
ANNEX ..............................................................ERROR! BOOKMARK NOT DEFINED.
1. QUESTIONNAIRE TO NSOS ..................................................................ERROR! BOOKMARK NOT DEFINED.
2. QUESTIONNAIRE TO EXPERTS ..............................................................ERROR! BOOKMARK NOT DEFINED.
3. QUESTIONNAIRE TO VENDORS ............................................................ERROR! BOOKMARK NOT DEFINED.
REFERENCES .....................................................ERROR! BOOKMARK NOT DEFINED.
© 2011 African Centre for Statistics (UNECA)
Page ii
I.
BACKGROUND INFORMATION
1.
There is broad consensus among African Governments and development partners
about the need for better statistics in support of sound policy formulation for the achievement
of internationally-agreed goals, including the Millennium Development Goals (MDGs).
Governments of African States realize that the right use of better statistics is essential for
good policies and development outcomes. This recognition requires more accurate and timely
statistics supported by a robust and integrated information technology environment.
2.
National statistical offices (NSOs) on the continent, however, are providing limited
statistical products and services in terms of quantity, type and quality, and are therefore
unable to respond adequately to the increasing demand by their Governments and the
international community for better development statistics.
3.
One of the recommendations put forward by the Data Management Working Group
during the first and second meetings of the Statistical Commission for Africa was to set up a
group of experts made up of statisticians, and data management and geo-information experts,
to evaluate the major statistical data management platforms available and compare their
features so that member States and their partners can make informed decisions on the
selection of platforms for statistical data collection, production and dissemination. The
recommendation was prompted by the plethora of offers of data management platforms that
member States receive. Some of these offers are at no or reduced cost as part of assistance
projects, while others are at commercial values. Even when there are no financial costs,
accepting all offers would result in duplication of efforts with associated wastage of scarce
human capacity, and the possibility of data inconsistencies. Feature documentation of such
statistical data management systems as well as selection guidelines or a handbook will,
therefore facilitate the right platform selection to enhance the sustainability of information
infrastructures and associated tools for the effective management and dissemination of
statistical data, applications and services.
1.1.
Statistical data
4.
The notion of statistical data encompasses all the facts and estimated values of a
certain specific entity. In the context of this handbook, “statistical data” refers to sequences of
observations or estimated values of social, economic, political and environmental entities.
Although there are various ways of classifying and differentiating statistical data, micro- and
macrodata are worth mentioning in order to understand the scope of this assignment.
1.1.1.
Microdata
5.
Microdata are data about individual objects such as a person, event, transaction, etc.
Every object can be characterized by properties. The values of these properties are considered
as microdata. In microdata sets, each row typically represents an individual object and each
column an attribute or characteristic feature of the object. Microdata are often collected from
each object through a survey or individual measurement.
1.1.2.
Macrodata
6.
Macrodata are estimated values of statistical characteristics of sets of objects.
Macrodata can be generated by combining, aggregating, or summarizing microdata or by
© 2011 African Centre for Statistics (UNECA)
Page 1
direct observation and estimation of a group of entities. Macrodata comprise files containing
tabulations, counts and frequencies.
1.2.
Statistical Data Management Systems
7.
A statistical data management system (SDMS) is a system that can model, store and
manipulate data in a manner well suited to the needs of users who want to perform statistical
analyses on the data. SDMSs offer process-oriented feature sets which help users traverse
from data capture through the process of statistical data validation and production and
information dissemination.
8.
Statistical data analysis functionalities, including data validation, standardization
support, metadata management and indicator management, are some of the core features of
SDMSs which differentiate them from ordinary database systems.
9.
Statistical data management systems are expected to:
(a)
Increase the quality of the statistical information produced;
(b)
Improve processes of statistical data analysis; and
(c)
Modernize and increase the quality of data dissemination.
1.3.
Justification of the assignment
10.
National, regional and subregional statistical offices and organizations often need to
compile data from various sources and disseminate the data to diverse user communities.
That need should be a determining factor in choosing a statistical data management platform
for such offices and organizations, which presupposes that the officers responsible for the
selection have adequate knowledge of the capabilities of the various offerings. That is not
always the case and some offices, therefore, end up with systems that may not fully satisfy
their needs. Some have implemented multiple systems to benefit from system
complementarity.
11.
While it is not necessarily wrong to implement multiple systems if the situation
warrants it, member States have expressed the need for guidance on the capabilities of the
various options to make informed decisions with regard to the optimum platform (or
platforms) for their particular environments. This handbook is therefore intended to address
this need by documenting feature descriptions of the existing platforms. It also presents
guidelines to be followed in selecting the required platform for the task at hand.
1.4.
Organization of this document
12.
This document is organized as follows. Section 2 presents the project definition where
objective, mode of operation and scope are described. Section 3 outlines the critical
requirements of a statistical data management system. This is by no means an exhaustive list
of features, but is intended to serve as a reference for organizations. Section 4 documents the
features of major statistical data management systems which are currently in use in member
States and partner institutions. This is simple feature documentation of SDMSs which should
not be considered as a feature comparison. Section 5 outlines the system selection guidelines
© 2011 African Centre for Statistics (UNECA)
Page 2
and describes the major factors which influence the process of SDMS selection. This section
also presents the steps to be followed in selecting the right SDMS for an organization.
Finally, concluding remarks and recommendations are presented in Sections 6 and 7
respectively.
II.
PROJECT DEFINITION
2.1.
Objective
13.
The main objective of this initiative is to produce a publication that documents
characteristic features of major statistical data management platforms to serve as a guide for
member States implementing data management services.
2.2.
Mode of operation
14.
In order to achieve the above objective, participatory design principles were strictly
followed in the implementation of this initiative. Participatory design is an approach which
gives much attention to the active involvement of all stakeholders in the whole
implementation process of an initiative. The approach promotes participative communication
and learning among stakeholders (including system vendors, experts, system users,
management) and is also known for reducing last minute surprises by gradually and
continuously informing participating individuals involved in the project.
15.
To that end, the following operations were performed in the course of the initiative:
(a)
An expert group, comprising individuals from different countries and
institutions, was formed to support the initiative;
(b)
An online discussion forum was set up to communicate ideas around selecting
a suitable statistical data management platform;
(c)
An expert group meeting was held and valuable feedback and suggestions on
the draft handbook were forwarded after the discussions;
(d)
Questionnaires were designed and distributed to three different types of
stakeholders, namely: national statistical offices, experts and system vendors; (see attached)
(e)
Physical observation of a selected site was conducted. This was to gauge how
comfortable users were in using the system. Other working environments for the system were
also taken into consideration;
(f)
A review of technical specifications for selected statistical data management
and dissemination platforms was conducted; and
(g)
Demonstrations of selected statistical data management and dissemination
systems were undertaken.
16.
In general, intensive communications and discussions with all stakeholders were
conducted to produce this document, including via an online discussion forum, emails,
telephone discussions and the distribution of questionnaires.
© 2011 African Centre for Statistics (UNECA)
Page 3
2.3.
Scope of work
17.
This initiative focused on macrodata management systems, identified as the area of
immediate need by member States. Microdata management platforms will be dealt with
separately as the needs in that area are different.
18.
The project is also limited to analysing and documenting statistical data management
platforms which are currently in use in the national statistical offices of member States and/or
partner institutions. Systems deployed elsewhere are not given much attention in this
document.
III.
MAJOR REQUIREMENTS
SYSTEMS
3.1.
OF
STATISCAL
DATA
MANAGEMENT
Data capture
19.
It is obvious that a statistical data management system should allow users to capture
statistical data. The main requirement of the system is to capture all the data the users intend
to store. The system should also offer appropriate data entry schemes. Some users might need
to compile their data in other software such as MS Excel and need to import this into the
system in batch mode.
20.
The system is also expected to validate the data at the time of entry. Data validation is
a critical feature for SDMSs.
21.
Most commercial word processing packages use AutoText which is currently
expanded to Building Blocks to facilitate data entry. In the word processing context, building
blocks are stored snippets that can contain formatted/unformatted text, graphics, and other
objects, which can be defined and inserted by the user into a document when needed.
Building Blocks as a concept can also be implemented in SDMSs to improve data entry by
speeding up the process and reducing errors.
22.
Pulling data through web services from third-party database systems is also a crucial
data capture feature that most SDMSs are required to possess.
3.2.
Data storage and retrieval
23.
Statistical organizations are responsible for collecting and storing a huge amount of
statistical data just to feed the decision makers, researchers and the general public with timely
and accurate information. Due to the magnitude of the amount of data maintained and the
users’ expectations and demands for quality data, the processes of storage and providing
access need to be supported by a robust statistical database system.
24.
Storage and retrieval is, therefore, one of the major requirements of any statistical
database system. Database systems need to store huge amounts of data in a systematic
manner. They should also offer a flexible, intuitive and simple retrieval module which assists
decision makers, the general public, and other users with limited system manipulation
expertise to access the information from the database.
© 2011 African Centre for Statistics (UNECA)
Page 4
3.3.
Data processing and dissemination
25.
Any statistical data management system is expected to perform data processing
activities such as coding, editing, and data harmonization to list just a few. Once data is
processed and the required adjustments are made, the database system should provide a
dissemination facility.
26.
Nowadays, the Internet is the most widely used dissemination medium. This
technology is composed of a number of functional features:
(a)
Electronic mail serves as a common platform for sending electronic messages.
It is mostly appropriate for periodical reports to a selected and predefined user community;
(b)
Websites are used to publish statistical information at a specified location on
the Internet for the general public; and
(c)
Websites also furnish features that help transport statistical data files in
different formats (Excel, PDF, Word, etc.). They are, increasingly, becoming dissemination
channels for statistical data. They offer a simple, comparatively cheap and efficient way to
provide timely information to the core users of statistics as well as to a broader audience.
27.
Most statistical database systems therefore possess a facility to publish information in
a web readable format. Accordingly, web publishing capability is a critical SDMS
requirement.
3.4.
Standard data sharing and exchange
28.
National statistical offices face tremendous pressure to provide reports to other
organizations including Government offices, international development organizations, and
partners. At the same time, NSOs need to capture data from various sources, including
partner institutions, with different formats. It is also abundantly clear that these activities are
performed frequently and entail a huge amount of data flow. Keying in such data manually is
mostly a resource-intensive, tedious and error-prone activity which needs to be reduced as far
as possible.
29.
Synergies, standardization and optimization of processes and infrastructures are the
only solution to this challenge. Standard exchange formats such as Statistical Data and
Metadata Exchange (SDMX) can help by improving quality and efficiencies in the exchange
and dissemination of data and metadata through:
(a)
Harmonization and coherence of data;
(b)
Preservation of meaning by coupling data with metadata that defines and
explains it accurately;
(c)
Use of an open format such as XML rather than a proprietary one; and
(d)
Facilitating and standardizing the use of new technologies such as XML and
Web services. Many NSOs are already using, or are planning to use, XML as the basis for
© 2011 African Centre for Statistics (UNECA)
Page 5
their data management and dissemination systems. By choosing SDMX, the proliferation of
many XML grammars could be avoided.
3.5.
Metadata management
30.
Metadata are defined as data about data, and refer to the definitions, descriptions of
procedures, methodologies, system parameters and operational results that characterize and
summarize statistical data. Metadata are data describing different quality aspects of statistical
data, such as file contents, and definitions of objects, populations, variables, etc. This
includes details on data accuracy, for example descriptions of the differences between the
observed/estimated and true values of variables and statistical characteristics. Metadata can
include information on which statistical data are available, where they are located, and how
they can be accessed. Metadata also might contain a description of the content and layout,
and a description of validation, aggregation and reports preparation rules. In other words,
metadata can be considered as an entity describing the meaning, accuracy, availability and
other important characteristics of the underlying data. These characteristic features of the
underlying data are essential for correctly identifying and retrieving relevant statistical data
for a specific problem as well as for correctly interpreting and reusing the data.
31.
Metadata is critical because data are only made accessible through their
accompanying documentation. Without a description of their various elements, data resources
will manifest themselves to the end user as more or less meaningless collections of numbers.
The metadata provides the bridge between the producers of data and their users and conveys
information that is essential for secondary analysis.
32.
As metadata is critical, metadata management is one of the core requirements of
SDMSs. It is this feature which manages the metadata required for defining the content,
quality, security, accessibility and other aspects of the actual database. The system, through
the metadata management module, is expected to present a description of data content and
layout, as well as a description of validation, aggregation and reports preparation rules.
33.
Currently, standardization of metadata elements makes information sharing more
reliable and universal. The use of metadata standards enables producers to describe data sets
fully and coherently. They also facilitate data discovery, retrieval and use. The Data
Documentation Initiative is an example of a metadata standard which is used for
documenting data sets and designed to be fully machine readable and machine processable.
Metadata standard compliance is another critical requirement that a SDMS needs to
demonstrate.
3.6.
Indicators management
34.
Statistical indicators are any quantitative data that provide evidence about the
quantity, quality or standard of an entity. The following are some examples of indicators
collected by the World Bank (http://data.worldbank.org/indicator):
(a)
Expenditure per student, primary (% of GDP per capita);
(b)
Public spending on education, total (% of government expenditure);
(c)
Expenditure per student, secondary (% of GDP per capita); and
© 2011 African Centre for Statistics (UNECA)
Page 6
(d)
Pupil-teacher ratio (primary)
35.
In most cases, SDMSs should enable users to create new indicators and manage
existing ones. The management might include operations such as categorizing indicators into
thematic groups, deleting existing indicators, or any other modifications.
3.7.
Integration with other systems
36.
In this era of technology, it cannot be thought that there is only a single software
system to manage processes of an organization. For different reasons, most organizations
deploy multiple technology solutions through time to manage their day-to-day activities.
Ultimately, however, as those systems are working to realize the vision of a single
organization, the need for integration arises. The same requirement might arise with statistical
data management systems.
37.
System integration deals with making two or more systems communicate. Such
communication can happen with different levels of proximity. Support for a standard
import/export facility can be used to transfer data from one system to another, or refer data
held in another database.
3.8.
Data security
38.
Data security is a broad concept and can be defined from various perspectives, each
defining separate SDMS requirements. Some of these perspectives are presented in the
subsequent paragraph:
3.8.1.
Backup and restore features
39.
An SDMS should provide an automatic backup feature for all inputs made to the
system. It should also furnish a restore facility, which will enable the system to recover lost
data. A manual backup and restore feature is also a crucial component of any database
system. Users (administrators) should be allowed to configure periodic backups or run onetime backup processes.
3.8.2.
Access control
40.
In most database systems access control is defined through roles, which determine
permissions. Roles job functions within the context of an organization with some associated
semantics regarding the authority and responsibility conferred on the user assigned to the
role. A role can be configured to consolidate the users’ responsibilities, and the permissions
that users require to perform a specific function.
41.
Permissions can be granted to access functions such as data editing, data approval and
other administrative functions, or to access restricted data such as that which is
geographically specific. Role-based access control is required because it simplifies mass
updates of user permissions; an organization need only change the permissions or role, and
the users assigned that role will inherit the new set of permissions automatically.
© 2011 African Centre for Statistics (UNECA)
Page 7
3.8.3.
User Management
42.
An SDMS, especially if it runs in a multi-user environment, is expected to provide a
feature that enables organizations to define administrative functions and manage users based
on specific requirements such as job role or geographic location.
Depending on the nature of the organization, different approaches can be followed to create a
user as indicated below:
(a)
User registration by centralized administration: In this approach, a system
administrator is responsible for creating and managing all users of the system. This approach
is appropriate in cases where the database has a small number of users;
(b)
Delegating administration: Instead of relying on a centralized administrator
to manage all users, an organization can create local administrators and grant them sufficient
privileges to manage a specific subset of the organization's users. This provides the
organization with a more granular level of security, and the ability to make the most effective
use of its administrative capabilities; and
(c)
Self-service Requests: This approach enables end users to request initial
access or additional access to the system. Access requests of users are either approved by the
system (with minimal privileges) or reviewed by the system administrator before approval. A
self-service registration process is an ideal approach when the number of system users is big
and in cases where users are not known to the administrator before requesting data. This
system is mostly used to grant access privileges for websites.
3.8.4.
Users and data auditing
43.
An SDMS should possess a feature to audit users and changes they make to the
database. It should allow the tracking of users' activities. Audit reports should give detailed
historical information on users' activities. Some applications offer real-time information on
user activities.
44. Audit trails also help to keep a history of changes to important data. With an audit trail,
it is easy to determine how data elements obtained their current value.
3.9.
GIS support
45.
Geographical Information System (GIS) technology is now a mature technology
which is used to present attractive and intuitive reports using maps. Due to the fact that most
statistical data is geocentric, GIS support is a critical feature requirement of an SDMS.
3.10.
Reporting features
46.
An SDMS should have a reporting engine which allows users to generate different
types of reports. The reporting engine is expected to have predefined report templates as well
as allow users to design new and ad hoc reports on the fly.
© 2011 African Centre for Statistics (UNECA)
Page 8
3.11.
Training
47.
The critical factor in the success of a major system implementation project is the
knowledge transfer that takes place before and during implementation. This can be
accomplished using a combined approach where the main objectives are both to educate and
to train.
48.
Hence, training, though not directly considered as a statistical system requirement, is
a major factor to be considered when evaluating a specific platform. Questions such as the
following should be asked:
(a)
Does the vendor have a sound training strategy?
(b)
What is the training approach?
(c)
Is there separate training for ordinary end users and key users?
3.12.
User interface
49.
The SDMS user interface is the medium which helps the user to communicate with
the system. In order for the user to fully utilize the system’s functionalities, the system must
have a simple, attractive and intuitive user interface. Generally, graphical user interfaces are
preferable to their command line counterparts. Items to consider when evaluating a user
interface include:
(a)
Ease of customization of the look and feel of the user interface by database
managers without the intervention of the developer. These are simple modifications, such as
increasing/decreasing font size, changing colours of buttons, menus and texts, of the user
interface;
(b)
Validating data entry - when users enter invalid data, the system should return
an error message so that the user can correct the invalid entry;
(c)
Error reporting/feedback - the system should offer a facility to report
unexpected errors to the developer; and
(d)
Wizards - the system should guide the user step by step to complete processes.
50.
The system should also offer an expert mode whereby expert users can use shortcuts
to operations.
3.13.
Alerting feature
51.
In organizational applications, users - mostly managers – commonly prefer to get
information when a predefined action occurs with the database. This event could be a new
inclusion in the database, an approval request, or a threshold exceeded. Such a function is
known as an alerting feature.
© 2011 African Centre for Statistics (UNECA)
Page 9
52.
An SDMS is required to have an alerting feature, which will assist users to configure
alert types and alert recipient groups. Once configured, the system should automatically send
alerts at the time of occurrence of the predetermined event.
3.14.
Analysis tools
53.
Data analysis is an integral part of SDMSs. The major requirement is that an SDMS
should possess an easy-to-use analysis tool. Users should be able to easily understand the tool
and interpret the results.
3.15.
Scalability
54.
Scalability is the ability of software to handle a growing amount of work. In statistical
database systems this is generally related to the increasing amount of data. If the system
quickly reaches a point where it cannot support new additions of data, users, and/or node of
operation, the system is not scalable.
55.
An example of a scalability requirement can be described as follows:
The system should have a capacity of supporting up to five years with a maximum increase in
database size, number of terminals/workstations, and/or activity levels without a server
upgrade or a significant decrease in system response or performance.
3.16.
Extendibility
56.
Extendibility is the extent to which software can be adapted to new requirements. It
refers to the magnitude of the effort required to add additional features after implementation
of an SDMS. Database systems developed on the basis of component-based architecture are
mostly highly extendible. They use plug-and-play components for new additional features.
57.
As the addition of new features after implementation of a statistical database system is
inevitable, the SDMS is required to be extendible so that the owning organization can
incorporate new features with minimal effort and expense.
3.17.
System environment
58.
The system environment in which an SDMS is running should be given due attention.
It is quite difficult to strictly identify a specific environment setting as this varies from
organization to organization.
59.
The system environment refers to the operating system and the relational database
engine an SDMS is running on. Consideration should be given to whether the software can
run on a network or is a stand-alone product.
IV.
AVAILABLE STATISCAL DATA MANAGEMENT SYSTEMS
4.1.
List of statistical data management systems
60.
By employing different data collection methods such as the administration of
questionnaires, discussion forums and a literature review, we have come to understand that
© 2011 African Centre for Statistics (UNECA)
Page 10
the following statistical data management systems are in use on the continent. It should be
noted, however, that the following list only includes those systems which deal with
macrodata management:
(a)
CountrySTAT (Food and Agriculture Organization of the United Nations -
(b)
Devinfo (United Nations Children’s Fund - UNICEF);
(c)
Eurotrace (Eurostat);
(d)
LABORSTA (International Labour Organization - ILO);
(e)
Live Database (World Bank);
(f)
Nesstar;
(g)
StatBase (United Nations Economic Commission for Africa - UNECA); and
FAO);
(h)
OECD).
4.2.
StatWorks (Organization for Economic Cooperation and Development -
Product descriptions
4.2.1
CountrySTAT (FAO)
61.
CountrySTAT is a statistical database system for food and agriculture statistics at the
national and subnational levels. It provides access to statistics across thematic areas such as
production, prices, trade and consumption. CountrySTAT is the country-specific version of a
statistical data management system called FAOSTAT which is deployed at FAO.
CountrySTAT serves as a complementary system to FAOSTAT, in that the two systems can
seamlessly integrate for data sharing and consolidation. FAOSTAT is designed to consolidate
data transferred from specific CountrySTAT deployments to generate quality international
statistics on food and agriculture.
62.
CountrySTAT has two data categories, namely core and details. The core data
category consists of national data shared with the FAOSTAT database. The design of the core
data category enables both FAO and country-level statistical offices to easily transfer data
between their respective STAT databases. On the other hand, the details category provides
more detailed data with subnational relevance and with the lowest levels of disaggregation.
63.
CountrySTAT can operate in many popular data formats: HTML, XML, Microsoft
Excel, Microsoft Access, Comma-SeparatedValue (CSV) files and others. In addition, SDMX
Technical Standards Version 2.0 is supported for the exchange of data and metadata based on
a common information model. As it is a web-based system, there is no need to build costly
new computer networks to link government offices for the purpose of data exchange.
64.
Deploying CountrySTAT requires a Windows operating system, Microsoft Internet
Information Server, and PC-Axis and PC-Web family software. Depending on the
© 2011 African Centre for Statistics (UNECA)
Page 11
implementation environment, CountrySTAT can be deployed with a PC-Axis database or can
be extended to utilize popular database engines including Oracle, Sybase or MS-SQL.
4.2.2. DevInfo (UNICEF)
65.
DevInfo is an integrated desktop and web-enabled tool that supports both standard
and user-defined indicators. A standard set of MDG indicators is at the core of the DevInfo
package. In addition, at the regional and country levels, database administrators have the
option to add local indicators to their databases. The software supports an unlimited number
of levels of geographical coverage: from the global level to regional, subregional, national
and subnational levels down to subdistrict and village levels (including data on schools,
health centres, water points, and other infrastructure).
66.
DevInfo has simple and user-friendly features that can be used to query the database
and generate tables, graphs and maps. The system provides an ideal tool for evidence-based
planning, results-focused monitoring, and advocacy. It allows data to be organized, stored
and displayed in a uniform way to facilitate data sharing at the country level across
government departments, United Nations agencies and development partners.
67.
Data from DevInfo can be exported to XLS, HTML, PDF, CSV and XML files and
imported from spreadsheets in a standardized format. DevInfo also has a data exchange
module for importing data from industry-standard statistics software packages such as SPSS,
SAS, Stata, Redatam, and CSPro.
68.
DevInfo is distributed royalty-free to all member States and United Nations agencies
for deployment on both desktops and the Internet. The user interface of the system and the
contents of the databases it supports include country-specific branding and packaging options
which have been designed to ensure broad ownership by national authorities. UNICEF has
absolutely no restrictions on the database and its use.
69.
The most common DevInfo users include United Nations country teams, national
statistical offices, planning ministries and district planners. Frequent users also comprise
members of the media (for reporting and tracking human development data), educational
institutions (for analysing data and helping students gain data access), as well as DevInfo
administrators (for customizing the system and adding data through advanced database
administration modules).
4.2.3. Eurotrace suite
70.
Eurotrace is a statistical data processing software for external trade statistics which is
designed by Eurostat. It has end-to-end features which enable users to capture, process, store
and disseminate statistical data. More specifically, it has tools for data entry, data transfers,
data checking, data editing, validation, and dissemination. Eurotrace can also be used as a
companion package with ASYCUDA (Automated System for Customs Data).
© 2011 African Centre for Statistics (UNECA)
Page 12
71.
It is composed of the following three main software modules:
(a)
Eurotrace Editor – is designed to allow users to enter data efficiently and
export to Eurotrace DBMS. With Eurotrace, data can be grouped into manageable subsets
that can be distributed to many people for correction and adjustment. It tracks down which
subsets are produced and matches the corrected subsets to the original ensuring control of the
complete distribution of the data processing effort. Eurotrace also provides wizards for
translation and aggregation of data;
(b)
Eurotrace DMBS - permits users complete preparation of statistical data
including metadata management, management of validation rules, data aggregation and
transformation, and management of data import/export. Manual data correction and editing is
also possible by exporting data to Eurotrace Editor; and
(c)
Comext Browser - is a system for the storage, analysis and retrieval of
statistical data, which is used to view, extract and do calculations on external trade data.
Comext also offers facilities to assist the dissemination of data. The browser has both server
and client versions. The main characteristics of the Comext Browser include its
multidimensional and virtual spreadsheet, integration with Microsoft Excel, exporting to
HTML and XML formats, online analytical processing (OLAP) engine to perform
aggregation and/or combine data among different nomenclatures on the fly, and multilingual
nomenclatures for successor-predecessor relationships.
72.
The following figure presents the three basic components of Eurotrace and their
integration:
Fig. 1. Eurotrace Suite Modules (Source: Eurotrace brochure)
73.
From the technical point of view, the Eurotrace suite of programmes is built with
Microsoft Visual Basic and C++ programming languages and supports Data Access Objects
(DAO) and open database connectivity standards (ODBC).
© 2011 African Centre for Statistics (UNECA)
Page 13
4.2.4.
LABORSTA (ILO)
74.
LABORSTA is a statistical data management platform of the International Labour
Organization, operated by ILO Department of Statistics. It was developed to manage labourrelated statistical data such as:
(a)
Total and economically active population;
(b)
Employment ;
(c)
Unemployment ;
(d)
Hours of work;
(e)
Wages;
(f)
Labour costs;
(g)
Consumer price indices;
(h)
Occupational injuries;
(i)
Strikes and lockouts;
(j)
Household income and expenditure; and
(k)
International labour migration.
75.
It also has a predefined metadata definition which can be accessed by users of the
system. Users can file their request to the system in a query form and can download their
query results in a format of their choice.
4.2.5. Live database (World Bank)
76.
The Live Database (LDB) is a user-friendly computer-based data tool that consists of
(a)
A Local Database - a tool for in-depth economic work;
(b)
Query- a tool for storing and manipulating economic and sectoral variables;
(c)
Africa Briefings-presorted ready-to-use data.
and
77.
The system was developed by the World Bank’s Africa Region with two
complementary goals in mind: in the short term, to provide staff in the region with an
efficient means of collecting, analysing and manipulating economic and sectoral data, and in
the long term, to become the linchpin of a major capacity-building effort in African countries,
aimed at upgrading local capacity in statistical data collection and analysis.
© 2011 African Centre for Statistics (UNECA)
Page 14
78.
LDB is a fully web-based system with an intuitive and friendly user interface. It
utilizes web services to allow seamless integration and data sharing with third-party systems.
79.
LDB is also equipped with OLAP technology. This means that users can perform
complex calculations on the fly. Such capabilities were not previously available or required
expensive programmers to execute. At the same time, the system is designed as a toolkit,
using off-the-shelf technology that allows it to be replicated, transferred and installed
anywhere with little software and hardware know-how.
80.
LDB has the built-in flexibility to allow the addition of new indicators, customization
of standard reports, change in the methodology used to calculate growth rates, etc. It is a
system, not simply a database with current data from the World Bank.
4.2.6. Nesstar
81.
Nesstar is a suite of software tools which offer features to publish, locate, and access
statistical data. It represents a system of software architecture that helps users to create,
locate, access and operate statistical data. Nesstar has added a level to the already existing
web technology by creating a web server geared towards statistical data manipulation based
on widely adopted data documentation standards. Accordingly, the demands of recognized
systems such as the Data Documentation Initiative and open source initiatives such as JBoss
are a key component of the Nesstar suite of products.
82.
Though there are quite a number of tools available, the main Nesstar components are
the following:
(a)
Nesstar Publisher - is a data management programme, which consists of data
and metadata conversion and editing tools, enabling the user to prepare materials for
publication to a Nesstar server. It can also be used as a stand-alone tool for the preparation of
data and metadata. The Publisher enables users to enhance data sets by combining a wide
range of catalogue and contextual information, which can then be viewed within the Nesstar
web client called Nesstar WebView;
(b)
Nesstar Server- is built as an extension to a normal web server by
incorporating statistical data management features. As well as providing all the usual
facilities for publishing web content, this server provides the ability to publish statistical
information that can be searched, browsed, analysed and downloaded by users. This is done
either by using a standard web browser or using Nesstar WebView; and
(c)
Nesstar WebView- is a web-based system for the dissemination of statistical
data which can be used to view tabular (cube) data as well as metadata that have been
published using Nesstar Publisher and made available on a Nesstar server. The WebView
allows users to search for, locate, browse, analyse, and download a wide variety of statistical
and related data within a web browser. With the help of third party mapping solutions such as
GeoServer, it can also display statistical data in maps. It also helps users to perform data
analysis including cross-tabulation, correlation and regression.
© 2011 African Centre for Statistics (UNECA)
Page 15
4.2.7. StatBase
83.
StatBase is a statistical data management platform developed by the African Centre
for Statistics of UNECA as a central database system to manage all macro time series at
subregional offices and NSOs of member States. StatBase aims at sound and proper
management of macrodata and easy access to statistical information by users of all categories.
84.
StatBase is developed using the latest web-based architecture in order to benefit from
technological advancements, and is based on stable database management systems. The backend component of the StatBase application runs on Windows as server operating system, MS
SQL as database server, and Internet Information Server as application server.
85.
As StatBase is a web-enabled system, it follows all the web-based user-interface
standards which make access to the system intuitive and simple. Major features of StatBase
are:
(a)
Multi-user functionality;
(b)
Multisector data management capability;
(c)
Document management functions;
(d)
Structured generically to allow management of indicators;
(e)
Metadata management capability;
(f)
Import/export functionality;
(g)
Role-based access control;
(h)
Allows storage, retrieval and dissemination of national and subnational data
levels (up to four levels in addition to cities/town), and periodicity (annual, quarterly and
monthly);
(i)
Parameter driven application;
(j)
A centralized database system;
(k)
Latest relational database technology;
(l)
Complete scalability for any size of data and number of users; and
(m)
Facilitates textual data management.
4.2.8. StatWorks (OECD)
86.
StatWorks is a generic software toolkit for statistical database management designed
and implemented by OECD. It uses MS SQL as a database engine where statistical data is
stored and managed. The platform manages statistical production processes including initial
© 2011 African Centre for Statistics (UNECA)
Page 16
data migration, database administration, security management, data capture and validation,
indicators management, metadata management, data querying, and data export.
87.
StatWorks is designed to be fully integrated with other statistical data management
tools developed by OECD. The statistical information system architecture of OECD has the
following major tools which are vital for StatWorks:
(a)
OECD.stat - a data-sharing and dissemination environment of OECD. It is a
data warehouse platform designed to store and disseminate corporate statistical data. Thirdparty OLAP tools can also be used to analyse the data stored in the warehouse;
(b)
MetaStore - a web-based system designed to manage metadata which
describes characteristic features of data sets including structure, collection methods,
manipulation techniques, quality attributes, etc; and
(c)
OECD eXplorer - a web-based interface to explore, analyse and visualize
statistics. It has mapping features and visual presentations such as bubble charts and a parallel
coordinates plotter, which enable users to analyse groups of areas of interest.
88.
The overall OECD statistical data analysis environment is depicted in the following
figure:
Fig. 2. OECD statistical data analysis environment (Source: [5])
89.
StatWorks developers are in the process of replacing existing CD-ROM-based data
exchange services with a web-based warehouse-to-warehouse SDMX enabled system. In
addition, StatWorks intensively utilizes spreadsheets, most notably Excel, for data
computation, presentation and visualization.
© 2011 African Centre for Statistics (UNECA)
Page 17
© 2011 African Centre for Statistics (UNECA)
Page 18
1
Eurotrace
LABORSTA
LDB
Nesstar
StatBase
StatWorks
Data storage and retrieval
Data entry features
Data processing and dissemination
SDMX support
Metadata management
Indicators management
User management
Multi-user support
GIS support
Data security features
Graphical user interface
Customization capabilities
Availability of wizards that guide users
through a series of steps necessary to
complete a defined process, without the
use of commands or traditional menus
Ability to perform checking and validation
of user input before sending data to the
server
Alerting system
Audit trail management
DevInfo
Features
CountrySTAT
4.3.
Feature comparisons
The following table presents a summary of the features of the statistical database systems described above:
yes
yes
yes
yes
yes
NIF
yes
yes
no
yes
yes
no
no
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
no
no
yes
yes
yes
yes
yes
NIF
yes
yes
no
yes
yes
no
yes
yes
yes
yes
NIF1
NIF
NIF
yes
yes
no
yes
yes
no
no
yes
yes
yes
NIF
NIF
NIF
yes
yes
yes
yes
yes
no
no
yes
yes
yes
yes
yes
NIF
yes
yes
no
yes
yes
no
no
yes
yes
yes
no
yes
yes
yes
yes
yes
yes
yes
no
no
yes
yes
yes
yes
yes
NIF
yes
yes
yes
yes
yes
no
no
NIF
yes
NIF
NIF
NIF
NIF
yes
NIF
NIF
NIF
NIF
yes
no
NIF
no
NIF
NIF
NIF
no
NIF
no
yes
NIF
yes
NIF: No information found
© 2011 African Centre for Statistics
Page 19
© 2011 African Centre for Statistics
DevInfo
Eurotrace
LABORSTA
LDB
Nesstar
StatBase
StatWorks
Web-based
Multi-language support
Multisector support
Platform independence
Creation of new report templates
Continuous and automatic backup
End user training
Availability of ongoing support and
maintenance services after implementation
Availability of context-sensitive help
Document management
CountrySTAT
Features
yes
yes
no
NIF
no
no
yes
yes
yes
yes
yes
no
no
yes
yes
yes
yes
yes
yes
no
no
yes
yes
yes
yes
no
no
no
no
NIF
NIF
yes
yes
yes
NIF
no
no
NIF
NIF
yes
yes
yes
yes
no
no
yes
yes
yes
yes
yes
yes
no
no
yes
yes
yes
yes
yes
NIF
no
no
yes
NIF
no
NIF
no
NIF
no
NIF
no
NIF
no
NIF
no
NIF
no
yes
no
yes
Page 20
V.
SOFTWARE SELECTION GUIDELINES
90.
Successful software selection and implementation begins with a comprehensive project
domain specification and planning. However, there are issues which are mostly not visible or
underestimated at the planning phase but have a huge negative influence on the project if not
properly handled. Such hidden factors are discussed in the following section, which is followed
by a presentation of statistical data management system selection guidelines.
5.1.
Hidden factors for software selection
91.
Software is an integral part of day-to-day activities in an organization. It is no less
important than the products and services acquired. This also applies to statistical data
management systems. In many ways, selecting an SDMS is no different from selecting a product
or service. Naturally, some of the same purchase criteria apply – brand, service, and maintenance
costs. In spite of the obviousness of the above, SDMS or, for that matter, any software selection
is a grey zone, an underdeveloped arena. This accounts for the high incidence of “shelfware” –
software that is bought with grand intentions, but ends up on dusty shelves. This mainly happens
because purchases are made based on what immediately meets the eye – technical features. This
mistake is understandable, because technical features are well documented and advertised, and
easy for the buyer to use as purchasing criteria. But with this approach, factors that are equally, if
not more important, but not as immediately obvious, are neglected. Some of these critical factors
are presented in the subsequent paragraphs.
5.1.1. Vendor history and experience
92.
Vendor background is essential because vendors, directly or indirectly, are likely to be
responsible for working with the sensitive statistical data of an organization. A background
check is crucial as the investment needs to be made with a dependable vendor with a proven
track record. Some questions to ask about the vendor would be:
(a)
How long has the vendor operated?
(b)
How long has the vendor been in the field of software development?
(c)
Is the vendor the software developer, or are they merchandising the software?
(d)
What is the vendor’s niche? Does the vendor understand our organization’s niche
well enough to know our needs?
(e)
Who are the customers of this vendor?
(f)
Who is using this SDMS?
(g)
What did the customers say about the vendor/SDMS?
© 2011 African Centre for Statistics
Page 21
5.1.2. Cost
93.
There is no denying the importance of cost effectiveness in software choices. Yet costs
should be seen in a broad perspective; low entry costs may well result in higher total costs over
the life of the system. Both one-time fixed costs and subsequent recurring costs should be
considered when selecting a SDMS.
94.
A cost-benefit analysis is a critical activity to determine investment feasibility. Costs
should be compared with the system’s range of features and functionalities. A system may not be
the cheapest, but it may allow you to perform many functions. On the other hand, opting for
many features can be a trap, because users never get around to using half of them. Many features
may not relate to the needs to be addressed.
5.1.3. Ease of use adoption
95.
A system should have an intuitive interface, and the use of features should be selfevident. The shorter the learning curve in training a new user, the better. The software should
also have the ability to easily fit into existing systems with which it will have to communicate.
96.
Adoption strategy and the vendor’s experience in assisting the organization’s training of
end users are key areas to be taken into consideration. One of the key selection criteria would be
the knowledge transfer approach the vendor is following with regard to the system. This could be
end user training, key user training in system implementation and configuration, and continuous
or one-off training. Free training seminars or their new avatar - webinars (online seminars) greatly help users to get up to speed with software at no extra cost. In some cases the company
might offer paid training, which may be essential.
97.
If users are not equipped with the maximum skills and knowledge to operate the system,
and if the system is too complex, it will become “shelfware”. Hence, ease of use and adoption
approaches need to be given due attention in the SDMS selection process.
5.1.4. Maintenance
98.
Maintenance costs and effort have a major impact on the performance and adoptability of
a SDMS, and hence, form an important criterion of the buying decision. If the system is hosted
by the vendor, it is of utmost importance that it be available online (“uptime”) at all times. A
minimum uptime of 99 per cent should be sought. Signing a service level agreement is a
common practice to ensure a contractual agreement with the vendor that quantifies performance
indicators such as uptime.
99.
The vendor’s upkeep of the system is also important. Efforts exerted by the vendor to
constantly improve the system indicate commitment to providing quality services. Evaluating the
frequency of bug fixes, upgrade releases and the availability of strong user communities
demonstrates whether the vendor is active and dependable for post-implementation maintenance.
It is a good habit to review the vendor’s newsletter, release notes or the “what’s new” section on
its website, as frequent updates are indicative of a dynamic vendor.
© 2011 African Centre for Statistics
Page 22
5.1.5. Familiarity
100. The “look and feel” of the system is a major selection criterion. The new system should
keep the basic layout and navigation schemes that are already familiar, as this makes for a
quicker transition for the users. A comparison with the operating system in which the system is
to be implemented is recommended. For instance, a system with a Mac schema would not fit
well in Windows. It is also customary to look for a SDMS which can run in a familiar
environment in terms of operating system, development tools, reporting tools, etc.
5.1.6. Security
101. Security is a top consideration for statistical data management systems. The organization
needs to be assured that its data are secure and that there are no risks of data being compromised.
The extent of the security consideration might vary from one organization to another depending
on the sensitivity of data.
102. As discussed in the previous sections, security has various elements: data security,
function security and system security are all factors that should be given due attention.
5.1.7. Software as a service (SaaS)
103. With the emergence and maturity of cloud computing, services such as SaaS are gaining
popularity. SaaS is a scheme whereby the services of a software system, in this case a SDMS, are
acquired without physically purchasing the software. The software is hosted or deployed at the
site of the developer or vendor and access privilege is given to the client upon subscription. Once
the client is granted access to the system, its functionalities can be used through a network or the
Internet.
104. SaaS avoids hardware investments, which in turn drastically reduces the initial
investment cost. The scheme also avoids maintenance costs, as the vendor is responsible for
maintaining the system. There is also no need to hire dedicated support staff at the organization’s
site as most support-related activities are handled by the vendor. System updates and upgrades
can be automatically performed at the vendor’s site.
105. However, some organizations might not feel secure with their critical data being stored
on the server of an external company. The vendor should be trusted and there must be binding
agreements to cover the entire business. The cost and quality of network connections are also
issues to be considered if an organization opts to implement SaaS.
5.2.
Important steps in selecting the right SDMS
106. As in any project, a well-planned and researched approach must be adopted to ensure
success in SDMS selection. SDMS selection requires a significant investment of time and
resources, involvement of the entire organization, and a considerable amount of research,
planning and re-evaluation along the way.
© 2011 African Centre for Statistics
Page 23
107. The following are important steps in the SDMS selection/acquisition process. It should be
noted, however, that these steps may not exactly fit the requirements of every organization;
rather they can serve as guidelines for the SDMS selection process. Organizations may modify
the steps presented below depending on their culture, size and environmental settings.
5.2.1. Needs Analysis
108. A needs analysis normally starts with a review of the current system being used to
manage statistical data in the organization. This will be followed by a process of identifying the
problems or shortcomings of the current system, leading to documentation of improvement
requirements.
109. Interviewing existing system (manual or automated) users, managers or other
stakeholders is a common method of conducting a needs analysis. The following are some of the
questions all stakeholders should be asked to collect data for the needs assessment:
(a)
What is the level of dependence on manual forms?
(b)
What is the level of support from the current SDMS supplier?
(c)
Does the current software support the organization’s mission statement? If not,
what improvements could be made?
(d)
Are there any areas of waste or possible inefficiencies in the current system that
need to be tackled urgently?
(e)
Do you feel you receive adequate reporting from the existing system? What
additional areas of reporting or type of information would you like to see with the new system?
(f)
Do you feel users spend a significant amount of time producing reports?
(g)
How easy is the user interface to use?
(h)
Do you feel the current system captures all the required data?
(i)
Do you feel the current system handles the growing user and data volume?
(j)
Do you feel your critical data is well secured in the current system?
110. The main objective of the needs analysis is to identify and document the gaps between
the current system and the needs of the organization to fulfil its mission.
5.2.2. Management support
111. Once the need is identified and justified, it must be presented to the management for
approval of the project plan and resources. A budget needs to be allocated and most importantly,
© 2011 African Centre for Statistics
Page 24
management commitment should be secured. A system development project is likely to fail
without management support.
112. Assigning a manager to lead the project is a crucial step. He or she will serve as a mentor
and sponsor for the project and will also be an invaluable resource in the event that the project
team struggles with difficult users and managers.
5.2.3. Requirements specification
113. Once the needs are identified and acquisition of the new SDMS is justified, a detailed list
of requirements must be prepared. The major requirements of a SDMS are discussed in Section 3
above. However, system requirements presented in Section 3 are not prescriptive and may not be
relevant for every statistical organization. Rather, they are intended to serve as a springboard for
producing more detailed requirement specifications for a specific project based on the results of
the needs assessment.
114. It is a good practice to focus more on the key differentiating criteria of the system in
order to identify the most critical requirements. This is important to quickly, yet thoroughly
evaluate system vendors.
115.
It is also a common practice to prioritize required features as:
(a)
“Must have” features;
(b)
Desired features; and
(c)
“Wish list” features.
116. The degree of fit of the SDMS to each required feature should be analysed and must be
one of the following:
(a)
System fully meets the requirement;
(b)
System meets the requirement with customization;
(c)
System meets the requirement with third party add-on products; and
Error! Bookmark not defined.
System does not meet the requirement.
(d)
117. An extract of the requirements specification with minor modification constitutes the
terms of reference (ToR) which can be used in the request for proposal (RFP) document.
© 2011 African Centre for Statistics
Page 25
5.2.4. RFP Preparation
118. After initially reviewing the available SDMS vendors (see Section 4 above), it is
necessary to prepare the RFP, which is the best means of communicating the full project
requirements to the potential vendors.
119.
The RFP might contain, but not be limited to, the following items:
(a)
Cover letter summarizing the request for proposal;
(b)
General information and scope of work;
(i)
Introduction;
(ii)
Overview and background of the organization;
(iii)
Objective of the project;
(iv)
Scope of the project;
(v)
Relationship to other systems;
(vi)
Project schedule and deadline for vendor response;
(c)
ToR;
(d)
Instructions to bidders;
(i)
Other binding information with regard to the bid;
(ii)
Evaluation criteria;
(iii)
Proposal response format.
(e)
Vendor profile;
(f)
Proposed statistical data management solution;
(g)
Implementation services;
(h)
Training services;
(i)
Data migration services;
(j)
Warranty period and annual maintenance;
© 2011 African Centre for Statistics
Page 26
(k)
Cost breakdown;
(l)
Available references; and
(m)
General and specific conditions
5.2.5. Software Demonstration
120. Some SDMS vendors offer online demonstrations of their systems. Exploring the demo is
a very helpful way to evaluate a system. Once the RFP is sent out to potential vendors it is good
practice to invite them for on-site demos. To avoid vendors following a simple sales
presentation, they should be requested to prepare structured demo scripts based on the
requirements of the organization.
121. The following are some of the questions which should be raised during on-site demo
sessions:
(a)
Did the demonstration follow the format or demo script provided?
(b)
Did the representative review all of the “must-have” items?
(c)
Did the system appear easy to use?
(d)
What is the level of confidence in the system’s capacity to fulfil the majority of
the requirements?
(e)
Is the system a significant improvement on what is currently used by the
organization?
5.2.6. System selection and contract negotiation
122. The processes outlined in Sections 3 and 4 of this document, the vendor’s response to the
RFP, and evaluation of both online and on-site demos facilitate selection of the right statistical
data management system and vendor.
123. As most system/software contracts are written by the software vendor, it is important to
negotiate the contract to protect the organization’s interests and save effort, time and cost that
might be incurred during and after system implementation.
124. Implementation issues such as project management, scheduling, staffing, data migration,
and training should be well articulated and thought out at the commencement of the project.
VI.
CONCLUSION
125. This handbook has been prepared with the intention of guiding African statistical
organizations in their SDMS selection process. Accordingly, a list of the core features of a
© 2011 African Centre for Statistics
Page 27
generic statistical data management system is outlined. Major statistical data management
systems for macrodata processing which are currently deployed in statistical offices of member
States are also presented. This is followed by system selection guidelines and tips which should
be given utmost consideration when selecting a statistical data management platform.
126. The list of features is not ordered according to level of importance, because the
importance of features varies from one organization to another depending on the scope, nature
and overall environment of the statistical information infrastructure. The features discussed are
not prescriptive, rather their level of importance is measured according to the needs of the
organization looking for a specific SDMS. A thorough needs assessment exercise is therefore a
critical success factor in selecting the right statistical data management system.
127. It is obvious that technical features are the key selection criteria. In most cases, those are
well documented and visible, which simplifies SDMS selection on the basis of features.
However, just as crucial as, or even more important than, the technical features are the hidden, or
so-called soft factors which must be taken into consideration when selecting statistical data
management systems. These hidden factors are difficult to measure but have a great impact on
the success of implementation. As direct measurement is not always possible, some research may
be required to gauge the impact of such factors.
128. The steps to be followed in selecting a statistical data management system are also
discussed in this handbook. Formal system acquisition procedures avoid unnecessary waste of
time, money and other resources. A critical part of this exercise is to secure management
willingness and approval. A statistical data management system deployment exercise without the
support of high-level management is guaranteed to fail.
VII.
RECOMMENDATIONS
129. As mentioned in the section on “Mode of Operation”, a lot of effort was made to gather
as much information as possible to document the characteristic features of the major statistical
data management platforms described. However, in the case of some platforms, notably Live
Database and LABORSTA, it was difficult to obtain all the required information. It is strongly
recommended that further investigation is carried out into other possibilities, such as acquisition
and configuration of these platforms in a local server, to fully understand and document their
features.
130. Secondly, in most African countries there are a number of government departments,
semi-government organizations, private institutions and non-governmental organizations
providing various statistics. For instance, in addition to the national statistical office, which is the
main official statistical data provider (in most cases), there are dozens of other government
offices including the national bank and ministries of finance, trade, tourism, agriculture, health,
and education. Most sector associations also have data that are available to the public. Private
institutions and non-governmental organizations compile statistical data on a daily basis. Just as
different parts of a country’s economic and socio-demographic entities are interconnected, data
released by different institutions are also interrelated and need to be consistent with each other.
Strategic deployment of statistical data management platforms plays a significant role in
© 2011 African Centre for Statistics
Page 28
promoting the consistency and the seamless integration of data. It is highly recommended,
therefore, that a similar initiative be commissioned in order to document the current status of
national statistical systems and to investigate and suggest a way forward to achieve robust and
integrated national statistical data management architecture.
© 2011 African Centre for Statistics
Page 29
ANNEX
1.
Questionnaire to NSOs
Questionnaire – Handbook Development for the Selection of Statistical Data Management Platform
1.
Your organization:
Name:
2.
Please list what you consider as the essential functional features of a statistical data
management system.
2.1.
3.
What are the data management platforms or systems that you have worked on or
know for statistical data management? Please complete one system features sheet
(make copies as necessary) for each system/platform you have used or know about.
Thank you.
System Details Number:
(please complete one sheet for each system)
4.
Name of data management system:
5.
Is this system currently used in your office?
6.
Please list the major features/functions of this data management system:
Yes
No
3.1.
7.
Support for periodical data backup/restore?
Fully supported
8.
Partially supported
Not supported
Support for data import/export?
Fully supported
© 2011 African Centre for Statistics
Partially supported
Not supported
Page 30
9.
10.
How do you evaluate the user interface?
Attractive and simple
Attractive but complex
Unattractive but simple
Unattractive and complex
Please list the dissemination media supported by this data management system (e.g.
www, CD, etc.):
11.
Please list the international data dissemination formats supported by this data
management system (e.g. SDMX):
11.1.
12.
13.
Vendor:
How would you rate the training provided by the system’s vendor?
Very good
14.
Not adequate
None
How would you rate other support given by the system’s vendor?
Very good
15.
Satisfactory
Satisfactory
Not adequate
None
How would you rate the security of the system (intruder access control)?
Strong security feature
Moderate security feature
Weak security feature
I don’t know
16.
Sectoral support:
Multisectoral
Single sector
17.
If the system is multi-sectoral, please list the statistical sectors supported:
17.1.
© 2011 African Centre for Statistics
Page 31
2.
Questionnaire to experts
Questionnaire – Handbook Development for the Selection of Statistical Data Management Platform
1.
Your organization:
Name:
2.
Please list what you consider as the essential functional features of a statistical data
management system.
3.
What do you think are the critical functional requirements of a statistical data
management system?
What are the data management platforms or systems that you have worked on or know for
statistical data management? Please complete one system features sheet (make copies as
necessary) for each system/platform you have used or know about. Thank you.
4.
Name of data management system:
5.
Is this system currently used in your office?
6.
Please list the major features/functions of this data management system:
7.
Support for periodical data backup/restore?
Fully supported
8.
Partially supported
Yes
No
Not supported
Support for data import/export?
Fully supported
© 2011 African Centre for Statistics
Partially supported
Not supported
Page 32
9.
10.
How do you evaluate the user interface?
Attractive and simple
Attractive but complex
Unattractive but simple
Unattractive and complex
Please list the dissemination media supported by this data management system (e.g.
www, CD, etc.):
11.
Please list the international data dissemination formats supported by this data
management system (e.g. SDMX):
12.
Vendor:
13.
How would you rate the training provided by the system’s vendor?
Very good
14.
Not adequate
None
How would you rate other support given by the system’s vendor?
Very good
15.
Satisfactory
Satisfactory
Not adequate
None
How would you rate the security of the system (intruder access control)?
Strong security feature
Moderate security feature
Weak security feature
I don’t know
16.
Sectoral support:
Multi-sectoral
17.
If the system is multi-sectoral, please list the statistical sectors supported:
© 2011 African Centre for Statistics
Single sector
Page 33
3. Questionnaire to vendors
Questionnaire – Statistical Data Management Platform
1.
Name of statistical data management system:
2.
Vendor:
3.
Availability of data entry function?
4.
Batch data entry function (e.g. through Excel data sheet) ?
Yes
5.
Ability to attach documents/other resources?
No
6.
Ability to delete records?
7.
Ability to view/undelete deleted entries?
8.
Availability of user manual?
9.
Availability of online/context help?
Yes
Yes
Yes
No
No
Yes
Yes
10. Facility to create report template?
No
No
No
Yes
No
Available
Not available
11. Facility to generate standard (predefined) reports?
12. Facility to generate ad hoc (on the fly) reports?
13. Facility to publish reports to web pages?
Supported
Available
Available
Not supported
Not available
Not available
14. List major metadata items that can be defined in the database (e.g. unit of measure,
indicators, scales, etc.)
15. Please list all other major features/functions of this data management system:
16. Support for periodical data backup/restore?
Fully supported
Partially supported
Not supported
17. Support for data import/export?
Fully supported
© 2011 African Centre for Statistics
Partially supported
Not supported
Page 34
18. How do you evaluate the user interface?
Attractive and simple
Attractive but complex
Unattractive but simple
Unattractive and complex
19. List the dissemination media supported by this data management system (e.g. www,
CD, etc.):
20. List the international data dissemination formats supported by this data management
system (e.g. SDMX):
21. Language support:
Single language
Dual language
Multi-language
Languages supported:
22. How would you rate the security of the system (intruder access control)?
Strong security feature
Moderate security feature
Weak security feature
I don’t know
23. Sectoral support:
Multi-sectoral
Single sector
24. If the system is multi-sectoral, please list the statistical sectors supported:
25. User responsibility management
26. Database management engine used to store data (Oracle, MS SQL, MySql, etc.)
27. Operating system the database is running on
28. The database system is
desktop application
server based – accessible through
network
© 2011 African Centre for Statistics
Page 35
REFERENCES
1.
Nesstar documentation from http://www.nesstar.com/, accessed on 26 April 2011.
2.
Sen, P., Key Issues in Managing and Utilizing IT as a Strategic Resource for NSOs.
Available from http://unstats.un.org/unsd/dnss/kf/it_country_docs.aspx. Accessed 26 April 2011.
3.
The Eurotrace Suite Workshop on Updated and New Recommendations for IMTS and
their Implementation in the Sub-Saharan Region, 1-5 November 2010, Lusaka, Zambia.
4.
Fletcher, T., StatWorks – an IT Toolkit for Statistical Data Management. Available from
www.oecd.org/dataoecd/50/38/18247342.pdf. Accessed 20 April 2011.
5.
Fletcher, T., Sharing Statistical Software – an Update on the OECD Experience, Meeting
on the Management of Statistical Information Systems (MSIS 2010), Daejeon, Republic of
Korea, 26-29 April 2010.
6.
Committee on Statistics, Integrated National Statistical System and BPS Information
Technology Development, twelfth session of United Nations Economic and Social Commission
for Asia and the Pacific, Bangkok, Thailand, 29 November-1 December 2000)
7.
Lukhwareni, T.J., S.F. Madonsela, D.E. Mokhuwa and L.M. Podile, Management of
Metadata in National Statistical Agency. Fourteenth Conference of Commonwealth Statisticians,
5–9 September 2005.
8.
Technology Group International, Software Selection Process Steps. Available from
www.tgiltd.com. Accessed 21 April 2011.
9.
Rizzo, F., The SDMX Service Architecture for the Perspective of a National Statistical
Institute. Meeting of the Joint OECD/UNECE Expert Group on Statistical Data and Metadata
Exchange, Palais des Nations, United Nations Economic Commission for Europe, Geneva,
Switzerland, 8-9 March 2010.
© 2011 African Centre for Statistics
Page 36
Download