Census Data Archiving Experience of the Central Statistical Presented

advertisement
Census Data Archiving
Experience of the Central Statistical
Agency (CSA) of Ethiopia
Presented on the United Nations
Regional Seminar on Census Data
Archiving for Africa
Addis Ababa, Ethiopia
20-23 September 2011
OUTLINE
1. Background Information;
2. Census Data Maintenance;
3. Census Data Archiving;
4. Type of Census Data Storage Device;
5. Data Storage Methodology;
6. Procedures for Safe Guarding the Security of
the Census Data;
7. Challenges
1.BACKGROUND INFORMATION
CENSUS UNDERTAKING EXPERIENCE IN ETHIOPIA
 In Ethiopia, only three National Population and Housing
censuses (PHC) have been conducted ;
 The First ever Population and Housing Census was conducted
in May 1984 (39.9 million, excluding Eritrea);
 The Second Population and Housing Census was conducted in
1994 (53.5 million);
 The third/Latest PHC was conducted in May and November
2007 (73.9 million).
 Millions of paper copies of the filled in questionnaires
containing data for each members of the household and
millions of individual records created in a soft copy format have
been acquire from those censuses.
 Archiving of the hard and soft copies of the census documents
has begun in 1984.
1.BACKGROUND …
CENSUS UNDERTAKING EXPERIENCE …
 The filled in questionnaires of the previous two censuses had
been archived until the beginning of enumeration of the
subsequent census.
 Due to the fact that the documents collected from the field were
huge that require large space for storage, it was necessary to
dispose of the preceding census questionnaires under secured
conditions before the conduct of the next in order to get
adequate space for the later.
 For the latest PHC questionnaires (census 2007) are being
archived in two ways; as hard copies and images, which were
captured during scanning.
 The hard copies of filled in questionnaires have been stored in
the warehouse where as the images of the questionnaires have
been stored on the servers.
1.BACKGROUND …
CENSUS PROCLAMATIONS
The country has had three Census Proclamations, an
independent one proclamation for each PHC.
For 1984 and 1994 PHCs, the Proclamations used to be
enacted before the commencement of the preparatory
activity of each respective census and had a temporary
nature where as that of the third is a permanent type,
which was established according to the Constitution of the
country.
The enforcement of each Census Proclamation was mainly
focuses on the conduct of preparatory activities, field
enumeration and approval of the results of each census
count. It also stipulates the establishment of Census
Commission, the highest body responsible for guiding,
coordinating and overseeing the over all census operations
and determined the compositions of its members.

1.BACKGROUND …
CENSUS PROCLAMATIONS…
The Census Proclamations defines duties and
responsibilities of each entity involved in the
operations such as that of the Census
Commission, the Central Statistical Agency and
also determine the obligations of the dwellers of
the country to provide correct information, and
the confidentiality of individual data, etc.
 However, it doesn’t not explicitly sates the
archiving of the documents.
2.CENSUS DATA MAINTENANCE
DATA BACKUP POLICY
 Backup is the key to recovering files in case of a disaster





or lose due to different reasons.
CSA has a backup policy which is embedded in its ICT
Policy . Here are some guidelines to keep in mind:
IT will replace/reinstall lost or damaged system files and
standard applications to users’ hard drives.
The user should keep original operating system or
application media along with licensing information;
IT performs centralized backups for systems residing on
CSA’s servers. Those departments and services using
these systems will have their data and files backs up
routinely. ICT administers and maintains these backups.
The users are responsible to backup of the files which are
stored on the computer given to them from the Agency.
2.CENSUS DATA MAINTENANCE
DATA BACKUP POLICY
 The user should keep all documents in a document folder
for easy backup(Best Practice)
 The user should backup entire documents folder to some
removable media at least once a week; daily if documents
are frequently created or changed.
 The user should maintain at least two backup sets,
alternating their use. Thus if one backup goes bad, there
will be the other
 Users must store their backup media in a safe place.
2.CENSUS DATA MAINTENANCE
Method of Securing Data Backup
 Use high quality backup media
 Restrict access to backup media. Keep backup in a locked area
because they may contain large amount of confidential data in a
form that easily fits into the pocket or briefcase of the attacker;
 Sort and label backup media appropriately
 Verify the integrity of the backup media by restoring that data
during a test restore.

 B. Backup Type
 CSA uses 3 backup types; i.e. full, incremental, differential
depending on the data.
 Full is complete backup of all data in a given folder, volume, or
drive.
 Incremental is only files that have change in a given period,
based on date/time stamp of the file.
 Differential is only files that have changed based on file size
and CRC (checksum redundancy check)
2.CENSUS DATA MAINTENANCE
Data Backup Procedure and Technologies
CSA has two data backup procedure.
1. D to D (Disk-to-Disk) Backup which is taking backup
from working to a storage server and
2. From computer (sever) to other storage media devices
like:



CD or DVD for application software
Plug in external tape drive(size ranges 50 to 500GB) for data
Networked(using iSCSI connection)
HP 1/8 G2 Tape
Autoloader for data
Networked Dell PowerVault MD3000i for data Storage
 In the near future (in three months) CSA has planned to
establish a data backup infrastructure which is a part of
improving ICT system infrastructure.
2.CENSUS DATA MAINTENANCE
Data Maintenance
Usually when data is missed, corrupted,
infected by virus, incorrect data size found the
data are maintained by restoring from the
latest backup using a disaster recovery system.
2.CENSUS DATA MAINTENANCE
Data Maintenance
Usually when data is missed, corrupted,
infected by virus, incorrect data size found the
data are maintained by restoring from the
latest backup using a disaster recovery system.
2.CENSUS DATA ARCHIVING
Directive for Census Data Archiving
 The CSA has a directive that has been
distributed to every Directorate who
responsible in conducting sample survey or
census to send the clean data with a complete
documentation to the Information Systems
Technology Directorate, which is responsible
to archive and electronically disseminate data
including the metadata.
3.CENSUS DATA ARCHIVING
Procedures for Archiving Census Microdata
Collect the Micro-data
(Population Statistics);
from
the
directorate
-Create appropriate directory and file structure ;
-Archive using micro-data management toolkit of the
World Bank and store in the right folder;
-Disseminate Metadata and Report on the web.
4.Type of Census Data Storage Device
The Agency has 3 storage devices: one with a
capacity of 6 terabyte and the remaining two
with a capacity of 3 terabytes each.
Of these, census data (images and micro-data)
were stored on 6 terabyte server.
Moreover, the same data are also stored on
tapes of 400GB on each side.
5.Data Storage Methodology
After officially dissemination of the census data all
documents relevant including metadata that were
collected
from
the
Population
Statistics
Directorate(department) are kept in the data bank in
the following structure:
C:\ETH-POP-YY\ where YY stand for the year when the censes/survey
Conducted
Under this folder there are sub- folders
\DATA
\DOCS
\PROGRAMMS
\WORK
Under the Sub-Folder DATA
DATA\SPSS
DATA\ASCII
5.Data Storage Methodology…
Under the Sub-Folder DOCS
DOCS\Report
DOCS \ Questionnaires
DOCS \Technical
Under the Sub-Folder Program
Program\All programs that help for editing,
tabulation… (Plan) are contained;
Under the data Sub-Folder Work
Work\intermediate work done during
archiving are kept.
6. Procedures for Safe Guarding the
Security of the Census Data
CSA ensure the security of the census data through the following measures
I.
II.
III.
IV.
V.
All staff performing data processing are required to make a statement to
ensure the confidentiality of data;
All completed paper questionnaires will be processed and stored in an area
designated for processing census data only. Detailed records of document
movements are maintained;
All completed paper questionnaires will be destroyed after 10 years and
commencement of the Census;
The reference link between record and address of units of quarters will be
deleted when the data set is constructed for subsequent tabulations; and
All published tables will be scrutinized to ensure no small values appear in
them for small geographical units from which personal particulars may be
derived through complicated deduction.
7.Challanges
 Lack off site back up in case for any damage on the Data
Storage Room (s);
 Uses of tapes for backup, as the tapes are some times
fails,
 Due to limited number of Tapes available currently only
two copies are taken;
THANK YOU
Download