Project Work

advertisement
Module 4 Session 4 plus
Module 4: Session 4 onwards
Project Work using UNHS2 data
There are four tasks listed on the following pages. You select one of these to form the work
of your project. The aim of the project work is for you to acquire skills in data analysis with
clear objectives in mind, while at the same time consolidating your knowledge of Stata. You
carry out work on this project in several of the sessions that follow, but should aim to
complete your data analysis by the end of day 5 of this module and complete a report on your
data analysis by the end of session 11 on day 6. Resource persons will assist you if you have
difficulties in producing specific parts of the analysis using STATA.
The overall objectives of the four tasks (from which you choose one) are as follows:
Task 1: Investigate health conditions of children (defined as those less than 18 years of age) in
sampled households.
Task 2: Investigate health conditions of adults (defined as those who are 18 years of age or
above) in sampled households.
Task 3: Investigate the educational level of children (defined as those less than 18 years and
more than 4 years of age) in sampled households.
Task 4: Investigate the educational level of adults (defined as those aged 18 years and more) in
sampled households.
The data comes from sections 2, 3A and 4 of the socio-economic component of the UNHS
2002/2003 Socio-Economic questionnaire. This data set includes the personal characteristics
of household members, their health and education, and is available in the STATA file called
SocioSects2to4.dta. Each task involves using a subset of this data. Copies of the relevant
pages of the questionnaire are on the final pages of this handout and/or in an Excel file
associated with Session 4.
If time permits, you can also look at some poverty related variables. This data is in file
pov0203.dta and is available at the household level.
The steps to follow are given below. Aim to complete up to Step (vii) in this session. Do
NOT attempt to start the major analyses related to the study objectives in this session. You do
this in Session 6, after a resource person has checked your analysis plan in Step (vii). Until
that time, we recommend that you browse your data and carry out exploratory data analysis
procedures that help you to understand the data variables and data structure.
Districts Training Programme
Module 4 Session 4 plus – Page 1
Module 4 Session 4 plus
(i) First decide which of the objectives interests you most, and select the corresponding task.
If you working alongside a friend or colleague, it is highly desirable that you both agree on
whether to work on Tasks 1 and 2, (health) or Tasks 3 and 4 (education). This will enable
you to discuss and compare your results as you go along. However, your final report should
be done by you, rather than jointly.
(ii) Next, read the specific questions to be answered in relation to your task as given on one of
the following pages. If you are not clear about any of the questions, ask a resource person.
(iii) Read the steps from (iv) onwards below, so you get a quick idea of good practice
procedures to be followed when undertaking a data analysis.
(iv) Now input your data file SocioSects2to4.dta into Stata. If Stata tells you that you don’t
have enough memory, type the command set memory 32m and try again. Browse your data
and check you can identify the variables related to your task. Note these down. Remember
that for any analysis, background variables, i.e. variables from hh to rurban, are likely to be
needed at some stage, so note these too as part of the set of variables you will need for
analysis. Ensure you understand the meaning of the variables you need for your task.
(v) You will notice that your task involves using a subset of the data. Don’t create this subset
until step (vi) below. For now, produce some summary statistics, using Stata facilities you
have learnt, that will enable you to identify the number of records that your data subset will
have. This is standard good practice. Note down this number. Also produce some simple
data summaries which will tell you about the number of records your subset data will have
across (a) regions; (b) urban and rural areas; (c) across sex of household member. Note down
these numbers (this is to help with the checking process in step (vi) below).
(vi) Now it is time to select the data subset you need for your task. BUT before you do so,
save the data set SocioSects2to4.dta under a different name and check you know the
directory on your computer from where you can recover this data set again. Now select the
data subset you need, possibly using either the keep or drop commands, together with an if
option. Using a few Stata commands, e.g. performing the same analysis as undertaken in step
(v) above, verify whether you have succeeded in selecting the data subset you need. If you
are happy with the result, then save your data.
(vii) Conduct some simple exploratory data analysis to understand your data subset better.
You may have already done this in step (vi) above. Note down your results since this analysis
can also form a part of your project write up report.
(viii) Now consider how you might address the first three specific objectives related to your
task. However, before starting the analysis using Stata, consider the type of tables and
graphs that will help in achieving the objectives. Draw up, on sheets of paper, “dummy”
tables and graphs to show the type of tables and graphs that would be appropriate.
Districts Training Programme
Module 4 Session 4 plus – Page 2
Module 4 Session 4 plus
For example, with respect to (say) two-way tables, you will want to specify what variable
makes up the rows of your table, what variable makes up the columns of your table, what the
table cells (contents) will have, and a suitable title for your table.
With respect to graphs, include a title, and consider whether you want a pie chart, bar chart,
multiple bar chart, box plot, or other type of graph. Give as much information as possible
concerning this graph so that you are clear about its form at the time of analysis. Once you
have completed this task, show your results to a resource person to check that the tables and
graphs you have suggested are appropriate for the objectives at hand.
(ix) Once the resource person is satisfied with your suggestions, you may start the data
analysis using Stata. Most of this work will be done in Session 6. Note down the procedures
you followed while doing your analysis, or a better approach is to set up an appropriate Do
file so that you can return to the same analysis at a later stage.
(x) After each piece of analysis, corresponding to each question of your task taken in turn,
write up the component of your report related to that analysis before you proceed to the next
question in your task. This way, you will not run out of time at the end, and will have the
opportunity to revise your report to a suitable length at a later stage. Aim to complete
questions 1, 2, 3 of your task by the end of Session 6.
(xi) After a short presentation in Session 10 – to give you an appreciation of sampling
weights, you will have time to learn about merging of data files. You will then have time to
continue with your project work and begin addressing objectives 4 and 5 of your task.
(xii) You should aim to complete your project report by the end of Session 11 and hand your
report to a resource person.
Districts Training Programme
Module 4 Session 4 plus – Page 3
Module 4 Session 4 plus
Task 1
Overall objective
To investigate health conditions of children (defined as those less than 18 years of age) in
sampled households.
Specific objectives (questions):
1.
What percentage of children suffered a sickness or injury during the 30 days prior to survey
work? Provide an appropriate estimate at this stage, but after Session 7, you should aim to
return to this question and provide a standard error and a confidence interval for your
estimate.
2.
(i) What specific sicknesses/injuries were experienced by children in the household in the
period of interest? Use a suitable graph to summarise these results.
(ii) There is specific interest in investigating the occurrence of malaria and respiratory
diseases. Produce a table to explore how the proportion of those with each of these diseases
varies across the 4 regions?
3.
What percentage took treatment for malaria and respiratory diseases? Of those who suffered
malaria, what percent did not go outside home for treatment of their sickness or injury?
What were the reasons for not consulting outside?
The remainder of this task will be restricted to household level variables and information
concerning one child in the household. For convenience, the oldest child will be selected for this
purpose. Create this data subset, save the data in a new name, and then proceed.
4.
Is there evidence of an association between the use of a mosquito net and the occurrence of
malaria? Explore whether this association exists for every region or for every district within
one region of your choice?
5.
Merge your data file with the household level file called pov0203.dta and investigate whether
the mean consumption expenditure per adult equivalent (variable welfare), converted to logs
if appropriate, differs significantly across those who (a) suffered and (b) did not suffer, an
illness or injury in the 30 days prior to survey work. Do this for the whole of your data
subset and then repeat the analysis for a district of your choice.
Districts Training Programme
Module 4 Session 4 plus – Page 4
Module 4 Session 4 plus
Task 2
Overall objective
To investigate health conditions of adults (defined as those who are 18 years of age or above) in
sampled households.
Specific objectives (questions):
1.
What percentage of adults suffered a sickness or injury during the 30 days prior to survey
work? Provide an appropriate estimate at this stage, but after Session 7, you should aim to
return to this question and provide a standard error and a confidence interval for your
estimate.
2.
(i) What specific sicknesses/injuries were experienced by adults in the household in the
period of interest? Use a suitable graph to summarise these results.
(ii) There is specific interest in investigating the occurrence of malaria and respiratory
diseases. Produce a table to explore how the proportion of those with each of these
diseases varies across the 4 regions?
3.
What percentage took treatment for malaria and respiratory diseases? Of those who
suffered malaria, what percent did not go outside home for treatment of their sickness or
injury? What were the reasons for not consulting outside?
The remainder of this task will be restricted to household level variables and information
concerning the head of the household. Create this data subset, save the data in a new name, and
then proceed.
4.
For heads of household, is there evidence of an association between the use of a mosquito
net and the occurrence of malaria? Explore whether this association exists for every region
or every district within one region of your choice?
5.
Merge your data file with the household level file called pov0203.dta and investigate whether
the mean consumption expenditure per adult equivalent (variable welfare), converted to logs
if appropriate, differs significantly across household heads who (a) suffered and (b) did not
suffer, an illness or injury in the 30 days prior to survey work. Do this for the whole of your
data subset and then repeat the analysis for a district of your choice.
Districts Training Programme
Module 4 Session 4 plus – Page 5
Module 4 Session 4 plus
Task 3
Overall objective
To investigate the educational level of children (defined as those less than 18 years and more than
4 years of age) in sampled households.
Specific objectives (questions):
1.
What percentage of children have never attending school? Provide an appropriate estimate
at this stage, but after Session 7, you should aim to return to this question and provide a
standard error and a confidence interval for your estimate.
2.
How does the percentage in question above vary between (a) rural and urban areas, (b)
across the different regions, and (c) across the different age groups, <10 years, 10 and <14
years, and 14 years. Consider showing your results with suitable graphical presentations.
3.
Of those children who have never attended school, what are the key reasons for not
attending? Consider a suitable table that may be produced to summarise your answer.
The remainder of this task will be restricted to household level variables and information
concerning one child in the household. For convenience, the oldest child will be selected for this
purpose. Create this data subset, save the data in a new name, and then proceed.
4.
Consider the distance that current school going children have to travel to get to school as
“Near”, i.e. less than 1 km away, “Some distance”, i.e. 1-2 kms, and “Far”, i.e. more than 2
kms. Is there any evidence that the distance to the school is related (associated) with the
type of school (boarding, day or day/boarding) they are attending. Explore whether this
association exists or not for every district within a region of your choice.
5.
Merge your data file with the household level file called pov0203.dta and investigate whether
the mean consumption expenditure per adult equivalent (variable welfare), converted to logs
if appropriate, differs significantly across (a) those who have never had schooling and (b)
those who are currently attending school. Do this for the whole of your data subset and
then repeat the analysis for a district of your choice.
Districts Training Programme
Module 4 Session 4 plus – Page 6
Module 4 Session 4 plus
Task 4
Overall objective
To investigate the educational level of adults (defined as those aged 18 years and more) in
sampled households.
Specific objectives (questions):
1.
What percentage of adults currently attend school? Provide an appropriate estimate at this
stage, but after Session 7, you should aim to return to this question and provide a standard
error and a confidence interval for your estimate.
2.
How does the percentage in question above vary between (a) rural and urban areas, (b)
across the different regions, and (c) across the different age groups, <25 years, 25 and <40
years, and 40 years. Consider showing your results with suitable graphical presentations.
3.
Of those adults who have never attended school, what are the key reasons for not attending?
Consider a suitable table that may be produced to summarise your answer.
The remainder of this task will be restricted to household level variables and information
concerning the head of the household. Create this data subset, save the data in a new name, and
then proceed.
4.
Is there any evidence that the literacy level of household heads is associated with whether or
not they have ever attended a literacy program? Explore whether this association exists or
not for every district within a region of your choice.
5.
Merge your data file with the household level file called pov0203.dta and investigate whether
the mean consumption expenditure per adult equivalent (variable welfare), converted to logs
if appropriate, differs significantly across those who (a) have never had schooling or have
only completed primary level schooling and (b) those whose educational level is higher than
primary level. Do this for the whole of your data subset and then repeat the analysis for a
district of your choice.
Districts Training Programme
Module 4 Session 4 plus – Page 7
Module 4 Session 4 plus
UGANDA BUREAU OF STATISTICS
THE REPUBLIC OF UGANDA
UGANDA NATIONAL HOUSEHOLD SURVEY 2002/2003
SOCIO-ECONOMIC SURVEY QUESTIONNAIRE
SECTION 1A: IDENTIFICATION PARTICULARS
1. STRATUM:
2. COUNTY:
3. SUB-COUNTY
4. PARISH:
5. EA/ LC1:
6. HOUSEHOLD SR. NO.:
7. SAMPLE NO.:
8. HOUSEHOLD CODE:
9. NAME OF HEAD:
THIS SURVEY IS BEING CONDUCTED BY THE UGANDA BUREAU OF STATISTICS
OF THE MINISTRY OF FINANCE, PLANNING AND ECONOMIC DEVELOPMENT
UNDER THE AUTHORITY OF THE UGANDA BUREAU OF STATISTICS ACT, 1998.
THE UGANDA BUREAU OF STATISTICS
P.O. BOX 13,
ENTEBBE,
TEL: 041 – 320741, 322099, 322100, 322101, 075 -720745, 077 - 705127
Fax: 320147
E-mail: unhs@infocom.co.ug, ubos@infocom.co.ug
Website: www.ubos.org
Districts Training Programme
Module 4 Session 4 plus – Page 8
Download