MICS Data Processing Workshop

advertisement
Multiple Indicator Cluster Surveys
Data Processing Workshop
Sample Weights
MICS Data Processing Workshop
What are sample weights ?
• Sample weight: a statistical correction factor
used to correct for imperfections in the
sample that might lead to bias:
– Unequal probabilities of selection
– Non-response
• Constant sampling weight: self-weighting
sample
Self-weighting sample
• Constant sampling weight: self-weighting
sample
– Stratum level (e.g., urban and rural within
region)
– National level: overall self-weighting sample
(almost inexistent in household surveys)
Self-weighting sample
• Advantages
– Equally representative for every unit
– Reduced sampling errors
• Disadvantages:
– Difficult for survey management (e.g., to
distribute the work-load) because of the variant
sample take by PSU
– Difficult to control the expected sample size
Self-weighting sample
• Disadvantages
– Self-weighting is not exact because of the
rounding of the sample takes and this will bring
bias in the survey estimation
• In most MICS surveys, if not all, samples are
not self-weighting. Therefore, sample weights
must be used for reporting national estimates
Example - Sample Weights
• For example, the weights for North and West regions
(Popstan)
– North region
– West region
10,000/500
10,000/250
= 20
= 40
• In North region, each household selected represents 20
households in that region – same figure is 40 in West
• Overall, every household selected in Popstan represents
26.6667 households (20,000/750)
Example - Sample Weights
• In other words, relative to a proportional
selection (should be 375 households selected
from each region), more households have
been selected from North, less have been
selected from West
• This has to be “compensated” by using sample
weights during analysis to re-calibrate the
sample to the national level
Example - Sample Weights
• In our example let’s assume that:
– 25 percent of households in North use improved
water sources
– 75 percent of households in West use improved
water sources
• If the sample was selected proportionally (375
households from each region), then our
survey estimate would be
– ((375 * 0.25) + (375 * 0.75)) / 750 = 0.50
Example - Sample Weights
• If we do not weight, then our national
estimate will be
– ((500 * 0.25) + (250 * 0.75)) / 750 = 0.417
– Because, we have over-sampled a region (North
region) where use of improved water sources is
less
• We need to calculate sample weights to
“correct” this situation
Example - Sample Weights
• If we assigned a weight of 20 to each
household in North, and 40 to each household
in West, this would do the trick
(500 * 20 * 0.25) + (250 * 40 * 0.75)
----------------------------------------------(500 * 20) + (250 * 40)
= 0.50
Example - Sample Weights
• This is fine, but SPSS tables would show
20,000 households as the denominator
• We do not want this
• So, we normalize the weights
• We calibrate (normalize) them so that the
average of the weights in the data set is equal
to 1
Example - Sample Weights
• The normalized weight for the North region is
calculated as (10000/500)/(20000/750) = 0.75
• And for the West region, (10000/250)/(20000/750)
= 1.5
When we calculate the national use of improved water
sources by using normalized weights,
(500 * 0.75 * 0.25) + (250 * 1.5 * 0.75) 375
-------------------------------------------------- = ----(500 * 0.75) + (250 * 1.5)
750
Sample weights
• Based on the design of the sample, there are
two (common) approaches to calculating
weights:
– Each cluster has a unique sample weight
(weights.xls)
– Each stratum has a unique sample weight
(weights_alt.xls)
• We have templates for both. You will need to
work with your sampling expert to see which
one you will use
Sample Weights Objects
• weights.xls
– spreadsheet that calculates weights
• weights_table.sps
– SPSS program that provides input data for spreadsheet
• weights.sps
– SPSS program that defines structure of spreadsheet’s output
• weights_merge.sps
– SPSS program that merges weights onto the MICS data files
Calculating sample weights
• The spreadsheet weights.xls is used to
calculate the sample weights
• It has two worksheets, calculations and
output.
– The calculations worksheet performs the
calculations
– The output worksheet contains only the sample
weights and a list of cluster numbers; format
useful for reading the data into SPSS
•Weights calculation template
Calculating and adding sample
weights
• weights_table.sps produces data
needed for calculating the sample
weights
• weights_merge.sps adds the
appropriate sample weights to the
analysis files
Steps in calculating sample weights
• The process of calculating sample
weights and adding them to your
analysis files can be broken down
into six steps
Steps in calculating sample weights
Step 1
• Adjust the number of rows in the calculations
and output worksheets so that there is one
row per cluster in your survey. After you have
added or deleted rows, be sure to check that
doing so did not affect the totals row in the
calculations worksheet
Steps in calculating sample weights
Step 2
• Enter required information for columns B to F
and for columns H and I
Steps in calculating sample weights
Step 3
• Update the definition of strata (or domains) on lines
3 through 10 of the program weights_table.sps
• The standard programs assume that strata are
formed by all combinations of area (that is, urban
and rural) and region and that there are four regions
(the program should be modified to reflect the strata
or domains in use in your sample)
Steps in calculating sample weights
Step 4
• Execute the program weights_table.sps.
Steps in calculating sample weights
Step 5
• Copy the information in the table and
paste it into the calculations worksheet
of weights.xls
• When you complete this step,
weights.xls will automatically calculate
the sample weights
Steps in calculating sample weights
Step 6
• Execute the program weights_merge.sps
• Once you have completed the sixth step,
be sure to check the output list for error
messages and to open the analysis files
and confirm that the weights have been
properly merged
Download