Non-sampling errors

advertisement
Non-sampling errors
(Session 20)
SADC Course in Statistics
Learning Objectives
By the end of this session, you will be able to
• describe the types of non-sampling errors
that arise in survey work
• explain actions that may be taken to
minimise commonly occurring non-sampling
errors
• have a greater appreciation that sampling
errors is only a small component of all
errors that may arise and that close
attention to reducing non-sampling errors is
equally or more important in survey work.
To put your footer here go to View > Header and Footer
2
Non-sampling errors: 1
Non-sampling errors cover all errors other
than those due to sampling a subset of the
population.
In both surveys and censuses, it is quite usual
to find non-sampling errors because their
absence implies that the data collection
process has been:
(a)Implemented and enumerated perfectly, &
(a)Completely free of measurement errors,
i.e. inaccuracies in the recording of
information from selected units.
To put your footer here go to View > Header and Footer
3
Non-sampling errors: 2
Non-sampling errors are not all due to
avoidable mistakes and/or deficiencies.
They can often occur because of decisions
by the researchers to balance the need for
good quality data with the need to obtain
timely data at acceptable cost.
The problem then reduces to one of
defining and minimising errors associated
with the data collection and data processing
procedures.
To put your footer here go to View > Header and Footer
4
Types of non-sampling errors
Non-sampling errors can be of various
types

Coverage (or Frame) errors

Non-response errors

Measurement errors

Data handling errors
Note that the first more often applies to
sample surveys, while the last three apply
to both surveys and censuses.
To put your footer here go to View > Header and Footer
5
Coverage (frame) errors
In surveys, the sample is selected from a list,
i.e. a sampling frame, of all population
members.
An inadequate frame leads to coverage
errors. Often can have either
 under-coverage (missing elements), or
 over-coverage (duplicates)
Both lead to biased results. See below for an
example.
To put your footer here go to View > Header and Footer
6
Minimising frame errors
For under-coverage, consider re-defining the
population, i.e. the target population is
simply considered as the population which
can be accessed by the frame.
For duplicates, develop a system to identify
the duplicates, e.g. by using additional
information on the recording unit.
Both under & over-coverage are minimised
by using up-to-date frames, e.g. in UK the
Postcode Address File is updated every 3
months, and is hence often used by the
Office of National Statistics (ONS).
To put your footer here go to View > Header and Footer
7
Non-response errors
Non-response errors are all errors arising
from:
• Unit non-response, i.e. failure to obtain
information from a pre-chosen sampling
unit or population unit
• Item non-response, i.e. failure to get a
response to a specific question or item in
the data recording form.
To put your footer here go to View > Header and Footer
8
Types of non-response errors
Discussion:
What are the typical forms of non-response
(both unit and item non-response) you
encounter in your work?
What are the reasons for non-response?
How can such non-response errors be
minimised?
To put your footer here go to View > Header and Footer
9
Measurement Errors
Measurement errors arise when the recorded
response differs from the true value.
They can occur for a variety of reasons, e.g.
• by respondent (e.g. heads of households)
giving an incorrect answer
• because of instrument or question error
• by interviewer error.
Further, errors may be greater for some subgroups of the population, e.g. those less
literate, or those unwilling to co-operate.
To put your footer here go to View > Header and Footer
10
Reasons for respondent errors
Respondent errors arise for many reasons e.g.
• respondent gives an incorrect answer, e.g. due to
prestige or competence implications, or due to
sensitivity or social undesirability of question
• respondent misunderstands the requirements
• lack of motivation to give an accurate answer
• “lazy” respondent gives an “average” answer
• question requires memory/recall
• proxy respondents are used, i.e. taking answers
from someone other than the respondent.
How can such errors be minimised?
To put your footer here go to View > Header and Footer
11
Instrument Errors
Instrument or question errors arise when
• The question is unclear, ambiguous or difficult
to answer
• the list of possible answers suggested in the
recording instrument is incomplete
• requested information assumes a framework
unfamiliar to the respondent
• the definitions used by the survey are different
from those used by the respondent (e.g. how
many part-time employees do you have? See
next slide for an example)
How can such errors be minimised?
To put your footer here go to View > Header and Footer
12
An example of instrument error
The following example is from Ruddock (1998) – see slide 18
In the Short Term Employment Survey (STES)
conducted by Office of National Statistics in UK,
data are collected on numbers of full-time and
part-time employees on a given reference date.
Some firms ignored the reference date and gave
figures for employees paid at the end of the
month, thus including those who joined and those
who left in that month – leading to an overestimate.
Firms found it difficult to give details of part-time
employees as their definition of “part-time” did
not agree with that used by ONS.
To put your footer here go to View > Header and Footer
13
Interviewer errors
Interviewer errors arise when
• different interviewers administer a survey in
different ways
• differences occur in reactions of respondents
to different interviewers, e.g. to interviewers
of their own sex or own ethnic group
• inadequate training of interviewers
• inadequate attention to the selection of
interviewers
• there is too high a workload for the
interviewer
How can such errors be minimised?
To put your footer here go to View > Header and Footer
14
Data handling errors
Data handling errors can occur from the stage of
data collection up to the final stages of data analysis.
Types of errors that can arise include:•
•
•
•
errors in transmission of data from the field to
the office
errors in preparing the data in a suitable format
for computerisation, e.g. during coding of
qualitative answers
errors in computerisation of the data
errors during data analysis, e.g. imputation and
weighting.
Do any of these types of error occur in your
work. If so, what can you do to minimise them?
To put your footer here go to View > Header and Footer
15
Measuring non-sampling errors
Measuring non-sampling errors is difficult and
often impossible. Attempts have often been
through specific additional studies, e.g.
characteristics of non-respondents in the 1996
British Crime Survey were investigated by a
mini-questionnaire to those living in 25% of
non-responding addresses.
Several studies to assess non-sampling errors
can be found in Ruddock (1998) (see slide 18
for full ref.) & in Lessler, J.T. and Kalsbeek,
W.D. (1992) Non-sampling error in surveys;
Wiley.
To put your footer here go to View > Header and Footer
16
Non-sampling errors: Key Points
Non-sampling errors are inevitable in production
of national statistics. Important that:•
At planning stage, all potential non-sampling
errors are listed and steps taken to minimise
them are considered.
•
If data are collected from other sources, question
procedures adopted for data collection, and data
verification at each step of the data chain.
•
Critically view the data collected and attempt to
resolve queries immediately they arise.
•
Document sources of non-sampling errors so that
results presented can be interpreted meaningfully.
To put your footer here go to View > Header and Footer
17
References
Ruddock, V. (1998) “Measuring and Improving Data
Quality” UK Govt. Statistical Service Methodology
Series No. 14, for a very comprehensive coverage of
non-sampling errors. This document may be
downloaded from
http://www.statistics.gov.uk/methods_quality/publications.asp
Lepkowski, J. (2004) Non-observation error in
household surveys in developing countries. Chapter
VIII, pp 149-169 of the UN Publication An Analysis of
Operating Characteristics of Household Surveys in
Developing and Transition Countries: Survey Costs,
Design Effects and Non-Sampling Errors. Available at
http://unstats.un.org/unsd/hhsurveys/index.htm
To put your footer here go to View > Header and Footer
18
Practical work follows…
To put your footer here go to View > Header and Footer
19
Download