Quality control

advertisement
Data Quality Control
by
Naila Baig Ansari
Research Fellow
Dept of Community Health Sciences
The Aga Khan University
Karachi, Pakistan
Who am I?
Education:
MSc (Epidemiology), The Aga Khan
University, 2001. Thesis: Care and
feeding practices and their
association with stunting among
young children residing in Karachi-s
squatter settlements
BBA (Management), The College of
William and Mary, Williamsburg, VA,
USA, 1989
Research interest: Nutritional and
behavioral epidemiology,
methodological issues in dietary
assessment methods, household
food security and gender-related
issues, care and feeding practices,
management of data and
questionnaire designing
Learning Objectives

To know the steps necessary for ensuring quality assurance
and control of data at various stages of a study

To understand the difference between pilot testing and pretesting

To understand the importance of designing data collection
instruments

To understand how data can be managed using an audit
trail and the various techniques that can be used to inspect
your dataset after it has been entered
Performance Objectives

Know the difference between quality assurance and quality
control and ways to ensure them

Know the objectives of a pilot test and a pre-test

Understand how data collection instruments should be
designed and coded

Be able to manage data using an audit trail

Be able to inspect datasets for errors and rectify them
Data Quality Control

Quality Assurance
– Activities to ensure
quality of data before
data collection

Quality Control
– Monitoring and
maintaining the quality
of data during the
conduct of the study
• Data Management
– Handling and
processing of data
throughout the study
Steps in Quality Assurance
1.
Specify the study hypothesis
2.
Specify general design to test study hypothesis 
Develop an overall study protocol
3.
Choose or prepare specific instruments
4.
Develop procedures for data collection and processing
 Develop operation manuals
5.
6.
Train staff  Certify staff
User certified staff, pretest and pilot-study data
collection and processing instruments and procedures
Quality Assurance: Standardization of
procedures

Why is standardization important?
– In order to achieve highest possible level of uniformity
and standardization of data collection procedures in the
entire study population

Preparation of written manual of operations
– Detailed descriptions of exactly how the procedures
specific to each data collection instrument are to be
carried out (BP example)
– Q by Q’s (question by question) instructions for
interviews
Quality Assurance: Training of Staff

Aim to make each staff person
thoroughly familiar with procedures
under his/her responsibility

Training certification of the staff member
to perform a specific procedure
Quality Assurance: Pretesting and Pilot
testing

Pretesting
– Involves assessing
specific procedures
on a sample in
order to detect
major flaws

Pilot Testing
– Formal rehearsal of
study procedures
– Attempts to
reproduce the
whole flow of
operations in a
sample as similar as
possible to study
participants
Pretesting and Pilot testing results

Pretesting of questionnaire used to assess:
– flow of questions,
– presence of sensitive questions,
– appropriateness of categorization of variables,
– clarity of the q by q instructions to the
interviewer

Pilot testing
– In addition to the above, flow of process
Quality Assurance: Data Management
Designing data collection
– Layout, questions to ask, sequence of questions,
phrasing of questions, response categories, skip
patterns
– Collect and record “raw”, not processed
information (eg. Age)
– Codebook: link between the questionnaire and
the data entered in the computer
Code book example
Variable
QNo
Meaning
Codes
Format
Q1Id
Q1
Quest. No
1-750
C3
Q2Sex
Q2
Respondent’s sex
1 male
2 female
N 1.0
Q3Child
Q3
No of children
99 no response
N 2.0
Q4Wt
Q4
Weight in kg
999 not recorded
N 3.1
Q5roof
Q5
Roof type
1 RCC
2 Cement sheet
3 Tin sheet
4 Thatched
Other (specify)
N 2.0
Quality Assurance: Use of a Code book

Variable names
– Up to 8 characters a-z and 0-9, must start with a letter
– Combination of question number and description (eg.
q3age)

Meaning:
– short text description describing the meaning of the
variable
– SPSS software can incorporate this info as variable
labels and display it in the output
Quality Assurance: Use of a Code book

Codes
– Try and use numerical codes

Predecide codes for no response, missing values
– Question could not be asked or not applicable (eg.
pregnancy outcome)
– Question was asked but respondent did not reply (eg
salary)
– Respondent replied “don’t know”
Quality Control
Observation of procedures and performance of staff
members for identification of obvious protocol
deviations

Strategies include:
– Over-the-shoulder observation of staff
– Taping all interviews and reviewing a random sample
– Ongoing field supervision
– field editing by interviewer as well as field supervisor
– Office editing which includes coding
– log book maintenance
– Statistical assessment of trends over time in the
performance of each observer/interviewer/technician
Data Management: Audit trail

Researcher should be able to trace each piece of
information back to the original document:
– ID included in the original documents and in the dataset
– All corrections must be documented and explained
– All modifications to the dataset must be documented by
command files
– Each analysis must be documented by a command file

Purpose of audit is to
– protect yourself against mistakes, errors, waste of time
and loss of information
– enable external audit (revision)
Data Management: Handling of Data

Entering data
– Use professional data entry program like
EpiData

Preparations
– complete codebook
– examine questionnaires for obvious
inconsistencies, skip patterns
Data Management: Handling of Data

Error prevention:
– Set up a data entry form resembling your
questionnaire
– Define valid values before entering data
– double data entry by two different operators

compare contents to get list of discrepancies
(EpiInfo)

correct errors in both files and run new comparison
First Inspection of data. Error Finding

Add variable and value labels to your data using a syntax
command

Searching for errors
– make printouts of codebook from the data, overview of variables, simple
frequency tables of appropriate variables
– compare codebook created with original codebook and see if label
information is correct
– Inspect the generated summary/frequency tables for illegal or improbable
minimum and maximum values of variables and inconsistencies (eg. 250
years age, pregnant male; 23 yr woman with 19 yr son)

Calculate the error rate by
– randomly select 10% or at least 40 of your questionnaires and re-enter
them into new file
Correction of errors - Documentation

If errors are discovered
– Make corrections in a command file (SPSS
syntax file), this will provide full
documentation of changes made to the dataset

If errors are discovered when comparing
files after double data entry
– you can make corrections directly in the data
entered, provided you end this step with a
comparison of the two files entered and
corrected
Correction of errors - Documentation

Split the process into distinct and welldefined steps and that your
documentation from one step to another
is consistent

Archive
– once you have a “clean” documented version of
your primary data, save one copy in a safe place
and do your work with another copy
Analysis

Make sure you use the right data set
– recommend to create command files for
analysis which start with the command reading
the dataset

Late discovery of errors and inconsistencies
Backing up vs Archiving

Backing up
– everyday activity
– purpose to able you to restore your data and documents
in case of destruction or loss of data
– not only datasets, but also command files modifying
your data, written documents such as the protocol, log
book and other documenting information

Archiving
– takes place once or a few times during the life of the
project
– purpose is to preserve your data and documents for a
more distant future, maybe to even allow other
researchers access to the information.
Download