
Questionnaire Development
Health Survey Research Methods
Susan Sherman
December 6, 2010
Lecture Objectives
1. Learn about the sampling universe,
sample, and sampling methodology
– Random
– Nonrandom
2. Describe strengths and weaknesses of
different approaches.
3. To conceptualize, operationalize, and
specify research questions;
Learning Objectives
4. To understand how questions, responses,
instructions and the questionnaire can effect the
meaning of a question.
5. To recognize and revise unbalanced, loaded, or
double-barreled questions.
6. Hear about the block
Study Population
• The population (universe, target
population) is the entire set of individuals
to which findings of the survey are to be
– Members of the population are elements
• Often cannot sample elements directly
(not available, too expensive), but are
associated with other units, enumeration
units or listing units.
Why sample?
• Economy!
• No need to determine all possible
• Can use probability and statistics to assist
making an informed judgment
• Sampling frames are out of date as soon
as they are developed
The population and the
• The population (universe, target
population) is the entire set of individuals
to which findings of the survey are to be
• Members of the population are elements
• Often cannot sample elements directly
(not available, too expensive), but are
associated with other units, enumeration
units or listing units.
The sample
Probability and non-probability sampling
• Probability sample: every element in the
population has a known, nonzero probability of
being included in the sample
• Non-probability sampling does not have this
feature, but is commonly used in market
research and public opinion polls (time,
expense, not feasible) – quota surveys
Sampling frame
• Reliance on known probability of being
selected (vs. marketing research)
• Provides means of identifying and locating
population elements.
– Often contains additional information that can
be used for stratification and clustering
– Organization of frame exerts strong influence
over sample design.
Sampling frame
“Ideal frame” lists each population
element once and contains no other
listings (rare)
Kish’s classification of possible frame
Missing elements (not in the frame)
Blanks and foreign elements
Duplicate listings
Problems constructing sampling
Household information may be dated
New households may have been added
Older houses may have been demolished
Block listings date quickly
Organizational lists may be fraught with
missing information
• Clustering
• Substitution
Forms of Probability
Simple random sampling (rarely done)
Systematic sampling
Stratified sampling
Cluster Sampling
Serpentine fashion order
Unknown Source Population
• Absence of a sampling frame; unknown
boundaries and size of target population
• Privacy concerns; illegal or stigmatized activities
i.e., Hidden populations
Men who have sex with men (MSM)
Injection drug users (IDU)
Commercial sex workers
Migrant workers
• Relatively small groups
Sampling Methods
Time-Location (TLS) or Venue-Based
• Respondent-Driven (RDS)
Facility-Based Sampling
• Sample of clients from facilities serving the
target population
• Examples:
– Jails/Prisons
– STI clinics
– Drug Treatment Centers
• Biased sample based on service seeking
• Considered convenience sample
Snowball Sampling
• “Random” selection of “seeds”
• Seeds refer others with outcome/exposure
of interest
• Endpoint: sample size or sample
• Considered convenience sample
• Example: ALIVE Study looking at the
natural history of HIV/AIDS among IDU
Targeted Sampling
• Formative research to identify networks of
outcome/exposure of interest
• Different networks treated as sampling
• Systematic sampling within strata
• Practically treated as a convenience
• Heavily dependent on extensive formative
Venue-Based Sampling (VBS)
• Sampling of physical venues attended by target
• Formative research identifies public/private venues
and days/times of attendance
• Venue-Day-Times (VDT) enumerated for eligibility
and viability
• Sampling frame consists of VDTs; random selection
of VDTs to construct sampling event calendar
• Individuals systematically recruited at sampling
Respondent-Driven Sampling
• Type of chain referral sampling to reach hidden
populations (Markov chain)
• Begin with a set of non randomly selected seeds
• Seeds recruit peers, who recruit peers, etc.
• Recruits are linked by coupons with unique
identifying numbers
• Recruitment quota through coupons
• Incentives provided for completed survey and for
each successful recruit
Heckathorn 1997; Heckathorn & Salganik, 2004; Broadhead et al. 1998
RDS Recruitment Network
Bias (systematic error)
• Sampling and non-sampling biases
• Sampling biases come from the sampling
processes themselves or from the
statistical estimation process
• Frame biases are the most problematic
– Inappropriate selection procedure
– Elements appear > 1 time
– Non-random ordering
Non-sampling biases
• Account for largest source of total survey
error, most often ignore, unappreciated
• Observational biases: caused by obtaining
and recording observations incorrectly
– Field errors (data collection, enumeration,
– Processing and data analysis errors
Overall Conclusions
• Sampling methods have improved ability to arrive
at valid inferences
• Each method has to be considered and applied
based on objectives and target population
• Formative research is vital for implementation and
• These active surveillance and research efforts can
greatly supplement and enhance passive
• Infrastructures can be used for prevention efforts
To start: what are your goals?
Research Question: A statement that identifies the
phenomenon to be studied.
What are the units or entities being studied?
What variables will be compared across those
What relationships do you want to examine?
What relationships do you
want to examine?
• Associative
– correlate
• Causal
– question a direction
– temporality
• Mediating
To determine if the relationship between
exercise and obesity varies by race/ethnicity
among 15-24 year olds in Baltimore, MD.
•What are the units?
•What are the variables of interest?
•What are the relationships between
variables of interest?
Measurement Process
Measurement is the process of
assigning numbers or labels to units
of analysis in order to represent
conceptual properties.
(Singleton, ‘93)
3 steps…
1.Conceptual Definition
2.Operational Definition
3.Variable Definition
1. Conceptual Definition
Process of formulating and clarifying
concepts of interest
– Refines problem statements or hypotheses
which can be vague
Example: Obesity
2. Operational Definition
Questions asked to obtain
information on concept or issue
1. Do you consider yourself overweight,
underweight, or just about right?
2. a. About how tall are you without shoes?
b. About how much do you weigh without
Measurement Process
Measurement is the process of
assigning numbers or labels to units
of analysis in order to represent
conceptual properties.
(Singleton, ‘93)
3. Variable Definition
Variable constructed from questions to
be used in the analysis of the data.
1. Obesity:
1 = overweight
2 = underweight
3 = about right
2. Obesity: Construct index of obesity based on BMI,
calculated as weight divided by height squared.
Note of Caution
• No indicator can perfectly represent a
single concept.
• No two indicators measure a single
concept exactly the same.
New vs. Existing Questions
Pros of Existing: enhances quality and applicability of
items, enables comparisons across studies
– Tested for validity and reliability
– Evidence of methodological problems—missing
– How questionnaire was delivered (self administered,
ACASI, etc)
– Study population
– Social changes (time period)
Four elements that effect
1. Questions
2. Responses
3. Instructions
4. Questionnaires
1. Questions
a. Wording
Unclear word choice
b. Phrasing
Unbalanced question
Loaded question
Double barreled question
c. Sentence
Wordiness (short vs. long)
d. Question
Irrelevant to population or research question
1a. Wording
• Avoid jargon
• Avoid acronyms
• Be precise
• Ensure there is adequate knowledge
1b. Phrasing
• Unbalanced Question
• Double-barreled Question
• Loaded Question
Unbalanced Question
Definition: Both sides of a question are not
adequately represented.
Example: Do you agree that medical
marijuana is bad?
Loaded Question
Definition: A question that encourages
participants to respond to the question in a
certain way
Example: There are many people who
believe that medicinal marijuana should e
available. Are you one of them?
Revised: Medicinal marijuana has positive
medical properties. [strongly agree… strongly
Double-barreled Question
Definition: A question that has more than
one question embedded within it. A red
flag is the word “AND”
Example: Do you agree that medical
marijuana should be legal and that you
would vote for it on a ballet?
1c. Sentences
Avoid wordiness and confusing sentence structure.
Example: Do you believe that the parking situation
on campus is problematic or difficult because of
the lack of spaces and the walking distances or
do you believe that the parking situation on
campus is ok?
1d. Question
Be sure that questions directly relate to
research questions.
Example (for research on parking): Do you
like or dislike the bus system?
2. Responses
• Open ended
– comprehensiveness
• Close ended (rating, ranking)
– Should be completely exhaustive
– Should be mutually exclusive
Open-ended vs. Closed-ended
• Enables participant to talk • Dependent on structure of
about what comes to
responses: some categories
mind first
may be inadvertently omitted
• Respondents can provide
a comprehensive and
diverse array of answers
• Difficult to code and
• May be effected by # and type
of response categories,
presence of a neutral, “don’t
know” category
• Easier to code and may be
more reliable across
respondents and interviewers
3. Instructions
• to ensure that the question or questionnaire is answered
in the way that it should be.
• As part of question itself
• To introduce or close questionnaire
• To make meaningful transitions between topics
• To guide respondent/interviewer on skip patterns
• parentheses, all capital letters or some other type face
To introduce or close
We are now done with the questionnaire. If
you have any questions about this
interview, please feel free to ask me now.
no questions, I'd like to thank you for
participating in this interview.
To make meaningful
transitions between topics
The next set of questions is about using
drugs. Please remember that your
answers are strictly confidential. Your
name is not on this form. No one can
trace these answers back to you. If you
do not want to answer certain questions
you don’t have to, but please answer all
the questions that you can.
4. Questionnaire
The order and context in which items
are placed has an impact on the
meaning of certain questions and how
respondents answer them.
Sequencing of Questions
1. Begin with behavioral questions about
the present and then the past.
2. Start with questions that make them feel
3. Place sensitive questions in the middle.
Categories of Health
1. Socio-demographics: e.g., age
2. Behavior: e.g., condom use
3. Knowledge: e.g., HIV transmission,
4. Attitudes: e.g., towards HIV positive
1. Socio-demographics
• Use of standardized measures:
– Reduce time and effort
– Permit direct comparison
– May have documented validity and reliability
• Adapt US Census, federally sponsored
2. Behaviors
• Prone to over and under reporting due to
recall errors
• Recall errors are a function of:
– time period over which events are to be
– Salience or significance of event to person
Minimizing Recall Bias:
Memory Aid Procedures
• Aided recall: Clues provided in question
• Records: Ask respondents to consult personal
records, e.g. checkbooks, doctor or hospital bills
• Diaries: Respondents use calendars to record
relevant events; respondent consults diary when
interviewed about these events
3 -4. Knowledge and Attitudes
• Knowledge: more objective b/c a “right”
answer is presumed to exist
• Opinion: refers to views about particular
objects such as a person or policy
• Attitude: Bundle of opinions that are more
or less coherent and are about some
complex object
Minimize Threat: Knowledge
• Threat stems from fear of being perceived
as ignorant
• To minimize threat, phrase questions as
opinions (e.g., “do you think…”)
• To accurately capture knowledge:
– Ask more than one question
– Keep responses open ended
Rules to Write By
 Be mindful of 4 elements that effect the meaning of a
question: question, response, instructions, questionnaire
 Write in everyday terms
 Avoid unbalanced questions
 Avoid loaded questions
 Avoid double-barreled questions
 Write short, simple questions
 Ensure questions are relevant to research question
 Minimize social desirability bias
 Minimize recall bias