Uploaded by Liem Vu

nothing

advertisement
Unit 2 Notes 2300
Jenna G. Tichon
Unit 2 Part 1
Objectives:
By the end of this class the student should be able to:
define basic sampling terms
explain common concerns when taking samples
identify sources of sampling and non-sampling error
suggest remedies to reduce non-sampling error
2.1.1 Basic Sampling Terms
Element
An object on which a measurement is taken.
Population
A collection of elements about which we wish to make an inference.
Sampling Unit
Nonoverlapping collections of elements from the population that cover the entire population.
(Sampling) Frame
A list of sampling units.
Sample
A collection of sampling units drawn from a frame or frames.
Note a unit and an element may or may not be the same thing. If your elements are puppies and you’re sampling individual puppies at pet
shelters, your units and elements are the same.
∗
QUESTION: What could your sampling units be instead so that your elements and sampling units would not
be the same?
Suppose I wanted to survey adults in the City of Winnipeg about how often they wear their masks in public places. Our elements would be adults
and our population would be adults in the City of Winnipeg.
QUESTION: What might we use for sampling units?
QUESTION: What might we use for a sampling frame?
2.1.2 Basic Considerations for Sampling
Sample Size
We want to estimate a population parameter.
Our best guess is a sample statistic.
We realize every sample is different but ideally we’d like to make sure our estimate (θ̂) is within B .
We need to live with some error in our life or we’d need gigantic sample sizes so we also want P(|θ − θ|̂
Or
, i.e. we only “want” a big error 100(α)% of the time. (Type 1 error rate)
What sample size will we need for that?
̂
P(|θ − θ| > B) = α
Sampling Method
How are our elements spread out throughout our sampling frame? How easy is it for me to access them?
Easy to sample, no groupings of common elements. Maybe simple random sample?
< B) = 1 − α
Are there defined subgroups? Should we stratify?
Are our sampling units very geographically spread apart? Multistage?
Are there defined groups where there’s no big difference between groups but inside the groups it is diverse? Clusters?
Do I have an easily accessible list where the order is more or less random? Maybe systematic.
How much money do I have available?
We must always remember that sampling costs time and money. The “best” answer to every sampling question is survey the whole
population but we can’t so we are not concerned with best so much as best within practical, time, and monetary constraints. An
impractical answer is as useless as a blatantly wrong answer.
Am I genuinely selecting my sample randomly?
AT HOME: Read the Gallup statement at the end of section 2.3. As you’re reading the statement, what year does it sound like it was taken
from? Note the year when you’re done reading in the citation and think about how that may be out of date or inappropriate today. Here are
some ways that modern large opinion polling companies do their sampling:
Angus Reid (http://angusreid.org/how-we-poll-ari/)
A list of blog topics (https://news.gallup.com/topic/methodologyblog.aspx) by Gallup on modern survey methods at their company
In particular an article by Gallup on the state of telephone (https://news.gallup.com/opinion/methodology/225143/listening-statetelephone-surveys.aspx)
Pew Research Center (https://www.pewresearch.org/methods/u-s-survey-research/our-survey-methodology-in-detail/)
2.1.3 Error of Nonobservation vs. Error of Observation
Errors of nonobservation are related to our sample making up only part of the target population and errors of observation are related to what is
recorded about our sampling units being inaccurate.
Errors of Nonobservation
Sampling Error
The distance between the recorded statistic and the population parameter due to only collecting a sample of the population. (e.g. our
statistic changes between each sample merely because each sample is different, not because the parameter is changing.)
Undercoverage
When a sampling frame does not include the entire target population. (There can also be issues with the sampling frame containing units
not in the target population.)
Nonresponse
When you cannot collect measurement on selected units in your sample.
Our sampling error is something we have to live with as the price of being statisticians. Assuming we have 100% ideal
conditions/compliance/measurement/sampling frame/etc… (ha!) we can atleast control this by setting α .
Issues with coverage are hard to correct after the fact as there was a reason they were not included in the original sampling frame in the first
place. Responsibly, you should report what your sampling frame was and how it compares to your target population.
e.g.
U of M alumni vs. alumni organization’s mailing list
Eligible voters in Winnipeg vs. people on last year’s registered voters list
Households in Winnipeg vs. houses listed in the telephone book
Often those missing are missing for reasons that may make them important and unique parts of your population to survey. In particular at risk or
low income populations can be marginalized from participation in opinion polling.
We can broadly classify non-response into three causes:
An inability to physically reach a sampling unit. e.g. No internet connection, no phone line, no permanent address.
An inability of the sampling unit to give the correct response. e.g. They may not have the appropriate data available to them such as a
person not being able to say how much they’ve paid in GST over the past 3 months.
A person may refuse to answer the survey.
QUESTION: What are some reasons a person may refuse to answer a survey?
Errors of Observation
We can broadly group errors of observation as being due to:
Interviewers: Tone, age, gender, physical appearance, and demeanor of an interviewer can all affect how truthful people will be,
intentionally or unintentionally (e.g. Not wanting to tell a woman they don’t support changes to parental leave vs. Being influenced by a
perceived shameful tone in the way an interviewer reads a question.)
Respondents: Respondents might not understand questions, may not seek clarification, may be embarrassed (or fearful) to answer
truthfully, may exagerate, may make up answers to not appear uninformed, or confuse units of measurement.
Measurement Instrument : Confusion around what the unit of measurement is or how something is defined. e.g. Does employed mean full
time?; Does your children include adult children? Step-children?; Would you like me to qunatify my commute time in minutes or hours?
Method of Data Collections : Accuracy can be affected by conducting personal interviews vs telephone interviews vs self-administered
questionairres vs direct observation.
QUESTION: I gave a question several times to my STAT 1150 students asking how many keys they had on the
keyring with their house key. What do you think were some of the problems students had when deciding to
answer it?
2.1.4 Reducing Error
There are many ways research companies and researches attempt to reduce errors in their surveys:
Callbacks: Making follow-up calls or sending reminder surveys (by mail or email) can help response rates. Follow-up calls should vary by
time of day and week to catch people on different schedules.
Rewards and Incentives: Surveys can offer monetary incentives for participating or put respondents into draws for a potential reward.
Large survey companies with panels of people they select from may earn points towards gift cards or other rewards.
Interviewer Training: Interviewers should have opportunities to practice asking questions under watchful eyes that can suggest
improvements in intonation or pronunciation or demeanour that may get more truthful answers.
Data Checks: Data can be cross referenced (e.g. age to year of birth), obvious “wrong” answers can be eliminated or corrected by
followups if possible.
Questionainnaire Construction: We will look in our next lecture how questions can be constructed to get honest and truthful answers from
respondents and help eliminate people from lying due to not understanding questions.
2.1.5 Summary
Summary
A well thought out survey considers both how to pick sampling units as well as how to question them.
We will always have sampling error we can’t control beyond fixing α but we should try to fix non-sampling
error.
We can have errors in both getting our sample and in getting our answers.
There are techniques that can be employed to help minimize non-sampling error.
2.1.6 References for Reading:
Textbook sections: 2.1 - 2.4
Slides for a talk on incentives in surveys (https://iriss.stanford.edu/sites/g/files/sbiybj6196/f/singer_slides.pdf)
An academic paper on whether incentives degrade data quality
(https://scholarworks.iu.edu/dspace/bitstream/handle/2022/23761/Does%20use%20of%20survey%20incentives%20degrade%20data%20quality
sequence=1&isAllowed=y). Longer read, fair warning.
2.1.7 Practice Problems:
Give some thoughts to textbook 2.1 to 2.7. Feel free to share ideas of thoughts on the forums for this class.
Unit 2 Part 2
Objectives:
By the end of this class the student should be able to:
design a questionnaire
word surveys to receive accurate and unbiased results
plan the stages involved in designing a survey
When designing a questionnaire there are a lot of things that may affect people’s answers. Unintentionally, or let’s hope not intentionally, answers
can be swayed one way or another in the way the questions are worded or the survey is designed. Let’s look at some of the things that influence
a survey:
2.2.1 Question Ordering
When there are many options in a questions where you’re choosing between several choices, there can be a recency effect.
Sanjeev and Balyan, 2014:
A primacy effect occurs when some respondents remember choices that appear first in a given list and
are therefore more likely to select these response options. It can also happen when an agreeable choice
is read from a list, because respondents may select it and move on, without reading through the entire
response list for a question. A recency effect, on the other hand, occurs when some respondents are
more likely to remember the last choices of a list, and are therefore more likely to select a choice from the
final part of a response list. This effect is much more pronounced when a response list has too many
options or the scaling is wide.
Randomization amongst all participants is a way to combat this. For ordinal questions you can reverse worst to best or best to worst for
some.
Similar questions can have an effect based on ordering, particularly if one goes from more general to more specific or vice versa. These are
called context effects.
The text gives an example about people being asked if they were happy in their marriage and if they were happy with their life in general.
When it was asked life then marriage, the responses were 52% were very in life in general. When it was asked marriage then life, 38%
responded they were very happy in life in general. The theory being people felt happy thinking about their marriages specifically that life in
general seemed less great in comparison.
Pew Research, n.d.:
Another experiment embedded in a December 2008 Pew Research poll also resulted in a contrast effect.
When people were asked “All in all, are you satisfied or dissatisfied with the way things are going in this
country today?” immediately after having been asked “Do you approve or disapprove of the way George
W. Bush is handling his job as president?”; 88% said they were dissatisfied, compared with only 78%
without the context of the prior question.
Magelssen et al., 2016:
Assimilation effects entail that question order reduces differences in responses between adjacent
questions; in contrast effects, question order increases differences. Question order effects occur when
the thoughts and feelings triggered by a question carry over to the next question, thereby influencing the
response.
Another e.g. A person that is given a long list of question about crimes might respond differently to a question about if they’ve been a
victim of crime as it primes them and gives them opportunities to remember things that have happened to them in the past.
Having question written out or restated can help reduce the issues with question ordering or long numbers of choices to make sure they
refocus on the given question.
2.2.2 Open vs Closed Questions
Closed questions have a predetermined set of ansers or a finite numerical answer (e.g. age or pain rating 1 to 10)
Open questions allows people answer however they would like.
Pros and cons of each: closed is easier to analyze but harder to capture all possible choices and subject to effects from question order.
More likely to “suggest other” in open ended list. This also implies some categories may get “over chosen” in a closed list.
(Pew Research, n.d.)
Sometimes pre-survey to get most common options for real survey to help capture what the public will answer as opposed to what the
surveyers think they might answer. (Has Family Feud not taught us anything about the things people will suggest?)
2.2.3 Response Options
A forced choice question makes a respondent select a yes or a no, a one or the other.
Laur and Kennedy (2019) state the research is inconclusive on which is more accurate generally but research shows fairly consistently that
the rates of agreement are higher in forced choice.
In certain situations, however, it seems highly likely that people are more accurate when doing forced choice rather than “select all that
apply” type questions. e.g. Someone is unlikely to report they’ve suffered from addiction when they haven’t (no benefit). Someone may
report they haven’t suffered from addiction when they have (to not embarass themselves) so the reporting method with higher results is
likely more accurate.
They give an example of victimization rate questions. People were either asked if they were victimized by something in particular (e.g. job
loss) in a yes/no format for six things or the person had to select from a list with all that applied. The rates were higher in the forced choice.
2.2.4 Wording of Quesetions
Leading questions include extra information that purposefully influence people in a particular question.
Pew Research, n.d.:
An example of a wording difference that had a significant impact on responses comes from a January
2003 Pew Research Center survey. When people were asked whether they would “favor or oppose taking
military action in Iraq to end Saddam Hussein’s rule,” 68% said they favored military action while 25%
said they opposed military action. However, when asked whether they would “favor or oppose taking
military action in Iraq to end Saddam Hussein’s rule even if it meant that U.S. forces might suffer
thousands of casualties,” responses were dramatically different; only 43% said they favored military
action, while 48% said they opposed it. The introduction of U.S. casualties altered the context of the
question and influenced whether people favored or opposed military action in Iraq.
Magelssen et al, 2016
For instance, in a classic study carried out in the USA, 23 % of the public agreed that too little was being
spent on “welfare”, whereas 64 % agreed that too little was being spent on “assistance to the poor” [9].
The two terms were intended to describe the same policy, yet evidently evoked different judgments in the
minds of respondents.
Magelssen et al, 2016:
An Australian study investigated patients’ views on AD [assisted dying] with the aid of face-to-face
interviews in which all respondents were asked a set of questions describing AD in different ways [15].
The study demonstrated that question wording impacted on answers. In particular, to the question “Do
you support the idea of euthanasia?” 79 % answered yes; 70 % answered yes to “Do you beieve that a
doctor should be able to assist a patient to die?”; and only 34 % gave an affirmative answer to “Do you
believe that a doctor should be able to deliberately bring about a patient’s death?”.
It is generally good to give someone two options in the wording rather than a straight “Do you favour…?” The text suggested, for e.g., “Do
you favor or oppose the use of capital punishment?” over “Do you favor captial punishment?”.
Asking “Do you agree with…?” may make the interviewee feel like the interviewer thinks it’s agreeable and make them more likely to
respond with yes.
Only one question should be asked at a time. A question that addresses to two ideas is called a double barreled question. e.g. “Do you
believe the IB program helped promote thinking about global issues and multiculturalism?”
Don’t use double negatives: e.g. avoid “Do you favour or oppose not allowing teenage drivers to drive alone after midnight?”
Recall measurement instruments from last class? You should be clear in writing out questions. “How much do you work a week?” could be
better as “On average, how many hours a week are paid for work?”
For in person interviews, a prop might be helpful to demonstrate height or volume.
Hospitals often give pain scales with descriptors for each number when asking questions of patients.
2.2.5 Planning a Survey
The text suggests the following series of steps as a checklist for a good questionnaire project:
Statement of objectives.
Target population.
The frame.
Sample Design.
Method of Measurement.
Measurement Instrument.
Selection and training of fieldworkers.
The pretest.
Organization of fieldwork.
Organization of data management.
Data analysis.
I would also in a step about summarizing and presenting your analysis and conclusions!
2.2.6 Summary
Summary
People are really easily influenced. Use this for good, not evil.
Consider wording of questions, what options you give for answers, and the order of your questions.
When readying results of an organization’s survey, always find out the question actually asked.
There are lots of steps to consider before giving a questionnaire.
2.2.7 References for Reading:
Sections 2.5 and 2.6 of the text
Pew Research Center. (n.d.). Questionaire Design (https://www.pewresearch.org/methods/u-s-survey-research/questionnaire-design/). Pew
Research Center. https://www.pewresearch.org/methods/u-s-survey-research/questionnaire-design/
(https://www.pewresearch.org/methods/u-s-survey-research/questionnaire-design/)
Lasorsa, D. (2003). Question-Order Effects in Surveys: The Case of Political Interest, News Attention, and Knowledge. Journalism & Mass
Communication Quarterly, 80(3), 499–512. https://doi.org/10.1177/107769900308000302 (https://doi.org/10.1177/107769900308000302)
Magelssen, M., Supphellen, M., Nortvedt, P., & Materstvedt, L. (2016). Attitudes towards assisted dying are influenced by question wording
and order: A survey experiment. BMC Medical Ethics, 17(1), 24–. https://doi.org/10.1186/s12910-016-0107-3
(https://doi.org/10.1186/s12910-016-0107-3)
Sanjeev, M., & Balyan, P. (2014). Response Order Effects in Online Surveys: An Empirical Investigation. International Journal of Online
Marketing (IJOM), 4(2), 28–44. https://doi.org/10.4018/ijom.2014040103 (https://doi.org/10.4018/ijom.2014040103)
Laur, A. and Kennedy, C. (2019, May 9). When Online Survey Respondents Only ‘Select Some That Apply’. Pew Research.
https://www.pewresearch.org/methods/2019/05/09/when-online-survey-respondents-only-select-some-that-apply
(https://www.pewresearch.org/methods/2019/05/09/when-online-survey-respondents-only-select-some-that-apply)
2.2.8 Practice Problems:
Consider your answers and reflect on questions 2.16 to 2.22 in the text.
Download