Writing Survey Questions “Items” Overview Characteristics of good survey items Two basic kinds of questions Some Common item writing problems Scale selection Evaluating questions after you write them Its harder than you think Despite your best efforts, you will probably write questions that raise more questions than they answer There are no real hard and fast rules, but rather a set of guidelines Common sense is the real rule CAVEAT of all survey research - - - surveys record second hand information, communicated perceptions, opinions, beliefs. Direct observation is always better (but not always possible) What will you measure Always begin by asking yourself what you intend to measure or evaluate. This serves as the guide to what items you need to make a decision. Ask yourself what covariates you will need (blocking variables) Common ones include, age, gender, SES, race, ethnicity, but you may identify some that are relevant to your study (Fowler, 1995) A guiding question should be “on what might the answers depend” (Yovanoff, 2005) More is usually better than less – you can always throw them away if you don’t need them, but you usually cant go back and get more info 5 Basic Characteristics There are some basic characteristics of questions and answers that are fundamental to good measurement (Fowler, 1995) 1. Question need to be consistently understood – the same for each person 2. Questions need to be consistently administered – sometimes an issue with in person interviews, less so with online surveys 3. Respondents need to be capable of understanding the question, and understanding how to answer 4. Respondents need to be willing to answer In general – we want to maximize the degree to which an question produces answers that measure something Two Basic Kinds of Questions Those that aim to gather facts (objective) e.g., In what month were you born? e.g., What is your approximate annual income in thousands of CDN dollars? Those that aim to gather opinion (subjective) e.g., Indicate on the following scale how much you liked last night’s math homework. e.g., What is the top reason you think an SDSU prof works for UofO is in BC? Questions to Gather Facts (e.g., month born, annual salary, etc.) Begin by clearly defining what you want to measure – overall and with each item. This definition will guide you in item development. Decide how you want the response to look – on what scale (more on that later) Questions to Gather Facts Think of a common question you might see on a survey designed to gather a ‘fact’ Questions to Gather Opinion (e.g. liking homework, why SDSU for UO in BC, etc.) When you gather opinion, you are gathering subjective data There is no right or wrong answer Sometimes these look like factual questions, but if people will respond in different ways, they are not e.g., How friendly is your teacher? Open Ended Questions They are needed when … you cannot put all possible responses on a scale, e.g., what are your work duties? You want to understand thinking, e.g., how would you solve this problem? When you really don’t understand what the answer might look like, e.g., how did you become homeless? Unlike with scales or MC questions, open ended questions allow for more respondent flexibility – but it is is coupled with difficult analysis later People are not constrained to an area of response, so there is a lot of variability in the data Leads to subjectivity in coding / imposition of scale Common Item Writing Problems (and how to deal with them) Multi-dimensionality Ambiguous Stems Response Restriction Sensitive Items Distorted Responses Item Writing - Dimensionality A single dimension in a question is usually best – both in the stem and possible response e.g., How would you rate your health? In this case “health” is ill defined, and may mean different things to different people. Is it level of fitness? Absence of disease? Weight? This construct has multiple dimensions e.g., How would you rate your teacher? a) smart and confident b) smart and not confident c) no smart and confident d) not smart and not confident Item Writing - Dimensionality Sometimes this is referred to as double barreled In questions watch out for coordinating conjunctions, because they are designed to join clauses "and," "but," "or," "nor," "for," "so,“, "yet“ e.g., Although the system of education in BC is of good quality, it really should not be mirrored in other provinces. Agree or Disagree? You may agree the BC system is good or not good You may think it should be mirrored, or not. Item Writing - Ambiguous Avoid ambiguous words Be concrete, define key terms in questions Do you spend a lot of time studying in the Master’s Program? On average, how many hours do you study each day ? Do you usually prepare before class? On average, how many hours do you spend preparing for class each week ? Item Writing – Restricted Response During the semester, on average, how many hours do you spend preparing for this class each week? Less than 1 hour Between 1 and 2 hours Between 2 and 4 hours More than 4 hours VS During the semester, on average, how many hours do you spend preparing for this class each week? Enter response _____ Item Writing - Leading Watch out for statements of supposed fact e.g., Overall, would you agree with most people that our federal government is corrupt? e.g., better – Overall, is our federal government corrupt? e.g., This year, wild dogs killed more cats than in any other year, ever. Do you agree that dogs ownership should be better regulated? e.g., better – Should dog ownership be regulated? Item Writing - Sensitivity Simple fact - some sensitive questions will not elicit truthful responses Many respondents will answer positively to avoid the question e.g., Do you love your children? Better to triangulate with… How much time do you spend with your children? Do you play games with your children? How much do you cuddle your children? Have you ever cried because of your child’s behavior? Have you ever hit your child with a wooden bat? Item Writing - Sensitivity Take note – sensitivity is in the eye of the beholder – it is not so much the question, but the way a person will answer e.g. Have you been hospitalized in the last year? a. For those who have not, it isnt sensitive – truthful response b. For those who have been to hospital for the flu, it probably isnt – probably truthful response c. For those who have been to hosptial for teenage pregnancy or an STD – it probably is – probably untruthful response Item Writing - Sensitivity Some studies seem to indicate that respondents tend to answer questions in a way that might make them look better, to the surveyor or the public (Locander, Sudman, Bradburn, 1976) Tend to under-report disease - when questioned about health (Cannell, Fisher, Bakker, 1965) Tend to report status quo - when questioned about voting (Madow, 1967) Tend to highly underreport - when questioned about masturbating (Sudman & Bradburn, 1982) Item Writing - Sensitivity To reduce sensitivity Assure confidentiality, use blind data analysis Better yet, don’t ask for names /personally identifying info (or do so at the end of the survey) Communicate the importance of accurate responses Reduce the role of the interviewer (if face to face) Reduce the amount of detail you ask for, so it is less obtrusive Item Writing– Distortions There are two primary reasons we might not get the answer we want outside of social desirability – “distortions in answers” The respondent may have forgotten The respondent may not have the information Be aware of respondents and how these factors may play into creating measurement error. Try to design around them. e.g., lead into a question with a brief description imagine yourself at home when you are eating dinner as you answer this question Keep in mind your responses are completely anonymous as you answer the following questions Scales With your question stem, you also need to design a response format. It might be a selection of choices, might be open ended, but often is on a scale. Scale Selection Your selection of response format (the scaling) is determined, in part, by the kind of analysis you might want to use Nominal – words/names – no order, no comparative meaning eg., red, blue, green, Mark, Mary, Chris Good for descriptive data analysis only Ordinal – ordered, less to more eg., no experience, a little experience, some experience, a lot Good for descriptive data, too, lends itself to proportions analysis like % or chi-square Interval (no zero) / ration (zero) – equal interval scales eg., month experience Lends itself to parametric statistics -t-tests, correlations, regression, ANOVA, etc. Scale Selection No matter what scale you choose, that scale is open to subjectivity. This is a constant source of measurement error, and is difficult to quantify. e.g., if I ask you to rate this class on a scale from 1 to 10, one person’s 6 might be pretty good, where another’s 6 might mean just palatable. Often, we use numbers, words, and pictures to let people express their feelings and opinions, each may be more or less appropriate in different settings Response Scale Examples Numbers eg., on a scale of 1 to 10… Words eg., a lot, some, only a little, not at all Pictures Scales – the details How many categories? Fewer categories leads to less variance (less data to analyze) But, research points to <10 (Andrews, 1984) being adequate In truth, 5 or 7 is sufficient (10 point scale just seems to be as much as people can handle) An even number will force one way or the other An odd number permits sitting on the fence Evaluating Questions After creating questions, there are several methods we can use to evaluate their efficacy/quality 1. Focus group discussions did you like this question? Why or why not? Interviews that focus on how a person answers a question (a cognitive approach) 2. what were you thinking when you answered this question Field test under realistic conditions (pilot testing) 3. e.g., give to 10 middle school age students first, before giving to the whole middle school Evaluating Questions We can also evaluate the degree to which our questions work by looking at data after they are administered 1. Predictive relationships of responses known in theory (e.g., how much do you like your job? What is your level of job stress?) 2. Comparison of data from questions that ask almost the same thing (but are worded differently) 3. Comparison of answers to records (when data available) 4. Consistency of answers from a respondent at two time points Activity With a partner or small group, design a good question Activity With a partner or small group, design a good question Now, make it bad in just one of the ways I covered Activity With a partner or small group, design a good question Now, make it bad in just one of the ways I covered Come type on my computer when ready Some Bad Questions 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Xxx Xxx Xxx Xxx Xxx Xxx Xxx Xxx Xxx xxx Activity – fix the questions 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Xxx Xxx Xxx Xxx Xxx Xxx Xxx Xxx Xxx xxx Another Activity (pick an area to evaluate) Write 3-5 questions that fit your topic In the same groups, review the items to make sure they are good. Consider covariates (usually demographic variables) you might want – these are also called blocking variables