General Questions about Research Design: What Decisions Must I Make as a Researcher? • Quantitative vs. qualitative • • • If quantitative, you have to decided on the degree of researcher intervention • Correlational: what are the precursors (enabling conditions, sufficient conditions) which account for variation in an existing outcome; manipulation not a factor • Experimental and quasi-experimental: what are the effects of a treatment variable on a dependent variable Conceptual framework: what is the research tradition, the literature in which the work is grounded Who (what) will be studied General Questions about Research Design: What Decisions Must I Make as a Researcher?, cont’d • Plan for data collection and analysis • Data collection • • Experiments; develop a plan for physical or statistical control of extraneous variables (e.g. exclude people of a certain age, or control for age) Correlational: focus on design of measuring instruments and identification of possible alternative explanatory variables (you hypothesize that individualism/collectivism produces variation in attitudes towards technology; what other variables do you need to take into account/control for? • Analysis • • Correlation and regression for non-experimental designs based on uncovering patterns of association T-tests, ANOVA, MANOVA, for analyzing results of experiments and quasi-experiments Experiments • A classic experiment involves randomly assigning participants to treatment groups and control groups, or in some other way to remove all possible differences between them other than the treatment to which they have been exposed (the manipulation) and then evaluate the outcome of the treatment • • • The assumption is that if all other between-group factors are controlled, any post-treatment differences between them can be attributed to the effects (can be said to be caused by) the treatment Randomization in theory should reduce the likelihood of any systematic source of variation between groups that could impact outcomes; researcher does not have to specify every possible characteristic of the subjects which might be a confounding factor Other physical control methods include matching or exclusion Experiments, cont’d • • Independent (Treatment) and Dependent (Outcome) Variables Control Variable (covariate): like the educational attainment variable in the example of the association between gender and employment category. one whose potential effect can be confounded with effects of the treatment variable and must be removed through randomization or statistical controls • • • Virtually any variable may occupy one of these roles in a study and it is up to the researcher to clarify for any given study which variable is playing which role Internal validity: refers to the internal logic of the study, ability to show that the putative IV causal impact on DV is legitimate and not attributable to other extraneous and uncontrolled variables Ecological validity: design is good simulation of external circumstances and relationships among variables in the “real world” Quasi-Experiments, “Natural Experiments,” and Non-Experiments • Quasi-experiments involve comparisons between naturally occurring treatment groups (by self-selection or administrative selection). Researcher does not control group assignment or treatment, but has control over when/what to observe (DV) • Example might be people in a face-to-face vs. distance education version of the same class; people who do or do not sign up to work at a polling place on election day; people named A-M versus N-Z • Researcher must rely on statistical controls (ANCOVA, partialling) to rule out extraneous variables which ordinarily would be controlled by randomization; should be variables suspected of a relationship to either DV or IV Natural experiments and nonexperimental designs • • “Natural experiments” might typically involve before and after designs where you look at a DV of interest before and after some phenomenon that has occurred, for example, tying gyrations in the stock market to increases in oil prices or significant world events (sometimes classifed as quasi-experiments) Non-experimental designs are basically crosssectional studies which are correlational in nature in which the researcher makes an effort to establish causal influence through measuring and statistical control of competing explanations Some Issues in Experimental Design • • Typical experiment in which you attempt to compare two groups, one of which has been exposed to some sort of treatment (e.g., a message, watching a video, doing a task, etc) and one of which has not, on some variable of interest (attitudes, beliefs, etc) Some things that you have to think about • • • How can you be sure in advance that your two groups do not already differ on the variable of interest (random assignment, analysis of covariance) How can you be sure that your experimental manipulation is valid (validation studies, post-experimental “manipulation check” during debriefing). Validity decreases in the research setting as the role imposed on the subject by the experimenter departs from his or her usual role in comparable behavior settings outside the lab How do you know that your post-treatment measures of the variable of interest are reliable (measure the individual’s “true score”) Issues in Design of Research • Reliability analysis is particularly important in studies involving pre/post test measures where change scores are computed (testretest reliability can be affected by issues of memory, maturation, development, and random error) • Replicability (have your manipulations and measures been chosen and implemented (and will you be able to report them) in such a way that other experimenters could replicate your experiment? Some Issues in Experimental Design, cont’d • Demand characteristics: refers to features of an experimental setting or questionnaire which induce people to behave in an artificial way • In our society the role of subject and experimenter are fairly well understood and carry with them mutual role expectations • Many subjects, particularly college students, have expectations that you might be “trying to prove something” and depending on whether or not they like you, may try to help you prove it by altering their behavior consistent with their understanding of what you “want them to prove” • • Often their notions about your hypothesis are incorrect People may also feel pressure to make a good showing for themselves by answering questionnaires in a nontruthful way that makes them look good to the experimenter • Testing for social desirability effects Issues in Design of experiments, cont’d • Sometimes the mere process of measurement (for example, taking a pre-experiment attitude survey) may induce change that will later be incorrectly atttibuted to intervening events, such as the experimental treatment (message about the survey topic). That is, simply asking questions about a topic may induce attitudes to change even for the “don’t know” group • • • People can make improved scores on tests just by repeated experience (intelligence tests, for example) even if they don’t get feedback as to right or wrong answers Experimenter characteristics: characteristics of the interviewer or experimenter can influence results, with different responses depending on gender, race of experimenter as well as changes in the E’s skill or interest Interpretations: Experimenters can be influenced by early data returns in Potential Sources of Confounding • Method variance • Different experimenters for different conditions • Lack of uniformity in conditions of measurement (for example, letting some Ss “take home” a survey or giving them additional time to complete it; using an interview for some subjects (perhaps due to lack of literacy) and a paper and pencil measure for others • Differences in instructional set (typically experimenters begin to memorize their instructions over time, but they then forget parts or say things differently to the first few subjects than they do to the last few) • Different recruitment strategies or different incentives for participation will result in different kinds of subjects, so be consistent in both regards within a single study Correlational Studies • • Objective is, minimally, to be able to say that variation in an IV (call it “X”) accounts for (“explains”) variation in a DV (call it “Y”; characterizing the IV as causal is a bigger leap, as although X could cause Y, there is nothing in the correlational relationship to say that Y doesn’t cause X, or that both aren’t caused by Z In most social science research today it would be regarded as unsophisticated to present a simple correlational analysis in which one variable was proposed as accounting for variation in another • Techniques (multiple regression, SEM) exist which allow us to simultaneously investigate the separate and combined effects of multiple variables as they interact to account for variance in a DV. Correlational analysis, cont’d • Using multiple regression/SEM techniques we can systematically observe the contribution of individual IVs and various combinations of IVs and measure the extent to which they increase or decrease the amount of variance in the DV which can be explained • Techniques like step-wise multiple regression allow us to include or remove predictor variables (IVs) in a progressive fashion to see how they contribute or take away from a combination of variables in accounting for larger and larger portions of the variation in a DV • Techniques like hierarchical linear modeling (HLM, also called multilevel modeling) allow us to account for the effects of nesting (e.g., individuals within couples, students within classrooms), so we can sort out the impact of any lack of independence of the observations we make on, say, a husband and wife, or among the students of the same classroom teacher When to Use Measurement • • Measurement is appropriate when there is a quality or property which we know how to describe, which we think can be arrayed along a continuum of some sort with identifiable signposts that tell us at the very least whether or not a given instance of the property constitutes a little or a lot of it Further, measurement requires that the quality or property we propose to measure is comparatively stable over time (like height in middle aged adults), or varies in a systematic way. • You might think that under that criterion you couldn’t measure weight, because we all know that we gain weight in mysterious and unpredictable ways, overnight • But in fact it’s a systematic process having to do with energy intake and energy expenditure that can be described as evidencing certain regularities The view from the other side… • There are certain paradigmatic objections to the whole notion of measurement on the grounds that there is no objective reality, even if people behave as though there is; that behavior is socially constructed; and that dissecting the behavior stream and assigning it numbers (1,2,3,,,7,87,9) is essentially throwing away data by reducing a phenomenon experienced analogically and holistically to a set of categories, no matter how fine the categories Some Fundamentals of Measurement • • • Assumption that a measuring instrument only provides indicators of or clues to an underlying trait which cannot be observed directly The measurement instrument should consist of items which constitute a representative sample of the universe of items which could be regarded as indices of the trait Basic types of scales • • • • Likert: (*you say lick-urt and I say like-urt”) strongly agree/strongly disagree to declarative statements, 5-7 scale steps Thurstone: series of statements thought to represent equally spaced intervals of attitudes toward a target stimulus along a bipolar continuum, e.g. I am a very strong proponent of the legalization of marijuana; I am in favor of the legalization of marijuana as long as it’s sale is regulated like cigarettes;…. I am opposed to the legalization of marijuana except for pain relief by cancer patients; I am an opponent of the legalization of marijuana in all circumstances Guttman: ordered series of statements which are arrayed in a pattern of increasingly polarized positions on an issue, such as acceptance of gay marriage. Subjects indicate which items they agree/disagree with. Assumption that the scale is cumulative; that is, if you agree with a “very strong” item, you also agree to the less-extreme items below it Semantic differential scales: an object of judgment is evaluated against a set of rating scales (usually five to seven steps) with bi-polar adjectives at either end, such as good-bad or friendly-unfriendly Constructing a New Measure vs. Utilizing an Existing One • • • For your purposes it would be to your advantage to use an existing instrument with published reliability and validity coefficients The best way to find these is through Googling for the construct keyword plus terms like “scale” or “measure” and perhaps restricting your search to PDF files, or to .edu domains. “Paper presented at” or “annual meeting” are good phrases to accompany the construct keyword PsychInfo is another very good source. Access this from an on-campus computer or home computer through the remote access portal. It’s found on the ISD Electronic Resources Page in the Quick Links menu. Some Issues in Questionnaire Design if You Must Create Your Own • Provide as clear as description of your construct as possible; then narrow it and narrow it some more • • • Consider how it will manifest itself under many different circumstances Consider what other constructs it might be closely related to and how it differs from those other constructs (ex., speech apprehension, shyness, social anxiety); trait vs. state Select a measurement type. For your purposes probably Likert-type scales or semantic differential scales will be sufficient • • Likert scales would be used with declarative statements about the construct, e.g., “I thought that the person depicted in this video was very friendly” with a scale below it of 5-7 steps ranging from strongly agree to strongly disagree Semantic differential scales would be used to elicit attributes associated with the construct or stimulus: e.g. person depicted in video Friendly ----- ----- ----- ----- ----- ----- ----- Unfriendly Unpleasant ----- ----- ----- ----- ----- ----- ----- Pleasant Some Issues in Questionnaire Design if You Must Create Your Own, cont’d • Some problems with response to questionnaires that you should keep in mind whether designing your own or using an existing one • People will develop “response sets” to respond to surveys in a particular way • • • • They may like only the “neutral” category, or like to respond only on the right-hand side of the page because it’s easier (so generally we would rotate some items) Fatigue factor: interest tends to decline as subjects become tired, and responses to later items may be less thoughtful, skipped, etc. so best to rotate order of items Order effects: primacy and recency effects (argues for counterbalancing) Generally people are more likely to endorse a statement than to disagree with its opposite Further Issues In Questionnaire Design • Items which are ambiguously worded can produce what’s called an “acquiescence set” • • • • Although people generally prefer to endorse strong statements rather than moderate or indecisive ones Yet items with “always” or “never” tend to be rejected as too inclusive Subjects prefer round numbers (2,4,6,etc) if response categories are numbered Some items may be culturally unacceptable. For example, in one study the item asked how frequently respondents have “the feeling I am going crazy.” Most of the 350 Vietnamese subjects refused to answer—they knew what was meant, but it was a strong norm to conceal or deny mental illness. • This is a feature of almost any population. People are likely to seriously under-report their socially undesirable behaviors and overreport their desirable ones. Problem can be reduced by confidentiality assurances, removal of survey process from the presence of an experimenter or perceived evaluator Further Issues In Questionnaire Design, cont’d: Validity and Reliability • To review from previous slides • • Reliability : internal consistency of measure; alternatively, consistency of measurement over time with the same subject, case, instance. Test-retest; alpha coefficient More on how to compute reliability in lesson on February 26 Validity: does the measure really assess what it claims it does? Face, concurrent, predictive, construct (convergent validity and discriminant validity) • Convergent validity : your new measure of social anxiety is positively related to established measures of self-consciousness or shyness • Discriminant validity: your new measure of support for the death penalty is negatively related to an established measure of political conservatism Further Issues In Questionnaire Design, cont’d: Validity and Reliability • If you are using an existing measure, look for articles in which it is used and validity and/or reliability data has been collected and reported • Validity and reliability should be reported as coefficients ranging between 0 and 1. Typically you will see reliability reported in terms of an alpha value. This is a measure of internal consistency of scale items and should be at least .8 or better • Validity is more rarely reported by users, more often by the original scale developer Collecting Data • Try to define the population from which you will collect data and if possible create a sampling frame from which you can randomly sample to obtain respondents • • • • • • If a sampling frame for the population of subjects cannot be defined, then try to narrow the range of persons or entities (or other cases, such as geographical units, web sites, tv shows, etc) to which you hope to generalize and prepare a list and sample from it. The point is to provide a replicable method and to provide every element in the population of interest an equal chance to be included. Prepare an consent form or information sheet for dissemination to your respondents. Secure IRB approval for your project. Consult the syllabus or previous slides for links, templates, etc. You would be likely to need a consent form only if you were planning to collect personally identifying information, use deception, subject them to some possibly harmful treatment, etc. Secure participants’ consent and provide a copy of the consent form/information sheet to them. The consent form should be countersigned by you and a copy returned to the respondent (make two copies if you are collecting signed consent) An information sheet can just be a separate page at the beginning of your questionnaire Collect all data under consistent conditions; no method variance allowed Special Topics in Data Collection: Web Surveys • Don Dillman is a leading researcher in the field of surveys. He has an excellent article with practical advice for constructing Web surveys. Here are the high points: • • • Realize that some users may not be able to complete questionnaires which use high-end programming techniques because of browser or computer limitations, so keep in mind the computing resources of your target population Don’t up-end the usual method of filling out questionnaires that people are accustomed to in order to “take advantage of” the advanced design features of Web programming languages and multimedia Some people may use Web questionnaires in mixed-mode situations (in combination with paper surveys, for example). While this invites method variance, it may be unavoidable. If this is the case, take care to make the questionnaire format as consistent as possible across methods More on Web Questionnaires • Use a welcome screen to motivate and inform (and of course to obtain agreement to participate) • • • • • Emphasize ease of response and make it clear how to proceed to participation Begin the questionnaire with a fully visible, easy question Present questions in the same way they usually appear on paper questionnaires Keep line lengths short to avoid participants’ skipping words If particular computer functions are necessary to complete a question, explain it if you think it would be necessary for the type of respondent you expect. For example, with radio buttons explain that only one item can be checked. Drop-down menus are hard for some people to negotiate and tend to encourage people to choose wherever they land. Indicate if the available space exceeds the apparent space for an openended answer, etc. • Place these instructions near the question, not at the beginning More on Web Questionnaires, 2 • Deal with the “forced answer” problem. Some IRBs (like USC’s) don’t like the quality of Web surveys that require that every item or even some items be answered and want you to assure subjects that their participation is strictly voluntary, which may extend to being able to opt out of any question they don’t like. You could approach this in a couple of ways • • • Require only that they agree to the “consent item”-an item like, I have read the information sheet and I am willing to participate in this study Provide a “don’t wish to answer” alternative for every item which you will later treat as missing data, but require them to indicate formally that they don’t wish to answer and don’t let them submit the form until they do You could provide some gentle encouragement for really critical items. For example, you could set required fields for the critical items and if they don’t answer or select don’t want to answer they could receive a screen which suggests that their answer is really important and if they could find a way to reply it would be helpful to the study. IRBs might differ as to how coercive this is, and you might not like it either More on Web Surveys, 3 • In general, surveys should scroll from question to question unless they are really long. • For a long questionnaire, give the respondent an idea of how far along in the process they are from time to time. Use words or simple graphics • Realize that “check all that apply” formats probably will yield different results than if you listed each potentially applicable item and put yes or no radio buttons next to it Special Problems in Data Collection: Some Guidelines on Conducting Face-to-Face Interviews • • • Check to see if your interviewee has completed the consent form process or has read the information sheet and agreed to participate Find a suitable, comfortable, quiet spot for the interview. It is important that it be comfortable for the interviewee, as fatigue can be a factor. The additional stimulus of another person increases the level of arousal during the interview compared to a paper and pencil questionnaire and can produce anxiety and fatigue; these can be major problems if you are interviewing the very old, or children, or interviewing people standing up or on their way to or from somewhere-adjust the length accordingly During the interview if the respondent begins to look tired or bored it is appropriate to suggest taking a short break. Interviews, continued • • During a long interview, look for signs of restlessness or impatience that may be indicators of fatigue. Look ahead to see how many questions remain, inform the participant and ask if he or she needs a break, or has to stop. Explain some general guidelines before you begin: • • • Tell the interviewees that you will mostly ask questions and write down their answers on a piece of paper. For any which is sensitive, such as one which asks about income, sex, drug use, etc, you can have them circle the answer on a sheet of paper so that they don’t have to say it aloud in front of you Be prepared that some items may seem ambiguous and you may have to provide definitions. Give the same definition to anyone who asks More about Face-to-Face Interviewing • Sometimes in an interview when you are asking for scaled responses it is useful to provide printed response cards with the scale alternatives • • Let them know that they have a right to decline to answer any question, and that they don’t have to give a reason for why they don’t want to answer it. • • Explain that you will point at the appropriate response scale and they can point at the alternative that they think best answers the question (e.g., strongly agree, agree, etc.) There is potential for embarrassment with respect to questions about certain subjects. Look for signals that the participant is reluctant. These signals might be questions about the use of the data, clearing of throat, behaviors indicating distraction, nervous gestures. Participants should be reminded that they can skip any question. Let them know that they can stop the interview altogether at any time and they will still be compensated with whatever compensation you may have offered, or that you will still be able to “use their answers” More on Face-to-Face Interviews • What your demeanor should be like (and this advice applies to any data collection context, including experiments and survey administration): • • • • • Don’t be overly friendly to the interviewees, merely polite and pleasant; otherwise, they may try to give positive answers to please you Don’t react to or comment on the interviewee’s answers Don’t allow the interviewee to engage you in chat. Say, “I’ll be happy to talk about that when we’ve finished the interview questions.” Be encouraging but don’t force them if they are reluctant to answer Keep any ideas about the hypotheses of the study or why certain questions are being asked strictly to yourself Special Issues with Multilingual Populations: Major Problem Areas • • • Lack of semantic equivalence across languages—problems of finding words or phrases in the target language that are the same or similar in meaning in the source language. Lack of conceptual equivalence across languages—the concepts which the researcher operationalizes in the source language may not exist in the same form in the target culture; these concepts may not even be part of their thinking. Norms which govern behavior may not be consistent across cultures • • • • • there may be more or less openness or willingness to talk about certain topics such as politics, money, family matters in some cultures people may be very assertive, or be very willing to say positive things about themselves, while in others modesty is more normative in some societies there will be a preference for giving positive responses to make the interviewer happy; in other cultures, the interviewer may be regarded with suspicion and deliberately not be given correct information there is wide variation in how likely respondents from different cultures are to use the end-points of scales; in certain cultures respondents cling to the middle choices, while others prefer the extremes the Western custom of repeating questions to ensure reliability may make the interviewer look stupid or as if she or he has a poor memory Problems with Backtranslation Process • • You may decided you want to ask questions in a language other than English and decide to try to translate a questionnaire yourself It’s typical to take the original English version, translate it into, say, Spanish, and then have it backtranslated to English and compare the two English versions • • • • Can be subject to false equivalence because the backtranslator is familiar with the research construct and can guess what the items should be Can be subject to false equivalence because the backtranslator is familiar with the grammatical structure of the target language and can guess what the items should be Be wary of literal translation of idioms and metaphors as they probably will not have the same connotation in the target language. For example, English expressions like “lending a hand” may not translate well. Explore alternatives to literal translations that capture the intent but not the letter of the original {Source: Behling and Law, Translating Questionnaries and Other Research Instruments. Debriefing: After Collecting the Data • • Debriefing is an important part of any study, no matter how you conduct it, although you may have to debrief by email or a special password-protected URL for a Web study Purpose of debriefing is twofold: • • • To attend to the post-study needs of subjects for information, assurances, giving feedback To gain information that will help you assess the effectiveness of experimental manipulations and improve your design for subsequent studies Some things that should be covered in a debriefing: • Information provided to the subject • • • • A brief account of the purpose of the study and what you hoped to learn An explanation of any deception that may have occurred, particularly as it might affect the subject’s self-evaluation (e.g. bogus test feedback) Promise to provide copy of the results Information gained from the subject • • • Open-ended or scaled responses to questions about experience as a participant in the study, including opinion of the experimenter (confederate) Manipulation check Agreement not to reveal the nature of the experiment to other potential subjects Analyzing the Data • We have already talked about about statistical tests that are suitable for use with nominal (categorical) data. We will continue to talk about what kinds of tests are used with what sorts of hypotheses • • • • • • • • • • • • Contingency table tests (Chi-square) already done Single-sample tests of means and proportions (t) done some of this Two-sample tests of means and proportions (t for independent samples, paired samples) doing next Analysis of Variance and ANCOVA Two-way Analysis of Variance MANOVA Correlation Analysis Multiple Regression Analysis Factor Analysis Discriminant Analysis and Classification Multidimensional Scaling and Cluster Analysis (maybe) Reliability