Chapter Seventeen External Validity and Critiquing Experimental Research PowerPoint Presentation created by Dr. Susan R. Burns Morningside College Smith/Davis (c) 2005 Prentice Hall External Validity: Generalizing Your Experiment to the Outside Chapter 5 covered the concept of internal validity, which concerns the question of whether your experiment is confounded. The second type of evaluation that you must make of your experiment involves external validity. When you consider external validity, you are asking a question about generalization. Smith/Davis (c) 2005 Prentice Hall External Validity: Generalizing Your Experiment to the Outside External validity – A type of evaluation of your experiment that asks whether your experimental results apply to populations and situations that are different form those of your experiment. Generalization – – – Applying the results from an experiment to a different situation or population. In essence, we would like to take our results beyond the narrow confines of our specific experiment. Generalization is an important aspect for any science. Smith/Davis (c) 2005 Prentice Hall External Validity: Generalizing Your Experiment to the Outside There are three customary types of generalization in which we are interested. – Population generalization – Environmental generalization – Applying the results from an experiment to a group of participants that is different and more encompassing than those used in the original experiment. Applying the results from an experiment to a situation or environment that differs from that of the original experiment. Temporal generalization Applying the results from an experiment to a time that is different from that when the original experiment was conducted. Smith/Davis (c) 2005 Prentice Hall Psychological Detective Read the following sentence – The best way to exert control over factors is to conduct your experiment in a lab (or a similar setting) with participants who are highly similar. Can you figure out why exerting control, which helps us in terms of internal validity ends up weakening our external validity? Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Methods) Campbell and Stanley (1966) have provided a list of factors relating to external validity: – Interaction of Testing and Treatment A threat to external validity that occurs when a pretest sensitizes participants to the treatment yet to come. Occurs for the pretest-posttest control group design. Because of a pretest, your participants’ reaction to the treatment will be different. Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Methods) Campbell and Stanley (1966) have provided a list of factors relating to external validity: – Interaction of Selection and Treatment A threat to external validity that can occur when a treatment effect is found only for a specific sample of participants. Occurs when the effects that you demonstrate hold true only for the particular groups that you selected for your experiment. Treatment interaction becomes greater as it becomes more difficult to find participants for your experiment (Campbell & Stanley, 1966). Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Methods) Campbell and Stanley (1966) have provided a list of factors relating to external validity: – Reactive Arrangements A threat to external validity caused by an experimental situation that alters participants’ behavior, regardless of the IV involved. We cannot be sure that the behaviors we observe in the experiment will generalize outside that setting because the artificial conditions of the experiment do not exist in the real world. Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Methods) Campbell and Stanley (1966) have provided a list of factors relating to external validity: – Demand characteristics – Features from the experiment that inadvertently lead participants to respond in a particular manner. Demand characteristics make generalizations difficult because it is not clear from a set of research findings whether the participants are responding to an experiment’s IV, its demand characteristics, or both. Multiple-Treatment Interference A threat to external validity that occurs when a set of findings results only when participants experience multiple treatments in the same experiment (repeated measures designs). Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Our Participants) The Infamous White Rat – – “There’s always an additional subject that screws up your data.” Sometimes it’s hard to tell which participants are more numerous in psychology experiments – lab rats or humans. If you are interested in the behavior of subhumans, generalizing from rats (and pigeons) to all other animals may be a stretch. If you are interested in generalizing from animal to human behavior, there are certainly closer approximations to humans (and pigeons) in the animal kingdom. Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Our Participants) The Ubiquitous College Student – – Psychologists who want to conduct human research turn to a ready, convenient source of human participants – students in introductory psychology courses (a technique referred to as convenience sampling). Convenience Sampling A researcher’s sampling of participants based on ease of locating the participants; often does not involve true random selection. Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Our Participants) The “Opposite” or “Weaker” or “Inferior” or “Second” Sex – – All four of these derogatory labels have been applied to women at various points in time. The supposed inferiority of women has carried over into some psychological theories. – Freud’s theories Erikson’s theory of psychosocial crises (“Eight Stages of Man”) Carol Tavris’s (1992) thesis is that “despite women’s gains in many fields in the last twenty years, the fundamental belief in the normalcy of men, and the corresponding abnormality of women, has remained virtually untouched” (p. 17) Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Our Participants) Even the Rats and Students Were White – – Just as history has failed to record the accomplishments of many women throughout time, it has largely ignored the accomplishments of African Americans and other minority groups. When we conduct research and make generalizations, we should be cautious that we do not exclude minority groups from our considerations. Even the Rats, Students, Women, and Minorities Were American – – Although experimental psychology’s early roots are based in Europe, this aspect of the discipline quickly became Americanized, largely due to the influence of John B. Watson’s behaviorism. In the mid-1960’s, psychologists started taking culture and ethnicity more seriously. The field of cross-cultural psychology has evolved from those changes that began in the 1960’s. Smith/Davis (c) 2005 Prentice Hall Threats to External Validity (Based on Our Participants) Cross-cultural psychology – A branch of psychology whose goal is to determine the universality of research results. Ethnocentricity – Other cultures are viewed as an extension of one’s own. Smith/Davis (c) 2005 Prentice Hall The Devil’s Advocate: Is External Validity Always Necessary? Mook (1983) pointed out four alternative goals of research that do not stress external validity: – – – – We may merely want to find out if something can happen (not whether it actually happens). We may be predicting from the real world to the lab – seeing a phenomenon in the real world, we think it will operate in a certain manner in the lab. If we can demonstrate that a phenomenon occurs in a lab’s unnatural setting, the validity of the phenomenon may actually be strengthened. We may study phenomena in the lab that don’t even have a realworld analogy. Smith/Davis (c) 2005 Prentice Hall The Devil’s Advocate: Is External Validity Always Necessary? Replication – – An additional scientific study that is conducted in exactly the same manner as the original research project. When we replicate an experimental finding, we are able to place more confidence in that result. Replication with extension – An experiment that seeks to confirm (replicate) a previous finding but does so in a different setting or with different participants or under different conditions. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Does the literature review adequately describe the research area? Is this material consistent with the specific research question? – – – As a research project evolves, the literature review and the actual experiment diverge somewhat over time. After you complete your project and work on the report, double-check to make certain that the actual project still shows a direct link with your research literature. Because most researchers carry out programmatic research, their new research ideas are likely to build directly on their (and others’) previous research. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Is the research question stated clearly? Do you have a clear idea concerning the research to be reported? – – The title and abstract of a research report should give you an indication of the research’s topic, although they may not contain the specific question per se. You will find the author’s review of relevant literature in the article’s introduction. As you read further into the introduction, the literature should apply more specifically to the particular research question. The research question will often be in the last paragraph of the introduction. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature In view of the research area and research question, are the hypotheses appropriate, clearly stated, and able to be stated in general implication form? – – – – Appropriate hypotheses are those that follow logically from the literature review. If you find a hypothesis that seems to come from nowhere and surprises you, it may be inappropriate – reread the introduction to make sure. A clearly stated hypothesis is one that you can easily understand without having to guess what the researcher is predicting. Remember that general implication form is the “if….then” format. Smith/Davis (c) 2005 Prentice Hall Psychological Detective Why is the “if…then” approach of the general implication form important in phrasing a research hypothesis? – It is the “if…then” approach to research questions that allows us to draw cause and effect conclusions – assuming that the researcher has done a good job in designing the experiment. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Are the key terms operationally defined? – – The reader should not have to guess what a researcher means when he or she refers to a specific independent, dependent, or extraneous variable. Remember that operational definitions mean that you should define your variables in terms of the operations you use to manipulate, measure, or control them. Are the IV’s and their levels appropriate? – – Be sure to pick a manipulation that is actually appropriate to the IV – don’t choose something merely because it is easy or convenient to use. Be sure to choose the levels of your IV appropriately. Choose levels of the IV to answer your experimental question, but do so economically (remember the principle of parsimony). Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Is the DV appropriate for this research? Should the researcher have included more than one DV if only one was recorded? – – – If a researcher wishes to study a particular outcome, the behavior chosen for measuring (the dependent variable) should be a good indicator of that outcome. The operational definition of the DV should be one that other researchers would judge to be valid. A researcher with broad interests should use multiple DV’s to get a better sense of the concept he or she is measuring. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Are the controls sufficient and appropriate? Are there any uncontrolled variables that could affect the results of the experiment? – – – Leaving variables uncontrolled can result in a confounded experiment which leaves the researcher unable to draw a conclusion. As you look for possible extraneous variables, you should concentrate on variables that have a legitimate or reasonable chance to actually make a difference. Look for extraneous variation, but don’t go overboard and find variation that most researchers would consider negligible. Smith/Davis (c) 2005 Prentice Hall Psychological Detective Why is the question regarding control such an important question? – Leaving variables uncontrolled can result in a confounded experiment, which leaves the researcher unable to draw a conclusion. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Did the author(s) use an appropriate research design to test the specific hypotheses and answer the general research question? – – With poor planning, it is possible to gather data for which there is no appropriate research design and, thus, no appropriate statistical test. Make sure that research reports use designs that match the question(s) they sought to answer. For example, if the researcher asked a question involving multiple IV’s, the experiment should involve a factorial design. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Assuming you had access to the appropriate equipment and materials, could you replicate the research after reading the method section? – – – The method section should contain enough detail about the variables and procedures of the experiment to enable a reader to replicate the experiment. The reader should not have to guess about any of the manipulations, measurements, or controls the researcher used. The reader must have all the vital details of the experiment in order to evaluate the operational definitions, the variables, and the procedures used in the experiment. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Did the researcher(s) use appropriate sampling procedures to select the participants and assign them to groups? – – – Random sampling and assignment in creating independent groups or the appropriate matching or repeated measures approach for correlated groups are important for both internal and external validity. If a researcher uses sampling techniques that result in biased samples, the internal validity of the experiment is threatened because the groups are likely to be different before the experiment. Biased samples also threaten external validity. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature What procedures were used to ensure group equivalence prior to the experiment? – – – Poor sampling techniques can result in biased samples. Biased samples are usually not equivalent before the experiment begins, so it would be impossible to draw valid conclusions about the effects of the IV (internal validity would be compromised). If you have reason to doubt the equivalence of your groups beforehand, you would be wise to use a pretest. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Did the research use a sufficient number of participants? – – With small numbers of participants, statistical tests are simply less powerful to detect differences – the differences between groups have to be quite large for the difference to turn out significant. Don’t back yourself into a corner so that you use the ageold student lament after your experiment: “If I had run more participants, my differences might have been significant.” Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Were there any history, instrumentation, statistical regression, or mortality effects that might have influenced the results? – – – – For history, be alert to outside events that occur that could affect the results. Be sure to check the operation of your equipment before each session to avoid instrumentation effects. Choosing extreme high- or low-scoring participants can result in lower or higher scores, respectively, simply due to statistical regression. If many participants drop out of one condition in the experiment (i.e., mortality), the participants who are left in that condition may differ in some important way(s) from the participants in other conditions. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Were the appropriate statistical tests used, and are they reported correctly? – – – You may need to consult a statistic text or someone who teaches statistics to help you answer this question. On the other hand, this guideline points out the importance of becoming statistically knowledgeable so that you can evaluate this guideline on your own. Remember that statistics are merely a tool experimenters use to decipher the results they obtained – you should be well armed with the proper tools as you evaluate and conduct research. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Did the author(s) report means, standard deviations, and a measure of effect size? – – – Group means may allow the reader to compare participants’ performance against existing norms. Standard deviations may allow the reader to determine that nonsignificant findings are due to extreme variability between groups rather than small differences between means. Effect sizes give standard comparison units so that readers can compare significant differences from several different experiments. Smith/Davis (c) 2005 Prentice Hall Psychological Detective Is there any difference between independent groups and correlated groups experiments in terms of internal or external validity? – As long as you use the proper techniques for creating independent or correlated groups, there should be no difference between the two approaches as far as internal or external validity are concerned. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Are the tables and figures clearly and appropriately labeled and presented accurately? – – Tables and figures should present a large amount of data than is possible in writing. Just as paragraph after paragraph of statistical results can be confusing, a poorly constructed table or figure can confuse the reader. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Does the author(s) correctly interpret the results? Does the discussion follow logically from the results? – – – – Did the researcher correctly interpret p < .05 as significant and p > .05 as nonsignificant? Did the researcher give a correct interpretation of his or her results in light of previous research? Does the discussion “make sense” given the data the researcher just presented. Authors should make it clear when conclusions follow from data and when they are engaging in speculation. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Are the conclusions and generalizations valid and justified by the data? Did the author(s) consider other possible interpretations of the results? – – – This difficulty often comes when researchers have a favorite theory that they espouse. Sometimes, this theoretical leaning is so strong that it seems to blind them to any alternative explanations. Alternative explanations for findings may provide you with the impetus for a new experiment. Smith/Davis (c) 2005 Prentice Hall Psychological Detective How can you help yourself to consider alternative explanations for your experimental findings? – Considering alternative explanations, particularly for published findings, is often difficult for novice researchers. The authors of your text recommend two approaches that may help: First, play “devil’s advocate.” As you read a study, try to put yourself in that role – look for any aspect of the author’s explanation (no matter how small) that you disagree with. Second, particularly in cases of your research, have another person who is familiar with your project read your report. An unbiased eye can often find weaknesses in your arguments that you may have overlooked. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Do all references cited in the text appear in the reference section, and vice versa? – – – It is highly unlikely you would find this problem in a published study. There should be a one-to-one correspondence of the citations in the text and the references at the end of the study. The reference section of an APA-format report consists only of material that you have read and included in the report. Smith/Davis (c) 2005 Prentice Hall Guidelines for Critiquing Psychological Research Literature Did the experimenter follow appropriate ethical procedures during all phases of the experiment? – – – To evaluate this guideline, you man need to refresh your memory of the ethical principles that psychologists follow in conducting research. Some older research involves some procedures that have been hotly debated as far as their ethical nature is concerned. It is doubtful that any ethically questionable study would receive approval from an institutional review board. Smith/Davis (c) 2005 Prentice Hall Smith/Davis (c) 2005 Prentice Hall