Generalizing Results Introduction to Experimental Psychology Golden West College Dr. Isonio Review—types of validity Measurement validity Internal validity “is this measure appropriately measuring what it is intended to measure?” “are the effects observed on the dependent variable(s) uniquely attributable to the independent variable?” External validity “do the findings of this apply to other populations, settings, and contexts?” Today—external validity External Validity— Focus: Generalizability of results Other aspects, some of which we have already considered: Reliability of the findings (statistical significance) Practical significance—importance, usefulness, do the results “make a difference” Effect size—is the effect (difference) large or trivial, irrespective of statistical significance Representative Samples Sampling— Probability samples assure representativeness, but in actuality samples of convenience are most typically used. Sample size—all else being equal, more participants is better than fewer Typical Participants Non-human studies— Depends much on variables being studied Very common: white rats (male albino rats of the Sprague-Dawley strain) Humans— College sophomores Approximately ¾ of all studies using human participants used college students Human participants – the College Sophomore problem Why use them college students so often? Stanovich: 3 reasons why doing so is not necessarily a problem— Reasons why it may not be a problem: 1. Using them does not invalidate findings; only requires follow-up tests to check on generalizability Cozby: generalization as statistical interaction Reasons why it may not be a problem 2. when basic psychological processes (e.g., perception, functioning of nervous system) are being studied, they are not unlike the rest of the population Clearest exception: Social psychology Cultural differences, collectivist/individualist, field dependent / independent dimensions Participant characteristics that can matter Sex Age Race / ethnicity Others: SES, family structure, education level Other dimensions to consider Location - country, state, community Setting – college campus, laboratory, participant’s home Reasons why it may not be a problem 3. College students are now a more diverse group Yet—still, at many colleges and universities: fairly homogeneous with reference to: Intelligence, life-experiences, attitudes, values, goals, self-identity development Volunteers . . . do they differ from the rest of the population?? The Volunteer Subject Many studies have examined characteristics of volunteer subjects and have shown such subjects are: More sociable Have a greater need for approval Are less authoritarian Generally are of higher social-class status Have a greater need for arousal / sensation-seeking Are less anxious Are more well-adjusted psychologically As students, earn better grades Perhaps indicates higher achievement motivation level Participant characteristics and type of study volunteered for— People generally are much more willing to volunteer for studies on attitudes, personality than for those on learning which might entail some type of punishment or harm Males—more inclined to volunteer for studies on hypnosis, sensory deprivation, interview on personal topics such as sex Females—generally more willing to volunteer than are males; prefer studies that don’t involve “unusual tasks or situations” External inference - Bottom line question— Does this study have anything to do with how variables relate in the world beyond the laboratory?? Laboratory to world External inference: Mundane realism – does the experiment “look and feel like” events in the real world? Experimental realism – are participants impacted by and engaged with the experiment; are they involved and do they take it seriously? Methodological limitations Use of a pre-test Pretests, as helpful as they can be, nevertheless can limit the generalizability of the findings to populations that did not get the pretest and can serve as a source of demand characteristics for participants Can use Solomon-four design to assess Compare conditions with, and without the pretest—does it make a difference? Solomon-four Design 1: pretest 2: pretest 3: - 4. - IV -IV -- posttest posttest posttest postest Here, we would expect a 1 and 3 versus 2 and 4 difference due to the effect of the IV, but would not expect (or want) 1 versus 3 and 2 versus 4 differences Methodological limitations Experimenters Personal characteristics of the experimenter(s) The concern is that the results might only apply to certain types of experimenters who behave in specific ways. Generalization via Literature Reviews and Meta-analyses Literature Review—a written summary and synthesis of a large body of research in a given domain. Potential problems: common measure?, file-drawer problem, which studies to include? e.g., Is schizophrenia a progressive neurodevelopmental disorder (handout) e.g., Life events and bipolar disorder (handout) Generalization via Literature Reviews and Meta-analyses Meta-analysis—a statistical evaluation of the strength and generality of a given effect Potential problems: jugdments regarding emphasis and interpretation e.g., How children and adolescents spend time (hanout) e.g., Gender differences in self-esteem (handout) Generalization via Replication Direct (exact) replication Conceptual replication Summary: Campbell & Stanley’s list of Threats to External Validity Interaction Interaction effect of testing effects of selection biases and the experimental treatment Reactive effects of experimental arrangements Multiple-treatment interferences Connectivity Principle Stanovich: the notion that there is a network of concepts in science that, collectively, constitute our understanding in an area. A new theory (or research finding) must connect to previously established facts Psychology operates more under the “gradual synthesis” model rather than the “great leap” model How important IS external validity? How does it compare to internal validity? The Importance of External Validity— Differing Views We are not examining genuine behavior in realistic ways: “In order to behave like scientists we must construct situations in which our subjects . . . can behave as little like human beings as possible and we do this in order to allow ourselves to make statements about the nature of their humanity” -Bannister, 1966, p. 24 The Importance of External Validity— Differing Views Artificiality is a critical problem: “The greatest weakness of laboratory experiments lies in their artificiality. Social processes observed to occur within a laboratory setting might not necessarily occur within more natural social settings.” Babbie, 1975, p. 254 The Importance of External Validity— Differing Views It is not a problem: “The problem of external validity is often either meaningless or trivial, and a misplaced preoccupation with it can seriously distort our evaluation of a research study.” Mook, 1983, p. 381 The Importance of External Validity— Differing Views The term itself creates unrealistic and erroneous expectations: On problems with the term “external validity”: “Who wants to be invalid—internally or externally, or in any other way? One might as well ask for acne. In a way, I wish we still used the term generalizability, precisely because it does not sound so good. It would then be easier to remember that we are not dealing with a criterion, like clear skin, but a with a question, like “How do I get the sofa down the stairs?” One asks that question if, and only if, moving the sofa is what one wants to do.” Mook, 1983, p. 379 When artificiality can be good/necessary Mook—In defense of external invalidity: Demonstrate the power of a phenomenon—show that it occurs even under trivial, contrived conditions e.g., aggression – gun as a stimulus cue Use the lab setting to create a situation that does not have a counterpart in real life When artificiality can be good/necessary Mook—In defense of external invalidity: When we ask whether something can happen, rather than that it does happen e.g., extreme obedience Prediction from theory specifies something that ought to happen in the lab (even though it does not generally happen in the real world) e.g., Consider Aggression- Read the scenarios in the handout—does #2 have anything to do with #1?? Do artificial studies/measures differ from “real life” ones?— The case of aggression Oral, written, physical indices of aggression correlate at between .70 and .80 Outside: male rate of assault and murder is 10x that of females; difference holds for physical aggression but not for verbal hostility Inside: males much more physically aggressive; verbal hostility—no strong differences Buss-Durke Hostility Inventory—predicts aggression equally well in lab and real world Do artificial studies/measures differ from “real life” ones?— The case of aggression Type A pattern—more aggressive than Type B—holds for inside and outside of laboratory Media violence—associated with increased violence both inside and outside of laboratory Anonymity/deindividuation—strong precursor to violence both in lab and outside of lab Temperature—associated with greater hostility, both in lab and out Sometimes generalization is not the goal e.g., survey research when a specific population is targeted—such as all GWC students Another perspective- Does psychological research conducted in artificial settings improve lives This is, in a sense, the ultimate “does it matter” question