Experiments: Part 1 Slide 1 This lecture addresses the use of experiments in marketing research. I’ll talk about experimentation in general and specific examples of marketing research experiments. Slide 2 There are three criteria for establishing causal relationships between two variables. For example, let’s say those variables are advertising and sales. We may want to establish a causal relationship between advertising and sales, such that we can say “when advertising increases, sales increase, and when advertising decreases, sales decrease.” That’s an example of a direct relationship. In an inverse relationship, an increase in one variable causes a decrease in the other and vice versa. If we’re trying to establish a causal relationship between advertising and sales, we must establish the following: There is concomitant variation. When the amount spent on advertising changes, the dollar or unit sales change. There is temporal ordering. To establish temporal ordering, we must show that sales increase or decrease after advertising increases or decreases, not vice versa. To establish temporal ordering, we must show that changes in advertising cause changes in sales. In fact, economists report that when sales increase, marketing managers take some of that increase and spend it on additional advertising. Let’s assume that advertising causes sales, so that a change in advertising causes a change in sales. We must establish that advertising changes always precede sales changes to establish the temporal ordering of those variables. The effect is not due to other variables. If we’re to conclusively show that changes in advertising cause changes in sales, we must rule out other causes for changes in sales, such as changes in sales force, product configuration, and competitor behavior. We must eliminate all alternative causes of changes in sales to say conclusively that changes in advertising cause changes in sales. If we can say that changes in the dependent variable (sales) depend on changes in the independent variable (advertising), then we’ve eliminated all alternative causes of changes in sales. Slide 3 More formally, let me define independent and dependent variables. The independent variable is the variable that the experimenter controls; in other words, the researcher can set the value of that variable in an experiment. For example, in an ad copy experiment, the researcher would control the versions of the ad copy that different study participants viewed, so this ‘value’ is manipulated by the researcher. In the case of ad copies, it’s multiple versions of the ad, but researchers can manipulate values of continuous as well as categorical variables. Slide 4 The researcher expects the dependent variable to vary in accord with the manipulated independent variable. For example, the researcher may believe that ad likability will depend on the ad study participants viewed. The criterion or standard by which results are judged is the Page | 1 dependent variable. For ad likability, the most likeable ad is the one that received the highest or best scores on ad likability. Slide 5 Although the test units of marketing experiments tend to be people, that’s not necessarily the case; the test unit could be an organization. For example, suppose a publisher of marketing research textbooks wanted to explore alternative approaches to promoting those books. As a result, the publisher might have its sales reps at one university use one promotional technique, its sales reps at a different university use a different technique, and then compare adoption rates. In this example, the test units are university marketing departments instead of people. Researchers must be clear on the test unit to be able to design appropriate experiments. Slide 6 What’s an experiment? An experiment is a research study conducted under controlled conditions. Controlled conditions allow researchers to eliminate alternative explanations for the observed phenomenon. An experiment has at least one independent variable that the researcher manipulates. The effect of that one independent variable on one dependent variable is measured. The purpose of an experiment, the purpose of worrying about controls, dependent variables, and independent variables, is to test a hypothesis in the most rigorous way possible. Slide 7 The basic issues that confront an experimenter relate to the way in which the independent variable is manipulated. In an advertising study, which alternative copies will be presented to respondents? In selection a dependent variable in a copy testing experiment, that will measure ad likability as the dependent variable, is it best to use attitudinal or physiological measures? The researcher also must properly assign subjects or other test units to treatments or conditions. That assignment must be random to control for alternative explanations on the scores received on the dependent variable. Slide 8 Even something as simple as a researcher smiling or frowning can influence the way people in an experiment behave. It’s important to control experiments as much as possible to eliminate all the alternative explanations for the results. Slide 9 Although not an exhaustive list of the ways researchers control for extraneous variations, this slide shows important ways researchers achieve this goal. One way is to hold the conditions of the experiment as constant as possible. For example, in an ad copy test, the researcher would prefer that all study participants view the alternative test ads under similar lighting and seating conditions. If some people viewed one test ad while sitting on a comfy chair in a softly lit room, but other people viewed another test ad while sitting on a hard wooden chair in a harshly lit room, then the researcher couldn’t be certain if different responses to the ads were attributable to difference in the viewing environment or differences in the ads themselves. It’s also critical that researchers ensure that study participants who have been assigned to the different treatments don’t differ systematically from one another. If they do, then the observed differences may be due to differences between the groups that were exposed to the different treatments, rather than differences between the treatments themselves. For example, if women Page | 2 tend to respond more favorably to a certain type of ad appeal, an inordinate number of women are assigned to view one ad, and an inordinate number of men are assigned to view another ad, then differences in the average group responses to the two ads may be attributable to the dominant sex in each group rather than differences between the ads. To avoid such confounding of results, researchers try, as much as possible, to randomly assign test units to treatment conditions. (That helps to eliminate many of the alternative explanations for experimental results.) When randomization is difficult, due to some environmental constraint, it may be necessary to approximate it by matching subjects across treatment conditions. Researchers may select a few key variables—like age, income, occupation, and sex—and try to match the groups in the experiment on those profile variables. Another way researchers control for extraneous variation is blinding. In a double-blind experiment, as in drug research testing, neither the experimenter administering the treatment nor the subjects participating in the experiment are aware of whether they are receiving the placebo or the drug. When people are unaware if they’re receiving the placebo or the drug, they are more likely to react appropriately. Finally, presentation order may be important. Whether or not we ask people questions of a certain type first or second may determine their responses to those questions. When it’s necessary to control for presentation order, researchers run different groups with presentation of different stimuli or different question sets reversed. Slide 10 It’s not a true experiment unless there’s a control group. The control group isn’t exposed to any of the manipulations, but it is exposed to the same environment and responds to the same set of dependent variable measures. If the control group is similar in all regards except exposure to the experimental manipulation, then its responses can be compared to the responses of the group(s) exposed to the manipulation. Consider the control group as the baseline group to be compared to all other non-baseline groups. Slide 11 This slide depicts the problems associated with not having a control group. In Case A, the researcher assumes that X has a positive influence on Y, but there is a third variable Z that is simultaneously having a positive effect on both X and Y. As Z increases, both X and Y increase. If only X and Y are measured, it seems that if X increases, then Y increases. In fact, X and Y are unrelated to one another, but both are related to Z. Hence, it’s an erroneous inference that changes in X cause changes in Y because it’s changes in Z that cause simultaneous changes in X and Y. In Case B, there’s an erroneous inference that X has no influence on Y. The reason for this assessment is that there’s an effect of Z on X that cancels out the effect of Z on Y. By not controlling for the effect of Z, it seems that X does not relate to Y in any meaningful way, when in fact it does relate meaningfully. In Case C, there’s an erroneous inference that X is solely responsible for changes in Y. Again, Z is not measured, and Y changes when X changes; hence, the effect of X on Y is inferred. In fact, X is correlated with Z and it’s Z that has an effect on Y. Such errors cause mistaken inferences that only one thing is the cause of something else, when in fact multiple things are the cause of something else. A proper control group would eliminate these erroneous inferences, which is why it’s important that experiments have control groups. Slide 12 (No Audio) Page | 3 Slide 13 One important distinction between types of experiments is laboratory versus field experiments. In a laboratory experiment, study participants visit a centralized location—that’s carefully controlled by the researcher—are exposed to some treatment (unless in a control group), and then their responses are measured. Laboratory experiments include concept tests, simulated test markets, product taste tests, copy advertising tests, and package tests. Field tests, in contrast, include store audits, home use tests, traditional test markets, and on-air ad testing. Slide 14 There’s a difference between laboratory and field experiments that relates to the artificiality of those experiments relative to one another. In terms of their environments, lab experiments are relatively artificial and field experiments tend to be more natural. Slide 15 As this slide shows, the relative advantages and disadvantages of field experiments are the reverse of one another. Laboratory experiments, because they have high control, can eliminate most extraneous variables as an explanation for scores on the dependent variable. Hence, laboratory experiments have high internal validity, meaning most erroneous explanations for the experimental results have been eliminated. Laboratory experiments tend to be of lower cost and shorter duration (completed in a week or two versus multiple months for field experiments). Due to extensive control, laboratory experiments tend to have a higher signal to noise ratio. As a result, far fewer study participants are needed to detect the effect of the independent variable on the dependent variable. Fewer participants also contribute to the lower cost and greater ease of running laboratory experiments. In contrast, the one major advantage of field experiments is their high external validity. Field experiments often occur in a natural environment, so there’s greater comfort in generalizing field experiment results to what will occur in ‘real time’. Slide 16 Demand artifacts are caused by experimental procedures that induce unnatural responses by study participants. Merely by being in a study and being asked to perform certain activities, the participants will behave in ways that differ from the ways they’d behave if not constrained by the experiment and its treatment. There are well-known terms for the differing behavior of people who know they’re being observed for a study: guinea pig effects and Hawthorne effects. This is a major problem with laboratory experiments. Because study participants know they’re being observed and being asked to do things that aren’t part of their routine behaviors or experiences, they tend to respond/react in unnatural ways. For example, participants in an ad copy experiment may be more attentive to the experimental task than they would be otherwise. Typically, viewing ads is a low-involvement activity; people aren’t highly engaged, so they don’t read every word of ad copy. However, when participating in an ad-copy experiment, they suspect they’ll be asked questions about the ads they viewed, so they attend carefully to those ads. As a result, any questions a researcher asks study participants about an ad—such as how much did you like it or how informative did you find it—may receive artifactual answers because they were encouraged inadvertently to attend more closely to the ad. Slide 17 This slide is worth repeating. One major concern about experiments is internal validity; the ability of the experiment to determine if the treatment was the sole cause of the changes in the dependent variable. In other words, did the experimenter’s manipulation do what it was Page | 4 supposed to do? To the extent that this question can be answered ‘yes’, the stronger the belief that the experiment truly reflects the underlying process of interest. Slide 18 Without high internal validity, especially in a laboratory experiment, the study results become untrustworthy and perhaps useless. Experimenters must recognize and minimize threats to internal validity, such as history, maturation, testing, instrumentation, selection bias, and mortality. Slide 19 These next two slides provide excellent definitions and examples of the various threats to internal validity. ‘History’ refers to changes in the environment that are irrelevant to the effects of interest but may modify scores on the dependent variable. If a test market ran in a geographic area in which a major employer shut down, sales of the tested product might decline due to a reduction in disposable income, rather than a problem with a new marketing mix. ‘Maturation” refers to changes in study participants that occur during the course of an experiment. Assume a company wants to test alternative copy treatments. Its researcher would show study participants one ad and ask them a series of questions; then expose them to a second ad and ask the same questions; and then expose them to a third ad and ask the same questions. By the time study participants are viewing and responding to that third ad, the task may have gotten dull and they may be a little tired. As a result, their responses to the third ad may differ from their responses to the first two ads because they’re tired, rather than because of something systematically true about the third ad. That’s why presentation order is one way that experimenters can control for this effect; in this case, some subjects will see an ad first, some will see it second, and some will see it third. Balancing the presentation sequence of the different ads can control for this maturation effect. Testing effects occur before study participants are exposed to a manipulation. Preexposure measurement may sensitize them to the nature of the experiment; as a result, they respond differently to subsequent post-exposure measurement. Suppose we want to run an experiment related to women in traditional gender roles. Before we expose participants to the manipulation, we administer a questionnaire about the traditional gender roles of women. Completing this questionnaire will encourage participants to think about traditional gender roles, which may sensitize them to the experiment and thus induce responds very different than would have occurred otherwise. Slide 20 ‘Instrument effects’ are a threat to internal validity because changes in the instrument itself may cause differences in responses. The example here relates to using alternative sets of questions to explore the same underlying notion. Perhaps the instrument is a human observer; one observer at one time and a different observer at a different time. Changing the observer means changing the instrument. As a result, instrument bias rather than fundamental before-and-after changes may have caused the observed result. Page | 5 With ‘selection bias’ the sample is not randomly assigned, or perhaps there’s no control group. Assume an experiment in which the control and experimental groups self select. With that type of random assignment, it’s impossible to know if observed between-group differences are due to the experimental treatment or systematic differences in the self selection process; for example, more women chose one treatment and more men chose another treatment. With ‘mortality’, there may be something systematically different about the people who complete a long-running experiment. The example is based on subjects in one group of a hair-dying study: widows who moved to Florida. Widows who dropped out of the experiment may have been the most successful users of the hair dye—they remarried and were no longer attracted to the remuneration for study participation. Nonetheless, the researcher may conclude that the dye doesn’t effectively modify women’s lifestyles. Slide 21 This is such a good experiment that it’s worth mentioning again. Assume we were concerned about the ways that consumers used unit pricing information and our goal was to encourage them to use that information in making purchase decisions. That would be our goal if we were a supermarket retailer, because the profit margins on the store brand are higher than on national brands. We believe that people who see and properly internalize unit pricing information are more strongly encouraged to buy the store brand. We want to determine the best way to present unit pricing information so it’ll be seen and properly internalized, so we conduct the following experiment. In this case, the test units are stores that are randomly assigned to one of two treatment conditions. In one treatment, the stores place unit pricing information underneath the shelf that holds the item in question (as it’s traditionally displayed). In the other treatment, the stores place a list of items adjacent to the shelves that hold the product category in question (such as peanut butter) and that’s organized by unit price; the lowest unit price is listed first and the highest unit price is listed last. Our goal is to measure store-versus-national-brand sales in the tested product categories. Let’s assume we’re experimenting with peanut butter sales. First, we record sales before any unit pricing information is provided. Then, we provide the information through either shelf tags or a list of peanut butters ranked by unit price. Finally, we record peanut butter sales in these different stores. We might discover, as was the case in this experiment, that presenting unit pricing information in list form rather than under-the-shelf-tags form encourages far more people to use unit pricing information because list eases unit pricing comparisons. As a result, they’re more likely to buy the store brand because they’re more likely to recognize savings associated with buying it. From a public policy perspective as well, this experiment showed that unit pricing information should be provided in list form; in that form, people are more like to use it and make more informed purchase decisions. However, this form would reduce sales of national brands, so national brand manufacturers would strongly oppose this shift in consumption behavior. Given their power in the distribution channel, these manufacturers would pressure grocery retailers not to provide unit pricing information in list form. Slide 22 Although I’m more interested in you understanding what constitutes a trustworthy experiment than in you memorizing details about the different types of experimental designs, I believe a brief overview of experimental designs will prove helpful. The first and third columns of this slide Page | 6 indicate pre-experimental and quasi-experimental designs. In those cases, there’s no control group. Only true or statistically valid experiments have control groups, which allow researchers to claim that they have strong evidence for rejecting alternative explanations for the results of their experiments. Page | 7