Hawthorne Effect When your experimental effect is due to the experiment itself: the subject is at the center of attention. Can manifest itself as a spurt or elevation in performance or physical phenomenon measured. More of a problem when it operates differently in different cells of the experiment. Solution: Add a control group to the experiment. Have them go through the same experimental procedure, but administer a placebo instead of the treatment. Example: Testing a new design tool. Bring in two groups into the lab, tell them both you have an exciting new tool. Use your real tool with one group, use the old tool with the placebo group. Blind and Double Blind Procedures • Medical Terminology • Blind Administration: When the subjects does not know if he/she is in the experimental / control condition • Double Blind Administration: When the above is true, and also the experimenter does not know which condition the subject is in (Controls for expectancy effects) Experimental terminology in Multifactor experiments • Factors / Independent Variable / Treatment Condition: Is directly manipulated in real experiments, is selected in quasi experiments. • Levels of the IV: Each specific variation of the factor. E.g. the different font sizes • Main Effect: The difference in the DV between the different levels of the IV • Interaction: Does one independent variable effect the other. Do they interact? Effect of Font Size and Screen Resolution on Readability 9 • Main Effect-Size 8 7 • Main Effect-Resolution 6 5 • No interaction 4 3 Low Resolution 2 High Resolution 1 0 Size 12 Size 16 Size 12 Low Resolution High Resolution Column Means Size 16 3 5 4 Row Means 5 4 7 6 6 5 Effect of Font Size and Screen Resolution on Readability • Main Effect-Size 7 6 • No Main EffectResolution 5 4 3 2 Low Resolution High Resolution 1 • No interaction 0 Size 12 Size 16 Size 12 Low Resolution High Resolution Column Means 3 3.2 3.1 Size 16 Row Means 6 4.5 6.2 4.7 6.1 4.6 Effect of Font Size and Screen Resolution on Readability 12 • Main Effect-Size 10 • Main Effect-Resolution 8 • Interaction 6 4 Low Resolution High Resolution 2 High sizes at High resolution have great readability 0 Size 12 Size 16 Size 12 Low Resolution High Resolution Column Means 3 4 3.5 Size 16 Row Means 5 4.0 10 7.0 7.5 5.5 Effect of Font Size and Screen Resolution on Readability 12 • Main Effect-Size 10 • Main Effect-Resolution 8 6 • Interaction 4 Low Resolution High Resolution 2 0 Size 12 Size 16 Size 12 Low Resolution High Resolution Column Means 1 5 2.8 Size 16 Row Means 9 4.8 10 7.5 9.5 6.1 • Main Effects: When we look at a main effect (effect of one variable averaged over the other), we are ignoring the other variable • Interaction: concerned with the joint effects of both the variables When lines are parallel, interaction not present. In case of interaction, lines will cross theoretically at some point Independent Variables can be depicted on either axis Establishing a Cause-Effect Relationship Temporal Precedence • Cause happened before your effect. Real life relationships between variables are never simple. Cyclical situations, involving ongoing processes that interact are hard to interpret. Covariation of the Cause and Effect if X then Y if not X then not Y • If you observe that whenever X is present, Y is also present, and whenever X is absent, Y is too, then there is covariation between the two. • For Example: Better website, more visitors Bad website, less visitors No Plausible Alternative Explanations • Covariation does not imply causation. • Rule out alternative explanations. (a third variable that might be causing the outcome) • Referred to as the "third variable" or "missing variable" problem. Also at the heart of establishing Internal validity. • For Example: Better better site (better company, more marketing) more visitors Hypothetical Case Study: Barnes and Noble site redesign •Hired one of the famous “ient” web design companies to redesign site •Purpose: make online shopping easy and site more attractive •Paid a lot of money •Does site redesign work: Lets look at sales figures Hypothetical Data Effect of Site Redesign on Online Sales 9 8 7 Sales 6 5 4 3 2 1 0 Old Site • Sales increased! New Site Problems with Deducing that site redesign worked • Temporal relationship • Covariation • Alternative Explanations: Reliability • Replicability • Insure that random confounding factors are not playing a role External Validity • Related to generalizing. Degree to which the conclusions in your study would hold for other persons in other places and at other times. • Sampling Model: Identify the population you would like to generalize to. Then, you draw random sample from that population. You can generalize back to it. Problems: Time and place constraints Threats to External Validity • Peoples: Results of your study could unusual type of people who were in the study. • Places: Limited to experimental context. For example: if you conducted study in an office atmosphere. • Time: Limited to time period when you did your experiment. For example: study on web interfaces in 1997 • Objects: In HCI your results might be extendable to only similar objects / interfaces. What is validity • Validity refers to the operationalization or measurement of concepts. • Any time you translate a concept or construct into a functioning and operating reality (the operationalization), you need to be concerned about how well you did the translation. Internal Validity Concerns inferences regarding cause-effect or causal relationships. •Only relevant in studies that try to establish a causal relationship. •Not relevant in most observational or descriptive studies. Important for studies that assess the effects of certain changes to websites, or to products. Are there alternative explanations? • Example: Amazon.com increased the number of tabs in its home page. • Assume that study showed increase in the no of tabs = increase in ease of navigation. Alternative explanations: • At same time Amazon.com launched a marketing campaign. • The key question in internal validity is whether observed changes can be attributed to your intervention (i.e., the cause) and not to other possible causes (sometimes described as "alternative explanations" for the outcome). Construct Validity • Degree to which you can generalize back to the theoretical construct you started from. • Construct validity can be thought of as a "labeling" issue. • Real Objective: to make site easier to navigate Operationalization: give users more options on each page by increasing number of links. Is increasing number of links really giving users more options. Kinds of construct validity • Face Validity • Content Validity Face Validity • Does operationalization of the concept seem like a good translation “on its face" or superficially speaking. • The weakest way to try to demonstrate construct validity. • For example: you can check for a measure of math ability, read through the questions, and decide that, it seems like this is a good measure of math ability (i.e., the label "math ability" seems appropriate for this measure). Content Validity • Check the operationalization against the relevant content domain for the construct. • For example: you are trying to measure usability. What are the sub domains of usability Efficiency Attractiveness Control • Check your measure of usability against these domains Research Designs Single Group Experimental Designs Repeated measurements are take across time for one group. Does not lend itself to clear statistical analysis and hypothesis testing Cannot control for order effects, difficult to generalize Can provide us with important information which we might not have access to by experiments Randomized Group Experimental Designs • This is what you want to aim for • You have an experimental and control group. Randomly assign subjects to either group • All sorts of causal inferences possible Quasi Experimental Design • When you cannot control who gets assigned to which group • For example: in an ex post facto study, IV has already occurred, you want to draw inferences. • For example: You want to compare users of Palm Pilot and Handspring. You have no control over who goes to which group Comparing Quasi-Experimental and Experimental designs • The experimental design is as sound in both cases • It is harder to make causal inferences in case of quasi experimental designs, since groups were not equal to start with • You can do pretest on groups, and do analysis of covariance