Reliability and Validity of Dependent Measures Validity of Dependent Variables Does it measure the concept? Construct Validity: Does DV really capture what you want to measure (good operational definition?) Or does it include mood, culture or gender bias, confusing wording, observational bias, etc. Indicators of Construct Validity Face Validity: Does it appear to be a good measure (do experts think so?) Predictive Validity: Predict later behaviorGRE=grad school success? Concurrent Validity: Are those known to diverge different in scores (Self Monitoring) Indicators of Construct Validity Convergent Validity: do other kinds of ratings agree? Similar responses to similar scales Divergent validity: is it different from other constructs? (measures intell, not SES or gender bias) shy isn’t lonliness Reactivity- knowing you are being studied changes behavior Reliability of DV Are results repeatable? All measurement contains true score plus error of measurement Not an issue of replication- same subjects=same scores Types of Reliability Inter-rater reliability- calculate r for observers or Cohen’s Kappa Internal consistency- split half reliability Cronbach’s Alpha calculates ave of all possible corr. Temporal consistency- test-retest reliability with SAME people Restaurant example Can a variable be reliable and not valid? Valid and not reliable? How do you know you have a good DV? – Mental Measurements Yearbook Validity of Experimental Designs Survey Design Internal validity Does the design test the hypothesis we want it to test? Did IV manipulation cause change in DV? Can we infer causality? What if internal validity is low? External validity Does your study represent a broad population? Caution with Discussion Section if weak Random Sampling – – Stratified Sampling Block Randomization Ecological validity Does study reflect the real worlddo people really behave this way? Can you study anything without changing it? Threats to Internal Validity: In pre-post design: – – – – Test participants Administer IV Post test for effect of IV Compare pre vs. post results to look for effect of IV History World events may cause change in attitudes or behavior over time. Tests of patriotism pre/post 9/11 Views of President pre/post Katrina Attitudes of adolescents pre/post Cobain suicide Maturation Individuals change over time as they mature. Issue for studies of children, but also huge growth in freshman year- change of attidues and behavior. Testing The study you use may cause differences in behavior. Similar to REACTIVITY, but for entire study not just DV. Parenting study for example Instrumentation Use of instrument may get better or worse with time Observation studies Testing skill/ interviewing Regression toward the mean Extreme scores do not tend to be repeatablethose who score very high or very low on a test will be closer to the average if tested again. A big issue for any study where pretest is used to select subjects for post test. Mortality Those who drop out of your study may differ from those who choose to continue. Placebo effect If given any treatment, behavior will change, even if treatment was not meaningful. (fake drugs get some results) How can we improve internal validity? History Maturation Testing Instrumentation Regression toward the mean Mortality Placebo effect Improved Design In pre-post design: – – – – Test participants Administer IV Post test for effect of IV Compare pre vs. post results to look for effect of IV Two Group design Pretest (do you need to do this?) RANDOMIZED assignment to levels of IV Compare post test results of IV and Control groups Extraneous Variables Any variable that you have not measured or controlled (RA) that may impact the results of your study Demand Characteristics Participants behave in ways demanded by the situation or experimental set-up. Behavior does not reflect actual beliefs or attitudes. Issue of Ecological Validity Subject Bias Bias brought on by subjects beliefs (Overhead of mood and menstrual cycle) Social desirability Subjects want to do the “right thing” and try to guess what the experimenter wants, and do not behave naturally. How to reduce Subject biases? Experimenter Bias Experimenters’ behavior and expectations can sway results of test. How to reduce these biases? Floor & Ceiling Effects If measures are too easy or too difficult you will not see differences between groups. Pilot test with similar subjects! Order effects When using within subjects designs, order of presentation can affect results in several ways. Practice effects: Subjects get better at task with successive trials Fatigue effects: Subjects get tired and do worse or lose interest Carryover effects: subjects experience in one condition impacts results of another condition- subject bias or anchoring and adjustment issues. How to reduce order effects Counterbalancing – – – – Does not get rid of effects, it just makes them equal for all groups. Can do complete counterbalancing if small number of conditions. Latin Square counterbalancing A, B, skip, C, skip, D, etc. then fill back A, B, N, C, N-1, D, N-2, E etc. A Latin Square for 6 conditions Order 1 2 A B B C F A C D E F D E 3 C D B E A F 4 D E C F B A 5 E F D A C B 6 F A E B D C Pretest Vs. Pilot test When do you use a pilot test? When do you use a pre test? Can a DV be reliable but not valid? Experimental Validity What to do if low Internal Validity? What are impacts of low External Validity? What if Ecological Validity is low?