Lab 8 1 PSY 395 Lab #8 – Data Analysis Three basic data analysis approaches for many research questions: (1) Correlation (2) Regression (3) Independent Samples t-test What is it? Hypothesis Testing When do you use it? Lab 8 2 Method 1: Correlation What is it? Numerical index that reflects the degree of linear relationship between two variables The correlation is also called the Pearson product-moment correlation. Measures a linear relationship between 2 variables (usually continuous) Represented as r Characteristics 1. Direction: The sign specifies the direction of the relationship (positive = direct relationship, negative = inverse relationship). Lab 8 3 2. Form: typically used for linear relationships linear not linear! Lab 8 4 3. Magnitude or strength: degree to which the relationship between the two variables fits the form (typically a straight line) a perfect correlation has a value of +1 or –1 (so a correlation of -.78 is stronger than a correlation of .23 even though -.78 is negative) Lab 8 5 A correlation is a ratio between the degree that 2 variables vary together and the degree that the 2 variables vary in different ways. r = degree that X and Y vary together/degree that X and Y vary separately or in mathematical terms: r cov( x, y ) var( x) var( y) where cov = covariance and var = variance A perfect correlation is +1 or -1 which would mean that for every z-score change in the value of X there is an equivalent corresponding z-score change in the value of Y (+1 means change is in the same direction; -1 means the change is in the opposite direction). Lab 8 6 Hypothesis Testing Sample statistic: r Population parameter: population correlation (rho) Step 1: Hypotheses & The question of interest is whether the relationship obtained in the sample will hold in the population. Two-tailed: Ho: = 0 We will use =.05. H1: 0 Lab 8 7 Step 2: Critical region Use the table of critical values for the correlation. Need to know: 1 or 2 tailed test df Example: rcrit = .805. To reject Ho, we need to obtain r > +.805. Step 3: Test statistic. The obtained sample correlation is r = +.90. Step 4: Evaluate Ho. We reject the null hypothesis because r > +.805. We can conclude there is a significant correlation. Lab 8 8 When do you use it? Interested in assessing the strength of association between two variables Sometimes there is no obvious distinction between independent variable and dependent variable Examples: What is the relationship between self-esteem and extraversion? Is level of reading readiness related to intelligence in children? Lab 8 9 Method 2: Regression What is it? Linear regression fits a line to a set of data and uses the equation for this line to predict future values on the variables. The relationship can also be represented by the following general equation: Y b0 b1 X b0 : Y-intercept b1 : slope Regression is like correlation in that positive correlations mean positive regression slopes (and negative correlations mean negative regression slopes). However regression is used for predicting scores. Regression can be used to predict values of Y. Lab 8 10 If the relationship between the # of hours of TV watched per week (X) and GPA (Y) is Y .07 X 4.0 , you would predict a person who watches 10 hours per week has a GPA of Y .07(10) 4.0 3.3 In order to find the regression line that best fits the data, we solve the following equation: Ŷ = b0 b1 X Ŷ: predicted value of Y There will be some error of prediction between Ŷ and the actual value of Y, represented as the difference between actual and predicted values: Y–Ŷ To estimate the total error, it is necessary to square the discrepancies and sum them to calculate the total squared error: (Y – Ŷ)2 Lab 8 11 We need to find the linear equation that minimizes the total squared error. This equation is the least-squares error solution. Lab 8 12 Hypothesis Testing Sample statistics: b0 b1 Population parameters: 0 1 Step 1: Hypotheses & The question of interest often is focused on slope. Two-tailed: Ho: 1 = 0 We will use =.05. H1: 1 0 Lab 8 13 Step 2: Calculate t statistic t = sample statistic–hypothesized population parameter estimated standard error t b1 S 1 b1 Step 3: Critical region Use the table for t statistic. Need to know: 1 or 2 tailed test df Example: t-critical = 2.04 Step 4: Test statistic. The obtained t-value is t = 3.36. Step 5: Evaluate Ho. We reject the null hypothesis because t > 2.04. Lab 8 14 When do you use it? Related to correlation in that both are concerned with assessing the relationship between sets of paired data. Regression is often used when there is a clearer distinction between what the independent and dependent variable are (i.e., when you know Y, what is being predicted [the DV] and X, what is predicting it [the IV]). Also correlation tends to be used when the primary interest is in assessing the strength of relationship, regression is used when primary interest is in prediction. Regression is also often used when we have more than one independent variable. Example: We’re interested in predicting extroversion from self-esteem scores. Lab 8 15 Method 3: Independent Samples t-test What is it? Independent samples t-test is used when we’re interested in the mean difference between two sets of data Used when we have an independent-groups design: study uses separate samples for each treatment condition (also known as a betweensubjects or between-groups design) Want to know if the sample data support rejecting equal means between groups in the population. If this support is obtained, then we can conclude that the mean of one group is significantly different from the mean of another group. Lab 8 16 Hypothesis Testing Sample statistic: X1 X2 Population parameter: 1 – 2 Step 1: Hypotheses & Ho: 1 – 2 = 0 (no difference between population means) H1: 1 – 2 0 OR equivalently H1: 1 2 (there is a mean difference between groups) = .05 and a two-tailed test will be used. Lab 8 17 Step 2: Set decision criteria and locate critical regions. Example: To reject Ho, the obtained t-statistic must be < -2.101 or > +2.101. Step 3: Collect sample data and calculate the tstatistic. The independent measures t-statistic has the same basic structure as before. t = sample statistic–hypothesized population parameter estimated standard error t = ( X 1 X 2 ) – ( 1 2 ) sX 1 X 2 Step 4: Evaluate Ho. Example: The obtained t-statistic is greater than +2.101. We reject Ho. Lab 8 18 When do you use it? You have a study with a dependent variable, and you are interested in the mean difference on that variable for two independent groups. Examples: Do people who jog have fewer heart attacks than those who don’t? What is the effect of caffeine vs. no caffeine on test performance? (note: this is a t-test if you had 2 groups, 1 who had a cup of coffee vs. 1 group who had a glass of water; how could you make this a correlational study instead, correlating caffeine intake with test scores?) Lab 8 19 Homework for Lab 8 1. Which of the three methods would likely be most appropriate for the following situations? Explain why? a. An organization wants to evaluate whether there are differences between African Americans vs. Caucasians regarding whether an employment test is fair. b. Research designed to assess the relationship between gender and approval ratings of President Bush. c. The Michigan Highway Safety Commission wants to know whether the “Click It or Ticket” billboards are effective. d. Research designed to examine the extent to which higher levels of education lead to higher levels of money given to charities. 2. Consider your final project: (2a) At the construct level what is the general relationship you are interested in (remember focus only on two variables)? Which might be considered the IV or predictor, and which the DV or criterion? (2b) Now look at each variable. List three different ways you could measure each variable. Be specific (e.g., don’t just say “look at their behavior,” say “count the number of times an individual opens a door for another person in a day”). (2c) What would you say to convince a granting agency that your study is important and worth funding (2-3 sentences max)?