Chapter 9: Linear Correlation What is a Perfect Positive Linear Correlation? – It occurs when everyone has the same exact score on two different variables. – It also occurs when everyone’s score on one variable differs by a constant from their score on the other variable (e.g., everyone’s Final exam score is exactly 10 points higher than their midterm score, or exactly twice as much). – It will occur if everyone occupies the same position in a normal distribution for one variable that they occupy for the other variable (i.e., if everyone has the same z score on both variables). – Perfect negative correlation occurs when everyone has the same z score on both variables, but with opposite signs (e.g., if z = +1.5 for one variable, that person’s z will be –1.5 for the other variable. Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 The Pearson Correlation Coefficient (r) – Pearson’s r ranges from -1.0 to +1.0, where • +1.00 = perfect positive correlation • –1.00 = perfect negative correlation • 0 = a total lack of correlation – A formula that illustrates the relationship between z-scores and Pearson’s r is the following: r zx z y N – The magnitude of the number represents the amount of correlation, while its sign represents the direction of the correlation. Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen. 2 Graphing a Correlation: The Scatterplot Correlation coeffiecent r = +1.0 25 20 15 10 5 0 0 5 10 15 20 25 20 25 Correlation coeffiecent r = -1.0 25 20 15 10 5 0 0 Chapter 9 5 10 15 For Explaining Psychological Statistics, 4th ed. by B. Cohen 3 The Graph of a Correlation That Is Less Than Perfect Correlation r = .83 16 14 12 10 8 6 4 10 20 30 40 50 Correlation r = -.83 22 20 18 16 14 12 10 8 6 4 10 Chapter 9 20 30 For Explaining Psychological Statistics, 4th ed. by B. Cohen 40 50 4 Calculating Pearson’s r • Computing formula in terms of population standard deviations: XY X Y N r XY – The numerator is the biased estimate of the covariance. Dividing the covariance by the product of the biased standard deviations ensures that r will never be greater than +/– 1.0. • Computing formula in terms of the unbiased covariance estimate divided by the unbiased standard deviations: 1 XY N X Y r N 1 s X sY This formula always yields the same value for r as the preceding formula based on the biased covariance. Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 5 • Try this example… Is education about other ethnicities correlated with tolerant attitudes towards others? Education Tolerance Score Score 25 3 25 9 33 14 35 11 38 13 36 14 31 12 29 12 22 9 41 14 315 111 XY 75 225 462 385 494 504 372 348 198 574 3637 Xbar = 31.5 Ybar = 11.1 N = 10 sX = 6.222 sY = 3.414 σX = 5.903 σY = 3.239 1 3,637 10 * 31.5 *11.1 10 1 r 6.222 * 3.414 1 3,637 3,496.5 15.611 9 .735 21.242 21.242 Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 6 Testing Pearson’s r for Significance – H0: ρ = 0; df = N – 2 – Using the t distribution: t r N 2 1 r2 .735 10 2 2.079 3.07 .6782 1 .540 – t.05 (8) = 2.306 < 3.07, so we can reject the null hypothesis for the correlation example from the previous slide. – Using the table of critical values: • df = N – 2 (where N is the number of pairs of scores). • Critical r for df = 8 is .632, for a .05, twotailed test. .632 < .735, so H0 can be rejected (consistent with t test). – Effects of N on Critical r • For small N, you need a fairly large r to find significance. • For very large N, even tiny sample rs can attain statistical significance. • As N increases, r is a better estimate of ρ. Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 7 Assumptions of the Test of Significance for Pearson’s r – The sample has been obtained by independent random sampling. – Both variables have been measured on interval or ratio scales. – Both variables exhibit normal distributions in the population. – The two variables jointly follow a bivariate normal distribution (see next slide). – If one or both variables has been measured on an ordinal scale, or the distribution assumptions have been severely violated, consider calculating the Spearman rank-order correlation coefficient (rS) as an alternative (a special table must be used for its critical values when N is small). Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 8 Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 9 Limitations and Cautions • Pearson’s r measures only the degree of linear correlation (r can be small even though there is a very close curvilinear relationship between the two variables). • Pearson’s r can underestimate the population correlation (ρ) if there are: – Restricted (truncated) ranges on one or both variables. – Bivariate outliers. • Correlation does not imply causation! Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 10 Uses of Pearson’s r – Reliability • Test-retest reliability • Split-half reliability • Inter-rater reliability – Criterion validity (e.g., of a selfreport measure). – To measure the degree of linear association between two variables that are not obviously related, but are predicted by some theory or past research to have an important connection. – To evaluate the results of experimental studies in which both the DV, and the levels of the manipulated variable (IV), have been measured on interval or ratio scales. Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 11