The Sample Linear Correlation Coefficient rXY (or just r for short) is the sample linear correlation coefficient. rXY measures the strength and direction of linear association between two quantitative variables X and Y . rXY = Pn i=1 (Xi − X̄)(Yi − Ȳ )/(n − 1) , sX sY where n is number of pairs of observations, X̄ is the sample average of the X data, Ȳ is the sample average of the Y data, sX is the sample standard deviation of the X data, and sY is the sample standard deviation of the Y data. For example, consider 11 families randomly selected from the population of families with one brother and one sister, both full grown. Let Xi denote the height (in inches) of the brother in the ith family. Let Yi denote the height (in inches) of the sister in the ith family. i 1 2 3 4 5 6 7 8 9 10 11 Xi 71 68 66 67 70 71 70 73 72 65 66 759 Yi 69 64 65 63 65 62 65 64 66 59 62 704 Xi − X̄ 2 -1 -3 -2 1 2 1 4 3 -4 -3 0 X̄ = 759/11 = 69 rXY = Pn Yi − Ȳ 5 0 1 -1 1 -2 1 0 2 -5 -2 0 (Xi − X̄)(Yi − Ȳ ) 10 0 -3 2 1 -4 1 0 6 20 6 39 Ȳ = 704/11 = 64 i=1 (Xi SX = s (Xi − X̄)2 4 1 9 4 1 4 1 16 9 16 9 74 74 11 − 1 SY = s (Yi − Ȳ )2 25 0 1 1 1 4 1 0 4 25 4 66 66 11 − 1 − X̄)(Yi − Ȳ )/(n − 1) 3.9 ≈ 0.558 =p sX sY (7.4)(6.6) rXY estimates the population linear correlation coefficient ρXY . rXY is dimensionless and is always between -1 and 1. rXY = 1 if and only if all data points fall perfectly on a line with positive slope. rXY = −1 if and only if all data points fall perfectly on a line with negative slope. rXY = 0 means there is no linear association between X and Y . Write the letter for each pair of variables on the number line to indicate the value of rXY that you would expect to see. Pair A X Stalk Diameter of Corn Plant Y Weight of Corn Plant B Person’s Age Person’s Year of Birth C Daily Dow Jones Industrial Average Daily Rainfall in Seattle D # of Ultrasounds During Pregnancy Birth Weight of Baby E U.S. Monthly Ice Cream Cone Sales Drowning per Month in U.S. F Age of Wife Age of Husband −1 0 1