Stat 203 Additional (FOR INTEREST) material. You are not responsible for knowing this. We’ve been looking at the Pearson correlation r without looking at how it’s calculated. For correlating the response variable to multiple explanatory variables, the easiest way is to use the sum of squares error and total (SSE and SST) For only one y variable and one x variable we have a more directed way. r is the Pearson correlation coefficient. n is the sample size. The parts in the brackets are “How many standard errors above the x mean and above the y mean” respectively This following notation isn’t exactly right, but it will serve our purposes. zx and zy are the standardized scores of x and y (the raw scores). For a set of 5 dragons, we might have a dataset like this: Length in cm (x) 34.3 24.8 30.0 28.7 30.9 Weight in grams (y) 670 373 557 480 567 Which produces this scatterplot: If y (weight) increases with x (length), then above-average x values will occur for the same cases as above-average y values. z So x >0 usually when zy > 0 That means, for most values, . (zx )(zy) > 0 In the correlation formula you’re adding mostly positive numbers, and your correlation will end up positive. If y decreases as x increases, below-average x occurs with above-average y. z So x <0 usually when zy > 0 That means, for most values, . (zx )(zy) < 0 In the correlation formula you’re adding mostly negative numbers, and your correlation will end up negative. First, standardize the scores. Length in cm (x) 34.3 z = 1.32 24.8 z = -1.43 30.0 z = 0.08 28.7 z = -0.30 30.9 z = 0.34 Weight in grams (y) 670 z = 1.27 373 z = -1.41 557 z = 0.25 480 z = -0.45 567 z= 0.34 Then multiply each one together Length in cm (x) Weight in grams (y) 34.3 z = 1.32 670 z = 1.27 24.8 z = -1.43 373 z = -1.41 30.0 z = 0.08 557 z = 0.25 28.7 z = -0.30 480 z = -0.45 30.9 z = 0.34 567 z= 0.34 (zx )(zy) 1.68 2.02 0.02 0.13 0.11 Then add the multiplied values Length in cm (x) Weight in grams (y) 34.3 z = 1.32 670 z = 1.27 24.8 z = -1.43 373 z = -1.41 30.0 z = 0.08 557 z = 0.25 28.7 z = -0.30 480 z = -0.45 30.9 z = 0.34 567 z= 0.34 TOTAL (zx )(zy) 1.68 2.02 0.02 0.13 0.11 3.97 This pretty much does the whole formula for us. r= 0.985, very strong positive. Final note: The correlation formula doesn’t show up in your textbook in this form, but in an equivalent but longer form. For the equivalence and more information I recommend http://en.wikipedia.org/wiki/Pearson_productmoment_correlation_coefficient