The Sample Linear Correlation Coefficient Formula for the Sample Linear Correlation Coefficient • rXY (or just r for short) is the sample linear correlation coefficient rXY = i - X )(Yi - Y ) /(n - 1) i =1 sX sY 66 64 Sister's Height (inches) 68 Scatter Plot of the Heights of a Brother and Sister in 11 Families 62 For example, consider 11 families randomly selected from the population of families with one brother and one sister, both full grown. Let Xi denote the height (in inches) of the brother in the i-th family. Let Yi denote the height (in inches) of the sister in the i-th family. n 60 • rXY measures the strength and direction of linear association between two quantitative variables X and Y. ? (X 66 68 70 72 Brother's Height (inches) 1 25 0 0 1 0 1 -3 9 1 1 2 68 64 -1 3 66 65 -3 4 67 63 -2 -1 2 4 5 70 65 1 1 1 1 1 6 71 62 2 -2 -4 4 4 7 70 65 1 1 1 1 1 8 73 64 4 0 0 16 0 9 72 66 3 2 6 9 4 10 65 59 -4 -5 20 16 25 66 62 -3 -2 6 9 4 759 704 0 0 39 74 66 11 X = 759 / 11 = 69 ∑i =1 n rXY = Y = 704 / 11 = 64 ( X i − X ) (Yi −Y ) /(n − 1) s X sY = SX = 74 11 −1 SY = Scatter Plot of the Heights of a Brother and Sister in 11 Families r=0.558 68 (Yi − Y )2 4 5 66 (Xi − X)2 10 2 64 ( X i − X ) (Yi −Y ) Xi − X Yi − Y 69 Sister's Height (inches) Yi 71 62 Xi 60 i 1 66 11 − 1 3. 9 ≈ 0 .558 (7 .4 ) (6 .6 ) 66 68 70 72 Brother's Height (inches) Is a Bear's Neck Circumference Correlated with a Bear's Weight? 400 300 Weight in Pounds 200 100 • rXY estimates the population linear correlation coefficient ?XY . • rXY is dimensionless and is always between 1 and -1. • rXY =1 if and only if all data points fall perfectly on a line with positive slope. • rXY =-1 if and only if all data points fall perfectly on a line with negative slope. • rXY =0 suggests that there is no linear association between X and Y. 500 Some Properties of rXY 10 15 20 25 30 Neck Circumference in Inches 2 Is Outside Temperature Associated with Crawling Time? Nov Oct 32 Dec Apr 31 Jun Feb Jan 30 Average Crawling Age in Weeks 400 300 200 Aug Mar 29 100 Weight in Pounds Sep Jul r=0.934 33 500 34 Is a Bear's Neck Circumference Correlated with a Bear's Weight? May 10 15 20 25 30 30 40 Neck Circumference in Inches Sep 70 Apr 31 Jun Feb Mar China Russia Iraq Morocco Vietnam S.Africa Egypt Pakistan Haiti Uganda Madagascar 50 30 Jan Life Expectancy in Years 32 Dec 60 33 Aug Yemen Cambodia Angola May 30 40 50 70 Japan France Canada,Australia UK US Mexico Sri Lanka 29 Average Crawling Age in Weeks r=-0.699 Nov Oct 60 Does Watching TV Increase Life Expectancy? 80 34 Is Outside Temperature Associated with Crawling Time? Jul 50 Average Monthly Temperature (F) in First 6 Months 60 Average Monthly Temperature (F) in First 6 Months 70 0 50 100 150 200 People Per Television 3 Guess the correlation for each pair of variables. Japan France Canada,Australia UK US Mexico Sri Lanka r=-0.804 China Russia Iraq 60 Morocco Vietnam S.Africa Egypt A) Stalk Diameter of Corn Plant Weight of Corn Plant B) Person’s Age Person’s Year of Birth C) Daily Dow Jones Industrial Average Daily Rainfall in Seattle D) # of Ultrasounds During Pregnancy Birth Weight of Baby E) U.S. Monthly Ice Cream Cone Sales Drowning per Month in U.S. F) Age of Wife Age of Husband Pakistan Haiti Uganda Madagascar 50 Life Expectancy in Years 70 80 Does Watching TV Increase Life Expectancy? Yemen Cambodia Angola -1 0 50 100 150 0 1 200 People Per Television 4