ST361: Ch3 Bivariate Data Analysis Overview/Review of Bivariate Data Analysis Bivariate data: 2 variables X and Y involved. Usually denote X as ___________________ variable, aka _______________ variable Y as ___________________ variable, aka _______________ variable Bivariate data analysis: interested in the relationship between X and Y Overview of bivariate data analysis: Example 1 Obs 1 2 3 Example 2 Battery Brand Duracell Eveready Eveready Lifetime (hr) 4.2 5.1 3.9 Duracell 3.8 Obs 1 2 3 4 5 6 N Question of interest Study Time (hr) 6.5 7.3 9.5 7.1 8.4 8.0 Example 3 Exam Score 76 83 92 87 93 88 Obs 1 2 3 Drug Type New New New Side Effect Y Y N N Old N Do the two brands have the same lifetime? Is exam score related to study time? Can new drug reduce side effect? X: X: X: categorical Y: Y: Y: categorical Side-by-side Boxplot Scatter plot Bar plot Independent variable X Dependent variable Y Type of Variables Graphical Presentation Numerical Summary 1 Statistical Inference H0 : D E 0 vs. H a : D E 0 Population regression line Y X H0 : 0 H 0 : New Old 0 vs. H 0 : New Old 0 Population Correlation coefficient H0 : 0 -----------------------------------------------------------------------------------------------------------------ST361: Ch3.2 Correlation Coefficient Topics: a) Definition b) Interpretation c) Calculation -----------------------------------------------------------------------------------------------------------------a) Definition: The sample correlation coefficient r is a statistic that quantifies the ______________ and ______________ of the _____________________________ between 2 continuous variables X and Y The ____________ of r indicates the strength of the relationship: The correlation coefficient r takes values in the range of _____________________ The ____________ of r indicates the direction of the relationship between X and Y: 2 Ex. b) Interpretation: (1) 0.8 < | r | < 1 : _______________ relationship between X and Y (2) 0.5 < | r | 0.8 : _______________ relationship (3) 0.0 < | r | 0.5 : _______________ relationship Comments: The value of r ____________________________________________________________ 3 Meaning of r = 0: c) Calculation Q: if the definitions of X and Y are swapped, will the value of r change? 4 Ex. Study time vs. Exam score Obs 1 2 3 4 5 6 X Y x y Study Time (hr) 6.5 7.3 9.5 7.1 8.4 8.0 Exam Score 76 83 92 87 93 88 494 605.9 874 617.7 781.2 704.0 i i x 7.8, sx 1.1 y 86.5, s y 6.3 x y i i =4076.8 i Calculate the sample correlation coefficient r. 5 Ex. Body Mass Index vs. Blood Pressure X Y x y Obs 1 2 3 4 5 6 7 8 9 10 Body Mass Index 18 20 22 25 26 29 30 33 33 35 Systolic Blood Pressure 120 110 120 135 140 115 150 165 160 180 2160 2200 2640 3375 3640 3335 4500 5445 5280 6300 sx 5.9, x i 271 , s y 23.7, i y i i 1395 , i x y i i i =38875 i Calculate the sample correlation coefficient r. 200 180 Blood Pressure 160 140 120 100 80 60 10 15 20 25 30 35 40 BMI 6