ch3.2 - NCSU Statistics

advertisement
ST361: Ch3 Bivariate Data Analysis
Overview/Review of Bivariate Data Analysis
 Bivariate data: 2 variables X and Y involved. Usually denote
 X as ___________________ variable, aka _______________ variable
 Y as ___________________ variable, aka _______________ variable
 Bivariate data analysis: interested in the relationship between X and Y
 Overview of bivariate data analysis:
Example 1
Obs
1
2
3
Example 2
Battery
Brand
Duracell
Eveready
Eveready
Lifetime
(hr)
4.2
5.1
3.9
Duracell
3.8
Obs
1
2
3
4
5
6

N
Question of interest
Study Time
(hr)
6.5
7.3
9.5
7.1
8.4
8.0
Example 3
Exam
Score
76
83
92
87
93
88
Obs
1
2
3
Drug
Type
New
New
New
Side Effect
Y
Y
N

N
Old
N
Do the two brands have
the same lifetime?
Is exam score related to
study time?
Can new drug reduce
side effect?
X:
X:
X: categorical
Y:
Y:
Y: categorical
Side-by-side Boxplot
Scatter plot
Bar plot
Independent variable X
Dependent variable Y
Type of Variables
Graphical Presentation
Numerical Summary
1
Statistical Inference
H0 : D  E  0
vs.
H a : D  E  0
Population regression line
Y    X
H0 :   0
H 0 :  New   Old  0
vs.
H 0 :  New   Old  0
Population Correlation
coefficient 
H0 :   0
-----------------------------------------------------------------------------------------------------------------ST361: Ch3.2 Correlation Coefficient
Topics:
a) Definition
b) Interpretation
c) Calculation
-----------------------------------------------------------------------------------------------------------------a) Definition: The sample correlation coefficient r is a statistic that quantifies the ______________
and ______________ of the _____________________________ between 2 continuous variables
X and Y
 The ____________ of r indicates the strength of the relationship:
The correlation coefficient r takes values in the range of _____________________
 The ____________ of r indicates the direction of the relationship between X and Y:
2
Ex.
b) Interpretation:
(1) 0.8 < | r | < 1 :
_______________ relationship between X and Y
(2) 0.5 < | r |  0.8 : _______________ relationship
(3) 0.0 < | r |  0.5 : _______________ relationship
Comments:
 The value of r ____________________________________________________________
3
 Meaning of r = 0:
c) Calculation
Q: if the definitions of X and Y are swapped, will the value of r change?
4
Ex. Study time vs. Exam score
Obs
1
2
3
4
5
6
X
Y
x y
Study Time
(hr)
6.5
7.3
9.5
7.1
8.4
8.0
Exam
Score
76
83
92
87
93
88
494
605.9
874
617.7
781.2
704.0
i
i
x  7.8, sx  1.1
y  86.5, s y  6.3
x y
i
i
=4076.8
i
Calculate the sample correlation coefficient r.
5
Ex. Body Mass Index vs. Blood Pressure
X
Y
x y
Obs
1
2
3
4
5
6
7
8
9
10
Body Mass Index
18
20
22
25
26
29
30
33
33
35
Systolic Blood Pressure
120
110
120
135
140
115
150
165
160
180
2160
2200
2640
3375
3640
3335
4500
5445
5280
6300
sx  5.9,
x
i
271 , s y  23.7,
i
y
i
i
1395 ,
i
x y
i
i
i
=38875
i
Calculate the sample correlation coefficient r.
200
180
Blood Pressure
160
140
120
100
80
60
10
15
20
25
30
35
40
BMI
6
Download