Stat 101 – Lecture 11 ∑ ( )(

advertisement
Stat 101 – Lecture 11
Correlation Coefficient
∑z
zy
n −1
(x − x )( y − y )
r=∑
s x s y (n − 1)
r=
x
1
Standardized Values
• Lucky Strike
• Now
– Tar = 24
– zx = 2.6
– Nicotine = 1.5
– zy = 2.1
– Tar = 2
– zx = –2.1
– Nicotine = 0.2
– zy = –2.5
2
Nicotine Content vs. Tar Content
Standardized Nicotine
3
2.6
2
5.72
2.2
1
0
-1
–2.5
-2
5.25
–2.1
-3
-3
-2
-1
0
1
2
3
Standardized Tar
3
Stat 101 – Lecture 11
Correlation Coefficient
• Tar and nicotine
r=
∑ zxz y
n −1
=
+ 22.9437
24
• r = 0.956
4
Correlation Coefficient
• There is a very strong positive
correlation, linear association,
between the tar content and
nicotine content of the various
cigarette brands.
5
Correlation Coefficient
• Tar and nicotine
r=
r=
∑ ( x − x )( y − y )
(n − 1)s x s y
+ 29.889
= +0.956
24(4.636 )(0.281)
6
Stat 101 – Lecture 11
Correlation Conditions
• Correlation applies only to
quantitative variables.
• Correlation measures the
strength of linear association.
• Outliers can distort the value of
the correlation coefficient.
7
JMP
• Analyze – Multivariate methods –
Multivariate
• Y, Columns
–
–
Tar
Nicotine
8
Multivariate
Correlations
Tar
Nicotine
Tar
1.0000
0.9560
Nicotine
0.9560
1.0000
Scatterplot Matrix
25
20
15
Tar
10
5
1.5
1
Nicotine
0.5
9
5
10
15
20
25
.5
1
1.5
Stat 101 – Lecture 11
Correlation Properties
• The sign of r indicates the
direction of the association.
• The value of r is always between
–1 and +1.
• Correlation has no units.
• Correlation is not affected by
changes of center or scale.
10
Correlation Cautions
• “Correlation” and “Association”
are different.
–Correlation – specific only linear.
–Association – vague.
• Don’t correlate categorical
variables.
11
Correlation Cautions
• Don’t confuse correlation with
causation.
– There is a strong positive correlation
between the number of crimes committed
in communities and the number of 2nd
graders in those communities.
• Beware of lurking variables.
12
Download