Presentation

advertisement
Correlations
MEASURES OF RELATIONSHIP
Key Concepts
 Pearson Correlation




interpretation
limits
computation
graphing
 Factors that affect the
Pearson Correlation
 Coefficient of
Determination (r2) –
‘variance explained’
 Correlation vs. Causation
Correlations
 A correlation measures a linear relationship
between two variables
Correlation: Scatterplots
Scatterplots are graphic representations of the relationship
between two continuous variables
120
Weight
100
80
60
40
20
0
0
2
4
6
Age
8
10
12
Correlation: Coefficients
Correlation coefficients are number between -1.00 and +1.00
representing the relationship between two variables
-1
0
+1
Stop and think
 What types of variables are correlated in education?
 Can you provide some examples of both positive and
negative relationships?
The Ugly Formula
…the variance formula for r
r
 ( X  X )(Y  Y )
 ( X  X )  (Y  Y )
2
2
This formula calculates the correlation between X and Y
It builds on your knowledge of variance; showing how the variation in X & Y
along with the covariation between X & Y make up the Pearson correlation
coefficient.
Example
Age
X
7
4
9
3
5
4
6
10
10
Weight
Y
70
50
100
25
55
40
75
90
25
Step 1: Layout the Problem
Example
Step 2: Compute the Mean for both variables
Age
X
7
4
9
3
5
4
6
10
10
Sum of X = 58
Number of X = 9
Mean of X = 6.44
Weight
Y
70
50
100
25
55
40
75
90
25
Sum of Y = 530
Number of Y = 9
Mean of Y = 58.89
Step 3: Compute the difference of each score from its Mean
X
7
4
9
3
5
4
6
10
10
Age
X-Xbar
.56
-2.44
2.56
-3.44
1.44
-2.44
-.44
3.56
3.56
Mean of X = 6.44
Weight
Y Y-Ybar
70 11.11
50
-8.89
100 41.11
25 -33.89
55
-3.89
40 -18.89
75 16.11
90 31.11
25 -33.89
Mean of Y = 58.89
Note: The sum of (X-Xbar) should equal 0 and the sum of (Y-Ybar)
should equal 0. Why?
Step 4: Compute the square of each mean difference
X
7
4
9
3
5
4
6
10
10
Age
X-Xbar (X-Xbar)2
.56
.3136
-2.44
5.9536
2.56
6.5536
-3.44 11.8336
1.44
2.0736
-2.44
5.9536
-.44
.1936
3.56 12.6736
3.56 12.6736
Y
70
50
100
25
55
40
75
90
25
Weight
Y-Ybar (Y-Ybar)2
11.11 1234.4321
-8.89
79.0321
41.11 1690.0321
-33.89 1148.5321
-3.89
15.1321
-18.89 356.8321
16.11 259.5321
31.11 967.8321
-33.89 1148.5321
Step 5: Sum the squares differences from the means
X
7
4
9
3
5
4
6
10
10
Age
2
X-Xbar (X-Xbar)
.56
.3136
-2.44
5.9536
2.56
6.5536
-3.44 11.8336
1.44
2.0736
-2.44
5.9536
-.44
.1936
3.56 12.6736
3.56 12.6736
Sum (X-Xbar)2 = 58.22
Y
70
50
100
25
55
40
75
90
25
Weight
2
Y-Ybar (Y-Ybar)
11.11 1234.4321
-8.89
79.0321
41.11 1690.0321
-33.89 1148.5321
-3.89
15.1321
-18.89 356.8321
16.11 259.5321
31.11 967.8321
-33.89 1148.5321
Sum (Y-Ybar)2 = 5788.89
Step 6: Compute the cross-product of the differences (for the
numerator)
X
7
4
9
3
5
4
6
10
10
Age
X-Xbar (X-Xbar)2
.56
.3136
-2.44
5.9536
2.56
6.5536
-3.44 11.8336
1.44
2.0736
-2.44
5.9536
-.44
.1936
3.56 12.6736
3.56 12.6736
Y
70
50
100
25
55
40
75
90
25
Weight
Y-Ybar (Y-Ybar)2 (X-Xbar)(Y-Ybar)
11.11 1234.4321
6.2216
-8.89
79.0321
21.6916
41.11 1690.0321
105.2416
-33.89 1148.5321
116.5816
-3.89
15.1321
5.6016
-18.89 356.8321
46.0916
16.11 259.5321
-7.0884
31.11 967.8321
110.7516
-33.89 1148.5321
-120.6484
Step 7: Sum the cross product of the differences
X
7
4
9
3
5
4
6
10
10
Age
2
X-Xbar (X-Xbar)
.56
.3136
-2.44
5.9536
2.56
6.5536
-3.44 11.8336
1.44
2.0736
-2.44
5.9536
-.44
.1936
3.56 12.6736
3.56 12.6736
Y
70
50
100
25
55
40
75
90
25
Weight
2
Y-Ybar (Y-Ybar) (X-Xbar)(Y-Ybar)
11.11 123.4321
6.2216
-8.89
79.0321
21.6916
41.11 1690.0321
105.2416
-33.89 1148.5321
116.5816
-3.89
15.1321
5.6016
-18.89 356.8321
46.0916
16.11 259.5321
-7.0884
31.11 967.8321
110.7516
-33.89 1148.5321
-120.6484
Sum (X-Xbar)(Y-Ybar) = 284.4444
Step 8: Collect the partial values together, and substitute each
into the formula. Solve the formula.
r
 ( X  X )(Y  Y )
 ( X  X )  (Y  Y )
2
r
2
Sum (X-Xbar)2=
58.22
Sum (Y-Ybar)2=
5788.89
Sum (X-Xbar)(Y-Ybar) = 284.44
284.44
 .49
(58.22)(5788.89)
Last Step: Check the computed r for reasonableness, then
interpret the value (sign and magnitude)
 The value of r must be between -1 and +1
 Computed r = .49, which is between -1 and +1
 The sign of r is positive
 The relationship among the two variables is positive
 “In general, younger people weigh less than older people.”
 “In general, older people weigh more than younger
people.”
 The magnitude of r is “moderate”
 Although age and weight are related, the relationship is
not very strong. Some of the variation in age has nothing
to do with weight, and some of the variation in weight has
nothing to do with age.
Cautions
 Variables with a curvilinear relationship will be
underestimated if r is applied.
 Size of the group does not affect the size of the
correlation coefficient.
Effect Size
 ES = the correlation coefficient, squared (r2)
 The proportion of the total variance of one variable
that can be associated with the variance in the other
variable.
 It is the proportion of shared or common variance
between two variables.
CAL
 Example:
WEIGHT
calorie intake & weight
r2 = .36
r = .60
 r2 = .36 or 36%
Correlation & Causality
 Correlation does not indicate causation
 correlation indicates a relationship or association
Practice
Compute the Pearson correlation and r squared value
for the following example. Be sure to try to draw a
rough sketch of a scatterplot to see if the relationship
looks linear.
X
3, 7, 8, 2, 5
Y
5, 8, 10, 3, 9
Interpret your results.
x
y
3
5
7
8
8
10
2
3
5
9
X-Xbar
r
Y-Ybar
(X-Xbar)2
(Y-Ybar) 2
(X-Xbar)(Y-Ybar) - numerator
 ( X  X )(Y  Y )
 ( X  X )  (Y  Y )
2
2
x
y
3
5
7
8
8
10
2
3
5
9
Xbar=5
Ybar=7
X-Xbar
r
Y-Ybar
(X-Xbar)2
(Y-Ybar) 2
(X-Xbar)(Y-Ybar)
 ( X  X )(Y  Y )
 ( X  X )  (Y  Y )
2
2
x
y
X-Xbar
Y-Ybar
3
5
-2
-2
7
8
2
1
8
10
3
3
2
3
-3
-4
5
9
0
-2
Xbar=5
Ybar=7
Check=0
Check=0
r
(X-Xbar)2
(Y-Ybar) 2
(X-Xbar)(Y-Ybar)
 ( X  X )(Y  Y )
 ( X  X )  (Y  Y )
2
2
x
y
X-Xbar
Y-Ybar
(X-Xbar)2
(Y-Ybar) 2
3
5
-2
-2
4
4
7
8
2
1
4
1
8
10
3
3
9
9
2
3
-3
-4
9
16
5
9
0
-2
0
4
Xbar=5
Ybar=7
Check=0
Check=0
Sum=26
Sum=34
r
(X-Xbar)(Y-Ybar)
 ( X  X )(Y  Y )
 ( X  X )  (Y  Y )
2
2
x
y
X-Xbar
Y-Ybar
(X-Xbar)2
(Y-Ybar) 2
(X-Xbar)(Y-Ybar)
3
5
-2
-2
4
4
4
7
8
2
1
4
1
2
8
10
3
3
9
9
9
2
3
-3
-4
9
16
12
5
9
0
-2
0
4
0
Xbar=5
Ybar=7
Check=0
Check=0
Sum=26
Sum=34
Sum=27
r
 ( X  X )(Y  Y )
 ( X  X )  (Y  Y )
2
2
x
y
X-Xbar
Y-Ybar
(X-Xbar)2
(Y-Ybar) 2
(X-Xbar)(Y-Ybar)
3
5
-2
-2
4
4
4
7
8
2
1
4
1
2
8
10
3
3
9
9
9
2
3
-3
-4
9
16
12
5
9
0
-2
0
4
0
Xbar=5
Ybar=7
Check=0
Check=0
Sum=26
Sum=34
Sum=27
r
 ( X  X )(Y  Y )
 ( X  X )  (Y  Y )
2
2
r = 27 / sqrt((26)(34))
r = 27 / 29.7
r = .908, r2 = .82
Key Points
 Correlation is a measure of relationship, and
ranges from -1 to 1. Sign indicates direction, and
the coefficient indicates strength of relationship.
 r2 represents the shared variance
 Correlations do not imply causality
Download