Chapter 9 : Linear Correlation

advertisement
Chapter 9:
Linear Correlation
What is a Perfect Positive Linear
Correlation?
– It occurs when everyone has the same exact
score on two different variables.
– It also occurs when everyone’s score on one
variable differs by a constant from their score on
the other variable (e.g., everyone’s Final exam
score is exactly 10 points higher than their
midterm score, or exactly twice as much).
– It will occur if everyone occupies the same
position in a normal distribution for one variable
that they occupy for the other variable (i.e., if
everyone has the same z score on both variables).
– Perfect negative correlation occurs when
everyone has the same z score on both variables,
but with opposite signs (e.g., if z = +1.5 for one
variable, that person’s z will be –1.5 for the other
variable.
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
1
The Pearson Correlation
Coefficient (r)
– Pearson’s r ranges from -1.0 to +1.0,
where
• +1.00 = perfect positive correlation
• –1.00 = perfect negative correlation
• 0 = a total lack of correlation
– A formula that illustrates the relationship
between z-scores and Pearson’s r is the
following:
r
 zx z y
N
– The magnitude of the number represents
the amount of correlation, while its sign
represents the direction of the correlation.
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen.
2
Graphing a Correlation:
The Scatterplot
Correlation coeffiecent r = +1.0
25
20
15
10
5
0
0
5
10
15
20
25
20
25
Correlation coeffiecent r = -1.0
25
20
15
10
5
0
0
Chapter 9
5
10
15
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
3
The Graph of a Correlation
That Is Less Than Perfect
Correlation r = .83
16
14
12
10
8
6
4
10
20
30
40
50
Correlation r = -.83
22
20
18
16
14
12
10
8
6
4
10
Chapter 9
20
30
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
40
50
4
Calculating Pearson’s r
• Computing formula in terms of population
standard deviations:
 XY
  X Y
N
r
 XY
– The numerator is the biased estimate of the
covariance. Dividing the covariance by the
product of the biased standard deviations
ensures that r will never be greater than +/– 1.0.
• Computing formula in terms of the
unbiased covariance estimate divided by
the unbiased standard deviations:

1
 XY  N X Y
r  N 1
s X sY

This formula always yields the same value for r as the
preceding formula based on the biased
covariance.
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
5
• Try this example…
Is education about other ethnicities
correlated with tolerant attitudes
towards others?
Education Tolerance
Score
Score
25
3
25
9
33
14
35
11
38
13
36
14
31
12
29
12
22
9
41
14
315
111
XY
75
225
462
385
494
504
372
348
198
574
3637
Xbar = 31.5
Ybar = 11.1
N = 10
sX = 6.222
sY = 3.414
σX = 5.903
σY = 3.239
1
3,637  10 * 31.5 *11.1
10

1
r 

6.222 * 3.414
1
3,637  3,496.5 15.611
9

 .735
21.242
21.242
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
6
Testing Pearson’s r for Significance
– H0: ρ = 0; df = N – 2
– Using the t distribution:
t
r N 2
1  r2

.735 10  2 2.079

 3.07
.6782
1  .540
– t.05 (8) = 2.306 < 3.07, so we can reject the
null hypothesis for the correlation example
from the previous slide.
– Using the table of critical values:
• df = N – 2 (where N is the number of
pairs of scores).
• Critical r for df = 8 is .632, for a .05, twotailed test. .632 < .735, so H0 can be
rejected (consistent with t test).
– Effects of N on Critical r
• For small N, you need a fairly large r to find
significance.
• For very large N, even tiny sample rs can attain
statistical significance.
• As N increases, r is a better estimate of ρ.
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
7
Assumptions of the Test of
Significance for Pearson’s r
– The sample has been obtained by
independent random sampling.
– Both variables have been measured
on interval or ratio scales.
– Both variables exhibit normal
distributions in the population.
– The two variables jointly follow a
bivariate normal distribution (see
next slide).
– If one or both variables has been
measured on an ordinal scale, or the
distribution assumptions have been
severely violated, consider calculating
the Spearman rank-order correlation
coefficient (rS) as an alternative (a
special table must be used for its critical
values when N is small).
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
8
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
9
Limitations and Cautions
• Pearson’s r measures only the
degree of linear correlation (r can
be small even though there is a
very close curvilinear relationship
between the two variables).
• Pearson’s r can underestimate the
population correlation (ρ) if there
are:
– Restricted (truncated) ranges on one
or both variables.
– Bivariate outliers.
• Correlation does not imply
causation!
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
10
Uses of Pearson’s r
– Reliability
• Test-retest reliability
• Split-half reliability
• Inter-rater reliability
– Criterion validity (e.g., of a selfreport measure).
– To measure the degree of linear
association between two variables that
are not obviously related, but are
predicted by some theory or past
research to have an important
connection.
– To evaluate the results of experimental
studies in which both the DV, and the
levels of the manipulated variable (IV),
have been measured on interval or
ratio scales.
Chapter 9
For Explaining Psychological
Statistics, 4th ed. by B. Cohen
11
Download