Correlation

advertisement
Correlation
Correlation
Correlation refers to a relationship that exists between pairs
of measures. Knowledge of the strength and direction of the
relationship allows us to predict one variable from the other
with an accuracy greater than chance.
Correlation
For example, you can guess someone’s weight more
accurately if you know how tall they are because height and
weight are positively correlated.
Correlation
For example, you can guess someone’s weight more
accurately if you know how tall they are because height and
weight are positively correlated.
When two variables are positively correlated that means they
tend to both move higher or both move lower at the same time.
Correlation
For example, you can guess someone’s weight more
accurately if you know how tall they are because height and
weight are positively correlated.
When two variables are positively correlated that means they
tend to both move higher or both move lower at the same time.
Generally, taller people weigh more than shorted people.
Correlation
When two variable are negatively correlated that means that
they change in value inversely. Higher scores on one
generally go with lower scores on the other.
Correlation
When two variable are negatively correlated that means that
they change in value inversely. Higher scores on one
generally go with lower scores on the other.
For example, the outdoor temperature and the weight of one’s
clothing are negatively correlated. The higher the
temperature is, the less clothing we wear. The lower the
outdoor temperature, the more clothing we wear.
Correlation
When two variable are negatively correlated that means that
they change in value inversely. Higher scores on one
generally go with lower scores on the other.
For example, the outdoor temperature and the weight of one’s
clothing are negatively correlated. The higher the
temperature is, the less clothing we wear. The lower the
outdoor temperature, the more clothing we wear.
Another example: How old you are and how much longer you
will live are negatively correlated.
Correlation
When two variable are negatively correlated that means that
they change in value inversely. Higher scores on one
generally go with lower scores on the other.
For example, the outdoor temperature and the weight of one’s
clothing are negatively correlated. The higher the
temperature is, the less clothing we wear. The lower the
outdoor temperature, the more clothing we wear.
Another example: How old you are and how much longer you
will live are negatively correlated.
The existence of a negative correlation does not mean the
absence of a relationship. It means that the variables tend to
move in opposite directions, not that the variables are
unrelated.
Correlation
A near zero correlation is said to exist when scores on the
two variables are unrelated. Higher scores on one variable
are just as likely to be accompanied by higher scores as by
lower scores on the other variable.
Correlation
A near zero correlation is said to exist when scores on the
two variables are unrelated. Higher scores on one variable
are just as likely to be accompanied by higher scores as by
lower scores on the other variable.
An example: The street number on your house and the
odometer reading of your car.
Correlation
For Each of the Following Examples, State From Your General
Knowledge Whether the Correlation Between the Two
Variables is Likely to be Positive, Negative, or Near Zero, and
Explain Why
Average Number of Calories Eaten Per Day and
Body Weight
Average Number of Calories Eaten Per Day and
Body Weight
Positive – Caloric Intake is One of the Major
Determinants of Body Weight
Golf Scores and the Number of Years of Golfing
Experience
Golf Scores and the Number of Years of Golfing
Experience
Negative – Golfers Improve with Experience and
Hence Would Be Expected to Get Better (Lower)
Scores
Length of Hair and Shoe Size in Adult Males
Length of Hair and Shoe Size in Adult Males
Near Zero – It’s Doubtful That These Two
Measures Could Be Influenced By Common
Factors
Amount of Formal Education One Has Received
and the Time Spent Collecting Public
Assistance (Welfare)
Amount of Formal Education One Has Received
and the Time Spent Collecting Public
Assistance (Welfare)
Negative – Educated Individuals Are More Likely
to Be Employable and Hence Less Likely to Need
Welfare
Per Capita Consumption of Alcohol in a Group
of Cites and Suicide Rates in Those Cities
Per Capita Consumption of Alcohol in a Group
of Cites and Suicide Rates in Those Cities
Positive – A Common Set of Stresses and Other
Factors Are Likely to Influence Rates of Both
Alcoholism and Suicide in a Given Community
Number Correct on a Current Events Test and
Time Spent Reading the Newspaper
Number Correct on a Current Events Test and
Time Spent Reading the Newspaper
Positive – The Newspapers Are Full of Stories
Concerning Current Events Around the World
Strength of Traditional Religious Beliefs and
Favorableness of Attitude Toward Abortion on
Demand
Strength of Traditional Religious Beliefs and
Favorableness of Attitude Toward Abortion on
Demand
Negative – Members of Traditional Religious
Groups Are More Likely to Regard Abortion as
Immoral Than Others
Height and Political Conservatism
Height and Political Conservatism
Near Zero
Correlation
Correlations can be represented either graphically by the construction
of a special type of graph called a scatter-plot diagram or through the
computation of an index called a coefficient of correlation.
Correlation
Correlations can be represented either graphically by the construction
of a special type of graph called a scatter-plot diagram or through the
computation of an index called a coefficient of correlation.
Most of the seminal work that went into the development of these
representation was done by the early statistician Karl Pearson,
employed by the Guinness Brewery.
Correlation
Construction of a scatter-plot diagram requires acquiring pairs of
scores from each subject.
For example, suppose we wanted to look at the relationship
between height and self esteem in men. Perhaps we have a
hypothesis that how tall you are effects your level of self esteem.
So we collect pairs of scores from twenty male individuals.
Height, measured in inches, and Self Esteem based on a selfrating scale (where higher scores mean higher self esteem).
Man
Height
Self
Esteem
1
68
4.1
2
71
4.6
3
62
3.8
4
75
4.4
5
58
3.2
6
60
3.1
7
67
3.8
8
68
4.1
9
71
4.3
10
69
3.7
11
68
3.5
12
67
3.2
13
63
3.7
14
62
3.3
15
60
3.4
16
63
4.0
17
65
4.1
18
67
3.8
19
63
3.4
20
61
3.6
Man
Height
Self
Esteem
1
68
4.1
2
71
4.6
3
62
3.8
4
75
4.4
5
58
3.2
6
60
3.1
7
67
3.8
8
68
4.1
9
71
4.3
10
69
3.7
11
68
3.5
12
67
3.2
13
63
3.7
14
62
3.3
15
60
3.4
16
63
4.0
17
65
4.1
18
67
3.8
19
63
3.4
20
61
3.6
Man
Height
Self
Esteem
1
68
4.1
2
71
4.6
3
62
3.8
4
75
4.4
5
58
3.2
6
60
3.1
7
67
3.8
8
68
4.1
9
71
4.3
10
69
3.7
11
68
3.5
12
67
3.2
13
63
3.7
14
62
3.3
15
60
3.4
16
63
4.0
17
65
4.1
18
67
3.8
19
63
3.4
20
61
3.6
Man
Height
Self
Esteem
1
68
4.1
2
71
4.6
3
62
3.8
4
75
4.4
5
58
3.2
6
60
3.1
7
67
3.8
8
68
4.1
9
71
4.3
10
69
3.7
11
68
3.5
12
67
3.2
13
63
3.7
14
62
3.3
15
60
3.4
16
63
4.0
17
65
4.1
18
67
3.8
19
63
3.4
20
61
3.6
Man
Height
Self
Esteem
1
68
4.1
2
71
4.6
3
62
3.8
4
75
4.4
5
58
3.2
6
60
3.1
7
67
3.8
8
68
4.1
9
71
4.3
10
69
3.7
11
68
3.5
12
67
3.2
13
63
3.7
14
62
3.3
15
60
3.4
16
63
4.0
17
65
4.1
18
67
3.8
19
63
3.4
20
61
3.6
Man
Height
Self
Esteem
1
68
4.1
2
71
4.6
3
62
3.8
4
75
4.4
5
58
3.2
6
60
3.1
7
67
3.8
8
68
4.1
9
71
4.3
10
69
3.7
11
68
3.5
12
67
3.2
13
63
3.7
14
62
3.3
15
60
3.4
16
63
4.0
17
65
4.1
18
67
3.8
19
63
3.4
20
61
3.6
Linear Regression
Man
Height
Self
Esteem
1
68
4.1
2
71
4.6
3
62
3.8
4
75
4.4
5
58
3.2
6
60
3.1
7
67
3.8
8
68
4.1
9
71
4.3
10
69
3.7
11
68
3.5
12
67
3.2
13
63
3.7
14
62
3.3
15
60
3.4
16
63
4.0
17
65
4.1
18
67
3.8
Best Fitting Line
19
63
3.4
Line of Prediction
20
61
3.6
Linear Regression
Coefficient of Correlation
Coefficient of Correlation
A statistical computation that indicates the strength
and direction of an underlying correlation
Coefficient of Correlation
A statistical computation that indicates the strength
and direction of an underlying correlation
Coefficient of Correlation
A statistical computation that indicates the strength
and direction of an underlying correlation
Always results in a signed number in the range from -1.00 to
+1.00 If the sign is positive, that indicates the underlying
relationship is a positive correlation. If the sign is negative, it
indicates an underlying negative correlation. The closer the
value is to “1” (either positive or negative) the stronger is the
indicated underlying relationship.
Coefficient of Correlation
A statistical computation that indicates the strength
and direction of an underlying correlation
Always results in a signed number in the range from -1.00 to
+1.00 If the sign is positive, that indicates the underlying
relationship is a positive correlation. If the sign is negative, it
indicates an underlying negative correlation. The closer the
value is to “1” (either positive or negative) the stronger is the
indicated underlying relationship.
Coefficient of Correlation
A statistical computation that indicates the strength
and direction of an underlying correlation
Always results in a signed number in the range from -1.00 to
+1.00 If the sign is positive, that indicates the underlying
relationship is a positive correlation. If the sign is negative, it
indicates an underlying negative correlation. The closer the
value is to “1” (either positive or negative) the stronger is the
indicated underlying relationship.
Coefficient of Correlation
r = +.86
Coefficient of Correlation
Sign
r = +.86
Coefficient of Correlation
Sign
Magnitude
r = +.86
Coefficient of Correlation
Sign
Magnitude
r = +.86
Direction
Coefficient of Correlation
Sign
Magnitude
r = +.86
Direction
Strength
Coefficient of Correlation
r = +.86
r = +.31
r = -.96
r = +.04
r = +1.02
Coefficient of Correlation
r = +.86
r = +.31
r = -.96
r = +.04
Strongest
Weakest
r = +1.02
Computational
Error
Direction?
Strength?
Prediction?
Correlation Does Not Imply Causation
Correlation Does Not Imply Causation
The Third Variable Problem
The Problem of Directionality
Correlation Does Not Imply Causation
The Third Variable Problem
Refers to the possibility that two variables are
correlated with each other, not because one
causes the other, but because both are effects
of some third unidentified cause.
Correlation Does Not Imply Causation
The Problem of Directionality
Even when two variables are correlated because of a
causal relationship between them, from the
correlational data alone, we can not tell which is the
cause and which is the effect.
Why Do Correlation Research?
Why Do Correlation Research?
When we are dealing with variables that we have not yet
learned to directly control
Why Do Correlation Research?
When we are dealing with variables that we have not yet
learned to directly control
When we are dealing with variables that it would not be
ethical to directly control
Why Do Correlation Research?
When we are dealing with variables that we have not yet
learned to directly control
When we are dealing with variables that it would not be
ethical to directly control
Reasons of economy (cheaper, faster easier) because we
are analyzing data that already exists rather than creating
data through our experimentation.
Why Do Correlation Research?
When we are dealing with variables that we have not yet
learned to directly control
When we are dealing with variables that it would not be
ethical to directly control
Reasons of economy (cheaper, faster easier) because we
are analyzing data that already exists rather than creating
data through our experimentation.
Prelude to Experimentation
Download