Coorelation doesn`t mean causation

advertisement
+
Stats
Chapter 7: Correlation doesn’t mean causation
+
Do Now:

The following graph shows ages of men and woman who are
married to each other. Based on the graph, would calculating
the correlation coefficient be appropriate? Why? Why not?
+
+
Pirates are causing global warming?
+
EVIL MAN PIRATE.
+
Cheese is killing people?
+
Margarine killing the love?
+
Political Action Committees are killers?
+
Skiing facilities! NO!!!!
+
Hmmmm….

Do NOT confuse association, correlation and causation.
+
Association

Deliberately vague term to describe the relationship
between two variables.

This can be used for CATEGORICAL data

Ex: Did you know there’s a strong correlation between playing an
instrument and drinking coffee…. NO. It’s an association!
+
Correlation

Precise term describing the strength and direction of the
linear relationship between quantitative variables.
+
Causation

Scatterplots and correlation coefficients NEVER prove
causation.


This is why it is so hard to prove something causes something
else– for example, just because lung cancer and smoking is
correlated doesn’t mean that one causes the other and when
considering one causing the other you need to consider both.

Does smoking cause cancer?

OR

Does cancer cause smoking?
You need to do further tests to actually figure this out.
+
Example 1

A news reporter claims “There appears to be a strong
correlation between whether you own a pet and the condition
of a person’s yard.”

Thoughts on this?
+
Example 2:

A researcher studies children in elementary school and finds
a strong positive linear association between height and
reading scores.

Does this mean that taller readers are generally better readers?

What might explain the strong correlation?
+
What is wrong? Example 3

The correlation coefficient between Olympic gold medal
times for the 800 m hurdles and year is -0.66 seconds per
year.
+
Answer

Correlation Coefficient has NO UNITS.
+
What’s wrong? Example 4

The correlation coefficient between Olympic Gold Medal
times for the 100m dash and year is -1.37
+
Answer

Correlation Coefficient of -1.37 is impossible! Correlation
coefficient will always be between -1 and 1!
+
What’s Wrong? Example 5

Since the correlation coefficient between Olympic gold
medal times for the 800m hurdles and 100m dash is -0.41, the
correlation coefficient between times for the 100m dash and
the 800m dash hurdles will be 0.41
+
Answer

Correlation does not change if we reverse x and y variables.
+
What’s Wrong?- Example 6

If we were to measure Olympic gold medal times for the
800m hurdles in minutes instead of seconds, the correlation
coefficient would be -.66/60 = -0.011
+
Answer

Correlation does not change when we change units!
+
Practice Problems

Try some on your own!

As always call me over if you are confused!
+
Exit Ticket

Explain the error:

Your friend claims that the correlation coefficient between
what continent you live on and how many hours you sleep is
0.14.
Download