Nonlinear Correlation

advertisement
Correlation with
a Non - Linear
Emphasis
Day 2
 Correlation
measures the strength
of the linear association between 2
quantitative variables.
 Before
you use correlation, you must
check several conditions:
 Quantitative
Variables Condition: Are
both variables quantitative?
 Straight
Enough Condition: Is the form of
the scatterplot straight enough that a
linear relationship makes sense? If the
relationship is not linear, the correlation
will be misleading.
 Outlier
Condition: Outliers can distort the
correlation dramatically. If an outlier is
present it is often good to report the
correlation with and without that point.
Warning : Correlatio n  Causation
A
hidden variable that stands behind a
relationship and determines it by
simultaneously affecting the other two
variables is called a lurking (confounding)
variable.
 Scatterplots
and correlation coefficients
NEVER prove causation.
 Don’t
ever assume the relationship
is linear just because the correlation
coefficient is high.
 In
order to determine whether a
relationship is linear or not linear, we
must always look at the residual
plot.
Residuals
A
residual is the vertical distance
between a data point and the graph of a
regression equation.
The Residual is
 positive
if the data point is above the
graph.
 negative if the data point is below the
graph.
 Is 0 only when the graph passes through
the data point.
What should you look for to tell
if it is not linear?......
 Sometimes
a high “r” value for
linear regression is deceptive. You
must look at the scatter plot AND
you must look at the residual
pattern it makes.
 If
the residuals have a curved
pattern then it is NOT linear.
To prove linearity
A
scatterplot of the residuals vs. the xvalues should be the most boring
scatterplot you’ve ever seen.
 It shouldn’t have any interesting
features, like a direction or a shape.
 It should stretch horizontally, with about
the same amount of scatter throughout.
 It should show no bends.
 It should show no outliers.
Some Non Linear Regression
Shapes……
 Positive
Quadratic
Regression:
 Negative
Quadratic
Regression:
More Non Linear Regression
Shapes……
 Positive
Exponential
Regression:
 Negative
Exponential
Regression:
Quadratic and Exponential on
GDC……
 Quadratic:
 Exponential:
Example……The scatter plot could possibly be
linear. You must check the residual pattern.
x
y
5
16.3
10
9.7
15
8.1
20
4.2
45
1.9
25
3.4
60
1.3
NOTE : RESIDS are found by
2nd Stat AFTER doing a LINEAR
REGRESSION .
 Change
y-list to
resid after
running a linear
correlation
regression – 2nd
stat resid:
 Notice
the
curved pattern in
the residuals.
NOTE!!!!!!
 Just
because the curved pattern on the
residuals looks like a quadratic we cannot
determine that until we check the “r”
value of other curved functions and see
how well the data fits.
 You
should also consider “real-life”
implications when deciding.
 When
you see that the residuals are
curved you must check the correlation
coefficient for the exponential and the
quadratic to choose the stronger
correlation.
A
check on the exponential regression
yield an r – value of -0.956. (Strong
Negative but check out the quadratic….)
This is a quadratic
regression…..
 Equation:
y=.00946x² 0.839x+18.5
r = 0.966
This value is even
stronger than the
exponential.
Example 2……Is it linear?
x
y
0
1
-3
0.125
-4
0.0625
3
8
4
16
5
32
Look at the residuals……

There is a curved
pattern in the
residuals. It is NOT
linear – it is either
quadratic or
exponential.
(Positive)

Use the “r” value to
help you decide.
And the Winner is…..
 Here
is the
equation you
should use for
predictions:
y = 1(2)
x
Homework
 Follow
the flowchart.
Download