Analyzing Bivariate Data With Fathom

advertisement
Correlation and Linear Regression
Analyzing Bivariate Data
With Fathom*
CFU 3102.5.10 Using technology with a set of contextual linear data to
examine the line of best fit;
determine and interpret the correlation coefficient.
Andy Wilson – APSU – wilsona@apsu.edu
Jackie Vogel – APSU – vogelj@apsu.edu
*Adapted from Preparing to Teach Mathematics with Technology by Lee, Hollebrands & Wilson, 2012
Correlation and Linear Regression
Open the 2006Vehicle.ftm file.
Is there a relationship between City mpg and Hwy mpg?
Click on the 2006 Vehicles Collection Icon and then drag
a table to the workspace. Be sure “Prevent Changing
Values in Graphs” is selected in the Collection menu.
Explore the data set.
Drag two blank graphs to the workspace and create dot
plots for each attribute by dragging each column
heading in the table to each x-axis on the graphs.
Correlation and Linear Regression
Your screen should look something like this.
Correlation and Linear Regression
Click on a point in one of the graphs and notice what happens in
the other graph and the table. You can also select several cases
by shift-clicking or clicking and dragging a rectangle around
multiple points.
What do you anticipate might be a reasonable relationship
between City mpg and Hwy mpg?
In the upper right corner of each graph, use the pull down menu
to display box plots. Click on the lower whisker in one box plot
and notice the location of the highlighted cases in the other box
plot. Repeat for other cases. What do you notice?
Correlation and Linear Regression
Drag the label (Hwy) from the x-axis and drop it on the y-axis.
Change the window sizes and orient the two box plots as shown.
Correlation and Linear Regression
Drag a new graph to the workspace and drag Hwy from the table
to the y-axis and City to the x-axis to make a Scatterplot as shown.
Correlation and Linear Regression
Choose the graph window and then Plot Value Mean (City) and
Plot Function Mean (Hwy). Match up the scales by dragging.
Correlation and Linear Regression
Use form (linear, quadratic, exponential, etc), direction (positive or
negative) and strength (weak, moderate or strong) to describe the
relationship between City and Hwy mpg.
Describe the location of the data points in relation to the mean
City and mean Hwy mpg. What does this tell you about the
general trend of the data? The correlation coefficient, r, is given
below. Use the scatterplot to discuss the sign of r (positive or
negative) for the mpg data.
r
x  x y  y 
x  x  y  y 
i
i
2
i
2
i
Correlation and Linear Regression
Open the Correlation.ftm
file. Move the slider back
and forth and notice how
the scatterplot changes
with the value of r.
Use the slider to create a
scatterplot that can help
you estimate a value of
the correlation coefficient
for the relationship
between City mpg and
Hwy mpg.
Correlation and Linear Regression
The linear correlation
coefficient, r, measures
the strength and direction
of the linear association
between two variables.
• between -1 and +1
• Positive, negative, or no
association
Correlation and Linear Regression
Go back to the 2006Vehicle.ftm file. Drag an
empty Summary object to the workspace.
Click and drag the City and Hwy attributes as
shown. The default measure that will be
displayed is the correlation between the two
attributes.
Compare the calculated correlation coefficient with the one you
estimated using the Correlation.ftm file.
What does this value of the correlation coefficient imply about
the relationship between City and Hwy mpg?
Correlation and Linear Regression
Since we have high correlation, it makes sense to try to use a
linear function to model the vehicle data. Click on the scatterplot
and under the Graph menu, select Add Movable Line. Dragging
the line by its middle changes the intercept (translates the line)
while dragging by either end changes the slope (rotates the line).
Note that the equation is displayed at the bottom of the window.
Adjust your line so that you feel it best models the data.
Interpret the slope and y-intercept in the equation of your linear
model. Use your equation to predict the Hwy mpg for a vehicle
with City mpg of 31.
Is the slope of your line the same as the value of r? Should they
be the same? Why or why not?
Correlation and Linear Regression
One common method that is used
for finding a linear model is to
minimize the deviations of the actual
data points from the predicted
values. Visually, these are the
vertical distances between the
actual data points and the line.
A linear model that minimizes the
sum of the squares of these
residuals in called the Least Squares
regression line.
Correlation and Linear Regression
Double click below the scatterplot on the equations containing
mean (Hwy) and mean (City), delete each expression and click
OK. This will remove the lines for the means which is necessary
for the next step.
Click on the graph with the
movable line and select Show
Squares from the graph menu.
Notice the sum of the squares is
computed and displayed below
the equation of the line.
Manipulate the movable line to
minimize the sum of squares.
Correlation and Linear Regression
Click on the graph and select Least-Squares Line from the graph
menu. Compare the function rule for the least squares line with
the function rule for your estimated linear model (movable line).
Also compare the sum of squares.
From the graph menu, choose
Remove Movable Line. Use
the regression line to find a
predicted Hwy mpg for the
Ford Ranger Manual and Ford
Ranger Automatic?
Is the least squares line a good
model for the 2006 Vehicle data?
Download