Top 10 Things to Remember about Summarizing

advertisement
Top 10 Things to
Remember about
Summarizing
Bivariate Data
10. Always
make a picture
(scatterplot of data,
residual plot)
9. Identify the
explanatory and
response variables.
(either in your equation
or define the variables
separately)
True or False:
Pearson’s correlation
coefficient, r, does not
depend on the units of
measurement of the
two variables.
True or False:
The value of
Pearson’s correlation
coefficient, r, is
always between 0
and 1.
8. If the goal is to
describe the strength
of relationship, report
correlation coefficient.
(strength, direction, form, and
unusual features, always in context)
7. If the goal is to
predict, report the
LSRL and coefficient of
determination.
True or False:
The slope of the least
squares line is the
average amount by
which y increases as x
increases by one unit.
True or False:
The slopes of the
LSRL for predicting y
from x, and the LSRL
for predicting x from
y, are equal.
6. Explain what the
slope b and y-intercept
mean in CONTEXT in
the predicted
y = a + bx regression
line.
5. Beware of
extrapolation (do not
assume that a linear
model is valid over a
wider range of x
values)
True or False:
The LSRL passes
through the point
True or False:
The coefficient of
determination is
equal to the positive
square root of
Pearson’s r.
4. A correlation
coefficient of 0 does
not necessarily imply
that there is no
relationship between
two variables (could be
strong but not linear).
3. Watch out for
influential
observations. (pulls
LSRL toward it, but
will be close to 0 in
residual plot)
True or False:
If |r |=1, the
standard deviation
of y is equal to the
standard deviation
of the residuals.
True or False:
The standard deviation
about the LSRL is
roughly the typical
amount by which an
observation deviates
the least squares line.
True or False:
A transformation (or
reexpression) of a
variable is accomplished
by substituting a function
of the variable in place of
the variable for further
analysis.
True or False:
The higher the value of
the coefficient of
determination, the
greater the evidence
for a causal
relationship between x
and y
2. Correlation does not
imply causation. A
strong correlation
implies only that the
two variables tend to
vary together in a
predictable way.
And the #1 thing to
remember….
1. Only use
QUANTITATIVE data
when comparing
bivariate data
Plot your data.
(scatterplot)
Interpret what you see.
(direction, form, strength, outliers)
Numerical summary?
 x, y, s
x
, s y , andr

Mathematical model?
(Regression line)
How well does it fit?
(Residuals and r2)
1)Given this residual plot, which of the following
is not a correct conclusion?
A
Residuals
0
Fitted values
a) The pattern in the residuals indicates the
regression line does not fit the data well.
b) Point A is a candidate as an outlier.
c) Point A is a candidate as an influential point.
d) The relationship between the variables is
positive.
e) All of these are correct.
2) Which of the following residual plots indicates
a reasonable fit to a given set of data?
a)
c)
Residuals
Residuals
0
Residuals
Residuals
0
b)
0
d)
e) None of these indicates a reasonable fit.
3) Which of the following is a correct conclusion based on
the residual plot displayed?
Residu
als
0
a)The line overestimates the data.
Fitted values
b) The line underestimates the data.
c) It is not appropriate to fit a line to these data since
there is clearly no correlation between the variables.
d) The data is not related.
e) None of these choices is correct.
5) You are given the regression equation:
temperature = 30.4 - .72(distance), where temperature is the
temperature displayed on a sensor in °C and distance is the distance
in centimeters from the sensor to a heat source. Which of the
following is not a reasonable conclusion?
a) The temperature of the heat source is approximately 30.4°C.
b) The temperature decreases approximately .72°C for each centimeter
the sensor is moved away from the heat source.
c) We can predict that the sensor displays a temperature of 21.76°C
when the sensor is 12 centimeters away from the heat source.
d) The correlation coefficient between temperature and distance
indicates a negative relationship.
e) All of these are reasonable.
7) If the correlation coefficient of a bivariate set of
data {(x,y)} is r, then which of the following is true?
a)The variable x and y are linearly related.
b) The correlation coefficient of the set {(y,x)} is also
r.
c) The correlation coefficient of the set {(x,ay)} is also
a ·r.
d) The correlation coefficient of the set {(ax,ay)} is
also a ·r.
e) None of these is true.
Download