Uploaded by demotequila0

ap stats ch3 review practice test key

advertisement
Chapter 3 Review KEY
1. List three synonyms for the x-variable Explanatory, Independent, Control, Predictor,
Input.
2. List three synonyms for the y-variable Response, Dependent, Output
3. r is called: Correlation Coefficient
4. r2 is called: Coefficient of Determination
5. What information must be included in a description of the relationship between two
variables? Form (overall pattern), Direction, Strength
6. LSRL stands for:_Least-Squares Regression Line
7. When calculating the equation of the LSRL from summary statistics, what information is
needed? r, Sy, Sx, x-bar, y-bar.
8. Calculate an r from the following points by hand. At any stage where you may need to
round, round to the nearest tenth.
r = .5 using (0,0), (1, 2), (2, 1)
9. Describe in words what a residual represents. It's the difference between the observed y
and the y predicted by the regression line.
10. In a regression line, the sum of all residuals =
0
11. Explain the meaning of r-squared. It's the proportion of variability in the response
variable explained by the linear regression of the response variable onto the explanatory
variable.
12. Explain why changing units of measure has no effect on r. No effect because changing
units constitutes a linear transformation. Linear transformations don't affect r. OR—
better—Because r is based on standardized values, and because converting numbers into zscores absorbs any units of measure, r doesn't care about units of measure one way or
another.
13. A point is removed from a scatterplot. The point is very close to the regression line, but way
out in the "outfield." Will r increase or decrease? Decrease
14. A point is removed from a scatterplot, but in this case the point is offset from the regression
line by a considerable distance. Will r increase or decrease? Increase.
15. A point is smack dab in the middle of the bunch, and it lies right on the regression line. If
you remove it, will r go up or down? Up—It contributes little to the sum of products of
standardized x & y, but it does add one to the denominator. Removing it doesn't change
the numerator, but denominator goes down. Ergo—fraction goes up.
15. In a residual plot, what is the preferred pattern if you hope to have faith in how well a
regression line can be used for prediction all along the domain? No pattern at all.
16. You have a point (15, 18) found in some data whose regression line is:
y = 14 — 3jc
Find the residual for this point. y-y
= 49 (Check my work—I was going quickly)
17. You're given this data: (2, 3), (4, 7), and (-3, -1). In under 30 seconds, without a calculator,
find one other point on the regression line for this data. (1, 3)
18. You look at a residual plot and see that residuals begin positive, then go negative, then pass
to positive. What do you conclude? A line is not the right regression (or model) to use.
19. You're given this info: r = .5, x = 4, y = 8 , Sx = .3, Sy = .8
Find the residual for the point (1,3) -1
20. You're performing a study relating years of driving and accident incidence during the first
five years of driving.
Name a lurking variable that might operate through common effect. (Bad question. Tough to
find a variable that would affect time :)
Name a lurking variable that might be a confound not operating through common effect Type of
car, age of driver, climate in which driving occurs—many choices...
21. You're performing a study on children and believe that the ingestion of melamine in milk has
a deleterious effect on growth. For obvious reasons, you can't perform an experiment.
Nonetheless, you wish to establish a strong case for melamine "causing" stunted growth. List
four of the criteria you would have to satisfy in order to begin building a case for causality, in the
context of this situation. Multiple studies showing the same relationship between melamine
ingestion and growth, melamine ingestion precedes growth stunting in time, Effect on
growth is proportional to amount of malamine ingested (more melamine, less growth), and
a strong r-value between melamine amount and growth "loss"
22.
You're using family income (in dollars) to predict the size of homes (in square feet).
You find an r value of .43. In your follow-on study, you wish to convert your study for a
European audience. There are 1.2 Euros per dollar, and 10.1 square feet per square meter. Find
your new r value, r = .43—no change! Linear Transformations do not affect r.
Practice Multiple Choice (Correct Answer is in Bold)
1. In a statistics course, a linear regression equation was computed to predict the final exam
score from the score on the first test. The equation was y = 10 + ,9x where y is the final
exam score and x is the score on the first test. Carla scored 95 on the first test. What is the
predicted value of her score on the final exam?
(a) 95
(b) 85.5
(c) 90
(d) 95.5
(e) None of the above
2.
Refer to the previous problem. On the final exam Carla scored 98. What is the value of her
residual?
P2=L1,RESID
(a) 98
(b) 2.5
(c) -2.5
(d) 0
(e) None of the above
=10
3. A study of the fuel economy for various automobiles plotted the
Y=10.00lfifi7
fuel consumption (in liters of gasoline used per 100 kilometers
traveled) vs. speed (in kilometers per hour). A least squares line was fit to the data. Here is
the residual plot from this least squares fit.
What does the pattern of the residuals tell you about the linear model?
(a) The evidence is inconclusive.
(b) The residual plot confirms the linearity of the fuel economy data.
(c) The residual plot does not confirm the linearity of the data.
(d) The residual plot clearly contradicts the linearity of the data.
(e) None of the above
4.
All but one of the following statements contains a blunder. Which statement is
correct?
(a) There is a correlation of 0.54 between the position a football player plays and
their weight.
(b) The correlation between planting rate and yield of corn was found to be r=0.23.
(c) The correlation between the gas mileage of a car and its weight is r=0.71 MPG.
(d) We found a high correlation (r=l .09) between the height and age of children.
(e) We found a correlation of r=-.63 between gender and political party preference.
5. After a linear regression, it was found that the r-value was .65. If each x-value were decreased
by one unit and the y-values remained the same, then the correlation r would
(a) Decrease by 1 unit
(b) Decease slightly
(c) Increase slightly
(d) Stay the same
(e) Can't tell without knowing the data values
6. In regression, the residuals are which of the following?
(a) Those factors unexplained by the data
(b) The difference between the observed responses and the values predicted by the
regression line
(c) Those data points which were recorded after the formal investigation was completed
(d) Possible models unexplored by the investigator
(e) None of the above
7. What does the square of the correlation (r2) measure?
(a) The slope of the least squares regression line
(b) The intercept of the least squares regression line
(c) The extent to which cause and effect is present in the data
(d) The fraction of the variation in the values of y that is explained by least-squares
regression of y on the other variable.
8. Which of the following statements are true?
I. Correlation requires one variable to be identified as the explanatory variable and other as the
response variable.
II. A two-variable scatterplot requires that both variables be quantitative.
III. Every least-square regression line passes through ( i , y).
(a) I and II only
(b) I and III only
(c) II and III only
(d) I, II, and III
(e) None of the above.
9. A local community college announces the correlation between college entrance exam grades and
scholastic achievement was found to be -1.08. On the basis of this you would tell the college
that
(a) The entrance exam is a good predictor of success.
(b) The exam is a poor predictor of success.
(c) Students who do best on this exam will be poor students.
(d) Students at this school are underachieving.
(e) The college should hire a new statistician.
10. The following are resistant:
(a) Least squares regression line
(b) Correlation coefficient
(c) Both the least square line and the correlation coefficient
(d) Neither the least square line nor the correlation coefficient
(e) It depends
11. A study found correlation r = 0.61 between the sex of a worker and his or her income.
You conclude that:
(a) Women earn more than men on the average.
(b) Women earn less than men on average.
(c) An arithmetic mistake was made; this is not a possible value of r.
(d) This is nonsense because r makes no sense here.
12. A copy machine dealer has data on the number x of copy machines at each of 89 customer
locations and the number y of service calls in a month at each location. Summary
calculations give x = 8.4, S^. = 2.1, y =14.2,S y =3.8, and r = 0.86. What is the slope of the
least squares regression line of number of service calls on number of copiers?
(a) 0.86
(b) 1.56
(c) 0.48
(d) None of these
(e) Can't tell from the information given
13. In the setting of the previous problem, about what percent of the variation in the number of
service calls is explained by the linear relation between number of service calls and number
of machines?
(a) 86%
(b) 93%
(c) 74%
(d) None of these
(e) Can't tell from the information given
14. If dataset A of (x,y) data has correlation coefficient r = 0.65, and a second dataset B has
correlation r = -0.65, then
(a) The points in A exhibit a stronger linear association than B.
(b) The points in B exhibit a stronger linear association than A.
(c) Neither A nor B has a stronger linear association.
(d) You can't tell which dataset has a stronger linear association without seeing the data or
seeing the scatterplots.
15. There is a linear relationship between the number of chirps made by the striped ground
cricket and the air temperature. A least squares fit of some data collected by a biologist gives
the model y = 25.2 + 3.3x, 9 < x < 25, where x is the number of chirps per minute and y
is the estimated temperature in degrees Fahrenheit. What is the estimated increase in
temperature that corresponds to an increase in 5 chirps per minute?
(a) 3.3°F
(b) 16.5°F
(c) 25.2°F
(d) 28.5°F
(e) 41.7°F
16. Linear regression usually employs the method of least squares. Which of the following is the
quantity that is minimized by the least squares process?
(a) y( b ) xr*>
y /
_ —\
(c)
i y i
2
(c) is the correct answer, once you've changed bars to hats.
(d)
(e)
17. A set of data relates the amount of annual salary raise and the performance rating. The least
squares regression equation is y = 1,400 + 2,000x where y is the estimated raise and x is
the performance rating. Which of the following statements is not correct?
(a) For each increase of one point in performance rating, the raise will increase on average
by $2,000.
(b) This equation produces predicted raises with an average residual of 0.
(c) A rating of 0 will yield a predicted raise of $ 1,400.
(d) The correlation for the data is positive.
(e) All of the above are true.
18. Which of the following would not be a correct interpretation of a correlation of
r = -.30?
(a) The variables are inversely related.
(b) The coefficient of determination is 0.09.
(c) 30% of the variation between the variables is linear, (yikes!)
(d) There exists a weak relationship between the variables.
(e) All of the above statements are correct.
19. If removing an observation from a data set would have a marked change on the position of
the LSRL fit to the data, what is the point called:
(a) Robust
(b) A residual
(c) A response
(d) Influential
(e) None of the above
Download