Uploaded by Lauren Harris

Stats Midterm Review

advertisement
MIDTERM REVIEW PODCAST OUTLINE AU20
1. The distribution of budget for the top-grossing movies of 2018 is summarized below.
What describes the shape of this data set?
a. Symmetric
b. Skewed right
c. Skewed left
d. None of the above
2. Which measure of center for movie budget is going to be higher, based on the previous
histogram?
a. The mean will be higher
b. The median will be higher
c. The mean and median will be equal
d. Not enough information to tell.
For questions 3-6 use the following descriptive statistics from the 2018 movie data. (Note
that the median for budget says 1e8 which is in exponential notation. This means 1 with 8 places
after it, before the decimal point, or $100,000,000.)
Column
Mean
Runtime
118.05882
Days Released
115.82353
THEATERS
4042.2069
BUDGET
1.1367647e8
OPENING WEEKEND 67120349
U.S. REVENUE
2.2249474e8
INT'L REVENUE
3.6916133e8
WORLD REVENUE 5.9165607e8
Critics Ratings
69.294118
Audience Ratings
67.088235
Std. dev.
Median
17.227429
118
43.793227
110.5
310.24591
4118
74325541
1e8
54785996
47802879
1.569501e8 1.7324568e8
2.8464669e8 3.0070489e8
4.1461165e8 4.5143926e8
22.824136
71
17.180623
71
3. The distribution of the data for opening weekend revenue (in $) is shaped how?
a. Skewed right
b. Skewed left
c. Symmetric
d. Not enough information to tell
4. The correlation between opening weekend and U.S. revenue is the inverse of the
correlation between U.S. revenue and opening weekend.
a. True
b. False
5. Looking (only) at the descriptive statistics above you can tell that critics ratings and
audience ratings are at least moderately correlated.
a. True
b. False
6. The above table shows you that the standard deviation of any data set can never be larger
than the mean of that same data set.
a. True
b. False
7. True or False: The best fitting line always has an SSE of 0.
a. True
b. False
8. Bob runs an experiment to see which brand of paper towel is more absorbent: Brand A or
Brand B. He takes a random sample of 10 sheets from each brand of paper towel and puts
each sheet in a cup of water and measures how much water was absorbed by the sheet by
squeezing it tightly for 10 seconds and weighing the water that comes out. What is the
response variable?
a. Weight of the water squeezed out
b. Which brand is more absorbent in the end
c. Brand type (A or B)
d. None of the above
9. Undercoverage occurs when a certain group from the population is sampled but does not
respond.
a. True
b. False
10. Confidentiality is ________________ than anonymity.
a. Weaker
b. Stronger
c. No different
11. The results of a well-designed experiment are ____________ than the results of a welldesigned observational study (assuming it is ethical to do an experiment.)
a. Stronger
b. Weaker
c. No different
12. What are the units of the residuals?
a. Same as the original units of X
b. Same as the units of Y
c. No units
d. None of the above
13. Suppose the best fitting line for a data set is y = x+2. The 3 points in the data set are
(0, 2); (1, 2); and (2, 5). What is the value of SSE in this case?
14. If the correlation is zero, what is the equation of the best-fitting regression line through
the data?
15. Bob wants to survey OSU students regarding their opinions on textbook costs. What would
response bias mean here?
16. Bob wants to survey OSU students regarding their opinions on textbook costs. Give a clear
example of a self-selected (volunteer) sample in this case.
Use the following edited StatCrunch output:
Simple linear regression results:
Sample size: 34
R-sq = 0.89443651
Parameter estimates:
Parameter Coeff
Std. Err. AlternativeDF T-Stat P-value
Constant
40641523 14171899
≠ 0 322.8677541 0.0073
X
-2.70936040.16454091
≠ 0 3216.466181<0.0001
17. What is the equation of the best-fitting regression line? Assume the variables are just named X
and Y.
18. If P(A)= .3, P(B) = .2, and P(B|A) = .1, what is P(A OR B)?
a. .44
b. .47
c. .48
d. None of the above
19. If A and B are disjoint, then A and B are independent.
a. True
b. False
20. In the following two-way table are opinion and gender independent?
YES
NO
FEMALE
15
30
MALE
30
60
a. True
b. False
21. In the following two-way table what notation stands for the probability that a female selected at
random said yes?
a. P(F|Y)
b. P(Y|F)
c. P(Y and F)
YES
NO
FEMALE
15
30
MALE
30
60
22. What is the name of the distribution shown by the pie chart below?
MALES (n = 1 00)
Category
favor
oppose
28.6%
71.4%
a.
b.
c.
d.
The marginal distribution of opinion
The marginal distribution of opinion for the males
The conditional distribution of males given opinion
The conditional distribution of opinion given males
23. Bob guesses on every question of a 5 question multiple choice test where there are 4 choices for
each answer. What’s the chance Bob gets at least 1 problem correct?
24. 40% of the workers at a certain factory work on the first floor, and the rest work on the 2 nd floor.
80% of the workers on the first floor come to work on time and 75% of the workers on the 2 nd
floor come to work on time. You randomly select a worker that was on time. Which floor are they
more likely to be working on, the first floor or the 2nd floor?
Download