Practice Exam 1, STAT 2331

advertisement
Name…..………………………
Lab section ……
Practice Exam 1, STAT 2331
Please do all questions.
1. Which of the following statements is ALWAYS TRUE?
a. The sample median is more sensitive to extreme values than the mean.
b. The sample standard deviation will be larger than the mean in a right-skewed
distribution.
c. The sample standard deviation is a measure of variation around the sample
mean.
d. The IQR is a measure of variation of the lower half of the data.
e. If a distribution is perfectly symmetric, then the mean will be equal to the
standard deviation.
2. The stem and leaf plot below summarizes the final year averages of a graduating
honors class in the Dedman College. Select the correct statement.
6 88
7 24567
8 2346
9 014
10
a.
b.
c.
d.
e.
The distribution is bimodal
The mean is 95
The IQR is 50
The maximum score is 94.
The median score is 68.
3. Suppose that two variables X and Y are known to have a correlation of 0.98. Which of
the following statements do we know must always be true?
a. X is normally distributed
b. There is a 95% chance that Y values will be found within 2 standard deviations
of their mean
c. A regression of Y on X will produce a line with a positive slope
d. The X variable will have a larger standard deviation that the Y variable
e. 98% of Y’s variation is explained by Y’s linear relationship with X.
4. Which of the following are true statements about the correlation coefficient r?
(i)
(ii)
(iii)
(iv)
The correlation r is always between -1 and 1 .
Correlation r measures how well a straight line models your data.
Correlation r varies with your units of measurement.
The square of the correlation is the coefficient of determination.
(a) (i) and (ii) only.
(b) (i) and (iii) only.
(c) (i) and (iv) only.
(d) (i), (ii) and (iv).
(e) (i) only.
The following two multiple-choice questions refer to cats. The weights of the cats in
Questions 5 and 6 are normally distributed with a mean of 9.5 pounds and a standard
deviation of 1.5 pounds.
5. How much do the heaviest 2.5% of these cats weigh?
a.
b.
c.
d.
e.
9.5 pounds or more
12.5 pounds or more
6.5 pounds or less
11 pounds or less
11 pounds or more
6. What percentage of these cats weigh less than 8 pounds?
a.
b.
c.
d.
e.
32%
20%
16%
5%
50%
7. A 2331 statistics student has a grade of 86% going into the final exam. The final exam is
worth 20% of the overall grade. What percentage score does the student need on the final
exam to ensure she gets at least 80% overall in the class?
a.
b.
c.
d.
e.
80%
90%
86%
17%
56%
The following 4 multiple choice problems relate to the information below on frogs.
We are interested in predicting how far a frog can jump (measured in cm), based on it’s leg
length (also in cm). We collect data on 50 frogs, allowing each frog one leap. We find the
correlation, r is 0.7, the mean of the leg length is 16cm, with a standard deviation of 2cm, and
the mean jumping distance is 20cm, with a standard deviation of 5cm.
8. The equation for the least squares regression line of jumping distance on leg length is
a. Jumping distance = 10.4+ 0.28 (leg length)
b. Leg length = -8 + 1.75 (Jumping distance)
c. Jumping distance = -8+ 1.75 (Leg length)
d. Jumping distance = -8 + 0.28 (Leg length)
e. Leg length = 10.4 + 0.28 (Jumping distance)
9. A frog with legs 20cm long would be predicted to jump
a. 31.4cm
b. 25.8cm
c. 29.3cm
d. 15.7cm
e. 27cm
10. What proportion of variation in jumping distances is due to the linear relation to leg
length?
a. 49%
b. 90%
c. 0.9%
d. 0.49%
e. 81%
11. The predicted distance jumped by a frog of leg length 4cm is -1cm. The best explanation
of this is
a. Frogs this small can only jump backwards
b. The researchers made a mistake in measuring distances
c. The straight line relationship should not be extrapolated for such small frogs
d. Frogs with legs this small are tasty.
e. We should have done a regression of leg length on distance jumped.
12. Suppose we have recorded the weights and heights of 50 people. We decide to regress
weight (Y) on height (X) because we wish to predict Y from X. We find that the
coefficient of determination is 25%. Suppose we change our mind, and now want to
regress height on weight. What will be the coefficient of determination for this new
regression?
a. 75%
b. -25%
c. 25%
d. 50%
e. It is not possible to calculate it given this information.
The following questions concern a study on two airlines: “Alaskan Skies” and “America
Best”. There are two departure locations considered, Seattle and Phoenix, and for each airline
and location flights are categorized by either being “on time” or “delayed”. The following
tables categorize flights by these 3 variables.
Seattle
Alaskan Skies (AS)
America Best (AB)
On time
3213
335
Phoenix
Alaskan Skies (AS)
America Best (AB)
On time
400
2345
Delayed
976
228
Delayed
47
468
Total
4189
563
Total
447
2813
13) The on-time rates for the airlines AS and AB for flights from Seattle are respectively;
a)
b)
c)
d)
e)
23.3% and 40%
76.7% and 60%
23.3% and 60%
32% and 40%
93% and 35%
14) The on-time rates for airlines AS and AB for flights from Phoenix are respectively;
a)
b)
c)
d)
e)
93% and 87%
83% and 98%
73% and 56%
11% and 17%
89% and 83%
15) What percentage of Seattle passengers use AS?
a) 50%
b) 88%
c) 76%
d) 44%
e) 24%
16) What percentage of Phoenix passengers use AB?
a) 24%
b) 77%
c) 23%
d) 44%
e) 86%
Now suppose we collapse the tables over airport and form a new table as follows
Alaskan Skies
America Best
On time
3613
2680
delayed
1023
696
total
4636
3376
17) The on time rates for airlines AS and AB for all passengers are respectively;
a) 50% and 60%
b) 44% and 66%
c) 90% and 50%
d) 78% and 79%
e) 22% and 17%
18) A correct statement about the variables is
a) The collapsing variable is airport location.
b) The collapsing variable is on time/delayed
c) The collapsing variable is airline (AS, AB).
d) There is no Simpson’s paradox here.
e) The response variable is airport location.
19) The correct conclusion to draw here is
a) Airport location is not a factor in whether or not a flight is on time.
b) Yes, airline AB was slightly more on time overall, but it is because it mostly had
passengers leaving from Phoenix, and airline AS mostly had passengers leaving from Seattle.
c) Yes, airline AS was slightly more on time overall, but it is because it mostly had
passengers leaving from Phoenix, and airline AB mostly had passengers leaving from Seattle..
d) We should fly airline AB, as it had a higher on time rate overall.
e) Airport location is not related to airline.
Each of the next several statements has just two options, true and false. Circle the one you
believe to be correct in each case. Here we say a statement is true if it is ALWAYS true,
otherwise we designate it false.
20) T F The mode is another name for the median.
21) T F The standard deviation is an outlier-resistant measure of variation.
22) T F If the data are normally distributed, 68% of the data are within one standard
deviation of the mean.
23) T F If two variables are positively correlated, the regression line must have a positive
slope.
24) T F The correlation coefficient (r) does not depend on the units of measurement.
25) T F Simpson’s paradox concerns 3 categorical variables.
Download