Practice Final Spring 2013

advertisement
Statistics 4220 Final
Version A
NAME: _________________________________________
Instructions:
Read these instructions
Do not turn the page until the test begins
You have 120 minutes
This test is printed on both sides, so don’t miss a page.
Each question is worth the number of minutes. This test is timed for 100 minutes.
For this test you may use a page of notes, a calculator, z/t/X2-tables
If you need any of these please find a solution before the exam begins
If you have a question during the test please come forward quietly so that you are not
disruptive. If you leave early please do so quietly. Note that I cannot give answers that
are part of the test, only clarify the English being used.
You must show your work. Answers which are correct but do not show any work may
not get full credit. I might assume you either guessed, cheated, or used some fancy
calculator.
Cheating is not tolerated. Any inappropriate activity will be discussed after the final
Hats or hoods must be moved so that your face is not obscured.
If you think you might throw away your tables you are encouraged to donate them to
the students of next semester
Please turn off your cell phone. You cannot have your phone out at all.
No one wants to hear “Use ANOVA” during the test. Actually we probably do, but turn
off your phone anyway
1) (5 minutes) Road Ordinance Company Kits sells bags of gravel. A civil engineer wants to test
whether the size of the gravel is 5 grams on average assuming the standard deviation is 2.2 grams.
She knows the distribution of gravel sizes is highly right skewed. She randomly selects 121 pieces of
gravel and finds an average of 5.18. Which picture below shows her p-value (Circle the letter)?
A
B
C
E
D
F
(G) This problem cannot be done because it is not actually normal
2) (4 minutes) I fired an angry bird at 45° at full power 40 times. The distance it travels is programmed
to have a random element to it, but a 95% confidence interval for the mean distance was (9.8, 10.2).
The use of “95%” means 95% of what? (check all that apply)
95% of my averages will be within the given confidence interval
95% of confidence intervals I could do with 40 trials would capture the true average
95% of the angles used to fire an angry bird will cause the bird to land within the interval
95% of the true average lands within a distance of 9.8 to 10.2
95% of the angry birds will travel a distance between 9.8 and 10.2
95% of the time this method is used, it will correctly capture the population average
3) (10 minutes) I’m going to try to convince my wife to let me experiment on how hot a spray paint can
gets before it explodes. I have a hypothesis that it’s more than 120° F. I plan on getting 36 cans and
testing at the 1% significance level. If I assume I know the standard deviation is 20° F, how powerful
is my test at a mean value of 125° F?
4) (8 minutes) In the fall 2012 students were asked on the teacher evaluation to rank the effort they
put into class on a 5 point scale, 5- a great deal, 4- a good deal, 3- moderate , 2-a little, and 1- very
little. The overall department average was 3.93. The 146 students from Dr. Crawford’s Stat 2050
class reported an average of 3.85 with a standard deviation of 0.85. Assuming the results are
actually numerical, test if this indicates that Dr. Crawford’s 2050 class is less work on average.
5) (10 minutes) Is higher intelligence related to poor social skills? To find out a random sample of 260
people was tested for poor social skills or low intelligence. Of the 125 people with poor social skills
56 had low intelligence. Of the 135 people with good social skills 73 had low intelligence. Test
whether there is evidence that social skills change depending on the intelligence level.
6) (6 minutes) Scales are supposed to be recalibrated after every use, but does it really matter? To find
out I asked 600 people to step on a scale, record the measurement, step off, and then step back on
again and record the second measurement. Based on the results below find a 99% confidence
interval for the average change in scale reading.
First measurement
Second measurement
N = 600
N = 600
Mean = 263.14
Mean = 277.24
S = 60
S = 75
Pooled Standard Deviation: 67.92
Matched Pairs Standard Deviation: 7.2
Difference between the Standard Deviation: 5
7) (6 minutes) Below is the output from regression on the length of a pipe and the deflection in the
center. Assume the assumptions are met. Unfortunately the p-value is super small and not very
interesting. I have decided for this test that I am going hack the data and change the p-value to
0.0499. Of course if I change the p-value I have to change other values to make this output look
consistent (wouldn’t it be awful if a student noticed I cheated on the numbers!) Circle the numbers
which must be changed and only numbers that must be changed (note that there is more than one
correct way to do this).
Regression Statistics
Multiple R
0.698729
R Square
0.488222
Standard Error
77.28074
Observations
20
ANOVA
Df
Regression
Residual
Total
SS
MS
1 102553.6 102553.6
18 107501.6 5972.313
19 210055.2
F
17.1715
Coefficients Std Error
t Stat
P-value
Intercept
-2.48371 32.63094 -0.07612 0.940167
length
5.615491 1.355138 4.143851 0.00061
P-value
0.00061
8) (6 minutes) A bridge designer believes that the distribution of wind velocities is normal with the
same variance in any city. The wind speed was measured in Laramie and Cheyenne:
Laramie: 8 days averaged 12.9 mph with a standard deviation of 4.7 miles per hour.
Cheyenne: 20 days averaged 9.1 mph with a standard deviation of 3.3 miles per hour.
The pooled standard deviation was 3.73 which got a t-score of 2.436 and a p-value of 0.022.
They concluded that the average wind velocity is not the same between Laramie and Cheyenne.
Fill in the ANOVA table that they would have gotten if they had done ANOVA
DF
SS
MS
F
Group
Error
Total
9) (5 minutes) A survey asked engineering students if they use TI calculators. 256 said they did, while
144 said they did not. Find a 92% confidence interval for the percent of engineering students that
use a TI calculator.
10) (3 minutes) My wife asked five of her friends how many loads of laundry they did each month. Their
answers were: 32, 45, 20, 28, and 37. My wife thought they would say 30 loads per month, and she
asked me to test whether their answers were significantly different from 30 on average. What type
of test could I do? Check all that apply under the correct assumptions
A z-test if I assume that the sample size is greater than 30
A z-test if I assume normality and assume the standard deviation is the true sigma
A t-test if I assume normality
A t-test if I assume each data point comes from a matched pairs experiment
A chi-squared test if I assume the expected values will be greater than 5
An F test if I assume normality and assume that the standard deviation for each friend is equal
11) (5 minutes) A study looked at several types of engineers and where they lived. Below is the
observed, expected, and partial chi-squared values. Fill in the missing values in the three tables.
OBS
Mechanical
Civil
Electrical
Chemical
City
26
44
63
45
Suburb
33
45
59
EXP
Mechanical
Civil
Electrical
Chemical
City
54.8
52.8
Suburb
30.6
38.0
53.2
51.2
X2
Mechanical
Civil
Electrical
Chemical
City
0.94
1.24
1.14
Suburb
0.97
0.65
1.27
1.17
12) (4 minutes) At the protest in front of the Union on April 29th I could hear them shouting statistics to
make their protest. One thing they said was that women who wear revealing clothing are not more
likely to be attacked. Suppose we had the following data:
Of 300 women who do not wear revealing clothing 94% have not been victims of an attack
Of 200 women who do wear revealing clothing 93% have not been victims of an attack
Which of the following equations would you use to make a 95% confidence interval?
.94  .93  1.96
.941  .94 .931  .93

300
200
.94  .93  1.96
.9361  .936 .9361  .936

300
200
.94  .93  1.96
.941  .94 .931  .93

200
300
0.95  1.96
.951  .95
500
.94  .93  2.004
0.95  2.004 0.95
.942 .952

300 200
500
None of the above
13) (6 minutes) We want to get a 90% confidence interval for the percent of engineers that have ever
been accused of incompetence. It’s got to be somewhere around 1%. We want the confidence
interval to have a width no larger than 0.03. How many engineers are we going to need to sample?
14) (4 minutes) A study examined the difference in the percentage of lead pipes verses copper pipes
that have microfractures. The following confidence intervals were reported in the journal article.
80% CI: ( 0.007, 0.057)
90% CI: ( 0.001, 0.064)
92% CI: ( -0.002, 0.066)
95% CI: ( -0.006, 0.070)
99% CI: ( -0.018, 0.082)
If the journal article had done a hypothesis test for the difference in the proportions what is the
smallest interval you can put on the p-value?
15) (8 minutes) A study surveyed 31 college students to ask them how many times they had camped in
the mountains. It is assumed the true standard deviation is 8.1 times. The average was 3.6 times.
When they made the 99% confidence interval the answer they got was unrealistic. Which of the
following might be reasons for that?
They calculated the confidence interval wrong – the correct calculation does show realistic results
Their sample must have been biased to get an unrealistically low mean
If the confidence level is too high then the results may not be realistic, and 99% must be too high
They assumed it was normally distributed when in fact it was not
It is because they assumed the standard deviation was known, but it should have been a t-score
The rule of n > 30 doesn’t mean it’s exactly normal, and this shows a time when that rule fails
16) (6 minutes) We want to know how much faster it is to build a road using the old paving machine or
the XRJ500. To find out we use them both for several weeks and record how long it takes them to
pave a quarter mile. Find the 95% confidence interval for the mean difference.
Old Paving machine
XRJ5000
N=50 roads
N=70 roads
Average time: 4.3 hours
Average time: 3.4 hours
Std Dev: 1.7 hours
Std Dev: 0.9 hours
Matched Pairs Standard Deviation: 0.22 hours
Pooled Standard Deviation: 1.28 hours
17) (3 minutes) Based on the residual plot shown below, which assumptions for regression do you feel
ought to be examined and why?
18) (1 minute) What is the hardest topic in this class?
Download