Statistics 4220 Final Version A NAME: _________________________________________ Instructions: Read these instructions Do not turn the page until the test begins You have 120 minutes This test is printed on both sides, so don’t miss a page. Each question is worth the number of minutes. This test is timed for 100 minutes. For this test you may use a page of notes, a calculator, z/t/X2-tables If you need any of these please find a solution before the exam begins If you have a question during the test please come forward quietly so that you are not disruptive. If you leave early please do so quietly. Note that I cannot give answers that are part of the test, only clarify the English being used. You must show your work. Answers which are correct but do not show any work may not get full credit. I might assume you either guessed, cheated, or used some fancy calculator. Cheating is not tolerated. Any inappropriate activity will be discussed after the final Hats or hoods must be moved so that your face is not obscured. If you think you might throw away your tables you are encouraged to donate them to the students of next semester Please turn off your cell phone. You cannot have your phone out at all. No one wants to hear “Use ANOVA” during the test. Actually we probably do, but turn off your phone anyway 1) (5 minutes) Road Ordinance Company Kits sells bags of gravel. A civil engineer wants to test whether the size of the gravel is 5 grams on average assuming the standard deviation is 2.2 grams. She knows the distribution of gravel sizes is highly right skewed. She randomly selects 121 pieces of gravel and finds an average of 5.18. Which picture below shows her p-value (Circle the letter)? A B C E D F (G) This problem cannot be done because it is not actually normal 2) (4 minutes) I fired an angry bird at 45° at full power 40 times. The distance it travels is programmed to have a random element to it, but a 95% confidence interval for the mean distance was (9.8, 10.2). The use of “95%” means 95% of what? (check all that apply) 95% of my averages will be within the given confidence interval 95% of confidence intervals I could do with 40 trials would capture the true average 95% of the angles used to fire an angry bird will cause the bird to land within the interval 95% of the true average lands within a distance of 9.8 to 10.2 95% of the angry birds will travel a distance between 9.8 and 10.2 95% of the time this method is used, it will correctly capture the population average 3) (10 minutes) I’m going to try to convince my wife to let me experiment on how hot a spray paint can gets before it explodes. I have a hypothesis that it’s more than 120° F. I plan on getting 36 cans and testing at the 1% significance level. If I assume I know the standard deviation is 20° F, how powerful is my test at a mean value of 125° F? 4) (8 minutes) In the fall 2012 students were asked on the teacher evaluation to rank the effort they put into class on a 5 point scale, 5- a great deal, 4- a good deal, 3- moderate , 2-a little, and 1- very little. The overall department average was 3.93. The 146 students from Dr. Crawford’s Stat 2050 class reported an average of 3.85 with a standard deviation of 0.85. Assuming the results are actually numerical, test if this indicates that Dr. Crawford’s 2050 class is less work on average. 5) (10 minutes) Is higher intelligence related to poor social skills? To find out a random sample of 260 people was tested for poor social skills or low intelligence. Of the 125 people with poor social skills 56 had low intelligence. Of the 135 people with good social skills 73 had low intelligence. Test whether there is evidence that social skills change depending on the intelligence level. 6) (6 minutes) Scales are supposed to be recalibrated after every use, but does it really matter? To find out I asked 600 people to step on a scale, record the measurement, step off, and then step back on again and record the second measurement. Based on the results below find a 99% confidence interval for the average change in scale reading. First measurement Second measurement N = 600 N = 600 Mean = 263.14 Mean = 277.24 S = 60 S = 75 Pooled Standard Deviation: 67.92 Matched Pairs Standard Deviation: 7.2 Difference between the Standard Deviation: 5 7) (6 minutes) Below is the output from regression on the length of a pipe and the deflection in the center. Assume the assumptions are met. Unfortunately the p-value is super small and not very interesting. I have decided for this test that I am going hack the data and change the p-value to 0.0499. Of course if I change the p-value I have to change other values to make this output look consistent (wouldn’t it be awful if a student noticed I cheated on the numbers!) Circle the numbers which must be changed and only numbers that must be changed (note that there is more than one correct way to do this). Regression Statistics Multiple R 0.698729 R Square 0.488222 Standard Error 77.28074 Observations 20 ANOVA Df Regression Residual Total SS MS 1 102553.6 102553.6 18 107501.6 5972.313 19 210055.2 F 17.1715 Coefficients Std Error t Stat P-value Intercept -2.48371 32.63094 -0.07612 0.940167 length 5.615491 1.355138 4.143851 0.00061 P-value 0.00061 8) (6 minutes) A bridge designer believes that the distribution of wind velocities is normal with the same variance in any city. The wind speed was measured in Laramie and Cheyenne: Laramie: 8 days averaged 12.9 mph with a standard deviation of 4.7 miles per hour. Cheyenne: 20 days averaged 9.1 mph with a standard deviation of 3.3 miles per hour. The pooled standard deviation was 3.73 which got a t-score of 2.436 and a p-value of 0.022. They concluded that the average wind velocity is not the same between Laramie and Cheyenne. Fill in the ANOVA table that they would have gotten if they had done ANOVA DF SS MS F Group Error Total 9) (5 minutes) A survey asked engineering students if they use TI calculators. 256 said they did, while 144 said they did not. Find a 92% confidence interval for the percent of engineering students that use a TI calculator. 10) (3 minutes) My wife asked five of her friends how many loads of laundry they did each month. Their answers were: 32, 45, 20, 28, and 37. My wife thought they would say 30 loads per month, and she asked me to test whether their answers were significantly different from 30 on average. What type of test could I do? Check all that apply under the correct assumptions A z-test if I assume that the sample size is greater than 30 A z-test if I assume normality and assume the standard deviation is the true sigma A t-test if I assume normality A t-test if I assume each data point comes from a matched pairs experiment A chi-squared test if I assume the expected values will be greater than 5 An F test if I assume normality and assume that the standard deviation for each friend is equal 11) (5 minutes) A study looked at several types of engineers and where they lived. Below is the observed, expected, and partial chi-squared values. Fill in the missing values in the three tables. OBS Mechanical Civil Electrical Chemical City 26 44 63 45 Suburb 33 45 59 EXP Mechanical Civil Electrical Chemical City 54.8 52.8 Suburb 30.6 38.0 53.2 51.2 X2 Mechanical Civil Electrical Chemical City 0.94 1.24 1.14 Suburb 0.97 0.65 1.27 1.17 12) (4 minutes) At the protest in front of the Union on April 29th I could hear them shouting statistics to make their protest. One thing they said was that women who wear revealing clothing are not more likely to be attacked. Suppose we had the following data: Of 300 women who do not wear revealing clothing 94% have not been victims of an attack Of 200 women who do wear revealing clothing 93% have not been victims of an attack Which of the following equations would you use to make a 95% confidence interval? .94 .93 1.96 .941 .94 .931 .93 300 200 .94 .93 1.96 .9361 .936 .9361 .936 300 200 .94 .93 1.96 .941 .94 .931 .93 200 300 0.95 1.96 .951 .95 500 .94 .93 2.004 0.95 2.004 0.95 .942 .952 300 200 500 None of the above 13) (6 minutes) We want to get a 90% confidence interval for the percent of engineers that have ever been accused of incompetence. It’s got to be somewhere around 1%. We want the confidence interval to have a width no larger than 0.03. How many engineers are we going to need to sample? 14) (4 minutes) A study examined the difference in the percentage of lead pipes verses copper pipes that have microfractures. The following confidence intervals were reported in the journal article. 80% CI: ( 0.007, 0.057) 90% CI: ( 0.001, 0.064) 92% CI: ( -0.002, 0.066) 95% CI: ( -0.006, 0.070) 99% CI: ( -0.018, 0.082) If the journal article had done a hypothesis test for the difference in the proportions what is the smallest interval you can put on the p-value? 15) (8 minutes) A study surveyed 31 college students to ask them how many times they had camped in the mountains. It is assumed the true standard deviation is 8.1 times. The average was 3.6 times. When they made the 99% confidence interval the answer they got was unrealistic. Which of the following might be reasons for that? They calculated the confidence interval wrong – the correct calculation does show realistic results Their sample must have been biased to get an unrealistically low mean If the confidence level is too high then the results may not be realistic, and 99% must be too high They assumed it was normally distributed when in fact it was not It is because they assumed the standard deviation was known, but it should have been a t-score The rule of n > 30 doesn’t mean it’s exactly normal, and this shows a time when that rule fails 16) (6 minutes) We want to know how much faster it is to build a road using the old paving machine or the XRJ500. To find out we use them both for several weeks and record how long it takes them to pave a quarter mile. Find the 95% confidence interval for the mean difference. Old Paving machine XRJ5000 N=50 roads N=70 roads Average time: 4.3 hours Average time: 3.4 hours Std Dev: 1.7 hours Std Dev: 0.9 hours Matched Pairs Standard Deviation: 0.22 hours Pooled Standard Deviation: 1.28 hours 17) (3 minutes) Based on the residual plot shown below, which assumptions for regression do you feel ought to be examined and why? 18) (1 minute) What is the hardest topic in this class?