Stat 4220 homework – Due April 24 1) The casino boss has heard a rumor that the dice in his casino are rigged. (For those not familiar with dice, there are 6 possible outcomes, and each outcome is supposed to be equally likely). He hired a graduate student (lackey) to take a die and roll it 1002 times. He recorded that the die rolled a one 150 times, rolled a two 155 times, rolled a three 164 times, rolled a four 180 times, rolled a five 150 times, and rolled a six 203 times. Test whether the die is rigged. H0: The dice are not rigged (all proportions are 1/6) Ha: The dice are rigged (not all proportions are 1/6) α=0.05 1 167 2 167 3 167 4 167 5 167 6 167 1 150 2 155 3 164 4 180 5 150 6 203 1 1.73 2 .86 3 .054 4 1.012 5 1.73 6 7.76 Chi-squared = 13.15 .02<p-value<.025 Reject Our data shows the dice is rigged 2) The same group of students also indicated on the survey what year in school they were. Since statistics is a sophomore level class, we might expect to see more sophomores than any other group. Use a chi-square goodness-of-fit test to determine if the different years in school are equally distributed. Year Freshman 50 Sophomore 65 Junior 37 Senior 20 H0: The distribution across the classes is even Ha: The distribution across the classes is uneven α=0.05 Year Freshman 43 Sophomore 43 Junior 43 Senior 43 Year Freshman 1.14 Sophomore 11.26 Junior 0.84 Senior 12.30 Χ2=25.54 p-value < 0.0001 Reject The distribution of classes is not even 3) For many years TV executives used the guideline that 30% of the viewing audience were watching each of the traditional big three prime-time networks and 10% were watching cable stations on a weekday night. A random sample of 500 viewers in the Tampa-St. Petersburg, Florida, area last Monday night showed that 165 homes were tuned in to the ABC affiliate, 140 to the CBS affiliate, 125 to the NBC affiliate, and the remainder were viewing a cable station. At the .05 significance level, can we conclude that the guideline is still reasonable? H0: The distribution given is correct HA: The distribution has changed Alpha=0.05 Observed: 165,140,125,70 Expected: 150,150,150,50 Chisquar: 1.5, .67, 4.16, 8 =14.33 p-value = 0.002 Reject The distribution has changed 4) A 98% confidence interval for the difference in the average length of a movie between “action flicks” and “chick flicks” was (22.43, 25.61) minutes. Which of the following statements is true? X 98% of “action flicks” and “chick flicks” last between 22.43 minutes and 25.61 minutes The probability the next “action flick” or “chick flick” lasts between 22.43 and 25.61 minutes is 98% We are 98% confident the average time for “chick flicks” and “action flicks” is between 22.43 and 25.61 The evidence does not support the claim that “action flicks” have a different average than “chick flicks” “Action flicks” are 22.43 to 25.61 longer than “chick flicks” on average with 98% confidence Of all possible “action flicks” and “chick flicks” 98% have an average difference of 22.43 to 25.61 5) University of Michigan surveyed high school seniors nationwide who smoke and asked them which brands of cigarettes they use. Is there a relationship between Race and Cigarette Brand? http://www.monitoringthefuture.org/data/tables/cigbrands/table1.html LD Johnston, PM O'Malley, JG Bachman, JE Schulenberg. (Apr. 1999). Cigarette brands smoked by American teens: One brand predominates; three account for nearly all of teen smoking. University of Michigan News and Information Services: Ann Arbor, MI. [On-line]. Available: www.isr.umich.edu/src/mtf; accessed 04/15/2013 Black White Hispanic Marlboro 6 1276 90 Newport 87 138 36 Camel 0 198 5 All other Brands 13 205 25 H0: race is independent of cigarette preference Ha: cigarette preference depends on race Alpha:0.05 Black White Hispanic Marlboro 69.95286 1199.098 102.9495 Newport 13.30736 228.1082 19.58442 Camel 10.35017 177.4175 15.23232 All other Brands 12.38961 212.3766 18.23377 Black White Hispanic Marlboro 58.46749 4.932019 1.628851 Newport 408.0904 35.59491 13.75948 Camel 10.35017 2.387808 6.87357 All other Brands 0.030072 0.256217 2.510832 Chisq:544.88 p-value = 0 Reject Certain cigarettes do depend on race 6) A study investigated whether people think Labrador retrievers are cuter than Afghan Hounds. They walked a Labrador past 100 people and 78 petted the dog. They walked an Afghan Hound past 90 people and 61 petted the dog. Find a 96% confidence interval for the difference in proportions of people who will pet a Labrador verses an Afghan Hound. 78/100-61/90+-2.054*sqrt(78/100*(1-78/100)/100+61/90*(1-61/90)/90)=(-0.02972, 0.23416) 7) George Bush Sr. mentions on T.V. that the average age of a student at UW is 23 years old. To test his hypothesis, you ask 3 randomly chosen UW students what their ages are, and use α=.01 Assume the ages of students at A&M are normally distributed. The ages were : 22 years old, 28 years old, and 24 years old. Test whether George Bush was right. n =3 x = (22+28+24)/3 = 24.67 μ0 = 23 df = 3-1 = 2 sx = (22 24.67) 2 (28 24.67) 2 (24 24.67) 2 3.055 3 1 H0: μ = 23 HA: μ ≠ 23 α=.01 The ages are normally distributed, so we can use a t-test x s n 24.67 23 t2 .9468 3.055 3 t df The t-value of .94 is between .817 and 1.061 on the t-table for 2 degrees of freedom, so the tail-probability is between .25 and .2. This is a two-sided test, so we need to double the probability to get the p-value. .4 < p-value < .5 Since .25 > .01 we cannot reject the null hypothesis. So Bush was right. (Again), it appears that the average age of students at UW is about 23 years old. 8) The Working Imitation Design Gadget Engineering Tool is manufactured by a machine that sometimes has a flaw in the production. According the machine specs the flaw distribution per hour should be: 0 flaws 70% 1 flaw 15% 2 flaws 8% 3 flaws 3% 4 flaws 2% 5 flaws 1% To see whether the machine is performing according to specifications we randomly sample 200 hours and get the following flaw distribution: 0 flaws 131 1 flaw 28 2 flaws 16 3 flaws 11 4 flaws 8 5 flaws 6 Can we say with 5% significance that the machine is not performing at specifications? There is not enough data to do a goodness of fit test – an expected value is less than 5 9) Billy says that the average speed of a mule is faster than the average speed of a zebra. To test this he rides 7 different zebras and 7 different mules. Assume each time he rides the animal the exact same way. Test whether Billy is right. First ride Second ride Third ride Fourth ride Fifth ride Sixth ride Seventh ride MEAN S Zebra 31 22 40 28 35 37 34 32.43 6.02 Mule 42 37 28 39 31 48 33 36.85 6.87 H0:mu1<=mu2 Ha: mu1>mu2 Alpha: 0.05 T=(32.43-36.85)/(6.02^2/7+6.87^2/7) = -1.28 p-value = .12 Fail to Reject We cannot say the mules run faster than the zebras 10) In a class survey done in a statistics class, students were asked, “Suppose that you are buying a new car and the model you are buying is available in three colors: silver, blue, or green. Which color would you pick?” Of the 111 students who responded, 59 picked silver, 27 picked green, and 25 picked blue. Is there sufficient evidence to conclude that the colors are not equally preferred? H0: colors equally preferred Ha: colors not equally preferred Alpha=0.05 O: 59,27,25 E: 37,37,37 X2: 13.08,2.7,3.89 X2=19.68 p-value = 0 Reject Colors not equally preferred 11) Suppose that on a typical day, the proportion of students who drive to campus is .30 (30%), the proportion of students who bike is .60 (60%), and the remaining .10 (10%) come to campus (e.g., walk, take the bus, get a ride). The campus sponsors a “spare the air” day to encourage people not to drive to campus on that day. They want to know whether the proportion using each mode of transportation on that day differ from the norm. To test this hypothesis, a random sample of 300 students that day was asked how they got to campus, with the following results: Method of Transportation Frequency Drive Bike Other Total 80 200 20 300 H0:New program did nothing Ha: New program effected a change Alpha=0.05 O:80,200,20 E:90,180,30 X2:1.11,2.22,3.33 X2:6.667 p-value 0.035 Reject The program did effect change 12) Ashley is eating ice cream when she gets a brain freeze. Her thought is that it’s because she was eating with her left hand. So she gets 100 bowls of cookie dough ice cream and asks 50 of her friends to randomly choose either their right or their left hand. They eat as fast as they can until they get a brain freeze. Then she asks them to switch hands with a new bowl of cookie dough ice cream and eat until they get a brain freeze. Here is her data: Left hand: 50 bowls Average time: 48 seconds Standard Deviation: 27 seconds Right hand: 50 bowls Average time: 37 seconds Standard Deviation: 24 seconds Pooled Standard deviation: 25.5 seconds Matched Pairs deviation: 2.5 seconds Difference in deviations: 3 seconds Make a 99% Confidence interval for the difference in times for each hand It doesn’t say it’s normal, but n=50, so the mean is normal. This is matched pairs because each right hand is paired to one of the left hands by being attached to the same body. t score= 2.704 (48-37) ± 2.704 * 2.5 / sqrt(50) (10.04, 11.95) 13) The MagBlast company demolishes buildings by setting four charges on each corner of the building. The charges are supposed to detonate at the same time, but sometimes something goes wrong and not all of them ignite on time. If you had 500 buildings the distribution for the expected number of charges that would go off on time is given below: All five detonate 406 Only four 35 Only three 27 Only two 15 Just one detonates 10 No charges detonate 7 You have been asked to investigate what would happen to the charges if a building was demolited when it is raining. To find out you randomly select 500 buildings around the United States and demolish them when it is raining. The data you observed is given below: All five detonate 417 Only four 25 Only three 31 Only two 17 Just one detonates 10 No charges detonate 0 Test whether your data supports the hypothesis that rain affects the detonation of the charges (assuming homeland security does not catch you). H0: The rain does not affect the detonation (proportions remain the same) Ha: The rain does affect the detonation (at least one proportion is different) Alpha = 0.05 All five detonate .298 Only four 2.857 Only three .5926 Only two .2667 Just one detonates 0 No charges detonate 7 Chisq=11.01 .05<p-value<.06 Fail to Reject Our data does not show that rain affects the detonation 14) Captain Buckwheat uses a sextant to measure the height of a ship when he spots it on the horizon. After attacking the ship he calculates the gold looted. His goal is to be able to predict the amount of gold based on the ship height. Below is the data and regression from the 500 ships he has attacked during his career. Coefficients: Estimate (Intercept) 9.8958 Height 6.5537 Std. Error 0.1879 0.3154 t value 52.66 20.78 Pr(>|t|) <2e-16 *** <2e-16 *** Based on the output above, are there any assumptions that you feel should be investigated? Perhaps there is clumping, which might indicate a nonrandom sample Based on the output above find a 90% confidence interval for the slope of the regression line. 6.5537+-1.65*0.3154=(6.03, 7.07) 15) Every day I see the elevator says it’s been “inspected.” Somehow I feel dubious. I think the elevator gets inspected less than 70% of the time. I plan on doing a hypothesis test by investigating 100 elevators and using α=0.10. If the true percentage was actually only 65%, how powerful would my test be? Z=-1.28, critical value = 0.64127 0.427 is the best answer, but if they use the null proportion in the alternative distribution (which I think is a reasonable error, easy to make) they’ll get .424 16) Randomly selected deaths of motorcycle riders are summarized in the table below. Use a .05 significance level to test the claim that such fatalities occur with equal frequency in the different months. Month Jan Feb Mar Apr May June July Aug Sept Oct Nov Dec Tot Observed 6 8 10 16 22 28 24 28 26 14 10 8 200 Expected 16.7 16.7 16.7 16.7 16.7 16.7 16.7 16.7 16.7 16.7 16.7 16.7 200 X2 6.86 4.53 2.7 0.03 1.68 7.64 3.19 7.64 5.18 0.44 2.69 4.53 47.1 H0: Even across the months Ha: Not even Alpha:0.05 X2:47.1 p-value =0 reject not even across the months