1) The expected value is from the regression output. The intercept was 5.0811, the slope was 2.0923 So the expected value at x=6 will be 5.0811+2.0923*6 = 17.6349 2) DFRegression = G-1 = 3 -1 = 2 DFResidual = N-G = 47 -3 = 44 But you don’t have to calculate both, once you find one you can just subtract if from the Degrees of freedom total, since it must add up to 46 SSRegression = SUM( n*(xbar – totalaverage)2) = 17*(15.21-15.92)2 + 22*(17.32-15.92)2 + 8*(13.59-15.92)2 = 95.12 SSResidual = SUM( (n-1) s2 ) = (17-1)*4.022 + (22-1)*4.202 + (8-1)*3.812 = 730.62 But you don’t have to calculate both. Once you do one of the SS, you can subtract it from the 825.74 MSRegression = 95.12/2=47.56 MSResidual = 730.62/44 = 16.60 F = 47.56/16.60 = 2.864 ANOVA Groups Residual Total df SS 2 95.12 44 730.62 46 825.74 MS 47.56 16.60 F P-value 2.864 0.06771 3) The decision that the two are related is the alternative hypothesis, so the Grinch rejected the null hypothesis. The truth was that the two variables are independent. That means the Grinch rejected the null when the null was true, so a Type I error was made. But the Grinch made an error before that. The p-value was larger than .05, so he should have accepted the null instead of rejecting it. 5) It will decrease the width of her confidence interval. 6) H0: p1 – p2 = 0 None of the confidence intervals contain 0 (note: You are not checking whether α is in the interval) Even at 98% we would reject, so the p-value is smaller than 0.02 (note – you do not divide by 2) P-value < α therefore reject. Conclude the Republicans and Democrats do not spend the same amount of money for campaigns 7) Note: It doesn’t matter that one sample has 10 times as many as the other sample, and the percentage of dogs who die without the cure is (1000-19)/1000 = 981/1000 n1π1=981 n1(1-π1)=19 n2π2=11 n2(1-π2)=89 So the assumptions for normality are met p1 p 2 Z p1 1 p1 p 2 1 p 2 n1 n2 981 981 11 11 1 1 981 11 1000 1000 100 100 1.88 1000 100 1000 100 = (0.8116, 0.9304) 8) np=49 n(1-p)=576 so the assumptions for normality are met 1) H0: p≤ 0.10 2) HA: p > 0.10 3) α = 0.05 4) 49 0.10 p 625 Z 1.8 1 0.101 0.10 n 625 If you look up the 1.8 (using the positive z-table) you’ll get an area of 0.9641. So the area to the right of -1.8 would be 0.9641. If you don’t have a picture you’ll never get it straight 5) p-value: 0.9641 6) Fail to reject the null 7) There is not enough evidence to conclude that the percent of veterinarians who go bankrupt is over 10%. 11) SSE = sum[ (n-1) s2 ] = (4-1)*4.62 + (4-1)*5.22 + (6-1)*4.82 + (6-1)*5.82 = 428 An alternative way: MSE = SSE/(n-g) 5.172 = SSE/(20 – 4) so SSE = 5.172 * (20-4) = 428 12) H0: μ ≤ 10 HA: μ > 10 (the claim) α = .05 The measurements are random and normally distributed X s T2 23 12 18 17.67 3 23 17.67 2 (12 17.67) 2 (18 17.67) 2 3 1 5 .5 17.67 10 2.415 5.5 3 Looking up 2.415 on the T-table with 2 degrees of freedom, .05 < p-value < .10 Thus we fail to reject the null Santa’s bag will not weigh more than 10 kilotons. 13) Clark rejected the null hypothesis, and Truthman said that the null hypothesis was incorrect, so Clark did not make a Type I or Type II error. The error that Clark made was to do a z-test when n*p<10. N*p = 200*.02 = 4 14) ANOVA Regression Residual Total df 1 29 30 SS 23 220 243 MS 23 7.58 F P-value 3.03 0.09845 15) The correct answers are: ______ Major and gender are related ______ The distribution of majors is not the same for males and females 16) The matched pairs test has a smaller standard deviation which means a more powerful test. 18) So the residual plot looks like a dog, which is meant to be humorous, but believe it or not residual plots like this really do happen - usually in models that have more than one x variable, and it’s usually because multiple assumptions have been violated. These assumptions should be investigated as possibly problematic: - Independence: There could be clumping in the area of -3 which is why we see a “head” - Constant Variance: The variance near 1 is small, but the variance near -1 is large, as well as the variance near 3.5, which is why we see “legs” - Linearity: Since there is a curved shape (flowing from the “head” down the “body”) the assumption of linearity could be wrong - Normality: There should be a lot of residuals near zero, with fewer residuals further away from zero. Instead there are few near zero, many near -1 few near 1, few near -3.5 and many 3. A histogram of these residuals would not be bell shaped. The one assumption that is not easily suspicious is that the residuals are centered at zero. Zero looks like the balancing point for the data. 19) Regression Error Total DF 1 79 80 SS 6.40 117.45 123.85 MS 6.40 1.4867 F 4.3048 20) Horse Age Price Expected Residual Red Fury 2 $8000 7600 400 Light Knight 9 $4000 6200 -2200 Twigger 15 $2900 5000 -2100 Pastey Beauty 25 $1000 3000 -2000 The best value will have the largest negative residual. Light Knight saves 2200, so it is the best value. There is an expected cell which is less than 5 (the Mosquito – Cayenne value of 4.4) Therefore we cannot do this test using the methods we learned in class, and no more conclusion (or hypothesis test) can be done. 22) 23) We have the difference in two means, with variances unpooled. The samples are greater than 30 t80=2.639 3.5 2 1.2 2 15 12 2.639 81 81 1.915,4.085 24) H0: µd=0 Ha: µd≠0 α=0.05 t24=(31-48)/(10.1/sqrt(35))=-9.96 p-value off the chart ≈ 0 Reject Our data shows one of the airlines (PilotAirOr) is significantly higher than the other 25) Ho: µ≥4.7 HA: µ<4.7 α=0.05 t31=(4.5-4.7)/(.5/sqrt(32))=-2.26 .01<p-value<.02 Reject Our evidence shows the Deans claim that honor students graduate earlier than 4.7 years is correct 26) Not a large enough sample size 27) t4=2.776 (107.4-88.4)±2.776*sqrt(17.42/7+14.42/5)=( -6.55, 44.55) 28) First let’s find the cut-off that Bill will use. His distribution for the average is normal centered at 50 with standard error of 21/sqrt(9). With α=0.05 the cutoff will be at z=1.96 (You could argue that Bill wants a two tailed test, but I think if he’s betting with the other guy that he would only care about the upper tail of his test. Not like the small addition to power from the other tail will have much probability) 1.96=(xbar-50)/(21/sqrt(9)) Xbar = 63.72 So if Bill gets an average above 63.72 he will reject (because his p-value will be smaller than 0.05 at that point) Will’s distribution for the average is normal with mean 70 and standard error of 21/sqrt(9) The probability of being above 63.72 (which means Bill will reject his test, which is powerful in Will’s opinion). Z=(63.72-70)/(21/sqrt(9)) = -0.9 Probability = 1- 0.1841 = 0.8159 While that’s enough to answer the question I hope you notice that the alpha is reasonably small, the power is reasonably high, and the sample size is rather small (if you can get titanium at WalMart) so everyone is likely to be happy with this study, and agree to the results of it. 29) H0: The distribution is what is should be HA: The distribution is not what it should be (the machine is not working right) Alpha:0.05 Watermelon Green Apple Lemon Cherry Peach .56 2.22 .5 4.09 4.09 The Chi-squared with 4 degrees of freedom is 11.46 The p-value is between 0.02 and 0.025 Reject the null The data shows the machine is not operating within the specified parameters ( the quality control expert now has to stop the machine and see if he can fix it)