2014 AP Stats Exam Solutions 17 + 7 24 = = 0.72 33 33 25 +12 37 = = 0.55224 67 67 Residential Status appears to make a difference in participation for this sample of students. On campus students are more likely to participate in one or more activities (72%) than off campus students (55%). Because the percentages in the stacked bar charts are not the same, we can see that there is an association between residential status and level of participation in activities. On campus students are slightly more likely to participate in 2 or more activities (21% vs. 18%) and off campus students are more likely to not participate in an activity (45% vs. 27%). Since p=0.23 is greater than any reasonable significance level (.01, .05, 0.10), the administrator should fail to reject the null hypothesis. This means that there is not significant evidence that residential status and level of participation in activities at the university have an association. 3 2 1 1 P(X = 3) = × × = = 0.0119 9 8 7 84 Yes, since the probability is so small, it gives us reason to doubt the manager’s claim. Only about 1% of the time would we select all 3 women from this group if it truly was a random sample. The probability of this happening is not very likely! No, this does not correctly simulate the selection. The dice roles are independent. They simulate a situation where 3 out of the 9 people are chosen WITH replacement, so that the probability of picking a woman remains constant at one-third. However, we need to simulate choosing 3 WITHOUT replacement, which would change the probabilities after each selection as shown in part a. We should use a different method, such as drawing names out of a hat. We could do this without replacement and that would correctly simulate the process. P(X > 140) Looking up -1.90 on the 140 -120 normal table, I get 0.0287. z= = 1.90 10.5 The probability of them losing some state funding is 0.0287. School A would be less likely to lose funding because it would be less likely to show a mean greater than 140. This is because the sampling distribution for samples of size 3 will have less variability than for samples of size 1. It would be less likely to find an AVERAGE of 3 be over 140 than one single observation be over 140. P(X > 140) z= PROOF: 140 - 120 = 3.30 Looking up -3.30, we get a probability of 10.5 3 0.0003 which is definitely less likely than our answer in part (a). 3 8 æ 2ö P(Monday or Friday) = ç ÷ = = 0.064 è 5ø 125 Since income data is likely to be skewed to the right because of many low incomes and only a few high outliers, the mean is likely to be much higher than the median and most of the data. When data is skewed, the median would be a more accurate representation of the majority of the data. Method 2 is better because Method 1 is biased. Method 1 is a voluntary response sample. If only 600 (less than 10%) answer as expected, those who answer may be different from the others in some way. For instance, alumni might be more likely to respond if their income is higher (because they’re proud of it), which would cause an overestimation of the average income. Method 2 avoids this nonresponse bias by following up to make sure all randomly selected people answer the question. A smaller unbiased random sample will give better results than a larger biased sample. Method 1 will overestimate, however, Method 2 should have an estimate close to the true average yearly income of the class of 1988. This is a matched pairs t test!! ud = the true mean difference in price (woman – man) H0 : ud = 0 Ha : u d > 0 Assumptions: Stated that they randomly selected 8 car models Pop car models > 10 * 8 Normality: The plot of differences (given) is not skewed (fairly symmetric), so normality can be assumed. Matched pairs t test: t= 585 - 0 = 3.118 530.71 8 With 7 degrees of freedom, our p-value is 0.0084. Let the significance level be 5%. Since .0084 < .05, I would reject the null hypothesis and say that there is significant evidence that in this county women pay more than men on average for the same car!! y = -1.5958 + 0.03726(175) = 4.9247 Residual = y - y = 5.88 - 4.9247 = 0.9553 Interpretation: The actual fuel consumption rate of Car A is 0.9553 gallons per mile higher than the predicted fuel consummation rate based on our LSRL. Car B has a very small, positive residual, which means that the predicted fuel consumption rate was very close to the actual fuel consumption rate. (The actual was only SLIGHTLY higher). There is a moderate positive linear association between the residuals of FCT vs. Length and the Engine Size, shown on Graph II. Graph III shows almost no linear association between Wheel Base and the Residuals of FCR vs. Length. The points are randomly scattered. In comparison, the residuals of FCT has a stronger linear association with engine size than with wheel base. Jamal should use engine size to improve his predictions, because it has a stronger association with the fuel consumption rate. The lack of association on Graph III tells us that using wheel base to predict FCR will not give us better predictions than what we already had with length alone. However, the positive association on Graph II tells us that some of the variability that has not yet been explained by length could be explained by engine size.