Introduction to Statistics Professor Hardin Math 58 – Fall 2007 HW #7, due Wednesday 11/7 In Introduction to the Practice of Statistics: 1.98 – 1.109, 5.3 – 5.8 E14. Suppose you have just moved to California and you face the traumatic situation of taking a multiple-choice exam to qualify for a California driver’s license. This exam consists of 36 multiple-choice questions, with three options provided for each question. Candidates need to answer at least 31 questions correctly in order to pass. Let the random variable X be the number of questions answered correctly on the exam. a. Suppose that the candidate guesses randomly among the three options on each question. What probability distribution does X have? (Specify the parameters of the distribution as well as its name.) b. If the candidate guesses randomly on each question, what would be the expected number of questions that he would answer correctly? What would the standard deviation of the number correct be? (Hint: for the standard deviation, look up the properties of the binomial in your textbook) c. If the candidate guesses randomly on each question, what is the probability of passing the exam? (You’ll have to use R or the theoretical answer from the binomial simulation applet, here and for the rest of the questions.) d. If the candidate can eliminate one wrong option on each question and guess randomly between the other two, what is the probability of passing the exam? (Identify the probability distribution of X in this situation) e. If the candidate studies to the point of having a 0.9 probability of answering each question, independently from question to question, what is the probability of passing the exam? (Identify the probability distribution of X in this case also.) f. Let p represent the probability of answering each question correctly, independently from question to question. If the candidate wants to have at least a 99% chance of passing the exam, what is the smallest value of p that will achieve this? The rules actually allow a candidate to take the exam three times, and passing (with at least 31 of 36 correct) at least once is sufficient to qualify for a license. g. Suppose that the candidate studies to the point of having a 0.8 probability of answering each question, independently from question to question. What is the probability of passing the exam at least once in the three allotted attempts? (Hint: it is much easier to figure out the probability of not passing at all and then to use the complement rule.) E15. Birthweights of babies in the United States have been said to be reasonably modeled by a normal distribution with mean 3250 grams and standard deviation 550 grams, N(3250, 550). Babies weighing less than 2500 grams are considered to be of low birthweight. a. Produce a sketch of the normal density curve described by these parameters (with appropriate scaling). Provide a label and a scale for the horizontal axis. b. Approximately 68% of newborns weigh between which two values? Shade the corresponding area of the distribution in (a). c. On the graph in (a), shade the area of the distribution corresponding to babies of low birthweight. Estimate from your shading the percentage of the distribution that falls in this range. d. To estimate the probability of a randomly selected baby being of low birthweight, we will use the normal model to determine P(X ≤ 2500), where X, which denotes the birthweight of a randomly selected baby, follows a N(3250, 550) distribution. We can calculate the probabilities using R or an applet. The applet is slightly more visual. Open the “Normal Probability Calculator” applet and enter “birthweight” in the variable box. Specify 3250 as the mean and 550 as the standard deviation. Press the “Scale to Fit” button. Select the first row and in the X box, enter the value of 2500 and hit the Enter key. (Note the applet scales both the x-axis and the “zaxis”). i. What does the applet report for the z-score and the probability? ii. Write a one-sentence interpretation of the z-score and the above probability. e. Data from the National Vital Statistics Report indicate that there were 3,880,894 births in the United States in 1997. A total of 291,154 babies were of low birthweight. What is the observed proportion of birthweight babies and how does this compare to the proportion predicted by the normal model? f. Suppose you wanted to estimate the probability of a baby weighing more than 10 pounds (4536 grams) at birth. Sketch (and label) the distribution and shade the area of interest. g. In the applet, change the inequality to >, specify 4536 in the “X” box, and press Enter. What are the z-score and the probability reported by the applet? Confirm that the numbers are consistent with your sketch. h. Suppose you wanted to estimate the probability that a randomly selected baby weights between 3000 and 4000 grams. Sketch (and label) the probability model with the area of interest shaded. i. In the applet, specify 3000 in the first “X” box, and then select the second row and specify 4000 in the second “X” box. Press the Enter key. What does the applet report for the z-scores and probability between these two values? j. Data from the National Vital Statistics Report indicate that there were 2,552,852 newborns weighing between 3000 and 4000 grams in 1997. How closely does the model’s prediction in (i) match the observed relative frequency? k. You can also use the normal distribution model to “work backwards.” Suppose you want to know how much a baby would have to weigh to be among the lightest 2.5% of all newborns (the 2.5th percentile). Sketch (and label) a normal model and indicate the appropriate area on the sketch. l. In the applet, unselect the second row. Since we want the lightest 2.5%, make sure the first row inequality is set at < and change the “probability” box value ot 0.025. Hit the Enter key. The applet will determine the corresponding z-score and weight. Are the results consistent with your sketch? Explain. m. Use the applet to determine how much a baby would have to weigh at birth to be in the heaviest 2.5% of all newborns. How does this z-score compare to that in (l) and how do the weight cutoffs compare? Explain the relationships. n. Suppose instead that birthweights of newborns followed a normal distribution with mean 3500 and standard deviation 400. Use the applet to create this sketch (pressing the “Scale to Fit” button) and to determine the two values such that the middle 95% of birthweights fall in between the two values. o. How do the z-scores in (n) compare to those in (l) and (m)? p. Notice that R gives the same values as the applet. Use R to calculate the probabilities in (d), (g), and (i). Write down the R commands you used. q. You can also use R to calculate the percentile cutoffs. Recalculate the cutoffs from (l), (m), and (n). Write down the R commands you used.