Introduction to the Practice of Statistics Sixth Edition Moore, McCabe Section 4.5 Homework Answers 4.103 Attendance at 2 – year and 4 – year colleges. In a large national population of college students, 61% attend 4 – year institutions and the rest attend 2-year institutions. Males make up 44% of the students in the 4-year institutions and 41% of the students in the 2-year institutions. (a) Find the four probabilities for each combination of gender and type of institutions in the following table. Be sure your probabilities sum up to 1. Men 0.2684 0.1599 0.4283 4-year institution 2- year institution Total Women 0.3416 0.2301 0.5717 Total 0.61 0.39 1 Males make up 44% of the students in the 4-year institutions: P(males | 4 year) = 0.44 = x P(male and 4 - year) = : x = 0.44(0.61) = 0.2684. 0.61 P(4 - year) and 41% of the students in the 2-year institutions: P(males | 2 year) = 0.41 = y P(male and 2 - year) = : y = 0.41(0.39) = 0.1599. 0.39 P(2 - year) (b) Consider randomly selecting a female student from this population. What is the probability that she attends a 4-year institution? “Consider randomly selecting a female student from this population.” It is a given that the student is female. P(4-year | woman) = 0.3416 0.5977 0.5715 4.104 Draw a tree diagram. Refer to the previous exercise. Draw a tree diagram to illustrate the probabilities in a situation where you first identify the type of institution attended and then identify the gender of the student. 4.105 Draw a different tree diagram for the same setting. Refer to the previous two exercises. Draw a tree diagram to illustrate the probabilities in a situation where you first identify the gender of the student and then identify the type of institution attended. Explain why the probabilities in this tree, diagram are different from those that you used in the previous exercise. In problem 104 we start by dividing the entire sample space into institution type. Then we find the percentages of males and females within each smaller subgroup of the entire sample space which is all students in a 4 year or 2 year institutions. But 105 starts by dividing the same sample space by gender first, then seeing the percentage of males that attend 4-year vs 2-year and the same for females. 4.106 Education and income. Call a household prosperous if its income exceeds $100,000. Call the household educated if the householder completed college. Select an American household at random, and let A be the event that the selected household is prosperous and B the event that it is educated. According to the Current Population Survey, P(A) = 0.138, P(B) = 0.261, and the probability that a household is both prosperous and educated is P(A and B) = 0.082. What is the probability P(A or B) that the household selected is either prosperous or educated? P(A or B) = P(A) + P(B) – P(A and B) P(prosperous OR educated) = P(prosperous) + P(educated) - P(prosperous AND educated) = 0.138 + 0.261 – 0.082 = 0.317 4.107 Find a conditional probability. In the setting of the previous exercise, what is the conditional probability that a household is prosperous, given that it is educated? Explain why your result shows that events A and B are not independent. P(B | A) = P(prosperous | educated) = 0.082 0.3142. 0.261 4.108 Draw a Venn diagram. Draw a Venn diagram that shows the relation between the events A and B in Exercise 4.106. Indicate each of the following events on your diagram and use the information in Exercise 4.106 to calculate the probability of each event. Finally describe in words what each event is. a) {A and B} The event is a household is prosperous and that it is educated. P(A and B) = 0.082 P(prosperous AND educated) = 0.082. b) {A and Bc} The household is prosperous, but not educated. P(A and Bc) = P(prosperous AND not educated) = 0.138 – 0.082 = 0.056 c) {Ac and B} The household is not prosperous, but it is educated. P(Ac and B) = P(Not prosperous AND educated) = 0.261 – 0.082 = 0.179 d) {Ac and Bc} The household is neither prosperous nor is it educated. P (Ac and Bc) = P(Not prosperous AND Not educated) = 1 – 0.317 = 0.683 4.109 Sales of cars and light trucks. Motor vehicles sold to individuals are classified as either cars or light trucks (including SUVs) and as either domestic or imported. In a recent year, 69% of vehicles sold were light trucks, 78% were domestic, and 55% were domestic light trucks. Let A be -the event that a vehicle is a car and B the event that it is imported. Step 1 – Write down probabilities using function notation. P(LT) = 0.69 P(domestic) = 0.78 P(domestic AND LT) = 0.55 Write each of the following events in set notation and give its probability. (a) The vehicle is a light truck. P(LT) = 0.69 (b) The vehicle is an imported car. P(Not LT and Not domestic) = 1 – (0.69 + 0.78 – 0.55) = 0.08 4.110 Income tax returns. In 2004, the Internal Revenue Service received 312,226,042 individual tax returns. Of these 12,757,005 reported an adjusted gross income of at least $100,000, and 240,128 reported at least $1 million. If you know that a randomly chosen return shows an income of $100,000 or more, what is the conditional probability that the income is at least $1 million? Step 1 – Write down probabilities using function notation. P(> $100 K) = 12, 757, 005 = 0.04085 312, 226, 042 P(>$1 million) = 240,128 = 0.00077 312, 226, 042 0.00077 = 0.01883 Notice that P(>$1million) = P(>$1million and 0.04085 >$100K) since you want both events satisfied simultaneously, one individual making both at least $ 100K and more than $1M, which means these people must all make more than a million. P(>$1 million | > $100 K) = Which is the same as if we used the actual counts. P(>$1 million | > $100 K) = 240,128 12, 757, 005 4.111 Conditional probabilities and independence. Using the information in Exercise 4.109, answer these questions. (a) Given that a vehicle is imported, what is the conditional probability that it is a light truck? P(LT And Not domestic) P(LT | Not domestic) = P(Not domestic) = 0.69 - 0.55 1 - 0.78 = 0.6364 (b) Are the events "vehicle is a light truck" and "vehicle is imported" independent? Justify your answer. We can check using either P(B | A) = P(B) or P(A and B) = P(A)P(B) P(LT | Not domestic) = 0.1795 which does not equal P(LT) = 0.69 therefore we do not have independence. 4.116 Academic degrees and gender- Here are the projected numbers (in thousands) of earned degrees in the United States in the 2010-2011 academic year, classified by level and by the sex of the degree recipient:28 Female Male Total Bachelor’s 933 661 1594 Master’s 402 260 662 Professional 51 44 95 Doctorate 26 26 52 Total 1412 991 2403 (a) Convert this table to a table giving the probabilities for selecting a degree earned and classifying the recipient by gender and the degree by the levels given above. To convert this table to probabilities divide every entry by the sample space count of 2403. Female Male Total Bachelor’s Master’s Professional Doctorate Total 0.3883 0.2751 0.6633 0.1673 0.1082 0.2755 0.0212 0.0183 0.0395 0.0108 0.0108 0.0216 0.5876 0.4124 1 (b) If you choose a degree recipient at random, what is the probability that the person you choose is a woman? Everyone on the table represents a degree recipient. Thus, we need only find P(woman) = 0.5876. (c) What is the conditional probability that you choose a woman, given that the person chosen received a professional degree? P(woman | professional) = 0.0212 0.5367 0.0395 (d) Are the events "choose a woman" and "choose a professional degree recipient" independent? How do you know? Among professional degree recepients the percentage of women is 53.57%, but the total percentage of women degree recepients in the population of degree recepients is 58.76%. Thus, we do not have independence since the percentage of women getting a professional degree is different than that of the population of women degree recepients in general. Had the numbers been the same we would have had independence. 4.117 Find some probabilities. The previous exercise gives the projected number (in thousands) of earned degrees in the United States in the 2010-2011 academic year, Use these data to answer the following questions. (a) What is the probability that a randomly chosen degree recipient is a man? P(man) = 0.4124 (b) What is the conditional probability that the person chosen received a bachelor's degree, given that he is a man? 0.2751 P(bachelors | man) = 0.6671 0.4124 (c) Use the multiplication rule to find the probability of choosing a male bachelors degree recipient. Check your result by finding this probability directly from the table of counts. The general multiplication rule is P(A and B) = P(A)P(B | A) as shown in class with the marble problem. P(man and bachelors) = P(man)P(bachelors | man) = 0.4124(0.6671) = 0.2751 4. 122 Gender and majors. The probability that a randomly chosen student at the University of New Harmony is a woman is 0.62. The probability that the student is studying education is 0.17. The conditional probability that the student is a woman, given that the student is studying education, is 0.8. What is the conditional probability that the student is studying education, given that she is a woman? P(woman) = 0.62, P(study education) = 0.17. P(woman | study education) = 0.8. P(study education | woman) = ? Here is the formula for the conditional probability P(A and B) P(A | B) = this translates to P(B) P(study education | woman) = P(study education AND woman) P(woman) = P(study education)P(woman | study education) P(woman) = (0.17)(0.8) 0.62 4.123 Spelling errors. As explained in Exercise 4.74 (page 286), spelling errors in a text can be either nonword errors or word errors. Nonword errors make up 25% of all errors. A human proofreader will catch 90% of nonword errors and 70% of word errors. What percent of all errors will the proofreader catch? (Draw a tree diagram to organize the information given.) 1st we need to write down the probabilities using the correct conjunctions and function notation. P(nonword) = 0.25 P(catch | nonword) = 0.9 P(catch | word) = 0.7 Now write the question using function notation. P(catch) = ? The questions asks to find all ways you can catch an error, since there are two types I need to find the probability of finding both. P(catch) = P(nonword And catch OR word And catch) = P(nonword And catch) + P(word And catch) = 0.25(0.9) + 0.75(0.7) = 0.75 4.124 Mathematics degrees and gender. Of the 16,071 degrees in mathematics given by U.S colleges and universities in a recent year, 73% were bachelor's degrees, 21% were master's degrees, and the rest were doctorates. Moreover, women earned 48% of the bachelor's degrees, 42% of the master's degrees, and 29% of the doctorates.30 You choose a mathematics degree at random and find that it was awarded to a woman. What is the probability that it is a bachelor's degree? P(bachelors) = 0.73 P(masters) = 0.21 Moreover, women earned 48% of the bachelor's degrees This is difficult to interpret, but you need to read it several times until it is clear what the 48% applies to and with what sample space. When you read it carefully it becomes apparent that the 48% applies to women, but 48% of what? The bachelors degree pie/whole. Considering all bachelors degrees as the whole, women have 48% of that whole. P(women | bachelors) = 0.48. P(women | masters) = 0.42 What is the probability that it is a bachelor's degree? Given the previous statement I must assume that it is given the person isa woman. P(bachelors | woman) = P(bachelors AND woman) P(woman) The denominator is calculated as: P(woman) = P(bachelors And woman OR masters And woman OR doctorate And woman) = P(bachelors And woman) + P(masters And woman) + P(doctorate And woman) = 0.73(0.48) + 0.21(0.42) + 0.06(0.29) = 0.456 P(bachelors AND woman) P(woman) 0.73(0.48) = 0.456 P(bachelors | woman) = = 0.7684 4.127 Cystic fibrosis. Cystic fibrosis is a lung disorder that often results in death. It is inherited but can be inherited only if both parents are carriers of an abnormal gene. In 1989, the CF gene that is abnormal in carriers of cystic fibrosis was identified. The probability that a randomly chosen person of European ancestry carries an abnormal CF gene is 1/25. (The probability is less in other ethnic groups) The CF20m test detects most but not all harmful mutations of the CF gene. The test is positive for 90% of people who are carriers. It is (ignoring human error) never positive for people who are not carriers. Jason tests positive. What is the probability that he is a carrier? P(carries gene) = 1/25 Note that I am assuming from this point on that we are only considering someone from European ancestry. P(positive test | carries gene) = 0.9 P(positive test | NOT carries gene) = 0 P(carries gene | positive test) = ? You can logically see that the answer is 1, since the test is never positive if you are a carrier. Creating a tree diagram will allow you to see this as well. 4.128 Use Bayes's rule. Refer to the previous exercise. Jason knows that he is a carrier of cystic fibrosis. His wife, Julianne, has a brother with cystic fibrosis, which means the probability is 2/3 that she is a carrier. If Julianne is a carrier, each child she has with Jason has probability 1/4 of having cystic fibrosis. If she is not a carrier, her children cannot have the disease. Jason and Julianne have one child, who does not have cystic fibrosis. This information reduces The probability that Julianne is a carrier. Use Bayes's rule to find the conditional probability that Julianne is a carrier, given that she and Jason have one child who does not have cystic fibrosis. P(Julianne) = P(carrier) = 2/3. P(carrier | no cystic) = P(cystic | carrier) = ¼. P(cystic | not carrier) = 0 P(carrier and cystic) P(no cystic) 23 34 = 23 1 (1) 34 3 = 0.6001 4.129 Muscular dystrophy. Muscular dystrophy is an incurable muscle-wasting disease. The most common and serious type, called DMD, is caused by a sex-linked recessive mutation. Specifically: women can be carriers but do not get the disease; a son of a carrier has probability 0.5 of having DMD; a daughter has probability 0.5 of being a carrier. As many as 1/3 of DMD cases, however, are due to spontaneous mutations in sons of mothers who are not carriers. Toni has one son, who has DMD In the absence of other information, the probability is 1/3 that the son is the victim of a spontaneous mutation and 2/3 that Toni is a carrier. There is a screening test called the CK test that is positive with probability 0.7 if a woman is a carrier and with probability 0.1 if she is not. Toni’s CK test is positive. What is the probability that she is a carrier? P(son has DMD | mom carrier) = 0.5 P(daughter a carrier of DMD | mom carrier) = 0.5 P(son has DMD | mom not a carrier) = 0.3333 Toni has one son who has DMD P(CK test positive | mom is a carrier) = 0.7 P(CK test is positive | mom not a carrier) = 0.1 P(Toni carrier | test positive) = ? 2 0.7 3 P(Toni carrier | test positive) = = 0.9333 2 1 (0.7) 0.1 3 3