1. Eagle Credit Union (ECU) has experienced a 10% default rate with its commercial loan customers (that is, 90% of commercial loan customers pay back their loans). ECU has developed a statistical test to assist in predicting which commercial loan customers will default. The test assigns either a rating of “Approve” or “Reject” to each loan applicant. When applied to recent commercial loan customers who paid their loans, the test gave an “Approve” rating in 80% of the cases examined. When applied to recent commercial loan customers who defaulted, it gave a “Reject” rating in 70% of the cases examined. a. Use this data to construct a joint probability table. The probabilities are: Default and approved = (0.10)(0.30) = 0.03 Default and rejected = (0.1)(0.7) = 0.07 Paid and approved = (0.90)(0.80) = 0.72 Paid and rejected = (0.90)(0.20) = 0.18 The joint probability table is: Default Paid Total Approved 0.03 0.72 0.75 Rejected 0.07 0.18 0.25 Total 0.10 0.90 1.00 b. What is the conditional probability of a “Reject” rating given that the customer defaulted? P ( Reject | Default ) = 0.07 = 0.70 0.10 c. What is the conditional probability of an “Approve” rating given that the customer defaulted? P ( Approved | Default ) = 0.03 = 0.30 0.10 d. Suppose a new customer receives a “Reject” rating. If they are given the loan anyway, what is the probability that they will default? P ( Default | Rejected ) = P ( Default and Rejected ) 0.07 = = 0.28 P ( Rejected ) 0.25 3. A soft drink machine can be regulated (discharge level m) so that it dispenses an average of m ounces per cup. If the ounces of fill are normally distributed with mean m and standard deviation equal to 0.3 ounces. Find the setting of the discharge level m so that eight ounce cups will overflow only one percent of the time. Overflowing only one percent of the time corresponds to a right-tail area of 0.01. The z-value for this tail area is 2.326. The corresponding fill volume is: x = mean + z * standard deviation 8 = m + 2.326(0.3) 8 = m + 0.6978 m = 8 – 0.6978 m = 7.30 The discharge level should be set to m = 7.3 ounces. 6. The United States Golf Association requires that the weight of a golf ball must not exceed 1.62 oz. The association periodically checks golf balls sold in the United States by sampling specific brands stocked by pro shops. Suppose that a manufacturer claims that no more than 1 percent of its brand of golf balls exceeds 1.62 oz. in weight. Suppose that 24 of this manufacturer’s golf balls are randomly selected, and let X denote the number of the 24 randomly selected golf balls that exceed 1.62 oz. a. Find the probability that none of the randomly selected golf balls exceed 1.62 oz. If the manufacturer’s claim is true, then the probability of any single golf ball exceeding the 1.62 oz weight is p = 0.01. In 24 trials, the probability that 0 golf balls will exceed that weight is: P ( 0 ) = C ( 24,0 ) * ( 0.01) (1- 0.01) = 0.7857 0 24 b. Find the probability that at least one of the randomly selected golf balls exceeds 1.62 oz. The probability that at least one of the selected golf balls exceeds 1.62 oz. is the complement of the probability that none of the balls exceeds that weight. P ( x ³ 1) = 1- P ( x = 0 ) = 1- 0.7857 = 0.2143 c. Suppose that two of the randomly selected golf balls are found to exceed 1.62 oz. Do you believe the claim that no more than 1 percent of this brand of golf balls exceed 1.62 oz. in weight? If the manufacturer’s claim is true, the probability that two balls will exceed the weight limit is: P ( x = 2 ) = C ( 24,2 ) * ( 0.01) * (1- 0.01) = 0.0221 2 22 Since there is only a 2.21% chance of finding two balls that exceed the weight limit, out of 24 randomly selected balls, it is likely that the manufacturer’s claim is not true. 7. Owing to several major ocean oil spills by tank vessels, Congress passed the 1990 Oil Pollution Act, which requires tankers to be designed with thicker hulls. Further improvements in the structural design of a tank vessel have been implemented since then, each with the objective of reducing the likelihood of an oil spill and decreasing the amount of outflow in the event of hull puncture. To aid in this development, J.C. Daidola reported on the spillage amount and cause of puncture for 50 recent major oil spills from tankers and carriers. The file OilSpill.sgd contains the data for the 50 spills reported. a. Is any one cause more likely to occur than any other? Justify the answer using hypothesis tests. There are four identified causes of oil spills in the data set: collision, fire, hull failure, and grounding. There are two cases that are identified as having an “unknown” cause. These two cases will be ignored, and the sample will be considered to consist of the 48 spills that have one of the four identified causes. If there is no difference in the likelihood of each of the four causes, then each would be expected to occur 25% of the time. The expected number of spills for each cause would then be 48 * 0.25 = 12. The following hypothesis will test whether there is a significant difference between the expected distribution of causes and the observed distribution. Hypotheses: Null: H 0 :Causes of oil spills are equally likely. Alternative: H 0 :Causes of oil spills are not equally likely. Critical value: This will be a X2 test, with 4 – 1 = 3 degrees of freedom. Assuming a = 0.05, the critical value is 7.815. The null hypothesis will be rejected if the test statistic is greater than 7.815. Test statistic: Observed 10 12 12 14 Expected 12 12 12 12 (O – E)2/E 0.3333 0.0000 0.0000 0.3333 0.6667 Decision: Since the test statistic is less than the critical value, the decision is to fail to reject the null hypothesis. Summary: There is insufficient evidence at the 0.05 level of significance to support a claim that one cause is more likely than another. b. Construct a 90 percent confidence interval for the difference between the mean spillage amount of accidents caused by collision and the mean spillage amount of accidents caused by fire/explosion. Interpret the result. Data: Collision: Mean: x1 = 76.6 StdDev: s1 = 70.3629 Sample size: n1 = 10 Fire/Explosion: Mean: x2 = 70.9167 StdDev: s2 = 60.6757 Sample size: n2 = 12 Since the standard deviations are approximately equal, the number of degrees of freedom is n1 + n2 – 2 = 10 + 12 – 2 = 20. The critical t-value for a 90% confidence interval with 20 degrees of freedom is tcrit = ±1.7247. The limits of the confidence interval are then calculated from: ( x1 - x2 ) ± ( ta /2 ) ( n1 - 1) s12 + ( n2 - 1) s2 2 n1 + n2 - 2 ( 76.6 - 70.9167 ) + (1.7247 ) 1 1 + n1 n2 (10 - 1) ( 70.3629 )2 + (12 - 1) ( 60.6757 )2 10 + 12 - 2 1 1 + 10 12 5.6833 ± (1.7247 ) ( 65.2133 ) ( 0.4282 ) 5.6833 ± 48.1582 ( -42.4749, 53.8145 ) The 90% confidence interval is (-42.47, 52.81). This confidence interval is interpreted to mean that we can be 90% confident that the true difference between the two means lies within the limits of this interval. c. Can we say that the mean spillage amount of accidents caused by grounding is the same as the corresponding mean of accidents caused by hull failure? Data: Grounding: Mean: x1 = 47.7857 StdDev: s1 = 28.4664 Sample size: n1 = 14 Mean: x2 = 54.4167 StdDev: s2 = 56.3874 Sample size: n2 = 12 Hull Failure: Hypotheses: Null: H 0 : µ1 = µ2 Alternative: H1 : µ1 ¹ µ2 Critical value: Treating the variances as equal, the number of degrees of freedom is 14 + 12 – 2 = 24. Assuming that a = 0.05, the critical value is tcrit = ±2.064. The null hypothesis will be rejected if the test statistic is less than -2.064, or greater than 2.064. Test statistic: ttest = x1 - x2 ( n1 - 1) s12 + ( n2 - 1) s2 2 n1 + n2 - 2 ttest = 1 1 + n1 n2 47.7857 - 54.4167 (14 - 1) ( 28.4664 )2 + (12 - 1) ( 56.3874 )2 14 + 12 - 2 1 1 + 14 12 ttest = -0.3871 Decision: Since the test statistic is between the two critical values, the decision is to fail to reject the null hypothesis. Summary: There is insufficient evidence to reject a claim that the mean spillage amount of accidents caused by grounding is the same as the corresponding mean of accidents caused by hull failure d. State any assumptions required for the inferences derived from the analyses to be valid. Are these assumptions reasonably satisfied? The X2 “goodness of fit” test performed in part a requires that the data be obtained from a random sample, and that the expected frequency of each category be at least 5. As there is no information presented in the problem statement regarding how the sample was obtained it is not possible to make a determine about the validity of the assumption. The requirement concerning the expected frequencies is met however, as the expected frequency for each category was 12. The t-tests in part c and d require that the samples be independent, and that they be drawn from a normally distributed population. The inferences assume that these requirements are met. Again, there is insufficient information about the sample to determine whether the requirement of independence is valid. It is possible that some of the reported spills involved the same ships, thereby hampering the independence of the samples. e. Is the variation in spillage amounts for accidents caused by collision the same as the variation in spillage amounts for accidents caused by grounding? Data: Collision: Variance: s12 = 4590.9333 Sample size: n1 = 10 Variance: s2 2 = 810.3352 Sample size: n2 = 14 Grounding: Hypotheses: Null: H 0 : s 12 = s 2 2 Alternative: H1 : s 12 ¹ s 2 2 Critical values: A test of variances uses the F statistic. For this test, the number of degrees of freedom in the numerator is 10 – 1 = 9. The number of degrees of freedom in the denominator is 14 – 1 = 13. Assuming a = 0.05, the critical value is Fcrit = 3.3120 Test statistic: s12 4590.9333 Ftest = 2 = = 264.76 s2 810.3352 Decision: Since the test value is greater than the critical value, the decision is the reject the null hypothesis. Summary: There is sufficient evidence at the 0.05 level of significance to reject a claim that the variation in spillage amounts for accidents caused by collision is the same as the variation in spillage amounts for accidents caused by grounding.