Statistics 2 Samples and hypothesis testing Chapter assessment 1. A factory manager is specifying a new storage tank for a particular chemical. In routine use, the tank will be filled to capacity each weekend. There should be enough chemical to last until the next weekend, as emergency deliveries are very expensive. On the other hand, money is wasted if an excessive amount of the chemical is stored. The volume of chemical varies from week to week and is modelled by a Normally distributed random variable X. The manager is investigating the mean of X. Data are available for a random sample of 15 weeks, giving the volumes of the chemical used in each week. These are as follows (in litres). 1962 1909 1928 1940 1943 1897 1939 1924 1866 1978 1964 1944 1942 1992 1996 The standard deviation of X is taken from long experience to be 28 litres. A 2000-litre tank will be specified if the mean of X is no more than 1930 litres. Carry out a 5% significance test to examine whether a 2000-litre tank should be specified, stating clearly the null and alternative hypotheses and the conclusion. [8] 2. A craftsman makes hand-made souvenirs. The time taken to make a souvenir is a Normally distributed random variable with mean 34 minutes and standard deviation 2.6 minutes. The craftsman undertakes a training course to improve his skill. Afterwards, a random sample of 8 times taken to make souvenirs is as follows (in minutes). 35.4 32.3 26.6 30.4 31.9 33.8 29.6 28.4 Assuming that the underlying standard deviation has not changed, test at the 0.1% level whether the mean time taken to make a souvenir has decreased after the training course. [8] 3. Psychologists are developing a new index of overall intelligence for 11-year old children. It is assumed that the index is Normally distributed over the whole underlying population and that the standard deviation of this distribution is 12. If the index has been created correctly, its mean over the population should be 50. The index is measured for a random sample of 100 11-year old children. It is found that the sample mean value is 47.8. Test the hypothesis that the true mean of the index is 50, against the alternative that it is not 50, at the 1% level of significance. [8] © MEI, 22/04/08 1/7 S2 Hypothesis testing Chapter assessment solutions 4. An office experiences a lack of reliability of its email system when transmitting messages. Many emails are successfully transmitted at the first attempt, others are eventually successfully transmitted, but only after more than one attempt, and others are not successfully transmitted at all. The computer manager thinks there may be an association between the success of transmission and the type of user at the intended destination. Results for a random sample of 300 emails are as follows. Type of destination Commercial Government University user department Successful at first attempt Successful after more Transmission than one attempt Not successful at all 100 57 23 21 14 13 31 21 20 (i) State the null and alternative hypotheses under examination in the usual χ² test applied to this contingency table. [2] 5. (ii) Carry out the test, at the 10% significance level. [12] (iii) Discuss your conclusions. [4] As part of a survey of interest in local elections, a random sample of 845 people was taken in towns which did not have directly-elected mayors. The people were classified according to age (< 30 or ≥ 30) and their stated level of interest (great or little) in local elections. The results were as follows. Age < 30 ≥ 30 Level of interest Great Little 49 216 145 435 (i) Carry out the usual χ² test for independence, at the 5% significance level, stating carefully the null and alternative hypotheses and briefly discussing the conclusions. [12] At a later stage in the survey, a random sample of 1327 people was taken in towns which had directly elected mayors. These people were classified similarly, with the following results. Age < 30 ≥ 30 Level of interest Great Little 118 314 260 635 © MEI, 22/04/08 2/7 S2 Hypothesis testing Chapter assessment solutions (ii) The organisation that commissioned the survey then asked whether, for people under the age of 30, the level of interest in local elections is independent of whether or not there is a locally elected mayor. Using the data in the tables above, write down the 2 × 2 table, including its margins, to be analysed. [4] (iii) Explain why the usual χ² test might not be appropriate for the 2 × 2 table you have written down in part (ii). [2] Total 60 Solutions to Chapter assessment 1. H0: μ = 1930 H1: μ > 1930 where μ is the population mean volume in litres x= 29124 = 1941.6 15 EITHER: Test statistic z = 1941.6 − 1930 = 1.6045 28 15 Right-hand tail so critical value at 5% level = Φ −1(0.95 ) = 1.645 Critical region is z > 1.645 1.6045 < 1.645 so accept H0: there is not sufficient evidence to suggests that the mean volume is greater than 1930 litres, so a 2000-litre tank should be specified. OR: Critical value for right-hand tail = 1930 + 1.645 × 28 = 1941.89 15 Critical region is X > 1941.89 x is not in critical region so accept H0: there is not sufficient evidence to suggests that the mean volume is greater than 1930 litres, so a 2000-litre tank should be specified. OR: ⎛ 1941.6 − 1930 ⎞ P(X ≥ 1941.6) = 1 − Φ ⎜ ⎟ ⎝ 28 / 15 ⎠ = 1 − Φ ( 1.6045 ) = 1 − 0.9457 = 0.0543 0.0543 > 0.05, so accept H0: there is not sufficient evidence to suggests that the mean volume is greater than 1930 litres, so a 2000-litre tank should be specified. © MEI, 22/04/08 3/7 S2 Hypothesis testing Chapter assessment solutions 2. H0: μ = 34 H1: μ < 34 where μ is the population mean time taken in minutes x= 248.4 = 31.05 8 EITHER: Test statistic z = 31.05 − 34 = −3.209 2.6 8 Left–hand tail so critical value at 0.1% level = Φ −1(0.001) = −Φ −1(0.999) = −3.090 Critical region is z < -3.090 -3.209 < -3.090, so reject H0: there is evidence to suggest that the craftsman’s mean time to make a souvenir has decreased. OR: Critical value for left-hand tail = 34 − 3.090 × 2.6 = 31.16 8 Critical region is X < 31.16 x is in critical region so reject H0: there is evidence to suggest that the craftsman’s mean time to make a souvenir has decreased. OR: ⎛ 31.05 − 34 ⎞ P(X < 31.05) = Φ ⎜ ⎟ ⎝ 2.6 / 8 ⎠ = Φ ( −3.209 ) = 1 − Φ(3.209) = 1 − 0.9993 = 0.0007 0.0007 < 0.001 so reject H0: there is evidence to suggest that the craftsman’s mean time to make a souvenir has decreased. 3. H0: μ = 50 H1: μ ≠ 50 where μ is the population mean value of the index. x = 47.8 47.8 − 50 = −1.8333 12 / 100 Two tail test at 1% significance so each tail is 0.5% Critical value for left-hand tail = Φ −1(0.005 ) EITHER: Test statistic z = = −Φ −1(0.995 ) = −2.576 Critical region is z < -2.576 © MEI, 22/04/08 4/7 S2 Hypothesis testing Chapter assessment solutions -1.8333 > -2.576 so accept H0: there is not sufficient evidence to suggest that the true population mean is not 50. OR: OR: Two tail test at 1% significance so each tail is 0.5% 12 = 46.9088 Critical value for left-hand tail = 50 − 2.576 × 100 Critical region is X < 46.9088 x is not in critical region so accept H0: there is not sufficient evidence to suggest that the true population mean is not 50. ⎛ 47.8 − 50 ⎞ P(X < 47.8) = Φ ⎜ ⎟ = Φ( −1.8333) ⎝ 12 / 100 ⎠ = 1 − 0.9666 = 0.0334 For two tail test at 1% significance level, compare probability with 0.005. 0.0334 > 0.005 so accept H0: there is not sufficient evidence to suggest that the true population mean is not 50. 4. (i) H0: there is no association between success of transmission and type of destination H1: there is some association between success of transmission and type of destination (ii) Observed frequencies: Commercial user Successful at first attempt Successful after Transmission more than one attempt Not successful at all Marginal totals Type of destination Government University department Marginal totals 100 57 23 180 21 14 13 48 31 21 20 72 152 92 56 300 © MEI, 22/04/08 5/7 S2 Hypothesis testing Chapter assessment solutions Expected frequencies: Type of destination Commercial Government University user department Successful at first attempt Successful after Transmission more than one attempt Not successful at all Marginal totals χ 2 ( 100 − = 91.2 ) 2 + 91.2 ( 13 − 8.96 ) + 2 8.96 Marginal totals 91.2 55.2 33.6 180 24.32 14.72 8.96 48 36.48 22.08 13.44 72 152 92 56 300 ( 57 − 55.2 )2 55.2 ( 23 − 33.6 )2 ( 21 − 24.32 )2 ( 14 − 14.72 ) + + + 33.6 24.32 14.72 ( 31 − 36.48 )2 ( 21 − 22.08 )2 ( 20 − 13.44 )2 + + + 36.48 22.08 13.44 = 10.64 For a 3 × 3 table there are 2 × 2 = 4 degrees of freedom Critical value at 10% significance level with ν = 4 4 = 7.779 10.64 > 7.770 so reject H0: there is evidence to suggest an association between the success of the transmission and the type of destination. (iii) When the destination is a university, far less transmissions are successful at the first attempt than would be expected, and more are successful after more than one attempt or unsuccessful than would be expected. For commercial users, more are successful at the first attempt than would be expected, and less are successful after more than one attempt or unsuccessful than would be expected. For Government departments the results are quite close to the expected frequencies. 5. (i) H0: there is no association between age group and level of interest H1: there is some association between age group and level of interest Observed frequencies < 30 ≥ 30 Marginal totals Age Level of interest Great Little 49 216 145 435 194 651 © MEI, 22/04/08 Marginal totals 265 580 845 6/7 2 S2 Hypothesis testing Chapter assessment solutions Expected frequencies < 30 ≥ 30 Marginal totals Age ( 49 − 60.84 ) = Level of interest Great Little 60.84 204.16 133.16 446.84 194 651 Marginal totals 265 580 845 2 ( 216 − 204.16 )2 χ + 60.84 204.16 2 ( 145 − 133.16 ) ( 435 − 446.84 )2 + + 133.16 446.84 = 4.357 For a 2 × 2 table there is 1 × 1 = 1 degree of freedom. Critical value for 5% significance level with ν = 1 is 3.841 4.357 > 3.841 so reject H0: there is evidence to suggest an association between age group and level of interest. 2 Under 30s seem to have less interest than would be expected, while the 30 and over age group seem to have more interest than would be expected. (ii) Locally Yes elected No mayor Marginal totals Level of interest Great Little 118 314 Marginal totals 432 49 216 265 167 530 697 (iii) This is not a random sample of 697 people classified over these four cells. © MEI, 22/04/08 7/7