The set of values of the test The probability of getting an outcome statistic which lead to rejection at least as extreme as that observed, if the null hypothesis is true. of the null hypothesis. Null hypothesis, symbol H 0 Test statistic Alternative hypothesis, symbol H 1 A value which gives a characteristic of the population. Often symbolised by a Greek letter. A test which looks for evidence that the parameter is greater than (or less than) a particular value. Critical value Critical region A single value calculated from the sample. It is used to make a decision. The belief you start with. You will only stop believing this if there is enough evidence. Significance level The probability that the null hypothesis is rejected, even though it is true. Parameter The value which you compare the test statistic with to decide whether to reject the null hypothesis. This is what you are looking for evidence of. p value 1 tail test The belief you start with. Null hypothesis, symbol H 0 You will only stop believing this if there is enough evidence. Alternative hypothesis, symbol H 1 This is what you are looking for evidence of. p value The probability of getting an outcome at least as extreme as that observed, if the null hypothesis is true. Significance level The probability that the null hypothesis is rejected, even though it is true. Test statistic A single value calculated from the sample. It is used to make a decision. Parameter A value which gives a characteristic of the population. Often symbolised by a Greek letter. Critical value The value which you compare the test statistic with to decide whether to reject the null hypothesis. Critical region The set of values of the test statistic which lead to rejection of the null hypothesis. 1 tail test A test which looks for evidence that the parameter is greater than (or less than) a particular value. 3 Over a long period of time, 20% of all bowls made by a particular manufacturer are imperfect and cannot be sold. The manufacturer introduces a new process for producing bowls. To test whether there has been an improvement, each of a random sample of 20 bowls made by the new process is examined. From this sample, 2 bowls are found to be imperfect. (ii) Show that this does not provide evidence, at the 5% level of significance, of a reduction in the proportion of imperfect bowls. You should show your hypotheses and calculations clearly. [6] (MEI S1 Jan 2006 (part)) State the null hypothesis State the alternative hypothesis Say what p stands for H 0 : p = 0.2 H1: p < 0.2 p is the proportion of imperfect bowls produced Decide what the distribution seen in the X ∼ B(20, 0.2) where X is the number of sample would be if the null hypothesis imperfect bowls in the sample is true Decide whether large or small values of Small values of X would lead to X (or both) would lead to rejection of rejection of H 0 the null hypothesis P( X ≤ 2) = 0.2061 Find the probability of the observed value of X and the values more extreme Compare the probability to the 20.61% > 5% significance level Decide whether to accept or reject the Accept H 0 null hypothesis State the decision in a way that relates There is insufficient evidence, at the to the original situation 5% level of significance, of a reduction in the proportion of imperfect bowls Mark scheme 3 (i) (ii) X ~ B(10,0.2) P(X < 4) = P(X ≤ 3) = 0.8791 OR attempt to sum P(X = 0,1,2,3) using X ~ B(10,0.2) can score M1, A1 Let p = the probability that a bowl is imperfect H 0 : p = 0.2 H1: p < 0.2 X ~ B(20,0.2) P(X ≤ 3) = 0.2061 0.2061 > 5% M1 for X ≤ 3 A1 2 B1 Definition of p B1, B1 Cannot reject H 0 and so insufficient evidence B1 for 0.2061 seen M1 for this comparison to claim a reduction. A1 dep for comment in context OR using critical region method: CR is {0} B1, 2 not in CR M1, A1 as above TOTAL Examiners’ report In the hypothesis test, although many candidates gave correct hypotheses in terms of p, few defined p explicitly in words. Centres should advise candidates that such a definition does attract credit. It was notable that from any given centre it was usually the case that either almost all candidates defined p or no candidates did so. The hypotheses themselves were usually correctly given but a number of candidates still continue to lose marks through poor notation. Candidates should be aware that H0 = 0.2 is not an acceptable notation, nor is H0 : P(X=0.2). The standard notation is H0: p = 0.2. As in previous sessions, many candidates used point probabilities, which effectively prevents any further credit being gained. Those who were successful in comparing the tail probability of 0.2061 with 0.05 often lost the final mark by not putting their conclusion in context. To simply state ‘Accept H0’ on its own is not sufficient to gain credit here. A conclusion along the lines of ‘There is insufficient evidence to claim that there has been a reduction’ is needed to gain the mark. An argument based on critical regions is of course perfectly acceptable, but candidates preferring to use such arguments need to be very precise. To simply state that the critical region is {0} without a probability justification is insufficient. 3 3 8 Accept H 0 p is the proportion of imperfect bowls produced State the null hypothesis Decide whether large or small values of X (or both) would lead to rejection of the null hypothesis State the alternative hypothesis P(X ≤ 2) = 0.2061 There is insufficient evidence, at the 5% level of significance, of a reduction in the proportion of imperfect bowls Compare the probability to the significance level Decide whether to accept Small values of X would or reject the null hypothesis lead to rejection of H 0 H 0 : p = 0.2 Decide what the distribution seen in the sample would be if the null hypothesis is true X ∼ B(20, 0.2) where X is the number of imperfect bowls in the sample Find the probability of the observed value of X and the values more extreme 20.61% > 5% H 1 : p < 0.2 Say what p stands for State the decision in a way that relates to the original situation Dream Number In the National Lottery Dream Number game seven digits are drawn at random. In the 17 draws in November and December 2007, the first digit to be drawn was 2 on four occasions. If the draw is truly random, the probability of the first digit being 2 should be random? 1 . Is there evidence, at the 5% level of significance, that the draw is not 10 Three solutions are given below; each of them is incorrect or incomplete (or both). What mistakes have been made in each solution? Can you produce a correct and complete solution? 4 = 23.5% . 17 The probability should be 1 = 10% . 10 There is more than 5% difference between these so there is evidence that the draw is not random. H 0 : p = 0.1 H1 : p ≠ 0.1 X ~ B(7, 0.1) P( X ≥ 4) P( X ≥ 4) = 1 − P( X ≤ 3) = 1 − 0.9973 = 0.0027 = 0.27% ! X ~ B(17, 0.1) P( X ≥ 4) P( X ≥ 4) = 1 − P( X ≤ 4) = 1 − 0.9779 = 0.0221 = 2.21% ! !