II- Review of Chapter 8: Interval Estimation Population: A population is the set of all elements of interest in a particular study (for that reason, it is sometimes called the universe). Example: If one is interested in evaluating the average revenue of the Moroccan family, the population would be the set of all Moroccan households. Parameter: A parameter is a summary measure describing a characteristic of the entire population. It is generally unknown, and is very difficult to assess precisely. Sometimes in fact, very often the population is very large, to the extent that it is extremely difficult, if not impossible, to consider it in its entirety because of the lack of time, money, etc., It is then necessary to limit oneself to a subset of the entire population. Sample: A sample is a subset of a population. Example: A sample of 5000 households is selected from the population of Moroccan households to conduct the study. Statistic: A statistic is a summary measure computed for a sample, rather than for the entire population. It is generally used as an estimation of the corresponding population parameter. Since it is often difficult, if not impossible to study parameters of entire populations. One needs in that case to limit oneself to subsets of the population, known as samples, which statistics provide approximations of the populations parameters. How well these statistics approximate the true parameters is the subject of the current chapter. III- Interval Estimation of Population Mean III-1- The population standard deviation, , is KNOWN Assume the standard deviation, , of the population is known. Given a sample and its corresponding mean X̄, we may state with (1 ↵) confidence that the population mean µ is within X̄ ± Z↵/2 ⇥ p n If X follows a normal distribution with mean µ and standard deviation . If X is not normally distributed then the sample size, n, must be 30. where ⇥ ⇤ P Z Z↵/2 = 1 ↵/2 Z↵/2 is called the critical value corresponding to a level of confidence of (1 e = Z↵/2 ⇥ p n ↵). is called the sampling error or the margin of error Z↵/2 is called the critical value corresponding to a level of confidence of (1 ↵). Exercise 1: In an e↵ort to estimate the mean amount spent per customer for dinner at a major Atlanta restaurant, data were collected for a sample of 49 customers. Assume a population standard deviation of 5. If the sample mean is 24.80, what is the 95% confidence interval for the population mean? Solution: 9 Since the sample size is 49 which is X̄ 30 then we are (1 ↵) confident that Z↵/2 ⇥ p µ X̄ + Z↵/2 ⇥ p n n Let’s first find Z↵/2 . Since we use a 95% confidence, ↵=1 0.95 = 0.05 so ⇥ ⇤ P Z Z↵/2 = 1 ↵/2 = 1 0.05/2 = 0.975 the Z-table gives us Z↵/2 = 1.96 so we are 95% confident that the population mean, µ, is 24.80 5 5 1.96 ⇥ p µ 24.80 + 1.96 ⇥ p 49 49 23.4 µ 26.2 We are 95% confident that the mean amount spent per customer for dinner, µ, is between 23.40 and 26.20 Exercise 2: The National Quality Research Center at the University of Michigan provides a quarterly measure of consumer opinions about products and services (The Wall Street Journal, February 18, 2003). A survey of 10 restaurants in the Fast Food/Pizza group showed a sample mean customer satisfaction index of 71. Past data indicate that the population standard deviation of the index has been relatively stable with = 4.85. a. b. c. d. What What Using What is the population? What is the parameter of interest? What is the statistic? assumption should the researcher be willing to make if a margin of error is desired? 95% confidence, what is the margin of error? is the margin of error if 99% confidence is desired? Solution: a. Since the sample size is 10 which is less than 30 then the researcher should assume that the customer satisfaction index, which is X, is normally distributed. b. Since we assume the customer satisfaction index is normally distributed then the margin of error, e, is e = Z↵/2 ⇥ p n Let’s first find Z↵/2 . Since we use a 95% confidence, ↵ = 1 0.95 = 0.05 so ⇥ ⇤ P Z Z↵/2 = 1 ↵/2 = 1 0.05/2 = 0.975 the Z-table gives us Z↵/2 = 1.96 so the margin of error is 4.85 e = Z↵/2 ⇥ p = 1.96 ⇥ p = 3.01 n 10 c. Since we assume the customer satisfaction index, X, is normally distributed then the margin of error, e, is e = Z↵/2 ⇥ p n Let’s first find Z↵/2 . Since we use a 99% confidence, ↵ = 1 0.99 = 0.01 so ⇥ ⇤ P Z Z↵/2 = 1 ↵/2 = 1 0.01/2 = 0.995 the Z-table gives us Z↵/2 = 2.58 so the margin of error is 4.85 e = Z↵/2 ⇥ p = 2.58 ⇥ p = 3.96 n 10 10 Exercise 3: A survey of small businesses with Web sites found that the average amount spent on a site was 11500 per year (Fortune, March 5, 2001). Given a sample of 60 businesses and a population standard deviation of = $4000, a. What is the population? What is the random variable X? What is the parameter of interest? What is the statistic? b. Using a 95% confidence, what is the margin of error? c. What would you recommend if the study required a margin of error of 500? a. The population is all small businesses with Web sites. The random variable X is the amount spent on a Web site by a small business. The parameter is the average amount spent on a Web site per year. The statistic is 11500 which is the average of a sample of 60 businesses. b. Since the sample size is 60 which is e, is 30 then, Using a 95% confidence, the margin of error, e = Z↵/2 ⇥ p n Let’s first find Z↵/2 . Since we use a 95% confidence, ↵ = 1 0.95 = 0.05 so ⇥ ⇤ P Z Z↵/2 = 1 ↵/2 = 1 0.05/2 = 0.975 the Z-table gives us Z↵/2 = 1.96 4000 e = 1.96 ⇥ p = $1012.14 60 c. If the study requires e = $500 then Z↵/2 ⇥ p = 500 n ⇣ n = Z↵/2 ⇥ 500 ⌘2 So we need to have a sample size of 246. 11 = ✓ 1.96 ⇥ 4000 500 ◆2 = 245.86 Exercise 4: Vogue magazine reported that the mean annual household income of its readers is 120000 (Vogue, January 2008). Assume this estimate of the mean annual household income is based on a sample of 80 households, and based on past studies, the population standard deviation is known to be = $30000. a. What is the population? What is the random variable X? What is the parameter of interest? What is the statistic? b. Develop a 90% confidence interval estimate of the population mean. c. Develop a 95% confidence interval estimate of the population mean. d. Develop a 99% confidence interval estimate of the population mean. e. Discuss what happens to the width of the confidence interval as the confidence level is increased. Does this result seem reasonable? Explain. Solution: a. The population is all readers of the magazine Vogue. The random variable X is the annual household income of a reader of the magazine Vogue. The parameter is the mean household income, µ, of all readers of the magazine Vogue. The statistic is 120000 which is the average of a sample of 80 households of readers of the magazine Vogue. b. Since the sample size is 80 which is X̄ 30 then we are (1 ↵) confident that Z↵/2 ⇥ p µ X̄ + Z↵/2 ⇥ p n n Let’s first find Z↵/2 . Since we use a 90% confidence, ↵ = 1 0.9 = 0.1 so ⇥ ⇤ P Z Z↵/2 = 1 ↵/2 = 1 0.1/2 = 0.95 the Z-table gives us Z↵/2 = 1.64 so we are 90% confident that the population mean, µ, is 120000 30000 30000 1.64 ⇥ p µ 120000 + 1.64 ⇥ p 80 80 $114499.27 µ $125500.73 c. Since the sample size is 80 which is X̄ 30 then we are (1 ↵) confident that Z↵/2 ⇥ p µ X̄ + Z↵/2 ⇥ p n n Let’s first find Z↵/2 . Since we use a 95% confidence, ↵ = 1 0.95 = 0.05 so ⇥ ⇤ P Z Z↵/2 = 1 ↵/2 = 1 0.05/2 = 0.975 the Z-table gives us Z↵/2 = 1.96 so we are 95% confident that the population mean, µ, is 120000 30000 30000 1.96 ⇥ p µ 120000 + 1.96 ⇥ p 80 80 $113425.96 µ $126574.04 12 d. Since the sample size is 80 which is X̄ 30 then we are (1 ↵) confident that Z↵/2 ⇥ p µ X̄ + Z↵/2 ⇥ p n n Let’s first find Z↵/2 . Since we use a 99% confidence, ↵ = 1 0.99 = 0.01 so ⇥ ⇤ P Z Z↵/2 = 1 ↵/2 = 1 0.01/2 = 0.995 the Z-table gives us Z↵/2 = 2.58 so we are 99% confident that the population mean, µ, is 120000 30000 30000 2.58 ⇥ p µ 120000 + 2.58 ⇥ p 80 80 $111346.42 µ $128653.58 e. The confidence interval gets wider as the confidence level is increased. It makes sense because if we keep the same sample size and we want to be more confident then the margin of error gets larger. III-2- The population standard deviation, , is UNKNOWN The previous result giving the confidence interval estimate for the mean assumed that the standard deviation of the population is known. Very often, however, this assumption cannot be made. When this occurs, one needs to modify the way a confidence interval estimate is formed. This is the objective of this section. To the unknown problem, a straightforward and intuitive answer could be given : just estimate using the point estimator S , the standard deviation of the sample, then just apply the usual general formula of a confidence interval estimate: Confidence Interval = X̄ ± (Critical Value) ⇥ (Std. error) . The Critical Value, though, uses in this case a new probability distribution instead of the normal distribution. This distribution is the Students t distribution. Assume the standard deviation, , of the population is unknown. Given a sample and its corresponding mean X̄ and standard deviation S, we may state with (1 ↵) confidence that the population mean µ is within S X̄ ± t↵/2 ⇥ p n In most applications, a sample size of n 30 is adequate when using the above interval estimation. However, one needs to consider the validity for the following cases. any n if the population follows a normal distribution (X is normally distributed). n must be 15 If the population is not normally distributed but is roughly symmetric and without outliers. n must be 50 if the population’s distribution is highly skewed or contains outliers. where ⇥ ⇤ P T t↵/2 = 1 13 ↵/2 t↵/2 is called the critical value of the Students t-distribution corresponding to a level of confidence of (1 ↵) and a number of degrees of freedom,df = n 1. e = t↵/2 ⇥ pS n is called the sampling error or the margin of error Remember: as n becomes larger and larger the t distribution with n closer and closer to a normal distribution N (0,1) 1 degrees of freedom gets Exercise 1: Redo the exercise in the previous section about a sample of 80 households of readers of the magazine Vogue readers but consider that we do not know the population’s standard deviation and that the sample’s standard deviation is S = $30000. Compare these results with the results found in the previous section ( KNOWN). Solution: a. No need to answer it. b. Since the sample size is 80 which is X̄ 50 then we are (1 S S t↵/2 ⇥ p µ X̄ + t↵/2 ⇥ p n n Let’s first find t↵/2 . Since we use a 90% confidence, ↵ = 1 0.9 = 0.1 so ⇥ ⇤ P T t↵/2 = 1 ↵/2 = 1 the t-table gives us t↵/2 = 1.664 -we use df = n that the population mean, µ, is 120000 ↵) confident that 0.1/2 = 0.95 1 = 80 1 = 79. So we are 90% confident 30000 30000 1.664 ⇥ p µ 120000 + 1.664 ⇥ p 80 80 $114418.77 µ $125581.23 The interval estimate we found when we considered KNOWN was: $114499.27 µ $125500.73 It is clear that the interval when UNKNOWN is wider than when we solved with c. Since the sample size is 80 which is X̄ 50 then we are (1 ↵) confident that S S t↵/2 ⇥ p µ X̄ + t↵/2 ⇥ p n n 14 KNOWN. Let’s first find t↵/2 . Since we use a 95% confidence, ↵ = 1 0.95 = 0.05 so ⇥ ⇤ P T t↵/2 = 1 ↵/2 = 1 the t-table gives us t↵/2 = 1.99 -we use df = n that the population mean, µ, is 0.05/2 = 0.975 1 = 80 1 = 79. So we are 95% confident 30000 30000 1.99 ⇥ p µ 120000 + 1.99 ⇥ p 80 80 120000 $113325.34 µ $126674.66 The interval estimate we found when we considered KNOWN was: $113425.96 µ $126574.04 It is clear that the interval when UNKNOWN is wider than when we solved with d. Since the sample size is 80 which is 50 then we are (1 KNOWN. ↵) confident that S S t↵/2 ⇥ p µ X̄ + t↵/2 ⇥ p n n X̄ Let’s first find t↵/2 . Since we use a 99% confidence, ↵ = 1 0.99 = 0.01 so ⇥ ⇤ P T t↵/2 = 1 ↵/2 = 1 the t-table gives us t↵/2 = 2.64 -we use df = n that the population mean, µ, is 0.01/2 = 0.995 1 = 80 1 = 79. So we are 99% confident 30000 30000 2.64 ⇥ p µ 120000 + 2.64 ⇥ p 80 80 120000 $111145.17 µ $128854.83 The interval estimate we found when we considered KNOWN was: $111346.42 µ $128653.58 It is clear that the interval when UNKNOWN is wider than when we solved with KNOWN. Exercise 2: Redo exercise 1 for a sample size of 100. Compare these results with the results found in the previous section ( KNOWN with sample size 80). Solution: a. No need to answer it. b. Since the sample size is 100 which is X̄ 50 then we are (1 ↵) confident that S S t↵/2 ⇥ p µ X̄ + t↵/2 ⇥ p n n Let’s first find t↵/2 . Since we use a 90% confidence, ↵ = 1 0.9 = 0.1 so 15 ⇥ ⇤ P T t↵/2 = 1 ↵/2 = 1 the t-table gives us t↵/2 = 1.66 -we use df = n that the population mean, µ, is 0.1/2 = 0.95 1 = 100 1 = 99. So we are 90% confident 30000 30000 1.66 ⇥ p µ 120000 + 1.66 ⇥ p 100 100 120000 $115020 µ $124980 The interval estimate we found when we solved with KNOWN and n = 80 was: $114499.27 µ $125500.73 It is clear that the interval when UNKNOWN and n = 100 is narrower than when we solved with KNOWN and n = 80. Even if we do not have the population’s standard deviation we can obtain a narrower interval by increasing the sample size. c. Since the sample size is 100 which is X̄ 50 then we are (1 S S t↵/2 ⇥ p µ X̄ + t↵/2 ⇥ p n n Let’s first find t↵/2 . Since we use a 95% confidence, ↵ = 1 0.95 = 0.05 so ⇥ ⇤ P T t↵/2 = 1 ↵/2 = 1 the t-table gives us t↵/2 = 1.984 -we use df = n that the population mean, µ, is 120000 ↵) confident that 0.05/2 = 0.975 1 = 100 1 = 99. So we are 95% confident 30000 30000 1.984 ⇥ p µ 120000 + 1.984 ⇥ p 100 100 $114048 µ $125952 The interval estimate we found when we solved with KNOWN and n = 80 was: $113425.96 µ $126574.04 It is clear that the interval when UNKNOWN and n = 100 is narrower than when we solved with KNOWN and n = 80. Even if we do not have the population’s standard deviation we can obtain a narrower interval by increasing the sample size. d. Since the sample size is 100 which is X̄ 50 then we are (1 S S t↵/2 ⇥ p µ X̄ + t↵/2 ⇥ p n n Let’s first find t↵/2 . Since we use a 99% confidence, ↵ = 1 0.99 = 0.01 so ⇥ ⇤ P T t↵/2 = 1 ↵/2 = 1 the t-table gives us t↵/2 = 2.626 -we use df = n that the population mean, µ, is 120000 ↵) confident that 0.01/2 = 0.995 1 = 100 1 = 99. So we are 99% confident 30000 30000 2.626 ⇥ p µ 120000 + 2.626 ⇥ p 100 100 16 $112122 µ $127878 The interval estimate we found when we solved with KNOWN and n = 80 was: $111346.42 µ $128653.58 It is clear that the interval when UNKNOWN and n = 100 is narrower than when we solved with KNOWN and n = 80. Even if we do not have the population’s standard deviation we can obtain a narrower interval by increasing the sample size. 17 Exercise 3 (Excel): A study designed to estimate the mean credit card debt for the population of U.S. households used a sample of 70 households. The data collected is in the sheet named ”SigmaUnknown-Ex3 ” in the Excel file ”Interval Estimation”. Develop a 95% confidence interval for the mean credit card debt for the population of the US. Solution: Using Excel X̄ = S= s Pn X̄ i=1 Xi n 1 Pn i=1 n 2 = Since we use a 95% confidence then ↵ = 1 Xi s = $9312 P70 i=1 2 (Xi 9312) = $4007 69 0.95 = 0.05 t↵/2 = t0.025 . We do not know the population’s standard deviation, , but since the sample size is we are 95% confident that the population mean, µ, is X̄ where t has n 1 = 70 50 then S S t0.025 ⇥ p µ X̄ + t0.025 ⇥ p n n 1 = 69 degrees of freedom. P [T t0.025 ] = 1 ↵/2 = 1 0.05/2 = 0.975 the t-table gives us t0.025 = 1.99 so we are 95% confident that the mean credit card debt, µ, of US households is 4007 4007 9312 1.99 ⇥ p µ 9312 + 1.99 ⇥ p 70 70 $8358.93 µ $10265.07 Exercise 4 (Excel): A study designed to estimate the average price per square meter of apartments in a certain neighborhood in Casablanca used a sample of 100 apartments. The data collected is in the sheet named ”SigmaUnknown-Ex4 ” in the Excel file ”Interval Estimation”. Develop a 95% confidence interval for the mean price per square meter of apartments in that neighborhood. IV- Population Proportion Given a sample of size n and the corresponding sample proportion p̄, we may state with (1 confidence that the population proportion, p, is within p̄ ± Z↵/2 ⇥ If np̄ 5 and n (1 p̄) r ↵) p̄ (1 p̄) n 5 where Z↵/2 is the critical value corresponding to the (1 ↵) level of confidence. Example: A random sample of 400 voters showed that 54 will vote for Candidate A. Set up a 95% confidence interval estimate for the proportion of voters who will vote for Candidate A. 18 We have the sample proportion, p̄, is p̄ = 54 = 13.5% 400 so np̄ = 400 ⇥ 0.135 = 54 5 and n(1 p̄) = 400 ⇥ (1 0.135) = 346 5. We are 95% confident that the population proportion, p, is r r p̄(1 p̄) p̄(1 p̄) p̄ Z↵/2 ⇥ p p̄ + Z↵/2 ⇥ n n r r 0.135 ⇥ (1 0.135) 0.135 ⇥ (1 0.135) 0.135 1.96 ⇥ p 0.135 + 1.96 ⇥ 400 400 0.1015 p 0.1685 Exercise 2 (Excel): A study about the car colors Moroccans prefer used a sample of 120 cars. The data collected is in the sheet named ”Proportion-Ex2 ” in the Excel file ”Interval Estimation”. The study considered only four colors: black, dark blue, red, and white. a. Develop a 95% confidence interval for the proportion of black cars in Morocco. b. Develop a 95% confidence interval for the proportion of dark blue cars in Morocco. a. We have the sample proportion, p̄, is p̄ = 35 = 29.17% 120 so np̄ = 120 ⇥ 0.2917 = 35.004 5 and n(1 p̄) = 120 ⇥ (1 0.2917) = 84.996 5. We are 95% confident that the population proportion, p, is r r p̄(1 p̄) p̄(1 p̄) p̄ Z↵/2 ⇥ p p̄ + Z↵/2 ⇥ n n r r 0.2917 ⇥ (1 0.2917) 0.2917 ⇥ (1 0.2917) 0.2917 1.96 ⇥ p 0.2917 1.96 ⇥ 120 120 0.2104 p 0.373 b. We have the sample proportion, p̄, is p̄ = 29 = 24.17% 120 so np̄ = 120 ⇥ 0.2417 = 29.004 5 and n(1 p̄) = 120 ⇥ (1 0.2417) = 90.996 5. We are 95% confident that the population proportion, p, is r r p̄(1 p̄) p̄(1 p̄) p̄ Z↵/2 ⇥ p p̄ + Z↵/2 ⇥ n n r r 0.2417 ⇥ (1 0.2417) 0.2417 ⇥ (1 0.2417) 0.2417 1.96 ⇥ p 0.2417 1.96 ⇥ 120 120 0.1651 p 0.3183 ============================================================== 19 Chapter 09: Hypothesis Testing I. Introduction In the previous chapter, we have seen how to develop confidence interval estimates for the mean and the proportion. In many situations in practice, decision makers are more interested in taking a decision to accept or reject an assessment about a parameter of the population than to find point or interval estimates of that parameter. Statistical hypothesis testing provides an analytical and rigourous methodology for doing such tests. We are talking here about decision to be taken from one or a few samples, so such decisions of course are always subject to some amount of statistical errors. However, the advantage of hypothesis testing is that it gives managers and decision makers the complete control on the errors that they make during the process of testing hypotheses or claim, as we shall see further in the chapter. Definition: A null hypothesis is a claim or a statement about a parameter (e.g., the mean, the proportion, the variance, etc.) of the population, which will be tested. Testing the null hypothesis leads to determining whether the statement about the population parameter should or should not be rejected. The null hypothesis is always represented by H0 . By convention, we have the following: (a) A null hypothesis should always contain an equality (e.g., µ = 4.65 ), or a non strict inequality (e.g., µ 4.65 or µ 4.65). A test like µ = 4.65 is called a two-tail test, while a test like µ 4.65 or µ 4.65 is called a one-tail test. (b) It can only be rejected, but never confirmed or accepted. (c) It is always accompanied by its opposite, called the alternative hypothesis. Definition: The alternative hypothesis, represented by H1 , is the opposite claim of the null hypothesis. Therefore, the null hypothesis is rejected if and only if the alternative hypothesis is accepted. Example 1: A new machine in a manufacturing plant is designed to produce an hourly average of 90 units. This is the status-quo, or the current state. Now, let suppose the engineering department comes with a new design that is supposed to increase the output of the machine. The objective is then to prove that the new design really worked as expected and increased the output. In this case the hypotheses are set as follows: 20 ⇢ H0 : H1 : µ µ> 90 90 as you can see, since we can only reject or not the null hypothesis ( H0 : µ 90), we end up proving or not the alternative (H1 : µ > 90), which is our objective. Example 2: Let us now present a di↵erent situation in which the objective is not to try to prove a claim, but rather to reject it. Suppose a consumer defense group receives complains concerning a famous brand of shoes, which mention that the actual sizes of shoes are not conform the sizes announced. Size 11, in particular, is mentioned many times as a source of errors. The consumer group thus decides to investigate this size. In this case, they are primarily interested in challenging the companys claim that the average size is 11. Therefore, their hypotheses would be: ⇢ H0 : µ = 11 H1 : µ 6= 11 The hypothesis testing procedure leads to two conclusions: the rejection of H0 or the conclusion that H0 cannot be rejected (again, do not forget that H0 cannot be accepted!). Of course, H0 may in reality be true or false. Hence, there are two types of errors that can be made: Type I error: rejecting H0 while it is in fact true; or Type II error: failing to reject H0 while it is in fact false. II. Hypothesis Testing for the Mean KNOWN Let us start with hypotheses that have equalities, or to what we call two-tail tests. Using Example 2 above, we have a null hypothesis H0 : µ = 11. Intuitively, if we take a sample (with a large 21 enough size) and we compute its mean X̄, and we find a value that is very far from our claim (11), than we suspect that the null hypothesis could be rejected. More rigorously, assuming µ = 11, we know from the Central LimitpTheorem that X̄ is approximately normally distributed with mean µ and standard deviation / n (see Figure 1 below). Let us define the area of the distribution located in the tails, so that the probability of each portion of this area is equal to ↵/2. Obviously, there is only a probability of ↵ that X̄ is in this area while µ = 11. Consequently, we can say with (1 ↵) confidence that µ 6= 11 if we find a value of X̄ in this area. Hence, this two-tail area is called the rejection region and is used, depending on the position of X̄ (within or outside the rejection region) to reject or not the null hypothesis. How to determine the rejection region? We prefer generally to work with the standardized normal distribution instead of the distribution of X̄. The boundaries Z↵/2 and Z↵/2 are obtained from normal table or from a statistical software package, using the fact that ⇥ the standardized ⇤ P Z Z↵/2 = 1 ↵/2. Since the standardized normal distribution is used, we need to transform X̄ accordingly by computing the Z-test statistic, Z= X̄ µ p / n The final step in our testing procedure is to determine whether the test statistic is inside the rejection region or not, by comparing Z to Z↵/2 and Z↵/2 : if Z Z↵/2 or Z Z↵/2 , then we are inside the rejection region and the null hypothesis is rejected; otherwise, we conclude that there is not enough statistical evidence to reject H0 . For our Example 2, suppose a 100 pairs of shoes sample is selected and gives as a sample mean of 11.3. Assume also that the standard deviation announced by the company is 0.55. Let us perform the test of hypothesis H0 with a significance level ↵ = 0.05. Let us first compute the Z-test statistic. We have µ = 11, X̄ = 11.3, Z= = 0.55, and n = 100 X̄ µ p / n Let’s find Z↵/2 . Since we use a significance level ↵ =0.05 then P [Z Z0.025 ] = 1 ↵/2 = 1 22 0.05/2 = 0.975 the Z-table gives us Z0.025 = 1.96 Since Z Z↵/2 (5.45 5% significance level. Z= 11.3 11 p = 5.45 0.55/ 100 Z= 11.3 11 p = 5.45 0.55/ 100 1.96), our sample is located in the rejection region, so we reject H0 with Exercise 1: ATMs must be stocked with enough cash to satisfy customers making withdrawals over an entire weekend. But if two much cash is unnecessarily kept in the ATMs, the bank is forging the opportunity of investing the money and earning interest. Assume that at a particular branch the population mean amount of money withdrawn from ATMs per customer transaction over the weekend has traditionally been 160 with a population standard deviation of 30. The branch managers are investigating the possibility that the population mean amount of money withdrawn µ has moved away from 160. a. If a random sample of 36 customer transactions indicates that the sample mean withdrawal amount is 172, is there evidence to believe that the population mean is no longer 160? (Hint: formulate first your null and alternative hypotheses, then conduct the test with 5% significance level). b. How would your results change if the significance level is 1 c. How would your results change if the population standard deviation is 24? Solution: a. ⇢ H0 : H1 : µ= µ 6= $160 $160 Let’s first find Z↵/2 . Since we use a significance level ↵ =0.05 then P [Z Z0.025 ] = 1 ↵/2 = 1 0.05/2 = 0.975 the Z-table gives us Z0.025 = 1.96 Z= 172 160 p = 2.4 30/ 36 Since Z Z↵/2 (2.4 1.96), our sample is located in the rejection region, so we reject H0 with 5% significance level. b. Let’s first find Z↵/2 . Since we use a significance level ↵ =0.01 then P [Z Z0.005 ] = 1 ↵/2 = 1 0.01/2 = 0.995 the Z-table gives us Z0.005 = 2.58 Z= 172 160 p = 2.4 30/ 36 Since Z is between Z↵/2 and Z↵/2 ( 2.58 2.4 2.58), do not reject H0 . There is not enough evidence to conclude that the mean amount of cash withdrawn per customer from the ATM machine is not equal to 160. 23 c. ↵ = 0.05 Let’s first find Z↵/2 . Since we use a significance level ↵ =0.05 then P [Z Z0.025 ] = 1 ↵/2 = 1 0.05/2 = 0.975 the Z-table gives us Z0.025 = 1.96 Z= 172 160 p =3 24/ 36 Since Z Z↵/2 (3 1.96), our sample is located in the rejection region, so we reject H0 with 5% significance level. ↵ = 0.01 Let’s first find Z↵/2 . Since we use a significance level ↵ =0.01 then P [Z Z0.005 ] = 1 ↵/2 = 1 0.01/2 = 0.995 the Z-table gives us Z0.005 = 2.58 Z= 172 160 p =3 24/ 36 Since Z Z↵/2 (3 2.58), our sample is located in the rejection region, so we reject H0 with 1% significance level. 24