1 Chapter 7 – Inferences Concerning a Mean Definition: Inferential statistics is the branch of statistics concerned with inferring the characteristics of populations (i.e., parameter values) based on the information contained in sample data sets. Inferential statistics includes estimation of parameters and hypothesis testing. There are two branches of statistical inference, 1) estimation of parameters and 2) testing hypotheses about the values of parameters. We will consider estimation first. Defn: A point estimate of a parameter is a specific numerical value, ˆ , of a statistic, ̂ , based on the data obtained from a sample. A particular statistic, such as ̂ , used to provide a point estimate of a parameter is called an estimator. Note: ̂ is a function of X1, X2, ..., Xn, the elements of a random sample, and is thus a random variable, with its own sampling distribution. Note: Nothing is said in the definition about the “goodness” of the estimate. Any statistic is an estimate of a parameter. For a particular parameter , some statistics provide better estimates than others. We need to examine the statistical properties of each estimator to decide which one gives the best estimate for a parameter. 2 Defn: The point estimator ̂ is said to be an unbiased estimator of ˆ the parameter if E . If the estimator is not unbiased, then ˆ the difference B E is called the bias of the estimator ̂ . Example: Assume that we have a r.s. X1, X2, ..., Xn from a distribution with unknown mean . We want to find an unbiased estimator of . From the linearity property of expectation, we have 1 n 1 n 1 n E X E X i E X i . Hence for any n i 1 n i 1 n i 1 distribution, the sample mean is an unbiased estimator of the distribution mean. Sometimes there are several unbiased estimators of a given parameter. For example, each member of the sample is also an unbiased estimator of the distribution mean. We want to decide which of them provides the best estimate, based on the characteristics of the sampling distributions of the estimators. Suppose that we have selected a r.s. X1, X2, ..., Xn from a distribution that is characterized by an unknown parameter . Suppose that we have two statistics ̂1 and ̂2 , both of which are unbiased estimators of . Which estimator should we use? Since we want our particular estimate to be close to the true value of , we want to use an estimator whose sampling distribution has small variance. 3 Defn: In the class of all unbiased estimators of a parameter , that estimator whose sampling distribution has the smallest variance is called the minimum variance unbiased estimator (MVUE) for . Defn: The standard error of an estimator ̂ of a parameter is just the square root of the variance of the sampling distribution of ̂ : ˆ V ˆ . S .E. Example: Estimating the population mean from sample data. Parameter: Population mean, µ Data: A random sample, 𝑋1 , 𝑋2 , … , 𝑋𝑛 Unbiased and Efficient Estimator: 𝑋̅ 𝑆 Estimator of Standard Error: 𝑛 √ In any given estimation situation, we have no way of knowing how close the point estimate is to the true value of the parameter, since the true value of the parameter is unknown. We can be certain, however, that the estimated value is not equal to the true value. We want to be able to say how good our estimate is. Hence, we want to extend the idea of a point estimate to the following type of estimation. Defn: A confidence interval estimate of a parameter is an interval obtained based on a point estimate, together with a percentage that specifies how confident we are that the true value of the parameter lies in the interval. This percentage is called the confidence level, or confidence coefficient. 4 The general procedure for obtaining a confidence interval estimate for a parameter is as follows: 1) We choose our level of confidence, 1- , (usually 90% or 95% or 99%). 2) We find statistics L and U such that P L U 1 . Interpreting a Confidence Interval: We say that we are 1 100% confident that the true value of the parameter lies in the interval. This means that the interval was obtained by a method such that 1 100% of all intervals so obtained actually contain the true parameter value. Confidence interval for : We choose our confidence level to be 1 - . Then we can write the statement X P t t n 1, 2 S n 1, 2 n 1 . Rearranging, we obtain S S P X t X t 1 n 1, n 1, n n 2 2 Then X t n 1, 2 S n is a 1 100% confidence interval for . 5 Example 1: A machine produces metal rods used in an automobile suspension system. A random sample of 12 rods is selected, and the diameter of each is measured, resulting in the following data (measurements are in millimeters): 8.23 8.31 8.42 8.29 8.19 8.24 8.19 8.29 8.30 8.14 8.32 8.40 We want a 95% confidence interval estimate for µ, the mean diameter of all metal rods produced by the machine. We must first calculate the sample mean and sample standard deviation using the Descriptive Statistics function of Excel. Then we will find the confidence interval using the CONFIDENCE.T function of Excel. Caution: This Excel function gives the margin of error only, given the confidence level, sample standard deviation, and sample size. We must then find the confidence interval using the previously calculated sample mean Example 2: Corrosion of reinforcing steel is a serious problem in concrete structures located in environments affected by severe weather conditions. For this reason, researchers have been investigating the use of reinforcing bars made of composite material. One study was carried out to develop guidelines for bonding glassfiber-reinforced plastic rebars to concrete (“Design recommendations for bond of GFRP rebars to concrete,” Journal of Structural Engineering, 1996: 247-254). Consider the following 48 observations on measured bond strength: 11.5 12.1 9.9 9.3 7.8 6.2 6.6 7.0 13.4 17.1 9.3 5.6 5.7 5.4 5.2 5.1 4.9 10.7 15.2 8.5 4.2 4.0 3.9 3.8 3.6 3.4 20.6 25.5 13.8 12.6 13.1 8.9 8.2 10.7 14.2 7.6 5.2 5.5 5.1 5.0 5.2 4.8 6 4.1 3.8 3.7 3.6 3.6 3.6 We want to use this data to obtain a 99% confidence interval estimate for µ, the overall mean bond strength. Sample Size for a Specified Margin of Error: As part of our experimental design, we want to specify the margin of error, E, that is acceptable for our estimate of , and choose a sample size to insure that we achieve this margin of error. We let E z 2 z n 2 . Solving for n, we obtain E n 2 2 . Now, we know E and , but we need to find a usable value for 2 before we can find the sample size. Since we don’t know the value of the sample variance until we collect the data, we have to go to another source for a usable value of 2. Often, we do a literature search for previous published research on the same topic. We then use the sample variance from the previous research. Then the above equation will give us a sample size for achieving the desired margin of error with the desired level of confidence. Example: Suppose that we are interested in the burning rate of a solid propellant used to power aircrew escape systems; burning rate is a random variable that can be described by a probability distribution. Our interest focuses on the mean burning rate. We want a 95% confidence interval estimate, and we want the margin of error of our estimate to be no more than E = 1.5 cm./sec. Previous studies have shown that the best estimate of the standard deviation of the burning rate is σ = 2.0 cm./sec. We then need to find the 7 value of 𝑧𝛼 = 𝑧0.025 = 𝑁𝑜𝑟𝑚. 𝐼𝑁𝑉(0.975,0,1) = 1.96. Substituting 2 into the above formula, we find that the required sample size for estimating µ with 95% confidence that our margin of error will be no more than 1.5 cm./sec. is 1.96 2 𝑛=( ) 2.02 = 6.8295 ≅ 7. 1.5 Note that we always round up to obtain the required sample size. Hypothesis Testing Often, instead of estimating the value of a parameter based on sample data, we simply want to decide whether we believe a specific assertion about the value of the parameter. Definition: An hypothesis is a statement about the value of a population parameter. Examples: 1) Nothing outlasts the Energizer. 2) More doctors recommend Tylenol for the relief of headache pain than any other pain reliever. Hypotheses are tested in pairs, to decide which of the two statements is more believable. The Null Hypothesis, H0 This hypothesis usually represents the state of no change or no difference, from the researcher’s point of view. Often the null hypothesis is a statement of current belief about the value of the parameter; the researcher doubts the null hypothesis and wants to disprove it. Symbolically, this hypothesis is never a strict inequality. 8 The Alternative Hypothesis, Ha This statement is what the researcher is attempting to prove. It usually is the negation of the null hypothesis. Symbolically, this hypothesis is always a strict inequality. This hypothesis can take one of three forms for a parameter and a given number 0: 1) Ha: > 0 2) Ha: < 0 3) Ha: 0 Examples: Let pT be the proportion of all doctors who recommend Tylenol, and let pA be the proportion of all doctors who recommend alternatives to Tylenol. We want to test the following two hypotheses against each other: H0: pT pA vs. Ha: pT > pA Whenever incomplete information, such as that from a sample, is used to make an inference about the value of a population parameter, there is the risk of making a mistake. In a hypothesis testing situation, there are two possible mistakes that could be made. Type I Error This type of error occurs when our test leads us to reject the null hypothesis when, in fact, the null hypothesis is true. The probability of making a Type I error is denoted by the Greek letter , and is called the significance level of the test. In scientific research, a Type I error is usually considered to be more serious. This error is made when the researcher concludes that she has proved what she wanted to prove, but this conclusion is mistaken. Type II Error This type of error occurs when our test leads us to fail to reject the null hypothesis when, in fact, the null hypothesis is false. The probability of making a Type II error is denoted by the Greek letter 9 . This error is made when the researcher concludes that the data do not give sufficient evidence to support the researchers original conjecture, but in fact the conjecture is true. Later research may then provide sufficient evidence to validate the researcher’s conjecture. Possible Results of a Hypothesis Test Reject H0 Fail to Reject H0 H0 True Type I Error Ha True Type II Error Before conducting a hypothesis test, the researcher decides on an acceptable level of risk for committing a Type I error (i.e., chooses a value for ). Most commonly, = 0.05. If a Type I error is deemed to have more serious consequences, then the researcher may choose a smaller value for , such as 0.01 or 0.001. The researcher also chooses the amount of evidence to collect (sample size or sizes). Test Statistic The researcher summarizes the information contained in the simple random sample(s) in the form of a test statistic, a random variable whose value is calculated from the sample data. The value of this statistic will tell the researcher whether to reject H0 or to fail to reject H0. The test statistic must be chosen so that its probability distribution under the null hypothesis is known. Rejection Region( or Critical Region) The rejection region is that range of possible values of the test statistic such that, if the actual value falls in this region, the 10 researcher will reject H0 and conclude that Ha is true. The boundary point(s) of this region is(are) called the critical value(s) of the test. The form of the rejection region depends on the form of the alternative hypothesis: 1) If the alternative hypothesis has the form Ha: > 0, then the rejection region is a right-hand tail of the distribution of the test statistic, with area . 2) If the alternative hypothesis has the form Ha: < 0, then the rejection region is a left-hand tail of the distribution of the test statistic, with area . 3) If the alternative hypothesis has the form Ha: 0, then the rejection region is the union of a right-hand tail and a left-hand tail of the distribution of the test statistic, each having area /2. Note: You may be wondering why we don’t simply always choose a very small value for the significance level of the test, so that we will have a very slim chance of making a Type I error. The reason is that, for a given level of evidence (sample size(s)), if we make the probability of a Type I error smaller, we will automatically increase the probability of making a Type II error. Our goal is to make both probabilities as small as possible. Usually, the consequences of making a Type I error are more serious than the consequences of making a Type II error. Hence, we want to control . It is important, however, to make both and as small as possible. We do this by choosing an appropriate sample size(s). For a given chosen value of , if we make our sample size(s) larger, we will decrease the probability of making a Type II error. 11 Steps in Statistical Hypothesis Testing The following steps must appear in each statistical hypothesis test. The first four steps are the set-up of the test. These steps are completed before the researcher chooses a sample(s) and collects data. Step 1: State the null hypothesis, H0, and the alternative hypothesis, Ha. The alternative hypothesis represents what the researcher is trying to prove. The null hypothesis represents the negation of what the researcher is trying to prove. (In a criminal trial in the American justice system, the null hypothesis is that the defendant is innocent; the alternative is that the defendant is guilty; either the jury rejects the null hypothesis if they find that the prosecution has presented convincing evidence, or the jury fails to reject the null hypothesis if they find that the prosecution has not presented convincing evidence). Step 2: State the size(s) of the sample(s). This represents the amount of evidence that is being used to make a decision. State the significance level, , for the test. The significance level is the probability of making a Type I error. A Type I error is a decision in favor of the alternative hypothesis when, in fact, the null hypothesis is true. A Type II error is a decision to fail to reject the null hypothesis when, in fact, the null hypothesis is false. Step 3: State the test statistic that will be used to conduct the hypothesis test. The following statement should appear in this step: “The test statistic is _________ , which under H0 has a _____________ probability distribution (with ____ degrees of freedom).” 12 Step 4: Find the critical value for the test. This is found using the T.INV function of Excel. This value represents the cutoff point for the test statistic. If the value of the test statistic computed from the sample data is beyond the critical value, the decision will be made to reject the null hypothesis in favor of the alternative hypothesis. Step 5: Calculate the value of the test statistic, using the sample data. We find the sample mean and standard deviation using Excel’s Descriptive Statistics, and then calculate t. Step 6: Decide, based on a comparison of the calculated value of the test statistic and the critical value of the test, whether to reject the null hypothesis in favor of the alternative. If the decision is to reject H0, the statement of the conclusion should read as follows: “We reject H0 at the (value of ) level of significance. There is sufficient evidence to conclude that (statement of the alternative hypothesis).” If the decision is to fail to reject H0, the statement of the conclusion should read as follows: “We fail to reject H0 at the (value of ) level of significance. There is not sufficient evidence to conclude that (statement of the alternative hypothesis).” 13 Example 1: A machine produces metal rods used in an automobile suspension system. The manufacturing specifications say that the mean diameter of the rods should be 8.20 mm. The quality control engineer wants to test for conformity to specifications. He will select a random sample of 12 of the rods from a large production run and use the data from the sample to test whether the mean diameter of the rods differs from the specified value. He will use = 0.05 as the significance level of the test. 14 Testing Hypotheses Concerning a Population Mean, : We want to test hypotheses of the following possible forms: 1) H0: = 0 vs. Ha: 0 2) H0: 0 vs. Ha: < 0 3) H0: 0 vs. Ha: > 0 X 0 S . Under the null The test statistic to be used is n hypothesis, the this statistic has an approximate t distribution with d.f. = n-1. T For the three types of alternative hypotheses, the rejection regions are: | T | t 1) Ha: ≠ 0 Reject H0 if 2) Ha: < 0 Reject H0 if T tn 1, 3) Ha: > 0 Reject H0 if n 1, 2 T tn 1, Example 1: The data for the automobile suspension rods are given again below. We will use this data set to perform the hypothesis test. 8.23 8.31 8.42 8.29 8.19 8.24 8.19 8.29 8.30 8.14 8.32 8.40 15 Example 2: Corrosion of reinforcing steel is a serious problem in concrete structures located in environments affected by severe weather conditions. For this reason, researchers have been investigating the use of reinforcing bars made of composite material. One study was carried out to develop guidelines for bonding glassfiber-reinforced plastic rebars to concrete (“Design recommendations for bond of GFRP rebars to concrete,” Journal of Structural Engineering, 1996: 247-254). Consider the following 48 observations on measured bond strength (all measurements in MPa): 11.5 12.1 9.9 9.3 7.8 6.2 6.6 7.0 13.4 17.1 9.3 5.6 5.7 5.4 5.2 5.1 4.9 10.7 15.2 8.5 4.2 4.0 3.9 3.8 3.6 3.4 20.6 25.5 13.8 12.6 13.1 8.9 8.2 10.7 14.2 7.6 5.2 5.5 5.1 5.0 5.2 4.8 4.1 3.8 3.7 3.6 3.6 3.6 It is desirable that the mean bond strength exceed 6.5 MPa. We want to use the sample data to test whether > 6.5 MPa.