 

Confidence Intervals Based on the t-Distribution You have learned that a confidence interval for a population mean  has the form: x  z 2   . n One huge problem with this expression is that if the population mean  is unknown and needs to be estimated, it is not likely that the population standard deviation  would be known. A natural solution is to simply use the sample standard deviation s as an estimate for the population s standard deviation . So it would seem reasonable to use x  z 2  to obtain a confidence n interval for μ. But does this expression actually produce an interval that will succeed in capturing the population mean  100(1-)% of the time? We will examine this question using simulation in Example 1 below. Example 1: Chest Measurements of Militamen The file Militia.MTW on the FacShare drive contains chest measurements (in inches) for 5738 Scottish militiamen in the early 19th century. We will treat these observations as our population, take samples of size n = 5 from it, and form 95% confidence intervals for the population mean. (a) Use the Stat > Basic Statistics > Display Descriptive Statistics to find the population mean and standard deviation. Record them below with appropriate symbols. Also make a histogram of the population distribution. Does the population distribution appear to be approximately normal? (b) For this population, is it appropriate to use the Central Limit Theorem to approximate the distribution of X even a very small sample size like n = 5? Explain. (c) We will now use the Minitab macro militia.mac (found on Facshare) to simulate selecting 1000 samples of size n = 5 from this population. For each of these 1000 samples, we compute a confidence interval for the mean using the formula suggest above, and then we’ll count how many of those intervals contain the actual population mean μ. Copy the macro from Facshare to your desktop and then type a command like the following in Minitab (replacing bgill with your username): %’C:\Users\bgill\Desktop\militia’ (type a % sign followed by the full path name of the location of the file in single quotes). It might be easiest to save the macro to a USB drive where the path will be something simpler like %’D:\militia’. Enter 5 for the sample size and 1000 for the number of samples to generate. The macro will select 1000 samples of size 5 and compute the mean and standard deviation for each sample. It will also produce a histogram of the simulated sampling distributions of x and s. Does the sampling distribution of x appear to be approximately normal? What about the sampling distribution of s? s from each of your 1000 samples, and n count how many of these (alleged) 95% confidence intervals succeed in capturing the population mean μ: MTB> let c5=c3-1.96*c4/sqrt(5) MTB> let c6=c3+1.96*c4/sqrt(5) MTB> let c7=(c5<39.832 & c6>39.832) MTB> tally c7 What percentage of your 1000 intervals succeed in capturing the population mean? Is it close to 95%? (d) Now create the bounds for the expression x  z 2  You should have found that this procedure does not produce intervals that succeed in capturing  95% of the time, so we need to adjust the procedure. The adjustment needed is to use a critical value not from the z-distribution (standard normal) but rather from what is called a t-distribution (you will receive a separate handout with more discussion of the t-distribution). These critical values for the t-distribution will depend on both the desired confidence level and on our sample size. To find a confidence interval based a sample of size n, we will use what is called the tdistribution with n-1 degrees of freedom. The critical value for a 100(1-)% confidence interval will be denoted by tn 1, 2 . s . This procedure is n appropriate when one has a simple random sample and when either the population has a normal distribution or the sample size is large (n ≥ 30 as a rule-of-thumb). A 100(1-)% confidence interval for μ is given by: x  tn 1, 2  (e) Use Minitab to find the critical value tn 1, 2 for a 95% confidence interval for μ with a sample of size n = 5. Note that this means that we will have n – 1 = 4 degrees of freedom. Also, for a 95% confidence interval, we have 1 – α = 0.95, so α = 0.05 and α/2 = 0.025. We can find the critical value using a process very similar to what we did for critical values for the normal distribution:  Select Graph > Probability Distribution Plot from the menu.  Select View Probability and click OK.  Under Distribution, select t. Then enter 4 as the Degrees of freedom (n – 1 = 4).  Click on the Shaded Area tab. Under Define Shaded Area By, select Probability. Then select Middle and enter 0.025 for both Probability 1 and Probability 2. Click OK. Record the resulting critical value t4,0.025 below. (f) Use this t-interval procedure to form 95% confidence intervals from your 1000 samples: MTB> let c8=c3-2.776*c4/sqrt(5) MTB> let c9=c3+2.776*c4/sqrt(5) MTB> let c10=(c5<39.832 & c6>39.832) MTB> tally c10 What percentage of your 1000 intervals succeed in capturing the population mean? Is it close to 95%? You should find that this new confidence interval formula using the sample standard deviation s along with the t-distribution works much better than the formula using the standard normal distribution. Example 2: ACT Scores Thirteen randomly selected juniors at a university are given the ACT. The mean and standard deviation for their scores are 26 and 4, respectively. Find a 95% confidence interval for the mean of score of all juniors at the university. Note that scores on standardized exams like the ACT are normally distributed. (a) Find the critical value tn 1, 2 needed to form a 95% confidence interval for the population mean μ based on this sample of size n = 13. (Remember to use n – 1 as the degrees of freedom). (b) Find a 95% confidence interval for the mean ACT score for all juniors at the university. Example 3: A pharmaceutical company checks the potency of the drugs it manufactures by chemical analysis. It selects a random sample of 16 cephalothin crystals manufactured in the last week and finds the following values for the release potency of the crystals: 897 914 913 906 916 918 905 921 918 906 895 893 908 906 907 901 The data can be found in the file Cephalothin.MTW on Facshare. (a) Find the mean and standard deviation of the sample data. Also examine a histogram of the data and comment on its key features. (b) Find a 99% confidence interval for the mean release potency of all cephalothin crystals manufactured in the last week. Use Minitab to find the critical value tn 1, 2 and then show the details of computing the interval. (c) Use Minitab to verify your computations from part (b). Select Stat > Basic Statistics > One Sample t. Double click on the column containing the data to select it as the variable, and use the Options button to set the confidence level to 99% then click OK twice. (d) The release potency is supposed to be 910. Based on the confidence interval, does it seem plausible that the population mean for all cephalothin crystals produced in the last week could be 910? Does this data provide strong evidence that the mean release potency is not 910? Explain. (e) Comment on whether this t-procedure is valid here. [Hints: First ask whether the sample is a random one from the population. Then ask whether the sample size is large or whether the data appear to be normally distributed.] (f) What percentage of the crystals from the sample fall within the interval that you computed? Is this percentage close to 99%? Should it be? Explain. This last question should remind you that the confidence interval estimates the value of the population mean. If we repeatedly take random samples and form a confidence interval like this, then in the long run 99% of those intervals would contain the actual value of the population mean. The interval does not aim to estimate the value of an individual observation, however. (g) What effect would it have on the confidence intervals and the questions above if the first value in the data (897) was replaced by 807?

 

Related documents

Products

Support

 

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib