G. W. Teklewolde Math MS Statistics Basics Study Note Part 5 Statistics Basics Transforming a z-Score to an x-Value Recall that to transform an x-value to a z-score, you can use the formula z x This formula gives z in terms of x. If you solve this formula for x, you get a new formula that gives x in terms of z. z x z x z x is the transformation of x. Sampling Distributions In previous sections, you studied the relationship between the mean of a population and values of a random variable. In this section, you will study the relationship between a population mean and the means of samples taken from the population. DEFINITION A sampling distribution is the probability distribution of a sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic is the sample mean, then the distribution is the sampling distribution of sample means. Properties of Sampling Distributions of Sample Means 1. The mean of the sample means x is equal to the population mean µ. x = µ 2. The standard deviation of the sample means x is equal to the population standard deviation σ. divided by the square root of n. x n The standard deviation of the sampling distribution of the sample means is called the standard error of the mean. G. W. Teklewolde Math MS Statistics Basics Study Note The Central Limit Theorem The Central Limit Theorem forms the foundation for the inferential branch of statistics. This theorem describes the relationship between the sampling distribution of sample means and the population that the samples are taken from, The Central Limit Theorem is an important tool that provides the information you’ll need to use sample statistics to make inferences about a population mean. The Central Limit Theorem 1. If samples of size n, where n ≥ 30, are drawn from any population with a mean µ and a standard deviation σ, then the sampling distribution of sample means approximates a normal distribution. The greater the sample size, the better the approximation. 2. If the population itself is normally distributed, the sampling distribution of sample means is normally distributed for any sample size n. In either case, the sampling distribution of sample means has a mean equal to the population mean. x = µ Mean The sampling distribution of sample means has a variance equal to 1/n times the variance of the population and a standard deviation equal to the population standard deviation divided by the square root of n. 2 x x 2 n Variance n Std. Dev. The standard deviation of the sampling distribution of the sample means, x is also called the standard error of the mean. Probability and the Central Limit Theorem Previously you saw how to find the probability that a random variable x will fall in a given interval of population values. In a similar manner, you can find the probability that a sample mean x will fall in a given interval of the x sampling distribution. To transform x to a z-score, you can use the formula z Value Mean x x x Sdt.Dev. x / n Approximating a Binomial Distribution Previously you learned how to find binomial probabilities. For instance, if a surgical procedure has an 85% chance of success and a doctor performs the procedure on 10 patients, it is easy to find the G. W. Teklewolde Math MS Statistics Basics Study Note probability of exactly two successful surgeries. But what if the doctor performs the surgical procedure on 150 patients and you want to find the probability of fewer than 100 successful surgeries? To do this using the techniques described in Section 4.2, you would have to use the binomial formula 100 times and find the sum of the resulting probabilities. This approach is not practical, of course. A better approach is to use a normal distribution to approximate the binomial distribution. Normal Approximation to a Binomial Distribution If np ≥ 5 and nq ≥ 5, then the binomial random variable x is approximately normally distributed, with mean np And standard deviation npq Correction for Continuity The binomial distribution is discrete and can be represented by a probability histogram. To calculate exact binomial probabilities, you can use the binomial formula for each value of x and add the results. Geometrically, this corresponds to adding the areas of bars in the probability histogram. Remember that each bar has a width of one unit and x is the midpoint of the interval. When you use a continuous normal distribution to approximate a binomial probability, you need to move 0.5 unit to the left and right of the midpoint to include all possible x-values in the interval. When you do this, you are making a correction for continuity. Approximating Binomial Probabilities GUIDELINES Using the Normal Distribution to Approximate Binomial Probabilities In Words 1. Verify that the binomial distribution applies. 2. Determine if you can use the normal distribution to approximate x, the binomial variable. 3. Find the mean µ and standard deviation σ for the distribution. 4. Apply the appropriate continuity correction. Shade the corresponding area under the normal curve. In Symbols Specify n, p, and q. Is np ≥ 5? Is nq ≥ 5? µ= np npq Add or subtract 0.5 from endpoints. G. W. Teklewolde Math MS Statistics Basics Study Note x 5. Find the corresponding z-score(s). z 6. Find the probability. Use the Standard Normal Table. Estimating Population Parameters In this chapter, you will learn an important technique of statistical inference— to use sample statistics to estimate the value of an unknown population parameter. In this section, you will learn how to use sample statistics to make an estimate of the population parameter p. when the sample size is at least 30 or when the population is normally distributed and the standard deviation a. is known. To make such an inference, begin by finding a point estimate. DEFINITION A point estimate is a single value estimate for a population parameter. The most unbiased point estimate of the population mean µ is the sample mean x . DEFINITION An interval estimate is an interval, or range of values, used to estimate a population parameter. Although you can assume that the point estimate in Example 1 is not equal to the actual population mean, it is probably close to it. To form an interval estimate, use the point estimate as the center of the interval, then add and subtract a margin of error. For instance, if the margin of error is 2.1, then an interval estimate would be given by 12.4 ± 2.1 or 10.3 < µ < 14.5. The point estimate and interval estimate are as follows. Interval estimate Before finding an interval estimate, you should first determine how confident you need to be that your interval estimate contains the population mean µ. DEFINITION The level of confidence c is the probability that the interval estimate contains the population parameter. The difference between the point estimate and the actual parameter value is called the sampling error. When µ is estimated, the sampling error is the difference x . In most cases, of course, µ is unknown, x varies from sample to sample. However, you can calculate a maximum value for the error if G. W. Teklewolde Math MS Statistics Basics Study Note you know the level of confidence and the sampling distribution. DEFINITION Given a level of confidence c, the margin of error (sometimes also called the maximum error of estimate or error tolerance) E is the greatest possible distance between the point estimate and the value of the parameter it is estimating. E zc x zc n When n ≥ 30, the sample standard deviation s can be used in place of σ. Confidence Intervals for the Population Mean Using a point estimate and a margin of error, you can construct an interval estimate of a population parameter such as µ. This interval estimate is called a confidence interval. DEFINITION A c-confidence interval for the population mean µ is xE x E The probability that the confidence interval contains µ is c. GUIDELINES Finding a Confidence Interval (CI)for a Population Mean (n ≥ 30 or σ known with a normally distributed population) In Words In Symbols x 1. Find the sample statistics n and x. s 2 Specify σ if known Otherwise, if (n ≥ 30, find the sample standard deviation s and use it as an estimate for α. 3. Find the critical value zc that corresponds to the given level of confidence. 4. Find the margin of error E. x n ( x x) 2 n 1 . Use the Standard Normal Table. E zc n 5. Find the left and right endpoints Left endpoint: x E and form the confidence interval. Right endpoint: x E Interval: x E x E G. W. Teklewolde Math MS Statistics Basics Study Note Sample Size For the same sample statistics, as the level of confidence increases, the confidence interval widens. As the confidence interval widens, the precision of the estimate decreases. One way to improve the precision of an estimate without decreasing the level of confidence is to increase the sample size. But how large a sample size is needed to guarantee a certain level of confidence for a given margin of error? Find a Minimum Sample Size to Estimate µ Given a c-confidence level and a margin of error E, the minimum sample size n needed to estimate p. the population mean is z n c E 2 If σ is unknown, you can estimate it using s, provided you have a preliminary sample with at least 30 members.