Sampling Distribution and Point Estimation of Parameters MATH30-6 Probability and Statistics Objectives At the end of the lesson, the students are expected to • Explain the general concepts of estimating the parameters of a population or a probability distribution; • Explain the important role of the normal distribution as a sampling distribution; and • Understand the central limit theorem. Point Estimator • A point estimate of some population parameter 𝜃 is a single numerical value 𝜃 of a statistic Θ. The statistic Θ is called the point estimator. Estimation problems occur frequently in engineering. • The mean μ of a single population • The variance σ2 (or standard deviation σ) of a single population • The proportion p of items in a population that belong to a class of interest Point Estimator • The difference in means of two populations, 𝜇1 − 𝜇2 • The difference in two population proportions, 𝑝1 − 𝑝2 Reasonable point estimates: • 𝜇=𝑥 • 𝜎 2 = 𝑠2 • 𝑝=𝑥 𝑛 • 𝜇1 − 𝜇2 = 𝑥1 − 𝑥2 • 𝑝1 − 𝑝2 = 𝑥1 𝑛1 − 𝑥2 𝑛2 Sampling Distribution • The random variables 𝑋1 , 𝑋2 , … , 𝑋𝑛 are a random sample of size n is (a) the Xi’s are independent random variables, and (b) every Xi has the same probability distribution. • A statistic is any function of the observations in a random sample. • The probability distribution of a statistic is called a sampling distribution. - For example, the probability distribution of 𝑋 is called the sampling distribution of the mean. Central Limit Theorem Consider determining the sampling distribution of the sample mean 𝑋. The sample mean 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 𝑋= 𝑛 has a normal distribution with mean 𝜇 + 𝜇 + ⋯+ 𝜇 𝜇𝑋 = =𝜇 𝑛 and variance 2 + 𝜎2 + ⋯ + 𝜎2 2 𝜎 𝜎 𝜎𝑋2 = = 𝑛 𝑛 Central Limit Theorem • If we are sampling from a population that has an unknown probability distribution, sampling distribution of the sample mean will still be approximately normal with mean μ and variance σ2/n, if the sample size n is large. • In Inferential Statistics, n ≥ 40 (Montgomery and Runger, 2011) is considered a large sample. Otherwise, it is considered small. • n ≥ 30 is considered a large sample (Walpole, et al, 2012) Central Limit Theorem If 𝑋1 , 𝑋2 , … , 𝑋𝑛 is a random sample of size n taken from a population (either finite or infinite) with mean μ and variance σ2, and if 𝑋 is the sample mean, the limiting form of the distribution of 𝑋−𝜇 𝑍= 𝜎 𝑛 (7-1) as n → ∞, is the standard normal distribution. Central Limit Theorem Central Limit Theorem Examples: 7-1/228 An electronics company manufactures resistors that have a mean resistance of 100 ohms and a standard deviation of 10 ohms. The distribution of resistance is normal. Find the probability that a random sample of n = 25 resistors will have an average resistance less than 95 ohms. Central Limit Theorem Central Limit Theorem 7-2/228 Suppose that a random variable X has a continuous uniform distribution 1 2,4 ≤ 𝑥 ≤ 6 𝑓 𝑥 = 0, otherwise Find the distribution of the sample mean of a random sample of size n = 40. Central Limit Theorem Central Limit Theorem 7-10/230 Suppose that the random variable X has the continuous uniform distribution 1, 0 ≤ 𝑥 ≤ 1 𝑓 𝑥 = 0, otherwise Suppose that a random sample of n = 12 observations is selected from this distribution. What is the approximate probability distribution of 𝑋 − 6? Find the mean and variance of this quantity. Central Limit Theorem 7-11/230 Suppose that X has a discrete uniform distribution 1 3 , 𝑥 = 1,2,3 𝑓 𝑥 = 0, otherwise A random sample of n = 36 is selected from this population. Find the probability that the sample mean is greater than 2.1 but less than 2.5, assuming that the sample mean would be measured to the nearest tenth. Central Limit Theorem 7-12/231 The amount of time that a customer spends waiting at an airport check-in counter is a random variable with mean 8.2 minutes and standard deviation 1.5 minutes. Suppose that a random sample of n = 49 customers is observed. Find the probability that the average time waiting in line for these customers is (a) Less than 10 minutes (b) Between 5 and 10 minutes (c) Less than 6 minutes Difference in Sample Means If we have two independent populations with means μ1 and μ2 and variances 𝜎12 and 𝜎22 and if 𝑋1 and 𝑋2 are the sample means of two independent random samples of sizes n1 and n2 from these populations, then the sampling distribution of 𝑋1 − 𝑋2 − 𝜇1 − 𝜇2 𝑍= 𝜎12 𝑛1 + 𝜎22 𝑛2 (7-4) Is approximately normal, if the conditions of the central limit theorem apply. If the two populations are normal, the sampling distribution of Z is exactly standard normal. Difference in Sample Means Examples: 7-3/229 Aircraft Engine Life The effective life of a component used in a jet-turbine aircraft engine is a random variable with mean 5000 hours and standard deviation 40 hours. The distribution of effective life is fairly close to a normal distribution. The engine manufacturer introduces an improvement into the manufacturing process for this component that increases the mean life to 5050 hours and decreases the standard deviation to 30 hours. Suppose that a random sample of n1 = 16 components is selected from the “old” process and a random sample of n2 = 25 components is selected Difference in Sample Means from the “improved” process. What is the probability that the difference in the two sample means 𝑋2 − 𝑋1 is at least 25 hours? Assume that the old and improved processes can be regarded as independent populations. Difference in Sample Means 7-13/231 A random sample of size n1 = 16 is selected from a normal population with a mean of 75 and a standard deviation of 8. A second random sample of size n2 = 9 is taken from another normal population with mean 70 and standard deviation 12. Let 𝑋1 and 𝑋2 be the two sample means. Find: (a) The probability that 𝑋1 − 𝑋2 exceeds 4 (b) The probability that 3.5 ≤ 𝑋1 − 𝑋2 ≤ 5.5 Difference in Sample Means 7-14/231 A consumer electronics company is comparing the brightness of two different types of picture tubes for use in its television sets. Tube type A has mean brightness of 100 and standard deviation of 16, while tube B has unknown mean brightness, but the standard deviation is assumed to be identical to that for type A. A random sample of n = 25 tubes of each type is selected, and 𝑋𝐵 − 𝑋𝐴 is computed. If μB equals or exceeds μA, the manufacturer would like to adopt type B for use. The observed difference is 𝑥𝐵 − 𝑥𝐴 = 3.5 . What decision would you make, and why? Summary • The probability distribution of a statistic is called the sampling distribution. For example, the sampling distribution of the sample mean 𝑋 is the normal distribution. • The simplest form of the central limit theorem states that the sum of n independently distributed random variables tend to be normally distributed as n becomes large. It is a necessary and sufficient condition that none of the variances of the individual random variables are large in comparison to their sum. Summary • Sampling Distribution of the Mean 𝑋−𝜇 𝑍= 𝜎 𝑛 • Approximate Sampling Distribution of a Difference in Sample Means 𝑋1 − 𝑋2 − 𝜇1 − 𝜇2 𝑍= 𝜎12 𝑛1 + 𝜎22 𝑛2 References • Montgomery and Runger. Applied Statistics and Probability for Engineers, 5th Ed. © 2011 • Walpole, et al. Probability and Statistics for Engineers and Scientists 9th Ed. © 2012, 2007, 2002