HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 9 Samples and Sampling Distributions HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Objectives: • To describe attributes of sampling distributions. • To understand the concept of sample means. • To understand the theory behind the Central Limit Theorem. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Definition: • Sampling distribution for sample means – describes the means of all possible samples of a particular sample size from a specified population. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Example: Determine the sampling distribution of all possible samples of size two of the diameters of the valves of respirators. Valve Measurements Valve Diameter It is important to realize the data set to the left constitutes a population with A 0.124 0.1486, and B 0.136 2 0.00073. C 0.201 D 0.144 E 0.138 Both the population mean and variance are considered population parameters. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Example: Determine the sampling distribution of all possible samples of size two of the diameters of the valves of respirators. Solution: Sample Valve Measurements Sample Number First Observation Second Observation x 1 (A & B) 0.124 0.136 0.1300 2 (A & C) 0.124 0.201 0.1625 3 (A & D) 0.124 0.144 0.1340 4 (A & E) 0.124 0.138 0.1310 5 (B & C) 0.136 0.201 0.1685 6 (B & D) 0.136 0.144 0.1400 7 (B & E) 0.136 0.138 0.1370 8 (C & D) 0.201 0.144 0.1725 9 (C & E) 0.201 0.138 0.1695 10 (D & E) 0.144 0.138 0.1410 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Attributes of Sample Means: • Central value of the variable (mean). • Variability of the variable (variance). • Familiar pattern of the variable (distribution). HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Definition: • Unbiased – If the average value of an estimator equals the population parameter being estimated, the estimator is unbiased. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Bias: Estimators are unbiased if the values are dispersed around the central value. Unbiased Biased HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Examples of Unbiased Estimators: • • • x is an unbiased estimator of . p is an unbiased estimator of p. s 2 is an unbiased estimator of 2 . HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Definition: Variance of the sample mean – it can be shown that for a population of infinite size the variance of x x2 equals 2 x 2 n where 2 is the population variance and n is the sample size. Example: For a population whose variance is 0.00073, calculate the variance of the sample means for samples of size n = 2. Solution: 2 x 2 n 0.00073 0.000365 2 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Definition: Variance of the sample mean – it can be shown that for a population of finite size the variance of x x2 equals where 2 x N n 2 N 1 n N = size of the population, and n = size of the sample. Example: For a population whose variance is 0.00073, calculate the variance of the sample means for samples of size n = 2, assuming there is a finite population of N = 5. Solution: 2 x N n 2 5 2 0.00073 0.0002738 2 N 1 n 5 1 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Example: For a population whose variance is 0.00073, calculate the variance of the sample means for samples of size n = 3, assuming there is a finite population of N = 5. Solution: Note: For 2 x N n 2 5 3 0.00073 0.0001217 N 1 n 5 1 3 2 n=2, x 0.0002738 2 n=3, x 0.0001217 So as n grows larger the sample variance grows smaller. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Choosing Sample Sizes: Based on the graphs below of a population with mean 43,660 and standard deviation of 2500 which sample size of n = 25, n = 100, or n = 200 is preferred? Solution: Because it has less variability it seems that the estimator when n = 200 would be preferred. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Characteristics of Sample Means: • The mean of the sample means is the population mean. Another way of expressing this concept is to say that the expected value of x is equal to the population mean. Symbolically, this can be expressed as E x . unbiasedness • If the sample size is increased, the variability of the sample mean decreases. This implies that the quality of the estimator tends to improve as the sample size increases. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Central Limit Theorem: If a sufficiently large sample (i.e. n > 30) is drawn from a population with mean, , and variance, 2, the distribution of the sample mean will have the following characteristics: • An approximately normal distribution regardless of the distribution of the underlying population. • x E x (The mean of the sample means equals the population mean.) • 2 x 2 n (The variance of the sample means equals the variance of the population divided by the sample size.) HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Central Limit Theorem: Distribution of the population Distribution of the sample mean for large samples HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Central Limit Theorem: Distribution of the population Distribution of the sample mean for large samples HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Central Limit Theorem: Distribution of the population Distribution of the sample mean for large samples HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.6 The Distribution of the Sample Mean Central Limit Theorem: Distribution of the population Distribution of the sample mean for large samples HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.7 Using the Central Limit Theorem Objective: • To apply Central Limit Theorem to normal random variables. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.7 Using the Central Limit Theorem Example: If a population has a mean of 30, a variance of 25, and a sample of 100 is drawn from the population, what is the probability that the sample mean will be larger than 31? Solution: 2 25 n 100 30 Using Central Limit Theorem. The distribution of x will be normal with a mean equal to the population mean, 30, and a variance given by 2 x 2 n 25 0.25 100 x 0.25 0.5 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.7 Using the Central Limit Theorem Example: If a population has a mean of 30, a variance of 25, and a sample of 100 is drawn from the population, what is the probability that the sample mean will be larger than 31? Solution: 31 30 P z P z 2 0.5 0.4772 .0228 P x 31 0 .5 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.7 Using the Central Limit Theorem Example: If a population has a mean of 30, a variance of 25, and a sample of 100 is drawn from the population, what is the probability that the sample mean will be in error by at most one unit from the true mean? Solution: To satisfy the condition that an error of less than one unit has been made, the condition x 1 must be satisfied. P 29 x 31 P x 1 P x P 2 z 2 .4772 .4772 .9544 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.7 Using the Central Limit Theorem Example: If a population has a mean of 30, a variance of 25, and a sample of 100 is drawn from the population, what is the probability that the sample mean will be in error by at most one unit from the true mean? Solution: Thus, if a sample size of 100 is drawn from the population given in the problem, the probability that the sample mean will be within one unit of the population mean is .9544. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Objective: • To apply the Central Limit Theorem to population proportions. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Definition: • Population proportion (p) – the percentage of a population that has a certain characteristic. • Sample proportion ( ) – the percentage of a sample that has a certain characteristic. • Notation: p is pronounced “p-hat”. • p is defined by the formula p x , n where x is the number in the sample possessing the characteristic of interest and n is the sample size. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Attributes of Sample Proportions: • Central value of the variable (expected value). • Variability of the variable (variance). • Familiar pattern of the variable (distribution). HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Sample Proportions: • The expected value of the sample proportion, p , is the population proportion, p. Symbolically this is expressed as E p p. • The variance of p is given by p2 p 1 p . n Since p is the estimator for p, the variance of the sample proportion is estimated as p 1 p 2 p . n HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Sampling Distribution: • The sampling distribution of p approaches normality as n becomes sufficiently large. The sample size is generally considered sufficiently large if np 5, and n 1 p 5. Sampling Distribution of p HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Sampling Distribution of the Sample Proportion • If the population is infinite and the sample is sufficiently large, the distribution of p has the following characteristics: 1. An approximately normal distribution. 2. p E p p (The mean of the population proportions equals the population proportion.) 3. 2 p p 1 p n p 1 p n . HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Sampling Distribution of the Sample Proportion • If the population is finite and the sample is sufficiently large, the distribution of p has the following characteristics: 1. An approximately normal distribution. 2. p E p p (The mean of the population proportions equals the population proportion.) N n p 1 p N n p 1 p , 3. N 1 n N 1 n where N is size of the population. 2 p HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Example: Suppose a sample of 400 persons is used to perform a taste test. If the true fraction in the population that prefers Pepsi is really .5, what is the probability that less than .44 of the persons in the sample will prefer Pepsi? Solution: Assume the distribution of p is normal with E p .5, 2 p .5 1 .5 0.000625, and 400 p .025. HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Example: Suppose a sample of 400 persons is used to perform a taste test. If the true fraction in the population that prefers Pepsi is really .5, what is the probability that less than .44 of the persons in the sample will prefer Pepsi? Solution: .44 .5 P p .44 Pz . 0 2 5 P z 2.4 .5 .4918 .0082 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Example: Suppose a sample of 500 is used to estimate the fraction of voters that favor a particular candidate. If the population proportion that favors the candidate is really .4. what is the probability that the error of estimation will be less than .05? Solution: Since the true value of the population proportion is .4, the value of p must fall between .35 and .45 in order for the error to be less than .05. E p .4 2 p .4 1 .4 500 p .0219 .00048 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Example: Suppose a sample of 500 is used to estimate the fraction of voters that favor a particular candidate. If the population proportion that favors the candidate is really .4. what is the probability that the error of estimation will be less than .05? Solution: To find the probability that p is within .05 of the true mean, we must find P p .05 p p .05 P .35 p .45 HAWKES LEARNING SYSTEMS Samples and Sampling Distributions math courseware specialists Section 9.8 The Distribution of the Sample Proportion Example: Suppose a sample of 500 is used to estimate the fraction of voters that favor a particular candidate. If the population proportion that favors the candidate is really .4. what is the probability that the error of estimation will be less than .05? Solution: Using the z-transformation, .35 .40 .45 0.40 P z .0219 .02 1 9 P 2.283 z 2.283 .4887 .4887 .9774