Business Statistics: A Decision-Making Approach 8th Edition Chapter 7 Introduction to Sampling Distributions 7-1 Chapter Goals After completing this chapter, you should be able to: Define the concept of sampling error Determine the mean and standard deviation _ for the sampling distribution of the sample mean, x Determine the mean and standard deviation for the _ sampling distribution of the sample proportion, p Describe the Central Limit Theorem and its importance _ _ Apply sampling distributions for both x and p 7-2 Sampling Error Sample Statistics are used to estimate Population Parameters ex: X is an estimate of the population mean, μ Problems: Different samples provide different estimates of the population parameter Sample results have potential variability, thus sampling error exits 7-3 Calculating Sampling Error Sampling Error: The difference between a value (a statistic) computed from a sample and the corresponding value (a parameter) computed from a population Example: (for the mean) Sampling Error x - μ where: x sample mean μ population mean Always present just because you sample! 7-4 Example If the population mean is μ = 98.6 degrees and a sample of n = 5 temperatures yields a sample mean of x = 99.2 degrees, then the sampling error is x μ 99.2 98.6 0.6 degrees 7-5 Sampling Errors The sampling error may be positive or negative ( x may be greater than or less than μ) The size of the error depends on the sample selected i.e., a larger sample does not necessarily produce a smaller error if it is not a representative sample Download “Sampling Distributions using Excel” First sheet: “Sampling Error” 7-6 Sampling Distribution A sampling distribution is a distribution of all possible values of a statistic for a given sample size that has been randomly selected from a population. Download “Sampling Distributions using Excel” Second sheet: “Sampling Dist” 7-7 Sampling Distribution Properties For any population, The average value of all possible sample means computed from all possible random samples of a given size from the population is equal to the population mean (call “Unbiased Estimator”): See “Sampling Distributions using Excel” Considered an “unbiased” estimator μx μ Theorem 1 7-8 Sampling Distribution Properties (continued) The standard deviation of the possible sample means computed from all random samples of size n is equal to the population standard deviation divided by the square root of the sample size (call Standard Error): Try using “Sampling Distributions using Excel” Not Equal! Also called the standard error σ σx n Theorem 2 7-9 Finite Population Correction Virtually all survey research, sampling is conducted without replacement from populations that are of a finite size N. In these cases, particularly when the sample size n is not small in comparison with the population size N (i.e., more than 5% of the population is sampled) so that n/N > 0.05, a finite population correction factor (fpc) is used to define both the standard error of the mean and the standard error of the proportion. 7-10 Impact of fpc by Sample Size 7-11 Finite Population Correction Apply the Finite Population Correction (fpc) if: The sample size is greater than 5% of population size. Only with sampling without replacement Then (x μ) z σ Nn n N 1 See fpc by “Sampling Distributions using Excel” 7-12 Sampling Distribution Properties (continued) As the sample size is increased, the StdDev of the sampling distribution is reduced….. That is, the potential for extreme sampling error is reduced when larger sample size are used. Graphical illustration: next slide 7-13 Sampling Distribution Properties (continued) The value of increases): x becomes closer to μ as n Population x Small sample size As n increases, x σ x σ/ n decreases Larger sample size μ x 7-14 If the Population is Normal If a population is normal with mean μ and standard deviation σ, the sampling distribution of x is also normally distributed with μx μ and σ σx n Theorem 3 As n increases the data behaves more like a normal distribution 7-15 z-value for Sampling Distribution of x z-value for the sampling distribution of x: (x μ) z σ n where: x = sample mean μ = population mean σ = population standard deviation n = sample size 7-16 Example Suppose that a population is known to be normally distributed with μ = 2,000 and σ = 230 and random sample of size n = 8 is selected. Because the population is normally distributed, the sampling distribution for the mean will also be normally distributed. What is the probability that the sample mean will exceed 2,100? Convert to Z value Using Excel: 1-NORMSDIST(1.23) Answer on the website 7-17 If the Population is not Normal We can apply the Central Limit Theorem: Even if the population is not normal, …sample means from the population will be approximately normal as long as the sample size is large enough …and the sampling distribution will have μx μ and σ σx n Theorem 4 7-18 Central Limit Theorem If necessary, watch the Video As the sample size gets large enough… n↑ the sampling distribution becomes almost normal regardless of shape of population x 7-19 How Large is Large Enough? For most distributions, n > 30 will give a sampling distribution that is nearly normal For fairly symmetric distributions, n > 15 is sufficient For normal population distributions, the sampling distribution of the mean is always normally distributed 7-20 Example Suppose a population (not normally distributed) has mean μ = 8 and standard deviation σ = 3 and random sample of size n = 36 (greater than 30) is selected. What is the probability that the sample mean is between 7.8 and 8.2? 7-21 Example (continued) Solution (continued) -- find z-scores: μ μ 7.8 - 8 8.2 - 8 x P(7.8 μ x 8.2) P 3 σ 3 36 n 36 P(-0.4 z 0.4) 0.3108 Population Distribution ??? ? ?? ? ? ? ? ? μ8 Sampling Distribution Standard Normal Distribution Sample 0.1554 +0.1554 Standardize ? x 7.8 μx 8 8.2 x -0.4 μz 0 0.4 z 7-22 Sampling Distribution of a Proportion Try this by yourself! In many instances, the objective of sample is to estimate a population proportion. An accountant may be interested in determining the proportion of accounts payable balances that are correct. A production supervisor may wish to determine the percentage of product that is defect free. A marketing research department might want to know the proportion of potential customers who will purchase a prticular product. 7-23 Population Proportions Example If the true proportion of voters who support Proposition A is π (population proportion) = 0.4, what is the probability that a sample of size 200 yields a sample proportion between 0.40 and 0.45? i.e.: if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? 7-24 Example (continued) if π = .4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Find σ p : σ p π(1 π) .4(1 0.4) 0 0.03464 n 200 Convert to P(0.40 p 0.45) standard normal (z-values): 0.45 0.40 0.40 0.40 P 0 z0 .03464 .03464 P(0 z 1.44) 7-25 Example (continued) if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Use standard normal table: P(0 ≤ z ≤ 1.44) = 0.4251 Standardized Normal Distribution Sampling Distribution 0.4251 Standardize 0.40 0.45 p 0 1.44 z 7-26 Population Proportions, π π = the proportion of the population having some characteristic Sample proportion ( p ) provides an estimate of π : x number of successes in the sample p n sample size If two outcomes, p is a binomial distribution 7-27 Sampling Distribution of p Approximated by a normal distribution if: Sampling Distribution .3 .2 .1 0 nπ 5 n(1 π) 5 0 where μp π P( p ) and .2 .4 .6 π(1 π) σp n 8 1 p Theorem 5 (where π = population proportion) 7-28 z-Value for Proportions Standardize p to a z value with the formula: pπ z σp If sampling is without replacement and n is greater than 5% of the population size, then σ p must use the finite population correction factor: pπ π(1 π) n σp π(1 π) N n n N 1 7-29 Using the Sample Distribution for Proportions 1. Determine the population proportion, p 2. Calculate the sample proportion, p 3. Derive the mean and standard deviation of the sampling distribution 4. Define the event of interest 5. If np and n(1-p) are both > 5, then covert p to z-value 6. Use standard normal table (Appendix D) to determine the probability 7-30 Chapter Summary Discussed sampling error Introduced sampling distributions Described the sampling distribution of the mean For normal populations Using the Central Limit Theorem (normality unknown) Described the sampling distribution of a proportion Calculated probabilities using sampling distributions Discussed sampling from finite populations 7-31 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. 7-32