7-1 The Excel NORMDIST Function Computes the cumulative probability to the value X NORMDIST X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. 7-2 The Excel NORMDIST Function Computes the cumulative probability to the value X No need to convert into a Z value Needs the mean, and standard distribution Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. 7-3 The Excel NORMINV Function To compute an X value that has a cumulative probability of a certain amount If mean is 100 and the standard deviation is 20, the X value with a cumulative probability of .95 is 132.9. Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc.. Business Statistics: A First Course 5th Edition Chapter 7 Sampling and Sampling Distributions Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. 7-4 7-5 Why Sample? Why not use the entire population? Selecting a sample is less time-consuming than selecting every item in the population (census). Selecting a sample is less costly than selecting every item in the population. An analysis of a sample is less cumbersome and more practical than an analysis of the entire population. Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-6 A Sampling Process Begins With A Sampling Frame The sampling frame is a listing of items that make up the population Sampling Frame For Population With 850 Items Frames are data sources such as population lists, Item Name directories, or maps Bev R. Ulan X. . . Joan T. Paul F. Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Item # 001 002 . . 849 850 A Sampling Process Begins With A Sampling Frame The sampling frame is a listing of items that make up the population Inaccurate or biased results can result if a frame excludes certain portions of the population Using different frames to generate data can lead to dissimilar conclusions Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-7 7-8 Types of Samples Samples Non-Probability Samples Judgment Convenience Probability Samples Simple Random Stratified Systematic Cluster In nonprobability sampling, items not are selected by probability. Statistical inference methods cannot be used. Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-9 Types of Samples Samples In judgment sampling, experts are selected. Non-Probability Probability Samples May Samples not be able to generalize to the general public. Judgment Convenience Simple Random Stratified Systematic Cluster In convenience sampling, items are easy, quickly, convenient, or convenient to sample. Surveys of users visiting a website. Representative of what population? Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-10 Types of Samples Samples Non-Probability In probability sampling, Samples you select frame items based on probabilities. To be used whenever Judgment Convenience possible to get unbiased inferences about the population of interest. Probability Samples Simple Random Stratified Systematic Cluster There are four types of probability sampling methods. We will look at two. Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Probability Sample: Simple Random Sample 7-11 Every individual or item from the frame has an equal chance of being selected Selection may be with replacement (selected individual is returned to frame for possible reselection) or without replacement (selected individual isn’t returned to the frame). Samples obtained from table of random numbers or computer random number generators. Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Probability Sample: Simple Random Sample The book describes tables of random numbers 7-12 This is prehistoric. The Excel RAND() function Randomly generates a number between 0 and 1 If you have a sampling frame numbered 1 to 850, create a table each cell having the equation =850*RAND() A random number between 0 and 500 will appear in each cell Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Probability Sample: Simple Random Sample The Excel RAND() function If you make any changes to the worksheet, the numbers will be recalculated. To prevent this, Select the range holding the random numbers Give the command to copy them Select the range again Give the command to paste special Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-13 Probability Sample: Stratified Sample 7-14 Divide population into two or more subgroups (called strata) according to some common characteristic A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes Population Divided into 4 strata Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Probability Sample: Stratified Sample 7-15 Samples from subgroups are combined into one This is a common technique when sampling population of voters, stratifying across racial or socio-economic lines. Sampling employees of different levels in an organization. Population Divided into 4 strata Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-16 Evaluating a Survey Design What is the purpose of the survey? Is the survey based on a probability sample? How will various errors be controlled? Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-17 Types of Survey Errors Coverage error or selection bias Exists if some groups are excluded from the frame and have no chance of being selected Example: Calling households during the daytime will exclude many working people Always consider who is being undercounted Excluded from frame Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-18 Types of Survey Errors Non response error or bias People who do not respond may be different from those who do respond Income level Ethnic background Busier Try multiple times to reach nonrespondents Do selective follow-up to determine differences Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-19 Types of Survey Errors Sampling error Variation from sample to sample will always exist Increasing sample size minimizes this Random differences from sample to sample Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-20 Types of Survey Errors Measurement error Problems in the design of questions Leading questions Ambiguous questions Hawthorne effect: Answering to please the interviewer Can be interpreted in different ways Questions the respondent cannot answer Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-21 Ethics Purposely create survey errors Coverage errors Nonresponse Errors Purposely exclude certain groups to give more favorable results Ignore nonrespondents, whose answers you probably will not like Measurement Errors Asking leading questions Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-22 Sampling Distributions A sampling distribution is a distribution of all of the possible values of a sample statistic for a given size sample selected from a population. Sample to find the mean age for college students. If you did this for N times, you would get N sample statistic values for the mean. The totality of these values would constitute the sampling distribution. μ Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. x 7-23 Sampling Distributions A sampling distribution is a distribution of all of the possible values of a sample statistic for a given size sample selected from a population. Most sample values will be close to the population mean P(X) .3 .2 .1 0 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 18 19 20 21 22 23 24 _ X Sample Mean Sampling Distribution: 7-24 Standard Error of the Mean A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean: (This assumes that sampling is with replacement or sampling is without replacement from an infinite population) σX σ n Note that the standard error of the mean decreases as the sample size increases Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Sample Mean Sampling Distribution: If the Population is Normal 7-25 If a population is normally distributed with mean μ and standard deviation σ, the sampling distribution of X is also normally distributed with μX μ Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. σ σX n Z-value for Sampling Distribution of the Mean Z-value for the sampling distribution of X : Z where: ( X μX ) σX ( X μ) σ n X = sample mean μ = population mean σ = population standard deviation n = sample size Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-26 7-27 Sampling Distribution Properties μx μ Normal Population Distribution μ (i.e. x is unbiased ) Normal Sampling Distribution (has the same mean) μx Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. x x 7-28 Sampling Distribution Properties (continued) As n increases, Larger sample size σ x decreases Smaller sample size μ Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. x Determining An Interval Including A Fixed Proportion of the Sample Means 7-29 Find a symmetrically distributed interval around µ that will include 95% of the sample means when µ = 368, σ = 15, and n = 25. Since the interval contains 95% of the sample means 5% of the sample means will be outside the interval Since the interval is symmetric 2.5% will be above the upper limit and 2.5% will be below the lower limit. From the standardized normal table, the Z score with 2.5% (0.0250) below it is -1.96 and the Z score with 2.5% (0.0250) above it is 1.96. Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. Determining An Interval Including A 7-30 Fixed Proportion of the Sample Means (continued) Calculating the lower limit of the interval σ 15 XL μ Z 368 (1.96) 362.12 n 25 Calculating the upper limit of the interval σ 15 XU μ Z 368 (1.96) 373.88 n 25 95% of all sample means of sample size 25 are between 362.12 and 373.88 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-31 Sample Mean Sampling Distribution: If the Population is not Normal We can apply the Central Limit Theorem: Even if the population is not normal, …sample means from the population will be approximately normal as long as the sample size is large enough. Properties of the sampling distribution: μx μ σ σx n Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-32 How Large is Large Enough? For most distributions, n > 30 will give a sampling distribution that is nearly normal For fairly symmetric distributions, n > 15 will usually give a sampling distribution is almost normal For normal population distributions, the sampling distribution of the mean is always normally distributed Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-33 Example Suppose a population has mean μ = 8 and standard deviation σ = 3. Suppose a random sample of size n = 36 is selected. What is the probability that the sample mean is between 7.8 and 8.2? Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-34 Example (continued) Solution: Even if the population is not normally distributed, the central limit theorem can be used (n > 30) … so the sampling distribution of approximately normal x is … with mean μx = 8 σ 3 0.5 …and standard deviation σ x n 36 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-35 Example (continued) Solution (continued): 7.8 - 8 X -μ 8.2 - 8 P(7.8 X 8.2) P 3 σ 3 36 n 36 P(-0.4 Z 0.4) 0.3108 Population Distribution ??? ? ?? ? ? ? ? ? μ8 Sampling Distribution Standard Normal Distribution Sample .1554 +.1554 Standardize ? X Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7.8 μX 8 8.2 x -0.4 μz 0 0.4 Z 7-36 Population Proportions π = the proportion of the population having some characteristic p Sample proportion ( p ) provides an estimate of π: X number of items in the sample having the characteristic of interest n sample size 0≤ p≤1 p is approximately distributed as a normal distribution when n is large (assuming sampling with replacement from a finite population or without replacement from an infinite population) Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-37 Sampling Distribution of p Approximated by a normal distribution if: Sampling Distribution .3 .2 .1 0 nπ 5 and 0 n(1 π ) 5 where P( ps) μp π and .2 .4 π(1 π ) σp n (where π = population proportion) Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. .6 8 1 p 7-38 Z-Value for Proportions Standardize p to a Z value with the formula: p Z σp Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. p (1 ) n 7-39 Example If the true proportion of voters who support Proposition A is π = 0.4, what is the probability that a sample of size 200 yields a sample proportion between 0.40 and 0.45? i.e.: if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-40 Example (continued) if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Find σ p : σ p (1 ) n 0.4(1 0.4) 0.03464 200 0.45 0.40 0.40 0.40 Convert to P(0.40 p 0.45) P Z standardized 0.03464 0.03464 normal: P(0 Z 1.44) Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. 7-41 Example (continued) if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Use standardized normal table: P(0 ≤ Z ≤ 1.44) = 0.4251 Standardized Normal Distribution Sampling Distribution 0.4251 Standardize 0.40 0.45 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.. p 0 1.44 Z 7-42 Chapter Summary Probability and nonprobability samples Two common probability samples Examined survey worthiness and types of errors Introduced sampling distributions Described the sampling distribution of the mean For normal populations Using the Central Limit Theorem for any distribution Described the sampling distribution of a proportion Calculated probabilities using sampling distributions Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..