Sampling Methods and Sampling Distributions Learning Objectives Explain Types of Samples Describe the Properties of Estimators Explain Sampling Distribution Describe the Relationship between Populations & Sampling Distributions State the Central Limit Theorem Solve Probability Problems Involving Sampling Distributions Sampling Methods Types of Samples Type of Sample Non Probability Probability Simple Random Judgement Quota Chunk Systematic Stratified Cluster Simple Random Sample 1. Each Population Element Has an Equal Chance of Being Selected 2. Selecting 1 Subject Does Not Affect Selecting Others 3. May Use Random Number Table, Lottery, ‘Fish Bowl’ Random Number Table Column Row 00000 12345 00001 67890 11111 12345 11111 67890 01 49280 88924 35779 00283 02 61870 41657 07468 08612 03 43898 65923 25078 86129 Types of Samples Type of Sample Non Probability Probability Simple Random Judgement Quota Chunk Systematic Stratified Cluster Systematic Sample 1. Items of population arranged in some way- alphabetically, by date received 2.Every kth Element Is Selected After a Random Start within the First k Elements 3. Used in Telephone Surveys © 1984-1994 T/Maker Co. Types of Samples Type of Sample Non Probability Probability Simple Random Judgement Quota Chunk Systematic Stratified Cluster Stratified Sample All Students 1. Divide Population into Subgroups Mutually Exclusive Collectively Exhaustive At Least 1 Common Characteristic of Interest Commuters 2. Select Simple Random Samples from Subgroups Residents Sample Types of Samples Type of Sample Non Probability Probability Simple Random Judgement Quota Chunk Systematic Stratified Cluster Cluster Sample Divide Population into Clusters Companies (Clusters) If Managers are Elements, then Companies are Clusters Select Clusters Randomly Survey All or a Random Sample of Elements in Cluster Sample Types of Samples Type of Sample Non Probability Probability Simple Random Judgement Quota Chunk Systematic Stratified Cluster Nonprobability Samples 1. Judgment Use Experience to Select Sample e.g., Test Markets 2. Quota Similar to Stratified Sampling Except No Random Sampling 3. Chunk (Convenience) Use Elements Most Available Errors Due to Sampling Sampling Error - occurs because sample is taken instead of census Errors are due to chance Equally likely to be too high or too low Improve by increasing sample size Nonsampling Error - Bias A directional error Can not be reduced by increasing sample size Sampling Distributions Statistical Methods Statistical Methods Descriptive Statistics Inferential Statistics Inferential Statistics Involves Estimation Hypothesis Testing Purpose Make Decisions about Population Characteristics Population? Inference Process Estimates & Tests Sample Statistic (`X, P ) Population Sample Estimators 1. Random Variables Used to Estimate a Population Parameter -Sample Mean, Sample Proportion, Sample Median 2. Sample Mean is an Estimator of Population Mean m If X = 3 then 3 Is the Estimate of m 3. Theoretical Basis Is Sampling Distribution Properties of Mean Unbiasedness Efficiency Mean of Sampling Distribution Equals Population Mean Sample Mean Comes Closer to Population Mean Than Any Other Unbiased Estimator Consistency As Sample Size Increases, Variation of Sample Mean from Population Mean Decreases Unbiasedness P(`X) Unbiased Biased A C mx= mx A mx C `X Efficiency P(`X) Sampling Distribution of Mean B Sampling Distribution of Median A mx `X Consistency P(`X) Larger Sample Size B Smaller Sample Size A mx `X Sampling Distribution Theoretical Probability Distribution Random Variable is Sample Statistic Sample Mean, Sample Proportion, etc. Results from Drawing All Possible Samples of a Fixed Size List of All Possible [`X, P(`X) ] Pairs Sampling Distribution of Mean Developing Sampling Distributions Suppose There’s a Population ... Population Size, N = 4 Random Variable, X, Is # Errors in Work Values of X: 1, 2, 3, 4 Uniform Distribution X (# of errors) 1 2 3 4 10 10 m 2.5 4 m (X -m) (X - m)2 2.5 2.5 2.5 2.5 -1.5 -0.5 0.5 1.5 2.25 0.25 0.25 2.25 5 5 112 . 4 Population Mean and Standard Deviation Population Characteristics Summary Measures Population Distribution N mx Xi i 1 N 2.5 N x .3 .2 .1 .0 (X i - m x) i 1 N 1 2 . 112 2 3 4 Inference Process Estimates & Tests Sample Statistic (`X, Ps ) Population Sample All Possible Samples of Size n = 2 16 Samples 16 Sample Means 1st 2nd Observation Obs 1 2 3 4 1st 2nd Observation Obs 1 2 3 4 1 1,1 1,2 1,3 1,4 1 1.0 1.5 2.0 2.5 2 2,1 2,2 2,3 2,4 2 1.5 2.0 2.5 3.0 3 3,1 3,2 3,3 3,4 3 2.0 2.5 3.0 3.5 4 4,1 4,2 4,3 4,4 4 2.5 3.0 3.5 4.0 Sample With Replacement Sampling Distribution of All Sample Means 16 Sample Means 1st 2nd Observation Obs 1 2 3 4 Sampling Distribution X f p(X) 1.0 1 1/16 1.5 2 2/16 2 1.5 2.0 2.5 3.0 2.0 3 3/16 3 2.0 2.5 3.0 3.5 2.5 4 4/16 4 2.5 3.0 3.5 4.0 3.0 3 3/16 3.5 2 2/16 4.0 1 1/16 1 1.0 1.5 2.0 2.5 Sampling Distribution of All Sample Means 16 Sample Means 1st 2nd Observation Obs 1 2 3 4 1 1.0 1.5 2.0 2.5 2 1.5 2.0 2.5 3.0 3 2.0 2.5 3.0 3.5 4 2.5 3.0 3.5 4.0 Sampling Distribution P(`X) .3 .2 .1 .0 `X 1.0 1.5 2.0 2.5 3.0 3.5 4.0 N mx Xi i 1 N X 1.0 1.5 1.5 2.0 2.0 2.0 2.5 2.5 2.5 2.5 3.0 3.0 3.0 3.5 3.5 4.0 40 40 2.5 1.0 + 1.5 + L + 4 .0 16 16 mx 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 (X- mx) (X- mx)2 -1.5 -1.0 -1.0 -0.5 -0.5 -0.5 0.0 0.0 0.0 0.0 0.5 0.5 0.5 1.0 1.0 1.5 2.25 1.00 1.00 0.25 0.25 0.25 0.00 0.00 0.00 0.00 0.25 0.25 0.25 1.00 1.00 2.25 10.00 Summary Measures of All Possible Sample Means N mx Xi i 1 N N x 10 . + 15 . + L + 4.0 (X i - m x ) 16 2 i 1 N 2 2.5 2 10 . 2 . 5 . - 2.5) +L+(4.0 - 2.5) ( - ) + (15 16 2 10 .79 16 Comparison of Population & Sampling Distribution Population .3 .2 .1 .0 Sampling Distribution P(X) 1 2 3 4 P(`X) .3 .2 .1 .0 `X 1 1.5 2 2.5 3 3.5 4 m x 2.5 m x 2.5 . x 112 x .79 Standard Error of Mean (Standard Deviation of the Sampling Distribution of Means) Standard Deviation of All Possible Sample Means,`X Measures Scatter in All Sample Means,`X Less Than Population Standard Deviation Formula (Sampling With Replacement) N x ( i 1 ) Xi - mx N 2 n Sampling Distribution of the Sample Means Summary mx = mx x n Sampling is done with replacement or Population is infinite or n/N < .05 Sampling from Normal Populations Population Distribution Central Tendency mx mx X = 10 Dispersion x x n Sampling With Replacement mX = 50 X Sampling Distribution n =16 x = 2.5 n =4 x = 5 mX- = 50 X Standardizing Sampling Distribution of Mean Z X - mx x n Sampling Distribution X - mx x Standardized Normal Distribution z = 1 X mX `X mZ = 0 Z Thinking Challenge You’re an operations analyst for AT&T. Longdistance telephone calls are normally distribution with mx = 8 min. & x = 2 min. If you select random samples of 25 calls, what percentage of the sample means would be between 7.8 & 8.2 minutes? © 1984-1994 T/Maker Co. Sampling Distribution Solution* Z Sampling Distribution Z X - mx x n X - mx x n 7.8 - 8 2 25 -.50 8.2 - 8 2 .50 Standardized 25 Normal Distribution Z = 1 `X = .4 .3830 .1915 7.8 8 8.2 `X .1915 -.50 0 .50 Z Sampling from Normal Populations Population Distribution Central Tendency mx mx X = 10 Dispersion x x n Sampling With Replacement mX = 50 X Sampling Distribution n =16 x = 2.5 n =4 x = 5 mX- = 50 X Sampling from Non-Normal Populations Central Tendency mx mx Population Distribution X = 10 Dispersion x x n Sampling With Replacement mX = 50 X Sampling Distribution n =30 n=4 x =1.8 x= 5 mX- = 50 X Central Limit Theorem For a population with a mean u and a standard deviation , the sampling distribution of the means of all possible samples of size n generated from the population will be approximately normally distributed assuming that the sample size is sufficiently large. Central Limit Theorem As sample size gets large enough ( 30) ... sampling distribution becomes almost normal. X Central Limit Theorem The sampling distribution of means is a normal distribution if population is normally distributed Even if population is not normally distributed, the sampling distribution of means is approximated by a normal distribution for large n (n>30) Central Limit Theorem As sample size gets large enough ( 30) ... sampling distribution becomes almost normal. X Proportions Categorical Variable (e.g., Gender) % Population Having a Characteristic If Two Outcomes, Binomial Distribution Possess - Don’t Possess Characteristic Sample Proportion Formula: P X n number of successes sample size Sampling Distribution of Proportion Approximated by Normal Distribution n·p 5 n·(1 - p) 5 Mean mP Sampling Distribution P(Ps) .3 .2 .1 .0 p .0 P .2 .4 .6 .8 Standard Error P p (1 - p) n where p = Population Proportion 1.0 Standardizing Sampling Distribution of Proportion Z@ P - mP P P -p p (1 - p ) n Sampling Distribution P Standardized Normal Distribution z = 1 mP P mZ = 0 Z Thinking Challenge You’re manager of a bank. 40% of depositors have multiple accounts. You select a random sample of 200 customers. What is the probability that the sample proportion of depositors with multiple accounts would be between 40% & 43% ? © 1984-1994 T/Maker Co. Solution* P(.40 P .43) n·p 5 n·(1 - p) 5 Z@ P -p .43 -.40 .87 p (1 - p ) .40 (1-.40 ) n Sampling Distribution 200 Standardized Normal Distribution P = .0346 Z = 1 .3078 mP = .40 .43 P mZ= 0 .87 Z Sampling from Finite Populations Modify Standard Error if Sample Size (n) Is Large Relative to Population Size (N) n > .05·N (or n/N > .05) Use Finite Population Correction (fpc) Factor for Standard Errors if n/N > .05 x x n N-n N -1 P p (1 - p) n (N - n) (N - 1) Sampling Distribution of the Sample Means Summary mx = mx x x Sampling is done with replacement or Population is infinite or n/N < .05 n x n N -n N -1 Sampling is without replaacement and Population is finite and n/N > .05 Thinking Challenge You’re manager of a bank. 40% of all 1000 depositors have multiple accounts. You select a random sample of 200 customers. What is the probability that the sample proportion of depositors with multiple accounts would be between 40% & 43% ? © 1984-1994 T/Maker Co. Solution* P(.40 P .43) Z@ P -p p (1 - p ) N - n n N -1 Sampling Distribution .43 -.40 .40 (1-.40 ) 200 1000 - 200 .97 1000 - 1 Standardized Distribution P = .0310 Z = 1 .3340 mP = .40 .43 P mZ= 0 .97 Z Selecting a Sample Size Selecting a Sample Size The Degree of Cofidence Selected The Maximum Allowable Error The Population Standard Deviation Sample Size for Means z z n 2 E E 2 2 2 E is the allowable error z is the z score associated with degree of confidence is the population standard deviation The marketing manager would like to estimate the population mean annual usage of home heating oil to within 50 gallons of the true value and desires to be 95% confident of correctly estimating the true mean. Based on a previous study taken last year,the marketing manager feels that the standard deviation can be estimated as 325 gallons. What is the sample size need to obtain these results? z = 1.96 Confidence = 95% E = 50 = 325 196 z . (325) (384 . )(105,625) n 162.31 2 2 2500 E (50) 2 2 2 2 n 163 homes need to be sampled Sample Size for Proportions p 1 - p z n 2 E 2 E is the maximum allowable error z is the z value associated with the degree of confidence p is the estimated proportion A political pollister would like to estimate the proportion of voters who will vote for the Democratic candidate in a presidential campaign. The pollster would like 95% confidence that her prediction is correct to within .04 of the true proportion. What sample size is needed? Confidence = 95% E = .04 p = unknown use p = .5 p(1 - P) z .5(1-.5)(196 . ) n 600.25 2 2 E (.04) 2 n = 601 voters 2 Conclusion Examined Sampling Methods Described the Properties of Estimators Explained Sampling Distribution Described the Relationship between Populations & Sampling Distributions Stated the Central Limit Theorem Solved Probability Problems Involving Sampling Distributions End of Chapter Any blank slides that follow are blank intentionally.