Estimating population characteristics with simple random sampling (Session 06) SADC Course in Statistics Learning Objectives By the end of this session, you will be able to • explain exactly what is meant by a simple random sample • distinguish between “with” and “without” replacement sampling • estimate the population mean or total using a sample where the sampling has been by simple random sampling • compute measures of precision for such an estimate, recognising the need for a finite population correction. To put your footer here go to View > Header and Footer 2 Simple random sampling: definition • The exact definition of simple random sampling is a procedure whereby every sample of size n has an equal chance of being selected. • In practice, this is achieved by picking one unit at a time without replacement from a list of population members. • “Without replacement” means that once a unit is chosen, it is not returned to the population list until all the necessary units have been sampled. To put your footer here go to View > Header and Footer 3 An illustration: • Suppose population size is N = 6 with the observable values of the six members being 10, 4, 17, 6, 8, 15. Suppose the values are observed accurately without any error. • Suppose we want a sample of size 2. • How many possibilities are there to choose 2 out of 6 members? • A list of all such pairs appears below. To put your footer here go to View > Header and Footer 4 Illustration continued… (10, 4) (4,17) (17, 6) (6, 8) (10,17) (4, 6) (17, 8) (6,15) (10, 6) (4,8) (17,15) (10, 8) (4,15) (8,15) (10,15) In simple random sampling, each of the above have an equal chance of selection. i.e. probability of selection = 1/15. To put your footer here go to View > Header and Footer 5 “With replacement” sampling • Taking a simple random sample is done using “without replacement” sampling. • “With replacement” involves noting the value for the unit drawn, and returning the unit to the population. • This means there is potential for the same unit to be selected more than once! • Is this sensible? Note: In multi-stage sampling, there is often a valid reason for doing “with replacement” sampling at the first stage of selection. More on this later! To put your footer here go to View > Header and Footer 6 Estimation with SRS Suppose a sample of size n (x1, x2, …., xn) is selected from a population of size N whose true mean is X , 10 for our 6-member pop.n Then the best estimator of the population mean is the sample mean given by 1 n x= xi n i=1 Note: Lower case letters will be used for sample values, and upper case for population values To put your footer here go to View > Header and Footer 7 Variance of the SRS estimator The variance of the sample mean is given by 2 X i -X n S 2 V x = 1- , where S = N-1 N n 2 =26.0 using population values 10, 4, 17, 6, 8, 15. Hence 2 26.0 V x = 1 - = 8.67 6 2 To put your footer here go to View > Header and Footer 8 Notes concerning the variance • Compared to the variance of a sample mean used in Module H2 (assuming an infinite population), the formula here is similar except for the inclusion of the term (1-n/N). • This multiplier is called the finite population correction. It may be ignored if the population is very large since n/N is then nearly zero. • The quantity n/N is called the sampling fraction, often denoted by f. Thus f=n/N. To put your footer here go to View > Header and Footer 9 Example • Suppose the sample values were 6 and 15. • The population mean is then estimated by 6+15 x= = 10.5 2 • Its variance is estimated by x i -x ns 2 V x = 1 , where s = = 40.5, i.e. N n n-1 2 2 2 40.5 V x = 1 = 13.5 6 2 std. error=3.7 To put your footer here go to View > Header and Footer 10 Estimating population total, XT • The appropriate estimate is given by x T = N x = (6).(10) = 60 in example above • The variance of this estimator is: 2 V x T = N V x = 62 (13.5) = 486 Hence std.error = 22.0 • Confidence intervals for both the population mean and the population total can be obtained in the usual way (refer to methods covered in Module H2). To put your footer here go to View > Header and Footer 11 Estimating population proportion • Results below are for use when the denominator for the proportion is fixed, e.g. proportion of HHs with at least 1 child aged 12-23 months of age. Denominator (total no. of HHs) is fixed by the investigator. • Appropriate estimate for the population proportion is the sample proportion p=r/n where r=number of samples having attribute and n=sample size. • Standard error of this estimate is sq. root of (1-f)p(1-p)/(n-1) where f=n/N. To put your footer here go to View > Header and Footer 12 Further notes • Important not to confuse estimating a population proportion with estimating a population ratio. • For example, estimating the ratio of male children to female children in the population • You will briefly meet with the estimation of a ratio through the practical exercise “To the Woods” done in Sessions 8, 9, 10. To put your footer here go to View > Header and Footer 13 Further notes • In computing confidence intervals for the estimators considered above, large sample sizes are usually assumed, so z-values from a standard normal distribution are used i.e. 2 x ± Zα s 1-f n • However, if n is small, t-values should be used, i.e. 2 x ± t α,n-1 s 1-f n To put your footer here go to View > Header and Footer 14 Some practical work follows… To put your footer here go to View > Header and Footer 15