Estimating population characteristics with simple random sampling

advertisement
Estimating population
characteristics with
simple random sampling
(Session 06)
SADC Course in Statistics
Learning Objectives
By the end of this session, you will be able to
• explain exactly what is meant by a simple
random sample
• distinguish between “with” and “without”
replacement sampling
• estimate the population mean or total
using a sample where the sampling has
been by simple random sampling
• compute measures of precision for such an
estimate, recognising the need for a finite
population correction.
To put your footer here go to View > Header and Footer
2
Simple random sampling: definition
• The exact definition of simple random
sampling is a procedure whereby every
sample of size n has an equal chance of
being selected.
• In practice, this is achieved by picking one
unit at a time without replacement from
a list of population members.
• “Without replacement” means that once a
unit is chosen, it is not returned to the
population list until all the necessary units
have been sampled.
To put your footer here go to View > Header and Footer
3
An illustration:
• Suppose population size is N = 6 with the
observable values of the six members being
10, 4, 17, 6, 8, 15. Suppose the values
are observed accurately without any error.
• Suppose we want a sample of size 2.
• How many possibilities are there to choose
2 out of 6 members?
• A list of all such pairs appears below.
To put your footer here go to View > Header and Footer
4
Illustration continued…
(10, 4)
(4,17)
(17, 6)
(6, 8)
(10,17)
(4, 6)
(17, 8)
(6,15)
(10, 6)
(4,8)
(17,15)
(10, 8)
(4,15)
(8,15)
(10,15)
In simple random sampling, each of the
above have an equal chance of selection.
i.e. probability of selection = 1/15.
To put your footer here go to View > Header and Footer
5
“With replacement” sampling
• Taking a simple random sample is done
using “without replacement” sampling.
• “With replacement” involves noting the
value for the unit drawn, and returning the
unit to the population.
• This means there is potential for the same
unit to be selected more than once!
• Is this sensible?
Note: In multi-stage sampling, there is often a valid
reason for doing “with replacement” sampling at
the first stage of selection. More on this later!
To put your footer here go to View > Header and Footer
6
Estimation with SRS
Suppose a sample of size n (x1, x2, …., xn) is
selected from a population of size N whose
true mean is X , 10 for our 6-member pop.n
Then the best estimator of the population
mean is the sample mean given by
1 n
x=  xi
n i=1
Note: Lower case letters will be used for
sample values, and upper case for population
values
To put your footer here go to View > Header and Footer
7
Variance of the SRS estimator
The variance of the sample mean is given by

2
X i -X

n
S


2
V  x  = 1- 
, where S =
N-1
 N n

2
=26.0
using population values 10, 4, 17, 6, 8, 15.
Hence
2  26.0

V  x  = 1 - 
= 8.67
6 2

To put your footer here go to View > Header and Footer
8
Notes concerning the variance
• Compared to the variance of a sample mean
used in Module H2 (assuming an infinite
population), the formula here is similar
except for the inclusion of the term (1-n/N).
• This multiplier is called the finite
population correction. It may be ignored
if the population is very large since n/N is
then nearly zero.
• The quantity n/N is called the sampling
fraction, often denoted by f. Thus f=n/N.
To put your footer here go to View > Header and Footer
9
Example
• Suppose the sample values were 6 and 15.
• The population mean is then estimated by
6+15
x=
= 10.5
2
• Its variance is estimated by
 x i -x 
ns


2
V  x  = 1 
, where s =
= 40.5, i.e.

N n
n-1

2
2
2  40.5

V  x  = 1 
= 13.5

6 2

 std. error=3.7
To put your footer here go to View > Header and Footer
10
Estimating population total, XT
• The appropriate estimate is given by
x T = N x = (6).(10) = 60 in example above
• The variance of this estimator is:
2
V x T = N V  x  = 62 (13.5) = 486
Hence std.error = 22.0
 
• Confidence intervals for both the population
mean and the population total can be
obtained in the usual way (refer to methods
covered in Module H2).
To put your footer here go to View > Header and Footer
11
Estimating population proportion
• Results below are for use when the
denominator for the proportion is fixed, e.g.
proportion of HHs with at least 1 child aged
12-23 months of age. Denominator (total
no. of HHs) is fixed by the investigator.
• Appropriate estimate for the population
proportion  is the sample proportion p=r/n
where r=number of samples having attribute
and n=sample size.
• Standard error of this estimate is sq. root of
(1-f)p(1-p)/(n-1) where f=n/N.
To put your footer here go to View > Header and Footer
12
Further notes
• Important not to confuse estimating a
population proportion with estimating a
population ratio.
• For example, estimating the ratio of male
children to female children in the population
• You will briefly meet with the estimation of a
ratio through the practical exercise “To the
Woods” done in Sessions 8, 9, 10.
To put your footer here go to View > Header and Footer
13
Further notes
• In computing confidence intervals for the
estimators considered above, large sample
sizes are usually assumed, so z-values from
a standard normal distribution are used
i.e.
2
x ± Zα
s
1-f 
n
• However, if n is small, t-values should be
used, i.e.
2
x ± t α,n-1
s
1-f 
n
To put your footer here go to View > Header and Footer
14
Some practical work follows…
To put your footer here go to View > Header and Footer
15
Download