Objectives, Data and Measurements

advertisement
Module H2 Practical 3
Sampling Distributions and Standard Errors
Objectives:
By the end of this practical you should be able to:



explain what is meant by an “estimate” of a population characteristic,
calculated using sample data
explain what is meant by the sampling distribution of an estimate
calculate and interpret the standard error of a sample mean from data
collected according to a simple random sampling scheme
1. Open the Excel file named H2_data.xls. Move to the sheet named cattle. As in
practical 2, compute the mean and standard deviation of the two columns corresponding to
data (fictitious) of the number of cattle in each district (in 1000’s), in variable cattle000,
and the mean number of persons per sleeping room within households in each district, in
variable pprm.. Use Excel functions AVERAGE and STDEV for this purpose.
(a) Note down your results in the table below, noting that these are population values (this
is exactly what you did in practical 2, but is repeated here to make it easier to compare
sample values with population values).
District Number
Cattle Numbers
Persons per sleeping room
Population Mean (  ) =
Population Std.Dev. (  )=
(b) Move now to the next worksheet named 50 cattle samples. Here, 50 samples have
been drawn, each of size 10, and the individual sample values have been recorded – leading
to 500 observations. Use Excel to find the means and standard deviations of both
variables with respect to the 1st sample, i.e. observations in cells B2:B11 and C2:C11.
Record your results below.
District Number
Cattle Numbers
Persons per sleeping room
Sample Mean ( x ) =
Sample Standard dev. ( s ) =
SADC Course in Statistics
Module H2 Practical 3 – Page 1
Module H2 Practical 3
(c) Note down the algebraic relationship between the standard deviation and the standard
error of the mean.
Use the above formula to find the standard error of the mean for each of the variables
cattle000 and pprm. Note down the results below.
s.e.m. for cattle =
s.e.m. for pprm =
Write down your interpretation of the standard error in each case.
(d) Move to the next worksheet named sample means&sds which include the means and
standard deviations for each sample (n=10 in each case) across the 50 repeated samples.
Check that the first row of this worksheet includes your answers given in part (b) above.
Each column in this worksheet represents values from a sampling distribution. Discuss
with the person sitting next to you, what sampling distribution is being shown in each
column, e.g. column catt10mn is the sampling distribution of what?
(e) Go to the bottom of the worksheet sample means&sds , and in row 52, compute the
standard deviation of the four columns of data. Note down below the standard deviation
of the two columns corresponding to the samples means, i.e. of columns named catt10mn
and pprm10mn.
Std. deviation of cattle000 means =
Std. deviation of pprm means =
SADC Course in Statistics
Module H2 Practical 3 – Page 2
Module H2 Practical 3
How close are these empirically computed standard errors, compared to the results from
the single sample as calculated in part (c) above?
(f) Note that the empirical values above, computed from 50 repeated samples, are estimates
of the true standard error of the mean (of 10 samples) given by the formula
s.e.m. (cattle000 mean) = population std.dev/(10) = 149.71/(10) = 47.3
s.e.m. (pprm mean) = population std.dev/(10) = 2.76/(10) = 0.873.
Of course, in practice, you will not have population values, nor will you be able to take
repeat samples. So in practice, the precision of the sample mean has to be estimated using
the standard error of the mean derived from the sample values, as has been done in part (c)
above.
2. Several computations have been done above giving you different standard deviations.
They correspond to:
(a) standard deviation of population values
(b) standard deviation of sample values
(c) standard error of the mean for a single sample of 10 observations
(d) standard deviation of 50 sample means = empirical estimate of the standard error of the
mean for a single sample of 10 observations.
(e) theoretical standard error of a sample mean if it is based on 10 observations drawn as a
simple random sample from a population with known standard deviation.
SADC Course in Statistics
Module H2 Practical 3 – Page 3
Module H2 Practical 3
Discuss, in pairs or in small groups of 3-4 persons per group, what these different values
mean. Ensure you are clear about how they relate to each other and the interpretation of
each.
Finally indicate what values would be used in practice and how they are useful.
Write down below the main lessons you learnt from this practical exercise.
SADC Course in Statistics
Module H2 Practical 3 – Page 4
Download