Module H6 Practical 6
1. A particular family consists of Mother, Father and six children. The children are 10, 4,
17, 6, 8, and 15 years of age. Suppose we wish to draw a simple random sample of size 3 from the 6 children to estimate the average age of a child. (Note: this is not a particularly useful question since the population is small and samples are not necessary. However, the purpose of the exercise is (i) to see what happens when all possible samples are drawn and
(ii) to introduce the idea of what is meant by an unbiased estimate.)
(a) How many possible samples of size 3 may be drawn, without replacement, from the population of 6?
(b) Open Excel and enter three columns of data for the three possible values you might draw, i.e. each row of data must have the three values of your sample. Name your columns x1, x2 and x3. Calculate a fourth and fifth column which will contain the sample mean and sample variance. Use Excel’s functions average and stdev for this purpose. Call these columns sampmean and sampvar respectively.
(c) Find the mean of the column of sample means.
(d) Find the mean of the column of sample variances.
(e) Enter also (in a new column), the population values, naming this column popn (say).
Find the mean and variance of the population values – these would be the true values for the population.
Now look at your results. You should find that
the population mean and the mean of the sample means are the same; the population variance and the mean of the sample variances are the same.
In other words, on average, the sample mean is an unbiased estimate of the population mean. On average, the sample variance is an unbiased estimate of the population variance.
SADC Course in Statistics Module H2 Practical 6 – Page 1
Module H6 Practical 6
2. A simple random sample of 10 farms belonging to subsistence farmers, was selected without replacement from a population of 379 farms in a particular region. The following are the areas in acres of the chosen 10 farms.
152, 231, 138, 140, 242, 260, 312, 396, 277, 163
The data are also available in the Excel spreadsheet named farmacres in file H6_data.xls.
(a) Estimate the mean area of a farm and the total area of farmland in the population.
(b) Calculate the standard error of the above two estimates and determine 95% confidence intervals for both.
(c) Interpret your results and comment on the precision of your sample estimates.
SADC Course in Statistics Module H2 Practical 6 – Page 2
Module H6 Practical 6
ATTEMPT EITHER 3 OR 4 BELOW.
3. A simple random sample of 15 cows was drawn from a population of 168 cows on a farm. The data below show whether the cows were found to have a particular disease.
Cow No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Diseased? Y N Y Y N N Y Y Y N Y N Y Y N
(a) Estimate the proportion of diseased cows on the farm and find a standard error for your estimate
(b) Calculate an approximate 95% confidence interval for the true proportion of diseased cows on the farm.
4. A village has 800 people. In order to estimate how many are HIV-positive, a simple random sample of 80 persons is chosen to be given blood tests. Sixteen of the selected 80 are found to be HIV-positive.
(a) Estimate the proportion of persons in the village who are HIV-positive and find a standard error for your estimate
(b) Calculate an approximate 95% confidence interval for the true proportion of HIVpositive persons in the village.
SADC Course in Statistics Module H2 Practical 6 – Page 3