Uploaded by Nkosingiphile Mncube

2018 Sup Exam (18)

advertisement
Question 1
(a) Researchers wish to design a survey to estimate the number of
Oak trees in a study area. The study area has been divided
into 1000 plots. From previous experience, the variance in the
number of stems per plot is known to be approximately 45.
Using simple random sampling, what sample size should be used
to estimate the total number of trees in the study area within
500 trees of the true value with 95% confidence.
(3)
(b) To estimate the proportion of voters in favour of a certain
constitutional amendment a simple random sample of 1200
eligible voters was contacted and questioned. Of these, 552
reported that they favoured the amendment. Estimate the
population proportion in favour and give a 95% confidence
interval for the population proportion. The number of eligible
voters in the population is approximately 1800000.
(6)
(c) Assume that the following are data from cluster sampling with
simple random sampling of clusters. There are 10 clusters
(primary units) and a total of 100 secondary units in the
population. For each of the n = 3 selected clusters, where ti is
the cluster total for the variable of interest and Mi is the cluster
size, t1 = 4, M1 = 5, t2 = 12, M2 = 20, t3 = 7, M3 = 10. Give
an unbiased estimate of the population total and estimate its
variance.
(5)
[14 Marks]
Question 2
Given the sample estimator of the population mean Ȳ , under simple
random sampling without replacement and stratified random sampling
1 Pn
1 Pk
as, ȳ =
y
and
ȳ
=
Ni ȳi , respectively.
i
st
n i=1
N i=1
(a) Show that the two estimators are unbiased under their
respective sampling techniques.
(4)
(b) The variance of ȳ under stratified random sampling is given as
2 Pk
Ni − n i 2
Ni
Si . Derive the variance of the
V (ȳst ) = i=1
N
Ni n i
estimator under proportional and optimal sample size allocation
at a fixed per unit cost.
(10)
(c) The table below shows results released by Statistics South Africa
that were obtained from a stratified random sample to estimate
the average amount of money lost by insurance companies in
cash in transit heists from three provinces in South Africa from
1980 to 2018. Note that yi represents the amount of money in
billion rands.
Stratum i Ni ni ȳi s2i
1
100 50 10 2800
2
50 50 20 700
3
300 50 30 600
i) Estimate the average amount of money lost by insurance
companies in South Africa for the stated period.
(3)
ii) Give a 95% confidence interval for this average.
(5 )
[22 Marks]
Question 3
Given the ratio estimator, R̂ =
ȳ
, of the population ratio R,
x̄
(a) Show that the large sample approximate!variance of R̂ is given
(1 − f ) PN (y − Rx )2
i
i
i=1
as V R̂ =
.
(7)
N −1
nX̄ 2
(b) Show further that the variance of the ratio estimators for the
ȳ
population mean and total defined as, ȳR = X̄, and tR = R̂Tx ,
x̄
respectively is given as
!
PN
2
(1 − f )
i=1 (yi − Rxi )
V (ȳR ) =
n
N −1
!
PN
2
2
N (1 − f )
i=1 (yi − Rxi )
and V (tR ) =
, receptively.
n
N −1
(7)
(c) Show
the coefficient of variation of, R̂, is given as
s that
1−f
(Cyy + Cxx − 2Cxy ). Where Cyy , and Cxx , are
n
the squared coefficient of variation of y, and x respectively.
Sxy
Cxy =
is the relative covariance.
X̄ Ȳ
(6)
[20 Marks]
Question 4
In simple random sampling with regression coefficient b, the linear
regression estimator of the population mean is given by
ȳreg = ȳ + b X̄ − x̄ ,
where ȳ are mean per units of y, and x, respectively, and X̄ is
population mean of x.
(a) Deduce that the estimator reduces to the ratio estimator ȳR if
ȳ
b is selected to be .
(2)
x̄
(b) Show that ȳreg is an unbiased estimator of the population mean
Ȳ if the regression coefficient b is preassigned a value of b0 .
(2)
(c) Show that the variance of ȳreg at b = b0 is given by
V (ȳreg ) =
(1 − f ) 2
Sy − 2b0 Sxy + b20 Sx2 .
n
(10)
(d) Show that the optimal value of b, denoted as bopt is given as
Sxy
and the resulting variance is given as
Sx2
V (ȳreg )min =
(1 − f ) 2
Sy 1 − ρ2
n
where ρ is the population correlation coefficient between y and
x.
(6)
[20 Marks]
Question 5
In a Johannesburg city suburb of 725 people, a simple random sample
of four households is selected from the 250 households in the population
to estimate the average cost on fizzy drinks per household per day for a
market research survey conducted by a leading producer of soft drinks
in South Africa. The first household in the sample had 4 people and
spent a total of R150 on fizzy drinks per day. The second household
had 2 people and spent R100. The third, with 4 people, spent R200.
The fourth, with 3 people, spent R140 per day.
(a ) Estimate;
(i) The population mean expenditure using the simple random
sample estimator of the mean and the ratio estimator. (6)
ii) The variance of both estimators.
(6)
iii) Based on the results, in (ii), which estimator appears
preferable in this study.
(1)
(b) In a survey to determine the amount of crop yield due to an
air pollutant on farms in KwaZulu-Natal, a simple random of
n = 20 plots was selected from N = 1000 in the province. The
summary statistics on yield yi (in weight) and level of pollutant
20
X
xi (in parts million) were ȳ = 10, x̄ = 6,
(xi − x̄) (yi − ȳ) =
i=1
−60,
20
X
(xi − x̄)2 = 30, and
i=1
20
X
(yi − ȳ)2 = 130. The mean
i=1
pollutant level is X̄ = 5.0.
(i) Estimate the mean yield for the province with a linear
regression estimator.
(3)
(ii) Estimate the variance of the linear regression and the
simple random estimator of the population mean.
(7)
(c) Under what condition will the regression estimator be equal to
the simple random sample estimator of the population mean?
(1)
[24 Marks]
TOTAL MARKS=100
Download