```BBAC2
QUESTION 1
(a) .
(i) A random variable denoted by X is a function whose domain is the sample space s and
the range is a set of real number whose value is determined by each element in the
sample space.
(ii) A random variable X is discrete if X is either finite or countable.
(iii)Discrete proability function is a formula or table linking all possible values that a discrete
random variable can take along with the associated probabilities
(b) .
(i) E(x)=∑๐๐ฅ=๐ ๐ฅ๐(๐ = ๐ฅ)
E(x) =∑5๐ฅ=0 ๐ฅ๐(๐ = ๐ฅ)
=0(0.10) + 1(0.15) +2(0.30) + 3(0.20) + 4(0.15) +5(0.10)
= 0 + 0.15 +0.6 + 0.6 + 0.6 + 0.5
=2.45 calls
(ii) E(x2)=∑๐๐ฅ=๐ ๐ฅ 2 ๐(๐ = ๐ฅ)
E(x) =∑5๐ฅ=0 ๐ฅ 2 ๐(๐ = ๐ฅ)
=02(0.10) + 12(0.15) +22(0.30) + 32(0.20) + 42(0.15) +52(0.10)
= 0 + 0.15 + 1.2 + 1.8 + 2.4 + 2.5
=8.05
(iii) ๐ = √E(x 2 ) = √8.05 = 2.84
๐
(c) P(X=x)=( ) ๐ ๐ฅ (1 − ๐)๐−๐ฅ
๐ฅ
P=0.9
1-p=0.1
n=5
5
(i) P(X=0)=( ) (0.9)0 (0.1)5
0
P(x=0) =0.00001
5
(ii) P(X=2)=( ) (0.9)2 (0.1)3
2
= 10(0.810(0.001)
P(x=2)=0.0081
(iii) P(X≤ 1)= P(X=0) + P(X=1)
5
5
= ( ) (0.9)0 (0.1)5 + ( ) (0.9)1 (0.1)4
0
1
= 0.00001 +5(0.9) (0.0001)
=0.00001+ 0.00045
=0.00046
(iv) P(X≥ 2)=1− P(X≤ 1)
=1−0.0046
= 0.99954
๐ −๐ ๐๐ฅ
(d) P(X=x)=
๐ฅ!
, x=0,1,… and E(x)=Var(x)=๐
(e) ๐ = 4
(i) P(X=0)=
๐ −4 40
0!
=0.018315638
=0.01832
(ii) P(X=4)=
๐ −4 44
4!
=0.19536366814
=0.19536
(iii)P(X≤ 3)= P(X=0) + P(X=1) +P(X=2) +P(X=3)
= 0.01832 +
๐ −4 41
1!
+
๐ −4 42
2!
+
๐ −4 43
3!
=0.01832 + 0.07326 + 0.14653 + 0.19537
=0.43348
(iv) P(X≥ 2)=1− P(X≤ 1)
=1−(0.01832 + 0.07326)
= 1-0.09158
= 0.90842
(v) P(X≤ 4)= P(X≤ 3) + P(X=4)
=0.43348 + 0.19536
= 0.62884
QUESTION 2
(a) .
i.
Null hypothesis is the assumption that there is no difference between specified
populations; any observed difference is due to sampling or experimental error.
ii.
Alternative hypothesis is the assumption that a new theory is true instead of an old
one.
iii. Critical region is the interval of values that corresponds to the rejection of the null
hypothesis at some chosen probability level.
(b) Central limit theorem states that if repeated samples of size n are drawn from any infinite
population with mean(๐)and variance ๐ 2 and n is large (๐ ≥ 30), the of the ๐ฅฬ the sample
mean is approximately normal with mean ๐ (i.e E(๐ฅฬ )= ๐) and variance
and this approximation becomes better as n becomes larger.
๐2
๐
(i.e. v(๐ฅฬ )=
๐2
๐
)
(c) .
(i) X~Po(4.5)
๐ = ๐ = 4.5 ๐ 2 = ๐ = 4.5
By central limit theorem, since n is large ๐ฬ is approximately normal
2
๐
So ๐ฬ~๐ (๐, ๐ ), with n =30
4.5
)
30
๐ฬ~๐(4.5, 0.15)
๐ฬ~๐ (4.5,
p(๐ฅ &gt; 5) =(๐ง &gt;
p(๐ฅ &gt; 5) =(๐ง &gt;
๐ฅ−๐
)
√๐
5−4.5
√0.15
)
p(๐ฅ &gt; 5) =(๐ง &gt; 1.29)
p(๐ฅ &gt; 5) =0.5 − (๐ง &gt; 1.29)
p(๐ฅ &gt; 5) = 0.5-0.4015
=0.0985
(ii) X~B(9, 0.5)
๐ = ๐๐ = 9 &times; 0.5 = 4.5 ๐ 2 = ๐๐๐ = 9 &times; 0.5 &times; 0.5 = 2.25
By central limit theorem, since n is large ๐ฬ is approximately normal
2
๐
So ๐ฬ~๐ (๐, ๐ ), with n =30
2.25
)
30
๐ฬ~๐(4.5, 0.075)
๐ฬ~๐ (4.5,
p(๐ฅ &gt; 5) =(๐ง &gt;
p(๐ฅ &gt; 5) =(๐ง &gt;
๐ฅ−๐
)
√๐
5−4.5
)
√0.075
p(๐ฅ &gt; 5) =(๐ง &gt; 1.83)
p(๐ฅ &gt; 5) =0.5 − (๐ง &gt; 1.83)
p(๐ฅ &gt; 5) = 0.5-0.4656
=0.0344
(d) n=200, ๐ = 75, ๐ = 15 z=
๐ฅ−๐
)
๐
60−75
)
15
(i) p(๐ฅ &gt; 60) =(๐ง &gt;
p(๐ฅ &gt; 60) =(๐ง &gt;
= p(๐ง &gt; −1)
=0.5 + p(๐ง ≤ 1)
=0.5 + 0.3413
= 0.8413
๐ฅ−๐
๐
(ii) number of clinics = 0.8413 x 200
=168 clinics
65−75
15
(iii) p(65 &lt; ๐ฅ &lt; 85) =(
&lt;๐ง&lt;
85−75
)
15
=(−0.67 &lt; ๐ง &lt; 0.67)
= 2 x ๐(๐ง &lt; 0.67)
= 2 x 0.2486
=0.4972
(e) ๐ฅฬ1 = 76, ๐1 = 50, ๐1 = 8
๐ฅฬ2 = 68 ๐2 = 65 ๐2 = 9
(i)
Point of estimate of ๐1 − ๐2 = ๐ฅฬ1 − ๐ฅฬ 2
=76 – 68
=8
(ii)
95% confidence level
AL=
1+๐๐
2
=
1+0.95
2
=
1.95
2
= 0.975, ๐กโ๐๐๐๐๐๐๐, ๐ง ๐๐๐ ๐กโ๐๐  ๐๐๐ฃ๐๐ ๐๐ ๐๐๐๐๐๐๐๐๐๐
Z=1.96
๐ 2
๐2 2
1
๐2
Confidence interval for ๐1 − ๐2 =(๐ฅฬ1 − ๐ฅฬ2 ) &plusmn; ๐ง√ ๐1 +
82
92
=8 &plusmn; 1.96√50 + 65
=8 &plusmn;3.12
= (4.88, 11.2) years
QUESTION 3
(a) .
(i) We find ๐ฅฬ1 ๐๐๐ ๐ฅฬ2 if ๐ฅฬ1 = ๐ฅฬ2 then E(x)=๐ the population mean
๐ฅฬ 1 =
∑ ๐1 ๐ฅ1 (18 &times; 3) + (19 &times; 7) + (20 &times; 15) + (21 &times; 10) + (22 &times; 5) 807
=
=
≈ 20
∑ ๐1
40
40
∑ ๐2 ๐ฅ2 (18 &times; 10) + (19 &times; 21) + (20 &times; 18) + (21 &times; 6) + (22 &times; 3) + (23 &times; 2)
=
∑ ๐2
40
977
=
≈ 20
50
๐ ๐๐๐๐ E(x1)=E(x2)=20
๐กโ๐ ๐๐๐๐ข๐๐๐ก๐๐๐ ๐ = 20.
๐ฅฬ 2 =
(ii) ๐ 1 2 =
∑ ๐1 (๐ฅ1 −๐ฅฬ1 )2
∑ ๐1 ๐ฅ1
=
3(18−20)2 +7(19−20)2 +15(20−20)2 +10(21−10)2 +5(22−20)2
40
49
=40 = ๐. ๐๐
(iii) ๐ 2 2 =
∑ ๐2 (๐ฅ2 −๐ฅฬ2 )2
∑ ๐2 ๐ฅ2
=
10(18−20)2 +21(19−20)2 +8(20−20)2 +6(21−10)2 +23(22−20)2 +2(23−20)2
50
177
= 50 = ๐. ๐
Since ๐ 1 2
Is small the estimate of the population ๐ 2 =
(b) . HO : μ = 130
๐ 1 2
๐
=
1.23
40
= 0.03075
HA: μ ≠ 130
The critical value for α = 0.05
The area of each shaded &quot;tail” of the standard normal curve is 0 .025 and the corresponding Z
scores ( Z tabulated) at the boundaries are &plusmn;1.96.
Sample: n = 64, X = 132
The Z score for the random sample of 64 persons of the village aged 20 to 40 years:
๐๐๐−๐๐๐
๐
Z calc =
= ๐.๐๐ = 1.6
๐๐
√๐๐
This score falls inside the “fail to reject region” from –1.96 to +1.96.
Hence, the null hypothesis is accepted. That is, the mean systolic blood pressure of persons (aged
20 to 40) living in village x is the same as the mean systolic blood pressure of the inhabitants
(aged 20 to 40) of the district.
(c) P=0.4, q=0.6, n= 150
(i)
Sample proportion =
๐๐
๐
๐๐ = √
=
0.4&times;0.6
150
60
150
= 0.4
= 0.04
Confidence interval= p &plusmn; z ๐๐
For a 99% Confidence interval
AL=
Z=
1+๐๐
2
=
1+0.99
2
2.57+2.57
=
1.99
2
= 0.995, ๐กโ๐๐๐๐๐๐๐, ๐ง ๐๐๐ ๐กโ๐๐  ๐๐๐ฃ๐๐ ๐๐ ๐๐๐๐๐๐๐๐๐๐
= 2.575
2
Confidence interval = 0.4 &plusmn; 2.575 (0.04) = (.4 &plusmn;0 .103)
= (0.297, 0.503).
(ii)
The epidemiologist need to worry because the confidence interval level for the
people infected with malaria is big.
QUESTION 4
(a) .
Uniform Probability Density Function
(i)
1
f(x)=๐−๐ , ๐๐๐ ๐ ≤ ๐ฅ ≤ ๐
=0
f(x) =
1
10
for 5 ≤ ๐ฅ ≤ 15
= 0 elsewhere
Where: x = salad plate filling weight
1
(ii)
P(5 ≤ ๐ฅ ≤ 15)= 10 (3) = 0.3
(iii)
Variance of x
(๐ + ๐)
2
5+15
2
E(x) =
=
= 10
(iv)
Var (x) =
=
(๐ − ๐)2
12
(15 – 52 )
12
= 8.33
(b) This is like sampling from an urn. The N = 20 “balls” in the urn correspond to the 20 cars, of
which M = 7 are “black” (i.e. polluting). When n = 5 are sampled, the distribution of X, the
number in the sample exceeding pollution control standards has a Hyper geometric (N = 20, M =
7, n = 5) distribution.
We can use this to calculate any probabilities of interest
P(X=x)
๐ ๐−๐
( )(
)
= ๐ฅ ๐๐−๐ฅ
( )
๐
P(X≤ 2) = P(X=0) + P(X=1) + P(X=2)
=
7 13
7 13
7 13
( )( )
( )( )
( )( )
0 5
2 3
1 4
+
+
20
20
20
( )
( )
( )
7
7
7
=0.0830 + 0.3228 + 0.3874
= 0.7932
(c) We are given E(X) = 12.5. Since X = number of tests completed when the 1st person with high
blood pressure is found, it is not the number of tests completed before the 1st person with high
blood pressure is found. So, E(X-1) = 11.5.
For a geometric distribution, E(X)=
1−๐
. ๐ธ(๐
๐
− 1) . As a result, p = 1/12.5 = 0.08.
1−๐
๐
11.5๐ = 1 − ๐
12.5๐ = 1
1
๐=
= 0.08
12.5
11.5 =
```