Uploaded by stanleykasusu3

# Answers to my assignment

advertisement ```BBAC2
QUESTION 1
(a) .
(i) A random variable denoted by X is a function whose domain is the sample space s and
the range is a set of real number whose value is determined by each element in the
sample space.
(ii) A random variable X is discrete if X is either finite or countable.
(iii)Discrete proability function is a formula or table linking all possible values that a discrete
random variable can take along with the associated probabilities
(b) .
(i) E(x)=∑𝑛𝑥=𝑖 𝑥𝑝(𝑋 = 𝑥)
E(x) =∑5𝑥=0 𝑥𝑝(𝑋 = 𝑥)
=0(0.10) + 1(0.15) +2(0.30) + 3(0.20) + 4(0.15) +5(0.10)
= 0 + 0.15 +0.6 + 0.6 + 0.6 + 0.5
=2.45 calls
(ii) E(x2)=∑𝑛𝑥=𝑖 𝑥 2 𝑝(𝑋 = 𝑥)
E(x) =∑5𝑥=0 𝑥 2 𝑝(𝑋 = 𝑥)
=02(0.10) + 12(0.15) +22(0.30) + 32(0.20) + 42(0.15) +52(0.10)
= 0 + 0.15 + 1.2 + 1.8 + 2.4 + 2.5
=8.05
(iii) 𝜎 = √E(x 2 ) = √8.05 = 2.84
𝑛
(c) P(X=x)=( ) 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥
𝑥
P=0.9
1-p=0.1
n=5
5
(i) P(X=0)=( ) (0.9)0 (0.1)5
0
P(x=0) =0.00001
5
(ii) P(X=2)=( ) (0.9)2 (0.1)3
2
= 10(0.810(0.001)
P(x=2)=0.0081
(iii) P(X≤ 1)= P(X=0) + P(X=1)
5
5
= ( ) (0.9)0 (0.1)5 + ( ) (0.9)1 (0.1)4
0
1
= 0.00001 +5(0.9) (0.0001)
=0.00001+ 0.00045
=0.00046
(iv) P(X≥ 2)=1− P(X≤ 1)
=1−0.0046
= 0.99954
𝑒 −𝜆 𝜆𝑥
(d) P(X=x)=
𝑥!
, x=0,1,… and E(x)=Var(x)=𝜆
(e) 𝜆 = 4
(i) P(X=0)=
𝑒 −4 40
0!
=0.018315638
=0.01832
(ii) P(X=4)=
𝑒 −4 44
4!
=0.19536366814
=0.19536
(iii)P(X≤ 3)= P(X=0) + P(X=1) +P(X=2) +P(X=3)
= 0.01832 +
𝑒 −4 41
1!
+
𝑒 −4 42
2!
+
𝑒 −4 43
3!
=0.01832 + 0.07326 + 0.14653 + 0.19537
=0.43348
(iv) P(X≥ 2)=1− P(X≤ 1)
=1−(0.01832 + 0.07326)
= 1-0.09158
= 0.90842
(v) P(X≤ 4)= P(X≤ 3) + P(X=4)
=0.43348 + 0.19536
= 0.62884
QUESTION 2
(a) .
i.
Null hypothesis is the assumption that there is no difference between specified
populations; any observed difference is due to sampling or experimental error.
ii.
Alternative hypothesis is the assumption that a new theory is true instead of an old
one.
iii. Critical region is the interval of values that corresponds to the rejection of the null
hypothesis at some chosen probability level.
(b) Central limit theorem states that if repeated samples of size n are drawn from any infinite
population with mean(𝜇)and variance 𝜎 2 and n is large (𝑛 ≥ 30), the of the 𝑥̅ the sample
mean is approximately normal with mean 𝜇 (i.e E(𝑥̅ )= 𝜇) and variance
and this approximation becomes better as n becomes larger.
𝜎2
𝑛
(i.e. v(𝑥̅ )=
𝜎2
𝑛
)
(c) .
(i) X~Po(4.5)
𝜇 = 𝜆 = 4.5 𝜎 2 = 𝜆 = 4.5
By central limit theorem, since n is large 𝑋̅ is approximately normal
2
𝜎
So 𝑋̅~𝑁 (𝜇, 𝑛 ), with n =30
4.5
)
30
𝑋̅~𝑁(4.5, 0.15)
𝑋̅~𝑁 (4.5,
p(𝑥 &gt; 5) =(𝑧 &gt;
p(𝑥 &gt; 5) =(𝑧 &gt;
𝑥−𝜇
)
√𝜎
5−4.5
√0.15
)
p(𝑥 &gt; 5) =(𝑧 &gt; 1.29)
p(𝑥 &gt; 5) =0.5 − (𝑧 &gt; 1.29)
p(𝑥 &gt; 5) = 0.5-0.4015
=0.0985
(ii) X~B(9, 0.5)
𝜇 = 𝑛𝑝 = 9 &times; 0.5 = 4.5 𝜎 2 = 𝑛𝑝𝑞 = 9 &times; 0.5 &times; 0.5 = 2.25
By central limit theorem, since n is large 𝑋̅ is approximately normal
2
𝜎
So 𝑋̅~𝑁 (𝜇, 𝑛 ), with n =30
2.25
)
30
𝑋̅~𝑁(4.5, 0.075)
𝑋̅~𝑁 (4.5,
p(𝑥 &gt; 5) =(𝑧 &gt;
p(𝑥 &gt; 5) =(𝑧 &gt;
𝑥−𝜇
)
√𝜎
5−4.5
)
√0.075
p(𝑥 &gt; 5) =(𝑧 &gt; 1.83)
p(𝑥 &gt; 5) =0.5 − (𝑧 &gt; 1.83)
p(𝑥 &gt; 5) = 0.5-0.4656
=0.0344
(d) n=200, 𝜇 = 75, 𝜎 = 15 z=
𝑥−𝜇
)
𝑛
60−75
)
15
(i) p(𝑥 &gt; 60) =(𝑧 &gt;
p(𝑥 &gt; 60) =(𝑧 &gt;
= p(𝑧 &gt; −1)
=0.5 + p(𝑧 ≤ 1)
=0.5 + 0.3413
= 0.8413
𝑥−𝜇
𝑛
(ii) number of clinics = 0.8413 x 200
=168 clinics
65−75
15
(iii) p(65 &lt; 𝑥 &lt; 85) =(
&lt;𝑧&lt;
85−75
)
15
=(−0.67 &lt; 𝑧 &lt; 0.67)
= 2 x 𝑝(𝑧 &lt; 0.67)
= 2 x 0.2486
=0.4972
(e) 𝑥̅1 = 76, 𝑛1 = 50, 𝜎1 = 8
𝑥̅2 = 68 𝑛2 = 65 𝜎2 = 9
(i)
Point of estimate of 𝜇1 − 𝜇2 = 𝑥̅1 − 𝑥̅ 2
=76 – 68
=8
(ii)
95% confidence level
AL=
1+𝑐𝑙
2
=
1+0.95
2
=
1.95
2
= 0.975, 𝑡ℎ𝑒𝑟𝑒𝑓𝑜𝑟𝑒, 𝑧 𝑓𝑜𝑟 𝑡ℎ𝑖𝑠 𝑙𝑒𝑣𝑒𝑙 𝑜𝑓 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒
Z=1.96
𝜎 2
𝜎2 2
1
𝑛2
Confidence interval for 𝜇1 − 𝜇2 =(𝑥̅1 − 𝑥̅2 ) &plusmn; 𝑧√ 𝑛1 +
82
92
=8 &plusmn; 1.96√50 + 65
=8 &plusmn;3.12
= (4.88, 11.2) years
QUESTION 3
(a) .
(i) We find 𝑥̅1 𝑎𝑛𝑑 𝑥̅2 if 𝑥̅1 = 𝑥̅2 then E(x)=𝜇 the population mean
𝑥̅ 1 =
∑ 𝑓1 𝑥1 (18 &times; 3) + (19 &times; 7) + (20 &times; 15) + (21 &times; 10) + (22 &times; 5) 807
=
=
≈ 20
∑ 𝑓1
40
40
∑ 𝑓2 𝑥2 (18 &times; 10) + (19 &times; 21) + (20 &times; 18) + (21 &times; 6) + (22 &times; 3) + (23 &times; 2)
=
∑ 𝑓2
40
977
=
≈ 20
50
𝑠𝑖𝑛𝑐𝑒 E(x1)=E(x2)=20
𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝜇 = 20.
𝑥̅ 2 =
(ii) 𝑠1 2 =
∑ 𝑓1 (𝑥1 −𝑥̅1 )2
∑ 𝑓1 𝑥1
=
3(18−20)2 +7(19−20)2 +15(20−20)2 +10(21−10)2 +5(22−20)2
40
49
=40 = 𝟏. 𝟐𝟑
(iii) 𝑠2 2 =
∑ 𝑓2 (𝑥2 −𝑥̅2 )2
∑ 𝑓2 𝑥2
=
10(18−20)2 +21(19−20)2 +8(20−20)2 +6(21−10)2 +23(22−20)2 +2(23−20)2
50
177
= 50 = 𝟑. 𝟓
Since 𝑠1 2
Is small the estimate of the population 𝜎 2 =
(b) . HO : μ = 130
𝑠1 2
𝑛
=
1.23
40
= 0.03075
HA: μ ≠ 130
The critical value for α = 0.05
The area of each shaded &quot;tail” of the standard normal curve is 0 .025 and the corresponding Z
scores ( Z tabulated) at the boundaries are &plusmn;1.96.
Sample: n = 64, X = 132
The Z score for the random sample of 64 persons of the village aged 20 to 40 years:
𝟏𝟑𝟐−𝟏𝟑𝟎
𝟐
Z calc =
= 𝟏.𝟐𝟓 = 1.6
𝟏𝟎
√𝟔𝟒
This score falls inside the “fail to reject region” from –1.96 to +1.96.
Hence, the null hypothesis is accepted. That is, the mean systolic blood pressure of persons (aged
20 to 40) living in village x is the same as the mean systolic blood pressure of the inhabitants
(aged 20 to 40) of the district.
(c) P=0.4, q=0.6, n= 150
(i)
Sample proportion =
𝑝𝑞
𝑛
𝜎𝑝 = √
=
0.4&times;0.6
150
60
150
= 0.4
= 0.04
Confidence interval= p &plusmn; z 𝜎𝑝
For a 99% Confidence interval
AL=
Z=
1+𝑐𝑙
2
=
1+0.99
2
2.57+2.57
=
1.99
2
= 0.995, 𝑡ℎ𝑒𝑟𝑒𝑓𝑜𝑟𝑒, 𝑧 𝑓𝑜𝑟 𝑡ℎ𝑖𝑠 𝑙𝑒𝑣𝑒𝑙 𝑜𝑓 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒
= 2.575
2
Confidence interval = 0.4 &plusmn; 2.575 (0.04) = (.4 &plusmn;0 .103)
= (0.297, 0.503).
(ii)
The epidemiologist need to worry because the confidence interval level for the
people infected with malaria is big.
QUESTION 4
(a) .
Uniform Probability Density Function
(i)
1
f(x)=𝑏−𝑎 , 𝑓𝑜𝑟 𝑎 ≤ 𝑥 ≤ 𝑏
=0
f(x) =
1
10
for 5 ≤ 𝑥 ≤ 15
= 0 elsewhere
Where: x = salad plate filling weight
1
(ii)
P(5 ≤ 𝑥 ≤ 15)= 10 (3) = 0.3
(iii)
Variance of x
(𝑎 + 𝑏)
2
5+15
2
E(x) =
=
= 10
(iv)
Var (x) =
=
(𝑏 − 𝑎)2
12
(15 – 52 )
12
= 8.33
(b) This is like sampling from an urn. The N = 20 “balls” in the urn correspond to the 20 cars, of
which M = 7 are “black” (i.e. polluting). When n = 5 are sampled, the distribution of X, the
number in the sample exceeding pollution control standards has a Hyper geometric (N = 20, M =
7, n = 5) distribution.
We can use this to calculate any probabilities of interest
P(X=x)
𝑀 𝑁−𝑀
( )(
)
= 𝑥 𝑁𝑛−𝑥
( )
𝑛
P(X≤ 2) = P(X=0) + P(X=1) + P(X=2)
=
7 13
7 13
7 13
( )( )
( )( )
( )( )
0 5
2 3
1 4
+
+
20
20
20
( )
( )
( )
7
7
7
=0.0830 + 0.3228 + 0.3874
= 0.7932
(c) We are given E(X) = 12.5. Since X = number of tests completed when the 1st person with high
blood pressure is found, it is not the number of tests completed before the 1st person with high
blood pressure is found. So, E(X-1) = 11.5.
For a geometric distribution, E(X)=
1−𝑝
. 𝐸(𝑋
𝑝
− 1) . As a result, p = 1/12.5 = 0.08.
1−𝑝
𝑝
11.5𝑝 = 1 − 𝑝
12.5𝑝 = 1
1
𝑃=
= 0.08
12.5
11.5 =
```