BBAC2 QUESTION 1 (a) . (i) A random variable denoted by X is a function whose domain is the sample space s and the range is a set of real number whose value is determined by each element in the sample space. (ii) A random variable X is discrete if X is either finite or countable. (iii)Discrete proability function is a formula or table linking all possible values that a discrete random variable can take along with the associated probabilities (b) . (i) E(x)=∑๐๐ฅ=๐ ๐ฅ๐(๐ = ๐ฅ) E(x) =∑5๐ฅ=0 ๐ฅ๐(๐ = ๐ฅ) =0(0.10) + 1(0.15) +2(0.30) + 3(0.20) + 4(0.15) +5(0.10) = 0 + 0.15 +0.6 + 0.6 + 0.6 + 0.5 =2.45 calls (ii) E(x2)=∑๐๐ฅ=๐ ๐ฅ 2 ๐(๐ = ๐ฅ) E(x) =∑5๐ฅ=0 ๐ฅ 2 ๐(๐ = ๐ฅ) =02(0.10) + 12(0.15) +22(0.30) + 32(0.20) + 42(0.15) +52(0.10) = 0 + 0.15 + 1.2 + 1.8 + 2.4 + 2.5 =8.05 (iii) ๐ = √E(x 2 ) = √8.05 = 2.84 ๐ (c) P(X=x)=( ) ๐ ๐ฅ (1 − ๐)๐−๐ฅ ๐ฅ P=0.9 1-p=0.1 n=5 5 (i) P(X=0)=( ) (0.9)0 (0.1)5 0 P(x=0) =0.00001 5 (ii) P(X=2)=( ) (0.9)2 (0.1)3 2 = 10(0.810(0.001) P(x=2)=0.0081 (iii) P(X≤ 1)= P(X=0) + P(X=1) 5 5 = ( ) (0.9)0 (0.1)5 + ( ) (0.9)1 (0.1)4 0 1 = 0.00001 +5(0.9) (0.0001) =0.00001+ 0.00045 =0.00046 (iv) P(X≥ 2)=1− P(X≤ 1) =1−0.0046 = 0.99954 ๐ −๐ ๐๐ฅ (d) P(X=x)= ๐ฅ! , x=0,1,… and E(x)=Var(x)=๐ (e) ๐ = 4 (i) P(X=0)= ๐ −4 40 0! =0.018315638 =0.01832 (ii) P(X=4)= ๐ −4 44 4! =0.19536366814 =0.19536 (iii)P(X≤ 3)= P(X=0) + P(X=1) +P(X=2) +P(X=3) = 0.01832 + ๐ −4 41 1! + ๐ −4 42 2! + ๐ −4 43 3! =0.01832 + 0.07326 + 0.14653 + 0.19537 =0.43348 (iv) P(X≥ 2)=1− P(X≤ 1) =1−(0.01832 + 0.07326) = 1-0.09158 = 0.90842 (v) P(X≤ 4)= P(X≤ 3) + P(X=4) =0.43348 + 0.19536 = 0.62884 QUESTION 2 (a) . i. Null hypothesis is the assumption that there is no difference between specified populations; any observed difference is due to sampling or experimental error. ii. Alternative hypothesis is the assumption that a new theory is true instead of an old one. iii. Critical region is the interval of values that corresponds to the rejection of the null hypothesis at some chosen probability level. (b) Central limit theorem states that if repeated samples of size n are drawn from any infinite population with mean(๐)and variance ๐ 2 and n is large (๐ ≥ 30), the of the ๐ฅฬ the sample mean is approximately normal with mean ๐ (i.e E(๐ฅฬ )= ๐) and variance and this approximation becomes better as n becomes larger. ๐2 ๐ (i.e. v(๐ฅฬ )= ๐2 ๐ ) (c) . (i) X~Po(4.5) ๐ = ๐ = 4.5 ๐ 2 = ๐ = 4.5 By central limit theorem, since n is large ๐ฬ is approximately normal 2 ๐ So ๐ฬ ~๐ (๐, ๐ ), with n =30 4.5 ) 30 ๐ฬ ~๐(4.5, 0.15) ๐ฬ ~๐ (4.5, p(๐ฅ > 5) =(๐ง > p(๐ฅ > 5) =(๐ง > ๐ฅ−๐ ) √๐ 5−4.5 √0.15 ) p(๐ฅ > 5) =(๐ง > 1.29) p(๐ฅ > 5) =0.5 − (๐ง > 1.29) p(๐ฅ > 5) = 0.5-0.4015 =0.0985 (ii) X~B(9, 0.5) ๐ = ๐๐ = 9 × 0.5 = 4.5 ๐ 2 = ๐๐๐ = 9 × 0.5 × 0.5 = 2.25 By central limit theorem, since n is large ๐ฬ is approximately normal 2 ๐ So ๐ฬ ~๐ (๐, ๐ ), with n =30 2.25 ) 30 ๐ฬ ~๐(4.5, 0.075) ๐ฬ ~๐ (4.5, p(๐ฅ > 5) =(๐ง > p(๐ฅ > 5) =(๐ง > ๐ฅ−๐ ) √๐ 5−4.5 ) √0.075 p(๐ฅ > 5) =(๐ง > 1.83) p(๐ฅ > 5) =0.5 − (๐ง > 1.83) p(๐ฅ > 5) = 0.5-0.4656 =0.0344 (d) n=200, ๐ = 75, ๐ = 15 z= ๐ฅ−๐ ) ๐ 60−75 ) 15 (i) p(๐ฅ > 60) =(๐ง > p(๐ฅ > 60) =(๐ง > = p(๐ง > −1) =0.5 + p(๐ง ≤ 1) =0.5 + 0.3413 = 0.8413 ๐ฅ−๐ ๐ (ii) number of clinics = 0.8413 x 200 =168 clinics 65−75 15 (iii) p(65 < ๐ฅ < 85) =( <๐ง< 85−75 ) 15 =(−0.67 < ๐ง < 0.67) = 2 x ๐(๐ง < 0.67) = 2 x 0.2486 =0.4972 (e) ๐ฅฬ 1 = 76, ๐1 = 50, ๐1 = 8 ๐ฅฬ 2 = 68 ๐2 = 65 ๐2 = 9 (i) Point of estimate of ๐1 − ๐2 = ๐ฅฬ 1 − ๐ฅฬ 2 =76 – 68 =8 (ii) 95% confidence level AL= 1+๐๐ 2 = 1+0.95 2 = 1.95 2 = 0.975, ๐กโ๐๐๐๐๐๐๐, ๐ง ๐๐๐ ๐กโ๐๐ ๐๐๐ฃ๐๐ ๐๐ ๐๐๐๐๐๐๐๐๐๐ Z=1.96 ๐ 2 ๐2 2 1 ๐2 Confidence interval for ๐1 − ๐2 =(๐ฅฬ 1 − ๐ฅฬ 2 ) ± ๐ง√ ๐1 + 82 92 =8 ± 1.96√50 + 65 =8 ±3.12 = (4.88, 11.2) years QUESTION 3 (a) . (i) We find ๐ฅฬ 1 ๐๐๐ ๐ฅฬ 2 if ๐ฅฬ 1 = ๐ฅฬ 2 then E(x)=๐ the population mean ๐ฅฬ 1 = ∑ ๐1 ๐ฅ1 (18 × 3) + (19 × 7) + (20 × 15) + (21 × 10) + (22 × 5) 807 = = ≈ 20 ∑ ๐1 40 40 ∑ ๐2 ๐ฅ2 (18 × 10) + (19 × 21) + (20 × 18) + (21 × 6) + (22 × 3) + (23 × 2) = ∑ ๐2 40 977 = ≈ 20 50 ๐ ๐๐๐๐ E(x1)=E(x2)=20 ๐กโ๐ ๐๐๐๐ข๐๐๐ก๐๐๐ ๐ = 20. ๐ฅฬ 2 = (ii) ๐ 1 2 = ∑ ๐1 (๐ฅ1 −๐ฅฬ 1 )2 ∑ ๐1 ๐ฅ1 = 3(18−20)2 +7(19−20)2 +15(20−20)2 +10(21−10)2 +5(22−20)2 40 49 =40 = ๐. ๐๐ (iii) ๐ 2 2 = ∑ ๐2 (๐ฅ2 −๐ฅฬ 2 )2 ∑ ๐2 ๐ฅ2 = 10(18−20)2 +21(19−20)2 +8(20−20)2 +6(21−10)2 +23(22−20)2 +2(23−20)2 50 177 = 50 = ๐. ๐ Since ๐ 1 2 Is small the estimate of the population ๐ 2 = (b) . HO : μ = 130 ๐ 1 2 ๐ = 1.23 40 = 0.03075 HA: μ ≠ 130 The critical value for α = 0.05 The area of each shaded "tail” of the standard normal curve is 0 .025 and the corresponding Z scores ( Z tabulated) at the boundaries are ±1.96. Sample: n = 64, X = 132 The Z score for the random sample of 64 persons of the village aged 20 to 40 years: ๐๐๐−๐๐๐ ๐ Z calc = = ๐.๐๐ = 1.6 ๐๐ √๐๐ This score falls inside the “fail to reject region” from –1.96 to +1.96. Hence, the null hypothesis is accepted. That is, the mean systolic blood pressure of persons (aged 20 to 40) living in village x is the same as the mean systolic blood pressure of the inhabitants (aged 20 to 40) of the district. (c) P=0.4, q=0.6, n= 150 (i) Sample proportion = ๐๐ ๐ ๐๐ = √ = 0.4×0.6 150 60 150 = 0.4 = 0.04 Confidence interval= p ± z ๐๐ For a 99% Confidence interval AL= Z= 1+๐๐ 2 = 1+0.99 2 2.57+2.57 = 1.99 2 = 0.995, ๐กโ๐๐๐๐๐๐๐, ๐ง ๐๐๐ ๐กโ๐๐ ๐๐๐ฃ๐๐ ๐๐ ๐๐๐๐๐๐๐๐๐๐ = 2.575 2 Confidence interval = 0.4 ± 2.575 (0.04) = (.4 ±0 .103) = (0.297, 0.503). (ii) The epidemiologist need to worry because the confidence interval level for the people infected with malaria is big. QUESTION 4 (a) . Uniform Probability Density Function (i) 1 f(x)=๐−๐ , ๐๐๐ ๐ ≤ ๐ฅ ≤ ๐ =0 f(x) = 1 10 for 5 ≤ ๐ฅ ≤ 15 = 0 elsewhere Where: x = salad plate filling weight 1 (ii) P(5 ≤ ๐ฅ ≤ 15)= 10 (3) = 0.3 (iii) Variance of x (๐ + ๐) 2 5+15 2 E(x) = = = 10 (iv) Var (x) = = (๐ − ๐)2 12 (15 – 52 ) 12 = 8.33 (b) This is like sampling from an urn. The N = 20 “balls” in the urn correspond to the 20 cars, of which M = 7 are “black” (i.e. polluting). When n = 5 are sampled, the distribution of X, the number in the sample exceeding pollution control standards has a Hyper geometric (N = 20, M = 7, n = 5) distribution. We can use this to calculate any probabilities of interest P(X=x) ๐ ๐−๐ ( )( ) = ๐ฅ ๐๐−๐ฅ ( ) ๐ P(X≤ 2) = P(X=0) + P(X=1) + P(X=2) = 7 13 7 13 7 13 ( )( ) ( )( ) ( )( ) 0 5 2 3 1 4 + + 20 20 20 ( ) ( ) ( ) 7 7 7 =0.0830 + 0.3228 + 0.3874 = 0.7932 (c) We are given E(X) = 12.5. Since X = number of tests completed when the 1st person with high blood pressure is found, it is not the number of tests completed before the 1st person with high blood pressure is found. So, E(X-1) = 11.5. For a geometric distribution, E(X)= 1−๐ . ๐ธ(๐ ๐ − 1) . As a result, p = 1/12.5 = 0.08. 1−๐ ๐ 11.5๐ = 1 − ๐ 12.5๐ = 1 1 ๐= = 0.08 12.5 11.5 =