Random Variables: PDF & CDF Exam Questions

Chapter #6 – Functions of Random Variables Question #2: Let 𝑋~𝑈𝑁𝐼𝐹(0,1). Use the CDF technique to find the PDF of the following random variables: a) 𝑌 = 𝑋1/4 , b) 𝑊 = 𝑒 −𝑋 , c) 𝑍 = 1 − 𝑒 −𝑋 , and d) 𝑈 = 𝑋(1 − 𝑋). 1 𝑖𝑓 𝑥 ∈ (0,1) a) Since 𝑋~𝑈𝑁𝐼𝐹(0,1), we know that the density function is 𝑓𝑋 (𝑥) = { 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 0 while the distribution function is 𝐹𝑋 (𝑥) = {𝑥 1 𝑖𝑓 𝑥 ∈ (−∞, 0] 𝑖𝑓 𝑥 ∈ (0,1) . We can then use the 𝑖𝑓 𝑥 ∈ [1, ∞) CDF technique to find 𝐹𝑌 (𝑦) = 𝑃(𝑌 ≤ 𝑦) = 𝑃(𝑋1/4 ≤ 𝑦) = 𝑃(𝑋 ≤ 𝑦 4 ) = 𝐹𝑋 (𝑦 4 ), so 𝑑 𝑑 that 𝑓𝑌 (𝑦) = 𝑑𝑦 𝐹𝑌 (𝑦) = 𝑑𝑦 𝐹𝑋 (𝑦 4 ) = 𝑓𝑋 (𝑦 4 )4𝑦 3 = (1)4𝑦 3 . Since 0 < 𝑥 < 1, the bounds are 0 < 𝑦 4 < 1 or 0 < 𝑦 < 1. Therefore, 𝑓𝑌 (𝑦) = {4𝑦 0 3 𝑖𝑓 𝑦 ∈ (0,1). 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 b) We have 𝐹𝑊 (𝑤) = 𝑃(𝑊 ≤ 𝑤) = 𝑃(𝑒 −𝑋 ≤ 𝑤) = 𝑃(𝑋 ≥ −𝑙𝑛(𝑤)) = 1 − 𝐹𝑋 (−𝑙𝑛(𝑤)), 𝑑 𝑑 1 1 so that 𝑓𝑊 (𝑤) = 𝑑𝑤 𝐹𝑊 (𝑤) = 𝑑𝑤 [1 − 𝐹𝑋 (− 𝑙𝑛(𝑤))] = −𝑓𝑋 (− 𝑙𝑛(𝑤)) (− 𝑤) = 𝑤. Since 0 < 𝑥 < 1, the bounds are 0 < − 𝑙𝑛(𝑤) < 1 or 𝑒 −1 < 𝑤 < 1. The probability density 1 𝑖𝑓 𝑤 ∈ (𝑒 −1 , 1) . 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 function of the random variable 𝑊 is therefore 𝑓𝑊 (𝑤) = {𝑤 0 c) By the CDF technique, we have that the 𝐹𝑍 (𝑧) = 𝑃(𝑍 ≤ 𝑧) = 𝑃(1 − 𝑒 −𝑋 ≤ 𝑧) = 𝑃(𝑒 −𝑋 ≥ 1 − 𝑧) = 𝑃(𝑋 ≤ −𝑙 𝑛(1 − 𝑧)) = 𝐹𝑋 (−𝑙 𝑛(1 − 𝑧)), so we have that 𝑓𝑍 (𝑧) = 𝑑 𝐹 (𝑧) 𝑑𝑧 𝑍 𝑑 −1 1 = 𝑑𝑧 𝐹𝑋 (−𝑙 𝑛(1 − 𝑧)) = 𝑓𝑋 (− 𝑙𝑛(1 − 𝑧)) (− 1−𝑧) = 1−𝑧. Since 0 < 𝑥 < 1, the bounds are 0 < −𝑙 𝑛(1 − 𝑧) < 1 or 0 < 𝑧 < 1 − 𝑒 −1 . Therefore, the probability 1 density function of this random variable is 𝑓𝑍 (𝑧) = { 1−𝑧 0 𝑖𝑓 𝑧 ∈ (0,1 − 𝑒 −1 ) 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 . d) We will use Theorem 6.3.2 for this question. Suppose that 𝑋 is a continuous random variable with density 𝑓𝑋 (𝑥) and assume that 𝑌 = 𝑢(𝑋) is a one-to-one transformation 𝑑 with inverse 𝑥 = 𝑤(𝑦). If 𝑑𝑦 𝑤(𝑦) is continuous and nonzero, the density of 𝑌 is given 𝑑 by 𝑓𝑌 (𝑦) = 𝑓𝑋 (𝑤(𝑦)) |𝑑𝑦 𝑤(𝑦)|. Here, the transformation is 𝑢 = 𝑥(1 − 𝑥) = −𝑥 2 + 𝑥 and, based on the y-values of the graph over the x-values of (0,1), the range over 1 which the density of 𝑈 is defined is (0, 4). Since this transformation is not one-to-one, 1 1 we must partition the interval (0,1) into two parts: (0, 2) and (2 , 1). Then solve the transformation by completing the square to obtain 𝑢 = −𝑥 2 + 𝑥 → −𝑢 = 𝑥 2 − 𝑥 → 1 1 − 𝑢 = 𝑥2 − 𝑥 + 4 → 4 1 1 2 1 1 1 Since 𝑥 = 𝑤(𝑢) = 2 ± √4 − 𝑢, we have 𝑤 1 1 1 1 − 𝑢 = (𝑥 − 2) → 𝑥 − 2 = ±√4 − 𝑢 → 𝑥 = 2 ± √4 − 𝑢. 4 1 1 ′ (𝑢) 1 1 = ± 2 (4 − 𝑢) − 1 2 (−1) = ±1 1 4 . Thus, 2√ −𝑢 1 1 𝑓𝑈 (𝑢) = 𝑓𝑋 (2 + √4 − 𝑢) |𝑤 ′ (𝑢)| + 𝑓𝑋 (2 − √4 − 𝑢) |𝑤 ′ (𝑢)| = (1 + 1)|𝑥 ′ (𝑢)| = − 1 so we can conclude that the density function is 𝑓𝑈 (𝑢) = {(4 − 𝑢) 0 1 2 , 1 √ −𝑢 4 1 𝑖𝑓 𝑢 ∈ (0, 4). 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Question #3: The measured radius 𝑅 of a circle has PDF 𝑓𝑅 (𝑟) = 6𝑟(1 − 𝑟) if 𝑟 ∈ (0,1) and 𝑓𝑅 (𝑟) = 0 otherwise. Find the distribution of a) the circumference and b) area of the circle. a) The circumference of a circle is given by 𝐶 = 2𝜋𝑅, so by the CDF technique we have 𝑐 𝑐 that 𝐹𝐶 (𝑐) = 𝑃(𝐶 ≤ 𝑐) = 𝑃(2𝜋𝑅 ≤ 𝑐) = 𝑃 (𝑅 ≤ 2𝜋) = 𝐹𝑅 (2𝜋). The density is thus 𝑑 𝑑 𝑐 𝑐 1 𝑐 𝑐 1 𝑓𝐶 (𝑐) = 𝑑𝑐 𝐹𝐶 (𝑐) = 𝑑𝑐 𝐹𝑅 (2𝜋) = 𝑓𝑅 (2𝜋) (2𝜋) = 6 (2𝜋) (1 − 2𝜋) (2𝜋) = ⋯ = 6𝑐(2𝜋−𝑐) (2𝜋)3 . 𝑐 Since 0 < 𝑟 < 1 we have 0 < 2𝜋 < 1 so that 0 < 𝑐 < 2𝜋. Therefore, the probability density function of the circumference is given by 𝑓𝐶 (𝑐) = { 6𝑐(2𝜋−𝑐) (2𝜋)3 𝑖𝑓 𝑐 ∈ (0,2𝜋) 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 . b) The area is given by 𝐴 = 𝜋𝑅 2 , so we have 𝐹𝐴 (𝑎) = 𝑃(𝐴 ≤ 𝑎) = 𝑃(𝜋𝑅 2 ≤ 𝑎) = 𝑎 𝑎 𝑎 𝑎 𝑎 𝑎 𝑃 (𝑅 2 ≤ 𝜋) = 𝑃 (|𝑅| ≤ ±√𝜋) = 𝑃 (−√𝜋 ≤ 𝑅 ≤ √𝜋) = 𝐹𝑅 (√𝜋) − 𝐹𝑅 (−√𝜋). Thus, 𝑑 𝑑 𝑎 𝑎 3(√𝜋−√𝑎) . 𝜋 3/2 we have 𝑓𝐴 (𝑎) = 𝑑𝑎 𝐹𝐴 (𝑎) = 𝑑𝑎 [𝐹𝑅 (√𝜋) − 𝐹𝑅 (−√𝜋)] = ⋯ = 𝑎 0 < 𝑟 < 1, then 0 < √𝜋 < 1 so that 0 < 𝑎 < 𝜋 and 𝑓𝐴 (𝑎) = { 3(√𝜋−√𝑎) 𝜋 3/2 0 Since we have 𝑖𝑓 𝑎 ∈ (0, 𝜋). 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 1 Question #10: Suppose 𝑋 has density 𝑓𝑋 (𝑥) = 2 𝑒 −|𝑥| for all 𝑥 ∈ ℝ. a) Find the density of the random variable 𝑌 = |𝑋|. b) If 𝑊 = 0 when 𝑋 ≤ 0 and 𝑊 = 1 when 𝑋 > 0 find the CDF of 𝑊. a) We have 𝐹𝑌 (𝑦) = 𝑃(𝑌 ≤ 𝑦) = 𝑃(|𝑋| < 𝑦) = 𝑃(−𝑦 ≤ 𝑋 ≤ 𝑦) = 𝐹𝑋 (𝑦) − 𝐹𝑋 (−𝑦), so that we obtain 𝑓𝑌 (𝑦) = 1 2 𝑑 𝐹 (𝑦) 𝑑𝑦 𝑌 = 𝑑 𝑑𝑦 [𝐹𝑋 (𝑦) − 𝐹𝑋 (−𝑦)] = 𝑓𝑋 (𝑦)(1) − 𝑓𝑋 (−𝑦)(−1) = 1 𝑒 −|𝑦| + 2 𝑒 −|−𝑦| = 𝑒 −|𝑦| . Since the transformation was an absolute value function and we have the bounds −∞ < 𝑥 < ∞, the bounds become 0 < 𝑦 < ∞. This allows us to write the probability density function of 𝑌 as 𝑓𝑌 (𝑦) = { 1 𝑒 −𝑦 0 𝑖𝑓 𝑦 ∈ (0, ∞) . 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 1 b) We see that 𝑃(𝑊 = 0) = 2 and 𝑃(𝑊 = 1) = 2 since 𝑓𝑋 (𝑥) is symmetric. This allows us 0 𝑖𝑓 𝑤 ∈ (−∞, 0) 1 𝑖𝑓 𝑤 ∈ [0,1) 1 𝑖𝑓 𝑤 ∈ [1, ∞) to write the cumulative distribution function as 𝐹𝑊 (𝑤) = { 2 . 1 Question #13: Suppose 𝑋 has density 𝑓𝑋 (𝑥) = 24 𝑥 2 for 𝑥 ∈ (−2,4) and 𝑓𝑋 (𝑥) = 0 otherwise. Find the probability density function of the random variable 𝑌 = 𝑋 2 .  We will use Theorem 6.3.2 for this question: Suppose that 𝑋 is a continuous random variable with density 𝑓𝑋 (𝑥) and assume that 𝑌 = 𝑢(𝑋) is a one-to-one transformation 𝑑 with inverse 𝑥 = 𝑤(𝑦). If 𝑑𝑦 𝑤(𝑦) is continuous and nonzero, the density of 𝑌 is given 𝑑 by 𝑓𝑌 (𝑦) = 𝑓𝑋 (𝑤(𝑦)) |𝑑𝑦 𝑤(𝑦)|. Here, the transformation is 𝑢 = 𝑥 2 and, based on the y-values of the graph over the x-values of (0,1), the domain over which the density of 𝑌 is defined is (0,16). Solving the transformation then gives that 𝑥 = 𝑤(𝑦) = ±√𝑦 so 1 that 𝑤 ′ (𝑦) = ± 2 𝑦. We must consider two cases in the interval (0,16): over (0,4) the √ transformation is not one-to-one and over (4,16) it is one-to-one. The density is thus 𝑓𝑋 (−√𝑦)|𝑤 ′ (𝑦)| + 𝑓𝑋 (√𝑦)|𝑤 ′ (𝑦)| 𝑖𝑓 𝑦 ∈ (0,4) 𝑓𝑌 (𝑦) = {𝑓 (√𝑦)|𝑤 ′ (𝑦)| 𝑋 0 𝑖𝑓 𝑦 ∈ [4,16) = 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 √𝑦 24 {√𝑦 𝑖𝑓 𝑦 ∈ (0,4) 𝑖𝑓 𝑦 ∈ [4,16) 48 0 . 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Question #16: Let 𝑋1 and 𝑋2 be independent random variables each having density function 1 𝑓𝑋𝑖 (𝑥) = 𝑥 2 for 𝑥𝑖 ∈ [1, ∞) and 𝑓𝑋𝑖 (𝑥) = 0 otherwise. a) Find the joint PDF of 𝑈 = 𝑋1 𝑋2 and 𝑉 = 𝑋1. b) Find the marginal probability density function of the random variable 𝑈. a) Since 𝑋1 and 𝑋2 are independent, their joint density is simply the product of their 1 1 1 2 marginal densities, so 𝑓𝑋1 𝑋2 (𝑥1 , 𝑥2 ) = (𝑥 2 ) (𝑥 2 ) = (𝑥 1 2 1 𝑥2 ) whenever we have that (𝑥1 , 𝑥2 ) ∈ [1, ∞) × [1, ∞) and zero otherwise. We will use Theorem 6.3.6, which says that if 𝑢 = 𝑓(𝑥1 , 𝑥2 ) and 𝑣 = 𝑓(𝑥1 , 𝑥2 ) and we can solve uniquely for 𝑥1 and 𝑥2 , then 𝜕𝑥1 we have 𝑓𝑈𝑉 (𝑢, 𝑣) = 𝑓𝑋1 𝑋2 (𝑥1 (𝑢, 𝑣), 𝑥2 (𝑢, 𝑣))|𝐽|, where 𝐽 = 𝜕𝑢 𝑑𝑒𝑡 [𝜕𝑥 2 𝜕𝑥1 𝜕𝑣 ]. 𝜕𝑥2 𝜕𝑢 𝑢 Here, we 𝜕𝑣 𝑢 have that 𝑣 = 𝑥1 so 𝑥1 = 𝑣 and 𝑢 = 𝑥1 𝑥2 so 𝑥2 = 𝑥 = 𝑣 , so we can calculate the 1 0 1 1 Jacobian as 𝐽 = 𝑑𝑒𝑡 [ 1 − 𝑢 ] = − 𝑣. We can therefore find the joint density as 𝑣 𝑣2 𝑢 1 𝑓𝑈𝑉 (𝑢, 𝑣) = 𝑓𝑋1 𝑋2 (𝑣, 𝑣 ) |− 𝑣| = ( 1 1 𝑢 2 (𝑣⋅ ) 𝑣 1 ) (𝑣) = 𝑢2 𝑣 if 1 < 𝑣 < 𝑢 < ∞. We can find this region of integration by substituting into the constraints of 𝑓𝑋1 𝑋2 (𝑥1 , 𝑥2 ). The first is that 1 < 𝑥1 < ∞ so 1 < 𝑣 < ∞ while the second is 1 < 𝑥2 < ∞ so 1 < 𝑢 𝑣 < ∞, which reduces to 𝑣 < 𝑢 < ∞. Combining these gives the required bounds 1 < 𝑣 < 𝑢 < ∞. 𝑢 𝑢 1 b) We have 𝑓𝑈 (𝑢) = ∫1 𝑓𝑈𝑉 (𝑢, 𝑣)𝑑𝑣 = ∫1 1 𝑢 𝑙𝑛(𝑢) 1 𝑢2 𝑑𝑣 = [𝑢2 𝑙𝑛(𝑣)] = 𝑢2 𝑣 if 1 < 𝑢 < ∞. Question #18: Let 𝑋 and 𝑌 have joint density function 𝑓𝑋𝑌 (𝑥, 𝑦) = 𝑒 −𝑦 for 0 < 𝑥 < 𝑦 < ∞ and 𝑓𝑋𝑌 (𝑥, 𝑦) = 0 otherwise. a) Then find the joint density function of 𝑆 = 𝑋 + 𝑌 and 𝑇 = 𝑋. b) Find the marginal density function of 𝑆. c) Find the marginal density function of 𝑇. a) We have that 𝑡 = 𝑥 so 𝑥 = 𝑡 and 𝑠 = 𝑥 + 𝑦 so 𝑦 = 𝑠 − 𝑥 = 𝑠 − 𝑡, so we can calculate 𝜕𝑥 the Jacobian as 𝐽 = 𝜕𝑥 𝜕𝑠 𝑑𝑒𝑡 [𝜕𝑦 𝜕𝑡 ] 𝜕𝑦 𝜕𝑠 𝜕𝑡 0 = 𝑑𝑒𝑡 [ 1 1 ] = −1. We can therefore find the joint −1 𝑠 density as 𝑓𝑆𝑇 (𝑠, 𝑡) = 𝑓𝑋𝑌 (𝑡, 𝑠 − 𝑡)|−1| = (𝑒 −(𝑠−𝑡) )(1) = 𝑒 𝑡−𝑠 if 0 < 𝑡 < 2 < ∞. We can find these bounds by substituting into the bounds 0 < 𝑥 < 𝑦 < ∞ with our solved 𝑠 transformations, which gives 0 < 𝑡 < 𝑠 − 𝑡 < ∞ or 0 < 2𝑡 < 𝑠 < ∞ or 0 < 𝑡 < 2 < ∞. 𝑠/2 b) We have that 𝑓𝑆 (𝑠) = ∫0 𝑠/2 𝑒 −𝑠 [𝑒 𝑡 ]0 𝑠/2 𝑡−𝑠 𝑓𝑆𝑇 (𝑠, 𝑡)𝑑𝑡 = ∫0 𝑒 𝑠/2 𝑡 −𝑠 𝑑𝑡 = ∫0 𝑠/2 𝑒 𝑒 𝑑𝑡 = [𝑒 𝑡 𝑒 −𝑠 ]0 = = 𝑒 −𝑠 (𝑒 𝑠/2 − 1) = if 0 < 𝑠 < ∞. ∞ ∞ ∞ c) We have that 𝑓𝑇 (𝑡) = ∫2𝑡 𝑓𝑆𝑇 (𝑠, 𝑡)𝑑𝑠 = ∫2𝑡 𝑒 𝑡−𝑠 𝑑𝑠 = ∫2𝑡 𝑒 𝑡 𝑒 −𝑠 𝑑𝑠 = [−𝑒 𝑡 𝑒 −𝑠 ]∞ 2𝑡 = 𝑡 −2𝑡 −𝑒 𝑡 [𝑒 −𝑠 ]∞ ) = 𝑒 −𝑡 if 0 < 𝑡 < ∞. Note that we have omitted the steps 2𝑡 = −𝑒 (0 − 𝑒 where the infinite limit of integration is replaced by a parameter and a limit to infinity with that parameter is evaluated to show that it goes to zero. Question #21: Let 𝑋 and 𝑌 have joint density 𝑓𝑋𝑌 (𝑥, 𝑦) = 2(𝑥 + 𝑦) for 0 < 𝑥 < 𝑦 < 1 and 𝑓𝑋𝑌 (𝑥, 𝑦) = 0 otherwise. a) Find the joint probability density function of 𝑆 = 𝑋 and 𝑇 = 𝑋𝑌. b) Find the marginal probability density function of the random variable 𝑇. 𝑡 𝑡 a) We have that 𝑠 = 𝑥 so 𝑥 = 𝑠 and 𝑡 = 𝑥𝑦 so 𝑦 = 𝑥 = 𝑠, so we can calculate the Jacobian 𝜕𝑥 𝜕𝑠 as 𝐽 = 𝑑𝑒𝑡 [𝜕𝑦 𝜕𝑠 𝜕𝑥 𝜕𝑡 ] 𝜕𝑦 1 = 𝑑𝑒𝑡 [− 𝑡 𝑠 𝜕𝑡 𝑡 1 0 1 1 ] = . The joint probability density function is thus 𝑠 𝑠 𝑡 1 𝑡 𝑓𝑆𝑇 (𝑠, 𝑡) = 𝑓𝑋𝑌 (𝑠, 𝑠) |𝑠 | = 2 (𝑠 + 𝑠) (𝑠 ) = 2 (1 + 𝑠2 ). The region is then found by 𝑡 substituting to 0 < 𝑥 < 𝑦 < 1, so we have 0 < 𝑠 < 𝑠 < 1 or 0 < 𝑠 2 < 𝑡 < 𝑠 < 1. This can be visualized as the region between 𝑡 = 𝑠 and 𝑡 = 𝑠 2 on the 𝑠𝑡 plane. √𝑡 b) The marginal probability density function is given by 𝑓𝑇 (𝑡) = ∫𝑡 𝑓𝑆𝑇 (𝑠, 𝑡)𝑑𝑠 = √𝑡 2𝑡 √𝑡 𝑡 ∫𝑡 2 (1 + 𝑠2 ) 𝑑𝑠 = [2𝑠 − 𝑠 ] 𝑡 = (2√𝑡 − 2√𝑡) − (2𝑡 − 2) = 2 − 2𝑡 if 0 < 𝑡 < 1. Question #25: Let 𝑋1 , 𝑋2 , 𝑋3,𝑋4 be independent random variables. Assume that 𝑋2, 𝑋3, 𝑋4 are each distributed Poisson with parameter 5 and the random variable 𝑌 = 𝑋1 + 𝑋2 + 𝑋3 + 𝑋4 is distributed Poisson with parameter 25. a) What is the distribution of the random variable 𝑋1? b) What is the distribution of the random variable 𝑊 = 𝑋1 + 𝑋2? a) We first note that while 𝑋1,𝑋2, 𝑋3, 𝑋4 are independent, they are not 𝑖𝑖𝑑 since only 𝑋2 , 𝑋3 , 𝑋4 ~𝑃𝑂𝐼(5) with 𝑋1 not being listed. Thus, we must use the unsimplified formula 6.4.4, which says if 𝑋1 , … , 𝑋𝑛 are independent random variables with moment generating functions 𝑀𝑋𝑖 (𝑡) and 𝑌 = 𝑋1 + ⋯ + 𝑋𝑛 , then the moment generating function of 𝑌 is 𝑀𝑌 (𝑡) = [𝑀𝑋1 (𝑡)] … [𝑀𝑋𝑛 (𝑡)]. We use the fact that if some 𝑋~𝑃𝑂𝐼(𝜇), then 𝑀𝑋 (𝑡) = 𝑒 𝜇(𝑒 that 𝑡 −1) to solve this problem. If we let 𝑌 = 𝑋1 + 𝑋2 + 𝑋3 + 𝑋4 , we have 𝑀𝑌 (𝑡) = [𝑀𝑋1 (𝑡)][𝑀𝑋2 (𝑡)][𝑀𝑋3 (𝑡)][𝑀𝑋4 (𝑡)] = 𝑒 25(𝑒 Substituting gives [𝑀𝑋1 (𝑡)][𝑒 5(𝑒 [𝑀𝑋1 (𝑡)][𝑒 5(𝑒 𝑡 −1) 3 ] = 𝑒 5(𝑒 𝑡 −1) 𝑡 −1) ][𝑒 5(𝑒 𝑡 −1) ][𝑒 5(𝑒 so that [𝑀𝑋1 (𝑡)] = 𝑡 −1) 𝑡 −1) ] = 𝑒 25(𝑒 𝑡 𝑒 25(𝑒 −1) 3 𝑡 [𝑒 5(𝑒 −1) ] = since 𝑡 −1) 𝑌~𝑃𝑂𝐼(25). , which reduces to 𝑡 𝑒 25(𝑒 −1) 𝑡 𝑒 15(𝑒 −1) = 𝑒 10(𝑒 𝑡 −1) . This is the moment generating function of a poisson 10 random variable, so 𝑋1 ~𝑃𝑂𝐼(10). b) We have 𝑀𝑊 (𝑡) = [𝑀𝑋1 (𝑡)][𝑀𝑋2 (𝑡)] = [𝑒 10(𝑒 𝑡 −1) ][𝑒 5(𝑒 𝑡 −1) ] = 𝑒 15(𝑒 𝑡 −1) , which is the moment generating function of a poisson 15 random variable, so 𝑊~𝑃𝑂𝐼(15). We can see a general pattern here; if 𝑋𝑖 ~𝑃𝑂𝐼(𝜇) for 𝑖 = 1, … , 𝑛 are independent random variables and we define some 𝑍 = ∑𝑘𝑖=1 𝑋𝑖 for 𝑘 ≤ 𝑛, then we have that 𝑍~𝑃𝑂𝐼(𝑘𝜇). Chapter #6 – Functions of Random Variables Question #17: Suppose that 𝑋1 and 𝑋2 denote a random sample of size 2 from a gamma 1 𝑋 distribution such that 𝑋𝑖 ~𝐺𝐴𝑀 (2, 2). Find the PDF of a) 𝑌 = √𝑋1 + 𝑋2 and b) 𝑊 = 𝑋1. 2 𝑥 1 a) We know that if some 𝑋~𝐺𝐴𝑀(𝜃, 𝜅), then 𝑓𝑋 (𝑥) = 𝜃𝜅 Γ(𝜅) 𝑥 𝜅−1 𝑒 −𝜃 if 𝑥 > 0. Since 𝑋1 and 𝑋2 are independent, their joint density function is given by 𝑓𝑋1 𝑋2 (𝑥1 , 𝑥2 ) = − 1 𝑓𝑋1 (𝑥1 )𝑓𝑋2 (𝑥2 ) = ( 1 √2Γ(2) 1 2 𝑥1 𝑒 𝑥 − 1 2 1 )( 1 √2Γ(2) − 1 2 𝑥2 1 𝑥2 𝑒 − 2 ) = 2𝜋 1 1 √ 𝑥1 √ 𝑥2 𝑒− (𝑥1 +𝑥2 ) 2 if 𝑥1 , 𝑥2 > 0, 1 since Γ (2) = √𝜋. We have the transformation 𝑦 = √𝑥1 + 𝑥2 and generate another transformation 𝑤 = 𝑥1 , so we have 𝑥1 = 𝑤 and 𝑥2 = 𝑦 2 − 𝑤, which allows us to find 1 0 𝐽 = 𝑑𝑒𝑡 [ ] = 2𝑦. Then the joint density of 𝑊 and 𝑌 is given by 𝑓𝑊𝑌 (𝑤, 𝑦) = −1 2𝑦 1 1 𝑓𝑋1 𝑋2 (𝑤, 𝑦 2 − 𝑤)|𝐽| = 𝜋 𝑦2 1 √𝑤 √𝑦 2 −𝑤 𝑒 − 2 𝑦 if 𝑤 > 0, 𝑦 2 − 𝑤 > 0; these bounds can be ∞ combined to give 0 < 𝑤 < 𝑦 2 . Finally, the density of 𝑌 is 𝑓𝑌 (𝑦) = ∫−∞ 𝑓𝑊𝑌 (𝑤, 𝑦) 𝑑𝑤 = 1 𝑦2 1 𝑦2 1 ∫ 𝑦 √𝑤 √𝑦 2 𝜋 0 −𝑤 𝑦2 𝑒 − 2 𝑑𝑤 = ⋯ = 𝑦𝑒 − 2 if 𝑦 > 0 and zero otherwise. The evaluation of this integral has been omitted, but can be computed by two substitutions. 𝑥 𝑧 b) We have 𝑤 = 𝑥1 and generate 𝑧 = 𝑥1 so that 𝑥1 = 𝑧 and 𝑥2 = 𝑤, which allows us to 2 1 calculate 𝐽 = det [ 1 𝑤 1 1 1 2𝜋 √𝑧 √𝑧/𝑤 𝑒 𝑧+𝑧/𝑤 − 2 ∞ 𝑧 𝑤2 0 𝑧 𝑧 𝑧 (𝑧, |𝐽| − 𝑤2 ] = − 𝑤2 . Then we have 𝑓𝑍𝑊 𝑤) = 𝑓𝑋1 𝑋2 (𝑧, 𝑤) = 1 1 = 2𝜋 𝑤3/2 𝑒 − 1 (𝑧+𝑧/𝑤) 2 ∞ 𝑓𝑊 (𝑤) = ∫−∞ 𝑓𝑊𝑍 (𝑤, 𝑧) 𝑑𝑧 = 2𝜋 ∫0 1 if 𝑧, 𝑤 > 0, so that the density of 𝑊 is given by 𝑒− 𝑤 3/2 (𝑧+𝑧/𝑤) 2 1 1 𝑑𝑧 = ⋯ = 𝜋 (𝑤+1) √𝑤 if 𝑤 > 0. The evaluation of this integral has been omitted, but can be computed by substitution. Question #26: Let 𝑋1 and 𝑋2 be independent negative binomial random variables such that 𝑋1 ~𝑁𝐵(𝑟1 , 𝑝) and 𝑋2 ~𝑁𝐵(𝑟2 , 𝑝). a) Find the MGF and distribution of 𝑌 = 𝑋1 + 𝑋2.  We use Theorem 6.4.3, which says that if the random variables 𝑋𝑖 are independent with respective MGFs 𝑀𝑋𝑖 (𝑡), then the MGF of the random variable that is their sum is simply the product of their respective MGFs. Also, if some discrete random variable 𝑟 𝑝𝑒 𝑡 𝑋~𝑁𝐵(𝑟, 𝑝), then 𝑀𝑋 (𝑡) = (1−𝑞𝑒 𝑡) . Therefore, the moment generating function of 𝑌 𝑝𝑒 𝑡 is 𝑀𝑌 (𝑡) = [𝑀𝑋1 (𝑡)][𝑀𝑋2 (𝑡)] = ( 1−𝑞𝑒 𝑡 𝑟1 ) ( 𝑝𝑒 𝑡 1−𝑞𝑒 𝑡 𝑟2 ) 𝑝𝑒 𝑡 =( 1−𝑞𝑒 𝑡 𝑟1 +𝑟2 ) . This then allows to determine the distribution of 𝑌 = 𝑋1 + 𝑋2, namely that 𝑌~𝑁𝐵(𝑟1 + 𝑟2 , 𝑝). Question #27: Recall that 𝑌~𝐿𝑂𝐺𝑁(𝜇, 𝜎 2 ) if 𝑙𝑛(𝑌) ~𝑁(𝜇, 𝜎 2 ). Assume that 𝑌𝑖 ~𝐿𝑂𝐺𝑁(𝜇𝑖 , 𝜎𝑖2 ) for 𝑖 = 1, … , 𝑛 are independent. Find the distribution functions of the following random 𝑎 𝑌 variables a) 𝐴 = ∏𝑛𝑖=1 𝑌𝑖 , b) 𝐵 = ∏𝑛𝑖=1 𝑌𝑖 , c) 𝐶 = 𝑌1, d) find 𝐸(𝐴) = 𝐸(∏𝑛𝑖=1 𝑌𝑖 ). 2 a) We have that 𝑙𝑛(𝐴) = 𝑙𝑛(∏𝑛𝑖=1 𝑌𝑖 ) = ∑𝑛𝑖=1 𝑙𝑛⁡(𝑌𝑖 ) = 𝑙𝑛(𝑌1 ) + ⋯ + 𝑙𝑛⁡(𝑌𝑛 ), so the random variable 𝑙𝑛⁡(𝐴) is the sum of 𝑛 normally distributed random variables. This implies that 𝑙𝑛(𝐴) ~𝑁(∑𝑛𝑖=1 𝜇𝑖 , ∑𝑛𝑖=1 𝜎𝑖2 ), which means 𝐴~𝐿𝑂𝐺𝑁(∑𝑛𝑖=1 𝜇𝑖 , ∑𝑛𝑖=1 𝜎𝑖2 ). b) The random variable 𝑙𝑛(𝐵) = 𝑙𝑛(∏𝑛𝑖=1 𝑌𝑖𝑎 ) = ∑𝑛𝑖=1 𝑙𝑛⁡(𝑌𝑖𝑎 ) = ∑𝑛𝑖=1 𝑎𝑖 𝑙𝑛⁡(𝑌𝑖 ) = 𝑎1 𝑙𝑛(𝑌1 ) + ⋯ + 𝑎2 𝑙𝑛⁡(𝑌𝑛 ). We use that if some 𝑋~𝑁(𝜇, 𝜎 2 ), then 𝑎𝑋~𝑁(𝑎𝜇, 𝑎2 𝜎 2 ) to conclude that 𝑙𝑛(𝐵) ~𝑁(∑𝑛𝑖=1 𝑎𝑖 𝜇𝑖 , ∑𝑛𝑖=1 𝑎𝑖2 𝜎𝑖2 ) so 𝐵~𝐿𝑂𝐺𝑁(∑𝑛𝑖=1 𝑎𝑖 𝜇𝑖 , ∑𝑛𝑖=1 𝑎𝑖2 𝜎𝑖2 ). 𝑌 c) We have that 𝑙𝑛(𝐶) = 𝑙𝑛 (𝑌1 ) = 𝑙𝑛(𝑌1 ) − 𝑙𝑛(𝑌2 ), so the random variable 𝑙𝑛⁡(𝐶) is the 2 sum of two normally distributed random variables. Thus, 𝑙𝑛(𝐶) ~𝑁(𝜇1 − 𝜇2 , 𝜎12 + 𝜎22 ) 𝑌 which implies that the distribution of 𝐶 = 𝑌1 is 𝐶~𝐿𝑂𝐺𝑁(𝜇1 − 𝜇2 , 𝜎12 + 𝜎22 ). 2 d) For 𝑋~𝑁(𝜇, 𝜎 2 ), we have 𝑀𝑋 (𝑡) = 𝐸(𝑒 𝑡𝑋 ) = 𝑒 𝜇𝑡+𝜎 𝑀𝑍 (𝑡) = 𝐸(𝑒 𝑡𝑍 ) = 𝑒 𝑡 2 /2 2 𝑡 2 /2 and for 𝑍~𝑁(0,1), we have . Thus, the expected value is given by 𝐸(𝑌𝑖 ) = 𝐸(𝑒 𝜇𝑖 +𝜎𝑖 𝑍 ) = 2 𝑒 𝜇𝑖 +𝜎𝑖 /2. Since the random variables 𝑌𝑖 are all independent, we therefore have that 2 𝐸(𝐴) = 𝐸(∏𝑛𝑖=1 𝑌𝑖 ) = ∏𝑛𝑖=1 𝐸(𝑌𝑖 ) = ∏𝑛𝑖=1(𝑒 𝜇𝑖 +𝜎𝑖 /2 ) = 𝑒𝑥𝑝⁡{∑𝑛𝑖=1 𝜇𝑖 + ∑𝑛𝑖=1 𝜎𝑖2 /2}. Question #28: Let 𝑋1 and 𝑋2 be a random sample of size 2 from a continuous distribution with PDF of the form 𝑓𝑋 (𝑥) = 2𝑥 if 0 < 𝑥 < 1 and zero otherwise. a) Find the marginal densities of 𝑌1 and 𝑌2 , the smallest and largest order statistics, b) find the joint probability density function of 𝑌1 and 𝑌2 , and c) find the density of the sample range 𝑅 = 𝑌2 − 𝑌1 . 0⁡⁡⁡⁡⁡⁡⁡𝑖𝑓⁡𝑥 ∈ (−∞, 0]⁡ (0,1)⁡⁡ 2𝑥⁡⁡⁡⁡⁡𝑖𝑓𝑥 ∈ a) Since 𝑓𝑋 (𝑥) = { , we know that 𝐹𝑋 (𝑥) = { 𝑥 2 ⁡⁡⁡⁡⁡𝑖𝑓⁡𝑥 ∈ (0,1)⁡⁡⁡⁡⁡⁡ . Then 0⁡⁡⁡⁡⁡⁡⁡⁡𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒⁡⁡⁡ 1⁡⁡⁡⁡⁡⁡⁡⁡𝑖𝑓⁡𝑥 ∈ [1, ∞)⁡⁡⁡⁡ from Theorem 6.5.2, we have that 𝑔1 (𝑦1 ) = 𝑛𝑓𝑋 (𝑦1 )⁡[1 − 𝐹𝑋 (𝑦1 )]𝑛−1 so we can calculate the smallest order statistic as 𝑔1 (𝑦1 ) = 2[2𝑦1 ][1 − 𝑦12 ]2−1 = 4𝑦1 − 4𝑦13 whenever 𝑦1 ∈ (0,1). Similarly, 𝑔𝑛 (𝑦𝑛 ) = 𝑛𝑓𝑋 (𝑦𝑛 )⁡[𝐹𝑋 (𝑦𝑛 )]𝑛−1 so we can calculate the largest order statistic as 𝑔2 (𝑦2 ) = 2[2𝑦2 ][𝑦22 ]2−1 = 4𝑦23 whenever 𝑦2 ∈ (0,1). b) From Theorem 6.5.1, the joint probability density function of the order statistics is 𝑔𝒀 (𝑦1 , … , 𝑦𝑛 ) = 𝑛! 𝑓𝑋 (𝑦1 ) … 𝑓𝑋 (𝑦𝑛 ). In this question, we have that 𝑔𝑌1 𝑌2 (𝑦1 , 𝑦2 ) = 2! 𝑓𝑋 (𝑦1 )𝑓𝑋 (𝑦2 ) = 2(2𝑦1 )(2𝑦2 ) = 8𝑦1 𝑦2 whenever we have 0 < 𝑦1 < 𝑦2 < 1. c) We first find the joint density of the smallest and largest order statistics in order to make a transformation to get the marginal density of the sample range. From the work we did above, we have that 𝑔𝑌1 𝑌2 (𝑦1 , 𝑦2 ) = 8𝑦1 𝑦2 . We have the transformation 𝑟 = 𝑦2 − 𝑦1 and generate 𝑠 = 𝑦1 , so we have 𝑦1 = 𝑠 and 𝑦2 = 𝑟 + 𝑠 which allows us to 1 0 calculate 𝐽 = 𝑑𝑒𝑡 [ ] = 1. The joint density of 𝑆 and 𝑅 is therefore 𝑓𝑆𝑅 (𝑠, 𝑟) = 1 1 𝑔𝑌1 𝑌2 (𝑠, 𝑟 + 𝑠)|𝐽| = 8𝑠(𝑟 + 𝑠)(1) = 8𝑠 2 + 8𝑠𝑟 if 0 < 𝑠 < 𝑟 + 𝑠 < 1, which can also be ∞ written as 0 < 𝑠 < 1 − 𝑟. The marginal density is thus 𝑓𝑅 (𝑟) = ∫−∞ 𝑓𝑆𝑅 (𝑠, 𝑟) 𝑑𝑠 = 1−𝑟 ∫0 8 1−𝑟 (8𝑠 2 + 8𝑠𝑟⁡) 𝑑𝑠 = [3 𝑠 3 + 4𝑟𝑠 2 ] 0 8 = 3 (1 − 𝑟)3 + 4𝑟(1 − 𝑟)2 if 0 < 𝑟 < 1. Question #31: Consider a random sample of size 𝑛 from an exponential distribution such that 𝑋𝑖 ~𝐸𝑋𝑃(1). Give the density of a) the smallest order statistic denoted by 𝑌1 , b) the largest order statistic denoted by 𝑌𝑛 , c) the sample range of the order statistics 𝑅 = 𝑌𝑛 − 𝑌1 . a) Since 𝑓𝑋𝑖 (𝑥𝑖 ) = { 0⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡𝑖𝑓𝑥 ∈ (−∞, 0] 𝑒 −𝑥𝑖 ⁡⁡⁡⁡⁡𝑖𝑓𝑥 ∈ (0, ∞)⁡⁡ , we have 𝐹𝑋𝑖 (𝑥𝑖 ) = { . 1 − 𝑒 −𝑥𝑖 ⁡⁡⁡⁡⁡𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒⁡⁡⁡⁡⁡⁡ 0⁡⁡⁡⁡⁡⁡⁡⁡⁡𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒⁡⁡⁡⁡⁡⁡ Then we have that 𝑔1 (𝑦1 ) = 𝑛𝑓𝑋 (𝑦1 )[1 − 𝐹𝑋 (𝑦1 )]𝑛−1 = 𝑛𝑒 −𝑦1 [1 − (1 − 𝑒 −𝑦1 )]𝑛−1 = 𝑛𝑒 −𝑦1 (𝑒 −𝑦1 )𝑛−1 = 𝑛𝑒 −𝑛𝑦1 if 𝑦1 > 0 and zero otherwise. b) Similarly, we have 𝑔𝑛 (𝑦𝑛 ) = 𝑛𝑓𝑋 (𝑦𝑛 )[𝐹𝑋 (𝑦𝑛 )]𝑛−1 = 𝑛𝑒 −𝑦𝑛 [1 − 𝑒 −𝑦𝑛 ]𝑛−1 for 𝑦𝑛 > 0. c) Since the exponential distribution has the memoryless property, the difference of 𝑅 = 𝑌𝑛 − 𝑌1 will not be conditional on the value of 𝑌1 . This allows us to treat 𝑌1 = 0, so that 𝑅 = 𝑌𝑛 − 𝑌1 = 𝑌𝑛 − 0 = 𝑌𝑛 . We then use the fact that the range of a set of 𝑛 order statistics from an exponential distribution is the same as the largest order statistic from a set of 𝑛 − 1 order statistics. From above, we have that 𝑔𝑛 (𝑦𝑛 ) = 𝑛𝑒 −𝑦𝑛 [1 − 𝑒 −𝑦𝑛 ]𝑛−1 , so substituting 𝑛 − 1 gives 𝑔𝑅 (𝑟) = (𝑟 − 1)[1 − 𝑒 −𝑟 ]𝑟−2 𝑒 −𝑟 . Question #32: A system is composed of five independent components connected in series one after the other. a) If the PDF of the time to failure of each component is 𝑋𝑖 ~𝐸𝑋𝑃(1), then give the PDF of the time to failure of the system; b) if the components are connected in parallel so that all must fail before the system fails, give the PDF of the time to failure. a) Since 𝑋𝑖 ~𝐸𝑋𝑃(1), we know that the density is given by 𝑓𝑋𝑖 (𝑥𝑖 ) = 𝑒 −𝑥𝑖 for 𝑥𝑖 > 0 and 0⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡𝑖𝑓𝑥 ∈ (−∞, 0] 𝐹𝑋𝑖 (𝑥𝑖 ) = { . The system in the series fails whenever the 1 − 𝑒 −𝑥𝑖 ⁡⁡⁡⁡⁡𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒⁡⁡⁡⁡⁡⁡ earliest component fails, which happens at time 𝑋(1) = 𝑌1 , the first order statistic. Thus, the probability density function of the time to failure is therefore given by 𝑓𝑌1 (𝑦1 ) = 𝑛𝑓𝑋 (𝑦1 )[1 − 𝐹𝑋 (𝑦1 )]𝑛−1 = 5𝑒 −𝑦1 [𝑒 −𝑦1 ]4 = 5𝑒 −5𝑦1 whenever 𝑦1 > 0. b) For the system in parallel, the system in the series fails whenever the last component fails, which happens at time 𝑋(5) = 𝑌5, the greatest order statistic. Thus, the density is 𝑓𝑌5 (𝑦5 ) = 𝑛𝑓𝑋 (𝑦𝑛 )[𝐹𝑋 (𝑦𝑛 )]𝑛−1 = 5𝑒 −𝑦5 [1 − 𝑒 −𝑦5 ]4 whenever 𝑦5 > 0. Question #33: Consider a random sample of size 𝑛 from a geometric distribution such that 𝑋𝑖 ~𝐺𝐸𝑂(𝑝). Give the CDF of a) the minimum 𝑌1 , b) the 𝑘 𝑡ℎ smallest 𝑌𝑘 , c) the maximum 𝑌𝑛 . a) If some 𝑋𝑖 ~𝐺𝐸𝑂(𝑝), then 𝑓𝑋𝑖 (𝑥) = 𝑝(1 − 𝑝)𝑥 and 𝐹𝑋𝑖 (𝑥) = 1 − (1 − 𝑝)𝑥+1. Consider the event 𝑋(1) > 𝑚, which happens if and only if 𝑋(𝑖) > 𝑚 for all 𝑖 = 1, … , 𝑛. Therefore, 𝑛 𝑛 we have 𝑃(𝑋(1) > 𝑚) = 𝑃(𝑋(𝑖) > 𝑚) = [1 − 𝐹𝑋𝑖 (𝑚)] = [(1 − 𝑝)𝑚 ]𝑛 = (1 − 𝑝)𝑚𝑛 , which implies that the CDF is given by 𝐹𝑋(1) (𝑚) = 𝑃(𝑋(1) ≤ 𝑚) = 1 − (1 − 𝑝)𝑚𝑛 . b) The event 𝑋(𝑘) ≤ 𝑚 happens when 𝑘 of the 𝑋(𝑖) satisfy 𝑋(𝑖) ≤ 𝑚 and the other 𝑛 − 𝑘 𝑛 satisfy 𝑋(𝑖) > 𝑚. Thus, we have 𝑃(𝑋(𝑘) ≤ 𝑚) = (𝑛𝑘)𝑃(𝑋(𝑖) ≤ 𝑚) 𝑃(𝑋(𝑖) > 𝑚) 𝑛−𝑘 = (𝑛𝑘)[1 − (1 − 𝑝)𝑚 ]𝑘 [(1 − 𝑝)𝑚 ]𝑛−𝑘 , which is the distribution function of 𝑌𝑘 . c) The event 𝑋(𝑛) ≤ 𝑚 happens when 𝑋(𝑖) ≤ 𝑚 for all 𝑖 = 1, … , 𝑛. Thus, 𝑃(𝑋(𝑛) ≤ 𝑚) = 𝑛 𝑛 𝑃(𝑋(𝑖) ≤ 𝑚) = (∑𝑛𝑗=1[𝑝(1 − 𝑝)𝑗−1 ]) = (𝑝 1−(1−𝑝)𝑚 𝑛 1−(1−𝑝) ) = [1 − (1 − 𝑝)𝑚 ]𝑛 . Chapter #7 – Limiting Distributions Question #30: If 𝑋~𝑃𝐴𝑅(𝜃, 𝜅), then 𝑓𝑋 (𝑥) = 𝜅 𝑥 𝜅+1 𝜃 𝜃(1+ ) if 𝑥 > 0 and zero otherwise. Consider a random sample of size 𝑛 = 5 from a Pareto distribution where 𝑋~𝑃𝐴𝑅(1,2); that is, suppose that 𝑋1 , … , 𝑋5 are drawn from the given Pareto distribution above. a) Find the joint PDF of the second and fourth order statistics given by 𝑌2 = 𝑋(2) and 𝑌4 = 𝑋(4) , and b) find the joint PDF of the first three order statistics given by 𝑌1 = 𝑋(2) , 𝑌2 = 𝑋(2) and 𝑌3 = 𝑋(2). 𝑥 𝑥 2 a) The CDF of the population is given by 𝐹𝑋 (𝑥) = ∫0 𝑓𝑋 (𝑡) 𝑑𝑡 = ∫0 [(1+𝑡)3 ] 𝑑𝑡, so that we can calculate the joint density using Corollary 6.5.1 as 5! [𝐹 (𝑦 )]2−1 𝑓𝑋 (𝑦2 )[𝐹𝑋 (𝑦4 ) (2−1)!(4−2−1)!(5−4)! 𝑋 2 𝑓𝑌2 𝑌4 (𝑦2 , 𝑦4 ) = − 𝐹𝑋 (𝑦2 )]4−2−1 𝑓𝑋 (𝑦4 )[1 − 𝐹𝑋 (𝑦4 )]5−4 = 5! 𝑓𝑋 (𝑦2 )𝑓𝑋 (𝑦4 )[𝐹𝑋 (𝑦2 )][𝐹𝑋 (𝑦4 ) − 𝐹𝑋 (𝑦2 )][1 − 𝐹𝑋 (𝑦4 )] if 0 < 𝑦2 < 𝑦4 < ∞. 𝑛! b) From Theorem 6.5.4, we have 𝑔(𝑦1 , … , 𝑦𝑟 ) = (𝑛−𝑟)! [1 − 𝐹𝑋 (𝑦𝑟 )]𝑛−𝑟 [𝑓𝑋 (𝑦1 ) … 𝑓𝑋 (𝑦𝑟 )], so we may calculate that 𝑓𝑌1 𝑌2 𝑌3 (𝑦1 , 𝑦2 , 𝑦3 ) = 60[1 − 𝐹𝑋 (𝑦3 )]2 [𝑓𝑋 (𝑦1 )𝑓𝑋 (𝑦2 )𝑓𝑋 (𝑦3 )] 𝑛! 5! whenever 0 < 𝑦1 < 𝑦2 < 𝑦3 < ∞, since we have that (𝑛−𝑟)! = 2! = 60. Question #1: Consider a random sample of size 𝑛 from a distribution with cumulative 1 distribution function 𝐹𝑋 (𝑥) = 1 − 𝑥 whenever 1 ≤ 𝑥 < ∞ and zero otherwise. That is, let the random variables 𝑋1 , … , 𝑋𝑛 be ~𝑖𝑖𝑑 from the distribution with CDF 𝐹𝑋 (𝑥). a) Derive the CDF of the smallest order statistic given by 𝑋(1) = 𝑋1:𝑛 , b) find the limiting distribution of 𝑋1:𝑛 ; that is, if 𝐺𝑛 (𝑦) denotes the order statistic from above, find lim 𝐺𝑛 (𝑦), c) find the limiting 𝑛→∞ distribution of 𝑛 𝑋1:𝑛 ; that is, find the CDF of 𝑛 𝑋(1) and its limit as 𝑛 → ∞. a) We can compute that 𝐹𝑋1:𝑛 (𝑦) = 𝑃(𝑋1:𝑛 ≤ 𝑦) = 1 − 𝑃(𝑋1:𝑛 > 𝑦) = 1 − 𝑃(𝑋𝑖 > 𝑦)𝑛 = 1 𝑛 1 𝑛 1 − [1 − (1 − 𝑦)] = 1 − (𝑦) whenever 1 ≤ 𝑦 < ∞. We thus have that the CDF of the smallest order statistic is 𝐹𝑋1:𝑛 (𝑦) = { 1 − (1/𝑦)𝑛 0 𝑖𝑓 𝑦 ≥ 1 . Finally, we note 𝑖𝑓 𝑦 < 1 that 𝑃(𝑋1:𝑛 > 𝑦) ≡ 𝑃(𝑋𝑖 > 𝑦)𝑛 since the smallest order statistic is greater than some 𝑦 if and only if all 𝑛 of the independent samples are also greater than 𝑦. We can use this approach for any order statistic, including the largest, by changing the exponent. b) We have that lim 𝐺𝑛 (𝑦) = { 𝑛→∞ lim [1 − (1/𝑦)𝑛 ] 𝑖𝑓 𝑦 ≥ 1 lim 0 𝑖𝑓 𝑦 < 1 𝑛→∞ 𝑛→∞ ={ 0 𝑖𝑓 𝑦 ≤ 1 = 𝐺(𝑦), 1 𝑖𝑓 𝑦 > 1 so the limiting distribution of 𝑋1:𝑛 is degenerate at 𝑦 = 1. From Definition 7.2.2, this means that 𝐺(𝑦) is the cumulative distribution function of some discrete distribution 𝑔(𝑦) that assigns probability one at 𝑦 = 1 and zero otherwise. 1 1 𝑛 𝑛 (𝑦) = 𝑃(𝑋1:𝑛 ≤ 𝑦) = 𝑃 (𝑋1:𝑛 ≤ 𝑦 𝑛 ) = 1 − 𝑃 (𝑋1:𝑛 > 𝑦 𝑛 ) = c) As before, we have 𝐹𝑋1:𝑛 1 𝑛 𝑛 1 − 𝑃 (𝑋𝑖 > 𝑦 ) = 1 − [1 − (1 − 1 1 𝑦𝑛 𝑛 1 )] = 1 − (𝑦) whenever 𝑦 ≥ 1. Therefore, it is clear that the limiting distribution of this sequence of random variables is given by 𝑛 (𝑦) = { lim 𝐹𝑋1:𝑛 𝑛→∞ 1 − 1/𝑦 0 𝑖𝑓 𝑦 ≥ 1 = 𝐺(𝑦) since there is no dependence on 𝑛. 𝑖𝑓 𝑦 < 1 Question #2: Consider a random sample of size 𝑛 from a distribution with CDF given by 1 𝐹(𝑥) = 1+𝑒 −𝑥 for all 𝑥 ∈ ℝ. Find the limiting distribution of a) 𝑋𝑛:𝑛 and b) 𝑋𝑛:𝑛 − 𝑙𝑛(𝑛). 1 𝑛 a) We have 𝐹𝑋𝑛:𝑛 (𝑦) = 𝑃(𝑋𝑛:𝑛 ≤ 𝑦) = 𝑃(𝑋𝑖 ≤ 𝑦)𝑛 = (1+𝑒 −𝑦 ) 1 𝑛 for all 𝑦 ∈ ℝ. Since lim [(1+𝑒 −𝑦 ) ] = 0, we conclude that 𝑋𝑛:𝑛 does not have a limiting distribution. 𝑛→∞ b) We calculate that 𝐹𝑋𝑛:𝑛−𝑙𝑛(𝑛) (𝑦) = 𝑃(𝑋𝑛:𝑛 − 𝑙𝑛(𝑛) ≤ 𝑦) = 𝑃(𝑋𝑛:𝑛 ≤ 𝑦 + 𝑙 𝑛(𝑛)) = 𝑛 1 𝑛 1 𝐹𝑋𝑛:𝑛 (𝑦 + 𝑙 𝑛(𝑛)) = (1+𝑒 −(𝑦+𝑙 𝑛(𝑛)) ) = (1+𝑒 −𝑦 𝑒 −𝑙𝑛(𝑛)) = ( 1+ limit gives lim [𝐹𝑋𝑛:𝑛 (𝑦 + 𝑙 𝑛(𝑛))] = lim [( 𝑛→∞ 1 𝑒−𝑦 1+ 𝑛 𝑛→∞ 𝑛 1 𝑒−𝑦 𝑛 ) . Evaluating this 𝑛 ) ] = 𝑒𝑒 −𝑦 for all 𝑦 ∈ ℝ. Question #3: Consider a random sample of size 𝑛 from the distribution 𝐹(𝑥) = 1 − 𝑥 −2 if 𝑥 > 1 and zero otherwise. Find the limiting distribution of a) 𝑋1:𝑛 , b) 𝑋𝑛:𝑛 and c) 1 √𝑛 𝑋𝑛:𝑛 . a) We can compute that 𝐹𝑋1:𝑛 (𝑦) = 𝑃(𝑋1:𝑛 ≤ 𝑦) = 1 − 𝑃(𝑋1:𝑛 > 𝑦) = 1 − 𝑃(𝑋𝑖 > 𝑦)𝑛 = 1 𝑛 1 1 1 − [1 − (1 − 𝑦 2 )] = 1 − (𝑦 2𝑛) if 𝑦 > 1. Thus, 𝐹𝑋1:𝑛 (𝑦) = { 1 lim [1 − 𝑦 2𝑛] 1 − 𝑦 2𝑛 𝑖𝑓 𝑦 > 1 0 𝑖𝑓 𝑦 ≤ 1 so 𝑖𝑓 𝑦 > 1 1 𝑖𝑓 𝑦 > 1 ={ . 0 𝑖𝑓 𝑦 ≤ 1 𝑖𝑓 𝑦 ≤ 1 the limiting distribution is lim 𝐹𝑋1:𝑛 (𝑦) = {𝑛→∞ 𝑛→∞ lim [0] 𝑛→∞ We therefore say that the limiting distribution is degenerate at 𝑦 = 1. 𝑛 1 b) We have 𝐹𝑋𝑛:𝑛 (𝑦) = 𝑃(𝑋𝑛:𝑛 ≤ 𝑦) = 𝑃(𝑋𝑖 ≤ 𝑦)𝑛 = (1 − 𝑦 2 ) whenever 𝑦 > 1. Thus, 1 (1 − 𝑦 2 ) 𝐹𝑋𝑛:𝑛 (𝑦) = { 0 𝑛 1 𝑛 lim [(1 − 𝑦 2 ) ] 𝑖𝑓 𝑦 > 1 𝑖𝑓 𝑦 > 1 so lim 𝐹𝑋𝑛:𝑛 (𝑦) = {𝑛→∞ 𝑛→∞ lim [0] 𝑖𝑓 𝑦 ≤ 1 𝑛→∞ 𝑖𝑓 𝑦 ≤ 1 = 0. We would therefore conclude that there is no limiting distribution for 𝑋𝑛:𝑛 . c) We compute that 𝐹 1 𝑋 √𝑛 (1 − ( 1 √𝑛𝑦)2 𝑛 1 𝑛:𝑛 (𝑦) = 𝑃 ( 1 √𝑛 𝑋𝑛:𝑛 ≤ 𝑦) = 𝑃(𝑋𝑛:𝑛 ≤ √𝑛𝑦) = 𝐹𝑋𝑛:𝑛 (√𝑛𝑦) = 𝑛 1 ) = (1 − 𝑛𝑦 2 ) whenever √𝑛𝑦 > 1 or 𝑦 > 𝑛. We can therefore compute √ 1 the limit as lim 𝐹 1 𝑋 𝑛→∞ √𝑛 𝑛:𝑛 (𝑦) = { 𝑛 lim [(1 − 𝑛𝑦 2 ) ] 𝑖𝑓 𝑦 > 𝑛→∞ lim [0] 𝑛→∞ 𝑖𝑓 𝑦 ≤ 1 √𝑛 1 √𝑛 1 − 2 𝑦 = {𝑒 0 𝑖𝑓 𝑦 > 0 . 𝑖𝑓 𝑦 ≤ 0 Question #5: Suppose that 𝑍𝑖 ~𝑁(0,1) and that the 𝑍𝑖 are all independent. Use moment generating functions to find the limiting distribution of 𝐴𝑛 =  We have 𝐴𝑛 = 1 𝑛 ∑𝑛 𝑖=1(𝑍𝑖 + ) √𝑛 1 𝑛 𝑛 (∑𝑛 𝑖=1 𝑍𝑖 )+(∑𝑖=1( )) = √𝑛 = 1 1 𝑛 ∑𝑛 𝑖=1(𝑍𝑖 + ) √𝑛 (∑𝑛 𝑖=1 𝑍𝑖 )+𝑛 1 𝑛 √𝑛 = as 𝑛 → ∞. ∑𝑛 𝑖=1 𝑍𝑖 √𝑛 + 1 √𝑛 , so the MGF is 𝑡 𝑀𝐴𝑛 (𝑡) = [𝑀1 (𝑡)]𝑛 [𝑀2 (𝑡)] = [𝑀1 (𝑡)]𝑛 [𝐸 (𝑒 √𝑛 )] since 𝐴𝑛 is the sum of two parts so we can multiply their respective MGFs. The MGF of a standard normal random variable with 𝜇 = 0 and 𝜎 2 = 1 is given by 𝑀𝑍 (𝑡) = 𝑒 𝑡 that 𝑀1 (𝑡) = 𝑒 ( 𝑡 2 ) /2 √𝑛 𝑡2 2𝑛 2 /2 1 , which allows us to calculate 𝑡 𝑡 = 𝑒 . Also, we have that 𝐸 (𝑒 √𝑛 ) = 𝑒 √𝑛 so combining these 𝑛 gives 𝑀𝐴𝑛 (𝑡) = [𝑀1 (𝑡)] [𝐸 (𝑒 1 𝑡 √𝑛 𝑡2 2𝑛 𝑛 𝑡2 𝑡 𝑡 )] = [𝑒 ] [𝑒 √𝑛 ] = [𝑒 2 ] [𝑒 √𝑛 ]. Then we can use 𝑡2 𝑡 𝑡2 Theorem 7.3.1 to calculate lim 𝑀𝐴𝑛 (𝑡) = lim [𝑒 2 ] [𝑒 √𝑛 ] = 𝑒 2 = 𝑀(𝑡), which we 𝑛→∞ 𝑛→∞ know is the MGF of a standard normal, so the limiting distribution is 𝐴~𝑁(0,1). Note that this is also a direct consequence of the Central Limit Theorem. Question #9: Let 𝑋1 , 𝑋2 , … , 𝑋100 be a random sample of size 𝑛 = 100 from an exponential distribution such that each 𝑋𝑖 ~𝐸𝑋𝑃(1) and let 𝑌 = 𝑋1 + 𝑋2 + ⋯ + 𝑋100. a) Give a normal 𝑌 approximation for the probability 𝑃(𝑌 > 110), and b) if 𝑋̅ = 100 is the sample mean, then give a normal approximation to the probability 𝑃(1.1 < 𝑋̅ < 1.2). a) Since each 𝑋𝑖 ~𝐸𝑋𝑃(1), we know that 𝐸(𝑋𝑖 ) = 𝜃 = 1 while 𝑉𝑎𝑟(𝑋𝑖 ) = 𝜃 2 = 1. Due to 100 the independence of the 𝑋𝑖 s, we have that 𝐸(𝑌) = 𝐸(∑100 𝑖=1 𝑋𝑖 ) = ∑𝑖=1 𝐸(𝑋𝑖 ) = 100 and 100 𝑉𝑎𝑟(𝑌) = 𝑉𝑎𝑟(∑100 𝑖=1 𝑋𝑖 ) = ∑𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) = 100 so 𝑠𝑑(𝑌) = √100 = 10. We can therefore calculate that 𝑃(𝑌 > 110) = 1 − 𝑃(𝑌 ≤ 110) = 1 − 𝑃(∑100 𝑖=1 𝑋𝑖 ≤ 110) = 1−𝑃( ∑100 𝑖=1 𝑋𝑖 −100 10 ≤ 110−100 10 ) ≈ 1 − 𝑃(𝑍 ≤ 1) = 1 − Φ(1) = 1 − 0.8413 = 0.1587, where 𝑍 denotes the standard normal distribution with 𝜇 = 0 and 𝜎 2 = 1. 𝑋̅ −𝜇 b) We know that 𝑍𝑛 = 𝜎/ √𝑛 →𝑑 𝑁(0,1) by the Central Limit Theorem. We then have that 𝑌 1 𝑌 1 1 𝐸(𝑋̅) = 𝐸 (100) = 100 𝐸(𝑌) = 1 and 𝑉𝑎𝑟(𝑋̅) = 𝑉𝑎𝑟 (100) = 10,000 𝑉𝑎𝑟(𝑌) = 100 so 1 1.1−1 𝑋̅ −1 𝑠𝑑(𝑌) = 10, which allows us to find 𝑃(1.1 < 𝑋̅ < 1.2) = 𝑃 ( 1/10 < 1/10 < 1.2−1 1/10 )≈ 𝑃(1 < 𝑍𝑛 < 2) = Φ(2) − Φ(1) = 0.9772 − 0.8413 = 0.1359. Here, we have used the fact that 𝜇 = 1 and 𝜎 = 1 which come from the population distribution 𝑋𝑖 ~𝐸𝑋𝑃(1). Question #11: Let 𝑋𝑖 ~𝑈𝑁𝐼𝐹(0,1) where 𝑋1 , 𝑋2 , … , 𝑋20 are all independent. Find a normal approximation for the probability 𝑃(∑20 𝑖=1 𝑋𝑖 ≤ 12).  Since each 𝑋𝑖 ~𝑈𝑁𝐼𝐹(0,1), we know that 𝐸(𝑋𝑖 ) = 1/2 while 𝑉𝑎𝑟(𝑋𝑖 ) = 1/12. Due to 20 the independence of the 𝑋𝑖 s, we have that 𝐸(∑20 𝑖=1 𝑋𝑖 ) = ∑𝑖=1 𝐸(𝑋𝑖 ) = 10 and 20 20 𝑉𝑎𝑟(∑20 𝑖=1 𝑋𝑖 ) = ∑𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) = 5/3, so that 𝑠𝑑(∑𝑖=1 𝑋𝑖 ) = √5/3. This allows us to find 𝑃(∑20 𝑖=1 𝑋𝑖 ≤ 12) = 𝑃 ( ∑20 𝑖=1 𝑋𝑖 −10 √5/3 ≤ 12−10 √5/3 ) ≈ 𝑃(𝑍 ≤ 1.55) = Φ(1.55) = 0.9394. Chapter #8 – Statistics and Sampling Distributions Question #1: Let 𝑋 denote the weight in pounds of a single bag of feed where 𝑋~𝑁(101,4). What is the probability that 20 bags will weigh at least 2,000 pounds?  20 20 Let 𝑌 = ∑20 𝑖=1 𝑋𝑖 where 𝑋𝑖 ~𝑁(101,4). We have that 𝐸(𝑌) = 𝐸(∑𝑖=1 𝑋𝑖 ) = ∑𝑖=1 𝐸(𝑋𝑖 ) = 20 20(101) = 2,020 and 𝑉𝑎𝑟(𝑌) = 𝑉𝑎𝑟(∑20 𝑖=1 𝑋𝑖 ) = ∑𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) = 20(4) = 80 such that 𝑠𝑑(𝑌) = √80 = 4√5. We can thus calculate the probability 𝑃(𝑌 ≥ 2,000) = ∑20 𝑖=1 𝑋𝑖 −𝐸(𝑌) 𝑃(∑20 𝑖=1 𝑋𝑖 ≥ 2,000) = 𝑃 ( 𝑠𝑑(𝑌) ≥ 2,000−𝐸(𝑌) ∑20 𝑖=1 𝑋𝑖 −2,020 𝑠𝑑(𝑌) 4√5 ) = 𝑃( ≥ 2,000−2,020 )≈ 4√5 20 𝑃 (𝑍 ≥ − 4√5) = 𝑃(𝑍 ≥ −2.24) = 1 − Φ(−2.24) = 0.987, where 𝑍~𝑁(0,1). Question #2: Let 𝑆 denote the diameter of a shaft and 𝐵 the diameter of a bearing, where both 𝑆 and 𝐵 are independent and 𝑆~𝑁(1,0.0004) and 𝐵~𝑁(1.01,0.0009). a) If a shaft and bearing are selected at random, what is the probability that the shaft diameter will exceed the bearing diameter? b) Now assume equal variances (𝜎𝑆2 = 𝜎𝐵2 = 𝜎 2 ) such that we have 𝑆~𝑁(1, 𝜎 2 ) and 𝐵~𝑁(1.01, 𝜎 2 ). Find the value of 𝜎 that will yield a probability of noninterference of 0.95 (which means the shaft diameter exceeds the bearing diameter). a) Define 𝑌 = 𝑆 − 𝐷, since we wish to find 𝑃(𝑆 > 𝐷) = 𝑃(𝑆 − 𝐷 > 0) = 𝑃(𝑌 > 0). We have that 𝐸(𝑌) = 𝐸(𝑆 − 𝐷) = 𝐸(𝑆) − 𝐸(𝐷) = 1 − 1.01 = −0.01 and 𝑉𝑎𝑟(𝑌) = 𝑉𝑎𝑟(𝑆 − 𝐷) = 𝑉𝑎𝑟(𝑆) + 𝑉𝑎𝑟(𝐷) = 0.0004 + 0.0009 = 0.0013 such that 𝑠𝑑(𝑌) = √0.0013 = 0.036. 𝑌−𝐸(𝑌) 𝑃( 𝑠𝑑(𝑌) > 0−𝐸(𝑌) 𝑠𝑑(𝑌) Thus, we 𝑌+0.01 have 𝑃(𝑆 > 𝐷) = 𝑃(𝑆 − 𝐷 > 0) = 𝑃(𝑌 > 0) = 0.01 ) = 𝑃 ( 0.036 > 0.036) ≈ 𝑃(𝑍 > 0.28) = 1 − Φ(0.28) = 0.39. b) For 𝑌 = 𝑆 − 𝐷, we have that 𝐸(𝑌) = −0.01 but 𝑉𝑎𝑟(𝑌) = 2𝜎 2 so 𝑠𝑑(𝑌) = √2𝜎. We wish to find 𝜎 so that 𝑃(𝑌 > 0) = 0.95 → 1 − 𝑃 (𝑍 ≤ 0.01 0.01 √2𝜎 √2𝜎 ) = 0.95 → Φ ( ) = 0.05. Since only the critical value 𝑍𝛼 = −1.645 ensures that Φ(−1.645) = 0.05, we must solve 0.01 √2𝜎 = −1.645 → 𝜎 = −0.004. But since we must have 𝜎 ≥ 0, no such 𝜎 exists. Question #3: Let 𝑋1 , … , 𝑋𝑛 be a random sample of size 𝑛 where they are ~𝑖𝑖𝑑 such that 𝑋𝑖 ~𝑁(𝜇, 𝜎 2 ) and define 𝑈 = ∑𝑛𝑖=1 𝑋𝑖 and 𝑊 = ∑𝑛𝑖=1 𝑋𝑖2 . a) Find a statistic that is a function of 𝑈 and 𝑊 and unbiased for the parameter 𝜃 = 2𝜇 − 5𝜎 2 . b) Find a statistic that is unbiased for 𝛾 = 𝜎 2 + 𝜇 2 . c) If 𝑐 is a constant and 𝑌𝑖 = 1 if 𝑋𝑖 ≤ 𝑐 and zero otherwise, find a statistic 𝑐−𝜇 that is a function of 𝑌1 , … , 𝑌𝑛 and is unbiased for 𝐹𝑋 (𝑐) = Φ ( 𝜎 2 )= 1 𝑡 (𝑐−𝜇)/𝜎 1 𝑒− 2 ∫−∞ √2𝜋 1 𝑑𝑡. 1 a) We first find an estimator for 𝜇 = 𝐸(𝑋̅) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 𝐸(∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 𝐸(𝑈) = 1 and then for 𝜎 2 = 𝐸(𝑆 2 ) = 𝐸 ( 𝑛−1 1 ∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2 ) = 𝐸 ( 1 1 𝑛−1 𝑈 𝑛 [∑𝑛𝑖=1(𝑋𝑖2 ) − 𝑛𝑋̅ 2 ]) = 1 2 1 𝐸(∑𝑛𝑖=1(𝑋𝑖2 ) − 𝑛𝑋̅ 2 ) = 𝑛−1 𝐸(∑𝑛𝑖=1 𝑋𝑖2 − 𝑛𝑋̅ 2 ) = 𝑛−1 𝐸 [𝑊 − 𝑛 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) ] = 𝑛−1 1 𝐸 [𝑊 − 𝑛 𝑛−1 (∑𝑛 𝑖=1 𝑋𝑖 ) 𝑛2 2 1 ] = 𝑛−1 𝐸 [𝑊 − (∑𝑛 𝑖=1 𝑋𝑖 ) 2 𝑛 1 1 𝑈 1 thus have 𝜃 = 2𝜇 − 5𝜎 2 = 2 [ 𝑛 ] − 5 [𝑛−1 (𝑊 − We 𝑈2 1 ] = 𝑛−1 𝐸 [𝑊 − 𝑛 𝑈 2 ] = 𝑛−1 [𝑊 − 𝑈2 𝑛 2𝑈 )] = 𝑛 𝑛 5 ]. 1 − 𝑛−1 (𝑊 − 𝑛 𝑈 2 ), which is an unbiased estimator of 𝜃 since 𝐸(𝜃) = ⋯ = 𝜃. 2 1 1 b) Since we found that 𝜇 = 𝐸(𝑋̅) = 𝐸 ( ∑𝑛𝑖=1 𝑋𝑖 ), then we have 𝜇 2 = [𝐸 ( ∑𝑛𝑖=1 𝑋𝑖 )] = 𝑛 2 1 𝑛 1 1 1 1 𝐸 [(𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) − 𝑉𝑎𝑟 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 )] = 𝐸 [𝑛2 (∑𝑛𝑖=1 𝑋𝑖 )2 − 𝑛2 𝑉𝑎𝑟(∑𝑛𝑖=1 𝑋𝑖 )] = 𝐸 [𝑛2 𝑈 2 − 𝑈2 1 𝑛𝜎 2 ] = 𝐸 [ 𝑛2 − 𝑛2 combining (1−𝑛) 𝑛 1 𝜎2 ]= 𝑛 these [𝑛−1 (𝑊 − 𝑈2 𝑛 𝑈2 − 𝑛2 we 𝑈2 𝜎2 find 1 1 . We previously found that 𝜎 2 = 𝑛−1 [𝑊 − 𝑛 𝑈2 )] + 𝑛2 = 𝑛 (𝑊 − 𝑈2 𝑛 𝜎2 𝛾 = 𝜎 2 + 𝜇 2 = 𝜎 2 + 𝑛2 − that 𝑈2 ) + 𝑛2 = 𝑊 𝑛 𝑈2 𝑈2 𝑛 𝑊 − 𝑛2 + 𝑛2 = 𝑛 , = (1−𝑛) 𝑛 𝑈2 𝑛 ], so 𝑈2 𝜎 2 + 𝑛2 = which is an unbiased estimator of 𝛾 since 𝐸(𝛾) = ⋯ = 𝛾. 𝑋𝑖 −𝜇 c) We have 𝑃(𝑌𝑖 = 1) = 𝑃(𝑋𝑖 ≤ 𝑐) = 𝑃 ( and 𝜎 ≤ 𝑐−𝜇 𝜎 ) = 𝑃 (𝑍 ≤ 𝑐−𝜇 𝜎 𝑐−𝜇 𝐸(𝑌𝑖 ) = 1 ∙ 𝑃(𝑌𝑖 = 1) + 0 ∙ 𝑃(𝑌𝑖 = 0) = 𝑃(𝑌𝑖 = 1) = Φ ( 𝑐−𝜇 ) = Φ( 𝜎 𝜎 ) = 𝐹𝑋 (𝑐) ) = 𝐹𝑋 (𝑐). Then, 1 1 1 1 𝑐−𝜇 𝑐−𝜇 𝐸(𝑌̅) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑌𝑖 ) = 𝑛 𝐸(∑𝑛𝑖=1 𝑌𝑖 ) = 𝑛 ∑𝑛𝑖=1 𝐸(𝑌𝑖 ) = 𝑛 𝑛Φ ( 𝜎 ) = Φ ( 𝜎 ) = 𝐹𝑋 (𝑐), which means that 𝑌̅ is an unbiased estimator of 𝐹𝑋 (𝑐) = Φ((𝑐 − 𝜇)/𝜎). Question #4: Assume that 𝑋1 and 𝑋2 are independent normal random variables such that each 𝑋𝑖 ~𝑁(𝜇, 𝜎 2 ) and define 𝑌1 = 𝑋1 + 𝑋2 and 𝑌2 = 𝑋1 − 𝑋2. Show that the random variables 𝑌1 and 𝑌2 are independent and normally distributed.  Since 𝑋1 and 𝑋2 are independent normal random variables, we know that their joint 1 density function is 𝑓𝑋1 𝑋2 (𝑥1 , 𝑥2 ) = 𝑓𝑋1 (𝑥1 )𝑓𝑋2 (𝑥2 ) = [ √2𝜋𝜎 1 𝑒 2𝜋𝜎2 − 1 [(𝑥1 −𝜇)2 +(𝑥2 −𝜇)2 ] 2𝜎2 𝑓𝑌1 𝑌2 (𝑦1 , 𝑦2 ) = 𝑓𝑋1 𝑋2 ( (𝑥 −𝜇)2 (𝑥 −𝜇)2 1 − 1 2 − 2 2𝜎 ][ 𝑒 2𝜎2 ] √2𝜋𝜎 = . We have the transformation 𝑦1 = 𝑥1 + 𝑥2 and 𝑦2 = 𝑥1 − 𝑥2 , which can be solved to obtain 𝑥1 = the Jacobian 𝐽 = 𝑑𝑒𝑡 [ 𝑒 1/2 1/2 , 2 and 𝑥2 = 𝑦1 −𝑦2 2 . This allows us to calculate 1/2 1 ] = − 2, so we can compute the joint density −1/2 𝑦1 +𝑦2 𝑦1 −𝑦2 2 𝑦1 +𝑦2 2 1 1 ) |𝐽| = 2 [2𝜋𝜎2 𝑒 − 2 2 1 𝑦 +𝑦 𝑦 −𝑦 (( 1 2 −𝜇) +( 1 2 −𝜇) ) 2 2 2𝜎2 1 simplifying this expression, we have 𝑓𝑌1 𝑌2 (𝑦1 , 𝑦2 ) = 4𝜋𝜎2 𝑒 − 1 [𝑦 −2𝜇]2 4𝜎2 1 𝑒 − ]. After 1 [𝑦 ]2 4𝜎2 2 . Since the marginal densities can be separated, this shows that 𝑌1 and 𝑌2 are independent and normally distributed. Moreover, we see that 𝑌1 ~𝑁(2𝜇, 2𝜎 2 ) and 𝑌2 ~𝑁(0,2𝜎 2 ). Question #12: The distance in feet by which a parachutist misses a target is 𝐷 = √𝑋12 + 𝑋22 where 𝑋1 and 𝑋2 are independent with each 𝑋𝑖 ~𝑁(0,25). Find the probability 𝑃(𝐷 ≤ 12.25).  We wish to find 𝑃(𝐷 ≤ 12.25) = 𝑃 (√𝑋12 + 𝑋22 ≤ 12.25) = 𝑃[𝑋12 + 𝑋22 ≤ (12.25)2 ] = 𝑃[(𝑋1 − 0)2 + (𝑋2 − 0)2 ≤ (12.25)2 ] = 𝑃[(𝑋1 − 𝜇)2 + (𝑋2 − 𝜇)2 ≤ (12.25)2 ] = 𝑃[ (𝑋1 −𝜇)2 𝜎2 + (𝑋2 −𝜇)2 𝜎2 ≤ (12.25)2 𝜎2 ] = 𝑃 [∑2𝑖=1 (𝑋𝑖 −𝜇)2 𝜎2 Since 𝜇 = 0 and 𝜎 2 = 25, we have 𝑃 [∑2𝑖=1 ≤ (12.25)2 (𝑋𝑖 −0)2 25 𝜎2 ≤ ] = 𝑃 [∑2𝑖=1 (12.25)2 25 (𝑋𝑖 −0)2 𝜎2 ≤ ] ≈ 𝑃 [𝜒 2 (2) ≤ (12.25)2 𝜎2 (12.25)2 25 ]. ]≈ 𝑃[𝜒 2 (2) ≤ 6] = 0.95. Note that we have used Corollary 8.3.4 to transform the 𝑋𝑖 −𝜇 2 question into one using the chi-square distribution, since ∑𝑛𝑖=1 ( because 𝑋𝑖 ~𝑁(𝜇, 𝜎 2 ) implies that 𝑋𝑖 −𝜇 𝜎 𝜎 ) ~𝜒 2 (𝑛). This is 𝑋𝑖 −𝜇 2 ~𝑁(0,1) so ( 𝜎 ) ~𝜒 2 (1) and that the sum of 𝑛 independent chi-square distributed random variables is distributed 𝜒 2 (𝑛). Chapter #8 – Statistics and Sampling Distributions Question #8: Suppose that 𝑋 and 𝑌 are independent and distributed 𝑋~𝜒 2 (𝑚) and 𝑌~𝜒 2 (𝑛). Is the random variable 𝑍 = 𝑌 − 𝑋 distributed chi-square if we have 𝑛 > 𝑚?  No. The random variable 𝑍 = 𝑌 − 𝑋 can clearly take on negative values, whereas as a random variable following the chi-square distribution must be positive. Question #9: Suppose that 𝑋~𝜒 2 (𝑚), 𝑆 = 𝑋 + 𝑌~𝜒 2 (𝑚 + 𝑛) and that 𝑋 and 𝑌 are independent random variables. Use moment generating functions to show that 𝑆 − 𝑋~𝜒 2 (𝑛).  We know that if some 𝐴~𝜒 2 (𝑣), then its MGF is given by 𝑀𝑌 (𝑡) = (1 − 2𝑡)−𝑣/2. We thus have 𝑀𝑋 (𝑡) = (1 − 2𝑡)−𝑚/2 and 𝑀𝑆 (𝑡) = 𝑀𝑋+𝑌 (𝑡) = (1 − 2𝑡)−(𝑚+𝑛)/2 . Since 𝑋 and 𝑌 are independent, we know that 𝑀𝑋+𝑌 (𝑡) = 𝑀𝑋 (𝑡)𝑀𝑌 (𝑡), which implies that 𝑀𝑆−𝑋 (𝑡) = 𝑀(𝑋+𝑌)−𝑋 (𝑡) = 𝑀𝑌 (𝑡) = 𝑀𝑋+𝑌 (𝑡) 𝑀𝑋 (𝑡) = (1−2𝑡)−(𝑚+𝑛)/2 (1−2𝑡)−𝑚/2 = (1 − 2𝑡)−𝑛/2. Thus, we have that 𝑌 = 𝑆 − 𝑋 is distributed chi-square with 𝑛 degrees of freedom. Question #14: If 𝑇~𝑡(𝜈), find the distribution of the random variable 𝑇 2 .  We know that if 𝑍~𝑁(0,1) and 𝑉~𝜒 2 (𝜈) are independent random variables, then the distribution of 𝑇 = 𝑍2 produce 𝑇 2 = 𝑉/𝜈 = 𝑍 is Student’s t distribution. But then we can square this to √𝑉/𝜈 𝑍 2 /1 𝑉/𝜈 , which makes it clear that 𝑇 2 ~𝐹(1, 𝜈). The reason for this is that we know if some 𝑍~𝑁(0,1), then 𝑍 2 ~𝜒 2 (1). Moreover, we are already given that 𝑉~𝜒 2 (𝜈). Combining these results with the fact that if some 𝑉1 ~𝜒 2 (𝜈1 ) and 𝑉2 ~𝜒 2 (𝜈2 ) 𝑉 /𝜈 are independent, then the random variable 𝑋 = 𝑉1 /𝜈1 ~𝐹(𝜈1 , 𝜈2 ). Therefore, 𝑇 2 follows 2 2 the F distribution with 1 and 𝜈 degrees of freedom whenever 𝑇~𝑡(𝜈). Question #15: Suppose that 𝑋𝑖 ~𝑁(𝜇, 𝜎 2 ) for 𝑖 = 1, … , 𝑛 and 𝑍𝑖 ~𝑁(0,1) for 𝑖 = 1, . . , 𝑘 and that all variables are independent. Find the distribution of the following random variables. a) 𝑋1 − 𝑋2 ~𝑁(𝜇 − 𝜇, 𝜎 2 + 𝜎 2 ) ≡ 𝑁(0,2𝜎 2 ) b) 𝑋2 + 2𝑋3 ~𝑁(𝜇 + 2𝜇, 𝜎 2 + 4𝜎 2 ) ≡ 𝑁(3𝜇, 5𝜎 2 ) c) 𝑍12 ~𝜒 2 (1) since the square of a standard normal random variable is chi-square. d) 𝑋1 −𝑋2 𝜎𝑆𝑧 √2 ~𝑡(𝑘 − 1) since 𝑋1 − 𝑋2 ~𝑁(0,2𝜎 2 ) implies that 𝑋1 −𝑋2 𝜎√2 ~𝑁(0,1) and dividing this by the sample standard deviation 𝑆𝑧 of the Z sample makes it clear that ~𝑡(𝑘 − 1). e) √𝑛(𝑋̅ −𝜇) ~𝑡(𝑘 𝜎𝑆𝑧 √𝑛(𝑋̅ −𝜇) 𝜎𝑆𝑧 𝑋̅ −𝜇 (𝑘−1)𝑆𝑧2 √ 𝜎2 − 1) since 𝑍 = 𝜎/ 𝑍 𝑍 𝑧 √𝑉𝑧 /(𝑘−1) =𝑆 = ~𝑁(0,1), 𝑉𝑧 = 𝑛 ~𝜒 2 (𝑘 − 1) and we can write ~𝑡(𝑘 − 1) by the definition of the t distribution (see above). f) 𝑍12 + 𝑍22 = 𝜒 2 (1) + 𝜒 2 (1)~𝜒 2 (1 + 1) ≡ 𝜒 2 (2) since we can simply add the parameters for a sum of independent chi-square random variables. g) 𝑍12 − 𝑍22 → the distribution is unknown. h) i) j) 𝑍1 √𝑍22 𝑍12 𝑍22 𝑍1 𝑍2 ~𝑡(1) since 𝑉 = 𝑍22 ~𝜒 2 (1) and we can write 𝑍1 √𝑍22 = 𝑍1 √𝑍22 /1 = ~𝐹(1,1) since 𝑉1 = 𝑍12 ~𝜒 2 (1), 𝑉2 = 𝑍22 ~𝜒 2 (1) and we have 𝑍1 √𝑉/1 𝑍12 𝑍22 = ~𝑡(1). 𝑉1 /1 = ~𝐹(1,1). 𝑉2 /1 𝑧 ~𝐶𝐴𝑈(1,0) since we can generate the joint transformation 𝑢 = 𝑧1 and 𝑣 = 𝑧2 , 2 1 calculate the joint density 𝑓𝑈𝑉 (𝑢, 𝑣) and integrate out 𝑑𝑣 to find 𝑓𝑈 (𝑢) = 𝜋(𝑢2 +1). k) 𝑋̅ 𝑍̅ → the distribution is unknown. l) √𝑛𝑘(𝑋̅ −𝜇) 2 𝜎√∑𝑘 𝑖=1 𝑍𝑖 𝑋̅ −𝜇 ~𝑡(𝑘) since 𝑊 = 𝜎/ expression √𝑛𝑘(𝑋̅ −𝜇) 2 𝜎√∑𝑘 𝑖=1 𝑍𝑖 = √𝑛 ̅ −𝜇) √𝑛(𝑋 𝜎 𝑘 2 ~𝑁(0,1) and 𝑉 = ∑𝑘𝑖=1 𝑍𝑖2 ~𝜒 2 (𝑘) and we can write the = √∑𝑖=1 𝑍𝑖 𝑊 √𝑉/𝑘 ~𝑡(𝑘) by the definition of the distribution. 𝑘 m) ∑𝑛𝑖=1 (𝑋𝑖 −𝜇)2 𝜎2 + ∑𝑘𝑖=1(𝑍𝑖 − 𝑍̅)2 ~𝜒 2 (𝑛 + 𝑘 − 1) since ∑𝑛𝑖=1 (𝑋𝑖 −𝜇)2 𝜎2 ~𝜒 2 (𝑛) by Corollary 2 (𝑘−1)𝑆 8.3.4 and ∑𝑘𝑖=1(𝑍𝑖 − 𝑍̅)2 = (𝑘 − 1)𝑆𝑧2 = 12 𝑧 ~𝜒 2 (𝑘 − 1) by Theorem 8.3.6. Thus, we have the sum of two chi-square random variables so we sum the parameters. n) 𝑋̅ 𝜎2 1 𝜎2 1 𝑘 2 1 𝜇 1 1 𝜎 + 𝑘 ∑𝑘𝑖=1 𝑍𝑖 ~𝑁 (𝜎2 , 𝑛𝜎2 + 𝑘) since 𝑋̅~𝑁 (𝜇, 𝑛 ) implies that the random variable 2 1 𝑋 1 1 𝜇 1 𝑋̅ = ∑𝑛𝑖=1 (𝜎2 ) 𝑛𝑖 ~𝑁 (∑𝑛𝑘=1 (𝑛𝜎2 ) 𝜇, ∑𝑛𝑘=1 (𝑛𝜎2 ) 𝜎 2 ) ≡ 𝑁 (𝜎2 , 𝑛𝜎2 ). Also, we have 1 ∑𝑘𝑖=1 𝑍𝑖 = 𝑍̅~𝑁 (0, ) so the distribution of their sum is normal and we sum their 𝑘 𝑋̅ 1 𝜇 1 1 respective means and variances to conclude that 𝜎2 + 𝑘 ∑𝑘𝑖=1 𝑍𝑖 ~𝑁 (𝜎2 , 𝑛𝜎2 + 𝑘). 2 o) 𝑘𝑍̅ 2 ~𝜒 2 (1) since √𝑘𝑍̅~𝑁(0,1), so it must be that (√𝑘𝑍̅) = 𝑘𝑍̅ 2 ~𝜒 2 (1). p) ̅ 2 (𝑘−1) ∑𝑛 𝑖=1(𝑋𝑖 −𝑋) ~𝐹(𝑛 𝑘 (𝑛−1)𝜎2 ∑𝑖=1(𝑍𝑖 −𝑍̅)2 ̅ 2 (𝑘−1) ∑𝑛 𝑖=1(𝑋𝑖 −𝑋) 𝑘 2 (𝑛−1)𝜎 ∑𝑖=1(𝑍𝑖 −𝑍̅)2 = − 1, 𝑘 − 1) since we can simplify the random variable as ̅ 2 12 ∑𝑛 𝑖=1(𝑋𝑖 −𝑋) 𝑛−1 ̅ 2 𝜎2 ∑𝑘 𝑖=1(𝑍𝑖 −𝑍) 𝑘−1 12 𝑆 2 = 𝜎2 𝑆𝑋2 and 12 𝑆𝑋2 ~𝜒 2 (𝑛 − 1) and 𝜎 2 𝑆𝑍2 ~𝜒 2 (𝑘 − 1). We 𝑍 thus have the ratio of two chi-square random variables over their respective degrees of freedom, which we know follows the F distribution. Question #18: Assume that 𝑍~𝑁(0,1), 𝑉1 ~𝜒 2 (5) and 𝑉2 ~𝜒 2 (9) are all independent. Then compute the probability that a) 𝑃(𝑉1 + 𝑉2 < 8.6), b) 𝑃 ( 𝑍 √𝑉1 /5 𝑉 < 2.015), c) 𝑃(𝑍 > 0.611√𝑉2 ), 𝑉 1 d) 𝑃 (𝑉1 < 1.45) and e) find the value of 𝑏 such that 𝑃 (𝑉 +𝑉 < 𝑏) = 0.9. 2 1 2 a) Since 𝑉1 ~𝜒 2 (5) and 𝑉2 ~𝜒 2 (9), we know that 𝑉1 + 𝑉2 ~𝜒 2 (14). This allows us to compute 𝑃(𝑉1 + 𝑉2 < 8.6) = 0.144 using the tables for the chi-square distribution. b) We know that if 𝑍~𝑁(0,1) and some 𝑉~𝜒 2 (𝜈) are independent random variables, then 𝑇 = 𝑍 𝑇= √𝑉1 /5 𝑍 √𝑉/𝜈 follows the t distribution with 𝜈 degrees of freedom. We thus have that 𝑍 ~𝑡(5), so we can compute 𝑃 ( √𝑉1 /5 < 2.015) = 0.95 using the t-table. c) We wish to compute 𝑃(𝑍 > 0.611√𝑉2 ) = 𝑃 ( 𝑍 √𝑉2 𝑃( 𝑍 √𝑉2 /9 > 0.611) = 𝑃 ( 𝑍 3 > 0.611(3)) = √𝑉2 > 1.833) = 0.05, from using the t-table since we know that 𝑉 𝑉 /5 9 𝑍 √𝑉2 /9 ~𝑡(9). 𝑉 /5 d) We wish to compute 𝑃 (𝑉1 < 1.45) = 𝑃 (𝑉1 /9 < 1.45 (5)) = 𝑃 (𝑉1 /9 < 2.61). We know 2 2 2 that if some 𝑉1 ~𝜒 2 (𝜈1 ) and 𝑉2 ~𝜒 2 (𝜈2 ) are independent, then the random variable 𝑉 /𝜈 𝑉 /5 given by 𝑋 = 𝑉1 /𝜈1 ~𝐹(𝜈1 , 𝜈2 ). We thus have that 𝑉1 /9 ~𝐹(5,9), so we can therefore use 2 2 2 𝑉 /5 the F-table to compute the desired probability as 𝑃 (𝑉1 /9 < 2.61) = 0.9. 2 𝑉 𝑉1 +𝑉2 1 e) We wish to compute 𝑏 sich that 𝑃 (𝑉 +𝑉 < 𝑏) = 𝑃 ( 1 𝑉 1 𝑉 /9 2 𝑉1 1 𝑉 1 > 𝑏) = 𝑃 (1 + 𝑉2 > 𝑏) = 1 5 1 𝑉 /9 𝑃 (𝑉2 > 𝑏 − 1) = 𝑃 (𝑉2/5 > 9 (𝑏 − 1)) = 0.9. But we know that 𝐹 = 𝑉2/5 ~𝐹(9,5) so we 1 1 1 can use tables to find that 𝑃(𝐹 > 0.383) = 0.9. This means that we must solve the 5 1 1 9 equation 9 (𝑏 − 1) = 0.383 → 𝑏 = 5 (0.383) + 1 → 𝑏 = 9 5 1 (0.383)+1 = 0.592. 1 1 Question #19: Suppose that 𝑇~𝑡(1). a) Show that the CDF of 𝑇 is 𝐹𝑇 (𝑡) = 2 + 𝜋 𝑎𝑟𝑐𝑡𝑎𝑛⁡(𝑡) 1 and b) show that the 100 ⋅ 𝛾 𝑡ℎ percentile is given by 𝑡𝛾 (1) = 𝑡𝑎𝑛 [𝜋 (𝛾 − 2)]. a) If some 𝑇~𝑡(𝜈), then its density is given by 𝑓𝑇 (𝑡) = we have 𝑓𝑇 (𝑡) = Γ(1) 1 Γ( )√𝜋 2 1 1 𝜈+1 ) 2 𝜈 Γ( )√𝜈𝜋 2 Γ( 𝑡2 − (1 + 𝜈 ) 1 𝜈+1 2 . When 𝜈 = 1, 1 (1 + 𝑡 2 )−1 = since Γ(1) = 1 and Γ (2) = √𝜋. We thus 𝜋 1+𝑡 2 1 have that 𝑓𝑇 (𝑡) = 𝜋 1+𝑡 2 when 𝜈 = 1, which is the density of a Cauchy random variable. 1 𝑡 1 To find the cumulative distribution, we simply compute 𝐹𝑇 (𝑡) = 𝜋 ∫−∞ 1+𝑥 2 𝑑𝑥 = 1 𝜋 1 𝜋 1 1 [𝑎𝑟𝑐𝑡𝑎𝑛⁡(𝑥)]𝑡−∞ = (𝑎𝑟𝑐𝑡𝑎𝑛(𝑡) − (− )) = 𝑎𝑟𝑐𝑡𝑎𝑛(𝑡) + . 𝜋 2 𝜋 2 b) The 100 ⋅ 𝛾 𝑡ℎ percentile is the value of 𝑡 such that 𝐹𝑇 (𝑡) = 𝛾. From the work above, we have 1 𝜋 1 1 1 𝑎𝑟𝑐𝑡𝑎𝑛(𝑡) + 2 = 𝛾 → 𝑎𝑟𝑐𝑡𝑎𝑛(𝑡) = (𝛾 − 2) 𝜋 → 𝑡 = 𝑡𝑎𝑛 [(𝛾 − 2) 𝜋]. This 1 proves that the 100 ⋅ 𝛾 𝑡ℎ percentile is given by 𝑡𝛾 (1) = 𝑡𝑎𝑛 [(𝛾 − 2) 𝜋]. Chapter #8 – Statistics and Sampling Distributions Question #22: Compute 𝐸(𝑋 𝑝 ) for 𝑝 > 0 if we have that 𝑋~𝐵𝐸𝑇𝐴(𝑎, 𝑏).  Γ(𝑎+𝑏) Since 𝑋~𝐵𝐸𝑇𝐴(𝑎, 𝑏), its PDF is 𝑓𝑋 (𝑥) = Γ(𝑎)Γ(𝑏) 𝑥 𝑎−1 (1 − 𝑥)𝑏−1 whenever 0 < 𝑥 < 1 and 𝑎 > 0, 𝑏 > 0. Then using the definition of expected value, we can compute ∞ 1 Γ(𝑎+𝑏) Γ(𝑎+𝑏) Γ(𝑎+𝑝)Γ(𝑏) 𝐸(𝑋 𝑝 ) = ∫−∞ 𝑥 𝑝 𝑓𝑋 (𝑥) 𝑑𝑥 = Γ(𝑎)Γ(𝑏) ∫0 𝑥 𝑝 𝑥 𝑎−1 (1 − 𝑥)𝑏−1 𝑑𝑥 = Γ(𝑎)Γ(𝑏) Γ(𝑎+𝑏)Γ(𝑎+𝑝) Γ(𝑎)Γ(𝑎+𝑏+𝑝) since we have integral to conclude that Γ(𝑎+𝑏) 1 𝑎−1 ∫ 𝑥 (1 Γ(𝑎)Γ(𝑏) 0 Γ(𝑎+𝑏+𝑝) = − 𝑥)𝑏−1 𝑑𝑥 = 1 so we can solve for the 1 Γ(𝑎)Γ(𝑏) = ∫0 𝑥 𝑎−1 (1 − 𝑥)𝑏−1 𝑑𝑥. In this case, we are solving Γ(𝑎+𝑏) 1 1 that ∫0 𝑥 𝑝 𝑥 𝑎−1 (1 − 𝑥)𝑏−1 𝑑𝑥 = ∫0 𝑥 𝑝+𝑎−1 (1 − 𝑥)𝑏−1 𝑑𝑥 = Γ(𝑎+𝑝)Γ(𝑏) Γ(𝑎+𝑏+𝑝) . Therefore, all of the moments of the beta distribution for some fixed 𝑝 > 0 can be written in terms of the gamma function, which can be evaluated numerically. Question #24: Suppose that 𝑌𝜈 ~𝜒 2 (𝜈). Use moment generating functions to find the limiting distribution of the transformed random variable  𝑌𝜈 −𝜈 √2𝜈 as 𝜈 → ∞. This result follows directly from the Central Limit Theorem. If we let 𝑌𝜈 = ∑𝑛𝑖=1 𝑋𝑖 where 𝑋𝑖 ~χ2 (1) for 𝑖 = 1, … , 𝑛, then 𝑌𝜈 ~𝜒 2 (𝑛) so that 𝐸(𝑌𝜈 ) = 𝑛, 𝑉𝑎𝑟(𝑌𝜈 ) = 2𝑛 and 𝑠𝑑(𝑌𝜈 ) = √2𝑛. Therefore, 𝑌𝜈 −𝐸(𝑌𝜈 ) 𝑠𝑑(𝑌𝜈 ) 𝑌𝜈 −𝑛 = √2𝑛 → 𝑍~𝑁(0,1) as 𝑛 → ∞. We will now prove this result using moment generating functions. By the definition of MGFs, we have 𝑀[𝑌𝜈−𝜈 ] (𝑡) = 𝐸 [𝑒 𝑌 −𝜈 𝑡( 𝜈 ) √2𝜈 ] = 𝐸 [𝑒 𝜈 2 −𝑡√ 𝑒 𝑌 𝑡 𝜈 √2𝜈 ]=𝑒 −𝑡 √2 √𝜈 𝐸 [𝑒 𝑌 𝑡 𝜈 √2𝜈 ]=𝑒 −𝑡 √2𝜈 𝑒 √2 −𝑡 √𝜈 (1 − 2𝑡 √2𝜈 − ) 𝜈 2 =𝑒 √2 −𝑡 √𝜈 2 − √2 √𝜈 𝑡 𝑀𝑌𝜈 ( √2𝜈 )= 𝜈 2 (1 − 𝑡√𝜈) . In order to evaluate lim 𝑀[𝑌𝜈−𝜈 ] (𝑡), we first 𝜈→∞ √2𝜈 take logarithms and then exponentiate the result. This implies that ln [M[Yν−ν ] (t)] = √2ν ln [𝑒 −𝑡 √2 √𝜈 (1 − 𝑡 √2 ) √𝜈 𝜈 − 2 ] = ln [e −𝑡 √2 √𝜈 ] + ln [(1 − 𝑡 √2 √𝜈 𝜈 − 2 ) ] = −𝑡 √2 √𝜈 ν − 2 ln (1 − 𝑡 √2 √𝜈 ). From here, we use the Taylor series ln(1 − 𝑧) = −𝑧 − 𝑧2 2 − 𝑧3 3 − ⋯ for 𝑧 = 𝑡 the limit, which then gives lim ln [M[Yν−ν ] (t)] = lim [−𝑡 𝜈→∞ √2 lim [−𝑡 𝜈 √ 𝜈→∞ 𝑡2 lim [ 2 + 𝜈→∞ − ν √2 (−𝑡 𝜈 2 √ 𝑡 3 √2 3√𝜈 +⋯] = lim 𝑀[𝑌𝜈 −𝜈 ] (𝑡) = 𝑒 𝜈→∞ − 𝑡2 𝑡2 2 𝜈 − 3 3𝜈 2 − ⋯ )] = lim [−𝑡 𝜈→∞ √2 √𝜈 +𝑡 √2 √𝜈 + √2 √𝜈 𝑡2 2 to evaluate ν − 2 ln (1 − 𝑡 + 𝑡 3 √2 3√𝜈 √2 √𝜈 )] ≈ +⋯] = + 0 + ⋯. This result therefore implies that the limit lim ln[M Yν −ν (t)] [ ] 𝜈→∞ √2ν 3 𝑡 3 22 𝜈→∞ √2 √𝜈 √2ν 𝑡2 = 𝑒 2 , which is the moment generating function of √2𝜈 a random variable that follows a standard normal distribution. This proves that the random variable 𝑌𝜈 −𝑛 √2𝑛 → 𝑍~𝑁(0,1) as 𝑛 → ∞, just as is guaranteed by the CLT. Chapter #9 – Point Estimation Question #1: Assume that 𝑋1 , … , 𝑋𝑛 are independent and identically distributed with common density 𝑓(𝑥; 𝜃), where 𝜃 > 0 is an unknown parameter. Find the method of moments estimator (MME) of 𝜃 if the density function is a) 𝑓(𝑥; 𝜃) = 𝜃𝑥 𝜃−1 for 0 < 𝑥 < 1, b) 𝑓(𝑥; 𝜃) = (𝜃 + 1)𝑥 −𝜃−2 whenever 𝑥 > 1, and c) 𝑓(𝑥; 𝜃) = 𝜃 2 𝑥𝑒 −𝜃𝑥 whenever 𝑥 > 0. 1 a) We begin by computing the first population moment, so 𝐸(𝑋) = ∫0 𝑥𝑓(𝑥; 𝜃) 𝑑𝑥 = 1 1 1 𝜃 𝜃 𝜃 ∫0 𝑥(𝜃𝑥 𝜃−1 ) 𝑑𝑥 = 𝜃 ∫0 𝑥 𝜃 𝑑𝑥 = 𝜃+1 [𝑥 𝜃+1 ]0 = 𝜃+1 (1 − 0) = 𝜃+1. We therefore have 𝜃 𝐸(𝑋) = 𝜃+1. Next, we equate the first population moment with the first sample 𝜃 1 𝜃 moment, which gives 𝜇1′ = 𝑀1′ → 𝜃+1 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 → 𝜃+1 = 𝑋̅. Finally, we replace 𝜃 ̂ 𝜃 𝑋̅ by 𝜃̂ and solve the equation 𝜃̂+1 = 𝑋̅ for 𝜃̂, which implies that 𝜃̂𝑀𝑀𝐸 = 1−𝑋̅. ∞ ∞ b) Just as above, we first compute 𝐸(𝑋) = ∫1 𝑥𝑓(𝑥; 𝜃) 𝑑𝑥 = ∫1 𝑥[(𝜃 + 1)𝑥 −𝜃−2 ] 𝑑𝑥 = ∞ (𝜃 + 1) ∫1 𝑥 −𝜃−1 𝑑𝑥 = 𝜃+1 −𝜃 which means that 𝜇1′ = 𝑀1′ → ∞ ∞ [𝑥 −𝜃 ]1 = − 𝜃+1 𝜃 𝜃+1 𝜃 1 [0 − 1] = = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 → 𝜃+1 𝜃 𝜃+1 𝜃 . Thus, we have 𝐸(𝑋) = 𝜃+1 𝜃 1 = 𝑋̅ and 𝜃̂𝑀𝑀𝐸 = 𝑋̅−1. ∞ ∞ 2 c) We have 𝐸(𝑋) = ∫0 𝑥𝑓(𝑥; 𝜃) 𝑑𝑥 = ∫0 𝑥[𝜃 2 𝑥𝑒 −𝜃𝑥 ] 𝑑𝑥 = 𝜃 2 ∫0 𝑥 2 𝑒 −𝜃𝑥 𝑑𝑥 = ⋯ = 𝜃 after doing integration by parts. We can also find this directly by noting that the 1 density 𝑓(𝑥, 𝜃) = 𝜃 2 𝑥𝑒 −𝜃𝑥 suggests that 𝑋~𝐺𝐴𝑀𝑀𝐴 (𝜃 , 2). This then implies that 1 2 2 1 2 𝐸(𝑋) = 𝜅𝜃 = 𝜃 2 = 𝜃. We therefore set 𝜇1′ = 𝑀1′ such that 𝜃 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 or 𝜃 = 𝑋̅, and 2 then solve for the method of moments estimator, which is given by 𝜃̂𝑀𝑀𝐸 = ̅ . 𝑋 Question #2: Assume that 𝑋1 , … , 𝑋𝑛 are independent and identically distributed. Find the method of moments estimator (MME) of the unknown parameters if the random sample 1 comes from a) 𝑋~𝑁𝐵(3, 𝑝), b) 𝑋~𝐺𝐴𝑀𝑀𝐴(2, 𝜅), c) 𝑋~𝑊𝐸𝐼 (𝜃, 2), and d) 𝑋~𝑃𝐴𝑅(𝜃, 𝜅). 𝑟 3 a) Since 𝑋~𝑁𝐵(3, 𝑝), we know that 𝐸(𝑋) = 𝑝 = 𝑝. Equating this with the first sample 3 3 moment gives 𝜇1′ = 𝑀1′ → 𝑝 = 𝑋̅, so the estimator is 𝑝̂𝑀𝑀𝐸 = 𝑋̅. b) Since 𝑋~𝐺𝐴𝑀𝑀𝐴(2, 𝜅), we know that 𝐸(𝑋) = 𝜅𝜃 = 2𝜅. Equating this with the first 𝑋̅ sample moment gives 𝜇1′ = 𝑀1′ → 2𝜅 = 𝑋̅, so the estimator is 𝜅̂ 𝑀𝑀𝐸 = 2 . 1 1 1 c) Since 𝑋~𝑊𝐸𝐼 (𝜃, 2), we know that 𝐸(𝑋) = 𝜃Γ (1 + 𝛽) = 𝜃Γ (1 + 1/2) = 𝜃Γ(3) = 𝑋̅ 𝜃(3 − 1)! = 2𝜃. Thus, we have 𝜇1′ = 𝑀1′ → 2𝜃 = 𝑋̅, so the estimator is 𝜃̂𝑀𝑀𝐸 = 2 . 𝜃2 𝜅 𝜃 𝜃2 d) Since 𝑋~𝑃𝐴𝑅(𝜃, 𝜅), we have 𝜇1 = 𝜅−1 and 𝜇2 = 𝜎 2 + 𝜇12 = (𝜅−2)(𝜅−1)2 + (𝜅−1)2. This 𝜃2 𝜅 𝜃 𝜃2 1 means that 𝜇1 = 𝜅−1 = 𝑀1′ = 𝑋̅ and 𝜇2 = (𝜅−2)(𝜅−1)2 + (𝜅−1)2 = 𝑀2′ = 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 . We must solve for the unknown parameters 𝜃 and 𝜅 in terms of the two sample moments 𝑋̅ and 1 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 . From the first equation, we can solve to find 𝜃 = (𝜅 − 1)𝑋̅ and substitute into the second equation to find 𝑋̅ 2 𝜅 (𝜅−2) 𝑋̅ 2 (𝜅−1)2 𝜅 (𝜅−2)(𝜅−1)2 + 𝑋̅ 2 (𝜅−1)2 (𝜅−1)2 𝑛 1 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 → 2 ∑ 𝑋𝑖 1 𝜅 1 2𝜅−2 + 𝑋̅ 2 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 → 𝑋̅ 2 (𝜅−2 + 1) = 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 → 𝜅−2 = 𝑖=1 . But this means 2 ̅ 𝑛𝑋 𝑛𝑋̅ 2 (2𝜅 − 2) = (𝜅 − 2) ∑𝑛𝑖=1 𝑋𝑖2 → 2𝑛𝑋̅ 2 𝜅 − 2𝑛𝑋̅ 2 = 𝜅 ∑𝑛𝑖=1 𝑋𝑖2 − 2 ∑𝑛𝑖=1 𝑋𝑖2 , so that 2𝑛𝑋̅ 2 𝜅 − 𝜅 ∑𝑛𝑖=1 𝑋𝑖2 = 2𝑛𝑋̅ 2 − 2 ∑𝑛𝑖=1 𝑋𝑖2 → 𝜅(2𝑛𝑋̅ 2 − ∑𝑛𝑖=1 𝑋𝑖2 ) = 2𝑛𝑋̅ 2 − 2 ∑𝑛𝑖=1 𝑋𝑖2 . Finally, we divide through to find 𝜅 = 2 2𝑛𝑋̅ 2 −2 ∑𝑛 𝑖=1 𝑋𝑖 2 . 2𝑛𝑋̅ 2 −∑𝑛 𝑖=1 𝑋𝑖 Plugging in to the other equation 2𝑛𝑋̅ 2 −2 ∑𝑛 𝑋 2 implies that 𝜃 = (𝜅 − 1)𝑋̅ = ( 2𝑛𝑋̅ 2 −∑𝑛𝑖=1𝑋 2𝑖 − 1) 𝑋̅, so that the two method of 𝑖=1 𝑖 moments estimators are 𝜅̂ 𝑀𝑀𝐸 = 2 2𝑛𝑋̅ 2 −2 ∑𝑛 𝑖=1 𝑋𝑖 𝑛 2 2 2𝑛𝑋̅ −∑𝑖=1 𝑋𝑖 2𝑛𝑋̅ 2 −2 ∑𝑛 𝑋2 and 𝜃̂𝑀𝑀𝐸 = ( 2𝑛𝑋̅ 2 −∑𝑛𝑖=1𝑋 2𝑖 − 1) 𝑋̅. 𝑖=1 𝑖 Question #3: Assume that 𝑋1 , … , 𝑋𝑛 are independent and identically distributed with common density 𝑓(𝑥; 𝜃), where 𝜃 > 0 is an unknown parameter. Find the maximum likelihood estimator (MLE) for 𝜃 when the PDF is a) 𝑓(𝑥; 𝜃) = 𝜃𝑥 𝜃−1 whenever 0 < 𝑥 < 1, b) 𝑓(𝑥; 𝜃) = (𝜃 + 1)𝑥 −𝜃−2 whenever 𝑥 > 1, and c) 𝑓(𝑥, 𝜃) = 𝜃 2 𝑥𝑒 −𝜃𝑥 whenever 𝑥 > 0. a) We first find the likelihood function based on the joint density of 𝑋1 , … , 𝑋𝑛 , which is 𝐿(𝜃) = 𝑓(𝑥1 ; 𝜃) … 𝑓(𝑥𝑛 ; 𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃) = ∏𝑛𝑖=1 𝜃𝑥𝑖𝜃−1 = 𝜃 𝑛 (𝑥1 … 𝑥𝑛 )𝜃−1 . Next, we construct the log likelihood function, since it is easier to differentiate and achieves a maximum at the same point as the likelihood function. This gives ln[𝐿(𝜃)] = ln[𝜃 𝑛 (𝑥1 … 𝑥𝑛 )𝜃−1 ] = 𝑛 ln(𝜃) + (𝜃 − 1)[ln(𝑥1 ) + ⋯ + ln(𝑥𝑛 )], which we differentiate so 𝜕 𝜕 𝑛 ln[𝐿(𝜃)] = 𝜕𝜃 [𝑛 ln(𝜃) + (𝜃 − 1) ∑𝑛𝑖=1 ln(𝑥𝑖 )] = 𝜃 + ∑𝑛𝑖=1 ln(𝑥𝑖 ). We then solve 𝜕𝜃 𝑛 for the value of 𝜃 which makes the derivative equal zero, so 𝜃 + ∑𝑛𝑖=1 ln(𝑥𝑖 ) = 0 → 𝜃 = − ∑𝑛 𝑛 . Since it is clear that the second derivative of ln[𝐿(𝜃)] is negative, we have 𝑖=1 ln(𝑥𝑖 ) found that the maximum likelihood estimator is 𝜃̂𝑀𝐿𝐸 = − ∑𝑛 𝑛 . (Note that we 𝑖=1 ln(𝑋𝑖 ) must capitalize the 𝑋𝑖 from 𝑥𝑖 when presenting the estimator.) b) We have 𝐿(𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃) = ∏𝑛𝑖=1(𝜃 + 1)𝑥𝑖−𝜃−2 = (𝜃 + 1)𝑛 (𝑥1 … 𝑥𝑛 )−𝜃−2 so that ln[𝐿(𝜃)] = ln[(𝜃 + 1)𝑛 (𝑥1 … 𝑥𝑛 )−𝜃−2 ] = 𝑛 ln(𝜃 + 1) − (𝜃 + 2) ∑𝑛𝑖=1 ln(𝑥𝑖 ). Then we find 𝜕 𝜕𝜃 𝜕 𝑛 ln[𝐿(𝜃)] = 𝜕𝜃 [𝑛 ln(𝜃 + 1) − (𝜃 + 2) ∑𝑛𝑖=1 ln(𝑥𝑖 )] = 𝜃+1 − ∑𝑛𝑖=1 ln(𝑥𝑖 ). Finally, 𝑛 we must solve 𝜃+1 − ∑𝑛𝑖=1 ln(𝑥𝑖 ) = 0 → 𝜃 = ∑𝑛 𝑛 𝑖=1 ln(𝑥𝑖 ) − 1. Since the second derivative of ln[𝐿(𝜃)] will be negative, we have found that 𝜃̂𝑀𝐿𝐸 = ∑𝑛 𝑛 𝑖=1 ln(𝑋𝑖 ) − 1. c) We have 𝐿(𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃) = ∏𝑛𝑖=1 𝜃 2 𝑥𝑖 𝑒 −𝜃𝑥𝑖 = 𝜃 2𝑛 (𝑥1 … 𝑥𝑛 )𝑒 −𝜃(𝑥1 +⋯+𝑥𝑛) so that 𝑛 ln[𝐿(𝜃)] = ln[𝜃 2𝑛 (𝑥1 … 𝑥𝑛 )𝑒 −𝜃(∑𝑖=1 𝑥𝑖 ) ] = 2𝑛 ln(𝜃) + ∑𝑛𝑖=1 ln(𝑥𝑖 ) − 𝜃 ∑𝑛𝑖=1 𝑥𝑖 . Then we have 𝜕 𝜕𝜃 𝜕 ln[𝐿(𝜃)] = 𝜕𝜃 [2𝑛 ln(𝜃) + ∑𝑛𝑖=1 ln(𝑥𝑖 ) − 𝜃 ∑𝑛𝑖=1 𝑥𝑖 ] = must solve 2𝑛 𝜃 2𝑛 − ∑𝑛𝑖=1 𝑥𝑖 = 0 → 𝜃 = ∑𝑛 𝑖=1 𝑥𝑖 2𝑛 𝜃 − ∑𝑛𝑖=1 𝑥𝑖 . Finally, we 2 2 = 𝑥̅ , which implies that 𝜃̂𝑀𝐿𝐸 = 𝑋̅. Question #4: Assume that 𝑋1 , … , 𝑋𝑛 are independent and identically distributed. Find the maximum likelihood estimator (MLE) of the parameter if the distribution is a) 𝑋𝑖 ~𝐵𝐼𝑁(1, 𝑝), 1 b) 𝑋𝑖 ~𝐺𝐸𝑂(𝑝) , c) 𝑋𝑖 ~𝑁𝐵(3, 𝑝), d) 𝑋𝑖 ~𝐺𝐴𝑀𝑀𝐴(𝜃, 2), e) 𝑋𝑖 ~𝑊𝐸𝐼 (𝜃, 2), and f) 𝑋𝑖 ~𝑃𝐴𝑅(1, 𝜅). a) Since the density of 𝑋~𝐵𝐼𝑁(1, 𝑝) is 𝑓(𝑥; 𝑝) = (𝑥1)𝑝 𝑥 (1 − 𝑝)1−𝑥 = 𝑝 𝑥 (1 − 𝑝)1−𝑥 , we 𝑛 𝑛 have 𝐿(𝑝) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝑝) = ∏𝑛𝑖=1 𝑝 𝑥𝑖 (1 − 𝑝)1−𝑥𝑖 = 𝑝∑𝑖=1 𝑥𝑖 (1 − 𝑝)𝑛−∑𝑖=1 𝑥𝑖 and then 𝑛 𝑛 ln[𝐿(𝑝)] = ln[𝑝∑𝑖=1 𝑥𝑖 (1 − 𝑝)𝑛−∑𝑖=1 𝑥𝑖 ] = (∑𝑛𝑖=1 𝑥𝑖 ) ln(𝑝) + (𝑛 − ∑𝑛𝑖=1 𝑥𝑖 )ln(1 − 𝑝). Differentiating ∑𝑛 𝑖=1 𝑥𝑖 𝑝 − 𝑛−∑𝑛 𝑖=1 𝑥𝑖 1−𝑝 gives =0→ 𝜕 𝜕𝑝 ln[𝐿(𝑝)] = ∑𝑛 𝑖=1 𝑥𝑖 𝑝 = 𝜕 𝜕𝑝 𝑛−∑𝑛 𝑖=1 𝑥𝑖 1−𝑝 [(∑𝑛𝑖=1 𝑥𝑖 ) ln(𝑝) + (𝑛 − ∑𝑛𝑖=1 𝑥𝑖 )ln(1 − 𝑝)] = → (1 − 𝑝) ∑𝑛𝑖=1 𝑥𝑖 = 𝑝(𝑛 − ∑𝑛𝑖=1 𝑥𝑖 ) → ∑𝑛𝑖=1 𝑥𝑖 − 𝑝 ∑𝑛𝑖=1 𝑥𝑖 = 𝑝𝑛 − 𝑝 ∑𝑛𝑖=1 𝑥𝑖 → ∑𝑛𝑖=1 𝑥𝑖 = 𝑝𝑛 → 𝑝 = ∑𝑛 𝑖=1 𝑥𝑖 𝑛 = 𝑥̅ . Since the second derivative will be negative, we have found that 𝑝̂ 𝑀𝐿𝐸 = 𝑋̅. b) Since 𝑓(𝑥; 𝑝) = 𝑝(1 − 𝑝)𝑥−1 , we have 𝐿(𝑝) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝑝) = ∏𝑛𝑖=1 𝑝(1 − 𝑝)𝑥𝑖 −1 = 𝑛 𝑝𝑛 (1 − 𝑝)[∑𝑖=1 𝑥𝑖 ]−𝑛 and then the log likelihood function becomes ln[𝐿(𝑝)] = 𝑛 ln[𝑝𝑛 (1 − 𝑝)[∑𝑖=1 𝑥𝑖 ]−𝑛 ] = 𝑛 ln(𝑝) + {[∑𝑛𝑖=1 𝑥𝑖 ] − 𝑛}ln(1 − 𝑝). 𝜕 𝜕 𝑛 ln[𝐿(𝑝)] = 𝜕𝑝 [𝑛 ln(𝑝) + {[∑𝑛𝑖=1 𝑥𝑖 ] − 𝑛} ln(1 − 𝑝)] = 𝑝 − 𝜕𝑝 with zero implies 𝑛 − 𝑝 [∑𝑛 𝑖=1 𝑥𝑖 ]−𝑛 1−𝑝 𝑛 =0→𝑝= [∑𝑛 𝑖=1 𝑥𝑖 ]−𝑛 1−𝑝 𝑛 𝑛 − 𝑛𝑝 = 𝑝 ∑𝑛𝑖=1 𝑥𝑖 − 𝑛𝑝 → 𝑛 = 𝑝 ∑𝑛𝑖=1 𝑥𝑖 → 𝑝 = ∑𝑛 𝑖=1 𝑥𝑖 Differentiating [∑𝑛 𝑖=1 𝑥𝑖 ]−𝑛 1−𝑝 gives . Equating this → (1 − 𝑝)𝑛 = 𝑝[∑𝑛𝑖=1 𝑥𝑖 − 𝑛] → = 1 1 𝑛 ∑ 𝑥 𝑛 𝑖=1 𝑖 1 = 𝑥̅ . Since the second 1 derivative will be negative, we have found that 𝑝̂ 𝑀𝐿𝐸 = 𝑋̅. (𝑥−1)! c) Since 𝑋~𝑁𝐵(3, 𝑝), we have 𝑓(𝑥; 𝑝) = (𝑥−1 )𝑝3 (1 − 𝑝)𝑥−3 = 2(𝑥−3)! 𝑝3 (1 − 𝑝)𝑥−3 = 3−1 (𝑥−1)(𝑥−2) 2 1 𝑝3 (1 − 𝑝)𝑥−3 = 2 (𝑥 2 − 3𝑥 + 2)𝑝3 (1 − 𝑝)𝑥−3. This implies that the 1 likelihood function 𝐿(𝑝) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝑝) = ∏𝑛𝑖=1 [2 (𝑥𝑖2 − 3𝑥𝑖 + 2)𝑝3 (1 − 𝑝)𝑥𝑖 −3 ] = 𝑛 2−𝑛 (𝑥𝑖2 − 3𝑥𝑖 + 2)𝑛 𝑝3𝑛 (1 − 𝑝)[∑𝑖=1 𝑥𝑖 ]−3𝑛 , so the log likelihood function ln[𝐿(𝑝)] = 𝑛 ln[2−𝑛 (𝑥𝑖2 − 3𝑥𝑖 + 2)𝑛 𝑝3𝑛 (1 − 𝑝)[∑𝑖=1 𝑥𝑖 ]−3𝑛 ] = −𝑛 ln(2) + 𝑛 ln(𝑥𝑖2 − 3𝑥𝑖 + 2) + 3𝑛 ln(𝑝) + {[∑𝑛𝑖=1 𝑥𝑖 ] − 3𝑛} ln(1 − 𝑝). Differentiating this then gives 𝜕 𝜕𝑝 𝜕 ln[𝐿(𝑝)] = 𝜕𝑝 [−𝑛 ln(2) + 𝑛 ln(𝑥𝑖2 − 3𝑥𝑖 + 2) + 3𝑛 ln(𝑝) + {[∑𝑛𝑖=1 𝑥𝑖 ] − 3𝑛} ln(1 − 𝑝)] = [∑𝑛 𝑖=1 𝑥𝑖 ]−3𝑛 1−𝑝 3 3𝑛 𝑝 − 3 = 0 → ⋯ → 𝑝 = 𝑋̅. Therefore, we have that 𝑝̂ 𝑀𝐿𝐸 = 𝑋̅. 𝑥 1 𝑥 1 d) Since 𝑋~𝐺𝐴𝑀𝑀𝐴(𝜃, 2), we have 𝑓(𝑥; 𝜃) = 𝜃2 Γ(2) 𝑥 2−1 𝑒 −𝜃 = 𝜃2 𝑥𝑒 −𝜃 . This means 𝑥𝑖 1 1 1 𝐿(𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃) = ∏𝑛𝑖=1 [𝜃2 𝑥𝑖 𝑒 − 𝜃 ] = 𝜃2𝑛 (𝑥1 … 𝑥𝑛 )𝑒 −𝜃 1 1 ln [𝜃2𝑛 (𝑥1 … 𝑥𝑛 )𝑒 −𝜃 gives 𝜕 ∑𝑛 𝑖=1 𝑥𝑖 so that ln[𝐿(𝜃)] = 1 ] = −2𝑛 ln(𝜃) + ∑𝑛𝑖=1 ln(𝑥𝑖 ) − 𝜃 ∑𝑛𝑖=1 𝑥𝑖 . 𝜕 𝜕𝜃 ∑𝑛 𝑖=1 𝑥𝑖 1 ln[𝐿(𝜃)] = 𝜕𝜃 [−2𝑛 ln(𝜃) + ∑𝑛𝑖=1 ln(𝑥𝑖 ) − 𝜃 ∑𝑛𝑖=1 𝑥𝑖 ] = − we solve 1 𝜃2 ∑𝑛𝑖=1 𝑥𝑖 − 2𝑛 𝜃 1 = 0 → 𝜃2 ∑𝑛𝑖=1 𝑥𝑖 = 2𝑛 𝜃 Differentiating 2𝑛 𝜃 1 + 𝜃2 ∑𝑛𝑖=1 𝑥𝑖 . Then → 𝜃 ∑𝑛𝑖=1 𝑥𝑖 = 2𝑛𝜃 2 → 𝜃 = ∑𝑛 𝑖=1 𝑥𝑖 2𝑛 𝑥̅ = 2. 𝑋̅ Since the second derivative will be zero, we have found that 𝜃̂𝑀𝐿𝐸 = 2 . 1 1 e) Since 𝑋~𝑊𝐸𝐼 (𝜃, 2), we have 𝑓(𝑥; 𝜃) = 2√𝜃 𝑥 𝐿(𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃) 1 = 1 −1 2 𝑥 − −√ 𝑖 1 2 ∏𝑛𝑖=1 [ 𝑥 𝑒 √𝜃 ] 2√𝜃 𝑖 = 1 𝑒 𝑥 𝜃 −√ 𝑛 2𝑛 𝜃 2 − 𝑛 1 𝜕 − 𝑛 1 √𝜃 1 𝑛 √𝜃 = ∑𝑛 𝑖=1 √𝑥𝑖 𝑛 →𝜃=[ ∑𝑛 𝑖=1 √𝑥𝑖 𝑛 2 1 2 (𝑥1 1 2 𝑥 𝜃 −√ . Thus, we have 1 1 1 − − ∑𝑛 𝑥 2 … 𝑥𝑛 2 ) 𝑒 √𝜃 𝑖=1 𝑖 1 − 𝑛 2𝑛 𝜃 2 (𝑥1 1 2 √𝜃 ∑𝑛 𝑖=1 √𝑥𝑖 3 2𝜃2 1 ∑𝑛𝑖=1 𝑥𝑖2 ] = − =0→ 𝑛 + 2𝜃 ∑𝑛 𝑖=1 √𝑥𝑖 3 2𝜃2 ∑𝑛 𝑖=1 √𝑥𝑖 3 2𝜃2 1 1 1 1 so that the 1 − − ∑𝑛 𝑥 2 … 𝑥𝑛 2 ) 𝑒 √𝜃 𝑖=1 𝑖 ] ∑𝑛𝑖=1 𝑥𝑖2 . Differentiating this gives [−𝑛 ln(2) − 2 ln(𝜃) + ∑𝑛𝑖=1 𝑥𝑖 2 − 𝜕𝜃 zero and solving implies − 2𝜃 + − = 2√𝜃 𝑥 𝑒 − log of the likelihood function is ln[𝐿(𝜃)] = ln [ −𝑛 ln(2) − 2 ln(𝜃) + ∑𝑛𝑖=1 𝑥𝑖 2 − 1 𝜕 𝜕𝜃 = ln[𝐿(𝜃)] = . Setting this equal to 3 𝑛 = 2𝜃 → 2𝜃 ∑𝑛𝑖=1 √𝑥𝑖 = 2𝑛𝜃 2 → 𝑛 2 ∑ √𝑋 ] . Therefore, we have found 𝜃̂𝑀𝐿𝐸 = [ 𝑖=1𝑛 𝑖 ] . 𝜅 f) Since 𝑋~𝑃𝐴𝑅(1, 𝜅), we have 𝑓(𝑥; 𝜅) = (1+𝑥)𝜅+1 so the likelihood function is 𝐿(𝜅) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜅) = ∏𝑛𝑖=1[𝜅(1 + 𝑥𝑖 )−𝜅−1 ] = 𝜅 𝑛 ∏𝑛𝑖=1(1 + 𝑥𝑖 )−𝜅−1. Then we have that ln[𝐿(𝜅)] = ln[𝜅 𝑛 ∏𝑛𝑖=1(1 + 𝑥𝑖 )−𝜅−1 ] = 𝑛 ln(𝜅) − (𝜅 + 1) ∑𝑛𝑖=1 ln(1 + 𝑥𝑖 ). compute the derivative so that 𝑛 𝜅 𝑛 𝜅 𝜕 𝜕𝜅 Next, we 𝜕 ln[𝐿(𝜅)] = 𝜕𝜅 [𝑛 ln(𝜅) − (𝜅 + 1) ∑𝑛𝑖=1 ln(1 + 𝑥𝑖 )] = − ∑𝑛𝑖=1 ln(1 + 𝑥𝑖 ). Finally, we set this result equal to zero and solve for 𝜅 to find that 𝑛 𝑛 − ∑𝑛𝑖=1 ln(1 + 𝑥𝑖 ) = 0 → 𝜅 = ∑𝑛𝑖=1 ln(1 + 𝑥𝑖 ) → 𝜅 = ∑𝑛 . Since the second 𝑖=1 ln(1+𝑥𝑖 ) derivative will be negative, we have found that 𝜅̂ 𝑀𝐿𝐸 = ∑𝑛 𝑛 . 𝑖=1 ln(1+𝑋𝑖 ) Chapter #9 – Point Estimation Question #7: Let 𝑋1 , … , 𝑋𝑛 be a random sample from 𝑋~𝐺𝐸𝑂(𝑝). Find the Maximum 1 Likelihood Estimator (MLE) for a) 𝐸(𝑋) = 𝑝, b) 𝑉𝑎𝑟(𝑋) = 1−𝑝 𝑝2 , and c) 𝑃(𝑋 > 𝑘) = (1 − 𝑝)𝑘 where 𝑘 ∈ {1,2, … }. Do it both ways for each part to verify the Invariance Property. a) We begin by computing 𝑝̂𝑀𝐿𝐸 by first calculating the likelihood function 𝐿(𝑝) = 𝑛 ∏𝑛𝑖=1 𝑓(𝑥𝑖 , 𝑝) = ∏𝑛𝑖=1(1 − 𝑝)𝑥𝑖 −1 𝑝 = 𝑝𝑛 (1 − 𝑝)∑𝑖=1(𝑥𝑖 −1). Then we can compute 𝑛 ln[𝐿(𝑝)] = ln[𝑝𝑛 (1 − 𝑝)∑𝑖=1(𝑥𝑖 −1) ] = 𝑛 ln(𝑝) + [∑𝑛𝑖=1(𝑥𝑖 − 1)] ln(1 − 𝑝) 𝜕 𝜕 and 𝑛 ln[𝐿(𝑝)] = 𝜕𝑝 [𝑛 ln(𝑝) + [∑𝑛𝑖=1(𝑥𝑖 − 1)] ln(1 − 𝑝)] = 𝑝 − 𝜕𝑝 differentiate Setting equal to zero and solving for 𝑝 gives 𝑛 − 𝑝 ∑𝑛 𝑖=1 𝑥𝑖 −𝑛 1−𝑝 𝑛 =0→𝑝= (1 − 𝑝)𝑛 = 𝑝 ∑𝑛𝑖=1 𝑥𝑖 − 𝑛𝑝 → 𝑛 − 𝑝𝑛 = 𝑝 ∑𝑛𝑖=1 𝑥𝑖 − 𝑝𝑛 → 𝑛 = 𝑝 ∑𝑛𝑖=1 𝑥𝑖 . 𝑛 implies that 𝑝 = ∑𝑛 𝑖=1 𝑥𝑖 then ∑𝑛 𝑖=1(𝑥𝑖 −1) 1−𝑝 ∑𝑛 𝑖=1 𝑥𝑖 −𝑛 1−𝑝 This . → then 1 = 𝑥̅ . Since the second derivative will be negative, we have 1 found that 𝑝̂𝑀𝐿𝐸 = 𝑋̅. By the Invariance Property of the Maximum Likelihood Estimator, we have that 𝜏(𝑝̂ ) = 1 1 𝑝̂𝑀𝐿𝐸 = b) Since 𝑝̂ 𝑀𝐿𝐸 = 𝑋̅ and 𝜏(𝑝) = 𝑉𝑎𝑟(𝑋) = 1 1/𝑋̅ 1−𝑝 𝑝2 1 = 𝑋̅ as the MLE for 𝜏(𝑝) = 𝐸(𝑋) = . , then 𝜏(𝑝̂ ) = 𝑝 1−𝑝̂𝑀𝐿𝐸 𝑝̂𝑀𝐿𝐸 2 ̅ 1−1/𝑋 = (1/𝑋̅)2 = 𝑋̅(𝑋̅ − 1), by the Invariance Property of the Maximum Likelihood Estimator. 1 c) Since 𝑝̂ 𝑀𝐿𝐸 = 𝑋̅ and 𝜏(𝑝) = (1 − 𝑝)𝑘 , then 𝜏(𝑝̂ ) = (1 − 𝑝̂ 𝑀𝐿𝐸 )𝑘 = (1 − 1/𝑋̅)𝑘 , by the Invariance Property of the Maximum Likelihood Estimator. Question #12: Let 𝑋1 , … , 𝑋𝑛 be a random sample from 𝑋~𝐿𝑂𝐺𝑁(𝜇, 𝜎 2 ). Find the Maximum Likelihood Estimator (MLE) for a) the parameters 𝜇 and 𝜎 2 , and b) 𝜏(𝜇, 𝜎 2 ) = 𝐸(𝑋). 1 1 a) We have that the density function of 𝑋 is 𝑓𝑋 (𝑥; 𝜇, 𝜎 2 ) = √2𝜋𝜎2 𝑥 𝑒 − 1 (ln 𝑥−𝜇)2 2𝜎2 , so that the likelihood function of the sample is given by 𝐿(𝜇, 𝜎 2 ) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 , 𝑝) = ∏𝑛𝑖=1 [ 1 1 √2𝜋𝜎2 𝑥𝑖 𝑒 − 1 (ln 𝑥𝑖 −𝜇)2 2𝜎2 𝑛 ] = (2𝜋𝜎 2 )− 2 (𝑥1 … 𝑥𝑛 )−1 𝑒 − 1 ∑𝑛 (ln 𝑥𝑖 −𝜇)2 2𝜎2 𝑖=1 𝑛 likelihood function is ln[𝐿(𝜇, 𝜎 2 )] = ln [(2𝜋𝜎 2 )− 2 (𝑥1 … 𝑥𝑛 )−1 𝑒 𝑛 − . Then the log 1 ∑𝑛 (ln 𝑥𝑖 −𝜇)2 2𝜎2 𝑖=1 ]= 1 − 2 ln(2𝜋𝜎 2 ) − ∑𝑛𝑖=1 ln(𝑥𝑖 ) − 2𝜎2 ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇)2. We differentiate this with respect to both parameters and set the resulting expressions equal to zero so we can simultaneously solve for the parameters, so and 𝜕 𝜕𝜎2 𝑛 𝜕 𝜕𝜇 1 ln[𝐿(𝜇, 𝜎 2 )] = 𝜎2 ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇) = 0 1 ln[𝐿(𝜇, 𝜎 2 )] = − 2𝜎2 + 2𝜎4 ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇)2 = 0. The first equation implies 1 ∑𝑛 (ln 𝑥𝑖 − 𝜇) = 0 → ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇) = 0 → ∑𝑛𝑖=1(ln 𝑥𝑖 ) − 𝑛𝜇 = 0 → 𝜇 = 𝜎2 𝑖=1 and 1 𝜎2 the second 𝑛 1 ∑𝑛 𝑖=1(ln 𝑥𝑖 ) 𝑛 1 𝑛 − 2𝜎2 + 2𝜎4 ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇)2 = 0 → 2𝜎4 ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇)2 = 2𝜎2 → 1 ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇)2 = 𝑛 → 𝜎 2 = ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇)2. Thus, we have that the maximum 𝑛 1 1 2 likelihood estimators are 𝜇̂ 𝑀𝐿𝐸 = 𝑛 ∑𝑛𝑖=1(ln 𝑥𝑖 ) and 𝜎̂𝑀𝐿𝐸 = 𝑛 ∑𝑛𝑖=1(ln 𝑥𝑖 − 𝜇̂ 𝑀𝐿𝐸 )2. b) We know that 𝑋~𝐿𝑂𝐺𝑁(𝜇, 𝜎 2 ) if and only if 𝑌 = ln(𝑋) ~𝑁(𝜇, 𝜎 2 ). But 𝑌 = ln⁡(𝑋) if and only if 𝑋 = 𝑒 𝑌 implies that 𝐸(𝑋) = 𝐸(𝑒 𝑌 ) = 𝑀𝑌 (1) = 𝑒 𝜇(1)+ 𝜎2 12 2 𝜎2 = 𝑒 𝜇+ 2 . By the Invariance Property of the Maximum Likelihood Estimator, we can conclude that 𝜎2 1 2 2 ) 𝜏(𝜇̂ , 𝜎̂ 2 )𝑀𝐿𝐸 = 𝜏(𝜇̂ 𝑀𝐿𝐸 , 𝜎̂𝑀𝐿𝐸 = 𝑒 𝜇̂𝑀𝐿𝐸 +2𝜎̂𝑀𝐿𝐸 is the MLE for 𝜏(𝜇, 𝜎 2 ) = 𝐸(𝑋) = 𝑒 𝜇+ 2 . Question #17: Let 𝑋1 , … , 𝑋𝑛 be a random sample from 𝑋~𝑈𝑁𝐼𝐹(𝜃 − 1, 𝜃 + 1). a) Show that the sample mean 𝑋̅ is an unbiased estimator for 𝜃; b) show that the midrange 𝑀 = 𝑋(1) +𝑋(𝑛) 2 is also an unbiased estimator for the parameter 𝜃; c) which one has a smaller variance? a) To show that 𝑋̅ is an unbiased estimator for 𝜃, we must verify that 𝐸(𝑋̅) = 𝜃. But we (𝜃+1)+(𝜃−1) 1 1 1 1 see that 𝐸(𝑋̅) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 𝐸(∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 ∑𝑛𝑖=1 𝐸(𝑋𝑖 ) = 𝑛 ∑𝑛𝑖=1 [ ]= 2 1 1 ∑𝑛 𝜃 = 𝑛𝜃 = 𝜃, so it is clear that the sample mean is an unbiased estimator for 𝜃. 𝑛 𝑖=1 𝑛 𝑋(1) +𝑋(𝑛) b) We have that 𝐸(𝑀) = 𝐸 ( 2 1 1 ) = 2 𝐸(𝑋(1) + 𝑋(𝑛) ) = 2 [𝐸(𝑋(1) ) + 𝐸(𝑋(𝑛) )]. We must therefore compute the mean of the smallest and largest order statistics, which we can do by first finding their density functions. We first note that since 1 1 𝑋~𝑈𝑁𝐼𝐹(𝜃 − 1, 𝜃 + 1), then 𝑓𝑋 (𝑡) = (𝜃+1)−(𝜃−1) = 2 whenever 𝑡 ∈ (𝜃 − 1, 𝜃 + 1) and 𝑡 1 1 𝐹𝑋 (𝑡) = ∫𝜃−1 2 𝑑𝑥 = 2 [𝑥]𝑡𝜃−1 = 𝑡−(𝜃−1) 2 𝑡 ∈ (𝜃 − 1, 𝜃 + 1). whenever Then the distribution function of 𝑋(𝑛) is given by 𝐹𝑛 (𝑡) = 𝑃(𝑋(𝑛) ≤ 𝑡) = 𝑃(𝑋𝑖 ≤ 𝑡)𝑛 = ( 𝑡−(𝜃−1) 𝑛 (𝑡−𝜃+1)𝑛 ) = 2 𝑑 so the density function of 𝑋(𝑛) is 𝑓𝑛 (𝑡) = 𝑑𝑡 𝐹𝑛 (𝑡) = 2𝑛 We can then compute the mean of 𝑋(𝑛) 𝜃+1 ∫𝜃−1 𝑡 𝑛(𝑡−𝜃+1)𝑛−1 2𝑛 𝑢 =𝑡−𝜃+1 𝜃+1 ∫𝜃−1 𝑡 𝑛 𝑢𝑛+1 2𝑛 [ + 2𝑛 𝑛+1 𝜃𝑢𝑛 𝑛 − 2𝑛 . 𝜃+1 as 𝐸(𝑋(𝑛) ) = ∫𝜃−1 𝑡𝑓𝑛 (𝑡) 𝑑𝑡 = 𝑑𝑡. This integral can be calculated by completing the substitution so 𝑛(𝑡−𝜃+1)𝑛−1 𝑛(𝑡−𝜃+1)𝑛−1 that 𝑑𝑢 = 𝑑𝑡 𝑛 2 𝑛 2𝑛+1 and 𝑡 = 𝑢 + 𝜃 − 1. 𝑛 This then implies 2 𝑑𝑡 = 2𝑛 ∫0 (𝑢 + 𝜃 − 1)𝑢𝑛−1 𝑑𝑢 = 2𝑛 ∫0 𝑢𝑛 + 𝜃𝑢𝑛−1 − 𝑢𝑛−1 𝑑𝑢 = 𝑢𝑛 2 ] = 2𝑛 [ 𝑛+1 + 𝑛 𝜃2𝑛 𝑛 0 − 2𝑛 2𝑛 ] = 𝑛+1 + 𝜃 − 1. We can similarly compute 𝑛 2𝑛 that the expected value of the first order statistic is 𝐸(𝑋(1) ) = − 𝑛+1 + 𝜃 + 1. Thus, we have 1 2 1 that 1 2𝑛 2𝑛 𝐸(𝑀) = 2 [𝐸(𝑋(1) ) + 𝐸(𝑋(𝑛) )] = 2 [(𝜃 − 𝑛+1 + 1) + (𝜃 + 𝑛+1 − 1)] = [2𝜃] = 𝜃, so the midrange is also an unbiased estimator for the parameter 𝜃. c) We 1 𝑛2 have ∑𝑛𝑖=1 that 1 1 1 𝑉𝑎𝑟(𝑋̅) = 𝑉𝑎𝑟 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛2 𝑉𝑎𝑟(∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) = [(𝜃+1)−(𝜃−1)]2 12 1 4 1 1 1 𝑛 1 = 𝑛2 ∑𝑛𝑖=1 12 = 𝑛2 ∑𝑛𝑖=1 3 = 𝑛2 3 = 3𝑛. Similarly, we can calculate 𝑋(1) +𝑋(𝑛) that 𝑉𝑎𝑟(𝑀) = 𝑉𝑎𝑟 ( 2 1 ) = 4 𝑉𝑎𝑟(𝑋(1) + 𝑋(𝑛) ). Question #21: Let 𝑋1 , … , 𝑋𝑛 be a random sample from 𝑋~𝐵𝐼𝑁(1, 𝑝). a) Find the Cramer-Rao lower bound for the variances of all unbiased estimators of 𝑝; b) Find the Cramer-Rao lower bound for the variances of unbiased estimators of 𝑝(1 − 𝑝); c) Find a UMVUE of 𝑝. 2 a) We have that 𝐶𝑅𝐿𝐵 = [𝜏′ (𝑝)] 𝑛𝐸[( , so we compute each of these parts individually. 2 𝜕 ln 𝑓(𝑋;𝑝)) ] 𝜕𝑝 First, we have 𝜏(𝑝) = 𝑝, so 𝜏 ′ (𝑝) = 1 and [𝜏 ′ (𝑝)]2 = 1. Next, since 𝑋~𝐵𝐼𝑁(1, 𝑝) we know that 𝑓𝑋 (𝑥) = (𝑥1)𝑝 𝑥 (1 − 𝑝)1−𝑥 = 𝑝 𝑥 (1 − 𝑝)1−𝑥 and so 𝑓(𝑋; 𝑝) = 𝑝 𝑋 (1 − 𝑝)1−𝑋 , which means that ln 𝑓(𝑋; 𝑝) = 𝑋 ln(𝑝) + (1 − 𝑋) ln(1 − 𝑝). Taking the derivative and squaring 𝜕 gives 𝜕𝑝 2 𝜕 𝑋 𝑋−𝑝 2 (𝜕𝑝 ln 𝑓(𝑋; 𝑝)) = (𝑝(1−𝑝)) = 𝐸[ 𝑋 2 −2𝑝𝑋+𝑝2 ] 𝑝2 (1−𝑝)2 1 𝑝2 (1−𝑝)2 1 𝑝2 (1−𝑝)2 1−𝑋 ln 𝑓(𝑋; 𝑝) = 𝑝 − 1−𝑝 = 𝑋 2 −2𝑝𝑋+𝑝2 . 𝑝2 (1−𝑝)2 (1−𝑝)𝑋−𝑝(1−𝑋) 𝑝(1−𝑝) = 𝑋−𝑝𝑋−𝑝+𝑝𝑋 𝑝(1−𝑝) 𝑋−𝑝 = 𝑝(1−𝑝) → 2 𝜕 Finally, we compute 𝐸 [(𝜕𝑝 ln 𝑓(𝑋; 𝑝)) ] = 1 1 = 𝑝2 (1−𝑝)2 𝐸(𝑋 2 − 2𝑝𝑋 + 𝑝2 ) = 𝑝2 (1−𝑝)2 [𝐸(𝑋 2 ) − 2𝑝𝐸(𝑋) + 𝑝2 ] = [(𝑝(1 − 𝑝) + 𝑝2 ) − 2𝑝(𝑝) + 𝑝2 ] = [𝑝 − 𝑝2 ] = 𝑝(1−𝑝) 𝑝2 (1−𝑝)2 1 𝑝2 (1−𝑝)2 [𝑝 − 𝑝2 + 𝑝2 − 2𝑝2 + 𝑝2 ] = 1 = 𝑝(1−𝑝). Thus, we have found that 𝐶𝑅𝐿𝐵 = 𝑝(1−𝑝) 𝑛 . b) Now, 𝜏(𝑝) = 𝑝(1 − 𝑝) = 𝑝 − 𝑝2 , so [𝜏 ′ (𝑝)]2 = [1 − 2𝑝]2 = 1 − 4𝑝 + 4𝑝2 , so the Cramer-Rao Lower Bound becomes 𝐶𝑅𝐿𝐵 = (1−4𝑝+4𝑝2 )𝑝(1−𝑝) 𝑛 . 1 1 c) Since for the estimator 𝑝̂ = 𝑋̅, we have 𝐸(𝑝̂ ) = 𝐸(𝑋̅) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 𝐸(∑𝑛𝑖=1 𝑋𝑖 ) = 1 𝑛 1 1 1 ∑𝑛𝑖=1 𝐸(𝑋𝑖 ) = ∑𝑛𝑖=1 𝑝 = 𝑛𝑝 = 𝑝 and then 𝑉𝑎𝑟(𝑝̂ ) = 𝑉𝑎𝑟(𝑋̅) = 𝑉𝑎𝑟 ( ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 𝑛 𝑛 1 𝑛2 1 1 1 𝑉𝑎𝑟(∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) = 𝑛2 ∑𝑛𝑖=1 𝑝(1 − 𝑝) = 𝑛2 𝑛𝑝(1 − 𝑝) = 𝑝(1−𝑝) 𝑛 = 𝐶𝑅𝐿𝐵, we can conclude that 𝑝̂ = 𝑋̅ is a Uniform Minimum Variance Unbiased Estimator (UMVUE) for the parameter 𝑝 in 𝑋~𝐵𝐼𝑁(1, 𝑝). Question #22: Let 𝑋1 , … , 𝑋𝑛 be a random sample from 𝑋~𝑁(𝜇, 9). a) Find the Cramer-Rao lower bound for the variances of unbiased estimators of 𝜇; b) is the Maximum Likelihood Estimator 𝜇̂ 𝑀𝐿𝐸 = 𝑋̅ a UMVUE for the parameter 𝜇? a) We have 𝜏(𝜇) = 𝜇, so 𝜏 ′ (𝜇) = 1 and [𝜏 ′ (𝜇)]2 = 1. Next, since 𝑋~𝑁(𝜇, 9) we know that the density is 𝑓𝑋 (𝑥) = 1 1 √18𝜋 1 𝑒 −18 (𝑥−𝜇)2 , so that we have 𝑓(𝑋; 𝜇) = 1 √18𝜋 1 𝑒 −18 (𝑋−𝜇)2 and 1 ln 𝑓(𝑋; 𝜇) = − 2 ln(18𝜋) − 18 (𝑋 − 𝜇)2. We then differentiate twice to obtain 𝜕 𝜕𝜇 1 𝜕2 9 𝜕𝜇 2 ln 𝑓(𝑋; 𝜇) = (𝑋 − 𝜇) → 1 ln 𝑓(𝑋; 𝜇) = − . Since we have shown than the 2 𝜕 9 𝜕2 1 1 expression 𝐸 [(𝜕𝜇 ln 𝑓(𝑋; 𝑝)) ] = −𝐸 [𝜕𝜇2 ln 𝑓(𝑋; 𝜇)] and −𝐸 (− 9) = 9, we can 9 conclude that the Cramer-Rao Lower Bound is 𝐶𝑅𝐿𝐵 = 𝑛. This then means that 9 𝑉𝑎𝑟(𝑇) ≥ 𝑛 for any unbiased estimator 𝑇 of the parameter 𝜇 in 𝑋~𝑁(𝜇, 9). 1 1 1 b) We first verify that 𝐸(𝜇̂ 𝑀𝐿𝐸 ) = 𝐸(𝑋̅) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 𝐸(∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 ∑𝑛𝑖=1 𝐸(𝑋𝑖 ) = 1 1 ∑𝑛 𝜇 = 𝑛𝜇 = 𝜇, so 𝜇̂ 𝑀𝐿𝐸 = 𝑋̅ is an unbiased estimator for 𝜇. Then we compute 𝑛 𝑖=1 𝑛 1 1 1 𝑉𝑎𝑟(𝜇̂ 𝑀𝐿𝐸 ) = 𝑉𝑎𝑟(𝑋̅) = 𝑉𝑎𝑟 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛2 𝑉𝑎𝑟(∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) = 1 𝑛2 ∑𝑛𝑖=1 9 = 1 𝑛2 9 9𝑛 = 𝑛 = 𝐶𝑅𝐿𝐵, so that 𝜇̂ 𝑀𝐿𝐸 = 𝑋̅ a UMVUE for the parameter 𝜇. Question #23: Let 𝑋1 , … , 𝑋𝑛 be a random sample from 𝑋~𝑁(0, 𝜃). a) Is the Maximum Likelihood Estimator (MLE) for 𝜃 unbiased?; b) is the MLE also a UMVUE for 𝜃? a) We first find 𝜃̂𝑀𝐿𝐸 by noting that since 𝑋~𝑁(0, 𝜃), then its density function is 𝑓𝑋 (𝑥) = 1 √2𝜋𝜃 1 2 𝑒 −2𝜃𝑥 so the likelihood function is 𝐿(𝜃) = ∏𝑛𝑖=1 𝑓𝑋 (𝑥; 𝜃) = ∏𝑛𝑖=1 𝑛 − 2 (2𝜋𝜃) 𝑒 1 − ∑𝑛 𝑥2 2𝜃 𝑖=1 𝑖 differentiate so and that 1 then 𝜕 𝜕𝜃 𝑛 1 ln[𝐿(𝜃)] = − 2 ln(2𝜋𝜃) − 2𝜃 ∑𝑛𝑖=1 𝑥𝑖2 . 𝑛2𝜋 1 1 1 √2𝜋𝜃 1 2 𝑒 −2𝜃𝑥𝑖 = Next, we 𝑛 ln[𝐿(𝜃)] = − 4𝜋𝜃 + 2𝜃2 ∑𝑛𝑖=1 𝑥𝑖2 = 0 → 2𝜃2 ∑𝑛𝑖=1 𝑥𝑖2 = 2𝜃 → ∑𝑛𝑖=1 𝑥𝑖2 = 𝑛𝜃 → 𝜃 = ∑𝑛𝑖=1 𝑥𝑖2 . Since the second derivative is negative, we have 𝑛 1 1 𝜃̂𝑀𝐿𝐸 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 . We verify unbiasedness by computing 𝐸(𝜃̂𝑀𝐿𝐸 ) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖2 ) = 1 1 1 1 1 𝐸(∑𝑛𝑖=1 𝑋𝑖2 ) = 𝑛 ∑𝑛𝑖=1 𝐸(𝑋𝑖2 ) = 𝑛 ∑𝑛𝑖=1(𝜃 + 02 ) = 𝑛 ∑𝑛𝑖=1 𝜃 = 𝑛 𝑛𝜃 = 𝜃. 𝑛 b) The estimator 𝜃̂𝑀𝐿𝐸 will be a UMVUE for 𝜃 if 𝑉𝑎𝑟(𝜃̂𝑀𝐿𝐸 ) = 𝐶𝑅𝐿𝐵. We therefore begin by computing the Cramer-Rao Lower Bound. First, we have 𝜏(𝜃) = 𝜃, so 𝜏 ′ (𝜃) = 1 and [𝜏 ′ (𝜃)]2 = 1. Next, since we previously found that 𝑓(𝑋; 𝜃) = 1 1 𝜕 1 1 √2𝜋𝜃 2 𝑒 −2𝜃𝑋 , then we 𝑋2 1 have ln 𝑓(𝑋; 𝜃) = − 2 ln(2𝜋𝜃) − 2𝜃 𝑋 2 so that 𝜕𝜃 ln 𝑓(𝑋; 𝜃) = − 2𝜃 + 2𝜃2. We then find 𝜕2 2𝑋 2 1 1 𝑋2 ln 𝑓(𝑋; 𝜃) = 2𝜃2 − 2𝜃3 = 2𝜃2 − 𝜃3 and take the negative of its expected value to 𝜕𝜃2 1 𝑋2 1 1 1 1 1 1 1 obtain −𝐸 [2𝜃2 − 𝜃3 ] = − [2𝜃2 − 𝜃3 𝐸(𝑋 2 )] = 𝜃3 (𝜃 + 02 ) − 2𝜃2 = 𝜃2 − 2𝜃2 = 2𝜃2. This implies that 𝐶𝑅𝐿𝐵 = to 1 𝑛2 this lower 2𝜃2 𝑛 . We must verify that the variance of our estimator is equal bound, so we compute 1 𝑉𝑎𝑟(𝜃̂𝑀𝐿𝐸 ) = 𝑉𝑎𝑟 (𝑛 ∑𝑛𝑖=1 𝑋𝑖2 ) = 1 𝑉𝑎𝑟(∑𝑛𝑖=1 𝑋𝑖2 ) = 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖2 ). In order to compute 𝑉𝑎𝑟(𝑋𝑖2 ), we use the formula 𝑉𝑎𝑟(𝑋𝑖2 ) = 𝐸(𝑋𝑖4 ) − [𝐸(𝑋𝑖2 )]2 = ⋯ = 3𝜃 2 − 𝜃 2 = 2𝜃 2 by finding the moments of 𝑋𝑖 using the derivatives of the Moment Generating Function at 𝑡 = 0. Then we have that 2 1 1 1 2𝜃 𝑉𝑎𝑟(𝜃̂𝑀𝐿𝐸 ) = 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖2 ) = 𝑛2 ∑𝑛𝑖=1 2𝜃 2 = 𝑛2 𝑛2𝜃 2 = 𝑛 = 𝐶𝑅𝐿𝐵, which verifies that the Maximum Likelihood Estimator is a UMVUE for the parameter 𝜃. Chapter #9 – Point Estimation Question #31: Let 𝜃̂ and 𝜃̃ be the MLE and MME estimators for the parameter 𝜃, where 𝑋1 , … , 𝑋𝑛 is a random sample of size 𝑛 from a Uniform distribution such that 𝑋𝑖 ~𝑈𝑁𝐼𝐹(0, 𝜃). Show that a) 𝜃̂ is MSE consistent, and b) 𝜃̃ is MSE consistent. a) We first derive the MLE 𝜃̂ for 𝜃. Since 𝑋~𝑈𝑁𝐼𝐹(0, 𝜃), we know that the density 1 function is 𝑓(𝑥; 𝜃) = 𝜃 for 𝑥 ∈ (0, 𝜃). This allows us to construct the likelihood function 𝐿(𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃) = ∏𝑛𝑖=1 𝜃 −1 = 𝜃 −𝑛 whenever 𝑥1:𝑛 ≥ 0 and 𝑥𝑛:𝑛 ≤ 𝜃 and zero otherwise. Then the log likelihood function is ln[𝐿(𝜃)] = −𝑛 ln(𝜃) so that 𝜕 𝜕𝜃 𝑛 ln[𝐿(𝜃)] = − 𝜃 < 0 for all 𝑛 and 𝜃. This means that 𝐿(𝜃) = 𝜃 −𝑛 is a decreasing function of 𝜃 for 𝑥𝑛:𝑛 ≤ 𝜃 since its first derivative is always negative, so we can conclude that the MLE is the largest order statistic, so 𝜃̂ = 𝑋𝑛:𝑛 . Next, we show that this estimator is MSE consistent, which means verifying that lim 𝐸[𝑋𝑛:𝑛 − 𝜃]2 = 0. 𝑛→∞ But then we can see that 2 lim [𝐸(𝑋𝑛:𝑛 ) 𝑛→∞ − 2𝜃𝐸(𝑋𝑛:𝑛 ) + 𝜃 2 ]. 2 lim 𝐸[𝑋𝑛:𝑛 − 𝜃]2 = lim 𝐸[𝑋𝑛:𝑛 − 2𝜃𝑋𝑛:𝑛 + 𝜃 2 ] = 𝑛→∞ 𝑛→∞ In order to compute this limit, we must find the first and second moments of the largest order statistic. But we already know that 𝑓𝑛 (𝑦) = 𝑛 𝑦 𝑛−1 𝑛𝑓(𝑦; 𝜃)𝐹(𝑦; 𝜃)𝑛−1 = 𝜃 (𝜃) 𝜃 ∫0 𝑦 𝑛𝑦 𝑛−1 𝜃𝑛 𝜃 𝑛 = 𝑛 𝑛𝑦 𝑛−1 𝜃𝑛 𝑦 𝑛+1 𝜃 , so we can calculate 𝐸(𝑋𝑛:𝑛 ) = ∫0 𝑦𝑓𝑛 (𝑦) 𝑑𝑦 = 𝜃 𝑛 𝜃𝑛+1 𝑛 𝑑𝑦 = 𝜃𝑛 ∫0 𝑦 𝑛 𝑑𝑦 = 𝜃𝑛 [ 𝑛+1 ] = 𝜃𝑛 ( 𝑛+1 − 0) = 𝑛+1 𝜃 0 𝜃 𝜃 ∫0 𝑦 2 𝑓𝑛 (𝑦) 𝑑𝑦 = ∫0 𝑦 2 𝑛𝑦 𝑛−1 𝜃𝑛 𝑛 𝜃 𝑛 𝑦 𝑛+2 𝜃 𝑛 2 ) 𝐸(𝑋𝑛:𝑛 = and 𝜃𝑛+2 𝑛 𝑑𝑦 = 𝜃𝑛 ∫0 𝑦 𝑛+1 𝑑𝑦 = 𝜃𝑛 [ 𝑛+2 ] = 𝜃𝑛 ( 𝑛+2 ) = 𝑛+2 𝜃 2 . 0 𝑛 𝑛 2 Thus, we have lim [𝐸(𝑋𝑛:𝑛 ) − 2𝜃𝐸(𝑋𝑛:𝑛 ) + 𝜃 2 ] = lim [𝑛+2 𝜃 2 − 2𝜃 𝑛+1 𝜃 + 𝜃 2 ] = 𝑛→∞ lim [ 𝑛 𝑛→∞ 𝑛+2 𝑛→∞ 2𝑛 𝜃 2 − 𝑛+1 𝜃 2 + 𝜃 2 ] = 𝜃 2 − 2𝜃 2 + 𝜃 2 = 0. That this limit is zero verifies that the maximum likelihood estimator 𝜃̂ = 𝑋𝑛:𝑛 is mean square error (MSE) consistent. 𝜃 b) We first derive the MME 𝜃̃ for 𝜃. Since 𝑋~𝑈𝑁𝐼𝐹(0, 𝜃), we know that 𝐸(𝑋) = 2 so we can equate 𝜇1′ = 𝑀1′ → 𝜃 2 1 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖 → 𝜃 2 = 𝑋̅ → 𝜃 = 2𝑋̅. This means that 𝜃̃ = 2𝑋̅. Next, we show that this estimator is MSE consistent, which means verifying that lim 𝐸[2𝑋̅ − 𝜃]2 = 0. But we have lim 𝐸[2𝑋̅ − 𝜃]2 = lim 𝐸[4𝑋̅ 2 − 4𝜃𝑋̅ + 𝜃 2 ] = 𝑛→∞ 𝑛→∞ lim [4𝐸(𝑋̅ 2 ) − 4𝜃𝐸(𝑋̅) + 𝜃 2 ]. We 𝑛→∞ 1 𝑛 1 1 𝑛→∞ therefore 𝜃 1 𝜃 𝐸(∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 ∑𝑛𝑖=1 𝐸(𝑋𝑖 ) = 𝑛 ∑𝑛𝑖=1 2 = 𝑛 𝑛 2 = 𝜃 2 1 1 𝑉𝑎𝑟 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) + (2) = 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) + 𝜃2 + 12𝑛 𝜃2 4 lim [4 𝑛→∞ = (3𝑛+1)𝜃 2 12𝑛 (3𝑛+1)𝜃2 12𝑛 𝜃2 1 𝐸(𝑋̅) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = compute 𝜃 2 and 𝐸(𝑋̅ 2 ) = 𝑉𝑎𝑟(𝑋̅) + 𝐸(𝑋̅)2 = 𝜃2 1 = 𝑛2 ∑𝑛𝑖=1 12 + 4 𝜃2 4 1 𝜃2 = 𝑛2 𝑛 12 + 𝜃2 4 = . Thus, we can compute that lim [4𝐸(𝑋̅ 2 ) − 4𝜃𝐸(𝑋̅) + 𝜃 2 ] = 𝑛→∞ 𝜃 − 4𝜃 + 𝜃 2 ] = lim [ 2 (3𝑛+1)𝜃2 3𝑛 𝑛→∞ − 2𝜃 2 + 𝜃 2 ] = 𝜃 2 − 2𝜃 2 + 𝜃 2 = 0. That this limit is zero verifies that the MME 𝜃̃ = 2𝑋̅ is mean square error (MSE) consistent. Question #29: Let 𝑋1 , … , 𝑋𝑛 be a random sample of size 𝑛 from a Bernoulli distribution such that 𝑋𝑖 ~𝐵𝐼𝑁(1, 𝑝). For a Uniform prior density 𝑝~𝑈𝑁𝐼𝐹(0,1) and a squared error loss function 𝐿(𝑡; 𝑝) = (𝑡 − 𝑝)2 , a) find the posterior distribution of the unknown parameter 𝑝, b) find the Bayes estimator of 𝑝, and c) find the Bayes risk for the Bayes estimator of 𝑝 above. a) We have that the posterior density is given by 𝑓P|𝑥 (𝑝) = 𝑓(𝑥1 ,…,𝑥𝑛 ;𝑝)𝑝(𝑝) ∫ 𝑓(𝑥1 ,…,𝑥𝑛 ;𝑝)𝑝(𝑝)𝑑𝑝 𝑛 , where 𝑛 𝑓(𝑥1 , … , 𝑥𝑛 ; 𝑝) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝑝) = ∏𝑛𝑖=1 𝑝 𝑥 (1 − 𝑝)1−𝑥 = 𝑝∑𝑖=1 𝑥𝑖 (1 − 𝑝)𝑛−∑𝑖=1 𝑥𝑖 since the random variables are independent and identically distributed and 𝑝(𝑝) = 1 since 1 𝑛 𝑛 the prior density is uniform. We then express ∫0 𝑝∑𝑖=1 𝑥𝑖 (1 − 𝑝)𝑛−∑𝑖=1 𝑥𝑖 𝑑𝑝 in terms of the beta distribution. Recall that if 𝑌~𝐵𝐸𝑇𝐴(𝑎, 𝑏), then its density is 𝑓(𝑦; 𝑎, 𝑏) = 1 𝑦 𝑎−1 (1 − 𝑦)𝑏−1 where 𝐵(𝑎, 𝑏) = 𝐵(𝑎,𝑏) 1 Γ(𝑎)Γ(𝑏) Γ(𝑎+𝑏) 𝑛 . Next, we must define 𝑎 = ∑𝑛𝑖=1 𝑥𝑖 and 𝑛 𝑏 = 𝑛 − ∑𝑛𝑖=1 𝑥𝑖 , so we can write ∫0 𝑝∑𝑖=1 𝑥𝑖 (1 − 𝑝)𝑛−∑𝑖=1 𝑥𝑖 𝑑𝑝 = 𝐵(𝑎 + 1, 𝑏 + 1) = 𝑛 𝑛 𝑝∑𝑖=1 𝑥𝑖 (1−𝑝)𝑛−∑𝑖=1 𝑥𝑖 𝐵(∑𝑛𝑖=1 𝑥𝑖 + 1, 𝑛 − ∑𝑛𝑖=1 𝑥𝑖 + 1). Thus, we have 𝑓P|𝑥 (𝑝) = 𝐵(∑𝑛 𝑛 𝑖=1 𝑥𝑖 +1,𝑛−∑𝑖=1 𝑥𝑖 +1) 1 𝐵(𝑎+1,𝑏+1) = 𝑝𝑎 (1 − 𝑝)𝑏 , which verifies that the random variable given by P|𝑥~𝐵𝐸𝑇𝐴(∑𝑛𝑖=1 𝑥𝑖 + 1, 𝑛 − ∑𝑛𝑖=1 𝑥𝑖 + 1) ≡ 𝐵𝐸𝑇𝐴(𝑎 + 1, 𝑏 + 1). 𝑎 b) For some random variable 𝑌~𝐵𝐸𝑇𝐴(𝑎, 𝑏), we know that 𝐸(𝑌) = 𝑎+𝑏. Moreover, Theorem 9.5.2 states that when we have a squared error loss function, the Bayes estimator is simply the expected value of the posterior distribution. This implies that ∑𝑛 𝑖=1 𝑥𝑖 +1 the Bayes estimator of 𝑝 is given by 𝑝̂ 𝐵𝐸 = ∑𝑛 𝑛 𝑖=1 𝑥𝑖 +1+𝑛−∑𝑖=1 𝑥𝑖 +1 = c) The risk function in this case is 𝑅𝑇 (𝑝) = 𝐸[(𝑇 − 𝑝)2 ], where 𝑇 = ∑𝑛 𝑖=1 𝑥𝑖 +1 𝑛+2 . ∑𝑛 𝑖=1 𝑥𝑖 +1 𝑛+2 is the Bayes Estimator derived above. We would therefore substitute for 𝑇 in the risk function, 1 evaluate the expected value of that expression and then compute ∫0 𝐸[(𝑇 − 𝑝)2 ] 𝑑𝑝. Question #34: Consider a random sample of size 𝑛 from a distribution with discrete probability mass function 𝑓𝑋 (𝑥; 𝑝) = (1 − 𝑝)𝑥 𝑝 for 𝑥 ∈ {0,1,2, … }. a) Find the MLE of the unknown parameter 𝑝. b) Find the MLE of 𝜃 = 1−𝑝 𝑝 . c) Find the CRLB for variances of all unbiased estimators of the parameter 𝜃 above. d) Is the MLE of 𝜃 = MLE of 𝜃 = 𝜃= 1−𝑝 𝑝 1−𝑝 𝑝 1−𝑝 𝑝 a UMVUE? e) Is the also MSE consistent? f) Compute the asymptotic distribution of the MLE of 𝑛 . g) If we have the estimator 𝜃̂ = 𝑛+1 𝑋̅, then find the risk functions of both 𝜃̂ and 𝑋̅ using the loss function given by 𝐿(𝑡; 𝜃) = a) We have (𝑡−𝜃)2 𝜃2 +𝜃 . 𝑛 𝐿(𝑝) = ∏𝑝𝑖=1 𝑓(𝑥𝑖 ; 𝑝) = ∏𝑝𝑖=1(1 − 𝑝)𝑥𝑖 𝑝 = 𝑝𝑛 (1 − 𝑝)∑𝑖=1 𝑥𝑖 , ln[𝐿(𝑝)] = 𝑛 ln(𝑝) + ∑𝑛𝑖=1 𝑥𝑖 ln(1 − 𝑝). Then we have that 𝜕 so 𝑛 ln[𝐿(𝑝)] = 𝑝 − 𝜕𝑝 that ∑𝑛 𝑖=1 𝑥𝑖 1−𝑝 . 1 Setting this equal to zero and solving for 𝑝 gives the estimator 𝑝̂ 𝑀𝐿𝐸 = 1+𝑋̅. 1−𝑝̂ b) By the Invariance Property, we have that the estimator is 𝜃̂𝑀𝐿𝐸 = 𝑝̂ 𝑀𝐿𝐸 = 𝑋̅. 𝑀𝐿𝐸 c) Since 𝜃 = 𝜏(𝑝) = 1−𝑝 𝑝 1 1 1 = 𝑝 − 1, then 𝜏 ′ (𝑝) = − 𝑝2 and [𝜏 ′ (𝑝)]2 = 𝑝4. Then since 𝑓(𝑋; 𝑝) = (1 − 𝑝) 𝑋 𝑝, we can compute ln 𝑓(𝑋; 𝑝) = ln(𝑝) + 𝑋ln(1 − 𝑝) so that 𝜕 1 𝜕2 𝑋 ln 𝑓(𝑋; 𝑝) = 𝑝 − 1−𝑝 and 𝜕𝑝 𝜕𝑝2 1 𝑋 ln 𝑓(𝑋; 𝑝) = − 𝑝2 − (1−𝑝)2. We can then compute the 𝜕2 negative of the expected value of this second derivative so that −𝐸 [𝜕𝑝2 ln 𝑓(𝑋; 𝑝)] = 1 1 1 1−𝑝 + (1−𝑝)2 𝐸(𝑋) = 𝑝2 + 𝑝(1−𝑝)2 = 𝑝2 These results imply that 𝐶𝑅𝐿𝐵 = (1−𝑝)2 +𝑝(1−𝑝) = 𝑝2 (1−𝑝)2 𝑝2 (1−𝑝) 𝑛𝑝4 = 1−𝑝 𝑛𝑝2 1−2𝑝+𝑝2 +𝑝−𝑝2 𝑝2 (1−𝑝)2 1−𝑝 1 = 𝑝2 (1−𝑝)2 = 𝑝2 (1−𝑝). . 1 1 1 1−𝑝 1−𝑝 d) We first verify that 𝐸(𝜃̂𝑀𝐿𝐸 ) = 𝐸(𝑋̅) = 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛 ∑𝑛𝑖=1 𝐸(𝑋𝑖 ) = 𝑛 𝑛 𝑝 = 𝑝 , so that the MLE is an unbiased estimator of 𝜃 = 1−𝑝 𝑝 . Next, we compute 𝑉𝑎𝑟(𝜃̂𝑀𝐿𝐸 ) = 1 1 1 1−𝑝 1−𝑝 𝑉𝑎𝑟(𝑋̅) = 𝑉𝑎𝑟 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) = 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) = 𝑛2 𝑛 𝑝2 = 𝑛𝑝2 = 𝐶𝑅𝐿𝐵, which verifies that 𝜃̂𝑀𝐿𝐸 = 𝑋̅ is the UMVUE for the parameter 𝜃 = 1−𝑝 𝑝 . 2 1−𝑝 e) To verify that 𝜃̂𝑀𝐿𝐸 = 𝑋̅ is MSE consistent, we must show that lim 𝐸 [𝑋̅ − 𝑝 ] = 0. 𝑛→∞ 2 2 (1−𝑝) 1−𝑝 2(1−𝑝) But we can see that we have lim 𝐸 [𝑋̅ − 𝑝 ] = lim 𝐸 [𝑋̅ 2 − 𝑝 𝑋̅ + 𝑝2 ] = 𝑛→∞ 2(1−𝑝) lim [𝐸(𝑋̅ 2 ) − 𝑝 𝐸(𝑋̅) + (1−𝑝)2 𝑛→∞ 𝑝2 𝑛→∞ ], so we must compute the expectation of both the 1−𝑝 mean and the mean squared. However, we already know that 𝐸(𝑋̅) = 𝑝 = 𝜃 since 2 1 1−𝑝 𝜃̂𝑀𝐿𝐸 is unbiased. Then 𝐸(𝑋̅ 2 ) = 𝑉𝑎𝑟(𝑋̅) + 𝐸(𝑋̅)2 = 𝑉𝑎𝑟 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) + ( 𝑝 ) = 1 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) + (1−𝑝)2 𝑝2 1 = 𝑛2 ∑𝑛𝑖=1 1−𝑝 𝑝2 + (1−𝑝)2 𝑝2 1 = 𝑛2 𝑛 1−𝑝 𝑝2 + 2 (1−𝑝)2 𝑝2 = 1−𝑝 𝑛𝑝2 + 2 (1−𝑝)2 𝑝2 . 2 (1−𝑝) (1−𝑝) (1−𝑝) 2(1−𝑝) 1−𝑝 2(1−𝑝) 1−𝑝 Thus lim [𝐸(𝑋̅ 2 ) − 𝑝 𝐸(𝑋̅) + 𝑝2 ] = lim [ 𝑛𝑝2 + 𝑝2 − 𝑝 + ]= 𝑝 𝑝2 𝑛→∞ (1−𝑝)2 𝑝2 − 𝑛→∞ 2(1−𝑝)2 𝑝2 + (1−𝑝)2 𝑝2 = 0. This shows that 𝜃̂𝑀𝐿𝐸 = 𝑋̅ is MSE consistent. f) We use Definition 9.4.5, which states that for large values of 𝑛, the MLE estimator is distributed normal with mean 𝜃 = that 𝐶𝑅𝐿𝐵 = 1−𝑝 𝑛𝑝2 1−𝑝 𝑝 and variance 𝐶𝑅𝐿𝐵. Since we previously found 1−𝑝 1−𝑝 , we can conclude that 𝜃̂𝑀𝐿𝐸 ~𝑁 ( 𝑝 , 𝑛𝑝2 ). g) Definition 9.5.2 states that the risk function is the expected loss 𝑅𝑇 (𝜃) = 𝐸[𝐿(𝑇; 𝜃)]. In this case, the loss function is 𝐿(𝑡; 𝜃) = (𝑡−𝜃)2 𝜃2 +𝜃 ̅ 2 −2𝜃𝑋̅ +𝜃2 𝑋 estimator 𝑋̅ we compute 𝑅𝑋̅ (𝜃) = 𝐸 [ 1 𝜃2 +𝜃 1−𝑝 [ 𝑛𝑝2 + 𝜃 (𝜃+1)𝑛𝑝 (1−𝑝)2 𝑝2 − 2𝜃 1−𝑝 𝑝 (1−𝑝)/𝑝 1 1−𝑝 can compute 𝑅𝜃̂ (𝜃) = 𝐸 [ 𝑡 2 −2𝜃𝑡+𝜃2 𝜃2 +𝜃 . Therefore, for the 1 ] = 𝜃2 +𝜃 [𝐸(𝑋̅ 2 ) − 2𝜃𝐸(𝑋̅) + 𝜃 2 ] = 𝜃2 1 + 𝜃 2 ] = 𝜃2 +𝜃 [𝑛𝑝 𝜃 2 + 𝜃 2 − 2𝜃 2 + 𝜃 2 ] = (𝜃2 +𝜃)𝑛𝑝 = = [(1−𝑝)/𝑝+1]𝑛𝑝 = [1/𝑝]𝑛𝑝2 = ( 𝜃 2 +𝜃 = 1−𝑝 𝑛𝑝 𝜃 𝑛 = 𝑛. Similarly, for the estimator 𝜃̂ = 𝑛+1 𝑋̅ we 2 𝑛 𝑛 𝑋̅ ) −2𝜃( 𝑋̅ )+𝜃2 𝑛+1 𝑛+1 𝜃2 +𝜃 𝑛+𝜃 ] = ⋯ = (𝜃+1)(𝑛+1)2 . Question #36: Let 𝑋1 , … , 𝑋𝑛 be a random sample of size 𝑛 from a Normal distribution such that each 𝑋𝑖 ~𝑁(0, 𝜃). Find the asymptotic distribution of the MLE of the parameter 𝜃.  1 From the previous assignment, we know that 𝜃̂𝑀𝐿𝐸 = 𝑛 ∑𝑛𝑖=1 𝑋𝑖2 . We then use Definition 9.4.5, which states that for large values of 𝑛, the MLE estimator is distributed normal with mean 𝜃 and variance 𝐶𝑅𝐿𝐵. That is, we have that 𝜃̂𝑀𝐿𝐸 ~𝑁(𝜃, 𝐶𝑅𝐿𝐵). This means that we must compute the Cramer-Rao Lower Bound. Since 𝜏(𝜃) = 𝜃, then 𝜏 ′ (𝜃) = 1 and [𝜏 ′ (𝜃)]2 = 1. Next, since we previously found that 𝑓(𝑋; 𝜃) = 1 1 √2𝜋𝜃 2 1 1 𝑒 −2𝜃𝑋 , then we have ln 𝑓(𝑋; 𝜃) = − 2 ln(2𝜋𝜃) − 2𝜃 𝑋 2 so that 𝜕 𝑋2 1 𝜕2 2𝑋 2 1 1 𝑋2 ln 𝑓(𝑋; 𝜃) = − 2𝜃 + 2𝜃2. We then find 𝜕𝜃2 ln 𝑓(𝑋; 𝜃) = 2𝜃2 − 2𝜃3 = 2𝜃2 − 𝜃3 and take 𝜕𝜃 1 𝑋2 1 1 the negative of its expected value to obtain −𝐸 [2𝜃2 − 𝜃3 ] = − [2𝜃2 − 𝜃3 𝐸(𝑋 2 )] = 1 1 𝜃 2𝜃2 (𝜃 + 02 ) − 3 1 1 1 = 𝜃2 − 2𝜃2 = 2𝜃2. This implies that 𝐶𝑅𝐿𝐵 = 2𝜃2 𝑛 . Combining these 2 2𝜃 facts reveals that the asymptotic distribution of the MLE is 𝜃̂𝑀𝐿𝐸 ~𝑁 (𝜃, 𝑛 ). We can transform this to get a standard normal distribution by noting that the random variable ̂ 𝑀𝐿𝐸 −𝜃 𝜃 𝜃√2/𝑛 ~𝑁(0,1) or large values of 𝑛. We could further reduce this by multiplying through by the constant 𝜃 so that ̂ 𝑀𝐿𝐸 −𝜃 𝜃 √2/𝑛 ~𝑁(0, 𝜃 2 ). Chapter #10 – Sufficiency and Completeness Question #6: Let 𝑋1 , … , 𝑋𝑛 be independent and each 𝑋𝑖 ~𝐵𝐼𝑁(𝑚𝑖 , 𝑝). Use the Factorization Criterion to show that 𝑆 = ∑𝑛𝑖=1 𝑋𝑖 is sufficient for the unknown parameter 𝑝.  Since each 𝑋𝑖 ~𝐵𝐼𝑁(𝑚𝑖 , 𝑝), we know that the probability mass function is given by 𝑓(𝑥; 𝑚𝑖 , 𝑝) = (𝑚𝑥𝑖 )𝑝 𝑥 (1 − 𝑝)𝑚𝑖 −𝑥 1{𝑥 = 0, … , 𝑚𝑖 }. We can then construct their joint probability mass function due to the fact that they are independent as 𝑖 𝑓(𝑥1 , … , 𝑥𝑛 ; 𝑚𝑖 , 𝑝) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝑚𝑖 , 𝑝) = ∏𝑛𝑖=1 (𝑚 ) 𝑝 𝑥𝑖 (1 − 𝑝)𝑚𝑖 −𝑥𝑖 1{𝑥𝑖 = 0, … , 𝑚𝑖 } = 𝑥 𝑖 𝑛 𝑛 𝑛 𝑖 𝑖 [∏𝑛𝑖=1 (𝑚 )] 𝑝∑𝑖=1 𝑥𝑖 (1 − 𝑝)∑𝑖=1 𝑚𝑖 −∑𝑖=1 𝑥𝑖 1{𝑥𝑖 = 0, … , 𝑚𝑖 }. If 𝐶 = [∏𝑛𝑖=1 (𝑚 )], then we 𝑥 𝑥 𝑖 𝑖 𝑛 𝑛 𝑛 have that 𝐶𝑝∑𝑖=1 𝑥𝑖 𝑞 ∑𝑖=1 𝑚𝑖 −∑𝑖=1 𝑥𝑖 1{𝑥𝑖 = 0, … , 𝑚𝑖 } = 𝐶 𝑛 𝑝 ∑𝑖=1 𝑥𝑖 𝐶 (𝑞 ) 𝑛 𝑛 𝑝∑𝑖=1 𝑥𝑖 𝑞 ∑𝑖=1 𝑚𝑖 ∑𝑛 𝑞 𝑖=1 𝑥𝑖 1{𝑥𝑖 = 0, … , 𝑚𝑖 } = 𝑛 𝑞 ∑𝑖=1 𝑚𝑖 1{𝑥𝑖 = 0, … , 𝑚𝑖 }. But then if we define 𝑠 = ∑𝑛𝑖=1 𝑥𝑖 , we have that 𝑝 𝑠 𝑛 𝑓(𝑥1 , … , 𝑥𝑛 ; 𝑚𝑖 , 𝑝) = 𝐶 (𝑞 ) 𝑞 ∑𝑖=1 𝑚𝑖 1{𝑥𝑖 = 0, … , 𝑚𝑖 } = 𝑔(𝑠; 𝑚𝑖 , 𝑝)ℎ(𝑥1 , … , 𝑥𝑛 ). 𝑝 𝑠 Since 𝑛 𝑔(𝑠; 𝑚𝑖 , 𝑝) = (𝑞 ) 𝑞 ∑𝑖=1 𝑚𝑖 does not depend on 𝑥1 , … , 𝑥𝑛 except through 𝑠 = ∑𝑛𝑖=1 𝑥𝑖 and ℎ(𝑥1 , … , 𝑥𝑛 ) = 𝐶1{𝑥𝑖 = 0, … , 𝑚𝑖 } does not involve 𝑝, the Factorization Criterion guarantees that 𝑆 = ∑𝑛𝑖=1 𝑋𝑖 is sufficient for the unknown parameter 𝑝. Question #7: Let 𝑋1 , … , 𝑋𝑛 be independent and each 𝑋𝑖 ~𝑁𝐵(𝑟𝑖 , 𝑝). This means that each 𝑋𝑖 has probability mass function 𝑃(𝑋𝑖 = 𝑥) = (𝑟𝑥−1 ) 𝑝𝑟𝑖 (1 − 𝑝)𝑥−𝑟𝑖 for 𝑥 = 𝑟𝑖 , 𝑟𝑖 + 1, 𝑟𝑖 + 2, … −1 𝑖 Find a sufficient statistic for the unknown parameter 𝑝 using the Factorization Criterion.  As in the question above, we have that 𝑓(𝑥1 , … , 𝑥𝑛 ; 𝑟𝑖 , 𝑝) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝑟𝑖 , 𝑝) = −1 ∏𝑛𝑖=1 (𝑥𝑟𝑖−1 ) 𝑝𝑟𝑖 (1 − 𝑝)𝑥𝑖 −𝑟𝑖 1{𝑥𝑖 = 𝑟𝑖 , 𝑟𝑖 + 1, … }. After applying the product operator, 𝑖 𝑛 𝑛 𝑛 −1 this becomes [∏𝑛𝑖=1 (𝑥𝑟𝑖−1 )] 𝑝∑𝑖=1 𝑟𝑖 𝑞 ∑𝑖=1 𝑥𝑖 −∑𝑖=1 𝑟𝑖 1{𝑥𝑖 = 𝑟𝑖 , 𝑟𝑖 + 1, … }. Then if we define 𝑖 𝐶= −1 [∏𝑛𝑖=1 (𝑥𝑟𝑖−1 )], 𝑖 𝑛 this expression becomes 𝐶 𝑛 𝑝∑𝑖=1 𝑟𝑖 𝑞 ∑𝑖=1 𝑥𝑖 𝑛 ∑ 𝑞 𝑖=1 𝑟𝑖 1{𝑥𝑖 = 𝑟𝑖 , 𝑟𝑖 + 1, … } = 𝑛 𝑝 ∑𝑖=1 𝑟𝑖 𝐶 (𝑞 ) joint 𝑛 𝑞 ∑𝑖=1 𝑥𝑖 1{𝑥𝑖 = 𝑟𝑖 , 𝑟𝑖 + 1, … }. Finally, if we let 𝑠 = ∑𝑛𝑖=1 𝑥𝑖 , we have that the mass function 𝑛 𝑝 ∑𝑖=1 𝑟𝑖 is 𝑓(𝑥1 , … , 𝑥𝑛 ; 𝑟𝑖 , 𝑝) = 𝐶 (𝑞 ) 𝑛 𝑝 ∑𝑖=1 𝑟𝑖 𝑔(𝑠; 𝑟𝑖 , 𝑝)ℎ(𝑥1 , … , 𝑥𝑛 ). Since 𝑔(𝑠; 𝑟𝑖 , 𝑝) = (𝑞 ) 𝑞 𝑠 1{𝑥𝑖 = 𝑟𝑖 , 𝑟𝑖 + 1, … } = 𝑞 𝑠 does not depend on 𝑥1 , … , 𝑥𝑛 except through 𝑠 = ∑𝑛𝑖=1 𝑥𝑖 and ℎ(𝑥1 , … , 𝑥𝑛 ) = 𝐶1{𝑥𝑖 = 𝑟𝑖 , 𝑟𝑖 + 1, … } does not involve 𝑝, the Factorization Criterion guarantees that 𝑆 = ∑𝑛𝑖=1 𝑋𝑖 is sufficient for 𝑝. Question #16: Let 𝑋1 , … , 𝑋𝑛 be independent and each 𝑋𝑖 ~𝑁𝐵(𝑟𝑖 , 𝑝). This means that each 𝑋𝑖 has mass function 𝑃(𝑋𝑖 = 𝑥) = (𝑟𝑥−1 ) 𝑝𝑟𝑖 (1 − 𝑝)𝑥−𝑟𝑖 for 𝑥 = 𝑟𝑖 , 𝑟𝑖 + 1, 𝑟𝑖 + 2, … Find the −1 𝑖 Maximum Likelihood Estimator (MLE) of 𝑝 by maximizing the MLE of the suﬃcient statistic.  In the previous question, 𝑛 𝑛 we found that 𝐿(𝑝) = 𝑓(𝑥1 , … , 𝑥𝑛 ; 𝑟𝑖 , 𝑝) = 𝑛 −1 [∏𝑛𝑖=1 (𝑥𝑟𝑖−1 )] 𝑝∑𝑖=1 𝑟𝑖 (1 − 𝑝)∑𝑖=1 𝑥𝑖 (1 − 𝑝)− ∑𝑖=1 𝑟𝑖 . Taking the natural logarithm gives 𝑖 −1 ln[𝐿(𝑝)] = ∑𝑛𝑖=1 (𝑥𝑟𝑖−1 ) + ∑𝑛𝑖=1 𝑟𝑖 ln(𝑝) + ∑𝑛𝑖=1 𝑥𝑖 ln(1 − 𝑝) − ∑𝑛𝑖=1 𝑟𝑖 ln⁡(1 − 𝑝). 𝑖 Then differentiating the log likelihood function and equating to zero implies that 𝜕 ln[𝐿(𝑝)] = 𝜕𝑝 ∑𝑛 𝑖=1 𝑟𝑖 𝑝 − ∑𝑛 𝑖=1 𝑥𝑖 1−𝑝 + ∑𝑛 𝑖=1 𝑟𝑖 1−𝑝 = 0 → (1 − 𝑝) ∑𝑛𝑖=1 𝑟𝑖 − 𝑝 ∑𝑛𝑖=1 𝑥𝑖 + 𝑝 ∑𝑛𝑖=1 𝑟𝑖 = 0. Then we have ∑𝑛𝑖=1 𝑟𝑖 − 𝑝 ∑𝑛𝑖=1 𝑟𝑖 − 𝑝 ∑𝑛𝑖=1 𝑥𝑖 + 𝑝 ∑𝑛𝑖=1 𝑟𝑖 = 0 → ∑𝑛𝑖=1 𝑟𝑖 − 𝑝 ∑𝑛𝑖=1 𝑥𝑖 = 0. ∑𝑛 𝑟 This implies that the Maximum Likelihood Estimator of 𝑝 is 𝑝̂ 𝑀𝐿𝐸 = ∑𝑛𝑖=1 𝑥𝑖 . 𝑖=1 𝑖 Question #12: Let 𝑋1 , … , 𝑋𝑛 be independent and identically distributed from a two parameter exponential distribution 𝐸𝑋𝑃(𝜃, 𝜂) such that the probability density function is 1 𝑓(𝑥; 𝜃, 𝜂) = 𝜃 𝑒  −𝑥+𝜂 𝜃 1{𝑥 > 𝜂}. Find jointly sufficient statistics for the parameters 𝜃 and 𝜂. Since the random variables are iid, their joint probability density function is thus given by 𝑓(𝑥1 , … , 𝑥𝑛 ; 𝜃, 𝜂) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃, 𝜂) 1{𝑥𝑖 > 𝜂} = ∏𝑛𝑖=1 𝜃 −1 𝑒 1 −𝑥𝑖 +𝜂 𝜃 1{𝑥𝑖 > 𝜂} = 𝑛 𝜃 −𝑛 𝑒 𝜃(𝑛𝜂−∑𝑖=1 𝑥𝑖 ) [∏𝑛𝑖=1 1{𝑥𝑖 > 𝜂}]. This then shows that 𝑆1 = ∑𝑛𝑖=1 𝑋𝑖 and 𝑆2 = 𝑋1:𝑛 are jointly sufficient for 𝜃 and 𝜂 by the Factorization Criterion with ℎ(𝑥1 , … , 𝑥𝑛 ) = 1 being independent of the unknown parameters 𝜃 and 𝜂 and 𝑔(𝑠1 , 𝑠2 ; 𝜃, 𝜂) = 1 𝜃 −𝑛 𝑒 𝜃 (𝑛𝜂−𝑠1 ) ∏𝑛𝑖=1 1{𝑠2 > 𝜂} depending on 𝑥1 , … , 𝑥𝑛 only through 𝑆1 and 𝑆2 . Question #13: Let 𝑋1 , … , 𝑋𝑛 be independent and identically distributed from a beta distribution 𝐵𝐸𝑇𝐴(𝜃1 , 𝜃2 ) such that the probability density function of each of these random Γ(𝜃 +𝜃 ) variables is given by 𝑓(𝑥; 𝜃1 , 𝜃2 ) = Γ(𝜃 1)Γ(𝜃2 ) 𝑥 𝜃1 −1 (1 − 𝑥)𝜃2 −1 whenever 0 < 𝑥 < 1. Find 1 2 jointly sufficient statistics for the unknown parameters 𝜃1 and 𝜃2 .  Since the random variables are iid, their joint density is given by 𝑓(𝑥1 , … , 𝑥𝑛 ; 𝜃1 , 𝜃2 ) = Γ(𝜃 +𝜃 ) 𝜃 −1 1 2 ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃1 , 𝜃2 ) 1{0 < 𝑥𝑖 < 1} = ∏𝑛𝑖=1 𝑥 1 (1 − 𝑥𝑖 )𝜃2 −1 1{0 < 𝑥𝑖 < 1} = Γ(𝜃 )Γ(𝜃 ) 𝑖 1 2 Γ(𝜃 +𝜃 ) 𝑛 [Γ(𝜃 1)Γ(𝜃2 )] [𝑥1 … 𝑥𝑛 ]𝜃1 −1 [(1 − 𝑥1 ) … (1 − 𝑥𝑛 )]𝜃2 −1 ∏𝑛𝑖=1 1{0 < 𝑥𝑖 < 1}. 1 2 This then shows that 𝑆1 = ∏𝑛𝑖=1 𝑋𝑖 and 𝑆2 = ∏𝑛𝑖=1(1 − 𝑋𝑖 ) are jointly sufficient for 𝜃1 and 𝜃2 by the Factorization independent of Criterion the with unknown ℎ(𝑥1 , … , 𝑥𝑛 ) = ∏𝑛𝑖=1 1{0 < 𝑥𝑖 < 1} parameters and being 𝑔(𝑠1 , 𝑠2 ; 𝜃1 , 𝜃2 ) = Γ(𝜃 +𝜃 ) 𝑛 [Γ(𝜃 1)Γ(𝜃2 )] [𝑠1 ]𝜃1 −1 [𝑠2 ]𝜃2−1 depending on the observations only through 𝑆1 and 𝑆2 . 1 2 Question #18: Let 𝑋~𝑁(0, 𝜃) for 𝜃 > 0. a) Show that 𝑋 2 is complete and sufficient for the unknown parameter 𝜃, and b) show that 𝑁(0, 𝜃) is not a complete family. a) Since 𝑋~𝑁(0, 𝜃), we know that 𝑓𝑋 (𝑥; 𝜃) = 1 √2𝜋𝜃 𝑥2 𝑒 −2𝜃 for 𝑥 ∈ ℝ. Therefore, by the Regular Exponential Class (REC) Theorem, 𝑋 2 is complete and sufficient for 𝜃. b) Since 𝑋~𝑁(0, 𝜃), we know that 𝐸 (𝑋) = 0 for all 𝜃 > 0. Therefore, completeness fails because we have a nontrivial unbiased estimator of 𝐸(𝑋) = 0. Chapter #10 – Sufficiency and Completeness Question #21: If 𝑋1 , … , 𝑋𝑛 is a random sample from a Bernoulli distribution such that each 𝑋𝑖 ~𝐵𝐸𝑅𝑁(𝑝) ≡ 𝐵𝐼𝑁(1, 𝑝) where 𝑝 is the unknown parameter to be estimated, find the UMVUE for a) 𝜏(𝑝) = 𝑉𝑎𝑟(𝑋) = 𝑝(1 − 𝑝), and b) 𝜏(𝑝) = 𝑝2. a) We first verify that the Bernoulli distribution is a member of the Regular Exponential Class (REC) by noting that its density can be written as 𝑓(𝑥; 𝑝) = 𝑝 𝑥 (1 − 𝑝)1−𝑥 = 𝑝𝑥 𝑥 𝑥 𝑝𝑥 (1−𝑝)𝑥−1 = (1−𝑝)𝑥 (1 − 𝑝) = (1−𝑝) (1 − 𝑝) = exp {ln [(1−𝑝) (1 − 𝑝)]}. This equality implies 𝑓(𝑥; 𝑝) = exp {𝑥 ln (1−𝑝) + ln(1 − 𝑝)} = exp {𝑥 ln (1−𝑝)} exp{ln⁡(1 − 𝑝)} = 𝑝 𝑝 𝑝 (1 − 𝑝) exp {𝑥 ln ( 𝑝 1−𝑝 𝑝 )} = 𝑐(𝑝) exp{𝑡1 (𝑥)𝑞1 (𝑝)}, so the Bernoulli distribution is a member of the REC by Definition 10.4.2. We then use Theorem 10.4.2, which guarantees the existence of sufficient statistics for distribution from the REC, to construct the sufficient statistic 𝑆1 = ∑𝑛𝑖=1 𝑡1 (𝑋𝑖 ) = ∑𝑛𝑖=1 𝑋𝑖 . Next, we appeal to the Rao-Blackwell Theorem in justifying the use of 𝑆1 (or any one-to-one function of it) in our search for a UMVUE for 𝑉𝑎𝑟(𝑋) = 𝑝(1 − 𝑝). Our initial guess for an estimator is 𝑇 = 𝑋̅(1 − 𝑋̅), so we first compute that 𝐸(𝑇) = 𝐸[𝑋̅(1 − 𝑋̅)] = 𝐸(𝑋̅) − 𝐸(𝑋̅ 2 ) = 1 1 1 𝐸 (𝑛 ∑𝑛𝑖=1 𝑋𝑖 ) − [𝑉𝑎𝑟(𝑋̅) + 𝐸(𝑋̅)2 ] = 𝑛 ∑𝑛𝑖=1 𝐸(𝑋𝑖 ) − 𝑛2 ∑𝑛𝑖=1 𝑉𝑎𝑟(𝑋𝑖 ) + 𝐸(𝑋̅)2. 1 1 1 2 we have calculated that 𝐸(𝑇) = 𝑛 𝑛𝑝 − 𝑛2 𝑛𝑝(1 − 𝑝) − (𝑛 𝑛𝑝) = 𝑝 − 1 𝑝(1−𝑝) 𝑛 Thus, − 𝑝2 = 𝑛−1 𝑝(1 − 𝑝) (1 − 𝑛) = 𝑝(1 − 𝑝) ( 𝑛 𝑛 𝑛 ), which implies that 𝑇 ∗ = 𝑛−1 𝑇 = 𝑛−1 [𝑋̅(1 − 𝑋̅] will have expected value equal to 𝑉𝑎𝑟(𝑋) = 𝑝(1 − 𝑝). The Lehman-Scheffe Theorem finally guarantees that 𝑇 ∗ is a UMVUE for 𝑉𝑎𝑟(𝑋) = 𝑝(1 − 𝑝) since it states that any unbiased estimator which is a function of complete sufficient statistics is a UMVUE. b) We note that for the complete sufficient statistic 𝑆1 = ∑𝑛𝑖=1 𝑋𝑖 , we have 𝐸(𝑆1 ) = 𝑛𝑝 and 𝑉𝑎𝑟(𝑆12 ) = 𝑛𝑝(1 − 𝑝) since 𝑆1 ~𝐵𝐼𝑁(𝑛, 𝑝), which is true because it is the sum of 𝑛 independent Bernoulli random variables. This implies 𝐸(𝑆12 ) = 𝑉𝑎𝑟(𝑆12 ) + 𝐸(𝑆1 )2 = 𝑛𝑝(1 − 𝑝) + (𝑛𝑝)2 = 𝑛𝑝(1 − 𝑝) + 𝑛2 𝑝2 . By the Lehman-Scheffe Theorem, we know that we must use some function of the complete sufficient statistic 𝑆1 to construct a UMVUE for the unknown parameter 𝑝2 . We note that for 𝑇 = 𝑆1 − 𝑆12 , we have 𝐸(𝑇) = 𝐸(𝑆1 ) − 𝐸(𝑆12 ) = 𝑛𝑝 − 𝑛𝑝(1 − 𝑝) + 𝑛2 𝑝2 = 𝑛𝑝 − 𝑛𝑝 + 𝑛𝑝2 + 𝑛2 𝑝2 = 𝑝2 (𝑛 + 𝑛2 ). 1 This implies that the statistic 𝑇 ∗ = 𝑛+𝑛2 𝑇 = 𝑆1 −𝑆12 𝑛+𝑛2 = 𝑛 ∑𝑛 𝑖=1 𝑋𝑖 +(∑𝑖=1 𝑋𝑖 ) 2 𝑛+𝑛2 will have expected value equal to 𝑝2 , so it is a UMVUE by the Lehman-Scheffe Theorem. Question #23: If 𝑋1 , … , 𝑋𝑛 is a random sample from a Normal distribution such that each 𝑋𝑖 ~𝑁(𝜇, 9) where 𝜇 is unknown, find the UMVUE for a) the 95th percentile, and b) 𝑃(𝑋1 ≤ 𝑐), where 𝑐 is a known constant. Hint: find the conditional distribution of 𝑋1 given 𝑋̅ = 𝑥, and apply the Rao-Blackwell Theorem with 𝑇 = 𝑢(𝑋1 ), where define 𝑢(𝑥) = 1{𝑥 ≤ 𝑐}. a) The 95th percentile of a random variable 𝑋 from a 𝑁(𝜇, 9) distribution is the value of 𝑘 such that 𝑃(𝑋 ≤ 𝑘) = 0.95 → 𝑃 ( 𝑋−𝜇 3 ≤ 𝑘−𝜇 3 ) = 0.95 → 𝑃 (𝑍 ≤ 𝑘−𝜇 3 ) = 0.95 where 𝑍~𝑁(0,1). From tabulations of the standard normal distribution function Φ(𝑧), we know that 𝑃(𝑍 ≤ 1.645) = 0.95, so we equate 𝑘−𝜇 3 = 1.645 → 𝑘 = 4.935 + 𝜇 = 𝜏(𝜇). This is what we wish to find a UMVUE for, but since the expectation of a constant is that constant itself, we simply need to find a UMVUE for 𝜇. We begin by verifying that the Normal distribution is a member of the Regular Exponential Class (REC) by noting that the density of 𝑋~𝑁(𝜇, 9) can be written as 𝑓(𝑥; 𝜇) = 1 √18𝜋 𝑥2 exp {− 18 + 𝜇𝑥 9 1 √18𝜋 1 exp {− 18 (𝑥 − 𝜇)2 } = 𝜇2 − 18}, where we have that 𝑡1 (𝑥) = 𝑥 2 and 𝑡2 (𝑥) = 𝑥. Thus, the Normal distribution is a member of the REC by Definition 10.4.2. We then use Theorem 10.4.2, which guarantees the existence of sufficient statistics for distribution from the REC, to construct the sufficient statistics 𝑆1 = ∑𝑛𝑖=1 𝑡1 (𝑋𝑖 ) = ∑𝑛𝑖=1 𝑋𝑖2 and 𝑆2 = ∑𝑛𝑖=1 𝑡2 (𝑋𝑖 ) = ∑𝑛𝑖=1 𝑋𝑖 . Since the sample mean is an unbiased 𝑆 estimator for the population mean, we have that 𝐸(𝑇) = 𝐸(𝑋̅) = 𝐸 ( 𝑛2 ) = 𝜇. Thus, an unbiased estimator for 𝜏(𝜇) = 𝑘 = 4.935 + 𝜇 is given by 𝑇 ∗ = 4.935 + 𝑋̅, which is also a UMVUE of 𝜏(𝜇) by the Lehmann-Scheffe Theorem. 𝑋1 −𝜇 b) Note that we are trying to estimate 𝑃(𝑋1 ≤ 𝑐) = 𝑃 ( 3 ≤ 𝑐−𝜇 3 ) = Φ( 𝑐−𝜇 3 ) = 𝜏(𝜇), where Φ: ℝ → (0,1) is the cumulative distribution function of 𝑍~𝑁(0,1). Since 𝜏(𝜇) is a nonlinear function of 𝜇, we cannot simply insert 𝑋̅ to obtain a UMVUE. To find an unbiased estimator, we note that 𝑢(𝑋1 ) = 1{𝑋1 ≤ 𝑐} is unbiased for 𝜏(𝜇) since we have 𝐸[𝑢(𝑋1 )] = 𝐸[1{𝑋1 ≤ 𝑐}] = 𝑃(𝑋1 ≤ 𝑐) = 𝜏(𝜇). But since it is not a function of the complete sufficient statistic 𝑆2 = ∑𝑛𝑖=1 𝑋𝑖 , this estimator cannot be a UMVUE. However, the Rao-Blackwell Theorem states that 𝐸[𝑢(𝑋1 )|𝑆2 ] = 𝐸[1{𝑋1 ≤ 𝑐}|𝑆2 ] will also be unbiased and will be a function of 𝑆2 = ∑𝑛𝑖=1 𝑋𝑖 . The Lehmann-Scheffe Theorem then guarantees that 𝐸[1{𝑋1 ≤ 𝑐}|𝑆2 ] will be a UMVUE. In order to find this, we must compute the conditional distribution of 𝑋1 given 𝑆2 . We know that the random variable 𝑆2 = ∑𝑛𝑖=1 𝑋𝑖 ~𝑁(𝑛𝜇, 9𝑛) and that 𝑋1 = 𝑥, 𝑆2 = 𝑠 is equivalent to 2 𝑋1 = 𝑥, ∑𝑛𝑖=2 𝑋𝑖 = 𝑠 − 𝑥. This implies that 𝑓𝑋1 |𝑆2 (𝑥|𝑠) = ⋯ = 𝑠 where 𝜇 ′ = 𝑛 and (𝜎 ′ )2 = 9(𝑛−1) 𝑛 𝐸[1{𝑋1 ≤ 𝑐}|𝑆2 ] = 𝑃(𝐴 ≤ 𝑐) = Φ ( 𝑐−𝑠/𝑛 3√(𝑛−1)/𝑛 , √2𝜋𝜎′ 𝑠 9(𝑛−1) . Therefore, if we let 𝐴~𝑁 (𝑛 , 2 exp{−(𝑥−𝜇 ′ ) /2(𝜎′ ) }⁡ 𝑛 ) we have that ), which is a UMVUE for Φ ( 𝑐−𝜇 3 ) = 𝜏(𝜇). Question #25: If 𝑋1 , … , 𝑋𝑛 is a random sample from the probability density function 𝑓(𝑥; 𝜃) = 𝜃𝑥 𝜃−1 1{0 < 𝑥 < 1} where 𝜃 > 0 is the unknown parameter, find the UMVUE for 1 1 a) 𝜏(𝜃) = 𝜃 by using the fact that 𝐸[− ln(𝑋)] = 𝜃, and b) the unknown parameter 𝜃. a) We first verify that the density is a member of the REC by nothing that it can be written as 𝑓(𝑥; 𝜃) = 𝜃𝑥 𝜃−1 = exp{ln[𝜃𝑥 𝜃−1 ]} = exp{ln(𝜃) + (𝜃 − 1) ln(𝑥)} = exp{ln(𝜃)} exp{(𝜃 − 1) ln(𝑥)} = 𝜃 exp{(𝜃 − 1) ln(𝑥)}, where 𝑡1 (𝑥) = ln(𝑥). We then use Theorem 10.4.2, which guarantees the existence of sufficient statistics for REC distributions, to construct the sufficient statistic 𝑆1 = ∑𝑛𝑖=1 𝑡1 (𝑋𝑖 ) = ∑𝑛𝑖=1 ln⁡(𝑋𝑖 ). Next, we appeal to the Rao-Blackwell Theorem in justifying the use of 𝑆1 (or any one-to-one 1 function of it) in our search for a UMVUE for 𝜃. From the hint provided, we initially guess that 𝑇= −𝑆1 𝑛 = ∑𝑛 𝑖=1 −ln⁡(𝑋𝑖 ) 𝑛 and check that 𝐸(𝑇) = 𝐸 [ ∑𝑛 𝑖=1 − ln(𝑋𝑖 ) 𝑛 ]= 1 1 1 1 ∑𝑛 𝐸[− ln(𝑋𝑖 )] = 𝑛 = . The Lehman-Scheffe Theorem finally guarantees that 𝑛 𝑖=1 𝑛 𝜃 𝜃 1 1 𝑇 = − 𝑛 ∑𝑛𝑖=1 ln⁡(𝑋𝑖 ) is a UMVUE for 𝜃 since it states that any unbiased estimator which is a function of complete sufficient statistics is a UMVUE. b) Any UMVUE of the unknown parameter 𝜃 must be a function of the complete and sufficient statistic 𝑆1 = ∑𝑛𝑖=1 ln⁡(𝑋𝑖 ) by the Lehman-Scheffe Theorem. We begin by 𝑛 noting that 𝐸(𝑆1 ) = 𝐸[∑𝑛𝑖=1 ln(𝑋𝑖 )] = ∑𝑛𝑖=1 𝐸[ln(𝑋𝑖 )] = − ∑𝑛𝑖=1 𝐸[− ln(𝑋𝑖 )] = − 𝜃, so 1 1 𝑆1 𝐸(𝑆1 ) we would like to be able to compute 𝐸 ( ) ≠ 1 . However, this involves finding 1 𝐸 [− ln(𝑋)] since we know that 𝐸[− ln(𝑋)] = 𝜃. We do this by finding the distribution 1 of 𝑌 = −ln⁡(𝑋) using the CDF technique, which shows that 𝑌~𝐸𝑋𝑃 (𝜃) with density 1 𝑓(𝑦; 𝜃) = 𝜃𝑒 −𝜃𝑥 1{𝑥 > 0}. This is equivalent to 𝑌~𝐺𝐴𝑀𝑀𝐴 (𝜃 , 1), so by the Moment 1 Generating Function technique, we see that 𝑆1 = ∑𝑛𝑖=1 ln⁡(𝑋𝑖 ) ~𝐺𝐴𝑀𝑀𝐴 (𝜃 , 𝑛). We can 1 𝜃𝑛 ∞1 thus calculate 𝐸 [− ln(𝑋)] = Γ(𝑛) ∫0 𝑛−1 𝑆 = ∑𝑛 𝑛−1 𝑖=1 ln⁡(𝑋𝑖 ) 𝜃 𝑥 𝑛−1 𝑒 −𝜃𝑥 𝑑𝑥 = ⋯ = 𝑛−1, which implies that 𝑇 = 𝑥 is an unbiased estimator of 𝜃. Then the Lehmann-Scheffe Theorem guarantees that it is also a UMVUE for the unknown parameter 𝜃. Question #31: If 𝑋1 , … , 𝑋𝑛 is a random sample from the probability density function 𝑓(𝑥; 𝜃) = 𝜃(1 + 𝑥)−(1+𝜃) 1{𝑥 > 0} for unknown 𝜃 > 0, find a) the MLE of 𝜃, b) a complete and 1 1 sufficient statistic for 𝜃, c) the CRLB for 𝜏(𝜃) = 𝜃, d) the UMVUE for 𝜏(𝜃) = 𝜃, e) the mean and variance of the asymptotic normal distribution of the MLE, and f) the UMVUE for 𝜃. a) We have 𝐿(𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃) = ∏𝑛𝑖=1 𝜃(1 + 𝑥𝑖 )−(1+𝜃) = 𝜃 𝑛 [∏𝑛𝑖=1(1 + 𝑥𝑖 )]−(1+𝜃) so that ln[𝐿(𝜃)] = 𝑛 ln(𝜃) − (1 + 𝜃) ∑𝑛𝑖=1 ln(1 + 𝑥𝑖 ). Then we have that 𝑛 𝜃 − ∑𝑛𝑖=1 ln(1 + 𝑥𝑖 ) = 0 → 𝜃 = ∑𝑛 𝑛 𝑖=1 ln(1+𝑥𝑖 ) so that 𝜃̂𝑀𝐿𝐸 = ∑𝑛 𝑛 . 𝑖=1 ln(1+𝑋𝑖 ) 𝜕 𝜕𝜃 ln[𝐿(𝜃)] = b) To check that it is a member of the REC, we verify that we can write the probability density function of 𝑋 as 𝑓(𝑥; 𝜃) = 𝜃(1 + 𝑥)−(1+𝜃) = exp{ln[𝜃(1 + 𝑥)−(1+𝜃) ]} = exp{ln(𝜃) − (1 + 𝜃) ln(1 + 𝑥)}, where 𝑡1 (𝑥) = 1 and 𝑡2 (𝑥) = ln⁡(1 + 𝑥). Thus, 𝑓(𝑥; 𝜃) is a member of the REC and 𝑆2 = ∑𝑛𝑖=1 𝑡2 (𝑋𝑖 ) = ∑𝑛𝑖=1 ln(1 + 𝑋𝑖 ) is a complete and sufficient statistic for the unknown parameter 𝜃 to be estimated. 1 2 1 1 c) Since 𝜏(𝜃) = 𝜃, we have [𝜏 ′ (𝜃)]2 = [− 𝜃2 ] = 𝜃4. Then we have that 𝑓(𝑋; 𝜃) = 𝜕 𝜃(1 + 𝑋)−(1+𝜃) so its log is ln 𝑓(𝑋; 𝜃) = ln(𝜃) − (1 + 𝜃) ln(1 + 𝑋) and 𝜕𝜃 ln 𝑓(𝑋; 𝜃) = 1 𝜃 − ln(1 + 𝑋). Finally, 𝜕2 𝜕𝜃2 ln 𝑓(𝑋; 𝜃) = − 1 𝜃2 so that −𝐸 [ 𝜕2 𝜕𝜃2 (1/𝜃4 ) ln 𝑓(𝑋; 𝜃)] = 1 𝜃2 . These 1 results combined allow us to conclude that 𝐶𝑅𝐿𝐵 = 𝑛(1/𝜃2 ) = 𝑛𝜃2 . d) We previously verified that this density is a member of the REC and that the statistic 𝑆2 = ∑𝑛𝑖=1 ln(1 + 𝑋𝑖 ) is complete and sufficient for 𝜃. Next, we use the Rao-Blackwell Theorem in justifying the use of 𝑆2 (or any one-to-one function of it) in our search for 1 a UMVUE for 𝜃. In order to compute 𝐸(𝑆2 ), we need to find the distribution of the random variable 𝑌 = ln⁡(1 + 𝑋), which we do using the CDF technique. We thus have that 𝐹𝑌 (𝑦) = 𝑃(𝑌 ≤ 𝑦) = 𝑃(ln(1 + 𝑋) ≤ 𝑦) = 𝑃(𝑋 ≤ 𝑒 𝑦 − 1) = 𝐹𝑋 (𝑒 𝑦 − 1), so that 𝑑 𝑑 then 𝑓𝑌 (𝑦) = 𝑑𝑦 𝐹𝑌 (𝑦) = 𝑑𝑦 𝐹𝑋 (𝑒 𝑦 − 1) = 𝑒 𝑦 𝑓𝑋 (𝑒 𝑦 − 1) = 𝑒 𝑦 [𝜃(1 + 𝑒 𝑦 − 1)−(1+𝜃) ] = 𝜃𝑒 𝑦 (𝑒 𝑦 )−(1+𝜃) = 𝜃𝑒 𝑦 𝑒 −(𝑦+𝜃𝑦) = 𝜃𝑒 𝑦−𝑦−𝜃𝑦 = 𝜃𝑒 −𝜃𝑦 whenever 𝑦 > 0. It is 1 immediately clear that 𝑌~𝐸𝑋𝑃(𝜃), so that 𝐸(𝑌) = 𝐸[ln(1 + 𝑋)] = 𝜃. This allows us to 𝑛 find 𝐸(𝑆2 ) = 𝐸[∑𝑛𝑖=1 ln(1 + 𝑋𝑖 )] = ∑𝑛𝑖=1 𝐸[ln(1 + 𝑋𝑖 )] = 𝜃. Since we want an unbiased 1 estimator for 𝜃, it is clear that 𝑇 = 𝑆2 𝑛 1 = 𝑛 ∑𝑛𝑖=1 ln(1 + 𝑋𝑖 ) will suffice by the LST. e) We previously found that the MLE for 𝜃 is 𝜃̂𝑀𝐿𝐸 = ∑𝑛 𝑛 . From Chapter 9, we 𝑖=1 ln(1+𝑋𝑖 ) know that the MLE for some unknown parameter 𝜃 has an asymptotic normal distribution with 𝜇 = 𝜃 and 𝜎 2 = 𝐶𝑅𝐿𝐵; that is, 𝜃̂𝑀𝐿𝐸 ~𝑁(𝜃, 𝐶𝑅𝐿𝐵) for large 𝑛. We must therefore find the Cramer-Rao Lower Bound, which can be easily done from the work in part c) above with 𝜏(𝜃) = 𝜃, so that 𝐶𝑅𝐿𝐵 = 𝜃2 𝑛 . This means that we have 2 𝜃 1 𝜃̂𝑀𝐿𝐸 ~𝑁 (𝜃, 𝑛 ) for large 𝑛. We can similarly argue for the MLE of 𝜏(𝜃) = 𝜃, where we 1 1 see that 𝜃̃𝑀𝐿𝐸 = 𝜃̂ = 𝑛 ∑𝑛𝑖=1 ln(1 + 𝑋𝑖 ) by the Invariance Property of the Maximum 𝑀𝐿𝐸 Likelihood Estimator. Then using the work done in part c) above for the Cramer-Rao 1 1 Lower Bound, we can conclude that 𝜃̃𝑀𝐿𝐸 ~𝑁 (𝜃 , 𝑛𝜃2 ) for large 𝑛. f) We previously verified that this density is a member of the REC and that the statistic 𝑛 𝑆2 = ∑𝑛𝑖=1 ln(1 + 𝑋𝑖 ) is complete and sufficient for 𝜃 where 𝐸(𝑆2 ) = 𝜃 . As in the 1 previous question, we have that 𝐸 (𝑆 ) = 𝐸 (∑𝑛 2 𝑇= 𝑛−1 𝑆2 = ∑𝑛 𝑛−1 𝑖=1 ln(1+𝑋𝑖 ) 𝑖=1 1 𝜃 ) = 𝑛−1 which implies that ln(1+𝑋 ) 𝑖 is unbiased and a UMVUE for the unknown parameter 𝜃. Chapter #11 – Interval Estimation Question #5: If 𝑋1 , … , 𝑋𝑛 is a random sample from 𝑓𝑋 (𝑥; 𝜂) = 𝑒 𝜂−𝑥 1{𝑥 > 𝜂} with 𝜂 unknown, then a) show that 𝑄 = 𝑋1:𝑛 − 𝜂 is a pivotal quantity and find its distribution, and b) derive a 100𝛾% equal-tailed confidence interval for the unknown parameter 𝜂. a) We first find the distribution of the smallest order statistic 𝑋1:𝑛 using the formula 𝑓1 (𝑦; 𝜂) = 𝑛𝑓𝑋 (𝑦; 𝜂)[1 − 𝐹𝑋 (𝑦; 𝜂)]𝑛−1 . We thus need the CDF of the population, which is given 𝑥 𝑥 𝑥 𝐹𝑋 (𝑥; 𝜂) = ∫𝜂 𝑓𝑋 (𝑡; 𝜂) 𝑑𝑡 = ∫𝜂 𝑒 𝜂−𝑡 𝑑𝑡 = 𝑒 𝜂 ∫𝜂 𝑒 −𝑡 𝑑𝑡 = 𝑒 𝜂 [−𝑒 −𝑡 ]𝜂𝑥 = by 𝑒 𝜂 (−𝑒 −𝑥 + 𝑒 −𝜂 ) = 1 − 𝑒 𝜂−𝑥 whenever 𝑥 > 𝜂. We therefore have that 𝑓1 (𝑦; 𝜂) = 𝑛𝑒 𝜂−𝑦 [1 − (1 − 𝑒 𝜂−𝑦 )]𝑛−1 = 𝑛𝑒 𝜂−𝑦 [𝑒 𝜂−𝑦 ]𝑛−1 = 𝑛𝑒 𝜂−𝑦 𝑒 𝑛𝜂−𝜂−𝑛𝑦+𝑦 = 𝑛𝑒 𝑛(𝜂−𝑦) when 𝑥 > 𝜂. Now that we have the density of 𝑋1:𝑛 , we can use the CDF technique to find the density of 𝑄 = 𝑋1:𝑛 − 𝜂. Thus, we have 𝐹𝑄 (𝑞) = 𝑃(𝑄 ≤ 𝑞) = 𝑃(𝑋1:𝑛 − 𝜂 ≤ 𝑞) = 𝑑 𝑃(𝑋1:𝑛 ≤ 𝑞 + 𝜂) = 𝐹1 (𝑞 + 𝜂) so 𝑓𝑄 (𝑞) = 𝑑𝑞 𝐹1 (𝑞 + 𝜂) = 𝑓1 (𝑞 + 𝜂) = 𝑛𝑒 −𝑛𝑞 whenever 1 𝑞 + 𝜂 > 𝜂 → 𝑞 > 0. This reveals that 𝑄 = 𝑋1:𝑛 − 𝜂~𝐸𝑋𝑃 (𝑛), so it is clearly a pivotal quantity since it is a function of 𝜂 but its distribution does not depend on 𝜂. b) We have 𝑃(𝑥(1−𝛾)/2 < 𝑄 < 𝑥(1+𝛾)/2 ) = 𝛾 → 𝑃(𝑥(1−𝛾)/2 < 𝑋1:𝑛 − 𝜂 < 𝑥(1+𝛾)/2 ) = 𝛾, so after solving for the unknown parameter we obtain the 100𝛾% equal tailed confidence interval 𝑃(𝑋1:𝑛 − 𝑥(1+𝛾)/2 < 𝜂 < 𝑋1:𝑛 − 𝑥(1−𝛾)/2 ) = 𝛾. This can also be expressed as the random interval (𝑋1:𝑛 − 𝑥(1+𝛾)/2 , 𝑋1:𝑛 − 𝑥(1−𝛾)/2 ). Finally, we know 1 that the 𝐸𝑋𝑃 (𝑛) distribution has CDF 𝐹𝑄 (𝑞) = 1 − 𝑒 −𝑛𝑞 so that 𝐹𝑄 (𝑥𝛼 ) = 𝛼 implies 1 1 − 𝑒 −𝑛𝑥𝛼 = 𝛼. We solve this last equality for 𝑥𝛼 = − 𝑛 ln⁡(1 − 𝛼). This means that the 1 confidence interval becomes (𝑋1:𝑛 + 𝑛 ln ( found by substituting 𝛼 = 1−𝛾 2 and 𝛼 = 1−𝛾 1+𝛾 2 2 1 1+𝛾 ) , 𝑋1:𝑛 + 𝑛 ln ( 2 )), where each term is 1 into the expression 𝑥𝛼 = − 𝑛 ln⁡(1 − 𝛼). 2 Question #7: If 𝑋1 , … , 𝑋𝑛 is a random sample from 𝑓𝑋 (𝑥; 𝜃) = 𝜃2 𝑥𝑒 −𝑥 unknown parameter 𝜃, a) show that 𝑄 = 2 2 ∑𝑛 𝑖=1 𝑋𝑖 𝜃2 ~𝜒 2 (2𝑛), b) use 𝑄 = 2 /𝜃 2 1{𝑥 > 0} with 2 2 ∑𝑛 𝑖=1 𝑋𝑖 𝜃2 to derive an equal-tailed 100𝛾% confidence interval for 𝜃, c) find a lower 100𝛾% confidence limit for 𝑃(𝑋 > 𝑡) = 𝑒 −𝑡 2 /𝜃 2 , d) find an upper 100𝛾% confidence limit for the 𝑝𝑡ℎ percentile. 2 a) Since 𝑓𝑋 (𝑥; 𝜃) = 𝜃2 𝑥𝑒 −𝑥 2 /𝜃 2 1{𝑥 > 0}, we know that 𝑋~𝑊𝐸𝐼(𝜃, 2). The CDF technique then reveals that 𝑋 2 ~𝐸𝑋𝑃(𝜃 2 ) so that ∑𝑛𝑖=1 𝑋𝑖2 ~𝐺𝐴𝑀𝑀𝐴(𝜃 2 , 𝑛). A final application of the CDF technique shows that also shows that 𝑄 = 2 2 ∑𝑛 𝑖=1 𝑋𝑖 𝜃2 2 𝜃2 ∑𝑛𝑖=1 𝑋𝑖2 ~𝜒 2 (2𝑛), proving the desired result. This is a pivotal quantity for the unknown parameter 𝜃. 2 2 b) We find that the confidence interval is 𝑃 (𝜒1−𝛾 (2𝑛) < 𝑄 < 𝜒1+𝛾 (2𝑛)) = 𝛾 → 2 2 2 ∑𝑛 𝑋 2 2 2 ∑𝑛 𝑋2 2 2 𝑃 (𝜒1−𝛾 (2𝑛) < 𝜃2 ∑𝑛𝑖=1 𝑋𝑖2 < 𝜒1+𝛾 (2𝑛)) = 𝛾 → 𝑃 (𝜒2 𝑖=1(2𝑛)𝑖 < 𝜃 2 < 𝜒2 𝑖=1(2𝑛)𝑖 ) = 𝛾. 2 2 1+𝛾 2 1−𝛾 2 Taking square roots gives the desired random interval (√ 2 2 ∑𝑛 𝑖=1 𝑋𝑖 2 (2𝑛) 𝜒1+𝛾 2 ,√ 2 2 ∑𝑛 𝑖=1 𝑋𝑖 2 2 2 ∑𝑛 𝑖=1 𝑋𝑖 c) From the work done above, a lower confidence limit for 𝜃 is √ quantity 𝜏(𝜃) = 𝑒 −𝑡 2 2 ∑𝑛 𝑖=1 𝑋𝑖 substitute √ 𝜒𝛾2 (2𝑛) 2 /𝜃 2 )⁡. 2 (2𝑛) 𝜒1−𝛾 𝜒𝛾2 (2𝑛) . Since the is a monotonically increasing function of 𝜃, we can simply for 𝜃 into the expression 𝜏(𝜃) = 𝑒 −𝑡 2 /𝜃 2 by Corollary 11.3.1. d) We must solve the equation 𝑃(𝑋 > 𝑡𝑝 ) = 1 − 𝑝 for 𝑡𝑝 . From the question above, we are given that 𝑃(𝑋 > 𝑡) = 𝑒 −𝑡 2 /𝜃 2 2 2 so we must solve 𝑒 −𝑡𝑝 /𝜃 = 1 − 𝑝 for 𝑡𝑝 , which gives 2 ∑𝑛 𝑋 2 𝑡𝑝 = 𝜃√ln⁡(1 − 𝑝). By the same reasoning as above, we substitute √𝜒2 𝑖=1(2𝑛)𝑖 in for 𝜃 1−𝛾 2 ∑𝑛 𝑋 2 into the expression 𝑡𝑝 = 𝜏(𝜃) = 𝜃√ln⁡(1 − 𝑝) to obtain 𝑡𝑝 = √𝜒2 𝑖=1(2𝑛)𝑖 ln⁡(1 − 𝑝). 1−𝛾 Question #8: If 𝑋1 , … , 𝑋𝑛 is a random sample from 𝑋~𝑈𝑁𝐼𝐹(0, 𝜃) with 𝜃 > 0 unknown and 𝑋𝑛:𝑛 is the largest order statistic, then a) find the probability that the random interval given by (𝑋𝑛:𝑛 , 2𝑋𝑛:𝑛 ) contains 𝜃, and b) find the value of the constant 𝑐 such that the random interval (𝑋𝑛:𝑛 , 𝑐𝑋𝑛:𝑛 ) is a 100(1 − 𝛼)% confidence interval for the parameter 𝜃. a) We have that 𝜃 ∈ (𝑋𝑛:𝑛 , 2𝑋𝑛:𝑛 ) if and only if 𝜃 < 2𝑋𝑛:𝑛 , since the inequality 𝜃 < 𝑋𝑛:𝑛 will always be true by the definition of the density. We must therefore compute 𝜃 𝜃 𝜃 𝑛 𝑃(2𝑋𝑛:𝑛 > 𝜃) = 𝑃 (𝑋𝑛:𝑛 > 2) = 1 − 𝑃 (𝑋𝑛:𝑛 ≤ 2) = 1 − [𝑃 (𝑋𝑖 ≤ 2)] = 1 − 2−𝑛 . b) As above, we have that 𝑃[𝜃 ∈ (𝑋𝑛:𝑛 , 𝑐𝑋𝑛:𝑛 )] = 1 − 𝑐 −𝑛 , so if we set this equal to 1 − 𝛼 and solve for the value of the constant, we obtain 1 − 𝑐 −𝑛 = 1 − 𝛼 → 𝑐 = 𝛼 −1/𝑛 . Question #13: Let 𝑋1 , … , 𝑋𝑛 be a random sample from 𝑋~𝐺𝐴𝑀𝑀𝐴(𝜃, 𝜅) such that their 1 common distribution is 𝑓(𝑥; 𝜃, 𝜅) = 𝜃𝜅Γ(𝜅) 𝑥 𝜅−1 𝑒 −𝑥/𝜃 1{𝑥 > 0} with the parameter 𝜅 known but 𝜃 unknown. Derive a 100(1 − 𝛼)% equail-tailed confidence interval for 𝜃 based on the sufficient statistic for the unknown parameter 𝜃.  We begin by noting that the given density is a member of the Regular Exponential Class (REC) since 𝑓(𝑥; 𝜃, 𝜅) = 𝜃 −𝜅 Γ(𝜅)−1 𝑥 𝜅−1 𝑒 −𝑥/𝜃 = 𝑐(𝜃)ℎ(𝑥) exp{𝑞1 (𝜃)𝑡1 (𝑥)} where 𝑡1 (𝑥) = 𝑥. Then we know that 𝑆 = ∑𝑛𝑖=1 𝑡1 (𝑋𝑖 ) = ∑𝑛𝑖=1 𝑋𝑖 is complete sufficient for the unknown parameter 𝜃. Next, we need to create a pivotal quantity from 𝑆; from 2 2 the distribution in question 7 which is similar, we guess that 𝑄 = 𝜃 𝑆 = 𝜃 ∑𝑛𝑖=1 𝑋𝑖 might be appropriate. We now derive the distribution of 𝑄 and, by showing that it is simultaneously a function of 𝜃 but its density does not depend on 𝜃, will verify that it is a pivotal quantity. Since the 𝑋𝑖 ~𝐺𝐴𝑀𝑀𝐴(𝜃, 𝜅), we know that the random variable 𝐴 = ∑𝑛𝑖=1 𝑋𝑖 ~𝐺𝐴𝑀𝑀𝐴(𝜃, 𝑛𝜅). 𝑃 (∑𝑛𝑖=1 𝑋𝑖 ≤ 1 𝑞𝜃 2 ) = 𝑃 (𝐴 ≤ 𝑞𝜃 𝑛𝜅−1 ( ) 𝜃𝑛𝜅 Γ(𝑛𝜅) 2 𝑞𝜃 𝜃 𝑞𝜃 2 Then 2 𝐹𝑄 (𝑞) = 𝑃(𝑄 ≤ 𝑞) = 𝑃 (𝜃 ∑𝑛𝑖=1 𝑋𝑖 ≤ 𝑞) = 𝑞𝜃 𝑑 1 which 𝑞𝜃 𝑞𝜃 𝜃 ) = 𝐹𝐴 ( 2 ) so that 𝑓𝑄 (𝑞) = 𝑑𝑞 𝐹𝐴 ( 2 ) = 𝑓𝐴 ( 2 ) 2 = 𝑒 −( 2 )/𝜃 2 = ⋯ = 2𝑛𝜅 Γ(𝑛𝜅) 𝑞 𝑛𝜅−1 𝑒 −𝑞/2 , shows that the 2 transformed random variable 𝑄 = 𝜃 ∑𝑛𝑖=1 𝑋𝑖 ~𝐺𝐴𝑀𝑀𝐴(2, 𝑛𝜅) ≡ 𝜒 2 (2𝑛𝜅). This allows 2 2 to compute 𝑃[𝜒𝛼/2 (2𝑛𝜅) < 𝑄 < 𝜒1−𝛼/2 (2𝑛𝜅)] = 1 − 𝛼, so after substituting in for 𝑄 2 ∑𝑛 𝑖=1 𝑋𝑖 and solving for 𝜃, we have 𝑃 [𝜒2 1−𝛼/2 2 ∑𝑛 𝑖=1 𝑋𝑖 < 𝜃 < 𝜒2 (2𝑛𝜅) ] = 1 − 𝛼, which is the desired 𝛼/2 (2𝑛𝜅) 100(1 − 𝛼)% equail-tailed confidence interval for 𝜃 based on the sufficient statistic 𝑆 = ∑𝑛𝑖=1 𝑋𝑖 for the unknown parameter 𝜃.

Random Variables: PDF & CDF Exam Questions

Related documents

Products

Support

Random Variables: PDF & CDF Exam Questions

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib