Stat 330 (Spring 2015): Homework 11 Due: April 24, 2015 Show all of your work, and please staple your assignment if you use more than one sheet. Write your name, the course number and the section on every sheet. Show all work to earn partial credit. Problems marked with * will be graded and one additional randomly chosen problem will be graded. 1. A sample of 3 observations of waiting time to access an internet server is x1 = 0.4, x2 = 0.7, x3 = 0.9 seconds. It is believed that the waiting time has the continuous distribution ( θtθ−1 , 0 < t < 1 f (t) = 0, otherwise (a) Find the maximum likelihood estimate of θ. Answer: Using MLE, L(θ) = 3 Y θtθ−1 = θ3 ( i i=1 3 Y ti )θ−1 i=1 `(θ) = 3 log θ + (θ − 1) 3 X log ti i=1 3 3 X ∂`(θ) = + log ti ∂θ θ i=1 θ̂ = −3/ 3 X log ti = −3/ − 1.378326 = 2.17655 i=1 2. (Baron’s book) 9.1 (Find MLE only and omit part (c) in this homework set) Answer: The let X be a random variable with P (X = 3) = θ and P (X = 7) = 1 − θ with θ ∈ (0, 1). Then E[X] = 3θ + 7(1 − θ) = 7 − 4θ. (a) To find the maximum likelihood estimator, L(θ) = θ4 (1 − θ)3 (1) `(θ) = log L(θ) = 5 log(θ) + 3 log(1 − θ) ∂ 5 3 set `(θ) = − = 0 ∂θ θ 1−θ θ̂M LE = 5/8 (2) (3) (4) (b) Note that if Y ∼ Ber(θ) then X = 7 − 4Y and since V ar[Y ] = θ(1 − θ), then V ar[X] = (−4)2 V ar[Y ] = 16θ(1 − θ). Now, 7−x 1 1 16θ(1 − θ) θ(1 − θ) V ar[θ̂M oM ] = V = V ar[X] = = 4 16 16 n n which we estimate by θ̂(1−θ̂) . n The standard error is s r θ̂(1 − θ̂) .625(1 − .625) ˆ se(θ)) = = ≈ 0.17 n 8 A similar argument shows the standard error is the same for the MLE. 3. * (Baron’s book) 9.3 (only find maximum likelihood estimator in each case) Answer: 1 Stat 330 (Spring 2015): Homework 11 Due: April 24, 2015 (a) U nif (a, b). The joint density is ( f (x) = 1 b−a n 0 a ≤ x1 , . . . , x n ≤ b otherwise This function is monotonically increasing in a and decreasing in b. It is maximized at the largest value for a and the smallest value for b such that the density is not zero. These are â = min(xi ) and b̂ = max(xi ). (b) Exp(λ). The MLE is f (x) = n Y λe−λxi i=1 log f (x) = n log(λ) − λnx ∂ n set log f (x) = − nx = 0 ∂λ λ λ̂ = 1/x (c) N (µ, σ 2 ), σ known. The MLE is f (x) = n Y 1 (2πσ 2 )−1/2 exp − 2 (xi − µ)2 2σ i=1 log f (x) = −n/2 log(2πσ 2 ) − n 1 X (xi − µ)2 2σ 2 i=1 n ∂ 1 X log f (x) = − 2 2(xi − µ) ∂µ 2σ i=1 =− 1 set nx − nµ = 0 σ2 µ̂ = x (d) N (µ, σ 2 ), µ known. The MLE is f (x) = n Y 1 (2πσ 2 )−1/2 exp − 2 (xi − µ)2 2σ i=1 log f (x) = −n/2 log(2πσ 2 ) − n 1 X (xi − µ)2 2σ 2 i=1 n ∂ 1 X set 2 log f (x) = −nσ /2 + (xi − µ)2 = 0 ∂σ 2 2(σ 2 )2 i=1 n σ̂ 2 = 1X (xi − µ)2 n i=1 2 (e) N (µ, σ 2 ). Notice that the MLE from part c) did not depend µ̂ = x. Now, to find the Pn on σ , so 1 2 2 MLE for σ we plug in µ = x in part d) and find σ̂ = n i=1 (xi − x)2 . 4. * There is concern about the speed of automobiles traveling over a particular stretch of highway. For a random sample of thirty automobiles, radar indicated the following speeds, in miles per hour: 82 88 64 78 90 57 74 70 81 60 75 78 85 77 78 65 73 79 73 66 71 70 61 77 66 69 72 67 64 74 Let the mean speed of all automobiles traveling over this stretch of highway be µ mph. 2 Stat 330 (Spring 2015): Homework 11 Due: April 24, 2015 (a) Find the sample mean and variance, x̄ and s2 . Answer: Using the data provided we have Pn n xi 1 X x̄ = i=1 = 72.8 s2 = (xi − x̄)2 = 66.16552 n n − 1 i=1 (you can use R or other software like Matlab to get these too). (b) Find a 95 % confidence interval for the mean speed of all automobiles traveling over this stretch of highway. Answer: We have a large sample n = 30 and so we may assume the sample mean following normal distribution and use z (the standard normal) for the quantiles. The 95% confidence interval for the mean speed of all automobiles traveling over this stretch of highway is s s x̄ − zα/2 √ , x̄ + zα/2 √ n n √ Here 1 − α = 0.95, zα/2 = Φ−1 (0.975) = 1.96 and s = s2 = 8.134219. Hence the 95% confidence interval for µ is given by 72.8 ± 2.91 or (69.89, 75.71). (c) Test the hypothesis that people are speeding, if the legal speed on this highway is 65 mph. That is, test H0 : µ = 65 vs. H1 : µ > 65 Answer: We need to test H0 : µ = 65 vs. H1 : µ > 65, where the test statistic is given as z= 72.8 − 65 x̄ − µ0 √ = √ = 5.252 s/ n 8.134219/ 30 The p-value is P (z > 5.252) = 1 − P (z ≤ 5.252) ≈ 0. Since the p-value is really tiny we reject the null hypothesis and conclude that the mean speed of all the automobiles is greater than 65. Also using part (b) we can say that since 65 is not contained in the confidence interval we can conclude that the mean speed of the car is not equal to 65. 5. A manager evaluates effectiveness of a major hardware upgrade by running a certain process 50 times before the upgrade and 50 times after it. Based on these data, the average running time is 8.5 minutes before the upgrade, 6.2 minutes after it. Historically, the standard deviation σ of the running times of this process has been 1.8 minutes and presumably it has not changed. (a) Construct a 90% confidence interval for the difference in the mean running times µBefore − µAfter . Answer: The 90% confidence interval for µBef ore − µAf ter = the difference in the mean running times is r r ! p 2 2 (x̄B − x̄A ) − zα/2 · σ , (x̄B − x̄A ) + zα/2 · σ = 2.3 ± 1.65 × 1.8 × 2/50 = (1.706, 2.894) n n where x̄B and x̄A are the average running time before and after the upgrade respectively so that x̄B − x̄A = 8.5 − 6.2 = 2.3, σ = 1.8 and zα/2 = Φ−1 (0.95) = 1.65 for 1 − α = 0.9. (b) Using this interval, can you conclude that the upgrade was effective? Why? Answer: Using the interval we can conclude that the upgrade is effective since the running time has decreased after the upgrade. As the confidence interval has only positive values we can conclude that the mean running time before the upgrade is greater than the mean running time after the upgrade and hence the upgrade was effective. (c) Test the hypothesis that the hardware upgrade improved the average running time of this process? Answer: We need to test H0 : µA = µB = 0 vs. H1 : µA = µB < 0, where the test statistic is given as x̄B − x̄A −2.3 p = = −6.39 z= 0.36 σ/ 2/n The p-value is P (z < −6.39) =≈ 0. Since the p-value is really tiny we reject the null hypothesis and conclude that the hardware upgrade improved the average running time of this process. 3 Stat 330 (Spring 2015): Homework 11 Due: April 24, 2015 6. Given two samples with n1 = 30, n2 = 39, x̄1 = 4.2, x̄2 = 3.4, s21 = 49, and s22 = 32, where x̄1 is the mean of a sample of size n1 from the first population (mean µ1 ) and has sample variance s21 , and x̄2 is the mean of a sample of size n2 from the second population (mean µ2 ) with sample variance s22 . (a) Find a 95% confidence interval for µ1 . (b) Find a 90% confidence interval for µ1 − µ2 . (c) Suppose n1 is 10 instead of 30. Re-establish the confidence interval in part (a) by assuming the first population follows a normal distribution. Answer: (a) Using the large sample C.I. for population mean, the interval endpoints are s1 x̄1 ± zα/2 √ . n1 Since z0.025 = 1.96, the C.I. is (1.70, 6.70). (b) We utilize the large sample C.I. for µ1 − µ2 . The standard error of our estimator X̄1 − X̄2 is s r s21 49 32 s22 + = + = 1.566. n1 n2 30 39 For a 90% confidence interval, we will need the .95 quantile for Z: z0.05 = 1.645. Putting all this together gets the interval endpoints x̄1 − x̄2 ± 1.645 · 1.566 or (0.8 − 2.58, 0.8 + 2.58) which yields the 90% C.I. for µ1 − µ2 of (−1.78, 3.38). (c) Using the samll sample (with unknown σ) C.I. for population mean, teh interval endpoints are s1 x̄1 ± tn1 −1,α/2 √ . n1 Since t9,0.025 = 2.262, the C.I. is (−0.81, 9.21). 7. (Baron’s book): 9.7 Answer: Or via the p-value approach: The p-value is P (Z > 2.93) = 0.0017 < α = 0.01. Thus, we reject H0 . 4