C22.0015 / B90.3302 NOTES for Wednesday 2011.APR.06 Here’s a fun maximum likelihood estimation problem. Suppose that X1, X2, …, Xn is a 1 x e sample from the density f( x | ) = . What’s the maximum likelihood 2 estimate? The likelihood is 1 e 2n L = n xi i 1 The log likelihood is log L = n log 2 n i 1 xi Can we ever succeed in differentiating this? Maybe, but perhaps we don’t need to. Let’s suppose that the xi’s have been sorted into increasing order x1 < x2 < … < xn and that there are no ties. n The function i 1 n i 1 xi xi is a polygonal line in . If xj < < xj+1 , then j = i 1 j = xi i 1 The derivative xi n i j 1 xi n x i j 1 i d is j – (n – j) = 2j – n . If n is even, this can be made zero by d n . This would correspond to using in the interval which defines the 2 median value. If n is odd, you can minimize this by selecting as the median. using j = 1 We have a detailed handout on confidence intervals. The information presented next here gives numeric examples related to that handout. We need to talk about the notion of confidence intervals. topic. Please see the handout on this Example 1 from this handout, with actual numbers. (It helps to lay these out on a number line.) Suppose that you have a sample of 28 values from a normal population with mean and standard deviation . The sample has an average of 2,840 and a s standard deviation of 418. The 95% confidence interval for is x t0.025; 27 . n 418 This is 2,840 2.0518 , or about 2,840 162.08. This interval is 28 (2,677.92 , 3,002.08). 418 , which is 28 2,840 218.87. It’s quite a bit longer, but that’s the cost of greater confidence. The 99% confidence interval uses t0.005; 27 = 2.7707. It is 2,840 2.7707 The one-sided 95% interval to infinity starts at x - t0.05; 27 s . This is n 418 2,840 - 134.54 = 2,705.46. The interval is (2,705.46 , ∞). 28 Observe that it starts below the value for x . 2,840 - 1.7032 The one-sided 95% interval from -∞ ends at x + t0.05; 27 s . This is n 418 2,840 + 134.54 = 2,974.54. The interval is (-∞ , 2.974.54). This 28 interval ends above the value for x . 2,840 + 1.7032 You get to pick you confidence interval strategy once. You don’t get to play around with lots of possibilities, as we’ve done here for illustrative purposes. Example 2 from this handout, with actual numbers. Suppose that, with the data above, you wanted a 95% confidence interval for 2, and you wanted this interval one-sided n 1 s 2 27 4182 from zero to an upper bound. This upper bound is 1 = 16.1514 2n1 292,082.9154. We are 95% confident that 2 is at most 292,082.9154. 2 A simple square root says that we’re 95% confident that is at most 540.45. Note that this upper limit is above the data value s = 418. The one-sided interval to infinity has lower limit n 1 s 2 2n1 = 27 4182 40.1133 117,605.5822. We’re 95% confident that 2 exceed this. We’re 95% confident that exceeds 342.94. This lower limit is below the data value s = 418. n 1 s 2 n 1 s 2 The two-sided 95% interval would probably be done as . , 1 2 2 2 2 n 1 n 1 27 4182 27 4182 This interval is , . This is (109,216.4049 , 323,709.4981). By 43.1945 14.5734 taking square roots, we get the interval for as (330.48, 568.95). This is not going to be the shortest interval. Example 3 from this handout with actual numbers. Suppose that you have two samples of values with these summaries: Sample Size Average A B 18 22 310 355 Standard deviation 47 51 We want the 95% confidence interval for the difference between population means. Begin by finding the pooled standard deviation. Its value is 49.2507. The confidence mn interval is X Y tm/2n 2 s p , and this works out to (-76.7, -13.3). Note mn that this excludes zero, corresponding to a significant difference between the sample 0.025 means. The other interesting thing here is tm/2n 2 = t38 = 2.0244. Example 4 from this handout with actual numbers. Using the values above, let’s say you 2A wanted a 95% one-sided interval, zero to an upper limit, for 2 . Let’s agree that m = 18 B s A2 (sample size for sA) and n = 22 (sample size for sB). This upper limit is 2 Fn 1, m1 , sB 3 2 47 0.05 which here uses Fn1, m1 F21,17 = 2.2189. The upper limit is 2.2189 51 1.8845. 2B Now . . . suppose that you had phrased this as getting a lower limit for 2 . This lower A sB2 1 0.95 limit is 2 Fn 1, m1 , here using Fn11,m1 F21,17 = 0.4675. The lower limit is sA 2 51 0.4675 0.5505 47 Two observations: 1 1.8167 , so these are really the same interval viewed in (1) 0.5505 two different ways. 1 1 0.95 (2) 0.4675. If you lack a table of F21,17 0.05 = 2.1389 F17, 21 lower percent points, you can still get the values from upper percent point tables. All the usual notes apply again. We don’t get to play around with one-sided versus two-sided, and we don’t get to do game with the confidence value. Example 5 from this handout. There’s a separate handout on Fieller’s result. Example 6 from this handout. It’s the simple binomial confidence interval. You’ve seen this many times. Example 7 from this handout. This has accompanying numbers. Our final issue here is that of the relevance of hypothesis testing. Is it really just a game? There’s a nice handout on this. This was a Science News article, March 27, 2010. The article is by Tom Siegfried, pp 26-29. 4