Notes 15 - Wharton Statistics Department

Statistics 550 Notes 15 Reading: Section 3.3 For finding minimax estimators, there is not a constructive approach like for finding Bayes estimators and it is often difficult to find the minimax estimators. However, there are some tools that allow us to find minimax estimators for particular settings. I. Finding Minimax Procedures The minimax criteria minimizes the worst possible risk. That is, we prefer  to  ' , if and only if sup  R( ,  )  sup  R( ,  ') . * A procedure  is minimax (over a class of considered decision procedures) if it satisfies sup R( ,  *)  inf sup R( ,  ) . Let   denote the Bayes estimator with respect to the prior  ( ) and r ( )  E [ E[l ( ,  ( X )) |  ]]  E [ R( ,  )] denotes the Bayes risk for the Bayes estimator for the prior  ( ) . A prior distribution  is least favorable if r  r ' for all prior distributions  ' . 1 This is the prior distribution which causes the statistician the greatest average loss assuming the statistician uses the Bayes estimator. Theorem 2: Suppose that  is a prior distribution on  and   is a Bayes estimator with respect to  such that r (  )   R( ,  )d ( )  sup R( ,   ) (1.1) Then: (i)   is minimax. (ii) If   is the unique Bayes solution with respect to  , it is the unique minimax procedure. (iii)   is a least favorable prior. Proof: (i) Let  be any other procedure. Then, sup R( ,  )   R( , ) d ( )   R( ,  )d ( )  sup R( ,   ) (ii) This follows by replacing  by > in the second inequality of the proof of (i). (iii) Let  ' be some other distribution of  . Then, r ' ( ' )   R( , ' )d '(   R( , )d '(  sup R( ,  )  r ( ) 2 Corollary 1: If a Bayes procedure   has constant risk, then it is minimax. Proof: If   has constant risk, then (1.1) is clearly satisfied. * Corollary 2 (Theorem 3.3.2): Suppose  has sup R( ,  * )  r   . If there exists a prior  * such that  * is Bayes for  * and  *{ : R( ,  * )  r}  1 , then  * is minimax. Example 1 (Example 3.3.1, Problem 3.3.4): Suppose X 1 , , X n are iid Bernoulli (  ) and we want to estimate  . 2 Consider the squared error loss function l ( , a)  (  a) . For squared error loss and a Beta(r,s) prior, we showed in Notes 16 that the Bayes estimator is r   i 1 xi n ˆr , s  rsn . We now seek to choose r and s so that ˆr , s has constant risk. The risk of ˆ is r ,s 3 2 n    r  x  i i 1 R( , ˆr , s )  E       r  s  n       r   n xi i 1  Var   rsn     r   n xi i 1    E     rsn             2 n (1   )  r  n        ( r  s  n) 2  r  s  n  n (1   )  r  n  r  s  n     ( r  s  n) 2  rsn  n (1   )  (r  r  s ) 2  ( r  s  n) 2 2 2 2 The coefficient on  in the numerator is n  (r  s) and the coefficient on  in the numerator is n  2r (r  s ) . We choose r and s so that both these coefficients are zero: n  (r  s)2  0, n  2r (r  s)  0 n r  s  Solving these equations gives 2 . The unique minimax estimator is n n   i 1 xi ˆminimax  ˆ n n  2 , n n 2 2  n 2 2 1 which has constant risk 4(1  n ) 2 compared to 4  (1   ) n for the MLE X . For small n, the minimax estimator is better than the MLE for a large range of  . For large n, the minimax estimator is better than the MLE for only a small range of  near 0.5. Note: For large n , the least favorable prior concentrates 1   nearly its entire attention on the neighborhood of 2 for 5 which accurate estimation of  is most difficult, leading to poor performance relative to the MLE for other neighborhoods. Minimax as limit of Bayes rules: If the parameter space  is not bounded, minimax rules are often not Bayes rules but instead can be obtained as limits of Bayes rules. To deal with such situations we need an extension of Theorem 3.3.2. * Theorem 3.3.3: Let  be a decision rule such that sup R( ,  * )  r   . Let { k } be a sequence of prior distributions and let rk be the Bayes risk of the Bayes rule with respect to the prior  k . If rk  r as k   , then  * is minimax. Proof: Suppose  is any other estimator. Then, sup R( ,  )   R( ,  )d  k ( )  rk , and this holds for every k. Hence, sup R( ,  )  sup R( ,  * ) and  * is minimax. Note: Unlike Theorem 3.3.2, even if the Bayes estimators for the priors  k are unique, the theorem does not guarantee * that  is the unique minimax estimator. 6 Example 2 (Example 3.3.3): X 1 , , X n iid N (  ,1),       . Suppose we want to estimate  with squared error loss. We will show that X is minimax. 1 First, note that X has constant risk n . Consider the sequence of priors,  k  N (0, k ) . In Notes 16, we showed that the Bayes estimator for squared error loss with respect n ˆk  X 1 . The risk function of ˆk is to the prior  k is n k 2       n ˆ R( , k )  E    X  1    n  k   2     n n [ Bias (ˆk )]2  Var (ˆk )       Var ( X )  1 1   n n k  k  2 2 1 1 2   n  2   n k  k  2 2 2 1  1 1   n  n  n  k k k      . The Bayes risk of ˆk with respect to  k is 7 1 n    k rk   2  1  n  k  2 2  2  exp   d 2 k  2k  1 1 n k   2 2 1  1  n  n  k  k  1 r  As k   , k n , which is the constant risk of X . Thus, by Theorem 3.3.3, X is minimax. 8

Notes 15 - Wharton Statistics Department

Related documents

Products

Support

Notes 15 - Wharton Statistics Department

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib