THE RESISTANT L I N E AND RELATED REGRESSION METHODS ORION 018 I A I N JOHNSTONE and PAUL VELLEMAN Project ORION JUNE 1983 Department of Statistics Stanford University Stanford, California THE RESISTANT LINE AND RELATED REGRESSION METHODS I a i n Johnstone Stanford U n i v e r s i t y and Mathematical Sciences Research I n s t i t u t e and Paul Ve 11eman Cornell U n i v e r s i t y ORION 018 J U N E 1983 PROJECT ORION Department of S t a t i s t i c s Stanford U n i v e r s i t y Stanford, California The R e s i s t a n t Line and R e l a t e d R e g r e s s i o n Xethods I a i n Johnstone * , Stanford U n i v e r s i t y , and NSP-I P a u l Velleman, C o r n e l l U n i v e r s i t y Abstract I n a bivariate (x,y) s c a t t e r - p l o t t h e three-group r e s i s t a n t l i n e i s t h a t l i n e f o r which t h e median r e s i d u a l i n each o u t e r t h i r d of t h e d a t a ( o r d e r e d on x) is zero. It was proposed by Tukey as a n e x p l o r a - t o r y method r e s i s t a n t t o o u t l i e r s i n culation. x and y and s u i t e d t o hand c a l - A d u a l p l o t r e p r e s e n t a t i o n of t h e procedure y i e l d s a f a s t , convergent a l g o r i t h m , non-parametric c o n f i d e n c e i n t e r v a l s f o r t h e s l o p e , c o n s i s t e n c y , i n f l u e n c e f u n c t i o n and a s y m p t o t i c n o r m a l i t y r e s u l t s . Monte- C a r l o r e s u l t s show t h a t s m a l l sample e f f i c i e n c i e s exceed t h e i r a s y m p t o t i c v a l u e s i n i m p o r t a n t c a s e s and a r e a d e q u a t e f o r e x p l o r a t o r y work. medians by o t h e r M-estimators Replacing of l o c a t i o n i n c r e a s e s e f f i c i e n c y s u b s t a n t i a l l y w i t h o u t a f f e c t i n g breakdown o r c o m p u t a t i o n a l complexity. The method t h u s combines bounded i n f l u e n c e and a c c e p t a b l e e f f i c i e n c y w i t h c o n c e p t u a l and computational s i m p l i c i t y . Key Words: R e s i s t a n t l i n e , bounded i n f l u e n c e , r o b u s t r e g r e s s i o n , d u a l p l o t , v a r i a n c e - r e d u c t i o n s w i n d l e , Brown-Mood r e g r e s s i o n , BerryEsseen theorem, M-estimator,confidence i n t e r v a l , r e p e a t e d median regression. *Work by P r o f e s s o r J o h n s t o n e was s u p p o r t e d i n p a r t a t C o r n e l l by an A u s t r a l i a n N a t i o n a l U n i v e r s i t y P o s t b a c h e l o r T r a v e l l i n g S c h o l a r s h i p , a t S t a n f o r d by ONR c o n t r a c t d N00014-81-K-0340, and a t t h e Math S c i e n c e s Research I n s t i t u t e by NSF. Work by P r o f e s s o r Yelleman was s u p p o r t e d i n p a r t by ARO(Durham) jIDAAG29-82-C-0178 awarded t o P r i n c e t o n U n i v e r s i t y . The a u t h o r s thank David Hoaglin f o r h e l p f u l comments on a n earlier d r a f t of this p a p e r and C o r n e l l U n i v e r s i t y f o r t h e computer funds. 1. INTRODUCTION APTD SUMMARY Tukey (1971) proposed t h e ( t h r e e group) r e s i s t a n t l i n e a s an exploratory d a t a a n a l y t i c t o o l f o r quickly f i t t i n g a s t r a i g h t l i n e t o bivariate (x,y) data. The d a t a p o i n t s a r e d i v i d e d i n t o t h r e e groups according t o s m a l l e s t , middle o r l a r g e s t x v a l u e s and t h e l i n e with an equal number of p o i n t s above and below i t i n each of t h e o u t e r groups is fitted. The r e s u l t i n g parameter e s t i m a t e s a r e r e s i s t a n t t o t h e e f f e c t s of d a t a p o i n t s extreme i n y, or x o r both. @or detailed d i s c u s s i o n s TncPudTng examples s e e Velleman and Hoaglin (198'1) and Emerson and Hoaglin (1983a) .) Designed a s a " p e n c i l and paper" method, t h e r e s i s t a n t l i n e i s a l s o u s e f u l a s a computationally quick nonparametric r e g r e s s i o n i n computer-assisted d a t a a n a l y s e s . Although i t compares favorably t o o t h e r well-known nonparametric r e g r e s s i o n methods and i s widely a v a i l a b l e i n s t a t i s t i c s packages (for example, Minitab ( ~ y a n&. 19811, and P-Stat &. (Buhler and Buhler, 1981), t h e method has not been analyzed t h e o r e t i c a l l y . T h f s etrt5cle p r e s e n t s an inexpensive computational algorithm (52), aonparametric confidence i n t e r v a l s ( § 3 ) , and asymptotic r e s u l t s on cons i s t e n c y , i n f l u e n c e f u n c t i o n , and asymptotic normality ( 5 4 ) f o r t h e resistant line. A c l a s s of " $ - r e s i s t a n t " l i n e s c o n s t r u c t e d from M- e s t i m a t o r s of l o c a t i o n i s t h e n introduced (55). While r e t a i n i n g t h e computational and r e s i s t a n c e p r o p e r t i e s of t h e r e s i s t a n t l i n e , t h e s e e s t i m a t o r s can have s u b s t a n t i a l l y h i g h e r e f f i c i e n c y . F i n a l l y , w e use t h e preceding theory and t h e r e s u l t s of a Monte Carlo s i m u l a t i o n experrnent of s m a l l sample p r o p e r t i e s (56) t o compare t h e r e s i s t a n t l i n e s w i t h r e l a t e d lfne-fl'ttfng procedures. The experiment u s e s two new v a r i a n c e r e d u c t i o n "swindles" and provides small-sample r e s u l t s n o t p r e v i o u s l y a v a i l a b l e f o r most of t h e e s t i m a t o r s s t u d i e d . I n c o n s i d e r i n g t h e merits of v a r i o u s a l t e r n a t i v e s t o l e a s t s q u a r e s r e g r e s s i o n methods i n e x p l o r a t o r y work our primary c r i t e r i a a r e : (1) r e s i s t a n c e ; i n c l u d i n g p r o t e c t i o n from d a t a v a l u e s extreme i n x a s w e l l a s t h o s e extreme i n y , ("bounded influence"), (21 c o s t of computation, and (3) asyrsptotic and small-sample e f f i c i e n c y f o r b o t h Gaussian and h e a v y - t a i l e d e r r o r d i s t r i b u t i o n s . The r e s i s t a n t l i n e performs w e l l f o r (1) and (2) , and i n many cases has an e f f i c i e n c y of roughly 60%--exceeding t h e 50% u s u a l l y considered adequate f o r e x p l o r a t o r y d a t a a n a l y s e s . The m o d i f i c a t i o n s proposed i n S e c t i o n 5 can i n c r e a s e t h e p r a c t i c a l e f f i c i e n c y t o roughly 75% under a wide range of e r r o r d i s t r i b u t i o n and i n c r e a s e t h e asymptotic r e l a t i v e e f f i c i e n c y t o 80% under c l a s s i c a l r e g r e s s i o n assumptions, Precedent L i t e r a t u r e Tukey's method combines f e a t u r e s of two t r a d i t i o n a l approaches t o l i n e a r regression. Wald (1940) s t u d i e d t h e e r r o r s - i n - v a r i a b l e s of f i t t i n g a s t r a i g h t l i n e y = a to error. + bx when both y and x problem a r e subject He p a r t i t i o n s t h e observed d a t a p o i n t s ( xi' y.) 1 i n t o two s e t s and and u s e s an e s t i m a t o r based on group means: Independently of Wald, Nair and S h r i v a s t a v a (1942) a p p l i e d a s i m i l a r e s t i m a t o r t o f i x e d , e q u a l l y spaced x-values divided i n t o t h r e e groups. They showed f o r Gaussian y t h a t an optimal (minimum v a r i a n c e ) esti- m a t o r ~ f t h eform (1.1) i s obtained by u s i n g only n/3 p o i n t s i n each of t h e extreme g r o u p s , i n c r e a s i n g i t s e f f i c i e n c y r e l a t i v e t o l e a s t s q u a r e s from 3 / 4 t o 819- observation. X o s t e l l e r (1946) n o t e d t h e advantage of t h r e e groups i n B a r t l e t t (1949) made a - s i m i l a r s i m p l e p a r t i t i o n - b a s e d e s t i m a t i o n of t h e c o r r e l a t i o n c o e f f i c i e n t of a (x,y) b i v a r i a t e Gaussian p o p u l a t i o n and found t h a t t h e o p t i m a l d i v i s i o n p l a c e s a b o u t 27% of t h e d a t a i n e a c h of t h e o u t e r groups. The same o p t i m a l i t y c a l c u l a t i o n ( S e c t i o n 4) a p p l i e s t o t h e r e g r e s s i o n problem and i n a d i f f e r e n t c o n t e x t l e a d s t o t h e s o c a l l e d "27% r u l e " i n psychometrics ( c . f . Kelley (19391, and McCabe (1980) f o r some r e f e r e n c e s ) . F u r t h e r r e s u l t s on o p t i m a l a l l o c a t i o n f o r o t h e r x - d i s t r i b u t i o n s and s m a l l sample approximations are g i v e n by Gibson and J o w e t t (1957a) and B a r t ~ n and Casley (1958). The second approach (Mood, 1950; Brown and Mood, 1951) i s non-parametric: d i v i d e t h e d a t a i n t o two groups w i t h t h e s p l i t a t t h e median ( a s i n Wald's method), and f i n d t h e s l o p e bm x making t h e median r e s i d u a l i n t h e l e f t group e q u a l t o t h e median of t h e r e s i d u a l s i n t h e r i g h t group. The i n t e r c e p t a 2 Let , L B'M = med{xi: i s t h e n chosen t o make t h e median of a l l t h e r e s i d u a l s z e r o . x i E L), ..yL = med{yi: xi E L), e t c . Xood's (1950) a l g o r i t h m f o r f i n d i n g t h e s l o p e ( r e p e a t e d i n Tukey, 1971) b e g i n s w i t h b ( l ) = (yR Here e m(k) R - FL)/(SR and -(k) eL - it), computes r e s i d u a l s e i , and iterates : a r e t h e medians of t h e kth s t e p r e s i d u a l s i n t h e r i g h t and l e f t g r o u p s , r e s p e c t i v e l y . The i t e r a t i o n is t o c o n t i n u e u n t i l t h e two g r o u p m e d i a n s a r e e q u a l o r t h e c o r r e c t i o n i s as s m a l l as r e q u i r e d . a p p e a r s t o b e t h e o n l y p u b l i s h e d a l g o r i t h m f o r Brown-Mood r e g r e s s i o n . This -4However, experience shows t h a t i t f a i l s t o converge o f t e n enough t o S e c t i o n 2 o f f e r s an make i t i n a p p r o p r i a t e f o r p r a c t i c a l d a t a a n a l y s i s . e x p l a n a t i o n and an a l g o r i t h m whose r a p i d convergence i s guaranteed. Brown and Mood, and Daniels (1954) provided t e s t s f o r t h e hypothesis ( a , b ) = (ao ,bo) assuming f i x e d x-values. Bhapkar (1961) gave some consistency r e s u l t s and H i l l (1962) proved asymptotic normality f o r b ) ( a ~ BII ~ ' Both a u t h o r s proposed tests f o r l i n e a r i t y based on Brown-l4ood r e g r e s s i o n , Hogg (1975) f i t r e g r e s s i o n l i n e s based on p e r c e n t i l e s o t h e r t h a n t h e median of t h e r e s i d u a l s i n each h a l f of t h e data. Kildea (1981) obtained asymptotic n o r m a l i t y of a c l a s s of Brown-Xood e s t i m a t o r s u s i n g weighted medians and proposed an o p t i m a l l y weighted, but non-robust estimator. The r e s i s t a n t l i n e thus combines t h e r e s i s t a n c e t o o u t l i e r s of t h e Brown-Mood median r e g r e s s i o n w i t h t h e improved e f f i c i e n c y obt a i n e d from t h r e e p a r t i t i o n s . I n a d d i t i o n t o t h e r e g r e s s i o n e s t i m a t o r s discussed above w e i n c l u d e t h r e e o t h e r s of c u r r e n t i n t e r e s t i n t h e comparisons. Two a r e more expensive t o compute than t h o s e discussed t h u s f a r , b u t reward t h e e x t r a e f f o r t w i t h h i g h e r breakdown v a l u e s and/or These a r e t h e median-of-pairwise-slopes and Sen (1968) ; bTS = med { (y j - yi) greater efficiencies. procedure s t u d i e d by T h e i l (1950) / (xj - xi) g e n e r a l l y high e f f i c i e n c y , and t h e r e l a t e d x j > x. 1 1, which h a s r e p e a t e d median estimatof = med med{(y - y i ) / ( x - xi)} i n t r o d u c e d by S i e g e 1 (1982), which bm i j+i 5 j h a s optimal 50% breakdown. The t h i r d , l e a s t a b s o l u t e r e s i d u a l (L ) 1 r e g r e s s i o n (e.g. Laplace 1818, B a s s e t t and Koenker 1978), h a s been suggested as an improvement on least s q u a r e s f o r heavy-tailed error -5distributions. It has an i n f l u e n c e f u n c t i o n t h a t i s unbounded i n x , and t h u s can be s e n s i t i v e t o extreme x v a l u e s . With t h i s c a v e a t , r e g r e s s i o n performs comparably t o t h e $ - r e s i s t a n t t h e c r i t e r i a (1) - (3) above. L1 l i n e s with respect tq We have n o t included a bounded i n f l u e n c e r o b u s t r e g r e s s i o n of t h e Krasker-Welsch (1982) type because t h e r e i s no agreement y e t on how t o apply t h e d e f i n i t i o n s t o simple r e g r e s s i o n . Some of t h e r e s u l t s i n t h i s paper were f i r s t announced i n Johnstone and Velleman (1982). 52. PROPERTIES OF THE KESISTANT LINE Let d a t a p o i n t s (x1,yl),...,(xn.yn) be given w i t h x 1- x <. 2 - ..lxn . P a r t i t i o n t h e x-values i n t o t h r e e groups L, M y and R c o n t a i n i n g and n n ~ ' R data points respectively x, = max(L)< xR = min{R) and (2+ % + nR = n) ; where M C ( 5 , xR) might be empty. and Hoaglin (1981) f o r a d e t a i l e d algorithm.) abuse n o t a t i o n by w r i t i n g Line s l o p e e s t i m a t e b (2.1) . med ieL RL L f o r (i : xi We s h a l l o c c a s i o n a l l y 11, e t c . E (See Velleman The r e s i s t a n t i s then defined a s t h e s o l u t i o n t o (yi - bxi) = med (yi i eR - bxi) The i n t e r c e p t e s t i m a t e aRL may, f o r example, be chosen t o make t h e median r e s i d u a l s of b o t h groups zero, s p e c i a l c a s e i n which %= Brown-Mood r e g r e s s i o n i s t h e 0, nL = nR = n/2. The r e s i s t a n t l i n e i s a s p e c i a l c a s e of K i l d e a ' s (1981) weighted median e s t i m a t o r s t h a t uses only 0-1 weights. 2 . 1 A, Dual. p l ? t li-nes. The d u a l p l o t f n t e r c h a n g e s t h e r o l e s of p o i n t s and - I g n o r e t h e i n t e r c e p t , a , and p l o t e a c h r e s i d u a l , e2(b) = yl a g a i n s t b. T h i s y i e l d s a l i n e w i t h s l o p e -x to the original data point x i y ) . i xib,- and i n t e r c e p t yf t o c o r r e s ~ o n d Two d u a l l i n e s ei(b) and eJ ( b ) w i l l i n t e r s e c t a t a point having'b-value . , y .1 ) namely t h e s l o p e o f t h e l i n e i n (x,y) s p a c e j o i n i n g ( x1 - (interpreted as if x i = x.). 3 and ( xj ,y j) For an i l l u s t r a t i v e example, s e e Emerson and H o a g l i n (1983a). Dolby (1960) and D a n i e l s (1954) have used t h e d u a l p l o t i n simple l i n e a r r e g r e s s i o n contexts. The median t r a c e of t h e l e f t p a r t i t i o n i s t h e p i e c e w i s e - l i n e a r TL(b) = med (yi-x ib ) . i€L the dual l i n e s i n L. If is odd, If i s even, t h e l i n e segments i n a d j a c e n t p a i r s of d u a l l i n e s . Since \ TL < function c o n s i s t s of l i n e segments from 5, t h e TL average r i g h t median t r a c e TR(b) = med (yi-bx i'1 i n t e r s e c t s TL e x a c t l y once. The b-value of t h i s ~ E R This i n t e r s e c t i o n i s t h e s l o p e of t h e r e s i s t a n t l i n e . ( s e e f i g u r e A.) g r a p h i c a l c o n s t r u c t i o n shows t h a t bRL always e x i s t s and i s unique. The d u a l p l o t r e v e a l s t h e d i f f i c u l t i e s encountered by t h e . , e t c . and L e t xR = med xi " ' ~= med i a R -y 1 iER Then t h e f i r s t e s t i m a t e d s l o p e i s b(')= (;B yL)/Ax. Mood-Tukey a l g o r i t h m . Ax = -- < - 4. - On t h e d u a l p l o t t h e d i s t a n c e between t h e median t r a c e s a t any s l o p e , b , i s t h e d i f f e r e n c e between t h e median r e s i d u a l s i n t h e l e f t and r i g h t groups. Thus, from ( 1 , 2 ) , b (k)) - TL(b(k))]/Ax. Now, i f Left - Figure A. (r....) The -Dual P l o t d i s p l a y s f o r each d a t a p o i n t (x, y) t h e l i n e e = y - xb. H e r e l i n e s a r e shown only f o r p o i n t s L e f t o r t h e R i g h t p a r t i t i o n s of t h e d a t a s e t . i n the The median t r a c e s , T and TRY f o l l o w t h e median r e s i d u a l , e , a t each L s l o p e , b. The a b c i s s a of t h e i r i n t e r s e c t i o n i s t h e resistant l i n e slope, bRL ' Right (- -j then I b (k+l) - bRLI > Ib (k ) - bRL I, has made t h e approximation worse, and t h i s s t e p of t h e i t e r a t i o n I n e q u a l i t y (2.2) c a n b e w r i t t e n which bounds t h e d i f f e r e n c e i n t h e s l o p e s of t h e median t r a c e s n e a r b~~ A median t r a c e c a n be s t e e p o n l y i f a n x-value i n i t s p a r t i t i o n i s l a r g e i n magnitude. Thus t h e i t e r a t i o n c a n be stymied by h i g h - l e v e r a g e p o i n t s w i t h e x t r a o r d i n a r y x-values. See Emersan and Haqglin C1983t) f o r a s p e c l f i c example c o n s t r u c t e d by A. S i e g e l , 2.2 - An Algorithm f o r R e s i s t a n t L i n e s . for the resistant l i n e The d u a l p l o t s u g g e s t s a n a l g o r i t h m t h a t w i l l always converge, l i n e a r monotone-decreasing function yields the r e s i s t a n t l i n e slope. The z e r o of t h e piecewise- - E(b) = TL(b) TR(b) Among a l g o r i t h m s f o r f i n d i n g z e r o s of f u n c t i o n s t h e one known a s ZEROIN (Wilkinson (1967), Dekker (1969), Brent (1973))is well-suited t o t h i s problem. The a l g o r i t h m r e q u i r e s c h o i c e of a t o l e r a n c e , TOL, which i s t h e amount of e r r o r i n and a n i n t e r v a l (bmin, b ) max t o search. TOLl = 0.5TOL + 2.0 x Ex Ibmax I and t h a t c a n be t o l e r a t e d , Brent (1973) shows t h a t t h i s bmax-bmin [log2( TOLl )12 a l g o r i t h m w i l l always converge i n fewer t h a n where b E steps, i s t h e machine e p s i l o n . Brent c l a i m s t h a t t h e a l g o r i t h m c a n u s u a l l y be expected t o converge much more q u i c k l y , and f o r t h i s a p p l i c a t i o n o u r e x p e r i e n c e c o n f i r m s this, Each s t e p r e q u i r e s t h e median of t h e r e s i d u a l s i n e a c h of t h e o u t e r groups. A s y m p t o t i c a l l y , t h e median of n numbers can b e found w i t h O(n) comparisons. (See, f o r example, Aho, H o p c r o f t , and Ullman (1974) pp. 97-99.) However, f o r moderate sample s i z e , n / 3 i s t o o small t o make t h e l a r g e overThe convergence r a t e of ZEROIN i s head of s p e c i a l i z e d methods worthwhile. independent of n , s o t h i s a l g o r i t h m r e q u i r e s O(n) o p e r a t i o n s . Because E(b) i s monotone, any b and bmin max that satisfy sign(g(bmin)) a r e s u i t a b l e ; t h e f i r s t two e s t i m a t e s s i g n ( E (bmax)) Velleman and H o a g l i n from Mood's o r i g i n a l i t e r a t i o n o f t e n s u f f i c e . (1981) p r o v i d e p o r t a b l e programs f o r f i n d i n g r e s i s t a n t l i n e s u s i n g t h i s algorithm. 2.3 - Unbiasedness. Consider t h e b i v a r i a t e l i n e a r model Yi = a + + BXi E ~ , i = 1 , n i n which t h e {si} (I) t h e X. a r e f i x e d , o r (11) t h e X. a r e i . i . d . 1 E i' are i.i.d. and and independent of 1 The r e s u l t s a r e g i v e n f o r n o d e l (I) and f o l l o w f o r model (11) by c o n d i t i o n i n g on X. If nL without and n R a r e both odd, t h e n any assumption o f symmetry on t h e P { b l U . ~B } = P{med isR p{bRL and s i m i l a r l y d i s t r i b u t e d about - 6 > < m) < 81 2 1 / 2 . 0: a r e symmetric a b o u t (if E \ cil i s median unbiased f o r B bRL B E i si] bRL (and a ) RL y and n c) = R' - {med ( E ~ c x . ) > m e d ( ~ ~cxi) 1 R L i s e q u i v a l e n t t o t h e same e v e n t w i t h e v e r y E { b~~ 2 112 , E i a r e symmetrically a), and t h u s a r e median unbiased and unbiased f o r a l l v a l u e s of - > med i EL i: Suppose now a l s o t h a t t h e e r r o r s i n t h i s case (and E i 1 Toseethisnotethat and t h a t r e p l a c e d by {bRL -& - @ i' ' -c) 3 ~ r e a k d o w nBounds. The ( g r o s s e r r o r ) breakdown p o i n t of a n e s t i m a t o r measures i t s a b i l i t y t o r e s i s t t h e e f f e c t s of o u t l i e r s ( 1971). Loosely,.. ~ ~ t h e breakdown p o i n t i s t h e l a r g e s t f r a c t i o n of ~ ~ * t h e d a t a t h a t c a n be changed a r b i t r a r i l y w i t h o u t changing t h e e s t i m a t o r beyond a l l bounds. n minl[(t - The r e s i s t a n t l i n e h a s a breakdown v a l u e of 1)/21, [(nR - 1)/21} because we need o n l y a l t e r h a l f of t h e d a t a p o i n t s i n one of t h e o u t e r part&t$ons t o t a k e complete c o n t r o l of t h e slope estimate. If = y4 = nR, t h i s i s a p p r o x i m a t e l y e q u a l t o 116, A P r o b a b i l i s t i c Breakdown Value. kThile a breakdown v a l u e o f 116 i s c e r t a i n l y r e s i s t a n t ( l e a s t s q u a r e s and l e a s t - a b s o l u t e - v a l u e r e g r e s s i o n both have breakdowns of O X ) , i t i s p o o r e r t h a n comparable v a l u e s f o r Brown-Mood ( 2 5 % ) , Theil-Sen r e g r e s s i o n ( 2 9 % ) , o r r e p e a t e d median r e g r e s s i o n (50%). However, breakdown c o n c e n t r a t e s on worst-case behavior, I n r e a l i t y , not a l l s u b s e t s of s i z e 116 a r e e q u a l l y dangerous f o r t h e r e s i s t a n t l i n e ; f o r t h e l i n e If we r e s t r i c t t o b r e a k down, a l l bad p o i n t s must l i e i n t h e same o u t e r t h i r d , a t t e n t i o n t o g r o s s e r r o r s i n y ( s o t h a t t h e x-values a r e held f i x e d ) , and assume t h a t extreme y-values o c c u r i n d e p e n d e n t l y w i t h p r o b a b i l r t y a , we can c a l c u l a t e a " p r o b a b i l i s t i c breakdown value1' (c. f . Donoho and Hu6er , 1982, T h i s g i v e s a p l a u s i b l e r a t h e r t h a n worst-case available. For t h e r e s i s t a n t l i n e y v a l u e s i n t h e l e f t ( r i g h t ) groups. 1 - '1-a (resp. breakdown [ (nR-1) /2]) extreme For convenience t a k e nL = n R = 2k t h e n t h e breakdown p r o b a b i l i t y becomes 1 - Pr{Bin(2k+l,a) + 5 kI2= ~ k + l , k + l ) ~wbere , I (.p ,q) i s P e a r s o n r s i n c o m p l e t e B e t a f u n c t i o n . X T a b l e 1 g i v e s some v a l u e s f o r t h e breakdown p r o b a b i l i t y under t h e above "random g r o s s e r r o r modelk', . bound on t h e p r o t e c t i o n ( f n c l u d i n g grown-Mood), o c c u r s i f t h e r e are more t h a n ( - 1 2 S e c t i o n 5.1) For t h e r e s i s t a n t l i n e , nL = nR = n / 3 i s used, w h i l e f o r Brown-Mood r e g r e s s i o n , w e t a k e nL = n R = [ n / 2 ] , which y i e l d s a l a r g e r v a l u e of k and hence lower breakdown p r o b a b i l i t i e s . 1; ~ For comparison, r e s u l t s f o r r e p e a t e d median g,nd TBe$l*en g r e s s i o n are g i v e n . rem I n t h e s e c a s e s , t h e p o s s i b i l i t y of breakdown i s a f f e c t e d o n l y by t h e s i z e of t h e s u b s e t drawn, s o t h a t t h e breakdown proba b i l i t y i s g i v e n by a s i n g l e b i n o m i a l p r o b a b i l i t y . T a b l e 1 shows f o r example t h a t f o r sample s i z e 27, i n 90% of c a s e s t h e r e s i s t a n t l i n e c a n t o l e r a t e a b o u t 25% c o n t a m i n a t i o n w i t h o u t l i e r s (and t h i s s t i l l assumes t h a t a l l o u t l i e r s r e i n f o r c e each o t h e r ) . For p r a c t i c a l a p p l i c a t i o n s , a v a l u e n e a r 25% seems a p p r o p r i a t e as a g u i d e i n d e t e r m i n i n g t h e a p p l i c a b i l i t y of t h e r e s i s t a n t l i n e method. Note t h a t the Theil-Sen method i s much more s e n s i t i v e t o a - v a l u e s c l o s e t o and e x c e e d i n g i t s nominal breakaown p o i n t . 1. I l l u s t r a t i v e Breakdown P r o b a b i l i t i e s a = P r o b a b i l i t y of Gross E r r o r method .20 ,25 .33 Resistant l i n e .04 .lo .26 Brown-Mood .014 .05 .19 Repeated Median .0002 .003 .04 Theil-Sen .07 .25 .56 Resistant l i n e .002 .013 .10 Brawn-Mood ,0002 .003 .05 n=2 7 n=6 3 Repeated Median Theil-Sen 0 0 .002 .O2 .14 .37 2.5 R e s i s t a n t L i n e and Repeated Median R e g r e s s i o n The r e p e a t e d median r e g r e s s i o n e s t i m a t e of S i e g e 1 ( 1 9 8 2 ) , bm = rned rned b i j f i ij pays f o r i t s o p t i m a l g r o s s e r r o r breakdown of n e a r l y 50% w i t h a comput a t E o n a l complexity o f 0 ( n 2 ) o p e r a t i o n s . r e p e a t e d median by computing b I n t e r e s t i n g l y , modifying t h e only f o r x ij o L and x i j E R is equivalent t o t h e O(n) r e s i s t a n t l i n e e s t i m a t o r . P r o p o s i t i o n 2.1. If and n a r e b o t h odd, t h e n R bRL = rned rned b Rr' rER R E L lo. xn t h e d u a l plot, l e t tlre i ? n t e r s e c e l o n o f er(b] = y r x b (X O R ) w i t h Proof - r TL(b) occur a t br. T (b ) = med eR(br) L r R EL From t h e d e f i n i t i o n T r and t h e relation i t follows (since 2'. - Since b~~ n L i s odd) t h a t br = rned bRr. R EL satisfies med e r ( bRL ) = TL ( bRL ) by d e f i n i t i o n , i t i s rER enough t o n o t e t h a t Remark. denote I n a n o r d e r e d set x n c l u s i o n of by lomed xi x and < x < 1- 2 x n+l *.. -< x 2n by himed 0 1 i s weakened t o br ( h i m e d bgr, xi. of even c a r d i n a l i t y , we If i s even, t h e con- and i f b o t h and R EL a r e even ( f o r example) t h e c o n c l u s i o n of t h e p r o p o s i t i o n i s j u s t t h a t lomad lomed bRr r € R EEL 2 bRL < himed himed bgr O R QoL . n R 53. CONFIDENCE INTERVALS FOR 6 Non-parametric confidence i n t e r v a l s f o r t h e s l o p e f3 i n a r e g r e s s i o n model may be d e r i v e d .by an analogue of t h e method t h a t g i v e s i n t e r v a l e s t imates f o r t h e p o p u l a t i o n median. the E Consider Model I of 52.3 and assume t h a t have a continuous d i s t r i b u t i o n ; t h e r e s u l t s extend t o t h e random i c a r r i e r s model a s before. a) - ei(f3) = yi Write i n the dual plot. l e f t group, group. - {yi Let e xiB, Assume t h a t xi@ f o r t h e r e s i d u a l of % L (8) (r) denote t h e r t h L}, and d e f i n e i E = nR = m e R (8) (s) f3 at smallest residual i n the similarly i n the r i g h t (say) f o r convenience: t h e r e i s no d i f f i c u l t y i n extending t h e t h e o r y o r computations t o unequal A s a convention, s e t s = m + 1- r. Let Br (ignoring % and "Re be t h e unique s o l u t i o n of h Existence and uniqueness of 8 follow from t h e d u a l p l o t a s b e f o r e . r i s odd then B m Of course, i f [(&1)/2] - @E, and, w i t h an a p p r o p r i a t e choice of i n t e r c e p t a , f i n d i n g Br amounts:to l o c a t i n g t h e l i n e i n (x,y) space such t h a t C l e a r l y Bl 2 B2 2 . . . -> Bm, and we now show t h a t p a i r s (Bs. Br) y i e l d nonparametric confidence i n t e r v a l s f o r B. (r < s ) g(B) = e = a d a t a p o i n t s i n t h e l e f t group l i e on o r below t h e r p o i n t s i n t h e r i g h t group l i e on o r above t h e l i n e . l i n e and Yi r R + (s) (B) Bx i L - e(,)(B) + Ei The f u n c t i o n i s s t r i c t l y d e c r e a s i n g , and hence i f t h e model obtains, - ... -< E (m) L L < E(l) where L E 9 (r) < ... -< E (m) a r e t h e ordered E i m The q u a n t i t y U = number of v a l u e s i n r i g h t group r i n t h e o u t e r groups. which exceed R and i s a n exceedance s t a t i s t i c and h a s p r o b a b i l i t y m a s s function as may be s e e n from a s i m p l e c o r n b i n a t o r i a l argument. pB{B > 8,) probability < r) also. Thus t h e i n t e r v a l 1- 2 P(U'< r). r E p s t e i n (1954) f o r s = m and by a symmetrical argument ( s i n c e = P Q I < ~ r), P {B > f3} = P(U: B s Now [Bs, Br] covers + 1B with The l a t t e r e x p r e s s i o n h a s been t a b u l a t e d by m = 2(1)15 (5)20, r < m, and more e x t e n s i v e l y by B e c h t e l (1982). Remarks : 1) F i n i t e n e s s of t h e s e t confidence l e v e l s . 2) {B1,. ..,Bm 1 l i m i t s t h e number of p o s s i b l e This can be a n o n - t r i v i a l r e s t r i c t i o n i f m i s small. The c o n f i d e n c e i n t e r v a l above c a n b e i n v e r t e d i n t h e u s u a l way t o g i v e a t e s t of Ho: f3 = Bo against general alternatives. The t e s t i s based on t h e number of d a t a p o i n t s i n t h e l e f t group t h a t l i e on o r above t h e l i n e with slope PO l i e on o r above i t . This t e s t is d i s c u s s e d i n t h e case and i n t e r c e p t chosen s o t h a t m points i n L = 0 , u s i n g nor- mal approximations, by Hogg (1975), a l o n g w i t h e x t e n s i o n s t o p e r c e n t i l e regression. UR r), I f i n s t e a d of Models I o r 11, i t i s assumed t h a t g i v e n 3) is the (LOO r / m ) t h p e r c e n t i l e of t h e Asymptotic approximation f3 + (3x Y d i s t r i b u t i o n , t h e n a n e s t i m a t e of t h e kind c o n s i d e r e d by Hogg (1975) i s o b t a i n e d (with be z e r o ) by s o l v i n g f o r x,a n~ not restricted t o i n anologue of ( 3 . 1 ) : for The e x p r e s s i o n (3.2) i s a n e g a t i v e P (um < r ) r hypergeometric p r o b a b i l i t y and as such nay b e thought o f , f n terms of f a 2 r c o i n - t o s s f n g a s P[X = +Y X ~ X = m] where X and are Y independent n e g a t i v e binomial v a r i a b l e s r e c o r d i n g t h e number of t a i l s be(m fore the -r + l ) a f and r t h heads r e s p e c t i v e l y . = 1 i f heads, -1 IS2m = 0 9 '2m+1 r one s e t s where = 0 W (t) (m i f t a i l s , with = 1) + 2 -x and &)/2, m = :1 S {X < r ) = {(s, Xi, + m)/2 Switching t o +Y we have {X - + 11, > m - r i.i.d. = m) = so t h a t i f then i s a s t a n d a r d Brownian Bridge on [0,1] ( B i l l i n g s l e y , 1968). A,lessrevealingdemonstration c a n b e based d i r e c t l y on t h e n e g a t i v e hypergeometric p r o b a b i l i t i e s u s i n g t h e ( l o c a l ) normal approximation g i v e n i n F e l l e r (1968, -- p. 1 9 4 ) . The approximate c o n f i d e n c e i n t e r v a l s t h a t r e s u l t have c o v e r a g e p r o b a b i l i t i e s c l o s e t o t h e nominal l e v e l even f o r moderate m. The s t a n d a r d c o n t i n u i t y c o r r e c t i o n ( r e p l a c e x by s l / &i n e q u a t i o n 3 , 3 ) y i e l d s t h e formula f o r a two-sided approximate 1 0 0 ( 1 example, if m = 20 (fi13, - 2a)% confidence i n t e r v a l , Far and a = - 0 5 , t h e n r = 7.9, W M Crounds ~ t o 8, B 8) h a s e x a c t c w e r a g e p r o b a b i l i t y 1 - 2P ( IJi0- 7) < The i n t e r v a l = .89 (Epstein B e c h t e l (1982) h a s shown t h a t v a l u e s of rc from e q u a t i o n 3.4 1954). round t o t h e e x a c t v a l u e s even a t the s m a l l e s t a p p l i c a b l e sample s i z e s . When r C i s n o t an i n t e g e r , w e can i n t e r p o l a t e f o r t h e edges of t h e i n t e r v a l . n I n t e r p o l a t i o n of logP(Ur < r be s a t i s f a c t o r y . - 1 ) vs r / n a p p e a r s (from d a t a a n a l y s i s ) t o This can overcome t h e r e s t r i c t e d c h o i c e of a v a l u e s imposed by i n t e g e r r. (See remark 1 of t h e p r e v i o u s s e c t i o n . ) t h e coverage p r o b a b i l i t y i s no l o n g e r e x a c t l y d i s t r i b u t i o n - f r e e . Of c o u r s e , 54. ASYMPTOTIC PROPERTIES OF THE ~ESISTANTLINE To s t u d y a s y m p t o t i c p r o p e r t i e s , we extend t h e d e f i n i t i o n of t h e r e s i s t a n t l i n e s l o p e estimate from b i v a r i a t e s c a t t e r p l o t s t o a r b i t r a r y H bivariate distributions Let xL 2 xR b e lower P~ of random v a r i a b l e s t h and upper d i s t r i b u t i o n of X such t h a t assume t h a t e i t h e r m e d ( ~ X 1 Assume t h a t 0 < pL pL -- 5 1- 5 P~ (X,Y) with values i n th p e r c e n t i l e s of t h e m a r g i n a l * pL = P(X 5 xL) 3)< o r m e d ( ~ 2( ~ %) > xR o r b o t h . and PR = P(X XI() and pR < 1, and t h a t t h e p a r t i t i o n i n g i s symmetric: pR, a l t h o u g h t h i s i s done f o r n o t a t i o n a l convenience only. The lower and upper medians of a d i s t r i b u t i o n f u n c t i o n G a r e d e f i n e d by lomed G = sup(x:G(x) t a k e med G = (lomed G functional B WI (aas < 1 / 2 ) and himed G = i n f (x:G(x) + himed > 112) , and we Define t h e r e s i s t a n t l i n e s l o p e G)/2. the c r ~ s s i n gpoint of z e r o of th.e f u n c t i o n E x i s t e n c e and u n i q u e n e s s a r e c l e a r because t h e s l o p e of l e s s than - (xR-xL) and h(f3) If3 + t[med (XI x < xL) - med h(f3) (XI i s always x 2 xR)] as f 3 + +- w . 4.1 Consistency F i s h e r c o n s i s t e n c y of t h e form Indeed, Y = a + BX + E, (3 RL (11) where h o l d s f o r l i n e a r r e g r e s s i o n models of I med (E X) = med E almost s u r e l y . rned(Y-f3xlx > xR) = m e d ( a + ~ ( x> xR) = m e d ( a + e ) = r n e d ( a + & l l < xL)= med(Y-BXIX~xL) so t h a t BRL(H) = 6. I n l o c a t i o n problems t h e median of i . i . d . i f f lomed G = himed C. samples from G is consistent To d e s c r i b e t h e analogous s t r o n g c o n s i s t e n c y -17p r o p e r t i e s of BRL(H), we i n t r o d u c e , h*($) = h i m e d ( Y - @ I ~ > x ) - l b m e d ( ~ -BXIX < xL) - R B X ~ X> xR) - hk(B) = lome&Y- h a ( @ ) 2 h(f3) Clearly 2 h,(B), { - - 1 ) ( - 0 ) h,(B) = - ( 1 0 ) ( 1 , ) B* Z B , and = 1, B Hn 8, B*, BRL, B* of RL Then h*(@) = 2 = 112, 8, B (9 ) = n he> 28, = 1- 2B, For samples H of s i z e n n = 9. IZL - O o r 1. b e a sequence of b i v a r i a t e d i s t r i b u t i o n s w i t h c o r r e s p o n d i n g partition points estimators . xL) BRL 2 &,. B* 2 from H i t i s c l e a r l y p o s s i b l e t o have Let 5 p l a c e s e q u a l m a s s on t h e f o u r p o i n t s H Suppose t h a t PIX and t h e u n i q u e z e r o s the respective functions s a t i s f y Example 4.1 himed(Y- and x < x L,n (n) probabilities RYny P~,ny 'RYn H n The n o t a t i o n s 3 H and r e s i s t a n t l i n e L(X ) n and + P(X) d e n o t e convergence i n d i s t r i b u t i o n of d i s t r i b u t i o n f u n c t i o n s and random variables respectively. P r o p o s i t i o n 4.2 P ~ + Suppose t h a t PLY , ~ PRYn+ PR . 8 Hn + H, and t h a t Then Proof. We prove t h a t B, -c lim check, f o r example, t h a t L ,& h p ) t h i s implies t h a t ultimately - 8 L n )i l i m B * ( ~ )i B*. (i)2 h*(b) < b. > 0 It i s enough t o f o r any b < BAY for Appendix 1 shows t h a t g(Y - - b ~x l-> xR,nyHn ) with t h e corresponding v e r s i o n f o r x L,n' -t - X ( X> xR,H) -~ r(Y together - The r e s u l t i s t h u s a consequence of t h e lower (upper) s e m i - c o n t i n u i t y of t h e f u n c t i o n lamed(.) (resp. h i n e d ( - ) ) f o r weak convergence of d i s t r i b u t i o n . Remark 1. BRL (H) Consequently, i f dy Hn -+ H i s u n i q u e l y d e f i n e d and , (which o c c u r s w i t h p r o b a b i l i t y one i f Hn i s t h e e m p i r i c a l measure of i . i . d . then o b s e r v a t i o n s from H) 6RL (Hn ) ,-t BRL (H) . I n particular, i f il = {H: r r e g r e s s i o n models t h e unique median of Y = a L,n and x Remark 2. R, n B0X l i e s i n t h e f a m i l y of l i n e a r + E; independent) and e i t h e r 0 i s E,X o r X h a s a n everywhere p o s i t i v e d e n s i t y , t h e n E, is strongly consistent for on x + 0 H 0 B Note t h a t t h e c o n d i t i o n s permit c h o i c e of t h e p a r t i t i o n p o i n t s based on t h e sample i t s e l f . A s w i t h Wald's e s t i m a t o r , t h e r e s i s t a n t l i n e may b e f i t t e d t o any b i v a r i a t e s c a t t e r p l o t , i n c l u d i n g t h o s e i n which t h e d a t a a r e believed t o follow an errors-in-variables model. I s s u e s of i d e n t i f i - a b i l i t y and c o n s i s t e n c y a r i s e j u s t a s f o r ~ a l d ' se s t i m a t o r . We mention w i t h o u t proof o n l y one such r e s u l t i n t h e s p i r i t of Neyman and S c o t t L e t (Xi, (1951, Theorem 2 ) . s t r u c t u r a l model X = U and + 0, i = 1, Yi), Y = BU + E ..., n where U , 0 and € h a v e smooth p o s i t i v e d e n s i t i e s , . t h e s u p p o r t of 0 b e t h e i n t e r v a l R = {Xi; X. 1 RL - xR} only i f P(xL-v ( U . Then < xL - [u, b e a sample from t h e and 11, E a r e independent, L e t t h e convex h u l l of v ] , and L = X ; Xi 2 xL}, BRL i s a c o n s i s t e n t e s t i m a t o r of B if and p) = P(xR -v < U I xR - p) = 0. - 194.2 Influence Function, S e v e r a l a u t h o r s (Mallows 1973, Huber 1973, Velleman and Ypelaar 1980, Krasker and Welsch 1982) have n o t e d t h e importance of bounding t h e i n f l u e n c e f u n c t i o n of a r e g r e s s i o n e s t i mator i n b o t h y and x. The r e s i s t a n t l i n e i n f l u e n c e f u n c t i o n i s s o T h e r e f o r e , i n c o n t r a s t t o , s a y L1 r e g r e s s i o n , bounded. is n o t unduly a f f e c t e d by a d a t a p o i n t w i t h an extreme x-value. Let p r o b a b i l i t y mass a t + H6 = ( 1 - 6 ) H (x,y). 6~ where ( X , Y1' The i n f l u e n c e f u n c t i o n of e (x,Y> BRL is a point i s d e f i n e d by - IC(BRL,H, ( x , ~ ) )= l i m IBRL(HF,' BRL(H)l/6. Suppose t h a t -- +O-- 6 = IH: Y = a + BX + E, f o r some a, B e E and X independent 1, and .x, assume t h a t E]x 1 -. has a density E g p o s i t i v e a t i t s median, 0, and t h a t I x 5 x.) and . I = E ( X x 2 xR) Then Appendix R 2 shows t h a t under mild smoothness c o n d i t i o n s on t h e d i s t r i b u t i o n of X, < Let vL = E (X 11 for qO(x) where median and c = H I{X > 0) L < x < x R , i s t h e i n f l u e n c e f u n c t i o n of t h e - pL) 1-1. (4.1) a c c o r d s w i t h i n t u i t i o n from d u a l The p o s i t i o n of t h e p o i n t plots. (BO,y- Box) x I{X < 0) = [ 2g (0) pL (pR Qualitatively, if - x 2 xR) with r e s p e c t t o the appropriate ( l e f t i f BO median t r a c e e v a l u a t e d a t x < xL, right d e t e r m i n e s t h e d i r e c t i o n of 1. change i n s l o p e . The a n a l o g y between t h i s influence function and t h a t of t h e median i n l o c a t i o n problems i s pursued f u r t h e r i n S e c t i o n 5. For a n a r b i t r a r y b i v a r i a t e d i s t r i b u t i o n f u n c t i o n remains t h e same e x c e p t t h a t Appendix 2 ) according a s x > xR or cH H, t h e form of t h e i n f l u e n c e t a k e s on two v a l u e s ( g i v e n i n x < xL' 4.3 Asymptotic Normality. assertsthqt i f sample H n (Xi,Yi) A s t a n d a r d h e u r i s t i c argument (e.g. d e n o t e s t h e e m p i r i c a l d i s t r i b u t i o n f u n c t i o n of a of s i z e n from 2 . If H [ BRL (Hn) A s Var E - H, then t h e s t a t i s t i c %L (H) a s y m p t o t i c a l l y normal w i t h mean EH[1C(hL,H, (X,Y))] )?, 6RL (Hn ) is and a s y m p t o t i c v a r i a n c e g i v e n by t h e n we u s e ( 4 . 1 ) t o o b t a i n BRL (H) ] = 1 - 2pLg (0) (vR-vL) We Huber, 1981) 2 s t a t e a v e r s i o n of t h e a s y m p t o t i c n o r m a l i t y theorem f o r t h e r e d s t a n t l i n e s l o p e e s t i m a t o r , proved i n Appendix 3 f o r t h e more g e n e r a l s e t t i n g of S e c t i o n 5 , Theorem 4.3. H E Jlr. Suppose t h a t Suppose a l s o t h a t (XiYi) i = 1 ~1x1 E P d e n s i t y a t i t s median, 0 , convenience o n l y ) Remarks. 1) 7' If p - L - PR* l %In -+ i s an i . i . d . sample from has a continuous p o s i t i v e pL , %/n P -* pR, and ( f o r Then pL#P(X<xL) forsome ~ - ~ ( ~ )where d ~ , F-L (p) x L' then p L isinterpreted d e n o t e s t h e l e f t c o n t i n u o u s i n v e r s e of as p~ F, t h e m a r g i n a l d i s t r i b u t i o n of 2) A s i m i l a r r e s u l t i s o b t a i n e d by K i l d e a (1981) i n t h e c o n t e x t of weighted medians when t h e 3) X i X. F u r t h e r d e t a i l s a r e g i v e n i n Appendix3, a r e c o n s i d e r e d a s f i x e d r a t h e r t h a n random. I f a h e t e r o s c e d a s t i c model of t h e form where X and E Y = a +B X + E/(T(X) obtains, a r e s t i l l assumed independent, t h e n t h e argument of Appendix 3 e x t e n d s ( w i t h a p p r o p r i a t e m o d i f i c a t i o n s of t h e r e g u l a r i t y c o n d i t i o n s ) t o show t h a t - i s a s y x p t o t i c a l l y c e n t e r e d Gaussian w i t h v a r i a n c e (BRL-f3 ) / 5 xL) = E(Xo(X) X I~L 4) 6 and .rR, fiR a r e d e f i n e d c o r r e s p o n d i n g l y , I n s p e c t i o n of t h e proof r e v e a l s t h a t of a l l c h o i c e s of s e t s (with L < R) such t h a t I+, of t h e s m a l l e s t %In -t p and l a r g e s t and L nR nR/n +- pR, L and R t h e s e t s composed a r e o p t i m a l ( i n t h e s e n s e of X.'s 1 minimizing a s y m p t o t i c v a r i a n c e ) . In the special case that L = {xi:Xi a d i r e c t proof i s e a s y . equivalent t o 5 xL} and > The e v e n t = med ( ~ ~ - c n - ' / ~ ~>. med ) ( E .-cn-l/*Xi) {Z The s p e c i a l form of 1 i&R n~ L a l l o w s Wn = W 1 EL > xR 1 , - R = {x.:Xi 1 Z C} is 1. t o b e approximated t o L order o (n+) P by t h e median t h e s e being regarded as 2(xL) = $,(XIx < xL) . W of m = [npL] m i , i . d c o p i e s of E - c n i.i.d. variables N(-c(liL,yR) , 14 " xL, S i m i l a r l y one o b t a i n s corresponding t o t h e r i g h t group. m ' 2 (011 -1 1 2 ) . Thus ' E i - c n14 Xi, where independent of 'm' Standard a s y m p t o t i c t h e o r y f o r medians of e x t e n d s t o show t h a t g v a l u e s of mw,,zm) p(vmRL - ?!. (W,Z) T, Bo) > c ) = P ( Z > W) + ~ ( l ) and t h i s l e a d s t o t h e claimed c o n c l u s i o n . G e n e r a l l y , however, t h e groups t o fixed values r, and x R' L and R a r e n o t chosen w i t h r e f e r e n c e b u t r a t h e r u s i n g sample q u a n t i l e s . To cover t h e s e c a s e s i t i s c o n v e n i e n t t o u s e t h e Berry-Esseen method of Appendix 3. - 4.4 Asymptotic Efficiency Given the marginal distribution F (and from symmetry xR L If E x2 as + - PRY then V < R - If F is symmetric about pL12. and it suffices to maximize R - - p ~ then the Cauchy-Schwartz inequality shows that h( However, if the tails of then h(x) f vary as fails to converge to zero as x lowing (4.3) .) "L also) can be determined to minimize asyptotic variance by maximizing p (p 0 and p of X, the partition point When E x2 < and += -rn. IxI "L) -+ 0 -a for a < 2, (cf. the remarks fol- F has a density f , any critical point xo<(-my 0) of (4.2) satisfies p ( x ) L 0 condition for x to maximize (4.2) is that 0 = 2x and a sufficient 0' f be unimodal and Specific cases were considered by Nair and Shrivastava, Bartlett, Barton and Casley, and Gibson and Jowett, who obtained 2&[F(xL) (v R - pL)2]-1 as the variance of the Wald estimator (1.1) under normal theory assumptions. We list a couple of these cases and remark that Section 5 shows that asymptotically, the optimal choice of division points depends only on the marginal distribution of X and not on the particular M-estimator applied to the residuals. If X is uniformly distributed on some interval or places equal mass at equally spaced points, then (4.2) is maximized by F(xL) = 113, which lends support to the choice of "thirds" as the partition points. (where $(x) to m(%) = = @'(x)), If F(x) = @(x), then (4.2) becomes +2 (%)/@(xL) which is maximized by 3= -.6121, corresponding .2702, and yields the "27%-rule'' described in the introduction. We now c o n s i d e r t h e a s y m p t o t i c r e l a t i v e e f f i c i e n c y of t h e blr . r e s i s t a n t l i n e e s t i m a t o r f o r models i n T (Xi, n n; i . e . a + BXi + ci) be any sequence b a s e d on samples of r e g r e s s i o n i n v a r i a n t e s t i m a t e s of f3 of s i z e ITn Let ' Tn(Xi, ....(Xn,Yn) (XI, Y1), ci) + B e The l o c a l a s y m p t o t i c minimax r e s u l t s of Hajek (1972, Theorem 4.2) imply t h a t f o r a p p r o p z i a t e l y smooth models i n l i m in£ n-t nE (T a,B n - blr -I f3)2 { ~ aX r E [ ~ ' ( E ) / ~ ( E ) ] ~= I r~2 && and t h a t e q u a l i t y c a n o c c u r o n l y if - &(T, B) ' is asymptotically ~ ( 0 , o i . ~ ) . It i s t h e r e f o r e n a t u r a l t o c o n s i d e r t h e r a t i o (c.f. S e c t i o n 6) o2 /[AsVarBm] = {pL(vR - I . I ~ ) V~ a/r ~XI 14g2 (0) 11 1 , X ,E g where I = E [ ~ ' ( E ) / ~ ( E ) ] 'i s t h e F i s h e r i n f o r m a t i o n of t h e d e n s i t y g . g C l e a r l y (4.3) c a n b e made a r b i t r a r i l y s m a l l by a p p r o p r i a t e c h o i c e of g. (4.3) T h i s s i t u a t i o n c a n , i n p r i n c i p l e , b e remedied by r e p l a c i n g t h e median by a n (adaptive, e f f i c i e n t ) e s t i m a t e of l o c a t i o n ( c . f . S e c t i o n 5 ) . depending on t h e X d i s t r i b u t i o n c a n a l s o b e made a r b i t r a r i l y s m a l l by t a k i n g x -(3+E) d e n s i t i e s f o r X w i t h t a i l s of o r d e r 1 -a fX(x) = -(a-l)x I{ x > 11 , t h e n I 1 2 3 - 1 ) 2 -2 However t h e r a t i o /4. sup 3 ' p o r L (u E > 0. For example, i f -I.I ) 2 / (2 Var X) R L = O Thus t h e r e c a n b e a l o s s of e f f i c i e n c y when t h e c a r r i e r h a s a d i s t r i b u t i o n w i t h v e r y heavy t a i l s . I n t h e f a m i l i a r c a s e i n which squares estimate e f f i c i e n t , with are i . i . d . E N(0,1), the least x) yI/L (xi - x12 i s a s y m p t o t i c a l l y A s V a r B /AsvarBRL p - ~ . ) ~ / ( n x). ~ a r If X is LS L R 6 BLS = E (xi - = (11 1 uniform, o r e q u i s p a c e d , t h e n t h e Gaussian e f f i c i e n c y of t h e r e s i s t a n t l i n e (with pL = 1 / 3 ) i s 56.6% w h i l e i f X i s N(0,1), and t h e e f f i c i e n c y becomes 50.5% ( r i s i n g t o 51.6% i f p L = ,27). p L = 113, W e note fQn t h a t e f f i c i e n c f e s of g T e a t e r 50% are conaidered qdequate f o r e x p l o r a t o r y d a t a a n a l y t i c methods. The c o r r e s p o n d i n g e f f i c i e n c i e s f o r Brown-Mood r e g r e s s i o n (pL = 112) 40.5% respectively, a r e 4 7 8 % and s o t h a t n e g l e c t i n g a n a p p r o p r i a t e middle group of o b s e r v a t i o n s i n c r e a s e s e f f i c i e n c y by a f a c t o r of a b o u t o n e - f i f t h . 55. THE CLASS OF $-RESISTANT LINES A g e n e r a l c l a s s of r e g r e s s i o n e s t i m a t e s t h a t i n c l u d e s t h e r e s i s t a n t l i n e , Wald-Nair-Shrivastava-Bartlett, and Brown-Mood methods c a n be d e f i n e d by r e p l a c i n g t h e medians a p p l i e d t o t h e r e s i d u a l s i n t h e d e f i n i t i o n of t h e r e s i s t a n t l i n e w i t h a n M-estimator of l o c a t i o n . This c l a s s is i n t r o - duced because of t h e s u b s t a n t i a l improvements i n e f f i c i e n c y t h a t c a n be achieved w i t h o u t s i g n i f i c a n t l y a f f e c t i n g breakdown o r ( i n some c a s e s ) com- A s i m i l a r t r e a t m e n t c a n i n p r i n c i p l e b e based on p u t a t i o n a l expense, L- o r R- e s t i m a t o r s . Let Tn(Zi):= Tn(Z1,.,.,Z monotone i n f l u e n c e f u n c t i o n redescending ...,n i = 1, t o those largest 'i 'i I)). n $ b e a n M-estimator of l o c a t i o n w i t h ) (We make some remarks below concerning A s b e f o r e , suppose t h a t b i v a r i a t e d a t a a r e a v a i l a b l e . Let with indices respectively. i T (Z.) L 1 and TR(Zi) B 9 T denote corresponding t o t h e n Define (xi,yi) L , applied s m a l l e s t and n = B + ( T ~ ) a s t h e z e r o of t h e function The monotonicity of that h $ and t r a n s l a t i o n i n v a r i a n c e of T e n s u r e i s monotone d e c r e a s i n g w i t h s l o p e a t most x (nL) - x (n-nR+l) ' R -25- which assures the existence of a unique solution to (5.1). Locating the zero of such a function is possible using (say) the ZERION algorithm as described in Section 2 and thus requires the same order of computation effort as computation of T. The gross error breakdown value of B is one third of that of the corresponding M-estimator if Remark 2 = nR $ = "13. In practice, one would have to estimate the scale of the residuals , also, and (5.1) might for example then be replaced by a T~ i'( = TR 5'( bxi) h where a is a robust estimate of scale which, In homoscedasric situatiom n can be based on all the data, In the presence of heteroscedastic errors a * could be replaced by o and a estimated separately in the two R L groups. 4 To define the $-resistant line as a functional on theoretical distributions H of the pair (X,Y), we proceed as follows. Let xL < xR X, and be lower pth and upper pth R percentiles of the distribution of L write F = F (H) and F L,b L,b R,b = F (H) for the respective distribution R,b functionsof Y - b X ( X Z 3 a n d Y -bXIX,%. Let t(F) be the M-estimator of location obtained from a monotone increasing function $(x) by solving X(F,t) = /$(x - t) dF(x) = 0. The $-resistant line estimator b (H) is defined as the (unique, since $ is monotone) solution '4 where TR(Y - i s a mnemonic f o r bX) The c o n s i s t e n c y of analogous t o t h a t of ,6 t(FR,b), e t c . can be discussed i n a f a s h i o n e x a c t l y .$,f , To r e p l a c e M m e d F and lomed F, we u s e The s t a t e m e n t and proof of P r o p o s i t i o n 4.2 remain v a l i d ( w i t h t h e obvious changes) under t h e e x t r a striction that b e bounded. T h i s f a c i l i t a t e s t h e proof t h a t * and T (F) are semi c o n t i n u o u s . In particular, i f T(E) i s bounded, t h e If bounded i n f l u e n c e i n b o t h line. Let H 6 +E 'pi = (1 - 6)H with + and 6 ~ X and { YY y g with ~ ( X ( E ) )= 0 , and Thus t h e i n f l u e n c e f u n c t i o n i s ( f o r 60 . l i n e r e t a i n s t h e p r o p e r t y of p o s s e s s e d by t h e o r d i n a r y r e s i s t a n t ~Under a p p r o p r i a t e r e g u l a r i t y con- d i t i o n s , t h e argument i n Appendix 2 shows t h a t i f density independent and E is strongly consistent f o r $-resistant x T*(F) (See Appendix 1.) Y = aO + DO X uniquely defined, then technical re- E(x(< x H E Hr and E has then values i n the outer thirds), a c o n s t a n t m u l t i p l e of t h e i n f l u e n c e f u n c t i o n of t h e l o c a t i o n e s t i m a t e T. -27- I f the observations (Xi,Yi) then asymptotic normality of B Q come from a model H C can b e $, X , E e s t a b l i s h e d under f a i r l y g e n e r a l c o n d i t i o n s on r u l e . Let denote t h e asymptotic v a r i a n c e of VT LC€& (5.4) B /'J B - 6 )1 + 4(t)' = E$(E-t) 1 h (t) = O ( l t ) . $ as t + a. t = 0 ?L. n~ ( ) n nR) $(E-t) have t h i r d moments, and e i t h e r EX 2 <. The hypotheses cover t h e extremes i s t h e median o r t h e mean ( a t l e a s t f o r most (resp. { T ~ } , Then a r e d e t a i l e d i n Appendix 3 . be monotone, be d i f f e r e n t i a b l e a t and t h e p a r t i t i o n i n g cau (0, 2p~1(uR-p,)2.~T) The r e g u l a r i t y . c o n d i t i o n s and proof E s s e n t i a l l y we r e q u i r e t h a t ~ E). or QJ i n which Tn The number o f ' p o i n t s Z i n t h e l e f t and r i g h t groups need only t o be chosen so t h a t pL(pR). The proof c o n d i t i o n s on t h e observed X's and u s e s a Berry-Esseen theorem t o o b t a i n a uniform normal approximation. I t ' i s apparent from (5.4) t h a t t h e asymptotic r e l a t i v e e f f i c i e n c y of two $-resistant l i n e s i s t h e r e l a t i v e e f f i c i e n c y of t h e two corresponding l o c a t i o n e s t i m a t o r s , s o t h a t t h e c h o i c e of d i s t r i b u t i o n of' E, $(x) = ~ ~ 1 < x k) 1 i f known (approximately) + could be t a i l o r e d t o t h e . For example, use of k s i g n ~ ( 1 x 1> k ) (which i s minimax over contaminatton neighborhoods of t h e Gaussian d i s t r i b u t i o n (Huber 1964)) w i t h k = 1 . 5 r a i s e s t h e A.R. E. w i t h r e s p e c t t o l e a s t s q u a r e s ( f o r uniform X and Gaussian Y) from 57% t o 85%, w b E l e r e t a i n i n g good breakdown -28- Redescending $ The use of non-monotone ("redescending") $-functions has been advocated to afford greater protection from outliers. However, solutions are no longer guaranteed to exist or be unique and care is required in estimating the scale, 6 (c.f. 5.2). Without unimodality assumptions on the error distribution, the resulting estimators need not even be consistent (Diaconis and Freedman, 1982). perform well in practice. Nevertheless, such estimators We have included a $-resistant line using a one-step biweight (see, e.g. Mosteller and Tukey,1977) in the experiment discussed in Section 6. A heuristic argument (outlined in Appendix 4) shows, under regularity conditions ensuring asymptotic uniqueness of the solution, that the asymptotic normality result: (5.4) is preserved for a wide class of non-monotone $-functions. Further, the Gaussian efficiency of the biweight is close to that of the optimal "hyperbolic tangent" redescending estimators ( Hampel, Rousseeuw, Rochetti; 1981). -29- 56, SMALL SAMPLE EFFICIENCIES We s t u d i e d small-sample p r o p e r t i e s of t e n under t h e s t a n d a r d l i n e a r model assumptions. a Monte C a r l o experiment. s i m p l e r e g r e s s i o n methods Hr ( s e c t i o n 4.2) , using The experiment d e s i g n was a complete c r o s s i n g of t h r e e sample s i z e s ( n = 1 0 , 22, and 40) by two x c o n f i g u r a t i o n s (fixed equispaced and random Gaussian ( 0 , l ) by t h r e e e r r o r d i s t r i b u t i o n s ( ~ a u s s i a n( 0 , 1 ) , a m i x t u r e of 90% Gau(0,l) and 10% Gau(0,9), and " s l a s h " ) . The Gaussian m i x t u r e i s one f o r which t h e mean and median have approximately e q u a l a s y p t o t i c v a r i a n c e s . as a unit The " s l a s h " d i s t r i b u t i o n , g e n e r a t e d Gaussian v a r i a t e d i v i d e d by a Uniform ( 0 , l ) v a r i a t e , h a s i n f i n i t e v a r i a n c e b u t i s n o t as peaked a s t h e Cauchy d e n s i t y . The methods s t u d i e d f a l l i n t o t h r e e c a t e g o r i e s . ( I ) non-resistant (breakdown = 0) methods w i t h computing e f f o r t O(n): L e a s t Squares (LS) , L e a s t Absolute R e s i d u a l (LAR) ( a l g o r i t h m from IMSL, 1 9 8 0 ) , and t h e p a r t i t i o n e d l i n e method of N a i r , S h x i v a s t a v a , and B a r t l e t t (NSB). (11) R e s i s t a n t methods w i t h computing e f f o r t 0(n2) : Repeated Median (RM) and S e n ' s v e r s i o n of t h e g l o b a l median p a i r w i s e s l o p e (SEN). (111) R e s i s t a n t methods w i t h computing e f f o r t O(n) : Brown-Mood r e g r e s s i o n (BM) , Resistant l i n e (RL) , $-Resistant line u s i n g one b i w e i g h t s t e p ( c = 6 ) t o improve each median and t h e g l o b a l median a b s o l u t e d e v i a t i o n from the median (MAD) a s a s c a l e e s t i m a t e , t h e $-Resistant l i n e u s i n g one b i w e i g h t s t e p b u t a l o c a l MAD computed o n l y w i t h i n each p a r t i t i o n (RLBW31 and t h e $-Resistant l i n e using a f u l l y i t e r a t e d Huber $-function and l o c a l s c a l e (RLHU3). partition-based l i n e s (NSB, BM, RL, RLBW, RLBW3, RLHU3) were computed A l l of the -- u s i n g a modified v e r s i o n of t h e code p u b l i s h e d by Velleman and Hoaglin (1981) f o r t h e r e s i s t a n t l i n e . a r e r e p o r t e d i n Appendix 6. T e c h n i c a l d e t a i l s of t h e exper2ment The experiment employed a "Gaussian over indepedent" v a r i a n c e reduction"swindlel'. (See, f o r example, Simon (1976) atid l.lart5n (3976.) . I , or Goodf ellow W e a l s o developed a new swindle f o r t h i s study t o improve performance on non-Gaussian e r r o r d i s t r i b u t i o n s , The swindle i s defined i n Appendix 5 and d i s c u s s e d more e x t e n s i v e l y i n Johnstone and Velleman (1983). The swindles i n c r e a s e t h e e f f e c t i v e number of r e p l i c a t i o n s , t y p i c a l l y by f a c t o r s of 3 t o 8. (That i s , t h e standard e r r o r s of our e f f i c i e n c y e s t i m a t e s a r e a s s m a l l a s t h o s e of a naive s i m u l a t i o n experiment w i t h between 3 and 8 times t h e number of r e p l i c a t i o n s . ) Occasionally ( e s p e c i a l l y f o r t h e more e f f i c i e n t e s t i m a t o r s ) swindle gain f a c t o r s reached 160. Table 2 p r e s e n t s t h e small-sample e f f i c i e n c i e s f o r f i x e d equispaced x. The e f f i c i e n c i e s have been computed r e l a t i v e t o t h e Cramer-Rao lower bound f o r each s i t u a t i o n , s o 100% e f f i c i e n c y w i l l only be a t t a i n a b l e i n t h e Gaussian c a s e here. I n t h i s sense, t h e e f f i c i e n c i e s r e p o r t e d f o r t h e mixed Gaussian and s l a s h e r r o r d i s t r i butions a r e conservative. Few small-sample r e s u l t s have been publ-ished f o r any of t h e s e methods. S i e g e 1 (1982) r e p o r t s e f f i c i e n c i e s f o r t h e repeated median (RM) a t sample s i z e 10 and 20 ( f o r f i x e d x: .69 and , 7 3 ; for Gaussian X, .53 and .61, r e s p e c t i v e l y ) t h a t a g r e e w i t h ours. Except f o r s l a s h d i s t r i b u t e d e r r o r s a t sample s i z e 10 where a l l e s t i m a t o r s p r e d i c t a b l y d i d b a d l y ) , t h e r e s i s t a n t l i n e performs w e l l , s t a y i n g above t h e nominal 50% v a l u e f o r e x p l o r a t o r y methods and u s u a l l y exceeding 60% e f f i c i e n c y . The i n c r e a s e i n e f f i c i e n c y over asymptotic v a l u e s i t e x h i b i t s f o r Gaussian and mixed Gaussian e r r o r s a t s m a l l e r sample s i z e s i s a p l e a s a n t b e n e f i t i n a technique o f t e n computed by hand f o r small d a t a sets. It appears t o d e r i v e from t h e s i m i l a r behavior of t h e . Key: 0.10 91.4 (.2) (.4) (.3) (.8) (.7) 93.C(.3). 72.3 89.2 63.6 40 95.5 89 63.7 - 2, I: R e s u l t s f o r n = 10 Based on 10,000 r e p l i c a t i o n s R e s u l t s f o r n = 22 Based on 5,000 r e p l i c a t i o n s R e s u l t s f o r n = 40 Based on 2,000 r e p l i c a t i o n s 11: 61.2 (.6) 67.2(.5) 22 95.8 62.0 69.7 m (,9) (1.3) 35.1(1,1) 37.0 0 36.2 10 Slash (1.7) 56.0 (.7) 59.4 (.7) 0 60.9 2 3 (Standard e r r o r ) . . 62.5 65.9 0 68.0 40 (1.1) (1.1) (1.1) - 70.4 0 77.3 - Least Absolute Residual Xair-Shrivastava (1942); B a r t l e t t (1949) Repeated Median; S i e g e 1 (1982) Sen (1968); T n e i l (1950) Brown-Mood (1950) R e s i s t a n t l i n e ; Tukey (1971) $-Resistant l i n e , One-step biweight, l o c a l MAD, c = 6 $-Resistant l i n e . Huber $ (e.g. Huber 1981). l o c a l MAD, c $-Resistant l i n e , one-step biweight, g l o b a l MAD, c 6 (.3) (. 7) (.9) SEN: BY : RL: RLBW3 : RLBU3 : RLBW: R1";: LAR: NSB : 92.0 75.3 61.4 68.4 (. 7) A Mixed Gaussian 75.0 (-6) (.4) 81.5(.4) . 89.1 (.3) 67.1 62w8(.6) 62.9(-5) 10 Residual D i s t r i b u t i o n E f f i c i e n c y . R e l a t i v e t o Cramer-Rao Lower Bound Breakdown = OX, Cost = O(n) Breakdown = 29%(Sen), 5 0 % ( p ) , Cost = 0(n2) 111: Breakdown = 25%(BM), 16%(ot hers), Cost O(n) CRLB Variance 87.9(.2) SEN (.4) 73.0 69.7 RM (.7) 89.1 (.I) 89.2 NSB Gaussian 63. 3(. 5) 22 61.9(.4) 10 LAR n = Fixed Equispased x 1.5 56.0 59.4 0 60.9 62.5 65.9 0 63.6 L4(1 Trief ficiency - 31median, which i s 100% e f f i c i e n t ( r e l a t i v e t o t h e mean) i n Gaussian samples of one o r two and d e c l i n e s i n e f f i c i e n c y t o an ARE of 2/a. The r e v e r s e t r e n d f o r s l a s h e r r o r s appears t o be due t o breakdown of t h e median f o r t h e samples of n/3 found i n t h e o u t e r p a r t i t i o n s of s m a l l samples. The r e s i s t a n t l i n e is t h u s a p p r o p r i a t e f o r hand c a l c u l a t i o n on small samples w i t h t h e caveat t h a t d a t a s e t s should be a b i t l a r g e r ( a t l e a s t 20, p r e f e r r a b l y 40) i f v e r y h e a v y t a i l e d e r r o r d i s t r i b u t i o n s a r e suspected. Among t h e e s t i m a t o r s o r d i n a r i l y found only w l t h t h e a i d of a .computer, t h e biweight + - r e s i s t a n t l i n e (RLBW) performs q u i t e w e l l . It has t h e h i g h e s t maximim e f f i c i e n c y ( " t r i e f f i c i e n c y " ) and 40 among methods s t u d i e d h e r e . a t sample s i z e s 22 It c l e a r l y dominates a t moderate sample s i z e s among e s t i m a t o r s w i t h bounded i n f l u e n c e and inexpensive (O(n)) computation o r d e r . Because t h e biweight is a s i n g l e s t e p improvement of t h e medians i n t h e r e s i s t a n t l i n e , t h i s e s t i m a t o r can be computed by hand i f a programmable c a l c u l a t o r is available, For i t s f a v o r a h l e combination of p r o p e r t i e s , we p r e f e r t h i s e s t i m a t o r among t h o s e s t u d i e d h e r e . Ifhen t h e s c a l e e s t i m a t e i s computed only from t h e l o c a l p a r t i t i o n (RLBW3, E H U 3 ) w e can r e l a x t h e assumption of homoscedastic e r r o r s , ' b u t we w i l l pay about 5% i n e f f i c i e n c y i f t h a t assumption i s i n f a c t t r u e . The l e a s t a b s o l u t e r e s i d u a l t h e RLBW. (LAR) l i n e performs s i m i l a r l y t o Its primary weakness is t h a t i t i s n o t r e s i s t a n t t o t h e e f f e c t s of l a r g e f l u c t u a t i , o n s a t extreme x v a l u e s . The method of T h e i l and Sen, and S i e g e l ' s r e p e a t e d median reward t h e 0(n2) computing expense w i t h h i g h e r . Gaussian e f f i c i e n c i e s and ( f o r t h e r e p e a t e d median) h i g h e r breakdown. However, t h e i r e f f i c i e n c y advantage i s eroded a t t h e more s e v e r e e r r o r distributions, and t h e i r computations a r e t o o complex f o r hand c a l c u l a t i o n and t o o e x p e n s i v e t o b e a p p l i e d t o l a r g e d a t a sets o r g e n e r a l i z e d t o more t h a n 3 o r 4 dimensions. The e f f i c i e n c i e s r e p o r t e d i n Table 3 f o r random Gaussian x f o l l o w t h e same g e n e r a l p a t t e r n , b u t are somewhat lower t h a n t h e c o r r e s p o n d i n g fixed-x e f f i c i e n c i e s . For t h e r e s i s t a n t l i n e f a m i l y e s t i m a t o r s , some of t h e l o s s i s due t o t h e s u b o p t i m a l placement of x L and x 57. R a t t h e 33% r a t h e r t h a n t h e 27% p o i n t s i n x. MULTIPLE REGRESSION P a r t i t i o n - b a s e d l i n e s can b e extended t o t h e m u l t i p l e r e g r e s - s i o n model i n a v a r i e t y of ways: no d e f i n i t i v e answers have y e t emerged. Gibson and J o w e t t (1957b) g e n e r a l i z e ~ a r t l e t t ' s (1949) method t o two c a r r i e r s , b u t t h e i r method i s n o t p r a c t i c a l f o r more t h a n two c a r r i e r s , Andrews (1974) proposes b u i l d i n g m u l t i p l e r e g r e s - s i o n s by "sweeping" w i t h a s i m p l e r e g r e s s i o n r e l a t e d t o t h e r e s i s t a n t line, Beaton and Tukey (1974) sweep i n a d i f f e r e n t o r d e r t o b u i l d a m u l t i p l e r e g r e s s i o n from a s i m p l e r e s i s t a n t r e g r e s s i o n . Emerson and Hoaglin (1983b) f o l l o w Andrew's sweeping o r d e r t o b u i l d a m u l t i p l e r e g r e s s i o n from t h e r e s i s t a n t l i n e . +resistant S u b s t i t u t i n g more e f f i c i e n t l i n e s i n t h e s e sweep methods w i l l improve t h e i r e f f i c i e n c y w i t h almost n o . i n c r e a s e i n computing e f f o r t . Random Gaussian X 69.8 .0526 64.6(l.1) 58.5(1.~) 69.2 (.7) 50-8(1.0) 42*4(1.1) 62.3(.8) 60.6 (.7) 37.4(.7) 52.8 (.6) 88.0 84.8 (.7) 64.2(1.0) 61.7(.8) 62.4(1.0) 78'7(.8) (.7) 102.6 (.3) 40 80'1(.6) 62.6 105.0(.3) 20 Gaussian 71.0 71.0 73.4 50.5 40.5 91.2 79.3 63.7 10Q m 60.3 8Q.Q 10 - - 3. 67.5 72.7 22 (-81 67.4 70,5 40 Mixed Gaussian (1.12 Residual Distribution 69.7 69.7 m 27.8 Q 10 - (standard error) - ~ f f i c i e n cRelatiye ~ to CRLB 0 22 - Slash 7 0 4Q - T r i e ff i c i e n c y One e x t r a p o l a t i o n of e q u a t i o n (2.1) r e q u i r e s t h e r e s i d u a l s from a m u l t i p l e r e g r e s s i o n f i t t o have z e r o each c a r r i e r : $-resistant l i n e slope against For p c a r r i e r s ( i n c l u d i n g t h e c o n s t a n t c a r r i e r ) and c o e f f i c i e n t v e c t o r , 8, d e f i n e e = y - XB. We s e e k t o satisfy the p simultaneous c o n d i t i o n s f o r W e s t i m a t o r , T , and p a r t i t i o n s e t s (L o r d e r i n g of c a r r i e r x j . j' R.) J d e f i n e d by t h e rank Other p o s s i b i l i t i e s i n c l u d e f i r s t o r t h o g o n a l i z i n g t h e c a r r i e r s i n some ( p o s s i b l y r e s i s t a n t ) f a s h i o n . The e x p e r i e n c e of Emerson and Hoaglin i n d i c a t e s t h a t (7.1) can be approximated i n e x p e n s i v e l y . No $-resistant multiple regression w i l l compete i n e f f i c i e n c y w i t h , f o r example, t h e bounded i n f l u e n c e r e g r e s s i o n s of Krasker and Welsch (1982). However, t h e y p r o v i d e a l e s s expensive e x p l o r a t o r y - q u a l i t y bounded i n f l u e n c e r o b u s t m u l t i p l e r e g r e s s i o n e s t i m a t e , which w i l l be s u f f i c i e n t f o r many a p p l i c a t i o n s , When greater efficiency is desired, the $-resistant multiple regression p r o v i d e s a b e t t e r s t a r t i n g v a l u e f o r t h e Krasker-Welsch i t e r a t t o n than l e a s t squares o r l e a s t absolute r e s i d u a l regression. APPEND1 CES A.l ADDENDA TO CONSISTENCY RESULTS We show f i r s t t h a t under t h e c o n d i t i o n of P r o p o s i t i o n 4.2', < 8~&) ( Y - ~ X [ X X ( Y - ~ X I X X ~ , ~ , H I+ < xL,H) A = { Y - bX < of t h e l a t t e r , and l e t Let be a c o n t i n u i t y p o i n t z By h y p o t h e s i s 2). H (A,X < x n - L,n ) s o t h a t i t remains t o check t h a t PL = H(X ( x L ) , For t h i s purpose write x and x f o r n X ~ . nand H ( A , X z xL). H (A,X ) f o r H (A n n n fl {(--,xnl X ~ ~ , = ~ pL,n ) + -t 3 X I ) and simply Hn(xn) when Hn(A) = 1. x Using r i g h t c o n t i n u i t y of H(x ) , and g i v e n y > x, y > 0 , choose E < H(y) < H(x) p o i n t of H , s o t h a t H(x) - a continuity 5x Hn(X + From E, t h i s we o b t a i n I H,(A,x,) - H(A,x) I < I Hn(A,y) I - Hn(Ayxn) + I Hn(A,y) The f i r s t term on t h e r i g h t s i d e i s bounded by where (-w,y] (-m,y] x R. denotes and converges t o IH(x) asymptotically since - 1 H(y) < then + if h(Fn,t) + X(F,t) p o s s i b l y depending on for F. T,(F) where then Now i n v a r i a n t ; and t h i s i m p l i e s t h a t Nt = N 0 +t E > 0 is & b Fn -+ F: Suppose t h a t such t h a t that for a l l sufficiently large Nt where NF c { t :F(Nt) = F(NO + t ) > 0) T, (Fn) ) T,(F) X(Fn,tO-E) + X(F,tO-E) > 0: n, T*(Fn) 2 to - i s countable. If is translation since It i s now e a s y t o check t h a t Indeed, choose Since To s e e t h i s , n o t e t h a t t h e monotone f u n c t i o n F(Nt) = 0 , countable. by assumption. i s a countable s e t NF i s bounded and c o n t i n u o u s on I R \ N t ' NF. H i s lower semi- > 0) i s bounded and monotone. x + +(x-t) t - Hn(y) 1 ) as was r e q u i r e d . = s u p ( t : X ( ~ ,t ) t &R\NF, ~ xn ; The second term v a n i s h e s a. Hn(A,xn) + H(A,x), Secondly, we prove t h a t continuous n + IH(~) H(x)l < E . F i n a l l y , t h e t h i r d term i s bounded by a r b i t r a r y i t follows t h a t 1H i s a c o n t i n u i t y set of (-,y] A Hn[(-m,ylRA(~,.xn]$ The l a t t e r e q u a l s as E - H(A,Y) I + IH(A,Y)- H(A,x) I E, i s a t most = to, say. t h i s implies and t h i s s u f f i c e s . A.2. DERIVATION OF INFLUENCE FUNCTION and For brevity in what follows we shall assume that F (z), FRPb(2) L,b $(z) are sufficiently smooth in b and z that the necessary differentiations and interchanges are valid. ction corresponding to T(F) In the case of the $ fun- the results will hold except on an = med(F), easily identified null set. b (H ) is defined implicitly as the solution to 4J 6 t (6,b) = 0, where tR(6,b) = t(FR,b,6) and L The estimate b(6) h(6.b) = tR(6,b) (H ) R,b,6- F ~ , b 6 F - = (and similarly for tL(6,b)). In turn, t (6,b) is de- R fined implicitly by ~t follows by implicit differentiation that where to = tL(Oyb(0)) = tR(O,b(0)). To evaluate (A.l), note that H6 = (1-6) H + 6cIX 9 F -- (t) = H6(Y R,6,b where we write - bX -< ~ I x-> xR) {A) instead of = (1 I({A}) R - as) FRYb(t) + , and Y R implies that {y - bx < t}, - i s t h e p r o b a b i l i t y t h a t a w i l d p o i n t i s drawn g i v e n t h a t i t l i e s i n R. . A s i m i l a r expression holds f o r F and hence gL ( 6 , b , t ) L*&,b H E Hr i n order t o simplify (A.l). Assume f o r now t h a t case F R,O,B b+(O) = B = F and t (0, B y t o ) hence t h a t hold f o r . A c a l c u l a t i o n shows t h a t , t o ) = $(y-6 Thus x* R and hence 1 - . D3gR = D3gL xR}IPR and R6 a & 6=0 = {X ) {x > x R1 / p R ' = u Analogous r e s u l t s D g 1 L' -+ Soo, D g 2 L ~1x1< E ( x I x L ~ ~ ) I.f v s o i t i s e a s y t o check t h a t Y gL(O, b(O), t ) = gR(O, b(O), t ) DlgR(0,f3 and t h a t as +a1 + a 1) It remains t o f i n d g X(E = tR(O, b ( 0 ) ) = tL(Oy b ( 0 ) ) = t ( t ( ~ This implies t h a t at = Ly 096 In this and RYb Assume now t h a t FR(x) = P(X < X ~ > X xR) - Write $(v)F D2gR. (v+ t)+O as v + & and and has a density E vR for $ ( v ) [ l - F R y b ( v + t ) ] -+ 0 then D 2 gR ( O , b , t ) = - / d$(v) -cu and a s i m i l a r r e s u l t h o l d s f o r ,f X dFR(z)zg((b-B ) z +v+ t ) , and R D2g~ I n s e r t i n g t h e s e e x p r e s s i o n s i n (A.l) l e a d s t o t h e i n f l u e n c e f u n c t i o n s (4.1) and (5.3). The i n f l u e n c e f u n c t i o n t a k e s a more complicated form i f and we g i v e o n l y t h e r e s u l t f o r t h e o r d i n a r y r e s i s t a n t l i n e . H 6 ar9 I n t h i s case, a s t r a i g h t f o r w a r d c a l c u l a t i o n shows t h a t where b. bo = b R L ( 0 ) , and Db denotes p a r t i a l d i f f e r e n t i a t i o n with respect t o A.3 ASYMPTOTIC NORMALITY We u s e f r e e l y t h e n o t a t i o n of S e c t i o n s 4 and 5, (Xi,Yi) is an i.i.d. c o n d i t i o n s on $ (A) If X, sample from i s monotone, w i t h T n = T,(Z1, (i) X ( t ) = E$(E-t) E 3 r and impose t h e following f u n c t i o n corresponding t o - $(x) > 0 ...,Z n ) n TE = sup{t:E1 $(Zi-t) (B) $ and t h e E, H for x > 0 i s any p o i n t i n > 01 Suppose t h a t and and Tn $(x) < 0 [T:,Tr], . x < 0. for where 01, < T** = i n £ {t:z;$(zi-t) n i s continuously d i f f e r e n t i a b l e a t then t = 0, with X(0) = 0. (ii) EX 2 < i s continuous a t t + w2(e-t) The f u n c t i o n (C) Either or (D) E [ $ ( E + w + c x ) [ <~ w 1 X(t) = 0(1 t ) for a l l as t 0. -t m. w,c E R . Under t h e s e assumptions, w e show f o r t h e $ - r e s i s t a n t A l i n e estimator A that @ ' @ + &($-Po) [pi1 + pa1] (vR-vL) -2 2 i s a s y m p t o t i c a l l y c e n t e r e d Gaussian w i t h v a r i a n c e / [A' (0) 1 2 Here U L-L - 7 L F - (p) ~ d and 1 v~ - -1 - PR j ~ - ~ ( ~ )where d ~ , F-L (p) i s t h e l e f t continuous i n v e r s e of l-pR t h e d i s t r i b u t i o n of shown t h a t X. When 11, = E ( X \ X5 x L ) , pR = P (X > xR). p L = P(X xL) and s i m i l a r l y f o r some vR = x L ' E(XIX2 xR) i t may be when Proof. Y =.a 0 From t h e m o n o t o n i c i t y of + 6 0X +E h ( s e e ( 5 . 1 ) ) and t h e model we have (TL ,T R ) = (TL ( ~1 . - c n - ' / ~ X ~ ) ,TR(~i-~n-112Xi)) We show converges i n p r o b a b i l i t y t o a j o i n t l y normal d i s t r i b u t i o n w i t h mean covariance matrix -2 [A1(0)] 2 -1 -1 . d i a g ( p L ,pR ), -C (pL,uR) and from which t h e r e s u l t follows e a s i l y . Xn, C o n d i t i o n a l on TL and TR a r e independent, s o t h a t Consider f o r now t h e f i r s t term, which i n view of (A.2) may be sandwiched between and t h e c o r r e s p o n d i n g e x p r e s s i o n w i t h n o n - s t r i c t i n e q u a l i t y . I t w i l l b e s e e n t h a t i t s u f f i c e s t o c o n s i d e r (A.4). f o r independent r . v . ' s Let ( e . g . F e l l e r (1971), p. 544) t o approximate (A.4). denote conditional expectation given zn , i = $ ~ & ~ - n (w+cxi) - ~ / ~ and P, rn = Apply t h e Berry Esseen theorem -Z - E - Pn,i, i&L - 13. I, pn, NOW = write where as always ZZn, i , and s e t 2 'n - -112 =A[n (*cxi)l, - i E ~ n , i 9sn < X L = {i:X (i) } = on,i = ~ a rn , (i ~ z icL and Note t h a t a l l t h e s e - (2) sn. From Berry-Esseen q u a n t i t i e s a r e f u n c t i o n s of follows ' I PL(?(,) - (3 (-vn/sn) 1 3 6rn/sn Now t h e second f a c t o r i n (A.3) can be handled analogously, s o an appeal t o t h e Bounded Convergence Theorem reduces t h e proof t o showing t h a t r/s3 n n 0 , and 2 -1 112 P 'nIsn + -(*P,){P~Q~(O))~(W We begin by showing t h a t where qi =-(&exi). (A.7) tends t o 0 This follows i f 2/ n sn p p L q2 + . 1 Write The Markov i n e q u a l i t y shows t h a t t h e second term i n i n probability i f a ) Nn P + W and W b) 2 ( E ) = W. = @J~(E+II-'/~~ +) n i s uniformly i n t e g r a b l e . Wn The c o n t i n u i t y p r o p e r t y ( B ) ( i i ) i m p l i e s a ) ; while b) follows from monotonicity of 2 {$ (€1 +$ 2 $ and c o n d i t i o n (D). Indeed, (E+v) }1{q2 (E) V $2 ( ~ + q )> a ) , The t h i r d term of (A.7) i s a p p e a l s t o t h e c o n t i n u i t y of o (1) P X(t) $ 2 ( e + q / & ) 1 [ ~ 2 ( ~ + q / f i ) > a] < and t h e l a t t e r i s i n t e g r a b l e . f o r s i m i l a r r e a s o n s , except t h a t one at 0 (and X(0) = 0) Consequently (A.5) can be e s t a b l i s h e d by showing t h a t Indeed - r / n < n-' n - n E 'n,i9 which i s 0 (1) P because i n s t e a d of (B) ( i i ) . rn is 312) op(n . which i s u n i f o r m l y bounded i n n by v i r t u r e of assumption (D) and I ) . m o n o t o n i c i t y of To e s t a b l i s h (A.6) we write " n-I' - - - -l 4; (w+cxi) C J;; EL - ' Io + A ' n(0) [A ( s ) -X' (O)]ds That t h e second term converges i n p r o b a b i l i t y t o 2 -pL(wtcvL) [A' (0) 1 may be s e e n by w r i t i n g n . C (w+cXi) EL P ~ X ' ( O ) ( W + C ~=- I ~ ) -1 C as Xi i&L L ~;'(~)dp, where Fn is the empirical d.f. of XI, 0 0 1 ) : term i s = w + cXi The f i r s t t h i s i s a consequence of t h e n e x t lemma a p p l i e d t o r . Y.1 ...,Xn . and 7 g(y) = [A1(s)-AV(O)]ds. Y o Lemma. Y1,Y2,. Let .. ( i i ) E[Y 1 s u b s e t of g(y) be a f u n c t i o n s u c h t h a t be i.i . d .r .v. ' s . < and 1 ,, g(y) n g(y) -t i s bounded. Let y -t 0; and or in be a ( p o s s i b l y random) Then E1yg(n-ll2y) 1 + 0. Assumption t h e lemma a p p l i e s t o t h e p r e s e n t s i t u a t i o n . - as Suppose t h a t e i t h e r ( i ) E Y ~< We omit t h e p r o o f , remarking o n l y t h a t i n c a s e ( i ) while i n case ( i i ) 0 n -1' 2 - P IYJ max l < i< n -+ A B and C e n s u r e t h a t o A. 4 HEURISTIC . ARGUMENT FOR REDESCENDING JI I n t h e following, we c o n d i t i o n on t h e observed sequence of c a r r i e r s X = (xl, * , x2, ... ) and set m of t n i n i m i z i n g ( L 9 (yi 1 I t h e sequence (x L -L Define T(y -1 t~ - ,..., ym) a s a v a l u e i n t e r v a l [-M,I$]. t h a t t h P and t h a t A(t) = E$(E n has a unique z e r o a t t = O w i t h and i n t e g r a t l i l i t y = 1 i n (5.2). - t ) 1 l o a prespecified -... - vL12 E 8 ( 1 ) , -1 z (xi n xz9 2 xi+vL Suppose f o r and L t) n - X'(0) # 0. Then under f u r t h e r smoothness c o n d i t i o n s one can show u s i n g Huber (1967) t h a t . w i t h an analogous r e s u l t v a l i d f o r TR n These r e s u l t s hold f o r each value of y . Under a p p r o p r i a t e conditions t h e random process w i t h p a t h s i n D(-m, a) converges weakly t o t h e degenerate -1 Gaussian p r o c e s s X(y) = (2pL v T ) l l % - (pg - t ) y , where Z - ~ ( 0 ,1 ) . The mapping h: D(-m,~) -+ IR t h a t l o c a t e s t h e minimum i n [-M,Y] of y + 1xC-y) 1 (with conventions t o handle non-uniqueness and jump d i s c o n t i n u i t i e s ) i s continuous on t h e support of (X(y)), and s o from t h e continuous mapping theorem A.5 VARIANCE REDUCTION METHODS Thfs 2 s an e x t e n s i o n a l s o l d i s - Gaussian o v e r Independent Swindles, cussed by Goodfellow and M a r t i n (1976) t o t h e r e g r e s s i o n s i t u a t i o n of a common method f o r l o c a t i o n problems (e.g. (1976)). an n Assume t h a t a l i n e a r model +E y = X f3 o b t a i n s , where Y i s 1 column v e c t o r , X a f i x e d n x p m a t r i x of c a r r i e r s , x parameter v e c t e r , and Here z Andrews e t . a1 (1972), Simon i - E i s a n n x 1 v e c t o r of i . i . d . ~ ( 002) , and wi a r e i.i.d. - , positive ..., w,), statistics i: C o n d i t i o n a l on t h e v e c t o r w t h e (complete) s u f f i c i e n t (wl, W* B a p variables E 1 x i = z./wi. 1 and independent of z i . s t a n d a r d normal t h e o r y y i e l d s a s t h e weighted l e a s t s q u a r e s A e s t i m a t e s of 6, o? and t h e ( a n c i l l a r y ) s t a n d a r d i z e d r e s i d u a l s e = (y - A A X fiW)/ow. Let b(y) b e a n a r b i t r a r y r e g r e s s i o n i n v a r i a n t e s t i m a t o r of 8, i . e . satisfying b(c Y + X d) = c b(Y) + d for c E IR , ds IR'. C o n d i t i o n a l on w, t h e ,., s u f f i c i e n c y a n c i l l a r i t ~theorem (e.g. Simon 11976)) e n s u r e s independence of A Bw' A2 Ow and i. From t h i s flows t h e decomposition (under t h e assumption t h a t . 6 = 0 , a 2 = 1, and t h a t b ( y ) is unbiased ) where A = d i a g (w.) and Var d e n o t e s t h e v a r i a n c e - c o v a r i a n c e m a t r i x . ,.. 1 I n many c a s e s t h e f i r s t t e r m can b e e v a l u a t e d a n a l y t i c a l l y , o r once-and-for-all by A Monte C a r l o , w h i l e Var b ( e ) , b e i n g s m a l l e r t h a n Var b ( y ) , can u s u a l l y b e ,. .., e s t i m a t e d more a c c u r a t e l y . A A Var b ( e ) = E b ( e ) b ( e ) -., 1 - C b ( Ie N ) ( I Based on I = 1, ..,, N replications, A ) , i s e s t i m a t e d v i a t h e method of moments a s wTth t h e v a r i a b i l i t y of t h e l a t t e r e s t i m a t e d i n a s t a n d a r d way u s i n g sums ofi f o u r t h moments. F i n a l l y we n o t e t h a t t h e decomposition (1) remains v a l i d i f t h e rows of x a r e i . i . d . t h i s f o l l o w i n g from t h e u n b i a s e d n e s s of b ( y ) . random v e c t o r s , All Score f u n c t i o n swindle This i s a g e n e r a l method, n o t l i m i t e d t o t h e r e g r e s s i o n c o n t e x t , o r t o d i s t r i b u t i o n s from t h e Gaussiandver-independent family. For more d e t a i l s see Johnstone and Velleman (1983). simple l i n e a r r e g r e s s i o n model Yi = a where a and B a r e unknown s c a l a r s , + E i B(Pi pX) ake i . i . d . t h e Xi may b e e i t h e r f i x e d ( i n which case p (with - X = -X1 + Consider f o r example a i = 1, ..., n; w i t h d e n s i t y g , and o r i.i . d . variables The e f f i c i e n t s c o r e f u n c t i o n f o r e s t i m a t i o n of 8 p X = E X1). i s given by n and h a s ( i n r e g u l a r c a s e s ) e x p e c t a t i o n z e r o and v a r i a n c e n 021(g), where X I ( g ) = / ( g ' ) 2 / g i s t h e F i s h e r i n f o r m a t i o n ( f o r l o c a t i o n ) of t h e d e n s i t y g. The Cramer-Rao i n e q u a l i t y s a y s t h a t L = In o;1(g)] t o t h e v a r i a n c e of any unbiased e s t i m a t o r b(y) of b(y) - LS n and LS n a r e u n c o r r e l a t e d (when B = 0 ) . -1 i s a lower bound B , and t h a t This gives t h e decomposition ( f o r B = 0, and a r e g r e s s i o n i n v a r i a n t e s t i m a t e r b ( ~ ) ) V a r b (y) = L + Var b (y - LSnX) . Again, i n many c a s e s L can b e e v a l u a t e d e x p l i c i t l y (perhaps w i t h t h e a i d of a numerical i n t e g r a t i o n ) , while Var(b(y) - LS ) can u s u a l l y be estimated n from s i m u l a t i o n more a c c u r a t e l y than Var(b(y)). The e f f e c t i v e n e s s of t h e swindle w i l l depend on how c l o s e t h e ( u s u a l l y u n a t t a i n a b l e ) bound L i s t o t h e v a r i a n c e of t h e Pitman e s t i m a t e , and on t h e amount of c o r r e l a t i o n between b (y) and LSn. MONTE CARLO EXPERIMENT DETAILS Without l o s s of g e n e r a l i t y w e s e t t h e t r u e B = 0. Pseudo-random e r r o r s were generated u s i n g t h e IMSL (1990) l i n e a r c o n g r u e n t i a l random number generator GGUBS (seed = 714235803) and Box-Muller Gaussian transformation i n GGNOR, Gaussian v a r i a t e s w e r e taken d i r e c t l y from GGNOR, Mixed Gaussian were generated by m u l t i p l y i n g u n i t Gaussian v a l u e s by 3.0 w i t h p r o b a h f l i t y 0.1. S l a s h v a r 2 a t e s were generated a s the q u o t i e n t of a random Gaussian from GGNOR and an independent random uniform from GGUBS , For sample s i z e 10 ye performed 10,000 r e p l i c a t i o n s ; s h e 22, 5,000; f o r sample s i z e 40, 2,000. f o r sample s i z e I n each run, t h e same samples were presented t o a l l e s t i m a t o r s 2n p a r a l l e l , s o compar2sons of e f f i c i e n c i e s among e s t i m a t o r s i n t h e same run a r e somewhat more r e l i a b l e than t h e r e p o r t e d s t a n d a r d e r r o r s would i n d i c a t e . A l l programs were w r i t t e n i n F o r t r a n 66, compiled e 5 t h e r by I B M * s F o r t r a n H compiler o r by IRMVs FORTVS compiler a t o p t i m i z a t i o n l e v e l 3, T r i a l s were run on C o r n e l l Un2versity1.s IBM 370/168 under a modified OS/MVT o p e r a t i n g system u s i n g CMS. REFERENCES Adichie, J, N. (1967), "Asymptotfc E f f i c i e n c y of a Class of Non-parametric Tests f o r Regression Parameters," Annals of Mathematical S t a t i s t i c s , 38, 884-893. Aho, A. N., Hopcroft, J. E . , and Ullman, J. D. (1974), The Design and Analysis of Computer Algorithms, Reading Massachusetts: Addison-Wesley. Huber, P. J., Rogers, W. H . , Andrews, D, F., B i c k e l , P. J . , Hampel, F, R., and Tukey, J. W. (1972), Robust Estimates of Location, Survey and Advances, P r i n c e t o n , New J e r s e y : P r i n c e t o n U n i v e r s i t y P r e s s . Andrews, D. F, (1974), "A Robust Method f o r M u l t i p l e Linear Regression," Technometrics, 16, 523-531. B a r t l e t t , M. S. (1949), " F i t t i n g a S t r a i g h t Lfne When Both V a r i a b l e s a r e Subject t o Error," Barton, D. E . , Biometrics, and Casley, D. J. (1958), 5, 207-212. "A Quick Estimate of t h e Regres- s i o n C o e f f i c i e n t , " Biometrika, 45, 431-435. Basset t , G i l b e r t , and Koenker , Roger (1978) , "Asymptotic Theory of Least Absolute E r r o r Regression," J o u r n a l of,American S t a t i s t i c a l Associa t i o n , 73: 618-622. Beaton, A. E. , and Tukey, J. W. (1974), he. F i t t i n g of Power S e r i e s , Meaning Polynomials, I l l u s t r a t e d on Ba-nd-Spectroscopic Data," Technometrics, 16, 147-185. B e c h t e l l , M. L. . (1982), ''A Confidence I n t e r v a l f o r t h e Slope C o e f f i c i e n t of t h e R e s i s t a n t Line", M.S. t h e s i s , C o r n e l l University. Bhapkar, V. P. (1961), "Some Nonparametric Median Procedures"', Annals of Mathemat2,calStatistics, 32, 846-863. Billingsley, P. (1968), Convergence of Probability Measures, New York: John Wiley. Bingham, R. J., and Emerson, J. D. (1983), "The Resistance of Three Regres- sion Methods", Proceedings of Social Statistical Section, American Statistical Association 1982, 514-519. Brent, R. P. (1973), Algorithms for Minimization Without Derivatives, Englewood Cliffs, New Jersey: Prenrice-Hall. Brown, G, W., and Mood, A. M. , (1951) , "On Median Tests for Linear Hypotheses," Proceedings Second Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, 159-166. Buhler, S., and Buhler, R. (1979), P-Stat 78, Princeton, New Jersey: P-Stat, Inc. Daniels, H. E, (1954), "A Distribution-Free Test for Regression parameters," Apnals of Mathematical Statistics, 25, 499-513. Dekker, T. J. (1969), "Finding a Zero by Means of Successive Linear Inter- polati~n,~' in B. Dejon and P. Henrici (eds.) Constructive Aspects of the Fundamental Theorem of Algebra, New York: Wiley-Interscience. Diaconis, P,, and Freedman, D. A, (1982) , "On Inconsistent *Estimators ," Annals of Statistics, 10, 454-461. Dixon, W,, and Tukey, J. W. (1968), "Approximate Behavior of the Distribution of Winsorized t (Tri&ng/Winsorization - 10, 83-98, 2)," Technometrics, Dolby, J. L. (1960), "Graphical Procedures f o r F i t t i n g t h e B e s t Line t o a S e t of P o i n t s , " Technometrics, 2, 477-481. Donoho, D., and Huber, P. J. (1982), "The Notion of Breakdown Point," F e s t s c h i f t f o r E r i c h Lehmann, B i c k e l , P. J . , Doksom, K, A , , and Hodges, J . L. ( e d s . ) , Belmont, C a l i f o r n i a : Wadsworth P r e s s , 157-184. Emerson, J. D. , and Hoaglin, D. C. i n Hoaglin, D. C., (1983a), M o s t e l l e r , F., "Resistant Lines f o r y-versus-x," and Tukey, J. W. (eds.), Under- s t a n d i n g RobustandExploratory Data Analysis, New York: John Wiley. , and (1983b), " R e s i s t a n t M u l t i p l e Regression, One Variable a t a Time," i n Hoaglin, D. C., Tukep, J. W. ( e d s . ) , Exploring New York: John Wiley Epstein, B. (1954), M o s t e l l e r , F., and Tables, Trends, and Shapes, (forthcoming). "Tables f o r t h e D i s t r i b u t i o n of t h e Number of Exceedances," Annals of Mathematical S t a t i s t i c s , 25, 762-768, F e l l e r , W. (1968), An I n t r o d u c t i o n t o P r o b a b i l i t y Theory and Its A p p l i c a t i o n s , Vol. 1, 3rd Ed., New York: John Wiley. , (1971), An I n t r o d u c t i o n t o P r o b a b i l i t y Theory and Its A p p l i c a t i o n s , Vol, 11, New York: John Wiley. Gibson, W . M., and Jowett, G. H. (1957a), "Three-Group Regression Analysis: P a r t 1, Simple Regression Analysis," Applied S t a t t s t i c s , , and (1957b), "Three Group Regression Analysis: P a r t 2 , Multiple Regression Analysis ," Applied S t a t i s t i c s , 6 , '189-197. Goodfellow, D. M., and Martin, R. D. Techniques i n Robust Estimation," (19761, "Monte Carlo Swindle EE Technical Report, Department of E l e c t r i c a l Engineering, U n i v e r s i t y of ~ a s h f n ~ t o n . 49 ~ g j e k ,J , (1972), "Local Asymptotic Wnimax and A d m i s s a b i l i t y i n E s t i - mation," P r o c e e d i n g s S i x t h Berkeley Symposium on Mathematical S t a t e s t i c s and P r o b a b i l i t y , Vol. 1, B e r k e l e y , C a l i f o r n i a : U n i v e r s i t y of C a l i f o r n i a P r e s s , 175-194. Hampel, F, R. (1971), "A Q u a l i t a t i v e D e f i n i t i o n of Robustness," Annals of Mathematical S t a t i s t i c s , 4 2 , 1887-1896, , Rousseeuw, P. J . , and R o n c h e t t i , E, (1981), "The Change-of-Variance Curve and Optimal Redescending E s t i m a t o r s , " J o u r n a l of American S t a t i s t i c a l A s s o c i a t i o n , 76, 643-648. H i l l , B. M, (1962), "A T e s t of L i n e a r i t y Versus Convexity of a Median Regression Curve,". Hogg, R, Yn (1975), Annals of Mathematical S t a t i s t i c s , 33, 1096-1123, "Estimates of P e r c e n t i l e Regression L i n e s Using S a l a r y Data," J o u r n a l of American S t a t i s t i c a l A s s o c i a t i o n , 70, 56-59. Huber, P, J, (1964), "Robust E s t i m a t i o n of a L o c a t i o n Parameter," Annals of N a t h e m a t i c a l S t a t i s t i c s , 35, 73-101. (19.671, "The Behavior of Maximum L i k e l i h o o d E s t i m a t e s Under Nons t a n d a r d C o n d i t i o n s , " Proceedings F i f t h Berkeley Symposium on Mathem a t i c a l S t a t i s t i c s and P r o b a b i l i t y , Vol. 1, Berkeley, C a l i f o r n i a : U n i v e r s i t y of C a l i f o r n i a P r e s s . (1973), "Robust Regression: Asymptotics, C o n j e c t u r e s and Monte c a r l o , " Annals of S t a t i s t i c s , 1, 799-821. (1981) , Robust S t a t i s t i c s , New York: IMSL L i b r a r y , E d i t i o n 8 (1980) , John Wiley and Sons. I n t e r n a t i o n a l Mathematical and S t a t i s t i c a l L i b r a r i e s , I n c . , 6 t h F l o o r , NBC B u i l d i n g , Houston, Texas. Johnstone, I, and Velleman, P. F. (1982), " ~ u k e y ~Resistant s Line and Related Methods: Asymptotics and Algorithms," Proceedings of the Statistical Computing Section, 1981, American Statistical Association, 218-223. , and (1984), "Variance Reduction Methods for Monte Carlo Study of Regression," to appear Proceedings of the Statistical Computing Section, 1983, American Statistical Association. Kelley, T. L. (1939), "The Selection of Upper and Lower Groups for the Validation of Test Items," Journal of Educational Psychology, 30, 17-24. Kildea, D. G. (1981), "Brown-Mood Type Median Estimators for Simple Regression Models ,'I Annals of Statistics, 9, 438-442. Krasker, W,, and Welsch, R. E. (1982), "Efficient Bounded-Influence Regression ~stimation,"Journal of American Statistical Association, 77, 595-604, detLaplace, Pierre Simon (1818), ~euxfGmeSupplement la 'i'he'orie Analytique des ~robabilitgs , Paris: Courcier. Lehmann, E. L. (19591, Testing Statistical Hypotheses, New York: John Wiley. Mallows, C, P, (1973), "Influence ~unctions,"paper presented at the Natlonal Bureau of Economic Research Conference on Robust Regression in Cambridge, Mass. McCabe, G. P. (1980), "Use of the 27% Rule in ~$erimental ~esign," Comunications in Statistics Mood, A. M. . - (1950), -- Theory and Methods, A9(7), 765-776. Introduction to the Theory of Statistics, New York, - New York: McGraw-Hill. Mosteller, F. (1946), ltOn Some Useful ' ~ n e f f i c i e n t ' ~ t a t i s t i c s , "Annals of Mathematical S t a t i s t i c s , 17, 377-404. , and Tukey, J. W. (1977), Data Analysis and Regression, Boston, Massachusetts: Addison-Wesley. and Shrivastava, M. P. Nair, K. R., (1942), "On a Simple Method of Curve F i t t i n g , " sankhya, 6 , 121-132. Neyman, J , , and S c o t t , E. L. (1951), "On C e r t a i n Methods of Estimatrng t h e Linear S t r u c t u r a l Relation," Annals of Mathematlcal S t a t i s t i c s , 22, 352-361. Ryan, T., J o i n e r , B., and Ryan, B. (1981), The Minitab Manual, S t a t e College, Pa. : Minitab P r o j e c t , (1968), Sen, P, K. "Estimates of t h e Regression C o e f f i c i e n t Based on Kendall's ~ a u , "J o u r n a l of American S t a t i s t i c a l Association, 6 3 , 1379-1389. S i e g e l , A, F. (1982), "Robust Regression Using Repeated Medfans,'" Biometrika, 69, 242-244. Simon, G. (1976), "Computer Simulation Swindles, with Applications t o Estimates of Location and Dispersion," T h e i l , J. (1950), Applied S t a t i s t i c s , 25, 266-274, "A Rank-invariant Method of Linear and Polynomial Regression Analysis," I , 11, and 111, Koninklijke Nederlandse Akademie v a r Wetenschappen Proceedings, 53, 386-392, Tukey, J. W. (1971), 521-525, 1397-1412, Exploratory Data Analysis, Limited Preliminary E d i t i o n , Ann Arbor, Michigan: University Wcrofilms. Velleman, P. F., and Hoaglin, D. C. (1981), Applications, Basics and Computing of Exploratory Data Analysis, Boston, Massachusetts: Duxbury Press. , and Ypelaar, M. A. (1980), "Constructing Regressions with Controlled Features: A Method of Probing Regression Performance," J o u r n a l of American S t a t i s t i c a l Association, 75, 839-844. Wald, A. (1940), "The F i t t i n g of S t r a i g h t Lines i f Both V a r i a b l e s a r e Subject t o E r r o r , " Wilkinson, J. H. (1967), Interpolation," Annals of Mathematical S t a t i s t i c s , 11, 282-300. "Two Algorithms Based on Successive Linear Tech. Report STAN-CS-67-60, Stanford, California: Computer Science Department, Stanford U n i v e r s i t y ,