ROBUST STATISTICS

ROBUST STATISTICS INTRODUCTION • Robust statistics provides an alternative approach to classical statistical methods. The motivation is to produce estimators that are not excessively affected by small departures from model assumptions. These departures may include departures from an assumed sample distribution or data departuring from the rest of the data (i.e. outliers). MEAN VS MEDIAN nm Tn     Xj j m1 n  2m where m  n Usually,0.1    0.2 mX m1  X nm     X i  nm Tn    i m1 n where m  n  ROBUST MEASURE OF VARIABILITY     MAD  m edian Yi  m edianY j  i j   ORDER STATISTICS AND ROBUSTNESS • Ordered statistics and their functions are usually somewhat robust (e.g. median, MAD, IQR), but not all ordered statistics are robust (e.g. X(1), X(n), R=X(n) X(1). M-ESTIMATORS m in  log f Yi     m in  Yi    i i i i . Yi  ˆ   i Yi  ˆ   0 or  wi Yi  ˆ   0 where w   Yi  ˆ  i i M-ESTIMATORS M-ESTIMATORS M-ESTIMATORS M-ESTIMATOR • When an estimator is robust, it may be inferred that the influence of any single observation is insufficient to yield any significant offset. There are several constraints that a robust M-estimator should meet: 1. The first is of course to have a bounded influence function. 2. The second is naturally the requirement of the robust estimator to be unique. Briefly we give a few indications of these functions: • L2 (least-squares) estimators are not robust because their influence function is not bounded. • L1 (absolute value) estimators are not stable because the  function |x| is not strictly convex in x. Indeed, the second derivative at x=0 is unbounded, and an indeterminant solution may result. • L1L2 estimators reduce the influence of large errors, but they still have an influence because the influence function has no cut off point. EXAMPLES OF M-ESTIMATORS • The mean corresponds to ρ(x) = x2, and the median to ρ(x) = |x|. (For even n any median will solve the problem.) The function x, x  c  x    0, otherwise corresponds to metric trimming and large outliers have no influence at all. The function  c , x   c   x    x , x  c c , x  c  is known as metric Winsorizing2 and brings in extreme observations to μ±c. EXAMPLES OF M-ESTIMATORS • The corresponding −log f is 2  x ,if x  c   x     c2 x  c ,otherwise and corresponds to a density with a Gaussian center and doubleexponential tails. This estimator is due to Huber. EXAMPLES OF M-ESTIMATORS • Tukey’s biweight has 2 2  t  t   t 1       R    where [ ]+ denotes the positive part of. This implements ‘soft’ trimming. The value R = 4.685 gives 95% efficiency at the normal. • Hampel’s ψ has several linear pieces, x ,0  x  a  ,a  x  b a  x   sgnx  a c  x  / c  b ,b  x  c 0 ,c  x  for example, with a = 2.2s, b = 3.7s, c = 5.9s. ROBUST REGRESSION • Procedures dampen the influence of outlying cases, as compared to ordinary LSE, in an effort to provide a better fit for the majority of cases. • LEAST ABSOLUTE RESIDUALS (LAR) REGRESSION: Estimates the regression coefficients by minimizing the sum of absolute deviations of Y observations from their means: n  m in  Yi   0  1 X i ,1     p 1 X i , p 1  i 1  Since absolute deviations rather than squared ones are involved, LAR places less emphasis on outlying observations than does the method of LS. Residuals ordinarily will not sum to 0. Solution for estimated coefficients may not be unique. ROBUST REGRESSION • ITERATIVELY REWEIGHTED LEAST SQUARES (IRLS) ROBUST REGRESSION: It uses weighted least squares procedure. n   m in  wi Yi   0  1 X i ,1     p X i , p  i 1 2  This regression uses weights based on how far outlying a case is, as measured by the residual for that case. The weights are revised with each iteration until a robust fit has been obtained. ROBUST REGRESSION • LEAST MEDIAN OF SQUARES (LMS) REGRESSION: n   m in  Yi   0  1 X i ,1     p X i , p  i 1 2  • Other robust regression methods: Some involve trimming extreme squared deviations before applying LSE, others are based on ranks. Many of the robust regression procedures require intensive computing. EXAMPLE • This data set gives n = 24 observations about the annual numbers of telephone calls made (calls, in millions of calls) in Belgium in the last two digits of the year (year); see Rousseeuw and Leroy (1987), and Venables and Ripley (2002). As it can be noted in Figure there are several outliers in the y-direction in the late 1960s. • Let us start the analysis with the classical OLS fit. > data(phones) > attach(phones) > plot(year,calls) >fit.ols <- lm(calls~year) > summary(fit.ols,cor=F) .. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -260.059 102.607 -2.535 0.0189 * year 5.041 1.658 3.041 0.0060 ** Residual standard error: 56.22 on 22 degrees of freedom Multiple R-Squared: 0.2959, Adjusted R-squared: 0.2639 F-statistic: 9.247 on 1 and 22 DF, p-value: 0.005998 > abline(fit.ols$coef) > par(mfrow=c(1,4)) > plot(fit.ols,1:2) > plot(fit.ols,4) > hmat.p <- hat(model.matrix(fit.ols)) > h.phone <- hat(hmat.p) > cook.d <- cooks.distance(fit.ols) > plot(h.phone/(1-h.phone),cook.d,xlab="h/(1-h)",ylab="Cook distance") • In order to take into account of observations related to high values of the residuals, i.e. the outliers in the late 1960s, consider a robust regression based on Huber-type estimates: > fit.hub <- rlm(calls~year,maxit=50) > fit.hub2 <- rlm(calls~year,scale.est="proposal 2") > summary(fit.hub,cor=F) .. Coefficients: Value Std. Error t value (Intercept) -102.6222 26.6082 -3.8568 year 2.0414 0.4299 4.7480 Residual standard error: 9.032 on 22 degrees of freedom > summary(fit.hub2,cor=F) .. Coefficients: Value Std. Error t value (Intercept) -227.9250 101.8740 -2.2373 year 4.4530 1.6461 2.7052 Residual standard error: 57.25 on 22 degrees of freedom > abline(fit.hub$coef,lty=2) abline(fit.hub2$coef,lty=3) • From these results and also from THE PREVIOUS PLOT, we note that there are some differences with the OLS estimates, in particular this is true for the Hubertype estimator with MAD. Consider again some classic diagnostic plots about the robust fit: the plot of the observed values versus the fitted values, the plot of the residuals versus the fitted values, the normal QQ-plot of the residuals and the fit weights of the robust estimator. Note that there are some observations with low Huber-type weights which were not identified by the classical Cook’s statistics.

ROBUST STATISTICS

Related documents

Products

Support

ROBUST STATISTICS

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib