ROBUST COVARIANCES Common Risk versus Specific Risk Outliers R. Douglas Martin Professor of Applied Mathematics Director of Computational Finance Program University of Washington doug@amath.washington.edu R-Finance Conference 2013 Chicago, May 17-18 5/30/2012 1 R Packages and Code Used R robust package PerformanceAnalytics package Global minimum variance portfolios with constraints – GmvPortfolios.r: gmv, gmv.mcd, gmv.qc, etc. Backtesting – btShell.portopt.r: btTimes, backtet.weight – gmvlo & gmvlo.robust.r 2 Robust Covariance Uses in Finance Asset returns EDA, multi-D outlier detection and portfolio unusual movement alerts – SM (2005), MGC (2010), Martin (2012) Data cleaning pre-processing – BPC (2008) Reverse stress testing – Example to follow Robust mean-variance portfolio optimization – Is it usefull ???? If so, which method ???? 3 Robust vs. Classical Correlations (Two assets in a larger fund-of-funds portfolio) -0.04 0.00 0.04 Pfand -0.04 0.00 0.04 USHYHinDM Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 1995 1996 1997 1998 1999 2000 2001 4 0.06 Tolerance Ellipses (95%) 0.04 CLASSIC CORR. = .30 0.0 -0.02 ROBUST CORR. = .65 -0.04 Pfand 0.02 What you get from every stats package. Gives an overly optimistic view of diversification benefit! ROBUST CLASSICAL -0.04 -0.02 0.0 0.02 A more realistic view of a lower diversification benefit! 0.04 USHYHinDM 5 Hedge Fund Returns Example 2000 2002 2004 0.00 0.00 F5 F6 -0.1 0.0 0.00 0.0 0.1 0.05 0.1 F4 F3 0.05 0.05 0.00 -0.05 2003 F2 0.05 F1 2001 F8 2000 2001 2002 2003 2004 -0.05 0.0 0.00 0.1 0.05 F7 2000 2001 2002 2003 2004 6 F8 F7 F6 F5 F4 F3 F2 F1 Hedge Fund Returns Example F1 F2 0.26 0.32 0.3 F3 0.29 0.3 0.23 F4 0.34 0.44 0.28 0.45 0.45 0.31 F5 0.05 0.26 -0.13 0.2 0.06 0.36 0.01 0.21 F6 0.36 0.34 0.32 -0.15 0.5 0.36 0.3 0.1 -0.3 0.02 F7 0.37 0.16 0.35 -0.04 0.53 0.24 0.24 -0.01 0.16 -0.31 0.4 0.26 F8 0.14 0.2 -0.03 0.06 0.44 0.16 0.09 0.13 -0.01 -0.19 0.51 0.28 ROBUST CLASSICAL 0.38 0.33 7 Portfolio Unusual Movement Alerts Mahalanobis Squared Distance (MSD) 2 dt 1 ˆ ˆ rt μ Σ rt μˆ Crucial to use a robust covariance matrix estimate Retrospective analysis Dynamic alerts Σ̂ ! 8 Commodities Example (see Appendix A of Martin, Clark and Green, 2009) 2004 2005 2006 2007 2008 2009 HOGS COPPER COFFEE SILVER SUGAR OJ PLANTINUM OILC -0.2 0.0 0.2 -0.2 0.0 0.2 -0.2 0.0 0.2 CATTLE 2004 2005 2006 2007 2008 2009 2004 2005 2006 2007 2008 2009 9 Classical Alerts Robust Alerts 0 10 20 CLASSICAL 30 40 50 60 ROBUST 7 58 COMMODITY RETURNS DISTANCES Unreliable alerts! 57 6 9 28 5052 5 57 60 59 11 4 3 2 1 0 10 20 30 40 50 60 Index 10 library(xts) library(robust) library(lattice) ret = read.zoo("commodities9.csv",sep=",",header = T,format = "%m/%d/%Y") ret = as.xts(ret) ret = ret['2004-01-31/2008-12-31',] xyplot(ret,layout = c(3,3)) # Not the same as slide data = coredata(ret) cov.fm <- fit.models(CLASSICAL = covMLE(data), ROBUST = covRob(data,estim = "mcd",quan = .7)) plot(cov.fm,which.plots = 3) 11 Robust Covariance Choices in R “robust” Min. covariance determinant (MCD) M-estimate (M) affine equivariant Donoho-Stahel (DS) Pairwise estimates (PW) not affine equivariant – Quadrant correlation and GK versions – Positive definite (Maronna & Zamar, 2002) For details see the R robust package reference manual. See also Chapter 10.2.2 of Pfaff (2013) Financial Risk Modeling and Portfolio Optimization with R, Wiley. 12 The Usual Robustness Outliers Model R T n table of returns with rows rt rt F (1 ) N (μ, Σ) H A natural model for common factor outliers – Market crashes 1. Probability of a row r containing an outlier is independent of the dimension n, so the majority of the rows of R are outlier-free. 2. Fraction of rows that have outliers is unchanged under affine transformations, so use affine equivariant estimators, e.g., MCD 13 Independent Outliers Across Assets (IOA) Let Bi 1 (0) if asset i is (is not) an outlier. (AKMZ, 2002 and AVYZ, 2009) Assume B1 ,B2 , ,Bn are independent with P( Bi ) i A natural model for specific risk outliers Suppose for example that P ( Bi ) , i 1, 2, , n . Then the probability of a row rt not containing an outlier is (1 ) n, which decreases rapidly with increasing p . E.g., for .05 : # of assets n prob. clean row 5 10 15 20 .77 .60 .46 .36 N.B. Affine transformations increase the percent of rows with outliers, so no need to restrict attention to affine equivariant estimators. 14 Choice of Outliers Model and Estimator Both are useful, but: The usual outliers model handles market events outliers and for these an affine equivariant robust covariance matrix estimator will suffice, e.g., MCD. The independent outliers across assets model is needed for specific risk outliers, and for these one may need to use a pairwise estimator to avoid breakdown! Goal: Determine when pairwise robust covariance matrix estimator performs better than MCD, etc. 15 Asset Class & Frequency Considerations Specific risk outliers are more frequent in the case of: Higher returns frequency, e.g., weekly and daily Smaller market-cap stocks Hedge funds Commodities … ??? 16 Non-Normality Increases with Frequency Comparison of Non-normality over Different Time Scales 11403 W EEKLY 11403 D AILY 0.2 -0.3 -0.2 -0.2 -0.1 0.0 0.0 0.2 0.0 -0.2 0.2 0.4 0.3 0.1 0.2 0.0 0.1 0.2 0.05 0.05 -0.05 -0.1 -0.10 -0.2 -0.1 0.0 0.0 0.0 0.1 0.0 -0.2 -0.05 20220 D AILY 0.1 0.2 0.0 20220 W EEKLY 0.2 20220 MON TH LY -0.1 0.10 0.0 -0.4 -0.4 -0.4 -0.2 -0.2 Quantiles of sample data 0.1 0.4 11403 MON TH LY -0.10 0.0 0.05 0.10 -0.06 -0.02 0.02 0.06 Quantiles of fitted normal distribution 17 Outlier Detection Rule for Counting ̂ optimal 90% efficient bias robust location estimate* ŝ associated robust scale estimate* Outliers: returns outside of ( ˆ sˆ 2.83, ˆ sˆ 2.83) Probability of normal return being an outlier: 0.5% * Use lmRob with intercept only in R package robust 18 Empirical Study of IOA Model Validity Four market-cap groups of 20 stocks, weekly returns 1997 – 2010 in three regimes: – 1997-01-07 to 2002-12-31 – 2003-01-07 to 2008-01-01 – 2008-01-08 to 2010-12-28 1. Estimate outlier probability i for each asset, and hence the probability 1 i that a row is free of outliers under the IOA model. 2. Directly estimate the probability row that a row has at least one outlier. 3. Compare results from 1 and 2 across market-caps and regimes. n i 1 19 -0.1 0.0 0.1 WTS 0.2 -0.2 0.0 0.1 HGIC 0.2 -0.2 0.0 0.1 0.2 BWINB -0.3 -0.1 0.1 PLXS 0.3 4 of the 20 Small-Caps for Entire History SMALL-CAPS 2000 2005 Index 2010 20 Small-Caps Outliers in Third Regime % OUTLIERS IN EACH ASSET 8 4 6 COUNT 4 0 2 2 0 PERCENT 6 10 8 12 # OF ASSETS WITH AN OUTLIER 1 3 5 7 9 11 ASSETS 14 17 20 2008 2009 2010 2011 Index 21 Large-Caps Outliers in Third Regime % OUTLIERS IN EACH ASSET 10 COUNT 4 5 3 2 0 1 0 PERCENT 5 6 15 # OF ASSETS WITH AN OUTLIER 1 3 5 7 9 11 ASSETS 14 17 20 2008 2009 2010 2011 Index 22 Evaluation of IOA Model for Weekly Returns 1997-01-07 to 2002-12-31 MICRO % Clean Rows IOA Model 32 % Clean Rows Direct Count 37 SMALL 39 48 MID 46 58 LARGE 59 69 2003-01-07 to 2008-01-01 MICRO % Clean Rows IOA Model 33 % Clean Rows Direct Count 37 SMALL 46 49 MID 52 62 LARGE 57 66 2008-01-08 to 2010-12-28 MICRO % Clean Rows IOA Model 23 % Clean Rows Direct Count 43 SMALL 31 48 MID 39 57 LARGE 46 62 23 gmv.lo gmv.lo.mcd gmv.lo.qc mkt -0.1 -0.15 0.00 1.0 1.5 2.0 LONG-ONLY GMV,GMV.MCD, GMV.PW, MKT -0.5 Drawdown Weekly Return Cumulative Return 2.5 Weekly Returns, Window = 60, Rebalance = Weekly 1998-03-03 2001-07-03 2005-01-04 Date 2008-07-01 24 Weekly Returns, Window = 60, Rebalance = Monthly 2.0 1.5 -0.1 -0.15 0.00 1.0 gmv.lo gmv.lo.mcd gmv.lo.qc mkt -0.5 Drawdown Weekly Return Cumulative Return LONG-ONLY GMV,GMV.MCD, GMV.PW, MKT 1998-03-03 2001-07-03 2005-01-04 2008-07-01 Date 25 HHI Diversification Index (sum-of-squared wts.) dvi.gmv 0.9 0.8 0.7 0.6 0.5 0.4 dvi.gmv.mcd 0.9 0.8 0.7 0.6 0.5 0.4 dvi.gmv.qc 0.9 0.8 0.7 0.6 0.5 0.4 1998 2000 2002 2004 2006 Time 2008 2010 26 Back-Test Code library(PerformanceAnalytics) library(robust) source("GmvPortfolios.r") source("btShell.portopt.r") source("btTimes.r") # Diversification Index Function dvi =function(x){1-sum(x^2)} # Input returns ret.all = read.zoo("smallcap_weekly.csv",sep=",",header = T,format = "%m/%d/%Y") mkt = ret.all[,"VWMKT"] ret = ret.all[,1:20] n.assets <- ncol(ret) # get returns dates all.date = index(ret) 27 # compute the backtest times t.mw <- btTimes.mw(all.date, 4, 60) # backtesting weight.gmv.lo <- backtest.weight(ret, t.mw,gmv.lo)$weight weight.gmv.lo.mcd <- backtest.weight(ret, t.mw, gmv.lo.mcd)$weight weight.gmv.lo.qc <- backtest.weight(ret, t.mw, gmv.lo.qc)$weight # The Diversification Index Plots gmvdat = coredata(weight.gmv.lo) gmvdat.mcd = coredata(weight.gmv.lo.mcd) gmvdat.qc = coredata(weight.gmv.lo.qc) dvi.gmv = apply(gmvdat,1,dvi) dvi.gmv.mcd = apply(gmvdat.mcd,1,dvi) dvi.gmv.qc = apply(gmvdat.qc,1,dvi) dvi.all = cbind(dvi.gmv,dvi.gmv.mcd,dvi.gmv.qc) dvi.all.ts = as.zoo(dvi.all) index(dvi.all.ts) = index(weight.gmv.lo) xyplot(dvi.all.ts, scales = list(y="same")) 28 # compute cumulative returns of portfolio gmv.lo <- Return.rebalancing(ret, weight.gmv.lo) gmv.lo.mcd <- Return.rebalancing(ret, weight.gmv.lo.mcd) gmv.lo.qc <- Return.rebalancing(ret, weight.gmv.lo.qc) # combined returns ret.comb <- na.omit(merge(gmv.lo, gmv.lo.mcd, gmv.lo.qc, mkt, all=F)) # return analysis charts.PerformanceSummary(ret.comb,wealth.index = T, lty = c(1,1,1,4),colorset = c("black","red","blue","black"), cex.legend = 1.3,cex.axis = 1.3, cex.lab = 1.5, main = "Weekly Returns, Window = 60, Rebalance = Monthly \n LONG-ONLY GMV,GMV.MCD, GMV.PW, MKT") 29 “Statistics is a science in my opinion, and it is no more a branch of mathematics than are physics, chemistry and economics; for if its methods fail the test of experience – not the test of logic – they will be discarded” - J. W. Tukey Thank You! 30 31 “Statistics is a science in my opinion, and it is no more a branch of mathematics than are physics, chemistry and economics; for if its methods fail the test of experience – not the test of logic – they will be discarded” - J. W. Tukey Thank You! Proprietary, for use only by permission. References Alqallaf, Konis, Martin and Zamar (2002). “Scalable robust covariance and correlation estimates for data mining”, Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 14-23. ACM. Scherer and Martin (2005). Modern Portfolio Optimization, Chapter 6.6 – 6.9, Springer Maronna, Martin, and Yohai (2006). Robust Statistics : Theory and Methods, Wiley. Boudt, Peterson & Croux (2008). “Estimation and Decomposition of Downside Risk for Portfolios with Non-Normal Returns”, Journal of Risk, 11, No. 2, pp. 79-103. Alqallaf, Van Aelst, Yohai and Zamar (2009). “Propagation of Outliers in Multivariate Data”, Annals of Statistics, 37(1). p.311-331. Martin, R. D., Clark, A and Green, C. G. (2010). “Robust Portfolio Construction”, in Handbook of Portfolio Construction: Contemporary Applications of Markowitz Techniques, J. B. Guerard, Jr., ed., Springer. Martin, R. D. (2012). “Robust Statistics in Portfolio Construction”, Tutorial Presentation, R-Finance 2012, Chicago, 32