DougMartin

advertisement
ROBUST COVARIANCES
Common Risk versus Specific Risk Outliers
R. Douglas Martin
Professor of Applied Mathematics
Director of Computational Finance Program
University of Washington
doug@amath.washington.edu
R-Finance Conference 2013
Chicago, May 17-18
5/30/2012
1
R Packages and Code Used
 R robust package
 PerformanceAnalytics package
 Global minimum variance portfolios with constraints
– GmvPortfolios.r: gmv, gmv.mcd, gmv.qc, etc.
 Backtesting
– btShell.portopt.r: btTimes, backtet.weight
– gmvlo & gmvlo.robust.r
2
Robust Covariance Uses in Finance
 Asset returns EDA, multi-D outlier detection and portfolio
unusual movement alerts
–
SM (2005), MGC (2010), Martin (2012)
 Data cleaning pre-processing
– BPC (2008)
 Reverse stress testing
– Example to follow
 Robust mean-variance portfolio optimization
– Is it usefull ???? If so, which method ????
3
Robust vs. Classical Correlations
(Two assets in a larger fund-of-funds portfolio)
-0.04
0.00
0.04
Pfand
-0.04
0.00
0.04
USHYHinDM
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3
1995
1996
1997
1998
1999
2000
2001
4
0.06
Tolerance Ellipses (95%)
0.04
CLASSIC CORR. = .30
0.0
-0.02
ROBUST CORR. = .65
-0.04
Pfand
0.02
What you get from every
stats package. Gives an
overly optimistic view of
diversification benefit!
ROBUST
CLASSICAL
-0.04
-0.02
0.0
0.02
A more realistic view of
a lower diversification
benefit!
0.04
USHYHinDM
5
Hedge Fund Returns Example
2000
2002
2004
0.00
0.00
F5
F6
-0.1
0.0
0.00
0.0
0.1
0.05
0.1
F4
F3
0.05
0.05
0.00
-0.05
2003
F2
0.05
F1
2001
F8
2000
2001
2002
2003
2004
-0.05
0.0
0.00
0.1
0.05
F7
2000
2001
2002
2003
2004
6
F8
F7
F6
F5
F4
F3
F2
F1
Hedge Fund Returns Example
F1
F2 0.26
0.32
0.3
F3 0.29
0.3
0.23
F4 0.34
0.44
0.28
0.45
0.45
0.31
F5 0.05
0.26
-0.13
0.2
0.06
0.36
0.01
0.21
F6 0.36
0.34
0.32
-0.15
0.5
0.36
0.3
0.1
-0.3
0.02
F7 0.37
0.16
0.35
-0.04
0.53
0.24
0.24
-0.01
0.16
-0.31
0.4
0.26
F8 0.14
0.2
-0.03
0.06
0.44
0.16
0.09
0.13
-0.01
-0.19
0.51
0.28
ROBUST
CLASSICAL
0.38
0.33
7
Portfolio Unusual Movement Alerts
Mahalanobis Squared Distance (MSD)
2
dt
1

ˆ
ˆ
  rt  μ  Σ  rt  μˆ 
Crucial to use a robust covariance matrix estimate

Retrospective analysis

Dynamic alerts
Σ̂
!
8
Commodities Example
(see Appendix A of Martin, Clark and Green, 2009)
2004
2005
2006
2007
2008
2009
HOGS
COPPER
COFFEE
SILVER
SUGAR
OJ
PLANTINUM
OILC
-0.2
0.0
0.2
-0.2
0.0
0.2
-0.2
0.0
0.2
CATTLE
2004
2005
2006
2007
2008
2009
2004
2005
2006
2007
2008
2009
9
Classical Alerts
Robust Alerts
0
10
20
CLASSICAL
30
40
50
60
ROBUST
7
58
COMMODITY RETURNS DISTANCES
Unreliable alerts!
57
6
9
28
5052
5
57
60
59
11
4
3
2
1
0
10
20
30
40
50
60
Index
10
library(xts)
library(robust)
library(lattice)
ret = read.zoo("commodities9.csv",sep=",",header =
T,format = "%m/%d/%Y")
ret = as.xts(ret)
ret = ret['2004-01-31/2008-12-31',]
xyplot(ret,layout = c(3,3)) # Not the same as slide
data = coredata(ret)
cov.fm <- fit.models(CLASSICAL = covMLE(data),
ROBUST = covRob(data,estim = "mcd",quan = .7))
plot(cov.fm,which.plots = 3)
11
Robust Covariance Choices in R “robust”
 Min. covariance determinant (MCD)
 M-estimate (M)
affine equivariant
 Donoho-Stahel (DS)
 Pairwise estimates (PW)
not affine equivariant
– Quadrant correlation and GK versions
– Positive definite (Maronna & Zamar, 2002)
For details see the R robust package reference manual. See also Chapter 10.2.2 of
Pfaff (2013) Financial Risk Modeling and Portfolio Optimization with R, Wiley.
12
The Usual Robustness Outliers Model
R T  n table of returns with rows rt
rt
F  (1   )  N (μ, Σ)    H
 A natural model for common factor outliers
– Market crashes
1. Probability of a row r containing an outlier is independent of the
dimension n, so the majority of the rows of R are outlier-free.
2. Fraction  of rows that have outliers is unchanged under affine
transformations, so use affine equivariant estimators, e.g., MCD
13
Independent Outliers Across Assets (IOA)
Let Bi  1 (0) if asset i is (is not) an outlier. (AKMZ, 2002 and AVYZ, 2009)
Assume B1 ,B2 ,
,Bn are independent with P( Bi )   i
 A natural model for specific risk outliers
Suppose for example that P ( Bi )   , i  1, 2, , n . Then
the probability of a row rt not containing an outlier is (1   ) n,
which decreases rapidly with increasing p . E.g., for   .05 :
# of assets n
prob. clean row
5
10
15
20
.77
.60
.46
.36
N.B. Affine transformations increase the percent of rows with outliers,
so no need to restrict attention to affine equivariant estimators.
14
Choice of Outliers Model and Estimator
Both are useful, but:
 The usual outliers model handles market events outliers
and for these an affine equivariant robust covariance
matrix estimator will suffice, e.g., MCD.
 The independent outliers across assets model is needed
for specific risk outliers, and for these one may need to
use a pairwise estimator to avoid breakdown!
 Goal: Determine when pairwise robust covariance
matrix estimator performs better than MCD, etc.
15
Asset Class & Frequency Considerations
Specific risk outliers are more frequent in the case of:





Higher returns frequency, e.g., weekly and daily
Smaller market-cap stocks
Hedge funds
Commodities
… ???
16
Non-Normality Increases with Frequency
Comparison of Non-normality over Different Time Scales
11403
W EEKLY
11403
D AILY
0.2
-0.3
-0.2
-0.2
-0.1
0.0
0.0
0.2
0.0
-0.2
0.2
0.4
0.3
0.1
0.2
0.0
0.1
0.2
0.05
0.05
-0.05
-0.1
-0.10
-0.2
-0.1
0.0
0.0
0.0
0.1
0.0
-0.2
-0.05
20220
D AILY
0.1
0.2
0.0
20220
W EEKLY
0.2
20220
MON TH LY
-0.1
0.10
0.0
-0.4
-0.4
-0.4
-0.2
-0.2
Quantiles of sample data
0.1
0.4
11403
MON TH LY
-0.10
0.0
0.05
0.10
-0.06
-0.02
0.02
0.06
Quantiles of fitted normal distribution
17
Outlier Detection Rule for Counting
̂  optimal 90% efficient bias robust location estimate*
ŝ 
associated robust scale estimate*
Outliers:
returns outside of ( ˆ  sˆ  2.83, ˆ  sˆ  2.83)
Probability of normal return being an outlier: 0.5%
* Use lmRob with intercept only in R package robust
18
Empirical Study of IOA Model Validity
 Four market-cap groups of 20 stocks, weekly returns
 1997 – 2010 in three regimes:
– 1997-01-07 to 2002-12-31
– 2003-01-07 to 2008-01-01
– 2008-01-08 to 2010-12-28
1. Estimate outlier probability  i for each asset, and
hence the probability  1   i  that a row is free of
outliers under the IOA model.
2. Directly estimate the probability  row that a row has at
least one outlier.
3. Compare results from 1 and 2 across market-caps and
regimes.
n
i 1
19
-0.1
0.0
0.1
WTS
0.2 -0.2
0.0
0.1
HGIC
0.2
-0.2
0.0 0.1 0.2
BWINB
-0.3
-0.1
0.1
PLXS
0.3
4 of the 20 Small-Caps for Entire History
SMALL-CAPS
2000
2005
Index
2010
20
Small-Caps Outliers in Third Regime
% OUTLIERS IN EACH ASSET
8
4
6
COUNT
4
0
2
2
0
PERCENT
6
10
8
12
# OF ASSETS WITH AN OUTLIER
1
3
5
7
9 11
ASSETS
14
17
20
2008
2009
2010
2011
Index
21
Large-Caps Outliers in Third Regime
% OUTLIERS IN EACH ASSET
10
COUNT
4
5
3
2
0
1
0
PERCENT
5
6
15
# OF ASSETS WITH AN OUTLIER
1
3
5
7
9 11
ASSETS
14
17
20
2008
2009
2010
2011
Index
22
Evaluation of IOA Model for Weekly Returns
1997-01-07 to 2002-12-31
MICRO
% Clean Rows IOA Model
32
% Clean Rows Direct Count
37
SMALL
39
48
MID
46
58
LARGE
59
69
2003-01-07 to 2008-01-01
MICRO
% Clean Rows IOA Model
33
% Clean Rows Direct Count
37
SMALL
46
49
MID
52
62
LARGE
57
66
2008-01-08 to 2010-12-28
MICRO
% Clean Rows IOA Model
23
% Clean Rows Direct Count
43
SMALL
31
48
MID
39
57
LARGE
46
62
23
gmv.lo
gmv.lo.mcd
gmv.lo.qc
mkt
-0.1 -0.15
0.00
1.0
1.5
2.0
LONG-ONLY GMV,GMV.MCD, GMV.PW, MKT
-0.5
Drawdown
Weekly Return
Cumulative Return
2.5
Weekly Returns, Window = 60, Rebalance = Weekly
1998-03-03
2001-07-03
2005-01-04
Date
2008-07-01
24
Weekly Returns, Window = 60, Rebalance = Monthly
2.0
1.5
-0.1 -0.15
0.00
1.0
gmv.lo
gmv.lo.mcd
gmv.lo.qc
mkt
-0.5
Drawdown
Weekly Return
Cumulative Return
LONG-ONLY GMV,GMV.MCD, GMV.PW, MKT
1998-03-03
2001-07-03
2005-01-04
2008-07-01
Date
25
HHI Diversification Index (sum-of-squared wts.)
dvi.gmv
0.9
0.8
0.7
0.6
0.5
0.4
dvi.gmv.mcd
0.9
0.8
0.7
0.6
0.5
0.4
dvi.gmv.qc
0.9
0.8
0.7
0.6
0.5
0.4
1998 2000
2002 2004 2006
Time
2008 2010
26
Back-Test Code
library(PerformanceAnalytics)
library(robust)
source("GmvPortfolios.r")
source("btShell.portopt.r")
source("btTimes.r")
# Diversification Index Function
dvi =function(x){1-sum(x^2)}
# Input returns
ret.all = read.zoo("smallcap_weekly.csv",sep=",",header =
T,format = "%m/%d/%Y")
mkt = ret.all[,"VWMKT"]
ret = ret.all[,1:20]
n.assets <- ncol(ret)
# get returns dates
all.date = index(ret)
27
# compute the backtest times
t.mw <- btTimes.mw(all.date, 4, 60)
# backtesting
weight.gmv.lo <- backtest.weight(ret, t.mw,gmv.lo)$weight
weight.gmv.lo.mcd <- backtest.weight(ret, t.mw,
gmv.lo.mcd)$weight
weight.gmv.lo.qc <- backtest.weight(ret, t.mw,
gmv.lo.qc)$weight
# The Diversification Index Plots
gmvdat = coredata(weight.gmv.lo)
gmvdat.mcd = coredata(weight.gmv.lo.mcd)
gmvdat.qc = coredata(weight.gmv.lo.qc)
dvi.gmv = apply(gmvdat,1,dvi)
dvi.gmv.mcd = apply(gmvdat.mcd,1,dvi)
dvi.gmv.qc = apply(gmvdat.qc,1,dvi)
dvi.all = cbind(dvi.gmv,dvi.gmv.mcd,dvi.gmv.qc)
dvi.all.ts = as.zoo(dvi.all)
index(dvi.all.ts) = index(weight.gmv.lo)
xyplot(dvi.all.ts, scales = list(y="same"))
28
# compute cumulative returns of portfolio
gmv.lo <- Return.rebalancing(ret, weight.gmv.lo)
gmv.lo.mcd <- Return.rebalancing(ret, weight.gmv.lo.mcd)
gmv.lo.qc <- Return.rebalancing(ret, weight.gmv.lo.qc)
# combined returns
ret.comb <- na.omit(merge(gmv.lo, gmv.lo.mcd, gmv.lo.qc, mkt,
all=F))
# return analysis
charts.PerformanceSummary(ret.comb,wealth.index = T,
lty = c(1,1,1,4),colorset = c("black","red","blue","black"),
cex.legend = 1.3,cex.axis = 1.3, cex.lab = 1.5, main =
"Weekly Returns, Window = 60, Rebalance = Monthly
\n LONG-ONLY GMV,GMV.MCD, GMV.PW, MKT")
29
“Statistics is a science in my opinion, and it is
no more a branch of mathematics than are
physics, chemistry and economics; for if its
methods fail the test of experience – not the
test of logic – they will be discarded”
- J. W. Tukey
Thank You!
30
31
“Statistics is a science in my opinion, and it is
no more a branch of mathematics than are
physics, chemistry and economics; for if its
methods fail the test of experience – not the
test of logic – they will be discarded”
- J. W. Tukey
Thank You!
Proprietary, for use only by permission.
References

Alqallaf, Konis, Martin and Zamar (2002). “Scalable robust covariance and
correlation estimates for data mining”, Proceedings of the eighth ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, pp. 14-23.
ACM.

Scherer and Martin (2005). Modern Portfolio Optimization, Chapter 6.6 – 6.9,
Springer

Maronna, Martin, and Yohai (2006). Robust Statistics : Theory and Methods, Wiley.

Boudt, Peterson & Croux (2008). “Estimation and Decomposition of Downside Risk
for Portfolios with Non-Normal Returns”, Journal of Risk, 11, No. 2, pp. 79-103.

Alqallaf, Van Aelst, Yohai and Zamar (2009). “Propagation of Outliers in Multivariate
Data”, Annals of Statistics, 37(1). p.311-331.

Martin, R. D., Clark, A and Green, C. G. (2010). “Robust Portfolio Construction”, in
Handbook of Portfolio Construction: Contemporary Applications of Markowitz
Techniques, J. B. Guerard, Jr., ed., Springer.

Martin, R. D. (2012). “Robust Statistics in Portfolio Construction”, Tutorial
Presentation, R-Finance 2012, Chicago,
32
Download