Mgmt 237H Lecture #1

advertisement
Lecture #7
Professor Jason C. Hsu, Ph.D.
(C) Created by Jason C. Hsu for use
by UCLA Anderson and Research
Affiliates, LLC
Mgmt 237H
1
(C) Created by Jason C. Hsu for use
by UCLA Anderson and Research
Affiliates, LLC
Admin
2
(C) Created by Jason C. Hsu for use
by UCLA Anderson and Research
Affiliates, LLC
Incorporating Signal
Into Portfolios
3
Portfolio Mathematics
Portfolio weights:
Sequence of portfolio weights over time :
Stock returns:
Portfolio return:
• Note:
• w_t is the vector of weights at the beginning of period t
• r_t is the vector of return over holding period t
• A proper long only portfolio has weights sum to 1
Some Simple Portfolios
• Equal weighting:
• Cap-weighting:
Active Portfolio
Benchmark portfolio:
Active Portfolio:
These long only portfolios have weights sum to 1
Active Weights:
*This a long-short portfolio with weights sum to 0
Active Portfolio Returns
• E{R_a} = alpha + E{R_b}
• Realized excess return relative to benchmark = R_a_t – R_b_t;
• Arithmetic Average of excess return relative to benchmark is a
proxy for alpha
Portfolio Risk
• TE:
• TE can also be estimated by the volatility of the realized excess
return
• TE measures the active deviation from the benchmark
• The larger is the TE the larger are the deviation in stock weights
against the benchmark stock weights
• Manager has greater conviction (will take on larger active
weights, which leads to larger TE)
Active Weights for Long Only
Strategy
• The active weights for a long only strategy can be expressed as
a long-short portfolio (active weights sum to 0%).
• The LS portfolio created in the previous sections can be used
as the basis for the active weights.
• Recall that the portfolio can be described by
𝑊 = 𝑊𝐵𝑒𝑛𝑐ℎ + 𝐴𝑊
• So 𝐴𝑊 = 𝛾𝐿𝑆 where the active weight portfolio (AW) is just a
scaled version of the long-short portfolio where 𝛾 is chosen to
satisfy the TE constraint of the long only portfolio.
Active Weights for Long Only
Strategy
• To continue, we need to know the benchmark weight and the
TE budget for the long only portfolio.
• Recall that active weights and ex ante TE have the following
relationship
• Vol (active weight portfolio) = TE
′ Σ𝑥
𝑇𝐸 = 𝑣𝑜𝑙 𝐴𝑊 = 𝑥𝑎𝑤
𝑎𝑤
Where 𝑥𝑎𝑤 is the vector of the active weights and Σ is the
covariance matrix for the stocks in the benchmark index
• Since 𝐴𝑊 = 𝛾𝐿𝑆, 𝑣𝑜𝑙 𝐴𝑊 = 𝛾 ∗ 𝑣𝑜𝑙(𝐿𝑆)
Negative Weights
• The active long portfolio that was just built may still have
some short weights. This happens when the scaled short
weights in the LS portfolio are larger than the benchmark
portfolio.
• There are a number of ways to deal with this:
• Zero out the short weight and rescale the entire portfolio back to
100%
• Find “similar characteristic” stocks and spread the excess short
weights to those stocks.
• Similar characteristics stocks will be other underweight stocks which
are being shorted for “roughly” the same reason(s)
Why don’t we optimize?
• We will talk more about issues with optimization.
• Generally, optimization doesn’t work.
(C) Created by Jason C. Hsu for use
by UCLA Anderson and Research
Affiliates, LLC
• The estimation errors are so large that the optimization generally
gives too much weight to the most over-estimated stocks.
12
(C) Created by Jason C. Hsu for use
by UCLA Anderson and Research
Affiliates, LLC
Passive Strategies
13
Passive Strategies
• The first quant product was an index
• Quant departments usually also handles index
• BGI, SSgA and Mellon Capital were the very original index
shops and have since become the biggest quant investment
shops
Index Fund Performance
• Pure replication passive index fund has 0% chance of
outperforming its benchmark
• But it has 70% chance of outperforming an active manager!
• The remaining part of the course will be focused on how to
outperform a benchmark index.
Passive Index Plus
• Producing Index + returns
• Securities Lending
• Investment bank (keeps 50% of the lending income)
• Collateral Management
• Cash management service charges mgmt fee
Portable Alpha
• Gain index exposure through futures contract
• Actively manage the cash collateral to take additional risk
Enhanced Indexes
• Producing Index + returns
• Tilting toward quant factors to earn premium
• Tilting toward value and small cap under a tight TE
Passive Approach to
Outperformance
• This approach claims (empirically or theoretically) that the
“cap-weighted” benchmark is sub-optimal.
• It tries to build a better (more optimal) portfolio
• This approach does not start with the benchmark and then try
to create active weights against the benchmark, it just builds a
new portfolio from scratch
(C) Created by Jason C. Hsu for use
by UCLA Anderson and Research
Affiliates, LLC
MVO
Optimal Portfolio
20
MVO as a method for
outperformance
• MV Optimal portfolio construction
• Why do cap weighting?
• We have no reason to believe that it is MVO
• So you can create a better passive portfolio by directly trying to
build an MVO portfolio
Tangency Portfolio
• What do we need to construct the tangency portfolio
• Mean and covariance
• Does it matter that we estimate them very poorly?
• As it turns out MVO is very sensitive to small variations in the inputs
• Stocks which we estimate with big positive errors in the expected
return will get enormous weights
• MVO can often be very undiversified as a result
• Numerically, how tractable is this method?
• Since we impose a long only constraint, we need to numerically solve
for the MVO. This is very difficult in practice when we need to deal
with hundreds of stocks.
• Simply using the algebraic solution and cutting out the negative
weight does not lead to a good approximation for the true tangency
portfolio
Naïve Tangent Portfolio
• Let’s use the most naïve asset pricing model to set expected
stock returns: using sample estimates
• Empirically, how well does this method work?
• Not so good!
• Empirically, this method consistently underperforms equal
weighting!
Naïve Tangent Portfolio
• Why doesn’t it work?
• MVO is guaranteed to be ex ante optimal if inputs are correct.
• However, what if inputs are not 100% correct? What if they are
only “generally” correct?
• Using some useful information should still be better than EW, which
is almost entirely without information!
• However, since sample averages tend to significantly over or underestimate true mean, this information actually appears to be “almost
useless” or even harmful when combined with MVO
• We will discuss how to implement MVO better (later)
MVO Portfolio
• Ingredients:
(C) Created by Jason C. Hsu for use
by UCLA Anderson and Research
Affiliates, LLC
• Define a stock universe
• Assign returns for stocks
• Estimate the covariance matrix for stocks
25
Estimating Future Stock
Returns
• If stock returns were ergodic (iid) then past return information
is indicative of future returns
• So historical realized returns tell us something about future
likely returns; but just how accurate can we forecast?
Mean Estimates with High
Frequency Data
• Great the more data the better. We can always go to higher
frequency.
• But that doesn’t rally help. Going from monthly data over 10
years (m=120), to 10 years of weekly data (m=520), the std err
on the weekly expected return becomes smaller, but when
you annualize, the std err on the annualized forecast remains
the same!
Estimating Returns
• The average individual arithmetic stock return is about 10%
per annum
• The average stock vol is 30%
• With 10 years of data, your std err on annual arithmetic return
estimate is 9.5%! You couldn’t really say if the realized 10%
return was different from 0%
• Easier to estimate lower vol portfolios (like an index)
• How much time would you need in order to conclude that a
stock actually has positive expected return?
• Need std err to be less than 5% (so that 10% is 2 std dev away
from zero)
• So T needs to be at least 36 years!
What if expected returns are timevarying instead of ergodic?
• If expected returns for stocks are time-varying, then having a
long time history of data doesn’t help, because you are not
just using the data to estimate the parameters of one
stationary distribution.
Modeling Expected Returns
• Even if returns were ergodic, it is much too hard to estimate
them with any reliability. So we will need to assume some
structure—build a theory that allows us to use more data or to
estimate fewer parameters, etc.
• We will spend a lot of time working on these asset pricing
models, which help us forecast returns.
Asset Pricing Model
• All reasonable asset pricing model tries to relate risk to
expected return.
• This then allows us to use higher moment distributional
parameters to give us information on the first moment
• If return is related to vol of a stock, then we can use the vol
estimate to help us estimate expected returns. We can use high
frequency data to improve on the expected return estimate then!
• Think CAPM. Why is CAPM pricing equation more useful than
historical average return for estimating future expected stock
returns?
APT
• 𝑟𝑖 = 𝛽𝑖,1 𝑓1 +…+𝛽𝑖,𝑘 𝑓𝑘 + ε𝑖
• A few risk factors which drive much of the aggregate
(undiversifiable risk) in the economy
• Exposure to these factors usually pay a premium (some might
pay no premium which others pay a lot of premium)
• We can figure out the expected return for a stock by
estimating its exposure to the factors
• We need to estimate the factor premiums as well
Return forecasting
• The practice of return forecasting in excess of the APT model
is generally involved with
• Forecasting the idiosyncratic error component of stock returns
(inside information, insights into mispricing)
• Forecasting the time-varying risk premium associated with the
factors.
Estimating Returns and Using
them!
• So you have estimated returns. You probably want to use
them for something useful!
• This is when you need to do more work!
• Your return estimates are plagued with outliers. These
outliers will hurt your portfolio strategy.
• Shrinkage approach.
• Create biased estimates, but more useful estimate.
• Idea is to reduce the effects of the outlier estimates by shrinking
them toward the mean
Estimating Factor Portfolio Mean
Return
• How do we identify the factor portfolio mean return?
• Take the same arithmetic average return for each factor portfolio
• This turns out not to be the best way
• The better way is the Fama-MacBeth Approach
• Intuitively, we want to use the cross-sectional stock information to
help us improve our mean estimate
APT Model and Fama-MacBeth
First stage we run the following cross-sectional regression
to estimate betas for each stock on the factors
• 𝑅1,𝑡 = 𝛼1 + 𝛽1,𝐹1 𝐹1,𝑡 + 𝛽1,𝐹2 𝐹2,𝑡 + ⋯ + 𝛽1,𝐹𝑚 𝐹𝑚,𝑡 + 𝜖1,𝑡
• 𝑅2,𝑡 = 𝛼2 + 𝛽2,𝐹1 𝐹1,𝑡 + 𝛽2,𝐹2 𝐹2,𝑡 + ⋯ + 𝛽2,𝐹𝑚 𝐹𝑚,𝑡 + 𝜖2,𝑡
•⋮
• 𝑅𝑛,𝑡 = 𝛼𝑛 + 𝛽𝑛,𝐹1 𝐹1,𝑡 + 𝛽𝑛,𝐹2 𝐹2,𝑡 + ⋯ + 𝛽𝑛,𝐹𝑚 𝐹𝑚,𝑡 + 𝜖𝑛,𝑡
For the second stage, we take expectation of the APT model
to get the following expression
• 𝐸(𝑅𝑖,𝑡 ) = 𝑎 + 𝛽𝑖,𝐹1 𝐸(𝐹1,𝑡 ) + 𝛽𝑖,𝐹2 𝐸(𝐹2,𝑡 ) + ⋯ + 𝛽𝑖,𝐹𝑚 𝐸(𝐹𝑚 )
This is now a cross-sectional regression that we can use to
estimate the factor premium
MVO
• Once you have the expected factor returns estimated, you can
estimate the expected stock returns for each individual stocks
• Now you can apply MVO!
• But your optimizer probably will struggle to deal with
optimizing 1000 stocks!
• In fact, if you want to run a back test, MVO over 100 stocks will
make your back test extremely slow.
• The more stocks you add to the MVO, the less robust the output
can become
Estimating Covariance Matrix
• We will need to estimate the covariance matrix for a variety of
reasons
• As input into computing min-var and MVO portfolios
• As input for estimating ex ante portfolio volatility
Issues with Estimating Covariance
Matrix
• The N x N covariance matrix contains
• N unique variance terms
• 0.5* N * (N -1) unique correlation terms
• For S&P500 stocks, you will have to estimate 62,250 unique
parameters
• You will need at least that many data points (62,250) to estimate
a full rank covariance matrix
• If you want to have “small” standard error on the variance
estimate, you will need to have at least 200 observations (per
stock)
• This is about 17 years of monthly data, which is a unwieldy in a
backtest
• Using daily data will solve this issue.
*Recall the formula for standard error for vol:
Issues with Cov Matrix
Estimation
• As with estimating the mean, the Cov matrix estimate can be
noisy, though that issue is significantly reduced by high
frequency data.
• However, there are techniques for improving the accuracy of
the Cov matrix which you should be aware of
• Covariance shrinkage
• PCA
Covariance Shrinkage (1/2)
• This is related to the Shrinkage which we described for
shrinking sample mean estimates.
• The shrinked Covariance Matrix has the form:
Σ𝑆 = 𝛿 Σ + (1 − 𝛿)𝑆
• Where S is also N x N, but contains only two distinct numbers:
• Diagonal elements are all
= average(the N stock sample variance)
• Off-diagonal elements are all
= average (the 0.5 * N * (N-1) off diagonal covariances)
Covariance Shrinkage (2/2)
• The shrinkage parameter 𝛿 is “the” art
• You can use an explicit solution recommended by Olivier and
Ledoit (2003); it’s a very ugly beast.
• You can also just estimate 𝛿 using a quick and dirty empirical
step.
• Step #1, define in-sample period for estimating sample
covariance Σ and shrinkage target 𝑆
• Step #2, use the out-of-sample period to estimate out of sample
predicted Σ; optimize 𝛿 to minimize the squared deviation of the
0.5 * N * (N-1) elements from our shrinkage target.
PCA Approach (1/3)
• Start with APT framework
• Stock movements can be modeled as driven by a few common
(orthogonal) factors + idiosyncratic noise
• 𝑟𝑖 = 𝛽𝑖,1 𝑓1 +…+𝛽𝑖,𝑘 𝑓𝑘 + ε𝑖
𝜎𝑖 2 = 𝛽𝑖,1 2 𝜎𝑓1 2 +…+𝛽𝑖,𝑘 2 𝜎𝑓𝑘 2 + 𝜑𝑖 2
𝜎𝑖𝑗 = 𝛽𝑖,1 𝛽1,𝑗 𝜎𝑓1 2 +…+𝛽𝑖,𝑘 𝛽𝑗,𝑘 𝜎𝑓𝑘 2
• You now only need to estimate N x k (𝛽𝑖,𝑗 ), k (𝜎𝑓𝑖 ), and N (𝜑𝑖 )
• For 500 stocks and 5 APT factors, that’s only 2500+5+500 or
3005 elements instead of 62250 elements in a unrestricted
model.
PCA (2/3)
• So how do you get the APT factors and how do you make
them orthogonal?
• We use a statistical approach called the Principal Component
Analysis
• This technique essentially examines the covariance matrix and
extracts the eigenvectors from the covariance matrix and sort
the eigenvectors by their eigenvalue
• Intuitively, the method finds the linear combination of the N time
series of stock returns such that the resulting portfolio explains
the greatest total variance.
• We the extract the next portfolio which explains the greatest
amount of the residual variance.
PCA (3/3)
• Once you identify the PCs, you can estimate the N x k (𝛽𝑖,𝑗 ), k
(𝜎𝑓𝑖 ), and N (𝜑𝑖 ) parameters
• by running regression to get 𝛽𝑖,𝑗
• Taking the vol of the PC portfolios to estimate 𝜎𝑓𝑖
• And then backing out 𝜑𝑖 from the total variance of each stock
MVO
• Once you have the expected factor returns estimated, you can
estimate the expected stock returns for each individual stocks
• Now you can apply MVO!
• But your optimizer probably will struggle to deal with
optimizing 1000 stocks!
• In fact, if you want to run a back test, MVO over 100 stocks will
make your back test extremely slow.
• The more stocks you add to the MVO, the less robust the output
can become
MVO in Factor Space
• We apply MVO in the factor space
• Since only the factors earn a risk premium, we can largely ignore
idiosyncratic volatility
• So why don’t we just optimize a portfolio of the k factors?
• That’s exactly what we should do!
• Apply MVO to the factors instead of the stocks will achieve a more
robust portfolio than doing MVO on the 500 stocks!
MVO and Estimation Errors
• Michaud Resampling Technique (parameter uncertainty
technique)
• The issue with MVO is that your mean and covariance
estimates are estimates with errors. They are not true
distribution parameters known with certainty. So you need to
adjust for that.
• Use a bootstrap resampling technique
• Start with the full sample of history
• Randomly select T dates to form a new sub-sample of data
• Use the sub-sample to compute all parameters (this works whether
you use the average return model, the APT or CAPM); then form the
MVO portfolio
• Repeat this process hundreds of times (M).
• Average over all M MVO portfolios
Minimum Variance
• Minimum Variance is clearly not an optimal portfolio and
should generally underperform most portfolios in the SR
space.
Minimum Variance
• Empirically, minimum variance has significantly higher SR than
most known portfolio strategies! It outperforms capweighting handily.
• Under what conditions would minimum-variance be optimal?
• Optimal if expected returns are equal for all stock (since we are
only focused on achieving the lowest vol portfolio)
Reading Review
• Clarke, de Silva and Thorly
• Theoretical paper about what drives portfolio variance and
explore why minimum variance can achieve lower risk with no
return give-up
• Key takeaways:
• The portfolio volatility (of a diversified basket of stocks) is
determined by its beta
• Min-var portfolio is 85-90% allocated to the lowest two beta
quintiles of stocks
• Since there is no empirical relationship between beta and stock
returns, min-var achieves lower risk with no return degradation.
Minimum Variance
• Has not been popular until recently.
• TE is too high; IR is poor
• Why should investor care about IR since TE isn’t really risk?
Isn’t a better SR more meaningful?
• There is a concern that MinVar success was an accident in history
• Hard to believe that you could achieve better return with lower
“market beta”—goes against CAPM
• There is increased awareness that low vol (as well as low beta)
stocks do not have lower premium
• Low volatility puzzle
• CAPM rejection
Equal Weighting
• Equal weighting has been a reliable method for outperforming
the cap-weighted benchmark
• How can a naïve and uninformative approach outperform a
benchmark that most intelligent active manager cannot
outperform?
• Taking on small cap risk
• Earning illiquidity premium
Reading Review
• DeMiguel, Garlappi and Uppal
• Horseraces between 1/N vs. MVO portfolios
• Key takeaways:
• Problems with MVO with using historical sample averages
• General issues with MVO on its sensitivity to estimation errors and
on concentration issues
• How to properly run a horserace to illustrate a success or problems
with a strategy
Risk Cluster Weighting
• An extension of the EW approach
• Solves the arbitrariness of N (how do you know how many
stocks and which stocks to equal weight)
• Define how many natural groups exist; use statistical methods
to organize stocks into groups; equal weight these clusters
Fundamental Indexing
• Based on the NIP theory of Summer and Black
• If prices are noisy and mean-reverting, then cap-weighting
would over-weight overvalued stocks and under-weight
undervalue stocks.
• Weighting by non-price-based measures will improve portfolio
return
• Specifically weighting by fundamentals will create a liquid and
low TE portfolio to the standard benchmarks
Download