Lecture 7 Statistical Arbitrage October 19, 2023 Sponsored by Announcements • https://tinyurl.com/quant-week7 • Max Dama Interest Form due Sunday 11:59PM • https://tinyurl.com/max-interest-formquantdecal • Lab 1 extended to Sunday 11:59PM - Office Hours available by Email • Resources for Pandas/Coding • Kaggle is a good resource • documentation • some links will be added to Ed Today’s Lecture • Introduction to Stat Arb • Structural Risk Decomposition • Portfolio Construction • How to (not) kill your ops guy • Alt Data Intro to Stat Arb Arbitrage • Arbitrage describes trades where one earns a risk-free profit. • Traditionally, arbitrage occurs between fundamentally linked assets, like an ETF and its constituent securities. • See the previous lectures for more. Statistical Arbitrage • Stat arb is a class of trading strategies by which you identify and trade on statistical relationships between assets. • Usually this looks like pairs trading. • Stat arb is fundamentally a reversion strategy: we general hope that any deviation from historical statistical relationships is simply due to inefficiency or a temporary disruption and will correct in the future. • Stat Arb differs from other styles of trading that we have discussed in that it is generally at longer time scales (seconds to days). • While market making has a capped upside, stat arb has potentially unlimited upside. Structural Risk Decomposition Arbitrage Portfolio Theory • Arbitrage Portfolio Theory, developed by Stephen Ross, asserts that assets must be priced as linear combinations of factors. • Why linear? Because portfolios are linear combinations of securities any arbitrage opportunity must be linear in some set of factors. Structural Risk Decomposition • Consider a matrix of equity returns, • We seek a structured approach to the risks implied by these returns • There are two general approaches: • Fundamental Factors: sector, big vs small companies, etc see French Fama Factor Models • Statistical Factors: principal components • Recall the SVD of X: , we select the first k columns of V and call these our risk factors. • Our matrix of factors is therefore which we can project X onto. • Our factor loadings are correlations between features and each of these factors. Factor Timing & Why Neural Nets are Hard • Factor timing is when you try to predict movements in a single factor. • The original factor timing is “Market Timing” • Doing this is generally considered to be very hard • Why are neural networks therefore difficult to fit? • We fit neural networks using gradient descent which greedily selects the directions of maximum improvement. • Therefore we are typically fitting to noise as we have an inverted noise structure ie the largest components of the data are noise vs signal. • Statistical factors change, so we may have to refit our network fairly frequently which can be expensive. Popular Off-The-Shelf Factor Models • Barra, Axioma, Barclays • Sometimes when these providers add factors they become real because so many people use their models • The markets are a social system Portfolio Construction So You Want to Make Money? • Factor timing is hard–wouldn’t it be nice if we could predict something that didn’t have any influence from any of these factors? • We instead predict residual returns, ie we attempt to predict the returns after residualizing on our factor model. • To compute this, we need our factor loadings, or the correlations between each factor (assuming factors are orthogonal. Why?) Portfolio Optimization • So, you have your loading matrix , we now want to to find the portfolio that allows us to capitalize off these returns. • We also defined as the covariance of returns and , as our predicted returns • Modern portfolio theory tells us that we should maximize expected returns for a given variance constraint while maintaining factor neutrality. This yields the optimization problem Taking A Position • Stat Arb portfolios generally take large positions over longer periods of time • We therefore need good execution as this can add to our alpha • Major players often go through broker-dealers who guarantee VWAP pricing • Because we are taking a position our strategy has much higher capacity and our returns are potentially unlimited. Turnover • Ideally, we want w to be consistent over time because each time we change positions we incur transaction costs • We therefore may implement a regularization term in our optimization penalizing deviations from our existing portfolio • There are typically teams whose job it is to model transaction costs to help motivate how strong turnover should be penalized • Unfortunately as we get new data our factors and loadings will change and induce some turnover. How you deal with can make/break the fund. How to (not) Kill Your Ops Guy (Risk Management) Portfolio Risk • All correlations tend towards 1 during large drops and factor variances collapse • This will kill your risk model and cause your entire portfolio to lose a lot of its value • You can prevent exposure to this risk by buying downside puts on your portfolio • You can also exit your position quickly if you have a robust predictor of when this might occur Are My Quants Bad At Their Job? • Ideally, if a position isn’t realizing its expected gains we want to dump it before we lose too much money • We therefore may choose to model expected drawdowns during prediction intervals • Drawdown is a fact of life, but we want to limit our downside so we can keep the lights on Structural Divergence • A relationship may no longer exist for fundamental reasons. • Example: • Let’s say you’re pairs trading Microsoft and Amazon, specifically on their respective cloud computing businesses. • However, the anti-trust action is taken against Microsoft so they have to drop Azure (their cloud platform). • Your statistical relationship will likely no longer hold, even when the two securities diverge. Liquidity Risk • If you are running a strategy in illiquid assets, it may be very hard to clear positions, even after reconvergence. • You will also not be able to take large positions without having a large market impact. • How can you solve this? Alternative Data Market Data is Useless (T&C Apply) • At longer time horizons order book data is typically useless. If it weren’t HFTs would be incorporating it into the price much more quickly. • Therefore, from the perspective of a stat arb strategy, the market is efficient with respect to very granular order book data. • This isn’t always the case especially when we aggregate data What is Alt Data? • Alt data is anything that isn’t market data • Some common examples include credit card data, satellite data, and weather data • These data can provide us orthogonal alphas to use in computing our returns matrix • Often alt data is sold by 3rd party vendors for firms to use Is My Competition Using This? And other questions to ask your vendor. • Is my competition using this data? How many of them? • What is the latency of this data? • Where are you getting this data from? • How do you ensure correctness? • Is there a human verifying this data? • How much is does the feed cost? • Do you already work with my firm? Major Players Some firms • Renaissance Technologies • Two Sigma • D.E. Shaw • TGS Management • PDT Partners • Citadel GQS • AQR Capital Management Questions? Notebook Demo