MFE 237H: Quantitative Asset Management Notes on Minimum Variance Portfolios Professor Jason Hsu UCLA Anderson School of Management Minimum Variance Portfolio Let wp be a N × 1 vector of weights for a portfolio of risky assets, p. Then xp is a N × 1 vector of total returns in excess of the risk free rate, rf , where E[xp ] = µp is the expected excess returns. Then, expected excess return of the portfolio p is: E[wp 0 xp ] = wp 0 E[xp ] = wp 0 µp = rp (1) The variance of a portfolio’s excess returns can be expressed as: V ar(wp0 xp ) = wp0 V ar(xp )wp = wp0 Σp wp = σp 2 (2) The minimum variance portfolio is the portfolio with the lowest excess returns volatility. It also happens to be the mean-variance optimal portfolio if the expected returns for all stocks are equal!1 In general, the minimum variance portfolio can be expressed as the solution to: w0 Σp w minimize w subject to ΣN i=1 wi = 1 (3) Using the clever observation that the minimum variance portfolio is the tangency portfolio when we assume the vector of expected returns, µp , is a constant vector, the closed form solution for the minimum variance portfolio is: Σ−1 p e wp = 0 −1 e Σp e (4) Estimating The Returns Covariance Matrix The population covariance between two stocks’ excess returns can be expressed as: Cov(xi , xj ) = E[(xi − µi )(xj − µj )] = σi,j (5) 1 Think about this point or draw a picture to convince yourself. What would the efficient frontier look like if all stocks have the same expected returns? 1 Where xi is the ith stock’s excess return and E[xi ] = µi is the expected excess return. The sample covariance over the period θ = 1, . . . , T is: T d i , xj ) = Cov(x 1 X t (xi − µ̂i )(xj t − µ̂j ) = σ̂i,j T − 1 t=1 Where µ̂i is the sample mean of the ith stock such that µ̂i = 1 T (6) PT t=1 xi t . Let X be a T × N matrix of column excess returns such that: x1 1 x1 2 X = .. . x1 T x2 1 . . . xN 1 x2 2 . . . xN 2 .. .. ... . . T x2 . . . xN T (7) Then the sample excess returns covariance matrix over period θ can be defined as:2 Σ̂ = 1 1 X 0 (I − ee0 )X T −1 T (8) Or, more explicitly: σ̂12 σ̂2,1 Σ̂ = .. . σ̂N,1 σ̂1,2 . . . σ̂1,N σ̂22 . . . σ̂2,N .. .. .. . . . 2 σ̂N,2 . . . σ̂N (9) SAS Examples This section will give a few brief demonstrations of how tangency and minimumvariance portfolios can be constructed in SAS with popular academic data sets. 2 Notice that when there are more returns observations than assets in the portfolio, the sample covariance matrix is rendered singular. The rank of I − T1 ee0 is T − 1, so the rank of the covariance matrix can be at most T − 1. When N > T − 1, the sample covariance matrix will not have full rank and is degenerate. 2 Common Data Formatting After an accounting data set has been merged with a returns data set in SAS, it is not uncommon for a SAS table to appear in a similar format (when sorted by ID and time) as: ID 567 567 567 567 .. . 7824 7824 7824 7824 .. . US US US US Name Auto Parts Auto Parts Auto Parts Auto Parts .. . Delta Delta Delta Delta Financial Financial Financial Financial .. . Month 1 2 3 4 .. . 1 2 3 4 .. . Year Tot Ret 1975 0.0657 1975 0.0102 1975 -0.0037 1975 -0.0289 .. .. . . 1975 0.0971 1975 0.0213 1975 0.0004 1975 -0.0575 .. .. . . RF 0.0044 0.0043 0.0042 0.0043 .. . ... ... ... ... ... ... ... ... ... ... 0.0044 0.0043 0.0042 0.0043 .. ... . Mkt Cap 8573.21 8573.21 8573.21 8573.21 .. . 789.42 789.42 789.42 789.42 .. . At the beginning of the examples, this data set is referred to as f inancial data. Data Preparation This section will show how to take data formatted like above and transform it into naı̈ve forecasts of mean excess returns and the excess returns covariance matrix which can then be used to generate portfolios. Two examples will be given for the (un)constrained minimum variance portfolios. Construction of the minimum variance portfolios will begin by sampling the largest 25 stocks sorted by descending market capitalization on the last trading day of the year, with each stock having 60 consecutive months of nonmissing total returns available. First, define the year for which the portfolio will be built. If a portfolio is being built for the year 2002, then monthly returns data from 1997 through the end of 2001 will be required.3 3 In the following examples, only a portfolio for a single year is computed. In a historical backtest, many portfolios will need to be built for a given strategy across time. The syntax of these examples accommodates that reality. SAS macro variables can be used in do-loops, which can be used to iteratively build portfolios across time. 3 /* Defines Year for Portfolio */ %let year = 2002; /* Sorts for Data Step */ proc sort data=financial_data; by id year month; run; After the year for the portfolio is defined and the data are properly sorted, stocks without enough available returns are eliminated from the data. A variable named returns count is created which enumerates the number of consecutive historical returns available for a stock on a given date. Then, only stocks with at least 60 months of consecutive historical returns are retained at the time of construction. /* Counts Non-missing Returns */ data historical_returns_&year; set financial_data; where (year <= &year - 1) and (year >= &year - 5); returns_count + 1; by id year; if first.id or missing(tot_ret) then returns_count = 1; run; /* Lists Stocks w/ 60 mo. Returns */ data holdings_&year(drop=count returns_count); set historical_returns_&year; count + 1; format ref_id $12.; where (month = 12) and (year = &year - 1) and (returns_count >= 60); ref_id = catt(’w’,count); run; Next, the largest 25 stocks (by market capitalization) will be chosen from the set of stocks with complete returns. The SORT and RANK procedures should be straightforward. The following data step merges the list of the largest 25 stocks with the set of complete returns, discarding data for stocks with market capitalizations ranked greater than 25. This is accomplished by using the variable x defined in the in = statement. If the set holdings &year contributes an observation during the merge, then x is equal to one (and zero otherwise). The if statement keeps only observations where x is equal to one, 4 which is equivalent to true as a boolean variable. Also, the excess returns are defined by subtracting the risk free rate from the total return, which will be used later on. /* Sorts for Rank Procedure */ proc sort data=holdings_&year; by year month descending mkt_cap; run; /* Ranks Market Capitalization */ proc rank data=holdings_&year descending; by year month; ranks mkt_cap_rank; var mkt_cap; run; /* Maps Returns to Stocks */ data holdings_&year(drop = mkt_cap_rank); merge holdings_&year(in=x) historical_returns_&year; where mkt_cap_rank >= 25; by id; if x; excess_ret = tot_ret - rf; run; In order to calculate a sample covariance matrix with the SAS procedure CORR, returns data must be presented as a time series column for each stock. To generate a data set with this format, the TRANSPOSE procedure can be used. Unnecessary variables are then deleted from the output, which is then used as input to the CORR procedure. /* Transpose Data */ proc transpose data=holdings_&year out=transpose_holdings_&year name=col_xpose; var excess_ret; by year month; id ref_id; run; /* Remove Excess Variables */ data transpose_holdings_&year; 5 set transpose_holdings_&year(drop=year month col_xpose); run; /* Covariance Matrix & Summary Statistics */ proc corr data=transpose_holdings_&year cov out=cov_matrix_summary_&year(type=cov) nocorr; run; The data set cov matrix summary &year will be used to build the unconstrained minimum-variance portfolio. The IML procedure will be able to read the data as is and define a covariance matrix object and a vector of excess returns. These will then be algebraically manipulated to generate a vector of portfolio weights. Two additional data tables will be created with the mean excess returns and covariances for use in the OPTMODEL procedure, which will be used to generate portfolio weights for the constrained minimum variance portfolio. /* Table of Covariance Matrix Elements */ data covariance_matrix(drop=_TYPE_ _NAME_); set cov_matrix_summary; if _TYPE_ = ’COV’; run; /* Table of Mean Returns */ data mean_returns(drop=_TYPE_ _NAME_); set cov_matrix_summary; if _TYPE_ = ’MEAN’; run; Lastly, when building constrained portfolios, you might want to let the number of securities in your portfolio vary from year to year based upon some criteria you choose. The OPTMODEL procedure will read the number of assets in your portfolio as a macro variable named n. To change the value of a macro variable while the program is running, the following trick works: /* Counts Number of Assets */ proc means noprint data=transpose_holdings_&year; output out=n_&year n=constituents; var w1; run; /* Sets &n = # Assets */ 6 data _null_; set n_&year; call symput(’n’,put(constituents,best12.)); run; The SYMPUT function will define a global macro variable using data from a SAS data table. This is convenient for defining macro variables during processing. The %LET statement will only define macro variables when the code are first read. Using the DATA Null statement lets the data step execute like normal, however a new data set is not created, and not stored in the work library. Global Minimum-Variance Portfolio The IML procedure allows the user to perform matrix operations on SAS data tables. For example, we can find weights for the unconstrained minimum variance portfolio by reading the sample covariance matrix into a matrix object and performing a few algebraic operations. The following code generates portfolio weights for the global minimum variance portfolio using the sample covariance matrix computed earlier with the CORR procedure. The use statement defines the data table from which the covariance matrix will be read.4 Weights are printed to the output and a new data set is created in the SAS work library. proc iml; /* Points To Data */ use cov_matrix_summary; /* Reading Data For Covariance */ read all var _ALL_ where (_TYPE_=’COV’) into COV; /* Computes Inverse Covariance Matrix */ COV_inv = INV(COV); /* Counts Number of Assets */ N = nrow(COV_inv); /* Creates N x 1 Vector of Ones */ e = j(N,1); 4 Alternatively, you could read the column returns table generated from proc TRANSPOSE at this point, and use this to generate the covariance matrix. 7 /* Compute Weights */ x = (COV_inv * e)/(t(e)*COV_inv*e); /* Print Solution Weights */ print x; /* Create Solution Weights Data Set */ create min_var_weights_&year x; quit; The IML procedure has many features- it allows the user to write if statements, do loops, and define complex functions. Minimum-Variance Portfolio with Constraints The OPTMODEL procedure in SAS is a powerful tool which can be used to solve many different types of optimization problems. In the minimum variance example using OPTMODEL, the IPNLP (interior point nonlinear programming) solver is used to generate portfolio weights. Also, the objective function (expected portfolio variance) must be expressed algebraically as a sum in the OPTMODEL procedure: 0 f (w) = w Σ̂p w = N X wi σ̂i,j wj (10) i,j=1 The OPTMODEL procedure reads values from the table covariance matrix into an N × N matrix named var. A N × 1 vector of weights is created with values initialized to equal N1 (equal weighting). Three constraints are defined: weights must sum to one, a single short position cannot exceed 10% of the portfolio value, and a single long position cannot excede 10% of the portfolio value. In sum, we are effectively solving the problem: minimize w subject to f (w) ΣN i=1 wi wi wi 8 = ≤ ≥ 1 0.10 ∀i −0.10 ∀i (11) Also, notice that we are using the names for the variables created earlier as ref id. This is because SAS can process enumerated lists of variables like w1, . . . , wN much more efficiently than common financial identifiers. However, it is important to accurately map the ref id back to its true financial identifier. proc optmodel; set N = 1..&n; num var{N,N}; var x{i in N} init 1/&n; /* N number of assets index /* N x N covariance matrix /* N x 1 vector of weights */ */ */ /* Objective Function */ min f = (sum{i in N}(sum{j in N}(x[i]*var[i,j]*x[j]))); /* Constraints */ con c1: sum{i in N} x[i] = 1; con c2: min{i in N} x[i] >= -0.10; con c3: max{i in N} x[i] <= 0.10; /* Reading Data */ read data covariance_matrix into [_n_]{i in N}<var[_n_,i]=col("w"||i)>; /* Execute Optimization */ solve with ipnlp; /* Create Solution Weights Data Set */ create data min_var_weights_&year from {i in N}<col("x"||i)=x[i]>; /* Print Solution Weights */ print x; quit; 9