ORF 523 Advanced Programming (2012): Final Report 1 Tracking Index Return and Price with Constraints of Transaction Cost and Tracking Error Changle Lin and Weichen Wang Operational Research and Financial Engineering, Princeton University Abstract: Index tracking has been widely used in financial industry for efficient investing, hedging and tracking all kinds of desirable portfolio in general sense. We drew inspiration from Support Vector Machine and formulate several optimization models using linear, quadratic and mixed-integer programming and different penalty functions. Based on these models, we came up with a more realistic model: with the advantages of generating sparsity and accurately tracking, this model also minimize transaction costs in an explicit way. Our empirical results indicate that this model tracks the target index very accurately with minimal costs. And this makes it a very desirable tool in index tracking. Key words and phrases: Optimization, Mixed-Integer Programming, Index Tracking, Support Vector Machine (SVM), Transaction Costs 1 Introduction Index tracking, or more generally, portfolio tracking is a widely-used tool in the financial industry. It is used for efficient investment, hedging and tracking a desirable portfolio in the general sense. Firstly, index tracking can be used to track the efficient portfolio. For example, in Markowitz’s model (Markowitz, 1952), the market portfolio is theoretically the optimal portfolio. And we can use the stocks in the market to track the market portfolio. Usually, there exists composite index in stock market to represent to market portfolio. For example, we can track the S&P 500 index to get a portfolio similar to the market portfolio. Second,index tracking techniques can be used to construct a hedging portfolio. To hedge a portfolio in equities, a tracking portfolio in the commodity sector may be very appealing to the investor, for commodity market’s low correlation with the equity market. To hedge against positions in the U.S. or Europe, the investor may need a tracking portfolio in emerging markets like China, India or Brazil. 2 C. Lin and W. Wang Moreover, an investor can always track a basket of commodities to do the tail risk hedging. Third, this tracking technique can be used to track any desirable portfolio an investor wants to. For example, Robert C. Merton mention in his paper Optimal Investment Strategies for University Endowment Funds, that a university endowment should track real estate assets to hedge the compensation risks for faculty members. An investor with limited fund may be unable to invest in any real estate assets. But he can use the tracking techniques to invest in a portfolio with similar return to the real estate sector. The wide usage of tracking techniques in the financial industry drives the needs of formulating a good index tracking model. From the introduction above, we can easily understand the needs of investors: to track the target index or portfolio as close as possible. And there is another issue: the costs incurred by tracking activities. An tracking investor wants to track his target as close as possible, and he wants to do this with minimal costs. These needs of the tracking investor set our goals of formulating models: minimize tracking error and tracking costs at the same time. The costs incurred by investing are categorized as follows: (1) Transaction Costs. The fees charged by exchange or broker dealer when transactions happen. (2) Execution Costs. The costs incurred by execution, often due to price impact of the buying or selling high volumes of assets. (3) Research Costs. The costs of researching on the market, e.g. reading industry or corporate reports,formulating strategies. For our research, we will mainly focus on how to minimize transaction costs, which is most explicit and easiest to model. Index or portfolio tracking generates no research costs since it is passive asset management, the investor does not need to read industry or corporate reports. The execution costs are stochastic and are ignorable for investor with moderate fund. Now our goal is clear: Track a Target Index or a Portfolio with Minimal Transaction Costs. Correspondingly, our optimization objectives are: (a) Minimize Tracking Error; (b) Minimize Transaction Costs. In the regard of transaction costs, we should see how the transaction fees are charge. Below is an article cited from the SEC (U.S. Securities and Exchange Commission) website: Washington, D.C., Mar. 1, 2012 Pursuant to Section 31 of the Securities Ex- Index Tracking 3 change Act of 1934, the Commission has determined that a mid-year adjustment to the Section 31 transaction fee rate is necessary. Effective on April 1, 2012, the Section 31 transaction fee rate will be set at $22.40 per million. (http://www.sec.gov/news/press/2012/2012-35.htm) We can see the transaction fee is in proportion to the traded volume with a fixed transaction fee rate. Usually, SEC makes mid-year adjustment to the transaction fee rate. The fixed rate of transaction fee rate will be reflection in our models. In the following sections, we first draw inspiration from Support Vector Machine (SVM) regression to formulate several optimization models with different penalty functions, various kinds of constraints, and many other different model settings. Using Gurobi mex in Matlab, we programmed our optimization models. And using daily data of stocks on NYSE and the S&P 500 composite stock index, we did various empirical analysis to test our models and get the sense of what is important in modeling. With a further step, we formulate a more realistic model based on the results and understanding we get from previous models. Finally, we will present our tracking results of the final model and give some insights. Moreover, there are still several interesting questions that can be studied for future works. 2 Models for Index Tracking The first question we would like to ask about index tracking is: what should we track, index return or index price? The index return, or daily return, is defined by the difference of index prices of two consecutive days divided the price of the first day; while the the index price is just the value itself. So tracking return or price is different in the sense that they use different data and will generate different albeit similar results. As mentioned in Section 1, our goal is to construct a portfolio which mimics the return of the index. So both ideas will work. We will develop models for them separately in the following. What models one prefer depends on he or she cares more about daily returns or the accumulated long-term returns, which is represented by the price. 2.1 Our motivation: SVM Support Vector Machine (SVM) is designed originally for classification prob- 4 C. Lin and W. Wang lem, but can be adapted for regression with a quantitative response. The SVM regression solves a quadratic optimization problem. min w,b,ξ + ,ξ − s.t. 2 1 2 kwk +C l P i=1 (ξi+ + ξi− ) wT xi + b − yi ≤ ε + ξi+ , i = 1, ..., l wT xi + b − yi ≥ −ε − ξi− , i = 1, ..., l ξi+ ≥ 0, ξi− ≥ 0 Minimizing the objective function of SVM regression is equivalent to minimizing ”ε-insensitive” loss function with a L2-norm penalty (regularized regression). T xi − b < ε 0, if y − w i Lε (yi − wT xi − b) = yi − wT xi − b − ε, otherwise The dual of SVM regression can be easily derived. The dual problem and the final regression function only depends on the inner product of data points. Hence we can generalize SVM from linear regression functions to more complicated functions using kernel methods. Also, L1 norm SVM has been studied by many recent researches (Zhu et al, 2003). It can generate sparsity that is promising for index tracking, since choosing automatically a parsimonious number of stocks usually means better prediction ability. We will see how there ideas motivate our modeling in the next section. 2.2 Tracking Index Return Traditional index tracking methods often model the tracking error as the variance of the difference of one’s portfolio and the index return, then minimize it. Let w be the weights of the stocks of a portfolio, r be the daily returns of the stocks, rind be the daily return of the index. Then we minimize the variance under the constraints the weights sum up to 1. Normally, we also constrain each weight to be nonnegative, which means we do not allow short selling in the portfolio. Thus the optimization problem is: min V ar(wT r − r ) min wT V ar(r − r 1)w ind ind ⇔ s.t. wT 1 = 1, w ≥ 0 s.t. wT 1 = 1, w ≥ 0 We see from the above objective function that we need to know the covariance matrix of stock and index returns. That is why the problem of covariance Index Tracking 5 matrix estimation plays a key role in traditional index tracking. The idea of estimating the covariance matrix is good since covariance matrix contains a lot of information of the market. But it is an undesirable approach for several reasons. (1) Covariance Matrix is usually huge. For example, more than 2500 stocks are listed on NYSE. With just 500 stocks (S&P), covariance Matrix has more than 100000 entries to estimate. (2) Hundreds of thousands of data are needed to estimate it. This is almost impossible since the covariance structure changes with time and market situation. We might only assume the same covariance during a short time, often less than a year. (3) The data are always very noisy, especially for high frequency trading data. The denoising step may even take more skills to accomplish. There are indeed statistical methods to estimate the matrix under the assumption that it is sparse (Bickel and Levina, 2008a,b; Lam and Fan, 2009). But there are other issues attached to the methods, which we will not go into details. The traditional idea take a random view towards the returns. But we tend to regard them as fixed data. This is partly because of the intractability of the covariance matrix estimation. However we also get idea from SVM. SVM regards input data as given training data and focus more on the prediction performance. This idea suggest us infer directly from the data when do not have sufficient information to estimate density or moments of the returns. In Vapnik’s famous book The Nature of Statistical Learning Theory, he gives the idea that we should make decision based on directly the data if we do not have enough information to do density estimation. SVM tends to generate more robust results. In the light of SVM ideas, we avoid assuming the underlying covariance structure and propose our own models to track index returns. Similar to SVM regression model, we adopt the Lε loss function and propose the following quadratic optimization model. We call it L2-norm SVM model. 6 C. Lin and W. Wang min w,ξ + ,ξ − s.t. 1 2 N P 2 (wit − wit−1 ) + β i=1 N P − i=1 N P i=1 N P t P h=1 (ξh + ξh∗ ) wit Rih + y h ≤ ε + ξh , h = 1, ..., t wit Rih − y h ≤ ε + ξh∗ , h = 1, ..., t wit = 1, i=1 ξi+ ≥ 0, wit ≥ 0 ξi− ≥ 0 where wit is the weight for stock i held by the index fund at time t and they sum up to 1; Rih is the return of stock i at time h, y h is the index return at time h, h = 1, ..., T ; wit−1 is the given weight held in the previous day with wi0 = 0, which allows us to update wit day by day. So the first term in the objective function measures the difference between weights of two consecutive days, thus representing the transaction costs. The more the difference is, to update our portfolio, the more proportion of stocks we have to buy or sell. This will accrue more transaction costs. The second term in the objective represents the tracking error since ξh and ξh∗ dominate the negative part and positive part of the tracking error N P i=1 wit Rih − y h . That is, we use ε-insensitive loss function to represent the tracking error. β is the tuning parameter which balances the tracking error and transaction fees. We see the above formulation of models exactly satisfies our goal to minimize the tracking error and transaction costs mentioned in Section 1. We will see more direct way of modeling those two quantities in Section 2.3. Here the model is not actually SVM since we add the constraint the weights totals to 1. This will change the nature of SVM to a large degree. One of the advantages of SVM is that its dual problem depends only on the inner product of the data points. Therefore SVM has the merits of mapping data into higher dimensional space using kernels and do linear classification or regression in that space. The constraint on the sum of weights totally eliminates the advantages. But the model can still be very effective in capturing the features of the index tracking problem as we will see in Section 3. Motivated by the L1-norm SVM, which generates sparse results, we would like also to try the following L1-norm SVM model. Index Tracking min N t P P (ξh + ξh∗ ) wit − wit−1 + β w,ξ + ,ξ − i=1 s.t. 7 − h=1 N P i=1 N P i=1 N P wit Rih + y h ≤ ε + ξh , h = 1, ..., t wit Rih − y h ≤ ε + ξh∗ , h = 1, ..., t wit = 1, i=1 ξi+ ≥ 0, wit ≥ 0 ξi− ≥ 0 The L1-norm SVM model uses L1 norm to measure the difference of weights. This is more intuitive than using L2 norm because we tend to think the transaction cost is linear in the traded volume, though there are researches on nonlinear transaction costs. We can easily transform the absolute value into the positive and negative parts, and the above becomes a linear optimization problem as follows: N P (kit+ + kit− ) + β t P (ξh + ξh∗ ) kt+ ,kt− ,ξ + ,ξ − i=1 h=1 N N P P t+ t−1 h (ki − kit− )Rih + y h ≤ ε + ξh , h = 1, ..., t s.t. − wi Ri − i=1 i=1 N N P P (kit+ − kit− )Rih − y h ≤ ε + ξh∗ , h = 1, ..., t wit−1 Rih + i=1 i=1 N N P P (kit+ − kit− ) = 1, wit−1 + kit+ − kit− ≥ 0 wit−1 + i=1 i=1 ξ i+ ≥ 0, ξ i− ≥ 0, kit+ ≥ 0, kit− ≥ 0 where wit − wit−1 = kit+ + kit− , wit − wit−1 = kit+ − kit− min There are three modifications of our model: • Allowing short. The constraints for no short wit ≥ 0 can be eliminated. • Controlling tracking error upper bound. In practice, tracking error bound is usually provided explicitly by an index fund. It is one indicator whether the index fund is worthwhile to invest. So controlling the upper bound is important in real cases. It can be easily done by adding constraints ξh + ξh∗ ≤ τ for some predetermined tracking error bound τ . We call it ErrCrtl model. 8 C. Lin and W. Wang • Controlling the number of stocks. Some investors would like to hold a limited number of stock to make managing them easier. We can achieve this goal using L0-norm |w|0 ≤ M where M is the stock number limit. We can translate this constraint by binary variables. wit ≤ γi N P γi ≤ M i=1 γi ∈ {0, 1} L1 or L2-norm SVM model plus the above constraints generates a mixed integer programming problem. We call it NumCrtl model. Should we constrain on numbers of selected stocks? Some previous researches in index tracking suggested that constraints on stock numbers is a good way to go. Their arguments include using fewer stocks could (1) reduce transaction costs, (2) solve the investor’s limited fund problem, (3) provide more stable prediction. However, with deeper analysis, we find this method is quite unappealing. And our empirical results corroborate that constraints on numbers of selected stocks will only make the tracking results worse off. Firstly, fewer stocks does not necessarily reduce the transaction costs. The transaction cost is usually in proportion to the traded volume with a fixed transaction fee rate. So if an investor want to use 1 million dollars to track some target index, he has to pay the $22.40 transaction fee stipulated by SEC, no matter how many stocks he put in his portfolio. Our empirical results actually show clearly that with constraints on the number of selected stocks, the tracking error will become larger than before with no reduced transaction costs. So constraints on the number of selected stocks is not a good idea in modeling. In the real world, constraints on the number of selected can make an investor worse off. Since constraints on the number of selected stocks will make an investor invest more money in each single stock, and this will cause a larger price impact on each stock. Thus, the execution costs will be higher, which makes the investor worse off. In conclusion, constraint on the number of selected stocks is not a good way to reduce transaction costs. We have developed models that minimize the transactions costs explicitly. In addition, the argument that with constraint on the number of selected 9 Index Tracking stocks, one can solve the investor’s limited fund problem is not valid as well. We will construct a model that makes the investor’s wealth as one parameter in Section 2.3 so that the model can give different results to investors with different wealth. This, indeed, give an answer to the limited fund problem. For the third point that fewer stocks can provide more stable prediction. We agree with that. In general, it is common sense that using moderate number of features in one’s model is better for prediction than using too many features which overfits the noise. In our case, we can achieve moderate number of stocks and stability in tracking by using L1-norm model. So we construct our price model to generate sparsity in Section 2.3. 2.3 Tracking Index Price We see from the above that the tracking return model does not take several practical aspects into consideration. First, it minimize transaction costs indirectly by L1 or L2 norm rather than explicitly. Second, it regards the investor’s total wealth as constant and only considers stock proportions. Moreover, it does not consider that fact that the stock shares should be integer and ignores the possibilities of holding cash. So in this section, we hope to construct our price tracking model to (1) minimize transaction costs explicitly, (2) use investor’s initial wealth as a parameter for the user to input, (3) consider cash and integer stock shares, (4)again bound the tracking error and control transaction costs, (5) apply L1 norm to generating sparsity, (6) be fast and easy to implement. Let T Wt be the total wealth at time t, Sit be the shares of stock i at time t with Si0 = 0, Pit be the stock price for i ≥ 2 and P1t = 1 be the cash price, Yt be the index value at time t. δ is the transaction fee rate. τ is the tracking error bound. Then the minimization problem is: P N t P P N t h Yh t−1 t t min δ Lε ( Si Pi − T Wt−1 Yt−1 ) Si − Si Pi + β i=1 h=1 i=2 P N h s.t. Sit Pih − T Wt−1 YYt−1 ≤ τ T Wt−1 , h = 1, 2, . . . , t i=1 N N N P t t P t P S i Pi + δ Sit−1 Pit Si − Sit−1 Pit = i=1 i=2 Sit is interger f or i ≥ 2 (stock), i=1 S1t is real (cash) The model is similar to L1 norm return tracking model. The Lε is still the ε- 10 C. Lin and W. Wang insensitive loss function given in Section 2.1 and β is the tuning parameter. The first term in the objective function is the explicit expression of linear transaction costs, which equals to the transaction fee rate δ times the traded volume (change of stock shares times stock prices). The tracking error is given by the second term where measures the difference at time h of our portfolio with the index. T Wt−1 /Yt−1 is the share of index if we invest all the wealth in the index. The first constraint bound the maximal tracking error. The second constraint states that we only have limited wealth to pay for the transaction fees and the updated portfolio. We also require the shares of stocks to be integer to mimic the realistic situation. The problem can be again transformed into mixed integer linear regression problem: min δ N P i=2 s.t. (kit+ + kit− )Pit + β N P − i=1 N P i=1 (kit+ − kit− + t P h=1 (ξh + ξh∗ ) Sit−1 )Pih h + T Wt−1 YYt−1 ≤ ε + ξh , h (kit+ − kit− + Sit−1 )Pih − T Wt−1 YYt−1 ≤ ε + ξh∗ , ξh + ξh∗ ≤ τ T Wt−1 , h = 1, 2, . . . , t N P N P i=1 (kit+ − kit− )Pit + δ i=2 h = 1, 2, . . . , t h = 1, 2, . . . , t (kit+ + kit− )Pit = 0 Sit is interger f or i ≥ 2 (stock), S1t is real (cash) P N t h h Si Pi − T Wt−1 YYt−1 = ξh + ξh∗ , i=1 t t− t+ t−1 Si − Si = ki + ki , Sit − Sit−1 where N P i=1 h Sit Pih − T Wt−1 YYt−1 = ξh − ξh∗ = kit+ − kit− The only problem is the expensive computation. For the MIP problem, in most cases, Gurobi reaches the iteration limitation and cannot give the optimal solution. The run time is huge. So we modify the model a little bit to first do the linear programming without the integer constraint. Then we round the stock shares to its nearest integer and change cash accordingly so that the total wealth does not vary. Then all the prediction is done by the integer shares of stocks. That is, if S̃it is the optimal shares without the integer constraint, then h i Sit = S̃it , i ≥ 2 S1t = N P i=1 S̃it Pit − N h i P S̃it Pit i=2 Index Tracking h 11 i where S̃it represents the nearest integer to S̃it . After this approximation, the computation time is dramatically decreased. We run the codes on a CPU of 2.27 GHz and on Matlab platform using gurobi mex. On average, it only needs 0.32s for one-time step run. We can easily update our portfolio everyday to track the index price. 3 Data Analysis and Discussions For empirical analysis, we use the S&P 500 composite stock index as our target index, and use the stocks listed on NYSE at 1009 time points from Jan. 2, 2008 to Dec. 30, 2011 as our pool of assets to track the target index. The S&P 500 composite stock index is an index developed by Standard&Poor’s, the most famous rating agency in the financial world. The S&P 500 composite stock index is thought to be a financial instrument representative of the U.S stock market. We can see from Figure 1, that the index is a good indicator of U.S stock market, or even U.S economy, with rises and drops corresponding to the bulls and bears in financial market. For example, the large drop in 2008 during financial crisis and the slow rise over 2009, 2010 and 2011. The index contains 500 large cap stocks and is calculated as a market-value weighted average of this stock prices relative to an origin time point. The Standard&Poor’s will make changes to its component stocks and adjust the weights over time. We have chosen 506 stocks listed on the NYSE to track the S&P 500 composite stock index. These stocks are either in the S&P 500 index component list or were once in the list. We find that we can track the index with much less that 500 stocks in our empirical analysis. 3.1 Comparison of Tracking Return Models for One-Time Step We use the returns of S&P stocks and index of 120 days (from Jan. 20, 2011 to Jul. 12, 2011) to fit the model coefficients wi1 giving the original wi0 = 0. Then we use the fitted model to predict the index return of the next 120 days and investigate the mean and maximal prediction error of the next 120, 30, 10 days. For the L1 or L2 models controlling stock numbers, we choose the stock limit to be M = 40. For the L1 or L2 models controlling tracking error bound, we choose upper bound level to be 10 base point τ = 0.001. Besides these, we 12 C. Lin and W. Wang (a) Returns of S&P Stocks and Index (b) Prices of S&P Stocks and Index Figure 1: Returns and Prices of S&P Stocks and Index Index Tracking 13 also list average and maximal tracking error of the 120 training data, number of stocks selected, run time and iterations in Table 1. We consider no short models first and ε = 0.0001. For the selection of tuning parameter β, we used the idea of validation. We tried a wide range of β’s and find the β of the mini-max prediction error. We find that 5 × 10−6 and 0.05 are fair values for L1 and L2 models respectively. Table 1: Comparison of Tracking Return Models for One-Time Step From the table, we notice several facts. L1 norm gives a sparse model while L2 norm selects almost all the 506 stocks into the model. When controlling the number of stocks, researching iteration limit as L2-NumCtrl model (max iteration number 100000) often occur; thus the result is not optimal. However, even though it is optimal as L1-NumCtrl, it does not generate good results. In addition to the big tracking error, which is natural since the number of variables is limited, the prediction is quite bad for no matter short-term period (10 days) or longterm period (120 days). Our data analysis indeed demonstrate our statement 14 C. Lin and W. Wang in Section 2.2 that controlling number of stocks is not a very effective idea as previous literature suggested. Controlling tracking error is important. That is why controlling error models perform better than the original L1 or L2 models, especially for short-term (10day) prediction. Though we care about controlling tracking error, what really matters is the prediction error. Even though we get almost zero tracking error, the prediction error is not zero in general. As a practical guidance, people usually think of an index fund with prediction error more than 38 base point (0.0038) not worthwhile to invest. In this sense, our model can satisfactorily track the index at leat within the first 10 days. After that, the maximal prediction error grows fast. That is why we need to update the portfolio on a weekly or daily basis. We will do this in Section 3.2 for tracking return model and Section 3.3 for tracking price model. We also drew the curves of the tracked index return and the true index return for one-time step. Figure 2 shows the comparison results for L1-ErrCtrl and L1-NumCtrl for 120 tracking days and 120 prediction days. L2-ErrCtrl, L2NumCtrl have similar results (Not shown). We see again that after controlling tracking errors, we get almost zero tracking errors as well as better prediction errors than controlling the stock numbers. In general, we have tracked the index quite well even in just one-time step as one can hardly see the difference of the two curves. In conclusion, number control and error control are in a balance. We cannot achieve both. But in practice, we care more about error and this will indeed lead to better performance. Besides the above observations, we also compared short selling versus no short selling, more vs fewer iterations for models of controlling stock number. We found that there is not much difference with or without short. With short selling, we can get a little bit smaller tracking error, but cannot guarantee prediction error. Similarly, there is not much difference with more iterations. So if we stop at certain iteration before the optimal solution is researched, we benefit significantly for reducing run time and do not harm the prediction results much. 3.2 Comparison of Multi-Step and One-Step of Tracking In this section, we used the same 240 days as in Section 3.1 to run the model. But we updated the model every 10 days by L1-ErrCtrl model. We used the first Index Tracking Figure 2: Comparison of True and Tracked Index Returns 15 16 C. Lin and W. Wang 120 days as training data to predict the returns of 10 days ahead, then update the portfolio and predict the following 10 days. We repeated the step for 12 times to obtain the average prediction error of the total 120 days. The comparison results of this multi-step tracking with one-step tracking, where we only solve the problem once and predict 120 days directly, is shown in Table 2. Table 2: Comparison of Multi-Step and One-Step of Tracking In order to track the return, multi-step model has to use more stocks on average. The average prediction error decreases from 15bp to almost zero. The maximal prediction error is just 23bp, which is smaller than the practical guidance level 38bp, while one-step tracking has the maximal prediction error 47bp, which makes one-step model worthless to use in real market. 3.3 Results of Multi-Step Index Price Tracking As discussed before, the price tracking model incorporates many advantages of the return tracking model and adds more practical features. So it is our final model for index tracking problem. We investigate its performance here. We choose ε = 0.0001, τ = 10bp as before. The transaction fee rate δ is selected as 22 × 10−6 according to SEC. And after trying a wide range of β’s, we choose β = 0.0035. We first plot the one-step prediction error in Figure 3. The left figure used 120 dates from Jan. 2, 2008 to Jun. 23, 2008 to predict the following 100 days; while the right figure used 120 dates from May. 23, 2008 to Nov. 11, 2008 to predict the following 100 days. Similar to return tracking models, one-step tracking is only useful for a short period. It grows rapidly after several days. This demonstrates again the necessity of multi-step tracking. At last, we applied our final model to a much bigger data scale. We started from Jan. 2, 2008, updated our model on a daily basis using the 120-day stock prices before the current day. And we proceeded the procedure for 350 business days (almost one and a half years). The tracked return, wealth (price) and Index Tracking 17 Figure 3: One-Step Prediction Errors for Next 100 Days with Different Start Dates tracking error are depicted in Figure 4. Though Figure 4, it is clear that our model does a wonderful job in tracking the index. The prediction errors are very small; the transaction fees are controlled very well (no more than $0.3 everyday with initial wealth $100000); almost every main increase and decrease trend of the index is captured by our portfolio. 4 Conclusions In this paper, we drew inspiration from Support Vector Machine and have constructed models based on linear, quadratic and mixed-integer programming. At first, we constructed various models with different penalty functions and model settings to get some understanding of the index return tracking techniques. Based on the understanding and evidence we collected from these models, we formulate a more realistic optimization model that has better performance both in tracking target index and minimizing transaction costs. We used our models to track the S&P 500 composite stock index using stocks listed on NYSE. Based on our empirical analysis, we can see that: • Tracking errors on the training data sets and prediction errors are very small in general, though prediction error is generally much larger. Prediction error is increasing with time, which suits our expectation. New information stream comes into the market every day and the weights and shares of stocks 18 C. Lin and W. Wang (a) Prediction Return and Index Return (b) Total Wealth and Index Wealth (c) Transaction Costs of Everyday Trading Figure 4: Multi-Step Tracking Results for Next 350 Days Index Tracking 19 should be adjusted dynamically to track the index. • Our final model minimized the transaction costs to ignorable level. This is a good news for investors. • From the various models we constructed, we can see that L1 norm penalty will generate sparsity. This is true in the general (e.g. Lasso). Since L2 does not generate sparsity, we prefer L1 norm in our final model. • We have seen from empirical results that a proper the tuning parameter in the model can reduce running time and increase prediction performance significantly. Obviously, the fair tuning parameter also depends on the transaction fee rate. We use the validation idea to select the proper tuning. • The tracking error upper bound is vital as well. Too tight an error bound might render the optimization problem infeasible and no results will be generated. In conclusion, we have used optimization models to construct a successful tracking technique which can track the target index as close as satisfied with minimal transaction costs. Also, we solved the limited fund issue in our model. The investor’s fund size is a specifiable parameter in our model. And the tracking model will give suitable solutions to investors with different fund size. In spite the success of our model, it still can be improved in several ways. We used the naive validation method to choose the tuning parameters. But since the tuning in the model exert significant influence on our tracking results, we might want better way to estimate it. Also in reality, longing and shorting securities generate different transaction costs for investors. So differentiating between transaction fees of longing and shorting is a good way to go. Moreover, we want to add the impact of the execution costs and price impact caused by large amount of purchase or selling. The price impact is generally not ignorable for big institution investors. But price impact, which is stochastic, needs more sophisticated methods to model. Before we get into more complicated modeling methods, our model has provided a good solution and guidance to answer the whole index tracking problem. 20 C. Lin and W. Wang Acknowledgment We thank Prof. Cook for the whole year’s teaching and hard working and our first-year PhD colleagues for their helpful discussions. References Markowitz, H.M. (March 1952). Portfolio Selection. The Journal of Finance 7(1), 77-91. Merton, R.C. (1992). Optimal Investment Strategies for University Endowment Funds. Journal of Economic Dynamics and Control 16, 27-449. Zhu,J. Rosset,S. Hastie,T. and Tibshirani,R. (2003). L1 norm support vector machines. Advances in Neural Information Processing Systems, 16. Bickel, P.J and Levina,E. (2008). Covariance Regularization by Thresholding. The Annals of Statistics 36(6), 2577-2604. Bickel, P.J and Levina,E. (2008). Regularized Estimation of Large Covariance Matrices. The Annals of Statistics 36 (1), 199-227. Lam, C and Fan, J. (2009). Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. The Annals of Statistics 37(6B), 42544278. Vapnik, V.N. (1996). The Nature of Statistical Learning Theory. Springer, New York.