Mutual Fund Flows: A study of variables influencing investment Ted Morrissey November 20, 2001 QTM7010-61 Professor Sharpe 1 Introduction Investment activities occur for many reasons. Companies invest their money into assets which they hope will generate revenue. Individuals invest their money, hoping to earn a return on their investment and obtain a more secure future. This analysis will examine the flow of money into/out of US stock market mutual funds during the period of 1984 to 1996. The objective of this case discussion is to examine what factors determine the flow of money into and out of the US mutual funds. Data The Investment Company Institute tracked the money flowing into and out of US stock market mutual funds on a monthly basis, from April 1984 until December 1996. Other economic variables recorded during this period were: the stock market’s return % on investment, interest rates on one-year certificates of deposits (CDs), US disposable per capita income and gold prices/oz.. These economic variables were also recorded on a monthly basis. (Sharpe, Ali and Potter, 2001) Data Analysis Scatter plot graphs were conducted to determine if any relationships appeared present between these variables and mutual fund investment flows (Figures 1-4). Stock market return % and disposable income appeared to have linear relationships with fund flows. There appeared to be a curvilinear relationship between CD interest rates and fund flows. A relationship between gold prices/oz. and fund flows was not visible during the exploratory data analysis. A correlation chart was constructed to determine the relationships, if any, which these variables had with mutual fund investment flows (Table 1). Market return %, CD interest rates and disposable income had p-values (less than a significance level of .05) which pointed to a significant relationship between these variables and mutual fund flows. Disposable income had the highest correlation value (.665) of these variables, with CD interest rates having the second highest correlation value (.629) and market return % in third (.204). Gold prices/oz. did not appear to have a significant relationship with mutual fund flows. A potential area of concern was observed during the examination of the correlation table. It appeared as though collinearity existed between CD interest rates and disposable income, which in turn would lead to unstable coefficients in the multiple regression model. Figure 5 suggested a negative linear relationship between the two variables, indicating that when included in a multiple regression equation together, they may cancel each other out, negating the other’s effect on mutual fund flows. Further analysis of these variables consisted of regression equation tables (Tables 2,3,4 &5). US disposable income (Table 4) had the highest coefficient of determination (44.3), with a similar adjusted value (43.9). The relationship between fund flows and US disposable income was as follows: For every dollar of US disposable income, 5.64 million dollars was invested into US stock market mutual funds (Table 4). CD interest rates (Table 3) also had a high coefficient of determination in respect to its relationship with fund flows (39.5). In examining the relationship of this equation’s residuals versus fits (Figure 7), there appeared to be a pattern in the form of a curve. A linear equation model did not sufficiently describe this relationship, for the residuals did not vary 2 randomly with each “x” value. This variable was initially hypothesized to have a curvilinear relationship with fund flows. A transformation of this variable was conducted (Table 8). The plot of residuals versus fits for this transformation (Figure 11) indicated that this description satisfied an assumption necessary to describe the relationship between fund flows and CD interest rate as curvilinear. Multiple regression analysis (Table 6) for all the variables brought the coefficient of determination to 57.7%, but collinearity was of concern, as well as the issue that the CD interest rates variable, in original form, was not sufficient to provide description of its relationship with fund flows (which also effected the plot of residuals vs. fits for this equation: Figure 9). Market return % and disposable income (Table 7) provided a coefficient of determination (48.6%) that was lower than other multiple regression equations, yet more conservative in terms of collinearity. Therefore, this model was the most reliable. Due to previous data analysis (Mutual Fund Flows Case A: Sharpe, Ali and Potter, 2001) in which the market return % provided different patterns on mutual fund flows based upon time frame (i.e. 1984 – 1989 and 1990-1996), it was hypothesized that different variables may have been either more or less significant, based upon the time period that fund flows were analyzed. Therefore, regression analysis was conducted for each time period (Tables 10-18). From 1984 until 1989, the market return % appeared to be the most significant factor in determining mutual fund flows, with the highest coefficient of determination (38.7%; Table 11), as well as the highest correlation factor (.622; Table 10). An interesting discovery made was that during this time period, an increase in disposable income meant a decrease in money invested in mutual fund flows (Tables 10 & 12). Once again, CD interest rates and disposable income appeared to be related (Table 10). From 1984-1989, gold prices appeared to have a significant relationship with mutual fund flows (Tables 10 & 13). In order to achieve a better coefficient of determination, multiple regression analysis was conducted (Table 14). When all variables were taken into consideration, gold price/oz. did not have as significant a relationship with fund flow (Table 14). These variables played a much different role in the mutual fund flows from 19901996 (Table 15). Disposable income had the highest correlation value (.655), while the correlation of market return % to fund flows during this period (.282) was similar to that of gold prices/oz. (.260). Collinearity between CD interest rates and disposable income did not appear to be of concern for this period (Table 15; Figure 15). A multiple regression equation involving all variables (Table 17) had a high coefficient of determination (71.8), but once again, collinearity between the variables (disposable income and gold prices/oz.) raised concern. To gain a more reliable model, gold prices were excluded from the multiple regression analysis (Table 18), and a high coefficient of determination was still obtained (68.2). Conclusion It makes sense that disposable income and market return % were the most influential variables in fund flows. If people have money, they will want to watch it grow, so they will invest it. If the market is providing a good return %, then they will invest their money. Put these two variables together (Table 7), and one has money to 3 invest and a place to earn on investment. If other investment opportunities, such as CDs, provide a better return on investment, then it would be wise to withdraw money out of the mutual fund market and invest in a better return opportunity (Table 8). Another relationship that would be interesting to examine would be that of the actual worth of a mutual fund to the money flowing into it. It was interesting to see the difference between the investment practices of two different decades. The 1980s were more conservative, and the market return % had the heaviest influence on money flowing into a mutual fund. The 1990s saw a booming economy, and people were much more willing to invest their money in the market, hoping to watch it grow. With the economy in its current state, it would be interesting to examine mutual fund flows for this decade. Combined with the variable of a mutual fund’s worth, a full analysis of mutual fund investment could be completed for the 1980s, 1990s and 2000s, providing a full realm of economic conditions and a better understanding of when to invest in a mutual fund, and when to take that investment elsewhere. 4 Bibliography: McClave, James T., Benson, P. George and Sincich, Terry (2001), Statistics for Business and Economics, 8th Edition. New Jersey: Prentice Hall. Sharpe, N., Ali, A., and Potter, M.E. (2001), A Casebook for Business Statistics: Laboratories for Decision Making. NY: John Wiley & sons, Inc. All tables and graphs were constructed in the Minitab Program 5 Figure 1: Mutual Fund flows to Market Return % (Entire Sample) Fund Flows ($millions) 30000 20000 10000 0 -10000 -20 -10 0 10 Market Return (%) Figure 2: Mutual Fund Flows to CD Interest Rate (Entire Sample) Fund Flows ($millions) 30000 20000 10000 0 -10000 2 7 12 CD_Interest_Rate Figure 3: Mutual Fund Flows to US Disposable Income per Capita (Entire Sample) Fund Flows ($millions) 30000 20000 10000 0 -10000 16000 17000 18000 19000 US Disposable Income per Capita 6 Figure 4: Mutual Fund Flows to Gold Price/oz. (Entire Sample) Fund Flows ($millions) 30000 20000 10000 0 -10000 300 400 500 Gold price per oz. Table 1: Correlation Chart (Entire Sample) Fund Flows by: Market Return % CD Interest Rate US Disposable Income Gold Price/oz. Market% CD Rate Dispos Inc. 0.204 0.011 -0.629 0.000 0.049 0.546 0.665 0.000 -0.006 0.941 -0.595 0.000 -0.071 0.386 -0.067 0.412 -0.005 0.955 0.188 0.020 Cell Contents: Correlation P-Value 7 Figure 5: CD Interest Rate to US Disposable Income (Entire Sample) CD_Interest_Rate 12 7 2 16000 17000 18000 19000 US_Disposable_Income_per_Capita Table 2: Regression Analysis of Fund Flows by Market Return % (Entire Sample) The regression equation is Fund Flows ($millions) = 4589 + 310 Market Return (%) Predictor Constant Market % Coef 4589.1 310.2 S = 6145 StDev 523.3 121.1 R-Sq = 4.2% T 8.77 2.56 P 0.000 0.011 R-Sq(adj) = 3.5% Figure 6: Residuals of Market Return % v. Fits (Entire Sample) Residuals Versus the Fitted Values (response is Fund Flo) Residual 20000 10000 0 -10000 0 5000 10000 Fitted Value Table 3: Regression Analysis of Fund Flows by CD Interest Rate (Entire Sample) The regression equation is Fund Flows ($millions) = 16470 - 1761 CD Interest Rate Predictor Constant CD Rate S = 4882 Coef 16470 -1761.2 StDev 1219 177.3 R-Sq = 39.5% T 13.51 -9.93 P 0.000 0.000 R-Sq(adj) = 39.1% 8 Figure 7: Residuals of CD Interest Rate vs. Fits (Entire Sample) Residuals Versus the Fitted Values (response is Fund Flo) 20000 Residual 10000 0 -10000 -5000 0 5000 10000 Fitted Value Table 4: Regression Analysis of Fund Flows by US Disposable Income (Entire Sample) The regression equation is Fund Flows ($millions) = - 94972 + 5.64 US Disposable Income Predictor Constant Disposable Income S = 4686 Coef -94972 5.6359 R-Sq = 44.3% StDev 9140 0.5148 T -10.39 10.95 P 0.000 0.000 R-Sq(adj) = 43.9% Figure 8: Residuals of US Disposable Income vs. Fits (Entire Sample) Residuals Versus the Fitted Values (response is Fund Flo) 20000 Residual 10000 0 -10000 0 5000 10000 15000 Fitted Value Table 5: Regression Analysis of Fund Flows by Gold Price/oz. (Entire Sample) The regression equation is 9 Fund Flows ($millions) = 9384 - 11.6 Gold price/oz. Predictor Constant Gold price S = 6261 Coef 9384 -11.58 StDev 5058 13.32 R-Sq = 0.5% T 1.86 -0.87 P 0.065 0.386 R-Sq(adj) = 0.0% Table 6: Mutliple Regression Equation for Fund Flows (Entire Sample) The regression equation is Fund Flows ($millions) = - 55144 + 342 Market Return (%) - 1054 CD Interest Rate + 3.75 US Disposable Income Predictor Constant Market % CD Rate Disposable income S = 4109 Coef -55144 342.41 -1054.3 3.7513 R-Sq = 57.7% StDev 10732 81.10 186.0 0.5617 T -5.14 4.22 -5.67 6.68 P 0.000 0.000 0.000 0.000 R-Sq(adj) = 56.9% Figure 9: Residuals vs. Fits of Multiple Regression Model (Entire Sample) Residuals Versus the Fitted Values (response is Fund Flo) 20000 Residual 10000 0 -10000 0 10000 Fitted Value Table 7: Multiple Regression Model for Market Return & Disposable Income (Entire Sample) The regression equation is Fund Flows ($millions) = - 95591 + 316 Market Return (%) + 5.65 US_Disposable_Income_per_Capita Predictor Constant Market % Disposable income S = 4516 Coef -95591 316.35 5.6466 R-Sq = 48.6% StDev 8809 88.98 0.4960 T -10.85 3.56 11.38 P 0.000 0.001 0.000 R-Sq(adj) = 47.9% 10 Figure 10: Residuals vs. Fits of Market Return & Disposable Income (Entire Sample) Residuals Versus the Fitted Values (response is Fund Flo) 20000 Residual 10000 0 -5000 0 5000 10000 15000 Fitted Value Table 8: Regression Analysis for CD interest Rate (Adjusted Variable) (Entire Sample) The regression equation is Fund Flows ($millions) = 22899 - 3837 CD Interest Rate + 150 cd interest rate squared Predictor Constant CD rate CD rate squ. Coef 22899 -3836.7 149.62 S = 4790 R-Sq = 42.2% StDev 2730 811.1 57.11 T 8.39 -4.73 2.62 P 0.000 0.000 0.010 R-Sq(adj) = 41.4% Figure 11: Residuals of Adjusted CD Rate vs. Fits (Entire Sample) Residuals Versus the Fitted Values (response is Fund Flo) 20000 Residual 10000 0 -10000 0 5000 10000 15000 Fitted Value 11 Table 9: Correlation Chart (Entire Data) with Adjusted Variable Fund Flows by: Market Return % Market% CD Rate Dispos Inc. 0.204 0.011 CD Interest Rate US Disposable Income CD interest Rate Squared -0.629 0.000 0.049 0.546 0.665 0.000 -0.006 0.941 -0.595 0.000 -0.579 0.000 0.038 0.639 0.977 0.000 Market% CD Rate -0.612 0.000 Table 10: Correlation Model (1984-1989) Fund Flows by: Market Return % Dispos Inc. 0.622 0.000 CD Rate -0.177 0.145 0.011 0.929 US Disposable Income -0.290 0.016 0.019 0.874 -0.315 0.008 Gold price -0.243 0.044 -0.121 0.322 -0.399 0.001 0.524 0.000 Table 11: Regression Analysis Fund Flows by Market Return % (1984-1989) The regression equation is Fund Flows ($millions) = 206 + 262 Market Return (%) Predictor Constant Market % S = 1630 Coef 205.9 261.50 StDev 206.2 40.23 R-Sq = 38.7% T 1.00 6.50 P 0.322 0.000 R-Sq(adj) = 37.8% 12 Figure 12: Residuals of Market Return vs. Fits (1984-1989) Residuals Versus the Fitted Values (response is Fund Flo) 6000 Residual 4000 2000 0 -2000 -4000 -5000 0 5000 Fitted Value Table 12: Regression Analysis of fund Flows by US Disposable Income (1984-1989) The regression equation is Fund Flows ($millions) = 20003 - 1.13 US Disposable Income Predictor Coef Constant 20003 Disposable Income -1.1324 S = 1992 StDev 7825 0.4569 R-Sq = 8.4% T 2.56 -2.48 P 0.013 0.016 R-Sq(adj) = 7.0% Figure 13: Residuals of Disposable Income vs. Fits (1984-1989) Residuals Versus the Fitted Values (response is Fund Flo) Residual 5000 0 -5000 0 500 1000 1500 Fitted Value 13 Table 13: Regression analysis of Fund Flows by Gold Price/oz. (1984-1989) The regression equation is Fund Flows ($millions) = 4343 - 9.70 Gold _price_per_oz Predictor Constant Gold price S = 2019 Coef 4343 -9.701 StDev 1832 4.728 R-Sq = 5.9% T 2.37 -2.05 P 0.021 0.044 R-Sq(adj) = 4.5% Figure 14: Residuals of Gold Price/oz. Vs. Fits (1984-1989) Residuals Versus the Fitted Values (response is Fund Flo) Residual 5000 0 -5000 0 500 1000 1500 Fitted Value Table 14: Multiple Regression Model for Fund Flows (1984-1989) The regression equation is Fund Flows ($millions) = 28469 + 260 Market Return (%) - 407 CD Interest Rate - 1.35 US Disposable Income - 4.96 Gold _price_per_oz Predictor Coef Constant 28469 Market R 259.57 CD Rate -406.7 Disposable Income -1.3465 Gold price -4.958 S = 1389 R-Sq = 57.5% StDev 6182 34.71 106.7 0.3792 4.038 T 4.61 7.48 -3.81 -3.55 -1.23 P 0.000 0.000 0.000 0.001 0.224 R-Sq(adj) = 54.8% 14 Figure 14: Residuals of Multipe Regression Model vs. Fits (1984-1989) Residuals Versus the Fitted Values (response is Fund Flo) 5000 4000 Residual 3000 2000 1000 0 -1000 -2000 -3000 -5000 0 5000 Fitted Value Table 15: Correlation Chart (1990-1996) Fund Flows by: Market % 0.282 0.009 Market% CD Rate -0.467 0.000 Dispos. Inc. 0.655 0.000 0.040 0.718 0.090 0.414 -0.041 0.708 Gold price 0.059 0.595 0.401 0.000 CD Rate 0.260 0.017 Dispos. Inc. 0.455 0.000 Figure 15: Matrix Plot of Fund Flows to Variables (1990-1996) 15 Scatter Plot Matrix 20395 Fund Flows ($millions) 4923 6.36125 Market Return (%) -3.67425 6.985 CD_Interest_Rate 4.235 18768.5 US_Disposable_Income_per_Capita 18037.5 394.325 Gold _price_per_oz 350.475 49 23 20 39 5 67 - 3. 42 5 6 .3 61 25 4.2 35 6.9 85 18 03 7.5 68.5 7 18 35 75 25 0 .4 9 4 .3 3 Table 16: Regression Analysis of Fund flows by Disposable Income (1990-1996) The regression equation is Fund Flows ($millions) = - 165690 + 9.55 US Disposable Income Predictor Coef Constant -165690 Disposable Income 9.550 S = 4748 StDev 22241 1.218 R-Sq = 42.8% T -7.45 7.84 P 0.000 0.000 R-Sq(adj) = 42.1% Figure 16: Residual of Disposable Income vs. Fits (1990-1996) Residuals Versus the Fitted Values (response is Fund Flo) Residual 10000 0 -10000 2000 7000 12000 17000 Fitted Value 16 Table 17: Multiple Regression Analysis of Fund Flows (1990-1996) The regression equation is Fund Flows ($millions) = - 141998 + 458 Market Return (%) - 2114 CD Interest Rate + 7.30 US Disposable Income + 74.7 Gold price/oz. Predictor Coef Constant -141998 Market R 457.9 CD Rate -2113.8 Disposable Income 7.297 Gold _pr 74.74 S = 3400 StDev 16155 112.2 259.8 1.022 23.71 R-Sq = 71.8% T -8.79 4.08 -8.14 7.14 3.15 P 0.000 0.000 0.000 0.000 0.002 R-Sq(adj) = 70.3% Figure 17: Residuals of Multiple Regression vs. Fits (1990-1996) Residuals Versus the Fitted Values (response is Fund Flo) 15000 Residual 10000 5000 0 -5000 0 10000 20000 Fitted Value 17 Table 18: Adjusted Multiple Regression Model (1990-1996) The regression equation is Fund Flows ($millions) = - 146414 + 458 Market Return (%) - 1727 CD Interest Rate + 8.95 US Disposable Income Predictor Coef Constant -146414 Market R 457.6 CD Rate -1727.3 Disposable Income 8.9542 S = 3585 StDev 16969 118.3 241.5 0.9246 R-Sq = 68.2% T -8.63 3.87 -7.15 9.68 P 0.000 0.000 0.000 0.000 R-Sq(adj) = 67.0% Figure 18: Residuals of Adjusted Model and Fits (1990-1996) Residuals Versus the Fitted Values (response is Fund Flo) Residual 10000 0 -10000 0 10000 20000 Fitted Value 18 19