An “Expert System” for Asset Timing

advertisement
An “Expert System” for Asset Timing
by Michael L. Tindall, Ph.D.
Revised 12-26-01
SUMMARY
This paper presents an expert system for making asset-allocation decisions
between equities and riskless assets. The system makes use of the fact that
individuals are able to incorporate only small information sets in their
asset-allocation decisions whereas expert systems can in principle incorporate
information sets which approximate the content of the information set available to
the entire market. Using principal components analysis, it is possible to construct
a system that incorporates the information content of large sets of data into the
asset-allocation decision. In within-sample and out-of-sample tests, the system
produces positive excess returns.
BACKGROUND AND INTRODUCTION
Advances in economic analysis in recent years have led to the possibility of
building expert systems, or economic decision-making systems which employ very
large information sets and replace or replicate human decision making. The
expert system proposed here would make decisions concerning the allocation of
funds between risky assets and "riskless" alternatives.
In a yet-to-be-published research paper, “Monetary Policy in a Data-Rich
Environment,” Ben Bernanke, chairman of the economics department at Princeton
University and one of the country’s leading macroeconomists, and Jean Boivin of
Columbia University have constructed an expert system for making monetary
policy decisions as the policymakers at the Federal Reserve Board do. The
system formulates monetary policy as a recommended target level of the federal
funds rate. The central premise of the Bernanke-Boivin paper is that, to date,
academic research of Federal Reserve decision-making has been based on the
notion that the Federal Reserve takes only a handful of economic variables into
account in its policy decisions. Academic researchers have attempted to model
Federal Reserve manipulation of the federal funds rate by fitting the rate to a small
set of variables to construct a “reaction function” of the central bank. This function
supposedly shows how the central bank changes the federal funds rate in
response to the small sets of variables. In contrast, Bernanke-Boivin state that a
careful examination of the internal workings of the Federal Reserve shows that it
takes into account a large information set, i.e. a large set of economic and financial
time series, in making monetary policy decisions. Federal Reserve officers from
district banks and branch facilities of district banks funnel information from bankers
and businessmen regarding local and regional economic conditions to district
presidents and on to the Federal Open Market Committee. This information is
combined with research from economists both at district banks and at the Board of
Governors in Washington, D.C. Thus, a very large information set is incorporated
into policy decisions. Bernanke-Boivin note that the relevant content of large
information sets can be captured with a statistical technique called “principal
components analysis” (PCA). Using this analysis, a large part of the relevant
content of large sets of information, i.e. economic and financial time series, can be
extracted in the form of a tractable set of a few time series which are themselves
linear weightings of the original set of series. Bernanke-Boivin substitute the
information content from principal components into the “Taylor rule,” a rule about
how Federal Reserve policy decisions are made. They find that policy formulation
from this system closely parallels actual Federal Reserve interest-rate decisions.
The Bernanke-Boivin paper is based on the idea that forecasting accuracy and
policy formulation can be improved dramatically using what is called a "large
model" approach. In conventional models--even those containing thousands of
equations--each equation contains only a handful of variables as determinants.
This is a "small model" approach. Bernanke-Boivin employ the large-model
econometric methods pioneered by J.H. Stock and Mark Watson.
In an unpublished paper, "Macroeconomic Forecasting Using Many Predictors,"
Watson states, "The last twenty-five years has seen enormous intellectual effort
and progress on the development of small-scale macroeconometric models....
Despite this..., it is also not too much of an overstatement to say that these
small-scale macroeconometric models have had little effect on practical
macroeconomic forecasting and policymaking. There are several reasons for this,
but the most obvious is the inherent defect of small models: they include only a
small number of variables. Practical forecasters and policymakers find it useful to
extract information from many more series...."
Turning to the large-model approach, Watson says, "This research program is not
new..., but it is still immature.... Yet, the results that we do have suggest that there
may be large payoffs from this 'large model' approach." The model, or expert
system, which we develop in this paper is nonlinear. In his paper, Watson says,"
While some may scoff at the notion that there are large gains from modeling
nonlinearity in macroeconomic time series, much of this (well-founded) skepticism
comes from experience with small models. But large-model results can be quite
different."
The implication of this research for asset timing is straightforward. Just as
academic researchers have concentrated on small information sets in their studies
of the Federal Reserve, portfolio managers typically use relatively small
information sets in determining asset weightings. That is, they do not employ the
information set available to the entire market. Portfolio managers may outperform
the market if they have the requisite ability to interpret small information sets. The
market as a whole, however, incorporates all relevant information by definition,
making it difficult for individual portfolio managers to outperform the market.
As in the Bernanke-Boivin analysis, the information content of large sets of data
relevant for asset timing can be extracted by means of PCA. Then, the principal
components can be inputted into a rule for allocating portfolio funds between risky
and riskless assets.
Such a system constitutes an expert system for asset timing. This paper examines
the structure of such a system.
METHODOLOGY
Consider the problem of allocating funds period by period between a risky asset in
the form of the Standard & Poor's 500 index, or the futures contract for that index,
and a "riskless" asset in the form of the 90-day Treasury bill. The problem can be
thought of as maximizing 1) the mean return over time of a portfolio consisting of a
mix of the stock index and the Treasury bill subject to a decision-making rule
regarding asset allocation or 2) the mean return of such a portfolio relative to its
standard deviation of return. For simplicity, we will examine the former problem in
this paper although the method for its solution can be easily adapted to solve the
latter problem.
Let at be the portfolio weighting of the stock index at time t so that 1-at is the
weighting of the Treasury bill. The decision maker has available at time t a set of
information in the form of k economic time series x1t, x2t, x3t, ..., xkt. For simplicity,
we assume that the decision regarding the value of at is made according to an
equation where at is a weighting of the data in the information set. The weights are
denoted wi. Maximization of portfolio mean return is the same as maximizing final
portfolio value, and the problem can be stated as follows:
Max (pn)
with respect to wi; i=0,1,2,...,k
subject to:
pt = pt-1 (at(1+rst)+(1-at)(1+rt)) + ct
at = w0+w1x1t+w2x2t+w3x3t+...+wkxkt
where pn is the portfolio value in the final sample period n, pt is the portfolio value at
time t, rst is the return of the stock index, rt is the rate of return of the riskless asset,
and ct is the transaction cost of reallocating the portfolio between stocks and bills in
period t.
The equation for at could yield a value of at which is less than zero, resulting in
short selling, or greater than the maximum weight allowed by the terms of the
investment fund. The equation for at can be modified to incorporate a restriction
that prevents such possibilities. For example, a sigmoid function f(x) = 1/(1+e-x) is
a smooth, continuous "s shaped" function with a maximum value of 1 and minimum
value of 0. We can write the equation for at as:
at = f(w0+w1x1t+w2x2t+w3x3t+...+wkxkt)
Where it is desired to leverage the investment in equities so that the maximum
allowable at is L>1, the equation can be written as:
at = L f(w0+w1x1t+w2x2t+w3x3t+...+wkxkt)
In the empirical work which follows, L is set equal to 1. We can modify this last
equation to allow for partial adjustment of the allocation at from period to period as
follows:
at = L f(w0+w1x1t+w2x2t+w3x3t+...+wkxkt+wk+1at-1)
The computation of the wi which generate a maximum final portfolio value is a
nonlinear problem which can be solved by numerical methods.
In traditional economic modeling techniques, the x’s above would consist of a
limited set of series. For example, the set of x’s could be a distributed lag of a
single series. In the expert-system approach proposed here, the goal is to employ
large sets of data. To understand why, consider the situation faced by a portfolio
manager. He uses an information set which is much smaller than that available to,
and used by, the set of all market participants. The portfolio manager may be able
to outperform the market if he can interpret the information available to him better
than other market participants can interpret their information sets, which are each
also relatively small in size compared to the information set of the entire market. In
an expert system, large sets of data can be used in the asset-allocation rule, at = L
f(w0+w1x1t+w2x2t+w3x3t+...+wkxkt+wk+1at-1), giving the system a potential advantage
over the individual portfolio manager with his small information set.
Two things, however, work to inhibit the use of large sets of data in an expert
system such as this. First, as the number of time series increases relative to the
number of observations within each series, the efficiency of computing the
asset-allocation equation breaks down. Second, the various series within the set
of data may be so highly correlated between pairs of series and among groups of
series that estimation of the weightings becomes intractable. This is the problem
of multicollinearity.
These deadlocks can be overcome by PCA. Using this technique, the information
in a large set of data is extracted in a relatively few component series which
capture most of the information in the set. The principal components of a set of
data series are extracted sequentially. Each principal component of a set of data is
a weighted sum of the series in that set so that the i-th principal component is:
zit = ai1y1t +ai2y2t + ai3y3t + … + aikymt
where the y1t, y2t, y3t, …, ymt are the m series in the large set of data.
The first principal component z1t is that weighted sum of the underlying series in
the set of data with the largest variance. The second principal component z2t is the
weighted sum with the second largest variance which is independent of, i.e.
uncorrelated with, the first principal component z1t. The third principal component
is the weighted sum which has the third largest variance and is independent of the
first two principal components, and so on. The computation of the principal
components of a set of data is a space rotation or eigenvector/eigenvalue problem.
There are as many principal components as there are time series in the set of data,
and the set of all principal components contains exactly the same information
content as that set. That is, where a variable is fitted to all the series in the set of
data and also to the set of all principal components, the set of estimated values of
the variable is the same in both fitted relationships.
The concept underlying the use of PCA is that the first few principal components
contain most of the information content of the entire set of data. Then, the
remaining principal components can be omitted from the analysis.
WITHIN-SAMPLE RESULTS FROM THE EXPERT SYSTEM
Working with monthly data where the S&P 500 index and 90-day Treasury bill rate
are monthly averages of daily data, research with the expert system described
above produces the following within-sample results.
Where the x's used in the asset-allocation equation are not leading indicators of
the market, the allocation equation tends to keep the weighting of the stock index
over time at L, the maximum allowable level. This is intuitive because, lacking
relevant information about the future course of the stock market, the best course of
action is to remain fully invested in the asset with higher long-term returns.
The situation changes where a "true" leading indicator of the market is used. The
maximization problem was computed where the x's consists of a distributed lag of
the level of stock prices, measured by the S&P 500 index, extending two months in
the future. In other words, the information set was guaranteed to provide superior
returns provided it was interpreted correctly. Indeed, the maximization routine
provided mean returns over the sample period which were roughly twice that of the
stock market index itself and standard deviations of returns which were roughly
half that of the index. This was made possible because the maximization routine
tended to get the investor out of the market in periods before sharp downturns,
increasing the mean return and reducing the volatility of return. In this setup, the
allocation weightings varied considerably from period to period.
Thus, the system could be said in a sense to recognize the limits of the information
set inputted into it, reducing the variation in equity allocation where the information
set contains little useful information for asset timing and increasing the variation
where the information set is rich in useful information.
The maximization problem was also computed where the x’s consisted of
12-month distributed lags of various individual economic time series. Here, the
goal was to select individual time series that are generally accepted as leading
indicators of the economy or as important indicators of overall economic activity.
In every case, those time series that generated higher mean return also generated
portfolios with lower standard deviations of return. Forty such series were
selected. The table below shows the series, the mean returns of the portfolios
generated by each of these series relative to the mean return of the S&P 500
index, and the standard deviation of portfolio return relative to the standard
deviation of return of the S&P 500 index. The table, as well as all that follow, are
based on returns defined as monthly changes in logarithms. The sample period for
the results in the table is March 1980 to September 2001. This choice of sample
period was dictated by such things as data availability of the various series, the
lengths of distributed lags used in the analysis, and the consequences of
differencing of data. Each of the following series was employed as first differences
in the maximization problem. Series which cannot assume nonpositive values
were employed as first differences of logarithms. Indexes of the Philadelphia Fed
business survey, which range between -100 and 100, were incremented by 100
and then employed as differences of logarithms.
Series
Nonborrowed reserves
Excess reserves
Free reserves
Federal funds rate
Domestic unit auto sales
Domestic unit light truck sales
Domestic unit heavy truck sales
U.S. Refiners composite cost of crude oil
Initial jobless claims***
New orders for consumer goods***
New orders for capital goods***
Real money supply***
Michigan consumer survey--expectations
Conference Board survey--current
Conference Board survey--expectations
NAPM survey--overall index
NAPM survey--new orders
NAPM survey--production
Philadelphia Fed survey--general activity index
Philadelphia Fed survey--new orders
Philadelphia Fed survey--deliveries
Philadelphia Fed survey--employment
Real construction spending--residential
Real construction spending--nonresidential
Real construction spending--public
Existing home sales
New home sales
Building permits
Housing starts
Houses not started
Industrial production--autos
Industrial production--trucks
New orders for durable goods
New orders for nondefense capital goods
Shipments of nondefense capital goods
Personal consumption in constant dollars
Retail sales
Retail sales--auto dealers
Retail sales--auto parts and supplies
Retail sales--building materials
Return*
1.462
1.297
1.540
1.318
1.252
1.253
1.341
1.385
1.328
1.000
1.000
1.000
1.443
1.340
1.545
1.283
1.378
1.450
1.568
1.291
1.405
1.486
1.000
1.000
1.000
1.530
1.512
1.456
1.359
1.493
1.452
1.355
1.494
1.581
1.000
1.000
1.000
1.343
1.000
1.000
Volatility**
0.767
0.882
0.719
0.896
0.900
0.901
0.902
0.845
0.857
1.000
1.000
1.000
0.753
0.878
0.711
0.909
0.881
0.767
0.706
0.953
0.883
0.802
1.000
1.000
1.000
0.798
0.710
0.824
0.872
0.815
0.784
0.904
0.869
0.783
1.000
1.000
1.000
0.882
1.000
1.000
-------------------*Ratio of the mean portfolio return to the mean return of the S&P 500 index
**Ratio of the standard deviation of portfolio return to the standard deviation of
return of the S&P 500 index
***From the leading indicators report.
Next, the maximization problem was computed where the x's consisted of
12-month distributed lags of each of the first eight principal components of the 40
series. That is, the set of x's was expanded to 12 * 8 = 96 variables. According to
the table below, these 8 principal components comprise 60.3 percent of the total
variation, i.e. the information content, of the entire 40 series. Each of the principal
components is, again, a weighted sum of each of the 40 series.
Principal component
1
2
3
4
5
6
7
8
Cumulative percent of total variation
16.4
25.2
35.8
41.8
47.6
52.2
56.4
60.3
Here, the analytical benefits of PCA over traditional methods is dramatic. Running
the maximization routine, the 8 principal components produce a portfolio with a
mean return which is 1.874 times the mean return of the S&P 500 index over the
sample period, and the standard deviation of portfolio return is 0.617 that of the
S&P 500 index.
The table below presents within-sample annual results from 1981 to 2000. In this
analysis, the differences in the means and standard deviations are presented
rather than the ratios. This is done because, in some years, the mean return of the
S&P 500 index was negative, generating negative, and thus meaningless, ratios.
According to the results in the table, the portfolios created by the maximization
routine never had negative returns in any one-year period, the portfolios
outperformed the S&P 500 index in 15 out of the 20 annual periods, and in every
year except one the portfolios had lower standard deviations than the S&P 500
index.
Period
81.01-12
82.01-12
83.01-12
84.01-12
85.01-12
86.01-12
87.01-12
88.01-12
89.01-12
90.01-12
91.01-12
92.01-12
93.01-12
94.01-12
95.01-12
96.01-12
97.01-12
98.01-12
99.01-12
00.01-12
Portfolio
Mean (SD)
0.0100 (0.0121)
0.0281 (0.0334)
0.0127 (0.0109)
0.0129 (0.0213)
0.0214 (0.0208)
0.0151 (0.0198)
0.0148 (0.0300)
0.0163 (0.0212)
0.0218 (0.0144)
0.0144 (0.0144)
0.0151 (0.0307)
0.0092 (0.0237)
0.0040 (0.0045)
0.0035 (0.0099)
0.0250 (0.0104)
0.0165 (0.0202)
0.0247 (0.0275)
0.0281 (0.0322)
0.0169 (0.0212)
0.0025 (0.0176)
S&P 500
Mean (SD)
-0.0063 (0.0337)
0.0099 (0.0518)
0.0137 (0.0222)
0.0001 (0.0324)
0.0193 (0.0279)
0.0152 (0.0262)
-0.0026 (0.0682)
0.0115 (0.0275)
0.0193 (0.0199)
-0.0049 (0.0386)
0.0139 (0.0317)
0.0095 (0.0242)
0.0056 (0.0097)
-0.0019 (0.0167)
0.0250 (0.0104)
0.0158 (0.0254)
0.0215 (0.0338)
0.0177 (0.0483)
0.0152 (0.0316)
-0.0059 (0.0271)
Portfolio - S&P
Mean (SD)
0.0163 (-0.0216)
0.0182 (-0.0183)
-0.0011 (-0.0114)
0.0128 (-0.0112)
0.0021 (-0.0070)
-0.0001 (-0.0064)
0.0174 (-0.0383)
0.0048 (-0.0063)
0.0025 (-0.0056)
0.0193 (-0.0242)
0.0012 (-0.0010)
-0.0003 (-0.0005)
-0.0016 (-0.0051)
0.0055 (-0.0067)
-0.0000 ( 0.0000)
0.0006 (-0.0051)
0.0032 (-0.0063)
0.0104 (-0.0161)
0.0017 (-0.0103)
0.0084 (-0.0095)
The table below presents the results for five-year periods. In every five-year
period, the portfolios outperformed the S&P 500 index and did so with lower
volatility of return.
Portfolio
Period
Mean (SD)
81.01-85.12 0.0170 (0.0217)
86.01-90.12 0.0165 (0.0202)
91.01-95.12 0.0114 (0.0197)
96.01-00.12 0.0177 (0.0252)
S&P 500
Mean (SD)
0.0073 (0.0351)
0.0077 (0.0398)
0.0104 (0.0216)
0.0129 (0.0344)
Portfolio - S&P
Mean (SD)
0.0097 (-0.0134)
0.0088 (-0.0196)
0.0010 (-0.0019)
0.0049 (-0.0093)
The maximization routine was computed under the assumption that the
contemporaneous principal components are each available at the time the
portfolio weighting is calculated. This means that the underlying series on which
the principal components are computed are reported in the month in which they
are used in the maximization routine. However, economic time series are only
available with a reporting lag. So, the maximization routine was also run deleting
the contemporaneous principal components such that the set of x's was reduced
from 8x12 = 96 to 8x11 = 88. Then, the set of x's was reduced again, eliminating
both the contemporaneous principal components and those principal components
lagged one period such that the set contained 8x10 = 80 terms. Repeating this,
the results are as follows:
Lag
0
1
2
3
Number of x's
96
88
80
72
Return*
1.874
1.769
1.801
1.739
Volatility**
0.617
0.654
0.641
0.662
-------------------*Ratio of the mean portfolio return to the mean return of the S&P 500 index
**Ratio of the standard deviation of portfolio return to the standard deviation of
return of the S&P 500 index
The results in the table above indicate that the benefits of PCA do not fall away
rapidly as the timeliness of information is progressively degraded.
A "TRADING TEST" OF THE SYSTEM
The results reported above are for within-sample tests of the expert system. In
contrast, a "trading test" of the system is a special kind of out-of-sample test
designed to replicate the performance of the system in actual practice. In this test,
the S&P 500 index and 90-day Treasury bill rate are end-of-month data rather than
monthly averages of daily data.
A trading test is performed as follows. Each month at the end of the month, the
portfolio is reallocated between stocks in the form of the stock index and a riskless
asset in the form of 90-day Treasury bills. The portfolio is invested in those assets
at their month-end prices according to the new allocation. Then, the portfolio is
held until the end of the following month when it is reallocated again.
In setting the month-end weights of stocks and bills, it is assumed that information
regarding economic data, i.e. the principal components, is not available for that
month. That is, in setting, say, the November weighting of stocks and bills at the
end of October, it is assumed that economic data for November are not available.
In fact, given the reporting lag for economic time series, at the end of October,
many economic time series for that month would not be available. So, in the
trading test performed here, the November weighting of stocks is set according to
economic data ending in September. That is, the November weighting is set
according to a distributed lag of the various principal components extending from
September two months earlier into the past. As a result, in the trading test
conducted here, we employ a ten-month distributed lag rather than a twelve-month
lag of the principal components.
Naturally, the principal components are recomputed for each out-of-sample
weighting to reflect the availability of data at that time. That is, to compute the
out-of-sample November weighting of stocks in the portfolio for the trading test, the
principal components are computed using data through September. The
maximization problem is computed using data for the stock index through October.
Since the principal components are lagged two periods in the computation of the
stock weighting, principal components data through August are used in the
computation of the maximization problem. Principal components data through
September are used to compute the November weighting.
In the computation of the out-of-sample weightings, the beginning of the sample
periods is March 1980 in each computation, the same beginnging period as in the
within-sample tests.
The trading test was conducted for the period from January 1998 to the end of the
sample in September 2001, a period of time when stock prices rose to a peak and
then fell sharply. An out-of-sample portfolio weighting of the stock index was
computed for each month in this period under the rules described above. To
compute out-of-sample weightings earlier than this would require reducing the
sample size and thus decrease the realism of the trading test.
Defining returns as changes in logarithms, the mean monthly return of the portfolio
in the trading test is 0.004414, and the mean monthly return of the stock index is
0.001559. The standard deviation of return of the portfolio is 0.037483, and the
standard deviation of return of the stock index is 0.053997. So the maximization
routine produces a mean portfolio return which is 0.002855 higher than the stock
index, or roughly 30 basis points per month, with a portfolio standard deviation of
return which is 0.016513 less than that of the stock index.
To adjust for the difference in risk of the portfolio and the stock index, the
calculation was modified to allow for investing in the stock index with leverage. In
this calculation, the computed out-of-sample portfolio stock weightings were
scaled by a number greater than one to achieve leverage. The portfolio weighting
of the Treasury bills was adjusted to allow for borrowing sufficient funds to achieve
the leverage. The results are as follows:
Leverage
1.10
1.20
1.30
1.40
1.50
Mean
(SD)
0.004555 (0.041279)
0.004682 (0.045079)
0.004795 (0.048881)
0.004894 (0.052687)
0.004979 (0.056496)
According to the table, leverage between 1.40 and 1.50 produces a portfolio with
risk approximately the same of the S&P 500 index. At this level of leverage, the
monthly return is 33-34 basis points higher per month than that of the S&P index.
CONCLUDING REMARKS
In this paper, we have examined the application of a large-model approach to the
construction of an expert system for asset timing. The natural motivation for this
approach is that securities markets by their nature incorporate large information
sets. Using a maximization routine which maximizes the absolute return of a
portfolio consisting of the S&P 500 stock index and Treasury bills, we find that the
large-model approach generates within-sample excess returns of roughly 7
percentage points annually. Turning to an out-of-sample "trading test," the expert
system generates excess returns of about 4 percentage points annually.
In the within-sample and out-of-sample tests, the expert system always generates
portfolios with no more risk than the stock market and usually produces portfolios
with substantially less risk than the stock market. This takes place because the
system invests only in the stock index or Treasury bills. Since the volatility of
return of Treasury bills is less than that of the stock index, the system produces
portfolios which are “safe” in the sense that they are no more risky than the stock
market.
The expert system developed in this paper is driven by a set of 40 underlying
variables which are used to construct a set of principal components. Future
research may focus on the benefits of expanding the set of underlying variables.
The underlying variables used here consist of monthly indexes familiar to business
economists. Purely financial indicators such as credit spreads, other indicators of
risk premia, and so forth were not employed in the study even though market
participants pay great attention to such indicators. This suggests that the results
presented here could be improved to some degree by enlarging the underlying set
of time series.
The focus in this paper was on a portfolio of U.S. equities represented the S&P 500
index. The method developed here may also be applicable to foreign equity
markets and to domestic and foreign fixed-income markets.
Download