On Market Timing and Investment Performance. II. Statistical Procedures for Evaluating Forecasting Skills Author(s): Roy D. Henriksson and Robert C. Merton Source: The Journal of Business, Vol. 54, No. 4, (Oct., 1981), pp. 513-533 Published by: The University of Chicago Press Stable URL: http://www.jstor.org/stable/2352722 Accessed: 18/08/2008 14:57 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=ucpress. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org. http://www.jstor.org Roy D. Henriksson and Robert C. Merton Massachusetts Institute of Technology On Market Timing and Investment Performance. IL. Statistical Procedures for Evaluating Forecasting Skills* I. Introduction In Merton(1981;hereafterreferredto as Part I), one of us developed a basic model of markettiming forecasts where the forecaster predicts when stocks will outperformbonds and when bonds will outperformstocks but does not predict the magnitudeof the superiorperformance. In that analysis, it was shown that the patternof returns from successful market timing has an isomorphiccorrespondenceto the patternof returns from following certain option investment strategieswhere the implicit prices paid for the options are less than their "fair" or marketvalues. This isomorphiccorrespondencewas used to drive an equilibrium theory of value for market-timingforecasting skills. By analyzing how investorswould use the markettimer'sforecast to modify their probability beliefs about stock returns,it was shown that the conditional probabilitiesof a correctforecast (conditionalon the returnon the market)provideboth necessary and sufficient conditions for such forecasts to have a positive value. In the analysis presented here, we use the * Earlierversions of the paper were presentedin seminars at Berkeley, Carnegie-Mellon,Universityof Chicago, Dartmouth,Harvard,Universityof SouthernCalifornia,and Vanderbilt;we thank the participantsfor their comments. Aid from the National Science Foundationis gratefully acknowledged. (Journal of Business, 1981, vol. 54, no. 4) ? 1981 by The University of Chicago 0021-9398/81/5404-0005$01.50 513 The evaluationof the performanceof investment managersis a much studiedproblem in finance. Based upon the model developed in PartI of this paper, the statisticalframeworkis derivedfor both parametricand nonparametrictests of market-timingability. If the manager'sforecasts are observable,then the nonparametrictest can be used withoutfurther assumptionsabout the distributionof security returns.If the manager's forecasts are not observable,then the parametrictest can be used underthe assumption of either a capital asset pricingmodel or a multifactorreturn structure.The tests differfrom earlierwork because they permit identificationand separationof the gains of market-timingskills from the gains of micro stock-selectionskills. 514 Journal of Business basicmodelof markettimingderivedin PartI to developbothparametric and nonparametricstatisticalproceduresto test for superiorforecasting skills. The evaluationof the performanceof investmentmanagersis a topic of considerableinterestto both practitionersand academics.To the former,such evaluationsprovidea usefulaidforthe efficientallocation of investmentfunds among managers.To the latter, significantevidence of superiorforecastingskillswouldviolatethe EfficientMarkets Hypothesis.'Such violations,if found,wouldhave far-reachingimplications for the theory of finance with respect to optimal portfolio holdings of investors, the equilibriumvaluation of securities, and manydecisionsin corporatefinance.With so much at stake, it is not surprisingthatmuchhas been writtenon this subject.Indeed,a major applicationof modem capital markettheory has been to provide a structuralspecificationto measureinvestmentperformance.Within this structure,it is the practiceto partitionforecastingskills into two components(see Fama 1972):(1) "microforecasting,"whichforecasts pricemovementsof individualstocks relativeto stocks generally,and (2) "macroforecasting,"whichforecastsprice movementsof the general stock marketrelative to fixed income securities. The formeris frequentlycalled "securityanalysis" and the latter is referredto as "markettiming."Moreover,this partitioningof forecastingskillstakes on addedsignificancethroughthe work of Treynorand Black (1973), who have shown that investmentmanagerscan effectively separate actionsrelatedto securityanalysisfromthoserelatedto markettiming. Most of the recent empiricalstudies of investment performance focus on microforecastingand are based on a mean-variancecapital asset pricingmodel (CAPM)framework'where the 1-periodexcess returnon securityi can be writtenas Zi(t) - R(t) = ai + fJZM(t) - R(t)] + Ei(t), (1) where Zi(t) is the 1-periodreturnper dollar on securityi, ai is the expectedexcess returnfrommicroforecasting, f3iis the ratioof the covarianceof the returnon securityi with the marketdivided by the varianceof the returnon the market,andEi(t)has the propertythatits expectation,conditionalon knowingthe outcomefor the marketreturn thatis not 1. As a tautology,superiorforecastingskillsmustbe basedon information is obtainable,then security reflectedin securityprices.Therefore,if such information andthemarketwillnot be efficient.Fama priceswillnotreflectall availableinformation (1970)providesan excellentdiscussionof boththe EfficientMarketsTheoryandvarious attemptsto test it. 2. The capitalasset pricingmodel (CAPM)refersto the equilibriumrelationships amongsecuritypriceswhichresultwheninvestorshavehomgeneousbeliefsandchoose criterionfunction.Fortheoriginalderivations, theirportfoliosbasedon a mean-variance reviewof the see Sharpe(1964),Lintner(1965),andMossin(1966).Fora comprehensive model,see Jensen(1972a). On Market Timing and InvestmentPerformance 515 ZM(t),is equal to its unconditionalexpectation which is zero. That is, E [Ei(t) IZM(t)1 = E [Ei(t)] = 0. Using this specification, both Fama (1972) and Jensen (1972b) develop theoretical structuresfor the evaluation of micro- and macroforecastingperformanceof investment managerswhere the basis for the evaluation is a comparison of the ex post performanceof the manager'sfund with the returnson the market.In the Jensen analysis, the markettimeris assumedto forecast the actualreturnon the market portfolio,and the forecastedreturnand the actualreturnon the market are assumed to have a joint normal distribution.Jensen shows that under these assumptions, a markettimer's forecasting ability can be measuredby the correlationbetween the markettimer's forecast and the realized returnon the market.3However, Jensen also shows that the separate contributionsof micro- and macroforecastingcannot be identifiedusing the structureof (1) unless for each period, the market-timing forecast, the portfolio adjustment correspondingto that forecast, and the expected returnon the market are known. Grant(1977) explains how market-timingactions will affect the results of empiricaltests that focus only on microforecastingskills. He shows that market-timingabilitywill cause the regressionestimateof ai in (1) to be a downward-biasedmeasureof the excess returnsresulting from microforecastingability. Treynor and Mazuy (1966) add a quadraticterm to (1) to test for market-timingability. In the standardCAPM regression equation, a portfolio's return is a linear function of the return on the market portfolio. However, they argue that if the investment managercan forecast marketreturns,he will hold a greaterproportionof the market portfolio when the returnon the marketis high and a smallerproportion when the marketreturnis low. Thus, the portfolioreturnwill be a nonlinearfunction of the marketreturn. Using annual returnsfor 57 open-endmutualfunds, they findthat for only one of the funds can the hypothesis of no market-timingability be rejected with 95% confidence. Kon and Jen (1979) use the Quandt (1972) switching regression techniquein a CAPMframeworkto examinethe possibilityof changing levels of market-relatedrisk over time for mutual fund portfolios. Using a maximumlikelihoodtest, they separatetheir data sampleinto differentrisk regimesand then runthe standardregressionequationfor each such regime. They find evidence that many mutualfunds do have discrete changes in the level of market-relatedrisk they choose which is consistent with the view that managersof such funds do attemptto incorporatemarket timing into their investment strategies. 3. See Jensen(1972b),pp. 317-18. This result is also derivedin Treynorand Black (1973). Journal of Business 516 The model of market-timingforecasts presented here differs from those of these earlier studies in that we assume that our forecasters follow a more qualitative approach to market timing. Namely, we assumethat they eitherforecast thatZM(t) > R (t) or forecast thatZM(t) < R (t). The forecastersin our model are less sophisticatedthan those hypothesized in, for example, the Jensen (1972b) formulationwhere they do forecast how muchbetterthe forecast superiorinvestmentwill perform. However, as is shown in Part I, when this simple forecast informationis combined with a prior distributionfor returns on the market, a posteriordistributionis derived which would permitprobability statements about the magnitudesof the superior investment's performance. A brief formaldescriptionof our forecast model is as follows: Let y(t) be the market timer's forecast variable where y(t) = 1 if the forecast, made at time t - 1, for time period t is that ZM(t) > R (t) and y(t) = 0 if the forecast is that ZM(t) - R (t). We definethe probabilities for y(t) conditionalupon the realized returnon the marketby prob [y(t) = 0 ZM(t) R (t)] 1 - p,(t) = prob [y(t) = 1 IZM(t) 6 R(t)] pl(t) (2a) and p2(t) prob [y(t) = 1 ZA(t) > R(t)] 1 - p2(t) = prob [y(t) = 0 Zm(t) > (2b) R(t)]. Therefore,p,(t) is the conditionalprobabilityof a correct forecast given that ZM(t) - R(t), and p2(t) is the conditionalprobabilityof a correct forecast given that ZM(t) > R (t). It is assumed thatp ,(t) and p2(t) do not dependupon the magnitudeof IZM(t) - R (t) I. Hence, the conditionalprobabilityof a correct forecast depends only on whether or not ZM(t) > R (t). Under this assumption, it was shown in Part I that the sum of the conditionalprobabilitiesof a correct forecast, p ,(t) + p 2(t), is a sufficient statistic for the evaluationof forecastingability. Unlike the earlier studies of markettiming, this formulationof the problempermitsus to study markettimingwithout assuminga CAPM framework. Indeed, provided that the market timer's forecasts are observable, we derive in Section II of this paper a nonparametrictest of forecasting ability which does not require any assumptions about either the distributionof returns on the market or the way in which individualsecurityprices are formed.Althoughthe substantivecontext of the test presentedthere is markettiming,the same test couldbe used to evaluate forecastingability between any two securities. If the markettimer's forecasts are not directly observable, then to test markettimingrequiresfurtherassumptionsabout the structureof equilibriumsecurityprices. In Section III, we derive such a test using the assumption that the CAPM holds. However, in contrast to the On Market Timing and InvestmentPerformance 517 Jensen formulation, our parametrictest permits us to identify the separate contributionsfrom micro- and macroforecastingto a portfolio's return using as our only data set the realized excess returns on the portfolio and on the market. Althoughthe test specificationin Section III assumes a CAPMframework,it can easily be adoptedto a multifactorpricing model as described in Merton (1973) and Ross (1976). II. A NonparametricTest of MarketTiming In PartI, it was shown that a necessary and sufficientcondition for a forecaster's predictions to have no value is that p1(t) + p2(t) = 1. Under this condition, an investor would not modify his priorestimate of the distributionof returns on the market portfolio as a result of receiving the predictionand therefore would pay nothingfor the prediction. It follows that a necessary condition for market-timingforecasts to have a positive value is thatp 1(t) + p 2(t) 7 1. As shownin Part I, a sufficientconditionfor a positive value is thatp 1(t) + p 2(t) > 1. For example, a perfect forecaster who is always correct will have 1(t) = 1 andp2(t) = 1; therefore,pl(t) + P2(t) = 2 > 1. Formally,forecasts with p1(t) + p2(t) < 1 can be shown to have a negative value because such forecasts are systematically incorrect. However, such forecasts are perverse in the sense that the contrary forecasts with p 1(t) 1 - p1(t) andp'(t) 1 - p2(t) would satisfyp (t) + p'(t) > 1 and thereforehave positive value. For example, a markettimer who is always wrong will have p 1(t) + p 2(t) = 0. However, such forecasts have all the informational content of a forecasterwho is always rightbecause by following a strategyof always doing the opposite of the forecasts that are always wrong, one will always be right. Thus, one can reasonablyarguethat forecasts with p1(t) + p2(t) < 1 have positive value as well, provided one is aware that the forecasts are perverse. Therefore, a test of a forecaster's market-timingability is to determine whetheror notp l(t) + p2(t) = 1. Of course, ifp l(t) andp2(t) were known, then such a test is trivial. However, p1(t), p2(t), or their sum, are rarely, if ever, observable. Generally,it will be necessary to estimatep 1(t) + p2(t), and then use these estimates to determinewhether one can reject the naturalnull hypothesisof no forecastingskills. That is, Ho: p1(t) + p2(t) = 1 where the conditionalprobabilitiesof a correct forecast are not known. Essentially, this is a test of independence between the markettimer's forecast and whether or not the returnon the marketportfoliois greaterthan the returnfrom riskless securities. The nonparametrictest constructed around this null hypothesis takes advantageof the fact that conditionalprobabilitiesof a correct forecast are sufficientstatistics to measureforecastingability and yet they do not depend on the distributionof returnson the marketor on any particularmodel for security price valuation. The essence of the Journal of Business 518 test is to determine the probabilitythat a given outcome from our sample came from a populationthat satisfies the null hypothesis. To determinethis probability,we proceed as follows. First, we definethe followingvariables:N1 numberof observationswhereZM S R; N2 number of observations where ZM > R; N N1 + N2 = total number of observations;n1 numberof successful predictions,givenZm - R; n2 numberof unsuccessful predictions,given Zm > R; n n1 + n2 numberof times forecast that ZM - R. By definition,E(n1/N,) = Pi and E(n2/N2) =1 - P2 where E is the expected value operator. From the null hypothesis, we have that E(n1/N1) = pi = Ho = E(n2/N2), - P2 (3) and from (3), it follows that E [(n1 + n2)/(N1 + N2)1 = E (n/N) P p. (4) Ho Both n1/N1 and n2/N2 have the same expected value under our null hypothesis, namely,p, and both are drawnfrom independentsubsamples. Hence, only one or the other need be estimated. Both n1 andn2 are sums of independentlyand identicallydistributed randomvariableswith binomialdistributions.Therefore,the probability that ni - x from a subsampleof Ni drawingscan be written as P(ni= x |Nip) ( Ni)PX(I _ P)Nx; 1,2. i= (5) Giventhe nullhypothesis,we can use Bayes's Theoremto determine the probabilitythat n1 = x given N1, N2, and n, that is, P(n1 = x IN1,N2,n). Denote the event that our markettimerforecasts m times that Zm s R (i.e., n = m) as A and the event that, of the times he forecaststhatZM- R, he is correctx times and incorrectm - x times (i.e., n1 = x and n2 =m - x) as B. Then P(n 1 = x IN1,N2,m) =P(B |A), and by Bayes's Theorem, we have that P(B+ A) P (A) P(B IA) (N1)( - P(B) P (A) N2 j)px(1 ()m(l tN,8 _ N2 ~j)mA?x {N _ )Njl-xpl-x(j - p)N-rn - p)N2-tn+x 519 On MarketTiming and InvestmentPerformance Hence, under the null hypothesis, the probabilitydistributionfor n 1-the numberof correct forecasts, given thatZM- R-has the form of a hypergeometricdistributionand is independentof bothp, andp 2. Thereforeto test the null hypothesis, it is unnecessary to estimate either of the conditionalprobabilities.So, providedthat the forecasts are known, all the variablesnecessary for the test are directlyobservable. GivenN1, N2, andn, the distributionof n1underthe null hypothesis is determinedby (6) where the feasible range for n, is given by: min(N,,n) =En. n, _ max(O,n - N2) n, (7) Equations (6) and (7) can be used in a straightforwardfashion to establish confidence intervals for testing the null hypothesis of no forecasting ability. For a standard two-tail test with a probability confidencelevel of c, one would reject the null hypothesis if n,1 , (c) or if n1, x(c), wherei and x are defined to be the solutions to the equations4 L- (NX )(N -)x )/(N ) = (1 - c)12 (8a) and 1~f(n- c)12. (8b) However, we would arguethat a one-tail test (or at least one which weights the right-handtail much more heavily than the left) is more appropriatein this case. If forecastersare rational,then it will never be true thatp l(t) + p 2(t) < 1, and a very smalln1 would simplybe the "luck of the draw" no matterhow unlikely. It seems most unlikelyto us that a "real world" forecaster who had the talents to generate significantforecastinginformationwould not have the talent to recognize that his forecastswere systematicallyperverse, while at the same time, we as outside observers of those forecasts can clearly see the errors of his ways. For such a one-tail test with a probabilityconfidence level of c, one would reject the null hypothesis if n1 ? x*(c) where x*(c) is defined as the solution to x*(IN1) (N2x) )(AT) - (N) = 1 - c. (9) By inspection of (8a) and (9), x*(c) < xY(c),and therefore, given an observationin the righttail, a one-tailtest is, of-course, more likely to distributionis discrete,the strictequalitiesof eqq. (8a) 4. Becausethe hypergeometric and(8b)will not, in general,be attainable.Therefore,in (8a),Y shouldbe interpretedas thelowestvalueof x for whichthe summationdoes not exceed (1 - c)12.In (8b),x should be interpretedas the highestvalue of x for whichthe summationdoes not exceed (1 c)12. Journal of Business 520 rejectthe null hypothesisthan a two-tailone for any fixed confidence level c. However, this fact in no way impliesa greaterlikelihoodof rejectingthe null hypothesiswhen it is true by using a one-tailtest. Computationof the confidenceintervalsfor either the two-tail or one-tailtest using(8) or (9) is straightforward when the samplesize is small. However, for large samples, the factorialor gammafunction computationscan become quite cumbersome.Fortunately,for those large samples where such computationsbecome a problem, the hypergeometricdistributioncan be accuratelyapproximatedby the normaldistributionsThe parametersused for this normalapproximation are the mean and variancefor the hypergeometricdistribution given in (6), which can be writtenas E(n)- nN N (1Oa) and r2(n1) = [n1N1(N - N1)(N - n)]I[N2(N - 1)]. (lOb) Tables1, 2, and 3 give valuesof n1 for a one-tailtest that rejectthe null hypothesisat the 99%confidencelevel for differentvalues of N1, N2, andn. As wouldbe expected,the requiredestimatedvalueof p l(t) + p2(t)decreasesas the size of the total sampleincreases.Tables 1-3 also demonstratethat the normal distributioncan be an excellent approximationfor determining the confidence intervals for the hypergeometric distribution,even for observationsamplesas smallas 50.6 By focusingon the conditionalfrequenciesof correctforecasts,the test proceduredescribedin (6)-(10) takes into accountthe possibility that the markettimer may not have the same skill in forecastingup marketsas down markets.That is, pQ(t) need not be equal to p2(t). However,if one knowsthatthe forecasterwhose predictionsarebeing tested has equalabilitywith respectto bothtypes of markets,then the conditionalprobabilitiesof a correctforecast,p(t) andp2(t),are equal to each other, andtherefore,each is equalto the unconditionalprobability of a correctforecast,p (t). That is, p1(t) = p2(t) = p (t). In that case, one need only measurethe unconditionalfrequencyof a correct forecastto test for market-timing abilitywhere the null hypothesisof 5. Thelarge-sample cases wheredirectcomputation of the confidenceintervalsusing (8)or (9)aremostcumbersomearewhenN1 N2orn N12.Inthesecases, thenormal approximation will be quite good for even moderatelylarge samples.See Lehmann (1975,theorem19)for a generalproof.Thenormalapproximation willnotbe a goodone even for quite large samplesin those cases where there are substantialdifferences betweenN1 andN2 or betweenn andN/2. However,it is preciselyin these lattercases wheredirectcomputation using(8)or (9)is notcumbersome evenforverylargesamples. 6. As discussedin 5, this excellentapproximation shouldonlybe expectedto obtain whenN1 N2 andn N/2. 521 On Market Timing and InvestmentPerformance 0 0 zau2neeee + -- "0 - > .4) N ~~~~~cN C) + L4- ---__ o o -)OQOa _ __-_ 0 0 m ttr a~~~~~~~~~c om 0= z - "0 y *:=t FNN -e1-e _--__ __ 0 o~ C)tN ~~u<<<V N__,_ m Journal of Business 522 + - *X it o + -) S 0 E 0 -U',, Xm >E0~ Z~ mbt o"0m ~~~ t On Market Timing and InvestmentPerformance 523 0 "0n t cow~w?>otlo ) , - Q , i ~ :HtQ__ CH ) o ?- o r o 00ro000C ___ _ Journal of Business 524 no forecasting skills is that p (t) = .5. The distributionof outcomes drawnfrom a populationthat satisfies this null hypothesis is the binomial distributionwhich can be written as P(k IN,p) = (NP k ( = p)N(k (N (.5)N, k wherek is the numberof correctpredictionsandN is the total number of observations. One can use (11) in an analogous fashion to (6) to construct either one-tail or two-tail confidence intervals for rejecting the null hypothesis. Whilethe simplicityof this test may be attractive, the readershouldbe warnedthat a test which uses (11) insteadof (6) is only appropriateif there is strongreason to believe thatp 1(t) = p2(t). As is discussed at lengthin PartI, the unconditionalprobabilityof a correct forecast cannot, in general, be used as a measure of markettimingability.Specifically,it is shown thatan unconditionalprobability of a correct forecast greater than one-half, p(t) > .5, is neither a necessary nor a sufficient condition for a forecaster's market-timing abilityto have positive value. To see why it is not sufficient,consider the case of a forecaster who always predicts that the return on the marketwill exceed the returnon riskless securities. Such completely predictable forecasts, like a stopped clock, clearly have no value. However, if the historical frequency with which the returns on the marketexceeded the returns on riskless securities were significantly greaterthan one-half, then this forecaster's unconditionalprobability of a correct forecast would exceed one-half, and the null hypothesis would be rejected. Indeed, if a two-tail test were used, then all that would be requiredto reject the null hypothesis is that the historical frequencyof up marketsversus down marketsbe significantlydifferent than one-half.However, if this "stopped clock" forecasterwere evaluated by the test proceduredescribedin (6)-(10), then, independentof the relativefrequenciesof up and down markets,the null hypothesisof no forecastingability would not be rejected because for any sampleof observations,p1(t) = 0 andp2(t) = 1, and hence, p1(t) + p2(t) = 1. Therefore, by using the unconditionalprobabilityprocedure in (11), one is actually testing the joint null hypothesis of no market-timing ability andp 1(t) = p2(t). In summary,we have deriveda nonparametricprocedurefor testing market-timingability which takes into account the possibility that forecastingskills are differentfor up marketsthan for down markets. Because the critical statistic for the test is p1(t) + p2(t), it is not essential that the individual conditional probabilities be stationary throughtime. Rather,the critical stationaritypropertyfor the validity of equation(6) is that their sum,p 1(t) + p2(t), be stationary,which is, On MarketTiming and InvestmentPerformance 525 of course, true under the null hypothesis of no market-forecasting ability. Moreover, it is straightforwardto show that this same procedurecan be used to test forecasterswho have moreconfidencein their predictionsduring some periods than they do in other periods. One such application can be found in Lessard, Henriksson, and Majd (1981),where the predictionsof some foreign-exchangeforecastersare tested using our procedure.However, it is essential to our test procedure that the forecasts of market-timingbe observable.We will therefore turn to the development of a procedure to test market-timing ability when such forecasts cannot be observed. III. Parametric Tests of Market Timing To use the nonparametricproceduresto test investmentperformance, the predictionsof the forecaster must be observable. However, it is frequentlythe case when measuringmanagedportfolio performance that the examineronly has access to the time series of realizedreturns on the portfolioand does not have the investmentmanager'smarkettimingforecaststhemselves. Whileundercertainconditionsit is possible to infer from the portfolioreturnseries alone what the manager's forecasts were, such inferences will, in general, provide noisy estimates of the forecasts. These estimates will be especially noisy if the manager'sportfoliopositions are influencedby his microforecastsfor individual securities. In this section, we derive procedures which permitthe testing of timingabilityusing returndata alone. Of course, there is a "cost" of not having the time series of forecasts, which is that these test proceduresrequirethe assumptionof a specific generatingprocess for returnson securities. Thus, these proceduresare parametrictests of thejoint hypothesisof no market-timingabilityand the assumed process for the returnson securities. As noted earlier,most of the recent empiricalstudies of investment performanceassume a patternof equilibriumsecurityreturnswhich is consistent with the SecurityMarketLine of the CAPMin additionto some assumptions about the market-timingbehavior. The standardregression-equationspecificationfor portfolio returns used in these studies can be written as Zp(t) R(t) - = a+ ?X (t) + E(t), (12) whereZp(t) is the realizedreturnon the portfolio,x(t) ZM(t) - R (t) is the realizedexcess returnon the market,and E(t) is a residualrandom term which is assumed to satisfy the conditions E[E(t)] = 0 E[E(t) x(t)] = 0 E[E(t) jE(t - i)] = (13) 0, i = 1,2,3. Journal of Business 526 Providedthat the investmentmanagerdoes not attempt(or, at least, is unsuccessful at) forecasting market returns, the standard leastsquares estimation of (12) can be used to test for microforecasting skills. However, Jensen (1972b)shows that it is impossibleto use this structuralspecificationto separatethe incrementalperformancedue to stock selection from the increment due to market timing when the return data alone are used. The tests derived here do permit such a separation. As in the earlier studies, we also assume that securities are priced accordingto the CAPM, althoughthe tests can easily be adaptedto accommodatea multifactormodelprovidedthat the factors are known. We furtherassume that as a function of his forecast, discretely different systematicrisk levels for the portfolioare chosen by the forecaster. For example, in the case we analyze in detail here, it is assumed that there are two target risk levels which depend on whether or not the return on the market portfolio is forecast to exceed the return on riskless securities.That is, the investmentmanageris assumedto have one targetbeta when he predictsZM(t)> R (t) and anothertargetbeta when he predicts that ZM(t)- R(t). In Section IV, we indicate how the test procedurescan be adaptedto the moregeneralcase of multiple target risk levels. Let vj, denote the targetbeta chosen for the portfolioby the manager when his forecast is thatZM(t) - R (t) and let j2 denote the targetbeta chosen when his forecastis thatZM(t)>R(t). If/3(t) denotes the beta of the portfolio at time t, then p3(t)= vjl for a down-market forecast and ,8(t) = -r,2for an up-marketforecast. If the forecaster is rational,then > vl. Of course, if/3(t) were observableat each point in time, then, as discussed in Fama (1972),the market-timingforecast is observable, and one could simply apply the nonparametrictests of the previous section. However, if beta is not observable, then /(t) is a random variable.Underthe assumptionthat beta is not observable,let b denote the unconditional(on the forecast) expected value of /3(t). Then 7)2 b = q[p 11 + (1 - P 1)q2] + (1 - q)[P27)2 + (1 - P2)>1], (14) where q is equal to the unconditional(on the forecast) probabilitythat ZM(t)- R (t). In Part I, the distributionfromwhich q is computed was called the prior distribution.If we define the randomvariable0(t) as equal to [/3(t) - b], then 0(t) is the unanticipatedcomponentof beta, and its distribution,conditionalon the realized excess returnon the market,x(t), can be written as: conditionalon x(t) - 0: = 8 where =1 = (7)i = (q2 - 2)[1 -i1)[qpI - qp1 - (1 - q)(1 - P2)] with prob = pi + (1 -q)(1 -p2)]withprob = 1 -p1 (l5a) 527 On Market Timing and InvestmentPerformance and conditionalon x(t) > 0: _= _02where 02 + (1 - q)(1 - with prob = (15b) = (62 - - (n1 - 70)L1 - qpI - (1 - q)(l - pa)] with prob = 1 -P 2 7,l)[qP 1 P2)] P2 From(15), it follows that the conditional(onx[t]) expected value of 0 can be written as E(0 Ix) (16a) 01 = (1 - q)(p1 +P2 - 1)Qq1 72), forx(t) S 0 and E(0 X) = = (16b) 02 q(p1 + P2 - 1)(72 - 1), forx(t) > 0. The per period returnon the forecaster'sportfoliocan be writtenas Zp(t) = R(t) + [b + 0(t)]x (t) + X + Ev(t), (17) where A is the expected incrementto the returnon the portfoliofrom microforecastingor securityanalysisand E.(t) is assumedto satisfy the standardCAPM conditions given in (13). Under the posited returnprocess for the portfolio given in (17), a least-squaresregression analysis can be used to identify the separate incrementsto performancefrom microforecastingand macroforecasting. The regression specificationcan be written as Zp(t) - R(t) = a + 3Jx(t) + 2y(t) + e(t) (18) where y(t) max[OR(t) - ZM(t)] = max[0,-x(t)]. The motivationbehindthe specificationgiven in (18) comes from the analysis of the value of markettimingpresentedin Section IV of PartI. There it was shown that up to an additive noise term, the returnsper dollarinvested in a portfoliousing the market-timingstrategydescribed here will be the same as those that would be generatedby pursuinga partial "protective put" option investment strategy where for each dollarinvested in this strategy, [P2-q2 + (1 - P2)7)1] dollarsare invested in the market; (p1 + P2 - 1)(n2 - -rp)put options on the market portfolioare purchasedwith an exercise price (per dollarof the market) equal to R(t); and the balance is invested in riskless securities. The value of the markettiming(per dollarof assets managed)is that the (p1 + P2 - 1)(X2 - ijl) puts are obtained, in effect, for no cost. Note that y(t) as defined in (18) is exactly the returnon one such put option. From(17), the expected returnon the portfolio,conditionalonx > 0, can be written as Journal of Business 528 X, (19a) E(Zp x>O)=R+(b+02)E(xIx>0)+ and the expected return, conditionalon x - 0, can be written as (19b) E(Zp Ix - O) = R + (b + 01)E(xIx - O) + X, where bars over randomvariables denote expected values. For the analysis of the regressioncoefficientsand errortermin (18), it will be convenientto express the relevantvariancesand covariances of the regression variables in terms of the expected values and variances of the randomvariablesxl(t) and x2(t) which are defined by x1(t) min[Ox(t)] x2(t) max[O,x(t)]. (20) If var[xi(t)] o-4, i = 1,2, then we can write the variances and covariances of the regression variables as var[y(t)] = U2 = U2 2 = r2o + o-2 - var[x(t)] cov[x (t) ,y (t)]0 cov[Zp(t),x(t)] , = (b + (21) 2x-lx-2 = X1X2-cr 1)(o2 cov[Zp(t),y]- ary= (b + 02)X1X2 (b + 02)(o2 + X2 -X - (b - X-1X2) + 6l)o-~. From (18) and (21), it follows that the large sample least-squares estimates of ,1 and /32 can be written as plim /a PYOXY 2 PX0Y = -r 2 O2... (22) = b+ 02 = P2172 + (1 -P2)nl and plim , - "-pY- =02 x rPX XY - (23) -01 = (P1 + P2 - 1)()2- 7) =E From (22), plim E[,B(t)Ix(t) > 0], and it is also equal to the in the market invested portfolioin the option strategyused in fraction Section IV of Part I to replicate the market-timingstrategy. The investment manager'smarket-timingabilityis expressed as /2' The true /2 will equal zero if either the forecasterhas no timingability-that is, ifp1(t) + p2(t) = 1-or he does not act on his forecasts-that is, if q2 = r%.With referenceto the replicatingoption investment strategy, from On MarketTiming and InvestmentPerformance 529 (23), plim82 is equalto the numberof "free" put optionson the market providedby the manager'smarket-timingskills. Indeed, as shown in PartI, the value of market-timingskills per dollarof assets managedis equal to (p1 + P2 - 1)(-q2 - Tj1)g(t)where g(t) is the market price of such a put. Thus, ,/2g(t) is an estimateof the value of the market-timing ability of the manager. Formallyinterpreted,a negativevalue for the regressionestimate/2 would imply a negative value for market timing. However, a true negative value for /32 would violate the rationality assumptions of p,(t) 1 andq2 : -r1.Hence, as was discussedfor the nonparametric +p2(t) W tests in Section II, the reader should consider the relative meritsof a one-tailversus the standardtwo-tailtest of significancewith respect to rejectingthe null hypothesis that /2 = 0. The incrementto portfolioperformancefrom microforecastingcan also be measured using regression equation (18). The large sample least-squaresestimate of a can be written as7 plim a = E(Z,) - R - plim f1x - plim A7= A. (24) Hence, from (23) and (24), regression equation (18) can be used to identify and estimate the separate contributionsof microforecasting and macroforecastingto overall portfolio performance. To completethe analysis of (18), we now investigatethe properties of the errorterm E(t)being carefulto take into accountthe differences between the actualbetas and their estimates. That is, if the forecaster strictlyfollows the posited behaviorof two discrete risk levels 7rnand 7q2,then the actual,B(t)of the fundwill never be equal to the estimated ,8 unless, of course, he is a perfect forecaster. Let us definethe followingvariables:A1 = 1 if x - 0 and the market timer's forecast is incorrect,A1 = 0 otherwise; V1 = 1 if x - 0 and the markettimer's forecast is correct, V1 = 0 otherwise; A2 = 1 if x > 0 and the markettimer's forecast is incorrect,A2 = 0 otherwise; V2 = 1 if x > 0 and the markettimer's forecast is correct, V2 = 0 otherwise. It follows immediatelythat: E(Al) =PI E(A2) = 1 E(V1)= P2 1 -Pi (25) E(V2) =P 2 From (18), each period's estimationerrorE can be written as: E= A1(-P 1)(q1 - -2)X1-V -P - 11712)X1 + A2P2(71 V2(1 - P2)(71l - 2)X2 v2)X2 + E1, (26) 7. Whentestingfor forecastingability,the relevantportfolioreturnsare thoseearned before any deductionfor managementfees. If one uses the returnsearnedafter the deductionof managementfees and if the fees chargedcan be expressedas a fixed percentage of assets, m, then plim a' = A - m. Journal of Business 530 where E, includes any error term resulting from microforecasting. Since, by definition,microforecastingis independentof x, Ep is independent of x. It follows by the Law of Large Numbers that as the numberof observations, N, gets large, V Y. 2, and Y. Y. N will all approachtheir expectation. Therefore, from (26), lim N-~ Nl - Nj - I[Pi(l P I)(l - - + [(1- P2)P72(71- + lim lEpIN lim N-oo [ NP] (1 - Pi)P1(fl O2) - - P2)-P2(0 P2)(fll XI - - 72)]X2 (27) = 0. Thus, for large samples, the coefficient from least-squaresestimation of (18), plus the realized excess returnon the market, will give us an unbiased estimate of the portfolio return.8 As discussed, the motivationbehindthe regressionspecification(18) was the analysis in Part I which showed the correspondencebetween market-timinginvestment strategies and certain option investment strategies. However, there is alternative,but equivalent, specification which some may find to be more intuitive. Namely, by a linear transformationof (18), we can write this alternativeregressionequationas Zp(t) - R(t) = a' + fI1XA(t)+ 3x2(t) + E, (28) wherex1(t)andx2(t)are as definedin (20). Becausex1(t) = 0 andx2(t) = x(t) if x(t) > 0 f32has a rather intuitive interpretationas the "upmarket"beta of the portfolio. Similarly,because x1(t) = x (t) and x2(t) = 0 if x(t) - 0, p8can be interpretedas the "down-market"beta of the portfolio. Indeed, the large sample properties of the regression estimators/ and f3"fit these intuitiveinterpretations(at least in the sense of expected values). That is, plim ,8 = E[f3(t) Ix(t) -P 171 + S 0] (1 -P1)f2 (29a) 8. Althoughunbiased,ordinaryleast squaresestimationis not efficientbecausep(t) is not stationary,and thereforethe standarddeviationof the errorterm is an increasing functionof Ix(t) l. To improvethe efficiencyof the estimates,one coulduse generalized least squaresestimationto correctfor this heteroscedasticity. 531 On Market Timing and InvestmentPerformance and plim /2 = E [,3(t) | x(t) > ] =P272 +2(1 (29b) P2)77l, - 0] = b + 1 The where E[E3(t) x(t) > 0] =b + 02Iand E[/(t)x(t) test for market-timingabilityusing this specificationwould be to show that /2 is significantlygreaterthan ,. That is, show that the expected "up-market" beta of the portfolio is greater than the expected "down-market"beta of the portfolio.The large samplepropertiesof oi' are the same as for a^in (18): namely, plim ^' IV. = A. Summary and Extensions Providedthatthe forecasteronly attemptsto predictthe sign of ZM(t)R (t) but not its magnitude,and providedthat his forecasts are observable, a procedurefor testing market timing has been derived which does not dependon any distributionalassumptionsaboutthe returnson securities. The test includes the possibility that the forecaster's confidencein his forecasts as measuredby (P 1,P2) can vary over time, and indeed, if such variations are observable, then the test can be refined to measure his forecastingability for each such variation. In the case where the forecaster's predictionsare not observable, a parametrictest procedure was derived which permits separate measurementsof the contributionsto portfolio performancefrom market timing and security analysis. As is apparentfrom the analysis of the errortermin Section III, this test will accommodatethe case where the two-targetrisk levels chosen by the managervary over time provided that these variationsare randomarounda stationarymean. The test is also applicableto the case where the forecaster selects from morethantwo discretesystematicrisktargetlevels, as long as the differentlevels are based on differinglevels of confidence in the forecasts and not differingexpectations of the level of the return. In this case, the large sampleleast-squaresestimates of)/1 and 32 representa weighted average of the differentrisk levels. The test procedurespresentedhere can be extended to evaluate the performanceof a markettimerwho segmentshis predictionof x(t) into more than two discrete regions. For example, a forecastermighthave that - 10%o- x(t) < 0, that fourpossible predictions:thatx(t) < - 10%F, 0 - x(t) < 10%, and that 10%lo x(t). We briefly illustrate how the analysis would be applied to such multipleregions for the parametric test case. As in the two-regioncase, it is assumed that the probability of a particularforecast will only depend on the region in which x(t) falls. However, there are now more than two possible forecasts. Journalof Business 532 Specifically,we assumethat there are n differentregionsthat the forecastermightpredictand definepij as the probabilitythat the forecaster'spredictionwas thatx(t) wouldbe in thejth region,given thatx(t) actuallyendedup in theith region.Theonlyconstrainton the conditionalprobabilitiesis that j=1pij = 1, i = 1,2 . . . , n. The return on theforecaster'sportfoliocanbe definedas in SectionIIIexceptthat now n b = E i=1 n > jpj, i j-1 thatx(t) willendupin regioni and where6iis definedas theprobability ,qj is the chosenlevel of systematicriskwhenthe forecastis thatx(t) will end up in regionj. to (28)in the two regioncase Theregressionequationcorresponding x + E,wherexi = x if x is in can be writtenas Z -R = a region i, and xi = 0 otherwise. A Thelargesampleleast-squaresestimatesof /pianda are:plim = j- pijqj and plim a' = X. Fromthis analysis,it follows that for regions(thatis, forlargeenoughvaluesof finelypartitioned sufficiently returns to separatetheincremental in is least possible n), it at principle, withoutany restrictionson the disfrommicro-andmacroforecasting tributionof forecasts.All that is requiredare the actualreturnsfrom the market,the portfolio,and risklesssecurities. References Fama, E. 1970.Efficientcapitalmarkets:a review of theoryandempiricalwork.Journal of Finance 25, no. 2:383-417. Fama, E. 1972. Componentsof investmentperformance.Journal of Finance 27, no. 2:551-67. Grant, D. 1977. Portfolioperformanceand the "cost" of timingdecisions. Journal of Finance 32, no. 2:837-46. Jensen, M. C. 1972a. Capitalmarkets:theory and evidence. Bell Journalof Economics and ManagementScience 3, no. 2:357-98. Jensen, M. C. 1972b. Optimalutilization of market forecasts and the evaluation of investmentperformance.In G. P. Szego andK. Shell (eds.), MathematicalMethodsof Investmentand Finance. Amsterdam:North-Holland. Kon, S. J., and Jen, F. C. 1978. The investment performanceof mutualfunds: an empiricalinvestigationof timing, selectivity, and marketefficiency.Journalof Business 52, no. 2:263-89. Lehmann,E. 1975.Nonparametrics:Statistical Methods Based on Ranks. San Francisco: Holden-Day. Lessard, D.; Henriksson,R.; and Majd, S. 1981. Risk and returnon foreign currency assets: an evaluationof forecastingability.Workingdraft. Cambridge,Mass.: Massachusetts Instituteof Technology. Lintner,J. 1965.Securityprices, risk, andmaximalgainsfromdiversification.Journalof Finance 20 (December):587-615. Merton,R. C. 1973.An intertemporalcapitalasset pricingmodel.Econometrica41, no. 5:867-87. Merton, R. C. 1981. On markettimingand investmentperformance.I. An equilibrium theory of value for marketforecasts.Journal of Business 54 (July): 363-406. On MarketTiming and InvestmentPerformance 533 in a capitalasset market.Econometrica34, no. 4:768-83. Mossin,J. 1966.Equilibrium Quandt,R. E. 1972.A new approachto estimatingswitchingregressions.Journalof the American Statistical Association 67, no. 338:306-10. Ross, S. 1976.The arbitragetheoryof capitalassetpricing.Journalof EconomicTheory 3 (December):343-62. Sharpe,W. F. 1964.Capitalassetprices:a theoryof marketequilibrium underconditions of risk. Journal of Finance 19, no. 4:425-42. Treynor,J., and Black, F. 1973. How to use securityanalysis to improveportfolio selection. Journal of Business 46, no. 1:66-86. Treynor,J., and Mazuy, F. 1966. Can mutualfunds outguessthe market?Harvard BusinessReview 44 (July-August):131-36.