Asset pricing in the frequency domain: theory and empirics Ian Dew-Becker and Stefano Giglio 10th May 2013 Abstract In many a¢ ne asset pricing models, the innovation to the pricing kernel is a function of innovations to current and expected future values of an economic state variable, often consumption growth, aggregate market returns, or short-term interest rates. The impulse response of the priced state variable to various shocks has a frequency (Fourier) decomposition, and we show that the price of risk for a given shock can be represented as a weighted integral over that spectral decomposition. In terms of consumption growth, Epstein–Zin preferences imply that the weight of the pricing kernel lies almost entirely at low frequencies, while internal habit-formation models imply that the weight is shifted to high frequencies. We estimate spectral weighting functions for the equity market and …nd that they place most of their weight at low frequencies, consistent with Epstein–Zin preferences. For Treasuries, we …nd that investors view increases in interest rates at low frequencies and decreases at businesscycle frequencies negatively. Federal Reserve Bank of San Francisco and Booth School of Business, University of Chicago. The views in this paper are those of the authors and do not represent the views of the Federal Reserve System or the Board of Governors. We appreciate helpful comments and discussions from Francisco Barillas, Rhys Bidder, John Campbell, Lars Hansen, Nikola Mirkov, Marius Rodriguez, Eric Swanson, and seminar participants at the San Francisco Fed, the University of Bergen, the University of Wisconsin, Chicago Booth, UC Santa Cruz, and the Bank of Canada. 1 1 Introduction This paper studies how risk prices for shocks depend on their dynamic e¤ects on the economy. Theoretical asset pricing models have strong implications for how short- and long-term shocks should be priced, and we empirically estimate how the power of a shock at di¤erent frequencies determines its risk price. A¢ ne models are a workhorse of both theoretical and empirical asset pricing. In these models, innovations to the pricing kernel are linearly related to innovations in economic state variables. This paper shows that many widely used a¢ ne frameworks can be written, estimated, and interpreted in the frequency domain. The frequency-domain decompositions give a clear and compact characterization of exactly how the dynamics of the economy a¤ect risk prices and provide sharp tests of competing asset pricing models. The dynamic behavior of the economy is a key input to asset pricing models. In a representativeagent model with Epstein–Zin (1989) preferences, the risk premium of an asset depends on the covariance of its return with current and expected future consumption growth. For the intertemporal CAPM (Merton, 1973; Campbell, 1993), risk premia depend on covariances with shocks to both current market returns and also future expected returns. And in a¢ ne term structure models, we show that risk premia depend on covariances with innovations to current and future short-term interest rates. In dynamic asset pricing models, then, the price of risk for a shock depends on how it a¤ects the state of the economy in the current period and in the future. The dynamic response of the economy to a shock is represented in the time domain with an impulse response function (IRF). Long-run shocks to consumption growth that have large risk prices under Epstein–Zin preferences, for example, those studied in Bansal and Yaron (2004), have IRFs that decay slowly. In this paper we propose and derive a new frequency-domain representation of risk prices. First, we map the IRF of a shock into the frequency domain. A shock that has strong long-run e¤ects has high power at low frequencies, whereas shocks that dissipate rapidly have more power at high frequencies. We refer to the frequency-domain version of the IRF as the impulse transfer function. Our key result is that the price of risk for a shock depends on the integral of the impulse transfer function weighted by a function Z(!) over the set of all frequencies !. The weighting function, Z, determines how shocks are priced depending on how they a¤ect the economy at di¤erent frequencies. In other words, Z(!) represents the price of exposure to shocks with frequency !. In this paper we derive Z(!) for various theoretical models and estimate it empirically in both equity and debt markets. The advantage of studying risk prices in the frequency domain is that Z gives a compact and intuitive measure of how di¤erent shocks a¤ect the pricing kernel. For example, under power utility, 2 the only thing that determines the price of risk for a shock is how it a¤ects consumption today. So Z is perfectly ‡at across frequencies because cycles of all frequencies receive identical weight in the pricing kernel. Under Epstein–Zin preferences, long-run risks matter, and Z places much more weight at low than high frequencies; in fact, the weight is focused only at the very lowest frequencies. Conversely, for an agent with internal habit formation most of the weight of Z is located at high frequencies. The spectral representation we derive is useful for two main reasons. First, it gives us new insights about the importance of the dynamics of shocks for asset prices in di¤erent models. For example, from the spectral decomposition we learn that Epstein–Zin preferences imply that the majority of the pricing weight lies extraordinarily close to frequency zero: under a standard calibration of the model, more than half of mass of the spectral weighting function lies on cycles longer than 230 years. Conversely, for an agent with internal habit formation most of the weight of Z is located at high frequencies. The decomposition also tells us which aspects of the consumption process one needs to focus on most when calibrating models: for example, under Epstein–Zin preferences, we …nd that the key statistics are the unconditional standard deviation and the long-run standard deviation of consumption growth; other aspects of consumption’s dynamic behavior are unimportant.1 The second key reason the spectral representation is useful is that it enables more general and powerful estimation of dynamic asset pricing models. Our theoretical analysis shows that Epstein– Zin preferences put weight on mainly the lowest-frequency e¤ects of shocks on consumption growth. But low frequencies are precisely where we have the least estimation power empirically. Our spectral decomposition makes it simple to test a more general model in which agents still care about "longrun" shocks, but where the long-run is de…ned as say, 10 or more years in the future, rather than 200 years. Economically, given that people do not live hundreds of years, it makes sense to focus on dynamic e¤ects within human lifespans. Furthermore, we obtain much more precise estimates of the weighting functions because estimation power rises substantially as we move away from the lowest frequencies. After deriving the frequency decomposition in section 2 and characterizing weighting functions theoretically for various consumption-based models in section 3, we proceed to estimate them using the cross-section of equity prices. Since we are interested in distinguishing among theoretical models that have very stark implications for the pricing of di¤erent frequencies, we parametrize Z to be able to capture separately the price of high-frequency and low-frequency shocks, letting the data speak on which are considered the more important frequencies by investors. We do this in two ways: by restricting the weighting function Z to be the one literally implied by the various models, 1 See Dew-Becker (2013) for a more extensive analysis of the issues around calibrating the long-run standard deviation. 3 and, second, by focusing on the pricing of economically interesting groups of frequencies (longrun as captured by below-business-cycle frequencies, business-cycle frequencies, and higher than business-cycle frequencies). The estimation shows strong support for long-run risk models. However, it only does so when we de…ne the long-run based on the frequency-domain interpretation of shocks with cycles longer than the business cycle. In our sample, when we estimate the parameters of Epstein–Zin preferences structurally, none of them are signi…cant, which would normally be interpreted as a rejection of the model –neither short- nor long-run consumption growth seems to price equities. But that rejection would be a mistake. When we allow "long-run" to simply mean anything longer than the business cycle, we …nd that covariance with long-run shocks is a statistically and economically signi…cant determinant of average portfolio returns. Section 4 extends the analysis to returns-based models, in which agents prices equity portfolios based on their covariance with short- and long-run shocks to equity market returns. We …nd (consistent with Campbell and Vuolteenaho, 2004), that it is low-frequency shocks to equity market returns that drive the pricing kernel. In section 5 we show that the methodology easily generalizes to pricing kernels that depend on innovations of multiple variables. For example, stochastic volatility models (as in Campbell et al., 2013) can imply that agents care about long-run innovations in returns and volatility. Next, we estimate weighting functions for a¢ ne term structure models. In this case we show that the price of risk for a shock depends on how it a¤ects the dynamics of short-term interest rates in the future. Standard term structure models have the problem that the fundamental shocks are only identi…ed up to a rotation, making interpretation of estimated risk prices di¢ cult. Our frequency-domain estimates of Z, on the other hand, are invariant to a rotation of the shocks. So instead of interpreting risk prices for the usual “level”, “slope”, and “curvature” factors, risk prices are interpreted in terms of how investors price shocks to interest rates at low frequencies, business-cycle frequencies, and high frequencies. We …nd that the low-frequency shocks to short-term interest rates have a signi…cantly positive price of risk, consistent with the idea that investors want to hedge against persistent increases in interest rates (and, presumably, in‡ation). At business-cycle frequencies and higher, shocks to interest rates have a negative price of risk, as we would expect given that short-term interest rates are procyclical. There is very little extant analysis of preference-based asset pricing in the frequency domain. Otrok, Ravikumar, and Whiteman (2002) is the most prominent example.2 While their paper also presents a spectral decomposition, the object of the decomposition is di¤erent from ours. Instead 2 See also Yu (2012). 4 of studying risk prices in the frequency domain, they ask how the price of a consumption claim depends on the spectral density of consumption.3 Empirically, a number of papers study the relationship between asset returns and consumption growth at long horizons as methods of testing the implications of Epstein–Zin preferences.4 These papers essentially assume that changes in expected consumption growth at any date in the future carry the same weight in the pricing kernel since they assume that the pricing kernel is driven by changes in the long-run expectation of consumption (which is what Epstein–Zin preferences imply). Our empirical estimates allow for a much more general speci…cation where shocks to consumption at di¤erent horizons may have di¤erent risk prices. The paper is also, in some respects, related to Hansen, Heaton, and Li (HHL; 2008). HHL decompose risk premia on assets in terms of their cash ‡ows at di¤erent horizons, essentially deriving term structures for various types of zero-coupon claims (e.g. consumption claims, as in Lettau and Wachter, 2007). We, on the other hand, decompose risk prices for shocks based on how they a¤ect some economic state variable at di¤erent horizons. Whereas HHL (and, relatedly, Alvarez and Jermann, 2005, and Otrok, Ravikumar, and Whiteman, 2007) study the present and future dynamics of the pricing kernel itself, we study how the dynamics of various shocks a¤ect the pricing kernel only in the current period. In their empirical analysis, HHL look at how the dynamics of dividends of value and growth stocks relate to the dynamics of the pricing kernel over time. We have nothing to say about the term structure of discount rates on dividends. Rather, we ask how the dynamics of various shocks to consumption growth (i.e. short and long-run shocks) a¤ect the pricing kernel today. 2 Spectral Decomposition and the Weighting Function We derive our spectral decomposition of the pricing kernel under two main assumptions. First, the pricing kernel, mt , depends on the current and future values of a state variable, x (perhaps consumption growth or market returns). Second, the dynamics of the economy are described by a vector moving average process Xt which includes xt . Assumption 1: Structure of the SDF. Denote the log pricing kernel (or stochastic discount factor, SDF) mt .5 We assume that mt 3 Moreover, unlike this paper, they do not obtain analytic relationships between the spectrum and asset prices; their results are all generated numerically. 4 Parker and Julliard, ; Malloy, Moskowitz, and Vissing-Jorgensen, 2009; Bansal, Dittmar, and Lundblad, 2005; Yu, 2012; Daniel and Marshall, 1997. 5 We do not take a position on whether mt is the pricing kernel for all markets or whether there is some sort of market segmentation. We also do not assume at this point that there is a representative investor. 5 depends on current and future values of some state variable in the economy xt : mt+1 = F (xt ) Et+1 1 X zk xt+1+k (1) k=0 where xt is the (scalar) priced variable, F is some unspeci…ed function, Et+1 Et+1 Et denotes the innovation in expectations, and Et is the expectation operator conditional on information available on date t. Note that this speci…cation is su¢ ciently ‡exible to match standard empirical applications of power utility, habit formation, and Epstein–Zin preferences. It can also accommodate the CAPM and ICAPM. Equation (1) implies that risk prices are constant, but we relax that assumption below. Assumption 2: Dynamics of the economy. xt is driven by an N -dimensional vector moving average process where Xt has dimension N xt = B1 Xt (2) Xt = (3) (L) "t 1, L is the lag operator, (L) = 1 X (L) is an N k kL N matrix lag polynomial, (4) k=0 and "t is an N 1 vector of (potentially correlated) martingale di¤erence sequences. We refer to "t as the fundamental shocks to the economy. Throughout the paper Bj denotes a conformable (here, 1 N ) vector equal to 1 in element j and zero elsewhere. We assume without loss of generality that xt is the …rst element of Xt . Furthermore, we require (L) has properties su¢ cient to ensure that xt is covariance stationary with a …nite and continuous spectrum. Putting together the assumptions about m with those about the dynamics of the economy, we can write the innovations to the pricing kernel as function of the impulse-response functions (IRFs) of xt to each of the fundamental shocks. In particular, for the jth fundamental shock, "j;t , the IRF of xt is the set of gj;k for all horizons k de…ned as: gj;k ( B1 0 k Bj for k 0 0 otherwise 6 (5) We can then rewrite the innovation to the SDF as: 1 X X Et+1 mt+1 = j zk gj;k k=0 ! "j;t+1 (6) P and we refer to ( 1 k=0 zk gj;k ) as the price of risk for shock j. In this representation, the e¤ect of a fundamental shock "j;t+1 on the pricing kernel is decomposed by horizon: for every horizon k, the e¤ect of the shock on mt+1 depends on the response of x at that horizon (captured by gj;k ) and on the horizon-speci…c price of risk zk . Our main result is a spectral decomposition in which the price of risk of a shock depends on the response of x to that shock at each frequency ! and on a frequency-speci…c price of risk, Z(!). Result 1. Under Assumptions 1 and 2, we can write the innovations to the SDF as, X Et+1 Mt+1 = Z 1 2 j Z (!) Gj (!) d! "j;t+1 (7) where Z (!) is a weighting function depending on the risk prices fzk g and Gj (!) measures the dynamic e¤ects of "j;t on x in the frequency domain, Z(!) z0 + 2 1 X zk cos (!k) (8) k=1 Gj (!) 1 X cos (!k) gj;k (9) k=0 Equivalently, the price of risk for a shock can be written as 1 X k=0 zk gj;k 1 = 2 Z Z (!) Gj (!) d! (10) Derivation and discussion For each shock "j;t , the set of coe¢ cients fgj;k g is the impulse-response function of xt at di¤erent horizons k. Moving into the frequency domain, the …rst step is to decompose the e¤ects of each shock "j;t on the future values of xt into cycles of di¤erent frequencies. To do this, we use the discrete Fourier transform, and de…ne ~ j (!) G 1 X k=0 7 e i!k gj;k (11) ~ j (!) will lie at low frequencies, If "j;t has very long-lasting e¤ects on x, then most of the mass of G ~ j (!) will isolate high frequencies.6 We while if "j;t induces mainly transitory dynamics in x, then G ~ j as the impulse transfer function of shock j since it is the transfer function associated refer to G P k with the …lter 1 k=0 gj;k L . Using the inverse Fourier transform, the price of risk for shock j is 1 X zk gj;k = k=0 1 X k=0 1 zk 2 Z ~ j (!) ei!k d! G (12) Now note that gj;k = 0 for all k < 0. Therefore, for any k > 0 we can rewrite (12) as:7 1 X zk gj;k k=0 1 = 2 Z Gj (!) z0 + 2 1 X ! zk cos (!k) d! k=1 (13) ~ j (!), where Gj (!) is the real part of G ~ j (!) = G (!) = re G 1 X cos (!k) gj;k (14) k=0 In other words, the price of risk for any shock depends on the integral of its response in the frequency domain, Gj (!), weighted by a real-valued function Z (!), where Z (!) z0 + 2 1 X zk cos (!k) (15) k=1 ~ gives weights in terms of cycles of di¤erent frequencies, we To be more rigorous about the sense in which G refer to the spectral representation theorem. Speci…cally, denote xk;t the process induced in xt if the only shock realizations were for "k . That is, 1 X xk;t = gk;j "k;t k 6 j=0 "k;t has a spectral representation "k;t = Z eit! dZ (!) where dZ (!) is an orthogonal increment process with constant variance (see, e.g., Priestley, 1981, for a textbook statement and proof of the spectral representation theorem). The spectral representation of xk;t is then xk;t = Z eit! 1 X gk;j e ij! dZ (!) = j=0 Z ~ k (!) dZ (!) eit! G ~ k thus determines the magnitude of ‡uctuations in xk;t at frequency !. G 7 See the appendix for a derivation of this equation. 8 We thus have 1 X k=0 zk gj;k 1 = 2 Z Gj (!) Z(!)d! (16) This equation maps element-by-element product of the in…nite collections fgj;k g and fzk g into a simple integral over a …nite range in the frequency domain. This result is closely related to Plancherel and Parseval’s theorems, but is not identical because we take advantage of the fact that gj;k = 0 for k < 0 to ensure that Z (!) is real-valued. The price of risk for shock "j thus depends on an integral over the function Gj (!), with weights Z(!). Recall that for each frequency !, Gj (!) tells us the e¤ect of "j on x at frequency !. Z(!) therefore determines the price of risk for any shock to the variable x at frequency !. 2.1 Examples of impulse transfer functions Gj (!) Before proceeding further, it is helpful to see examples of what the impulse transfer looks like for some simple impulse response functions. For the sake of concreteness, suppose for the moment that the priced variable xt is log consumption growth, ct . Figure 1 plots the impulse response (IRF) and impulse transfer functions for four di¤erent hypothetical shocks. Note that while we are ultimately interested in the e¤ects of the shocks on consumption growth, ct , we plot the IRF in terms of consumption levels, ct , as they are the more natural way to think about consumption. The …rst shock is a simple one-time increase in the level of consumption. This shock has a ‡at impulse transfer function on consumption growth, indicating it has power at all horizons. The second shock is a long-run-risk type shock, inducing persistently positive consumption growth, with the level of consumption ultimately reaching the same level as that induced by the …rst shock. In this case, there is much less power at high frequencies, but the power at frequency zero is identical, since G (0) depends only on the long-run e¤ect of the shock on the level of consumption P (Gj (0) = 1 k=0 gj;k ). The next two shocks have purely transitory e¤ects. The third shock raises consumption for just a single period, and we see now zero power at frequency zero and positive power at high frequencies. The fourth shock is more interesting. Consumption rises initially, turns negative in the second period, and returns to its initial level in the third period. The transfer function is again equal to zero at ! = 0, but it now actually has negative power at low and middle frequencies. This is a result of the fact that the impulse response of consumption is actually negative in some periods. The sign of G re‡ects the direction in which the shock drives consumption. If we had reversed the signs of the impulse responses for the …rst three shocks, their transfer functions would all have been 9 negative. 3 Weighting functions in consumption-based models This section applies the analysis above to a range of standard utility functions for which m can be written as a function of innovations to consumption growth. We analyze power utility, models of internal and external habit formation, and Epstein–Zin preferences.8 We then estimate weighting functions empirically using data on equity returns. 3.1 3.1.1 Weighting functions in theoretical models Power utility Under power utility, the log pricing kernel is mt+1 = log ct+1 (17) where ct denotes the log of an agent’s consumption, is the coe¢ cient of relative risk aversion, and log is the rate of pure time preference. (17) implies that z0 = and zk = 0 for all k > 0, and thus the weighting function under power utility is simply Z power (!) = (18) Z power is ‡at and exactly equal to the coe¢ cient of relative risk aversion. Z power is constant because the only determinant of the innovation to the SDF is the innovation to consumption on date t + 1. A shock to consumption growth has the same e¤ect on the pricing kernel regardless of how long the innovation is expected to last. 3.1.2 Habits Adding an internal habit to the preferences yields the lifetime utility function Vt = 1 X j (Ct+j j=0 8 bCt+j 1 )1 1 (19) While these models of preferences are often applied under the assumption of the existence of a representative agent, note that that assumption is not strictly necessary. In particular, the pricing kernel generated by an agent’s Euler equation will hold for any market in which he participates. We thus do not concern ourselves, for now, with issues of market completeness or the existence of a representative agent. Of course, we will assume a representative agent when testing the model using data on aggregate consumption. 10 where Ct = exp (ct ) is the level of consumption and 0 importance of the habit. The pricing kernel is exp (mt+1 ) = (Ct+1 (Ct bCt ) bCt 1 ) If we log-linearize the pricing kernel in terms of we obtain Et+1 mt+1 b (1 b) 2 +1 b < 1 is a parameter determining the Et+1 b (Ct+2 Et b (Ct+1 ct+1 and bCt+1 ) bCt ) (20) ct+2 around a zero-growth steady-state, Et+1 ct+1 + b (1 b) 2 Et+1 ct+2 (21) With internal habits the pricing kernel depends on both the innovation to current consumption growth and also the change in consumption growth between dates t + 1 and t + 2. The spectral weighting function under habit formation is Z internal (!) = 1 + b (1 b) 2 b (1 b) 2 2 cos (!) (22) The weighting function with habits is equal to a constant plus a negative multiple of cos (!). As we would expect, Z internal (!) = Z power (!) when b = 0. The left panel of Figure 2 plots Z internal (!) for various values of b. Here and in all cases below we only plot Z between 0 and as is standard, since Z is even across 0 and . The x-axis lists the wavelength of the cycles, as opposed to the frequency !. Given a frequency of !, the corresponding cycle has length 2 =! periods (the smallest cycle we can discern lasts two periods). As b rises, there are two e¤ects. First, the integral over Z gets larger, and second, its mass shifts to higher frequencies. The latter e¤ect is consistent with the usual intuition about internal habit formation that households prefer to smooth consumption growth and avoid high-frequency ‡uctuations to a greater extent than they would under power utility.9 One lesson from the equation for Z internal is that as long as b is the only parameter we can vary, there is little ‡exibility in controlling preferences over di¤erent frequencies. cos (!) always crosses zero at =2, so the pricing kernel will always place higher weight on cycles of frequency greater than =2 and relatively less weight on cycles with frequency less than =2. Furthermore, Z internal is monotone, regardless of the value of b.10 9 Otrok, Ravikumar, and Whiteman (2002) obtain a similar result, but in a di¤erent manner. Rather than characterize the volatility of the pricing kernel, they characterize the price of a Lucas tree, which is equivalent to simply characterizing lifetime utility as a function of the spectral density of consumption growth. While lifetime utility is important, it is not the same as the price of risk in the economy. Our results are therefore complements rather than substitutes. 10 One potential way to enrich preferences to allow preferences to isolate smaller ranges of the spectrum may be to allow for more lags of consumption to enter the utility function. 11 Under external habit formation, the SDF is Ct+1 exp (mt+1 ) = Ct b Ct bC t (23) 1 where C denotes some external measure of consumption (e.g. aggregate consumption or that of an agent’s neighbors). In this case, the innovation to the SDF depends only on the innovation to Ct+1 . So the weighting function with an external habit will be completely ‡at. Otrok, Ravikumar, and Whiteman (2002) show that the external habit has a strong e¤ect on what weights utility places on consumption cycles of di¤erent frequencies, but what we show here is the SDF is driven entirely by one-period innovations, so all cycles receive the same weight. The pricing kernel in models with external habit formation, e.g. Campbell and Cochrane (1999), places equal weight on all frequencies. On the other hand, the internal habit models of Constantinides (1990) and Abel (1990) are heavily weighted towards high-frequency ‡uctuations. 3.1.3 Epstein–Zin preferences An alternative way of incorporating non-separabilities over time to habits is Epstein and Zin’s (1991) formulation of recursive preferences. In general, under recursive preferences, anything that a¤ects an agent’s welfare a¤ects the pricing kernel. So not only shocks to current and future consumption growth, but also innovations to higher moments will be priced. We begin by focusing on the case where consumption growth is log-normal and homoskedastic. Subsequent sections consider models with stochastic volatility. Suppose an agent has lifetime utility n Vt = (1 ) Ct1 + 1 Et Vt+1 1 o11 (24) Campbell (1993) and Restoy and Weil (1998) show that if consumption growth is log-normal and homoskedastic, the stochastic discount factor for these preferences can be log-linearized as Et+1 mt+1 Et+1 ct+1 + ( ) Et+1 1 X j=0 j ct+1+j ! (25) where is the inverse elasticity of intertemporal substitution (EIS), and is the coe¢ cient of relative risk aversion. is a parameter (generally close to 1) that comes from the log-linearization of the return on the agent’s wealth portfolio (Campbell and Shiller, 1988).11 is a measure of 11 1 Speci…cally, = 1 + DP , where DP is the dividend-price ratio for the wealth portfolio (i.e. the consumptionwealth ratio) around which we approximate. generalizes the rate of pure time preference somewhat because it also 12 impatience: if the agent is highly impatient, then he consumes a large fraction of his wealth in each period and is small. In the case where = 1, (25) is exact. For the case of (25), the weighting function is Z EZ (!) +( ) 1 X j (26) 2 cos (!j) j=1 which can be further simpli…ed using Euler’s formula as Z EZ (!) = +( ) 1 2 1 2 cos (!) + 2 (27) Under power utility, = and Z EZ (!) = is ‡at, so all frequencies receive equal weight, as discussed above. On the other hand, if 6= , then weights can vary across frequencies due to the second term. Z EZ is much richer than what we obtain in the case of power utility and it has a number of interesting properties. First, as with power utility, its average value is exactly equal to the coe¢ cient of relative risk aversion, Z 1 Z EZ (!) d! = (28) 0 So the total weight placed on the spectrum depends only on risk aversion. To the extent that the volatility of the pricing kernel depends on the EIS, it is due only to how a¤ects which frequencies receive weight. Moreover, in the special case where consumption follows a random walk with standard deviation , G (!) = and the standard deviation of the log pricing kernel is simply . Looking at Z EZ , we have 1+ 1 1 ) 1+ Z EZ (0) = ( ) Z EZ ( ) = ( dZ EZ (!) / d! + (29) + (30) for ! 2 [0; ] (31) For near 1, Z EZ (0) is driven by the term since 1+ is large (approaching 1 as ! 1). 1 Since the integral of Z is always equal to , the term determines the relative weight of Z EZ near frequency zero. The sign of dZ EZ =d! depends only on . If risk aversion is higher than the inverse EIS depends on discounting due to uncertainty about future consumption. 13 –agents prefer an early resolution of uncertainty –then Z EZ (0) is high and Z EZ is decreasing on (0; ). If < , then Z EZ (0) is low (or negative) and Z EZ is increasing. So, essentially, Z EZ (0) depends on the magnitude of ( ), and Z EZ then monotonically moves towards as ! increases. An obvious question is how rapidly Z EZ falls as ! rises above zero. That is, how much of the mass of Z EZ is concentrated at very low frequencies? In the limit as ! 1, i.e. the case where households are indi¤erent about when consumption occurs, Z EZ (!) approaches Z EZ (!) = ( ) D1 (!) + (32) where D1 is the limit of the Dirichlet kernel (closely related to the Dirac delta function), with the key properties 1 2 Z D1 (!) = 0 for ! 6= 0 D1 (!) d! = 1 (33) (34) for ! in the interval [ ; ]. For an agent who is e¤ectively in…nitely patient, then, two features of the consumption process matter: the permanent innovations at ! = 0 (limj!1 Et+1 ct+j ), which are weighted by , and all other innovations, which have no e¤ect on limj!1 Et+1 ct+j , and are weighted by . Moving away from the limiting case, the right-hand panel of …gure 2 plots Z EZ for a variety of parameterizations. The parameterizations are meant to correspond to annual data, so we take = 0:975 as our benchmark, which corresponds to a 2.5 percent annual dividend yield. For = 5 and = 0:5, we see a large peak near frequency zero, with little weight elsewhere. In fact, half the mass of Z EZ in this case lies on cycles with length of 230 years or more, and 75 percent lies on cycles with length 72 years or more. In this parameterization, it is e¤ectively only the very longest cycles in consumption (up to permanent shocks) that carry any substantial weight in the pricing kernel. Purely temporary shocks to the level of consumption (which is what are induced by shocks to monetary policy in standard models, for example) are essentially unpriced. The line that is highly negative near ! = 0 is for = 0:5 and = 5, where households prefer a late resolution of uncertainty. In this case, the mass of Z EZ is still e¤ectively isolated near zero, but because households now prefer an early resolution of uncertainty, Z EZ is negative at that point. The integral of Z EZ is still equal to , though, so it turns positive at higher frequencies.12 The …nal two lines in the right-hand panel of …gure 2 plot Z EZ for = 5 and = 0:5 with lower values of , 0.9 and 0.5. These are values that imply a decidedly unrealistic discount rate, but they 12 Note, though, that the case where > relevant (see, e.g., Bansal and Yaron, 2004) is not taken as a benchmark and is not widely viewed as empirically 14 help show what is necessary for Epstein–Zin to place any meaningful weight on higher frequencies. Even with = 0:9, half the weight of Z EZ is on cycles with length 50 years or more. When we push all the way to 0.5, the median cycle …nally has length 9 years, roughly corresponding to a long business cycle (whereas under power utility, half the weight is on cycles with length 4 years or more). 3.1.4 Ambiguity aversion interpretation As usual, the analysis of Epstein–Zin preferences naturally also applies to the preferences of an ambiguity averse agent (e.g. Hansen and Sargent, 2001; Barillas, Hansen, and Sargent, 2009). When the agent has a preference for robustness, he can be viewed as having a reference distribution (the true distribution) and a worst-case distribution, which is what he uses to actually price assets. Under the reference distribution, the agent simply has power utility, so his weighting function would be ‡at. Under the worst-case distribution, though, he places relatively more weight on certain "bad" states of the world (based on a joint entropy condition on the two distributions). Our weighting function shows the e¤ect of that reweighing in the frequency domain. Agents essentially place more weight on the possibility of the occurrence of low-frequency ‡uctuations, which gives them a relatively high weight in the function Z EZ .13 3.2 Estimates of weighting functions We now estimate the weighting function Z(!) for consumption growth using the cross-section of equity prices. Estimating Z involves three main steps. First, we need to estimate the dynamics of the economy and identify the fundamental shocks "t+1 and the dynamic response of consumption growth to these shocks. To do this, we represent the dynamics of the economy with a VAR and we estimate it; second, we parametrize the function Z(!); third, we estimate the parameters of Z using the cross-section of equity returns. 3.2.1 Step 1: Estimation of the dynamics We assume the process driving the priced variable, xt , follows a …nite-order VAR, Xt = (L) Xt 13 1 + "t (35) Hansen and Sargent (2007) provide a similar interpretation of their multiplier preferences in the frequency domain for a linear-quadratic control problem. 15 where xt = B1 Xt is the …rst element of Xt and Xt has dimension N 1.14 In our benchmark results, xt is log consumption growth, ct . If the lag polynomial (L) has order K, then we can stack K 0 consecutive observations of Xt so that Xt Xt0 ; Xt0 1 ; ::: follows a VAR(1) Xt = Xt 1 and xt = B1 Xt . We estimate this VAR using OLS yielding estimates of 3.2.2 (36) + "t and "t . Step 2: Parametrization of the Spectral Weighting Function The weighting function that we want to estimate, Z (!), is potentially in…nite-dimensional. However, we only have a …nite number of risk prices (one for each estimated shock in "t ) with which to estimate it. We therefore need to choose a functional form for Z with a …nite number of parameters. We consider two speci…cations, a ‡exible function motivated by the utility functions discussed above, and a step function. The utility basis The analysis of the utility functions in the previous sections suggests modeling Z as: 1 X j Z U (!) = q1 cos (!j) + q2 + q3 cos (!) (37) j=1 where q1 , q2 , and q3 are unknown coe¢ cients. We call (37) the utility basis because it nests the weighting functions derived from utility-based models. If q3 = 0 (37) matches the weighting function for Epstein–Zin preferences in (26). If q1 = 0, the long-run component that is crucial in the Epstein–Zin case is shut o¤, and we obtain the weighting function for internal habit formation in (22). Finally, if both q1 = 0 and q3 = 0, then we have the weighting function for power utility. Note that we have an extra parameter here. Following the most common calibration of the Epstein–Zin model that motivated our speci…cation of the Z function, we choose = 0:9751=4 for quarterly data, corresponding to a 2.5 percent annual consumption/wealth ratio.15 Because the utility basis is so closely related to the weighting functions we derived under various preference speci…cations, the constituent functions are already plotted in Figure 2. In particular, P j the lines in the right-hand panel represent the …rst function, 1 cos (!j), shifted upward by j=1 a constant. This function clearly isolates very low frequencies, and the extent to which the lowest frequencies are isolated depends on the parameter . 14 Recall that Bj represents a conformable selection vector equal to 1 in element j and 0 elsewhere. In theory, we could estimate . However, we …nd that it is poorly identi…ed in the data, so we proceed to calibrate it to the value most commonly used in the literature. 15 16 The bandpass basis One advantage of working in the frequency domain is that it is straightforward to estimate risk prices for ranges of frequencies of interest. We simply break the interval [0; ] into three intervals, corresponding to business-cycle length ‡uctuations with wavelength between 6 and 32 quarters (as is standard in the macro literature, e.g. Christiano and Fitzgerald, 2003), and frequencies above and below that window. Under Epstein–Zin preferences, we would expect most of the weight of Z to lie in the range of frequencies below the business cycle, while habit formation implies that the mass should lie at higher frequencies. We refer to the set of three step functions as the bandpass basis, since Z (!) is composed of the sum of three bandpass …lters. Speci…cally, we de…ne ( Z (a;b) (!) 1 if a < j!j b 0 otherwise For quarterly data, our three basis functions are then Z (0;2 We therefore estimate the function Z BP (!) = q1 Z (0;2 3.2.3 =32) (!) + q2 Z (2 =32) =32;2 =6) (!), Z (2 (38) =32;2 =6) (!) + q3 Z (2 =6; ) (!), and Z (2 (!) =6; ) (!). (39) Step 3: Estimation of the spectral weighting function Result 1 and the estimated VAR imply that the innovations to the SDF are: Et+1 Mt+1 = W (q)"t+1 (40) for a 1 N vector W that depends on the parameters q [q1 q2 q3 ]0 : We then estimate the vector q using the cross-section of asset prices. To …nd W (q) for a given basis, go back to the VAR representation to write: Et+1 Mt+1 = 1 X zk B1 k "t+1 (41) k=0 According to Result 1, the time-domain weights fzk g are transformations of the weighting function, zk = ( 1 R 1 2 R Z (!) d! for k = 0 Z (!) cos (!k) d! for k > 0 (42) For both the utility and bandpass basis, Z (!) is linear in the coe¢ cients q, which implies that zk 17 is also linear in q. Speci…cally, zk = q 0 Hk (43) where Hk contains the integrals of the basis functions function for Z. For the utility basis, 2 6 H0 = 4 1 2 R 1 2 3 P1 i cos (!i) d! i=1 R 7 1 1d! 5 2 R cos (!) d! 2 1 6 Hk>0 = 4 R For the bandpass basis, we obtain: 2 6 H0 = 4 1 2 R 1 2 1 2 3 2 Z (0;2 =32) (!) d! 7 6 Z (2 =32;2 =6) (!) d! 5 Hk>0 = 4 R Z (2 =6; ) (!) d! R (44) 3 Z (0;2 =32) (!) cos (!k) d! R 7 Z (2 =32;2 =6) (!) cos (!k) d! 5 R 1 Z (2 =6; ) (!) cos (!k) d! 1 1 3 P1 i cos (!i) cos (!k) d! i=1 R 7 1 cos (!k) d! 5 R 1 cos (!) cos (!k) d! R (45) which can be further simpli…ed as a function of sines (without integrals) and then computed numerically. The set of vectors fHj g is thus determined exogenously by the choice of the basis. Given that zk = q 0 Hk , (41) becomes Et+1 Mt+1 = q 0 1 X Hj B1 j=0 j ! (46) "t+1 Et+1 Mt+1 is thus a function of the VAR parameters , the innovations "t = Xt three parameters q1 ;q2 and q3 . The risk prices are then estimated from the asset pricing condition16 E[Rit Rf ] = Xt 1 , and the (47) Cov(mt+1 ; rit+1 ) = E[q 0 ut+1 rit+1 ] (48) Our full set of moment conditions identifying the parameters of the model is 2 Mapping into frequency domain 6 Gt+1 ( ; q) = 6 4(X | t+1 X) {z t VAR moments Xt , rt+1 } | f rt+1 q 0 zX }| Hj B1 j=0 {z 1 j { (Xt+1 Asset pricing moments where rt is the vector of test asset returns and rtf is the risk-free rate. 16 3 7 Xt )rt+1 7 5 } (49) We use level returns as opposed to log returns for the left-hand side of the following equation: under the 2 f assumption of lognormality, Et [rit+1 ] rt+1 + 2t ' Et [Rit+1 Rtf ]. We also condition the equation down. 18 While we could in principle minimize the GMM objective function for all the parameters simultaneously, that method has the drawbacks that the optimization is di¢ cult to perform (due to the large number of parameters) and that it allows errors in the asset pricing model to a¤ect the VAR estimates. We therefore construct estimates of and q by minimizing the two moment conditions separately. That is, is simply estimated through OLS and then q is estimated taking as given, using GMM.17 Given estimates ^ and b q, we construct standard errors using the full set of moments, Gt ^ ; b q . The standard errors we report for the risk prices q therefore always incorporate uncertainty about the dynamics of the economy through . We perform the GMM estimation of q, taking as given, using either one-step GMM (using the identity matrix to weight the asset pricing moments) or two-step GMM (using the estimated variance-covariance matrix of the moment residuals to construct the weighting matrix for the second step).18 3.3 3.3.1 Empirical results Data The most natural choice for the priced variable, xt , is consumption growth, but we also explore using other variables: GDP, durable consumption, and investment growth. The rationale for using variables other than consumption, even though we are motivated by consumption-based models, is that to the extent that the pricing kernel is driven by permanent shocks to consumption, permanent shocks to any variable that is cointegrated with consumption should also proxy for the pricing kernel, since the permanent shocks to consumption and any variable it is cointegrated with must be perfectly correlated. That said, households want to smooth consumption compared to income, so we cannot view estimates of the spectral weighting function for aggregates other than consumption as yielding direct tests comparing utility functions. Rather, we interpret them as simply illustrating how the dynamics of the economy are priced. Furthermore, Cochrane (1996) argues that investment growth should price the cross-section of asset returns. Our results on investment are a generalization of his analysis that asks whether and how future dynamics of investment growth are priced. 17 Optimizing the full GMM objective function (or even using two-stage GMM) would be more e¢ cient, so our standard errors will in general be larger than if we used a fully e¢ cient method. 18 When computing the standard errors incorporating the full estimation uncertainty (according to eq. 49), we take into account the weighting matrix we have used to estimate q. We construct the full weighting matrix in the following way. We assign equal weight to all VAR moment conditions (i.e. we use the identity matrix for the block of the weighting matrix that corresponds to the VAR moment conditions). For the block that corresponds to the asset pricing moments, we use the same weighting matrix we used in the estimation of q. We set to zero the weight on the cross-product between VAR and asset pricing moment conditions. Finally, we scale the VAR moment conditions by a (common) constant such that on average the block of VAR moments and the block of asset pricing moments get the same weight. 19 For the vector of state variables Xt , we want variables that are both priced and can forecast our priced variable xt . Since the number of parameters of the VAR increases quadratically with the dimension of Xt , we look for a parsimonious representation with few state variables. We include in Xt the lagged values of the priced variable, and we add the …rst two principal components of a set of 9 real and …nancial variables: the aggregate price/earnings and price/dividend ratios; the 10 year/3 month term spread; the AAA–Baa corporate yield spread (default spread); the small-stock value spread; the unemployment rate minus its 8-year moving average; RREL, the detrended version of the short-term interest rate that Campbell (1991) …nds forecasts market returns; the three-month Treasury yield rate; and Lettau and Ludvigson’s (2001) cay. Because many of the variables used are only available after 1952, in the analysis that follows we use the quarterly data over the period 1952–2011. Finally, in the analysis that follows we use 3 lags of quarterly data, but results are robust to the choice of the number of lags. Table 1 reports the VAR coe¢ cients using consumption growth as a priced variable. 3.3.2 Parameter estimates Table 2 reports the estimation results using two-step GMM to estimate the risk prices. The lefthand side uses the set of 25 size and book/market-sorted portfolios, while the right-hand side adds in a set of 49 industry portfolios (both sets of portfolios are obtained from Ken French’s website; we drop six industry portfolios for which we have missing data in the period considered). For both portfolio sets we estimate both the bandpass basis and the utility basis. For the bandpass basis, q1 corresponds to the price of lower-than-business cycle risks, q2 to business cycle risks, and q3 to higher-than-business cycle risks. For the utility basis, q1 is price the long-run component, q2 is the constant, and q3 is the high-frequency component (coe¢ cient on cos(!)). The …rst set of rows in table 2 reports results obtained using consumption growth as priced variable in the SDF. We …nd that long-run shocks to consumption are strongly priced both in the 25 Fama–French portfolio and in the industry portfolios, while business-cycle frequency shocks and high frequency shocks do not seem to be priced. The result can be seen much more clearly when using the bandpass basis. If one only used the utility basis, one could barely reject at the 10% level that long-run consumption shocks are not priced, and the coe¢ cient on the long-run shocks is not statistically di¤erent from 0 in the joint cross-section of value-growth and industry portfolios. In other words, when estimating structural preference-based models, we …nd no signi…cant parameters, implying that neither short- nor long-run shocks to consumption growth are priced in equity returns. On the other hand, with the bandpass basis we …nd that long-run shocks are signi…cantly priced. We obtain similar results for the other priced variables: with the bandpass basis we …nd strongly 20 signi…cant risk prices in almost all of the variables: GDP growth, durable consumption growth, and the various measures of investment growth. On the contrary, we …nd almost no signi…cance when using the utility basis. Table 3 repeats the analysis using one-step GMM (i.e. using the identity matrix as a weighting matrix). Using the utility basis, we can only distinguish the price of long-run risk from zero in a single case (residential investment, only for the cross-section of the 25 FF portfolios). Using the bandpass basis, we …nd several cases in which the long-run risk price is signi…cant: consumption growth, durables growth, …xed investment and residential investment. In any case, the bandpass basis yields results on the price of long-run risks that are much stronger than the ones indicated by the utility basis. We interpret these results in two ways. First, it is possible that agents do care about longrun shocks, but their de…nition of “long-run” is closer to the one captured by the bandpass basis (cycles longer than the business cycle) rather than that captured by the utility basis (where more than half of the pricing weight falls on cycles longer than 230 years). Second, the bandpass basis loads on frequencies that can be economically considered “long-run”but are much easier to detect empirically than frequencies close to 0 for which we have little empirical power. Using the frequency-domain decomposition leads us to very di¤erent conclusions about the underlying theories than standard time-domain techniques would have. The results that employ the utility basis show essentially no support for the long-run risk model. Looking at the problem using the bandpass …lter and targeting the economically relevant set of frequencies instead yields strong and robust support for the idea that low-frequency shocks to the economy are priced in equity markets. 3.3.3 Impulse transfer and weighting functions Figure 3 plots estimated impulse transfer functions, Gj , for consumption growth. To help show the behavior of the functions near zero, we plot them from to , instead of beginning at zero as elsewhere. Note that the functions are all symmetrical across the vertical axis (since they are linear combinations of cosines). The shaded regions in each …gure are 95-percent con…dence intervals. There are two key features of the transfer functions to note. First, there are meaningful qualitative di¤erences across the functions in how power is distributed, which helps identify the underlying risk prices. If the transfer functions were all highly similar, then we would not expect to be able to distinguish risk prices across frequencies very well. Looking at the con…dence bands, though, it is clear that the transfer functions are poorly estimated near frequency zero. ! = 0 corresponds to the the in…nite-horizon response to each shock, so it is not surprising that it is most di¢ cult to estimate. Nevertheless, the fact that the uncertainty rises so much at very low frequencies helps explain why 21 we have trouble estimating the coe¢ cient on the low-frequency component of the utility basis. The right-hand set of plots in each …gure zooms in on frequencies corresponding to cycles longer than 5 years. In each of those right-hand-side …gures, the vertical lines demarcate the set of frequencies that receive half the weight under our benchmark calibration Epstein–Zin preferences (i.e. cycles longer than 230 years). In all the cases, it is clear that the con…dence bands are far larger in the region where the mass of the Epstein–Zin weighting function is focused than elsewhere. Figure 4 plots the estimated spectral weighting functions for consumption growth, estimated with the 25 Fama–French portfolios, using the bandpass basis and the utility basis. The left panel plots all frequencies, while the right panel zooms in on the cycles longer than 5 years. The …gure shows signi…cant weight at low frequencies. The price of long-run risks is quite precisely estimated using the bandpass basis (and signi…cantly di¤erent from zero), while the standard errors of the utility basis estimates diverge quickly as we look at frequencies closer to zero, con…rming the huge amount of statistical uncertainty exactly in the frequency range most important for the Epstein–Zin model. 4 4.1 4.1.1 Weighting functions in returns-based models Weighting functions in theoretical models The CAPM Under the CAPM, innovations to the SDF are proportional to innovations to the market return, mt+1 Et mt+1 = E [rm;t+1 rf;t+1 ] (rm;t+1 V ar (rm;t+1 rf;t+1 ) Erm;t+1 ) (50) where rm;t+1 is the market return. The weighting function under the CAPM is thus simply Z CAP M (!) = 4.1.2 E [rm;t+1 rf;t+1 ] V ar (rm;t+1 rf;t+1 ) (51) Epstein–Zin and power utility In a model with a representative agent with Epstein–Zin preferences (with power utility as a special case) and where consumption growth is log-normal and homoskedastic, Campbell (1993) shows that innovations to the pricing kernel can be written purely in terms of returns on the representative 22 agent’s wealth portfolio, mt+1 Et mt+1 = Et+1 rw;t+1 ( 1) Et+1 1 X j rw;t+1+j (52) j=1 where rw is the log return of the wealth portfolio of the representative agent. is the same loglinearization parameter as in the previous section. Campbell (1993) interprets (52) as a version of Merton’s (1973) intertemporal CAPM because both current returns and changes in the investment opportunity set are priced risk factors. The weighting function for (52) is Z EZ returns (!) = +( 1) 1 X j (53) 2cos(!j) j=1 As ! 1 we obtain the limit Z(!) = ( (54) 1) D1 (!) + 1 with D1 the limit of the Dirichlet kernel, that essentially corresponds to a point mass at 0. All agents, then, regardless of (i.e., regardless of whether they have power utility or more general recursive preferences) place high weight on low-frequency ‡uctuations in equity returns. 4.1.3 Returns-based asset pricing when we can forecast returns but not consumption Campbell’s (1993) analysis, and that used in Campbell and Vuolteenaho (2004, 2009), assumes that risk premia are constant and that consumption growth is potentially forecastable. Suppose, alternatively, that we cannot forecast consumption growth at all, and that when we forecast asset returns we are simply forecasting risk premia. For example, return predictability might arise from stochastic volatility (as in Bansal and Yaron, 2004 and Campbell, Giglio, Polk and Turley, 2012) or time-varying risk aversion (Campbell and Cochrane, 1999; Dew-Becker, 2012). The Campbell– Shiller approximation when consumption is unpredictable reduces to Et+1 rw;t+1 = Et+1 ct+1 Et+1 1 X j rw;t+j+1 (55) j=1 and the pricing kernel is Et+1 mt+1 = Et+1 rw;t+1 23 1 1 Et+1 1 X j=1 j rw;t+j+1 (56) This result is notably di¤erent from that of Campbell (1993) and equation (52) above, which are derived assuming risk premia are constant. Speci…cally, if the EIS is greater than 1 ( < 1), then the coe¢ cient on expected future returns becomes proportional to (1 ): it has the opposite sign than in Campbell (1993) and equation (52). The intuition for this result is as follows. In Campbell (1993), news about high future returns corresponds to an improvement in future expected consumption growth (or, in the ICAPM, the investment opportunity set), which is unambiguously good.19 If, however, high expected returns are due to high future risk aversion or volatility, then there is only a discounting e¤ect: agents dislike news about high future expected returns because it is associated with low lifetime utility. An increase in risk aversion or volatility is purely bad news. 4.2 4.2.1 Estimation of the weighting function Methods Given that the weighting function presented above can be decomposed in two of the three constituents functions that we saw for the case of consumption (and that are plotted in Figure 2), the utility basis representation in the case of returns will simply be: Z U (!) = q1 1 X j cos(!j) + q2 (57) j=1 Since we are mostly interested in estimating the pricing of long-run discount rate news, we parametrize the bandpass basis to only include a constant and a long-run component, Z BP (!) = q1 Z (0;2 =32) (!) + q2 (58) Like in Campbell and Vuolteenaho (2004), we use a VAR(1) with state vector composed of log excess returns, the price/earnings ratio, term spread and default spread. We use quarterly data from 1926q3 to 2011q2. We estimate the VAR using OLS, and set = 0:95 per year. We then use GMM as above to estimate the two parameters q1 and q2 using the cross-section of 25 Fama-French assets or the combination of those assets and the 49 industry portfolios. As before, we estimate and q [q1 ; q2 ] separately. Again, we report results using both one-step and the two-step GMM to estimate q, and compute standard errors for q taking into account the uncertainty related to the estimation of the VAR parameters as explained in Section 3. For robustness, we also compute the results using the three-parameters bandpass and utility basis we presented in Section 3. 19 In the terminology of the ICAPM, the cash ‡ow e¤ect from higher expected future consumption growth outweighs the discounting e¤ect that comes from higher discount rates in equation (55). 24 4.2.2 Results Table 4 shows the results using only the 25 FF assets (left columns) or adding the 49 industry portfolios (right columns). The top panel reports the version with two parameters (where the …rst one captures the long-run risks) discussed in the previous section. For both the bandpass basis and the utility basis, we …nd evidence that the long-run component of discount rate news is priced, at least when using only two parameters and using the e¢ cient matrix to estimate q. Consistent with equation (53), when we use the utility basis we …nd that both the constant and the discount-rate news (long-run component) are priced, and that q1 is approximately equal to q2 1. Similarly, for the bandpass basis, the price for frequencies below the business cycle is positive and signi…cant. The bottom panel of Table 4 reports estimates of the three-parameter version described in Section 3. Here we …nd much weaker evidence that long-run innovations in returns are priced. Overall, then, looking at returns we …nd mild evidence that news about the long-term expected returns carry a positive risk price, though the results seem to be more sensitive to the speci…cation used than in the previous section. 5 Multiple priced variables and stochastic volatility So far the analysis has focused only on the case where there is a single priced variable. In some models, though, the dynamics of multiple variables matter for asset pricing. For example, in many applications with Epstein–Zin preferences, both consumption growth and variation in volatility or disaster risk are priced (e.g. Bansal and Yaron, 2004; Campbell et al., 2012; Gourio, 2012; Constantinides and Ghosh, 2013, study a model with time-varying cross-sectional skewness with similar results). It turns out that the results above map easily into a multivariate setting. Assumption 1a: Structure of the SDF Instead of there being a single priced variable xt , suppose there is an M 1 vector of priced variables, ~xt , with 1 X mt+1 = F (~xt ) Et+1 Zk ~xt+1+k (59) k=0 where Zk is a 1 M vector of weights and F (~xt ) : RM ! R is a scalar valued function. Assumption 2a: Dynamics of the economy We assume that ~xt is driven by a vector moving average process as before, ~xt = JXt (60) Xt = (61) (L) "t 25 for some matrix J of dimension M N . The appendix derives the following extension to Result 1, Result 2. Under Assumptions 1a and 2a, we can write the innovations to the SDF as, Et+1 Mt+1 = X 1 2 j Z ~ Z(!)G (!) d! "j;t+1 (62) where (!) is a vector-valued weighting function and G (!) measures the dynamic e¤ects of "j;t on x in the frequency domain, ~ (!) Z Z0 + 2 1 X Zk cos (!k) (63) k=1 1 X G(!) (64) cos (!k) gk k=0 and gk is the impulse response function, gk J k In this case, then, we have multiple variables whose impulse responses we track in G, and each ~ of the priced variables has its own weighting function, represented as one of the elements of Z(!). We can thus also write the price of risk for a shock as 1 X 1 Zk gk Bj = 2 k=0 Z X ~ m (!)Gm;j (!) d! Z m ~ m (!) denotes the mth element of Z ~ (!) and Gm;j (!) denotes the m; jth element of G (!). where Z The M weighting functions each multiply N di¤erent impulse transfer functions, Gm;j (!). The price of risk for shock j depends on how it a¤ects the various priced variables at all horizons. 5.1 Epstein–Zin with stochastic volatility Using Result 2, we now extend the results on Epstein–Zin preferences to also allow for stochastic volatility. We use the same log-normal and log-linear framework as above. The log stochastic discount factor under Epstein–Zin preferences is, mt+1 = 1 1 ct+1 + 26 1 rw;t+1 (65) where rw;t+1 is the return on a consumption claim on date t+1. Whereas we previously assumed that consumption growth was log-normal and homoskedastic, we now allow for time-varying volatility driven by a variable 2t . We assume that 2t follows a linear, homoskedastic, and stationary process. The volatility term may a¤ect consumption growth in myriad ways. For example, consumption growth might be a simple ARMA process with 2t driving the volatility of the innovations. Alternatively, consumption growth might follow the long-run risk process from Bansal and Yaron (2004), and 2t could a¤ect either the persistent or transitory consumption growth. Regardless, though, it is straightforward to show that we will have Et rw;t+1 = k0 + Et ct+1 + k1 2 t (66) where k0 and k1 are constants that depend on the underlying process driving consumption growth. Using the Campbell–Shiller approximation, we can then write the innovation to the SDF as Et+1 mt+1 = ct+1 ( 1 X ) Et+1 j (67) ct+1+j j=1 1 Et+1 k1 2 t+1 Et+1 1 1 X j k1 2 t+j+1 (68) j=1 The weighting functions for consumption growth and volatility are now ZCEZ SV (!) = +( ) 1 X j Z EZ 2 SV (!) = k1 1+ 1 (69) 2 cos (!j) j=1 1 X j=1 j ! 2 cos (!j) (70) SV In the case where = 0, ZCEZ SV is exactly proportional to Z EZ . In any case, even for > 1 2 EZ SV they are highly similar. ZC is in fact the same we obtained in the homoskedastic case. Both SV weighting functions have a constant and also allow for a point mass near zero. Z EZ always has 2 the same basic shape regardless of the value of : unless we are in the particular case = in SV which Z EZ (!) = 0, agents always place high weight on the low-frequency features of volatility. 2 27 Alternatively, the weighting functions can be written in terms of returns and their volatility, ZREZ SV R (!) = (1 ) 1 X j j=1 Z EZ 2 SV R (!) = 1 k1 1 1+ (71) 2 cos (!j) 1 X j=1 j ! 2 cos (!j) (72) which yields conceptually very similar results. 6 Spectral weighting functions in a¢ ne term structure models The spectral analysis laid out in Section 2 applies to a¢ ne asset pricing models, so it can also be used to understand models of the term structure. This section shows that standard essentially a¢ ne asset pricing models can be recast in the frequency domain. The risk prices are reinterpreted as weighting functions in terms of shocks to short-term interest rates. The usual “level factor”, for example, corresponds to low-frequency shocks to short-term interest rates. Our method applies to a¢ ne and essentially a¢ ne (Du¤ee, 2002) models. It can accommodate standard yields-only models, models with macro factors (Ang and Piazzesi, 2003), and models with hidden factors that are not re‡ected in the yield curve (Du¤ee, 2011). In all these cases, we show that the risk premium for a given shock (e.g. a shock to the level factor) depends on its dynamic e¤ects on the short-term interest rate. It is in general di¢ cult to interpret risk prices in term structure models because they are only identi…ed up to a rotation (at least in yields-only models). That is, the underlying factors, and consequently their risk prices, can be rotated without having any observable e¤ect on the dynamics of bond prices. Some papers identify the shocks by assuming, for example, that the coe¢ cient matrix in the VAR driving the factors is lower triangular, while others assume the factors are principal components or that the factors are …xed functions of the yields (Hamilton and Wu, 2012, Joslin, Singleton, and Zhu, 2011, and Christensen, Diebold, and Rudebusch, 2011, respectively). These restrictions are not necessarily economically motivated, however, and there is no guarantee that they will deliver economically interpretable factors (we are fortunate that the principal components in bond yields seem to have a “level” and “slope” factors that we can tell intuitive stories about). Moreover, it is not obvious how the estimated risk prices can be compared across samples, even if the identifying assumptions are the same (the three principal components in one sample will not be identical to those in another sample). Our spectral decomposition, on the other hand, has a clear 28 and stable interpretation and the risk prices can be easily compared across various sample periods or datasets. 6.1 A canonical term structure model We consider here a standard yields-only model in which the factors follow a homoskedastic VAR(1) because it is widely studied in the literature. The analysis here can be easily generalized to other settings. Suppose the state of the economy is summarized by a vector Xt (with dimension N 1) that follows a VAR(1), X t = X t 1 + "t (73) where "t is a vector of mean-zero normally distributed random variables. The SDF takes the essentially a¢ ne form, mt+1 = 0 1 Xt 0 1 2 0 t t 0 t "t+1 (74) (75) + Xt t where the short-term interest rate, rt , follows rt = 0 + 0 1 Xt (76) We now show that for a model of this form, the innovation to the SDF can be written as Et+1 mt+1 = 1 X (z0;k + z1;k Xt ) Et+1 rt+k+1 (77) k=0 where z0;k is a scalar and z1;k is a 1 N vector of coe¢ cients. First, suppose that (77) is the true model of the SDF. The innovation to future expected values of the short rate is Et+1 rt+k+1 = 0 k 1 "t+1 (78) and (77) can then be rewritten as Et+1 mt+1 = 1 X (z0;k + z1;k Xt )0 k=0 0 k 1 ! "t+1 (79) So in order for (74) and (77) to be equivalent representations of the pricing kernel, we simply need 29 to be able to solve the pair of vector equations 1 X 0 z0;k 0 k 1 = 0 (80) 0 z1;k 0 k 1 = 0 (81) k=0 1 X k=0 Equations (80) and (81) give us a way to directly link a vector of risk prices t into the frequency domain in terms of sets of coe¢ cients fz0;k g and fz1;k g. As above, we can map the coe¢ cients zk;t = z0;k + z1;k Xt into the frequency domain through the cosine transform, Zt (!) = z0;t + 2 1 X zk;t cos (!j) (82) k=1 where the subscript on Zt (!) denotes the dependence of the weighting function on time (which follows from the fact that the risk prices vary over time through the term Xt ). In standard term structure models (where the factors can be recovered from bond yields alone), the shocks are not uniquely identi…ed –any full-rank rotation produces an observationally equivalent model, which means that the risk prices are also not uniquely identi…ed. It is thus not obvious how to interpret the risk prices. The spectral representation of the risk prices, Zt (!), is invariant to a rotation of the shocks and thus can be interpreted without any ambiguity. For example, standard intuition tells us that highly persistent increases in nominal interest rates are associated with persistent increases in in‡ation and are generally viewed negatively. So we would expect Zt to generally take on negative values for low values of !. An increase in interest rates just at businesscycle frequencies, though, tends to represent good news as short-term interest rates are empirically procyclical, so we would expect Zt to be positive at business cycle frequencies. 6.2 Empirics We start our empirical analysis by estimating a standard three-factor model using the method of Hamilton and Wu (2012). We assume that three yields are observed perfectly and one is measured with error (as in, e.g., Du¤ee, 2002; Kim and Wright, 2005; Joslin, Singleton, and Zhu, 2011). We use data on 1, 12, 36, and 60-month yields from 1980–2003 (a period with stable monetary policy where the zero lower bound does not bind) and assume that the 36-month yield is measured with error. We only report a single set of estimates here. In experiments with various speci…cations, we have found that the results are somewhat sensitive to choices about the yields used, the sample 30 period, and details of the estimation (this is true of both the reduced-form and spectral risk prices). For now, then, the results here are meant to be illustrative of the method, rather than de…nitive. We use the same bandpass basis for pricing bonds as for equities. Speci…cally, we solve the equations, 2 (0;2 =32) Fk 6 (2 =32;2 =6) K0 4 F k (2 =6; ) k=0 Fk 2 (0;2 =32) Fk 1 X 6 K1 4 Fk(2 =32;2 =6) (2 =6; ) k=0 Fk 1 X 3 0 k 1 = 0 (83) 3 0 k 1 = 0 (84) 7 5 7 5 for K0 and K1 , which is a simple matrix inversion problem. The vector K0 gives the set of steadystate frequency-domain risk prices, and the matrix K1 determines how those risk prices respond to the factors Xt : 6.3 Results We begin by reporting and interpreting the risk prices in the usual way. First, the left-hand panel of …gure 5 plots the loadings of bond yields with maturities from 1 to 60 months on the three factors. The factors are identi…ed as principal components –that is, we rotate them so that they are independent and have unit variance. The …rst factor can be interpreted as a level factor since it a¤ects all yields roughly equally. The second factor is generally thought of as a¤ecting the slope, and the third factor is a curvature factor. The center panel of …gure 5 plots the response of the 1-month interest rate to shocks to the three factors. As we would expect, the level shock has persistent e¤ects, and the curvature shock has a hump-shaped response. Interestingly, the shock to the slope factor has strongly persistent e¤ects on interest rates. Even though the slope of the yield curve rises, the short-term interest rate is driven persistently lower, which implies that the slope factor mainly represents a shift in the term premium. The right-hand panel of …gure 5 plots the corresponding impulse transfer functions. For the sake of readability, the horizontal axis only covers cycles with length greater than 12 months (i.e. ! < 2 =12). The key result in this plot is that the three shocks have noticeably di¤erent impacts in the frequency domain. For example, all three have positive e¤ects at business-cycle frequencies, while the only the level factor has positive e¤ects at low frequencies. This result implies that the transfers functions of the three factors are not highly collinear, which should help identi…cation of 31 the spectral weighting functions. Table 5 reports the estimated risk prices for the three reduced-form shocks. represents the steady-state reduced-form risk prices. The risk prices for all three shocks are statistically signi…cant. The shock to the level factor has a negative price, implying that, conditional on slope and curvature, periods when the level of the term structure is high are viewed as bad times – investors want to purchase assets that insure them against the possibility of a decline in the level factor. This is consistent with the view that long-term bonds are risky because they are exposed to persistent shifts in in‡ation (e.g. Bekaert, Cho, and Moreno, 2010). The price of risk for the slope factor is insigni…cant, which is somewhat surprising since the slope of the term structure tends to be high during recessions. However, the curvature factor has a signi…cant price, and if curvature is also correlated with the state of the business cycle, then it might have driven out the slope factor. Table 5 also reports time parameters determining time-variation in risk prices, . The nine parameters in are somewhat more di¢ cult to interpret. Only three of them are statistically signi…cant. The price of risk for shocks to the level factor depends signi…cantly on the level and slope. It is higher when either of those factors is higher, consistent with previous …ndings that the term spread forecasts returns on long-term bonds (and risky assets more generally; Campbell and Shiller, 1991; Fama and French, 1989). Table 6 shows that the steady-state risk prices for "level", "slope", and "curvature" are highly dependent on the de…nitions of the factors. The …rst row of table 6 uses the principal-components rotation from table 5. The remaining rows use three yields, y1 , y2 , and y3 , with the level factor de…ned as the mean of the yields, slope is y3 y1 , and curvature is y1 + y3 2y2 . The three rows use di¤erent yields for y1 , y2 , and y3 . In each case, the factors are normalized so that their innovations have unit variance, meaning that the risk prices can be compared across the rotations in terms of their e¤ects on the SDF. Beyond the de…nition of the factors, though, (i.e. how they are rotated) the estimation of the model is identical (and the models are all observationally equivalent with equal likelihoods and reduced-form implications for yields). When we use 1-, 12-, and 60-month yields in constructing the factors, the results are quantitatively similar to those for the principal-components rotation. However, when we use 1-, 36-, and 60-month yields, the risk price on shocks to the curvature factor doubles and the risk price on the slope factor becomes signi…cantly negative. That is, we now …nd that positive shocks to the slope are viewed negatively (consistent with the idea that the Federal Reserve cuts interest rates temporarily in recessions). When we switch to using 12-, 36-, and 60-month yields in de…ning the factors, the price of risk on the slope factor rises by a factor of four and the price of risk on curvature nearly triples. 32 So even for …ve seemingly reasonable and similar rotations of the underlying factors in the model, we obtain qualitatively di¤erent steady-state risk prices on the factors. This demonstrates one of the primary advantages of the spectral risk prices in term structure models: they are invariant to any rotation of the underlying factors. The spectral risk prices for all 4 rows in table 6 are identical. The right-hand side of table 5 reports estimates of K0 and K1 , where K0 collects the steadystate prices of low, business cycle, and high frequencies, and K1 reports the 9 coe¢ cients of the function that links the variation of the risk prices on the three groups of frequencies on the three state variables. All three elements of K0 are signi…cantly di¤erent from zero. The low-frequency shocks have a negative price of risk, implying that a highly persistent increase in nominal interest rates is viewed as a bad shock. This …nding …ts with our priors that market participants view permanent increases in interest rates (presumably from increases in long-run in‡ation expectations) negatively. Unlike the interpretation of , though, the interpretation (and estimation) of K0 is completely independent of any rotation of the factors. Regardless of how we may de…ne “level”, “slope”, and “curvature”, highly persistent shocks to nominal interest rates carry a positive price of risk. The fact that the second element of K0 is positive means that positive shocks to interest rates at business-cycle frequencies are viewed positively, which is consistent with the idea that short-term interest rates have a procyclical component. For K1 , none of the parameter estimates are signi…cant. In the end, even though estimating the spectral risk prices is more di¢ cult than the reducedform risk prices since they are dependent on estimates of and 1 , we still obtain highly signi…cant estimates for K0 , and the results …t well with standard intuition about the prices of risk for di¤erent ‡uctuations in interest rates. 7 Conclusion This paper studies risk prices in the frequency domain. The impulse response of consumption growth to a given shock to the economy can decomposed into components of varying frequencies. In a model where innovations to current and expected future consumption growth drive the pricing kernel, the price of risk for a given shock then depends on a weighted integral over the frequencydomain representation of the impulse response function. We study this weighting function both theoretically and empirically. Theoretically, we …nd that the weighting function helps us gain a deeper understanding of the behavior of asset pricing models. Empirically, our estimates of the weighting function are consistent with the idea of long-run risk models. Estimation a standard version of Epstein–Zin preferences yields statistically weak results, but using our spectral decomposition to target economically meaningful “long-run”frequencies (speci…cally, below-business-cycle frequencies) yields strong support for the importance of long-run risks for asset prices. 33 The method of analysis used here is generally applicable in asset pricing models where the pricing kernel is a linear (or log-linear) function of some state variable. Frequency domain analysis is useful for showing what aspects of the dynamics of the data are important to focus on in studying asset pricing models. For example, under Epstein–Zin preferences, we show that the variance of the pricing kernel is driven essentially by the long-run standard deviation of consumption growth, i.e. its spectral density at frequency zero. Whereas calibrations in the consumption-based asset pricing literature tend to focus on matching the unconditional standard deviation of consumption growth, our results show that they should match the long-run standard deviation. Our spectral method is also useful for studying the term structure. We show that the price of risk for a shock to the term structure can be written in terms of its dynamic e¤ects on short-term interest rates. This formulation has the advantage that it expresses risk prices in a way that is invariant to any rotation of the underlying factors. As we would have expected, we …nd that investors want to hedge increases in interest rates at very low frequencies and decreases at business-cycle frequencies. 34 A Derivation of equation (13) For any gj;k , we have gj;k 1 = 2 Z ~ j (!) (cos (!k) + i sin (!k)) d! G (85) Now since gj;k = 0 for k < 0, for any k > 0 we have gj;k = gj;k + gj; 1 = 2 Z k 1 = 2 Z cos (!k) + i sin (!k) cos ( !k) + i sin ( !k) ~ j (!) G ! d! ~ j (!) 2 cos (!k) d! G ~ (!) multiplied by any cos (!k) for integer k integrates Furthermore, note that the complex part of G ~ . We thus have to zero, which is why we can just study G re G 1 X k=0 B zk gj;k 1 = 2 Z Gj (!) z0 + 2 1 X ! (86) zk cos (!k) d! k=1 Derivation of weighting function with multiple priced variables The impulse response function is denoted gk J (87) k where gk is an M N matrix whose fm; ng element determines the e¤ect of a shock to the nth element of "t on the mth element of Xt+k . The innovation to the SDF is then 1 X Et+1 mt+1 = Zk gk k=0 ! The price of risk for the jth element of " is simply the jth element of As before, we take the discrete Fourier transform of fgk g, de…ning ~ (!) G 1 X k=0 35 e i!k gk (88) "t+1 P1 k=0 Z k gk . (89) Following the same steps as in section 2 and de…ning G (!) 1 X k=0 Zk gk Bj 1 = 2 1 = 2 Z Z ~ Z(!)G (!) Bj d! (90) X (91) ~ m (!)Gm;j (!) d! Z m where ~ (!) Z ~ (!) , we arrive at re G Z0 + 2 1 X Zk cos (!k) (92) k=1 ~ m (!) denotes the mth element of Z ~ (!) and Gm;j (!) denotes the m; jth element of and where Z G (!). We thus have M di¤erent weighting functions, one for each of the priced variables. The M weighting functions each multiply N di¤erent impulse transfer functions, Gm;j (!). The price of risk for shock j depends on how it a¤ects the various priced variables at all horizons. 36 References Abel, Andrew B., “Asset Prices under Habit Formation and Catching up with the Joneses,”The American Economic Review, Papers and Proceedings, 1990, 80(2), 38–42. Alvarez, Fernando and Urban J. Jermann, “Using Asset Prices to Measure the Persistence of the Marginal Utility of Wealth,”Econometrica, 2005, 73(6), 1977–2016. Ang, Andrew and Monika Piazzesi, “No-Arbitrage Vector Autoregression of Term Structure Dynamics with Macroeconomic and Latent Variables,”Journal of Monetary Economics, 2003, 50, 745–787. Bansal, Ravi and Amir Yaron, “Risks for the Long Run: A Potential Resolution of Asset Pricing Puzzles,”Journal of Finance, 2004, 59 (4), 1481–1509. , Robert F. Dittmar, and Christian T. Lundblad, “Consumption, Dividends, and the Cross Section of Equity Returns,”Journal of Finance, 2005, 60(4), 1639–1672. Barillas, Francisco, Lars P. Hansen, and Thomas J. Sargent, “Doubts or Variability?,” Journal of Economic Theory, 2009, 144(6), 2388–2418. Bekaert, Geert, Seonghoon Cho, and Antonio Moreno, “new Keynesian Macroeconomics and the Term Structure,”Journal of Money, Credit ans Banking, 2010, 42(1), 33–62. Campbell, John Y., “A Variance Decomposition for Stock Returns,” The Economic Journal, 1991, 101(405), 157–179. , “Intertemporal Asset Pricing Without Consumption Data,” American Economic Review, 1993, 83(3), 487–512. and John H. Cochrane, “By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior,”Journal of Political Economy, 1999, 107 (2), 205–251. and Robert J. Shiller, “The Dividend-Price Ratio and Expectations of Future Dividends and Discount Factors,”Review of Financial Studies, 1988, 1(3) (3), 195–228. and , “Yield Spreads and Interest Rate Movements: A Bird’s Eye View,” Review of Economic Studies, 1991, 58(3), 495–514. and Tuomo Vuolteenaho, “Bad Beta, Good Beta,” American Economic Review, 2004, 94 (5), 1249–1275. 37 and , “Growth or Glamour? Fundamentals and Systematic Risk in Stock Returns,” Journal of Financial Economics, Forthcoming. Christensen, Jens H.E., Francis X. Diebold, and Glenn D. Rudebusch, “The A¢ ne Arbitrage-Free Class of Nelson–Siegel Term Structure Models,”Journal of Econometrics, 2011, 164(1), 4–20. Christiano, Lawrence J. and Terry J. Fitzgerald, “The Band Pass Filter,” International Economic Review, 2003, 44(2), 435–465. Cochrane, John H., “A Cross-Sectional Test of an Investment-Based Asset Pricing Model,” Journal of Political Economy, 1996, 104 (3), 572–621. Constantinides, George M., “Habit Formation: A Resolution of the Equity Premium Puzzle,” The Journal of Political Economy, 1990, 98(3), 519–543. and Anish Ghosh, “Asset Pricing with Countercyclical Household Consumption Risk,” 2013. Working paper. Daniel, Kent D. and David Marshall, “The Equity Premium Puzzle and the Risk-Free Rate Puzzle at Long Horizons,”Macroeconomic Dynamics, 1997, 1(2), 452–484. Dew-Becker, Ian, “A Model of Time-Varying Risk Premia with Habits and Production,” 2012. Working paper. , “Estimates of the volatility of the permanent component of consumption and their implications for asset pricing.” Working paper. Du¤ee, Gregory R., “Term Premia and Interest Rate Forecasts in A¢ ne Models,” Journal of Finance, 2002, 57(1), 405–443. , “Information in (and not in) the Term Structure,”Review of Financial Studies, 2011, 24(9), 2895–2934. Epstein, Larry G. and Stanley E. Zin, “Substitution, Risk Aversion, and the Temporal Behavior of Consumption and Asset Returns: A Theoretical Framework,” Econometrica, 1989, 57(4), 937–969. and , “Substitution, Risk Aversion, and the Temporal Behavior of Consumption and Asset Returns: An Empirical Analysis,” The Journal of Political Economy, 1991, 99(2), 263– 286. 38 Fama, Eugene F. and Kenneth R. French, “Business Conditions and Expected Returns on Stocks and Bonds,”Journal of Financial Economics, 1989, 25(1), 23–49. Hamilton, James D. and Cynthia Wu, “Identi…cation and Estimation of Gaussian A¢ neTerm-Structure Models,”Journal of Econometrics, 2012. Working Paper. Hansen, Lars P. and Thomas J. Sargent, “Robust Control and Model Uncertainty,”American Economic Review, 2001, 91(2), 60–66. , John C. Heaton, and Nan Li, “Consumption Strikes Back? Measuring Long-Run Risk,” Journal of Political Economy, 2008, 116(2), 260–302. Hansen, Lars Peter and Thomas J. Sargent, Robustness, Princeton University Press, 2007. Joslin, Scott, Kenneth J. Singleton, and Haoxiang Zhu, “A New Perspective on Gaussian Dynamic Term Structure Models,”Review of Financial Studies, 2011, 24(3), 926–970. Kim, Don H. and Jonathan H. Wright, “An Arbitrage-Free Three-Factor Term Structure Model and the Recent Behavior of Long-Term Yields and Distant-Horizon Forward Rates,” Federal Reserve Board, Finance and Economics Discussion Series 2005-33 2005. Lettau, Martin and Jessica A. Wachter, “Why Is Long-Horizon Equity Less Risky? A Duration-Based Explanation of the Value Premium,”Journal of Finance, 2007, 62 (1), 55–92. and Sydney Ludvigson, “Consumption, Aggregate Wealth, and Expected Stock Returns,” Journal of Finance, 2001, 56 (3), 815–849. Malloy, Christopher J., Tobias J. Moskowitz, and Annette Vissing-Jørgensen, “LongRun Stockholder Consumption Risk and Asset Returns,” Journal of Finance, 2009, 64(6), 2427–2479. Merton, Robert C., “An Intertemporal Capital Asset Pricing Model,” Econometrica, 1973, 41 (5), 867–887. Otrok, Christopher, B. Ravikumar, and Charles H. Whiteman, “Habit Formation: A Resolution of the Equity Premium Puzzle?,” Journal of Monetary Economics, 2002, 49(6), 1261–1288. , , and , “A generalized volatility bound for dynamic economies,” Journal of Monetary Economics, 2007, 54(8), 2269–2290. 39 Parker, Jonathan A. and Christian Julliard, “Consumption Risk and the Cross Section of Expected Returns,”Journal of Political Economy, 2005, 113(1), 185–222. Restoy, Fernando and Philippe Weil, “Approximate Equilibrium Asset Prices,” 1998. NBER Working Paper. Yu, Jianfeng, “Using Long-Run Consumption-Return Correlations to Test Asset Pricing Models,” 2012. Working paper. 40 Figure 1. Impulse response functions and impulse transfer functions 4.5 4.5 Impulse response functions 4 4 3.5 3.5 3 3 2.5 2.5 2 2 1.5 1.5 Impulse transfer functions Shock 4 Shock 3 Shock 1 1 Shock 1 1 Shock 2 0.5 0.5 Shock 2 Shock 3 0 0 0 2 4 -0.5 6 8 10 0 0.5 1 1.5 2 2.5 3 -0.5 Shock 4 -1 -1 Notes: The left panel plots responses of the level of consumption to four hypothetical shocks. The right-hand panel plots the fourier transforms of the shocks to consumption growth , which we refer to as the impulse transfer functions Figure 2. Theoretical spectral weighting functions 40 100 Internal habit formation 80 30 b=0.75 Epstein–Zin preferences α=5; ρ=0.5; θ=0.975 60 α=5; ρ=0.5; θ=0.9 40 20 α=5; ρ=0.5; θ=0.5 20 0 10 0.00 20.00 10.00 6.67 b=0.25 b=0.5 -20 0 0.00 20.00 10.00 6.67 -10 5.00 4.00 3.33 2.86 2.50 2.22 2.00 Cycle length (2π/frequency) (years) 5.00 4.00 3.33 2.86 2.50 2.22 2.00 Cycle length (2π/frequency) (years) -40 -60 α=0.5; ρ=5; θ=0.99 -80 -20 -100 Notes: Plots of the spectral weighting function Z for various utility functions. The x-axis is the cycle length. In the left-hand panel, the parameter b determines the importance of the internal habit in the agent's utility function. In the right-hand panel, α is the coefficient of relative risk aversion; ρ is the inverse elasticity of intertemporal substitution; and θ is the discount factor. Figure 3. Estimated impulse transfer functions for consumption VAR Shock to cycle factor Shock to price factor Shock to consumption growth All frequencies Cycles longer than 5 years Cycles longer than 5 years 50% of E–Z mass (benchmark params ═>cycles >230 years) Cycles longer than 5 years 50% of E–Z mass (benchmark params ═>cycles >230 years) Cycles longer than 5 years 50% of E–Z mass (benchmark params ═>cycles >230 years) Notes: Impulse transfer functions estimated from a VAR in consumption growth and the two principal components. Shaded regions represent 95-percent confidence intervals. The left-hand plots are for all frequencies, while the right-hand plots zoom in on cycles longer than 5 years. The range between the lines on the right-hand side contains 50 percent of the mass of the weighting function for Epstein–Zin preferences with RRA=5 and EIS=2 (cycles longer than 230 years). The x-axis gives frequencies in terms of quarters. Shocks are not orthogonalized. Figure 4. Estimated spectral weighting functions for equities All frequencies Cycles longer than 5 years Bandpass basis Bandpass basis Utility basis Utility basis Notes: Estimated weighting functions for consumption growth as the priced variable. Risk prices are estimated using the 25 Fama–French portfolios with the efficient weighting matrix for GMM. Shaded areas denote 95-percent confidence regions. The utility basis uses a discount factor of 0.975 at the annual horizon. The x-axis gives frequencies in quarters. Figure 5. Bond pricing factor dynamics Responses of short-term interest rates to shocks Loadings of yields on three factors Impulse transfer functions Businesscycle frequencies 0.6 3.5 0.9 High Frequencies 3 0.4 Level 2.5 0.7 Level 0.5 0.2 2 Slope 0.3 1.5 Level 0 0 1 20 Curvature Slope 40 60 Curvature 0.1 -0.2 0.5 0 -0.1 0.2 0.4 Slope 0 0 20 40 Curvature 60 -0.4 -0.5 -1 -0.6 -0.3 -0.5 Notes: Estimates from a term structure model in which the underlying factors are rotated so as to be orthogonal and have unit variance. The lefthand panel gives the loading of yields from 1 to 60 months on the three factors. The center panel gives the response of the 1-month interest rate to a unit increase in each of the three factors from 1 to 60 months. The right-hand panel plots the impulse transfer functions for frequencies between 0 and 0.52, where w=0.52 corresponds to a wavelength of 12 months. Units are annualized percentage points. Table 1. Regression coefficients from VARs Lag 1 Cons. Price Cycle 0.322 *** 0.629 *** 0.4935 ** Cons. se (0.08) (0.23) (0.14) Cons. 0.1459 ** (0.07) Lag 2 Price -0.485 * (0.29) Cycle Cons. -0.516 ** (0.21) 0.1658 ** (0.07) Lag 3 Price Cycle -0.207 (0.15) 0.129 (0.17) Notes: VAR results for consumption growth and the two principal components. The table reports the regression of consumption growth on its own lags and those of the two pricipal components. The sample is 1952:1–2011:2, quarterly. Standard errors are reported in brackets. *** indicates significance at the 5 percent level, ** at the 1% level. Table 2. Parameter estimates for the spectral weighting function (efficient matrix for GMM) Portfolios: Basis: Bandpass t-stat FF25 Utility (0.975) t-stat q1 Consumption q2 growth q3 269 -431 138 2.47 ** -1.17 0.33 555.47 -442.65 616.12 1.66 * -0.44 0.32 GDP q1 q2 q3 124 -106 127 1.85 * -1.29 1.33 231.42 119.67 -217.42 0.69 0.87 -1.04 Durables q1 q2 q3 49 -38 33 2.66 *** -1.26 1.70 * 75.62 44.87 -86.66 Investment q1 q2 q3 12 -7 -7 2.03 ** -1.18 -1.12 29.25 -0.22 5.17 Fixed Investment q1 q2 q3 27 -25 61 2.16 ** -1.11 3.10 *** 39.33 67.18 -90.96 Bandpass t-stat FF25+IND E–Z (0.975) t-stat 112 1.95 * -116 -0.87 -134 -0.70 197.52 -279.30 504.32 1.35 -0.65 0.63 91 1.64 -72 -1.13 22 0.40 186.58 26.70 -77.25 0.69 0.24 -0.49 1.69 * 2.29 ** -0.63 15 2.30 ** -2 -0.30 -5 -0.83 19.96 10.79 19.62 1.58 2.37 ** 0.55 1.03 -0.03 0.36 14 2.26 ** -7 -1.22 -3 -0.66 31.87 4.93 2.11 1.07 0.88 *** 0.15 0.81 2.59 *** -1.66 * 15 1.54 -20 -1.26 -4 -0.45 26.42 -5.05 -20.23 0.82 -0.41 -0.63 q1 16 3.52 *** 27.00 2.44 ** 3 1.96 ** 5.38 1.40 q2 -3 -0.45 -4.74 -0.12 2 1.36 -11.73 -1.19 q3 4 0.24 55.19 0.85 -14 -2.26 ** 33.03 1.97 ** Notes: Risk price estimates for the period 1952:1–2011:2 using quarterly data. The priced variable is listed in the left-hand column. The left-hand set of columns uses the Fama–French portfolios as the test assets; the right-hand columns add 49 industry portfolios from Ken French's website. For the bandpass basis, q1 is the price of low-frequency risk, q2 business-cycle frequency, and q3 high frequency. For the utility basis, q1 is the low-frequency component, q2 the constant, and q3 the coefficient on cos(w). The asset pricing moments are estimated using two-step GMM. The "t-stat" column gives the t statistics for the risk prices. * indicates Residential Investment Table 3. Parameter estimates for the spectral weighting function (identity matrix for GMM) Portfolios: Basis: Bandpass t-stat FF25 Utility (0.975) t-stat Bandpass t-stat FF25+IND E–Z (0.975) t-stat q1 Consumption q2 growth q3 336 -541 401 1.73 * -1.14 0.72 703.66 -340.19 558.13 1.62 -0.24 0.21 112 1.22 -116 -0.54 -131 -0.52 198.11 -277.49 502.47 1.02 -0.58 0.60 GDP q1 q2 q3 138 -117 139 1.55 -1.11 1.10 258.39 133.31 -237.80 0.62 0.80 -0.95 91 1.38 -71 -0.85 21 0.18 186.71 25.95 -75.69 0.62 0.22 -0.37 Durables q1 q2 q3 54 -40 37 q1 Investment q2 q3 13 -7 -7 q1 Fixed q2 Investment q3 37 -36 77 1.79 * -0.98 1.31 82.87 52.91 -91.29 1.43 1.82 * -0.48 15 0.95 -2 -0.14 -5 -0.37 19.96 10.77 19.62 0.72 0.84 0.40 30.07 0.04 4.15 0.86 0.00 0.24 14 1.55 -7 -0.86 -3 -0.40 31.64 4.74 2.20 0.94 0.63 0.13 0.77 2.03 * -1.35 15 0.92 -19 -0.82 -4 -0.25 26.06 -5.58 -19.59 0.68 -0.29 -0.34 q1 18 2.49 ** 30.60 1.98 ** 3 0.75 5.44 Residential q2 -3 -0.36 2.03 0.03 2 0.84 -11.63 Investment q3 11 0.37 57.95 0.59 -13 -1.14 33.05 Notes: See table 2. These estimates differ only in that they use the identity matrix for the GMM weighting matrix. 0.66 -0.70 1.25 1.44 -0.80 -0.73 1.70 * -1.03 2.15 ** 54.66 84.97 -118.72 Table 4. Parameter estimates with returns as priced variable FF25 Weighting: S coeff t-stat Utility basis q1 Long-run q2 Constant 8.36 9.99 Bandpass basis q1 Long-run q2 Constant 13.10 -2.36 Weighting: I coeff t-stat 2.19 ** 3.34 *** 2.36 ** -0.65 7.99 9.56 13.65 -2.76 1.65 * 1.81 * 1.53 -0.72 FF25 Weighting: S coeff t-stat Utility basis Bandpass basis Weighting: I coeff t-stat FF25 + Industry Weighting: I Weighting: S coeff t-stat coeff t-stat 7.01 8.60 3.06 *** 6.25 *** 7.07 8.63 9.65 -1.31 2.62 *** -0.41 9.65 -1.36 1.57 2.09 ** 1.39 -0.48 FF25 + Industry Weighting: S Weighting: I coeff t-stat coeff t-stat q1 Long-run q2 Constant q3 High Freq 1.98 10.40 114.20 0.29 0.95 1.05 2.61 10.57 108.64 0.26 0.90 1.01 6.36 8.69 12.58 q1 Long-run q2 Business cycle q3 Short-run 9.22 46.38 -45.62 0.95 0.28 -0.30 9.26 52.58 -51.75 0.44 0.20 -0.22 15.53 -76.23 70.11 1.33 4.79 *** 0.17 2.53 ** -0.82 0.81 6.59 8.75 10.07 15.60 -77.02 70.80 0.83 2.09 ** 0.07 1.45 -0.43 0.42 Notes: Risk price estimates for the period 1926:3 - 2011:2, using quarterly data. The top panel uses two parameters for the weighting function (a longrun component and a constant), the bottom panel uses three parameters corresponding to the decomposition of Table 2. t-statistics take into account VAR estimation uncertainty using GMM. The weighting matrix used is either the inverse of the variance-covariance matrix of the moment residuals (Weighting: S) or the identity matrix (Weighting: I). Table 5. Estimates of bond pricing model lam Level Slope -2.38 *** 0.13 [-4.93] [0.27] Curvature -0.41 ** [-2.34] K0 Low -102 ** [-1.98] Business-cycle High 284 *** -146 ** [2.98] -2.05 LAM K1 Level Slope Curvature Level Slope Curvature Level -0.65 -0.84 ** 0.29 Low 20.2 -84.3 ** 21.5 [-1.43] [-1.97] [0.73] [0.48] [-2.34] [0.56] Slope 0.43 ** -0.28 * -0.24 Business-Cycle -76.6 199.3 215.2 [2.44] [-1.84] [-1.51] [-0.56] [1.54] [1.57] Curvature -0.10 -0.11 -0.21 High -50.4 -29.0 -47.7 [-0.77] [-0.90] [-1.40] [-1.53] [-0.85] [-1.15] Notes: lam and LAM are the reduced-form risk prices for the three-factor term structure model. K0 gives the set of steady-state risk prices for low, business-cycle, and high frequencies. K1 determines how the risk prices interact with the underlying factors driving the model. t-statistics are reported in brackets. * denotes significance at the 10 percent level, ** the 5 percent level, and *** the 1 percent level. Table 6. Term structure risk prices under alternative rotations Level Slope Curvature Principal components: -2.38 *** 0.13 -0.41 ** [-4.93] [0.27] [-2.34] Yields used for exogenously defined factors (months maturity): 1, 12, 60 -2.68 *** 0.01 -0.50 ** [-4.87] [0.02] [-2.20] 1, 36, 60 -2.66 *** -0.67 ** -0.90 * [-4.74] [-2.28] [-1.92] 12, 36, 60 -3.34 *** -2.74 ** -2.58 [-3.16] [-1.52] [-1.28] Notes: Risk prices for level, slope, and curvature factors under various rotations of the term structure model. The rotations are observationally equivalent in their implications for bond yields. The first row defines the factors as the first three principal components of the yields used in estimation. The remaining rows define the factors in terms of yields for the three maturities listed in the first column. Level is the sum of the maturities; slope the difference between the first and last; curvature the first plus last minus twice the middle maturity. In each case, the factors are normalized so that their innovations have unit variance. t-statistics are reported in brackets.