Solutions to the Review Questions at the End of Chapter 6 1. (a) This is simple to accomplish in theory, but difficult in practice as a result of the algebra. The original equations are (renumbering them (1), (2) and (3) for simplicity) y1t 0 1 y 2 t 2 y 3t 3 X 1t 4 X 2 t u1t (1) y 2 t 0 1 y 3t 2 X 1t 3 X 3t u2 t (2) y 3t 0 1 y1t 2 X 2 t 3 X 3t u3t ( 3) The easiest place to start (I think) is to take equation (1), and substitute in for y3t, to get y1t 0 1 y 2 t 2 ( 0 1 y1t 2 X 2 t 3 X 3t u3t ) 3 X 1t 4 X 2 t u1t Working out the products that arise when removing the brackets, y1t 0 1 y 2 t 2 0 2 1 y1t 2 2 X 2 t 2 3 X 3t 2 u3t 3 X 1t 4 X 2 t u1t Gathering terms in y1t on the LHS: y1t 2 1 y1t 0 1 y 2 t 2 0 2 2 X 2 t 2 3 X 3t 2 u3t 3 X 1t 4 X 2 t u1t y1t (1 2 1 ) 0 1 y 2 t 2 0 2 2 X 2 t 2 3 X 3t 2 u3t 3 X 1t 4 X 2 t u1t (4) Now substitute into (2) for y3t from (3). y 2 t 0 1 ( 0 1 y1t 2 X 2 t 3 X 3t u3t ) 2 X 1t 3 X 3t u2 t Removing the brackets y 2 t 0 1 0 1 1 y1t 1 2 X 2 t 1 3 X 3t 1u3t 2 X 1t 3 X 3t u2 t (5) Substituting into (4) for y2t from (5), y1t (1 2 1 ) 0 1 ( 0 1 0 1 1 y1t 2 X 2 t 1 3 X 3t 1u3t 2 X 1t 3 X 3t u2 t ) 2 0 2 2 X 2 t 2 3 X 3t 2 u3t 3 X 1t 4 X 2 t u1t Taking the y1t terms to the LHS: y1t (1 2 1 1 1 1 ) 0 1 0 1 1 0 1 2 X 2 t 1 1 3 X 3t 1 1u3t 1 2 X 1t 1 3 X 3t 1u2 t 2 0 2 2 X 2 t 2 3 X 3t 2 u3t 3 X 1t 4 X 2 t u1t 1/8 “Introductory Econometrics for Finance” © Chris Brooks 2008 Gathering like-terms in the other variables together: y1t (1 2 1 1 1 1 ) 0 1 0 1 1 0 2 0 X 1t (1 2 3 ) X 2 t (1 1 2 2 2 4 ) X 3t (1 1 3 1 3 2 3 ) u3t (1 1 2 ) 1 u2 t u1t (6) Multiplying all through equation (3) by (1 2 1 1 1 1 ) : y 3t (1 2 1 11 1 ) 0 (1 2 1 11 1 ) 1 y1t (1 2 1 11 1 ) 2 X 2 t (1 2 1 11 1 ) 3 X 3t (1 2 1 11 1 ) u3t (1 2 1 11 1 ) Replacing y1t (1 2 1 11 1 ) (7) in (7) with the RHS of (6), 0 1 0 1 1 0 2 0 X 1t (1 2 3 ) y 3t (1 2 1 11 1 ) 0 (1 2 1 11 1 ) 1 X 2 t (1 1 2 2 2 4 ) X 3t (1 1 3 1 3 2 3 ) u3t (1 1 2 ) 1u2 t u1t 2 X 2 t (1 2 1 11 1 ) 3 X 3t (1 2 1 11 1 ) u3t (1 2 1 11 1 ) (8) Expanding the brackets in equation (8) and cancelling the relevant terms y3t (1 2 1 11 1 ) 0 10 11 0 X 1t (1 2 1 1 3 ) X 2 t ( 2 14 ) X 3t ( 11 3 3 ) u3t 11u2 t 1u1t (9) Multiplying all through equation (2) by (1 2 1 1 1 1 ) : y 2 t (1 1 1 1 1 2 ) 0 (1 1 1 1 1 2 ) 1 y 3t (1 1 1 1 1 2 ) 2 X 1t (1 1 1 1 1 2 ) 3 X 3t (1 1 1 1 12 ) u2 t (1 1 1 1 1 2 ) (10) Replacing y3t (1 2 1 11 1 ) in (10) with the RHS of (9), 0 1 0 11 0 X 1t (1 2 1 1 3 ) y 2 t (1 1 1 1 1 2 ) 0 (1 1 1 1 12 ) 1 X 2 t ( 2 1 4 ) X 3t ( 3 11 3 ) u3t 11u2 t 1u1t 2 X 1t (1 1 1 1 1 2 ) 3 X 3t (1 1 1 1 12 ) u2 t (1 1 1 1 1 2 ) (11) Expanding the brackets in (11) and cancelling the relevant terms y2 t (1 1 1 ( 1 12 ) 0 02 1 1 0 1 10 X 1t 1 1 3 2 22 1 ) X 2 t ( 1 2 1 14 ) X 3t ( 1 3 3 32 1 ) 1u3t u2 t (1 2 1 ) 1 1u1t (12) 2/8 “Introductory Econometrics for Finance” © Chris Brooks 2008 Although it might not look like it (!), equations (6), (12), and (9) respectively will give the reduced form equations corresponding to (1), (2), and (3), by doing the necessary division to make y1t, y2t, or y3t the subject of the formula. From (6), 0 1 0 1 1 0 2 0 (1 2 3 ) ( 2 2 4 ) X 1t 1 1 2 X 2t (1 2 1 1 1 1 ) (1 2 1 1 1 1 ) (1 2 1 1 1 1 ) (1 1 3 1 3 2 3 ) u ( 2 ) 1 u2 t u1t X 3t 3 t 1 1 (1 2 1 1 1 1 ) (1 2 1 1 1 1 ) (13) From (12), y1t y2 t 0 02 1 1 01 10 ( 1 1 3 2 22 1 ) ( 1 2 1 14 ) X1t X (1 1 11 12 ) (1 1 11 12 ) (1 1 11 12 ) 2 t ( 1 3 3 32 1 ) u u (1 2 1 ) 1 1u1t X 3 t 1 3t 2 t (1 1 11 12 ) (1 1 11 12 ) (14) From (9), y 3t 0 10 11 0 (1 2 1 1 3 ) ( 2 1 4 ) X 1t X (1 2 1 11 1 ) (1 2 1 11 1 ) (1 2 1 11 1 ) 2 t ( 11 3 3 ) u 11u2 t 1u1t X 3t 3t (1 2 1 11 1 ) (1 2 1 11 1 ) (15) Notice that all of the reduced form equations (13)-(15) in this case depend on all of the exogenous variables, which is not always the case, and that the equations contain only exogenous variables on the RHS, which must be the case for these to be reduced forms. (b) The term “identification” refers to whether or not it is in fact possible to obtain the structural form coefficients (the , , and ’s in equations (1)-(3)) from the reduced form coefficients (the ’s) by substitution. An equation can be over-identified, just-identified, or under-identified, and the equations in a system can have differing orders of identification. If an equation is underidentified (or not identified), then we cannot obtain the structural form coefficients from the reduced forms using any technique. If it is just identified, we can obtain unique structural form estimates by back-substitution, while if it is over-identified, we cannot obtain unique structural form estimates by substituting from the reduced forms. There are two rules for determining the degree of identification of an equation: the rank condition, and the order condition. The rank condition is a necessary and sufficient condition for identification, so if the rule is satisfied, it guarantees that the equation is indeed identified. The rule centres around a restriction on the rank of a sub-matrix containing the reduced form 3/8 “Introductory Econometrics for Finance” © Chris Brooks 2008 coefficients, and is rather complex and not particularly illuminating, and was therefore not covered in this course. The order condition, can be expressed in a number of ways, one of which is the following. Let G denote the number of structural equations (equal to the number of endogenous variables). An equation is just identified if G-1 variables are absent. If more than G-1 are absent, then the equation is overidentified, while if fewer are absent, then it is not identified. Applying this rule to equations (1)-(3), G=3, so for an equation to be identified, we require 2 to be absent. The variables in the system are y1, y2, y3, X1, X2, X3. Is this the case? Equation (1): X3t only is missing, so the equation is not identified. Equation (2): y1t and X2t are missing, so the equation is just identified. Equation (3): y2t and X1t are missing, so the equation is just identified. However, the order condition is only a necessary (and not a sufficient) condition for identification, so there will exist cases where a given equation satisfies the order condition, but we still cannot obtain the structural form coefficients. Fortunately, for small systems this is rarely the case. Also, in practice, most systems are designed to contain equations that are overidentified. (c). It was stated in Chapter 4 that omitting a relevant variable from a regression equation would lead to an “omitted variable bias” (in fact an inconsistency as well), while including an irrelevant variable would lead to unbiased but inefficient coefficient estimates. There is a direct analogy with the simultaneous variable case. Treating a variable as exogenous when it really should be endogenous because there is some feedback, will result in biased and inconsistent parameter estimates. On the other hand, treating a variable as endogenous when it really should be exogenous (that is, having an equation for the variable and then substituting the fitted value from the reduced form if 2SLS is used, rather than just using the actual value of the variable) would result in unbiased but inefficient coefficient estimates. If we take the view that consistency and unbiasedness are more important that efficiency (which is the view that I think most econometricians would take), this implies that treating an endogenous variable as exogenous represents the more severe mis-specification. So if in doubt, include an equation for it! (Although, of course, we can test for exogeneity using a Hausman-type test). (d). A tempting response to the question might be to describe indirect least squares (ILS), that is estimating the reduced form equations by OLS and then substituting back to get the structural forms; however, this response would be WRONG, since the question tells us that the system is over-identified. A correct answer would be to describe either two stage least squares (2SLS) or instrumental variables (IV). Either would be acceptable, although IV requires the user to determine an appropriate set of instruments and hence 2SLS is simpler in practice. 2SLS involves estimating the reduced form equations, and obtaining the fitted values in the first stage. In the second stage, the structural 4/8 “Introductory Econometrics for Finance” © Chris Brooks 2008 form equations are estimated, but replacing the endogenous variables on the RHS with their stage one fitted values. Application of this technique will yield unique and unbiased structural form coefficients. 2. (a) A glance at equations (6.97) and (6.98) reveals that the dependent variable in (6.97) appears as an explanatory variable in (6.98) and that the dependent variable in (6.98) appears as an explanatory variable in (6.97). The result is that it would be possible to show that the explanatory variable y2t in (6.97) will be correlated with the error term in that equation, u1t, and that the explanatory variable y1t in (6.98) will be correlated with the error term in that equation, u2t. Thus, there is causality from y1t to y2t and from y2t to y1t, so that this is a simultaneous equations system. If OLS were applied separately to each of equations (6.97) and (6.98), the result would be biased and inconsistent parameter estimates. That is, even with an infinitely large number of observations, OLS could not be relied upon to deliver the appropriate parameter estimates. (b) If the variable y1t had not appeared on the RHS of equation (6.98), this would no longer be a simultaneous system, but would instead be an example of a triangular system (see question 3). Thus it would be valid to apply OLS separately to each of the equations (6.97) and (6.98). (c) The order condition for determining whether an equation from a simultaneous system is identified was described in question 1, part (b). There are 2 equations in the system of (6.97) and (6.98), so that only 1 variable would have to be missing from an equation to make it just identified. If no variables are absent, the equation would not be identified, while if more than one were missing, the equation would be over-identified. Considering equation (6.97), no variables are missing so that this equation is not identified, while equation (6.98) excludes only variable X2t, so that it is just identified. (d) Since equation (6.97) is not identified, no method could be used to obtain estimates of the parameters of this equation, while either ILS or 2SLS could be used to obtain estimates of the parameters of (6.98), since it is just identified. ILS operates by obtaining and estimating the reduced form equations and then obtaining the structural parameters of (6.98) by algebraic backsubstitution. 2SLS involves again obtaining and estimating the reduced form equations, and then estimating the structural equations but replacing the endogenous variables on the RHS of (6.97) and (6.98) with their reduced form fitted values. Comparing between ILS and 2SLS, the former method only requires one set of estimations rather than two, but this is about its only advantage, and conducting a second stage OLS estimation is usually a computationally trivial exercise. The primary disadvantage of ILS is that it is only applicable to just identified equations, whereas many sets of equations that we may wish to estimate are over-identified. Second, obtaining the structural form coefficients via algebraic substitution can be a very tedious exercise in the context of large systems (as the solution to question 1, part (a) shows!). 5/8 “Introductory Econometrics for Finance” © Chris Brooks 2008 (e) The Hausman procedure works by first obtaining and estimating the reduced form equations, and then estimating the structural form equations separately using OLS, but also adding the fitted values from the reduced form estimations as additional explanatory variables in the equations where those variables appear as endogenous RHS variables. Thus, if the reduced form fitted values corresponding to equations (6.97) and (6.98) are given by y1t and y2t respectively, the Hausmann test equations would be y1t 0 1 y 2t 2 X 1t 3 X 2t 4 y 2t 'u1t y 2t 0 1 y1t 2 X 1t 3 y1t ' u1t . Separate tests of the significance of the y1t and y2t terms would then be performed. If it were concluded that they were both significant, this would imply that additional explanatory power can be obtained by treating the variables as endogenous. 3. An example of a triangular system was given in Section 6.7. Consider a scenario where there are only two “endogenous” variables. The key distinction between this and a fully simultaneous system is that in the case of a triangular system, causality runs only in one direction, whereas for a simultaneous equation, it would run in both directions. Thus, to give an example, for the system to be triangular, y1 could appear in the equation for y2 and not vice versa. For the simultaneous system, y1 would appear in the equation for y2, and y2 would appear in the equation for y1. 4. (a) p=2 and k=3 implies that there are two variables in the system, and that both equations have three lags of the two variables. The VAR can be written in long-hand form as: y1t 10 111 y1t 1 211 y 2t 1 112 y1t 2 212 y 2t 2 113 y1t 3 213 y 2t 3 u1t y 2t 20 121 y1t 1 221 y 2t 1 122 y1t 2 222 y 2t 2 123 y1t 3 223 y 2t 3 u 2t 10 y1t u1t where 0 , y t , ut , and the coefficients on the lags of yt 20 y2t u 2 t are defined as follows: ijk refers to the kth lag of the ith variable in the jth equation. This seems like a natural notation to use, although of course any sensible alternative would also be correct. (b) This is basically a “what are the advantages of VARs compared with structural models?” type question, to which a simple and effective response would be to list and explain the points made in the book. The most important point is that structural models require the researcher to specify some variables as being exogenous (if all variables were endogenous, then none of the equations would be identified, and therefore estimation of the structural equations would be impossible). This can be viewed as a 6/8 “Introductory Econometrics for Finance” © Chris Brooks 2008 restriction (a restriction that the exogenous variables do not have any simultaneous equations feedback), often called an “identifying restriction”. Determining what are the identifying restrictions is supposed to be based on economic or financial theory, but Sims, who first proposed the VAR methodology, argued that such restrictions were “incredible”. He thought that they were too loosely based on theory, and were often specified by researchers on the basis of giving the restrictions that the models required to make the equations identified. Under a VAR, all the variables have equations, and so in a sense, every variable is endogenous, which takes the ability to cheat (either deliberately or inadvertently) or to mis-specify the model in this way, out of the hands of the researcher. Another possible reason why VARs are popular in the academic literature is that standard form VARs can be estimated using OLS since all of the lags on the RHS are counted as pre-determined variables. Further, a glance at the academic literature which has sought to compare the forecasting accuracies of structural models with VARs, reveals that VARs seem to be rather better at forecasting (perhaps because the identifying restrictions are not valid). Thus, from a purely pragmatic point of view, researchers may prefer VARs if the purpose of the modelling exercise is to produce precise point forecasts. (c) VARs have, of course, also been subject to criticisms. The most important of these criticisms is that VARs are atheoretical. In other words, they use very little information form economic or financial theory to guide the model specification process. The result is that the models often have little or no theoretical interpretation, so that they are of limited use for testing and evaluating theories. Second, VARs can often contain a lot of parameters. The resulting loss in degrees of freedom if the VAR is unrestricted and contains a lot of lags, could lead to a loss of efficiency and the inclusion of lots of irrelevant or marginally relevant terms. Third, it is not clear how the VAR lag lengths should be chosen. Different methods are available (see part (d) of this question), but they could lead to widely differing answers. Finally, the very tools that have been proposed to help to obtain useful information from VARs, i.e. impulse responses and variance decompositions, are themselves difficult to interpret! – See Runkle (1987). (d) The two methods that we have examined are model restrictions and information criteria. Details on how these work are contained in Sections 6.12.4 and 6.12.5. But briefly, the model restrictions approach involves starting with the larger of the two models and testing whether it can be restricted down to the smaller one using the likelihood ratio test based on the determinants of the variance-covariance matrices of residuals in each case. The alternative approach would be to examine the value of various information criteria and to select the model that minimises the criteria. Since there are only two models to compare, either technique could be used. The restriction approach assumes normality for the VAR error terms, while use of 7/8 “Introductory Econometrics for Finance” © Chris Brooks 2008 the information criteria does not. On the other hand, the information criteria can lead to quite different answers depending on which criterion is used and the severity of its penalty term. A completely different approach would be to put the VARs in the situation that they were intended for (e.g. forecasting, making trading profits, determining a hedge ratio etc.), and see which one does best in practice! 8/8 “Introductory Econometrics for Finance” © Chris Brooks 2008