Simultaneous Equations Introduction • Most economic models are simultaneous i.e. at least two relationships between the variables in the regression. • Macro example: c = 1 + 2 y • Micro Example: Supply and Demand • Lesson: Simultaneity can appear anywhere • OLS will mix up the two relationships Macro Example 1. Consumption, c, is function of income, y. c = 1 + 2 y c is “endogenous” 2 is MPC 2. y = consumption + investment. y=c+i y is endogenous 3. Investment assumed independent of income. i is “exogenous” The Structural Form of the Statistical Model ct = 1 + 2 yt+et Identity yt = ct + it et is a random disturbance term • The model is simultaneous because we cannot determine C or Y without knowing the other • Jargon: C and Y are : • endogenous • jointly determined • jointly endogenous • But I (investment) is exogenous • We rely on economic intuition to tell us whether a variable is endogenous or exogenous -- not really a statistical issue Single vs. Simultaneous Equations Single Equation: Simultaneous Equations: yt ct et yt et ct it Reduced Form • For use later, useful to re-write the system of equations in their reduced form – “Solve” the model – Reduced form: each equation has only one endogenous variable on the left – method: substitute one equation into the other • Easy for this simple Macro example, more difficult in real world cases • Note the conceptual difference between structural and reduced forms ct = 1 + 2 yt + et yt = ct + it ct = 1 + 2(ct + it) + et (1 2)ct = 1 + 2 it + et 2 1 1 ct = + it + et (12) (12) (12) ct = 11 + 21 it + t • We can do the same for the equation in Y • We get the reduced form of the system • Note the conceptual difference ct = 11 + 21 it + t yt = 12 + 22 it + t Failure of OLS C Y • OLS picks best fit --- a mixture of both relationships • Not get correct estimate of MPC • OLS is biased and inconsistent because the right hand side variable (y) is correlated with the disturbance term. 1. Any change in e, leads to a change in C via consumption equation 2. Change in consumption leads to a change in income via the identity 3. This change in income will feed back into a change in consumption via the consumption equation • Thus any time there is a change in e there is a simultaneous change in Y 1. ct = 1 + 2 yt + et 3. yt = ct + it 2. Fundamental Problem of OLS • OLS will give credit to Y for changes in e i.e. the estimated effect of Y on C will include also the effect of e on C • OLS will act as if a change in consumption brought about by some random effect (e), was due to a change in income • OLS will overstate the effect of income on consumption i.e. the MPC • OLS will be biased and inconsistent The Failure of Least Squares The least squares estimators of parameters in a structural simultaneous equation is biased and inconsistent because of the correlation between the random error and the endogenous variables on the right-hand side of the equation. Indirect Least Squares • One way to estimate is to do OLS on the reduced form ct = 11 + 21 it + t yt = 12 + 22 it + t • This works because no endogenous variable on the right hand side i.e. unbiased and consistent • We can then use the formulae that link the parameters of the reduced and structural forms to calculate the estimates of 11 = 12 = 1 (12) 22 = (121) = ˆ1, ILS ˆ11 ˆ 22 1 (12) • In practice, this method is not used because usually the link between the reduced form and structural form is very complicated in more realistic models • Several different structural forms may have the same reduced form. • Difficult to get standard errors on Identification • Biggest issue in simultaneous equations, biggest issue in econometrics • OLS cannot distinguish between effect of Y and effect of e • Problem is to separate these two effects or literally “identify” the effect of Y on C Micro Example • We use a micro economic example i.e. Supply and Demand model • Structural model: Demand: q 1P 2 y d Supply: q 1P s • Price and quantity are endogenous (jointly determined) and income is exogenous • The model is simultaneous because: – q is a function of p (demand curve) – p is a function of q (supply curve) • OLS estimation of the demand equation will be biased and inconsistent • The OLS estimate of 1 will pick up the effect of the supply curve also • Cov(p,d) is not equal to zero • Problem of identification is to separate the effect of the supply curve from that of the demand curve • Have to do this to have hope of estimating 25 20 10 15 price of truffle 30 35 Illustrating the Identification Problem 5 10 15 20 quantity of truffles demanded 25 • Is this a supply curve or a demand curve? • It looks like a supply curve • It could be a supply curve, i.e data is generated by movements of the demand curve along a supply curve -- so trace out the supply curve p S D q • Or it could be movement in both p S S D q • It turns out that we can estimate consistently, but cannot estimate the demand curve • The reason for this is that y income is in the demand curve but excluded from the supply curve • As income changes we know the demand curve will shift but the supply curve will be fixed • Therefore if we can concentrate on those changes in p and q that are caused by changes in income, we can trace out the supply curve Exclusion Restrictions • We can identify (trace out) the supply curve only because y is in the demand curve equation but not in the supply curve • It is because y is excluded from the supply curve that we can be sure that changes in y move the demand curve only • If y was in the supply curve we could not do this • We cannot identify (trace out) the demand curve, because there is no variable in the supply curve that is not in the demand curve • “exclusion restrictions” General Condition for Identification of an equation An equation containing M endogenous variables must exclude at least M1 exogenous variables from a given equation in order for the parameters of that equation to be identified and to be consistently estimated. Importance of Identification • Must check identification before try to estimate • If equation is unidentified, will not be able to get consistent estimates of the structural parameters • Always try to design models so that the equations are identified Beware of Artificial Restrictions • Must justify exclusion restrictions using economic intuition • For example: Is it reasonable that income affects demand but not supply • Most cases are not so obvious • if a restriction is wrong -- no hope of getting correct answers • most arguments in applied economic papers are over the validity of these restrictions Estimation- 2SLS • Two stage least squares 1. Estimate the reduced form using OLS. 1 pt 1 yt vt qt 2 y t v 2. Do OLS on the structural form with the actual values replaced by the fitted values from the first stage 2 t • Why this works for the supply equation – The fitted values from the first stage are by definition the part of the variation in p and q that is due to changes in income – Therefore we are sure that the fitted values lie along the supply curve --- so we just do OLS on these values – More formally: the fitted value of p is uncorrelated with because it is a function solely of y which is uncorrelated with (i.e. exogenous) Pˆt ˆ1 yt qˆt 1Pˆt s • Why does it not work on the demand equation? – Computer will generate an error at second stage estimation of demand equation because effectively the income variable will appear twice – Perfect multicolinearity d ˆ qˆt 1 Pt 2 yt t Pˆ ˆ y t 1 t General 2SLS Procedure • The 2SLS procedure can be used for a system of any degree of complication • M equations • M endogenous variables (y1 .... yM) • K exogenous variables (x1 .... xk) • Remember: can only estimate those equations that pass the identification condition • Suppose one of the equations you want to estimate is: y1 1 x1 2 x2 2 y2 1 • First check that it is identified i.e. are enough x variables excluded from the equation • Estimate the reduced form for the entire model y1 11 x1 .... 1k xk v1 y2 21 x1 .... 1k xk v2 yM M 1 x1 .... Mk xk vM • Replace the endogenous variables in the structural equations with their fitted values and do OLS yˆ1 1 x1 2 x2 2 yˆ 2 1 • Note: Possible problem with standard errors in some computer programs Properties of 2SLS • • • • Estimates are consistent Estimates are biased Estimates are asymptotically normal Standard errors are not same formula as OLS -- usually built into software • Also known as Instrumental Variables (IV) • Beware of false restrictions Example: Market for Truffles • Structural model: Demand: q P ps y qt 1 2 Pt 3ct ts Supply: • ps= price of substitute, c=rent of pig (i.e. cost of production) y= per capita disposable income • Estimate by OLS t 1 2 t 3 t 4 t – Note the sign of price coefficient d t Identification • • • • • P and Q are endogenous c, ps and y are exogenous ? Plausible? Is supply identified? Why? Is demand identified? Why? Are the restrictions plausible? ---- very important • Can we use 2SLS? • N.B: two subjective judgments • reasonable to say variable is exogenous • reasonable to exclude it Stage 1: Estimate Reduced Form • Endogenous on left, all exogenous on right qt 11 21 pst 31 yt 41ct vtq pt 12 22 pst 32 yt 21ct vtp • See the results: note exogenous variables are significant , R2 is high – this is close to being the “sufficient condition” – a.k..a “rank” condition – what happens if insignificant? Stage 2: Estimate Structural Form • Calculate the fitted values for p and q • Do OLS on ˆq Pˆ ps y d t 1 2 t 3 t 4 t t qˆt 1 2 Pˆt 3ct ts • Note the signs and significance of the coef.