Lecture 18 Preview: Explanatory Variable/Error Term Independence Premise, Consistency, and Instrumental Variables Review Regression Model Standard Ordinary Least Squares (OLS) Premises Estimation Procedures Embedded within the Ordinary Least Squares (OLS) Estimation Procedure Taking Stock and a Preview: The Ordinary Least Squares (OLS) Estimation Procedure A Closer Look at the Explanatory Variable/Error Term Independence Premise Explanatory Variable/Error Term Correlation and Bias Geometric Motivation Confirming Our Logic Estimation Procedures: Large and Small Sample Properties Unbiased and Consistent Estimation Procedure Unbiased but Not Consistent Estimation Procedure Biased but Consistent Estimation Procedure The Ordinary Least Squares (OLS) Estimation Procedure, and Consistency Instrumental Variable (IV) Estimation Procedure: A Two Regression Procedure Mechanics The Two “Good” Instrument Conditions Regression Model yt = Const + xxt + et yt = Dependent variable xt = Explanatory variable et = Error term Const and x are the parameters t = 1, 2, …, T The error term is a random variable representing random influences: Mean[et] = 0 Standard Ordinary Least Squares (OLS) Premises Error Term Equal Variance Premise: The variance of the error term’s probability distribution for each observation is the same. Error Term/Error Term Independence Premise: The error terms are independent. Explanatory Variable/Error Term Independence Premise: The explanatory variables, the xt’s, and the error terms, the et’s, are not correlated. OLS Estimation Procedure Includes Three Estimation Procedures Value of the parameters, Const and x: bx = bConst = Question: What happens when the explanatory SSR Variance of the error term’s EstVar[e] = variable/error term Degrees of Freedom probability distribution, Var[e]: independence Variance of the coefficient estimate’s EstVar[b ] = premise is x probability distribution, Var[bx]: violated? Good News: When the standard premises are satisfied each of these procedures is unbiased. Good News: When the standard premises are satisfied the OLS estimation procedure for the coefficient value is the best linear unbiased estimation procedure (BLUE). Crucial Point: When the ordinary least squares (OLS) estimation procedure performs its calculations, it implicitly assumes that the three standard (OLS) premises are satisfied. Taking Stock and a Preview: The Ordinary Least Squares (OLS) Estimation Procedure OLS Bias Question: Is the Satisfied: Explanatory explanatory variable/error term Variable and Error Terms independence premise satisfied or Are Independent violated? Is the OLS estimation procedure for the value of the Yes – Unbiased coefficient biased or unbiased? OLS Reliability Question: Are the error term equal variance and the error term/error term independence premises satisfied or violated? Can the OLS calculation for the coefficient’s standard error be “trusted?” Is the OLS estimation procedure for the value of the coefficient BLUE? Satisfied Violated Yes No Yes No Violated: Explanatory Variable and Error Terms Are Correlated No – Biased Preview: When the explanatory variable/error term independence premise is violated and consequently the ordinary least squares (OLS) estimation procedure is biased, other estimation procedures can be used to mitigate although not completely remedy the bias problem. Explanatory Variable/Error Term Independence Premise: The explanatory variables, the xt’s, and the error terms, the et’s, are not correlated. Question: What happens when this premise is violated? Claim: When the explanatory variables and the error terms are correlated the ordinary least squares estimation procedure for the coefficient value is biased. Question: What does explanatory variable/error term independence and correlation “look like?” Explanatory Variable/Error Term Independence: CorrX&E = 0. The explanatory variable and the error terms appear to be independent. After many, many repetitions, each student’s mean is approximately 0. Lab 18.1 Explanatory Variable/Error Term Positive Correlation: CorrX&E = .6 Lab 18.1 The explanatory variable and the error terms appear to be positively correlated. After many, many repetitions: When the value of the explanatory variable is low the error term is typically negative. When the value of the explanatory variable is high the error term is typically positive. Explanatory Variable/Error Term Negative Correlation: CorrX&E = .6 Lab 18.1 The explanatory variable and the error terms appear to be negatively correlated. After many, many repetitions: When the value of the explanatory variable is low the error term is typically positive. When the value of the explanatory variable is high the error term is typically negative. Consequences of Explanatory Variable/Error Term Correlation Explanatory variable, xt, and error term, et, are positively correlated et xt up et up Plot the y’s on the diagram: yt = Const + xxt + et Find the best fitting line: Explanatory variable and error term are positively correlated Estimated equation more steeply sloped that actual equation OLS estimation procedure for coefficient value is biased upward xt yt Estimated equation y = bConst + bxx Actual equation y = Const + xx xt Explanatory variable, xt, and error term, xt, are negatively correlated et Explanatory variable, xt, and error term, et, are negatively correlated xt up et down Plot the y’s on the diagram: yt = Const + xxt + et Find the best fitting line: yt xt Actual equation y = Const + xx Explanatory variable and error term are negatively correlated Estimated equation less steeply sloped that actual equation OLS estimation procedure for coefficient value is biased downward Estimated equation y = bConst + bxx xt Explanatory Variable/Error Term Independence Premise: The explanatory variables, the xt’s, and the error terms, the et’s, are not correlated. Confirm Our Suspicions When the explanatory variable and the error terms are positively correlated, the ordinary least squares (OLS) estimation procedure will be biased upward. When the explanatory variable and the error terms are negatively correlated, the ordinary least squares (OLS) estimation procedure will be biased downward. Lab 18.2 Estimation Procedure OLS OLS OLS Corr X&E .00 .30 .30 Sample Size 50 50 50 Actual Coef 2.0 2.0 2.0 Mean of Coef Ests 2.0 6.1 2.1 Magnitude of Bias 0.0 4.1 4.1 Variance of Coef Ests 4.0 3.6 3.6 Explanatory variable and error term are positively correlated Explanatory variable and error term are uncorrelated Explanatory variable and error term are negatively correlated OLS estimation procedure for coefficient value is biased upward OLS estimation procedure for coefficient value is unbiased OLS estimation procedure for coefficient value is biased downward Estimation Procedures: Unbiased versus Biased and Consistent versus Inconsistent Unbiased: Small Sample Property. The estimation procedure does not systematically underestimate or overestimate the actual value. Formally, the mean of the estimate’s probability distribution equals the actual value. Mean[Est] = Actual Value When the estimate’s probability distribution is symmetric, the chances that the estimate is greater than the actual value equal the chances that it is less. Unbiasedness is called a small sample property because it does not depend on the sample size. Unbiasedness depends only of the mean of the estimate’s probability distribution. Consistent: Large Sample Property. Both the mean and variance of the estimate’s probability distribution are important for consistency: Mean of the estimate’s probability distribution: Either The estimation procedure is unbiased: Mean[Est] = Actual Value or The estimation procedure is biased, but the magnitude of the bias diminishes as the sample size becomes larger. Formally, as the sample size approaches infinity the mean approaches the actual value: As Sample Size : Mean[Est] Actual Value Variance of the estimate’s probability distribution: The variance diminishes as the sample size becomes larger Formally, as the sample size approaches infinity the variance approaches 0: As Sample Size : Variance[Est] 0 All Estimation Procedures Unbiased Consistent To get a better sense of the two different properties of estimation procedures we shall consider three estimation procedures: Unbiased and Consistent Unbiased but Not Consistent Biased but Consistent Categorizing Estimation Procedures Does Mean[Est] equal the Actual Value? Yes - Unbiased No - Biased Does Mean[Est] Actual Value Biased and as the sample size ? No Not Yes Consistent Does Var[Est] 0 Does Var[Est] 0 as the sample size ? as the sample size ? No Yes No Yes Unbiased and Consistent Unbiased but Not Consistent Biased but Consistent Lab 18.3 Estimation Corr Procedure X&E OLS .0 OLS .0 Any Two .0 Any Two .0 Lab 18.4 Sample Size 3 6 3 6 Actual Coef 2.0 2.0 2.0 2.0 After Many, Many Repetitions Mean of Magnitude Variance of Coef Ests of Bias Coef Ests 2.0 0.0 2.50 2.0 0.0 1.14 2.0 0.0 7.5 2.0 0.0 17.3 Illustrating a Consistent but Biased Estimation an Procedure: Revisit Our Friend Clint Random Sample Procedure: Write the name of each individual in the population on a 35 card Perform the following procedure 16 times: Thoroughly shuffle the cards. Randomly draw one card. Ask that individual if he/she is voting for Clint and record the answer. Replace the card. Calculate the fraction of the sample supporting Clint. Nonrandom Sample Procedure: Leave Clint’s dorm room and ask the first 16 people you run into if he/she is voting for Clint. Calculate the fraction of the sample supporting Clint. Questions: Compared to the general student population: Are the students who live near Clint are more likely to be Clint’s friend? Yes Are the students who live near Clint more likely to vote for him? Yes Since your starting point is Clint’s dorm room, is it likely that you will poll students who are more supportive of Clint than the general student population? Yes Would you be biasing your poll in Clint’s favor? Yes Consistent Estimation Procedure Simulation Sampling Technique Random Random Random Nonrandom Nonrandom Nonrandom Population Sample Fraction Size .50 16 .50 25 .50 100 .50 16 .50 25 .50 100 Mean (Average) of Estimates .50 .50 .50 .56 .54 .51 Magnitude of Bias .00 .00 .00 .06 .04 .01 Lab 18.5 Variance of Estimates .016 .010 .0025 .015 .010 .0025 Is the random procedure unbiased? Yes Is the nonrandom procedure unbiased? No Is the random procedure consistent? Yes Is the nonrandom procedure consistent? Yes As the sample size increases the magnitude of the bias diminishes. As the sample size increases the variance of the estimates diminishes. The nonrandom procedure is biased but consistent. The Explanatory Variable/Error Term Premise, the Ordinary Least Squares (OLS) Estimation Procedure, and Consistency Review: We have already shown that the ordinary least squares (OLS) estimation procedure biased when explanatory variable/error term correlation is present. Estimation Corr Sample Actual Mean of Magnitude Variance of Procedure X&E Size Coef Coef Ests of Bias Coef Ests OLS .00 50 2.0 2.0 0.0 4.0 OLS .30 50 2.0 6.1 4.1 3.6 OLS .30 50 2.0 2.1 4.1 3.6 Question: But might the ordinary least squares (OLS) estimation procedure still be consistent when explanatory variable/error term correlation is present? Estimation Procedure OLS OLS OLS Corr X&E .30 .30 .30 Sample Size 50 100 150 Actual Coef 2.0 2.0 2.0 Mean of Coef Ests 6.1 6.1 6.1 Magnitude of Bias 4.1 4.1 4.1 Lab 18.6 Variance of Coef Ests 3.6 1.7 1.2 There is nothing but bad news. In the presence of explanatory variable/error term correlation, the ordinary least squares (OLS) estimation procedure is: Biased Question: Where do we go from here? Not Consistent The Instrumental Variable (IV) Estimation Procedure Question: Why is there a problem? yt = Const + xxt + t When xt and t are correlated xt is a “problem” explanatory variable “Problem” Explanatory Variable: xt is the “problem” explanatory variable: The explanatory variable, xt, is correlated with the error term, t. Consequently, the explanatory variable/error term independence premise is violated. The ordinary least squares (OLS) estimation procedure for the coefficient value is biased. Addressing the “Problem” Explanatory Variable Using Instrumental Variables Choose an Instrument: A “good” instrument, zt, must meet two conditions. Good Instrument Condition 1: Correlated with the “problem” explanatory variable, xt. Good Instrument Condition 2: Uncorrelated with the error term, t. Instrumental Variables (IV) Regression 1: Use the instrument, zt, to provide an “estimate” of the problem explanatory variable, xt. Dependent Variable: “Problem” explanatory variable, xt. Explanatory Variable: Instrument, zt. Estimate of the problem explanatory variable: Estxt = aConst + azzt where aConst and az are the estimates of the constant and coefficient in this regression, IV Regression 1. Instrumental Variables (IV) Regression 2:In the original model, replace the “problem” explanatory variable, xt, with its surrogate, Estxt, the estimate of the “problem” explanatory variable provided by the instrument, zt, from IV Regression 1. Dependent Variable: Original dependent variable, yt. Explanatory Variable: Estimate of the “problem” explanatory variable based on the results from IV Regression 1, Estxt. The “Good” Instrument Conditions Good Instrument Condition 1: The instrument, zt, must be correlated with the “problem” explanatory variable, xt. Focus on Instrumental Variables (IV) Regression 1: Use the instrument, zt, to provide an “estimate” of the problem explanatory variable, xt. Dependent Variable: “Problem” Explanatory Variable, xt. Explanatory Variable: Instrument, zt We are using the instrument to create a surrogate for the “problem” explanatory variable: Estxt = aConst + azzt The estimate, Estxt, will be a “good” surrogate only if the instrument is correlated with the problem explanatory variable. Only if Estxt, is a good predictor of the “problem” explanatory variable. Good Instrument Condition 2: The instrument, zt, must be independent of the error term, t. Instrumental Variables (IV) Regression 2 In the original model, replace the “problem” explanatory variable, xt, with its surrogate, Estxt, the estimate of the “problem” explanatory variable provided by the instrument, zt, from IV Regression 1. Original Model: yt = Const + xxt + t yt = Const + xEstxt + t Replace the “problem” explanatory variable with its surrogate To avoid violating the explanatory variable/error term independence premise Estxt and t must be independent Estxt = aConst + azzt zt and t must be independent. Justifying the Instrumental Variable (IV) Approach: A Simulation Model: yt = Const + xxt + et Defaults IV is selected indicating that the instrumental variable (IV) estimation procedure we just described will be used to estimate the value of the explanatory variable’s coefficient. The Corr X&E list the value .30 is specified. The correlation coefficient for the explanatory variable and error term equals .30. Hence, the explanatory variable/error term independence premise is violated. Two new correlation lists appear in this simulation: Corr X&Z and Corr Z&E. The two new lists reflect the two conditions required for a good instrument. The Corr X&Z list specifies the correlation coefficient for the explanatory variable and the instrument. To be a “good” instrument the explanatory variable and the instrument must be correlated. The default value is .50. The Corr Z&E specifies the correlation coefficient for the instrument and error term. To be a “good” instrument the instrument and error term must be independent. The default value is .00; that is, the instrument and error term are independent. Lab 18.7 Estimation Corr Corr Corr Sample Actual Mean of Magnitude Variance of Procedure X&Z Z&E X&E Size Coef Coef Ests of Bias Coef Ests IV .50 .00 .30 50 2.0 1.61 .39 20.3 IV .50 .00 .30 100 2.0 1.82 .18 8.7 IV .50 .00 .30 150 2.0 1.88 .12 5.5 Question: Is the IV estimation procedure: Unbiased? No. Consistent? Yes. Estimation Procedure IV IV IV Corr X&Z .50 .50 .50 Corr Z&E .00 .00 .00 Corr X&E .30 .30 .30 Sample Actual Mean of Magnitude Variance of Size Coef Coef Ests of Bias Coef Ests 50 2.0 1.61 .39 20.3 100 2.0 1.82 .18 8.7 150 2.0 1.88 .12 5.5 Question: Is the IV estimation procedure: Unbiased? No. Consistent? Yes. Good Instrument Condition 1: Instrument/”Problem” Explanatory Variable Correlation Suppose that the instrument is more highly correlated with the “problem” explanatory variable: Estimation Procedure IV Corr X&Z .75 Corr Z&E .00 Corr X&E .30 Sample Actual Mean of Magnitude Variance of Size Coef Coef Ests of Bias Coef Ests 150 2.0 1.95 .05 2.3 Question: Is the instrument better? Yes. Both the magnitude of the bias and the variance of the estimates is less. Intuition: zt more highly correlated with xt Estxt is a better predictor for xt Estxt is a better surrogate for xt Instrument variables (IV) estimation procedure is better. Lab 18.7 Good Instrument Condition 2: Instrument/Error Term Independence: Suppose that the instrument is correlated with the error term: Estimation Procedure IV IV IV Corr X&Z .75 .75 .75 Corr Z&E .10 .10 .10 Corr X&E .30 .30 .30 Lab 18.7 Sample Actual Mean of Magnitude Variance of Size Coef Coef Ests of Bias Coef Ests 50 2.0 3.69 1.69 6.8 100 2.0 3.74 1.74 3.2 150 2.0 3.76 1.76 2.1 Question: Is the IV estimation procedure: Unbiased? No. Consistent? No.