0. Introduction Organizational issues: 1. 2. Lecture ▪ Saturday 08:30 – 09:45 D D207 ▪ Saturday 10:15 – 11:30 D D207 Application Classes ▪ Tutorials will be held together with lectures. ADA University © Khatai Abbasov 1 8. Simple/Multiple Linear Regression Analysis Content: 1. General Review, Economic Data 2. Testing hypothesis for one population parameters 3. Testing hypothesis for two population parameters 4. Testing hypothesis for several population means 5. Non-parametric Statistical Analysis 6. Categorical Data Analysis 7. Correlation Analysis 8. Simple/Multiple Linear Regression Analysis 9. Time Series Analysis ADA University © Khatai Abbasov Wooldridge (2016, Chp. 1.3) Lee (2020, Chapter 7) Lee (2020, Chapter 8) Lee (2020, Chapter 9) Lee (2020, Chapter 10) Lee (2020, Chapter 11) Lee (2020, Chapter 12) Wooldridge (2016, Chp.2 & 3) Wooldridge (2016, Chapt. 10) 2 8. Simple/Multiple Linear Regression Analysis Outline of Section 8: 8.1 Statistical Models 8.2 Linear Regression Models 8.3 Ordinary Least Squares 8.4 R-squared 8.5 Multiple Regression Analysis 8.6 Omitted Variable Bias 8.7 Examples to widen your understanding 8.8 Qualitative Information: Binary (or Dummy) Variables 8.9 Binary Dependent Variable – LPM Model 8.10 Discrete Dependent Variables ADA University © Khatai Abbasov 3 8. Simple/Multiple Linear Regression Analysis 8.1 Statistical Models ❑ Business Statistics/Econometrics is mainly concerned with model building. ❑ Model: one variable is caused by another ❑ Model building often begins with an idea of a relation ❑ Statistical model building: translating this idea into a (a set of) equation(s) ❑Some features of this equation answer a relevant/interesting question about the variable of interest ❑Examples: Does insurance coverage affect health care utilization? What is the direction of this effect if it exists? How “big” is this effect if it exists? ADA University © Khatai Abbasov 4 8. Simple/Multiple Linear Regression Analysis 8.1 Statistical Models ❑ Statistical point of view: health care utilization, insurance coverage, and further covariates have a joint probability distribution ❑ We often interested in conditional distribution of one of these variables given the other. ❑ Focus is often on the conditional mean of one variable y given the value of covariates x i.e. E[y|x] ❑E[y|x] is the regression function ❑E.g. expected number of doctor visits given income, health status, insurance status etc. ❑Linear regression model most common ADA University © Khatai Abbasov 5 8. Simple/Multiple Linear Regression Analysis 8.2 Linear Regression Models ❑ Simple/Bivariate Regression Model ❑ y: left-hand-side variable [lhs] ❑ x: right-hand-side variables. [rhs] ❑ u : disturbance /errors / unobserved part ❑ Random component from the underlying theoretical model ❑ Measurement error in y ❑ Captures anything not explicitly taken into account by the model ADA University © Khatai Abbasov 6 8. Simple/Multiple Linear Regression Analysis 8.2 Linear Regression Models ❑ Simple/Bivariate Regression Model ADA University © Khatai Abbasov 7 8. Simple/Multiple Linear Regression Analysis 8.2 Linear Regression Models ❑ Simple/Bivariate Regression Model ❑ Sample of data for 𝑦𝑖 and 𝑥𝑖 with 𝑖 = 1, … , 𝑛 ❑ Key assumption: each observed value of 𝑦𝑖 is generated by the underlying data generating process ❑𝑦𝑖 = 𝛽1 𝑥𝑖 + 𝑢𝑖 ❑𝑦𝑖 is determined by deterministic part 𝛽1 𝑥𝑖 and random part 𝑢𝑖 ❑ Objective of statistical analysis: estimate the unknown model parameters, here 𝛽1 ❑Testing hypotheses (Does the number of doctor visits increase by income?) ❑Identifying independent effects of 𝑥𝑖 on y (How strong will doctor visits increase if income increases by one unit?) ❑Making predictions about 𝑦𝑖 (How often will individual 𝑖 with characteristics 𝑥𝑖 visit a doctor?) ADA University © Khatai Abbasov 8 8. Simple/Multiple Linear Regression Analysis 8.2 Linear Regression Models ❑ Linearity does not mean that the statistical model needs to be linear 𝑦 = 𝛼 ∗ 𝑥𝛽𝑒𝑢 is nonlinear, its transformation ln 𝑦 = ln 𝛼 + 𝛽 ln(𝑥) + 𝑢 is linear 𝑦 = 𝛼 ∗ 𝑥𝛽 + 𝑒𝑢 is non-linear too but cannot be transformed to linear model ❑ Linearity refers to linearity in the parameters and in the disturbances, not to linearity in the (original, not transformed) variables, e.g. 𝑦 = 𝛼 + 𝛽1 𝑙𝑛 𝑥 + 𝛽2 𝑙𝑛(𝑧) + 𝑢 𝑙𝑛(𝑦) = 𝛼 + 𝛽1 𝑥 + 𝛽2 𝑧 2 + 𝑢 𝑦 = 𝛼 + 𝛽1 1/𝑥 + 𝛽2 𝑧 + 𝑢 are linear regression models though they are nonlinear in x ❑ Especially log-linear (log-log, semi-log) models frequently used in applied work ADA University © Khatai Abbasov 9 8. Simple/Multiple Linear Regression Analysis 8.2 Linear Regression Models ❑ Linearity does not mean that the statistical model needs to be linear ❑ We mean here that the equation is linear in the parameters β. ❑Nothing prevents us from using simple regression to estimate a model such as , where cons is consumption and inc is Income annually. ❑ Linear models with non-linear variables are often more realistic: ADA University © Khatai Abbasov 10 8. Simple/Multiple Linear Regression Analysis 8.2 Linear Regression Models ❑ Linear models with non-linear variables are often more realistic: ADA University © Khatai Abbasov 11 8. Simple/Multiple Linear Regression Analysis 8.3 Ordinary Least Squares (OLS) ❑ Estimation of model parameters: objective of econometric model ❑ Different approaches to model estimation, least squares regression the most popular ❑ Starting with simple least squares is often good idea in applied work - even if more sophisticated (possibly better suited) methods are available. ❑ Idea of least squares estimation: choosing a coefficient 𝛽 such that the sum of squared residuals (estimated unobserved parts) is minimized. ❑ Intuition: The fitted line 𝛽1 𝑥 is close to the observed data points ❑ Algebraic perspective: least squares allows for algebraic solution of the minimization problem ❑ Least squares estimation puts much weight on avoiding large deviations of observed data points from the fitted line (by considering squares) ADA University © Khatai Abbasov 12 8. Simple/Multiple Linear Regression Analysis 8.3 Ordinary Least Squares (OLS) ❑ Population and Sample Regression ADA University © Khatai Abbasov 13 8. Simple/Multiple Linear Regression Analysis 8.3 Ordinary Least Squares (OLS) ❑ In the linear model of 𝑦 = 𝛽1 𝑥 + 𝑢 ❑ E(u) = 0 -> Distribution of unobserved factors in the population. ❑ E(u|x) = E(u) -> u is mean independent of x, in other words full independence between u and x. ❑ Example: -> E(abil|educ) = E(abil|8) = E(abil|16) -> What if ability increases with years of education? ❑ E(𝑦|𝑥) = 𝛽1 𝑥 -> Average value of y changes with x, not all units! ❑ Systematic part and Unsystematic part ??? ADA University © Khatai Abbasov 14 8. Simple/Multiple Linear Regression Analysis 8.3 Ordinary Least Squares (OLS) ❑ Example: ❑ What is wage for a person with 8 years of education? ❑ How much does hourly wage increase by 1 & 4 more years of education? ADA University © Khatai Abbasov 15 8. Simple/Multiple Linear Regression Analysis 8.3 Ordinary Least Squares (OLS) ❑ More Examples ADA University © Khatai Abbasov 16 8. Simple/Multiple Linear Regression Analysis 8.3 Ordinary Least Squares (OLS) ❑ A note on terminology: Statistical Jargon ❑ Often, we indicate the estimation of a relationship using OLS by writing equations. ❑ Alternatively, we run the regression of y on x. ❑ Or simply we regress y on x: : always the dependent variable on the independents. ❑ Example: regress salary on roe ❑When we use such terminology, we will always mean that we plan to estimate the intercept along with the slope coefficient. ❑This case is appropriate for the vast majority of applications. ❑ Unless explicitly stated otherwise, we estimate an intercept along with a slope. ❑ Fitted values (Explained Part) versus Actual Data versus Residuals. ADA University © Khatai Abbasov 17 8. Simple/Multiple Linear Regression Analysis 8.4 R-squared ADA University © Khatai Abbasov 18 8. Simple/Multiple Linear Regression Analysis 8.4 R-squared ❑ R-squared is the ratio of the explained variation compared to the total variation; ❑ It is interpreted as the fraction of the sample variation in y that is explained by x. SST = SSE + SSR divide all side by SST 1 = SSE/SST + SSR/SST 1 – SSR/SST = SSE/SST = R2 R-squared Example: ❑ Firm’s ROE explains only 1.3% of the variation in salaries for this sample. ADA University © Khatai Abbasov 19 8. Simple/Multiple Linear Regression Analysis 8.4 R-squared ❑ R-squared is a poor tool in the model analyses ❑ Example ❑ narr86 = number of arrests in 1986 ❑ pcnv = the proportion of arrests prior to 1986 that led to conviction ❑Interpret the coefficients ❑Interpret the R-squared value ❑In the arrest example, the small R2 reflects what we already suspect in the social sciences: It is generally very difficult to predict individual behavior. ADA University © Khatai Abbasov 20 8. Simple/Multiple Linear Regression Analysis 8.5 Multiple Regression Analysis ❑ Linear regression model with multiple independent variables ❑For any values of 𝑥1 and 𝑥2 in the population, the average of the unobserved factors is equal to zero. ❑This implies that other factors affecting y are not related on average to 𝑥1 & 𝑥2 ❑Thus, equation fails when any problem causes u to be correlated with any of the independent variables. ADA University © Khatai Abbasov 21 8. Simple/Multiple Linear Regression Analysis 8.5 Multiple Regression Analysis ❑ Linear regression model with multiple independent variables Example ❑ Exper – years of labor market experience ❑ Tenure – years with the current employer ❑If we take two people with the same level of experience and job tenure, the coefficient on educ is the proportionate difference in predicted wage when their education levels differ by one year. ❑ The estimated effect on wage when an individual stays at the same firm for another year? ❑ How to obtain fitted or predicted values for each observation? ❑ What is the difference between the actual values and fitted values? ADA University © Khatai Abbasov 22 8. Simple/Multiple Linear Regression Analysis 8.6 Omitted Variable Bias ❑ OVB occurs when a statistical model leaves out one or more relevant variables. ❑ The bias results in the model attributing the effect of the missing variables to those that were included. 𝑦 = β0 + β1𝑥1 + β2𝑥2 + 𝑢 𝑦 = β෨ 0 + β෨ 1𝑥1 + 𝑣 -> Full model -> Underspecified model ❑ What if Corr(x1,x2) = 0 ? ADA University © Khatai Abbasov 23 8. Simple/Multiple Linear Regression Analysis 8.7 Examples to widen your understanding 1. Some students are randomly given grants to buy computer. If the amount of the grant is truly randomly determined, we can estimate the ceteris paribus effect of the grant amount on subsequent college grade point by simple regression analysis. a) Because of random assignment, all of the other factors that affect GPA would be uncorrelated with the amount of the grant. b) R-squared would probably be very small. c) In a large sample we could still get the precise estimate of the effect of the grant. d) For more precise estimate, SAT score, family background variables would be good candidates since no correlation is an issue with the amount of grant. Remember, I do not mention here unbiasedness, it is already obtained. Issue here is getting an estimator with small sampling variance. ❑ Keep in mind that the smaller the R-squared, the harder the prediction/ forecast. ❑ Yet, goodness of fit or other factors should not be decisive in model selection. It is important what is your purpose to assess. ADA University © Khatai Abbasov 24 8. Simple/Multiple Linear Regression Analysis 8.7 Examples to widen your understanding 2. We have ecolabeled apples as a dependent variable: ecolbs, and independent variables are prices of the ecolabeled apples (ecoprc) and price of regular apples (regprc). Families are randomly chosen. And we want to estimate the price effects. a) Since random assignment is in charge, family income, desire for clean environment are unrelated to prices. b) Hence, 𝑟𝑒𝑔 𝑒𝑐𝑜𝑙𝑏𝑠 𝑒𝑐𝑜𝑝𝑟𝑐 𝑟𝑒𝑔𝑝𝑟𝑐 produces unbiased estimator of the price effects. c) R-squared is 0.0364, which means price variables explain only 3.6 % of the total variation in the dependent variable. 3. Suppose we want to estimate the effect of pesticide usage among farmers on family health expenditures. Should we include the number of doctor visits as an explanatory variable? The answer is NO! a) Health expenditures include doctor visits, and we want to see all health expenditures b) If we include doctor visits, then we are only measuring the effect of pesticide usage on health expenditures other than doctor visits. ADA University © Khatai Abbasov 25 8. Simple/Multiple Linear Regression Analysis 8.8 Qualitative Information: Binary (or Dummy) Variables ADA University © Khatai Abbasov 26 8. Simple/Multiple Linear Regression Analysis 8.8 Qualitative Information: Binary (or Dummy) Variables ❑ Intercept in the models with binary variables are interpreted as the base group ❑ Example ❑What does intercept tell us? ❑What does coefficient of female tell us? ❑The base group or benchmark group here is male, since female = 0 leaves us with males and that is shown by intercept. ❑ Example ❑ What does negative intercept, -1.57 stand for? ❑ Is it economically meaningful? Why? ❑ What does coefficient on woman measure? ❑ Why in this example it is important to control for educ exper and tenure? ADA University © Khatai Abbasov 27 8. Simple/Multiple Linear Regression Analysis 8.9 Binary Dependent Variable – LPM Model ❑ Linear Probability Model (LPM) is about dependent variable being zero/one ❑ Thus, β cannot be interpreted as the change in y given a one-unit increase in x. ❑ But here E(y|x) = P(y=1|x): that is the probability that y = 1 ❑ We must consider now estimated y is the predicted probability of success. ✓ inlf = labor force participation ✓ nwifeinc = husband’s earning ✓ kidslt6 = kids less than 6 years ✓ kidsge6 = kids from 6 to 18 years ❑10 more years of education increases the probability of being in the labor force by 0.38, which is a pretty large increase in a probability. ADA University © Khatai Abbasov 28 8. Simple/Multiple Linear Regression Analysis 8.10 Discrete Dependent Variables ❑ Dependent variable has a set of small integer values, and zero is a common value ❑ For instance, number of arrests, number of living children, etc. ❑ Example ❑How to interpret the estimate of educ? ❑ Each additional years of education reduces the estimated number of children by 0.079? Clearly, not possible situation for any particular woman. ❑ The estimate of education means that, average fertility falls by 0.079 children given one more year of education. ❑ Even a better summary would be, if each woman in a group of 100 (women) obtains another year of education, we estimate there will be roughly 8 fewer children among them. ADA University © Khatai Abbasov 29