Lab : Univariate & Multivariate OLS 7,160 Quantitative Methods TA : Edouard Mattille September 22, 2023 Edouard Mattille 7,160 Quantitative Methods September 22, 2023 1 / 32 Introduction Introduction to computer labs Computer labs are meant to show you how to apply the theory learned during the lectures. We will look at issues that are often encountered during applied research as well as some interesting concepts best demonstrated by programming. Seeing certain concepts implemented in programming gives us a deeper understanding of the theory. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 2 / 32 Introduction Programming in this course A few words on programming in this course... Edouard Mattille 7,160 Quantitative Methods September 22, 2023 3 / 32 Ordinary Least Squares Univariate OLS Test Score Example Dependent Variable : Test Scores : Average of the reading and math scores on the Stanford 9 Achievement Test, a standardized test administered to fifth-grade students Independent Variable : Student-Teacher Ratio : Number of students in the district divided by the number of full-time teachers Student Characteristics (averaged across the district) : Percentage of students who are English learners Percentage of students who quality for reduced-price lunch Percentage of students who are in the public assistance program CalWorks District and School Characteristics (averaged across the district) : Expenditures per student Income per capita Edouard Mattille 7,160 Quantitative Methods September 22, 2023 4 / 32 Ordinary Least Squares Univariate OLS Preliminaries, packages, and data set Preliminaries : rm(list=ls()) getwd() setwd("/Users/...") Packages : library(AER) # contains the data library(stargazer) # makes nice tables and outputs to LaTeX library(fBasics) # makes nice summary statistics Edouard Mattille 7,160 Quantitative Methods September 22, 2023 5 / 32 Ordinary Least Squares Univariate OLS Preliminaries, packages, and data set Necessary data set : data(CASchools) Construction of required variables : # student-teacher ratio (STR) CASchools$STR <- CASchools$students/CASchools$teachers # average test-score (score) CASchools$score <- (CASchools$read + CASchools$math)/2 Edouard Mattille 7,160 Quantitative Methods September 22, 2023 6 / 32 Ordinary Least Squares Univariate OLS Let’s get some statistics # compute sample means of STR and score mean_STR <- mean(CASchools$STR) mean_score <- mean(CASchools$score) # compute sample standard deviations of STR and score sd_STR <- sd(CASchools$STR) sd_score <- sd(CASchools$score) # compute sample minimum and maximum of STR and score min_STR <- min(CASchools$STR) min_score <- min(CASchools$score) max_STR <- max(CASchools$STR) max_score <- max(CASchools$score) Or we can use the fBasics package and get some summary statistics : basicStats(CASchools$STR) basicStats(CASchools$score) Edouard Mattille 7,160 Quantitative Methods September 22, 2023 7 / 32 Ordinary Least Squares Univariate OLS Summary Statistics nobs NAs Minimum Maximum 1. Quartile 3. Quartile Mean Median Sum Variance Stdev Skewness Excess Kurtosis Edouard Mattille STR score 420 0 14 25.8 18.6 20.9 19.6 19.7 8, 249 3.6 1.9 0 0.6 420 0 605.6 706.8 640.1 666.7 654.2 654.4 274, 745.8 363 19.1 0.1 -0.3 7,160 Quantitative Methods September 22, 2023 8 / 32 Ordinary Least Squares Univariate OLS The Simple Regression Model Regression Model : Estimation of the effect of the student-to-teacher ratio (STR) on the test score. TestScorei = β0 + β1 · STRi + ui Goal : Estimate the coefficients β0 and β1 . Assess how well the model explains the observations. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 9 / 32 Ordinary Least Squares Univariate OLS OLS Regression : the lm command. # estimate the model and assign the result to linear_model linear_model <- lm(score ~ STR, data = CASchools) # print the standard output linear_model # Coefficients: # (Intercept) STR # 698.93 -2.28 If you want to regress without an intercept, run lm(score ~ STR - 1, data = CASchools) Why ? Edouard Mattille 7,160 Quantitative Methods September 22, 2023 10 / 32 Ordinary Least Squares Univariate OLS Let’s do some plotting # plot the data plot(score ~ STR, data = CASchools, main = "Regression of TestScore and STR", xlab = "Student-Teacher Ratio", ylab = "Test Score", col = "royalblue1") # add the regression line to the plot abline(linear_model, col = "green") Edouard Mattille 7,160 Quantitative Methods September 22, 2023 11 / 32 Ordinary Least Squares Edouard Mattille Univariate OLS 7,160 Quantitative Methods September 22, 2023 12 / 32 Ordinary Least Squares Univariate OLS Estimation of Coefficients We now manually derive the results from the definition of the OLS parameters : Pn (Xi − X̄ )(Yi − Ȳ ) Pn βˆ1 = i=1 2 i=1 (Xi − X̄ ) β̂0 = Ȳ − β̂1 X̄ Edouard Mattille 7,160 Quantitative Methods September 22, 2023 13 / 32 Ordinary Least Squares Univariate OLS Implementation in R # compute and print beta_1_hat beta_1 <- sum((CASchools$STR - mean(CASchools$STR)) * (CASchools$score - mean(CASchools$score))) / sum((CASchools$STR - mean(CASchools$STR))^2) beta_1 # compute and print beta_0_hat beta_0 <- mean(CASchools$score) - beta_1 * mean(CASchools$STR) beta_0 # We get the same results. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 14 / 32 Ordinary Least Squares Univariate OLS We can also express OLS as : # Note we can also calculate beta1 with: cov(CASchools$score, CASchools$STR)/var(CASchools$STR) # or cor(CASchools$score, CASchools$STR) * (sd(CASchools$score) / sd(CASchools$STR)) Edouard Mattille 7,160 Quantitative Methods September 22, 2023 15 / 32 Ordinary Least Squares Univariate OLS Measures of Fit The R 2 : R2 = 1 − where SSR = Pn 2 i=1 ûi and TSS = SSR TSS Pn i=1 (Yi − Ȳ )2 The adjusted R 2 : R̄ 2 = 1 − n−1 SSR · TSS n − k − 1 The SER of the regression : SER = where sû2 = q sû2 SSR n−k−1 Edouard Mattille 7,160 Quantitative Methods September 22, 2023 16 / 32 Ordinary Least Squares Univariate OLS Let’s compute it in R # compute the components residuals <- CASchools$score - predict(linear_model, newdata = CASchools) n <- nrow(CASchools) SSR <- sum(residuals^2) TSS <- sum((CASchools$score - mean(CASchools$score))^2) # compute and print R^2 R2 <- 1 - SSR/TSS R2 # compute and print SER SER <- sqrt(SSR / (n-2)) SER # Measures of Fit from lm object summary(linear_model) Edouard Mattille 7,160 Quantitative Methods September 22, 2023 17 / 32 Ordinary Least Squares Multivariate OLS The Multivariate Regression Model Regression model : Estimation of the effect of the student-to-teacher rato (STR) and the percentage of English learners (PctEL) on the test score : TestScorei = β0 + β1 · STRi + β2 · PctELi + ui Goal : Estimate the coefficients β0 , β1 , and β2 . Assess how well the model explains the observations. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 18 / 32 Ordinary Least Squares Multivariate OLS R implementation of multivariate OLS mult_model <- lm(score ~ STR + english, data = CASchools) # # # # # # Call: lm(formula = score ~ STR + english, data = CASchools) Coefficients: (Intercept) 686.0322 Edouard Mattille STR -1.1013 english -0.6498 7,160 Quantitative Methods September 22, 2023 19 / 32 Ordinary Least Squares Multivariate OLS Minimizing the least squares Let’s now consider why least squares is called least squares by solving the same linear regression using an optimization approach (i.e. gradient descent). We are going to build a function in R which takes as an input the regression parameters β0 , β1 , and β2 and outputs the squared residual errors. Then, we instruct R to find the parameters which minimize the output least squares ! Edouard Mattille 7,160 Quantitative Methods September 22, 2023 20 / 32 Ordinary Least Squares Multivariate OLS The function to be minimized multivariateOLS <- function(parameters){ beta0 <- parameters[1] beta1 <- parameters[2] beta2 <- parameters[3] y <- data[,1] x1 <- data[,2] x2 <- data[,3] residuals <- (y - beta0 - beta1*x1 - beta2*x2)^2 SSR <- sum(residuals) return(SSR) } Edouard Mattille 7,160 Quantitative Methods September 22, 2023 21 / 32 Ordinary Least Squares Multivariate OLS Minimizing the function # Prepare the data data <- cbind(score = CASchools$score, STR = CASchools$STR, english = CASchools$english) # We feed in some random starting values parameters <- c(rnorm(3)) # Run the optimizer leastsquares <- optim(fn=multivariateOLS, par = parameters, method = "L-BFGS-B") Let’s run it live to check the results. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 22 / 32 Appendices Matrix version of multivariate OLS Appendix 1 : Matrix version of multivariate OLS For completion’s sake, let’s demonstrate the multivariate setting. β̂ = (XT X)−1 XT Y Recall that the inverse of a matrix A solves A−1 A = I, the identity matrix (a diagonal of 1s). For comparison, the univariate β̂1 was derived as : Pn (Xi − X̄ )(Yi − Ȳ ) Pn βˆ1 = i=1 2 i=1 (Xi − X̄ ) Let’s understand what is in the matrices through programming. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 23 / 32 Appendices Matrix version of multivariate OLS Appendix 1 : Matrix version of multivariate OLS X <- cbind(1, CASchools$STR, CASchools$english) # Why the 1 ? Y <- CASchools$score beta_hat <- solve(t(X) %*% X) %*% t(X) %*% Y beta_hat # Note: "solve" calculates the inverse of a matrix. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 24 / 32 Appendices FAQ Appendix 2 : FAQ Question : On slide 15 we programmed β̂1 as a cov (x, y )/var (x) and got the same OLS estimator as the lm command. However from the definition of the univariate β̂1 estimator on slide 13, the denominator is actually the P sum of the squared deviations from 1 to n, i.e. ni=1 (Xi − X̄ )2 , seemingly variance times n. How is this possible ? Edouard Mattille 7,160 Quantitative Methods September 22, 2023 25 / 32 Appendices FAQ Appendix 2 : FAQ Answer : What we ran in R is the sample covariance divided by the sample variance, which is : Pn 1 Pn (Xi − X̄ )(Yi − Ȳ ) i=1 (Xi − X̄ )(Yi − Ȳ ) n−1 Pn β̂1 = = i=1 1 Pn 2 2 i=1 (Xi − X̄ ) i=1 (Xi − X̄ ) n−1 Therefore the two are equivalent by virtue of the (1/(n − 1)) corrections cancelling out. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 26 / 32 Appendices FAQ Appendix 2 : FAQ Question : Why are the sample variance and sample covariance divided by n − 1 instead of n ? Edouard Mattille 7,160 Quantitative Methods September 22, 2023 27 / 32 Appendices FAQ Appendix 2 : FAQ Answer : This adjustment needs to be made in order to produce an unbiased estimator, and is applied whenever we are using the sample mean as opposed to the population mean. Let’s consider the sample variance. Consider µ as the true population mean, σ 2 as the true population variance, X̄ as the sample mean, and S 2 as the “wrong” sample variance, dividing by n. We have : n X̄ = 1X Xi n i=1 S2 = Edouard Mattille n 1X n (Xi − X̄ )2 i=1 7,160 Quantitative Methods September 22, 2023 28 / 32 Appendices FAQ Then let’s take the expectation of our biased estimator : " n # 1X 2 2 E [S ] = E (Xi − X̄ ) n i=1 " n # 2 1X =E (Xi − µ) − (X̄ − µ) n i=1 # " n 1X 2 2 (Xi − µ) − 2(X̄ − µ)(Xi − µ) + (X̄ − µ) =E n i=1 " n # n n X X X 1 1 1 2 =E (Xi − µ)2 − 2(X̄ − µ)(Xi − µ) + X̄ − µ n n n i=1 i=1 i=1 " n # n X 1X 2 =E (Xi − µ)2 − (X̄ − µ) (Xi − µ) + (X̄ − µ)2 (1) n n i=1 Edouard Mattille i=1 7,160 Quantitative Methods September 22, 2023 29 / 32 Appendices FAQ Subtract the true population mean from both sides of X̄ = get : X̄ − µ = n n n n i=1 i=1 i=1 i=1 1 n Pn i=1 Xi and 1X 1X 1X 1X Xi − µ = Xi − µ= (Xi − µ) n n n n n X ∴ n · (X̄ − µ) = (Xi − µ) i=1 Plug back into (1) and get : # " n 1X 2 2 2 2 (Xi − µ) − 2(X̄ − µ) + (X̄ − µ) E [S ] =E n i=1 " n # X 1 =E (Xi − µ)2 − (X̄ − µ)2 n i=1 Edouard Mattille 7,160 Quantitative Methods September 22, 2023 30 / 32 Appendices FAQ # n 1X (Xi − µ)2 − E (X̄ − µ)2 = σ 2 − var (X̄ ) =E n " i=1 From the Least Squares Assumptions, the Xi are i.i.d., and therefore ! n n 1 X 1 1X Xi = 2 var (Xi ) = σ 2 var (X̄ ) = var n n n i=1 i=1 Finally we see that : n−1 2 1 σ 6= σ 2 E [S 2 ] = σ 2 − σ 2 = n n Thus the estimator S 2 is biased. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 31 / 32 Appendices FAQ Therefore we need an unbiased estimator of the variance, call it σ̂ 2 . Using the definition of S 2 , we get : # " n # " n n 1X 1 X 2 2 2 (Xi − X̄ ) = E (Xi − X̄ ) E [σ̂ ] = E n−1 n−1 n i=1 = i=1 n n−1 2 · σ = σ2 n−1 n It should now be clear why multiplying the (sample) variance of the dependent variable times n − 1 gets us the TSS. Edouard Mattille 7,160 Quantitative Methods September 22, 2023 32 / 32