Uploaded by Duy Kim H. Tran

CL1 OrdinaryLeastSquares

advertisement
Lab : Univariate & Multivariate OLS
7,160 Quantitative Methods
TA : Edouard Mattille
September 22, 2023
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
1 / 32
Introduction
Introduction to computer labs
Computer labs are meant to show you how to apply the theory learned
during the lectures.
We will look at issues that are often encountered during applied research as
well as some interesting concepts best demonstrated by programming.
Seeing certain concepts implemented in programming gives us a deeper
understanding of the theory.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
2 / 32
Introduction
Programming in this course
A few words on programming in this course...
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
3 / 32
Ordinary Least Squares
Univariate OLS
Test Score Example
Dependent Variable :
Test Scores : Average of the reading and math scores on the Stanford
9 Achievement Test, a standardized test administered to fifth-grade
students
Independent Variable :
Student-Teacher Ratio : Number of students in the district divided by
the number of full-time teachers
Student Characteristics (averaged across the district) :
Percentage of students who are English learners
Percentage of students who quality for reduced-price lunch
Percentage of students who are in the public assistance program
CalWorks
District and School Characteristics (averaged across the district) :
Expenditures per student
Income per capita
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
4 / 32
Ordinary Least Squares
Univariate OLS
Preliminaries, packages, and data set
Preliminaries :
rm(list=ls())
getwd()
setwd("/Users/...")
Packages :
library(AER) # contains the data
library(stargazer) # makes nice tables and outputs to LaTeX
library(fBasics) # makes nice summary statistics
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
5 / 32
Ordinary Least Squares
Univariate OLS
Preliminaries, packages, and data set
Necessary data set :
data(CASchools)
Construction of required variables :
# student-teacher ratio (STR)
CASchools$STR <- CASchools$students/CASchools$teachers
# average test-score (score)
CASchools$score <- (CASchools$read + CASchools$math)/2
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
6 / 32
Ordinary Least Squares
Univariate OLS
Let’s get some statistics
# compute sample means of STR and score
mean_STR <- mean(CASchools$STR)
mean_score <- mean(CASchools$score)
# compute sample standard deviations of STR and score
sd_STR <- sd(CASchools$STR)
sd_score <- sd(CASchools$score)
# compute sample minimum and maximum of STR and score
min_STR <- min(CASchools$STR)
min_score <- min(CASchools$score)
max_STR <- max(CASchools$STR)
max_score <- max(CASchools$score)
Or we can use the fBasics package and get some summary statistics :
basicStats(CASchools$STR)
basicStats(CASchools$score)
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
7 / 32
Ordinary Least Squares
Univariate OLS
Summary Statistics
nobs
NAs
Minimum
Maximum
1. Quartile
3. Quartile
Mean
Median
Sum
Variance
Stdev
Skewness
Excess Kurtosis
Edouard Mattille
STR
score
420
0
14
25.8
18.6
20.9
19.6
19.7
8, 249
3.6
1.9
0
0.6
420
0
605.6
706.8
640.1
666.7
654.2
654.4
274, 745.8
363
19.1
0.1
-0.3
7,160 Quantitative Methods
September 22, 2023
8 / 32
Ordinary Least Squares
Univariate OLS
The Simple Regression Model
Regression Model :
Estimation of the effect of the student-to-teacher ratio (STR) on the
test score.
TestScorei = β0 + β1 · STRi + ui
Goal :
Estimate the coefficients β0 and β1 .
Assess how well the model explains the observations.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
9 / 32
Ordinary Least Squares
Univariate OLS
OLS Regression : the lm command.
# estimate the model and assign the result to linear_model
linear_model <- lm(score ~ STR, data = CASchools)
# print the standard output
linear_model
# Coefficients:
# (Intercept)
STR
#
698.93
-2.28
If you want to regress without an intercept, run
lm(score ~ STR - 1, data = CASchools)
Why ?
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
10 / 32
Ordinary Least Squares
Univariate OLS
Let’s do some plotting
# plot the data
plot(score ~ STR, data = CASchools, main = "Regression of
TestScore and STR", xlab = "Student-Teacher Ratio",
ylab = "Test Score", col = "royalblue1")
# add the regression line to the plot
abline(linear_model, col = "green")
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
11 / 32
Ordinary Least Squares
Edouard Mattille
Univariate OLS
7,160 Quantitative Methods
September 22, 2023
12 / 32
Ordinary Least Squares
Univariate OLS
Estimation of Coefficients
We now manually derive the results from the definition of the OLS
parameters :
Pn
(Xi − X̄ )(Yi − Ȳ )
Pn
βˆ1 = i=1
2
i=1 (Xi − X̄ )
β̂0 = Ȳ − β̂1 X̄
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
13 / 32
Ordinary Least Squares
Univariate OLS
Implementation in R
# compute and print beta_1_hat
beta_1 <- sum((CASchools$STR - mean(CASchools$STR))
* (CASchools$score - mean(CASchools$score)))
/ sum((CASchools$STR - mean(CASchools$STR))^2)
beta_1
# compute and print beta_0_hat
beta_0 <- mean(CASchools$score) - beta_1 * mean(CASchools$STR)
beta_0
# We get the same results.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
14 / 32
Ordinary Least Squares
Univariate OLS
We can also express OLS as :
# Note we can also calculate beta1 with:
cov(CASchools$score, CASchools$STR)/var(CASchools$STR)
# or
cor(CASchools$score, CASchools$STR) * (sd(CASchools$score)
/ sd(CASchools$STR))
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
15 / 32
Ordinary Least Squares
Univariate OLS
Measures of Fit
The R 2 :
R2 = 1 −
where SSR =
Pn
2
i=1 ûi
and TSS =
SSR
TSS
Pn
i=1 (Yi
− Ȳ )2
The adjusted R 2 :
R̄ 2 = 1 −
n−1
SSR
·
TSS n − k − 1
The SER of the regression :
SER =
where sû2 =
q
sû2
SSR
n−k−1
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
16 / 32
Ordinary Least Squares
Univariate OLS
Let’s compute it in R
# compute the components
residuals <- CASchools$score - predict(linear_model,
newdata = CASchools)
n <- nrow(CASchools)
SSR <- sum(residuals^2)
TSS <- sum((CASchools$score - mean(CASchools$score))^2)
# compute and print R^2
R2 <- 1 - SSR/TSS
R2
# compute and print SER
SER <- sqrt(SSR / (n-2))
SER
# Measures of Fit from lm object
summary(linear_model)
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
17 / 32
Ordinary Least Squares
Multivariate OLS
The Multivariate Regression Model
Regression model :
Estimation of the effect of the student-to-teacher rato (STR) and the
percentage of English learners (PctEL) on the test score :
TestScorei = β0 + β1 · STRi + β2 · PctELi + ui
Goal :
Estimate the coefficients β0 , β1 , and β2 .
Assess how well the model explains the observations.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
18 / 32
Ordinary Least Squares
Multivariate OLS
R implementation of multivariate OLS
mult_model <- lm(score ~ STR + english, data = CASchools)
#
#
#
#
#
#
Call:
lm(formula = score ~ STR + english, data = CASchools)
Coefficients:
(Intercept)
686.0322
Edouard Mattille
STR
-1.1013
english
-0.6498
7,160 Quantitative Methods
September 22, 2023
19 / 32
Ordinary Least Squares
Multivariate OLS
Minimizing the least squares
Let’s now consider why least squares is called least squares by solving the
same linear regression using an optimization approach (i.e. gradient
descent).
We are going to build a function in R which takes as an input the regression
parameters β0 , β1 , and β2 and outputs the squared residual errors.
Then, we instruct R to find the parameters which minimize the output least squares !
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
20 / 32
Ordinary Least Squares
Multivariate OLS
The function to be minimized
multivariateOLS <- function(parameters){
beta0 <- parameters[1]
beta1 <- parameters[2]
beta2 <- parameters[3]
y
<- data[,1]
x1
<- data[,2]
x2
<- data[,3]
residuals <- (y - beta0 - beta1*x1 - beta2*x2)^2
SSR <- sum(residuals)
return(SSR)
}
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
21 / 32
Ordinary Least Squares
Multivariate OLS
Minimizing the function
# Prepare the data
data <- cbind(score = CASchools$score, STR = CASchools$STR,
english = CASchools$english)
# We feed in some random starting values
parameters <- c(rnorm(3))
# Run the optimizer
leastsquares <- optim(fn=multivariateOLS, par = parameters,
method = "L-BFGS-B")
Let’s run it live to check the results.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
22 / 32
Appendices
Matrix version of multivariate OLS
Appendix 1 : Matrix version of multivariate OLS
For completion’s sake, let’s demonstrate the multivariate setting.
β̂ = (XT X)−1 XT Y
Recall that the inverse of a matrix A solves A−1 A = I, the identity matrix
(a diagonal of 1s). For comparison, the univariate β̂1 was derived as :
Pn
(Xi − X̄ )(Yi − Ȳ )
Pn
βˆ1 = i=1
2
i=1 (Xi − X̄ )
Let’s understand what is in the matrices through programming.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
23 / 32
Appendices
Matrix version of multivariate OLS
Appendix 1 : Matrix version of multivariate OLS
X <- cbind(1, CASchools$STR, CASchools$english)
# Why the 1 ?
Y <- CASchools$score
beta_hat <- solve(t(X) %*% X) %*% t(X) %*% Y
beta_hat
# Note: "solve" calculates the inverse of a matrix.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
24 / 32
Appendices
FAQ
Appendix 2 : FAQ
Question : On slide 15 we programmed β̂1 as a cov (x, y )/var (x) and got
the same OLS estimator as the lm command. However from the definition
of the univariate β̂1 estimator on slide 13, the denominator
is actually the
P
sum of the squared deviations from 1 to n, i.e. ni=1 (Xi − X̄ )2 , seemingly
variance times n. How is this possible ?
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
25 / 32
Appendices
FAQ
Appendix 2 : FAQ
Answer : What we ran in R is the sample covariance divided by the sample
variance, which is :
Pn
1 Pn
(Xi − X̄ )(Yi − Ȳ )
i=1 (Xi − X̄ )(Yi − Ȳ )
n−1
Pn
β̂1 =
= i=1
1 Pn
2
2
i=1 (Xi − X̄ )
i=1 (Xi − X̄ )
n−1
Therefore the two are equivalent by virtue of the (1/(n − 1)) corrections
cancelling out.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
26 / 32
Appendices
FAQ
Appendix 2 : FAQ
Question : Why are the sample variance and sample covariance divided by
n − 1 instead of n ?
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
27 / 32
Appendices
FAQ
Appendix 2 : FAQ
Answer : This adjustment needs to be made in order to produce an
unbiased estimator, and is applied whenever we are using the sample mean
as opposed to the population mean. Let’s consider the sample variance.
Consider µ as the true population mean, σ 2 as the true population
variance, X̄ as the sample mean, and S 2 as the “wrong” sample variance,
dividing by n. We have :
n
X̄ =
1X
Xi
n
i=1
S2 =
Edouard Mattille
n
1X
n
(Xi − X̄ )2
i=1
7,160 Quantitative Methods
September 22, 2023
28 / 32
Appendices
FAQ
Then let’s take the expectation of our biased estimator :
" n
#
1X
2
2
E [S ] = E
(Xi − X̄ )
n
i=1
" n
#
2
1X
=E
(Xi − µ) − (X̄ − µ)
n
i=1
#
" n
1X
2
2
(Xi − µ) − 2(X̄ − µ)(Xi − µ) + (X̄ − µ)
=E
n
i=1
" n
#
n
n
X
X
X
1
1
1
2
=E
(Xi − µ)2 −
2(X̄ − µ)(Xi − µ) +
X̄ − µ
n
n
n
i=1
i=1
i=1
" n
#
n
X
1X
2
=E
(Xi − µ)2 − (X̄ − µ)
(Xi − µ) + (X̄ − µ)2
(1)
n
n
i=1
Edouard Mattille
i=1
7,160 Quantitative Methods
September 22, 2023
29 / 32
Appendices
FAQ
Subtract the true population mean from both sides of X̄ =
get :
X̄ − µ =
n
n
n
n
i=1
i=1
i=1
i=1
1
n
Pn
i=1 Xi
and
1X
1X
1X
1X
Xi − µ =
Xi −
µ=
(Xi − µ)
n
n
n
n
n
X
∴ n · (X̄ − µ) =
(Xi − µ)
i=1
Plug back into (1) and get :
#
" n
1X
2
2
2
2
(Xi − µ) − 2(X̄ − µ) + (X̄ − µ)
E [S ] =E
n
i=1
" n
#
X
1
=E
(Xi − µ)2 − (X̄ − µ)2
n
i=1
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
30 / 32
Appendices
FAQ
#
n
1X
(Xi − µ)2 − E (X̄ − µ)2 = σ 2 − var (X̄ )
=E
n
"
i=1
From the Least Squares Assumptions, the Xi are i.i.d., and therefore
!
n
n
1 X
1
1X
Xi = 2
var (Xi ) = σ 2
var (X̄ ) = var
n
n
n
i=1
i=1
Finally we see that :
n−1 2
1
σ 6= σ 2
E [S 2 ] = σ 2 − σ 2 =
n
n
Thus the estimator S 2 is biased.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
31 / 32
Appendices
FAQ
Therefore we need an unbiased estimator of the variance, call it σ̂ 2 . Using
the definition of S 2 , we get :
#
" n
#
"
n
n
1X
1 X
2
2
2
(Xi − X̄ ) =
E
(Xi − X̄ )
E [σ̂ ] = E
n−1
n−1
n
i=1
=
i=1
n
n−1 2
·
σ = σ2
n−1
n
It should now be clear why multiplying the (sample) variance of the
dependent variable times n − 1 gets us the TSS.
Edouard Mattille
7,160 Quantitative Methods
September 22, 2023
32 / 32
Download