Uploaded by Sarthak Sachdev

Econometrics: Regression Theory - Functional Forms & R-squared

Lecture 8
Regression Theory - 3
Kriti Khanna
Plaksha University
Functional Forms: Logs
Functional Forms: Constant Elasticity Model
• Level-level:
• Level-log:
• Log-level:
• Log-log:
𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝑢
𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 log(𝑒𝑑𝑢𝑐) + 𝑢
log(𝑤𝑎𝑔𝑒) = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝑢
log(𝑤𝑎𝑔𝑒) = 𝛽0 + 𝛽1 log(𝑒𝑑𝑢𝑐) + 𝑢
Using logs in regression Models
• Variable should take positive values
• Changing units from, say, dollars to thousands of dollars has no effect on an
independent variable’s coefficient when that independant variable appears in
logarithmic form
• Log of variable is usually taken in regression models when the range of values of the
variable is large. Eg: population, salaries, sales
• In many cases, taking the log greatly reduces the variation of a variable, making OLS
estimates less prone to outlier influence. However, in cases where y is a fraction and
close to zero for many observations, log(yi ) can have much more variability than yi .
• Models with log(y) as the dependent variable often more closely satisfy the classical
linear model assumptions. Model has a better chance of being linear,
homoskedasticity is more likely to hold, and normality is often more plausible
Models with Quadratics
• A quadratic function in an explanatory variable allows for an increasing or
decreasing effect
at low values of rooms, an additional room has
a negative effect on log(price). At some point,
the effect becomes positive, log(price) is
increasing as rooms increases.
Models with interaction terms
• Sometimes you want to see if the effect respect to an explanatory variable
depends on the magnitude of yet another explanatory variable
• If 𝛽3 > 0 then it implies that an additional bedroom yields a higher
increase in housing price for larger houses
• In other words, there is an interaction effect between square footage and
number of bedrooms
Effects of Attendance on Final Exam Performance
• A model to explain the standardized outcome on a final exam (stndfnl) in terms of
percentage of classes attended, prior college grade point average, and ACT score is
• Model includes an interaction between priGPA and the attendance rate. Class
attendance might have a different effect for students who have performed differently in
the past, as measured by priGPA.
• If we simply look at the coefficient on atndrte, we will incorrectly conclude that attendance has
a negative effect on final exam score. But this coefficient supposedly measures the effect when
priGPA =0
Adjusted R-squared
n = no. of observations in the sample
p = no. of independent variables
• 𝑅2 can never fall when a new independent variable is added to a regression equation: this is because
SSR never goes up (and usually falls) as more independent variables are added
• The Adjusted R-squared takes into account the number of independent variables used for predicting the
outcome variable. In doing so, we can determine whether adding new variables to the model actually
increases the model fit.
Adjusted R-squared
If R-squared does not increase significantly on the addition of a new independent variable,
then the value of Adjusted R-squared will actually decrease.
• On the other hand, if on adding the new independent variable we see a significant increase
in R-squared value, then the Adjusted R-squared value will also increase.