Uploaded by shahap700


Linear Regression Assignment
Jose Fernandez
MSBA Data Analytics III
In this homework Assignment you will use linear regression to study speeding tickets. Each question builds
on the previous question. Your regressions should have more controls as you move through the assignment.
Try to capture all of these regression in one nicely formatted table.
What determines how much drivers are fined if they are stopped for speeding? Do demographics like
age, gender, and race matter? To answer this question, we’ll investigate traffic stops and citations in
Massachusetts using data from Makowsky and Stratmann (2009). Even though state law sets a formula for
tickets based on how fast a person was driving, polic officers in practice often deviate from the formula. An
amount for the fine is given only for observations in which the police officer decided to assess a fine.
a) Plot a histogram of fines. Does it looked normally distributed or skewed?
b) Estimate a simple linear regression model in which the ticket amount is the dependent variable as a
function of age. Is age statistically significant?
c) What does it mean for a variable to be endogenous? Is it possibly age endogenous? Please explain
your answer.
d) Estimate the model from part b), also controlling for miles per hour over the speed limit. Explain
what happens to the coefficient on age and why.
e) Is the effect of age on fines linear or non-linear? Assess this question by estimating a model with a
quadratic age term, controlling for MPHover, Female, Black, and Hispanic. Interpret the coefficients
on the age variables.
f) Sketch the relationship between age and ticket amount from the foregoing quadratic model: calculate
the fitted value for a white male with 0 MPHover (probably not many people going zero miles over
the speed limit got a ticket, but this simplifies calculations a lot) for ages equal to 20, 25, 30, 35, 40,
and 70. Use R to calculate these values and plot them.
g) Calculate the age that is associated with the lowest predicted fines. Hint: You can use calculus or
a simple formula used to find the minimum and maximum of quadratic functions.
h) Do drivers from out of town and out of state get treated differently? Do state police and local police
treat nonlocals differently? Estimate a model that allows us to assess whether out of towners and out
of staters are treated differently and whether state police respond differently to out of towners and out
of staters. Interpret the coefficients on the relevant variables. Hint: you have to do something
more than just including the dummy variables.
i) Test whether the two state police interaction terms are jointly significant. Briefly explain your results.
Hint: it says jointly so it is not a T-test.
Variable Name
State Pol
Miles per hour over the speed limit
Assessed fine for the ticket
Age of driver
Equals 1 for women and 0 for men
Equals 1 for African-American and 0 otherwise
Equals 1 for Hispanics and 0 otherwise
Equals 1 if ticketing officer was state patrol officer and 0
Equals 1 if driver from out of town and 0 otherwise
Equals 1 if driver from out of state and 0 otherwise