Uploaded by Emmanuel Nkansah


I use data from Wooldridge's book called the bwght (birthweight data). This is cross-sectional
individual data on birth weights and it looks at how cigarette smoking affects the weight of
newborns controlling for other factors. The number of observations is 1388 but the final data
used is 1191.
(I & II)
The dependent variable is the log of birth weight (log of bwght) and the explanatory variables are
a log of family income (lfaminc), father's years of education (fatheduc), mother's years of
education (motheduc), birth order of child (parity), male (1 if a male child), white (1 if White),
cigarette tax (cigtax), packs smoked per day when a mother is pregnant (packs)- is the
endogenous variable, price of cigarettes (cigprice).
The instrument is the average price of cigarettes (cigprice) in each woman’s state of residence
and the endogenous variable is packs smoked per day when a mother is pregnant.
Table 1: First stage regression
Dependent variable: Packs
For an instrument to be valid, it must be significant in the first stage and insignificant in the
second stage. From Table 1, the instrument used, the average price of the cigarettes is not
significant hence not a valid instrument. Accordingly, cigprice fails as an IV for packs because
cigprice is not partially correlated with packs. Also, the p-value (0.7653) is above (p-value of
0.05), hence the IV is not significant.
Can the instrument be excluded from the second stage?
Table 2: Regression Results
The test shows that the p-value of the instrument (cigprice) is 0.8199 which is above 0.005. The
instrument is insignificant which satisfies the second assumption that the IV should be
insignificant in the second regression equation.
Thus, the instrument can be excluded from the second-stage outcome regression.
Table 3: Regression Results
The p-value (0.8217) shows that the OLS estimate (on the potentially endogenous covariate) is
not statistically different from the IV estimate.
Assignment on Fixed Effects and Random Effects
This data comes from the Panel Study of Income Dynamics (PSID), which covers the years 1976
to 1982 and includes details on the demographics and earnings of 595 people.
Id = Unique identifier for each survey respondent
T = time (1 through 7)
Wks = Weeks worked in the past year
lwage = Natural logarithm of earnings in the past year
Ms = binary indicator for whether respondent is married (1 = married)
Occ = binary indicator for whether a respondent is a blue-collar (= 0) or white-collar (= 1)
Ind = binary indicator for whether respondent works in manufacturing (= 1)
South = binary indicator for whether the respondent lives in the South (= 1)
Smsa = binary indicator for whether the respondent lives in a standard metropolitan area (SMSA;
= 1)
Fem = binary indicator for whether the respondent is female (= 1)
blk = binary indicator for whether the respondent is African-American (= 1)
ed = years of education
exp = years in the workforce.
Random Effects
Using the Hausman test, the fixed effects is different from the random effect
The null hypothesis is that the preferred model is random effects; The alternate hypothesis is that
fixed effects is the preferred model.
Since the p-value (0.000) is small (less than 0.05), we go with the fixed effects and conclude that
it is the preferred model
The fixed Effects