Problem Set 1
Troubleshooting
Log Files
Save in text format for readability: log using ps1 .log
, replace or: log using ps1, text
Handling Missing Values
- By default, Stata excludes all observations marked with a period (.) from subsequent statistical analysis.
- Best practices:
- Recode appropriate survey responses to missing.
Safest: replace v1 = . if v1 == 6
- Do not drop observations with missing values.
- Be careful not to recode missing values by accident.
Handling Missing Values
Problematic: gen dummy1 = 0 replace dummy1 = 1 if v1 == 4 | v1==5
Safe: gen dummy1 = . replace dummy1 = 1 if v1 == 4 | v1==5 replace dummy1 = 0 if v1 <= 3 gen dummy1 = 1 if v1 == 4 | v1==5 replace dummy1 = 0 if v1 <= 3
Handling Missing Values
Stata handles ‘.’ as higher than any integer value.
Will recode missing observations: replace dummy1 = 1 if v1 > 6
Safe: replace dummy1 = 1 if v1 > 6 & v1 != .
Optional: PS2 Time Saver
Stata supports loops:
} foreach x of numlist 1800 1850 1870 1900 1920 1950 1970 1975 { sum spending if year==`x', detail
} foreach x of varlist gdpcap pop taxhead race* { gen log_`x’ = log(`x’) sum `x’, detail hist `x’, name(`x’)