Advanced Stata Programming { Andrew Hicks CCPR Statistics and Methods Core Automating your work 1.Macros 2.Saved Results 3.Loops 4.Egen commands 5.Writing programs 6.Ado files Macros Assigns a string of text or a number to an abbreviation. ( a box you put text in) local name content local x 1 display `x’ (italics indication something you provide) Macros Assigns a string of text or a number to an abbreviation. local nobs = 198 display “The number of observations equals: `nobs’” display `nobs’ – 90 Local macros can be used only within the do-file in which the are defined. • When the program ends, the macro disappear. local x 1 Global macros persist until you delete it or exit Stata. global x 1 Macros Assigns a string of text or a number to an abbreviation. ( a box you put text in) regress price headroom trunk weight length `size’ local size “headroom trunk weight length” regress price `size’ regress mpg `size’ regress gear_ratio `size’ foreign turn Results saved by Stata Stata provides the number so you don’t have to enter by hand. summarize price return list generate price_centered = price – 6165.257 generate price_centered2 = price – r(mean) Loops Center 5 variables from Auto dataset: price, mpg, weight, length, turn, displacement Manually: summarize price generate price_centered = price – r(mean) summarize mpg generate mpg_centered = mpg – r(mean) summarize weight generate weight_centered = weight – r(mean) summarize length generate length_centered = weight – r(mean) summarize turn generate turn_centered = turn – r(mean) summarize displacement generate displacement = displacement – r(mean) Foreach Loops Center 6 variables from Auto dataset: price, mpg, weight, length, turn, displacement Loop (foreach): foreach var in price mpg weight length turn displacement { summarize `var’ generate `var’_centered2 = `var’ – r(mean) } 1. summarize price generate price_centered2 = price – r(mean) 2. summarize mpg generate mpg_centered2 = mpg – r(mean) . . 6. summarize displacement generate displacement_centered2 = displacement – r(mean) Foreach Loops Loop (foreach): foreach depvar in weight length turn displacement { regress `depvar’ price mpg headroom trunk } foreach depvar of varlist weight-displacement { regress `depvar’ price mpg headroom trunk } foreach item in quest1 qeust2 qeust3 quest4 { replace `item’=. if `item’ == 99 } foreach item of varlist quest* { replace `item’=. if `item’ == 99 } Foreach Loops Create your own foreach loop: generate generate generate generate generate generate generate generate generate generate generate generate taxinc1 = inc1 * taxinc2 = inc2 * taxinc3 = inc3 * taxinc4 = inc4 * taxinc5 = inc5 * taxinc6 = inc6 * taxinc7 = inc7 * taxinc8 = inc8 * taxinc9 = inc9 * taxinc10 = inc10 taxinc11 = inc11 taxinc12 = inc12 .10 .10 .10 .10 foreach lname in list { .10 commands referring to lname .10 } .10 .10 .10 * .10 * .10 * .10 Forvalues Loops gen gen . . gen gen hadInc1990 = (inc1990>0) if inc1990<. hadInc1991 = (inc1991>0) if inc1991<. hadInc2009 = (inc2009>0) if inc2009<. hadInc2010 = (inc2010>0) if inc2010<. Forvalues Loops Loop (forvalues): forvalues year=1990/2010 { gen hadInc`year’=(inc`year’>0) if inc`year’<. } Forvalues Loops Loop (forvalues): forvalues year=1990/2010 { gen hadInc`year’=(inc`year’>0) if inc`year’<. } forvalues year=1990(2)2010 { gen hadInc`year’=(inc`year’>0) if inc`year’<. } foreach year of numlist 1980 1983 1990{ gen hadInc`year’=(inc`year’>0) if inc`year’<. } forvalues race=1/3 { svy, subpop(if race==`race’): regress income age i.education } levelsof race, local(races) foreach race in races { svy, subpop(if race==`race’): regress income age i.education } Forvalues Loops Create your own forval loop: generate generate generate generate generate generate generate generate generate generate generate generate taxinc1 = inc1 * taxinc2 = inc2 * taxinc3 = inc3 * taxinc4 = inc4 * taxinc5 = inc5 * taxinc6 = inc6 * taxinc7 = inc7 * taxinc8 = inc8 * taxinc9 = inc9 * taxinc10 = inc10 taxinc11 = inc11 taxinc12 = inc12 .10 .10 .10 .10 forvalues lname = range { .10 commands referring to lname .10 } .10 .10 .10 * .10 * .10 * .10 Nested Loops forval i=1/3 { forval j=1/3 { display “`i’, `j’” } } Nested Loops forval year = 1990/2010 { foreach month in Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec { gen hadInc`month’`year’ = (inc`month’`year’>0) if inc`month’`year’<. } } Loops with counter local counter = 0 foreach var in price mpg weight length turn displacement { local = `counter’ + 1 local counter ++counter display “`counter’ - `var’” } local counter = `counter’ + 1 local ++counter While Loops foreach and forvalues loops repeat a command a set number of times: forval i=1/5 { display `i’ } while loops repeat until a condition is no longer true: local i 1 while `i’<=5 { display `i++’ } Egen commands Take the average: generate mean_score = (read + write + math + science)/4 But if any single score is missing, the total score will be missing Solution: egen mean_score = rowmean(read write math science) egen mean_score = rowmean(score1 score2 score3 score4) egen mean_score = rowmean(score*) = rowtotal(score*) = rowmax(score*) = rowmin(score*) Egen commands rowmean( ) averages across variables mean( ) averages across observations egen mean_read = mean(read) egen schooltype_mean_read = mean(read), by(schtyp) Writing programs A simple program: program define hello di “Hello World” end A more complicated program: capture program drop hello program define hello di “Hello `1’” end An even more complicated program: capture program drop hello program define hello di “Hello `0’” end Writing programs A program to center variables: program define demean foreach var of local 0 { quietly: sum `var' replace `var'=`var'-r(mean) } end Writing programs A program to center variables: program define demean foreach var of local 0 { capture confirm numeric variable `var' if _rc==0 { sum `var',meanonly replace `var'=`var'-r(mean) } else di "`var' is not a numeric variable and cannot be demeaned." } end Ado files How to make an ado file: 1. Write a program in a do-file 2. Save the do-file with .ado extension demean.ado 3. Put the .ado file in your personal ado directory sysdir PERSONAL: c:\ado\personal\ 4. Use your ado program just like any other Stata command