Stata.programming

advertisement
Computing for Research I
Spring 2014
Stata Programming
March 5
Primary Instructor:
Elizabeth Garrett-Mayer
Some simple programming
• Once again, princeton’s site has some great easy
info:
http://data.princeton.edu/stata/programming.aspx
• We will discuss a few things:
– ‘macros’
– looping
– writing commands
• We will not discuss ‘mata’: powerful matrix
programming language
macros
• macro = a name associated with some text.
• macros can be local or global in scope.
• Example of use: shorthand for repeated
phrase
– graphics title
– set of ‘adjustment’ covariates
• syntax: local name content
command
Name of
macro
“guts” of
macro
Example: covariates
* use SCBC data
use "I:\Classes\StatComputingI\SCBC2004.dta", clear
* make tumor numeric and transform
gen sizen=real(tumor)
gen logsize = log(sizen)
replace logsize = . if sizen==999
regress logsize age black graden
*define local macro
local adjusters age black graden
regress logsize `adjusters'
NOTE: must use accent (`) in upper left
of keyboard as beginning quote
and apostrophe (‘) (next to enter key)
for end quote.
regress logsize `adjusters' i.ercat
regress logsize `adjusters' i.prcat
regress logsize `adjusters' i.ercat i.prcat
More examples
local erprknown ercat<9 & prcat<9
regress logsize `adjusters' i.ercat i.prcat if `erprknown‘
• An important property of the local macros, and the reason
they are called "local", is that they only exist within the
process where they were defined
• This means when you highlight and run from a ‘do’ file, all of
the local definitions need to be defined in the highlighted
portion.
• Stata will NOT remember locals defined from earlier calls to
the do file!
Example: titles
* another example
infile str14 country setting effort change ///
using http://data.princeton.edu/wws509/datasets/effort.raw, clear
graph twoway (lfitci change setting) ///
(scatter change setting) ///
, title("Fertility Decline by Social Setting") ///
ytitle("Fertility Decline") ///
legend(ring(0) pos(5) order(2 "linear fit" 1 "95% CI"))
local gtitles title("Fertility Decline by Social Setting") ytitle("Fertility
Decline")
* with macro
graph twoway (lfitci change setting) ///
(scatter change setting) ///
, `gtitles' legend(ring(0) pos(5) order(2 "linear fit" 1 "95% CI"))
* without macro
graph twoway (lfitci change setting) ///
(scatter change setting) ///
, legend(ring(0) pos(5) order(2 "linear fit" 1 "95% CI"))
Storing results
• Stata commands (and new commands that you and
others write) can be classified as follows:
– r-class: General commands such as summarize.
Results are returned in r() and generally must be
used/saved before executing more commands.
– e-class: Estimation commands such as regress,
logistic etc., that fit statistical models. Results
are returned in e() and remain there until the
next model is estimated.
(continued)
– s-class: Programming commands that assist in
parsing. These commands are relatively rare.
Results are returned in s().
– n-class: Commands that do not save results at all,
such as generate and replace.
– c-class: Values of system parameters and settings
and certain constants (such as the value of π)
which are contained in c().
Accessing returned values
• return list, ereturn list, sreturn
list and creturn list return all the values
contained in the r(), e(), s() and c()
vectors, respectively.
• For example, after using summarize, r() will
contain r(N), r(mean), r(sd), r(sum) etc.
• Elements of each of the vectors can be used
when creating new variables. They can also be
saved as macros.
Using regression results
Although coefficients and standard errors from
the most recent model are saved in e(), it is
quicker to refer to them by using _b[varname]
and _se[varname], respectively.
regress change setting effort
gen fitvals = setting*_b[setting] + effort*_b[effort]
_cons*_b[_cons]
predict fit
+
Storing results
* run regression and store r-squared value
regress change setting
local rsq = e(r2)
display rsq
* run new regression
regress change setting effort
display e(r2)
*see old saved r-squared
display rsq
* still there if you run it ALL in the same call to do file
Saving matrix results
matrix list e(b)
matrix list e(V)
matrix betamodel1=get(_b)
matrix list betamodel1
* help matrix get
Global macros
• Global macros have names of up to 32 characters and, as the name
indicates, have global scope.
• You define a global macro using
global name [=] text
and evaluate it using $name. (You may need to use ${name} to
clarify where the name ends.)
• “I suggest you avoid global macros because of the potential for
name conflicts.”
• A useful application, however, is to map the function keys on your
keyboard. If you work on a shared network folder with a long name
try something like this:
global F5 \\server\shared\research\project\subproject\
• Then when you hit F5, Stata will substitute the full name. And your
do files can use commands like do ${F5}dofile. (We need the
braces to indicate that the macro is called F5, not F5dofile.)
More on macros
• Macros can also be used to obtain and store
information about the system or the variables in
your dataset using extended macro functions.
• For example you can retrieve variable and value
labels, a feature that can come handy in
programming.
• There are also commands to manage your
collection of macros, including macro list and
macro drop. Check out help macro to learn
more.
Looping
• foreach: loops over a set of variables
• forvalues: loops over a set of values
(index)
• Also:
– while loops
– if and else sets of commands
Programming
• ‘ado’ files
• create commands in ado file and put them in
the appropriate directory for Stata to find
• Can also create them in do files for local use
• See
– http://data.princeton.edu/stata/programming.html
– www.ssc.upenn.edu/scg/stata/stata-programming-1.ppt
– http://www.ssc.wisc.edu/sscc/pubs/stata_prog2.htm
Ado files
• An ado-file (“automatic do-file”) is a do-file that
defines a Stata command. It has the file extension
.ado.
• Not all Stata commands are defined by ado-files:
some are built-in commands.
• The difference between a do-file and an ado-file
is that when the name of the latter is typed as a
Stata command, Stata will search for and run that
file.
• For example, the program mysum could be saved
in mysum.ado and used in future sessions
Ado files
• Ado-files often have help (.hlp) files associated with
them.
• There are three main sources of ado-files:
– Official updates from StataCorp.
– User-written additions (e.g. from the Stata
Journal).
– Ado-files that you have written yourself.
• Stata stores these in different locations, which can be
reviewed by typing sysdir.
Ado files
• Official updates are saved in the folder associated
with UPDATES.
• User-written additions are saved in the folder
associated with PLUS.
• Ado-files written by yourself should be saved in
the folder associated with PERSONAL.
• If you have an Internet connection, official updates and
user-written ado-files can be installed easily.
• To install official updates, type:
update from http://www.stata.com
Download