Tricks in Stata Anke Huss Generating „automatic“ tables in a do-file Why programming tables? • It‘s much more writing in the do-file! • BUT: once you have done it, the next one will be faster (copy & paste...) • No more troubles with updates of your data • No more copying mistakes, because Stata does it for you Caerphilly castle Used data: Caerphilly Prospective study (CAPS) download at: www.blackwellpublishing.com/ essentialmedstats/datasets.htm Basic idea • Use the Stata data sheet for your tableto-be illn MI diabetes % 19.48 1.85 Stored results in r() and e() • Use stored results usually from r-class: results after general commands such as summarize are saved in r() and generally must be used before executing more commands. For an overview type: return list e-class: results from estimation commands (regress/logictic…) are saved in e() until the next model is fitted. Overview: ereturn list Steps 1. DESIGN TABLE FIRST: what do I want my table to look like? 2. generate a new variable for each column 3. replace cell with number of interest 4. use „outsheet“ to write your new variables in text/ excel file Example 1 1. DESIGN FIRST: what do I want my table to look like? E.g.: Illness % Myocardial inf 19.48 diabetes 1.85 Example 1 2. Generate a new variable for each column gen str illness = ““ gen percent =. Illness % Example 1 3. Replace cell with contents/ number of interest: first column sort id replace illness = “myocardial inf“ in 1 replace illness = “diabetes“ in 2 Illness Myocardial inf diabetes % Example 1 3. Replace cell with contents/ number of interest: second column sum mi sort id replace percent = r(mean)*100 in 1 Illness % Myocardial inf 19.48 diabetes sum diabetes sort id replace percent = r(mean)*100 in 2 format percent %9.2f 1.85 Example 1 4. use „outsheet“ to write your new variables in text/ excel file outsheet illness percent in 1/2 using textres/illns.txt For further *comment 1: this works only if you have set STATA to work in a specific STATA folder. Eg: cd "d:/Statistisches/automatic_tables/STATA„ *comment 2: you can also export as excel file (*.xls), but automatic import of new textfile lets graphics survive... Example 1 *Alternative way to do the same: program a small loop: gen str name = "" gen percent = . local i = 1 foreach var of varlist mi diabetes { replace name = “`var'“ in `i' sum `var' sort id replace percent = r(mean)*100 in `i' local i = `i' + 1 } format percent %9.2f Example 2 1. DESIGN TABLE FIRST: Category underweight percent 4.20 normal 32.03 overweight 51.29 obese 12.49 Example 2 2. Generate a new variable for each column gen str category = "" gen percent = . Category percent Example 2 3. Replace cell with contents/ number of interest: first column sort id replace replace replace replace Category underweight normal Overweight obese category category category category percent = = = = "underweight" "normal" "overweight" "obese" in in in in 1 2 3 4 Example 2 3. Replace cell with numbers: second column ta bmicat, gen (bminew) *4 lines with percentages *4 variables with ending in numbers from 1 to 4 --LOOP! Category forvalues i = 1/4 { sum bminew`i' sort id replace percent = r(mean)*100 in `i' } format percent %9.2f underweight percent 4.20 normal 32.03 Overweight 51.29 obese 12.49 Example 2 4. Outsheet ...same as in example 1 Less writing... label list bmicat capture drop percent category bminew* ta bmicat, gen (bminew) gen category =. gen percent = . forvalues i = 1/4 { sum bminew`i' sort id replace category = `i' in `i' replace percent = r(mean)*100 in `i' } label values category bmicat format percent %9.2f Example 3 1. THINK FIRST: table after logistic reg. Myocardial infarction Current smoking Current smoking (+ age) Current smoking(+ age + bmi) OR uci lci pval Example 3 2. Generate a new variable for each column gen gen gen gen gen str currsmok = "" OR = . uci = . lci = . pval =. Example 3 3. Replace cell with contents/ number of interest: first column sort id replace currentsm = "current smoking" in 1 replace currentsm = "current smoking + age" in 2 replace currentsm = "current smoking + age + bmi" in 3 Example 3 3. Replace cell with numbers: second column logistic mi cursmoke sort id replace OR = exp(_b[cursmoke]) in 1 replace lci = exp(_b[cursmoke] - 1.96*_se[cursmoke]) in replace uci = exp(_b[cursmoke] + 1.96*_se[cursmoke]) in est store A logistic mi est store B lrtest A B sort id replace pval = r(p) in 1 ... In lines 2 and 3 1 1 Example 3 4. outsheet ...as in example 1 Resulting table Myocardial infarction OR uci lci pval Current smoking 1.74 2.22 1.36 6.76e -06 Current smoking (+ age) 1.67 2.18 1.28 0 Current smoking(+ age + bmi) 1.82 2.40 1.39 0 Other way to save results after estimation commands • Use the statsby command: eg: statsby "logistic mi diabetes smoking" _b _se, saving (D:\Statistisches\automatic_tables\STATA\data\caerphillysta tsby.dta) replace Statsby will collapse your dataset! Store results in a new dataset and open the original file again. Rerun "statsby" with next variables and append data to first stored results.