Stata Introduction, Short Hein Stigum Presentation, data and programs at:

advertisement
Stata Introduction, Short
v2
Hein Stigum
Presentation, data and programs at:
http://folk.uio.no/heins/
courses
May-16
H.S.
H.S.
1
Stata introduction
• General use
– Interface and menu
– Do-files and syntax
– Data handling
• Analysis
– Descriptive
– Graphs
– Bivariate
May-16
H.S.
2
Why Stata
• Pro
–
–
–
–
–
Aimed at epidemiology
Many methods, growing
Graphics
Structured, Programmable
Coming soon to a course near you
• Con
– Memory>file size
May-16
H.S.
3
Interface
Interface Stata 9
May-16
H.S.
5
Interface Stata 12
Do Data
file edit
May-16
H.S.
6
Menu
May-16
H.S.
7
Do-file example
New do-file: icon or
Ctrl-9
Run: Mark, Ctrl-D
May-16
H.S.
8
Syntax
• Syntax
[bysort varlist:] command [varlist] [if exp] [in range][, opts]
• Examples
–
–
–
–
May-16
mean age
mean age if sex==1
bysort sex: summarize age
summarize age ,detail
H.S.
9
Data handling
Import data
• Using SPSS 14.0-17.0
– Save as, Stata Version 8 SE
May-16
H.S.
11
Use and save data
• Open data
– use “C:\Course\Myfile.dta”, clear
• Describe
– describe
– list x1 x2 in 1/20
describe all variables
list obs nr 1 to 20
• Save data
– save “C:\Course\Myfile.dta” ,replace
May-16
H.S.
12
Use data from web
• webuse “file” use data from Stata homepage
1.webuse set
“http://www.med.uio.no/forskning/doktorgradkarriere/forskerutdanning/kurs/biostatistikk/mf
9510-logistisk-regresjon-overlevelsesanalysecox/”
set homepage
2.webuse “birth1”
data for exercise 1
May-16
H.S.
13
Generate, replace
• Index
– generate index=0
– replace index=1 if sex==1 & age<30
• Young/Old
– generate old=(age>50) if age<.
• Serial numbers, lags
– generate id=_n
– generate age1=age[ _n-1]
May-16
H.S.
14
Dates
• From numeric to date
ex: m=12, d=2, y=1987
generate birth=mdy(m,d,y)
format birth %td
• From string to date
ex: bstr=“01.12.1987”
generate birth=date(bstr,”DMY”)
format birth %td
May-16
H.S.
15
Missing
• Obs!!!
–
–
–
–
Represented as ”.”
Missing values are large numbers
age>30
will include missing.
age>30 if age<.
will not.
• Test
– replace age=0 if (age==.)
• Remove
– drop if age==.
• Change
– replace educ=. if educ==99
May-16
H.S.
16
Describe missing
• Summarize variables
summarize id bullied sex
• Missing in tables
tab bullied sex, missing
misstable summarize bullied sex
May-16
new command
H.S.
17
Help
• General
– help
– findit
command
keyword
search Stata+net
• Examples
– help table
– findit aflogit
May-16
H.S.
18
Summing up
• Use do files
– Run:
Mark, Ctrl-D
• Syntax
– command [varlist] [if exp] [in range] [, options]
• Missing
– age>30 if age<.
– generate old=(age>50) if age<.
• Help
– help describe
May-16
H.S.
19
Descriptive
Descriptive
• Continuous
summarize weight
summarize weight, details
fractiles ++
• Categorical
tabulate bullied
tabulate bullied, nolab
May-16
show coding
H.S.
21
Other descriptives
tabstat mAge, stat( N min p50 mean max) by(parity)
May-16
H.S.
22
Graphics
May-16
H.S.
23
Twoway plots
• Syntax
– twoway (plot1, opts) (plot2, opts), opts
• One plot
Kernel density estimate
– kdensity bw
0
2000
4000
Birth weight
6000
0
2000
Birth weight
– scatter bw gest
4000
6000
kernel = epanechnikov, bandwidth = 102.3251
240
May-16
H.S.
260
280
300
Gestational age
320
340
24
twoway
( kdensity bw if sex==1, lcolor(blue) ) ///
( kdensity bw if sex==2, lcolor(red ) )
0
.0002
.0004
.0006
.0008
Weight distribution by sex
1000
May-16
2000
3000
gram
H.S.
4000
5000
25
twoway (scatter bw gest) (fpfitci bw gest) (lfit bw gest)
smooth with CI
scatter
line fit
2000
3000
gram
4000
5000
6000
Weight by gestational age
250
270
290
310
days
May-16
H.S.
26
Titles
scatter bw gest,
title("title") subtitle("subtitle")
xtitle("xtitle") ytitle("ytitle") note("note")
title
1000 2000 3000 4000 5000
ytitle
subtitle
240
260
280
xtitle
300
320
note
May-16
H.S.
27
///
Bivariate analysis
2 independent samples
Do boys and girls have the same mean birth weight?
twoway
( kdensity weight if sex==1, lcolor(blue) ) ///
( kdensity weight if sex==2, lcolor(red) )
Equal means?
Equal variance?
2000
May-16
3000
H.S.
4000
Birth weight
5000
6000
29
2 independent samples test
ttest weight, by(sex)
2-sample T-test
ttest weight, by(sex) unequal
ttest w1 w2, paired
May-16
H.S.
30
Crosstables
Are boys bullied as much as girls?
tabulate bullied sex, col chi2 nofreq
equal proportions?
May-16
H.S.
31
Summing up
• Descriptive
summarize weight
tabulate sex
• Graphs
twoway (plot1, opts) (plot2, opts), opts
• Bivariate
• ttest weight, by(sex)
• tabulate bullied sex, chi2
May-16
H.S.
32
Download