Stata Intro Mixed Models Hein Stigum Presentation, data and programs at:

advertisement
Stata Intro Mixed Models
Hein Stigum
Presentation, data and programs at:
http://folk.uio.no/heins/
Why Stata
• Pro
–
–
–
–
–
Aimed at epidemiology
Many methods, growing
Graphics
Structured, Programmable
Coming soon to a course near you
• Con
– Memory>file size
– Copy tables
Apr-20
H.S.
2
Use
Interface
Apr-20
H.S.
4
Do Editor
• New
– Ctrl-8,
or:
• Run
– Mark commands, Ctrl-D to do (execute)
Apr-20
H.S.
5
Do-file example
Apr-20
H.S.
6
Syntax
• Syntax
[bysort varlist:] command [varlist] [if exp] [in range][, opts]
• Examples
–
–
–
–
mean age
mean age if sex==1
bysort sex: summarize age
summarize age ,detail
Apr-20
H.S.
7
Data handling
Import data
• Using SPSS 14.0
– Save as, Stata Version 8 SE
Apr-20
H.S.
9
Use and save data
• Open data
– set memory 200m
– use “C:\Course\Myfile.dta”, clear
• Describe
– describe
– list x1 x2 in 1/20
describe all variables
list obs nr 1 to 20
• Save data
– save “C:\Course\Myfile.dta” ,replace
Apr-20
H.S.
10
Generate, replace
• Age square
– generate ageSqr=age^2
• Young/Old
• Alternatives
– generate old=0 if (age<=50)
– replace old=1 if (age>50)
generate old=(age>50)
generate old=(age>50) if age<.
• Observation numbers
– gen id=_n
– gen lag=age[ _n-1]
Apr-20
H.S.
11
Missing
• Obs!!!
– Missing values are large numbers
– age>30
will include missing.
– age>30 & age<.
will not.
• Test
– replace x=0 if (x==.)
• Remove
– drop if age==.
• Change
– replace educ=. if educ==99
Apr-20
H.S.
12
Calculater
• Display
– dis 26/3
– dis exp(1.2)
• Store results
– scalar se=sqrt( 0.8*(1-0.8)/60 )
– dis se
Apr-20
H.S.
13
Help
• General
– help
– findit
command
keyword
search Stata+net
• Examples
– help table
– findit aflogit
Apr-20
H.S.
14
Summing up
• Use do files
– Mark, Ctrl-D to do (execute)
• Syntax
– command [varlist] [if exp] [in range] [, options]
• Missing
– age>30 & age<.
– generate old=(age>50) if age<.
• Help
– help describe
Apr-20
H.S.
15
Books
Web: http://www.stata.com/bookstore
A Gentle Introduction to Stata
by Alan C. Acock
A visual guide to Stata graphics
by M.N. Mitchell
Multilevel and longitudinal modeling using Stata
by S. Rabe-Hesketh, A. Skrondal
Apr-20
H.S.
16
Graphics
Twoway density
• Syntax
– graph twoway (plot1, opts) (plot2, opts), opts
• One plot
– kdensity x
• Two plots, boys and girls compared
twoway
Apr-20
( kdensity weight if sex==1, lcolor(blue) ) ///
( kdensity weight if sex==2, lcolor(red) )
H.S.
18
twoway
( kdensity weight if sex==1, lcolor(blue) ) ///
( kdensity weight if sex==2, lcolor(red) )
0
.0002
.0004
.0006
.0008
Weight distribution by sex
1000
Apr-20
2000
3000
gram
H.S.
4000
5000
19
Twoway scatter
• Syntax
– graph twoway (plot1, opts) (plot2, opts), opts
• Examples
– scatter y x
– twoway (scatter y x) (lfit y x)
Fitlines
with CI
lfit
lfitci
Linear
qfit
qfitci
quadratic
mband, mspline
Median band, median spline
fpfitci
lowess
Apr-20
Fractional polynomial
Local regression
H.S.
20
twoway (scatter weight gest)(lfitci weight gest)
1000
2000
gram
3000
4000
5000
Weight by gestational age
240
Apr-20
260
280
days
H.S.
300
320
21
Descriptives
Apr-20
H.S.
22
Central tendency and dispersion
Mean and standard deviation:
Mean with confidence interval:
Apr-20
H.S.
23
Frequency and proportion
Frequency:
Proportion with CI:
Apr-20
H.S.
24
Crosstables
Are boys bullied as much as girls?
equal proportions?
Apr-20
H.S.
25
Tables for epidemiologists
• Data
– Must be 0/1
– Long format. Wide format
• Commands
– cc
– mcc
Case-control
Matched case-control
• Example
– cc disease exposed, by(sex)
Stratified MH-OR
• Calculator (i=immideate)
– cci 10 90 5 95
11.04.2020
OR
H.S.
26
Logistic regression
Being bullied
11 April 2020
H.S.
27
Syntax
• Estimation
– logistic y x1 x2
– xi: logistic y x1 i.c1
logistic regression
categorical c1
• Post estimation
– predict yf, pr
predict probability
• Manage models
– estimates store m1
– est table m1, eform
11 April 2020
save model
show OR
H.S.
28
Bivariate, dummies
Generate dummies
gen Island=
gen Norway=
gen Finland=
gen Denmark=
11 April 2020
(country==2) if country<.
(country==3)
(country==4)
(country==5)
H.S.
29
Model 1: outcome and exposure
Alternative commands:
xi:logistic bullied i.country
use xi: i.var for categorical variables
xi:logistic bullied i.country , coef
coefs instead of OR's
xi:logistic bullied i.country if sex!=. & age!=. do if sex and age not missing
11 April 2020
H.S.
30
Model 2: Add confounders
Estimate associations: m1=m2
Predict:
m2 best
11 April 2020
H.S.
31
Model 3: interaction
lincom age+1*agesex
lincom age+2*agesex
11 April 2020
effect of age for boys
effect of age for girls
H.S.
32
• Estimation
Regression Summary
– regress y x1 x2
– logistic y x1 x2
– xi:regress y x1 i.x2
linear regression
logistic regression
categorical x2
• Manage results
– estimates store m1
– estimates table m1 m2
– estimates stats m1 m2
store results
table of results
statistics of results
• Post estimation
– predict y, xb
– predict res, resid
– lincom b0+2*b3
linear prediction
residuals
linear combination
• Help
– help logistic postestimation
11 April 2020
H.S.
33
Mixed Models
Multilevel models
Panel data
Repeated measurements
Apr-20
H.S.
H.S.
34
Long and wide data
1.
2.
id
bp0
bp1
bp2
bp3
1
2
151.6
132.5
156.8
139.1
138.5
150.0
161.7
159.9
Wide data
reshape wide bp, i(id) j(occ)
reshape long bp, i(id) j(occ)
id
occ
bp
1.
2.
3.
4.
1
1
1
1
0
1
2
3
151.6
156.8
138.5
161.7
5.
6.
7.
8.
2
2
2
2
0
1
2
3
132.5
139.1
150.0
159.9
Apr-20
Long data
H.S.
35
Correlated measures
• Two measures per person: W1 W2
symmetry W1 W2
Measure the same?
• Matched Case-Control
mcc expCase expContr
Matched OR
Multilevel data
• Panel data
• xt
• help xt
xsectional time data
Setup and describe
• Set panel data
– xtset school
– xtset id time
pupils nested in schools
times nested in subjects
• Describe panel data
–
–
–
–
xtdes
xtsum bp
xttab ht
xtline bp
describe data and missing
summarize bloodpressure
tabulate hypertension
plot bp versus time for each id
• Lag and lead
– replace bp=bp[ _n+1] if id==1
Apr-20
H.S.
38
Logistic regression methods
• Fixed effects models
– logit y x1 x2, or
• Conditional fixed effects models
– clogit y x1 x2, group(id) or
• Random intercept models
– xtlogit y x1 x2, i(id) or
• Mixed effects models
– xtmelogit y x1 x2 || id: x1 , or
• Population average effects
– xtgee y x1 x2, i(id) t(time) fam(bin) link(logit) robust
eform
Apr-20
H.S.
39
Download