STATA Program for Probit/Logit Models

advertisement
STATA Program for OLS
cps87_or.do
*
*
*
*
the data for this project is a small subsample;
of full time (30 or more hours) male workers;
aged 21-64 from the out going rotation;
samples of the 1987 current population survey;
* this line defines the semicolon as the ;
* end of line delimiter;
# delimit ;
* set memork for 10 meg;
set memory 10m;
* write results to a log file;
* the replace options writes over old;
* log files;
log using cps87_or.log,replace;
* open stata data set;
use c:\bill\stata\cps87_or;
* list variables and labels in data set;
desc;
* generate new variables;
* lines 1-2 illustrate basic math functoins;
* lines 3-4 line illustrate logical operators;
* line 5 illustrate the OR statement;
* line 6 illustrates the AND statement;
* after you construct new variables, compress the data again;
gen age2=age*age;
gen earnwkl=ln(earnwke);
gen union=unionm==1;
gen topcode=earnwke==999;
gen nonwhite=((race==2)|(race==3));
gen big_ne=((region==1)&(smsa==1));
* label the data;
label var age2 "age squared";
label var earnwkl "log earnings per week";
label var topcode "=1 if earnwkl is topcoded";
label var union "1=in union, 0 otherwise";
label var nonwhite "1=nonwhite, 0=white" ;
label var big_ne "1= live in big smsa from northeast, 0=otherwsie";
* get descriptive statistics;
sum;
* get detailed descriptics for continuous variables;
sum earnwke, detail;
128
* get frequencies of discrete variables;
tabulate unionm;
tabulate race;
* get two-way table of frequencies;
tabulate region smsa, row column cell;
*run simple regression;
reg earnwkl age age2 educ nonwhite union;
* run regression addinf smsa, region and race fixed-effects;
* the xi command constructs the dummies for you;
* the lowest numbered dummy is usually the;
* omitted variable;
xi: reg earnwkl age age2 educ union i.race i.region i.smsa;
more;
* close log file;
log close;
129
STATA Results for OLS
cps87_do.log
-----------------------------------------------------------------------------log: c:\bill\stata\cps87_or.log
log type: text
opened on:
6 Nov 2004, 08:14:10
. * open stata data set;
. use c:\bill\stata\cps87_or;
. * list variables and labels in data set;
. desc;
Contains data from c:\bill\stata\cps87_or.dta
obs:
19,906
vars:
7
6 Nov 2004 08:11
size:
636,992 (93.9% of memory free)
-----------------------------------------------------------------------------> storage display
value
variable name
type
format
label
variable label
-----------------------------------------------------------------------------> age
float %9.0g
age in years
race
float %9.0g
1=white, non-hisp, 2=place,
n.h, 3=hisp
educ
float %9.0g
years of education
unionm
float %9.0g
1=union member, 2=otherwise
smsa
float %9.0g
1=live in 19 largest smsa,
2=other smsa, 3=non smsa
region
float %9.0g
1=east, 2=midwest, 3=south,
4=west
earnwke
float %9.0g
usual weekly earnings
-----------------------------------------------------------------------------> Sorted by:
.
.
.
.
.
.
.
* generate new variables;
* lines 1-2 illustrate basic math functoins;
* lines 3-4 line illustrate logical operators;
* line 5 illustrate the OR statement;
* line 6 illustrates the AND statement;
* after you construct new variables, compress the data again;
gen age2=age*age;
. gen earnwkl=ln(earnwke);
. gen union=unionm==1;
. gen topcode=earnwke==999;
. gen nonwhite=((race==2)|(race==3));
. gen big_ne=((region==1)&(smsa==1));
130
. * label the data;
. label var age2 "age squared";
. label var earnwkl "log earnings per week";
. label var topcode "=1 if earnwkl is topcoded";
. label var union "1=in union, 0 otherwise";
. label var nonwhite "1=nonwhite, 0=white" ;
. label var big_ne "1= live in big smsa from northeast, 0=otherwsie";
. compress;
age was float now byte
race was float now byte
educ was float now byte
unionm was float now byte
smsa was float now byte
region was float now byte
earnwke was float now int
age2 was float now int
union was float now byte
topcode was float now byte
nonwhite was float now byte
big_ne was float now byte
. more;
. * get descriptive statistics;
. sum;
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------age |
19906
37.96619
11.15348
21
64
race |
19906
1.199136
.525493
1
3
educ |
19906
13.16126
2.795234
0
18
unionm |
19906
1.769065
.4214418
1
2
smsa |
19906
1.908369
.7955814
1
3
-------------+-------------------------------------------------------region |
19906
2.462373
1.079514
1
4
earnwke |
19906
488.264
236.4713
60
999
age2 |
19906
1565.826
912.4383
441
4096
earnwkl |
19906
6.067307
.513047
4.094345
6.906755
union |
19906
.2309354
.4214418
0
1
-------------+-------------------------------------------------------topcode |
19906
.0719381
.2583919
0
1
nonwhite |
19906
.1408118
.3478361
0
1
big_ne |
19906
.1409625
.3479916
0
1
. * get detailed descriptics for continuous variables;
. sum earnwke, detail;
usual weekly earnings
------------------------------------------------------------Percentiles
Smallest
1%
128
60
131
5%
10%
25%
178
210
300
50%
449
75%
90%
95%
99%
615
865
999
999
60
60
63
Largest
999
999
999
999
Obs
Sum of Wgt.
19906
19906
Mean
Std. Dev.
488.264
236.4713
Variance
Skewness
Kurtosis
55918.7
.668646
2.632356
. more;
. * get frequencies of discrete variables;
. tabulate unionm;
1=union |
member, |
2=otherwise |
Freq.
Percent
Cum.
------------+----------------------------------1 |
4,597
23.09
23.09
2 |
15,309
76.91
100.00
------------+----------------------------------Total |
19,906
100.00
. tabulate race;
1=white, |
non-hisp, |
2=place, |
n.h, 3=hisp |
Freq.
Percent
Cum.
------------+----------------------------------1 |
17,103
85.92
85.92
2 |
1,642
8.25
94.17
3 |
1,161
5.83
100.00
------------+----------------------------------Total |
19,906
100.00
. more;
. * get two-way table of frequencies;
. tabulate region smsa, row column cell;
+-------------------+
| Key
|
|-------------------|
|
frequency
|
| row percentage
|
| column percentage |
| cell percentage |
+-------------------+
1=east, |
2=midwest, |
1=live in 19 largest smsa,
3=south, |
2=other smsa, 3=non smsa
4=west |
1
2
3 |
Total
-----------+---------------------------------+----------
132
1 |
2,806
1,349
842 |
4,997
|
56.15
27.00
16.85 |
100.00
|
38.46
18.89
15.39 |
25.10
|
14.10
6.78
4.23 |
25.10
-----------+---------------------------------+---------2 |
1,501
1,742
1,592 |
4,835
|
31.04
36.03
32.93 |
100.00
|
20.58
24.40
29.10 |
24.29
|
7.54
8.75
8.00 |
24.29
-----------+---------------------------------+---------3 |
1,501
2,542
1,904 |
5,947
|
25.24
42.74
32.02 |
100.00
|
20.58
35.60
34.80 |
29.88
|
7.54
12.77
9.56 |
29.88
-----------+---------------------------------+---------4 |
1,487
1,507
1,133 |
4,127
|
36.03
36.52
27.45 |
100.00
|
20.38
21.11
20.71 |
20.73
|
7.47
7.57
5.69 |
20.73
-----------+---------------------------------+---------Total |
7,295
7,140
5,471 |
19,906
|
36.65
35.87
27.48 |
100.00
|
100.00
100.00
100.00 |
100.00
|
36.65
35.87
27.48 |
100.00
. more;
. *run simple regression;
. reg earnwkl age age2 educ nonwhite union;
Source |
SS
df
MS
-------------+-----------------------------Model | 1616.39963
5 323.279927
Residual | 3622.93905 19900 .182057239
-------------+-----------------------------Total | 5239.33869 19905 .263217216
Number of obs
F( 5, 19900)
Prob > F
R-squared
Adj R-squared
Root MSE
=
19906
= 1775.70
= 0.0000
= 0.3085
= 0.3083
= .42668
-----------------------------------------------------------------------------earnwkl |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0679808
.0020033
33.93
0.000
.0640542
.0719075
age2 | -.0006778
.0000245
-27.69
0.000
-.0007258
-.0006299
educ |
.069219
.0011256
61.50
0.000
.0670127
.0714252
nonwhite | -.1716133
.0089118
-19.26
0.000
-.1890812
-.1541453
union |
.1301547
.0072923
17.85
0.000
.1158613
.1444481
_cons |
3.630805
.0394126
92.12
0.000
3.553553
3.708057
-----------------------------------------------------------------------------. more;
. * run regression addinf smsa, region and race fixed-effects;
. * the xi command constructs the dummies for you;
. * the lowest numbered dummy is usually the;
. * omitted variable;
. xi: reg earnwkl age age2 educ union i.race i.region i.smsa;
i.race
_Irace_1-3
(naturally coded; _Irace_1 omitted)
133
i.region
i.smsa
_Iregion_1-4
_Ismsa_1-3
(naturally coded; _Iregion_1 omitted)
(naturally coded; _Ismsa_1 omitted)
Source |
SS
df
MS
-------------+-----------------------------Model | 1767.66908
11 160.697189
Residual | 3471.66961 19894 .174508375
-------------+-----------------------------Total | 5239.33869 19905 .263217216
Number of obs
F( 11, 19894)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
19906
920.86
0.0000
0.3374
0.3370
.41774
-----------------------------------------------------------------------------earnwkl |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.070194
.0019645
35.73
0.000
.0663435
.0740446
age2 | -.0007052
.000024
-29.37
0.000
-.0007522
-.0006581
educ |
.0643064
.0011285
56.98
0.000
.0620944
.0665184
union |
.1131485
.007257
15.59
0.000
.0989241
.1273729
_Irace_2 | -.2329794
.0110958
-21.00
0.000
-.254728
-.2112308
_Irace_3 | -.1795253
.0134073
-13.39
0.000
-.2058047
-.1532458
_Iregion_2 | -.0088962
.0085926
-1.04
0.301
-.0257383
.007946
_Iregion_3 | -.0281747
.008443
-3.34
0.001
-.0447238
-.0116257
_Iregion_4 |
.0318053
.0089802
3.54
0.000
.0142034
.0494071
_Ismsa_2 | -.1225607
.0072078
-17.00
0.000
-.1366886
-.1084328
_Ismsa_3 | -.2054124
.0078651
-26.12
0.000
-.2208287
-.1899961
_cons |
3.76812
.0391241
96.31
0.000
3.691434
3.844807
-----------------------------------------------------------------------------. more;
. * close log file;
. log close;
log: c:\bill\stata\cps87_or.log
log type: text
closed on:
6 Nov 2004, 08:14:19
------------------------------------------------------------------------------
134
STATA Program for Probit/Logit Models
workplace.do
*
*
*
*
*
*
*
this data for this program are a random sample;
of 10k observations from the data used in;
evans, farrelly and montgomery, aer, 1999;
the data are indoor workers in the 1991 and 1993;
national health interview survey. the survey;
identifies whether the worker smoked and whether;
the worker faces a workplace smoking ban;
* set semi colon as the end of line;
# delimit;
* ask it NOT to pause;
set more off;
* open log file;
log using c:\bill\jpsm\workplace1.log,replace;
* use the workplace data set;
use c:\bill\jpsm\workplace1;
* print out variable labels;
desc;
* get summary statistics;
sum;
* run a linear probability model for comparison purposes;
* estimate white standard errors to control for heteroskedasticity;
reg smoker age incomel male black hispanic
hsgrad somecol college worka, robust;
* run probit model;
probit smoker age incomel male black hispanic
hsgrad somecol college worka;
*predict probability of smoking;
predict pred_prob_smoke;
* get detailed descriptive data about predicted prob;
sum pred_prob, detail;
* predict binary outcome with 50% cutoff;
gen pred_smoke1=pred_prob_smoke>=.5;
label variable pred_smoke1 "predicted smoking, 50% cutoff";
* compare actual values;
tab smoker pred_smoke1, row col cell;
* ask for marginal effects/treatment effects;
mfx compute;
135
* the same type of variables can be produced with;
* prchange. this command is however more flexible;
* in that you can change the reference individual;
prchange, help;
* get marginal effect/treatment effects for specific person;
* male, age 40, college educ, white, without workplace smoking ban;
* if a variable is not specified, its value is assumed to be;
* the sample mean. in this case, the only variable i am not;
* listing is mean log income;
prchange, x(age=40 black=0 hispanic=0 hsgrad=0 somecol=0 worka=0);
* using a wald test, test the null hypothesis that;
* all the education coefficients are zero;
test hsgrad somecol college;
* how to run the same tets with a -2 log like test;
* estimate the unresticted model and save the estimates ;
* in urmodel;
probit smoker age incomel male black hispanic
hsgrad somecol college worka;
estimates store urmodel;
* estimate the restricted model. save results in rmodel;
probit smoker age incomel male black hispanic
worka;
estimates store rmodel;
lrtest urmodel rmodel;
* run logit model;
logit smoker age incomel male black hispanic
hsgrad somecol college worka;
* ask for marginal effects/treatment effects;
* logit model;
mfx compute;
log close;
136
STATA Results for Probit/Logit Models
workplace.log
-----------------------------------------------------------------------------log: c:\bill\jpsm\workplace1.log
log type: text
opened on:
4 Nov 2004, 07:29:21
. * use the workplace data set;
. use c:\bill\jpsm\workplace1;
. * print out variable labels;
. desc;
Contains data from c:\bill\jpsm\workplace1.dta
obs:
16,258
vars:
10
28 Oct 2004 05:27
size:
325,160 (96.9% of memory free)
-----------------------------------------------------------------------------> storage display
value
variable name
type
format
label
variable label
-----------------------------------------------------------------------------> smoker
byte
%9.0g
is current smoking
worka
byte
%9.0g
has workplace smoking bans
age
byte
%9.0g
age in years
male
byte
%9.0g
male
black
byte
%9.0g
black
hispanic
byte
%9.0g
hispanic
incomel
float %9.0g
log income
hsgrad
byte
%9.0g
is hs graduate
somecol
byte
%9.0g
has some college
college
float %9.0g
-----------------------------------------------------------------------------> Sorted by:
. * get summary statistics;
. sum;
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------smoker |
16258
.25163
.433963
0
1
worka |
16258
.6851396
.4644745
0
1
age |
16258
38.54742
11.96189
18
87
male |
16258
.3947595
.488814
0
1
black |
16258
.1119449
.3153083
0
1
-------------+-------------------------------------------------------hispanic |
16258
.0607086
.2388023
0
1
incomel |
16258
10.42097
.7624525
6.214608
11.22524
hsgrad |
16258
.3355271
.4721889
0
1
somecol |
16258
.2685447
.4432161
0
1
college |
16258
.3293763
.4700012
0
1
. * run a linear probability model for comparison purposes;
137
. * estimate white standard errors to control for heteroskedasticity;
. reg smoker age incomel male black hispanic
> hsgrad somecol college worka, robust;
Regression with robust standard errors
Number of obs
F( 9, 16248)
Prob > F
R-squared
Root MSE
=
=
=
=
=
16258
99.26
0.0000
0.0488
.42336
-----------------------------------------------------------------------------|
Robust
smoker |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0004776
.0002806
-1.70
0.089
-.0010276
.0000725
incomel | -.0287361
.0047823
-6.01
0.000
-.03811
-.0193621
male |
.0168615
.0069542
2.42
0.015
.0032305
.0304926
black | -.0356723
.0110203
-3.24
0.001
-.0572732
-.0140714
hispanic |
-.070582
.0136691
-5.16
0.000
-.097375
-.043789
hsgrad | -.0661429
.0162279
-4.08
0.000
-.0979514
-.0343345
somecol | -.1312175
.0164726
-7.97
0.000
-.1635056
-.0989293
college | -.2406109
.0162568
-14.80
0.000
-.272476
-.2087459
worka |
-.066076
.0074879
-8.82
0.000
-.080753
-.051399
_cons |
.7530714
.0494255
15.24
0.000
.6561919
.8499509
-----------------------------------------------------------------------------. * run probit model;
. probit smoker age incomel male black hispanic
> hsgrad somecol college worka;
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -9171.443
= -8764.068
= -8761.7211
= -8761.7208
Probit estimates
Number of obs
LR chi2(9)
Prob > chi2
Pseudo R2
Log likelihood = -8761.7208
=
=
=
=
16258
819.44
0.0000
0.0447
-----------------------------------------------------------------------------smoker |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0012684
.0009316
-1.36
0.173
-.0030943
.0005574
incomel |
-.092812
.0151496
-6.13
0.000
-.1225047
-.0631193
male |
.0533213
.0229297
2.33
0.020
.0083799
.0982627
black | -.1060518
.034918
-3.04
0.002
-.17449
-.0376137
hispanic | -.2281468
.0475128
-4.80
0.000
-.3212701
-.1350235
hsgrad | -.1748765
.0436392
-4.01
0.000
-.2604078
-.0893453
somecol |
-.363869
.0451757
-8.05
0.000
-.4524118
-.2753262
college | -.7689528
.0466418
-16.49
0.000
-.860369
-.6775366
worka | -.2093287
.0231425
-9.05
0.000
-.2546873
-.1639702
_cons |
.870543
.154056
5.65
0.000
.5685989
1.172487
-----------------------------------------------------------------------------. *predict probability of smoking;
. predict pred_prob_smoke;
138
(option p assumed; Pr(smoker))
. * get detailed descriptive data about predicted prob;
. sum pred_prob, detail;
Pr(smoker)
------------------------------------------------------------Percentiles
Smallest
1%
.0959301
.0615221
5%
.1155022
.0622963
10%
.1237434
.0633929
Obs
16258
25%
.1620851
.0733495
Sum of Wgt.
16258
50%
75%
90%
95%
99%
.2569962
.3187975
.3795704
.4039573
.4672697
Largest
.5619798
.5655878
.5684112
.6203823
Mean
Std. Dev.
.2516653
.0960007
Variance
Skewness
Kurtosis
.0092161
.1520254
2.149247
. * predict binary outcome with 50% cutoff;
. gen pred_smoke1=pred_prob_smoke>=.5;
. label variable pred_smoke1 "predicted smoking, 50% cutoff";
. * compare actual values;
. tab smoker pred_smoke1, row col cell;
+-------------------+
| Key
|
|-------------------|
|
frequency
|
| row percentage
|
| column percentage |
| cell percentage |
+-------------------+
| predicted smoking,
is current |
50% cutoff
smoking |
0
1 |
Total
-----------+----------------------+---------0 |
12,153
14 |
12,167
|
99.88
0.12 |
100.00
|
74.93
35.90 |
74.84
|
74.75
0.09 |
74.84
-----------+----------------------+---------1 |
4,066
25 |
4,091
|
99.39
0.61 |
100.00
|
25.07
64.10 |
25.16
|
25.01
0.15 |
25.16
-----------+----------------------+---------Total |
16,219
39 |
16,258
|
99.76
0.24 |
100.00
|
100.00
100.00 |
100.00
|
99.76
0.24 |
100.00
139
. * ask for marginal effects/treatment effects;
. mfx compute;
Marginal effects after probit
y = Pr(smoker) (predict)
= .24093439
-----------------------------------------------------------------------------variable |
dy/dx
Std. Err.
z
P>|z| [
95% C.I.
]
X
---------+-------------------------------------------------------------------age | -.0003951
.00029
-1.36
0.173 -.000964 .000174
38.5474
incomel | -.0289139
.00472
-6.13
0.000
-.03816 -.019668
10.421
male*|
.0166757
.0072
2.32
0.021
.002568 .030783
.39476
black*| -.0320621
.01023
-3.13
0.002 -.052111 -.012013
.111945
hispanic*| -.0658551
.01259
-5.23
0.000 -.090536 -.041174
.060709
hsgrad*|
-.053335
.01302
-4.10
0.000
-.07885 -.02782
.335527
somecol*| -.1062358
.01228
-8.65
0.000 -.130308 -.082164
.268545
college*| -.2149199
.01146 -18.76
0.000 -.237378 -.192462
.329376
worka*| -.0668959
.00756
-8.84
0.000
-.08172 -.052072
.68514
-----------------------------------------------------------------------------(*) dy/dx is for discrete change of dummy variable from 0 to 1
.
.
.
.
* the same type of variables can be produced with;
* prchange. this command is however more flexible;
* in that you can change the reference individual;
prchange, help;
probit: Changes in Predicted Probabilities for smoker
age
incomel
male
black
hispanic
hsgrad
somecol
college
worka
min->max
-0.0269
-0.1589
0.0167
-0.0321
-0.0659
-0.0533
-0.1062
-0.2149
-0.0669
0->1
-0.0004
-0.0361
0.0167
-0.0321
-0.0659
-0.0533
-0.1062
-0.2149
-0.0669
Pr(y|x)
0
0.7591
x=
sd(x)=
age
38.5474
11.9619
incomel
10.421
.762452
x=
sd(x)=
college
.329376
.470001
worka
.68514
.464475
-+1/2
-0.0004
-0.0289
0.0166
-0.0330
-0.0710
-0.0544
-0.1130
-0.2366
-0.0652
-+sd/2
-0.0047
-0.0220
0.0081
-0.0104
-0.0170
-0.0257
-0.0502
-0.1123
-0.0303
MargEfct
-0.0004
-0.0289
0.0166
-0.0330
-0.0711
-0.0545
-0.1134
-0.2396
-0.0652
1
0.2409
male
.39476
.488814
black
.111945
.315308
hispanic
.060709
.238802
hsgrad
.335527
.472189
somecol
.268545
.443216
Pr(y|x): probability of observing each y for specified x values
Avg|Chg|: average of absolute value of the change across categories
Min->Max: change in predicted probability as x changes from its minimum to
its maximum
0->1: change in predicted probability as x changes from 0 to 1
-+1/2: change in predicted probability as x changes from 1/2 unit below
base value to 1/2 unit above
140
-+sd/2: change in predicted probability as x changes from 1/2 standard
dev below base to 1/2 standard dev above
MargEfct: the partial derivative of the predicted probability/rate with
respect to a given independent variable
.
.
.
.
.
.
* get marginal effect/treatment effects for specific person;
* male, age 40, college educ, white, without workplace smoking ban;
* if a variable is not specified, its value is assumed to be;
* the sample mean. in this case, the only variable i am not;
* listing is mean log income;
prchange, x(age=40 black=0 hispanic=0 hsgrad=0 somecol=0 worka=0);
probit: Changes in Predicted Probabilities for smoker
age
incomel
male
black
hispanic
hsgrad
somecol
college
worka
min->max
-0.0323
-0.1795
0.0198
-0.0385
-0.0804
-0.0625
-0.1235
-0.2644
-0.0742
0->1
-0.0005
-0.0320
0.0198
-0.0385
-0.0804
-0.0625
-0.1235
-0.2644
-0.0742
Pr(y|x)
0
0.6479
x=
sd(x)=
age
40
11.9619
incomel
10.421
.762452
x=
sd(x)=
college
.329376
.470001
worka
0
.464475
-+1/2
-0.0005
-0.0344
0.0198
-0.0394
-0.0845
-0.0648
-0.1344
-0.2795
-0.0776
-+sd/2
-0.0056
-0.0263
0.0097
-0.0124
-0.0202
-0.0306
-0.0598
-0.1335
-0.0361
MargEfct
-0.0005
-0.0345
0.0198
-0.0394
-0.0847
-0.0649
-0.1351
-0.2854
-0.0777
1
0.3521
male
.39476
.488814
black
0
.315308
hispanic
0
.238802
. * using a wald test, test the null hypothesis that;
. * all the education coefficients are zero;
. test hsgrad somecol college;
( 1)
( 2)
( 3)
hsgrad = 0
somecol = 0
college = 0
chi2( 3) =
Prob > chi2 =
.
.
.
.
>
504.78
0.0000
* how to run the same tets with a -2 log like test;
* estimate the unresticted model and save the estimates ;
* in urmodel;
probit smoker age incomel male black hispanic
hsgrad somecol college worka;
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -9171.443
= -8764.068
= -8761.7211
= -8761.7208
141
hsgrad
0
.472189
somecol
0
.443216
Probit estimates
Number of obs
LR chi2(9)
Prob > chi2
Pseudo R2
Log likelihood = -8761.7208
=
=
=
=
16258
819.44
0.0000
0.0447
-----------------------------------------------------------------------------smoker |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0012684
.0009316
-1.36
0.173
-.0030943
.0005574
incomel |
-.092812
.0151496
-6.13
0.000
-.1225047
-.0631193
male |
.0533213
.0229297
2.33
0.020
.0083799
.0982627
black | -.1060518
.034918
-3.04
0.002
-.17449
-.0376137
hispanic | -.2281468
.0475128
-4.80
0.000
-.3212701
-.1350235
hsgrad | -.1748765
.0436392
-4.01
0.000
-.2604078
-.0893453
somecol |
-.363869
.0451757
-8.05
0.000
-.4524118
-.2753262
college | -.7689528
.0466418
-16.49
0.000
-.860369
-.6775366
worka | -.2093287
.0231425
-9.05
0.000
-.2546873
-.1639702
_cons |
.870543
.154056
5.65
0.000
.5685989
1.172487
-----------------------------------------------------------------------------. estimates store urmodel;
. * estimate the restricted model. save results in rmodel;
. probit smoker age incomel male black hispanic
> worka;
Iteration 0:
Iteration 1:
Iteration 2:
log likelihood = -9171.443
log likelihood = -9022.2473
log likelihood = -9022.1031
Probit estimates
Number of obs
LR chi2(6)
Prob > chi2
Pseudo R2
Log likelihood = -9022.1031
=
=
=
=
16258
298.68
0.0000
0.0163
-----------------------------------------------------------------------------smoker |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0003514
.0009163
0.38
0.701
-.0014445
.0021473
incomel | -.1802868
.0143242
-12.59
0.000
-.2083617
-.152212
male | -.0117546
.0223519
-0.53
0.599
-.0555635
.0320543
black | -.0650982
.0345516
-1.88
0.060
-.1328181
.0026217
hispanic |
-.152071
.0465132
-3.27
0.001
-.2432351
-.0609069
worka | -.2501544
.0227794
-10.98
0.000
-.2948012
-.2055076
_cons |
1.37729
.1472574
9.35
0.000
1.08867
1.665909
-----------------------------------------------------------------------------. estimates store rmodel;
. lrtest urmodel rmodel;
likelihood-ratio test
(Assumption: rmodel nested in urmodel)
LR chi2(3) =
Prob > chi2 =
. * run logit model;
. logit smoker age incomel male black hispanic
142
520.76
0.0000
> hsgrad somecol college worka;
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -9171.443
= -8770.6512
= -8760.9282
= -8760.9112
Logit estimates
Number of obs
LR chi2(9)
Prob > chi2
Pseudo R2
Log likelihood = -8760.9112
=
=
=
=
16258
821.06
0.0000
0.0448
-----------------------------------------------------------------------------smoker |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0026236
.0015594
-1.68
0.092
-.0056799
.0004327
incomel | -.1518663
.0251899
-6.03
0.000
-.2012376
-.102495
male |
.0942472
.0390171
2.42
0.016
.0177751
.1707192
black |
-.196468
.0598366
-3.28
0.001
-.3137456
-.0791904
hispanic | -.4024453
.0825043
-4.88
0.000
-.5641507
-.2407399
hsgrad | -.2906189
.0707661
-4.11
0.000
-.429318
-.1519199
somecol | -.6092455
.073822
-8.25
0.000
-.7539339
-.4645571
college | -1.325203
.0780572
-16.98
0.000
-1.478192
-1.172214
worka | -.3508271
.0389286
-9.01
0.000
-.4271257
-.2745285
_cons |
1.467936
.255991
5.73
0.000
.9662025
1.969669
-----------------------------------------------------------------------------. * ask for marginal effects/treatment effects;
. * logit model;
. mfx compute;
Marginal effects after logit
y = Pr(smoker) (predict)
= .23812502
-----------------------------------------------------------------------------variable |
dy/dx
Std. Err.
z
P>|z| [
95% C.I.
]
X
---------+-------------------------------------------------------------------age |
-.000476
.00028
-1.68
0.092
-.00103 .000078
38.5474
incomel | -.0275518
.00457
-6.03
0.000
-.0365 -.018604
10.421
male*|
.0171866
.00715
2.40
0.016
.003174
.0312
.39476
black*| -.0342102
.00998
-3.43
0.001 -.053765 -.014655
.111945
hispanic*| -.0661959
.01217
-5.44
0.000 -.090044 -.042347
.060709
hsgrad*| -.0513887
.01219
-4.22
0.000 -.075278
-.0275
.335527
somecol*|
-.102284
.01141
-8.97
0.000 -.124644 -.079924
.268545
college*| -.2120833
.0108 -19.64
0.000 -.233248 -.190919
.329376
worka*| -.0657566
.0075
-8.76
0.000 -.080464 -.05105
.68514
-----------------------------------------------------------------------------(*) dy/dx is for discrete change of dummy variable from 0 to 1
. log close;
log: c:\bill\jpsm\workplace1.log
log type: text
closed on:
4 Nov 2004, 07:30:16
------------------------------------------------------------------------------
143
STATA Program for Odds Ratio in Logit Models
natal95.do
*
*
*
*
*
*
*
*
this data set is a small .005 % random sample;
of observations from the 1995 natality detail;
data. we will examine the impack of smoking:
on birth weight. two large states, NY and CA, do not;
record mothers smoking status. therefore, of the ;
4 million births in the US, only 3 million have all;
the necessary data so there should be 3 million*.005;
or roughly 15,000 obs;
* set semi colon as the end of line;
# delimit;
* ask it NOT to pause;
set more off;
* open log file;
log using c:\bill\jpsm\natal95.log,replace;
* use the natality detail data set;
use c:\bill\jpsm\natal95;
* print out variable labels;
desc;
* construct indicator for low birth weight;
gen lowbw=birthw<=2500;
label variable lowbw "dummy variable, =1 ifBW<2500 grams";
* get frequencies;
tab lowbw smoked, col row cell;
* run a logit model;
xi: logit lowbw smoked age married i.educ5 i.race4;
* get marginal effects;
mfx compute;
* run a logit but report the odds ratios instead;
xi: logistic lowbw smoked age married i.educ5 i.race4;
log close;
144
STATA Results for Odds Ratio in Logit Models
natal95.log
-----------------------------------------------------------------------------log: c:\bill\jpsm\natal95.log
log type: text
opened on:
4 Nov 2004, 05:48:05
. * use the natality detail data set;
. use c:\bill\jpsm\natal95;
. * print out variable labels;
. desc;
Contains data from c:\bill\jpsm\natal95.dta
obs:
14,230
vars:
7
27 Oct 2004 14:58
size:
170,760 (98.4% of memory free)
-----------------------------------------------------------------------------> storage display
value
variable name
type
format
label
variable label
-----------------------------------------------------------------------------> birthw
int
%9.0g
birth weight in grams
smoked
byte
%9.0g
=1 if mom smoked during
pregnancy
age
byte
%9.0g
moms age at birth
married
byte
%9.0g
=1 if married
race4
byte
%9.0g
1=white,2=black,3=asian,4=other
educ5
byte
%9.0g
1=0-8, 2=9-11, 3=12, 4=13-15,
5=16+
visits
byte
%9.0g
prenatal visits
-----------------------------------------------------------------------------> Sorted by:
. * construct indicator for low birth weight;
. gen lowbw=birthw<=2500;
. label variable lowbw "dummy variable, =1 ifBW<2500 grams";
. * get frequencies;
. tab lowbw smoked, col row cell;
+-------------------+
| Key
|
|-------------------|
|
frequency
|
| row percentage
|
| column percentage |
| cell percentage |
+-------------------+
dummy |
variable, |
145
=1 |
=1 if mom smoked
ifBW<2500 |
during pregnancy
grams |
0
1 |
Total
-----------+----------------------+---------0 |
11,626
1,745 |
13,371
|
86.95
13.05 |
100.00
|
94.64
89.72 |
93.96
|
81.70
12.26 |
93.96
-----------+----------------------+---------1 |
659
200 |
859
|
76.72
23.28 |
100.00
|
5.36
10.28 |
6.04
|
4.63
1.41 |
6.04
-----------+----------------------+---------Total |
12,285
1,945 |
14,230
|
86.33
13.67 |
100.00
|
100.00
100.00 |
100.00
|
86.33
13.67 |
100.00
. * run a logit model;
. xi: logit lowbw smoked age married i.educ5 i.race4;
i.educ5
_Ieduc5_1-5
(naturally coded; _Ieduc5_1 omitted)
i.race4
_Irace4_1-4
(naturally coded; _Irace4_1 omitted)
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-3244.039
-3149.3534
-3137.0703
-3136.9913
-3136.9912
Logit estimates
Number of obs
LR chi2(10)
Prob > chi2
Pseudo R2
Log likelihood = -3136.9912
=
=
=
=
14230
214.10
0.0000
0.0330
-----------------------------------------------------------------------------lowbw |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------smoked |
.6740651
.0897869
7.51
0.000
.4980861
.8500441
age |
.0080537
.006791
1.19
0.236
-.0052564
.0213638
married | -.3954044
.0882471
-4.48
0.000
-.5683654
-.2224433
_Ieduc5_2 | -.1949335
.1626502
-1.20
0.231
-.5137221
.1238551
_Ieduc5_3 | -.1925099
.1543239
-1.25
0.212
-.4949791
.1099594
_Ieduc5_4 | -.4057382
.1676759
-2.42
0.016
-.7343769
-.0770994
_Ieduc5_5 | -.3569715
.1780322
-2.01
0.045
-.7059081
-.0080349
_Irace4_2 |
.7072894
.0875125
8.08
0.000
.5357681
.8788107
_Irace4_3 |
.386623
.307062
1.26
0.208
-.2152075
.9884535
_Irace4_4 |
.3095536
.2047899
1.51
0.131
-.0918271
.7109344
_cons | -2.755971
.2104916
-13.09
0.000
-3.168527
-2.343415
-----------------------------------------------------------------------------. * get marginal effects;
. mfx compute;
Marginal effects after logit
y = Pr(lowbw) (predict)
146
= .05465609
-----------------------------------------------------------------------------variable |
dy/dx
Std. Err.
z
P>|z| [
95% C.I.
]
X
---------+-------------------------------------------------------------------smoked*|
.0436744
.00706
6.18
0.000
.029834 .057514
.136683
age |
.0004161
.00035
1.19
0.236 -.000271 .001104
26.6564
married*| -.0218806
.0052
-4.21
0.000 -.032074 -.011687
.683204
_Ieduc~2*| -.0095123
.00749
-1.27
0.204 -.024188 .005164
.165495
_Ieduc~3*| -.0096965
.00758
-1.28
0.201 -.024554 .005161
.345397
_Ieduc~4*| -.0190499
.00714
-2.67
0.008 -.033043 -.005057
.22319
_Ieduc~5*| -.0169077
.00771
-2.19
0.028 -.032027 -.001788
.216093
_Irace~2*|
.0453844
.00675
6.72
0.000
.032148 .058621
.17168
_Irace~3*|
.0236917
.02204
1.07
0.282 -.019506
.06689
.010401
_Irace~4*|
.018225
.01363
1.34
0.181 -.008488 .044938
.031694
-----------------------------------------------------------------------------(*) dy/dx is for discrete change of dummy variable from 0 to 1
. * run a logit but report the odds ratios instead;
. xi: logistic lowbw smoked age married i.educ5 i.race4;
i.educ5
_Ieduc5_1-5
(naturally coded; _Ieduc5_1 omitted)
i.race4
_Irace4_1-4
(naturally coded; _Irace4_1 omitted)
Logistic regression
Number of obs
LR chi2(10)
Prob > chi2
Pseudo R2
Log likelihood = -3136.9912
=
=
=
=
14230
214.10
0.0000
0.0330
-----------------------------------------------------------------------------lowbw | Odds Ratio
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------smoked |
1.962198
.1761796
7.51
0.000
1.645569
2.33975
age |
1.008086
.0068459
1.19
0.236
.9947574
1.021594
married |
.6734077
.0594262
-4.48
0.000
.5664506
.8005604
_Ieduc5_2 |
.8228894
.1338431
-1.20
0.231
.5982646
1.131852
_Ieduc5_3 |
.8248862
.1272996
-1.25
0.212
.6095837
1.116233
_Ieduc5_4 |
.6664847
.1117534
-2.42
0.016
.4798043
.9257979
_Ieduc5_5 |
.6997924
.1245856
-2.01
0.045
.4936601
.9919973
_Irace4_2 |
2.028485
.1775178
8.08
0.000
1.70876
2.408034
_Irace4_3 |
1.472001
.4519957
1.26
0.208
.8063741
2.687076
_Irace4_4 |
1.362817
.2790911
1.51
0.131
.9122628
2.035893
-----------------------------------------------------------------------------. log close;
log: c:\bill\jpsm\natal95.log
log type: text
closed on:
4 Nov 2004, 05:48:39
------------------------------------------------------------------------------
147
STATA Program for Ordered Probit Models
sr_health_status.do
*
*
*
*
*
*
*
this data for this example are adults, 18-64;
who answered the cancer control supplement to;
the 1994 national health interview survey;
the key outcome is self reported health status;
coded 1-5, poor, fair, good, very good, excellent;
a ke covariate is current smoking status and whether;
one smoked 5 years ago;
# delimit;
set memory 20m;
set matsize 200;
set more off;
log using c:\bill\jpsm\sr_health_status.log,replace;
* load up sas data set;
use c:\bill\jpsm\sr_health_status;
* get contents of data file;
desc;
* get summary statistics;
sum;
* get tabulation of sr_health;
tab sr_health;
* run OLS models, just to look at the raw correlations in data;
reg sr_health male age educ famincl black othrace smoke smoke5;
* do ordered probit, self reported health status;
oprobit sr_health male age educ famincl black othrace smoke smoke5;
* get marginal effects, evaluated at y=5 (excellent);
mfx compute, predict(outcome(5));
* get marginal effects, evaluated at y=3 (good);
mfx compute, predict(outcome(3));
* use prchange, evaluate marginal effects for;
* 40 year old white female with a college degree;
* never smoked with average log income;
prchange, x(age=40 black=0 othrace=0 smoke=0 smoke5=0 educ=16);
log close;
148
STATA Results for Ordered Probit Models
sr_health_status.log
-----------------------------------------------------------------------------log: c:\bill\iadb\sr_health_status.log
log type: text
opened on:
1 Nov 2004, 12:06:56
. * load up sas data set;
. use sr_health_status;
. * get contents of data file;
. desc;
Contains data from sr_health_status.dta
obs:
12,900
vars:
9
1 Nov 2004 11:51
size:
322,500 (98.5% of memory free)
-----------------------------------------------------------------------------> storage display
value
variable name
type
format
label
variable label
-----------------------------------------------------------------------------> male
byte
%9.0g
=1 if male
age
byte
%9.0g
age in years
educ
byte
%9.0g
years of education
smoke
byte
%9.0g
current smoker
smoke5
byte
%9.0g
smoked in past 5 years
black
float %9.0g
=1 if respondent is black
othrace
float %9.0g
=1 if other race (white is ref)
sr_health
float %9.0g
1-5 self reported health,
5=excel, 1=poor
famincl
float %9.0g
log family income
-----------------------------------------------------------------------------> Sorted by:
. * get summary statistics;
. sum;
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------male |
12900
.438062
.4961681
0
1
age |
12900
39.84124
11.60603
21
64
educ |
12900
13.24016
2.73325
0
18
smoke |
12900
.2891473
.453384
0
1
smoke5 |
12900
.0813953
.2734519
0
1
-------------+-------------------------------------------------------black |
12900
.1242636
.3298948
0
1
othrace |
12900
.0412403
.1988532
0
1
sr_health |
12900
3.888992
1.063713
1
5
famincl |
12900
10.21313
.95086
6.214608
11.22524
. * get tabulation of sr_health;
. tab sr_health;
149
1-5 self |
reported |
health, |
5=excel, |
1=poor |
Freq.
Percent
Cum.
------------+----------------------------------1 |
342
2.65
2.65
2 |
991
7.68
10.33
3 |
3,068
23.78
34.12
4 |
3,855
29.88
64.00
5 |
4,644
36.00
100.00
------------+----------------------------------Total |
12,900
100.00
. * run OLS models, just to look at the raw correlations in data;
. reg sr_health male age educ famincl black othrace smoke smoke5;
Source |
SS
df
MS
-------------+-----------------------------Model | 2609.62058
8 326.202572
Residual | 11985.4163 12891 .929750704
-------------+-----------------------------Total | 14595.0369 12899 1.13148592
Number of obs
F( 8, 12891)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
12900
350.85
0.0000
0.1788
0.1783
.96424
-----------------------------------------------------------------------------sr_health |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------male |
.1033877
.0172399
6.00
0.000
.0695949
.1371804
age | -.0189687
.0007472
-25.39
0.000
-.0204333
-.0175041
educ |
.074539
.0033897
21.99
0.000
.0678946
.0811833
famincl |
.2299388
.0099542
23.10
0.000
.2104271
.2494504
black | -.2127016
.0265726
-8.00
0.000
-.2647878
-.1606153
othrace | -.2120907
.0429632
-4.94
0.000
-.2963049
-.1278765
smoke | -.1800193
.0196221
-9.17
0.000
-.2184815
-.1415572
smoke5 | -.1356116
.0317119
-4.28
0.000
-.1977716
-.0734515
_cons |
1.362405
.1005616
13.55
0.000
1.165289
1.55952
-----------------------------------------------------------------------------. * do ordered probit, self reported health status;
. oprobit sr_health male age educ famincl black othrace smoke smoke5;
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
=
=
=
=
-17591.791
-16403.785
-16401.987
-16401.987
Ordered probit estimates
Number of obs
LR chi2(8)
Prob > chi2
Pseudo R2
Log likelihood = -16401.987
=
=
=
=
12900
2379.61
0.0000
0.0676
-----------------------------------------------------------------------------sr_health |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------male |
.1281241
.0195747
6.55
0.000
.0897583
.1664899
age | -.0202308
.0008499
-23.80
0.000
-.0218966
-.018565
150
educ |
.0827086
.0038547
21.46
0.000
.0751535
.0902637
famincl |
.2398957
.0112206
21.38
0.000
.2179037
.2618878
black |
-.221508
.029528
-7.50
0.000
-.2793818
-.1636341
othrace | -.2425083
.0480047
-5.05
0.000
-.3365958
-.1484208
smoke | -.2086096
.0219779
-9.49
0.000
-.2516855
-.1655337
smoke5 | -.1529619
.0357995
-4.27
0.000
-.2231277
-.0827961
-------------+---------------------------------------------------------------_cut1 |
.4858634
.113179
(Ancillary parameters)
_cut2 |
1.269036
.11282
_cut3 |
2.247251
.1138171
_cut4 |
3.094606
.1145781
-----------------------------------------------------------------------------. * get marginal effects, evaluated at y=5 (excellent);
. mfx compute, predict(outcome(5));
Marginal effects after oprobit
y = Pr(sr_health==5) (predict, outcome(5))
= .34103717
-----------------------------------------------------------------------------variable |
dy/dx
Std. Err.
z
P>|z| [
95% C.I.
]
X
---------+-------------------------------------------------------------------male*|
.0471251
.00722
6.53
0.000
.03298
.06127
.438062
age | -.0074214
.00031 -23.77
0.000 -.008033 -.00681
39.8412
educ |
.0303405
.00142
21.42
0.000
.027565 .033116
13.2402
famincl |
.0880025
.00412
21.37
0.000
.07993 .096075
10.2131
black*| -.0781411
.00996
-7.84
0.000 -.097665 -.058617
.124264
othrace*| -.0843227
.01567
-5.38
0.000 -.115043 -.053602
.04124
smoke*| -.0749785
.00773
-9.71
0.000
-.09012 -.059837
.289147
smoke5*| -.0545062
.01235
-4.41
0.000 -.078719 -.030294
.081395
-----------------------------------------------------------------------------(*) dy/dx is for discrete change of dummy variable from 0 to 1
. * get marginal effects, evaluated at y=3 (good);
. mfx compute, predict(outcome(3));
Marginal effects after oprobit
y = Pr(sr_health==3) (predict, outcome(3))
= .25239744
-----------------------------------------------------------------------------variable |
dy/dx
Std. Err.
z
P>|z| [
95% C.I.
]
X
---------+-------------------------------------------------------------------male*| -.0276959
.00425
-6.51
0.000 -.036029 -.019363
.438062
age |
.0043717
.0002
21.81
0.000
.003979 .004765
39.8412
educ | -.0178727
.00089 -20.02
0.000 -.019623 -.016123
13.2402
famincl | -.0518395
.00261 -19.85
0.000 -.056959 -.04672
10.2131
black*|
.0464219
.00599
7.75
0.000
.034675 .058169
.124264
othrace*|
.0501493
.00934
5.37
0.000
.031834 .068464
.04124
smoke*|
.0443735
.00464
9.56
0.000
.035272 .053476
.289147
smoke5*|
.0323707
.00739
4.38
0.000
.017882
.04686
.081395
-----------------------------------------------------------------------------(*) dy/dx is for discrete change of dummy variable from 0 to 1
.
.
.
.
* use prchange, evaluate marginal effects for;
* 40 year old white female with a college degree;
* never smoked with average log income;
prchange, x(age=40 black=0 othrace=0 smoke=0 smoke5=0 educ=16);
151
oprobit: Changes in Predicted Probabilities for sr_health
male
0->1
Avg|Chg|
.0203868
0->1
5
.05096698
Min->Max
-+1/2
-+sd/2
MargEfct
Avg|Chg|
.13358317
.00321942
.03728014
.00321947
Min->Max
-+1/2
-+sd/2
MargEfct
5
-.33395794
-.00804856
-.09320036
-.00804868
1
-.0020257
2
-.00886671
3
-.02677558
4
-.01329902
1
.0184785
.00032518
.00382077
.00032515
2
.06797072
.00141642
.01648743
.00141639
3
.17686112
.00424452
.04910323
.00424462
4
.07064757
.00206241
.0237889
.00206252
1
-.10945692
-.00133136
-.0036753
-.0013293
2
-.19725057
-.00579271
-.01587057
-.00579057
3
-.22822781
-.01734608
-.04728749
-.01735309
4
.07974288
-.00842556
-.02291423
-.00843208
1
-.05486112
-.00390581
-.0037093
-.00385563
2
-.13623201
-.01684746
-.01601486
-.0167955
3
-.22790183
-.05016185
-.04771243
-.05033251
4
.00276569
-.02429861
-.02311897
-.02445719
1
.00473166
2
.01835598
3
.04779626
4
.01581377
age
educ
Min->Max
-+1/2
-+sd/2
MargEfct
Avg|Chg|
.21397413
.01315829
.03589903
.01316202
Min->Max
-+1/2
-+sd/2
MargEfct
5
.45519245
.03289571
.08974758
.03290504
famincl
Min->Max
-+1/2
-+sd/2
MargEfct
Avg|Chg|
.16759798
.03808549
.03622223
.03817633
Min->Max
-+1/2
-+sd/2
MargEfct
5
.41622926
.09521371
.09055558
.09544083
black
0->1
Avg|Chg|
.03467907
0->1
5
-.08669767
othrace
152
0->1
Avg|Chg|
.03787661
0->1
5
-.09469151
1
.00532324
2
.02040636
3
.05239134
4
.0165706
1
.00438228
2
.01712416
3
.04497364
4
.01528287
1
.00299019
2
.012047
3
.03281575
4
.01242298
smoke
0->1
Avg|Chg|
.03270518
0->1
5
-.08176297
smoke5
0->1
Avg|Chg|
.02411037
0->1
5
-.06027591
Pr(y|x)
1
.00563112
x=
sd(x)=
male
.438062
.496168
x=
sd(x)=
smoke5
0
.273452
2
.03431748
age
40
11.606
3
.17979275
educ
16
2.73325
4
.30986777
famincl
10.2131
.95086
black
0
.329895
5
.47039089
othrace
0
.198853
smoke
0
.453384
. log close;
log: c:\bill\iadb\sr_health_status.log
log type: text
closed on:
1 Nov 2004, 12:07:40
------------------------------------------------------------------------------
153
STATA Program for Count Data Models
drvisits.do
*
*
*
*
drvisits.do;
this program estimates a poisson and negative binomial;
count data model. teh data inclused people aged 65+;
from the 1987 nmes data set. dr visits are annual;
* this line defines the semicolon as the line delimiter;
# delimit ;
* set memork for 10 meg;
set memory 10m;
* open output file;
log using c:\bill\jpsm\drvisits.log,replace;
* open stata data set;
use c:\bill\jpsm\drvisits;
* generate new variables;
gen incomel=ln(income);
* get distribution of dr visits;
tabulate drvisits;
* get descriptive statistics;
sum;
* run poisson regression;
poisson drvisits age65 age70 age75 age80 chronic excel good fair female
black hispanic hs_drop hs_grad mcaid incomel;
* run neg binomial regression;
nbreg drvisits age65 age70 age75 age80 chronic excel good fair female
black hispanic hs_drop hs_grad mcaid incomel, dispersion(constant);
log close;
154
STATA Results for Count Data Models
drvisits.log
-----------------------------------------------------------------------------log: C:\bill\stata\drvisits.log
log type: text
opened on: 28 Oct 2004, 13:44:05
. * open stata data set;
. use drvisits;
. * generate new variables;
. gen incomel=ln(income);
(28 missing values generated)
. * get distribution of dr visits;
. tabulate drvisits;
annual doc |
visits |
Freq.
Percent
Cum.
------------+----------------------------------0 |
915
17.18
17.18
1 |
601
11.28
28.46
2 |
533
10.01
38.46
3 |
503
9.44
47.91
4 |
450
8.45
56.35
5 |
391
7.34
63.69
6 |
319
5.99
69.68
7 |
258
4.84
74.53
8 |
216
4.05
78.58
9 |
192
3.60
82.19
10 |
147
2.76
84.94
11 |
123
2.31
87.25
12 |
99
1.86
89.11
13 |
81
1.52
90.63
14 |
80
1.50
92.13
15 |
66
1.24
93.37
16 |
56
1.05
94.42
17 |
56
1.05
95.48
18 |
34
0.64
96.11
19 |
26
0.49
96.60
20 |
17
0.32
96.92
21 |
21
0.39
97.32
22 |
20
0.38
97.69
23 |
11
0.21
97.90
24 |
15
0.28
98.18
25 |
4
0.08
98.25
26 |
12
0.23
98.48
27 |
9
0.17
98.65
28 |
6
0.11
98.76
29 |
4
0.08
98.84
30 |
5
0.09
98.93
31 |
6
0.11
99.04
32 |
2
0.04
99.08
33 |
2
0.04
99.12
34 |
3
0.06
99.17
155
35 |
2
0.04
99.21
36 |
2
0.04
99.25
37 |
4
0.08
99.32
38 |
2
0.04
99.36
39 |
5
0.09
99.46
40 |
2
0.04
99.49
41 |
1
0.02
99.51
42 |
4
0.08
99.59
43 |
2
0.04
99.62
44 |
2
0.04
99.66
47 |
1
0.02
99.68
48 |
2
0.04
99.72
49 |
1
0.02
99.74
50 |
1
0.02
99.76
51 |
1
0.02
99.77
53 |
2
0.04
99.81
55 |
1
0.02
99.83
56 |
1
0.02
99.85
58 |
2
0.04
99.89
61 |
1
0.02
99.91
63 |
1
0.02
99.92
65 |
1
0.02
99.94
66 |
1
0.02
99.96
68 |
1
0.02
99.98
89 |
1
0.02
100.00
------------+----------------------------------Total |
5,327
100.00
. * get descriptive statistics;
. sum;
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------drvisits |
5327
5.563732
6.676081
0
89
age65 |
5327
.3358363
.4723263
0
1
age70 |
5327
.2802703
.4491734
0
1
age75 |
5327
.2003004
.4002627
0
1
age80 |
5327
.1101934
.31316
0
1
-------------+-------------------------------------------------------chronic |
5327
.6279332
.4834015
0
1
excel |
5327
.0749014
.263257
0
1
good |
5327
.3792003
.4852336
0
1
fair |
5327
.3305801
.4704662
0
1
hs_drop |
5327
.5029097
.5000385
0
1
-------------+-------------------------------------------------------hs_grad |
5327
.2922846
.4548551
0
1
black |
5327
.1255866
.331414
0
1
hispanic |
5327
.0324761
.1772774
0
1
female |
5327
.5969589
.4905549
0
1
mcaid |
5327
.1019335
.3025893
0
1
-------------+-------------------------------------------------------income |
5327
25381.78
28962.69
0
548224
incomel |
5299
9.754733
.8911269
2.639057
13.21444
. * run poisson regression;
. poisson drvisits age65 age70 age75 age80 chronic excel good fair female
> black hispanic hs_drop hs_grad mcaid incomel;
156
Iteration 0:
Iteration 1:
Iteration 2:
log likelihood = -22275.374
log likelihood = -22275.351
log likelihood = -22275.351
Poisson regression
Number of obs
LR chi2(15)
Prob > chi2
Pseudo R2
Log likelihood = -22275.351
=
=
=
=
5299
3334.46
0.0000
0.0696
-----------------------------------------------------------------------------drvisits |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age65 |
.2144282
.026267
8.16
0.000
.1629458
.2659106
age70 |
.286831
.0263077
10.90
0.000
.2352689
.3383931
age75 |
.2801504
.0269802
10.38
0.000
.2272702
.3330307
age80 |
.24314
.0292045
8.33
0.000
.1859001
.3003798
chronic |
.4997173
.0137789
36.27
0.000
.4727111
.5267235
excel | -.7836622
.0305392
-25.66
0.000
-.8435178
-.7238065
good | -.4774853
.0159987
-29.85
0.000
-.5088422
-.4461284
fair | -.2578352
.0155473
-16.58
0.000
-.2883073
-.2273631
female |
.0960976
.0123182
7.80
0.000
.0719543
.1202409
black | -.2838081
.0202163
-14.04
0.000
-.3234314
-.2441849
hispanic | -.2051023
.0368764
-5.56
0.000
-.2773788
-.1328258
hs_drop | -.2323802
.016066
-14.46
0.000
-.263869
-.2008914
hs_grad | -.1200559
.016517
-7.27
0.000
-.1524287
-.0876831
mcaid |
.1535708
.0203414
7.55
0.000
.1137025
.1934392
incomel |
.0211453
.0072946
2.90
0.004
.0068481
.0354425
_cons |
1.348084
.0804659
16.75
0.000
1.190374
1.505795
-----------------------------------------------------------------------------. * run neg binomial regression;
. nbreg drvisits age65 age70 age75 age80 chronic excel good fair female
> black hispanic hs_drop hs_grad mcaid incomel, dispersion(constant);
Fitting Poisson model:
Iteration 0:
Iteration 1:
Iteration 2:
log likelihood = -22275.374
log likelihood = -22275.351
log likelihood = -22275.351
Fitting constant-only model:
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-17434.216
-15076.44
-14841.425
-14840.935
-14840.935
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-14840.935
-14540.408
-14519.799
-14519.721
-14519.721
Fitting full model:
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
log
log
log
log
log
157
Negative binomial (constant dispersion)
Number of obs
LR chi2(15)
Prob > chi2
Pseudo R2
Log likelihood = -14519.721
=
=
=
=
5299
642.43
0.0000
0.0216
-----------------------------------------------------------------------------drvisits |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age65 |
.1034281
.054664
1.89
0.058
-.0037113
.2105675
age70 |
.2039634
.0546788
3.73
0.000
.0967949
.3111319
age75 |
.2094928
.0560412
3.74
0.000
.0996541
.3193314
age80 |
.2227169
.0605925
3.68
0.000
.1039579
.341476
chronic |
.5091666
.0292189
17.43
0.000
.4518986
.5664347
excel | -.5272908
.0594584
-8.87
0.000
-.6438271
-.4107545
good | -.3422506
.0353507
-9.68
0.000
-.4115368
-.2729645
fair | -.1526385
.0351632
-4.34
0.000
-.2215571
-.0837198
female |
.1321966
.0263028
5.03
0.000
.0806441
.183749
black | -.3300031
.0438969
-7.52
0.000
-.4160395
-.2439668
hispanic | -.1527763
.0763018
-2.00
0.045
-.3023251
-.0032275
hs_drop | -.1912903
.0344335
-5.56
0.000
-.2587787
-.1238018
hs_grad | -.0869843
.0354543
-2.45
0.014
-.1564733
-.0174952
mcaid |
.1341325
.0442797
3.03
0.002
.0473459
.2209191
incomel |
.0379834
.0155687
2.44
0.015
.0074693
.0684975
_cons |
1.11029
.17092
6.50
0.000
.7752924
1.445287
-------------+---------------------------------------------------------------/lndelta |
1.65017
.0286445
1.594027
1.706312
-------------+---------------------------------------------------------------delta |
5.207863
.1491766
4.923538
5.508607
-----------------------------------------------------------------------------Likelihood-ratio test of delta=0: chibar2(01) = 1.6e+04 Prob>=chibar2 = 0.000
. log close;
log: C:\bill\stata\drvisits.log
log type: text
closed on: 28 Oct 2004, 13:44:20
------------------------------------------------------------------------------
158
Program for Duration Data
Surv_data.do
*
*
*
*
*
*
*
*
this data set has married males, aged 50-70;
from the nhis multiple cause of death file;
data is taken from the 1987-1990 nhis;
surveys. all people are followed for;
up to 60 months. max_mths is the most;
people are followed and diedin5;
indicates whether the person died;
in five years (60 months);
* set end of line marker;
# delimit;
set more off;
* increase memory;
set memory 20m;
* write results to file;
log using c:\bill\jpsm\surv_data.log,replace;
* load up sas data set;
use c:\bill\jpsm\surv_data;
* get contents of data file;
desc;
* get summary statistics;
sum;
* define the duration data in the analysis;
stset max_mths, failure(diedin5);
* list the kaplan meier survivor function;
sts list;
* you can graph the functions as well;
* output the graphs to a file;
sts graph;
graph save c:\bill\jpsm\graph1.gph, replace;
* you can draw graphs for various subgroups;
* output the graphs to a file;
sts graph, by(educ);
graph save c:\bill\jpsm\graph2.gph, replace;
*
*
*
*
run a duration model where the hazard varies across;
people. first, ask stata to print out the raw;
coefficients (nohr option), then do default;
show weibull first, then exponential;
* first, construct dummies for the income and;
* education categories. in the regression statement;
159
* _Ie star include all variables beginning with _Ie;
* and _Ii star includes all variables starting with;
* _Ii;
xi i.income i.educ;
streg age_s_yrs black hispanic _Ie* _Ii*, d(weibull) nohr;
* now get the hazard ratios where all coefs are raised to;
* exp(b1);
streg age_s_yrs black hispanic _Ie* _Ii*, d(weibull);
* for compairson purposes, look at results from an exponential;
streg age_s_yrs black hispanic _Ie* _Ii*, d(exp) nohr;
streg age_s_yrs black hispanic _Ie* _Ii*, d(exp);
log close;
160
STATA Results for Duration Data
surv_data.log
-----------------------------------------------------------------------------log: c:\bill\jpsm\surv_data.log
log type: text
opened on:
7 Nov 2004, 06:26:56
. * load up sas data set;
. use c:\bill\jpsm\surv_data;
. * get contents of data file;
. desc;
Contains data from c:\bill\jpsm\surv_data.dta
obs:
26,654
vars:
7
2 Nov 2004 10:59
size:
533,080 (97.5% of memory free)
-----------------------------------------------------------------------------> storage display
value
variable name
type
format
label
variable label
-----------------------------------------------------------------------------> age_s_yrs
byte
%9.0g
age in years at the time of
survey
max_mths
byte
%9.0g
max months of followup
black
byte
%9.0g
dummy variable, =1 if black
hispanic
byte
%9.0g
dummy variable, =1 hispanic
income
float %9.0g
=1 if <10K, 2 if 10-20, 3 if
20-30, 4 if 30-40, 5 if 40+
educ
float %9.0g
=1 if <9, =2 if 9-11, =3 if
12-15, =4 if 16+
diedin5
float %9.0g
died with 5 year followup
-----------------------------------------------------------------------------> Sorted by:
. * get summary statistics;
. sum;
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------age_s_yrs |
26654
59.42586
5.962435
50
70
max_mths |
26654
56.49077
11.15384
0
60
black |
26654
.0928566
.2902368
0
1
hispanic |
26654
.0454716
.20834
0
1
income |
26654
3.592181
1.327325
1
5
-------------+-------------------------------------------------------educ |
26654
2.766677
.961846
1
4
diedin5 |
26654
.1226082
.3279931
0
1
. * define the duration data in the analysis;
. stset max_mths, failure(diedin5);
failure event:
diedin5 != 0 & diedin5 < .
161
obs. time interval:
exit on or before:
(0, max_mths]
failure
-----------------------------------------------------------------------------26654 total obs.
23 obs. end on or before enter()
-----------------------------------------------------------------------------26631 obs. remaining, representing
3245 failures in single record/single failure data
1505705 total analysis time at risk, at risk from t =
0
earliest observed entry t =
0
last observed exit t =
60
. * list the kaplan meier survivor function;
. sts list;
failure _d:
analysis time _t:
diedin5
max_mths
Beg.
Net
Survivor
Std.
Time
Total
Fail
Lost
Function
Error
[95% Conf. Int.
> ]
-----------------------------------------------------------------------------> 1
26631
38
0
0.9986
0.0002
0.9980
0.999
> 0
2
26593
42
0
0.9970
0.0003
0.9963
0.997
> 6
3
26551
40
0
0.9955
0.0004
0.9946
0.996
> 2
4
26511
49
0
0.9937
0.0005
0.9926
0.994
> 5
5
26462
50
0
0.9918
0.0006
0.9906
0.992
> 8
6
26412
61
0
0.9895
0.0006
0.9882
0.990
> 6
7
26351
45
0
0.9878
0.0007
0.9864
0.989
> 0
8
26306
60
0
0.9855
0.0007
0.9840
0.986
> 9
9
26246
46
0
0.9838
0.0008
0.9822
0.985
> 3
10
26200
42
0
0.9822
0.0008
0.9806
0.983
> 8
11
26158
52
0
0.9803
0.0009
0.9785
0.981
> 9
12
26106
56
0
0.9782
0.0009
0.9764
0.979
> 9
13
26050
53
0
0.9762
0.0009
0.9743
0.978
> 0
14
25997
64
0
0.9738
0.0010
0.9718
0.975
> 6
15
25933
48
0
0.9720
0.0010
0.9699
0.973
> 9
16
25885
49
0
0.9701
0.0010
0.9680
0.972
> 1
17
25836
54
0
0.9681
0.0011
0.9659
0.970
162
> 2
18
25782
46
0
0.9664
0.0011
0.9642
0.968
19
25736
51
0
0.9645
0.0011
0.9622
0.966
20
25685
38
0
0.9631
0.0012
0.9607
0.965
21
25647
56
0
0.9609
0.0012
0.9586
0.963
22
25591
51
0
0.9590
0.0012
0.9566
0.961
23
25540
48
0
0.9572
0.0012
0.9547
0.959
24
25492
51
0
0.9553
0.0013
0.9528
0.957
25
25441
59
0
0.9531
0.0013
0.9505
0.955
26
25382
58
0
0.9509
0.0013
0.9483
0.953
27
25324
63
0
0.9486
0.0014
0.9458
0.951
28
25261
50
0
0.9467
0.0014
0.9439
0.949
29
25211
50
0
0.9448
0.0014
0.9420
0.947
30
25161
52
0
0.9428
0.0014
0.9400
0.945
31
25109
60
0
0.9406
0.0014
0.9377
0.943
32
25049
52
0
0.9386
0.0015
0.9357
0.941
33
24997
54
0
0.9366
0.0015
0.9336
0.939
34
24943
56
0
0.9345
0.0015
0.9315
0.937
35
24887
66
0
0.9320
0.0015
0.9289
0.935
36
24821
70
0
0.9294
0.0016
0.9263
0.932
37
24751
45
0
0.9277
0.0016
0.9245
0.930
38
24706
59
0
0.9255
0.0016
0.9223
0.928
39
24647
54
0
0.9235
0.0016
0.9202
0.926
40
24593
48
0
0.9217
0.0016
0.9184
0.924
41
24545
61
0
0.9194
0.0017
0.9160
0.922
42
24484
63
0
0.9170
0.0017
0.9136
0.920
43
24421
56
0
0.9149
0.0017
0.9115
0.918
44
24365
52
0
0.9130
0.0017
0.9095
0.916
45
24313
60
0
0.9107
0.0017
0.9072
0.914
> 5
> 6
> 2
> 2
> 3
> 6
> 7
> 6
> 5
> 1
> 3
> 5
> 6
> 4
> 5
> 5
> 4
> 0
> 4
> 8
> 6
> 6
> 8
> 6
> 3
> 2
> 3
> 1
163
46
24253
56
0
0.9086
0.0018
0.9051
0.912
47
24197
68
0
0.9060
0.0018
0.9025
0.909
48
24129
59
0
0.9038
0.0018
0.9002
0.907
49
24070
57
0
0.9017
0.0018
0.8981
0.905
50
24013
57
0
0.8996
0.0018
0.8959
0.903
51
23956
66
0
0.8971
0.0019
0.8934
0.900
52
23890
57
0
0.8949
0.0019
0.8912
0.898
53
23833
50
0
0.8931
0.0019
0.8893
0.896
54
23783
53
0
0.8911
0.0019
0.8873
0.894
55
23730
64
0
0.8887
0.0019
0.8848
0.892
56
23666
55
0
0.8866
0.0019
0.8827
0.890
57
23611
65
0
0.8842
0.0020
0.8803
0.887
58
23546
66
0
0.8817
0.0020
0.8777
0.885
59
23480
44
0
0.8800
0.0020
0.8761
0.883
> 0
> 5
> 3
> 2
> 1
> 7
> 6
> 7
> 7
> 4
> 3
> 9
> 5
> 9
60
23436
50 2.3e+04
0.8781
0.0020
0.8742
0.8
> 820
-----------------------------------------------------------------------------> . * you can graph the functions as well;
. * output the graphs to a file;
. sts graph;
failure _d:
analysis time _t:
diedin5
max_mths
. graph save c:\bill\jpsm\graph1.gph, replace;
(file c:\bill\jpsm\graph1.gph saved)
. * you can draw graphs for various subgroups;
. * output the graphs to a file;
. sts graph, by(educ);
failure _d:
analysis time _t:
diedin5
max_mths
. graph save c:\bill\jpsm\graph2.gph, replace;
(file c:\bill\jpsm\graph2.gph saved)
.
.
.
.
*
*
*
*
run a duration model where the hazard varies across;
people. first, ask stata to print out the raw;
coefficients (nohr option), then do default;
show weibull first, then exponential;
164
. * first, construct dummies for the income and;
. * education categories. in the regression statement;
. * _Ie star include all variables beginning with _Ie;
. * and _Ii star includes all variables starting with;
. * _Ii;
. xi i.income i.educ;
i.income
_Iincome_1-5
(naturally coded; _Iincome_1 omitted)
i.educ
_Ieduc_1-4
(naturally coded; _Ieduc_1 omitted)
. streg age_s_yrs black hispanic _Ie* _Ii*, d(weibull) nohr;
failure _d:
analysis time _t:
diedin5
max_mths
Fitting constant-only model:
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
=
=
=
=
-12759.823
-12723.121
-12722.924
-12722.924
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-12722.924
-12454.553
-12425.111
-12425.055
-12425.055
Fitting full model:
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
log
log
log
log
log
Weibull regression -- log relative-hazard form
No. of subjects =
No. of failures =
Time at risk
=
Log likelihood
=
26631
3245
1505705
-12425.055
Number of obs
=
26631
LR chi2(10)
Prob > chi2
=
=
595.74
0.0000
-----------------------------------------------------------------------------_t |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age_s_yrs |
.0452588
.0031592
14.33
0.000
.0390669
.0514508
black |
.4770152
.0511122
9.33
0.000
.3768371
.5771932
hispanic |
.1333552
.082156
1.62
0.105
-.0276676
.294378
_Ieduc_2 |
.0093353
.0591918
0.16
0.875
-.1066786
.1253492
_Ieduc_3 |
-.072163
.0503131
-1.43
0.151
-.1707748
.0264488
_Ieduc_4 | -.1301173
.0657131
-1.98
0.048
-.2589126
-.0013221
_Iincome_2 | -.1867752
.0650604
-2.87
0.004
-.3142914
-.0592591
_Iincome_3 | -.3268927
.0688635
-4.75
0.000
-.4618627
-.1919227
_Iincome_4 | -.5166137
.0769202
-6.72
0.000
-.6673747
-.3658528
_Iincome_5 | -.5425447
.0722025
-7.51
0.000
-.684059
-.4010303
_cons | -9.201724
.2266475
-40.60
0.000
-9.645945
-8.757503
-------------+---------------------------------------------------------------/ln_p |
.1585315
.0172241
9.20
0.000
.1247729
.1922901
-------------+---------------------------------------------------------------p |
1.171789
.020183
1.132891
1.212022
1/p |
.8533961
.014699
.8250675
.8826974
------------------------------------------------------------------------------
165
. * now get the hazard ratios where all coefs are raised to;
. * exp(b1);
. streg age_s_yrs black hispanic _Ie* _Ii*, d(weibull);
failure _d:
analysis time _t:
diedin5
max_mths
Fitting constant-only model:
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
=
=
=
=
-12759.823
-12723.121
-12722.924
-12722.924
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-12722.924
-12454.553
-12425.111
-12425.055
-12425.055
Fitting full model:
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
log
log
log
log
log
Weibull regression -- log relative-hazard form
No. of subjects =
No. of failures =
Time at risk
=
Log likelihood
=
26631
3245
1505705
-12425.055
Number of obs
=
26631
LR chi2(10)
Prob > chi2
=
=
595.74
0.0000
-----------------------------------------------------------------------------_t | Haz. Ratio
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age_s_yrs |
1.046299
.0033055
14.33
0.000
1.03984
1.052797
black |
1.611258
.082355
9.33
0.000
1.457667
1.781032
hispanic |
1.142656
.093876
1.62
0.105
.9727116
1.342291
_Ieduc_2 |
1.009379
.059747
0.16
0.875
.8988145
1.133544
_Ieduc_3 |
.9303792
.0468103
-1.43
0.151
.8430114
1.026802
_Ieduc_4 |
.8779924
.0576956
-1.98
0.048
.7718905
.9986788
_Iincome_2 |
.8296302
.0539761
-2.87
0.004
.7303062
.9424625
_Iincome_3 |
.7211611
.0496617
-4.75
0.000
.6301089
.8253706
_Iincome_4 |
.5965372
.0458858
-6.72
0.000
.5130537
.6936049
_Iincome_5 |
.5812672
.041969
-7.51
0.000
.5045648
.6696297
-------------+---------------------------------------------------------------/ln_p |
.1585315
.0172241
9.20
0.000
.1247729
.1922901
-------------+---------------------------------------------------------------p |
1.171789
.020183
1.132891
1.212022
1/p |
.8533961
.014699
.8250675
.8826974
-----------------------------------------------------------------------------. * for compairson purposes, look at results from an exponential;
. streg age_s_yrs black hispanic _Ie* _Ii*, d(exp) nohr;
failure _d:
analysis time _t:
diedin5
max_mths
166
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-12759.823
-12493.913
-12465.272
-12465.218
-12465.218
Exponential regression -- log relative-hazard form
No. of subjects =
No. of failures =
Time at risk
=
Log likelihood
26631
3245
1505705
=
-12465.218
Number of obs
=
26631
LR chi2(10)
Prob > chi2
=
=
589.21
0.0000
-----------------------------------------------------------------------------_t |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age_s_yrs |
.0450058
.0031587
14.25
0.000
.0388149
.0511968
black |
.4739259
.0511077
9.27
0.000
.3737567
.574095
hispanic |
.1325028
.0821549
1.61
0.107
-.0285178
.2935235
_Ieduc_2 |
.0094567
.0591916
0.16
0.873
-.1065568
.1254701
_Ieduc_3 |
-.071804
.0503096
-1.43
0.154
-.170409
.0268011
_Ieduc_4 | -.1293206
.0657092
-1.97
0.049
-.2581081
-.000533
_Iincome_2 | -.1855024
.0650573
-2.85
0.004
-.3130123
-.0579925
_Iincome_3 | -.3244382
.0688567
-4.71
0.000
-.4593948
-.1894816
_Iincome_4 | -.5134143
.0769126
-6.68
0.000
-.6641602
-.3626684
_Iincome_5 | -.5391811
.072196
-7.47
0.000
-.6806827
-.3976795
_cons | -8.491069
.2107085
-40.30
0.000
-8.90405
-8.078088
-----------------------------------------------------------------------------.
streg age_s_yrs black hispanic _Ie* _Ii*, d(exp);
failure _d:
analysis time _t:
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
log
log
log
log
log
diedin5
max_mths
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-12759.823
-12493.913
-12465.272
-12465.218
-12465.218
Exponential regression -- log relative-hazard form
No. of subjects =
No. of failures =
Time at risk
=
Log likelihood
=
26631
3245
1505705
-12465.218
Number of obs
=
26631
LR chi2(10)
Prob > chi2
=
=
589.21
0.0000
-----------------------------------------------------------------------------_t | Haz. Ratio
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age_s_yrs |
1.046034
.0033041
14.25
0.000
1.039578
1.05253
black |
1.606288
.0820936
9.27
0.000
1.453184
1.775523
hispanic |
1.141682
.0937948
1.61
0.107
.971885
1.341145
_Ieduc_2 |
1.009502
.059754
0.16
0.873
.898924
1.133681
_Ieduc_3 |
.9307133
.0468238
-1.43
0.154
.8433198
1.027163
167
_Ieduc_4 |
.8786922
.0577381
-1.97
0.049
.7725117
.9994672
_Iincome_2 |
.8306869
.0540422
-2.85
0.004
.731241
.943657
_Iincome_3 |
.7229334
.0497788
-4.71
0.000
.6316658
.827388
_Iincome_4 |
.5984488
.0460282
-6.68
0.000
.5147056
.6958171
_Iincome_5 |
.5832257
.0421066
-7.47
0.000
.5062713
.6718773
-----------------------------------------------------------------------------. log close;
log: c:\bill\jpsm\surv_data.log
log type: text
closed on:
7 Nov 2004, 06:27:08
------------------------------------------------------------------------------
168
Download