LPM and logit example

advertisement
PROC IMPORT OUT= WORK.woodwc
DATAFILE= "D:\data\logitex.xls"
DBMS=EXCEL REPLACE;
RANGE="data";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
DATA logitex;
SET WORK.woodwc;
PROC SORT;
BY GPA;
RUN;
PROC REG;
TITLE 'OLS Estimation of Personalized Instruction Model';
MODEL GRADE = GPA TUCE PSI;
RUN;
COMMENT 'What follows is Weighted Least Squares - don't use this code unless you're doing WLS';
OUTPUT OUT=RESFILE PREDICTED=YHAT;
DATA TWO;
MERGE ECONED RESFILE;
BY GPA;
YHATC = (YHAT*(0<YHAT<1)) + (0.999*(YHAT>=1)) + (0.001*(YHAT<=0));
W = SQRT(YHATC*(1-YHATC));
RECIPW = 1/W;
GRADEW = GRADE/W;
GPAW = GPA/W;
TUCEW = TUCE/W;
PSIW = PSI/W;
PROC PRINT;
RUN;
PROC REG;
TITLE 'Weighted Least Squares Estimation of Linear Probability Model';
MODEL GRADEW = RECIPW GPAW TUCEW PSIW /NOINT;
RUN;
COMMENT 'End Weighted Least Squares code';
PROC LOGISTIC DESCENDING;
TITLE 'Logit Estimation of Personalized Instruction Model';
MODEL GRADE = GPA TUCE PSI /CTABLE PPROB=0.5;
RUN;
COMMENT 'PROC QLIM promising new procedure for getting marginal effects';
PROC QLIM;
model GRADE = GPA TUCE PSI/ DISCRETE(D=logit);
output out=meffects marginal;
run;
proc means data=meffects;
var meff:;
run;
OLS Estimation of Personalized Instruction Model
The REG Procedure
Model: MODEL1
Dependent Variable: GRADE GRADE
Number of Observations Read
Number of Observations Used
Source
DF
Analysis of Variance
Sum of
Squares
Model
Error
Corrected Total
3
28
31
3.00228
4.21647
7.21875
Root MSE
Dependent Mean
Coeff Var
0.38806
0.34375
112.88935
Variable
Label
Intercept
GPA
TUCE
PSI
Intercept
GPA
TUCE
PSI
DF
1
1
1
1
32
32
Mean
Square
1.00076
0.15059
R-Square
Adj R-Sq
Parameter Estimates
Parameter
Standard
Estimate
Error
-1.49802
0.46385
0.01050
0.37855
0.52389
0.16196
0.01948
0.13917
F Value
Pr > F
6.65
0.0016
0.4159
0.3533
t Value
Pr > |t|
-2.86
2.86
0.54
2.72
0.0079
0.0078
0.5944
0.0111
1
OLS Estimation of Personalized Instruction Model
Obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
GPA
2.06
2.39
2.63
2.66
2.67
2.74
2.75
2.76
2.83
2.83
2.86
2.87
2.89
2.89
2.92
3.03
3.10
3.12
3.16
3.26
3.28
3.32
3.39
3.51
3.53
3.54
3.57
2
TUCE
PSI
GRADE
YHAT
YHATC
W
22
1
0
0.06696
0.06696
0.24996
19
1
1
0.18855
0.18855
0.39115
20
0
0
-0.06818
0.00100
0.03161
20
0
0
-0.05427
0.00100
0.03161
24
1
0
0.37090
0.37090
0.48305
19
0
0
-0.02766
0.00100
0.03161
25
0
0
0.03995
0.03995
0.19585
17
0
0
-0.03937
0.00100
0.03161
19
0
0
0.01409
0.01409
0.11786
27
1
1
0.47661
0.47661
0.49945
17
0
0
0.00702
0.00702
0.08347
21
0
0
0.05363
0.05363
0.22530
22
0
0
0.07341
0.07341
0.26080
14
1
0
0.36800
0.36800
0.48226
12
0
0
-0.01763
0.00100
0.03161
25
0
0
0.16983
0.16983
0.37548
21
1
0
0.53888
0.53888
0.49849
23
1
0
0.56914
0.56914
0.49520
25
1
1
0.60869
0.60869
0.48804
25
0
1
0.27652
0.27652
0.44728
24
0
0
0.27530
0.27530
0.44666
23
0
0
0.28336
0.28336
0.45063
17
1
1
0.63141
0.63141
0.48242
26
1
0
0.78153
0.78153
0.41321
26
0
0
0.41225
0.41225
0.49224
24
1
1
0.77446
0.77446
0.41794
23
0
0
0.39932
0.39932
0.48976
Weighted Least Squares Estimation of Linear Probability Model
The REG Procedure
Model: MODEL1
Dependent Variable: GRADEW
Number of Observations Read
32
Number of Observations Used
32
NOTE: No intercept in model. R-Square is redefined.
Analysis of Variance
DF
4
28
32
Sum of
Squares
74.64082
22.73882
97.37964
Root MSE
Dependent Mean
Coeff Var
0.90117
0.91932
98.02486
Source
Model
Error
Uncorrected Total
Mean
Square
18.66020
0.81210
R-Square
Adj R-Sq
F Value
22.98
Pr > F
<.0001
0.7665
0.7331
Parameter Estimates
Variable
RECIPW
GPAW
TUCEW
PSIW
DF
1
1
1
1
Parameter
Estimate
-1.30873
0.39817
0.01216
0.38782
Standard
Error
0.28849
0.08783
0.00454
0.10518
t Value
-4.54
4.53
2.68
3.69
Pr > |t|
<.0001
<.0001
0.0123
0.0010
Logit Estimation of Personalized Instruction Model
The LOGISTIC Procedure
Model Information
Data Set
Response Variable
Number of Response Levels
Model
Optimization Technique
WORK.TWO
GRADE
2
binary logit
Fisher's scoring
Number of Observations Read
Number of Observations Used
GRADE
32
32
Response Profile
Ordered
Value
GRADE
Total
Frequency
1
2
1
0
11
21
Probability modeled is GRADE=1.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept
and
Criterion
AIC
SC
-2 Log L
Only
Covariates
43.183
44.649
41.183
33.779
39.642
25.779
Testing Global Null Hypothesis: BETA=0
Test
Likelihood Ratio
Score
Wald
Chi-Square
DF
Pr > ChiSq
15.4042
13.3088
8.3762
3
3
3
0.0015
0.0040
0.0388
Logit Estimation of Personalized Instruction Model
61
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Parameter
DF
Estimate
Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept
GPA
TUCE
PSI
1
1
1
1
-13.0204
2.8259
0.0951
2.3785
4.9310
1.2629
0.1415
1.0645
6.9723
5.0072
0.4518
4.9925
0.0083
0.0252
0.5015
0.0255
Odds Ratio Estimates
Effect
GPA
TUCE
PSI
Point
Estimate
95% Wald
Confidence Limits
16.877
1.100
10.789
1.420
0.833
1.339
200.567
1.451
86.917
Association of Predicted Probabilities and Observed Responses
Percent Concordant
Percent Discordant
Percent Tied
Pairs
88.3
11.3
0.4
231
Somers' D
Gamma
Tau-a
c
0.771
0.774
0.359
0.885
Classification Table
Prob
Level
0.500
Correct
NonEvent Event
6
18
Incorrect
NonEvent Event
3
5
Correct
75.0
Percentages
Sensi- Speci- False
tivity ficity
POS
54.5
85.7
33.3
False
NEG
21.7
Logit Estimation of Personalized Instruction Model
The QLIM Procedure
Discrete Response Profile of GRADE
Index
Value
1
2
0
1
Frequency
Percent
21
11
65.63
34.38
Model Fit Summary
Number of Endogenous Variables
Endogenous Variable
Number of Observations
Log Likelihood
Maximum Absolute Gradient
Number of Iterations
Optimization Method
AIC
Schwarz Criterion
1
GRADE
32
-12.88963
3.82282E-6
17
Quasi-Newton
33.77927
39.64221
Goodness-of-Fit Measures
Measure
Likelihood Ratio (R)
Upper Bound of R (U)
Aldrich-Nelson
Value
15.404
41.183
0.325
Formula
2 * (LogL - LogL0)
- 2 * LogL0
R / (R+N)
62
Cragg-Uhler 1
Cragg-Uhler 2
Estrella
Adjusted Estrella
McFadden's LRI
Veall-Zimmermann
McKelvey-Zavoina
0.3821
0.5278
0.4528
0.2251
0.374
0.5774
0.7915
1 - exp(-R/N)
(1-exp(-R/N)) / (1-exp(-U/N))
1 - (1-R/U)^(U/N)
1 - ((LogL-K)/LogL0)^(-2/N*LogL0)
R / U
(R * (U+N)) / (U * (R+N))
N = # of observations, K = # of regressors
Algorithm converged.
Logit Estimation of Personalized Instruction Model
63
The QLIM Procedure
Parameter Estimates
Parameter
DF
Estimate
Standard
Error
t Value
Intercept
GPA
TUCE
PSI
1
1
1
1
-13.021347
2.826113
0.095158
2.378688
4.931350
1.262912
0.141555
1.064557
-2.64
2.24
0.67
2.23
Approx
Pr > |t|
0.0083
0.0252
0.5014
0.0255
Logit Estimation of Personalized Instruction Model
The MEANS Procedure
Variable
Label
N
Mean
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Meff_P1_GPA
Marginal effect of GPA on the probability of GRADE=1
32
-0.3625809
Meff_P2_GPA
Marginal effect of GPA on the probability of GRADE=2
32
0.3625809
Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1 32
-0.0122084
Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2 32
0.0122084
Meff_P1_PSI
Marginal effect of PSI on the probability of GRADE=1
32
-0.3051777
Meff_P2_PSI
Marginal effect of PSI on the probability of GRADE=2
32
0.3051777
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Variable
Label
Std Dev
Minimum
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Meff_P1_GPA
Marginal effect of GPA on the probability of GRADE=1
0.2354968
-0.7055222
Meff_P2_GPA
Marginal effect of GPA on the probability of GRADE=2
0.2354968
0.0674638
Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1
0.0079294
-0.0237555
Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2
0.0079294
0.0022716
Meff_P1_PSI
Marginal effect of PSI on the probability of GRADE=1
0.1982133
-0.5938252
Meff_P2_PSI
Marginal effect of PSI on the probability of GRADE=2
0.1982133
0.0567830
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Variable
Label
Maximum
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Meff_P1_GPA
Marginal effect of GPA on the probability of GRADE=1
-0.0674638
Meff_P2_GPA
Marginal effect of GPA on the probability of GRADE=2
0.7055222
Meff_P1_TUCE Marginal effect of TUCE on the probability of GRADE=1
-0.0022716
Meff_P2_TUCE Marginal effect of TUCE on the probability of GRADE=2
0.0237555
Meff_P1_PSI
Marginal effect of PSI on the probability of GRADE=1
-0.0567830
Meff_P2_PSI
Marginal effect of PSI on the probability of GRADE=2
0.5938252
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
In the logit analysis, gradepoint average and tutoring both had positive effects on student grades.
The higher a student’s overall gradepoint average, the more likely the student was to receive an A
in intermediate macroeconomics. Evaluated at the mean, an increase of one entire gradepoint was
associated with a 36.2 percent greater probability of receiving an A. The coefficient on gradepoint
average was statistically significant (p = 0.0252).
Students who received tutoring were also more likely to receive A grades in intermediate
macroeconomics. Evaluated at the mean, receiving tutoring was associated with a 30.5 percent greater
probability of receiving an A. The coefficient on the tutoring dummy variable was statistically
significant (p = 0.0255).
The coefficient on the TUCE score was not statistically significant.
The overall equation fit well, with an Rp2 equal to 0.75 and a highly statistically significant
likelihood ratio test (p = 0.0015, 2 = 15.40).
Download