The Generalized Linear Model and PROC GENMOD

advertisement
Dependent Variable
Discrete
 2 values – binomial
 3 or more discrete values –
multinomial
 Skewed – e.g. Poisson
Continuous
 Non-normal
Link Function
Connection between dependent variable
and predictor:
 Logit – ln(p/(1-p))
 Probit – inverse normal
 Other nonlinear connections
(exponential, logarithmic, power, etc.)
Function
link(y) = a + b1*x1 + b2*x2 + … +
bn*xn + e)
The link function should connect the
(discrete) dependent observation to
the linear predictor.
 y = inverse-link (a + b1*x1 …)
Link Functions
Distribution
Link
Normal, gamma, Poisson
Linear, log , power
Binomial
Logit, probit
Multinomial
Log(x1/(1 – x2 - … - xn))
Solution
 Requires numeric solution (rather
than algebraic for traditional GLM)
Significance
 Wald statistic
 Likelihood Ratio statistic
 Score statistic
Residuals
 Pearson residuals – based on
observed – predicted values
 Deviance residuals – contribution to
log likelihood statistic
 Leverage
 Studentized
 Cook’s D
Models




ANOVA
Regression
ANCOVA
More complex linear models
SAS
 PROC GENMOD: procedure call
 CLASS: categorical (ANOVA) variables
 MODEL: dependent= independent
MODEL
 Model= dependent
 Model = events/trials = (ratio of
events divided by number of trials for
summarized binomial responses)
Model Options
 CORR, COVB: parameter correlations or
covariances
 DIST= lists the assumed distribution of the
dependent variable (see SAS docs)
 LINK= specifies the link function. SAS will
pick a default for a DIST if you don’t
 Type1 (sequential), Type3 (partial), Wald
statistics
 P (predicted estimates) R (residuals)
Download