Presentation

advertisement
Chapter 17.1
Poisson Regression
Classic Poisson Example
• Number of deaths by horse kick, for each of 16
corps in the Prussian army, from 1875 to 1894
• Did the risk of death show an trend across
years for the guard corps?
1. Construct Model – Graphical
1. Construct Model Formal
Write General Linear Model:
General linear model inappropriate for count data:
• Variance likely increases with mean
• Fitted values may be negative
• Errors tend not to be normal
• Zeros are difficult to handle with transformations
1. Construct Model Formal
Write General Linear Model:
๐ท๐‘’๐‘Ž๐‘กโ„Ž๐‘  = ๐›ฝ๐‘œ + ๐›ฝ๐‘Œ๐‘’๐‘Ž๐‘Ÿ ๐‘Œ๐‘’๐‘Ž๐‘Ÿ + ๐‘›๐‘œ๐‘Ÿ๐‘š๐‘Ž๐‘™ ๐œ€
Write Generalized Linear Model:
2. Execute analysis & 3. Evaluate model
glm1 <- glm(deaths~year, family=poisson(link=log),
data=horsekick)
2. Execute analysis & 3. Evaluate model
glm1 <- glm(deaths~year, family=poisson(link=log),
data=horsekick)
4. State population and whether sample is
representative.
5. Decide on mode of inference. Is hypothesis
testing appropriate?
6. State HA / Ho pair, tolerance for Type I error
Statistic:
Distribution:
7. ANODEV. Calculate change in fit
(ΔG) due to explanatory variables.
• The F-statistic is not used for models with
non-normal errors
• We will assess improvement in fit (ANODEV)
7. ANODEV. Calculate change in fit
(ΔG) due to explanatory variables.
> anova(glm1, test="Chisq")
Analysis of Deviance Table
Model: poisson, link: log
Response: deaths
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL
19
22.050
year 1 0.61137
18
21.439
0.4343
8. Assess table in view of evaluation of
residuals.
–
Residuals acceptable
9. Assess table in view of evaluation of
residuals.
–
Reject HA: There was no apparent trend in deaths by
horsekick over two decades (ΔG=0.611, p=0.4343)
10.Analysis of parameters of biological interest.
–
βyear was not significant – report mean deaths/yr
• 16 deaths / 20 years = 0.8 deaths/year
library(pscl)
library(Hmisc)
prussian
horsekick <- subset(prussian, corp=="G")
names(horsekick) <- c("deaths","year","corps")
glm0 <- glm(deaths ~ 1, family = poisson(link = log), data = horsekick) # intercept only
glm1 <- glm(deaths ~ year, family = poisson(link = log), data = horsekick)
plot(glm1, which=1, add.smooth=F, pch=16)
plot(glm1$residuals, Lag(glm1$residuals), xlab="Residuals", ylab="Lagged residuals", pch=16)
plot(deaths~year, data=horsekick, pch=16, axes=F, xlab="Year", ylab="Deaths (Guard corp)")
axis(1, at=75:94, labels=1875:1894)
axis(2, at=0:3)
box()
lines(horsekick$year, glm1$fitted) # with regression term
lines(horsekick$year, glm0$fitted, lty=2) # intercept
anova(glm1, test="Chisq")
Download