Chapter 17.1 Poisson Regression Classic Poisson Example • Number of deaths by horse kick, for each of 16 corps in the Prussian army, from 1875 to 1894 • Did the risk of death show an trend across years for the guard corps? 1. Construct Model – Graphical 1. Construct Model Formal Write General Linear Model: General linear model inappropriate for count data: • Variance likely increases with mean • Fitted values may be negative • Errors tend not to be normal • Zeros are difficult to handle with transformations 1. Construct Model Formal Write General Linear Model: ๐ท๐๐๐กโ๐ = ๐ฝ๐ + ๐ฝ๐๐๐๐ ๐๐๐๐ + ๐๐๐๐๐๐ ๐ Write Generalized Linear Model: 2. Execute analysis & 3. Evaluate model glm1 <- glm(deaths~year, family=poisson(link=log), data=horsekick) 2. Execute analysis & 3. Evaluate model glm1 <- glm(deaths~year, family=poisson(link=log), data=horsekick) 4. State population and whether sample is representative. 5. Decide on mode of inference. Is hypothesis testing appropriate? 6. State HA / Ho pair, tolerance for Type I error Statistic: Distribution: 7. ANODEV. Calculate change in fit (ΔG) due to explanatory variables. • The F-statistic is not used for models with non-normal errors • We will assess improvement in fit (ANODEV) 7. ANODEV. Calculate change in fit (ΔG) due to explanatory variables. > anova(glm1, test="Chisq") Analysis of Deviance Table Model: poisson, link: log Response: deaths Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 19 22.050 year 1 0.61137 18 21.439 0.4343 8. Assess table in view of evaluation of residuals. – Residuals acceptable 9. Assess table in view of evaluation of residuals. – Reject HA: There was no apparent trend in deaths by horsekick over two decades (ΔG=0.611, p=0.4343) 10.Analysis of parameters of biological interest. – βyear was not significant – report mean deaths/yr • 16 deaths / 20 years = 0.8 deaths/year library(pscl) library(Hmisc) prussian horsekick <- subset(prussian, corp=="G") names(horsekick) <- c("deaths","year","corps") glm0 <- glm(deaths ~ 1, family = poisson(link = log), data = horsekick) # intercept only glm1 <- glm(deaths ~ year, family = poisson(link = log), data = horsekick) plot(glm1, which=1, add.smooth=F, pch=16) plot(glm1$residuals, Lag(glm1$residuals), xlab="Residuals", ylab="Lagged residuals", pch=16) plot(deaths~year, data=horsekick, pch=16, axes=F, xlab="Year", ylab="Deaths (Guard corp)") axis(1, at=75:94, labels=1875:1894) axis(2, at=0:3) box() lines(horsekick$year, glm1$fitted) # with regression term lines(horsekick$year, glm0$fitted, lty=2) # intercept anova(glm1, test="Chisq")