- Word counts Moving further - Speech error counts -Categorical Metaphor countscount data - Active construction counts Hissing Koreans Winter & Grawunder (2012) No. of Cases Bentz & Winter (2013) Alcohol Level 0 4 8 12 16 20 24 No. of Speech Errors 28 32 Alcohol Level 0 4 8 12 16 20 24 No. of Speech Errors 28 32 0 4 8 12 16 20 24 No. of Speech Errors 28 32 Poisson Model Alcohol Level The Poisson Distribution few deaths Army Corps with few Horses Siméon Poisson Army Corps lots of Horses low variability many deaths high variability 1898: Ladislaus Bortkiewicz Poisson Regression = generalized linear model with Poisson error structure and log link function The Poisson Model Y ~ log(b0 + b1*X1 + b2*X2) In R: lmer(my_counts ~ my_predictors + (1|subject), mydataset, family="poisson") Poisson model output ex log values e x exponentiate predicted mean rate 0 4 8 12 16 20 24 No. of Speech Errors 28 32 Poisson Model Alcohol Level - Focus vs. no-focus Moving further - Yes vs. No -Binary Dative vs.categorical genitive - Correct vs. incorrect data Case yes vs. no ~ Percent L2 speakers Bentz & Winter (2013) 1 0 Accuracy 5 10 Noise level (db) 15 20 1 0 Accuracy 5 10 Noise level (db) 15 20 1 0 Accuracy 5 10 Noise level (db) 15 20 1 0 Accuracy 5 10 Noise level (db) 15 20 Logistic Regression = generalized linear model with binomial error structure and logistic link function The Logistic Model p(Y) ~ logit-1(b0 + b1*X1 + b2*X2) In R: lmer(binary_variable ~ my_predictors + (1|subject), mydataset, family="binomial") Probabilities and Odds p(x) Probability of an Event p(x) 1- p(x) Odds of an Event Intuition about Odds What are the odds that I pick a blue marble? N = 12 Answer: 2/10 Log odds æ p(x) ö log ç ÷ è 1- p(x) ø = logit function Representative values Probability Odds Log odds (= “logits”) 0.1 0.2 0.3 0.4 0.111 0.25 0.428 0.667 -2.197 -1.386 -0.847 -0.405 0.5 0.6 0.7 1 1.5 2.33 0 0.405 0.847 0.8 4 1.386 0.9 9 2.197 Snijders & Bosker (1999: 212) Bentz & Winter (2013) Case yes vs. no ~ Percent L2 speakers Estimate Std. (Intercept) 1.4576 Percent.L2 -6.5728 Error z value 0.6831 2.134 2.0335 -3.232 Log odds when Percent.L2 = 0 Pr(>|z|) 0.03286 0.00123 Bentz & Winter (2013) Case yes vs. no ~ Percent L2 speakers Estimate Std. (Intercept) 1.4576 Percent.L2 -6.5728 Error z value 0.6831 2.134 2.0335 -3.232 For each increase in Percent.L2 by 1%, how much the log odds decrease (= the slope) Pr(>|z|) 0.03286 0.00123 Bentz & Winter (2013) Case yes vs. no ~ Percent L2 speakers Estimate Std. (Intercept) 1.4576 Percent.L2 -6.5728 Error z value 0.6831 2.134 2.0335 -3.232 Pr(>|z|) 0.03286 0.00123 Exponentiate Odds Transform by inverse logit Probabilitie Logits or “log odds” Case yes vs. no ~ Percent L2 speakers Estimate Std. (Intercept) 1.4576 Percent.L2 -6.5728 Error z value 0.6831 2.134 2.0335 -3.232 Pr(>|z|) 0.03286 0.00123 exp(-6.5728) Odds Transform by inverse logit Probabilitie Logits or “log odds” Case yes vs. no ~ Percent L2 speakers Estimate Std. (Intercept) 1.4576 Percent.L2 -6.5728 Error z value 0.6831 2.134 2.0335 -3.232 Pr(>|z|) 0.03286 0.00123 exp(-6.5728) 0.001397 878 Transform by inverse logit Probabilitie Logits or “log odds” Odds p(x) >1 1- p(x) p(x) <1 1- p(x) Numerator more likely = event happens more often than not Denominator more likely = event is more likely not to happen Case yes vs. no ~ Percent L2 speakers Estimate Std. (Intercept) 1.4576 Percent.L2 -6.5728 Error z value 0.6831 2.134 2.0335 -3.232 Pr(>|z|) 0.03286 0.00123 exp(-6.5728) 0.001397 878 Transform by inverse logit Probabilitie Logits or “log odds” Case yes vs. no ~ Percent L2 speakers Estimate Std. (Intercept) 1.4576 Percent.L2 -6.5728 Logits or “log odds” Error z value 0.6831 2.134 2.0335 -3.232 logit.inv(1.4576) Pr(>|z|) 0.03286 0.00123 0.81 About 80%(makes sense) Bentz & Winter (2013) Case yes vs. no ~ Percent L2 speakers Estimate Std. (Intercept) 1.4576 Percent.L2 -6.5728 Logits or “log odds” Error z value 0.6831 2.134 2.0335 -3.232 Pr(>|z|) 0.03286 0.00123 logit.inv(1.4576) 0.81 logit.inv(1.4576+ -6.5728*0.3) 0.37 Bentz & Winter (2013) x æ p(x) ö log ç ÷ è 1- p(x) ø e x 1+ e = logit function = inverse logit function x This is the famous “logistic function” logit-1 e x 1+ e = inverse logit function Inverse logit function logit.inv = function(x){exp(x)/(1+exp(x))} (this defines the function in R) (transforms back to probabilities) General Linear Model Generalized Linear Model Generalized Linear Mixed Model General Linear Model Generalized Linear Model Generalized Linear Mixed Model General Linear Model Generalized Linear Model Generalized Linear Mixed Model = “Generalizing” the General Linear Model to cases that don’t include continuous response variables (in particular categorical ones) Generalized Linear Model = Consists of two things: (1) an error distribution, (2) a link function = “Generalizing” the Logistic regression: General Linear Model to Binomial distribution cases that don’t include Poisson regression: continuous response Poisson distribution variables (in particular categorical ones) Logistic regression: Logit link function Poisson regression: Log link function = Consists of two things: (1) an error distribution, (2) a link function =lm(response “Generalizing” the ~ predictor) Logistic regression: General Linear Model to Binomial distribution cases that don’t glm(response ~ include predictor, Poisson regression: family="binomial") continuous response Poisson distribution variables (in particular glm(response ~ predictor, categorical ones) family="poisson") Logistic regression: Logit link function Poisson regression: Log link function = Consists of two things: (1) an error distribution, (2) a link function Categorical Data Dichotomous/Binary Count Logistic Regression Poisson Regression General structure Linear Model continuous ~ any type of variable Logistic Regression dichotomous ~ any type of variable Poisson Regression count ~ any type of variable For the generalized linear mixed model… … you only have to specify the family. lmer(…) lmer(…,family="poisson") lmer(…,family="binomial") That’s it (for now)