Psych 1950 Unit 19: Generalized Linear Mixed-Effects Models II Today's Menu Remember that in logistic regression Y is 0/1 (e.g., it could be a single wrong/right item). Today we extend logistic regression in two directions: Multiple 0/1 items: Y is based on a sum score of k "correct" answers out of N binary items (binomial regression): Single Likert item (e.g., 5-point): Y is ordinal and has more than 2 categories (ordinal logistic regression). We also introduce a variant of an ordinal logistic regression model useful for modeling sum of Likert items or Likert with many categories (equidistant threshold model). Of course, all of these models can be fitted in a mixed-effects fashion as well by adding random effects correspondingly. Distribution of the day: Binomial distribution. Resources: Agresti, A. (2012). Analysis of Ordinal Categorical Data (3rd ed.). Wiley. Bürkner P. C., & Vuorre M. (2019). Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2, 77-101. 10.1177/2515245918823199 2 / 16 Binomial Regression 3 / 16 Binomial Distribution The Bernoulli distribution from last unit focuses on a single 0/1 trial per person. The binomial distribution generalizes this setup to multiple 0/1 trials (e.g., each participant answers 10 binary questions). Formally, we have N 0/1 trials per person with "success" probability p for each trial. The probability mass function (pmf) of a binomial distribution, expressing the probability that a person has k "successes" (i.e., k 1-scores), is N P (Y = k) = ( k N −k )p (1 − p) k Note that the binomial distribution is discrete. Within a regression context we will use it to model a response reflecting the number of successes in N binary (!!) trials. 4 / 16 Binomial Distribution Parameters: n 1 40 10 1 5 9 13 17 21 25 29 33 37 40 p 0 1 0.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Support: k ∈ {0, 1, . . . , n} E(X) = np V ar(X) = np(1 − p) 5 / 16 Sidestep: Sum Scores Sum scores as response variables are tricky to model (lower and upper bound, questionable whether they are actually metric). Remarks: A sum score (or mean) is a very coarse aggregate measure1 (e.g., each item gets the same weight of 1). In the psychometrics class we learn specific techniques on how to aggregate items properly (principal component analysis, factor analysis, item response theory). If a sum score is based on a sum of 0/1 items, use binomial regression. If a sum score is based on a sum of Likert items, there are several options: Ordinal regression (sum score taken as ordinal, no distributional violations). Beta regression if we hit boundaries (sum score taken as metric); Gaussian regression if normality is not violated too strongly (sum score taken as metric). [1] McNeish, D., & Wolf, M. G. (2020). Thinking twice about sum scores. Behavior Research Methods, 52, 2287-2305. 10.3758/s13428-020-01398-0 6 / 16 Binomial Regression If Y is based on a set of N binary responses and we are interested in modeling the number of successes, a binomial regression does the job. Single 0/1 response: logistic regression → Bernoulli distribution. Multiple 0/1 responses aggregated: binomial regression → binomial distribution. The model expression is the same as in logistic regression1, i.e., logit(pi ) = log( with . pi pi 1 − pi ) = β0 + β1 X1i + β2 X2i + ⋯ + βp Xpi as the proportions of successes, and 1 − pi as the proportion of failures of person [1] The parameter interpretation is analogous to logistic regression: e.g., an i ^ exp(β ) transformation gives the odds ratio (i.e., success vs. failure for a 1-unit increase in X , holding all the other predictors constant). Anyway, effects plots show the effects in a straightforward manner. 7 / 16 Remarks Some remarks before we are going fit a binomial regression model: A binomial regression only works if there is an upper limit N of trials. It doesn't work for an arbitrary count variable (we will learn corresponding count regression models later). The number N does NOT have to be constant across subjects, i.e. each participant can be exposed to a different number of trials. We don't use the "proportion correct" directly as response. We will learn a corresponding modeling approach when we cover beta regression. Binomial regression is great for modeling sum scores of binary (dichotomous) items. Don't use it for a sum of Likert items, as the binomial distribution does not reflect the underlying data generating process (DGP). 8 / 16 Binomial Regression Binomial regression models can be fitted with glm() (or brm() / stan_glm() for Bayesian) with family = "binomial" . The response variable occupies two columns in a data frame. Say we have a predictor X , N as the number of trials per person, and sumY as the number of 1 responses in N trials. A simple data frame would look as follows: call (we cbind successes and failures; same in stan_glm() ): glm() dat ## X sumY N ## 1 10 5 10 ## 2 16 ## 3 8 8 10 10 10 ## 4 20 1 10 ## 5 15 3 10 glm(cbind(sumY, N-sumY) ~ X, family = "binomial", data = dat) call (we pipe successes and number of trials): brm() brm(sumY | trials(N) ~ X, family = "binomial)` In the code file we show an example of a mixed-effects binomial regression using a word recognition dataset. 9 / 16 Ordinal Logistic Regression 10 / 16 Ordinal Logistic Regression Now we extend basic logistic regression this to a situation where categories. Examples: Y is ordinal and has > 2 4 mental impairment categories (well, mild, moderate, impaired). Likert item (e.g., 5-point scale) as response. Sum of Likert items: e.g. sum of 3 items having 0-4 response categories → minimum score of 0, maximum 12. There are several types of ordinal regression models, depending on how they are parameterized1. Here we focus on proportional odds model (the most popular ordinal regression model); equidistant threshold model (super useful; restricted version of the proportional odds model). [1] Bürkner P. C., & Vuorre M. (2019). Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2, 77-101. 10.1177/2515245918823199 11 / 16 Proportional Odds Model Having an ordinal response Yi , the most popular model is the proportional odds model. It is based on the following cumulative logit expression: P (Yi ≤ j) log( 1 − P (Yi ≤ j) For instance, for j : is the probability for scoring 2 or less (i.e., P (Yi = 1) + P (Yi = ≤ 2) = P (Yi > 2), i.e., the probability for scoring higher than 2. P (Yi ≤ 2) 1 − P (Yi = 2 ) = αj + β1 Xi1 + β2 Xi2 + ⋯ + βp Xip ); 2) Each cumulative logit has its own intercept αj which increases in response category j. This intercept determines the horizontal shift of the logistic function. Note that we only get one set of β parameters, typically subject to inference. For instance for j = 2 vs. j = 3 the curve for P (Y ≤ 2) is the curve P (Y ≤ 3) shifted by (α3 − α2 )/β1 units in X1 direction. This applies to all the predictors and all the adjacent category logits; that's why it is called proportional odds model. 12 / 16 Example: Mental Impairment Data We model mental impairment (4 categories; well, mild, moderate, impaired) as response with SES and number of critical life events as predictors, using the clm() function from the ordinal package: clmFit <- clm(impairment ~ ses + events, data = mentalImpairment) ... ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## seshigh 1.1112 0.6109 1.819 0.0689 . ## events -0.3189 0.1210 -2.635 0.0084 ** ## --## ## Threshold coefficients: ## Estimate Std. Error z value ## imp|mod ## mod|mild ## mild|well ... -2.2094 -1.2128 0.2819 Slopes: β^1 = 0.7210 0.6607 0.6423 1.111; β^2 = -3.064 -1.836 0.439 -0.319 (can be exponentiated for interpretability). ^1 Intercepts (threshold coefficients): α = ^2 -2.209; α = ^3 -1.213; α = 0.282 13 / 16 Example: Mental Impairment Data Let us have a look at the effects plots (cumulative logits): E.g., for events −β^ 2 = ^ j parameters shift the curves. 0.319 reflects the slope of the logits. The α 14 / 16 Extensions At this point we can think of the following extensions: Mixed-effects extensions: super easy to fit since we only change the right hand side of the equation. The clmm() function in the ordinal package is an easy-to-use implementation. In the Bayesian domain the brms package can handle a variety of ordinal models in a mixed-effects fashion. The proportional odds model is a cumulative odds model. Depending on the specification of the log-odds on the left hand side of the equation, other models can be formulated (e.g., adjacentcategory, sequential, etc.). Details can be found in Bürkner & Vuorre (2019), including illustrations how to fit these models with brms . One can think of a more parsimonious version of the proportional odds model that restricts the thresholds to be equidistant. This is useful for ordinal responses with many categories. In the code we show an mixed-effects ordinal regression model with equidistant thresholds. Bürkner P. C., & Vuorre M. (2019). Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2, 77-101. 10.1177/2515245918823199 15 / 16 Summary Today we've extended basic logistic regression in two directions: binomial logit model: modeling sum scores of binary (!!) items as response. ordinal logit models (ordinal response with more than 2 categories): proportional odds and equidistant threshold model. There is another extension called the multinomial logit model for responses that are NOT ordinal and have more than 2 categories. See here for a tutorial. Ultimately, the best way to fit these models is to do it in a Bayesian way using brm() with family = "categorical" . Random effects can be added as desired. Next on the list: formal introduction of the GLM class followed by modeling metric response variables that don't fit into a Gaussian distribution framework. 16 / 16