Multinomial, Ordered and Multivariate Models Multinomial, ordered and multivariate models allow us to use various types of categorical data for our dependent variable. What type of dependent variable do you have? Dependent Variable Model Interval or ratio scale Ordinary Least Squares Binary: 0,1 Logit or Probit st nd rd Ordered Logit or Probit Ordinal: 1 , 2 , 3 (mutually exclusive) Nominal: gender (mutually exclusive) Multiple Binary: y1: 0, 1 Multinomial Model Multivariate Model y2 : 0, 1 …. Should you use Logit or Probit? Logit Probit Coefficients affect the Coefficients affect the log odds ratio probability k Pi (1 Pi ) k Z i Odds = (P)/(1-P) Based off the normal distribution model Logistic distribution tails are fatter Deciding between a Multinomial and an Ordered Model This choice depends on what type of data you are working with. Nominal Data: Categorical data where the categories have an arbitrary order. Ordinal Data: Categorical data where categories have a meaningful rank. However, ordinal data may not describe relative size of each category or the amount of difference between categories. Multinomial Models When to use a multinomial model? When the dependent variable has more than two discrete categorical outcomes, a multinomial model can estimate effects for the explanatory variables for each category of the dependent variable. Variable 1a Variable 2a Variable 3a Variable 1b Variable 2b Variable 3b Variable 1c Variable 2c Variable 3c Dependent Variable: Multiple Categories Example: Land Use 2. Farming Dependent Variable: Multiple Categories Example: Land Use 1. Housing Dependent Variable: Multiple Categories Example: Land Use 3. Forest Land Ordered Models When to use an ordered model? If the dependent variable is both categorical and ordinal, use an ordered model to estimate the effects of explanatory variables on the probability that an observation will fall within one of the categories of the dependent variable. Ordinal models assume that the independent variables have the same effect when moving between categories. Explanatory Variable 1 Explanatory Variable 2 Explanatory Variable 3 Dependent Variable: Multiple Ordered Categories Example: Rank 1. Gold 2. Silver 3. Bronze Multivariate Models When is a multivariate approach necessary? Multivariate models simultaneously estimate multiple dependent variables. These dependent variables can be binary or continuous. Since we are addressing logit and probit models here we will focus on binary responses. Explanatory Variable 1 Explanatory Variable 2 Explanatory Variable 3 Dependent Variable #1 Example: Male has bank account Dependent Variable #2 Example: Female has bank account Multivariate models can be particularly helpful when two dependent variables are correlated. Multivariate commands take an extra step of calculating the correlation of the two y variables and including that in the estimation of coefficients. By estimating them together efficiency is improved. There is no gain in efficiency if the two dependent variables are not correlated. In that case it would be better to just estimate them separately. How the Multivariate analysis is set up: (Bivariate example) Typical set up of the dependent variables (here shown as binary). The only addition is that we have more than one. This added step of estimating ρ is what allows us to use variables that may be highly correlated. Example of a Multivariate and Bivariate Analysis Marieka Klawitter, ‘Who Is Banked in Low Income Families? The Effects of Gender and Bargaining Power.’, Social Science Research, 40 (2011), 50. Data: Survey that includes data on which families have bank accounts, which bank accounts are jointly or individually owned. Models: Probit, bivariate and multivariate models were each specified with bank account ownership as the dependent variable and various socioeconomic factors as explanatory variables. Analysis: By using a multivariate method the study is able to show how intrahousehold dynamics influence whether a family and an individual has a bank account. The probit model on its own does not give the richness of data that we see from the multivariate approach. Stata Code for Reference Model Ordinary Least Squares Binary Binary Logit Binary Probit Ordinal Ordered Logit Generalized Logit Ordered Probit Nominal Multinomial Logit Conditional Logit Multinomial Probit Multivariate Bivariate Multivariate Stata Command .regress .logit, .logistic .probit .ologit .gologit2 .oprobit .mlogit .clogit .mprobit .bvariate Many, search STATA for options Resources: Multinomial and Ordinal Models: http://www.indiana.edu/~statmath/stat/all/cdvm/cdvm1.html#s11 Multinomial Logit: http://en.wikipedia.org/wiki/Multinomial_logit Intro to Multivariate: http://www.pisces-conservation.com/pdf/mvstats-lecture1.pdf