Chapter 2: Logistic Regression and Correspondence Analysis 2.1 Fitting Ordinal Logistic Regression Models 2.2 Fitting Nominal Logistic Regression Models 2.3 Introduction to Correspondence Analysis 1 Chapter 2: Logistic Regression and Correspondence Analysis 2.1 Fitting Ordinal Logistic Regression Models 2.2 Fitting Nominal Logistic Regression Models 2.3 Introduction to Correspondence Analysis 2 Objectives 3 Define a cumulative logit. Fit an ordinal logistic regression model. Interpret parameter estimates. Compute odds ratios. When Do You Use Ordinal Logistic Regression? Response Variable Two Categories Binary Yes No Nominal Three or More Categories Ordinal 4 Type of Logistic Regression Binary Nominal Ordinal Cumulative Logits Response Logit(1) Log Logit(2) Log Number of Cumulative Logits = Number of Levels -1 5 Proportional Odds Assumptions Logit(i) Logit(2)= a2+BX Equal Slopes Logit(1)= a1+BX Predictor X 6 Sample Data Set Gender Income Age 7 P R E D I C T O R S MODEL O U T C O M E >100 5 75-100 4 50-74 3 25-49 2 0-24 1 Examining Distributions This demonstration illustrates the concepts discussed previously. 8 9 Exercise This exercise reinforces the concepts discussed previously. 10 Chapter 2: Logistic Regression and Correspondence Analysis 2.1 Fitting Ordinal Logistic Regression Models 2.2 Fitting Nominal Logistic Regression Models 2.3 Introduction to Correspondence Analysis 11 Objectives 12 Explain a generalized logit. Fit a nominal logistic regression model. Interpret the parameter estimates. Compute odds ratios. When To Use Nominal Logistic Regression? Response Variable Two Categories Binary Yes No Nominal Three or More Categories Ordinal 13 Type of Logistic Regression Binary Nominal Ordinal Generalized Logits Response Logit(1) Log Logit(2) Log Number of Generalized Logits = Number of Levels -1 14 Generalized Logit Model Logit(i) Logit(i) Logit(2)=a2+B2X Different Slopes Different Slopes and and Intercepts Intercepts Logit(1)=a1+B1X Predictor X X Predictor 15 2.01 Multiple Choice Poll Suppose a nominal response variable has four levels. Which of the following statements is true? a. JMP will compute three generalized logits. b. Logit(1) is the log odds for level 1 occurring versus level 4 occurring. c. JMP will compute a separate intercept parameter for each logit. d. JMP will compute a separate slope parameter for each logit. e. All of the above are true. 17 2.01 Multiple Choice Poll – Correct Answer Suppose a nominal response variable has four levels. Which of the following statements is true? a. JMP will compute three generalized logits. b. Logit(1) is the log odds for level 1 occurring versus level 4 occurring. c. JMP will compute a separate intercept parameter for each logit. d. JMP will compute a separate slope parameter for each logit. e. All of the above are true. 18 Sample Data Set Gender Income Age 19 P R E D I C T O R S MODEL O U T C O M E >100 5 75-100 4 50-74 3 25-49 2 0-24 1 Nominal Logistic Regression Model This demonstration illustrates the concepts discussed previously. 20 21 Exercise This exercise reinforces the concepts discussed previously. 22 Chapter 2: Logistic Regression and Correspondence Analysis 2.1 Fitting Ordinal Logistic Regression Models 2.2 Fitting Nominal Logistic Regression Models 2.3 Introduction to Correspondence Analysis 23 Objectives 24 Explain how correspondence analysis can help you study data. Perform a simple correspondence analysis. Interpret a correspondence plot. What Is Correspondence Analysis? Correspondence analysis is a data analysis technique that enables you to display the associations between the levels of two or more categorical variables graphically extract information from a frequency table with many levels for the rows and columns. 25 Row and Column Profiles A 1 2 3 4 19.55 27.39 17.27 24.20 17.67 24.20 17.51 24.20 B 25.91 23.27 28.18 25.31 28.84 25.31 29.49 26.12 C 54.55 25.53 54.55 25.53 53.49 24.47 53.00 24.47 Gives Row Profile Row % Column % Gives Column Profile Row and column percentages are used to obtain row and column profiles. 26 Row Profiles A B C 1 19.55 25.91 54.55 2 17.27 28.18 54.55 3 17.67 28.84 53.49 4 17.51 29.49 53.00 Row % Row Profile = Row%/100 Row percentages are used to obtain row profiles. 27 Column Profiles A B C 1 27.39 23.27 25.53 2 24.20 25.31 25.53 3 24.20 25.31 24.47 4 24.20 26.12 24.47 Column % Col Profile = Column%/100 Column percentages are used to obtain column profiles. 28 Correspondence Plot Rows 1 and 2 have similar profiles. Their points are close together and fall in the same direction away from the origin. The profile for Row 7 is different. Its point is closer in and falls in a different direction away from the origin. 29 Association Row 8 and Column D fall in approximately the same direction from the origin, and are relatively close to one another. 30 2.02 Multiple Answer Poll In correspondence analysis, which of the following are true? (Choose all answers that apply.) a. Row points that fall far from each other but in the same direction away from the origin indicate that they have similar profiles. b. Column points that fall close together and in the same direction away from the origin indicate that they have similar profiles. c. Row and column points that fall in the same direction away from the origin indicate that they have an association. 32 2.02 Multiple Answer Poll – Correct Answers In correspondence analysis, which of the following are true? (Choose all answers that apply.) a. Row points that fall far from each other but in the same direction away from the origin indicate that they have similar profiles. b. Column points that fall close together and in the same direction away from the origin indicate that they have similar profiles. c. Row and column points that fall in the same direction away from the origin indicate that they have an association. 33 Sample Data Set ACTION MYSTERY COMEDY AGE SPORTS MOVIES GENDER ROMANCE SCI-FI HORROR DRAMA FAMILY 34 Analysis Approaches You want to perform an analysis that takes into account the three variables Movie, Age, and Gender. There are several approaches. You can analyze a two-way table where the rows correspond to the levels of Movie and the columns correspond to combinations of the levels of Age and Gender 35 treat Gender as a stratification variable and analyze males and females separately. Correspondence Analysis This demonstration illustrates the concepts discussed previously. 36 37 Exercise This exercise reinforces the concepts discussed previously. 38 2.03 Quiz Ice cream brands A through D are tested by a panel, and rated from 1through 9 (with 9 as the best score). What can you conclude from the Correspondence Analysis? 40