Logistic Regression and Correspondence Analysis

advertisement
Chapter 2: Logistic Regression and
Correspondence Analysis
2.1 Fitting Ordinal Logistic Regression Models
2.2 Fitting Nominal Logistic Regression Models
2.3 Introduction to Correspondence Analysis
1
Chapter 2: Logistic Regression and
Correspondence Analysis
2.1 Fitting Ordinal Logistic Regression Models
2.2 Fitting Nominal Logistic Regression Models
2.3 Introduction to Correspondence Analysis
2
Objectives




3
Define a cumulative logit.
Fit an ordinal logistic regression model.
Interpret parameter estimates.
Compute odds ratios.
When Do You Use Ordinal Logistic
Regression?
Response Variable
Two
Categories
Binary
Yes
No
Nominal
Three
or More
Categories Ordinal
4
Type of
Logistic Regression
Binary
Nominal
Ordinal
Cumulative Logits
Response
Logit(1)
Log
Logit(2)
Log
Number of Cumulative Logits =
Number of Levels -1
5
Proportional Odds Assumptions
Logit(i)
Logit(2)= a2+BX
Equal Slopes
Logit(1)= a1+BX
Predictor X
6
Sample Data Set
Gender
Income
Age
7
P
R
E
D
I
C
T
O
R
S
MODEL
O
U
T
C
O
M
E
>100
5
75-100
4
50-74
3
25-49
2
0-24
1
Examining Distributions
This demonstration illustrates the concepts
discussed previously.
8
9
Exercise
This exercise reinforces the concepts discussed
previously.
10
Chapter 2: Logistic Regression and
Correspondence Analysis
2.1 Fitting Ordinal Logistic Regression Models
2.2 Fitting Nominal Logistic Regression Models
2.3 Introduction to Correspondence Analysis
11
Objectives




12
Explain a generalized logit.
Fit a nominal logistic regression model.
Interpret the parameter estimates.
Compute odds ratios.
When To Use Nominal Logistic Regression?
Response Variable
Two
Categories
Binary
Yes
No
Nominal
Three
or More
Categories Ordinal
13
Type of
Logistic Regression
Binary
Nominal
Ordinal
Generalized Logits
Response
Logit(1)
Log
Logit(2)
Log
Number of Generalized Logits =
Number of Levels -1
14
Generalized Logit Model
Logit(i)
Logit(i)
Logit(2)=a2+B2X
Different Slopes
Different
Slopes
and
and
Intercepts
Intercepts
Logit(1)=a1+B1X
Predictor X X
Predictor
15
2.01 Multiple Choice Poll
Suppose a nominal response variable has four levels.
Which of the following statements is true?
a. JMP will compute three generalized logits.
b. Logit(1) is the log odds for level 1 occurring versus
level 4 occurring.
c. JMP will compute a separate intercept parameter for
each logit.
d. JMP will compute a separate slope parameter for each
logit.
e. All of the above are true.
17
2.01 Multiple Choice Poll – Correct Answer
Suppose a nominal response variable has four levels.
Which of the following statements is true?
a. JMP will compute three generalized logits.
b. Logit(1) is the log odds for level 1 occurring versus
level 4 occurring.
c. JMP will compute a separate intercept parameter for
each logit.
d. JMP will compute a separate slope parameter for each
logit.
e. All of the above are true.
18
Sample Data Set
Gender
Income
Age
19
P
R
E
D
I
C
T
O
R
S
MODEL
O
U
T
C
O
M
E
>100
5
75-100
4
50-74
3
25-49
2
0-24
1
Nominal Logistic Regression
Model
This demonstration illustrates the concepts
discussed previously.
20
21
Exercise
This exercise reinforces the concepts discussed
previously.
22
Chapter 2: Logistic Regression and
Correspondence Analysis
2.1 Fitting Ordinal Logistic Regression Models
2.2 Fitting Nominal Logistic Regression Models
2.3 Introduction to Correspondence Analysis
23
Objectives



24
Explain how correspondence analysis can help
you study data.
Perform a simple correspondence analysis.
Interpret a correspondence plot.
What Is Correspondence Analysis?
Correspondence analysis is a data analysis technique
that enables you to
 display the associations between the levels of two
or more categorical variables graphically
 extract information from a frequency table with
many levels for the rows and columns.
25
Row and Column Profiles
A
1
2
3
4
19.55
27.39
17.27
24.20
17.67
24.20
17.51
24.20
B
25.91
23.27
28.18
25.31
28.84
25.31
29.49
26.12
C
54.55
25.53
54.55
25.53
53.49
24.47
53.00
24.47
Gives
Row Profile
Row %
Column %
Gives Column Profile
Row and column percentages are used to obtain row
and column profiles.
26
Row Profiles
A
B
C
1
19.55 25.91 54.55
2
17.27 28.18 54.55
3
17.67 28.84 53.49
4
17.51 29.49 53.00
Row %
Row Profile =
Row%/100
Row percentages are used to obtain row profiles.
27
Column Profiles
A
B
C
1
27.39 23.27 25.53
2
24.20 25.31 25.53
3
24.20 25.31 24.47
4
24.20 26.12 24.47
Column %
Col Profile =
Column%/100
Column percentages are used to obtain column profiles.
28
Correspondence Plot
Rows 1 and 2 have
similar profiles. Their
points are close
together and fall in the
same direction away
from the origin.
The profile for Row 7
is different. Its point is
closer in and falls in a
different direction
away from the origin.
29
Association
Row 8 and Column D
fall in approximately
the same direction
from the origin, and
are relatively close to
one another.
30
2.02 Multiple Answer Poll
In correspondence analysis, which of the following are
true? (Choose all answers that apply.)
a. Row points that fall far from each other but in the
same direction away from the origin indicate that they
have similar profiles.
b. Column points that fall close together and in the same
direction away from the origin indicate that they have
similar profiles.
c. Row and column points that fall in the same direction
away from the origin indicate that they have an
association.
32
2.02 Multiple Answer Poll – Correct Answers
In correspondence analysis, which of the following are
true? (Choose all answers that apply.)
a. Row points that fall far from each other but in the
same direction away from the origin indicate that they
have similar profiles.
b. Column points that fall close together and in the same
direction away from the origin indicate that they have
similar profiles.
c. Row and column points that fall in the same direction
away from the origin indicate that they have an
association.
33
Sample Data Set
ACTION
MYSTERY
COMEDY
AGE
SPORTS
MOVIES
GENDER
ROMANCE
SCI-FI
HORROR
DRAMA
FAMILY
34
Analysis Approaches
You want to perform an analysis that takes into account
the three variables Movie, Age, and Gender. There
are several approaches.
You can
 analyze a two-way table where the rows correspond
to the levels of Movie and the columns correspond
to combinations of the levels of Age and Gender

35
treat Gender as a stratification variable and analyze
males and females separately.
Correspondence Analysis
This demonstration illustrates the concepts
discussed previously.
36
37
Exercise
This exercise reinforces the concepts discussed
previously.
38
2.03 Quiz
Ice cream brands A through D are tested by a panel, and
rated from 1through 9 (with 9 as the best score). What
can you conclude from the Correspondence Analysis?
40
Download