Multivariate Statistics
Discriminant Function Analysis
MANOVA
Discriminant Function Analysis
• You wish to predict group membership
from a set of two or more continuous
variables.
• Example: The IRS wants to classify tax
returns as OK or fraudulent.
• The have data on many predictor variables
from audits conducted in past years.
The Discriminant Function
• Is a weighted linear combination of the
predictors
Di  a  b1X 1  b2 X 2    bp X p
• The weights are selected so that the two
groups differ as much as possible on the
discriminant function.
Eigenvalue and Canonical r
• Compute a discriminant score, Di, for each
case.
• Use ANOVA to compare the groups on Di
• SSBetween Groups / SSWithin Groups = eigenvalue
canonical r =
SSBetween
_ Groups
SSTotal
Classification
• The analysis includes a classification
function.
• This allows one to predict group
membership for any case on which you
have data on the predictors.
• Those who are predicted to have
submitted fraudulent returns are audited.
Two or More Discriminant
Functions
• You may be able to obtain more than one
discriminant function.
• The maximum number you can obtain is
the smaller of
– The number of predictor variables
– One less than the number of groups
• Each function is orthogonal to the others.
• The first will have the greatest eigenvalue,
the second the next greatest, etc.
Labeling Discriminant Functions
• You may wish to name these things you
have created or discovered.
• As when naming factors from a factor
analysis, look at the loadings (correlations
between Di and the predictor variables)
• Look at the standardized discriminant
function coefficients (weights).
Predicting Jurors’ Verdict
Selections
• Poulson, Braithwaite, Brondino, and
Wuensch (1997).
• Subjects watch a simulated trial.
• Defendant accused of murder.
• There is no doubt that he did the crime.
• He is pleading insanity.
• What verdict does the juror recommend?
The Verdict Choices
•
•
•
•
Guilty
GBMI (Guilty But Mentally Ill)
NGRI (Not Guilty By Reason of Insanity)
The jurors are not allowed to know the
consequences of these different verdicts.
Eight Predictor Variables
•
•
•
•
•
•
•
•
Attitude about crime control
Attitude about the insanity defense
Attitude about the death penalty
Attitude about the prosecuting attorneys
Attitude about the defense attorneys
Assessment of the expert testimony
Assessment of mental status of defendant.
Can the defendant be rehabilitated?
Multicollinearity
• This is a problem that arises when one
predictor can be nearly perfectly predicted
by a weighted combination of the others.
• It creates problems with the analysis.
• One solution is to drop one or more of the
predictors.
• If two predictors are so highly correlated,
what is to be lost by dropping one of
them?
But I Do Not Want To Drop Any
• The lead researcher did not want to drop
any of the predictors.
• He considered them all theoretically
important.
• So we did a little magic to evade the
multicollinearity problem.
Principal Components
• We used principal components analysis to
repackage the variance in the predictors
into eight orthogonal components.
• We used those components as predictors
in a discriminant function analysis.
• And then transformed the results back into
the metric of the original predictor
variables.
The First Discriminant Function
• Separated those selecting NGRI from those
selecting Guilty.
• Those selecting NGRI:
– Believed the defendant mentally ill
– Believed the defense expert testimony more than the
prosecution expert testimony
– Were receptive to the insanity defense
– Opposed the death penalty
– Thought the defendant could be rehabilitated
– Favored lenient treatment over strict crime control.
The Second Discriminant Function
• Separated those selecting GBMI from
those selecting NGRI or Guilty.
• Those selecting GBMI:
– Distrust attorneys (especially prosecution)
– Think rehabilitation likely
– Oppose lenient treatment
– Are not receptive to the insanity defense
– Do not oppose the death penalty.
MANOVA
• This is just a DFA in reverse.
• You predict a set of continuous variables
from one or more grouping variables.
• Often used in an attempt to control
familywise error when there are multiple
outcome variables.
• This approach is questionable, but
popular.
MANOVA First, ANOVA Second
• Suppose you have an A x B factorial
design.
• You have five dependent variables.
• You worry that the Type I boogeyman will
get you if you just do five A x B ANOVAs.
• You do an A x B factorial MANOVA first.
• For any effect that is significant (A, B, A x
B) in MANOVA, you do five ANOVAs.
The Beautiful Criminal
• Wuensch, Chia, Castellow, Chuang, &
Cheng (1993)
• Data collected in Taiwan
• Grouping variables
– Defendant physically attractive or not
– Sex of defendant
– Type of crime: Swindle or burglary
– Defendant American or Chinese
– Sex of juror
Dependent Variables
• One set of two variables
– Length of recommended sentence
– Rated seriousness of the crime
• A second set of 12 variables, ratings of the
defendant on attributes such as
– Physical attractiveness
– Intelligence
– Sociability
Type I Boogeyman
• If we did a five-way ANOVA on one DV
– We would do 27 F tests
– And that is just for the omnibus analysis
• If we do that for each of the 14 DVs
– That is 378 F tests
– And the Boogeyman is licking his chops
Results, Sentencing
• Female jurors gave longer sentences, but
only with American defendants
• Attractiveness lowered the sentence for
American burglars
• But increased the sentence for American
swindlers
• Female jurors gave shorter sentences to
female defendants
Results, Ratings
• The following were rated more favorably
– Physically attractive defendants
– American defendants
– Swindlers
Canonical Variates
• For each effect (actually each treatment
df) there is a different set of weights
applied to the outcome variables.
• The weights are those that make the effect
as large as possible.
• The resulting linear combination is called a
canonical variate.
• Again, one canonical variate per treatment
degree of freedom.
Labeling the Canonical Variates
• Look at the loadings
• Look at the standardized weights
(standardized discriminant function
coefficients)
Sexual Harassment Trial:
Manipulation Check
• Moore, Wuensch, Hedges, and Castellow (1994)
• Physical attractiveness (PA) of defendant,
manipulated.
• Social desirability (SD) of defendant,
manipulated.
• Sex/gender of mock juror.
• Ratings of the litigants on 19 attributes.
• Experiment 2: manipulated PA and SD of
plaintiff.
Experiment 1: Ratings of
Defendant
• Social Desirabililty and Physical
Attractiveness manipulations significant.
• CVSocial Desirability loaded most heavily on
sociability, intelligence, warmth, sensitivity,
and kindness.
• CVPhysical Attractiveness loaded well on only the
physical attractiveness ratings.
Experiment 2: Ratings of Plaintiff
• Social Desirabililty and Physical
Attractiveness manipulations significant.
• CVSocial Desirability loaded most heavily on
intelligence, poise, sensitivity, kindness,
genuineness, warmth, and sociability.
• CVPhysical Attractiveness loaded well on only the
physical attractiveness ratings.