Discriminant-Analysis

advertisement
MANOVA AND DISCRIMANT
ANALYSIS
Juan Carlos Penagos
Saul Hoyos
HISTORY
•Discriminant Analysis (DA) go back to around to
the 1920s when the estatician Karl Pearson
propose what was called the coefficient of racial
likeness, a type of intergroup distance index
•R.A. Fisher in 1930s translated the idea of
multivariate intergroup distance to a linear
composite variables derived for the propose of
two groups classifications
•In 1948, C.R Rao gave the extension of twogroup classification to multiple groups
•The initial study of DA started with Biological
and Medical science application, nowadays it
is currently used in Social Science
Calyampudi Radhakrishna Rao
CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006)
•The first three-four decades focused on the prediction of group
membership -Predictive Discriminant Analysis (PDA)
CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006)
•Fisher consideration about LDA where latter used for proposes
of interpreting effects revealed via Multivariate Analysis Of
Variance –MANOVA-
CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006)
•This aspect of DA is called Descriptive Discriminant Analysis
(DDA)
CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006)
•MANOVA and DFA are mathematically identically,
•MANOVA seeks differences in the indepenent variables between 2 or
more the groups.
•In DA the independent variables are used as predictors of groups
membership. Then the groups may be considered the dependent
variables
CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006)
Predictive Discriminant Analysis PDA
How accurately can group membership be predicted?
•Prediction and identification are very common in our society (life expectancy,
economic growth, voting support of a candidate, etc.)
•Academic setting (predicting family planning devices, fish site rainfall, marital
outcome, etc.).
CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006)
The primary questions addressed in a DDA are:
•How many dimensions characterize the group separation?
•What constructs characterize group separation?
CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006)
•If the study involves a single set of response variables, the analysis to
be applied would be, for example a cluster analysis or a principal
component analysis. For two sets of response variables, analyses of
interest may be multiple regression (MRA), multiple correlation (MCA)
or canonical correlation (CCA).
MANOVA
•Ecologist often record the abundance of many
species from each sampling or experiments units
and physiologist commonly measure more than
one variable.
•With multiple response variables we are
interested in whether there are group
differences on all the response variables
considered simultaneously
•Our Hypothesis is about group effects on a
combination of the response variables (we
compare group centroids for 2 or more variables)
ASSUMPTIONS OF MANOVA
•Important to check normality and homogeneity
of variances and outliers for each response
variable
•1)MANOVA test are sensitive to multivariate
outliers
•2)Homogeneity of variances and covariences is
an important assumption
•MANOVA test are more reliable when sample sizes are
equal, reducing the dimensionality of the analysis
improves the robustness of all the MANOVA test
statistics.
Wilks' λ is a test used in MANOVA to test whether there
are differences between the means of identified groups
of subjects on a combination of dependent variables.
Wilks‘ λ statistic can be transformed to a statistic which
has approximately an F distribution. This makes it easier
to calculate the P-value.
Go to the script!!!!
•The MANOVA uses the linear combinantion (z) of
response variables, out of the infinite number of
possible linear combinations, which maximizes
the ratio of between-groups and within-group
variances of z.
•The linear combination is called the
DISCRIMINANT FUNCTION for the difference
between-groups (DFA)
p
Z i  b0   ck yk
k 1
Zik=b0 + c1yi1+ c2yi2+ …+cjyij+…+ cpyip
•Coefficients of linear combination
•If the different variables are measured on
comparable scales then the relative size of these
coefficients (weights) provide a comparable measure
of the contribution of each variable to the variance
explained by the DF and thus the differences
between-groups.
•If variables are measured on a very different scale,
we need to standarized so the coefficients can be
compared
LOADINGS
•Are the correlations between each variable and the DF.
This represent the correlation between the value of a
variable and the score for the DF with the units as
replicates.
•Loadings measure the correlation between each
variable and each DF, although they ignore any
correlation between variables.
DISCRIMINANT FUNCTION ANALYSIS DFA
•Is used when we have observations from
predetermined groups with 2 or more response variables
recorded for each observations.
•DFA generates a linear combination of variables that
maximizes the probability of correctly assigning
observation to their predetermine groups and can also
be used to classify new observations into one of the
groups
•We might also wish to have some measure of the
likelihood of success of our classification
Jackknife
The basic idea behind the jackknife estimator lies
in systematically recomputing the statistic
estimate leaving out one observation at a time
from the sample set.
From this new set of "observations" for the
statistic an estimate for the bias can be
calculated, as well as an estimate for the variance
of the statistic.
(replacement-Bootstrap)
•Jackknife model results in a lower percentage
successful prediction but the percentage may be more
reliable indicator of classification success because we
have excluded each observation when calculating
coefficients.
•DESCRIPTION AND HYPOTHESIS TESTING
•DFA is mathematically identically to a single factor
MANOVA, although the former emphasizes
classification & prediction rather than test of
hypothesis about group differences
DISCRIMINANT FUNCTION ANALYSIS DFA (HO TESTING)
•First step…derive Discriminant Function that are linear
combinations of the original variables
•The 1st DF is the linear combination of variables that
maximizes the ratio between-groups to within-groups
variances.
•The 2nd DF is independent of the 1st and best separates
groups using the variation remaining after the 1st DF has
been determined, and so for the 3rd,4th etc…DF
Heavy metals in marine sediments Haynes et al. (1995)
differences between sites
in trace metal concentrations
At each site, they
recorded the concentrations of copper,
lead, nickel, manganese
We used only the 1991 data in our
analyses. There were strong
correlations
among some of the metals (e.g. Cu and
Cr, Fe and Ni) so only four variables
(Cu, Ni, Pb, Mn) were included in the
analysis.
Plant functional groups and leaf characters Reich et al. (1999)
leaf traits from different species
across a range of ecosystems and
geographic regions.
Two locations and two functional
groups.
five response variables were
measured: specific leaf area (log10transformed), leaf nitrogen
concentration, mass-based net
photosynthetic capacity,
area-based net photosynthetic
capacity and leaf diffusive
conductance at photosynthetic
capacity.
Zik=constant + c1(log10 specific leaf area)i+ c2(leaf N)i+ c3(mass-based photosynt
capacity)i+c4(area-based photos cap)i+ c5(leaf diffusive cap)i
•Determining which variable contributes the most
to DF (group separation), is done in the same way
as for MANOVA
•Relative size of the standardized coefficients for
each DF indicate which variables are more
important to each DF.
•The test of the Hypothesis of no difference
between groups centroids (MANOVA) is usually the
first step in a DFA, because if it is not significant,
the DFA will not be very useful for separating
groups and there for classifying observations.
•We can calculate the DF scores (Zik) for each observation
on each function (K) by simply solving each DF. These
scores can be used in a Linear Discriminant Function (LDF)
plot with the first DF scores on one axis and the 2nd DF
scores on the other axis. Either individual observations or
centroids can be plotted. These plots indicate subjectively
how similar or different groups are in terms of DF
•CLASSIFICATION AND PREDICTION
•The second purpose of a DFA is to classify each
observation into one of the groups and asses the success
of the classification. A classification equation is derived
for each group and is a linear combination of variables
like a DF, including a constant.
•The success of classifications of observations will be
greater if the groups were clearly distinguishable on the
first discriminant function.
•One difficulty is that the classification function are
calculated using all observation and these functions are
then used to classify the same observations. One way of
avoiding the resulting inherent bias is to use a jackknife
procedure.
ASSUMPTION OF DFA
•DFA has the same assumption as MANOVA (the most
important is homogeneity of the within group)
•If there is clear heterogeneity across the within group
variance-covarience matrices, you can try fitting
quadratic functions instead of the usual linear ones.
Quadratic functions include coefficients for squares of
the variables and do not assume equal with-in group
covariances
•MORE COMPLEX DESIGNS
•Because DFA is identical to a MANOVA, DFA can be
extended to more complex designs, such as factorial
designs
MANOVA vs. DFA
•MANOVA and DFA are mathematically identically,
although the terminology used in the two procedures
often differs
•In MANOVA we test whether population centroids,
based on a number of response variables are different
between-groups
•In DFA we use the response variables to try and predict
group membership and also classify new observations to
one or other of the groups with some measure of success
of that classification
•Are functionally equivalent, the former emphasizing
between-groups differences on a single discriminant
function, the latter using more than one discriminant
function and focusing on classification
Discriminant Analysis (DA) vs Principal Component
Analysis (PCA)
•Principal component Analysis PCA and DA, are both
eigenanalysis techniques, but they have
fundamentally different goals.
•DA maximes the separation of prior groups using
linear combinations of variables, while PCA represent
as much variation among the individuals point as
possible in a low dimensional space, also using a
linear combination of variables.
•Prior groups have no role in PCA but can be
superimposed on the ordination space
HINTS FOR ANALYSIS
•Homogeneity of between-groups variances and covariances is
important.
•Keep sample sizes similar and at least ensure homogeneity of
variances for each variable separately.
•Check for outliers.
•Test against a X2 distribution with p degree of freedom and strict
significance level (0.001).
•Pillai’s trace is the most robust of the test statistics for MANOVA
and is recommended (divide each eigenvalue by 1 + the
characteristic root, then sum these ratios)
•The contribution of each variable to a DF is best measured by
the standarized coefficients
•Jacknife classifications of each observation to each-group are
probably more reliable than standard clasifications, because the
former do not include the observation being classified when
calculating the classification score
Download