MANOVA AND DISCRIMANT ANALYSIS Juan Carlos Penagos Saul Hoyos HISTORY •Discriminant Analysis (DA) go back to around to the 1920s when the estatician Karl Pearson propose what was called the coefficient of racial likeness, a type of intergroup distance index •R.A. Fisher in 1930s translated the idea of multivariate intergroup distance to a linear composite variables derived for the propose of two groups classifications •In 1948, C.R Rao gave the extension of twogroup classification to multiple groups •The initial study of DA started with Biological and Medical science application, nowadays it is currently used in Social Science Calyampudi Radhakrishna Rao CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006) •The first three-four decades focused on the prediction of group membership -Predictive Discriminant Analysis (PDA) CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006) •Fisher consideration about LDA where latter used for proposes of interpreting effects revealed via Multivariate Analysis Of Variance –MANOVA- CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006) •This aspect of DA is called Descriptive Discriminant Analysis (DDA) CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006) •MANOVA and DFA are mathematically identically, •MANOVA seeks differences in the indepenent variables between 2 or more the groups. •In DA the independent variables are used as predictors of groups membership. Then the groups may be considered the dependent variables CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006) Predictive Discriminant Analysis PDA How accurately can group membership be predicted? •Prediction and identification are very common in our society (life expectancy, economic growth, voting support of a candidate, etc.) •Academic setting (predicting family planning devices, fish site rainfall, marital outcome, etc.). CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006) The primary questions addressed in a DDA are: •How many dimensions characterize the group separation? •What constructs characterize group separation? CLASSIFICATION OF MULTIVARIATE ANALYSIS (HUBERTY ET AL 2006) •If the study involves a single set of response variables, the analysis to be applied would be, for example a cluster analysis or a principal component analysis. For two sets of response variables, analyses of interest may be multiple regression (MRA), multiple correlation (MCA) or canonical correlation (CCA). MANOVA •Ecologist often record the abundance of many species from each sampling or experiments units and physiologist commonly measure more than one variable. •With multiple response variables we are interested in whether there are group differences on all the response variables considered simultaneously •Our Hypothesis is about group effects on a combination of the response variables (we compare group centroids for 2 or more variables) ASSUMPTIONS OF MANOVA •Important to check normality and homogeneity of variances and outliers for each response variable •1)MANOVA test are sensitive to multivariate outliers •2)Homogeneity of variances and covariences is an important assumption •MANOVA test are more reliable when sample sizes are equal, reducing the dimensionality of the analysis improves the robustness of all the MANOVA test statistics. Wilks' λ is a test used in MANOVA to test whether there are differences between the means of identified groups of subjects on a combination of dependent variables. Wilks‘ λ statistic can be transformed to a statistic which has approximately an F distribution. This makes it easier to calculate the P-value. Go to the script!!!! •The MANOVA uses the linear combinantion (z) of response variables, out of the infinite number of possible linear combinations, which maximizes the ratio of between-groups and within-group variances of z. •The linear combination is called the DISCRIMINANT FUNCTION for the difference between-groups (DFA) p Z i b0 ck yk k 1 Zik=b0 + c1yi1+ c2yi2+ …+cjyij+…+ cpyip •Coefficients of linear combination •If the different variables are measured on comparable scales then the relative size of these coefficients (weights) provide a comparable measure of the contribution of each variable to the variance explained by the DF and thus the differences between-groups. •If variables are measured on a very different scale, we need to standarized so the coefficients can be compared LOADINGS •Are the correlations between each variable and the DF. This represent the correlation between the value of a variable and the score for the DF with the units as replicates. •Loadings measure the correlation between each variable and each DF, although they ignore any correlation between variables. DISCRIMINANT FUNCTION ANALYSIS DFA •Is used when we have observations from predetermined groups with 2 or more response variables recorded for each observations. •DFA generates a linear combination of variables that maximizes the probability of correctly assigning observation to their predetermine groups and can also be used to classify new observations into one of the groups •We might also wish to have some measure of the likelihood of success of our classification Jackknife The basic idea behind the jackknife estimator lies in systematically recomputing the statistic estimate leaving out one observation at a time from the sample set. From this new set of "observations" for the statistic an estimate for the bias can be calculated, as well as an estimate for the variance of the statistic. (replacement-Bootstrap) •Jackknife model results in a lower percentage successful prediction but the percentage may be more reliable indicator of classification success because we have excluded each observation when calculating coefficients. •DESCRIPTION AND HYPOTHESIS TESTING •DFA is mathematically identically to a single factor MANOVA, although the former emphasizes classification & prediction rather than test of hypothesis about group differences DISCRIMINANT FUNCTION ANALYSIS DFA (HO TESTING) •First step…derive Discriminant Function that are linear combinations of the original variables •The 1st DF is the linear combination of variables that maximizes the ratio between-groups to within-groups variances. •The 2nd DF is independent of the 1st and best separates groups using the variation remaining after the 1st DF has been determined, and so for the 3rd,4th etc…DF Heavy metals in marine sediments Haynes et al. (1995) differences between sites in trace metal concentrations At each site, they recorded the concentrations of copper, lead, nickel, manganese We used only the 1991 data in our analyses. There were strong correlations among some of the metals (e.g. Cu and Cr, Fe and Ni) so only four variables (Cu, Ni, Pb, Mn) were included in the analysis. Plant functional groups and leaf characters Reich et al. (1999) leaf traits from different species across a range of ecosystems and geographic regions. Two locations and two functional groups. five response variables were measured: specific leaf area (log10transformed), leaf nitrogen concentration, mass-based net photosynthetic capacity, area-based net photosynthetic capacity and leaf diffusive conductance at photosynthetic capacity. Zik=constant + c1(log10 specific leaf area)i+ c2(leaf N)i+ c3(mass-based photosynt capacity)i+c4(area-based photos cap)i+ c5(leaf diffusive cap)i •Determining which variable contributes the most to DF (group separation), is done in the same way as for MANOVA •Relative size of the standardized coefficients for each DF indicate which variables are more important to each DF. •The test of the Hypothesis of no difference between groups centroids (MANOVA) is usually the first step in a DFA, because if it is not significant, the DFA will not be very useful for separating groups and there for classifying observations. •We can calculate the DF scores (Zik) for each observation on each function (K) by simply solving each DF. These scores can be used in a Linear Discriminant Function (LDF) plot with the first DF scores on one axis and the 2nd DF scores on the other axis. Either individual observations or centroids can be plotted. These plots indicate subjectively how similar or different groups are in terms of DF •CLASSIFICATION AND PREDICTION •The second purpose of a DFA is to classify each observation into one of the groups and asses the success of the classification. A classification equation is derived for each group and is a linear combination of variables like a DF, including a constant. •The success of classifications of observations will be greater if the groups were clearly distinguishable on the first discriminant function. •One difficulty is that the classification function are calculated using all observation and these functions are then used to classify the same observations. One way of avoiding the resulting inherent bias is to use a jackknife procedure. ASSUMPTION OF DFA •DFA has the same assumption as MANOVA (the most important is homogeneity of the within group) •If there is clear heterogeneity across the within group variance-covarience matrices, you can try fitting quadratic functions instead of the usual linear ones. Quadratic functions include coefficients for squares of the variables and do not assume equal with-in group covariances •MORE COMPLEX DESIGNS •Because DFA is identical to a MANOVA, DFA can be extended to more complex designs, such as factorial designs MANOVA vs. DFA •MANOVA and DFA are mathematically identically, although the terminology used in the two procedures often differs •In MANOVA we test whether population centroids, based on a number of response variables are different between-groups •In DFA we use the response variables to try and predict group membership and also classify new observations to one or other of the groups with some measure of success of that classification •Are functionally equivalent, the former emphasizing between-groups differences on a single discriminant function, the latter using more than one discriminant function and focusing on classification Discriminant Analysis (DA) vs Principal Component Analysis (PCA) •Principal component Analysis PCA and DA, are both eigenanalysis techniques, but they have fundamentally different goals. •DA maximes the separation of prior groups using linear combinations of variables, while PCA represent as much variation among the individuals point as possible in a low dimensional space, also using a linear combination of variables. •Prior groups have no role in PCA but can be superimposed on the ordination space HINTS FOR ANALYSIS •Homogeneity of between-groups variances and covariances is important. •Keep sample sizes similar and at least ensure homogeneity of variances for each variable separately. •Check for outliers. •Test against a X2 distribution with p degree of freedom and strict significance level (0.001). •Pillai’s trace is the most robust of the test statistics for MANOVA and is recommended (divide each eigenvalue by 1 + the characteristic root, then sum these ratios) •The contribution of each variable to a DF is best measured by the standarized coefficients •Jacknife classifications of each observation to each-group are probably more reliable than standard clasifications, because the former do not include the observation being classified when calculating the classification score