selecting a multivariate technique

advertisement
Multivariate Analysis: An Overview
CHAPTER LEARNING OBJECTIVES
After reading this chapter, students should understand…
1.
How to classify and select multivariate techniques.
2.
That multiple regression predicts a metric dependent variable from a set of metric
independent variables.
3.
That discriminant analysis classifies people or objects into categorical groups
using several metric predictors.
4.
How multivariate analysis of variance assesses the relationship between two or
more metric dependent variables and independent classificatory variables.
5.
How structural equation modeling explains causality among constructs that
cannot be directly measured.
6.
How structural equation modeling explains causality among constructs that
cannot be directly measured.
7.
How principal components analysis extracts uncorrelated factors from an initial
set of variables and (exploratory) factor analysis reduces the number of variables
to discover the underlying constructs.
8.
The use of cluster analysis techniques for grouping similar objects or people.
9.
How perceptions of products or services are revealed numerically and
geometrically by multidimensional scaling.
CHAPTER LECTURE NOTES
INTRODUCTION

In recent years, multivariate statistical tools have been applied with increasing
frequency to research problems.

Multivariate analysis—statistical techniques that focus upon and bring out in
bold relief the structure of simultaneous relationships among three or more
phenomena.
SELECTING A MULTIVARIATE TECHNIQUE

Classifications of multivariate techniques may be classified as dependency and
interdependency techniques.
19-1

Dependency techniques—those techniques where criterion or dependent
variables and predictor or independent variables are present.
–

Interdependency techniques—techniques where criterion or dependent variables
and predictor or independent variables are not present.


Examples include multiple regression, multivariate analysis of variance
MANOVA, and discriminant analysis.
Examples include factor analysis, cluster analysis, and multidimensional
scaling.
Measures to be checked:

Metric measures—statistical techniques using interval and ratio measures.

Nonmetric measures—statistical techniques using ordinal and nominal
measures.
DEPENDENCY TECHNIQUES
Multiple Regression

Multiple regression—statistical tool used to develop a self-weighting estimating
equation that predicts values for a dependent variable from the values of
independent variables.

Multiple regression is used as a descriptive tool in three types of situations:


It is often used to develop a self-weighting estimating equation by which to
predict values for a criterion variable (DV) from the values for several
predictor variables (IVs).

A description application of multiple regression calls for controlling for
confounding variables to better evaluate the contribution of other variables.

Multiple regression can be also used to test and explain causal theories. This
approach is referred to as path analysis (e.g., describes, through regression,
an entire structure of linkages advanced by a causal theory).
Multiple regression is also used as an inference tool to test hypotheses and to
estimate population values.
Method

Multiple regression is an extension of the bivariate linear regression discussed in
Chapter 18.

Dummy variables—nominal variables converted for use in multivariate statistics.

Regression coefficients are stated either in raw score units (the actual X values) or
standardized coefficients (regression coefficients in standardized form [mean =
19-2
0] used to determine the comparative impact of variables that come from different
scales.

When regression coefficients are standardized, they are called beta weights (β)
(standardized regression coefficients where the size of the number reflects the
level of influence X exerts on Y), and their values indicate the relative importance
of the associated X values, particularly when the predictors are unrelated.
Example

Most statistical packages provide various methods for selecting variables for the
equation.

Forward selection—sequentially adds the variable to a regression model that
results in the largest R2 increase.

Backward elimination—sequentially removes the variable from a regression
model that changes R2 the least.

Stepwise selection—a method for sequentially adding or removing variables
from a regression model to optimize R2 .
–
Collinearity—when two independent variables are highly correlated.
– Multicollinearity—when more than two independent variables are highly
correlated.
– A solution to the above problem can be the holdout sample (the portion
of the sample exclude for later validity testing when the estimating
equation is first computed).
Discriminant Analysis

Discriminant analysis is frequently used in market segmentation research.
Method

Discriminant analysis is a technique using two or more independent interval or
ratio variables to classify the observations in the categories of a nominal
dependent variable.


Once the discriminant equation is found, it can be used to predict the
classification of a new observation.
The most common use for discriminant analysis is to classify persons or objects
into various groups; it can also be used to analyze known groups to determine the
relative influence of specific factors for deciding into which group various cases
fall.
MANOVA

Multivariate analysis of variance (MANOVA) assesses the relationship
between two or more dependent variables and classificatory variables or factors.
19-3
Method

MANOVA:

Is similar to univariate ANOVA, with the added ability to handle several
dependent variables.

Uses special matrices to test for differences among groups.

Examines similarities and differences among the multivariate mean scores of
several populations.

Centroids—term used for the multivariate mean score in MANOVA.

Before using MANOVA to test for significant differences, you must first
determine that the assumptions for its use are met.
Example

When MANOVA is applied properly, the dependent variables are correlated.

If the dependent variables are unrelated, there would be no need for a multivariate
test.
Structural Equation Modeling

Since the 1980s, marketing researchers have relied increasingly on structural
equation modeling to test hypotheses about the dimensionality of, and
relationships among, latent and observed variables.

Structural equation modeling (SEM) uses analysis of covariance structures to
explain causality among constructs.

It is most commonly called LISREL (linear structural relations) models.

The major advantages of SEM are:

Multiple and interrelated dependence relationships can be estimated
simultaneously.

It can represent unobserved concepts, or latent variables, in these relationships
and account for measurement error in the estimation process.
Method

Researchers using SEM must follow five basic steps:
19-4

Model specification.
A formal statement of the model’s parameters.
– Specification error—an overestimation of the importance of the variables
included in a structural model.
 Estimation.
–
–
Often uses an iterative method such as maximum likelihood estimation
(MLE).
 Evaluation of fit.
–
Goodness-of-fit tests are used to determine if the model should be
accepted or rejected.

Respecification of the model.

Interpretation and communication.
–
SEM hypotheses and results are most commonly presented in the form of
path diagrams (presents predictive and associative relationships among
constructs and indicators in a structural model).
Conjoint Analysis

The most common applications for conjoint analysis are market research and
product development.
Method

Conjoint analysis measures complex decision making that requires multiattribute
judgments.

The objective of conjoint analysis is to secure utility scores (e.g., a score in
conjoint analysis used to represent each aspect of a product or service in a
participant’s overall preference ratings—also called partworths), that represent
the importance of each aspect of a product or service in the subject’s overall
preference ratings.

The first step in a conjoint study is to select the attributes most pertinent to the
purchase decision.

Possible values for an attribute are called factor levels.

After selecting the factors and their levels, a computer program determines the
number of product descriptions necessary to estimate the utilities.
INTERDEPENDENCY TECHNIQUES
Factor Analysis

Factor analysis is a technique for discovering patterns among the variables to
determine if an underlying combination of the original variables (a factor) can
summarize the original test.
19-5
Method

Factor analysis begins with the construction of a new set of variables based on the
relationships in the correlation matrix.

The most frequently used approach is the principle components analysis (one of
the methods of factor analysis that transforms a set of variables into a new set of
composite variables).

These linear combinations of variables, called factors (the result of transforming
a set of variables into a new set of composite variables through factor analysis),
account for the variance in the data as a whole.

The best combination makes up the first principal component and is the first
factor (and so on).

The process continues until all the variance is accounted for.

Loadings—the correlation coefficients that estimate the strength of the
variables composing the factor.

Eigenvalues—the proportion of total variance in all the variables that is
accounted for by a factor.

Communalities—the estimate of the variance in each variable that is
explained by the factors being studied.

Rotation—a technique used to provide a more simple and interpretable
picture of the relationship between factors and variables.
–
See chapter section for a complete description of the factor loading and
rotation process.
– The interpretation of factor loadings is largely subjective.
– For this reason, factor analysis is largely used for exploration.
Example

See chapter section for a student grades example; note exhibits that follow.
Interpretation

If factor analysis’s results are examined with care, it can be a powerful tool.
Cluster Analysis

Cluster analysis identifies homogeneous subgroups of study objects or
participants and then studies the data by these subgroups.

Often used in the fields of medicine, biology, and other sciences.
19-6

Cluster analysis offers a means for segmentation research and other marketing
problems where the goal is to classify similar groups.

Cluster analysis starts with an undifferentiated group of people, events, or objects
and attempts to reorganize them into homogeneous subgroups.
Method

Five steps are basic to the application of most cluster studies:

Selection of the sample to be clustered.

Definition of the variables on which to measure the objects, events, or people.

Computation of similarities among the entities through correlation, Euclidean
distances, and other techniques.

Selection of mutually exclusive clusters or hierarchy arranged clusters.

Cluster comparison and validation.
Example

The average linkage method (evaluates the distance between two clusters by
first finding the geometric center of each cluster and then computing distances
between the two centers) is demonstrated.

The resulting diagram is called a dendogram.
Multidimensional Scaling


Multidimensional scaling (MDS) is a scaling technique to simultaneously
measure more than one attribute of the participants or objects.
A perceptual map is created.
19-7
Download