Uploaded by Marija Trpkova Nestorovska

Chapter 01 7e

advertisement
Chapter 1
Introduction
1-1
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Chapter 1 Introduction
LEARNING OBJECTIVES:
Upon completing this chapter, you should be able to do the
following:
1.
2.
3.
4.
5.
Explain what multivariate analysis is and when its application is
appropriate.
Define and discuss the specific techniques included in
multivariate analysis.
Determine which multivariate technique is appropriate for a
specific research problem.
Discuss the nature of measurement scales and their
relationship to multivariate techniques.
Describe the conceptual and statistical issues inherent in
multivariate analyses.
1-2
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
What is Multivariate Analysis?
 What is it? Multivariate Data Analysis = all
statistical methods that simultaneously analyze
multiple measurements on each individual or
object under investigation.
 Why use it?
• Measurement
• Explanation & Prediction
• Hypothesis Testing
1-3
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Basic Concepts of Multivariate Analysis
 The Variate
 Measurement Scales
• Nonmetric
• Metric
 Multivariate Measurement
 Measurement Error
 Types of Techniques
1-4
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
The Variate






The variate is a linear combination of variables with
empirically determined weights.
Weights are determined to best achieve the objective of the
specific multivariate technique.
Variate equation: (Y’) = W1 X1 + W2 X2 + . . . + Wn Xn
Each respondent has a variate value (Y’).
The Y’ value is a linear combination of the entire set of
variables. It is the dependent variable.
Potential Independent Variables:
X1 = income
X2 = education
X3 = family size
X4 = ??
1-5
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Types of Data and Measurement Scales
Data
Metric
or
Quantitative
Nonmetric
or
Qualitative
Nominal
Scale
Ordinal
Scale
Interval
Scale
Ratio
Scale
1-6
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Measurement Scales
 Nonmetric
•
•
Nominal – size of number is not related to the amount of the
characteristic being measured
Ordinal – larger numbers indicate more (or less) of the
characteristic measured, but not how much more (or less).
 Metric
•
•
Interval – contains ordinal properties, and in addition, there are
equal differences between scale points.
Ratio – contains interval scale properties, and in addition, there
is a natural zero point.
NOTE: The level of measurement is critical in determining the
appropriate multivariate technique to use!
1-7
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Measurement Error
•
•
•
All variables have some error. What are the
sources of error?
Measurement error = distorts observed
relationships and makes multivariate
techniques less powerful.
Researchers use summated scales, for
which several variables are summed or
averaged together to form a composite
representation of a concept.
1-8
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Measurement Error
In addressing measurement error,
researchers evaluate two important
characteristics of measurement:
• Validity = the degree to which a
measure accurately represents what it
is supposed to.
• Reliability = the degree to which the
observed variable measures the “true”
value and is thus error free.
1-9
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Statistical Significance and Power
 Type I error, or , is the probability of rejecting the null
hypothesis when it is true.
 Type II error, or , is the probability of failing to reject the
null hypothesis when it is false.
 Power, or 1-, is the probability of rejecting the null
hypothesis when it is false.
Fail to Reject H0
Reject H0
H0 true
H0 false
1-

Type II error

Type I error
1-
Power
1-10
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Power is Determined by Three Factors
 Effect size: the actual magnitude of the effect of
interest (e.g., the difference between means or
the correlation between variables).
 Alpha (): as  is set at smaller levels, power
decreases. Typically,  = .05.
 Sample size: as sample size increases, power
increases. With very large sample sizes, even
very small effects can be statistically significant,
raising the issue of practical significance vs.
statistical significance.
1-11
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
1-12
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Impact of Sample Size on Power
1-13
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Rules of Thumb 1–1
•
•
•
•
•
Statistical Power Analysis
Researchers should always design the study to
achieve a power level of .80 at the desired
significance level.
More stringent significance levels (e.g., .01
instead of .05) require larger samples to achieve
the desired power level.
Conversely, power can be increased by choosing
a less stringent alpha level (e.g., .10 instead of
.05).
Smaller effect sizes always require larger sample
sizes to achieve the desired power.
Any increase in power is most likely achieved by
increased sample size.
1-14
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Types of Multivariate Techniques
Dependence techniques: a variable or set of
variables is identified as the dependent variable to
be predicted or explained by other variables
known as independent variables.
o Multiple Regression
o Multiple Discriminant Analysis
o Logit/Logistic Regression
o Multivariate Analysis of Variance (MANOVA)
and Covariance
o Conjoint Analysis
o Canonical Correlation
o Structural Equations Modeling (SEM)
1-15
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
The relationships between multivariate
dependence methods
1-16
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Types of Multivariate Techniques
Interdependence techniques: involve the
simultaneous analysis of all variables in the
set, without distinction between dependent
variables and independent variables.
o Principal Components and Common
Factor Analysis
o Cluster Analysis
o Multidimensional Scaling (perceptual
mapping)
o Correspondence Analysis
1-17
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Selecting a Multivariate Technique
1. What type of relationship is being examined –
dependence or interdependence?
2. Dependence relationship: How many variables
are being predicted?
 What is the measurement scale of the
dependent variable?
 What is the measurement scale of the
predictor variable?
3. Interdependence relationship: Are you examining
relationships between variables, respondents, or
objects?
1-18
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Two Broad Types of Multivariate Methods:
1. Dependence – analyze dependent and
independent variables at the same time.
2. Interdependence – analyze dependent and
independent variables separately.
1-19
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Selecting a multivariate technique
1-20
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Selecting the Correct Multivariate Method
Multivariate
Methods
Interdependence
Methods
Dependence
Methods
One
Dependent
Variable
Several
Dependent
Variables
Multiple
Relationships Structural
Equations
SEM
Metric
Nonmetric
Metric
Nonmetric
Canonical
Correlation
with
Dummy
Variables
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Multiple
Regression
and Conjoint
Discriminant
Analysis
and Logit
MANOVA
and
Canonical
Nonmetric
Metric
CFA
Factor
Analysis
Cluster
Analysis
Metric
MDS
Nonmetric
MDS and
Correspondence
Analysis
1-21
Multiple Regression
. . . a single metric dependent variable
is predicted by several metric
independent variables.
X1
e.g. Monthly expenditures on dining out-family income size, age of head
of household
Sales-expenditures on advertising, number of sales people, number of
stores carrying the products
Y
X2
1-22
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Discriminant Analysis
• What is it?
. . . single, non-metric (categorical)
dependent variable is predicted by several
metric independent variables.
• Why use it?
1-23
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Logistic Regression
(Logit analysis)
•
A single nonmetric dependent variable is predicted by
several metric independent variables.
• This technique is similar to discriminant analysis (the
difference is it accepts both metric and non-metric
independent variables, and does not require multivariate
normality), but relies on calculations more like regression
(with differences in estimation method and assumptions).
e.g. Financial advisors trying to select emerging firms for start-up investment – review past
records and place firms in two groups: successful over a 5-year period and unsuccessful over 5year period. They use financial and managerial data. They identify those financial and
managerial data that best differentiate between successful and unsuccessful firms to select the
best candidates in future.
1-24
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
MANOVA
Several metric dependent variables
are predicted by a set of nonmetric
(categorical) independent variables.
e.g.- Company wants to know if a humorous ad will be more effective than a nonhumorous ad. It develops two ads—one humorous and one non-humorous
show a group of customers the two ads.
After seeing the ads, the customers rate the company and its products, such as modern
versus traditional or high quality versus low quality.
MANOVA would be the technique to use to determine the extent of any statistical
differences between the perceptions of customers who saw the humorous ad versus those
who saw the non-humorous one.
1-25
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
CANONICAL ANALYSIS
Logical extension to multiple regression analysis
•
Several metric dependent variables are predicted by several metric independent
variables.
•
Development of a linear combination of each set of variables (both independent and
dependent) so that maximizes the correlation between the two sets.
•
Procedure involves obtaining a set of weights for the dependent and independent
variables that provides the maximum simple correlation between the set of dependent
variables and the set of independent variables.
E.g. Company collects information on its service quality based on answers to 50 metrically
measured questions.
• The study includes benchmarking information on perceptions of the service quality of worldclass companies as well as the company for which the research is being conducted.
• Canonical correlation is used to compare the perceptions of the world-class companies on
the 50 questions with the perceptions of the company. The research could then conclude
whether the perceptions of the company are correlated with those of world-class companies.
• The technique provides information on the overall correlation of perceptions as well as the
correlation between each of the 50 questions.
1-26
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
CONJOINT ANALYSIS
. . . is used to understand respondents’ preferences
for products and services.
•
•
•
In doing this, it determines the
E.g. - Assume a product concept has
importance of both:
three attributes (price, quality, and
color), each at three possible levels
(red, yellow, and blue).
Instead of having to evaluate all 27 (3
* 3 * 3) possible combinations, a
subset (9 or more) can be evaluated
for their attractiveness to consumers
(the attractiveness of red versus
yellow versus blue).
The results can also be used in
product design simulators, which
show customer acceptance for any
number of product formulations and
aid in the design of the optimal
product.
attributes and
levels of attributes
. . . based on a smaller
subset of combinations of
attributes and levels.
1-27
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
CONJOINT ANALYSIS
Typical Applications:






Soft Drinks
Candy Bars
Cereals
Beer
Apartment Buildings; Condos
Solvents; Cleaning Fluids
1-28
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Structural Equations Modeling (SEM)
•
•
•
Technique that allows separate relationships for each of a set of dependent
variables.
Provides the appropriate and most efficient estimation technique for a series
of separate multiple regression equations estimated simultaneously.
Two basic components:
•
(1) the structural model - the path model, which relates independent to
dependent variables. In such situations, theory, prior experience, or
other guidelines enable the researcher to distinguish which independent
variables predict each dependent variable.
•
(2) the measurement model enables the researcher to use several
variables for a single independent or dependent variable. For example,
the dependent variable might be a concept represented by a summated
scale, such as self-esteem.
1-29
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Structural Equations Modeling (SEM)
• E.g. - A study by management consultants identified several factors that affect worker
satisfaction:
• supervisor support, work environment, and job performance.
• Also, supervisor support and the work environment not only affected worker satisfaction directly,
but had possible indirect effects through the relationship with job performance, which was also a
predictor of worker satisfaction.
• To assess these relationships, multi-item scales for each construct was developed (supervisor
support, work environment, job performance, and worker satisfaction).
Work
environment
Supervisor
support
Job
performance
Worker
satisfaction
1-30
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Factor analysis
. . . analyzes the structure of the
interrelationships among a large number of
variables to determine a set of common underlying
dimensions (factors).
•
•
•
E.g. - A researcher can use factor analysis, for example, to better understand the
relationships between customers’ ratings of a fast-food restaurant.
Assume you ask customers to rate the restaurant on the following six variables:
food taste, food temperature, freshness, waiting time, cleanliness, and friendliness
of employees.
The analyst would like to combine these six variables into a smaller number. By
analyzing the customer responses, the analyst might find that:
• the variables food taste, temperature, and freshness combine together to
form a single factor of food quality, whereas
• the variables waiting time, cleanliness, and friendliness of employees
combine to form another single factor, service quality.
1-31
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Cluster Analysis
•
. . . groups objects (respondents, products, firms, variables,
etc.) so that each object is similar to the other objects in the
cluster and different from objects in all the other clusters.
• Cluster analysis usually involves at least three steps.
• 1) measurement of some form of similarity or association
among the entities to determine how many groups really
exist in the sample.
• 2) the actual clustering process, whereby entities are
partitioned into groups (clusters).
• 3) profile the persons or variables to determine their
composition.
•
•
E.g. Restaurant owner wants to know whether customers are patronizing
the restaurant for different reasons. Data could be collected on
perceptions of pricing, food quality, and so forth.
Cluster analysis could be used to determine whether some subgroups
(clusters) are highly motivated by low prices versus those who are much
less motivated to come to the restaurant based on price considerations.
1-32
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Perceptual mapping
(multidimensional scaling)
• Objective is to transform consumer judgments of similarity or
preference into distances represented in multidimensional space.
• If objects A and B are judged by respondents as being the most
similar compared with all other possible pairs of objects, perceptual
mapping techniques will position objects A and B in such a way that
the distance between them in multidimensional space is smaller
than the distance between any other pairs of objects.
E.g. - Owner of a Burger King franchise wants to know whether the
strongest competitor is McDonalds or Wendy's. A sample of customers
is given a survey and asked to rate the pairs of restaurants from most
similar to least similar. The results show that the Burger King is most
similar to Wendy's, so the owners know that the strongest competitor is
the Wendy's restaurant because it is thought to be the most similar.
Follow-up analysis can identify what attributes influence perceptions of
similarity or dissimilarity.
1-33
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Correspondence Analysis
. . . uses non-metric data and evaluates either linear or non-linear
relationships in an effort to develop a perceptual map representing the
association between objects (firms, products, etc.) and a set of
descriptive characteristics of the objects.
•
•
•
•
E.g. – Respondents brand preferences can be cross-tabulated on demographic
variables (gender, income categories, occupation) by indicating how many people
preferring each brand fall into each category of the demographic variables.
Through correspondence analysis, the association of brands and the
distinguishing characteristics of those preferring each brand are then shown in a
two-or three-dimensional map of both brands and respondent characteristics.
Brands perceived as similar are located close to one another.
The most distinguishing characteristics of respondents preferring each brand are
also determined by the proximity of the demographic variable categories to the
brands position.
1-34
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Guidelines for Multivariate Analysis
o Establish Practical Significance as well as Statistical Significance.
o Sample Size Affects All Results.
o Know Your Data.
o Descriptive data; charts
o Strive for Model Parsimony.
o Irrelevant variables
o multicollinearity
o Look at Your Errors
o
starting points to diagnose validity and indication of the remaining unexplained
relationships;
o Validate Your Results.
o
o
o
Splitting the sample (one to estimate the model, the second to estimate
predictive accuracy);
Gather separate sample
Bootstrapping-draw large number of subsamples, estimate models, calculate
means of estimated coefficients, examine the actual values from the repeated
samples.
1-35
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
A Structured Approach to
Multivariate Model Building:
Stage 1: Define the Research Problem, Objectives, and
Multivariate Technique(s) to be Used
Stage 2: Develop the Analysis Plan
Stage 3: Evaluate the Assumptions Underlying the
Multivariate Technique(s)
Stage 4: Estimate the Multivariate Model and Assess
Overall Model Fit
Stage 5: Interpret the Variate(s)
Stage 6: Validate the Multivariate Model
1-36
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Description of HBAT Primary Database Variables
Variable Description
Data Warehouse Classification Variables
X1
X2
X3
X4
X5
Customer Type
Industry Type
Firm Size
Region
Distribution System
Variable Type
nonmetric
nonmetric
nonmetric
nonmetric
nonmetric
Performance Perceptions Variables
X6
X7
X8
X9
X10
X11
X12
X13
X14
X15
X16
X17
X18
Product Quality
E-Commerce Activities/Website
Technical Support
Complaint Resolution
Advertising
Product Line
Salesforce Image
Competitive Pricing
Warranty & Claims
New Products
Ordering & Billing
Price Flexibility
Delivery Speed
metric
metric
metric
metric
metric
metric
metric
metric
metric
metric
metric
metric
metric
Outcome/Relationship Measures
X19
X20
X21
X22
X23
Satisfaction
Likelihood of Recommendation
Likelihood of Future Purchase
Current Purchase/Usage Level
Consider Strategic Alliance/Partnership in Future
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
metric
metric
metric
metric
nonmetric
1-37
Multivariate Analysis
Learning Checkpoint
1. What is multivariate analysis?
2. Why use multivariate analysis?
3. Why is knowledge of measurement scales
important in using multivariate analysis?
4. What basic issues need to be examined
when using multivariate analysis?
5. Describe the process for applying
multivariate analysis.
1-38
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Download