Session 5- Measurement and survey design

advertisement
Session 5. Causal Research and Survey Design
MKTG 3010 MARKETING RESEARCH
1
Marketing Research Process
Step 1: Defining the Problem
Step 2: Developing an Approach to the Problem
Step 3: Formulating a Research Design
Step 4: Doing Field Work or Collecting Data
Step 5: Preparing and Analyzing Data
Step 6: Preparing and Presenting the Report
2
A Comparison of Basic Research Designs
Exploratory
Descriptive
Causal
Objective:
Discovery of ideas
and insights
Describe market
characteristics or
functions
Determine cause
and effect
relationships
Characteristics:
Flexible, versatile
Methods:
3
Marked by the prior
Manipulation of
formulation of specific independent
hypotheses
variables, effect
on dependent
Often the front end Preplanned and
variables
of total research
structured design
design
Control
mediating
Expert surveys
Secondary data:
variables
Pilot surveys
quantitative analysis
Focus group
Surveys
Experiments
Secondary data:
Panels
qualitative analysis Observation and other
qualitative research data
Causal Research Design
4
The danger of mixing up causality and
correlation
https://www.youtube.com/watch?v=8B271L3NtAw
5
Is ice cream dangerous?
 What makes marriage men live longer than
single men? Is it because of marriage?
 Were bed lamps children’s abuse?
 High self-esteem lead to good grade?

6

Is ice cream dangerous?


What makes marriage men live longer than
single men? Is it because of marriage?


Underling factor - genetic
High self-esteem lead to good grade?

7
Reverse causality
Were bed lamps children’s abuse?


Underling factor - weather
Reverse causality
Correlation vs causation
 Correlation doesn’t imply
causation
 For any two correlated
events A and B, the
following relationships
are possible:




A causes B;
B causes A;
A and B are consequences of a
common cause, but do not cause
each other;
There is no connection between A
and B, the correlation is coincidental.
8
Condition for causality

Concomitant variation is the extent to which a
cause, X, and an effect, Y, occur together or vary
together in the way predicted by the hypothesis
under consideration.

The time order of occurrence condition states
that the causing event must occur either before or
simultaneously with the effect; it cannot occur
afterwards.

The absence of other possible causal factors
means additional or extraneous variables that
impact the effect variable must be controlled
9
Experimental design
An experimental design is a set of
procedures specifying
 the test units and how these units are to be
divided into homogeneous subsamples;
 what independent variables or treatments
are to be manipulated;
 what dependent variables are to be
measured;
 how the extraneous variables are to be
controlled.
10
Definition and concepts

Independent variables are variables or alternatives that are
manipulated and whose effects are measured and compared,
e.g., price levels.

Test units are individuals, organizations, or other entities
whose response to the independent variables or treatments
is being examined, e.g., consumers or stores.

Dependent variables are the variables which measure the
effect of the independent variables on the test units, e.g.,
sales, profits, and market shares.

Extraneous variables are all variables other than the
independent variables that affect the response of the test
units, e.g., store size, store location, and competitive effort.
11
A Classification of Experimental Designs
Experimental Designs
Pre-experimental
True
Experimental
One-Shot Case
Study
Pretest-Posttest
Control Group
Time Series
Randomized
Blocks
One Group
Pretest-Posttest
Posttest: Only
Control Group
Multiple Time
Series
Latin Square
Static Group
Solomon FourGroup
12
Quasi
Experimental
Statistical
Factorial
Design
True Experimental Designs
Posttest-Only
Control Group
Design
(EG) R
(CG) R
X
O1
O2
time
X = Treatment; O = Observation
EG = Experimental Group; CG = Control Group

Treatment effect = O2-O1
13
R = Two stage randomization
True Experimental Designs
Pretest-Posttest
with Control
Group Design
(EG) R O1
(CG) R O3
X
O2
O4
time
X = Treatment; O = Observation
R = Two stage randomization
EG = Experimental Group; CG = Control Group

Treatment Effect = TE = (O2-O1)-(O4-O3)
14
Limitations of Experimentation

Experiments can be time consuming, particularly if
the researcher is interested in measuring the longterm effects.

Experiments are often expensive. The
requirements of experimental group, control group,
and multiple measurements significantly add to the
cost of research.

Experiments can be difficult to administer. It may
be impossible to control for the effects of the
extraneous variables, particularly in a field
environment.
15
Marketing Applications of Experiments

New product launch



Controlled field experiment
Test markets (real/simulated)
Assessing advertising effectiveness


Recognition , recall, persuasion
Purchase behavior (split-cable tests)

Direct mail: Randomized offers

Online experimentation
Homework - Psychology beats business
training when it comes to entrepreneurship
(https://www.economist.com/news/business/21729454-among-smallbusiness-owners-togo-least-psychology-beats-business-training-whenit-comes)




What is the research question?
Which experimental design was used?
What is the treatment? How to calculate the
treatment effect?
What is the dependent variable?
17
Define the Information Needed
Design the Exploratory, Descriptive, and/or Causal Phases of the Research
Specify the Measurement and Scaling Procedures
Construct a Questionnaire
Specify the Sampling Process and the Sample Size
Develop a Plan of Data Analysis
18
Measurement and Scaling
19
Measure information
Gender
 Brands
 Price
 Income
 Preference
 Attitude

20
Measurement

Measurement means assigning numbers
or other symbols to characteristics of
objects according to predetermined rules.
21
Scaling

In the social sciences, scaling is the
process of measuring or ordering entities
with respect to quantitative attributes or
traits.

It can be described in terms of four basic
characteristics:
22
Scaling
Description
Refers to the unique labels or descriptors that are used to designate each
value of the scale( e.g. 1 = Female, 0 = Male). All scales have description.
Order
Refers to the relative sizes or positions of the descriptors. Order is
denoted by descriptors such as greater than, less than, and equal to.
Distance
Refers to the absolute differences between the scale descriptors are
known and may be expressed in units.
Origin
Means that the scale has a unique or fixed beginning or true zero point
23
Four Types of Primary Scales
Primary Scales
Ratio
Scale
Interval
Scale
Ordinal
Scale
Nominal
Scale
24
Lowest level of
measurement
Highest level of
measurement
Levels of Measurement Scales
Rules
Nominal
Applications
Objects are either identical
or different
Objects are greater
or smaller
Description/
classification
Rankings
Interval
Intervals between objects
are equal (arbitrary zero)
Ratings
Ratio
Meaningful zero,
for comparison of
absolute magnitude
Amounts
Ordinal
Nominal Scale Measures
1. When at home, which TV brand do you watch most often? (circle one).
 Sony  Toshiba  Panasonic  Samsung  Phillips  Other

Descriptions



Statistics


Frequency tables, mode
Use percentages belonging to each group


Each individual/object either is or is not in a group
Can’t order the labels
chi-square tests, Z-test for proportions, discriminant, logistic regression
Useful as grouping variable

t-tests/ANOVA/ (independent variable only)
Ordinal Scale Measures
1. Please rank each television brand in terms of your preference. Place a “1” by
your top choice, a “2” by your second choice, etc.
_ Sony _ Toshiba _ Panasonic _ Samsung _ Phillips
2. For each pair of television brands, circle the one you would be more likely to
purchase.
Sony versus Toshiba
Sony versus Panasonic ...

Ranking



Statistics


X is better than Y
Can’t say by how much
Frequency tables, mode, median
Useful for ranking alternatives

Rank-order correlation, some conjoint models
Interval Scale Measures
For each brand below, indicate how much you like each brand by circling
the appropriate number.
Not Like
Like
At All
Very Much
Sony
1
2
3
4
5
Toshiba
1
2
3
4
5
Panasonic
1
2
3
4
5
• Ratings
– X is n units different from Y
– Distance between 2 and 3 is same as distance between 3 and 4 (or is it?)
• Statistics
– Frequency tables, mode, median, mean, standard deviation
• t-tests, regression, ANOVA, factor, cluster, some conjoint
– Can’t compare absolute magnitude (“4” ≠ 2 X “2”)
Ratio Measure Scales
How much would you be willing to pay for a 24” flat screen TV made by each
company?
Sony
$_______
Toshiba
$_______
Panasonic
$_______


Amounts, zero has meaning
X is n units different than Y



Distance between 2 and 3 is half of distance between 2 and 4
Statistics

Frequency tables, mode, median, mean, standard deviation,
geometric mean

t-tests, regression, ANOVA, factor, cluster, some conjoint
Can compare absolute magnitude (“4” = 2 x “2”)
Use of Rating Scales in Self-Report
Measurements
30
Scaling Techniques
Scaling
Techniques
Noncomparative
Scales
Comparative
Scales
Continuous
Rating Scales
Paired
Comparison
Rank
Order
Constant
Sum
Itemized
Rating Scales
Likert
Semantic
Differential
Stapel
Comparative scales involve the direct comparison
of stimulus objects.
- e.g. Do you prefer Pepsi or Coke?
- Comparative scale data must be interpreted in relative
terms and have only ordinal or rank order properties.
In noncomparative scales, each object is scaled
independently of the others in the stimulus set.
- e.g. How do you feel about Coke?
- The resulting data are generally assumed to be interval
or ratio scaled.
Comparative Scaling
- Nonmetric Scaling
33
Comparative Scaling Techniques:
Paired Comparison Scaling

A respondent is presented with two objects and asked
to select one according to some criterion.

e.g. Do you prefer Pepsi or Coke?

The data obtained are ordinal in nature.

Paired comparison scaling is the most widely used
comparative scaling technique.

Under the assumption of transitivity, it is possible to
convert paired comparison data to a rank order
Comparative Scaling Techniques:
Rank Order Scaling

Respondents are presented with several objects
simultaneously and asked to order or rank them
according to some criterion.

e.g. Rank the following advertisements. 1 denotes the
one you like the most and 10 is the one you least
preferred. No two ads should receive the same rank
number.

Rank order scaling also results in ordinal data.
Comparative Scaling Techniques:
Constant Sum Scaling

A respondent is given a constant sum of money, script,
credits, or points and asked to allocate these to various
items

e.g. If you had 100 HKD to spend on food products,
how much would you spend on product A, on product
B, on product C, etc.

This is also an ordinal level technique.
Relative Advantages of Comparative Scales





Small differences between stimulus objects can be
detected.
Same known reference points for all respondents.
Easily understood and can be applied.
Involve fewer theoretical assumptions.
Tend to reduce halo or carryover effects from one
judgment to another.
The halo effect or halo error is a cognitive bias in which one's
judgments of a person’s character can be influenced by one's overall
impression of him or her. (wikipedia)
Relative Disadvantages of Comparative
Scales


Ordinal nature of the data.
Inability to generalize beyond the stimulus objects
scaled.
Noncomparative Scaling
- Metric Scaling
39
A classification of scaling techniques
Scaling
Techniques
Comparative
Scales
Noncomparative
Scales
Itemized
Rating Scales
Continuous
Rating Scales
Likert
Semantic
Differential
Stapel
Continuous Rating Scale

Respondents rate the objects by placing a mark at the
appropriate position on a line that runs from one extreme
of the criterion variable to the other.
How would you rate SOGO as a department store?
Version 1
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably the best
Version 2
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably the best
0
10
20
30
40
50
60
70
80
90
100

Scoring and codification is difficult.
Itemized Rating Scales- Likert Scale

Respondents are asked to indicate the amount of
agreement or disagreement (from strongly agree to
strongly disagree) on a five- to nine-point scale.
Strongly
disagree
Disagree
Neither
Agree
agree nor
disagree
Strongly
agree
1. SOGO sells high quality merchandise.
1
2
3
4
5
2. SOGO has poor in-store service.
1
2
3
4
5
3. I like to shop at SOGO.
1
2
3
4
5


The analysis can be conducted on an item-by-item basis (profile analysis),
or a total (summated) score can be calculated.
When arriving at a total score, the categories assigned to the negative
statements by the respondents should be scored by reversing the scale.
Itemized Rating Scales- Semantic
Differential Scale

The semantic differential is a seven-point rating scale with end
points associated with bipolar labels that have semantic
meaning.
e.g. Please mark (X) at the point along the scale that best describes what the store
means to you. Please be sure to mark every scale; do not omit any scale.
SOGO is:
Powerful
Unreliable
Modern


--:--:--:--:-X-:--:--: Weak
--:--:--:--:--:-X-:--: Reliable
--:--:--:--:--:--:-X-: Old-fashioned
The negative adjective or phrase sometimes appears at the left side of the
scale and sometimes at the right.
This controls the tendency of some respondents, particularly those with
very positive or very negative attitudes, to mark the right- or left-hand
sides without reading the labels.
Itemized Rating Scales- Stapel Scale

The Stapel scale is a unipolar rating scale with ten categories numbered
from -5 to +5, without a neutral point (zero). This scale is usually
presented vertically.
SOGO
e.g. Please evaluate how accurately each
word or phrase describes the store. Select a
plus number by placing an (x) beside it for
the phrases you think describe the store
accurately. The more accurately you think
the phrase describes the store, the larger
the positive number you should choose. A
larger negative number indicates that the
phrase does not describe the store at all.

+5
+5
+4
+4
+3
+3
+2
+2x
+1
+1
High Quality
Poor Service
-1
-1
-2
-2
-3
-3
-4x
-4
-5
-5
The data obtained by using a Stapel scale can be analyzed in the same way
as semantic differential data.
Some Unique Rating Scale Configurations
Thermometer Scale
Instructions: Please indicate how much you like McDonald’s hamburgers by coloring in
the thermometer. Start at the bottom and color up to the temperature level that best
indicates how strong your preference is.
Form:
100
75
50
25
0
Like very much
Dislike very much
Smiling Face Scale
Instructions: Please point to the face that shows how much you like the Barbie
Doll. If you do not like the Barbie Doll at all, you would point to Face 1. If you
liked it very much, you would point to Face 5.
Form:
1
2
3
4
5
The Big Debate: Response Scales

Likert-type scales: 3-11 categories
Even
or
Less “cop-outs”
vs.
Can express indifference
All Labeled
or
Ends Labeled?
vs.
Interval not Ordinal
or
More?
vs.
Can express distinctions
Consistent interpretation
Fewer
Less interpretation error
Odd?
Unmotivated, uninformed,
Motivated, informed,
heterogenous
Warning: In 1-10 scales, homogenous
is 5 the midpoint?
Choosing the Right Level Scale

Consider information needs





Use dummy tables
Plan analysis before you begin
Plan ahead for advanced methods (segmentation,
conjoint, regression, etc..)
You can always do less...but never more
Consider ease of use for respondent

Choice can be easier than absolute preference
Questionnaire Design
48
49
Questionnaire Design Process
Specify the Information Needed
Specify the Type of Interviewing Method
Determine the Content of Individual Questions
Design the Question to Overcome the Respondent’s Inability and
Unwillingness to Answer
Decide the Question Structure
Determine the Question Wording
Arrange the Questions in Proper Order
Identify the Form and Layout
Reproduce the Questionnaire
Eliminate Bugs by Pre-testing
Key Issues in Survey Design

Structure of the survey


Order of information
Respondent-driven design
Understanding the psychology of survey response
 Question wording
− Are they answering what we’re asking?

Survey Content and Order

What do you need to know?



How much do you need to ask?



Topic “blocks”
Deliberate redundancy, “what if”
Branching
Splits for length (random rotate)
Backwards marketing research


What analyses will you want to do?
Implications of structure
General,
Uninformative
New
Information
Specific
Past behavior,
Demographics
Question Sequence: Example
Location
Type
Examples
Rationale
Screeners
Qualifying
questions
“Have you been snow
skiing in the past twelve
months?”
To identify target respondents
(ski owners who have
skied in the past year).
First few
questions
Warm-ups
“What brand of skis
do you own?”
Easy to answer. Shows respondents
survey is simple.
First third of
questions
Transitions
/focus questions
“What feature do you
like best about skis?”
Relate to research objectives,
slightly more effort needed to
answer.
Middle
Difficult and
complicated
Following are ten
Respondent has committed to
characteristics of snow
completing survey and can see
skis. Please rate your
that just a few questions are left.
skis on each characteristic
using the scale below.
Last section
Demographic/
firmographic
information
“What is the highest level Some questions may be considered
of education you have”
“personal” and respondent may leave
attained?’
them blank, but they are at the end
of the survey.
_______________________
_______________________
Key Issues in Survey Design

Structure of the survey


Order of information
Respondent-driven design
Understanding the psychology of survey response
 Question wording
− Are they answering what we’re asking?

What’s Going On In There?

NOT Retrieval of existing info ?



55
Recall (incomplete, distorted)
Associative processes
Constructed assessments / preferences
The Psychology of Survey Response
“Self-Reports”,
N. Schwarz
Am Psych 1999
Comprehension
Retrieval
Judgment
Response
56
I. Question Wording:
Speak Their Language
“Which of the following rationales motivate your decisions in
the carbonated beverage category in majority of purchase
occasions? Please check all that apply.”
“How do you usually decide which soda to buy? Please check
all that apply.”
• Use respondents’ own vocabulary!
Comprehension
57
Question Wording:
Avoid Implicit Assumptions

An alternative that is not explicitly expressed in the
options is an implicit alternative.
1.
Do you like to fly when traveling short distances?
(Incorrect)
2. Do you like to fly when traveling short distances,
or would you rather drive?
(Correct)
Comprehension
58
Question Wording:
Avoid Compound Questions
 –“Do
McYummysoffers
offersfast
fast
“Doyou
youbelieve
believe that
that McYummys
and
courteous service?”
service?”
– “Do you believe that McYummys offers
courteous service?”
Comprehension
59
Here you go…
@#%*!
Question Wording:
Avoid insufficiently specific questions

Which brand of shampoo do you use?






60
Household or personal brand?
What if more than one brand?
Use now? During last week, month, year?
At home? Away from home?
Which brand or brands of shampoo have you personally
used at home during the last one month? In case of
more than one brand, please list all the brands that
apply.
How many members are there in your family?
Comprehension
Question Wording:
Avoid Double Meanings

Do you eat at Joe’s daily?



Going to Home Depot feels like being in a
warehouse.



Yes, it has lots of everything at low prices.
Yes, it is noisy and crowded and dangerous.
Do you think any brand is better than Gap?



Seven days a week?
Five days a week?
Yes, the brand Abercrombie & Fitch is better.
Yes, any brand is better.
Attributes: Neat, Fresh, Cheap, Tough, etc…
Comprehension
61
II. Failures of Retrieval

When in doubt: people guess-timate.


Due to passage of time
Due to uncertainty of actual timing of event

Timing of event may be uncertain (allergies?)
When mundane: people guess-timate
 When salient: people over-estimate


Telescoping
Retrieval
62
Error from Inability to Recall
Years suffering allergies
25
1) What % have suffered for 10 years?
2) What % have suffered for 9 years?
20
15
%
10
5
0
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
Years
Retrieval
63
Priming:
Context Affects Retrieval

Cues in the environment affect what’s
available in memory.


Survey-related content
General attitudes
Retrieval
64
Addressing Memory Issues

Use a reasonable time frame:
“How
many
cartons
of orange
juice
did you
buy in
the
“How
many
cartons
of orange
juice
did you
buy
last
”
thisyear?
week?”
“Is this a normal number for you?”

Use available cues:
“Sinceyou
last aThanksgiving,
were you
“Were
driver or passenger
in a driver
car thatorwas
passenger
a car
that was
involved
in an accident?”
involved ininan
accident
in the
last year?”
• Be aware of the limits of human memory!
Retrieval
65
III. Issues in Judgment:
Constructed Preferences






Order effects: Anchoring, contrast
Demand Effect
Framing
Halo effect
Priming
Context effects: choice set
Judgment
66
Hmmm…let
me figure out
what I think.
Anchoring / Contrast Effects
1--2--3--4--5--6--7
Hate it
Love it
□ Squid Ink Ice Cream
□ Lemon Tart
□ Chocolate Cake
□ Tiramisu
□ Crème Brule
□ Vanilla Ice Cream
□ Lemon Tart
□ Chocolate Cake
□ Tiramisu
□ Crème Brule
Comprehension (past information) Affects
Judgments
Judgment
67
Demand Effect
“What do I think you want me to say?”

Are you in favor of stem cell research?
69% YES (NBC news)
 24% YES (Conference of Catholic Bishops)
 70% YES (Juvenile Diabetes Foundation)


57% don’t know enough to say (Gallup)
Source: “Unbearable Lightness of Public
Judgment Opinion Polls”, NYT 2001
68
Leading Questions:
“What kind of person are you?”
“Doyou
youthink
thinkthat
thatpatriotic
Americans
should buy
 –“Do
Americans
should
buyautomobiles
imported automobiles
when
imported
or automobiles
madethat
in
would
put American labor out of work?
this country?
___
yesimported
___
___
no domestic
___
___
don’t
know
___
don’t
know


Colgate
your
favorite
toothpaste?”
–“Is“What
is your
favorite
toothpaste
brand?”
Avoid leading or biased questions
Judgment
69
Framing

“Sony fired 4% of its work force.”

“Sony was able to retain 96% of its work
force.”
Judgment
70
Context Effects: Compromise
HD Space
Preference for the
middle option
RAM
Judgment
71
IV. Response Scale Induced Bias
How well does this brand’s product perform:
Sony
….
Not Well
1
2
3
4
Very Well
5
Rate how you feel about the following brands:
Sony
….
Not Well
-2
-1
Very Well
0
Response
72
1
2
Scale Usage Bias

Rate Apple on 20 attributes on a 5 point scale:
Avg=1.2

Avg=3.0
What does this mean?
She likes Apple more than he does
 She has positive scale usage bias (rates more favorable)
 She has extremeness usage bias (rates high and low only)
 He was uninvolved or uninformed
Alternative: Rankings remove all (good & bad) usage variance


Response
73
Minimizing Survey Bias:
Best Practices

Plan order carefully



Multiple measures





74
Rotate items
Split the sample (monadic testing)
Ratings and rankings
Different wording
Consistent methods for comparisons
Provide (realistic) context
Recognize limitations: realistic expectations
Complexity of Questionnaire Design
Looks easy.
 Very difficult.
 No rules can guarantee flawless
questionnaire.
 Questionnaires by skilled researchers may
have drawbacks.
 Researchers may discover questionnaire
flaws after data collection.

Think Ahead: How Can You Tell
Whether They “Get It”?
“Don’t Know” / “Not Apply” option
 Open-ended responses



“Other (specify)”
Probing questions


“Why?”
But… coding is costly and
time consuming
 Multiple
methods
_______________________
_______________________
Bad Respondents

Causes:
Hurried
Irked
Inexperienced
Imposter (“Gaming”)
Clues In the Data
Response time
 Patterns of response (all ‘3’)
 Blank / nonsensical open-ends
 Traps





Usage of low incidence / bogus products
Verification: “Please check ‘5’ ”
“Recall test”
What’s the right cut-off?
Pre-Testing

Dry run on a small sample





sample members should be actual members of the
target population
ensures the questions you intend are being
answered
ensures that questions are understood by
respondents
screens for problems with flow patterns
Verbal protocol questionnaire testing
(“think aloud”)
Download
Related flashcards

Estimation theory

– Cards

Regression analysis

– Cards

Regression analysis

– Cards

Econometricians

– Cards

Create flashcards