1
Session 5. Causal Research and Survey Design
MKTG 3010 MARKETING RESEARCH
Marketing Research Process
Step 1: Defining the Problem
Step 2: Developing an Approach to the Problem
Step 3: Formulating a Research Design
Step 4: Doing Field Work or Collecting Data
Step 5: Preparing and Analyzing Data
Step 6: Preparing and Presenting the Report
2
A Comparison of Basic Research Designs
Objective:
Characteristics:
Methods:
Exploratory
Discovery of ideas and insights
Descriptive
Describe market characteristics or functions
Causal
Determine cause and effect relationships
Flexible, versatile
Often the front end of total research design
Expert surveys
Pilot surveys
Focus group
Secondary data: qualitative analysis qualitative research
Marked by the prior formulation of specific hypotheses
Manipulation of independent variables, effect on dependent variables Preplanned and structured design
Secondary data: quantitative analysis
Surveys
Panels
Observation and other data
Control mediating variables
Experiments
3
4
Causal Research Design
The danger of mixing up causality and correlation https://www.youtube.com/watch?v=8B271L3NtAw
5
Is ice cream dangerous?
What makes marriage men live longer than
single men? Is it because of marriage?
Were bed lamps children’s abuse?
High self-esteem lead to good grade?
6
Is ice cream dangerous?
Underling factor - weather
What makes marriage men live longer than single men? Is it because of marriage?
Reverse causality
Were bed lamps children’s abuse?
Underling factor - genetic
High self-esteem lead to good grade?
Reverse causality
7
Correlation vs causation
Correlation doesn’t imply causation
For any two correlated events A and B, the following relationships are possible:
A causes B;
B causes A;
A and B are consequences of a common cause, but do not cause each other;
There is no connection between A and B, the correlation is coincidental.
8
Condition for causality
Concomitant variation is the extent to which a cause, X, and an effect, Y, occur together or vary together in the way predicted by the hypothesis under consideration.
The time order of occurrence condition states that the causing event must occur either before or simultaneously with the effect; it cannot occur afterwards.
The absence of other possible causal factors means additional or extraneous variables that impact the effect variable must be controlled
9
Experimental design
An experimental design is a set of procedures specifying
the test units and how these units are to be divided into homogeneous subsamples;
what independent variables or treatments are to be manipulated;
what dependent variables are to be measured;
how the extraneous variables are to be controlled.
10
Definition and concepts
Independent variables are variables or alternatives that are manipulated and whose effects are measured and compared, e.g., price levels.
Test units are individuals, organizations, or other entities whose response to the independent variables or treatments is being examined, e.g., consumers or stores.
Dependent variables are the variables which measure the effect of the independent variables on the test units, e.g., sales, profits, and market shares.
Extraneous variables are all variables other than the independent variables that affect the response of the test units, e.g., store size, store location, and competitive effort.
11
A Classification of Experimental Designs
Experimental Designs
Pre-experimental
True
Experimental
Quasi
Experimental
One-Shot Case
Study
Pretest-Posttest
Control Group
One Group
Pretest-Posttest
Posttest: Only
Control Group
Static Group Solomon Four-
Group
Time Series
Multiple Time
Series
12
Statistical
Randomized
Blocks
Latin Square
Factorial
Design
True Experimental Designs
Posttest-Only
Control Group
Design
(EG) R
(CG) R
X = Treatment; O = Observation
EG = Experimental Group; CG = Control Group
Treatment effect = O
2
-O
1
X O
1
O
2 time
R = Two stage randomization
13
True Experimental Designs
Pretest-Posttest with Control
Group Design
(EG) R O
1
(CG) R O
3
X = Treatment; O = Observation
EG = Experimental Group; CG = Control Group
X time
R = Two stage randomization
O
2
O
4
Treatment Effect = TE = (O
2
-O
1
)-(O
4
-O
3
)
14
Limitations of Experimentation
Experiments can be time consuming, particularly if the researcher is interested in measuring the longterm effects.
Experiments are often expensive. The requirements of experimental group, control group, and multiple measurements significantly add to the cost of research.
Experiments can be difficult to administer. It may be impossible to control for the effects of the extraneous variables, particularly in a field environment.
15
Marketing Applications of Experiments
New product launch
Controlled field experiment
Test markets (real/simulated)
Assessing advertising effectiveness
Recognition , recall, persuasion
Purchase behavior (split-cable tests)
Direct mail: Randomized offers
Online experimentation
Homework - Psychology beats business training when it comes to entrepreneurship
( https://www.economist.com/news/business/21729454-among-smallbusiness-owners-togo-least-psychology-beats-business-training-whenit-comes )
What is the research question?
Which experimental design was used?
What is the treatment? How to calculate the treatment effect?
What is the dependent variable?
17
Define the Information Needed
Design the Exploratory, Descriptive, and/or Causal Phases of the Research
Specify the Measurement and Scaling Procedures
Construct a Questionnaire
Specify the Sampling Process and the Sample Size
Develop a Plan of Data Analysis
18
19
Measurement and Scaling
Measure information
Gender
Brands
Price
Income
Preference
Attitude
20
Measurement
Measurement means assigning numbers or other symbols to characteristics of objects according to predetermined rules.
21
Scaling
In the social sciences, scaling is the process of measuring or ordering entities with respect to quantitative attributes or traits.
It can be described in terms of four basic characteristics:
22
Scaling
Description
Refers to the unique labels or descriptors that are used to designate each value of the scale( e.g. 1 = Female, 0 = Male). All scales have description.
Order
Refers to the relative sizes or positions of the descriptors. Order is denoted by descriptors such as greater than, less than, and equal to.
Distance
Refers to the absolute differences between the scale descriptors are known and may be expressed in units.
Origin
Means that the scale has a unique or fixed beginning or true zero point
23
Four Types of Primary Scales
Primary Scales
Ordinal
Scale
Nominal
Scale
Lowest level of measurement
Interval
Scale
Ratio
Scale
Highest level of measurement
24
Nominal
Ordinal
Interval
Ratio
Rules Applications
Objects are either identical Description/ or different classification
Objects are greater Rankings or smaller
Intervals between objects Ratings are equal (arbitrary zero)
Meaningful zero, Amounts for comparison of absolute magnitude
Nominal Scale Measures
1. When at home, which TV brand do you watch most often? (circle one).
Sony
Toshiba
Panasonic
Samsung
Phillips
Other
Descriptions
Each individual/object either is or is not in a group
Can’t order the labels
Statistics
Frequency tables, mode
Use percentages belonging to each group chi-square tests, Z-test for proportions, discriminant, logistic regression
Useful as grouping variable t-tests/ANOVA/ ( independent variable only )
Ordinal Scale Measures
1. Please rank each television brand in terms of your preference. Place a “1” by your top choice, a “2” by your second choice, etc.
_ Sony _ Toshiba _ Panasonic _ Samsung _ Phillips
2. For each pair of television brands, circle the one you would be more likely to purchase.
Ranking
Sony versus Toshiba
Sony versus Panasonic ...
X is better than Y
Can’t say by how much
Statistics
Frequency tables, mode, median
Useful for ranking alternatives
Rank-order correlation, some conjoint models
Interval Scale Measures
For each brand below, indicate how much you like each brand by circling the appropriate number.
Not Like Like
At All Very Much
Sony
Toshiba
Panasonic
1
1
1
2
2
2
3
3
3
4
4
4
5
5
5
• Ratings
– X is n units different from Y
– Distance between 2 and 3 is same as distance between 3 and 4
(or is it?)
• Statistics
– Frequency tables, mode, median, mean, standard deviation
• t-tests, regression, ANOVA, factor, cluster, some conjoint
– Can’t compare absolute magnitude (“4” ≠ 2 X “2”)
Ratio Measure Scales
How much would you be willing to pay for a 24” flat screen TV made by each company?
Sony $_______
Toshiba
Panasonic
$_______
$_______
Amounts, zero has meaning
X is n units different than Y
Distance between 2 and 3 is half of distance between 2 and 4
Statistics
Frequency tables, mode, median, mean, standard deviation, geometric mean
t-tests, regression, ANOVA, factor, cluster, some conjoint
Can compare absolute magnitude (“4” = 2 x “2”)
30
Use of Rating Scales in Self-Report
Measurements
Scaling Techniques
Scaling
Techniques
Paired
Comparative
Scales
Comparison
Rank
Order
Noncomparative
Scales
Constant
Sum
Continuous
Rating Scales
Itemized
Rating Scales
Likert
Semantic
Differential
Stapel
Comparative scales involve the direct comparison of stimulus objects.
- e.g. Do you prefer Pepsi or Coke?
- Comparative scale data must be interpreted in relative terms and have only ordinal or rank order properties.
In noncomparative scales , each object is scaled independently of the others in the stimulus set.
- e.g. How do you feel about Coke?
- The resulting data are generally assumed to be interval or ratio scaled.
33
Comparative Scaling
- Nonmetric Scaling
Comparative Scaling Techniques:
Paired Comparison Scaling
A respondent is presented with two objects and asked to select one according to some criterion.
e.g. Do you prefer Pepsi or Coke?
The data obtained are ordinal in nature.
Paired comparison scaling is the most widely used comparative scaling technique.
Under the assumption of transitivity, it is possible to convert paired comparison data to a rank order
Comparative Scaling Techniques:
Rank Order Scaling
Respondents are presented with several objects simultaneously and asked to order or rank them according to some criterion.
e.g. Rank the following advertisements. 1 denotes the one you like the most and 10 is the one you least preferred. No two ads should receive the same rank number.
Rank order scaling also results in ordinal data.
Comparative Scaling Techniques:
Constant Sum Scaling
A respondent is given a constant sum of money, script, credits, or points and asked to allocate these to various items
e.g. If you had 100 HKD to spend on food products, how much would you spend on product A, on product
B, on product C, etc.
This is also an ordinal level technique.
Relative Advantages of Comparative Scales
Small differences between stimulus objects can be detected.
Same known reference points for all respondents.
Easily understood and can be applied.
Involve fewer theoretical assumptions.
Tend to reduce halo or carryover effects from one judgment to another.
The halo effect or halo error is a cognitive bias in which one's judgments of a person’s character can be influenced by one's overall impression of him or her. (wikipedia)
Relative Disadvantages of Comparative
Scales
Ordinal nature of the data.
Inability to generalize beyond the stimulus objects scaled.
39
Noncomparative Scaling
- Metric Scaling
A classification of scaling techniques
Scaling
Techniques
Comparative
Scales
Noncomparative
Scales
Continuous
Rating Scales
Itemized
Rating Scales
Likert
Semantic
Differential Stapel
Continuous Rating Scale
Respondents rate the objects by placing a mark at the appropriate position on a line that runs from one extreme of the criterion variable to the other.
How would you rate SOGO as a department store?
Version 1
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably the best
Version 2
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably the best
0 10 20 30 40 50 60 70 80 90 100
Scoring and codification is difficult.
Itemized Rating Scales- Likert Scale
Respondents are asked to indicate the amount of agreement or disagreement (from strongly agree to strongly disagree) on a five- to nine-point scale.
Strongly Disagree Neither Agree disagree agree nor disagree
Strongly agree
1. SOGO sells high quality merchandise. 1
2. SOGO has poor in-store service.
1
3. I like to shop at SOGO.
1
2
2
2
3
3
3
4
4
4
5
5
5
The analysis can be conducted on an item-by-item basis (profile analysis), or a total (summated) score can be calculated.
When arriving at a total score, the categories assigned to the negative statements by the respondents should be scored by reversing the scale.
Itemized Rating Scales- Semantic
Differential Scale
The semantic differential is a seven-point rating scale with end points associated with bipolar labels that have semantic meaning. e.g. Please mark (X) at the point along the scale that best describes what the store means to you. Please be sure to mark every scale; do not omit any scale.
SOGO is:
Powerful --:--:--:--:-X-:--:--: Weak
Unreliable --:--:--:--:--:-X-:--: Reliable
Modern --:--:--:--:--:--:-X-: Old-fashioned
The negative adjective or phrase sometimes appears at the left side of the scale and sometimes at the right.
This controls the tendency of some respondents, particularly those with very positive or very negative attitudes, to mark the right- or left-hand sides without reading the labels.
Itemized Rating Scales- Stapel Scale
The Stapel scale is a unipolar rating scale with ten categories numbered from -5 to +5, without a neutral point (zero). This scale is usually presented vertically.
SOGO e.g. Please evaluate how accurately each word or phrase describes the store. Select a plus number by placing an (x) beside it for the phrases you think describe the store accurately. The more accurately you think the phrase describes the store, the larger the positive number you should choose. A larger negative number indicates that the phrase does not describe the store at all.
+5
+4
+3
+2
+1
High Quality
-1
-2
-3
-4x
+5
+4
+3
+2x
+1
Poor Service
-1
-2
-3
-4
-5 -5
The data obtained by using a Stapel scale can be analyzed in the same way as semantic differential data.
Some Unique Rating Scale Configurations
Thermometer Scale
Instructions: Please indicate how much you like McDonald’s hamburgers by coloring in the thermometer. Start at the bottom and color up to the temperature level that best indicates how strong your preference is.
Form:
100
75
50
25
0
Like very much
Dislike very much
Smiling Face Scale
Instructions: Please point to the face that shows how much you like the Barbie
Doll. If you do not like the Barbie Doll at all, you would point to Face 1. If you liked it very much, you would point to Face 5.
Form:
1 2 3 4 5
The Big Debate: Response Scales
Likert-type scales: 3-11 categories
Even
Less “cop-outs”
All Labeled
Consistent interpretation or vs. or vs.
Odd?
Can express indifference
Ends Labeled?
Interval not Ordinal
Fewer or More?
Less interpretation error vs. Can express distinctions
Unmotivated, uninformed, heterogenous
Motivated, informed,
Warning: In 1-10 scales, homogenous is 5 the midpoint?
Consider information needs
Use dummy tables
Plan analysis before you begin
Plan ahead for advanced methods (segmentation, conjoint, regression, etc..)
You can always do less...but never more
Consider ease of use for respondent
Choice can be easier than absolute preference
48
Questionnaire Design
49
Questionnaire Design Process
Specify the Information Needed
Specify the Type of Interviewing Method
Determine the Content of Individual Questions
Design the Question to Overcome the Respondent’s Inability and
Unwillingness to Answer
Decide the Question Structure
Determine the Question Wording
Arrange the Questions in Proper Order
Identify the Form and Layout
Reproduce the Questionnaire
Eliminate Bugs by Pre-testing
Key Issues in Survey Design
Structure of the survey
Order of information
Respondent-driven design
Understanding the psychology of survey response
−
Question wording
Are they answering what we’re asking?
Survey Content and Order
What do you need to know?
Topic “blocks”
Deliberate redundancy, “what if”
How much do you need to ask?
Branching
Splits for length (random rotate)
Backwards marketing research
What analyses will you want to do?
Implications of structure
General,
Uninformative
New
Information
Specific
Past behavior,
Demographics
Question Sequence: Example
Location Type Examples Rationale
Screeners
First few questions
Qualifying questions
Warm-ups
“Have you been snow To identify target respondents skiing in the past twelve (ski owners who have months?” skied in the past year).
“What brand of skis do you own?”
First third of questions
Transitions “What feature do you
/focus questions like best about skis?”
Easy to answer. Shows respondents survey is simple.
Relate to research objectives, slightly more effort needed to answer.
Middle Difficult and complicated
Following are ten Respondent has committed to characteristics of snow completing survey and can see skis. Please rate your skis on each characteristic using the scale below.
that just a few questions are left.
Last section Demographic/ “What is the highest level Some questions may be considered firmographic information of education you have” “personal” and respondent may leave attained?’ them blank, but they are at the end of the survey.
_______________________
_______________________
___________
Key Issues in Survey Design
Structure of the survey
Order of information
Respondent-driven design
Understanding the psychology of survey response
−
Question wording
Are they answering what we’re asking?
What’s Going On In There?
NOT Retrieval of existing info ?
Recall (incomplete, distorted)
Associative processes
Constructed assessments / preferences
55
The Psychology of Survey Response
Comprehension
“Self-Reports”,
N. Schwarz
Am Psych 1999
Retrieval
Judgment
Response
56
I. Question Wording:
Speak Their Language
“Which of the following rationales motivate your decisions in the carbonated beverage category in majority of purchase occasions? Please check all that apply.”
“How do you usually decide which soda to buy? Please check all that apply.”
• Use respondents’ own vocabulary!
Comprehension
57
Question Wording:
Avoid Implicit Assumptions
An alternative that is not explicitly expressed in the options is an implicit alternative .
1.
Do you like to fly when traveling short distances?
(Incorrect)
2.
Do you like to fly when traveling short distances, or would you rather drive?
(Correct)
Comprehension
58
Question Wording:
Avoid Compound Questions
– “Do you believe that McYummys offers courteous service?”
Comprehension
Here you go…
@#%*!
59
Question Wording:
Avoid insufficiently specific questions
Which brand of shampoo do you use?
Household or personal brand?
What if more than one brand?
Use now? During last week, month, year?
At home? Away from home?
Which brand or brands of shampoo have you personally used at home during the last one month? In case of more than one brand, please list all the brands that apply.
How many members are there in your family?
Comprehension
60
Question Wording:
Avoid Double Meanings
61
Do you eat at Joe’s daily?
Seven days a week?
Five days a week?
Going to Home Depot feels like being in a warehouse.
Yes, it has lots of everything at low prices.
Yes, it is noisy and crowded and dangerous.
Do you think any brand is better than Gap?
Yes, the brand Abercrombie & Fitch is better.
Yes, any brand is better.
Attributes: Neat, Fresh, Cheap, Tough, etc…
Comprehension
II. Failures of Retrieval
When in doubt: people guess-timate.
Due to passage of time
Due to uncertainty of actual timing of event
Timing of event may be uncertain (allergies?)
When mundane: people guess-timate
When salient: people over-estimate
Telescoping
Retrieval
62
Years suffering allergies
25
1) What % have suffered for 10 years?
2) What % have suffered for 9 years?
20
63
15
%
10
5
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years
Retrieval
Priming:
Context Affects Retrieval
Cues in the environment affect what’s available in memory.
Survey-related content
General attitudes
Retrieval
64
Addressing Memory Issues
Use a reasonable time frame:
Use available cues:
• Be aware of the limits of human memory!
Retrieval
65
III. Issues in Judgment:
Constructed Preferences
Order effects: Anchoring, contrast
Demand Effect
Framing
Halo effect
Priming
Context effects: choice set
Hmmm…let me figure out what I think.
Judgment
66
1 - - 2 - - 3 - - 4 - - 5 - - 6 - - 7
Hate it Love it
□
Squid Ink Ice Cream
□ Lemon Tart
□ Chocolate Cake
□
□
Tiramisu
Crème Brule
□
□
□
□
□
Vanilla Ice Cream
Lemon Tart
Chocolate Cake
Tiramisu
Crème Brule
Comprehension (past information) Affects
Judgments
Judgment
67
Demand Effect
“ What do I think you want me to say?”
Are you in favor of stem cell research?
69% YES
(NBC news)
24% YES
(Conference of Catholic Bishops)
70% YES
(Juvenile Diabetes Foundation)
57% don’t know enough to say
(Gallup)
Judgment
Source: “
Unbearable Lightness of Public
Opinion Polls
”, NYT 2001
68
Leading Questions:
“What kind of person are you?”
Avoid leading or biased questions
Judgment
69
Framing
“Sony fired 4% of its work force.”
“Sony was able to retain 96% of its work force.”
Judgment
70
Context Effects: Compromise
HD Space
Preference for the middle option
RAM
Judgment
71
IV. Response Scale Induced Bias
How well does this brand’s product perform:
Sony
….
Not Well
1 2 3 4
Very Well
5
Rate how you feel about the following brands:
Sony
….
Not Well
-2 -1 0 1 2
Very Well
Response
72
Scale Usage Bias
Rate Apple on 20 attributes on a 5 point scale:
Avg=1.2 Avg=3.0
What does this mean?
She likes Apple more than he does
She has positive scale usage bias (rates more favorable)
She has extremeness usage bias (rates high and low only)
He was uninvolved or uninformed
Alternative: Rankings remove all (good & bad) usage variance
Response
73
Minimizing Survey Bias:
Best Practices
Plan order carefully
Rotate items
Split the sample (monadic testing)
Multiple measures
Ratings and rankings
Different wording
Consistent methods for comparisons
Provide (realistic) context
Recognize limitations: realistic expectations
74
Complexity of Questionnaire Design
Looks easy.
Very difficult.
No rules can guarantee flawless questionnaire.
Questionnaires by skilled researchers may have drawbacks.
Researchers may discover questionnaire flaws after data collection.
“Don’t Know” / “Not Apply” option
Open-ended responses
“Other (specify)”
Probing questions
“Why?”
But… coding is costly and time consuming
_______________________
_______________________
___________
Bad Respondents
Causes: Hurried
Irked
Inexperienced
Imposter (“ Gaming ”)
Clues In the Data
Response time
Patterns of response (all ‘3’)
Blank / nonsensical open-ends
Traps
Usage of low incidence / bogus products
Verification: “Please check ‘5’ ”
“Recall test”
What’s the right cut-off?
Pre-Testing
Dry run on a small sample
sample members should be actual members of the target population ensures the questions you intend are being answered ensures that questions are understood by respondents screens for problems with flow patterns
Verbal protocol questionnaire testing
(“think aloud”)