Optimal Database Marketing Drozdenko & Drake, 2002
1
Copyright © 1999 by Ancell School of Business. All Rights Reserved.
Chapter 8
Segmenting the Customer
Database
Optimal Database Marketing Drozdenko & Drake, 2002
2
Objectives
• Learn the importance and basic concepts of
segmentation
• Explore how databases are segmented
• Examine the types of segmentation schemes typically
employed by database marketers such as by
promotional product offers, life-stage marketing and
market research
• Review the appropriate analysis techniques such as
univariate and cross-tabulation analysis, formal RFM
analysis, CHAID analysis and multivariate analysis
• Examine issues when preparing to implement
segmentation schemes.
Optimal Database Marketing Drozdenko & Drake, 2002
3
Segmentation Objectives
• Segmentation is the process of dividing
the total market into groups of people
with similar needs and desires based on
their characteristics and past purchase
behavior.
• As the segments become smaller, we
are able to develop marketing programs
that are more specific to the needs of
the segment (see Exhibit 8.1)
Optimal Database Marketing Drozdenko & Drake, 2002
4
Optimal Database Marketing Drozdenko & Drake, 2002
5
Conditions for Proper Segmentation
• Customers’ needs in the market must be
heterogeneous.
• There must be customer information on the database
that reflects the heterogeneous needs of the overall
market so that the market can be divided into
segments.
• There must be a way to measure the transactions or
potential transactions of this group in order to
forecast revenue from the segment.
• There must be an economical way to reach the
segment with marketing programs.
Optimal Database Marketing Drozdenko & Drake, 2002
6
Exhibit 8.2 P&G Personalized Cosmetic Line
Optimal Database Marketing Drozdenko & Drake, 2002
7
Segmentation Schemes
The customer file can have several overall
segmentation schemes depending on the
objective to be attained. In general, they
can be classified into three main categories:
• Promotional product offers
• Life-stage marketing
• Market research
Optimal Database Marketing Drozdenko & Drake, 2002
Segmentation for Promotional
Product Offerings
Typically house customer data is used for this type of
segmentation scheme. Two basic levels of segmentation
are employed, each serving a unique purpose:
• Corporate level segmentation
 concerned with issues that are common across all product lines
within a corporation.
• Product line-specific segmentation
 recency, frequency and monetary value are important variables
for product line segmentation
Optimal Database Marketing Drozdenko & Drake, 2002
Exhibit 8.3 Corporate-Level Elimination
Segments
ACME Database
(UNIVERSE SIZE = 10,000,000)
Known Frauds
and Suspect High
Risk Accounts
(UNIVERSE SIZE:
122,435)
Frequent
Samplers
Bad Credit/Bad
Debt Accounts
DMA
Do-Not-Promotes
Remaining
Names
(UNIVERSE SIZE:
434,782)
(UNIVERSE SIZE:
256,887)
(UNIVERSE SIZE:
67,544)
(UNIVERSE SIZE:
9,118,352)
Optimal Database Marketing Drozdenko & Drake, 2002
Exhibit 8.4
Names Remaining for Product
Line Segmentation
ACME Database
(UNIVERSE SIZE = 10,000,000)
Known Frauds
and Suspect High
Risk Accounts
(UNIVERSE SIZE:
122,435)
Frequent
Samplers
Bad Credit/Bad
Debt Accounts
DMA
Do-Not-Promotes
Remaining
Names
(UNIVERSE SIZE:
434,782)
(UNIVERSE SIZE:
256,887)
(UNIVERSE SIZE:
67,544)
(UNIVERSE SIZE:
9,118,352)
Optimal Database Marketing Drozdenko & Drake, 2002
Product Line Segmentation
• Typically, a scheme divides the “Remaining
Names” (typically called “promotable”
names) into groups generically defined as
the primary, secondary, tertiary, … and
finally, the conversion segment.
• These divisions are based on recency,
frequency and monetary data related to the
product line of concern.
Optimal Database Marketing Drozdenko & Drake, 2002
Exhibit 8.5 Product Line Segmentation
ACME Database
(UNIVERSE SIZE = 10,000,000)
Known Frauds
and Suspect High
Risk Accounts
(UNIVERSE SIZE:
122,435)
Frequent
Samplers
Bad Credit/Bad
Debt Accounts
DMA
Do-Not-Promotes
Remaining
Names
(UNIVERSE SIZE:
434,782)
(UNIVERSE SIZE:
256,887)
(UNIVERSE SIZE:
67,544)
(UNIVERSE SIZE:
9,118,352)
Most Active
Video Buyer
Segment
Least Active
Video Buyer
Segment
Optimal Database Marketing Drozdenko & Drake, 2002
Non-Video
Buyer Segment
Segmentation for Life-Stage
Marketing and Research
• Life-stage segmentation divides the file in a way
that considers primarily demographic and
psychographic data.
• This enables marketers to develop, market, or
advertise more relevant products and offers on the
basis of their customers’ life-stages. Segmenting a
customer file in this manner also allows a direct
marketer to understand the future needs of their
customers via research.
Optimal Database Marketing Drozdenko & Drake, 2002
14
Life-Stage Segmentation
Life-stage segments residing on a customer database might
include:
• Young families with children
• Newly moved
• Professional 25-40 year olds, no children
• Entering the retirement years
• Children 2-5, 6-8, 9-12
• Adolescents
• College students
• Empty nesters
• New grandparents
Optimal Database Marketing Drozdenko & Drake, 2002
15
Segmentation Examples
• The New York Times might wish to highlight in their
promotional copy “sports” coverage to professional males
and “home and fashion” coverage to professional females
based on demographically defined segments.
• A magazine publisher wishing to increase advertising sales
revenue might use such data to develop unique
demographically based segments for targeted advertising.
• The PRIZM clusters defined in Chapter 3 can be used by
retailers for determining where to place stores.
• Performance data by Trans Union as discussed in Chapter
4 can be used by creditors to segment the customer base
into those most likely and least likely to purchase a home
equity loan.
Optimal Database Marketing Drozdenko & Drake, 2002
Segmentation Techniques
There are four commonly used analysis methods for
segmenting a customer file for promotional
product offers, life-stage marketing, or market
research purposes:
• Univariate and cross-tabulation analysis
• Formal RFM analysis
• CHAID analysis
• Multivariate analysis
Optimal Database Marketing Drozdenko & Drake, 2002
Univariate and Cross-Tabulation Analysis
• The customer file can be segmented on a number
of variables including Recency, Frequency and
Monetary Value.
• To develop a segmentation scheme based on these
three data elements, you can create two- or threeway cross-tabulations and divide the file based on
an analysis of historical response data in
conjunction with your marketing assumptions
Optimal Database Marketing Drozdenko & Drake, 2002
18
RFM Segmentation Steps
• Step 1 Create a large sample comprised of past product promotions
to the group of customers you wish to segment. Each sample used
must reflect the customer’s characteristics point-in-time of the
promotion (Chapter 6). Additionally, if you sell “one-shot” items
such as books or music, you will want to include samples
representing an array of product offerings across the various genres.
• Step 2 Create a two- or three-way cross-tabulation on recency,
frequency and monetary values and display the response rates, index
values and percentages falling into each cell.
• Step 3 Define the segments by looking for natural breaks in
response rates which are meaningful with respect to the profitability
of your product line. The number of segments depend on the size of
your database. The smaller your database, the fewer the segments
you will want.
• Step 4 To confirm, you will test the final segmentation scheme on
past product promotion samples not used in the analysis.
Optimal Database Marketing Drozdenko & Drake, 2002
Exhibit 8.6 Current Video Segmentation
Scheme
ACME Database
(UNIVERSE SIZE =
10,000,000)
Corporate Level
Eliminations
(UNIVERSE SIZE:
881,648)
Remaining
Promotable Names
(UNIVERSE SIZE:
9,118,352)
Video Buyers
(UNIVERSE SIZE:
4,784,544)
Group to be further segmented.
Optimal Database Marketing Drozdenko & Drake, 2002
Non-Video Buyers
(UNIVERSE SIZE:
4,333,808)
Exhibit 8.7 Cross-Tabulation of Recency and
Frequency Data
Last Purchase Date
Number of
Past
Purchases
0-1 Purchases
2-4 Purchases
5-10 Purchases
11+ Purchases
TOTAL
0-3
Months
Ago
3-6
Months
Ago
6-9
Months
Ago
9-12
Months
Ago
12+
Months
Ago
TOTAL
RR = 5.34% (106)
RR = 4.58% (91)
RR = 3.75% (75)
RR = 2.98% (59)
RR = 1.45% (29)
RR = 3.37% (67)
Ord = 285
Ord = 383
Ord = 428
Ord = 488
Ord = 139
Ord = 1,723
Tot = 5,337
Tot = 8,354
Tot = 11,420
Tot = 16,391
Tot = 9,568
Tot = 51,070
RR = 7.54% (150)
RR = 6.57% (131)
RR = 4.98% (99)
RR = 4.35% (87)
RR = 2.79% (56)
RR = 4.56% (91)
Ord = 361
Ord = 945
Ord = 1,098
Ord = 1,314
Ord = 721
Ord = 4,439
Tot = 4,789
Tot = 14,376
Tot = 22,039
Tot = 30,203
Tot = 25,838
Tot = 97,245
RR = 11.23% (224) RR = 9.44% (188)
RR = 6.45% (128)
RR = 5.45% (109)
RR = 4.48% (89)
RR = 5.57% (111)
Ord = 76
Ord = 192
Ord = 801
Ord = 1,418
Ord = 809
Ord = 3,296
Tot = 677
Tot = 2,033
Tot = 12,426
Tot = 26,018
Tot = 18,051
Tot = 59,206
RR = 14.71% (293) RR = 11.46% (228) RR = 8.82% (176)
RR = 7.01% (140)
RR = 6.34% (126)
RR = 7.30% (145)
Ord = 20
Ord = 77
Ord = 792
Ord = 1,448
Ord = 763
Ord = 3,100
Tot = 136
Tot = 672
Tot = 8,981
Tot = 20,654
Tot = 12,036
Tot = 42,479
RR = 6.78% (135)
RR = 6.28% (125)
RR = 5.68% (113)
RR = 5.01% (100)
RR = 3.71% (74)
RR = 5.02% (100)
Ord = 742
Ord = 1,597
Ord = 3,119
Ord = 4,668
Ord = 2,432
Ord = 12,558
Tot = 10,939
Tot = 25,435
Tot = 54,867
Tot = 93,266
Tot = 65,493
Tot = 250,000
Optimal Database Marketing Drozdenko & Drake, 2002
Exhibit 8.8 Creation of the Four Segments
The figure below illustrates how the product manager might combine the cells to
create four segments based on index values.
Last Purchase Date
Number of
Past
Purchases
0-1 Purchases
2-4 Purchases
5-10 Purchases
11+ Purchases
TOTAL
0-3
Months
Ago
3-6
Months
Ago
6-9
Months
Ago
9-12
Months
Ago
12+
Months
Ago
TOTAL
RR = 5.34% (106)
RR = 4.58% (91)
RR = 3.75% (75)
RR = 2.98% (59)
RR = 1.45% (29)
RR = 3.37% (67)
Ord = 285
Ord = 383
Ord = 428
Ord = 488
Ord = 139
Ord = 1,723
Tot = 11,420 C3
Tot = 16,391 C4
Tot = 9,568
Tot = 5,337
C1
Tot = 8,354
C2
C5
Tot = 51,070
RR = 7.54% (150)
RR = 6.57% (131)
RR = 4.98% (99)
RR = 4.35% (87)
RR = 2.79% (56)
RR = 4.56% (91)
Ord = 361
Ord = 945
Ord = 1,098
Ord = 1,314
Ord = 721
Ord = 4,439
Tot = 14,376 C7
Tot = 22,039 C8
Tot = 30,203 C9
Tot = 25,838 C10
Tot = 97,245
RR = 11.23% (224) RR = 9.44% (188)
RR = 6.45% (128)
RR = 5.45% (109)
RR = 4.48% (89)
RR = 5.57% (111)
Ord = 76
Ord = 801
Ord = 1,418
Ord = 809
Ord = 3,296
Tot = 4,789
Tot = 677
C6
Ord = 192
C11 Tot = 2,033 C12 Tot = 12,426 C13 Tot = 26,018 C14 Tot = 18,051 C15 Tot = 59,206
RR = 14.71% (293) RR = 11.46% (228) RR = 8.82% (176)
RR = 7.01% (140)
RR = 6.34% (126)
RR = 7.30% (145)
Ord = 20
Ord = 77
Ord = 1,448
Ord = 763
Ord = 3,100
Tot = 136
C16 Tot = 672
Tot = 20,654 C19
Tot = 12,036 C20
Tot = 42,479
Ord = 792
C17 Tot = 8,981 C18
RR = 6.78% (135)
RR = 6.28% (125)
RR = 5.68% (113)
RR = 5.01% (100)
RR = 3.71% (74)
RR = 5.02% (100)
Ord = 742
Ord = 1,597
Ord = 3,119
Ord = 4,668
Ord = 2,432
Ord = 12,558
Tot = 10,939
Tot = 25,435
Tot = 54,867
Tot = 93,266
Tot = 65,493
Tot = 250,000
n Yellow = Excellent Responders
n Gray = Average Responders
Optimal Database Marketing Drozdenko & Drake, 2002
n Magenta = Good Responders
n Cyan = Poor Responders
Exhibit 8.9 New Video Segmentation Scheme
ACME Database
(UNIVERSE SIZE =
10,000,000)
Corporate Level
Eliminations
(UNIVERSE SIZE:
881,648)
Remaining
Promotable Names
(UNIVERSE SIZE:
9,118,352)
Video Buyers
(UNIVERSE SIZE:
4,784,544)
Excellent
Good
Optimal Database Marketing Drozdenko & Drake, 2002
Average
Never Video Buyers
(UNIVERSE SIZE:
4,333,808)
Poor
Segmentation Techniques (Cont.)
Based on the cross-tabulation, how many names of the 4,784,544 video
buyers can the product manager expect to fall within the “Excellent
Responders” segment?
Last Purchase Date
Number of
Past
Purchases
0-1 Purchases
2-4 Purchases
5-10 Purchases
11+ Purchases
TOTAL
0-3
Months
Ago
3-6
Months
Ago
6-9
Months
Ago
9-12
Months
Ago
12+
Months
Ago
TOTAL
RR = 5.34% (106)
RR = 4.58% (91)
RR = 3.75% (75)
RR = 2.98% (59)
RR = 1.45% (29)
RR = 3.37% (67)
Ord = 285
Ord = 383
Ord = 428
Ord = 488
Ord = 139
Ord = 1,723
Tot = 11,420 C3
Tot = 16,391 C4
Tot = 9,568
Tot = 5,337
C1
Tot = 8,354
C2
C5
Tot = 51,070
RR = 7.54% (150)
RR = 6.57% (131)
RR = 4.98% (99)
RR = 4.35% (87)
RR = 2.79% (56)
RR = 4.56% (91)
Ord = 361
Ord = 945
Ord = 1,098
Ord = 1,314
Ord = 721
Ord = 4,439
Tot = 14,376 C7
Tot = 22,039 C8
Tot = 30,203 C9
Tot = 25,838 C10
Tot = 97,245
RR = 11.23% (224) RR = 9.44% (188)
RR = 6.45% (128)
RR = 5.45% (109)
RR = 4.48% (89)
RR = 5.57% (111)
Ord = 76
Ord = 801
Ord = 1,418
Ord = 809
Ord = 3,296
Tot = 4,789
Tot = 677
C6
Ord = 192
C11 Tot = 2,033 C12 Tot = 12,426 C13 Tot = 26,018 C14 Tot = 18,051 C15 Tot = 59,206
RR = 14.71% (293) RR = 11.46% (228) RR = 8.82% (176)
RR = 7.01% (140)
RR = 6.34% (126)
RR = 7.30% (145)
Ord = 20
Ord = 77
Ord = 1,448
Ord = 763
Ord = 3,100
Tot = 136
C16 Tot = 672
Tot = 20,654 C19
Tot = 12,036 C20
Tot = 42,479
Ord = 792
C17 Tot = 8,981 C18
RR = 6.78% (135)
RR = 6.28% (125)
RR = 5.68% (113)
RR = 5.01% (100)
RR = 3.71% (74)
RR = 5.02% (100)
Ord = 742
Ord = 1,597
Ord = 3,119
Ord = 4,668
Ord = 2,432
Ord = 12,558
Tot = 10,939
Tot = 25,435
Tot = 54,867
Tot = 93,266
Tot = 65,493
Tot = 250,000
n Yellow = Excellent Responders
n Gray = Average Responders
Optimal Database Marketing Drozdenko & Drake, 2002
n Magenta = Good Responders
n Cyan = Poor Responders
Segmentation Techniques (Cont.)
What is the “index to total” value for the “Excellent Responder” segment?
That is, how much higher is the “Excellent Responders” segment
expected to respond versus all video buyers?
Last Purchase Date
Number of
Past
Purchases
0-1 Purchases
2-4 Purchases
5-10 Purchases
11+ Purchases
TOTAL
0-3
Months
Ago
3-6
Months
Ago
6-9
Months
Ago
9-12
Months
Ago
12+
Months
Ago
TOTAL
RR = 5.34% (106)
RR = 4.58% (91)
RR = 3.75% (75)
RR = 2.98% (59)
RR = 1.45% (29)
RR = 3.37% (67)
Ord = 285
Ord = 383
Ord = 428
Ord = 488
Ord = 139
Ord = 1,723
Tot = 11,420 C3
Tot = 16,391 C4
Tot = 9,568
Tot = 5,337
C1
Tot = 8,354
C2
C5
Tot = 51,070
RR = 7.54% (150)
RR = 6.57% (131)
RR = 4.98% (99)
RR = 4.35% (87)
RR = 2.79% (56)
RR = 4.56% (91)
Ord = 361
Ord = 945
Ord = 1,098
Ord = 1,314
Ord = 721
Ord = 4,439
Tot = 14,376 C7
Tot = 22,039 C8
Tot = 30,203 C9
Tot = 25,838 C10
Tot = 97,245
RR = 11.23% (224) RR = 9.44% (188)
RR = 6.45% (128)
RR = 5.45% (109)
RR = 4.48% (89)
RR = 5.57% (111)
Ord = 76
Ord = 801
Ord = 1,418
Ord = 809
Ord = 3,296
Tot = 4,789
Tot = 677
C6
Ord = 192
C11 Tot = 2,033 C12 Tot = 12,426 C13 Tot = 26,018 C14 Tot = 18,051 C15 Tot = 59,206
RR = 14.71% (293) RR = 11.46% (228) RR = 8.82% (176)
RR = 7.01% (140)
RR = 6.34% (126)
RR = 7.30% (145)
Ord = 20
Ord = 77
Ord = 1,448
Ord = 763
Ord = 3,100
Tot = 136
C16 Tot = 672
Tot = 20,654 C19
Tot = 12,036 C20
Tot = 42,479
Ord = 792
C17 Tot = 8,981 C18
RR = 6.78% (135)
RR = 6.28% (125)
RR = 5.68% (113)
RR = 5.01% (100)
RR = 3.71% (74)
RR = 5.02% (100)
Ord = 742
Ord = 1,597
Ord = 3,119
Ord = 4,668
Ord = 2,432
Ord = 12,558
Tot = 10,939
Tot = 25,435
Tot = 54,867
Tot = 93,266
Tot = 65,493
Tot = 250,000
n Yellow = Excellent Responders
n Gray = Average Responders
Optimal Database Marketing Drozdenko & Drake, 2002
n Magenta = Good Responders
n Cyan = Poor Responders
Formal RFM Analysis
• Formal RFM segmentation analysis based on an
algorithmic analysis of customer behavior based on
the same basic customer data used in the crosstabulation analysis: recency of orders/purchases,
frequency of orders and monetary value of orders
• The main advantage of Formal RFM analysis is its
simplicity for implementation
• However, don’t mistake simplicity for effectiveness.
Formal RFM analysis has many drawbacks and will
not produce a segmentation scheme as powerful as
other methods.
Optimal Database Marketing Drozdenko & Drake, 2002
26
Hard Coded RFM
Optimal Database Marketing Drozdenko & Drake, 2002
27
Optimal Database Marketing Drozdenko & Drake, 2002
28
Optimal Database Marketing Drozdenko & Drake, 2002
29
Optimal Database Marketing Drozdenko & Drake, 2002
30
Optimal Database Marketing Drozdenko & Drake, 2002
31
Optimal Database Marketing Drozdenko & Drake, 2002
32
CHAID Analysis
• CHAID is an acronym for Chi-Squared
Automated Interaction Detection, sometimes
referred to as a “tree algorithm.”
• The main benefit of performing a CHAID
analysis is it can assist you in determining
statistically meaningful splits in your data.
Optimal Database Marketing Drozdenko & Drake, 2002
33
Exhibit 8.16 First-Level Split of CHAID Analysis
Music Sample
Quantity = 250,000 Response rate
= 4.36%
Last Payment Date (any PL)
within 1 Year
Last Payment Date (any PL)
1 to 2 Years
Last Payment Date (any PL)
2+ Years
(64,530 @ 6.76% - 155 index to
total)
(83,440 @ 4.69% - 108 index to
total)
(102,030 @ 2.57% - 59 index to
total)
Optimal Database Marketing Drozdenko & Drake, 2002
Exhibit 8.17 SAS Output of Tree
1
0
Total
4.36%
95.64%
100.00%
10900
239100
250000
LSTPAYDT
1
1
0
Total
6.76%
93.24%
100.00%
2
4362
60168
64530
1
0
Total
Optimal Database Marketing Drozdenko & Drake, 2002
4.69%
95.31%
100.00%
3
3913
79527
83,440
1
0
Total
2.57%
97.43%
100.00%
2622
99408
102030
Exhibit 8.18 Additional Splits for “Last Payment Date
Within 1 Year” Group Based on CHAID Analysis
Music Sample
Quantity = 250,000 Response rate
= 4.36%
Last Payment Date (any PL)
within 1 Year
Last Payment Date (any PL)
1 to 2 Years
Last Payment Date (any PL)
2+ Years
(64,530 @ 6.76% - 155 index to
total)
(83,440 @ 4.69% - 108 index to
total)
(102,030 @ 2.57% - 59 index to
total)
1-6 Music Purchases Ever
7+ Music Purchases Ever
(24,660 @ 5.49% - 126
index to total)
(39,870 @ 7.54% - 173
index to total)
Optimal Database Marketing Drozdenko & Drake, 2002
Exhibit 8.19 Additional Splits for “Last Payment Date
Within 1 to 2 Years” Group Based on CHAID Analysis
Music Sample
Quantity = 250,000 Response rate
= 4.36%
Last Payment Date (any PL)
within 1 Year
Last Payment Date (any PL)
1 to 2 Years
Last Payment Date (any PL)
2+ Years
(64,530 @ 6.76% - 155 index to
total)
(83,440 @ 4.69% - 108 index to
total)
(102,030 @ 2.57% - 59 index to
total)
1-6 Music Purchases Ever
7+ Music Purchases Ever
1-5 Music Purchases Ever
(24,660 @ 5.49% - 126
index to total)
(39,870 @ 7.54% - 173
index to total)
(33,630 @ 3.10% - 71
index to total)
Optimal Database Marketing Drozdenko & Drake, 2002
6+ Music Purchases Ever
(49,810 @ 5.77% - 132
index to total)
Exhibit 8.20 Final Segmentation Scheme for the Music
Product Line
Music Sample
Quantity = 250,000 Response rate
= 4.36%
Last Payment Date (any PL)
within 1 Year
Last Payment Date (any PL)
1 to 2 Years
Last Payment Date (any PL)
2+ Years
(64,530 @ 6.76% - 155 index to
total)
(83,440 @ 4.69% - 108 index to
total)
(102,030 @ 2.57% - 59 index to
total)
1-6 Music Purchases Ever
7+ Music Purchases Ever
1-5 Music Purchases Ever
(24,660 @ 5.49% - 126
index to total)
(39,870 @ 7.54% - 173
index to total)
(33,630 @ 3.10% - 71
index to total)
Optimal Database Marketing Drozdenko & Drake, 2002
6+ Music Purchases Ever
(49,810 @ 5.77% - 132
index to total)
Exhibit 8.20 Final Segmentation Scheme for the Music
Product Line
For this example, the CHAID analysis created five segments. Each segment or group was
determined so that the separation in response between segments to music promotions
was maximized and significant.
• Segment 1: Last Payment Date within 1 year and Music Purchases Ever (1-6)
• Segment 2: Last Payment Date within 1 year and Music Purchases Ever (7+)
• Segment 3: Last Payment Date within 1 to 2 years and Music Purchases Ever (1-5)
• Segment 4: Last Payment Date within 1 to 2 years and Music Purchases Ever (6+)
• Segment 5: Last Payment Date 2+ years
Music Sample
Quantity = 250,000 Response rate
= 4.36%
Last Payment Date (any PL)
within 1 Year
Last Payment Date (any PL)
1 to 2 Years
Last Payment Date (any PL)
2+ Years
(64,530 @ 6.76% - 155 index to
total)
(83,440 @ 4.69% - 108 index to
total)
(102,030 @ 2.57% - 59 index to
total)
1-6 Music Purchases Ever
7+ Music Purchases Ever
1-5 Music Purchases Ever
6+ Music Purchases Ever
(24,660 @ 5.49% - 126
index to total)
(39,870 @ 7.54% - 173
index to total)
(49,810 @ 5.77% - 132
index to total)
(33,630 @ 3.10% - 71
index to total)
Optimal Database Marketing Drozdenko & Drake, 2002
Factor and Cluster Analysis
• Factor and cluster analysis are more sophisticated
segmentation techniques used by savvy direct
marketers for segmenting the customer file.
• Both techniques are exploratory in nature.
• Often these techniques are used together to create
the most powerful segmentation scheme available.
Optimal Database Marketing Drozdenko & Drake, 2002
Factor Analysis
• Factor analysis often reveals unusual and not readily apparent
relationships in the customer data.
• This technique will help determine the relationships among
various data elements in an attempt to summarize predictors into
fewer data elements.
• It provides a way to reduce large numbers of data elements to
fewer, more powerful data elements for input to a target model or
for the development of a segmentation scheme.
• Factor analysis reduces the data elements by creating various
linear combinations or groupings of them based on patterns seen
in the data. It does not consider response information. This
analysis is performed only on predictor variables.
Optimal Database Marketing Drozdenko & Drake, 2002
41
Factor Analysis
The enhancement data elements to be examined for patterns
include:
• Household Size
• Household Income
• Age of Head of Household
• Children Present
• Apartment Renter
• Cooking Interest
• Wine Interest
• Home Improvement Interest
• Car Repair Interest
• Own Investment Portfolio
• Have Retirement Account
Optimal Database Marketing Drozdenko & Drake, 2002
Exhibit 8.21 Resulting Factors on a
Sample of 100,000 ACME Direct Names
Variable/Data Element
Factors
Factor 1 Loadings
Factor 2 Loadings
Household Size = 2
0.85
-0.01
Household Income = $80,000 +
0.89
0.14
Cooking Interest = yes
0.56
0.13
Wine Interest = yes
0.44
0.03
Home Improvement Interest = yes
0.05
0.78
Car Repair Interest = yes
-0.06
0.76
Age of Head of Household = 30-35
0.56
0.62
Own Investment Portfolio = yes
0.92
0.20
Have Retirement Account = yes
0.94
0.23
Children Present = Under 1 Yr.
0.11
0.89
Rent Apartment = yes
0.02
0.95
Optimal Database Marketing Drozdenko & Drake, 2002
Important Data Elements in
Factor 1
♦ Household size = 2
♦ Household income = $80,000
♦ Cooking interest = yes
♦ Wine interest = yes
♦ Age of head of household = 30–35
♦ Own investment portfolio = yes
♦ Have retirement account = yes
Optimal Database Marketing Drozdenko & Drake, 2002
44
Important Data Elements in
Factor 2
♦ Home improvement interest = yes
♦ Car repair interest = yes
♦ Age of head of household = 30–35
♦ Children present = under 1 yr.
♦ Rent apartment = yes
Optimal Database Marketing Drozdenko & Drake, 2002
45
Exhibit 8.22 Jones and Smith Customer
Records
Customer
Jones
Smith
Home
Improvement
No
Yes
Car
Repair
Yes
Yes
Age of Head of
Household
33
37
Optimal Database Marketing Drozdenko & Drake, 2002
Children
Present =
Under 1 Yr.
Yes
No
Rent
Apartment
Yes - Rent
No - Own
46
Exhibit 8.23 Resulting Factors Scores for
Customers Jones and Smith
Factor 1 Variables
Home Improvement
Car Repair
Age of Head of Household
= 30-35
Children Present = Under
1 Yr.
Rent Apartment
TOTAL SCORE
Jones
Smith
no
yes
yes
Jones
Score
0
.76
.62
yes
yes
no
Smith
Score
.78
.76
0
yes
.89
no
0
yes
.95
3.22
no
0
1.54
Optimal Database Marketing Drozdenko & Drake, 2002
Selecting Significant Factors
• A direct marketer can also use this information in a
“select.” That is, they can identify customers on their
database that are considered to be a “young struggling
family” by selecting names meeting all criteria determine
important for that factor.
• For example, a direct marketer will identify such names
via a “select” by choosing those names on their database or
on an outside list that are:
• Interested in home improvement
• Interested in car repair
• Between the ages of 30 and 35
• Have a child less than 1 year old
• Rent an apartment
Optimal Database Marketing Drozdenko & Drake, 2002
48
Less Struggling - Factor 2 - More Struggling
Exhibit 8.24 Graphical Presentation of Factor
1 and Factor 2
Factor Plot
1
0.9
C ar
0.8
R epair 0.7
“Young Struggling Family”
R ent
C hild LE 1Yr.
H ome Improve
Age 30- 35
0.6
“2 Person Household with High Income”
0.5
0.4
R etire Acct.
0.3
0.2
0.1
C ooking Interest
Invest.
Inc $80M+
Wine Interest
0
-0.2
-0.1 0
0.2
0.4
0.6
0.8 H H Size 1= 2
Less 2 Person/Affluent ----- Factor 1 ----- More 2 Person/Affluent
Optimal Database Marketing Drozdenko & Drake, 2002
Cluster Analysis
• Cluster Analysis, as opposed to factor analysis, groups
populations (people, bank accounts, office branches, zip
codes, countries, etc.) together based on similarities in the
data. It is utilized to obtain subset segments that can be
marketed to and treated differently based on different
needs.
• Cluster Analysis calculates a statistical measure of
distance similar to the actual distance between two points
on an X/Y axis
• Often times, the analyst will first run a factor analysis to
reduce the available data elements to fewer more
meaningful descriptors. It is these factors that are used in
the cluster analysis.
Optimal Database Marketing Drozdenko & Drake, 2002
50
Cluster Analysis Method
• Commonly the Cluster Analysis routine will select 2
observations (seeds) from the data set at random for a 2-cluster
solution. After which, the routine will assign each of the
remaining observations in the data set to one of the 2 initial
clusters - the ones they are closest to based on the key
descriptors (e.g., age, income, genders, etc.) being considered.
• Once the initial 2 clusters are created and all observations
assigned, the initial seed values are replaced by the cluster
centroids (the averages of all observations in each cluster). An
iterative process now begins and each observation is reevaluated to determine if they are in the correct cluster by
examining the distance from the other clusters based on the
key descriptor values. After several iterations, the final
clusters result.
Optimal Database Marketing Drozdenko & Drake, 2002
51
Optimal Database Marketing Drozdenko & Drake, 2002
52
Optimal Database Marketing Drozdenko & Drake, 2002
53
Optimal Database Marketing Drozdenko & Drake, 2002
54
Optimal Database Marketing Drozdenko & Drake, 2002
55
Optimal Database Marketing Drozdenko & Drake, 2002
56
Determining the Appropriate
Number of Clusters
• The Sample size
• if you only have 10 observations you are trying to
cluster, you would not be looking for a 10-cluster
solution
• The size of each cluster
• a final cluster of only one person may indicate an outlier
in your data
• Your knowledge of the business question at hand
• Cluster profiles that make the most sense, given
your business knowledge
Try several different cluster solutions and determine
which best suits your objectives.
57
Optimal Database Marketing Drozdenko & Drake, 2002
Cluster Analysis Example
• Proflowers.com, a flower Internet company, used to implement
it’s marketing programs by following traditional RFM
segmentation analysis as discussed earlier in this chapter.
• Executives at Proflowers.com wanted a better understanding of
who was buying and why in order to create more personalized
and effective marketing communications.
• Using cluster analysis, they created new customer segments and
tested various messages to them.
• As a results, Proflowers.com was able to gain a thorough
understanding of their customer base and increase response
rates.
• In addition, this analysis also revealed new business
development opportunities base on the various segments’
demographics profiles allowing Proflowers.com to partner with
companies like Omaha Steaks and the Bombay Company.
Optimal Database Marketing Drozdenko & Drake, 2002
58
Promotional Intensity
• Segmentation almost always is used to partition out “good”
customers. As we have seen, “good” is typically defined as
customers with high past transaction levels. Promotions are
then directed to the good customers, while the “bad” (less
active) customers are ignored.
• The continual concentration of marketing efforts on the good
customers can have negative effects. With repeated contacts,
customers may begin to ignore offers if there is no current need
for the products. Further, a high contact rate may degrade the
customer perception of an organization.
• While segmentation techniques are often directed toward good
customers, it can be unwise to ignore “bad” customers. A “bad”
customer that ranks low on a segmentation scheme may actually
be a brand loyal customer that has a longer purchase cycle or a
lower overall need for a company’s products.
Optimal Database Marketing Drozdenko & Drake, 2002
59
Too Many Products
• From the consumer’s perspective, highly
segmented markets that result in a proliferation of
products can be confusing.
• Although people who are experienced or highly
involved in the product category prefer the
selection, less experienced or less motivated
consumers may become frustrated and postpone
the purchase.
Optimal Database Marketing Drozdenko & Drake, 2002
60
Cannibalism
• In marketing, the term “cannibalism” refers to the situation
where a company’s new product takes sales away from
existing products.
• As the company develops more products in response to
emerging market segments, the probability of cannibalism
increases. If the new product is more profitable for the
company and steals customers from the company’s less
profitable products, then cannibalism can have a beneficial
impact on overall profitability.
• On the other hand, if a company “down lines” to appeal to a
younger, less affluent segment, the danger exists that some
potential customers of more expensive (and profitable)
models will move down to the new less expensive model.
• The goal for developing new products for new or
existing segments is to draw sales from competitors, not
from other products of the company.
Optimal Database Marketing Drozdenko & Drake, 2002
61
Overgeneralization
• Marketers must be careful not to overgeneralize the results
of a segmentation analysis. This overgeneralization might
limit the perceived valuation of a segment to the marketer.
• For example, looking only at an aggregate breakdown of
age categories, a database marketer may conclude that the
older customers on her database are not interested in
gourmet foods.
• However, when the analysis includes other variables, such
as frequency of international travel and education level,
there may be many older customers who could be
classified as good prospects for a gourmet food offering.
Optimal Database Marketing Drozdenko & Drake, 2002
62
Ethical and Public Policy Issues
• Targeting customers based on certain psychographic or
demographic variables (e.g., children, smokers, ethnic
groups) can result in reactions from public interest groups.
• There is no question that the public is opposed to direct and
indirect targeting of children for products that are intended
for adults, such as liquor and cigarettes. However, there are
circumstances that are less obvious where targeting certain
groups has elicited public response.
• In particular, telemarketers who target the elderly have been
scrutinized because of the fraudulent practices of some
organizations, and database marketers who sell children’s
products must handle their databases carefully.
Optimal Database Marketing Drozdenko & Drake, 2002
63
Review Questions
1. What is the importance of segmentation to marketing, and
how is it used in database marketing?
2. What variables might be used in database segmentation?
3. Describe how tabulation techniques are used to segment a
database.
4. What is RFM and how is it used to segment a database?
What are the limitations of formal RFM techniques?
5. How is factor analysis and cluster analysis used to segment
a database? What are the advantages and limitations of the
technique?
6. Discuss some of the potential problems with the application
of segmentation methods to database marketing.
Optimal Database Marketing Drozdenko & Drake, 2002
64