Segmentation and Targeting Basics Market Definition Segmentation Research and Methods Behavior-Based Segmentation Market Segmentation • Market segmentation is the subdividing of a market into distinct subsets of customers. Segments • Members are different between segments but similar within. Segmentation Marketing Definition Differentiating your product and marketing efforts to meet the needs of different segments, that is, applying the marketing concept to market segmentation. Primary Characteristics of Segments • Bases—characteristics that tell us why segments differ (e.g. needs, preferences, decision processes). • Descriptors—characteristics that help us find and reach segments. • (Business markets) IndustryAge/Income Size Education Location Profession Organizational structure Media habits (Consumer markets) Life styles A Two-Stage Approach in Business Markets Macro-Segments: • First stage/rough cut – Industry/application – Firm size Micro-Segments: • Second-stage/fine cut – Different customer needs, wants, values within macro-segment Relevant Segmentation Descriptor Variable A: Climatic Region 1.Snow Belt 2.Moderate Belt 3.Sun Belt Fraction of Customers Segment 1 0 Segment 2 Segment 3 100% Likelihood of Purchasing Solar Water Heater (a) Irrelevant Segmentation Descriptor Variable B: Education 1.Low Education 2.Moderate Education 3.High Education Fraction of Customers Segment 1 Segment 2 Segment 3 0 100% Likelihood of Purchasing Solar Water Heater (b) Variables to Segment and Describe Markets Consumer Industrial Segmentation Bases Needs, wants benefits, solutions to problems, usage situation, usage rate. Needs, wants benefits, solutions to problems, usage situation, usage rate, size*, industrial*. Descriptors Demographics Age, income, marital status, family type & size, gender, social class, etc. Lifestyle, values, & personality characteristics. Use occasions, usage level, complementary & substitute products used, brand loyalty, etc. Individual or group (family) choice, low or high involvement purchase, attitudes and knowledge about product class, price sensitivity, etc. Level of use, types of media used, times of use, etc. Industry, size, location, current supplier(s), technology utilization, etc. Personality characteristics of decision makers. Use occasions, usage level, complementary & substitute products used, brand loyalty, order size, applications, etc. Formalization of purchasing procedures, size & characteristics of decision making group, use of outside consultants, purchasing criteria, (de) centralizing buying, price sensitivity, switching costs, etc. Level of use, types of media used, time of use, patronage at trade shows, receptivity of sales people, etc. Psychographics Behavior Decision Making Media Patterns Segmentation in Action We segment our customers by letter volume, by postage volume, by the type of equipment they use. Then we segment on whether they buy or lease equipment. Based on this knowledge, we target our marketing messages, fine tune our sales tactics, learn which benefits appeal to which customers and zero in on key decision makers at a company. —Kathleen Synnot, VP, Worldwide Marketing Mailing Systems Division, Pitney Bowes, Inc. [quoted in Marketing Masters (Walden and Lawler)] Segmentation If you’re not thinking segments, you’re not thinking. To think segments means you have to think about what drives customers, customer groups, and the choices that are or might be available to them. —Levitt, Marketing Imagination STP as Business Strategy Segmentation • Identify segmentation bases and segment the market. • Develop profiles of resulting segments. Targeting • Evaluate attractiveness of each segment. • Select target segments. Positioning • Identify possible positioning concepts for each target segment. • Select, develop, and communicate the chosen concept. … to create and claim value Overview of Methods for STP • Clustering and discriminant analysis • Choice-based segmentation • Perceptual mapping - later Segmentation (for Carpet Fibers) Strength (Importance) A,B,C,D: Location of segment centers. Typical members: A: schools B: light commercial C: indoor/outdoor carpeting D: health clubs . . .. . . .A. .. .. . B. . .. .. . . .. . . Perceptions/Ratings for one respondent: Customer Values . . . D. . . ... .... . C. . .. . .. . .. . . Water Resistance (Importance) Distance between segments C and D Targeting Segment(s) to serve Strength (Importance) .. . . . .... . . . .. ... . .. . . . . .. ... . .. . . . . .. ... . .. . . Water Resistance (Importance) Positioning Product Positioning .. . Comp 1 Comp 2 Strength (Importance) . . .. ... . .. . . Us . . .. ... . .. . . . . .. ... . .. . . Water Resistance (Importance) A Note on Positioning Positioning involves designing an offering so that the target segment members perceive it in a distinct and valued way relative to competitors. Three ways to position an offering: 1. Unique (“Only product/service with XXX”) 2. Difference (“More than twice the [feature] vs. [competitor]”) 3. Similarities (“Same functionality as [competitor]; lower price”) What are you telling your targeted segments? Behavior-Based Segmentation • Traditional segmentation (eg, demographic, psychographic) • Needs-based segmentation • Behavior-based segmentation (choice models) Steps in a Segmentation Study • Articulate a strategic rationale for segmentation (ie, why are we segmenting this market?). • Select a set of needs-based segmentation variables most useful for achieving the strategic goals. • Select a cluster analysis procedure for aggregating (or disaggregating customers) into segments. • Group customers into a defined number of different segments. • Choose the segments that will best serve the firm’s strategy, given its capabilities and the likely reactions of competitors. Segmentation: Methods Overview • Factor analysis (to reduce data before cluster analysis). • Cluster analysis to form segments. • Discriminant analysis to describe segments. Cluster Analysis for Segmenting Markets • Define a measure to assess the similarity of customers on the basis of their needs. • Group customers with similar needs. Recommend: the “Ward’s minimum variance criterion” and, as an option, the K-Means algorithm for doing this. • Select the number of segments using numeric and strategic criteria, and your judgment. • Profile the needs of the selected segments (e.g., using cluster means). Cluster Analysis Issues • Defining a measure of similarity (or distance) between segments. • Identifying “outliers.” • Selecting a clustering procedure – Hierarchical clustering (e.g., Single linkage, average linkage, and minimum variance methods) – Partitioning methods (e.g., K-Means) • Cluster profiling – Univariate analysis – Multiple discriminant analysis Doing Cluster Analysis • Dimension 2 • • • • b I • =distance from member to cluster center b =distance from I to III Perceptions or ratings data from one respondent III • a • • a • Dimension 1 • II • Ward’s Minimum Variance Agglomerative Clustering Procedure First Stage: Second Stage: Third Stage: A =2B = 5C = 9D = AB = 4.5BD = 12.5 AC = 24.5BE = 50.0 AD = 32.0 CD = 0.5 AE = 84.5CE = 18.0 BC = 8.0DE = 12.5 CDA =38.0 CDB =14.0 AE =85.0 BE =50.5 Fourth Stage: ABCD = 41.0ABE= Fifth Stage: ABCDE = 98.8 CDE =20.66 93.17CDE = 10E = AB =5.0 25.18 15 Ward’s Minimum Variance Agglomerative Clustering Procedure 98.80 25.18 5.00 0.50 A B C D E Discriminant Analysis for Describing Market Segments • Identify a set of “observable” variables that helps you to understand how to reach and serve the needs of selected clusters. • Use discriminant analysis to identify underlying dimensions (axes) that maximally differentiate between the selected clusters. Two-Group Discriminant Analysis Price Sensitivity X-segment x = high propensity to buy o = low propensity to buy XXOXOOO XXXOXXOOOO XXXXOOOXOOO XXOXXOXOOOO XXOXOOOOOOO Need for Data Storage O-segment Interpreting Discriminant Analysis Results • What proportion of the total variance in the descriptor data is explained by the statistically significant discriminant axes? • Does the model have good predictability (“hit rate”) in each cluster? • Can you identify good descriptors to find differences between clusters? (Examine correlations between discriminant axes and each descriptor variable). PDA Example PDA – Segmentation • Performs Wards method - Code: proc cluster data=hold.pda method=wards standard outtree=treedat pseudo; var Innovator Use_Message Use_Cell Use_PIM Inf_Passive Inf_Active Remote_Acc Share_Inf Monitor Email Web M_Media Ergonomic Monthly Price; run; proc tree data=treedat; run; PDA – Segmentation (alternative) • Performs K-means method - Code: proc fastclus data=hold.pda maxc=4 maxiter=10 random=41 maxiter=50 out=clus; var Innovator Use_Message Use_Cell Use_PIM Inf_Passive Inf_ActiveRemote_Acc Share_Inf Monitor Email Web M_Media Ergonomic ; run; proc means data =clus; var Innovator Use_Message Inf_Active Remote_Acc M_Media Ergonomic Monthly Price; by cluster; run; Use_Cell Use_PIM Inf_Passive Share_Inf Monitor Email Web Output • • • • • • • • The following clusters are quite close together and can be combined with a small loss in consumer grouping information: i) clusters 7 and 5 at 0.27, ii) clusters 1 and 6 at 0.28, ii) fused cluster 7-5 and cluster 2 (0.34). However, when going from a four-cluster solution to a three-cluster solution, the distance to be bridged is much larger (1.11); thus, the four-cluster solution is indicated by the ESS. In addition, four seems a reasonable number of segments to handle based on managerial judgment. Distance (not to scale) 3.13 1.45 1.11 0.34 0.28 0.27 1 6 4 2 5 7 3 Cluster Four Cluster Solution – profile code; proc tree data = treedata nclusters=4 out=outclus no print; run; ** create new data set; data temp; merge hold.pda outclus; run; ** profile these segments; proc means data =temp; var Innovator Use_Message Use_Cell Use_PIM Inf_Passive Inf_Active Remote_Acc Share_Inf Monitor Email Web M_MedErgonomic Monthly Price; by cluster; run; PDA profiles _A CC VE ED IA LY PR IC E M O NT H O M IC M _M ER G O N L W EB EM AI SH AR E_ IN F M O NI TO R O TE CT I VE _P IM SS I US E IN F_ A RE M G E O R _C EL L ES SA US E _M VA T IN F_ PA US E IN NO PDA Visual profile 2.5 2 1.5 Series1 Series2 1 Series3 Series4 0.5 0 PDA Visual profile… PRICE INNOVATOR 2.5 USE_MESSAGE 2 MONTHLY USE_CELL 1.5 1 ERGONOMIC USE_PIM 0.5 0 M_MEDIA INF_PASSIVE WEB INF_ACTIVE EMAIL REMOTE_ACC MONITOR SHARE_INF Series1 Series2 Series3 Series4 PDA profiles • Cluster 1. Phone users who use Personal Information Management software, to whom Email and Web access, as well as Multimedia capabilities are important. • Cluster 2. People who use messaging services and cell phones, need remote access to information, appreciate better monitors, but not for multi-media usage. PDA profiles.. Cluster 3. Pager users who have a high need for fast information sharing (receiving as well as sending) and also remote access. They use neither email extensively, nor the Web, nor Multi-media, but do require a handy, non-bulky device. Cluster 4. Innovators who use cell phones a lot, have a high need for Email, Web, and Multi-media use. They also require a sleek device. Profile based on Demos/behaviour Name the segments Cluster 1 - Sales Pros: Cluster 1 consists mainly of sales professionals: 54% of the cluster members indicated Sales as their occupation. They use the cell phone heavily, and many (45%) own a PDA already; practically all have access to a PC. Their work often takes them away from the office. They mostly read two of the selected magazines: 30% read BW. From the needs data, we see that they are quite price sensitive. Cluster 2 – Service Pros: Cluster 2 is made up primarily of service personnel (39%) and secondarily of sales personnel (23%). They use cell phones heavily, but only about one fifth currently use a PDA. They spend much time on the road and in remote locations. They read PC Magazine, 29%. From the needs data, we see that they are quite price sensitive. Name the segments… Cluster 3 – Hard Hats: Cluster 3 is made up predominantly of construction (31%) and emergency (19%) workers. They use cell phones, but usually do not own a PDA. By the nature of their work, they have high information relay needs and generally work in remote locations. They exchange information with colleagues in the field (e.g. construction workers on the site). Many read Field & Stream (31%) and also PC Magazine. Note also from the needs data, that they are the least price sensitive (willing to pay highest price plus monthly fee) and also have the lowest income. This apparent anomaly occurs because these folks are less likely to have to pay for the device themselves, raising the question of whose preferences— their own or their employers’—will drive the adoption decision Name the segments… Cluster 4 – Innovators: Cluster 4 represents early adopters (see needs data), predominantly professionals (lawyers, consultants, etc.). Every cluster member has access to a PC, 89 percent already own PDAs. They read many magazines, especially BW 49%, PCMag 32%. Most are highly paid and highly educated. Who to target… • Discuss. Interpreting Cluster Analysis Results • Select the appropriate number of clusters: – Are the bases variables highly correlated? (Should we reduce the data through factor analysis before clustering?) – Are the clusters separated well from each other? – Should we combine or separate the clusters? – Can you come up with descriptive names for each cluster (eg, professionals, technosavvy, etc.)? • Segment the market independently of your ability to reach the segments (i.e., separately evaluate segmentation and discriminant analysis results). Discrimination based on demographics/behaviour proc discrim data=temp outstat=outdisc method=normal pool=yes list crossvalidate; class cluster; priors prop; vars age education etc… ; run; ** all relevant vars. not used to create segment solutions; Discrimination based on demographics/behaviour This allows us a way to target and profile future customers: Discrimination based on demographics/behaviour Discrimination based on demographics/behaviour •The first discriminant function above explains 51% the variation. According to its coefficients, i.e., the four groups are particularly different with respect to the amount away from the office. •In addition, the function shares high correlation with the level of education, possession of a PDA, and income. •The second function explains 32% of the variance and primarily distinguishes the occupation types construction/emergency from sales/service, and the third function separates Sales and Service types. Visualising relationships Correspondence Analysis • Provides a graphical summary of the interactions in a table • Also known as a perceptual map – But so are many other charts • Can be very useful – E.g. to provide overview of cluster results • However the correct interpretation is less than intuitive, and this leads many researchers astray Four Clusters (imputed, normalised) Usage 9 Usage 7 Usage 8 Usage 4 Reason 2 Cluster 3 Cluster 2 Usage 10 Reason 9 Reason 6 Reason 13 Usage 5 Reason 10 Usage 6 Reason 4 Usage 2 Cluster 1 Usage 1 Reason 12 Usage 3 Reason 11 Reason 7 Reason 3 Reason 5 Reason 14 Cluster 4 Reason 1 Reason 15 25.3% 53.8% = Correlation < 0.50 2D Fit = 79.1% Reason 8 Interpretation • Correspondence analysis plots should be interpreted by looking at points relative to the origin – Points that are in similar directions are positively associated – Points that are on opposite sides of the origin are negatively associated – Points that are far from the origin exhibit the strongest associations • Also the results reflect relative associations, not just which rows are highest or lowest overall Software for Correspondence Analysis • Earlier chart was created using a specialised package called BRANDMAP • Can also do correspondence analysis in most major statistical packages • For example, using PROC CORRESP in SAS: *---Perform Simple Correspondence Analysis—Example 1 in SAS OnlineDoc; proc corresp all data=Cars outc=Coor; tables Marital, Origin; run; *---Plot the Simple Correspondence Analysis Results---; %plotit(data=Coor, datatype=corresp) Cars by Marital Status Segmentations Other details Tandem Segmentation • One general method is to conduct a factor analysis, followed by a cluster analysis • This approach has been criticised for losing information and not yielding as much discrimination as cluster analysis alone • However it can make it easier to design the distance function, and to interpret the results Tandem k-means Example proc factor data=datafile n=6 rotate=varimax round reorder flag=.54 scree out=scores; var reasons1-reasons15 usage1-usage10; run; proc fastclus data=scores maxc=4 seed=109162319 maxiter=50; var factor1-factor6; run; • Have used the default unweighted Euclidean distance function, which is not sensible in every context • Also note that k-means results depend on the initial cluster centroids (determined here by the seed) • Typically k-means is very prone to local maxima – Run at least 20 times to ensure reasonable maximum Cluster Analysis Options • There are several choices of how to form clusters in hierarchical cluster analysis – – – – – Single linkage Average linkage Density linkage Ward’s method Many others • Ward’s method (like k-means) tends to form equal sized, roundish clusters • Average linkage generally forms roundish clusters with equal variance • Density linkage can identify clusters of different shapes FASTCLUS Density Linkage Cluster Analysis Issues • Distance definition – Weighted Euclidean distance often works well, if weights are chosen intelligently • Cluster shape – Shape of clusters found is determined by method, so choose method appropriately • Hierarchical methods usually take more computation time than kmeans • However multiple runs are more important for k-means, since it can be badly affected by local minima • Adjusting for response styles can also be worthwhile – Some people give more positive responses overall than others – Clusters may simply reflect these response styles unless this is adjusted for, e.g. by standardising responses across attributes for each respondent MVA - FASTCLUS • PROC FASTCLUS in SAS tries to minimise the root mean square difference between the data points and their corresponding cluster means – Iterates until convergence is reached on this criterion – However it often reaches a local minimum – Can be useful to run many times with different seeds and choose the best set of clusters based on this RMS criterion • See http://en.wikipedia.org/wiki/K-means_clustering for more k-means issues Iteration History from FASTCLUS Relative Change in Cluster Seeds Iteration Criterion 1 2 3 4 5 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 0.9645 1.0436 0.7366 0.6440 0.6343 0.5666 2 0.8596 0.3549 0.1727 0.1227 0.1246 0.0731 3 0.8499 0.2091 0.1047 0.1047 0.0656 0.0584 4 0.8454 0.1534 0.0701 0.0785 0.0276 0.0439 5 0.8430 0.1153 0.0640 0.0727 0.0331 0.0276 6 0.8414 0.0878 0.0613 0.0488 0.0253 0.0327 7 0.8402 0.0840 0.0547 0.0522 0.0249 0.0340 8 0.8392 0.0657 0.0396 0.0440 0.0188 0.0286 9 0.8386 0.0429 0.0267 0.0324 0.0149 0.0223 10 0.8383 0.0197 0.0139 0.0170 0.0119 0.0173 Convergence criterion is satisfied. Criterion Based on Final Seeds = 0.83824 Results from Different Initial Seeds 19th run of 5 segments Cluster Means Cluster FACTOR1 FACTOR2 FACTOR3 FACTOR4 FACTOR5 FACTOR6 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 -0.17151 0.86945 -0.06349 0.08168 0.14407 1.17640 2 -0.96441 -0.62497 -0.02967 0.67086 -0.44314 0.05906 3 -0.41435 0.09450 0.15077 -1.34799 -0.23659 -0.35995 4 0.39794 -0.00661 0.56672 0.37168 0.39152 -0.40369 5 0.90424 -0.28657 -1.21874 0.01393 -0.17278 -0.00972 20th run of 5 segments Cluster Means Cluster FACTOR1 FACTOR2 FACTOR3 FACTOR4 FACTOR5 FACTOR6 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 0.08281 -0.76563 0.48252 -0.51242 -0.55281 0.64635 2 0.39409 0.00337 0.54491 0.38299 0.64039 -0.26904 3 -0.12413 0.30691 -0.36373 -0.85776 -0.31476 -0.94927 4 0.63249 0.42335 -1.27301 0.18563 0.15973 0.77637 5 -1.20912 0.21018 -0.07423 0.75704 -0.26377 0.13729 Howard-Harris Approach • Provides automatic approach to choosing seeds for kmeans clustering • Chooses initial seeds by fixed procedure – Takes variable with highest variance, splits the data at the mean, and calculates centroids of the resulting two groups – Applies k-means with these centroids as initial seeds – This yields a 2 cluster solution – Choose the cluster with the higher within-cluster variance – Choose the variable with the highest variance within that cluster, split the cluster as above, and repeat to give a 3 cluster solution – Repeat until have reached a set number of clusters • I believe this approach is used by the ESPRI software package (after variables are standardised by their range) Another “Clustering” Method • One alternative approach to identifying clusters is to fit a finite mixture model – Assume the overall distribution is a mixture of several normal distributions – Typically this model is fit using some variant of the EM algorithm • E.g. weka.clusterers.EM method in WEKA data mining package • See WEKA tutorial for an example using Fisher’s iris data • Advantages of this method include: – Probability model allows for statistical tests – Handles missing data within model fitting process – Can extend this approach to define clusters based on model parameters, e.g. regression coefficients • Also known as latent class modeling Segmentations via Choice Modelling Choice Models 1. Observe choice: (Buy/not buy => direct marketers Brand bought packaged goods, ABB) 2. Capture related data: – demographics – attitudes/perceptions – market conditions (price, promotion, etc.) 3. Link 1 to 2 via “choice model” model reveals importance weights of characteristics Choice Models vs Surveys With standard survey methods . . . preference/ importance choice weightsperceptions predict observe/ask observe/ask But with choice models . . . importance choice weights observe infer perceptions observe/ask Behavior-Based Segmentation Model Stage 1: Screen products using key attributes to identify the “consideration set of suppliers” for each type of customer. Stage 2: Assume that customers (of each type) will choose suppliers to maximize their utility via a random utility model. Uij = Vij + ij where: ij Uij =Utility that customer i has for supplier j’s product. Vij =Deterministic component of utility that is a function of product and supplier attributes. =An error term that reflects the non-deterministic component of utility. Specification of the Deterministic Component of Utility Vij = Wk bijk k where: i =an index to represent customers, j is an index to represent suppliers, and k is an index to represent attributes. bijk=i’s perception of attribute k for supplier j. wk =estimated coefficient to represent the impact of bijk on the utility realized for attribute k of supplier j for customer i. A Key Result from this Specification: The Multinomial Logit (MNL) Model If customer’s past choices are assumed to reflect the principle of utility maximization and the error (ij) has a specific form called double exponential, then: ^ eVij pij =–––––– k eVik ^ where: p^ij =probability that customer i chooses supplier j. Vij =estimated value of utility (ie, based on estimates of bijk) obtained from maximum likelihood estimation. Applying the MNL Model in Segmentation Studies Key idea: Segment on the basis of probability of choice— 1.Loyal to us 2.Loyal to competitor 3.Switchables: loseable/winnable customers Switchability Segmentation Loyal to Us Winnable Customers (business to gain) Losable Loyal to Competitor Current Product-Market by Switchability Questions: Where should your marketing efforts be focused? How can you segment the market this way? Using Choice-Based Segmentation for Database Marketing A B C D Average Customer PurchasePurchase Profitability Customer Probability VolumeMargin = A B C 130% $31.000.70 $6.51 22% 310% $143.00 $54.000.67 0.60 $1.72 $3.62 45% $88.000.62 $2.73 560% 622% $20.000.58 $60.000.47 $6.96 $6.20 711% $77.000.38 $3.22 813% 91% $39.000.66 $184.00 $3.35 0.56 $1.03 $72.000.65 $1.87 104% Managerial Uses of Segmentation Analysis • Select attractive segments for focused effort (Can use models such as Analytic Hierarchy Process or GE Planning Matrix). • Develop a marketing plan (4P’s and positioning) to target selected segments. – In consumer markets, we typically rely on advertising and channel members to selectively reach targeted segments. – In business markets, we use sales force and direct marketing. You can use the results from the discriminant analysis to assign new customers to one of the segments. Checklist for Segmentation Studies • Is it values, needs, or choice-based? Whose values and needs? • Is it a projectable sample? • Is the study valid? (Does it use multiple methods and multiple measures) • Are the segments stable? • Does the study answer important marketing questions (product design, positioning, channel selection, sales force strategy, sales forecasting) • Are segmentation results linked to databases? • Is this a one-time study or is it a part of a long-term program? Concluding Remarks In summary, • Use needs variables to segment markets. • Select segments taking into account both the attractiveness of segments and the strengths of the firm. • Use descriptor variables to develop a marketing plan to reach and serve chosen segments. • Develop mechanisms to implement the segmentation strategy on a routine basis (one way to do this is through information technology).