Modeling the Network of Loyalty-Profit Chain Carl Lee, Central Michigan University, Mt. Pleasant, MI Tim Rey, The Dow Chemical Company, Midland, MI Olga Tabolina, The Dow Chemical Company, Midland, MI James Mentele, Center for Applied Research & Technology, Central Michigan University Research Corporation, Mt. Pleasant, MI Tim Pletcher, Center for Applied Research & Technology, Central Michigan University Research Corporation, Mt. Pleasant, MI ABSTRACT This article presents a case study on modeling the network of cause-and-effect relationships of the loyalty-profit chain for the chemical industry. The modeling of the loyalty-profit chain has become an important research topic in marketing due to the dynamic change of the global economical marketing. The article first presents a project model and strategy and discuss how it is applied to this case study. In order to model the complex network of the potentially nonlinear and asymmetric cause-and-effect relationships, a modified neural network technique, structural neural network is developed. Detailed strategy of modeling building and evaluation is presented. A comparison between this modified neural network, the traditional neural network and regression models is presented. 1. INTRODUCTION This article presents a case study for predicting the profit through modeling the network of loyalty-profit chain for a company in the chemical industry, which will be named as Company A through out the article. The complete modeling process included three stages which spanned several years of elapsed time and was conducted in three stages. The first stage was to establish the network of the cause-and-effect relationships in the individual business study specifically for the attitudinal or performance portion of the performance-satisfaction-loyalty chain (Rey and Johnson 2002; Rey 2002). Structured equation model and partial least square model were applied to build the loyalty construct based on theoretical loyalty framework available in the marketing literature (e.g., Dick and Basu, 1994; Oliver, 1994; Oliver, 1997; Gustafsson and Johnson, 2000; 1 Gustafsson and Johnson, 2004). . The second stage took the conceptual loyalty construct from stage one as the basic network and used customer attitudinal performance data, perceived values, satisfaction, image and customers’ characteristics across the accumulation of 40+ individual business studies spanning four years to build a predictive model of loyalty index (Lee, Rey, Mentele & Garver, 2005). The third stage was to model the complete loyalty-profit chain for predicting the profit for Company A. The complete loyalty-profit network is built on the loyalty network obtained from the stage two with additional network structure connecting the loyalty construct with variables ob employee satisfaction, market orientation, and financial data. This article focuses on the third stage of modeling the complete network of loyalty-profit chain. Recent research findings in the marketing research literatures have confirmed that customer satisfaction, customer loyalty and retention are related to key measures of financial performance, such as increased sales, lower costs, and more predictable profit streams are some of the tangible benefits to the company of having loyal customers (Bejou and Palmer 1998; Terrill et al. 2000). Customer loyalty has also been documented as a source of competitive advantage and a key to firm survival and growth (Bharadwaj et al. 1993; Reichheld 1993; Reichheld, 1996; Terrill, 2000 ). Reichheld and Sasser (1990) identified numerous bottom line benefits of customer retention due to loyal and satisfied customers including willing to purchase more, paying higher prices, easier to service (thus reducing operating costs), and helping to expand the customer base by giving positive referrals. The bottom line is that Building and enhancing long-term relationships with customers generates positive returns to a company. On the other hand, how a firm should do in order to build satisfied and loyal customers? Empirical evidence in the satisfaction literature has shown that performance attributes of a firm, such as product quality, customer services, technical support, cost, and so on, are associated with customer satisfaction, which in turn impact the loyalty (see, e.g., Hanson 1992; Mittal, Ross, and Baldasare 1998, Anderson and Mittal 2000). The marketing literature also suggests that different customer segments may place different levels of importance on product and service attributes, and that for different segments attribute may have more or less impact on predicting satisfaction, loyalty, and retention. Further, different level of retained customers may have different impact on the firm’s profitability. For example, Mittal and Katrichis (2000) argue that newly acquired and loyal customers should be 2 treated as distinct segments. They present three case studies from the automotive, mutual fund, and credit card industry to show that attribute importance varies significantly between these two segments. Reichheld (1996) showed that for the credit card and insurance industry, the relationship between a customer’s duration with the firm and profitability varied significantly. By calculating the cost of maintaining customers in each segment and the revenue they generated, the firm will be able to calculate the differential profitability rates for each segment. Failure to consider segment-specific differences may lead a firm to optimize performance on the wrong attribute for a given segment (Anderson and Mittal 2000). The modeling strategy used for this project consists of a large task of data preparation, data harmonization and data processing. As indicated in data mining literature (e.g., Berry and Linoff 2000; Han and Kamber 2001), the data cleansing stage took over 80% of the project period. The network of cause and effect relationship for describing the ‘path’ that leads to profit is complicated. A modification of the neural network technique, namely, structural neural network (SNN) is developed for modeling the loyalty-profit chain. Section two presents the complete loyalty-profit network structure and discusses the motivation behind the development of the SNN technique for modeling the loyalty-profit chain. Section three discusses the project plan and the issues related to data preparation, harmonization and processing. Section four describes the SNN technique and the SNN modeling strategies for building the SNN model using the data provided by Company A. Section five summaries the results and gives a brief discussion of the findings. A brief conclusion and remarks of the SNN techniques and the lesson learned is discussed in section six. 2. THE NETWORK OF THE LOYALTY-PROFIT CHAIN There is a long history of development of the loyalty and profitability framework in the marketing research literature (e.g., Dick and Basu, 1994; Oliver, 1994; Oliver, 1997; Gustafsson and Johnson, 2000; Gustafsson and Johnson, 2004). The complete loyalty-profitability framework adopted in stage three is given in Figure 1. This framework was developed by testing and modifying various theoretical frameworks in the marketing literature using company A’s data. For more detailed discussion of the development of the framework, one may refer to Rey and Johnson (2002), Rey (2002) and Lee, et al (2005). Consistent with the literature (e.g., 3 Gustafsson and Johnson, 2000), customer perceptions of product and service attributes (technical support, customer service, availability and delivery, product quality, and cost) lead to customer perceptions of value. In turn, perceived value, ease of doing business, and the business relationship influence and predict customer satisfaction. Loyalty intentions (intentions to repurchase and recommend) are predicted by the firm’s perceived image in the marketplace, customer characteristics (type of buyer, type of firm, etc.), and their current level of customer satisfaction. Loyalty intention predicts loyalty behaviors, which in turn affects the customer’s purchase volume, level of price sensitivity, and retention. Various profitability measures are directly predicted by these variables. Figure 1: The Complete Network of the Loyalty-Profit Chain 2.1 THE NEED FOR NEW TECHNIQUES FOR MODELING THE LOYALTY-PROFIT CHAIN Regression techniques such as multiple regression and principle component regression are typical techniques for modeling profit without taking into account the network structure. Structural equation modeling (SEM) and partial least square (PLS) techniques are typical techniques for modeling profit when the network structure is considered (e.g., Johnson and Gustafsson, 2000). Gustafsson and Johnson (2004) compared multiple regression, partial least 4 square and principle component regression techniques for three different service industries. They suggested that one should not solely rely on one technique until they are carefully compared. This is mainly due to the fact that the modeling structure and the underlining assumptions are different and serve for different purposes. Lee, et. al. (2005) discussed the strengths and weakness of these traditional techniques and indicated that the strengths of these traditional techniques include (1) parameter/weight estimates are more easily interpreted, (2) models are easy to construct, and (3) in most cases, confidence level and hypothesis testing can be performed. The weakness of these techniques include (1) inability to model nonlinear relationship between inputs and targets, (2) inability to model higher order interactions effectively, (3) requirement of distribution assumptions such as normality, and (4) inability to effectively model large amounts of messy data.. Anderson and Mittal (2000) gave a thorough discussion about the nature of nonlinearity and asymmetry in the chain relationship between the constructs of attribute performance, satisfaction, loyalty and profit, and showed that the relation between each link often is nonlinear and asymmetric. For instance, the nonlinear link between attribute performance and satisfaction constructs may occur when performance increases in certain types of attributes have less of an impact on satisfaction at some point, while at other points in the chain, there are increasing returns. The nonlinear link between satisfaction and profit construct may appear in the form of diminishing returns. That is, each additional one-unit increase in an input has a smaller impact than the preceding one-unit increase. Asymmetric link occurs when the impact of an increase is different from the impact of an equivalent decrease, not only in terms of direction but also in terms of size (e.g., Anderson and Mittal, 2000). Empirical evidences have been reported in the literature in a variety of industries such as health care ((Mittal and Baldasare 1996), airlines and telephone directory service (Danaher 1998), automotive (Mittal, Ross, and Baldasare 1998), and business-to-business marketing (Kumar 1998). In the Chemical industry, similar nonlinear and asymmetric relationships also exist based on exploring the Company A’s data (Rey 2002). Marketing literature has suggested that many marketing characteristics such as customer’s satisfaction, retention rates, and profit measurements do not follow a normal distribution. In addition, the fast growth of data collected by firms not only results in a complex and messy data structure but also in a large amount of data. The traditional statistical inference and hypothesis 5 testing may no longer be appropriate in these situations. In various data mining literature (e.g., see, Hand, et, al, 2001; Hastie, 2001; Riply, 1996; Han and Kamber, 2001) a variety of techniques have been developed for dealing with problems involving large amounts of nonnormal, nonlinear, and messy data. A key feature of the loyalty-profit chain (Figure 1) is that the attribute performance, satisfaction, loyalty and financial constructs in the model are inherently abstract or latent variables. Appropriate modeling techniques need to accommodate the fact that the model is a network of cause-and-effect relationships that contains latent variables. Traditional SEM and PLS techniques are natural choices for modeling such a network (e.g., see Johnson and Gustafsson, 2000; Hahn, et al, 2002; Gustafsson and Johnson, 2004). However, the weaknesses mentioned above have caught the attention of various researchers. Alternative modeling techniques have been developed to deal with these drawbacks. For instance, Hahn, et al (2002) proposed a mixture PLS model for taking into account the difference of business segments. Ansari, et, al. (2000) proposed a hierarchical Bayesian methodology for treating heterogeneity in structural equation models. Hruschka (2001) applied a one hidden layer neural network to model net attraction. The ultimate goal of the loyalty-profit modeling project is: “To develop a predictive model for predicting profit that is capable of taking into the nonlinear and asymmetric cause-effect relationships in the loyalty-profit framework without the assumptions such as normality and homogenous variance for large and messy data for Company A.” Lee, et, al, (2005) proposed a modification of the traditional neural network, namely, a structural neural network technique (SNN) to model the loyalty construct in stage two of the project. They demonstrated that the SNN technique performs better than traditional NN models as well as regression modeling techniques. The proposed SNN technique mimics the theoretical loyalty framework, and takes into account the nonlinear and asymmetric relationships between constructs (i.e., performance to satisfaction to loyalty). Building on the success of the SNN model for the loyalty construct, it was decided to apply the same technique for modeling the complete loyalty-profit chain in stage three of this project. The major advantage of a neural network technique is that it is a universal approximator 6 (Riply, 1996) for any type of function. However, since the weight estimates of the neural network model are not meaningful for interpreting the impact of the inputs (or independent variables), it has been criticized as a ‘Black Box’ approach. Therefore, in the development of the structural neural network system, some strategies are implemented to deal with the issue of validity of the technique. 3. THE PROJECT PLANNING AND DATA PREPARATION AND EXPLORATION 3.1: Project Model and Strategy for Data Mining Project A successful data mining project requires the integration of three skills: information technology skill, domain knowledge skill and analytic skill. A data mining project team therefore must be multi-disciplinary. The project model implemented for this and other data mining projects is summarized in Figure 2. Figure 2: Project Model for Conducting Data Mining Project Client •Link to Business Strategy •Data Knowledge •Provide Corporate Data •Prioritize Need •Establish Requirements •Interpret Results •Monitor Results •Identify Actionable Events Research Discipline PROJECT • • Center & IT Expertise TEAM •Tools & IT Skills •Project Management •Summarize and Analyze • Identify data to extract •External Data •Quality Control • Create Models •Integrate Data •Discover and Explore •Security & Confidentiality As shown in Figure 2, our project model is a collaboration team approach from both business client and academic researchers. The project team for the loyalty-profit project consists of information technology support from both the research center and Company A. Similarly, the domain experts for marketing research in loyalty and profitability is also from both Company A and researchers at the center. The analytic experts are from the research center. Table 1 7 summarizes our data mining project strategy, including this loyalty-profit project. Such a strategy is common in business intelligence (e.g., Berry and Linoff, 2000). Table 1: The Data Mining Project Strategy 1. Define business problem 4. Prepare data for modeling 7. Deploy model and results 2. Build data mining database 5. Build model 8. Take action 3. Explore data 6. Evaluate model 9. Measure the results 3.2 Data Sources and Data Preparation In order to build a predictive model based on the framework in Figure 1, a variety of data sources were required. These data include customer satisfaction surveys, employee satisfaction surveys, customer complaint data, market orientation survey, purchasing volume data, price data, revenue data and cost data. Some data were at customer level and some were collected at business account level. Some data were cross-sectional, and some were longitudinal. All of the data were transformed into business account level and aggregated into a cross-sectional data structure. Table 2 summarizes the data sources and brief characteristics of the data. Table 2: Data Sources for the Loyalty-Profit Predictive Model Data Source Customer Satisfaction and Loyalty Market Orientation Employee Satisfaction Customer Complaints Account Sales/Volume Price Profit Characteristics • Cross sectional, Conducted in different sectors in different time period (a total of 40+ survey studies) • 100+ variables, 20,000+ observations (by individual within a business customer) • Cross sectional • 20+ variables, 300+ observations (business sector in Company A) • Multiple time frames • 20+ variables, 30,000+ observations (business sector in Company A) • Quarterly Over time • 30 variables, 180,000+ observations (customer level) • Quarterly Over time • 40+ of variables, 100,000+ observations (account level) • Quarterly over time • 30+ variables, 1,400,000+ observations (customer level) Quarterly Over time • 25+ of variables, 300,000+ observations (customer level) The quarterly data were collapsed into annual average data. By going through the steps of data collapsing, merging, titration data quality checking, and missing data processing, data transformation, new variable creation, and variable selection, the final data set consist of 2976 cases and 63 inputs and five target variables. The target variable presented in this article is the 8 adjusted 2001 Economic Profit (A_EP01), which is the measure of the ‘pure’ profit for the year of 2001 at the account level defined by Company A. As suggested in data mining literature (e.g., Berry and Linoff, 2000), we also experienced that over 80% of the project time was spent on the data preparation. Table 3 summarizes the inputs for each latent variable shown in the loyaltyprofit chain network in Figure 1. Table 3: The Input Variables for each Latent Variable of the Network of Loyalty-Profit Chain Latent Variable # of Inputs (data source) Latent Variable # of Inputs (Data Source) Costs Availability Technical Support 4 (Survey) 5 (Survey) 4 (Survey) Product Quality Customer Service Perceived Value 2 (Survey) 4 (Survey, Complaint data) 2 (Survey) Ease of Doing Business Customer Characteristics 7 (Survey) 3 (Survey, Internal data) Commercial Relation Customer Satisfaction 6 (Survey, Internal data) 4 (Survey) Image Business Competition 6 (Survey) 3 (Internal data Purchase Intent Retention, Attrition 2 (Survey) 8 (Internal data) Purchase Behavior Price Index 1 (Internal data) 1 (Internal data) Volume Index 1 (Internal data) These inputs were determined using both the domain knowledge from the loyalty-profit framework and the empirical data collected by Company. Most of the survey questions are in the scale of 1 to 10 with one being ‘extremely negative’ and ten being ‘extremely positive’. These data are usually distributed skewed to the left (that is, more customers were in positive or extremely positive category). The distributions of volume, price, the EP indices and other financial variables are highly skewed to the right. Accounts that are outside the 99th percentile were considered outliers. These variables are adjusted by the 99th percentile defined as: A _ VNAMEi VNAMEi / VNAME _ P99 . Where VNAMEi is the variable name with the original scale for the ith account, and VNAME_P99 is the 99th percentile of the corresponding variable. The transformed data values that are greater than one were treated as outliers and were deleted. Other inputs were also standardized using range normalization: S _ NAME NAME /( Max Min) . This transformation eliminates the effect of the different scales of the original data, and retains the variation structure of individual input. 9 3.2.1: Missing Data Processing Missing data processing is often one of the tricky issues in the data preparation. This project involves with over 40 different surveys from both business customers and internal employees in different time periods with different questions or different ways of asking the same questions, as well as different time periods of financial data and profit data. In the process of preparation of data at the account level, we encountered various missing data problems. For each type of missing data, a strategy was developed using both the domain knowledge and the property of the type of data. Some missing data were set to zero, some were deleted and some were imputed. Two imputation techniques were applied. One was by using the tree imputation technique with surrogate variables. The other was by trend analysis. For instance, missing in price at different quarter des not mean there was no price nor zero price. A time series trend imputation was used to impute the price data. Figure 3 shows an example of price imputation. Figure 3: Price Imputation using Trend Analysis Before Imputation After Imputation 15 15 10 10 5 5 0 0 0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16 3.2.2: Hostage and Mercenary Customer Identification Lee, et, al, (2005) developed an SNN model for the loyalty construct and mentioned an unsolved issue about hostage and mercenary customers. Hostage customers are those who are dissatisfied, but, have to purchase the product, while mercenary customers are those who are satisfied but tend to shopping around regardless. In modeling the profit, it is important to carefully examine these two types of customers. It was decided to take loyalty-specific segmentation approach by segmenting the customers into hostage, mercenary and ‘logical’ customers. The ‘logical’ customers segment was then used for the profit model building. A set of criteria was developed within Company A for segmenting the three types of customers. Such a segmentation strategy is necessary and was also recommended in the marketing literature (e.g., Anderson and Mittal, 2000). 10 4. MODELING STRATEGY FOR BUILDING A STRUCTURAL NEURAL NETWORK 4.1: A Brief Overview of a Neural Network Model A neural network (NN) can be considered as a two-stage nonlinear or classification model, usually represented by a network diagram. The two-stage process is, first, to derive a hidden layer of variables through a nonlinear function acting upon the linear combination of the H i. g Z ' X i. ' , i 1,2, inputs: ,N , where g is the activation function and Z is the weight matrix of the inputs. Additional layers can be derived using H i. as inputs to create two or more hidden layers. Commonly used activation functions are: Hyperbolic tangent: tanh a e a e a e a e a , Logistic function: 1 1 e , Arctangent function: 2 tan a and Elliott function: a 1 a . The 1 a target Yi . where f is modeled as the function of the linear combination of H i. defined as Yi. f W ' H i. ' is the activation function connecting hidden layers with the targets. The function , f can be taken the same as g or as an identity function. If f is taken as an identity function, Y is a linear combination of H. In modeling with a NN model, one usually normalizes both targets and inputs to eliminate the problem of different units and magnitudes among the variables. The Backpropagation algorithm is one of the earlier techniques developed to estimate the weights. Many alterative algorithms have been developed (Ripley,1996). Most algorithms for estimating the weight matrices Z and W minimize certain objective functions, which are defined as the functions of the difference between the observed values Y and predicted values Ŷ . For detailed description of neural network, one may refer to, for example, Fausett (1994), Ripley (1996), and Han and Kamber (2001). 4.2 SNN Model for the Loyalty-Profit Chain The following strategy is applied to build a structural neural network model for fitting the theoretical framework of cause-effect relationship. (1) Network Identification: The framework in Figure 1 is used as the underlying network for building the SNN model. Each node in the SNN model represents a latent variable in the framework. The layout and the number of hidden layers are determined by the framework itself. For example, the Product Quality, Cost, Customer service, Availability/Delivery and Technical Support are the nodes for the first hidden layer, which are the inputs for the second 11 hidden layer “Perceived Value”. The “Ease of Doing Business” is also a first Hidden Layer, which is the input for “Biz/ Commercial Relation. The “Perceived Value” and “Biz/Commercial Relationship” are the inputs for the third hidden layer, “Satisfaction”. (2) Determination of the number of neurons for each hidden node (latent variable): For each hidden node, the number of neurons decides the degree of approximation of the inputs to the node. The more the neurons, the better the approximation supposes to be, but the risk of over fitting also increases. Therefore, it is important to determine an adequate number of neurons for each node. Principle Component Analysis is applied to determine the number of the principle components of the input variables for each hidden node as the number of neurons for the hidden node. The percent of variation explained for choosing the number of neurons is 80% or higher. Hence, the eigenvalues may be less than one in some cases. The following Figure (Figure 4) is an example of traditional NN and a Structural Model Figure 4: Traditional NN model and SNN Model Traditional NN Model Input H1 Structural NN Model Input H2 H1 H2 Target Target Input The data mining software, SAS Enterprise Miner® is used to building the SNN model. Figure 5 gives the SNN architecture of the SNN model for the target variable, pure economic profit. Figure 5: The SNN Architecture for the Loyalty-Profit Network Using SAS/EM 12 The light blue blocks in Figure 5 represent the input variables, and the dark blue squares are the hidden layers representing the latent variables. The adjusted 2001 Pure Economic Profit (A_EP01) is the target variable (yellow block). The number inside each hidden layer is the number of neurons applied to the hidden node, which is determined using factor analysis as described in step (2) above. Notice that the structure in Figure 5 mimics the framework in Figure 1. The price and volume are combined into one latent variable. The marketing orientation is combined into the business competition latent variable. [Olga and Rey: Please revise this. I am not so sure about the Gap and Competition inputs]. Thus, based on the loyalty-profit framework, the SNN model for modeling the target A_EP01 (2001 adjusted Pure Economic Profit) has seven hidden layers. 4.3 THE MODELING STRATEGY AND ASSESSMENT The modeling strategy and evaluation are the steps 5, 6, and 7 shown in our data mining project strategy in Table 1. The following processes are considered during modeling. (a) Starting Weights and Stopping Rule: Five preliminary networks are conducted using random samples based on different seeds. The weight estimates that give the smallest error is chosen to be the initial values. This is done using the neural network options in the SAS/EM. (b) Control over fitting: A simple cross validation approach is applied to guide against over fitting. The data are split into Training (60%), Validation (20%) and Testing (20%). Other partitions are also conducted. No noticeable differences are noticed. (c) The objective function for model comparison: Three objective functions are used for model comparison. The primary objective function is the Root Average Square Error, which is also used by EM as the default for determining the final model, is defined as: ASE = SUM(yi – yi(Pred))2/n , where n is the total number of cases. The other two criteria are Root Mean Square Error (RMSE) and Schwarz Bayesian Criterion (SBC): MSE = SUM(yi – yi(Pred))2/(n-p), where p is total number of estimated weights. SSE SBC n ln p ln n n 13 (d) Dummy Variable Handling: For nominal input, deviation coding is used. For ordinal input, bathtub coding is used. For each case in the ith category, the jth dummy variable is set to 1.5C /(C 2 1) for i >j, or otherwise to 1.5C /(C 2 1) otherwise (see SAS Enterprise Miner Reference Manual for details). (e) Activation Functions: The hyperbolic tangent is used to connect the inputs and hidden nodes. Logistic activation function is used to link hidden layers and the target variable. (f) The competing models considered include (1) Linear Model with complete two-way interaction. Stepwise variable selection is applied for selecting variables. (2) Traditional NN, that is, all of the input variables are feeding into the first hidden layer. To make a proper comparison, three hidden layers, similar to the SNN, are also used. The number of neurons for each hidden layer is three, which is the SAS® neural network default, and (3) SNN having multiple neurons per node, where the number of neurons are determined using Principle Component Analysis. 5. RESULTS AND DISCUSSION Using the 60%/20%/20% data partition, the fit statistics for the traditional NN, regression and SNN models are given in Table 4. The objective function is the Average Square Error (ASE). The best model is the model that gives the smallest ASE for the validation data. Figure 6 gives the average errors for different iterations in the modeling process. The best is obtained near the 135th iteration. The test data is not included in the modeling process. It is used as an independent evaluation of the model. Table 4: The Goodness of Fit Statistics for the Three Models Model Traditional NN (3 nodes) Regression (with twoway interactions) Structural NN Degrees of Freedom Root ASE (Training) Root ASE (Validation) Schwarz Bayesian Criterion 0.2394 0.2412 -1159.6 0.1877 0.1928 -2768.3 0.1719 0.1719 -2614.9 Adjusted R-Square 0.86 [Olga and Rey, please revise this table if the numbers are not correct. I get them from the power point presentation you presented at the final presentation. In the slides, there are several different summary tables. The results are all different. I am not sure if these 14 numbers I took are the correct ones. I can not find the degrees of freedom from your final presentation. Are they available? Figure 6: The Average Square Error of each Iteration of modeling process for the Validation Data The target, adjusted 2001 Economic Profit is range normalized. Values are between 0 and 1. The root average square error for the SNN model is the lowest, which is about 17%. Regression model has error at 19%, while the traditional NN model has error at 24%. Figure 7 gives the plot between predicted profit (left) and residual plot (right) against the original profit. The residual plot shows some extreme accounts that are not fit properly. Figure 7: Plots for Predicted Profits (left) and Residuals (right) against Original Profits 15 The stepwise regression technique is also applied to select key factors that have significant impact to the prediction of profit. The adjusted R-square for the final model is at 86%. Using the adjusted R-square as the selecting criterion, Figure 8 gives the selected variables based on the Adjusted R-square (left) and the t-statistics (right). Figure 8: Selected Input Variables (left) and the corresponding t-test statistics (right) Rey and Olga: Please note: (1) The variables selected are: ______________________________________________ [Rey and Olga, please fill this, and give some discussion about these variables. ] 6. CONCLUSION This article presents a data mining project strategy and proposes a modified neural network technique, a structural neural network. The strategy and modeling technique are applied to model the network of loyalty and profit chain for a chemical company. The project management model and strategy described in this article have been successfully applied to various projects in our research center, including chemical industry, automotive industry, information technology industry, health care industry, pharmaceutical industry and others. This article discusses how the SNN architecture is built to mimics the theoretical framework that describes the cause-and-effect relationships between the constructs of performance, satisfaction, loyalty and profit chain. The SNN technique takes into account the 16 potential nonlinear and asymmetric relationships that can not be handled using the traditional SEM and PLS modeling techniques. If the relationship is nonlinear and asymmetric, then, the SNN model is shown that it performs better than others. The results of this loyalty-profit modeling project seem to suggest that the SNN technique performs better than the traditional NN model and somewhat better than the regression models. This insight is important in several aspects for the chemical industry: (1) The small advantage of SNN model against the simpler regression model suggests that the nonlinear and asymmetric relationships between constructs of the loyalty-profit chain may be as strong as originally expected. (2) Neural network technique is mainly for prediction purpose. Understanding that the causeand-effect relationships are less nonlinear suggests that one can take a two stage modeling strategy: first apply the more traditional response surface modeling methodology to search for the key drivers of between constructs and to understand the global trend of the relationship, and then, apply the SNN technique to fine turn the prediction of profit to obtain more localized profit prediction that may be more useful decision making in different divisions within the company. (3) Literature also showed that the relationships between constructs vary in different industries. For instance, the study by Goodman and associates (1995) for estimating the importance of attributes of the U.S. postal service indicated that linear relationship between attributes of the postal service and overall satisfaction is adequate. However, this does not mean that one should restrict to applying only linear techniques for modeling the loyalty-profit chain. In stead, it is important to apply a more general modeling technique that is robust to assumptions and capable of handling large and messy data structure. The traditional NN model is an empirical modeling technique. In general, the underlying theoretical framework is not taken into consideration. Instead, the traditional NN model attempts to allow the data to speak for itself. The failure of the traditional NN model sends an important message that when applying the ‘black box’ neural network modeling, it is essential to take into account the contextual and theoretical knowledge. For the loyalty-profit modeling case, it is clear that the theoretical framework provides a great deal of insight about the cause-and-effect relationships among the latent variables and input data and the targets. Structural neural network 17 techniques should be considered for any predictive modeling problems when the contextual and theoretical knowledge is available to assist in the designing the structure. This technique is applicable to other modeling problems where frameworks are well defined. REFERENCES Anderson, Eugene W. and Vikas Mittal (2000), “Strengthening the Satisfaction-Profit Chain”, Journal of Service Research, Volume 3, No. 2, November 2000 107-120. Ansari, Asim, Kamel Jedidi and Harsharan S. Jagpal (2000), “A hierarchical Bayesian methodology for treating heterogeneity in structural equation models”, Marketing Science, Vol. 19, 328 – 347. Bansal, Harvir S. and Shirley F. Taylor (1999), "The Service Provider Switching Model (SPSM): A Model of Consumer Switching Behavior in the Services Industry.," Journal of Service Research, 2 (2), 200-18. Bejou, David and Adrian Palmer (1998), "Service failure and loyalty: An exploratory empirical study of airline customers," Journal of Services Marketing, 12 (1), 7-22. Bharadwaj, S. G., P.R. Vanradarajan, and J. Fahy (1993), "Sustainable competitive advantage in service industries: conceptual model and research propositions," Journal of Marketing, 57, 83-99. Bloemer, Josee, Ko de Ruyter, and Martin Wetzels (1999), "Linking perceived service quality and service loyalty: a multi-dimensional perspective," European Journal of Marketing, 33 (11/12), 1082-106. Butcher, Ken, Beverley Sparkes, and Frances O'Callaghan (2001), "Evaluative and relational influences on service loyalty," International Journal of Service Industry Management, 12 (4), 310-27. Danaher, Peter J. (1998), “Customer Heterogeneity in Service Management,”Journal of Service Research, 1 (November), 129-39. de Ruyter, Ko, Martin Wetzels, and Josee Bloemer (1998), "On the relationship between perceived service quality, service loyalty and switching costs," International Journal of Service Industry Management, 9 (5), 436-53. Dick, Alan S. and Kunal Basu (1994), "Customer Loyalty: Toward an Integrated Conceptual Framework," Journal of the Academy of Marketing Science, 22 (2), 99-113. Fausett, L. (1994), Fundamentals of Neural Network Architectures, Algorithms, and Applications. Prentice Hall. Fornell, Claes and Jaesung Cha (1994), “Partial Least Squares,” in Advanced Methods of Marketing Research, Richard P. Bagozzi, ed. Cambridge, MA: Blackwell, 52-78. Garbarino, Ellen and Mark S. Johnson (1999), "The Different Roles of Satisfaction, Trust, and Commitment in Customer Relationships," Journal of Marketing, 63 (2), 70-87. Grossman, Randi P. (1998), "Developing and Managing Effective Consumer Relationships," Journal of Product and Brand Management, 7 (1), 27-40. Gustafsson, Anders and Michael D. Johnson (2004), “Determining Attribute Importance in a Service Satisfaction Model”, Journal of Service Research, Volume 7, No. 2, November 2004 124-141. Hand, D., H. Mannila, and P. Smyth, (2001), Principles of Data Mining. MIT Press, 2001. 18 Hastie, T., R. Tibshirani and J. Friedman (2001), The Elements of Statistical Learning Data Mining, Inference, and Prediction. Springer. Hruschka, Harald (2001), An Artificial Neural Net Attraction Model (Annam) To Analyze Market Share Effects Of Marketing Instruments, Schmalenbach Business Review u Vol. 53 u January 2001 u pp. 27 – 40 Hahn, Carsten, Michael D. Johnson, Andreas Herrmann and Frank Huber (2002), “Capturing Customer Heterogeneity Using A Finite Mixture PLS Approach”, Schmalenbach Business Review, Vol. 54, July 2002, 243 – 269. Han, J. and M. Kamber (2001). Concepts and Techniques. Morgan Kaufmann, NY. Johnson, Michael and Anders Gustafsson (2000), Improving Customer Satisfaction, Loyalty and Profit: An Integrated Measurement and Management System. San Francisco: Jossey-Bass. Jones, Tim and Shirley, F. Tayler (2003). The Conceptual Domain of Service Loyalty: How Many Dimensions? Unpublished manuscript. Keiningham, Timothy L. ,Tiffany Perkins-Munn and Heather Evans (2003), “The Impact of Customer Satisfaction on Share Of Wallet in a Business-to-Business Environment”, Journal of Service Research, Vol6, No. 1, August, 2003, 37-50 Kumar, Piyush (1998), “A Reference-Dependent Model of Business Customers’ Repurchase Intent,” working paper,William Marsh Rice University, Houston, TX. Lee, Carl, Tim Rey, James Mentele, and Michael Garver (2005). Structural Neural Network Model for Modeling Loyalty and Profitability. To appear in the Proceedings, SAS Users’ Group International Conference, Philadelphia, USA, April, 2005. Mittal, Vikas and Patrick M. Baldasare (1996), “Impact Analysis and the Asymmetric Influence of Attribute Performance on Patient Satisfaction,” Journal of Health Care Marketing, 16 (3), 24-31. Mittal, Vikas and Jerome Katrichis (2000), “Distinctions between New and Loyal Customers,” Marketing Research, 12 (Spring), 27-32. Mittal, Vikas, William T. Ross, and Patrick M. Baldasare (1998), “The Asymmetric Impact of Negative and Positive Attribute-Level Performance on Overall Satisfaction and Repurchase Intentions,” Journal of Marketing, 62 (January), 33-47. Oliver, Richard L. (1997), Satisfaction: A Behavioral Perspective on the Consumer. New York: McGraw-Hill. Oliver, Richard L (1999), "Whence Consumer Loyalty," Journal of Marketing, 63 (Special Issue), 33-44. Pritchard, Mark P., Mark E. Havitz, and Dennis R. Howard (1999), "Analyzing the commitmentloyalty link in service contexts," Journal of the Academy of Marketing Science, 27 (3), 33348. Pugesek, B. H., A. Tomer, A. and A. Von Eye (2003), Structural Equation Modeling: Applications in Ecological and Evolutionary Biology. Cambridge University Press. Sharma, Neeru and Paul G. Patterson (2000), "Switching costs, alternative attractiveness, and experience as moderators of relationship commitment in professional, consumer services.," International Journal of Service Industry Management, 11 (5), 470-90. Reichheld, Frederick (1996). The Loyalty Effect: The Hidden Source Behind Growth, Profits, and Lasting Value. Boston: Harvard Business School Press. Reichheld, Frederick F. (1994), "Loyalty and the renaissance of marketing," Marketing Management, 2 (4), 10. 19 Reichheld, Frederick F (1993), "Loyalty-based management," Harvard Business Review, 71, 6473 Reichheld, Frederick & Sasser, W. Earl (1990). “Zero Defections: Quality Comes to Services.” Harvard Business Review, September–October. Rey, T. D., (2002), “Using JMP and Enterprise Miner to Mine Customer Loyalty Data”, MidWest SAS Users Group, 13th Annual Conference, October, 14. Rey, T. D., (2004), “Tying Customer Loyalty to Financial Impact”, Symposium on Complexity and Advanced Analytics Applied to Business, Government and Public Policy Society for Industrial and Applied Mathematics, Great Lakes Section , October 23, University of Michigan, Dearborn Campus. Rey, T. D. and Johnson, M., (2002), “Modeling the Connection Between Loyalty and Financial Impact: A Journey”, Earning a Place at the Table, 23rd Annual Marketing Research Conference, American Marketing Association, September 8-11, Chicago, IL. Ripley, Brain D. (1996), Pattern Recognition and Neural Networks. Cambridge University Press. Rusbult, Caryl E., Jennifer Wieselquist, Craig A. Foster, and Betty S. Witcher (1999), "Commitment and trust in close relationships," in Handbook of interpersonal commitment and relationship stability, Jeffrey M. Adams and Warren H. Jones, Eds. New York, NY: Kluwer Academic. SAS Helps and Documentation (2004), Enterprise Miner 4.3 Reference. Terrill, Craig, Arthur Middlebrooks, and American Marketing Association. (2000), Market leadership strategies for service companies : creating growth, profits, and customer loyalty. Lincolnwood, Il.: NTC/Contemporary Publishing. Implications, May 6-7, Ann Arbor, MI. ACKNOWLEDGEMENTS This project was conducted at the Center for Applied Research & Technology Central Michigan University Research Cooperation (CMURC) at Central Michigan University. The financial support came from the Dow Chemical Company. The authors are grateful for the support of both CMURC and the Dow Chemical Company. 20