Consumer Personality and Research Methods 2005 Conference Dubrovnik, Croatia, September 20-24, 2005 http://www.cpr2005.info Abstract Data-mining in direct marketing: A comparison of RFM, CHAID, and logistic regression John A. McCarty The College of New Jersey, School of Business, NJ, USA mccarty@tcnj.edu Manoj Hastak American University, Kogod College of Business, Wash., DC , USA Keywords: data-mining, statistical techniques, RFM Type of contribution: Talk / Paper presentation Abstract: The field of direct marketing has become more efficient in recent years because of the development of database marketing techniques. These data-mining approaches have allowed the direct marketer to better segment their current customers and develop marketing strategies tailored to particular segments and/or individuals. Over the recent years, database marketing techniques have evolved from simple RFM (recency, frequency, and monetary value) models to statistical techniques such as chi-square automatic interaction detection (CHAID) and logistic regression. In spite of recent statistical advances in data-mining, marketers continue to employ RFM, primarily because of its ease of implementation and the ability of managers to understand the results of the RFM analysis. Therefore, it has been argued that the simplicity of RFM has been emphasized and its efficiency, relative to statistical techniques, has not been considered to the extent that it should be. Although the efficiency of RFM has been questioned, little research has documented its ability relative to newer statistical techniques. The current study evaluates RFM, comparing it to CHAID and logistic regression, in an effort to understand its capabilities as a database marketing analytical tool. The analysis involves two customer data sets, both with approximately 100,000 customer records. We test one RFM procedure, which involves dividing the customers into cells (or nodes) as a function of their recency of purchase, frequency of purchase, and monetary value (amount of money they have spent). These variables are evaluated in terms of their ability to predict customer response. The study compares the lift in customer response using RFM to the lift provided by CHAID and logistic regression. Using a catalog marketer’s database and a nonprofit marketer’s database, the study shows that RFM performed well, compared to the statistical techniques. The results are considered in light of the distribution free nature of RFM, while statistical techniques assume linearity of recency, frequency, and monetary value to response.