数据挖掘应用 赵卫东 博士 复旦大学软件学院 wdzhao@fudan.edu.cn 客户关系管理(CRM) 顾客生命周期 盈利 收入 寿命 支出 获取消费者 保持消费者 寿命 消费者分析和恢复 顾客生命周期中的数据挖掘支撑 顾客数据 数据挖掘在CRM中的应用 Customer identification CRM begins with customer identification. This phase involves targeting the population who are most likely to become customers or most profitable to the company. It also involves analyzing customers who are being lost to the competition and how they can be won back. Elements for customer identification include target customer analysis and customer segmentation. Customer attraction Organizations can direct effort and resources into attracting the target customer segments. Direct marketing is a promotion process which motivates customers to place orders through various channels. direct mail or coupon 目标营销 Customer retention Central concern for CRM. Customer satisfaction is the essential condition for retaining customers. Elements of customer retention include one-to-one marketing, loyalty programs and complaints management. One-to-one marketing refers to personalized marketing campaigns which are supported by analyzing, detecting and predicting changes in customer behaviors. Loyalty programs involve campaigns or supporting activities which aim at maintaining a long term relationship with customers. Churn analysis, credit scoring, service quality or satisfaction form part of loyalty programs. 客户流失分析 Customer development Elements of customer development include customer lifetime value analysis, up/cross selling and market basket analysis. Customer lifetime value analysis is defined as the prediction of the total net income a company can expect from a customer. Up/Cross selling refers to promotion activities which aim at augmenting the number of associated or closely related services that a customer uses within a firm. Market basket analysis aims at maximizing the customer transaction intensity and value by revealing regularities in the purchase behaviour of customers. SPSS通讯行业分析专题 SPSS Modeler通讯行业分析模型 Personalized recommendation systems Personalized recommendation Personalization is defined as “the ability to provide content and services tailored to individuals based on knowledge about their preferences and behavior” or “the use of technology and customer information to tailor electronic commerce interactions between a business and each individual customer” Internet recommendation systems (Internet recommender systems) in electronic commerce is to reduce irrelevant content and provide users with more pertinent information or product. A recommendation system is a computer-based system that uses profiles built from past usage behavior to provide relevant recommendations. Information filtering and recommendation rule-based filtering, content-based filtering, and collaborative filtering. Rule-based filtering uses pre-specified if-then rules to select relevant information for recommendation. Content-based filtering uses keywords or other productrelated attributes to make recommendations. Collaborative filtering uses preferences of similar users in the same reference group as a basis for recommendation. Typical personalization process understanding customers through profile building delivering personalized offering based on the knowledge about the product and the customer measuring personalization impact Inadequate information in IR One possible solution for overcoming the problem is to expand the query by adding more semantic information to better describe the concepts. Relevance feedbacks and knowledge structure are used to add appropriate terms to expand the queries. Relevance feedbacks are information on the items selected by the user from the output of previous queries. Spreading Activation Model In the Spreading Activation (SA) Model, concepts are expanded based on the semantics in the process of identifying customer profile and matching items and the model has been applied to expand queries. A personalized knowledge recommendation system A semantic-expansion approach to build the user profile by analyzing documents previously read by the person. The semantic-expansion approach that integrates semantic information for spreading expansion and content-based filtering for document recommendation. A sample semantic-expansion network Experimental results An empirical study using master theses in the National Central library in Taiwan shows that the semantic-expansion approach outperforms the traditional keyword approach in catching user interests. 构件库管理 自适应构件检索 构件检索是构件库研究中的重要问题,有效的构件检索机 制能够降低构件复用成本。 构件的复用者并不是构件的设计者或构件库的管理员,在 检索构件时对构件库的描述理解不充分,导致难以给出完 整和精确的检索需求。 用户选择构件的结果反映其真实需求,如果能够从用户的 检索行为以及用户对检索结果的反馈中推断出用户的非精 确检索条件与用户实际需要的精确检索条件之间内在联系 的模式,就可以提高系统的查准率。 基于关联挖掘的自适应构件检索 把关联规则挖掘方法引入构件检索,从用户检索行为以及 反馈中挖掘出非精确检索条件与精确检索结果之间的关联 规则,从而调整检索机制,提高构件检索的查准率。 实例 {windows} {windows ,SQL Server} {Linux} {Linux ,Mysql} {金融} {金融,SQL Server} {windows ,金融} {windows ,金融,SQL Server} 供应链管理 零部件供应商选择 如何选择供应商不仅决定了产品的质量和成本,也决定了 产品的销售价格、维护费用和用户满意程度。 选择供应商一般以满足时间约束的条件下最小化物流成 本为目标,没有考虑零部件故障率与不同地域环境之间 的相关性。 基于关联规则的零部件供应商选择 使用关联规则挖掘算法,从产品维修记录中,寻找 不同供应商提供的产品零部件及其组合在不同地域 的频繁故障模式。 在生成供应商选择和配送方案过程中,利用这些频 繁故障模式,选择合适的零部件供应商组合,达到 物流成本与产品维护成本的联合优化。 人力资源管理 人力资源管理 人力资源在高科技公司中的地位相当重要。人力招聘直 接影响公司员工的素质,但传统的人力资源管理方法已 经不适应高科技公司的需要。 高科技行业知识不断变化,工作不易定界,跨职能任务 较多,工作过程趋于多元化。这些因素都对员工素质提 出了更高的要求,依靠传统方法获知竞聘者是否能够胜 任工作变得比较困难。 采用决策树挖掘出人员选拔规则 CHAID Decision tree for predicting job performance Improving education Improving teaching and learning Instructors can have trouble identifying their real difficulties in learning. Based on the students’ testing records, the system works to identify and find those problems, and then comes up with its suggestions for designing new teaching strategies. Assist teachers to identify students’ specific difficulties and weaknesses in learning. Helps the student to find out his or her weak points in learning and offers improvement recommendations. ESL recommender teaching and learning Right/wrong answer statistical table For every student, the system creates a right/wrong answer statistical table: a wrong answer is represented by 1 and a right answer by 0. Summary table of students’ wrong answers The right/wrong answer statistical tables for respective students are integrated in a summary table of students’ wrong answers, and the sum values in the table are then ranked in descending order so as to show the descending degrees of weaknesses the students have collectively . Hierarchical clustering Hierarchical clustering algorithm is then applied to data collected to segment the students into a certain number of clusters, or categories, each of which includes students sharing the same or similar characteristics. All students’ right/wrong answer statistical tables Clustering analysis A clustering analysis is made of the data in All students’ right/wrong answer statistical tables. It is evident that the students whose numbers are enclosed in the following separate parentheses belong to different clusters respectively: (9,15, 6, 17, 13, 19, 14, 5); (22, 23, 4, 3, 21, 11, 24, 20, 7, 1);(12, 18, 2, 8, 25, 10, 16). 搜索引擎优化 搜索引擎优化 They are usually not search engines by themselves. The clustering engine uses one or more traditional search engines to gather a number of results; then, it does a form of postprocessing on these results in order to cluster them into meaningful groups. The post-processing step analyzes snippets, i.e., short document abstracts returned by the search engine, usually containing words around query term occurrences. E-Commerce Recommender Systems Background E-commerce has allowed businesses to provide consumers with more choices. Increasing choice, however,has also brought about information overload. E-commerce stores are applying mass customization principles to their presentation in on-line stores. One way to achieve mass customization in e-commerce is the use of recommender systems. What is E-commerce Recommender Systems? Recommender systems are used by e-commerce sites to suggest products to their customers and to provide consumers with information to help them decide which products to purchase.In a sense, recommender systems enable the creation of a new store personally designed for each consumer(one-to-one marketing). Tool for database marketing and CRM The Structure of Recommender Systems •A typical e-commerce recommender application includes the functional I/O, the recommendation method. Recommendation Method •Targeted customer inputs •Community inputs •Outputs Targeted Customer Inputs explicit navigation inputs are intentionally made by the customer with the purpose of informing the recommender application of his or her preferences— keywords search,registration etc. Implicit inputs:specific item or items that the customer is currently viewing or those items in the customer's shopping cart(purchase history). Community Inputs community purchase history best-seller lists text comments Output Recommendations a set of suggestions:ordered list or unordered lists Ratings meta-rating: rating the comments themselves text comments item-to-item correlation user-to-user correlation Top-N Email marketing Delivery and Presentation Push methods reach a customer who is not currently interacting with the system for example, by sending e-mail, recommendations for related products. Pull methods notify customers that personalized information is available but display this information only when the customer explicitly requests it. Other types of visualization. Recommendation Methods Statistical summaries of community opinion withincommunity popularity measures and aggregate or summary ratings Association analysis Content-based recommendations: The user will be recommended items similar to the ones the user preferred in the past; Collaborative recommendations: The user will be recommended items that people with similar tastes and preferences liked in the past; Hybrid approaches: These methods combine collaborative and content-based methods. Examples? Techniques for Recommendation Many techniques from data mining can be adapted to the scalability problem for recommender systems:nearestneighbor,classifiers(rule induction, neural networks, and Bayesian networks), clustering,association Web usage mining and more general commerce-related data mining may reveal techniques for exploiting complex behavioral data. E-commerce Recommender Applications E-recommender systems enhance Ecommerce sales (2 %-8 %) in the following ways: Converting Browsers into Buyers Increasing Cross-sell Building Credibility through Community Inviting customers back Give the type of feedback needed for marketing professionals Techniques for Recommendation Many techniques from data mining can be adapted to the scalability problem for recommender systems:nearestneighbor,classifiers(rule induction, neural networks, and Bayesian networks), clustering,association Web usage mining and more general commerce-related data mining may reveal techniques for exploiting complex behavioral data. 顾客评价的关联分析 A location-aware recommender system for mobile shopping environments When receiving a service request, the on-line subsystem generates a list of possibly interesting web pages based on the customer’s interests profile, vendor data,and the instantaneous position of the customer provided by the location manager. 研讨题 阅读后面参考文献,分析案例使用的数据挖掘方 法以及解决的主要问题。 结合自己的实践,说明所在岗位对商务智能的需 求(针对软件工程硕士)。