Advances in BI 1. Why Data Mining? 2. Expert Systems: A Tool for Sifting Through Mountains of Data - Case Example: Ocean Spray Cranberries 3. Data Mining Models: - Association, Sequential Patterns, Classification, Clustering and Predictive Models 4. Data Mining Techniques: - Decision Trees, Rules Induction, Regression & Neural Networks 5. Text Mining for Unstructured Data 6. Business Activity Monitoring: A Priority Today Dr. Lakshmi Mohan 1 Why Data Mining ? “Now that we have gathered so much data, what do we do with it?” “The datasets are of little direct value themselves. What is of value is the knowledge that can be inferred from the data and put to use.” Data volumes are TOO BIG for traditional DSS Query/ Reporting and OLAP tools. Organizations have to get value from the huge investments of time and money made in building data warehouses. Dr. Lakshmi Mohan 2 “Discover the Diamonds in Your Data Warehouse” “Maximize your ROI on data warehousing & data marts by enabling your decision makers to exploit your customer data for competitive advantage” “This web-enabled, point-and-click approach lets you employ OLAP, neutral networks, churn analysis, and many other visualizations and analytical techniques to improve – Customer retention Target key prospect Profile market segments Detect fraud Analyze customer response, and much more” Without BI, your DW is…….. ….. Well, a warehouse full of data Source: Ads of BI vendors Dr. Lakshmi Mohan 3 The Economics of Attention “A wealth of information creates a poverty of attention.” - Nobel prize- winning economist, Herbert Simon Problem: NOT Information Access BUT Information Overload Challenge: Locating , Filtering & Communicating What is useful to the user Dr. Lakshmi Mohan 4 Why is Data Mining a “Hot” Topic Today? 1. Implementation of ERP, CRM & SCM systems have resulted in vast stores of operational data. 2. Emergence of global competition has put the pressure on companies to be “data- driven” – i.e., make informed decisions based on facts and not hunches. 3. The speed of change in the marketplace demands that the pearls of actionable information have to be found faster in the ocean of data, for companies to be one step ahead of competition. 4. The hardware needed to store and process a “ton of data” was prohibitively expensive until recently – “You would have had to have NASA at your disposal”. Today, the technology makes it feasible to apply complex models to ferret out patterns previously left to rot in “data jails”. Dr. Lakshmi Mohan 5 The Payoff from Data Mining - Two Examples 1. Farmer’s Insurance 2. Based on traditional data analysis, drivers of sports cars were determined to be at higher risk for collisions than drivers of “safe” cars such as Volvos Hence charged them more for car insurance Data mining discovered a pattern that changed the pricing policy…. ….. As long as the sports car was not the only car in the household, the driver fit the profile of the “safe” family car driver, not the risky sports car driver. Walgreen (A large Retailer) In the past, success of promotional offers such as 2-for-1 sales was measured primarily by product sales….. ….. With data mining, Walgreen can see what other items are selling with its promotional offers ….. Tuned its programs to put things on sale that people tend to buy in tandem with high-margin items. Dr. Lakshmi Mohan 6 What are Expert Systems? A technology that enables expertise to be distributed throughout a firm without the presence of the human expert Rule-Based System ― If “This”, Then “That” ― Rules are determined from expert knowledge and programmed in the software An HR Application Screening a large number of resumes for relatively low-level positions with well-defined and precise skill requirements - e.g., Call Center Agents Expert System can weed out applicants who do not meet the requirements Dr. Lakshmi Mohan 7 Applying Expert Systems – To Extract “News” from Scanner Data The Promise: Better Data for Tracking Market Shares – Compared to Retail Store Audits – Frequency: Weekly vs. Bimonthly – Level of Detail: UPCs vs. Brands – Scope: Top 50 Markets vs. Regions The Problem: Too Much Data – At least 100 times more data The Result: Impossible to Use the Quality Data Dr. Lakshmi Mohan 8 "CoverStory"- An Expert System: Replaced the Human Analyst Before . . . Companies circulated top-line reports, including tables and charts from the retail store audit data. An analyst prepared the cover memo highlighting important news in the data. Now. . . Not feasible to have an army of analysts to sift through the mountain of scanner data. Instead, "CoverStory" automatically writes this memo! – a model-imbedded expert system extracts the news – includes a built-in thesaurus to eliminate repetitious wording Dr. Lakshmi Mohan 9 Case Example: Ocean Spray Cranberries – A $1 billion grower-owned agricultural cooperative – Lean IS staff – Only one marketing professional for analyzing the tracking data – Scanner data for juices is imposing -- 400 M numbers covering up to 100 data measures, 10,000 products, 125 weeks and 50 geographic markets -- Grows by 10 million new numbers every four weeks Dr. Lakshmi Mohan 10 Impact of CoverStory – Enables a department of one to alert all Ocean Spray marketing and sales managers to key problems and opportunities and provide problem-solving information – Being done across 4 business units handling scores of company products in dozens of markets representing hundreds of millions of dollars of sales – System is totally integrated into business operations because it delivers information of competitive value in running the business Dr. Lakshmi Mohan 11 Tools to Get Value from Data Warehouses Business Intelligence Tools To enable users without programming skills to analyze the raw data in the data warehouse. Ad Hoc Query / Reporting OLAP Tools to “slice” and “dice” data. Data Mining Tools Automate the detection of patterns in the data warehouse Build models to predict behavior through statistical and machine-learning techniques. Dr. Lakshmi Mohan 12 Data Mining Not Limited to Discovery… … i.e., finding an existing nugget of “gold” in the “mountain” of data, Data Mining used for Prediction also Telling you not just where the gold is “today”, but where the gold might be “tomorrow” Predict what is going to happen next based on what we have found. “From the moment I signed up for my Total Rewards card in the casino lobby and filled in my name, address, date of birth and driver’s license number, Harrah’s had a pretty good hunch that my long term potential was already low… I was a 32- year old man from the distant state of Montana… did not fit the profile of a highvalue customer!” Age, gender and distance from the casino were identified through data mining as critical predictors of frequency of visiting casinos. Dr. Lakshmi Mohan 13 Knowledge Discovery in Databases - Steps in KDD process Data Warehouse Selection Target Data Cleaning Pre - processed Data Data reduction Transformed Data DATA MINING Patterns Evaluation & Interpretation Knowledge Source: Communications of the ACM, 1996 Dr. Lakshmi Mohan 14 Data Mining is One Step in the KDD Process Determine patterns from observed data to solve a business problem. Step 1: Identify the Business Problem - e.g., Who are “good” customers? Which customers are likely to leave? Step 2: Choose Model or Goal for Data Mining - Some models are better for predictions while others are better for describing behavior Step 3: Choose Technology to Build Model Step 4: Apply the Algorithm (Computation process) to Data. Review the results and refine the Model Step 5: Validate the Model on New Data (the “hold-out” dataset) Dr. Lakshmi Mohan 15 Data Mining Models 1. Association - If customer buys spaghetti, also buys red wine in 70% of cases 2. Sequential Patterns – time or event based - A customer orders new sheets and pillow cases followed by drapes in 75% of the cases 3. Classification - Opera ticket buyers are usually young urban professionals with high income while country music concert ticket purchasers are typically blue collar workers 4. Clustering - Discovers different groups in the data whose members are very similar 5. Predictive Models - Relate behavior of customers (“dependent” variable) to predictors (“independent” variables felt to be “responsible” for the dependent one) Dr. Lakshmi Mohan 16 Association Models for Market–Based Analysis Model finds items that occur together in a given event or record Discovers rules of the form: If item A is part of an event, then X% of the time (confidence factor), Item B is part of the event. Used to discover patterns of items bought together from the “mountain” of scanner data Example: If a customer buys corn chips, then 65% of the time, also buys cola Unless there is a promotion, in which case buys cola 85% of the time. Dr. Lakshmi Mohan 17 Sequential Patterns Similar to Association Models, except that the relationships among items are spread over time. Sequences are associations in which events are linked by time Require data on the identity of the transactors in addition to details of each transaction. Example: If surgical procedure X is performed, then 45% of the time infection Y occurs within 5 days But after 5 days, the likelihood of infection Y drops to 4% Dr. Lakshmi Mohan 18 Classification Models - Most Common Data Mining Model Describe the group that a member belongs to by examining existing cases that already have been classified, and inferring a set of rules These IF-THEN rules are often depicted in a tree like structure Examples: - What are the characteristics of customers who are likely to switch to a rival telecom service provider? - Which kinds of promotions have been effective in keeping which types of customers so that you can target the right promotion to the right customer? Dr. Lakshmi Mohan 19 Clustering Models Segment a database into different groups whose members are very similar - Similar to Classification except that no groups have yet been defined The Clustering model discovers groupings within the data - You do not know what the clusters will be when you start, or on what attributes the data will be clustered. Hence, a user who is knowledgeable in the business needs to interpret the clusters. Example: - - Xerox has developed predictive models using clusters for analyzing usage profile history, maintenance data, and representations of knowledge from field engineers to predict photocopy component failure. An email is sent to the repair staff to schedule maintenance PRIOR to the breakdown “Root Cause Analysis” enables a “prescription” for what to do about a problem Dr. Lakshmi Mohan 20 Predictive Models Combine predictors (or “independent” variables) in a model relating them to the variable to be predicted (“dependent” or “predictive” variable) using historical data on the predictors and the predictive variable – “training” data set - Resulting model is used to predict the value for new data that does not include the predictive variable. Example 1: Predefined Predictors - If the customer is rural and her monthly usage is high, then the customer will probably renew. If the customer is urban and new feature exploration is high, then the customer will probably not renew. Example 2: Customer Profiling - “We can tell the profile of someone who is about to have a baby by what purchases they make… We can then compare that profile with those of others “who are moving into baby space” to predict needs. For instance, such a customer may be a good target for a life insurance sales pitch.” Dr. Lakshmi Mohan 21 Data Mining Techniques - Decision Trees Derives rules from patterns in data to create a hierarchy of IF-THEN statements, called a Decision Tree, to classify the data. Segments the original data set: Each segment is one of the leaves of the tree Records in each segment are similar with regard to the variable of interest Example: Classification of Credit Risks Dr. Lakshmi Mohan 22 Pros & Cons of Decision Trees 1. How to handle continuous sets of data, like age or sales? 2. Crux of the “Tree- Growing” Process: 3. Ranges have to be created such as 25-34 years, 35-44 years, etc. This grouping of ages could inadvertently hide patterns… e.g., a significant break at 30 could be concealed What is the best possible question to ask at each branch point of the tree? e.g., The question “are you over 35?” may not distinguish between churners and those who are not if the spilt of people over 35 is 40% for churners & 60% for others. The goal is to get a 90%-10% (10%- 90%) spilt in the segment of people over 35 years. The algorithms look at all possible distinguishing questions and the sequence of asking them that could break up the “training data set” into segments that are nearly homogeneous with respect to the variable to be predicted. They stop growing the tree when the improvement is not substantial to warrant asking the question. Dr. Lakshmi Mohan 23 CART: Classification and Regression Trees - A Popular Statistical Package for Decision Trees CART begins by trying all the questions for grouping the population and picks the best one that splits the data into two or more “organized” segments that decrease the “disorder” of the original population as much as possible. Then, CART repeats the process on each of these new segments individually. The algorithm not only discovers the optimally generated tree but also has the validation of the model on new test data (holdout sample) built in. The most complex tree rarely fares the best on the holdout sample because it has been over-fitted to the training data set. The tree is pruned back based on the performance of the various pruned versions on the test data. Dr. Lakshmi Mohan 24 CHAID: Another Statistical Tool for Decision Trees Chi-Square Automatic Interaction Detector Relies on the “Chi-Square” test used in “contingency” tables obtained by cross-tabulating the data on say, churners and non-churners by predictors, which have to be “categorical” such as age groups: Less than 20, 20-29, 30-39, etc. It determines which categorical predictor is “furthest from independence” with the prediction values of churners and non-churners. Problem: Continuous variables such as age have to be coerced into a categorical form – how many categories? where should the splits be? Dr. Lakshmi Mohan 25 Decision Tree for Segmenting Customers - Who Responded to a Marketing Campaign Overall : 7% of Customers Responded Segment of Customers Who Rent with High Family Income and No Savings A/c : 45% response Target this segment for Future Direct Marketing Campaign Dr. Lakshmi Mohan 26 Data Mining Techniques - Rule Induction Most common form of knowledge discovery in unsupervised learning systems Rule – “IF this and this and this, THEN that” - Accuracy or Confidence: How often is this rule correct? - Coverage: How many records does this rule apply to High Coverage means that the rule can be used often and is less likely to be an idiosyncrasy of the data set Examples: Rule Accuracy Coverage If cereal purchased, Then milk is purchased 85% 20% If bread, Then Swiss Cheese 15% 6% If 40-45 yrs and purchased, pretzels and peanuts, Then beer purchased 95% 0.01% Left Side of Rule (before THEN) – Antecedent (Can Have Multiple Conditions) Right Side of Rule (after THEN) – Consequent (Only ONE Condition) Dr. Lakshmi Mohan 27 Rule Coverage vs Accuracy Accuracy Low Accuracy High Coverage High Rule is rarely correct, BUT can be used often Rule is often correct AND can be used often Coverage Low Rule is rarely correct Rule is often correct AND can only rarely be used BUT can only rarely be used Total # of baskets in database = 100 # with eggs = 30 # with milk = 40 # with both eggs and milk = 20 Rule: IF Milk, THEN Eggs Accuracy = 20/40 = 50% Coverage = 40/100 = 40% Dr. Lakshmi Mohan Rule: IF Eggs, THEN Milk Accuracy = 20/30 = 67% Coverage = 30/100 = 30% 28 What To Do With A Rule? 1. Target the Antecedent: - All rules with a certain value for the antecedent, e.g., “nails, bolts and screws”, are presented to a retailer - Would discontinuing the sale of these low-margin items have any effect on sales of higher margin products, e.g., expensive hammers? - Example: A British supermarket was about to discontinue a line of expensive French Cheeses which were not selling well. But data mining showed that the few people who were buying the cheeses were among the supermarket’s most profitable customers – so it was worth keeping the cheese to retain them. 2. Target the Consequent: - Understand what affects the consequent, say, purchase of coffee - Put those items near the coffee on the store shelves to increase sales of coffee and those items - Example: Sales of diapers and beer were found to be highly correlated in shopping transactions between 5pm and 7pm… young fathers dropped in at the stores to pick up diapers, and decided to stock up the latter at the same time… hence put the beer display near the diapers Dr. Lakshmi Mohan 29 Rule Induction vs. Decision Trees Decision Trees: One AND ONLY One Rule for a Record - All records in training data set will be mutually exclusive (non-overlapping) segments - Supervised learning where the outcome is known for each record in the training data set. e.g., Was the person a good risk or a bad risk? - Process trains the algorithm to recognize key variables and values that will be used for predictions with new data. Rule Induction: May be Many Rules for a Record - Not guaranteed that a rule will exist for every possible record in the training data set - Will not partition the data into mutually exclusive segments … a particular record may match any number of rules, including no rules at all - More commonly used for knowledge discovery in unsupervised learning than prediction - Rules are generally created by taking a simple high-level rule, and then adding new constraints to it until the coverage gets so small that it is not meaningful Dr. Lakshmi Mohan 30 When to Use What? Decision Trees: - Create the smallest possible set of rules for a predictive model - work from a prediction target downward in what is known as “greedy” search – look for the best possible split on the next step, greedily picking the best one without looking any further than the next step - If there is overlap between two predictors, the better of the two would be picked. e.g., height might be used instead of shoe-size as a predictor whereas both could be used as antecedents in a rule induction system - Traditionally used for exploration to determine the useful predictors to be fed on the second pass of data mining into prediction models using statistical techniques or neural networks Rule Induction: - Yields a variety of rules with different predictors even if some are redundant. - Even though height and shoe size are highly correlated, both could be preset as antecedents in two different rules – in contrast, the decision tree would pick the better of the two predictors - Mainly used to discover interesting patterns in the data Dr. Lakshmi Mohan 31 Data Mining Techniques - Regression Models Statistical models which link predictors or “independent” variables to the variable to be predicted or “dependent” variable User has to select the predictors and define the structure of the linkage e.g., a linear model linking the predictor, Customer’s Annual Income (Y) to the variable to be predicted, Average Customer Bank Balance, (X) Y = a + b*X The constants, ‘a’ and ‘b’ in the above model, are called “parameters” that specify the shape of the line relating X and Y. The parameters are calculated so as to minimize the sum of squares of the forecast errors when the model is applied to the training or modelfitting data set of X values and corresponding actual Y values … The “least squares method” uses calculus to derive the formulas for the parameters a and b. Dr. Lakshmi Mohan 32 Validation and Refinement of Regression Models “R-Squared” value is calculated to show the goodness of fit of the predicted Y values from the model to the actual Y values in the data set. e.g., a value of 0.87 means than 87% of the variation in y was explained by the model Acid test of the model is to apply the fitted model to new data not used to calculate the parameters (‘a’ and ‘b’) of the model – the “hold-out” or “validation” data set Refine the model, if necessary, to make better predictions: … Add multiple predictors (“multiple regression models) … Transform predictors by squaring, taking logarithms etc (“non-linear models”) … Combine predictors by multiplying or taking rations (e.g., ratio of annual household income to family size) If dependent variable is a response variable with just Yes/No or 0/1 values, a different model called “logisitic regression” model is used. Dr. Lakshmi Mohan 33 Data Mining Techniques - Neural Networks Based on the concept of the human brain in that it learns - originally developed for military applications to tell whether a speck on a screen is a bomber or a bird, and discriminate between decoys and genuine mistakes - now, the same technology can separate good customers from bad ones Network composed of a large number of “neurons” (or processing elements) tied together with weighted connections (synapses) - A collection of connected notes, each having an input and an output, and arranged in layers. - Between the visible Input Layer and final Output Layer, there could be a number of hidden processing layers Dr. Lakshmi Mohan 34 Structure of a Neural Network A neural network uses a training data set to produce outputs from inputs, which are then compared with the known output. A correction is then calculated for the discrepancy in the output and applied to the processing in the nodes in the network The process is repeated until its stopping condition such as deviations being less than a prescribed amount is reached Dr. Lakshmi Mohan 35 A Simple Example No Default vs Actual value of 0 0.47(0.7) + 0.65(0.1) = 0.39 • Link weights (0.7 & 0.1 in the above example) are adjusted to correct for the deviation between the output of the processing (0.39 in this case) and the actual value (0 in this case) • Large errors are given greater attention in the correction than small errors How do Neural Networks Learn? Compute Output Adjust Weights No Desired Output Achieved? Yes Stop Dr. Lakshmi Mohan 37 Pros and Cons of Neural Nets Pros Data-driven Used when expertise is hard to codify, but good results are known Works well when the technique is customized for a well-defined problem such as: - Credit Cards Fraud Detection (HNC Software’s Falcon System) - Direct Marketing Campaigning (ASA’s ModelMAX) After the technique has proven to be successful, it can be used over and over again without a deep understanding of how it works Cons: Hard to interpret weights and neuron relationships Not easy to use: - All the predictors must have numeric values - Output is also numeric and needs to be translated if the final output variable is categorical such as the purchase of blue or white or black jeans Dr. Lakshmi Mohan 38 How to Evaluate a Data Mining Product 1. What kind of business problem does it address? 2. What technique does it use to model the data? 3. How does it handle categorical data and continuous data? 4. How sensitive is it to “noise” data? 5. How does it avoid the problem of “overfitting” the model? 6. Does it have a built-in process for validating the model on the “holdout” data? 7. Is the user interface easy to understand and use? 8. How long does it take to get useful answers from the data? 9. How clear are the results to interpret? 10. ABOVE ALL, TEST DRIVE THE PRODUCT ON YOUR DATA! Dr. Lakshmi Mohan 39 Text Mining: An Imperative Today “We are drowning in information, but are starving for knowledge” Unstructured data, most of it in the form of text files, typically accounts for 85% of an organization's knowledge stores, but it’s not always easy to find, access, analyze or use. Dr. Lakshmi Mohan 40 New Generation of Text Mining Tools… …to extract key elements from large unstructured data sets, discover relationships and summarize the information Categorization: Presents the search results in categories, rather than an undifferentiated mass. Clustering: Grouping similar documents based on their content. Extraction: Extracting relevant information from a document e.g., pulling out all the company names from a data set. Dr. Lakshmi Mohan 41 New Generation of Text Mining Tools Keyword Search: Searching documents for the occurrence of a particular word or set of words. Natural-Language processing: Determining the meaning of written words taking into account their context, grammar, etc. Visualization: Graphically presenting the mined data as relationships are easier to spot and understand. Dr. Lakshmi Mohan 42 Case Example of Text Mining - Dow Chemical’s BI Center Using ClearResearch software to extract data from a century’s worth of chemical patent abstracts, published research papers and the company’s own files. “By eliminating the irrelevant, we’ve been able to reduce the time it takes for researchers to find what they need to read.” ClearResearch uses a proprietary pattern-matching technology to search for information, categorize it and show its relationship to other data. “The software can see, discover and extract concepts, not just words. It gives us a pictorial representation of the text in the document in an easy-to-understand chart” Dr. Lakshmi Mohan 43 Case Example of Text Mining - Air Products & Chemical’s Knowledge Management System Company has over 18,000 employees in 300 countries, and more than 600 intranet and extranet sites. Its file servers contain 9TB of unstructured data, excluding email or anything stored on local drives. Using SmartDiscovery to generate a catalog and index of the data repository so that it can be more easily accessed by MS SharePoint Portal Document Management System. Also using the software for Sarbanes-Oxley compliance and e-learning since by correctly categorizing the data, business rules can be applied to a category of documents rather than to individual documents: e.g., if a document relates to operations covered by SOX, then the appropriate data-retention policies are applied to it. “I call it the central nervous system for what we are doing with knowledge management.” Dr. Lakshmi Mohan 44 Text Mining Tools Come either as stand-alone products or embedded as part of a larger software system: Database vendors: Oracle, IBM,… - Incorporating pattern-matching algorithms into their database products Data Mining vendors: SAS, SPSS,… - Added text mining to their portfolios. Enterprise Search Engine Vendors: Autonomy, Verily,… Specialized Text Mining Firms: Inxight Software, Stratify… “Installing SAS Text Miner is a simple process- just needed to load 6 CDs on my workstation” Hard part:: Get meaningful results - Depends on the skill and knowledge of user to properly interrogate text repositories “We are getting an increasing understanding of what things are possible with text mining. But there is a huge skills problem in this area, which is why it hasn’t gotten much traction so far”- Gartner Dr. Lakshmi Mohan 45 Dec 2003 Report of Gartner Text Mining Will revolutionize CRM Strategies by 2008… Companies will retire older technologies such as IVR, and redesign customer-facing processes. Text Mining has not been well coupled with clearly recognized “pain points” in the organisation. Customer service has been mainly handled in call centers, with an emphasis on transaction processing and short interaction times. As a result, most firms have been missing valuable input from customers on how to improve their business processes. This has led to low levels of customer satisfaction, little long-term loyalty and an expensive, albeit necessary, way of resolving customer complaints… Blended service delivery models using text mining, telephone and web services will enable companies to identify not only what the customer said, but also what was meant… will be able to spot and resolve problems earlier… improve their ability to prevent problems recurring…improved measurement of customer satisfaction over today’s flawed survey methodology.” Dr. Lakshmi Mohan 46 Business Activity Monitoring (BAM) Automated monitoring of business-related activity affecting an enterprise Report on activity in the current operational cycle, e.g., the current hour, day or week. Designed to spot problems early enough to head them off. BAM is not a new concept Credit Card companies have had real-time fraud monitors for years. Manufacturers have real-time error-detection software built into their assembly lines. Proactive or Reactive? “The conventional wisdom has been to just take transactional data and move it to the data warehouse and then to the BI System. But these systems aren’t responsive” Monitoring business activity after the fact is too late to head off a problem such as a missed deadline or the loss of a major customer. BAM systems pluck the data in real time from the applications where it originates - order entry, accounts receivable, call centers, etc. Output in variety of forms – dashboards, e-mails, pager alerts,… Dr. Lakshmi Mohan 47 GE’s Real-Time Dashboard GE’s aim is to monitor everything in real time, GE’s CIO explains, calling up a special web page on his PC: a “digital dashboard”. From a distance it looks like a Mondrian canvas in green, yellow and red. A closer look reveals that the colors signal the status of software applications critical to GE’s business. If one of the programs stays red or even yellow for too long, he gets the system to email the people in charge. He can also see when he had to intervene the last time, or how individual applications such as programs to manage book-keeping or orders have performed. As CIO, Mr. Reiner was the first in the firm to get a dashboard, in early 2001. Now most of GE’s senior managers have such a constantly updated view of their enterprise. Their screens differ according to their particular business, but the principle is the same: the dashboard compares how certain measurements, such as response times or sales or margins, perform against goals, and alerts managers if the deviation becomes large enough for them to have to take action. Dr. Lakshmi Mohan 48 BAM Case Example - Davis Controls Ltd. (Canada) Every afternoon, at 4:30 pm, a screen pops up on the CEO’s PC with important “news”: How many orders the company booked Names of customers who have gone past 90 days without paying Orders that have missed delivery promises PLUS 15 Daily E-mail Alerts, e.g., Which salespeople have not logged in that day to download the latest data from a corporate database about the customers in their territories “Sometimes those remote sales guys will just sit out there in never-never land, and as long as they think no one is watching, they will march to their own drummer.” When a promised order-delivery is missed, one e-mail alert is generated for the responsible salesperson, one goes to a customer with an apology, and one goes to an expediter… Different e-mails go to new customers, depending on the size of their initial orders. Dr. Lakshmi Mohan 49 BAM Case Example - Davis Controls Ltd. (Canada) Use Macola Enterprise Suite, an ERP package from Exact Software, a subsidiary of a Dutch Company Includes the Exact Event Manager, a BAM product that triggers alerts and reports on activity and non-activity, both inside and outside the ERP system. “BAM enables me to manage the Company more proactively. Before, I’d have to wait until a customer called with a complaint or the month-end financial reports to really get a feel for how the business was doing.” Dr. Lakshmi Mohan 50 BAM Case Example - A Fortune 100 Financial Services Firm Uses SeeRun Platform, a suite of products from SeeRun Corp. in San Francisco To monitor some 50,000 cases per year where the firm has signed contracts with it’s clients guaranteeing performance against operational metrics relating to dozens of milestones in the contracts. “If a task is supposed to be completed within 24 hours but isn’t, an alert is generated for the appropriate manager.” “Even more helpful is receiving live activity-tracking along the way – at 6 hours, 12 hours, 18 hours and so on.” Benefits: Improved Performance & Reduced Expenses Serves also as a marketing tool to show prospective clients Biggest Challenge: What To Do With All the Data “You can actually over engineer something like this. If you get too many stakeholders involved, everyone wants their own particular metric. We have been able to keep it focused and simple.” Dr. Lakshmi Mohan 51 BAM Case Example - The Albuquerque City Government Uses NoticeCast from Cognos To proactively push e-mail notices of important events, in near real time, to city employees, residents & vendors NoticeCast sits outside the city’s firewall on an extranet and monitors events by periodically querying Oracle tables populated by municipal systems. Vendors Sends an e-mail to each vendor that was issued an electronic payment during the night. Directs the vendor to a Website on the extranet where it can get a remittance report Residents Sends an e-mail to each residents for whom a water-bill was produced with all the pertinent billing info Directs the resident to a Website where he may pay his bill online City Employees Once-a-day e-mails to certain employees letting them know of all online payments made to the city during the past 24 hours –> whenever a candidate files a contribution report, NoticeCast sends an e-mail to city employees responsible for tracking campaign law compliance Dr. Lakshmi Mohan 52 What’s Next for BAM? Will become tightly coupled to Business Process Management (BPM) systems Send Alerts in a publish/subscribe model to lots of BPM systems throughout the enterprise. Events go in and alerts come out, but those alerts just become events in other applications Example: A BAM system could generate an alert that the estimated date of a package delivery had slipped. A CRM system and a BPM system might each subscribe to such “package due-date change” alerts, extending the usefulness of the alerts. Dr. Lakshmi Mohan 53 What’s Next for BAM? More sophisticated rules of logic will be included in BAM capable of finding hidden patterns in current business activity by doing on-the-fly analyses of historical data. “If a process is beginning to go South, the early birds of that are hard to see. Eventually, we’ll see BI & BAM married at the level of using historically recorded data to identify problems much earlier.” Even further out lies the Holy Grail of BAM: When a system not only sees a problem coming but also goes beyond alerts to actually fixing the problem. e.g., automatically reordering a part when it sees that a shipment has been lost – an example of autonomic response, a self-learning system. Dr. Lakshmi Mohan 54 An Example of Autonomic Response 10 years ago: If you were a good customer, FedEx shipped you a PC and allowed you to dial into their network 5 years ago: You could get the shipping information from any browser Customers now want shipping information on their order status screen Tomorrow's Scenario: FedEx plane containing your package is snowed in Cincinnati FedEx system knows your package will not arrive in the morning A Web service can send you early notice of a non-delivery through the CRM system Business process for supply chain looks for an alternate supplier, if you cannot wait for the package Dr. Lakshmi Mohan 55