Business Intelligence Systems Chap 9 Objectives • Q1 – Why do organizations need business intelligence? • Q2 – What business intelligence systems are available? • Q3 – What are typical reporting applications? • Q4 – What are typical data-mining applications? • Q5 – What is the purpose of data warehouses and data marts? Why do organizations need business intelligence? • Business intelligence is comprised of information that contains patterns, relationships, and trends about customers, suppliers, business partners, and employees. • Business intelligence systems process, store, and provide useful information to users who need it, when they need it. What business intelligence systems are available? • A business intelligence (BI) system is an information system that employs business intelligence tools to produce and deliver information. • Business intelligence tools are computer programs that implement a particular BI technique. The techniques are categorized three ways: Business Intelligence Tools – Reporting tools read data, process them, and format the data into structured reports that are delivered to users. They are used primarily for assessment. – Data-mining tools process data using statistical techniques, search for patterns and relationships, and make predictions based on the results – Knowledge-management tools store employee knowledge, make it available to whomever needs it. These tools are distinguished from the others because the source of the data is human knowledge It’s important that you understand the difference between these business intelligence components: – A BI tool is a computer program that implements the logic of a particular procedure or process. – A BI application uses BI tools on a particular type of data for a particular purpose. – A BI system is an information system that has all five components (hardware, software, data, procedures, people) that delivers the results of a BI application to users. What are typical reporting applications? • Reporting applications input data from a source(s) and apply a reporting tool to the data to produce information. The reporting system delivers the information to users. • Basic reporting operations include sorting, grouping, calculating, filtering, and formatting. Raw Data • This figure shows raw data before any reporting operations are used. • • The figure on the left shows the raw sales data sorted by customer names. The figure on the right shows data that’s been sorted and grouped. Sales Data Sorted by Customer Name Sales Data, Sorted by Customer Name & Grouped by Number of Orders & Purchase Amount This figure shows even better information that’s been filtered and formatted according to specific criteria. Fig 9-5 Sales Data Filtered to Show Repeat Customers • RFM Analysis allows you to analyze and rank customers according to purchasing patterns as this figure shows. – R = how recently a customer purchased your products – F = how frequently a customer purchases your products – M = how much money a customer typically spends on your products • The lower the score, the better the customer. • Online Analytical Processing (OLAP) is more generic than RFM and provides you with the dynamic ability to sum, count, average, and perform other arithmetic operations on groups of data. Reports, also called OLAP cubes, use: – Measures which are data items of interest. In the next figure a measure is Store Sales Net . • Dimensions which are characteristics of a measure. In the figure below a dimension is Product Family. Fig 9-7 OLAP Product Family by Store Type • A presentation like what you saw in the prior slide is often called a OLAP cube or a cube. • Know that an OLAP cube and a OLAP report are the same thing • Users can alter the format of a report • Its possible to Drill down into the available data Drilled down by store location and store type Further drilled down to just stores in California What are typical data-mining applications? Businesses use statistical techniques to find patterns and relationships among data and use it for classification and prediction. Data mining techniques are a blend of statistics and mathematics, and artificial intelligence and machine-learning. Fig 9-11 Convergence Disciplines for Data Mining Data mining • Because data mining is a odd blend of terms from different disciplines it is sometimes referred to as knowledge discovery in databases. • There are two types of data-mining techniques: – Unsupervised data-mining characteristics: • No model or hypothesis exists before running the analysis • Analysts apply data-mining techniques and then observe the results • Analysts create a hypotheses after analysis is completed • Cluster analysis, a common technique in this category groups entities together that have similar characteristics – Supervised data-mining characteristics: • Analysts develop a model prior to their analysis • Apply statistical techniques to estimate parameters of a model • Regression analysis is a technique in this category that measures the impact of a set of variables on another variable • Neural networks predict values and make classifications Market-Basket Analysis is a data-mining tool for determining sales patterns. It helps businesses create cross-selling opportunities. Two terms used with this type of analysis, and shown in the figure, are: Support—the probability that two items will be purchased together Confidence—a conditional probability estimate Decision-Trees • A decision tree is a hierarchical arrangement of criteria that predicts a classification or value. It’s an unsupervised data-mining technique that selects the most useful attributes for classifying entities on some criterion. It uses if…then rules in the decision process. • Next are two examples. Fig 9-13 Grades of Students from Past MIS Class (Hypothetical Data) Fig 9-14 Credit Score Decision Tree What is the purpose of data warehouses and data marts? Data warehouses and data marts address the problems companies have with missing data values and inconsistent data. They also help standardize data formats between operational data and data purchased from third-party vendors. These facilities prepare, store, and manage data specifically for data mining and analyses. Fig 9-15 Components of a Data Warehouse Granularity refers to whether data are too fine or too coarse. Clickstream data refers to the clicking behavior of customers on Web sites. The phenomenon called the curse of dimensionality—just because you have more attributes doesn’t mean you have a more worthwhile predictor. Figure 9-16, left, lists some of the data that’s readily available for purchase from data vendors Some of the problems companies experience with operational data are shown in figure 9-17 below. Here’s the difference between a data warehouse and a data mart: A data warehouse stores operational data and purchased data. It cleans and processes data as necessary. It serves the entire organization. A data mart is smaller than a data warehouse and addresses a particular component or functional area of an organization. Fig 9-18 Data Mart Examples