Final Exam Review Spring 2011 Exam format About 75 questions 45% multiple choice and T/F 30% short fill-ins 25% short-paragraph explanations What to study 50 questions from Exam 1 and 2 12-15 questions about presentation topics 10-13 questions will come from labs Concepts from Market Basket: Data Mining SCM: RFID (hardward) + XML(concept) Fund Trading: DBs for Optimization Pivot Chart: DBs for Discovery & Prediction Wagemart: DBs for Decision Support Market Basket Analysis Support: Probability (P) that an item is in someone’s checkout basket A,B,E A,B,F A,B A,B,F,G A,D,F C,D C,D,G E,F,G E,F E,G P(A) = 5/10 = 50% P(AB) = 4/10 = 40% P(C) = 2/10 = 20% P(CD) = 2/10 = 20% Market Basket Analysis Confidence X Y = P(XY)/P(X) : If item X is purchased, what is the probability that item Y is also purchased Confidence B A = P(AB)/P(A) = 40%/50% = 80% Confidence C D = P(CD)/P(C) = 20%/20% = 100% Given: P(A) = 5/10 = 50% P(AB) = 4/10 = 40% P(C) = 2/10 = 20% P(CD) = 2/10 = 20% Market Basket Analysis Quality X Y = Confidence X Y * P(YX) High quality association rules Quality A B = 80% * 40% = 32% Quality C D = 100% * 20% = 20% Apriori Algorithm: Calculate high quality association rules given billions of transactions millions of items Complex Association Rule ADGMS CLPT (50% quality, 80% confidence), i. 5 items (A,D,G,M, and S) imply with great confidence that 4 items (C, L, P, and T) are purchased. Without the Apriori Algorithm, the calculation would take too long (millions of years). Apriori Algorithm How it works: By setting minimum support level, the algorithm can prune low confidence pairs (2-itemsets) to compute 3-itemsets. Then, the pruned 3-itemsets can compute 4itemsets. The algorithm is guaranteed to return all the itemsets above the minimum support level. When you get to 5-, 6-, or 7-itemsets, the pruning reduces the number of possible sets from trillions to a few thousand or hundred, which can help humans discover very complex, high quality association rules. Importance of Apriori Algorithm A process An innovation that takes terabytes of data and reduces it to meaningful rules Raw Data Relevant and Timely Information A.I. Data Mining Market Basket Analysis Pivot Chart Lab Great example of Online Analytical Processing (OLAP) Slice & Dice data (Temp., Mood, Day, Weather) Drill Down (look at only incorrect predictions) Unlike Data Mining, the process is interactive a person participates in the process The process is Ad. Hoc. The process is not pre-determined like Apriori Algo. Significance of Pivot Chart Lab Business Intelligence (like A.I.) Use OLAP to find patterns Encode patterns as IF statements to predict future cases. The spreadsheet can automate the human decision making process on a large scale, faster than a human. Such a system enables timely, accurate predictions without a human decision-maker (Business Intelligence System) Excel Pivot Charts as a tool First: Pattern is noticed Second: Interactive analysis tools (Pivot Chart) helps to confirm and pin-point the pattern Example: A marketer thinks that geography plays a role in sales; a Pivot chart shows that Southern stores do have better sales. Database queries as tools First: The data mining reveals numerous patterns (association rules) Second: Human intelligence can derive the theory behind the pattern. Example: The Apriori algorithm discovers a high quality association rule (Beer Diapers). Later, Marketers try to unravel the reason why. The data analysis must come before the hypothesis because the data is too big for humans to analyze. Fund Trading Lab Decision Support Automation: Using a Database to compute the optimal sequence of trades. Too many combinations for a human to analyze Another Example of Business intelligence 1. At first we use a graph and human intuition to make the trades 2. We do better if we use a query to calculate and sort all possible transactions 3. We use Database tools to pick the best one’s that don’t overlap Decision Support Systems: Wagemart vs. Fund Trading Wagemart Fund Trading start with tons of data start with less data individual salaries, availability reduce it to simple info total cost, average rating to help make a decision. Fund value for each day compute every possible transaction Much more data Queries are used to find the optimal transactions Decision Support Systems: Wagemart vs. Fund Trading Both system model scenarios to compute the outcome of decisions one is structured one scenario to optimize the other unstructured many different scenarios to consider Fund Trading was more structured, i.e., you can only buy and sell; you just have to decide the optimal day and funds to buy/sell. Wagemart was very unstructured, many different ways to cut costs. Porter’s 5-forces Do companies complete because its fun? Maybe some… They compete because of the threat of going out of business. Profitability is the penultimate measure of success Why? What are the threats? A new competitor Will take away your sales and profits? Because they are better? In business what does better really mean? The five forces/threats New entrants Substitute products Rivalry Bargaining power of consumers Bargaining power of suppliers Example Target forces their supplier to use XML-formatted shipment data and boxes tagged with RFID chips. Apple refuses and wins Target has to use Apple’s system to sell Apple’s products. What force is this? Example Indirect: Brooke visits Google Shopping and Shopzilla to compare prices on a new camera. She’ll buy from the most inexpensive online retailer Direct: Bradley uses Lending Tree.com where banks try to underbid each other to get his business. Example Disney World implements a new ride tracking system, that directs visitors to the rides with shortest wait times. Forces Universal Studios to invest in a similar system. Example Everyone at the gym is using their iPhone or Android phone to listen to music MP3 players are now collecting dust Example Netflix emerges and puts 120 Blockbuster videos stores out of business Competitive strategies To fight the forces 1. Do something totally new (innovation) 2. Be inexpensive (cost leadership) 3. Be big to increase power (growth) Lock-in your customers Lock-out your competition 4. Make mutually beneficial partnerships (alliance) 5. Be different but in a good way (differentiation) Put up barriers to the competition Example Imagine if Blockbuster decided to use Internet/Mail delivery before Netflix. But Blockbuster was NOT ______________ By the way, Netflix created a totally new process for renting videos. How does an IS make this possible? How is the IS better than the old-fashioned process. E-commerce It was an innovation at one point Now it necessary to stay in business Example Walmart’s efficient supply chain cuts cost. RFID and XML play a role Their size allows them to negotiate low prices with suppliers. Large companies absolutely need information systems for good management Walmart’s strategy is 2-fold. How do Information Systems really help businesses to compete? The labs provide many examples RFID, XML More accessible, timely information for improving supply chain. Market Basket More relevant information for increasing sales/profits How do Information Systems really help businesses to compete? The labs provide many examples Wagemart More accurate information for modeling decisions Pivot Chart & Fund Trading Flexible information; manipulated in real-time to solve problems (prediction & optimization) The 11 information attributes are fair game Flexibility and accessibility are different. Putting something on the web makes it more accessible Storing data electronically can make it more flexible Putting electronic data in a robust, standardized format (XML) improves both. Attribute Trade-offs Simple vs. Complete Secure vs. Accessible Presentations Don’t forget to review presentations The websites will be linked on Tuesday Textbook Reading Low priority Top Priority Review past exams and lookup correct answers (Text and Google) Will post them on Tuesday Skim lab materials and instructions on Blackboard Create cheat sheet